In the previous section, I discussed how popular pages (as judged by links) rank higher. By this logic, you might expect that the Internet's most popular pages would rank for everything. To a certain extent they do (think Wikipedia!), but the reason they don't dominate the rankings for every search result page is that search engines put a lot of emphasis on determining relevancy.
Text Is the Currency of the Internet
Relevancy is the measurement of the theoretical distance between two corresponding items with regards to relationship. Luckily for Google and Microsoft, modem-day computers are quite good at calculating this measurement for text.
By my estimations, Google owns and operates well over a million servers. The electricity to power these servers is likely one of Google's larger operating expenses. This energy limitation has helped shape modern search engines by putting text analysis at the forefront of search. Quite simply, it takes less computing power and is much simpler programmatically to determine relevancy between a text query and a text document than it is between a text query and an image or video file. This is the reason why text results are so much more prominent in search results than videos and images.
As of this writing, the most recent time that Google publicly released the size of its indices was in 2006. At that time it released the numbers shown in Table 1-1.
Table 1-1: Size of Google Indices
Data
Size in Terabytes
Crawl Index
800
Google Analytics
200
Google Base
2
Google Earth
70
Orkut
9
Personalized Search
4
So what does this emphasis on textual content mean for SEOs? To me, it indicates that my time is better spent optimizing text than images or videos. This strategy will likely have to change in the future as computers get more powerful and energy efficient, but for right now text should be every SEO's primary focus.
This is especially true until Google finds better ways to interpret and grade non-textual media
But Why Content?
The most basic structure a functional website could take would be a blank page with a URL. For example purposes, pretend your blank page is on the fake domain www.WhatlsJessicaSimpsonThinking.com. (Get it? It is a blank page.) Unfortunately for the search engines, clues like top-level domains ( .com, .org, and so on), domain owners (WHOIS records), code validation, and copyright dates are poor signals for determining relevancy. This means your page with the dumb domain name needs some content before it is able to rank in search engines.
The search engines must use their analysis of content as their primary indication of relevancy for determining rankings for a given search query. For SEOs, this means the content on a given page is essential for manipulating—that is, earning—rankings. In the old days of AltaVista and
other search engines, SEOs would just need to write "Jessica Simpson" hundreds times on the site to make it rank #1 for that query. What could be more relevant for the query "Jessica Simpson" than a page that says Jessica Simpson 100 times? (Clever SEOs will realize the answer is a page that says "Jessica Simpson" 101 times.) This metric, called keyword density, was quickly manipulated, and the search engines of the time diluted the power of this metric on rankings until it became almost useless. Similar dilution has happened to the keywords meta tag, some kinds of internal links, and H1 tags.
Despite being more sophisticated, modem-day search engines still work essentially the same way they did in the past—by analyzing content on the page.
Hey, Ben Stein, thanks for the history lesson, but how does this apply to modern search engines? The funny thing is that modern-day search engines still work essentially the same way they did back in the time of keyword density. The big difference is that they are now much more sophisticated. Instead of simply counting the number of times a word or phrase is on a webpage, they use natural language processing algorithms and other signals on a page to determine relevancy. For example, it is now fairly trivial for search engines to determine that a piece of content is about Jessica Simpson if it mentions related phrases like "Nick Lachey" (her ex- husband), "Ashlee Simpson" (her sister), and "Chicken of the Sea" (she is infamous for thinking the tuna brand "Chicken of the Sea" was made from chicken). The engines can do this for a multitude of languages and with astonishing accuracy.
Don't believe me? Try going to Google right now and searching
related:www.j essicasimpson.com. If your results are like mine, you will see
websites about her movies, songs, and sister. Computers are amazing things. In addition to the words on a page, search engines use signals like image meta information (alt attribute), link profile and site architecture, and information hierarchy to determine how relevant a given page that mentions "Jessica" is to a search query for "The Simpsons."
কোন মন্তব্য নেই:
একটি মন্তব্য পোস্ট করুন