It’s normal for us to open our web browser, go to a search engine’s main page (if it’s not yet your default homepage), and type in a search query. Search engines are our main way of locating specific information in the World Wide Web. Without them, it would be virtually impossible to find something without knowing a particular URL down to the last character. But come to think of it, how do search engines work?
When we use the term “search engine”, we usually mean the search form that does the searching through billions of HTML documents in the Web that were gathered by a robot. But there are actually three varieties of search engines: spider-driven, human-driven, and a combination of both.
Spider-driven search engines refer to those that make use of automated computer programs (a.k.a. spiders, ants or bots) that visit a web page, read the information contained in it, and follow the hyperlinks to be able to index the rest of the other pages as well as the other websites they link to. All of the information gathered by crawlers are then brought back to a central database where it is indexed. Crawlers then go back to websites from time to time to check if there are any updates. Human-driven search engines are those that are dependent on human users to submit information. This information is then indexed and categorized.
For the purposes of SEO, let’s talk about crawler-based search engines. Such search engines have two main functions:
- Crawling and indexing
- Determining content relevancy and search rankings and providing results
Crawling and Indexing
To make it easier for you to visualize how search engines work, imagine that the vast expanse of the Internet is a huge subway system with a complicated network of stops. Each stop represents a document (i.e. web page, jpg, pdf, etc.). Search engine spiders need to crawl the entire subway system and locate all of the stops using the only path available – hyperlinks. Through those links, search engine spiders are able to get to the billions of pages, files, photos, videos, and other interconnected documents.
Once the crawlers land on these pages, they then analyze their source code and store selected data in massive hard drives that will later be recalled when requested by users in their search query. To be able to carry out this colossal task of storing billions of web pages so that they can be retrieved in a split-second upon request, search engines have made enormous datacenters in different cities around the world. These storage facilities house numerous machines that are capable of processing big data. After all, human users demand search results on the spot. A three-second delay might cause them to get displeased, so search engines work really hard to provide results as fast as they can.
Providing Results
When you search for something online, search engines have to scour their billions of stored documents and:
- Select those that are relevant to the query
- Rank the results based on their perceived value
Because of this, relevance and importance/popularity are what a search engine optimization (SEO) campaign aims to influence.
Do you notice that sometimes search engine returns dead links? This is precisely because the search results are based on their records or index. So if the index isn’t up-to-date and hasn’t been revised since that web page got invalid, then the search engine still treats that page as if it were active.
Also, do you notice that queries using different search engines return varying results? Partly it’s because not all indices of search engines are the same. It actually depends on what their crawlers find or what human users submitted to them. Another reason is because search engines use different algorithms to determine the relevance of the data found in the index to what human users are searching for.
How Can I Design a Successful SEO Campaign?
Search engines are very stingy when it comes to sharing the algorithms that they use to rank pages and determine their relevance. They only provide limited information as to how webmasters can drive more traffic to their websites and achieve a better slot in search engine results pages. But a summary of what little SEO-related information they have revealed can be found below:
- Design websites for humans, not for search engines.
- Make use of text links and make all of your web pages reachable through static text links.
- Create pages with useful and informative content. Don’t forget to fill in the <title> tag and ALT attributes.
- Don’t overuse links per page. Less than 100 is okay.
Yahoo!
Yahoo! says that the factors that can affect search results are the following:
- Number of inbound links
- Content
- Updates to indices
- Discovery of other websites
- Changes to their search algorithm
Bing
- Include targeted keywords in the visible body text.
- Limit pages to around one topic per page and to a practical size. An HTML web page without pictures should be a maximum of 150kb.
- Pages should be accessible by static text links.
- Don’t put your targeted keywords and the phrases you wish to be indexed within an image. Use text and not images of text.