How do the search engines work and how your content is crawled, indexed and ranked? Learn it now!
In chapter 1, you learnt the ‘SEO 101 Basics’ and you also got familiar with some of the popular search engines used across the globe. Now, as you know what exactly search engines are, it is time to learn how they behave. You will know how search engines search, understand and organize the content available on the internet so that relevant search results are presented to the searchers.
It is very important that your content is easily searched by the search engines. Remember, if your site cannot be found by search engines, it won’t show up on the Search Engine Results Page (SERP).
Know How Search Engines Work
The 3 primary functions of Search engines are crawling, indexing and ranking. Let us know about them in detail.
Crawling
It is a process in which a website is scanned by search engine crawlers/bots/spiders and details (titles, linked pages, images, keywords etc.) about each and every page of is collected. So, for e.g., if a Googlebot is visiting your website, it is basically analysing the content or code placed on the web pages.
The Googlebot starts fetching out web pages, then follows the links placed on those web pages, thus finding new URLs. Along with discovering new links, the crawler also searches new content placed on these links and then, it adds this new content to its index, known as Caffeine. This ‘Caffeine’ is a database of discovered URLs. These discovered URLs are later retrieved by Google to answer user queries for which the content placed on these URLs is an ideal match.
For a website, it is very important that all of its web pages are available for crawling.
Indexing
If the content on your website is crawled, it will then be processed for indexing. Once crawled, the content will then be stored and organized in huge databases so that it can appear as answers to relevant search queries. For e.g., if you search for ‘Europe history’ on Google, its algorithm will analyze your search terms in the indexed files and then ensure that appropriate search pages pertaining to Europe’s history are presented to you. If you consistently gain higher domain authority, you are very likely to invite more search engines to timely crawl and index your website.
Ranking
Ranking in simple words is basically presenting the pieces of content to a searcher in response to his search query. While ranking pieces of content for a searcher, search engines usually rank them in a most relevant to the least relevant order, thus putting in best efforts to solve his query and give him the most appropriate content choices.
It is in fact quite an interesting theory that higher a website ranks, more it is likely to make search engines believe that it is the most relevant one for its vertical or the targeted keywords it solves queries for.
You, however, can also block the search engines to visit certain portions of your website or you may even block them to visit your website completely. However, if you want search engines to discover and rank your website with ease, make sure that your website can be easily crawled and indexed.
Can Your Web Pages Be Easily Indexed? Here’s How You Can Check.
As we discussed above, to rank on the Search Engine Results Pages (SERPs), it is very essential that your website is easily crawled and indexed. If you have a site ready or if you’re working on any of your client’s site, you should check how many pages of your website are crawled and indexed. Doing this will allow you to see if all of your pages that you wanted search engines to crawl and index are fetched or not. Besides this, you will also get to know if the pages that you wanted not to appear on the search engines are hidden or not.
As we discussed in the previous chapter, out of all the search engines, Google has the highest search engine market share of more than 91% and it also is certainly the most efficient one. And why are we calling it efficient? Because, an analysis of the cumulative traffic coming from Google search and Youtube search reveals that the traffic on Google not only is maximum, but it also comes from the places where the number of web searches is higher. So, most of the things in the search engine space revolve around Google.
Now, coming to understanding how to check which pages of your website are indexed. To check this, there’s a very simple command that you need to run on the Google search bar. Just type site:yourdomain.com. For e.g., if you want to check the indexed pages of blog.hubspot.com, just type site:blog.hubspot.com into the Google search bar.
Once you do this, you will see the number of your website’s indexed pages on Google. Although, the number of pages shown in the results might not be accurate but it will give you a fair understanding of how your pages are appearing on Google.
If you wish to track and monitor accurate details of your website’s indexing, you can start using Google Search Console and use its Index Coverage option. Using Google Search Console will give you a better understanding of your website’s indexed pages, site links, errors and more. Just start using Google Search console simply by signing up for a free account.
Using Google Search Console, you can also submit sitemaps for your website and check how many submitted pages are added in Google’s index. Just in case, you’re not able to find your website or some of its pages been indexed, there could be a possible reason out of the following:
- Your website is new and it has not been crawled by Google yet.
- Some code of your website is blocking Google bots to visit the website page(s).
- Your website’s navigation is too complex that Google robots can’t crawl the website easily.
- Due to spammy tactics, your website has been penalized by Google.
- No external website has linked your website.
Like we discussed earlier, there are some pages on your website, which you want Googlebot to find, however, there could be some pages which you don’t want to appear in the search engine results. For e.g., duplicate URLs, thin content pages, sign up form pages etc. To avoid crawling of such pages or website portions by the Googlebot, you need to use robots.txt.
Let us read in detail what is robots.txt and understand how you leverage it to allow search engines to crawl only the pages which are meant to be crawled.
Robots.txt
Robots.txt files indicate search engines which website portions their bots should crawl and which website portions their bots should avoid. These robots.txt files are located in the root directory of a website (e.g. www.dhirenvyas.com/robots.txt) and they also suggest the bots the speed at which they should crawl the website.
The Robots.txt is basically a text file created by webmasters to instruct the search engine bots on how to crawl the website pages. It is a part of the Robots Exclusion Protocol (REP) which sets a guideline on how the robots would crawl the web, store and index the content and rank it to the searchers. Under Robots Exclusion Protocol, directives such as meta robots along with site-wide instruction, sub-directory and page(s) are included so that search engines can determine how they should treat the links on the page(s) as — “Follow” or “Nofollow”.
Now, as you know what Robots.txt is, let us find out how Googlebot treats robots.txt files while crawling.
Googlebot and robots.txt Files Correlation
So here’s how Googlebot treats robots.txt files:
- When a Googlebot is unable to find a robots.txt file for a website, it continues crawling the website.
- When a Googlebot comes across an error while accessing a website’s robots.txt file and is unable to recognize the error, it stops crawling the website.
- When a Googlebot finds a robots.txt file without encountering any error, it follows the suggestions as provided in the robots.txt file and continues crawling the website.
However, there is another twist in the tale when it comes to Googlebot responding to your website’s robots.txt files. Lately, John Mueller in his podcast said that even a page which is blocked for crawling through robots.txt may get indexed and ranked if the content on the page is considered valuable by the search engine.
It is always a good idea to optimize for “crawl budget”. Now, you may ask what is crawl budget? So crawl budget is basically the average number of links Googlebot will crawl your website in one go. It ensures that Googlebot doesn’t waste time in crawling the pages that are of no importance and doesn’t ignore pages that are highly important for a website. Crawl budget is usually of more importance for sites that have more than 10,000 URLs. Content that is not important at all should be blocked, but make sure that you do not block crawler’s access from the pages with other directives such as noindex and canonical.
You have to make sure private content pages such as login and administration pages are noindexed and placed behind a login form rather than looping in the robots.txt file. This will avoid people with malicious intent to access this data and steal your website’s private information. You must understand one thing that not all the bots abide robots.txt. People with wrong intentions, such as email scrappers, do not follow this protocol and therefore, other than guiding the crawlers to optimally crawl your website, keeping the private content safe is also important.
How Can You Define URL Parameters in Google Search Console?
Websites, usually the e-commerce ones, have the same content available on the diffrent URLs. This is because they include some parameters to their URLs. For e.g., when you search for clothing, say “men’s t-shirts” on Walmart.com and then use a filter, say maximum price $10, you will notice that the URL slightly changes:
https://www.walmart.com/browse/clothing/t-shirts-tank-tops/5438_133197_4237948
https://www.walmart.com/browse/clothing/t-shirts-tank-tops/5438_133197_4237948?max_price=10&page=1
But have you ever thought how does Google identify which URL should it serve to its users? Well, here’s the answer. Although Google in itself is smart enough in gauging which URL is the representative URL, but it also happens because of Google Search Console, wherein you can use the URL Parameters feature to guide Google how it should perceive your pages.
By using this feature, you can ‘configure parameter’ to inform Googlebot which URLs it shouldn’t crawl. Usually, in the Search Console, it is done using syntax like “crawl no URLs with XXXXX parameter”. This will tell Googlebot to hide or ultimately remove the mentioned pages from the search results.
Are Crawlers Able to Crawl The Important Content?
As you now have an idea on how to avoid search engines from crawling the pages that are least important for you, let’s know how can you optimize the important web pages of your website so that Googlebot can easily crawl them.
It often happens that a search engine is able to find only some parts of your website by crawling. In such a scenario, many important pages are ignored due to one or the other reason. Therefore, it becomes essential that bots are able to crawl the content that you wanted them to. It’s not just the homepage that the bots must crawl but also rest of the important pages.
You should spend time in finding out the crawl and index status of all the web pages of your website. As mentioned earlier, a tool like search console can be crucial in finding this out. Apart from this, also check if either of the following is happening with your website?
When Content is Placed Behind Login Forms
If your website requires the visitors to fill out forms, login, or answer surveys prior to showing a piece of content, then this could be something not going in the right direction for crawlers. Here, you’ve to understand one thing that search engines are not doing to see the pages that are protected. And it is impossible for a crawler to crawl a page which requires login.
When You Rely on Search Boxes for Crawling
Search forms cannot be crawled by bots. Hence, placing a search box on your website and thinking that the crawlers will be able to find what searchers search for is mere misguidance.
When You Hide Useful Text As A Non-text Content
It is not at all a good idea to place a content that is intended to be indexed as a non-text media item (image, GIF, video etc.). Always make sure that you place content which is aimed to be indexed as HTML markup on your website page. Though search engines have improved in terms of recognizing the text placed on the images, but still it doesn’t guarantee the same level of success which you can get when you place useful content as text on your webpage.
Do You Have Seamless Site Navigation for Search Engines?
We have read in this chapter that a search engine crawler discovers your website through links from other sites. Similarly, it also needs a navigation path of links on your website itself so that it can crawl your website page by page. However, if a page that is important for your website isn’t linked to any other page of your website, it is less likely to be noticed by the search engines, thus not appearing in the search engine results. It is often seen that many websites make this mistake of structuring their navigation pages so improperly that it becomes impossible for the search engines to access the pages.
Here’re the navigation mistakes that may not allow crawlers to view your entire site:
- If your website doesn’t have navigation through menu items which are coded in HTML, the crawlers may find it difficult to find crawl your website. For e.g. Java script navigations. Although the technology of Googlebots have fairly improved but still, crawling when it comes to crawling the Java scrip navigations, a perfect crawling isn’t guaranteed.
- If your website has different mobile and desktop navigations, there are chances that optimum crawling might not be achieved.
- If your website displays personalized or unique navigations to different users, this may appear as cloaking to search engine bots, thus impacting the display of your website pages in SERPs.
- If you do not link a new page to its primary page on your website navigation, crawlers may not crawl it.
So now you just learnt, how to identify if your website has seamless navigation for crawlers or not, and what all mistakes should you avoid so that search engine crawlers can read your website efficiently. Let’s now learn if your website is user-friendly enough and has a clean information architecture for the users.
is Your Website User-friendly and Has A Clean Information Architecture?
This is something very important from a users point of view. It is ideal to have a website that is user-friendly enough to guide users in taking their own initiatives / actions on the pages. In fact, a website should be properly organized and have labelled content so that users can easily browse through the pages and consume information. The ultimate objective behind making the information architecture clean is to make users feel that they needn’t put efforts to search something on your website.
Sitemaps: An Ideal Way to Attract Crawlers
We have earlier discussed in this chapter that crawling and indexing are the two key essentials to make your web pages rank on Google. Feeding the search console with a sitemap may further simplify the job of indexing and crawling your high priority pages.
Sitemaps are basically a list of URLs which determine the pages that your website has. This feeds the crawler the list of URLs, it needs to crawl. To ensure that a search engine like Google finds all the important pages of your website, submit a sitemap using Google Search Console.
Submission of sitemap will give a path to crawlers to access all the important pages of your website. If your website isn’t linked to any other website, submission of an XML sitemap may give you an advantage of getting the web pages crawled.
Are Crawlers Encountering Errors While Reading Your Site’s Urls?
It is possible that while crawling the URLs on your website, the search engine crawler may get an error. To find out crawling errors, you can check for the “crawl errors” report in Google Search Console. This report will allow you to detect the URLs thus giving you a useful insight on page not found errors as well as server errors.
Now, the next step for SEO learners is to know the type of server and not found errors. So, let’s get familiar with them.
Decoding The 4xx Errors
The 4xx errors appear due to a client error when search engine bots are not able to access the content on your website. It basically happens when a requested URL has an improper syntax or cannot be run. A very common example of 4xx errors is “404 error” aka “404 not found error”. Such an error appears in situations like a typing mistake while writing a URL, in case of broken redirect, when an indexed page is deleted etc. From user’s point of view, a page with a 404 error is of no importance and he won’t find any relevant information which solves his query. And from a search engine’s point of view too, such errors could be troublesome as the search engine won’t be able to access the URL.
Decoding The 5xx Errors
The 5xx errors appear when due to a server error the crawlers are not able to access the content on your website. This happens when a searcher tries to access a page, but the server on which it is located fails to fulfil searcher’s request. Such errors can be tracked in Google Search Console using the crawl error report. When a request for a URL is timed out, the Googlebot abandons the request, thus showing these errors.
What Are 301 Redirects and How Can You Use Them?
To avoid situations wherein an old page is replaced by a new page and searchers or search engines are not able to access the new page, 301 redirect, also known as 301 permanent redirect is done. Take an example, a website has its page with URL seohacks.com/facebook-instagram/ and you move to a new page having URL as seohacks.com/social-media/. In this case, you need to give searchers as well as the search engines a bridge so that they can smoothly transition from the old URL to the new URL. In such a scenario, you will build that bridge using a 301 redirect.
Let us now go through the following points and understand the significance of 301 redirects:
- 301 redirects transfer link equity from a page’s new URL to old URL.
- 301 redirects are essential if you want to pass the authority of an old URL to a new URL.
- 301 redirects allow Googlebot to crawl and index the page’s new version.
- 301 redirects could be the saviour of your rankings if you timely redirect the old URL to a new URL. This will thus help you sustain the website reputation.
- 301 redirects provide your searcher the information placed on an older URL, by smoothly transferring him to a new URL.
A 301 redirection signifies that a page has been permanently moved to a new URL (location). You should always ensure that such a redirection is never done to any irrelevant page. In other words, make sure that 301 redirect is done to a page where the content of the old page resides. Otherwise, it may adversely impact your page rankings. You should use 301 redirects very sensibly as these directly relate to your website rankings.
What is a 302 Redirect? How’s it Different From a 301 Redirect
You learnt how 301 redirects function and what is the significance of their usage. But as compared to 301 redirects, 302 redirects are a temporary solution. As mentioned above, a 301 redirect permanently moves an old page to a new location but a 302 redirect moves a page to a new location temporarily. In case of 301 redirects, search engines replace the old page with a new one permanently, however, a page with a 302 redirect is moved temporarily.
Usually, the 302 redirects aren’t done as often as the 301 (permanent) redirects. From an SEO professional’s point of view, it is very important to know which is the right type of redirection for your particular need. A wrong selection of redirection type may confuse the search engines thus resulting in loss of rankings and traffic.
Keep An Eye on Redirect Chains
For Googlebot, crawling a page could be difficult if it has been redirected multiple times. Such multiple redirections are defined as “redirect chains” by Google. It is always good to limit the number of redirections so that Googlebot is very clear of the old URL to the new URL path.
For e.g., if your website has a page with URL theseoglobe.com/a and you redirect it to a new URL theseoglobe.com/b. And then again, you redirect the same page to theseoglobe.com/c.
In such a situation, it is better to remove the middle URL i.e. theseoglobe.com/b and straightaway link theseoglobe.com/a to theseoglobe.com/c.
You need to keep an eye on your website’s crawling status and once you’re sure that the crawlability of your website is optimized, check if your website’s indexing is good enough.
Understanding How Search Engines Index and Store Your Website’s Pages
After your site is crawled, the next step is to make sure that it gets indexed by the search engines. This is really important because mere crawling of your website doesn’t guarantee that the data on your pages will be stored in search engines’ indexes.
Once a crawler discovers a page, the search engine analyzes the content on the page and then stores it in its index. Like we mentioned in the former part of this chapter, Google has its own index called “Caffeine” where it stores all the data crawled by the Googlebot.
Now, let’s understand how indexing works and how should you prepare your website to get it indexed by the search engines.
Here’s How A Googlebot Crawler Reads Your Pages
As an SEO practitioner, it is very important for you to know how and when a Googlebot reads your pages. To know this, you can view the cached version of your website and understand when was the last time the crawler crawled your website.
Google’s crawler crawls and caches the websites at different frequencies. For e.g. a popular website like www.bbc.com is crawled more frequently as compared to less popular sites.
Have a look at the screenshot given below:
Start searching your website on Google. Now, on the Search Engine Results Page (SERP), to check the cached version of your website click on the green dropdown arrow, right next to your website’s URL. Click on “Cached” and a cached version of your website will appear.
To know if your website’s important content is crawled and cached or not, you can also check the text-only version of your website.
Can Pages Be Taken Off From The Index?
It is certainly possible that pages could be removed from the index. Here’re are the reasons why a URL could be removed from the search engine’s index:
- When a URL returns a page “4XX” error (not found error) or 5XX error (server error), the URL could possibly be removed from the index. This might have been done intentionally if a page was deleted and given a 404 error so that it could be removed from the index. Or, it may also accidentally happen when a page was moved to a new location but wasn’t given a 301 redirect gateway.
- When site owners want the search engine to exclude a page from its index, they give the page’s URL a noindex meta tag. This instructs the search engine to remove the page from their index.
- If a URL doesn’t meet Webmaster Guidelines of a search engine, it could be manually penalized for violating guidelines and could therefore be removed from the search engine index.
- If a URL requires a password for the visitors to access the page, it is very likely to be blocking the crawlers to crawl its page. In such a scenario, the page will not be crawled hence, excluded from the index.
If in Google’s index, you see an existing page excluded, you can check its status by using the URL inspection tool in the search console. If Google tells you that this page has not been indexed yet, you should click on the “Request Indexing” button, once Google has done inspecting the URL. This will allow you to submit the URL to Google’s index. In the search console, you can also check if there is any kind of issue with your page. This will enable you to gauge in a better way how Google is has interpreted your page.
Communicating Search Engines How to Index Your Website
Importance of Robots Meta Directives
You should use the meta directives, also known as meta tags, to instruct the search engines on how to treat your web page.
Often site owners use either Robots Meta Tags in the <head> of their HTML pages or X-Robots-Tag in the HTTP header to communicate the crawlers actions like “do not index the mentioned page in search results” or “do not pass link equity to any link existing on the page”.
Using Robots Meta Tag
As we mentioned above, the robots meta tag is usually used in the HTML <head> of a web page. You can use it to block all or some specific search engines. Let’s go through the common meta directives used by SEOs in certain situations and understand the utility of robots meta tag.
Index/Noindex
This is quite a common robots meta tag wherein you communicate search engines if they should crawl and index the page for retrieval at a later stage or not. If “noindex” is used for a page, then it will tell the crawlers that this page should be excluded from the search results. The default communication passed to the search engines is “index”, wherein they crawl and index the pages and display it on the search results. You do not need to use the “index” value as it already is the default signal to the search engines.
When should you use index/noindex?
As mentioned, “index” value is not needed to be used because it is in itself a default value for the search engines. However, you can use the “noindex” value in a situation when you do not want pages with thin content to be stored in Google’s index but still, you want them to be accessed by the users for certain actions.
Follow/Nofollow
Follow/Nofollow meta commands communicate the search engines if the URLs on a particular page should be followed or nofollowed. “Follow” allows bots to follow the URLs residing on a page and pass the link equity through links existing on the page. On the other hand, “Nofollow” instructs the search engines to not follow the links existing on a page and also not pass link equity through links existing on the page.
When should you use follow/nofollow?
By default, search engines follow all the existing links on a page. However, you can use nofollow if you do not wish to pass the link equity to URLs placed on a page. Quite often, it is seen that nofollow and noindex are used together to disallow the page indexing and restrict the crawler from following the links residing on the page.
Noarchive
This is used to limit the search engines from storing a cached copy of your web page. The default nature of search engines is to keep the visible copies of all the web pages which have been indexed, and make them accessible to the searchers via cached links in search results.
When should you use noarchive?
Noarchive can be used by ecommerce websites where product prices change quite frequently so that searchers do not see outdated prices.
So now, you know about index & noindex, follow & nofollow, and noarchive robots meta tags. Let us proceed on to knowing the code that you have to paste on the html header of the page so that search engines can noindex it:
<!DOCTYPE html>
<html><head>
<meta name=”robots” content=”noindex” />
(…)
</head>
<body>(…)</body>
</html>
And, if you wish search engines to both noindex and nofollow a page, you can place the following html code to the page header:
<!DOCTYPE html>
<html>
<head>
<meta name=”robots” content=”noindex, nofollow” />
</head>
<body>…</body>
</html>
If, for an example, you wish to exclude multiple crawlers like Googlebot and Bing, you can consider using multiple robot exclusion tags.
Using X-Robots-Tag
As compared to robots meta tag, the x-robots tag provides you more functionality and flexibility as it enables you to block the search engines at a larger scale. Using regular expressions, you can block the non-HTML files and enable sitewide noindex tags. This kind of tag is used in the HTTP header of the URL.
Using these tags, you can exclude the file types or even the entire folders. You can read more about robots meta tag here.
Tip for WordPress users
In WordPress, if you want to block search engines from visiting your site through robots.txt file, go to Dashboard, then select Settings, and then click on Reading. Ensure that “Search Engine Visibility” box is unchecked.
Unfolding Ranking: How Are URLs Ranked by Search Engines?
Understanding how the ranking of URLs works from a search engine’s point of view is very important for SEO learners. Search engines present results to the searchers as an answer to their queries. They present the results on a Search Engine Results Page (SERP) in the most relevant to the least relevant order, which we call ranking.
But, did you ever think how search engines are able to rank the URLs? Well, the answer is right here!
Search engines use their algorithms to gauge the relevance. These algorithms are basically a process or a formula which helps a search engine to retrieve and order the stored information in the best logical manner.
Over the years, the algorithms of all the search engines have gone through changes so that the quality of answering the users’ query is improved. For e.g., the world’s largest search engine Google is known to make minor algorithm alterations every day. These are minor changes considered to improve quality. However, its core algorithm updates are also introduced with the time to counter some bigger issues. For e.g. Google introduced its new algorithm update Penguin in 2012 to tackle link spam.
You can read about all the algorithm updates of Google here.
Although, Google hardly reveals why it changes the algorithms so often and what they intend to do next, but there is one thing that we, workers of search engine space can sense. And it is that Google wants to make the search quality better.
Google always highlights that it always makes quality updates and such updates, at a times, may impact your website too. It is therefore necessary for you to timely check Google’s Quality Guidelines.
What Do Search Engines Need?
Providing useful answers in the most helpful formats has been the key objective of search engines. And over the time, in making this happen, the SEO space witnessed a lot of changes. Like we gradually learn a language, search engines too have learnt our behaviour and languages.
Earlier, when search engines were at the evolving stage, it was very easy use to use tactics and tricks to gain an advantage against the search engines’ quality guidelines. For e.g., keyword stuffing just to rank a page for a keyword “Michael Jackson songs” was enough.
In those early days of SEO, if you wanted to rank for “Michael Jackson songs”, using this keyword multiple times on a page and making the keyword bold was sufficient. The ranking for a particular keyword could easily be boosted at that time.
That was the time when baseless content with stuffed keywords was placed on the page merely for the purpose of ranking. As mentioned, to rank for “Michael Jackson songs”, content with keywords stuffed was enough. Here’s an example for the same:
You want to listen to Michael Jackson songs? Here you will listen to Michael Jackson songs. Michael Jackson songs of all genres are listed here. These Michael Jackson songs will take you to the era of Michael Jackson. By listening to these Michael Jackson songs you can dance and groove, try that moonwalk.
Hoaxing with such tactics gave users a terrible experience rather than giving them the information they were looking for. This used to happen in the past, but nowadays search engines have become smarter manifolds. They not only wish websites to serve users with the sensible content that they wish for but they also wish their users to be served with detailed and unique information in the best-deserved formats.
Importance of Links in SEO
Links are majorly of two types, backlinks and internal links. While backlinks or inbound links are the links given by other websites, pointing destination to your website, internal links are the links on your website, pointing destinations to other pages of your websites.
In SEO, links have always played an important role. Earlier, search engines used to rely on the links (URLs) a lot to gauge the trustworthiness and ranking credibility of the websites. The number of links pointing to a website helped search engines to calculate their authority. Although today’s search engines, do not majorly rely on links, as things like user experience, site speed & load time, content formats etc. have come into the picture, but still links hold their importance in the world of SEO.
To measure the quality and quantity of website’s links, Google came up with PageRank. This PageRank is also a part of the search engine’s core algorithm and it estimates the authenticity of a webpage by gauging the quality and quantity of URLs pointing to it. It works on the logic that a webpage which is trustworthy, important and relevant will earn more links.
In order to rank higher on search results, it is good to gain natural links coming from websites which have higher authority.
Importance of Content in SEO
Links (URLs) direct searchers to a source where they can find answers/solutions to their queries/problems. And to give answers to the searchers, websites need to have content! Content can be offered to searchers in multiple formats — images, videos, pdfs, and most importantly, text.
As we know that search engines are robots, or say answer machines, and their primary objective is to serve its users with answers. Whenever a searcher searches for something, there are numerous possible results for it. And for a search engine to identify right answers to display in priority, or say rank higher, it needs most suitable content to answer its user’s query. Search engines always give ranking preference to the pages, whose content adds value in solving its users’ query. In simple words, the pages that match with the query intent are very likely to rank.
Today, for search engines, user satisfaction is on top priority and therefore, there’s no guideline that limits the length of the content or the number of times a keyword should appear in the content. What search engines want is that their users must be served with detailed information and in the most natural form of content. For every website, the focus should be the users — the consumers of content.
Although, the ranking signals today are numerous. But, the top three among them are the number of links coming to your website from external sources, the quality of content placed on your website and RankBrain.
RankBrain: The Key Component of Google’s Core Algorithm
RankBrain, introduced by Google on 26 October 2015, is a machine learning based algorithm which allows it to process search results and deliver more relevant search results to the searchers. As this Google’s core algorithm component functions through a machine learning program, it constantly improves its search results through training data and new observations.
If a word or phrase is not familiar to Google, its RankBrain machine learning algorithm decides what words or phrases might have similiar meaning to it, thus filtering and providing the most appropriate search results to the users. It emphasizes on providing better user satisfaction through its own ability to understanding what people mean when they type in a word or a phrase.
Apart from content and links, RankBrain is the third most important signal to rank on Google search. Therefore, it will be an intelligent practice to use as many keyword permutations and combinations as possible to develop content for your page’s subject. And make sure that these keywords are utilized in your content piece in the most natural ways. This will create a possibility to deliver diversified keywords for a single topic to Google and help you to rank on even the uncommon or low competition keywords.
How to utilize RankBrain for better SEO?
As Google will keep utilizing RankBrain to encourage relevant and useful content for its users, SEO professionals should concentrate on fulfilling the user intent. Ensure that you provide useful and detailed information to the users, and focus on providing the best user experience to the searchers who land on your website.
Understanding the Engagement Metrics
Your ranking doesn’t make a significant imapct until and unless it can be measured. And to measure the effectiveness of your ranking, content quality, user experience and many other related factors, enagement metrics play a key role. It will not be wrong to say that engagement metrics are
not only correlated with Google rankings but they are also partially the outcome of your presence on search results.
Engagement metrics is basically the data which tells you how your searchers are interacting with your website.
Following things are included under engagement metrics:
- Clicks – These are number of visits coming from search.
- Time on page – It is amount of time a searcher spends on your webpage before he leaves.
- Bounce rate – It is the percentage of all your website sessions on a single page, viewed by users. In another words, Bounce rate is the percentage of visits on a landing page when a visitor leaves your website without browsing further.
- Pogo sticking – It occurs when a user clicks on an organic result presented on a SERP and then quickly resturns on the SERP, and clicks a different result. It is often confused with Bounce rate, but it is different. While bounce rate indicates that a user visited a page to get answer to his query and did not visit the site any further, pogo sticking on the other hand, highlights a complete dissatisfaction by the user.
It has been mentioned by various sources that engagement metrics are directly related to higher ranking. And we agree too! It won’t be wrong to comment that good engagement metrics may make your site rank on a higher spot.
What Google Says on Engagement Metrics and Ranking?
It is to be noted that a platform like Google has never used a term like “direct ranking signal”. In fact, Google has been very specific that they consume click data to alter the SERP for particular queries.
Udi Manber, the former Google’s Chief of Search Quality commented that if for a particular query 80 percent of people click on a result placed on #2 and only 10 percent of the people click on a result appearing on #1, then they will switch the results considering that the result appearing on #2 is something that people find more useful for that particular query.
Similarly, Edmond Lau, a former Google engineer states that any reasonable search engine will ascertain click data and use it to improve the search results’ quality.
It is solely Google’s call how it utilizes the click data but it is an indicator that Google uses click data.
As Google always looks forward to improving search quality, it is sure that engagement metrics have more importance that anyone would expect. However, it is to be understood that Google also doesn’t quote engagement metrics as a “ranking signal”. This could be because these metrics have the primary objective of improving the search quality and ranking of URLs on a SERP is simply its byproduct.
What do the tests confirm?
There have been various tests which confirm that Google will adjust the order of SERP depending on searcher’s engagement.
A test done by Rand Fishkin in 2014 showed how a search result moved from #7 to #1 position on the SERP after around 200 people were asked to click on a particular URL. It was observed that the ranking was to an extent impacted with the location of people who clicked on the link. While in the US, where the number of people clicking the link was high, the ranking improved to incredibly, however, the ranking remained lower in the regions where lesser people clicked on the link.
Another test by Larry Kim concluded that the machine-learning based Google’s algorithm demotes the ranking of a page on which people spend lesser time.
Another test by Darren Shaw brought an interesting insight. It highlighted user behavior’s impact on map pack results and local search.
It will not be wrong to comment that SEOs should consider optimizing for engagement as various tests have shown that user engagment metrics are used to modify the SERPs to achieve higher quality levels. And this also largely changes the rank position as a byproduct. Remember that there are chances that even if your web page isn’t changed or the backlinks have not been impacted, the ranking of a page may decline if a searcher’s behaviour indicates the search engine that they like other pages more as compared to your web page for a particular query.
When it comes to ranking the pages, it is engagement metrics that functions like a fact-checker for Google. While factors such as content and backlinks on the page feed Google with the answers to search queries and authority, the engagement metrics enable it to order the pages rightly and also check if the ranking orders are justified or not.
How Search Results Have Evolved?
Earlier, when search engines were not smart enough, the SERPs were very ordinary and just had the usual “10 blue links”. Whenever, a searcher used to search anything on Google, only the 10 organic results used to appear, in the same “10 blue links” format.
It was only the #1 spot on SERP which all the websites always desired. But later on, Google started introducing new formats on the search results pages, known as SERP features which allowed the websites to rank in some or the other format and increase the chances of being visible to the potential audience. Below are some of the very common SERP features:
- Site Links
- Featured Snippets
- Knowledge Panel
- Paid Advertisements
- People Also Ask Boxes
- Local (map) Pack
Google continuously experiments to introduce new formats and keeps on adding new SERP features regularly. On one hand, there has been a negative impact on organic results of multiple website as many organic results have been pushed down due to introduction of new SERP features, while on the other hand, Google has improved the user experience to a greater extent by showing different content formats to better satisfy the queries.
Let us now have a look at different type of query intents and the SERP features used to answer those intents.
Query Intent | SERP Feature Likely Used |
Local Search | Map Pack |
Transactional | Shopping |
Informational | Featured Snippet |
Informational with 1 answer | Instant Answer / Knowledge Graph |
In the next chapter, i.e. Chapter 3, we will be discussing the intent part in detail. For now, it is very important for you to understand that the answers to searches can be delivered in numerous formats. And the most important part is that your content’s structure can impact the format in which it will show on the SERP.
Localized Search
Google has its own index for local businesses which is used to create local search results. If you are optimizing for a local business; a business having a physical outlet where customers can visit (for e.g. an sunglass store), it is very necessary for your to register that business on Google’s free local business listing platform, Google My Business.
To determine the ranking of localized search results, Google uses three important factors:
- Relevance
- Distance
- Prominence
Relevance
Relevance from Google’s point of view allows it to determine how appropriate a local business is for a searcher. To feed Google the relevance of your business, it is very important for you to throughly fill the business information in the Google My Business app. Also make sure that the information you’re filling is accurate.
Distance
Distance is the another key factor used by Google to rank you in the localized search results. Google uses your business’ geographic location to display its presence in the local results. It is to be noted that the local search results shown by Google are very sensitive to proximity and it also considers very seriously the geo-location of the searcher or the location mentioned in the searcher’s query.
Prominence
Google considers the prominence of a business listed on Google My Business aka Google Business Profile and also rewards it by preferring it in the local search results. Other than the offline prominence of a business, Google also checks the online factors to evaluate its ranking position. Some of the online ranking factors based on prominence are:
Customer Reviews
Google reviews given by customers for a local business along with the sentiments highlighted in the reviews notably impact the ranking of business in the local results.
Business Citations
Another important local ranking factor is business citation, also known as business listing. It is basically a local business’ web-based reference having inputs such as Name, Address and Phone Number (NAP). Such business citations are created on local platforms such as Infogroup, YP, Yelp, Acxiom, Localeze etc.
Google uses the number and consistency of these local business citations to derive the prominence of a business. Using data from such local business citations, Google is able to make its own local business index. When Google is able to come across consistent references of a business multiple times, the business gain’s Google’s trust thus increasing the chances of being shown in the local search results. Apart from this, Google also utilizes information about a local business available on the internet, such as articles and links.
Focus on Local Engagement
Engagement can be another key feature that may bring you delightful results in local business optimization. You’ve to understand one thing that Google improves local results by using real world data such as average length of visits, popular times to visit etc. Also it allows the searchers to ask questions to the business. Engagement over the time has become cruicial in deciding the local results for searchers.
Google is utilizing the real-world data to display local results. It observes how searchers respond to local businesses and how do they interact with the businesses. Rather than just displaying results on the basis of link, citations etc., Google goes deeper into the sentiments of the customers. Therefore, businesses who wish to operate successfully in the world of local SEO should brainstorm on how improve the quality of their services, how customers can be made happy so that positive sentiments are achieved.
So, in this chapter you have learnt how search engines works, how crawling, indexing and ranking function, and how local SEO works. Now in the next chapter, you will learn everything about keyword research. So proceed to Chapter 3 and unlock how you can gauge what your audience is looking for.