Googlebot - Eazy Walkers
loading
Googlebot

What is Googlebot?

Googlebot is the web crawler software used by Google; it collects documents from the webpages to build a searchable index for the Google Search engine. Googlebot is the generic name for Google’s web crawler. Googlebot constantly visits billions of webpages all over the web.

What is a Web Crawler?

Web crawlers (also known as bots, robots or spiders) are a type of software designed to follow links, gather information and then send that information somewhere.

How does Googlebot work?

Googlebot uses sitemaps and databases of links discovered during previous crawls to determine where to go next. Whenever the crawler finds new links on a site, it adds them to the list of pages to visit next. If Googlebot finds changes in the links or broken links, it will make a note of that so the index can be updated. The program determines how often it will crawl pages. To make sure Googlebot can correctly index your site, you need to check its crawlability. If your site is available to crawlers they come around often.

Googlebot was designed to be run simultaneously by thousands of machines to improve performance and scale as the web grows. Generally, Googlebot crawls over HTTP/1.1. However, starting November 2020, Googlebot may crawl sites that may benefit from it over HTTP/2 if it’s supported by the site. This may save computing resources (for example, CPU, RAM) for the site and Googlebot, but otherwise it doesn’t affect indexing or ranking of your site.

The difference between Googlebot and the Google index

Googlebot

  • Googlebot retrieves content from the web.
  • Googlebot does not judge the content in anyway, it only retrieves it.
  • The only concerns Googlebot has is “Can I access this content?” and “Is there any further content that I can access?”

The Google index

  • The Google index takes the content it receives from Googlebot and uses it to rank pages
  • The first step of being ranked by Google is to be retrieved by Googlebot.

How to Block Googlebot from visiting site?

To Block Googlebot from visiting your site to gather information available on the site, you can use the following method;

  • Use appropriate directives in the robots.txt as Googlebot follows the instructions in it.
  • Adding robot instructions in the metadata or meta tag <meta name=”Googlebot” content=”nofollow” /> to the web page.
  • Using proper sitemaps in the XML sitemap file.

Types of Google crawlers (user agents)

There are several types of Google crawlers. Google’s main crawler is called Googlebot.

  • Crawler is a generic term for any program (such as a robot or spider) that is used to automatically discover and scan websites by following links from one webpage to another.
  • User agent token is used in the User-agent: line in robots.txt to match a crawler type when writing crawl rules for your site.
CrawlerUser agent token (product token)
APIs-GoogleAPIs-Google
AdSenseMediapartners-Google
AdsBot Mobile Web Android (Checks Android web page ad quality)AdsBot-Google-Mobile
AdsBot Mobile Web (Checks iPhone web page ad quality)AdsBot-Google-Mobile
AdsBot (Checks desktop web page ad quality)AdsBot-Google
Googlebot ImageGooglebot-Image Googlebot
Googlebot NewsGooglebot-News Googlebot
Googlebot VideoGooglebot-Video Googlebot
Googlebot (Desktop)Googlebot
Googlebot (Smartphone)Googlebot
Mobile AdSenseMediapartners-Google
Mobile Apps Android (Checks Android app page ad quality. Obeys AdsBot-Google robots rules.)AdsBot-Google-Mobile-Apps
Feedfetcher FeedFetcher-Google
Google Read AloudGoogle-Read-Aloud
Duplex on the webDuplexWeb-Google
Google Favicon (Retrieves favicons for various services)Google Favicon
Web Lightgoogleweblight

Digital Marketing

Digital MarketingGoogle crawlersGooglebot

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.