A brief on Robots meta tag

What is robots meta tag?

Robots meta tag (also known as "Robots meta descriptives") provide crawlers firm instructions on how to crawl and index a individual web page content. These tags are mainly placed in the head section of the webpage.

What is crawling and indexing?

Yeah, we spoke about these jorgans crawling and indexing but never discussed them. Now is the time to understand them.

Crawling: There are programs (also known as spiders) that initially fetch few web pages and gather information from them and follow the new links on those pages and crawl new pages this is how the spiders or crawlers crawl billions of web pages and gather the information.

Indexing: This is the process in which we store the information gathered after crawling the web page onto the database.

How does the robots meta tag work?

<head>

  <!-- Example 1: Targets only the google bot -->
  <meta name="googlebot" content="noindex">

  <!-- Example 2: Targeting all bots -->
  <meta name="robots" content="noindex, nofollow">

</head>

name="" This attribute here helps to target a specific crawler bot (i.e the user agent) as shown in Example 1. To target all crawler bots we specify the value robots as shown in Example 2.

content="" This attribute here specifies the instructions or rules for the crawler bots. We can assign multiple instructions at once by comma separating them as shown in Example 2.

Mostly used robots meta tag values

1. noindex: Search engine doesn't index these pages hence they will not be available in search results.

<meta name="robots" content="noindex">

2. nofollow: Tells a crawler not to follow any links on a page or pass along any link equity.

<meta name="robots" content="nofollow">

Link equity: Search engine ranking factor based on certain links pass value and authority from one page to another. They are also referred as "Link Juice".

3. noarchive: Search engine doesn't show any cached or archived information in search results to this page.

<meta name="robots" content="noarchive">

4. nosnippet: Search engine doesn't show any text snippet or video snippet (i.e the meta description) in search results with respect to that page.

<meta name="robots" content="nosnippet">

5. notranslate: Search engine doesn't show any option to translate our web page.

<meta name="robots" content="notranslate">

6. unavailable_after: We use this if we don't want to show up our page in search results after a certain period of time.

<meta name="robots" content="unavailable_after:2022/07/17 ">

7. none: We restrict all the permissions that a crawler might need. It is equivalent to using both the noindex and nofollow tags simultaneously.

<meta name="robots" content="none ">

8. all: No restrictions are specified to the crawler or the search engine.

<meta name="robots" content="all">

9. noimageindex: Specifies search engine not to index any images on the web page.

<meta name="robots" content="noimageindex">

These are the few majorly used robot meta tag values.

How are robots meta tags different from robots.txt?

The key difference is that the robots.txt file is not suitable for safely excluding content from indexation. Incoming links may still cause content to be indexed under certain circumstances. Therefore it is advised to use the robots.txt file to manage crawling traffic and prevent image, video, and audio files from appearing in search results.

By using robots meta tags with the noindex instruction, you reliably prevent pages from appearing in search results. However, you cannot use them to exclude individual image, audio, or video files from indexation.

NOTE: Most of the information delivered in the article is with respect to google search engine and Googlebot

Feel free to reach out: LinkedIn

Aarya's opinion on code