Noindex

What is it, importance and examples

What noindex is in a Web Page

Noindex is a value used in the robots meta tag of the HTML code of a URL to prevent the indexing of a page by search engines such as Google, Bing or Yahoo.

Google understands the noindex tag as a directive. Therefore, if it finds it, it will not show that page to users in its results pages.

The counterpart of noindex is “index”, which explicitly allows indexing although its use is not necessary since search engines interpret the absence of the tag as a green light to index the content.

Why it is important

The noindex tag allows you to decide whether a particular URL should be included in the search engine index or not.

Therefore, noindex is a great resource that allows us to control the indexing of each individual page with very little effort,

Noindex is a great resource that allows us to control the indexing of each individual page with very little effort.

For this very reason, this directive is one of the favorite optimization tools of all SEOs.

Noindex Tag and Syntax Example

Here is an example of the syntax of the noindex tag:

< meta name=”robots” content=”noindex” >

Another variation of this is the noindex nofollow directive:

< meta name=”robots” content=”noindex,nofollow” >

In addition, we can also prevent the indexing of a page for a specific bot.

Here are some examples:

< meta name=”googlebot” content=”noindex” / >
< meta name=”googlebot-news” content=”noindex” / >
< meta name=”bingbot” content=”noindex” / >

When to Use the Noindex Tag

The general recommendation when applying this directive is very simple:

  • Use the meta robots noindex tag for the content of little value to the user.

This can be very subjective so here are some examples of content or pages that you should not index:

  • Author pages
  • Internal search results
  • Restricted access pages
  • Certain types of (custom) entries generated by plugins
  • Certain category or tag pages

Depending on the type of website or page you manage, you should apply one criterion or another, but always, to be sure, ask yourself if the page in question has value for the user.

Noindex vs Disallow

It is very important to emphasize that the noindex tag of a page does not prevent search engine crawlers from fully crawling that URL.

It only prevents them from displaying it to users in their search results.

Therefore,

If we are looking to prevent a page from being crawled and indexed by a search engine, we must resort to the use of robots.txt.

Specifically, the “Disallow” directive.

In this way, we prevent the crawling of the page and its subsequent indexing (although this is not always achieved).

In any case, if you want to ensure that both directives are met, you can combine a disallow with a noindex in the robots.txt by adding both directives to the robots.txt file:

Disallow: /example-page-1/

Noindex: /example-page-1/
WARNING: Noindex (page) + Disallow: cannot be combined with noindex on the page, because the page is blocked and therefore search engines will not crawl it to know not to leave the page out of the index.