Google just sent out a reminder to all webmasters that they need to remove and stop depending on their noindex in your robots.txt files. A couple of days ago, the SEO community received notifications from Google Search Console with the subject line “Remove noindex” statements from robots.txt of…
But What is Robots.txt?
The robots.txt file, also known as the robots exclusion protocol, is a text file optimizers create to direct search engine robots how to crawl (or what to not crawl) pages on their websites. It is part of a group of web standards that instruct how robots crawl the web, access and index content, and serve that content up to web users.
Search engine bots usually look for a robots.txt file after arriving at a website and before crawling or reading it. If it does find a file, it usually reads it first and then decides whether to proceed through the page. If the files do not include any directives that tell the search engine robots to not read the page, it will proceed to crawl the information on that page.
Why Is This Important?
If you receive this notification, make sure to check your site and ensure whatever you mentioned in this noindex directive is supported a different way. The gist of this notification is for you to move your noindex directives out of robots.txt file. Google has given you until September 1 to make those changes and find a better place to put it in – preferably somewhere they can still read.
- Google suggests that you put noindex directives to meta tags. This solution works for both HTTP response headers and HTMLs.
- Adding 404 and 410 HTTP status codes – which mainly means that these pages do not exist.
- Search Console Removal Tool – definitely a quick and easy way to temporarily remove a URL from Google’s search results.