It is officially The End Of Robot.txt Indexing As We Know It! With the dramatics behind us, Let’s begin by addressing what Robot.txt and indexing are exactly and why this change is likely to shake up marketing methods as experts begin to shift over to supported directives that Google will be using to crawl information from websites and how it relays that information to consumers. While Robot.txt indexing is not an official directive of Google it is a helpful tool for marketers and developers as it is essentially a string of code that tells Google which pages to crawl for information, and which pages to ignore entirely. This is helpful as it frees up robots designated time to crawl the important aspects of your website.
Starting September 1st Google will no longer recognize no index commands placed in Robots.txt. To review the recent news by Google, please click here!
What is Robot.txt?
Robot Exclusions Protocol is a collection of web standards that determine how a web crawler can access and index important content & information on your site, and from there how it chooses to display this information to users around the world. This affects how people can access your site and find information, as well as strictly telling a search engine to ignore a page entirely. If you are utilizing a WordPress based site this may not be a major concern as there are many other methods when it comes to no-indexing and no-follow on a website. However, if your site is HTML based this update is something that you should keep a close eye on.
Why it benefits SEO
Having a properly formatted Robot.txt file allows Robots to parse your website page by page to determine relevancy, and more importantly which information Google will display based on keywords. Essentially when someone types in a query and your website is displayed as a result, Google deemed that the information on that page was more relevant than other pages. Alternatively, no indexing can benefit your site as this disables people from finding certain pages, whether it is a landing page, a client login page, etc, etc.
Implementing a no index / no follow command for a webpage would be ideal for a thank-you page that a user lands on after submitting a contact request. This is important for marketers as the thank-you page can be used to track contact conversions, thus you wouldn’t want people to be able to access that page from anywhere, as it will skew your data.
If you are creating a landing page and do not want users to be able to access a page publically on your website, or alternatively if you are posting a blog, or a webpage with content from another site, by placing a no index command you are telling Google to disregard a page as original content, and can help by not penalizing your website.
Options going forward
While there are a variety of ways to tell search engines to ignore certain pages, for the time being, it is still possible to tell Search Engine Robots to ignore a specific page on a website, but with this recent news, it is important to begin looking into alternative methods for indexing before this change occurs. This is to ensure that your website doesn’t take an unexpected dip when the update rolls out & robots start crawling pages they were never intended to. Stay ahead of the curve, and make sure you are ready for these upcoming changes to robot.txt, no-index & no-follow!
- Noindex in robots meta tags
- 404 and 410 HTTP status codes
- Password protection
- Disallow in robots.txt
- Search Console Remove URL tool
Boston Web Marketing: SEO and Marketing Professionals
Boston Web Marketing can work alongside your internal team of Marketing professionals when it comes to optimizing your business and getting your website back in front of the eyes of those you are trying to reach. Need a site consultation? If you are interested in growing your business online, give Boston Web Marketing a call today at 857-526-0096!