Duplicate content is a tricky thing to figure out. You may have seen multiple websites online that have the exact same content but still rank well. While it is still best practice to write unique content on your website, John Mueller of Google Switzerland explains how Google’s algorithm works when trying to filter through multiple sites with the same content.
“This is kind of a tricky question in the sense that if just pages of your website are duplicated across the web, maybe you have your main website in the marketing website and it’s exactly the same content, you just use one maybe for off-line advertising something like that, then in most situations we all recognize that and just pick one of these URLs to show it in search.
That’s not something where we demote a website for having this kind of duplication, be it internally on the website or across websites like that. Essentially what we do there is we’ll try to recognize that these pages are equivalent, and fold them together in the search results.
So it’s not that they’ll rank lower, it’s just that we’ll show one of these because we know these are essentially equivalent and just show one in search. And that’s not something that would trigger a penalty or that would lower the rankings, that’s not a negative signal from Google.”
Basically, John is saying that duplicate content is not necessarily a penalty. If two sites have the same content and are of equal value, Google will choose one to display over the other. How the algorithm chooses is probably based on other SEO variables. He did mention that there are exceptions to this rule. If a website contains duplicate content from multiple sites and just scrapes together content to call its own, Google may take a manual action and remove your website from search results.
“The types of situations where we might take for example manual action on duplicate content is more if one website is just a compilation of content from a bunch of other websites where we kind of look at this website and see they’re scraping the New York Times, they’re scraping some other newspaper sites, they’re rewriting some articles, they’re spinning some other articles here, and we can see that this is really just a mix of all different kinds of content sources and that there’s really no additional value in actually even bothering to crawl this website.
And that’s the type of situation where the web spam team may take a look at that and say well we really don’t need to waste our resources on this, we can essentially just drop this out of the index. And when the webmaster is ready and has something unique and compelling on the website and kind of has removed all of this duplicated content then we can talk about a reconsideration request and go through that process there.”
Check out the Google Webmaster’s office hours hangout where John talks about this.
It is important to note that while it is not officially called a “penalty” Google’s Panda algorithm still targets low quality content or thin sites from appearing in search results. It is best to stick to the rules and best practice method of writing your own unique content.