Banish Devilish Duplicate Content to the Nether Regions of the Web
Out of the corner of your eye you catch a glimpse of someone you recognise. It’s you! You look closer to see if there’s a mirror but no, there’s no mirror. You’ve seen your double!
If this has happened to you then you may have been in the presence of the sinister doppelganger.
The sight of a doppelganger is thought to be a bad omen, signalling misfortune to come. Unfortunately the doppelganger has not restricted itself to myths and folklore. The nefarious duplicating spirit has found its way from the pages of fairytales on to the pages of the internet.
Well, maybe it’s not the doppelganger, who knows? Whether it’s the uncanny imp or just the mysterious nature of the World Wide Web, duplication abounds and it spells misfortune for your website.
Duplicate content is the double you don’t want to see on your website. While perhaps not as problematic as seeing a doppelganger in real life, two pages with identical content can make things difficult for your site.
Search engines don’t want to show the same content seven times on one result page. They want a variety of results. That means only one of two sites or pages with the same content will ever be shown. Watch the video below from Matt Cutts as he explains how Google handles duplicate content
The problem duplicate content causes is for the spiders or bots that search engines send to crawl or index your site. The bot sees two identical pages. How does it know which one to rank for? It can take a guess but do you really want a bot making that decision for you?
Search engines won’t directly penalise you for having duplicate content but they may rank other websites ahead of yours with the same content.
There are many reasons, some more technical than others, why you might end up with duplicate content. One place we see a lot of duplicate content is on e-commerce sites that have different domains in different countries. Companies tend to cut and paste the content from their main site to their to their .com, co.uk and .ie sites.
Another common cause of duplicate content is your content management system. Content management systems often point to the same file via two different URLs but a search engine sees this as two different pages. If different sites link via the two different URLs the value of those links is split.
Other causes may be that you have copied large sections of text from a competitors website, PDF, brochure or other source. (won’t recommend this by the way!)
While it is believed that nothing can be done to get rid of the ominous doppelganger, getting rid of duplicate content is relatively easy. The problem is that like the doppelganger it can be quite elusive.
There are a number of ways that you can track down and eliminate your elusive doppelganger for good. Here are some common issues you may come across along with some recommended solutions:
Duplicate content across your website
Use a tool called Plagspotter to find content that has been duplicated across your website. Once you find the duplicated content either re-write it or remove it from your website.
Somebody has copied content from your website
You can use Copyscape to find websites that have copied your content and re-published it on their own website. In order to get the copied content removed from their site we recommend that you contact the Webmaster and ask them to remove the content first. If that doesn’t work you can submit a request to Google to have the page de-indexed. You can do that here.
Duplicate Meta Data
You can use Google Webmaster Tools to diagnose duplicate Meta Titles and Description on your website.
When you login to Webmaster Tools surf to “Search Appearance” and select the “HTML Improvements” option. Here you will find pages that have duplicate and short Titles and Descriptions.
If you do find some doppelganger Meta Data we recommend you re-write it. Every page on your website should have unique Meta Data.
Duplicate pages that you need on your website
Sometimes you cannot do anything about duplicate pages on your website. They may be a product duplicated under two categories or a testimonial that you need in two areas of your website.
In this instance you can use a Rel=Canonical Tag to indicate to Google that the page is a duplicate of another. Remember if you are using the Rel=Canonical Tag only the canonical page will be indexed – so make sure that it is the page that holds the most importance for you.
If you follow these simple guidelines you should free yourself of the problems of duplicate content. Your site will fare better in search indexes and the right people will arrive at the page you want them to see.
As for doppelgangers, they aren’t as easily dealt with. But if you do see your double try not to panic, you never know, it might just be your long lost twin.