Duplicate content is one of the most common problems related to search engine optimization, and it is most often an issue on websites which have a large number of pages. Duplicate content basically refers to having the same webpage content but located at a different URL, either on your own website or elsewhere. Since the search engines scan and index every URL as an individual page, duplicate content can have a significant impact on your standing in the search results. This guide explains in further detail why duplicated content is a problem and how to identify and work around it.
Why Is Duplicate Content a Problem?
When you have two or more URLs containing the same content, the search engines have a problem trying to identify which webpage is more relevant to the search query entered. Google and other search engines will never display the same content in the search results, so it will only index and display the page which it considers to be most relevant while the other page will no longer be taken into account. Duplicated pages have a far lower ranking power, and as a result, they can severely decrease your website traffic.
Why You May Have Duplicate Content
There are many reasons why your website can end up with duplicate content, but regardless of how it got there, be it intentional or accidental, the end result will be the same. Duplicate content is often created intentionally, such as in the case of print-optimized versions of the same webpage or content sorted in an alternative way. It’s still the same content, regardless of the fact that it might be useful to the visitor, so the results are the same unless you take extra steps to work around the problem. Other ways content can end up being duplicated include variable session IDs used to track website visitors in applications such as online shopping carts, affiliate codes identifying referrers and even alternative domains.
Identifying Duplicate Content with Google Webmaster Tools
By far the easiest way to identify duplicate content on your website is to use Google Webmaster Tools. Create a Google account if you haven’t done so already, and then log in and navigate to Optimization > HTML Improvements. A complete list of duplicate pages will appear here.
Alternatively, you can use a website crawler which works in a similar way to those that the search engines use to view and index content. These applications will show you how your website looks to the search engines while helping you to identify any issues. One of the most popular options is Screaming Frog, a freemium desktop application which you can use to scan your website from an SEO perspective.
How to Solve Problems with Duplicate Content
Dealing with duplicate content is not necessarily all that complicated, and you don’t need to get rid of it if it benefits your website and its usability. Fortunately, there are a number of options which you may use to alter the way the search engine crawlers access and index your website. Following are the most common options:
- You can add 301 redirect rules to the .htaccess file located in the root directory of your website’s server. These will automatically redirect both your visitors and the search engines to the original page, negating entirely the influence of the duplicate content. However, this method effectively removes the copied content entirely, so you’ll only want to use it if the duplicate content was created unintentionally.
- Using the Rel=canonical tag in thesection of the HTML code for the offending webpage will tell the search engines not to index it. The main advantage of using this method is that it will still mean that your visitors can access the duplicate page while the search engines will not take it into account when indexing your content. . Use Google Webmaster Tools if you have multiple domain names (mirror websites) containing exactly the same content. You can set the preferred domain by navigating to Configuration > Settings.
Other ways to deal with duplicate content include editing the robots.txt file in your website’s root directory or by using an automatic URL rewriting tool.
How do you deal with dup;irate content? Share with us in the comments.