Softpedia
 

NEWS CATEGORIES:



NEWS ARCHIVE >>
SOFTPEDIA REVIEWS >>
MEET THE EDITORS >>
Home > News > Webmaster > Tips and Tricks

June 16th, 2008, 15:35 GMT · By Catalin Bocanu

The Value of Duplicate Content

SHARE:

Adjust text size:


Google Logo
Enlarge picture
The content of a website is considered duplicate if parts of an existing published content are repeated on a certain number of pages across many websites and domain names. Even if building a website that has content automatically generated sounds tempting, like in the case of RSS feeds publishing in HTML pages, the consequences of such a thing can often spell exactly what no webmaster would desire.

Practically, the multiplication of the same content is detected
by search engine spiders that will index and display only original content and, at the same time, will exclude or ban the websites or web pages consisting of duplicate content.

There is also the situation when identical content is published without your intention on the same website, or when other webmasters have reproduced certain parts of your content by publishing your RSS feeds on HTML pages or, finally, when two versions of the same content appear on the same website page, like in the case of the coexistence of a printer-friendly version page and the HTML page.

If the same website content is published two or more times on the same page, then simple solutions exist to avoid the consequences enforced by the duplicate content detection by search engines spiders.

For example, one of the page versions (printer-friendly and regular versions) can be hidden from search engines spiders by blocking it, and thus only one of the versions is indexed. Consequently, you should add a noindex meta tag to the pages that must not be indexed by the search engine's robots. At the same time, the robots.txt file or the sitemap could also give the correct indexing direction for the search engine spider.

As far as the republishing of articles is concerned, the search engine spider will always look for the original source of the articles. You can always go for the by-now familiar suggestion of linking back to the source (the website containing the original content), should you consider reproducing an article on another website. In order to avoid the existence and detection of duplicate content, the chances of it occurring must be minimized by decreasing the number of possibilities for it to appear.

TELL US WHAT YOU THINK:

1,805 hits · 1 comment · Link to this article · Print article · Send to friend · Subscribe to news

MUST-READ RELATED ARTICLES:


Automatic Translation of Blogs

Custom Search Engines' Practical Utility

The Ideal Homepage

The Mechanisms and Function Principles of Search Engines

The Influence of Ads Over Websites Usability

READER COMMENTS:


Comment #1 by: james on 02 Feb 2009, 18:36 UTC reply to this comment

In regards to the duplicate content part of this blog post, I personally use the http://www.copygator.com website to find and stop duplicate content:

1. it's automated and brings me results instead of me searching for duplicated content. All i had to do was submit my feed and it started monitoring my feed showing me who's republished my articles on the web.

2. i get notified by email so it contacts me when it finds copies of my articles online.

3. i use their image badge feature to alert me directly on my website when my content is being lifted.

4. it's a free service as opposed the "per page" cost of copyscape/copysentry.

Copyright © 2001-2012 Softpedia. Contact/Tip us at

WindowsGamesDriversMacLinuxScriptsMobileHandheldNews

SUBMIT PROGRAM   |   ADVERTISE   |   GET HELP   |   SEND US FEEDBACK   |   RSS FEEDS   |   UPDATE YOUR SOFTWARE   |   ROMANIAN FORUM