Monday, November 20, 2006
A Lesson in Duplicate Content -- and a Ways to Avoid It
I did a simple test yesterday that showed me that I was running afoul of Google's duplicate content filters. What does that mean? It means they were detecting pages that it felt were too similar and were treating those pages as spam. Yep, those pages had no chance of turning up in almost any search engine results at all.
How did this happen? Two ways.
Each newsletter that I write I eventually post in my newsletter archive. I also post the individual articles under their respective topics so that visitors can find the articles grouped by topic rather than sorting through my whole article archive.
I had assumed that Google would index the standalone articles and treat the articles in the newsletters as part of a larger entity. No such luck. Both were being treated the same as spam articles that are posted over and over on spam sites in a deluded attempt to completely dominate the search results for a keyword.
The other pages that were tagged as duplicate content were guest articles I was reprinting from top marketers. I can't say that being tagged as duplicate content was unexpected in this case. Hey, the articles do appear on other selected sites.
So what can I do about this? With the articles I have written for my newsletter, I'll have to make sure that I create a shortened version in my newsletter and an expanded version for the website. That should avoid the duplication issue.
With the guest articles, I'll need to add some more commentary, some editor's notes, to the version I print on the site to break up the articles a little bit and add some of my own insights to them.
I'll consider also the possibility of blocking the spiders from spidering my newsletters. I need to avoid having pages on my site treated as second-class citizens in the search engine's eyes. Content is the lifeblood of the Internet. I can't afford to have my content dumped into the dreaded supplemental listings.
By the way, if you want to see if you're suffering any duplicate content penalties, here's the test you can do:
Type the following into Google's search box "site:yoursite.com" (without the quotes and replacing "yoursite.com" with the domain name of the site you want to check).
Go through each page of the search engine results and scan the entries for the words "Supplementary Results" after the URL of your site. Any URL that sports those dreaded words has been deemed spam and dumped into Google's trash bin as unworthy of appearing in real search results.
Figure out what's causing your penalty and fix it—that is, unless you want your pages not to show up in anyone's search results. ;)
Jeff
I did a simple test yesterday that showed me that I was running afoul of Google's duplicate content filters. What does that mean? It means they were detecting pages that it felt were too similar and were treating those pages as spam. Yep, those pages had no chance of turning up in almost any search engine results at all.
How did this happen? Two ways.
Each newsletter that I write I eventually post in my newsletter archive. I also post the individual articles under their respective topics so that visitors can find the articles grouped by topic rather than sorting through my whole article archive.
I had assumed that Google would index the standalone articles and treat the articles in the newsletters as part of a larger entity. No such luck. Both were being treated the same as spam articles that are posted over and over on spam sites in a deluded attempt to completely dominate the search results for a keyword.
The other pages that were tagged as duplicate content were guest articles I was reprinting from top marketers. I can't say that being tagged as duplicate content was unexpected in this case. Hey, the articles do appear on other selected sites.
So what can I do about this? With the articles I have written for my newsletter, I'll have to make sure that I create a shortened version in my newsletter and an expanded version for the website. That should avoid the duplication issue.
With the guest articles, I'll need to add some more commentary, some editor's notes, to the version I print on the site to break up the articles a little bit and add some of my own insights to them.
I'll consider also the possibility of blocking the spiders from spidering my newsletters. I need to avoid having pages on my site treated as second-class citizens in the search engine's eyes. Content is the lifeblood of the Internet. I can't afford to have my content dumped into the dreaded supplemental listings.
By the way, if you want to see if you're suffering any duplicate content penalties, here's the test you can do:
Type the following into Google's search box "site:yoursite.com" (without the quotes and replacing "yoursite.com" with the domain name of the site you want to check).
Go through each page of the search engine results and scan the entries for the words "Supplementary Results" after the URL of your site. Any URL that sports those dreaded words has been deemed spam and dumped into Google's trash bin as unworthy of appearing in real search results.
Figure out what's causing your penalty and fix it—that is, unless you want your pages not to show up in anyone's search results. ;)
Jeff
Comments:
Post a Comment
© 2005, 2006, 2007, 2008, 2009, 2010 Jeff Baas, One Stop Web Support

