Duplicate content is now a real problem for web sites, you must have original content because of the contrary, the site will be strongly affected, whereby is important eliminate duplicate content and/or content copied that may exist on the site in any form, the SEO strategy stopped being purely of linking and coding since Google Panda and Penguin walk on the network, now the original content is king but the quality will prevail.
When you write an article or post, usually never cause any problems, is normal to some people want to pass of smart making a "patchwork quilt" of copied content from various web sites, others simply copied all text without even putting the reference link where was extracted, and of course, there are websites with scripts that copy all content, it exists and will exist.
Google is working to refine their algorithms (Google and Google Panda Penguin) and constantly updated to prevent damage to the sites that create original content on sites that copy content and many times obtaining better results in the SERPs that the site created original content, but also other factors take priority, as the web site that create the original content is not using techniques penalized by Google and not have a manual or algorithmic sanction among other factors, do one thing well does not mean that other will be omitted or not be taken into account.
Google has a form to report on cases where the copied content is positioned better than the original content in Google search results; Google uses case reports to refine their algorithms.
Now let's look at some ways to tackle each of them.
Check the content that users want to publish on the site
For many it is important to check the content that users want to enter the site, so that it does not take as copied content, it is best to check the content (fragments approximately 16 words) on Google in quotation marks, this will give a clear idea if the content is original or from another site.
There are also sites that let you check the content as PlagiarismChecker, but the best practice is to search in the search engines (Google, Bing or Yahoo) that you wish to check for these cases.
Our original content has been copied and now, what I do?
Many times what we want is verify if the content of our website has been copied and see if others are using our content as their own; that others copy our content is not all bad when they place a link to our site, and even more when we accreditation work done correctly, but occasionally it is good to take the necessary precautions in case we should make a complaint to Google about a site spammer.
Sites like Copyscape let you check if the content of our site appears on other sites, using the free tool will get only 10 results, but this is a good guide and starting point.
If we suspect that a site is copying our content or part thereof, with Webconfs we can compare our website with the site suspect, if there is a similarity between the two sites, this will show as a percentage, if we get being above 10% (preferably less than 5%) may indicate us a existing problem, this might indicate copying content or duplicate content, however, should not be excluded that for the cases of very standard content can be seen as duplicate content and for this case, is possible that the problem is not in the other site.
If a site is copying the contents of your website (a scraper), you can try to locate the webmaster, or if you can not find a way to contact the webmaster in the site, you can find the contact of the person responsible with who.is, a site where you can meet who has registered the domain and the email contact (There is the possibility not be found the email), once you have (sometimes you have to be ingenious) the contact email, you can request to stop copying your content a good way, something like so:
Receive a cordial greeting,
My name is [Name] and I am the owner (responsible, administrator, etc.) of website [FullDomain], I found that you've been copying my content site in your web site [ScraperDomainSite] in whole or in part, without express permission of me part, therefore, I request you cordially that you delete such content from their website and do not perform these actions again.
If within 15 days does not remove such content from your website, I interpret it as a refusal to remove the content and the ratification to refusal to do this, therefore, I will take up appropriate legal action to protect my content and get the compensation for damage (here you can wield the law according to each country among others) and also, I will inform your hosting service (for this serves who.is) and I will inform your website in Google AdSense (this if the site has AdSense) and in Google over your actions against me, my brand and my company (if applicable for each).
Thanks very much for your attention and cooperation,
With this, certainly will not want to do it again ;-D
It is very probable that after this, the scraper site not want to expose, even more so when mentioned the hosting service, Google AdSense and the same Google, to be the case and to have inform the hosting service, we get a little more serious about it with some laws, etc., most probably the hosting service not want involved in legal problems and also not involve in problems with Google, by this will force his client to be rectified or remove client from your hosting service. Now, if the practice of copying is very serious, can be sent copy at hosting service when you send the first message to scraper site to show that the matter is very seriously. o-O ""
Here is the links for the respective reports to Google
The link for report to Google AdSense.
The link for general report to Google.
The link for report to Blogger.
It is very important, but very important, for these procedures that content is properly accredited to Google.
|Accredit authorship of the original content with the profile in Google+|
|Accredit content with PingShot in FeedBurner|
When the content of our site is duplicate
Many times the content can be duplicate in our site by the existence of duplicate addresses as the result for not doing adequate redirects when moving a directory or link; the redirects 301 and in other cases the canonicalization using rel = "canonical", can solve problems of duplicate content in the website.
A site that has several pages with the same or similar content can set, using the canonical, which is the main page within the set of pages that have the same content or a content very similar in a large percentage.
A site with a tool that allows us to check if we have duplicate addresses is Cuwhois, where we just have to enter the website address and if in results appear results in blue (appear two directions with "cod: 200 - Ok") mean exists two links (duplicate content) that should be corrected.
|No problem of duplication|
How know with Google if exist duplicate content
To understand if Google identifies results that are very similar, you can do an experiment, enter the domain of site (without the "www" and without ". TipodeDominio", only the name) in the Google search and Google give us results, we must go to the last page of results [1 ... 10 ... 20 ... 30, etc.] to see if we find something like: "... we have omitted some entries very similar to the XXX already displayed. If you want, you can repeat the search with the omitted results included", this message is telling us that there are other results that could be duplicate content itself or also from other sites that have linked our domain, such as directories, sites where we test our website and many others, this is the way Google lets us know that can exist results with similar content in our site or in another sites.
If we make the same experiment previous but, placing the command "site:Fulldomain" and we found: "we have omitted ... ... XYZ results ... if you want to repeat your search with the omitted results.", then we can see that exists content being seeing as duplicate content in the website and should be corrected promptly.
Copy content Partially or fully, duplicate our content, the very general content and even, the lack of content, can make Google see your site as irrelevant or with bad practices, so it is important to remove content has been copied and/or duplicate and quickly take appropriate corrective action.