Did you ever face a challenge of duplicate content? Your hard work went into the vein when somebody scraped your content and copied it into one’s website? Let me say this way. Have your content been Plagiarised?
It happens that you are searching your content and you bumped into your content on somebody else’s site. So content creation is an uphill task, whereas content stealing is the easiest one.
Let me put this way, google crawls a site A every hours where was it crawls a site B every second day. By chance your content has been copied with change in time stamp by site B and that gets indexed. While site becomes victim of plagiarism due to authority of site A and its page rank. So does this mean site B can not get justice?
No worry, you can do DMCA (Digital Millenium Copy Right Act) request and give information of URL or copied content. Google takes action after confirming the copyright infringement and remove the respective content or URL. If site A continues doing it, it will be affected ranking wise.
Google has updated his ranking algorithm under name called Rankbrain. It is artificial intelligence based software, which has been adapted to giving the best user experience by matching with precision to the query.
It also segregates between strings and Things. So keeping the quality check on content is priority of Artificial intelligence Machine learning system. So chances are almost nil to showing duplicate content, when machine can do this in fraction of seconds to retrieve answers.
There is surfeit amount of content being written every day, Crawlers are never at leisure. They keep working, finding, analysing and indexing new content.
Google does not find the duplicate content issue malicious until it smells something fishy. Google will never compromise on the user’s search experience. it will not show the same content across different domain or URLs. It indexes only original one and don’t index other.
It assesses all possibilities to pull the plugs on the malicious intention of the frivolous marketer. They often do it to get the ranking and try to make search engine fool.
Let us try to analyse it here what is duplicate content as per Google’s content guideline
“Duplicate content generally refers to substantive blocks of content within or across domains that either completely matches other content or are appreciably similar.”
First we need to understand what is duplicate content?
Duplicate content happens to be present in the same format across the same domain or different domain.
1-Duplicate content on the same domain
it is very much possible that you have many URL having the same content. As most of the bloggers and business owners use the popular CMS (WordPress). Which automatically generates categories and Tags. So all these categories and tags create many URLs pointing to the same content.
So you will be thinking that your site may face a penalty from google. Actually, this is not the case as google understand the characteristics of most of the CMS and generally don’t harm sites ranking.
Here you need to show your clear intention that you want to appear for original URL rather than populating the same content through tags and category URL.
If crawlers see you that you are reposting the same content across your site for different pages. It will not index all but possibly one or even not a single one. I am sure this ugly part you will never want to use.
You can add meta tag “no index” to your category and Tag URLs. So that it clear to google that which page it should index for search.
Till some time back, most of the unscrupulous marketers used to resort to copying the performing content. They used to deluge their pages with thin content and finagling keywords placing. Those copied content became their nemesis. Their entire site went for the toss after panda update. Almost all pages lost ranking and they suffered huge business loss.
Most of them had to extrude themselves out of the business. Google has been stern on the duplication part of the content. That’s the reason it is frequenting changes in its ranking algorithm. In every algorithm update, it gives the education to webmasters. When webmasters don’t understand the reason of the ringing bells. Then it starting singing blues for the whole lot of those marketers involved in thin and duplicate content.
There is not a surety that your content is original and will remain to get traffic and keywords. It all depends on the authority and trust signal of somebody else’s domain, who has copied it from you. you run a risk of de-indexation completely from the search result based on the power of domain and its audience.
This can be possible through even when you site is syndicated to other sites. We will discuss this point later in this post. You are always at risk. But you need not be worried as google has been girding its lion and taking the stock of situating in order to make its user delighted.
Most of the search engine often believe in showing up the variety of the same topic in a different way. They will show you most relevant result near user’s query. It filters out all similar or duplicate content, which are available in different URLs.
Let us find out what happens when google encounter same content
Google determines the Plagiarism issue in four way
It tries to crawl the new content and thereafter scout through similar content in its database to establish the uniqueness of content.
As per Matt Cutts, 30% duplicate content is considerable. As if I am writing this article and researched some of the best articles on the similar topics and extrapolated some of the points here too in this article. Which is ok. But what matter to Google is how you lend credit to original source.
It expands its wings across to discard those pages. Which come from a link farm, MFA and blacklisted IPs.
“Link farming is the process of creating automated link amongst the number of sites. It generally aims at giving backlinks to the sites involved in it. All these sites in the group create a link web.”
It then dissects all those pages and looks at inbound links, anchor text, link juice and quality of linking domain. It weighs upon 301 and 3o2 redirects too.
At last, after reviews the topical links and time of discovery and dissection. it sniffs out the originator of the content. It establishes the fact that which page it considers to be original. It looks at the canonical tag of the duplicate page or no index status too. Based on this, it brands the page duplicate or original and includes in the indexed result. It then fetches the search result for the query mentioned from its topics.
1-Try to embark upon more authoritative sites with great page rank and niche in their segment. Don’t waste time in buying links or exchanging links.
2-Never Over optimise your anchor text and never focus on getting the link through the same keyword. Focus on highlighting the phrase.
3-Add a canonical tag to your mobile site. That tag should use your desktop site URL.
4- Always share your content on social media once it is published so that people like it and share it. They will link to it. So google will follow the link faster than it discovers on other site.
The issue of content duplication arises when you have redirected your URL permanently to other URL. In this case, you can add a canonical tag to redirecting URL. So that google treats the new page as original page and index that next time in the result. It also passes the link juice to the new one.
Whereas temporary redirect 302 also should be defined with canonical tag and add no index tags to the redirected one. So that you can specify to google about the original URL, which is temporarily unavailable.
Permanent redirect (301 status code of HTML page) is the process of redirecting your current URL to the new destination. There can be lots of reasons for doing so. You have altered your URL, have inserted keywords in that, old URL has misspelled words, or not in sync with page title or content of the pages etc.
2-Duplicate Content across different domain
In this case, your content appears to be present across different domain in different URLs.
This can happen through content syndication. Where you have syndicated your content to lots of other sites.
These are possible reasons below
1-Same news published across different sites in the same format
2-Same article has been used by webmasters from article directory
3-You have been publishing the same article in the original format across article directories.
4-E-commerce sites may have been using the same product description given by manufacturers.
5-Press release being distributed across different web sites.
So you would be wondering that if content syndication is causing duplication then one should stop it. Absolutely not. It has both sides
People tend to use syndication tactic to command awareness, authority and, more importantly, traffic to their sites.
If we talk about the benefits
A) It brings you traffic.
B) It expands the reach and awareness.
C) And last it get associated with high trafficked sites, which has your target audience.
On the downside.
It runs the risk of getting your site not higher ranking compared to syndicated sites.
Google may not consider your site authority strong enough compared to your syndicator. Who is commanding trust and high domain authority?
You may face entire page deindexed from the search result.
The third point is scarier. Yes, it is, So is there any remedial action.
You can consider following points while syndicating the content.
1-First you should get your site indexed or you can do it by fetching the URL in your webmaster tool.
2-Wait till your page gets indexed
3-You can create the different format of the content like video, infographic or Ebooks.
4-Ask for backlink pointing to your page or site. So that google can assess that you are the content creator.
5-Ask for a no-index tag to be incorporated in the syndicated sites. Most of the site would not agree if especially they are high in authority. But you can also ask them to deploy canonical tag to their URL of your original content URL.
6-Always ask the publisher to mention this point at the bottom of article ” This article is originally published in site www.example.com/Content-duplicacy-issue.
7-You can also publish your link across your social profile especially Google plus and try to engage your audience talk, share and like that.
When I am writing my post, I make sure that I follow all above mentioned points. Having said so, it does not diminish my nightmare of content being stolen. Here are the ways you can check.
You can use Copyscape to assess the duplicated content. You can paste the URL of your page into the search box and hit the press button. It will let you know the possible sites using your content. They may be using part of it as a reference too.
Here I pasted my URL: http://mobizdom.in/how-to-plan-powerful-content-marketing-8-ways-to-do-it/
it returns the part of the content available on possible sites. So it may not be duplicate as most of the sites use reference so some part of the content is bound to overlap.
I can try with another free tools Small Seo Tools
For example, I took a part of my previous article Why Startups Need These Nine Growth Hacking Ideas
and pasted within box and entered the code and hit the button. It returns the results marked as plagiarised. It confirms the percentage of unique content. It shows only 1% content unique. Though i checked it for my own article. You can then click any highlighted plagiarised and it will point back to the originator of the content. That is my URL.This also shows that google has given priority to the original content.
I am again reiterating that 30% content duplicity is permissible as per Matt Cutts of google.
When you are writing. Keep your audience in mind, always make a habit of using your content a unique and factual.
You as a marketer should always work on creating the best user experience. Google may be showing only relevant content to its users. Which is possible through off page optimization. You can repurpose your original content. You can republish it in a different format to attract the users. You can alter the headline, change keywords, add or remove some part.
You can also create unique and targeted content based on the customer journey into your site. if a customer is a stranger where you can create a story around your brands, once he is visitor you can give him blogs and infographic to read. You can capture his information by offering him E-books or downloads. You can send him an email for conversion. After He buys you can ask for a referral.
All these different tactics of content can work in your favor and you can avoid the duplicate content or plagiarism issue. It will bolster your business. It gives you the upper hand once your customer becomes engaging with you.
I am also mentioning here top 9 tools, where you can check plagiarism online. I will review these tools in some other article.
Plagiarism in content is the big issue, Which you need to deal with utmost care. Most marketers struggle to produce content in consistent and frequent manner. It may be because of lack of the right team, limited resources, budgetary constraint and intense competition. It often results in using and publishing somebody’s content without their permission.
Remember, it is not a crime to write somebody’s article. You can do it but never forget to show the hospitality and always mention the source and link to them. You can add a nofollow tag. But it is a gesture
There is always a chance for best and unique content to win provided it has right flavor to communicate and right SEO strategy used. Always be careful in content syndication planning. You should make sure that you are getting backlinks pointed to the source page.
Most of the CMS generate duplicate URL for the same page but it should not bother you as well as google. Your ranking is dependent upon changes made by you, google, competitors and your customers.
So what do you think now? Are you able to address your woes now? You can publish your content without fear. I am sure you may have more points to add, please do let us know. We will love to hear them.