What is Duplicate Content?
Duplicate content refers to blocks of content that appear in multiple locations across a website. It can refer to identical or highly similar content found on different web pages, within a single website, or across multiple pages.
Estimates show that about 30% of everything on the web is duplicate content. At first, this sounds like a huge amount, but it actually makes perfect sense…
Imagine, two independent Online Marketers will be trying to explain in their own words what “duplicate content” means. One marketer’s explanation will nearly identically match the other marketer’s explanation. This also happens in many cases on the web. There are of course exceptions, as there are different types of duplicate content, some more harmful than others, but this explanation will follow later in this article.
We will cover in this guide:
- Why duplicate content is damaging to your website.
- What internal and external duplicate content is.
- How to control duplicate content.
- How to avoid duplicate content on your website.
How Damaging is Duplicate Content for SEO?
Duplicate content can be harmful to your rankings, this can be due to several things:
- The search engine does not know which page to index or exclude. Search engines like Google want to provide a good visitor experience, which is why search engines will rarely show versions with the same content.
- The search engine doesn’t know which page the statistics belong to. It becomes difficult to make accurate analysis of how a website is doing performing in terms of Trust Flow and Domain Authority.
The search engine does not know which page is the “right” one, so the search engine will have to choose which page it indexes. Search engines in most cases choose the domains with the most dominant authority.
However, Google itself states that they do not deal with so-called “penalties.”
This is what Google says: “duplicate content on a site is not a reason to take action against that site unless it appears that the duplicate content is intended to mislead and manipulate search engine results.”
INTERNAL DUPLICATE CONTENT
Internal duplicate content usually occurs unintentionally within your own website. So this has nothing to do with copying content but is usually the result of various (technical) problems within your website’s URL structure. Here are the most common causes:
Nearly identical URLs with the same purpose
In many web stores, you can get to the same destination page in different ways through the visitor panel. For example, you get:
Almost identical URLs with different purposes
Many web shops have products available in different colors, which makes the URLs look almost identical.
HTTP & HTTPS / www. Vs. no-www.
There are many websites that have different versions of their domain live with the same content, for example, ‘www.seeders.nl’ and ‘seeders.co.uk’.
In the aforementioned cases, Google chooses which page to index, but it’s difficult for Google’s crawlers to determine which page is the ‘original,’ which often leads to poor or declining rankings. There are other solutions to this, which we’ll come back to later.
EXTERNAL DUPLICATE CONTENT
External duplicate content occurs when different websites copy each other or when the content is very similar. An example of this is company X doing drop-shipping. Company X gets its stuff from a Chinese supplier. However, this Chinese supplier has many more customers to whom it delivers products. The Chinese supplier sends a product description with all the products to its customers. Many of these customers are also drop shippers and post the supplied product description on their websites. Company X also posts this description on their website. Voilà, now duplicate content has been created. As a result, Google will have trouble indexing and distributing rankings across these companies. This is just one example, but I am sure you can imagine many other ways external duplicate content can be created.
DUPLICATE CONTENT CHECK
To find out how much duplicate content your website has, you can use several tools.
To check your Website for duplicate content, you can use Siteliner’s tool. With this free tool, you simply enter the URL of your website or page. You will then get a report with data on the amount of duplicate content.
Google Incognito Window
To check if your website is suffering from duplicate content, the best thing you can do is make a copy of the first 10 to 15 words of text on one of your pages. Then paste it incognito into Google’s search bar. If your page appears at the top, Google thinks your content is the original source. If not, there is a problem with duplicate content. Repeat this process with different pages of your website.
Step 1: Copy the first part of your content
Step 2: Paste the text into an incognito window and make sure the page is at the top
You can also check external duplicate content and plagiarism with a free tool called Copyscape. With Copyscape, you can have notifications automatically come in when the tool suspects that an external website has copied content from your site.
Tips and Tricks to Avoid Duplicate Content
We know by now that duplicate content can be detrimental to your rankings. To prevent this, there are a number of things you can do.
The very best way to avoid duplicate content is to make sure your site structure is sophisticated. In most cases, this will automatically minimize duplicate content. However, if you already have a complete page structure in place, this is not something that can be taken care of on the fly. If other solutions are poorly applicable, restructuring your website is something you do need to consider carefully.
To minimize external duplicate content, it’s necessary to publish unique content. Google rewards unique content, in addition, it is important that content is updated regularly. But, above all, write in your own way and substantiate it with sources.
Build a strong domain
Google considers domains with strong authority as original sources of duplicate content, rather than domains with weaker authority. This is not true in all cases; for example, Google also pays attention to the publication date of the content. However, a strong domain does ensure a strong position in the World Wide Web, which consists of a large network of duplicate content.
With a 301 redirect, you pass all the link value generated by duplicate pages to one page. In other words, you no longer use the remaining pages. However, this way does not work for filter pages because of the unique parameters that make up a URL.
A canonical tag is a label that tells search engines which page is the original source of your web page. To work this out visually, we’ll use Example 1 under the heading “Internal duplicate content.” If you want to indicate that examplewebshop.co.uk/phones/apple/iphone-12 is the original page, add a canonical tag to page examplewebshop.co.uk/brands/apple/iphone-12 with the link from the original page. It is important to set canonical tags on pages where duplicate content may occur. However, a canonical tag is not a 100% guarantee of success; the search engine chooses whether to respond to it or not.
Genevieve is a Corporate Content Marketer at Seeders Group, bringing a creative flair to the world of digital marketing. With a knack for storytelling and a keen eye for detail, she excels at crafting compelling content that resonates with audiences. Her expertise spans SEO, SEA, Digital PR, and Link Building, and she's passionate about sharing her knowledge to help businesses grow.