Duplicate Content SEO Advice From Google


Duplicate Content SEO Best Practice

Webmasters are often confused about getting penalised for duplicate content, which is a natural part of the web landscape, especially at a time when Google claims there is NO duplicate content penalty. The reality in 2015 is that if Google classifies your duplicate content as THIN content then you DO have a very serious problem that violates Google’s website performance recommendations and this ‘violation’ will need ‘cleaned’ up.

What is duplicate content?

Here is a definition from Google:

Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar. Mostly, this is not deceptive in origin…..

It’s very important to understand that if, in 2015, as a webmaster you republish posts, press releases, news stories or product descriptions found on other sites, then your pages are very definitely going to struggle to gain in traction in Google’s SERPS (search engine results pages).

Google doesn’t like using the word ‘penalty’ but if your entire site is made of entirely of republished content – Google does not want to rank it.

If you have a multiple site strategy selling the same products – you are probably going to cannibalise your own traffic in the long run, rather than dominate a niche, as you used to be able to do.

This is all down to how a search engine filters duplicate content found on other sites – and the experience Google aims to deliver for it’s users – and it’s competitors.

Mess up with duplicate content on a website, and it might look like a penalty, as the end result is the same – important pages that once ranked might not rank again – and new content might not get crawled as fast as a result.

Your website might even get a ‘manual action’ for thin content.

Worse case scenario your website is hit by the Google Panda algorithm.

A good rule of thumb is do NOT expect to rank high in Google with content found on other, more trusted sites, and don’t expect to rank at all if all you are using is automatically generated pages with no ‘value add’.

While there are exceptions to the rule, (and Google certainly treats your OWN duplicate content on your on site differently), your best bet in ranking in 2015 is to have one single version of content on your site with rich, unique text content that is written specifically for that page.

Google wants to reward RICH, UNIQUE, RELEVANT, INFORMATIVE and REMARKABLE content in it’s organic listings – and it’s really raised the quality bar over the last few years.

If you want to rank high in Google for valuable key phrases and for a long time – you better have good, original content for a start – and lots of it.

A very interesting statement in a recent webmaster hangout was “how much quality content do you have compared to low quality content“. That indicates google is looking at this ratio. John says to identify “which pages are high quality, which pages are lower quality, so that the pages that do get indexed are really the high quality ones.

Onsite Problems

If you have many pages of similar content your site, Google might have trouble choosing the page you want to rank, and it might dilute your capability to rank for what you do what to rank for.

For instance, if you have PRINT ONLY versions of web pages (Joomla used to have major issues with this), that can end up displaying in Google instead of your web page, if you’ve not handled it properly. That’s probably going to have an impact on conversions – for instance. Poorly implemented mobile sites can cause duplicate content problems, too.

Google Penalty For Duplicate Content On-Site?

Google has given us some explicit guidelines when it comes to managing duplication of content.

filtering-of-duplicate-content

John Mueller clearly states in the video where I grabbed the above image:

 “We don’t have a duplicate content penalty. It’s not that we would demote a site for having a lot of duplicate content.

and

You don’t get penalized for having this kind of duplicate content

…in which he was talking about very similar pages. John says to “provide… real unique value” on your pages.

I think that could be read Google has is not compelled to rank your duplicate content. If it ignores it, it’s different from a penalty. Your original content can still rank, for instance.

An ecommerce seo tip from John with:

variations of  product “colors…for product page, but you wouldn’t create separate pages for that.With these type of pages you are “always balancing is having really, really strong pages for these products, versus having, kind of, medium strength pages for a lot of different products.

John says:

one kind of really, really strong generic page” trumps “hundreds” of mediocre ones.

If “essentially, they’re the same, and just variations of keywords” that should be ok, but if you have ‘millions‘ of them- Googlebot might think you are building doorway pages and that IS risky.

Generally speaking, Google will identify the best pages on your site if you have a decent on-site architecture. It’s usually pretty good at returning specific duplicate content depends on a number of other factors.

The advice is to avoid duplicate content issues if you can and this should be common sense. Google wants and rewards original content – it’s a great way to push up the cost of seo and create a better user experience at the same time.

Google doesn’t like it when ANY TACTIC it’s used to manipulate it’s results, and republishing content found on other websites is common tactic of a lot of spam sites.

Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. Google.

You don’t want to look anything like a spam site, that’s for sure – and Google WILL classify your site… as something.

The more you can make it look a human built every page on a page by page basis with content that doesn’t appear exactly in other areas of the site – the more Google will like it. Google does not like automation when it comes to building a website, that’s clear in 2015.

I don’t mind multiple copies of articles on the same site – as you find with WordPress categories or tags, but I wouldn’t have tags and categories, for instance, and expect them to rank well on a small site with a lot of higher quality competition, and especially not targeting the same keyword phrases.

I prefer to avoid unnecessary repeated content on my site, and when I do have automatically generated content on a site, I tell Google not to index it with a noindex it in meta tags or in XRobots. I am probably doing the safest thing, as that could be seen as manipulative if I intended to get it indexed.

Google won’t thank you, either, for spidering a calendar folder with 10,000 blank pages on it, or a blog with more categories than original content – why would they?

Offsite Problems

…in some cases, content is deliberately duplicated across domains in an attempt to manipulate search engine rankings or win more traffic. Deceptive practices like this can result in a poor user experience, when a visitor sees substantially the same content repeated within a set of search results. Google tries hard to index and show pages with distinct information. This filtering means, for instance, that if your site has a “regular” and “printer” version of each article, and neither of these is blocked with a noindex meta tag, we’ll choose one of them to list. In the rare cases in which Google perceives that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we’ll also make appropriate adjustments in the indexing and ranking of the sites involved. As a result, the ranking of the site may suffer, or the site might be removed entirely from the Google index, in which case it will no longer appear in search results. GOOGLE

If you are trying to compete in competitive niches you need original content that’s not found on other pages in the exact same form on your site, and THIS IS EVEN MORE IMPORTANT WHEN THAT CONTENT IS FOUND ON OTHER PAGES ON OTHER WEBSITES.

Google isn’t under any obligation to rank your version of content – in the end it depends who’s site has got the most domain authority or most links coming to the page.

Don’t unnecessarily compete with these duplicate pages by always rewriting your content if you think the content will appear on other sites (especially if you are not the first to ‘break it’, if it’s news).

How To Check For Duplicate Content

An easy way to find duplicate content is to use Google. Just take a piece of text content from your site and put it “in quotes” as a search in Google. Google will tell you how many pages that piece of content it found on pages in it’s index of the web. The best known online duplicate content checker tool is Copyscape and I particularly like this little tool too, which check duplicate content ratio between two selections of text.

If you find evidence of plagiarism, you can file a DMCA or contact Google but I haven’t ever bothered with that, and many folk have republished my articles over the years. I even saw one in a paid advert in a magazine before.

A Dupe Content Strategy?

There are strategies where this will still work, in the short term. Opportunities are (in my experience) reserved for long tail serps where the top ten results page is already crammed full of low quality results and the SERPS are shabby – certainly not a strategy for competitive terms.

There’s not a lot of traffic in long tail results unless you do it en-mass and that could invite further site quality issues, but sometimes it’s worth exploring if using very similar content with geographic modifiers (for instance) on a site with some domain authority has opportunity. Very similar content can be useful across TLDs too. A bit spammy, but if the top ten results are already a bit spammy…

If low quality pages are performing well in the top ten of an existing long tail SERP – then it’s worth exploring – I’ve used it in the past. I always thought if it improves user experience and is better than whats there in those long tail searches at present, who’s complaining?

Unfortunately that’s not exactly best practice seo in 2015, and I’d be nervous creating any type of low quality pages on your site these days.

Too many low quality pages might cause you site wide issues in the future, not just page level issues.

Original Content Is King, they say

Stick to original content, found on only one page on your site, for best results – especially if you have a new/young site and are building it page by page over time… and you’ll get better rankings and more traffic to your site (affiliates too!).

Yes – you can be create – and reuse and repackage content, but I always make sure if I am asked to rank a page I will require original content on the page.

There is NO NEED to block your own Duplicate Content

There was a useful post in Google forums a while back with advice from Google how to handle very similar or identical content:

“We now recommend not blocking access to duplicate content on your website, whether with a robots.txt file or other methods” John Mueller

John also goes on to say some good advice about how to handle duplicate content on your own site:

  1. Recognize duplicate content on your website.
  2. Determine your preferred URLs.
  3. Be consistent within your website.
  4. Apply 301 permanent redirects where necessary and possible.
  5. Implement the rel=”canonical” link element on your pages where you can. (Note – Soon we’ll be able to use the Canonical Tag accross multiple sites/domains too.)
  6. Use the URL parameter handling tool in Google Webmaster Tools where possible.

Webmaster guidelines on content duplication used to say:

Consider blocking pages from indexing: Rather than letting Google’s algorithms determine the “best” version of a document, you may wish to help guide us to your preferred version. For instance, if you don’t want us to index the printer versions of your site’s articles, disallow those directories or make use of regular expressions in your robots.txt file. Google

but now Google is pretty clear they do NOT want us to block duplicate content, and that is reflected in the guidelines.

Google does not recommend blocking crawler access to duplicate content (dc) on your website, whether with a robots.txt file or other methods. If search engines can’t crawl pages with dc, they can’t automatically detect that these URLs point to the same content and will therefore effectively have to treat them as separate, unique pages. A better solution is to allow search engines to crawl these URLs, but mark them as duplicates by using the rel="canonical" link element, the URL parameter handling tool, or 301 redirects. In cases where DC leads to us crawling too much of your website, you can also adjust the crawl rate setting in Webmaster Tools. DC on a site is not grounds for action on that site unless it appears that the intent of the DC is to be deceptive and manipulate search engine results. If your site suffers from DC issues, and you don’t follow the advice listed above, we do a good job of choosing a version of the content to show in our search results.

Basically you want to minimise dupe content, rather than block it. I find the best solution to handling a problem is on a case by case basis. Sometimes I will block Google.

Google says it really needs to detect an INTENT to manipulate Google to incur a penalty, and you should be OK if your intent is innocent, BUT it’s easy to screw up and LOOK as if you are up to something fishy.

It is also easy to fail to get the benefit of proper canonicalisation and consolidation of primary relevant content if you don’t do basic housekeeping, for want of a better turn of phrase.

Advice on content spread across multiple domains:

Reporting News

Content Spread Accross Multiple TLDs

Mobile SEO Advice

Canonical Link Element Best Practice

Google also recommends using the canonical link element to help minimise content duplication problems.

If your site contains multiple pages with largely identical content, there are a number of ways you can indicate your preferred URL to Google. (This is called “canonicalization”.)

Google SEO – Matt Cutts from Google shared tips on the rel=”canonical” tag (more accurately – the canonical link element) that the 3 top search engines now support. Google, Yahoo!, and Microsoft have all agreed to work together in a

“joint effort to help reduce duplicate content for larger, more complex sites, and the result is the new Canonical Tag”.

Example Canonical Tag From Google Webmaster Central blog:

<link rel="canonical" href="http://www.example.com/product.php?item=swedish-fish" />

You can put this link tag in the head section of the problem urls, if you think you need it.

I add a self referring canonical link element as standard these days – to ANY web page.

Is rel=”canonical” a hint or a directive? 
It’s a hint that we honor strongly. We’ll take your preference into account, in conjunction with other signals, when calculating the most relevant page to display in search results.

Can I use a relative path to specify the canonical, such as <link rel=”canonical” href=”product.php?item=swedish-fish” />?
Yes, relative paths are recognized as expected with the <link> tag. Also, if you include a<base> link in your document, relative paths will resolve according to the base URL.

Is it okay if the canonical is not an exact duplicate of the content?
We allow slight differences, e.g., in the sort order of a table of products. We also recognize that we may crawl the canonical and the duplicate pages at different points in time, so we may occasionally see different versions of your content. All of that is okay with us.

What if the rel=”canonical” returns a 404?
We’ll continue to index your content and use a heuristic to find a canonical, but we recommend that you specify existent URLs as canonicals.

What if the rel=”canonical” hasn’t yet been indexed?
Like all public content on the web, we strive to discover and crawl a designated canonical URL quickly. As soon as we index it, we’ll immediately reconsider the rel=”canonical” hint.

Can rel=”canonical” be a redirect?
Yes, you can specify a URL that redirects as a canonical URL. Google will then process the redirect as usual and try to index it.

What if I have contradictory rel=”canonical” designations?
Our algorithm is lenient: We can follow canonical chains, but we strongly recommend that you update links to point to a single canonical page to ensure optimal canonicalization results.

Can this link tag be used to suggest a canonical URL on a completely different domain?
**Update on 12/17/2009: The answer is yes! We now support a cross-domain rel=”canonical” link element.**

Tip – Redirect old, out of date content to new, freshly updated articles on the subject, minimising low quality pages and duplicate content whilst at the same time, improving the depth and quality of the page you want to rank. See our page on 301 redirects – http://www.hobo-web.co.uk/how-to-change-domain-names-keep-your-rankings-in-google/.

Tips from Google

As with everything Google does – Google has had it’s own critics about it’s use of duplicate content on it’s own site for it’s own purposes:

example-of-scraper

 

There are some steps you can take to proactively address duplicate content issues, and ensure that visitors see the content you want them to. Use 301s: If you’ve restructured your site, use 301 redirects (“RedirectPermanent”) in your .htaccess file to smartly redirect users, Googlebot, and other spiders. (In Apache, you can do this with an .htaccess file; in IIS, you can do this through the administrative console.)

Be consistent: Try to keep your internal linking consistent. For example, don’t link to http://www.example.com/page/ and http://www.example.com/page and http://www.example.com/page/index.htm.

I would also add ensure your links are all the same case, and avoid capitalisation and lowercase variations of the same url. This type of duplication can be easily sorted keeping internal linking consistent and proper use of canonical link elements.

Use top-level domains: To help us serve the most appropriate version of a document, use top-level domains whenever possible to handle country-specific content. We’re more likely to know that http://www.example.de contains Germany-focused content, for instance, than http://www.example.com/de or http://de.example.com.

Google also tell webmasters to choose a preferred domain to rank in Google:

Use Webmaster Tools to tell us how you prefer your site to be indexed: You can tell Google your preferred domain(for example, http://www.example.com or http://example.com).

…although you should really ensure you handle such redirects server side, with 301 redirects redirecting all versions of a url to one canonical url (with a self referring canonical link element).

Minimize boilerplate repetition: For instance, instead of including lengthy copyright text on the bottom of every page, include a very brief summary and then link to a page with more details. In addition, you can use the Parameter Handling tool to specify how you would like Google to treat URL parameters.  Understand your content management system: Make sure you’re familiar with how content is displayed on your web site. Blogs, forums, and related systems often show the same content in multiple formats. For example, a blog entry may appear on the home page of a blog, in an archive page, and in a page of other entries with the same label.

Understand Your CMS

Google says:

Understand your content management system: Make sure you’re familiar with how content is displayed on your web site. Blogs, forums, and related systems often show the same content in multiple formats. For example, a blog entry may appear on the home page of a blog, in an archive page, and in a page of other entries with the same label.

WordPress, Magento, Joomla, Drupal – they all come with slightly different duplicate content (and crawl equity performance) challenges.

Syndicating Content Comes At A Risk

When it comes to publishing your content on other websites:

Syndicate carefully: If you syndicate your content on other sites, Google will always show the version we think is most appropriate for users in each given search, which may or may not be the version you’d prefer. However, it is helpful to ensure that each site on which your content is syndicated includes a link back to your original article. You can also ask those who use your syndicated material to use the noindex meta tag to prevent search engines from indexing their version of the content.

The problem with syndicating your content is you can never tell if this will ultimately cost you organic traffic. If it’s on other peoples websites – they might be getting ALL the benefit – not you.

It’s also worth noting that Google still clearly says in 2015 that you can put links back to your original article in posts that are republished elsewhere. But you need to be careful with that too – as those links could be classified as unnatural links.

A few years ago I made an observation I think that links on duplicate posts which have been stolen – duplicated and republished – STILL pass anchor text value (even if it is a light boost).

Take this Cheeky beggar…. –  he nicked my what is seo post I did in 2007 and stripped out all my links (cheek!) and published the article as his own.

Well he stripped out all the links apart from one link he missed:


Snippet

 


Yes, the link to http://www.duny*.com.pk/ was actually still pointing to my home page.

This gave me an opportunity to look at something…..

The article itself wasn’t 100% duplicate – there where a small intro text as far as I can see. It was clear by looking at Copyscape just how much of the article is unique and how much is duplicate.

So this is was 3 yr old article republished on a low quality site with a link back to my site within a portion of the page that’s clearly dupe text.

I would have *thought* Google just ignored that link.

But no, Google did return my page for the following query (at the time):

Google SERP

This Google Cache notification is now no longer available tells fibs, but is pretty accurate this time:

Google Cache

… which looks to me as Google will count links (AT SOME LEVEL) even on duplicate articles republished on other sites – probably depending on the search query, and the quality of the SERP at that time (perhaps even taking into consideration the quality score of the site with the most trust?).

I’d imagine this to be the case even today.

How to take advantage of this?

Well, you get an idea of just how much original text you need to add to a page for that page to pass some kind of anchor text value (perhaps useful for article marketers). And in this case, it’s not much! Kind of lazy though. And certainly not good enough in 2015.

It seems, syndicating your content via RSS and encouraging folk to republish your content will get you links, that count, on some level it seems (which might be useful for longer tail searches). I still always make sure even duplicate (in essence) press releases and articles we publish are ‘unique’ at some level.

Google is quite good at identifying the original article especially if the site it’s published on has a measure of trust – I’ve never had a problem with syndication of my content via rss and let others cross post…. but I do like at least a link back, nofollow or not.

Original Articles Come Top (usually)

The bigger problem with content syndication in 2015 is unnatural links and whether or not Google classifies your intent as manipulative.

Thin Content Classifier

Google also says about ‘thin’ content.

Avoid publishing stubs: Users don’t like seeing “empty” pages, so avoid placeholders where possible. For example, don’t publish pages for which you don’t yet have real content. If you do create placeholder pages, use the noindex meta tag to block these pages from being indexed.

and

Minimize similar content: If you have many pages that are similar, consider expanding each page or consolidating the pages into one. For instance, if you have a travel site with separate pages for two cities, but the same information on both pages, you could either merge the pages into one page about both cities or you could expand each page to contain unique content about each city.

The key takeaways about duplicate content is this.

Duplicate content is a normal churn of the web. Google will rank it – for a time. Human or machine generated, there is a lot of it – and Google has a lot of experience handling it and there are many circumstances where Google finds duplicate content on websites. Not all duplicate content is a bad thing.

If a page ranks well and Google finds it manipulative use of duplicate content, Google can demote the page if it wants to. If it is deemed the intent is manipulative and low quality with no value add, Google can take action on it – using manual or algorithmic actions.

There is a very thin line between reasonable duplicate content and thin content. This is where the confusion comes in.

Google clearly states they don’t have a duplicate content penalty – but they do have a ‘thin content’ manual action… which looks and feels a lot like a penalty. They also have Google Panda.

Google Panda

A part of Google Panda algorithm is focused on thin pages and ratio of good quality content to low quality content on a site. In the original announcement about Google Panda we were specifically told that the following was a ‘bad’ thing:

Does the site have duplicate, overlapping, or redundant articles?

If Google is rating your pages on content quality, or lack of it, as we are told, and user signals – on some level – and a lot of your site is duplicate content that gets no user signal – then that may be a problem too.

Google offers some advice on thin pages (emphasis mine):

Here are a few common examples of pages that often have thin content with little or no added value: 1 . Automatically generated content, 2. Thin affiliate pages 3. Content from other sources. For example: Scraped content or low-quality guest blog posts. 4. Doorway pages

Everything I’ve bolded in the last two quotes is essentially about duplicate content.

Google is even more explicit when it tells you how to clean up this ‘violation':

Next, follow the steps below to identify and correct the violation(s) on your site: Check for content on your site that duplicates content found elsewhere.

So beware. Google says there is NO duplicate content penalty but if Google classifies your duplicate content as thin content, then you DO have a problem.

A serious problem if your entire site is built like that.

And how Google rates thin pages changes over time, with a quality bar that is always going to rise and that your pages need to keep up with. Especially if rehashing content is what you do.

TIP – Look out for soft 404 errors in Google webmaster tools as examples of pages Google are classing as low quality, user unfriendly thin pages.

More reading



Loading Facebook Comments ...

24 Responses

  1. Alan Bleiweiss says:

    Well I’m going to to continue blocking duplicate content. First, Google may be the biggest and most important search engine to focus on, however they’re not the only ones. A while back they made a deal with Adobe to claim that Flash was now more SEO friendly. Anyone who actually used that financially motivated marketing hype as an excuse to change their anti-Flash views was a fool. And Google does not state anywhere (nor can they nor will they) that content you want kept out of the SERPs is guaranteed to be kept out by the new recommended methods. In fact, they say “In cases where duplicate content still leads to us crawling too much of your website, you can also adjust the crawl rate setting in Webmaster Tools.” Well guess what – that is the most arcane and pitiful directive I’ve seen in a long time. If I want to keep the googlebot’s grubby digital hands off of certain content, I’ll do it the intelligent, proven way and not rely on Google hacks and their guestimating algorithm…

  2. Ian Brodie says:

    Hi Shaun, I rewrite my articles before syndicating to article directories & other blogs. but I have to admit it’s a huge pain. What’s your view on the value of a link from a page regarded as duplicate content. So, for example, imagine I put the same article on a load of article directories and many of them end up in the “we have omitted some entries very similar…” bit of search. That’s OK (I prefer the original to rank, of course), but do the links back to my site from these pages still count? Ian

  3. George says:

    Hi, I understand your line of thought but wonder how I can write unique content on technical details on products. For example I sell shower pumps and have added content explaining when and how to use. These daitails can be found on many web sites including manufacturer’s sites. no scoope for original content! Regards George

  4. seoslayer says:

    I think that if you have a new/young site, the worst thing you can do is providing duplicate content. This practice can “kill” your SERPS. But if I create a page in which i copy another page and buy a bunch of links to that page, could I rank better than the original page?

  5. Joseph Geraghty says:

    Nice uncomplicated post. I have one page that is ranked by Google and is number one oganically for Emotional Intelligence coaching. I can see I need to put the same love and attention into the other pages to get them recognised. I downloaded sometime ago your free book, I need to schedule an appointment wiht myself to read them. Thanks for sharing, Joseph

  6. Pippa says:

    Just wondering on the topic of duplicate content whether Google treats copies of pages translated into different languages as duplicates? For example, client designed site in english, we’ve sorted keywords, titles, descriptions etc He was just going to get it all translated as is, into Spanish, (it’s for an english painter living in Spain). including keywords,title etc. Will this account against him?

  7. Adelson (Gerenciando says:

    Hi, Shaun! Wise words! But I have a question: even in HOBO, you have partial duplicate contents between main page and article pages. This happens because the beginning of articles are also shown in main page. Same happens in my blog. Is that a problem? I notice that Google first shows results of my main page, before showing article pages. If so, how can I avoid that? Thanks!

  8. Nick says:

    Shaun, On my website http://www.pimlico-flats.co.uk I have lots of pages where a flat is described – these pages are often identical because the flats are identical. Also I use 3 different pages ( see http://www.pimlico-flats.co.uk/rent_london_flats_75_11_b.html and http://www.pimlico-flats.co.uk/rent_london_flats_75_11_a.html ) to create the picking a view effect. Is any of this causing a problem for me?

    • Shaun Anderson (Hobo says:

      @Nick – Possibly-I’d consider 301ing old pages or using the canonical tag if you have a lot of dupe contentpages. @Adelson Google knowsthissiteisa blog. It can handle that kind of dupe content no problem. @pippa I’n not 100% up to speedwith translated duplicate content but if you are going to the trouble to have it translated, why not rewrite a little. @George that’s no excuse! lol If I had your site every product page wouldhave unique content-even padded out by an article writer. @Ian – It’s a low quality link at best and at worst an invitation to a keyword ranking filter in my experience.

  9. Asif says:

    I am considering developing a site for a colleague who has been sending emails out based on links to articles featured elsewhere on the web. Articles will be appropriately referenced. The site aims to be a good reference source for the email base , but would like to have a platform to evolve the site with respect to organic search listings. Is there a way of isolating the duplicate content from being evaluated by Google ? Thanks,

  10. Nick says:

    Shaun – these aren’t old or redundant pages, they are pages that describe different flats, because the flats are identical the pages are very similar. This can’t be an unusual situation, lots of companies carry similar and identical products. Great blog BTW I follow, but can’t always find the “Leave a Reply” Box so don’t comment as much as I’d like to. I wanted to comment yesterday about WordPress Plugins but there was no reply box. Today I can’t even find the blog! I wanted to recommend LinkWithin & Zemanta as Plugins.

    • Shaun Anderson (Hobo says:

      HI NIck – I close old posts just to keep my spam down and the focus on new discussions :) Yeah I had a look at the pages you linked to ;) and saw that. I work on some sites with similar issues, but I try to ensure an internal navigation structure emphasises at least one of those type of pages (even a sort of category page that would lead to those detail pages) so at least to ENSURE one page has a chance to rank for the terms you are optimising for. As I say in the article, NEAR duplicate content is not always a bad thing, but you need a sensible strategy. If it’s not clear which page ideally you want to rank for a certain term, how can you possibly expect Google to pick it out? But as I say, near dupe content can be of limited use in supplemental, or low quality, results.

  11. drew says:

    How bad is framed pages? I have a section of my homepage that is framed and a news feed comes through. How much text exactly should i have? I see a lot of top ranking sites in my category with very little wording on there home page , they just have links and search boxes, pictures, ect?? Thanks and these daily tips have been very helpful

    • Shaun Anderson (Hobo says:

      @ Drew depends on your intent. I’ve heard of Google penalising such pages because of a LOT of content in hidden scroll bar DIVs (like Frames). I have always avoided introducing hidden text and elements to pages as I think if you dont think it is important to display on your page prominently, why would Google think it’s a valuable addition to the page? I wouldn’t ever build a site with HTML FRAMES of course. @Asif – Duplicate content ranks – it depends on how you implement it and howitis linked to. It’s notreally a strategy I like to do though for real sites.

  12. Christian Buckland says:

    Hey, this is my first post and hope im doing it right! I worked at the Priory before starting up on my own and i did their website and as many pages were about the same thing; depression, addictions etc a SEO charged us a fortune just to point that out, so thanks for doing this for free!

  13. Alok says:

    I wonder how do duplicate content checking algorithm really works. And how do the search engines really figure out if the words has just been replaced with their synonyms. Moreover duplicate content on multiple sites. Is it really feasible/possible to figure out it among multiple sites if they have been altered slightly.

  14. Ian Brodie says:

    Thanks for the reply Shaun. Guess I have to keep slogging away to make everything unique then! Ian

  15. Dave says:

    Hi Guys, If I use a forwarding URL from someone like godaddy.. which I think uses a 302 redirect…. will the SE’s look at the forwarded URL as a duplicate site? Ex: if I forward a aaa.net to fff.com and the aaa.net is just a domain, no pages or content. Thx Dave

  16. Jenny Stradling says:

    @George…. I had a client once with a clean site, he was offering products that came with manufactures descriptions. The site was ranking well for many keywords but for some reason we just couldn`t get one particular SP to rank. I ran the page through Copy Scape and found that the content was flagged as duplicate. I asked the client to modify the content. He explained tat because it was the manufactures content they required it to be exact. Long story short, he finally got permission to change the content and just like that, the ban was lifted, the page was indexed and within 2 weeks was ranking in the SERPS. I think this is a common mistake people make with shopping site. Even if you are reselling a product, your own website content should be unique! If your manufacturer has a description, explain to them that for SEO value your content needs to be unique.

  17. Shaun Anderson (Hobo says:

    @Alok I’ve seen software that checks for dupe content on a line by line basis…. not just a page. Wether Google is using that much effort is another matter….

  18. Jenny says:

    Hi, On the line for line note: if you have say one duplicate paragraph that changes only slightly on a lot of your pages will that be better? or will it still get penalised? Thanks, Jen Web Design Wales Graphic Design Wales

  19. George says:

    Hi Jenny thanks for a the tips. I have now changed content details on my pump page and will see if page rank improves. your views are logical thanks for the reply. Regards george

  20. Michael Brandon says:

    I love your post, always good to hear something that I myself have found. Young sites definitely are more effected by dup content. I get incredibly frustrated with Google regarding duplicate content. I get clients homepages ranked well, then because they are top ten, scrapers copy their metas and words including the first instance of the search phrases. Since the client websites are in general new, the duplicate is enough to knock them off the ranking perch. It has even happened with a powerful page on my own SearchMasters.co.nz site – I was ranking top ten for “marketing” on Google.co.nz. Scrapers copied my content and my page dived to not in top 1000. I made the content unique again, and the rankings came back up to top 20, now 40th. Unfortunately Google seems to have a memory of such dup content that even after the content has been made unique, it holds a black mark against you. While you say that “Don’t unnecessarily compete with these dupe pages by always rewriting your content”, I would rather be safe than sorry. And so I am often rewriting meta descriptions and first instances of search phrases on my own and client pages. Shopping cart type sites are that much easier when you have a formula generated opening paragraph. Just change the formula, and wholla you have unique opening para’s again.

  21. Oisin says:

    I just found out that my site is being scraped constantly by someone who now has my entire site (albeit in a static version) available on a different url. Should I be worried? I am a little as Google doesn’t seem to be crawling my site that strongly lately, with some new pages I put up not being revisited for around 2 months now (I’ve got sitelinks so I presume that Google considers me a decent site at least)



Learn how you can get more sales from your website

Subscribe for free and let us share with you:

  • how to submit your site to Google, Yahoo & Bing
  • how to optimise your site to get more traffic from Google
  • how to target the most valuable keywords for your business
  • how to make your site rank better in free Google listings
  • how to rank high & avoid Google penalties in 2015

Trust Hobo with your SEO plan

SEO Audit