Prevent Google Indexing Internal Search Results Pages?

Google says to do it using Robots txt.

Use robots.txt to prevent crawling of search results pages or other auto-generated pages that don’t add much value for users coming from search engines. Google Webmaster Help

Perhaps it comes down to how much value these pages add to your visitors if they land on these pages from search results. I see many websites (especially big sites) that allow their internal search results to be spidered by Googlebot. Of course, if you follow Google’s advice and if people are linking to your search results pages too, and you robots.txt them out, you might be losing out on incoming link equity. Andy beard wrote about something similar some time ago- – SEO Linking Gotchas Even the Pros Make.

Search Engine Land wrote a couple of years ago about this same issue, as did Matt Cutts.

I have used various methods to manage internal search engine results pages over the last few years, but it annoys to see bigger brands ignore this Google guideline (where they benefit). Of course, they could well be screwing themselves in other ways with regards to ignoring this directive and allowing Google to crawl and return these internal serps.

I am no expert in this area. Do you prevent Google indexing the internal search results of your website and how do you do it?

Interested in learning more about Robots.txt? – check out our robots.txt beginners guide with Sebastian – the respected writer of Sebastians Pamphlets :)

My Twitter buddy Edward Lewis pointed me to this site for more information on Robots Meta Tags if you want to get deep :)

If you enjoyed this post, please share :)

Written by Shaun Anderson

5 Responses to “Prevent Google Indexing Internal Search Results Pages?”

  1. Brandon says:

    Instead of preventing Google from indexing internal search result pages, re-write the URLs and remove search result distinctions (search results in title, h1, url, etc). Make them faceted search pages.

    Good examples can be seen on Overstock

  2. Jeet says:

    Many times I simply use a noindex on search results page rather than exposing my search path with robots.txt

    I do the same for thankyou pages and order confirmation pages that have tracking code.

  3. Well i think both of the ideas work in such cases: 1:rewrite the urls of internal search results pages 2: just guide robot.txt to not index those pages.
    As i was suffering with this issue then i consult with google customer service two months back and they suggested to disallow such urls in robot.txt.
    Any ways pretty much informative post, thanks.

  4. Richard Ball says:

    Because they use HTTP, robot spider indexers can be slower than local file indexers, and can put more pressure on your web server, as they ask for each page. Some older webservers may crash during this process, either from the number of requests or because they uncover file corruption.

Subscribe & Get Your Free Beginners Guide To Google SEO!

Free SEO Ebook