Vanity Check: “Keyword Intelligence Tools”

It’s not about the tools you have, it’s about how you interpret the data. Most importantly, it’s how you evaluate what your data is really telling you. Look at this chart: Now, ask yourself what you see. Say it out loud, or in your head, whatever. I bet the first thing you said was: “The […]

The post Vanity Check: “Keyword Intelligence Tools” appeared first on SEOgadget.

Weighting the Clusters of Ranking Factors in Google’s Algorithm – Whiteboard Friday

Posted by randfish

One thing we collect for our semiannual ranking factors survey is the opinions of a group of SEO experts (128 of them this year!) about the relative weights of the categories of ranking factors. In other words, how important each of those categories is for SEO relative to the others.

In today’s Whiteboard Friday, Rand explains some key takeaways from the results of that particular survey question. In addition, the pie chart below shows what the categories are and just where each of them ended up.

Whiteboard Friday – Weighting the Clusters of Ranking Factors in Google Analytics

For reference, here’s a still of this week’s whiteboard and a fancy version of the chart from this week’s video!

Weighting of Thematic Clusters of Ranking Factors in Google

larger version

Video Transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week I’m going to talk a little bit about the ranking factors survey that we did this year and specifically some of the results from that.

One of my favorite questions that we ask in our ranking factors survey, which happens every two years and goes out to a number of SEO experts. This year, 128 SEO experts responded, sort of folks who were hand chosen by us as being very, very knowledgeable in the field. We asked them, based on these sort of thematic clusters of ranking elements, things like domain level link authority versus page level keyword agnostic features, weight them for us. You know, give a percentage that you would assign if you were giving an overall assessment of the importance of this factor in Google’s ranking algorithm.

So this is opinion data. This is not fact. This is not actually what Google’s using. This is merely the aggregated collective opinions of a lot of smart people who study this field pretty well. This week what I want to do is run through what these elements are, the scores that people gave them, and then some takeaways, and I even have an exercise for you all at home or at the office as the case may be.

So interestingly, the largest portion that was given credit by the SEOs who answered this question was domain-level link authority. This is sort of the classic thing we think of in the Moz scoring system as domain authority, DA. They said 20.94%, which is fairly substantive. It was the largest one.

Just underneath that, page-level link features, meaning external links, how many, how high-quality, where are they coming from, those kinds of things for ranking a specific page.

Then they went to page-level keyword and content features. This isn’t just raw keyword usage, keyword in the title tag, how many times you repeat on the page; this is also content features like if they think Google is using topic modeling algorithms or semantic analysis models, those types of things. That would also fit into here. That was given about 15%, 14.94%.

At 9.8%, then they all kind of get pretty small. Everything between here and here is between 5% and 10%. A bunch of features in there, like page-level keyword agnostic features. So this might be like how much content is in there, to what degree Google might be analyzing the quality of the content, are there images on the page, stuff like this. “How fast does the page load” could go in there.

Domain level brand features. Does this domain or the brand name associated with the website get mentioned a lot on the Internet? Does the domain itself get, for example, mentioned around the Web, lots of people writing about it and saying, “Moz.com, blah, blah, blah.”

User usage and traffic or query data. This one’s particularly fascinating, got an 8.06%, which is smaller but still sizeable. The interesting thing about this is I think this is something that’s been on the rise. In years past, it had always been under 5%. So it’s growing. This is things like: Are there lots of people visiting your website? Are people searching for your domain name, for your pages, for your brand name? How are people using the site? Do you have a high bounce rate or a lot of engagement on the site? All that kind of stuff.

Social metrics, Twitter, Facebook, Google+, etc., domain-level keyword usage, meaning things like if I’m trying to rank for blue shoes, do I have blue shoes in the domain name, like blueshoes.com or blue-shoes.com. This is one that’s been declining.

Then domain-level keyword agnostic features. This would be things like:
What’s the length of the domain name registration, or how long is the domain name? What’s the domain name extension? Other features like that, that aren’t related to the keywords, but are related to the domain.

So, from this picture I think there’s really some interesting takeaways, and I wanted to walk through a few of those that I’ve seen. Hopefully, it’s actually helpful to understand the thematic clusters themselves.

Number one: What we’re seeing year after year after year is complexity increasing. This picture has never gotten simpler any two years in a row that we’ve done this study. It’s never that one factor, you know, used to be smaller and now it’s kind of dominant and it’s just one thing. Years ago, I bet if we were to run this survey in 2001, it’d be like page rank, Pac-Man, everything else, little tiny chunk of Pac-Man’s mouth.

Number two: Links are still a big deal. Look here, right? I mean what we’re essentially seeing in this portion here is domain-level link authority and page-level link features, all of them. You could sort of think of this as maybe page authority being a proxy for this and domain authority being a proxy for this. That’s still a good 40% of how SEOs are perceiving Google’s algorithm. So links being a big important portion, but not the overwhelming portion.

It has almost always been the case in years past that the link features, when combined, were 50%. So we’re seeing that they’re a big deal both in the page and domain level, just not as big or as overwhelming as they used to be, and I think this is reflected in people’s attitudes towards link acquisition, which is, “Hey, that’s still a really important practice. That’s still something I’m looking forward to and trying to accomplish.”

Number three: Brand-related and brand-driven metrics are on the rise. Take a look. Domain level brand features and user usage or traffic query data, this is comprising a percentage that actually in sum exceeds page-level keyword content and features. This is really kind of the branding world happening right here. So if you’re not building a brand on the Web, that could be seriously hurting your SEO, maybe to the same degree that not doing on-page optimization is. Actually, that would be a conclusion that I personally would agree with as well.

Number four: Social is still perceived to have a minor impact despite some metrics to the contrary. So, social you can see up here at 7.24%, which is reasonably small. It’s the third-smallest factor that was on there. And yet, when we look at how do social metrics correlate with things that rank highly versus things that rank poorly, we’re seeing very high numbers, numbers that in many cases exceed or equal the link metrics that we look at. So here at Moz we kind of look at those and we go, “Well, obviously correlation does not imply causation.” It could be the case that there are other things Google’s measuring that just happen to perform well and happen to correlate quite nicely with social metrics, like +1s and shares and tweets and those kinds of things.

But certainly it’s surprising to us to see such a high correlation and such a low perception. My guess is, if I had to take a guess, what I’d say is that SEOs have a very hard time connecting these directly. Essentially, you go and you see a page that’s ranking number nine, and you think, “Hey, let me try to get a bunch of tweets and shares and +1s, and I’m going to acquire those in some fashion. Still ranking number nine. I don’t think social does all that much.” Versus, you go out and get links, and you can see the page kind of rising in the search results. You get good links from good places, from authoritative sites and many of them. Boom, boom, boom, boom. “I look like I’m rising; links are it.”

I think what might be being missed there is that the content of the page, the quality of the page and the quality of the domain and the brand and the amplification that it can achieve from social is an integral part. I don’t know exactly how Google’s measuring that, and I’m not going to speculate on what they are or aren’t doing. The only thing they’ve told us specifically is that we are not exclusively using just +1s precisely to increase rankings unless it’s personalized results, in which case maybe we are. To me, that kind of hyper specificity says there’s a bigger secret story hiding behind the more complex things that they are not saying they aren’t doing.

Number five, the last one: Keyword-based domain names, which I know have been kind of a darling of the SEO world (or historically a darling of the SEO world) and particularly of the affiliate marketing worlds for a long time, continue to shrink. You can see that in the correlation data. You can see it in the performance data. You can see it in the MozCast data set, which monitors sort of what appears in Google and doesn’t.

Our experience reinforces that. So remember Moz switched from the domain name SEOmoz, which had the keyword SEO right in there, to the Moz domain name not very long ago, and we did see kind of a rankings dive for a little while. Now almost all of those numbers are right back up where they were. So I think that’s (a) a successful domain shift, and I give huge credit to folks like Ruth Burr and Cyrus Shepard who worked so hard and so long on making that happen, Casey Henry too. But I think there’s also a story to be told there that having SEO in the domain name might not have been the source of as many rankings for SEO-related terms as we may have perceived it to be. I think that’s fascinating as well.

My recommendation, my suggestion to all of you, if you get the chance, try this. Go grab your SEO team or your SEO colleagues, buddies, friends in the field. Sit down in a room with a whiteboard or with some pen and paper. Don’t take a laptop in. Don’t use your phones. List out these features and go do this yourself. Go try making these percentages for what you think the algorithm actually looks like, what your team thinks the algorithm looks like, and then compare. What is it that’s the difference between kind of the aggregate of these numbers and the perception that you have personally or you have as a team?

I think that can be a wonderful exercise. It can really open up a great dialogue about why these things are happening. I think it’s some fun homework if you get a chance over the next week.

Until then, see you next week. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

Weighting the Clusters of Ranking Factors in Google Analytics – Whiteboard Friday

Posted by randfish

One thing we collect for our semiannual ranking factors survey is the opinions of a group of SEO experts (128 of them this year!) about the relative weights of the categories of ranking factors. In other words, how important each of those categories is for SEO relative to the others.

In today’s Whiteboard Friday, Rand explains some key takeaways from the results of that particular survey question. In addition, the pie chart below shows what the categories are and just where each of them ended up.

Whiteboard Friday – Weighting the Clusters of Ranking Factors in Google Analytics

For reference, here’s a still of this week’s whiteboard and a fancy version of the chart from this week’s video!

Weighting of Thematic Clusters of Ranking Factors in Google

larger version

Video Transcription

Howdy, Moz fans, and welcome to another edition of Whiteboard Friday. This week I’m going to talk a little bit about the ranking factors survey that we did this year and specifically some of the results from that.

One of my favorite questions that we ask in our ranking factors survey, which happens every two years and goes out to a number of SEO experts. This year, 128 SEO experts responded, sort of folks who were hand chosen by us as being very, very knowledgeable in the field. We asked them, based on these sort of thematic clusters of ranking elements, things like domain level link authority versus page level keyword agnostic features, weight them for us. You know, give a percentage that you would assign if you were giving an overall assessment of the importance of this factor in Google’s ranking algorithm.

So this is opinion data. This is not fact. This is not actually what Google’s using. This is merely the aggregated collective opinions of a lot of smart people who study this field pretty well. This week what I want to do is run through what these elements are, the scores that people gave them, and then some takeaways, and I even have an exercise for you all at home or at the office as the case may be.

So interestingly, the largest portion that was given credit by the SEOs who answered this question was domain-level link authority. This is sort of the classic thing we think of in the Moz scoring system as domain authority, DA. They said 20.94%, which is fairly substantive. It was the largest one.

Just underneath that, page-level link features, meaning external links, how many, how high-quality, where are they coming from, those kinds of things for ranking a specific page.

Then they went to page-level keyword and content features. This isn’t just raw keyword usage, keyword in the title tag, how many times you repeat on the page; this is also content features like if they think Google is using topic modeling algorithms or semantic analysis models, those types of things. That would also fit into here. That was given about 15%, 14.94%.

At 9.8%, then they all kind of get pretty small. Everything between here and here is between 5% and 10%. A bunch of features in there, like page-level keyword agnostic features. So this might be like how much content is in there, to what degree Google might be analyzing the quality of the content, are there images on the page, stuff like this. “How fast does the page load” could go in there.

Domain level brand features. Does this domain or the brand name associated with the website get mentioned a lot on the Internet? Does the domain itself get, for example, mentioned around the Web, lots of people writing about it and saying, “Moz.com, blah, blah, blah.”

User usage and traffic or query data. This one’s particularly fascinating, got an 8.06%, which is smaller but still sizeable. The interesting thing about this is I think this is something that’s been on the rise. In years past, it had always been under 5%. So it’s growing. This is things like: Are there lots of people visiting your website? Are people searching for your domain name, for your pages, for your brand name? How are people using the site? Do you have a high bounce rate or a lot of engagement on the site? All that kind of stuff.

Social metrics, Twitter, Facebook, Google+, etc., domain-level keyword usage, meaning things like if I’m trying to rank for blue shoes, do I have blue shoes in the domain name, like blueshoes.com or blue-shoes.com. This is one that’s been declining.

Then domain-level keyword agnostic features. This would be things like:
What’s the length of the domain name registration, or how long is the domain name? What’s the domain name extension? Other features like that, that aren’t related to the keywords, but are related to the domain.

So, from this picture I think there’s really some interesting takeaways, and I wanted to walk through a few of those that I’ve seen. Hopefully, it’s actually helpful to understand the thematic clusters themselves.

Number one: What we’re seeing year after year after year is complexity increasing. This picture has never gotten simpler any two years in a row that we’ve done this study. It’s never that one factor, you know, used to be smaller and now it’s kind of dominant and it’s just one thing. Years ago, I bet if we were to run this survey in 2001, it’d be like page rank, Pac-Man, everything else, little tiny chunk of Pac-Man’s mouth.

Number two: Links are still a big deal. Look here, right? I mean what we’re essentially seeing in this portion here is domain-level link authority and page-level link features, all of them. You could sort of think of this as maybe page authority being a proxy for this and domain authority being a proxy for this. That’s still a good 40% of how SEOs are perceiving Google’s algorithm. So links being a big important portion, but not the overwhelming portion.

It has almost always been the case in years past that the link features, when combined, were 50%. So we’re seeing that they’re a big deal both in the page and domain level, just not as big or as overwhelming as they used to be, and I think this is reflected in people’s attitudes towards link acquisition, which is, “Hey, that’s still a really important practice. That’s still something I’m looking forward to and trying to accomplish.”

Number three: Brand-related and brand-driven metrics are on the rise. Take a look. Domain level brand features and user usage or traffic query data, this is comprising a percentage that actually in sum exceeds page-level keyword content and features. This is really kind of the branding world happening right here. So if you’re not building a brand on the Web, that could be seriously hurting your SEO, maybe to the same degree that not doing on-page optimization is. Actually, that would be a conclusion that I personally would agree with as well.

Number four: Social is still perceived to have a minor impact despite some metrics to the contrary. So, social you can see up here at 7.24%, which is reasonably small. It’s the third-smallest factor that was on there. And yet, when we look at how do social metrics correlate with things that rank highly versus things that rank poorly, we’re seeing very high numbers, numbers that in many cases exceed or equal the link metrics that we look at. So here at Moz we kind of look at those and we go, “Well, obviously correlation does not imply causation.” It could be the case that there are other things Google’s measuring that just happen to perform well and happen to correlate quite nicely with social metrics, like +1s and shares and tweets and those kinds of things.

But certainly it’s surprising to us to see such a high correlation and such a low perception. My guess is, if I had to take a guess, what I’d say is that SEOs have a very hard time connecting these directly. Essentially, you go and you see a page that’s ranking number nine, and you think, “Hey, let me try to get a bunch of tweets and shares and +1s, and I’m going to acquire those in some fashion. Still ranking number nine. I don’t think social does all that much.” Versus, you go out and get links, and you can see the page kind of rising in the search results. You get good links from good places, from authoritative sites and many of them. Boom, boom, boom, boom. “I look like I’m rising; links are it.”

I think what might be being missed there is that the content of the page, the quality of the page and the quality of the domain and the brand and the amplification that it can achieve from social is an integral part. I don’t know exactly how Google’s measuring that, and I’m not going to speculate on what they are or aren’t doing. The only thing they’ve told us specifically is that we are not exclusively using just +1s precisely to increase rankings unless it’s personalized results, in which case maybe we are. To me, that kind of hyper specificity says there’s a bigger secret story hiding behind the more complex things that they are not saying they aren’t doing.

Number five, the last one: Keyword-based domain names, which I know have been kind of a darling of the SEO world (or historically a darling of the SEO world) and particularly of the affiliate marketing worlds for a long time, continue to shrink. You can see that in the correlation data. You can see it in the performance data. You can see it in the MozCast data set, which monitors sort of what appears in Google and doesn’t.

Our experience reinforces that. So remember Moz switched from the domain name SEOmoz, which had the keyword SEO right in there, to the Moz domain name not very long ago, and we did see kind of a rankings dive for a little while. Now almost all of those numbers are right back up where they were. So I think that’s (a) a successful domain shift, and I give huge credit to folks like Ruth Burr and Cyrus Shepard who worked so hard and so long on making that happen, Casey Henry too. But I think there’s also a story to be told there that having SEO in the domain name might not have been the source of as many rankings for SEO-related terms as we may have perceived it to be. I think that’s fascinating as well.

My recommendation, my suggestion to all of you, if you get the chance, try this. Go grab your SEO team or your SEO colleagues, buddies, friends in the field. Sit down in a room with a whiteboard or with some pen and paper. Don’t take a laptop in. Don’t use your phones. List out these features and go do this yourself. Go try making these percentages for what you think the algorithm actually looks like, what your team thinks the algorithm looks like, and then compare. What is it that’s the difference between kind of the aggregate of these numbers and the perception that you have personally or you have as a team?

I think that can be a wonderful exercise. It can really open up a great dialogue about why these things are happening. I think it’s some fun homework if you get a chance over the next week.

Until then, see you next week. Take care.

Video transcription by Speechpad.com

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!

AdWords Debuts Offline Conversion Tracking For Full Sales Cycle Optimization

Running lead generation campaigns on Google AdWords? Measuring success and optimizing campaigns for actual sealed-deal conversions, not just lead in-take, just got a whole lot easier. On the heels of rolling out cross-account conversion tracking and se…

SPONSOR MESSAGE: Enterprise SEO Tools 2013: A Buyer’s Guide

In this Buyer’s Guide, learn about the latest trends, opportunities and challenges facing the market for Enterprise SEO Tools. Drawing from interviews with industry leaders and their customers, this report profiles vendors in the SEO Tools marketplace and what you need to look for in an SEO…

Please visit Search Engine Land for the full article.

Foursquare Aims To Improve Venue Database By Expanding Superuser Program

Foursquare’s superuser program, now 40,000 people strong, is getting an overhaul. The company says it’ll soon launch an automated test that will make it easier for users to become superusers — they’re the ones with special privileges to edit business listings and venues in…

Please visit Search Engine Land for the full article.

Bing News Gets Easy Way To Spot Trending Stories On Twitter & Facebook

What are the top stories being shared on Facebook and Twitter? There’s a new way to get a sense of that. Bing News got a makeover today, including new “Trending” boxes that show news content that’s popular on both of those social networks. Bing News Now Features Trending…

Please visit Search Engine Land for the full article.

Search Privacy Low On List Of Privacy Concerns For US Internet Users

Despite the fact that search history can be so revealing about people’s desires and interests, a new survey of US internet users finds that concern over search privacy — while significant — still ranks far behind other online privacy issues. The survey by the Pew Research Center…

Please visit Search Engine Land for the full article.

Shouldn’t my Developer know SEO?

Dev Versus SEOOne problem that SEO face is that we heavily rely on work carried out by other Digital Professionals, Creatives, Developers, PR people etc, so how do we over come this?

Post from on State of Search
Shouldn’t my Developer know SEO?

How To Automate Mobile Bid Multipliers — And Other New AdWords Scripts Tricks

September 5th marks the first anniversary of AdWords Scripts. I’ve been using them extensively to automate tasks ranging from reporting Quality Score to creating entire AdWords accounts from spreadsheets of products and services. Scripts have come a long way since they were launched, so to…

Please visit Search Engine Land for the full article.

The ultimate guide to….the 404 status code and SEO

Status codes

What is it?

Every page you visit on the Internet will return something called a ‘status code’, a code consisting of three numbers that communicate to the requester the status of their request for a particular page.

These can be set by the administrator of a server or be a default server communication based on certain criteria being met (or not met). The following are the most frequently returned status codes:

  • 200 OK: The page you are requesting has been found and here it is.
  • 301 Moved Permanently: The page you have requested has moved permanently from the location you’ve requested it from (Location A) to another location (Location B), and here it is.
  • 302 Found: The page you have requested has moved temporarily from the location you’ve requested it from (Location A) to another location (Location B), and here it is.
  • 404 – Not Found: The server you are requesting the page from has acknowledged your request but the page you are requesting could not be found.

The last of these, the 404, is an ambiguous status code as the server cannot find what you are looking for but has made no attempt to contextualise why that might be.

Is it because the page was removed by the webmaster, or the URL was mistyped by a user? Is it because a malformed internal or external link was followed to the failed location from another website?

Or is it because the page was deleted or renamed, intentionally or unintentionally?

A 404 is ultimately an error message by default and is a very frequent and recognisable message experienced by every single Internet user.

You can check whether a URL is delivering a 404 response by using the ‘Fetch as Googlebot’ feature in Google Webmaster Tools, as well as a number of tools that will crawl your site and identify them all, tools such as Xenu and many others.

The check is important because many people have 404 pages that look like 404 pages, complete with a standard ‘Something went wrong’ message, but the implementation was incorrect and the response actually delivered is a ‘200 OK’, i.e. it looks like a 404, reads like one, but technically isn’t because the status code returned isn’t a 404.

That is called a ‘Soft 404’ and is far more common than you might think – even Intel has made that mistake and academic institutions too.

Intel 404

What are the SEO impacts?

Firstly, 404s are not inherently bad. They exist for a very good reason and the search engines expect to see them on most sites. Their ambiguous nature however means that search engines (and your users, and your rankings) will often benefit from some direction on what action to take when they come across them.

Without this direction and left unmanaged, 404 errors are problematic for two reasons:

Firstly, 404s often introduce link, page and site integrity and fidelity issues. At the most basic, 404s on your site can break crawl paths and impact on accessibility, and attempts to manage 404s often create even bigger problems, for example when SEOs and webmasters make poor decisions around where to 301 redirect them.

Furthermore, a search engine must make a judgement call on a site in its entirety if it is seeing a huge number of 404s as a percentage of all pages on the site.

Secondly, a search engine will be allocating link equity across the pages of the internet by following links from pages to pages and a 404 header response breaks that chain so a search engine needs to decide how to algorithmically deal with that.

Let’s call that a ‘link sink’, with the implication of a ‘sunk cost’ quite intentional given the marketing and proactivity that may have led to that link being placed that ends in a 404 on your site.

With big sites, and those that may have accumulated a large number of 404 pages over time, the quantity of lost link juice may be substantial and herding it would be a legitimate and good use of your time, as well as using your default approach to 404s to pre-empt the most common problems.

Ultimately, SEOs and webmasters will very likely have existing 404 problems, deficiencies, and inefficiencies to resolve, but also need to put in place a robust infrastructure and process for it to be as self-maintaining and optimising as possible, particularly for huge sites.

What are the possible solutions?

SEOs and webmasters typically believe that they have three choices with regards to how to manage 404 pages.

1. Do nothing

Search engines are really smart these days and some SEOs and webmasters believe that there’s very little value to be found in trying to manage 404s and that, assuming the site is configured properly, that the search engines will pretty much take care of everything.

2. Use a soft 404 rather than a real one

The rationale here for many is that a real 404 cannot be manipulated from an SEO perspective fully as by its nature you are instructing a search engine to purge the page from its index.

With a soft 404 the page can contain links to your commercial pages and you can ‘funnel’ link equity around the site like the administrator of a complex aqueduct.

3. 301 redirect all 404 pages to the homepage

Some SEOs believe that there should be no 404s returned by the web server…ever.

This school of thought dictates that every 404 be 301 redirected to the homepage automatically as and when they materialise to preserve link equity and also give consumers a starting position if they were to come to the site via that 404.

4. 301 redirect all 404 pages to a related and relevant live page

As above but with some logic to dictate where a 404 page should be redirected based on page relevance, funnelling link equity to a more appropriate page than the homepage, and also funnelling link equity to arguably more appropriate pages than just the homepage.

In reality, it is a combination of those four solutions that will be right and each solution will be different depending on the site in question.

Guidelines to maximise SEO value

All solutions, however, must be consistent with the following guidelines to maximise SEO value:

1. Do not use soft 404s and test your 404s to make sure that they have been implemented correctly.

Alternatively, you can use a 410 status code rather than 404 – Google suggested that both 404 and 410 were considered by Google as identical in 2007, but by 2009 they suggested that they were considered different by Google and that a 410 may expedite the purge.

At the very least they’ll be deemed comparable in intent. If you want to create a custom 404 then you can do so with it still being a real one.

You may be tempted to create a custom page that is quirky and innovative so that it can attract links, whose link juice can then be funnelled around the site via links on that custom 404.

This would only work if it were a soft 404 page, not a real one (as the search engines won’t follow links from a real 404 page).

Whilst these can be incredibly cool, they do not come with the other benefits of using real 404s (automatic housekeeping, link juice preservation, intelligent redirection for consumers, etc, etc).

So, if you want a custom, novelty 404, just make sure it is returns a real 404 status code, but be willing to forego any links that it might attract – you should just consider it viral marketing, as opposed to SEO marketing.

2. 404 pages that receive traffic should be redirected

They should be redirected to a page that is the most appropriate to its original topic but that will also not jar with human users if and when they are redirected.

A good way of doing this is to have a search box on your 404 page and see what people search for after arriving at such a page, which then will help determine where those people should be redirected to.

Always remember that in many cases you aren’t just redirecting pages and search engines, but real people, with real money, and real buying intent.

3. 404 pages that have inbound links from other websites should be redirected to pages that are consistent with the anchor text mix of the links 

If there isn’t a page where the anchor text profile will not conflict with how Google perceives the page you’re thinking of redirecting the 404 to, then 301 redirect it to the sitemap instead (or the homepage if the inbound link anchor text profile is consistent between both pages). 

4. Leave 404 pages that have no traffic or link value as they are and the search engines will purge them from the index.

If speed is of the essence then a 410 may expedite matters. Remove all links to those pages from your site though to conserve link equity and improve your user experience. 

Bespoking an approach based on the guidelines above can be done in a number of ways, including building a custom 404 handler.

This is a method of adding your own custom code to how your server deals with 404s, including conditional arguments before returning the 404 message. For example, you could code your 404 handler to check for the requested URL in a database you might have to determine where to redirect it to.

You could even have the 404 handler check that the URL receives traffic and if it does then to redirect it to a page that has the closest anchor text link profile, and if it doesn’t receive traffic or links to leave it be, etc, etc.

There are practically no limitations to the power of a customised 404 handler other than determining what the effort versus benefit might be of the coding effort.

Typically it is monstrously large and complex websites that would benefit most from that level of automated intelligence. All sites large or small will however benefit from an optimal and consistent approach to the management of 404s.

Pros and cons of 404 solutions

How to Recover When Your Content Is Stolen

Last month, I shared a case study of a client I’m currently working with on a duplicate content issue. It turns out that this particular site had significantly lost rankings over the past year because of other sites “lifting” their content, often verbatim, causing the…

Please visit Search Engine Land for the full article.