These notes are largely about how to make a high-quality website, dealing with Google Panda and other quality updates.
They touch on improving user experience (UX) and what you need to know if you are serious about building an effective flagship site that will perform in Google in 2016.
Google is on record that ‘user experience’ itself is not a ranking ‘factor‘, but poor UX may still seriously affect how your website performs in Google this year.
Once you have an optimised page on which to publish it, high-quality content is the number 1 user experience area to focus on across most websites if they are to earn traffic from Google in 2016.
Table Of Contents
Google Quality Rater Guidelines 2015
I spent a lot of time reading through the leaked 2014 and 2015 Google Quality Rater Guidelines when they came out.
I did this as a result of some strange stuff I was seeing going on as I reviewed sites and pages with Panda-like problems to see if the answer to some of this lay in MACHINE IDENTIFIABLE Google Panda signals.
This document is now well-cited around the net in various sources.
It’s an incredibly important document for all Webmasters because it gives you an idea that you ARE up against an ARMY of human website reviewers who are rating how user-friendly your website is.
Quality raters are also grading the quality and trustworthiness of your business (from signals it can detect online, starting with your company website).
Quality Raters Do Not Directly Impact YOUR site
Ratings from evaluators do not determine individual site rankings. GOOGLE
While Google is on record as stating these quality raters do not directly influence where you rank (without more senior analysts making a call on the quality of your website, I presume?) – there are some things in this document, mostly of a user experience nature (UX) that all search engine optimisers and Webmasters of any kind should note going forward.
From what I’ve seen of the recent Panda drops, an unsatisfying user experience signal can impact rankings even on a REPUTABLE DOMAIN and even with SOUND, SATISFYING CONTENT.
Quality Bar – Always Rising – Always Relative?
You’ve got to imagine all these quality ratings are getting passed along to the engineers at Google in some form (at some stage) to improve future algorithms – and identify borderline cases.
This is the ‘quality’ bar I’ve mentioned a couple of times in past posts.
Google is always raising the bar – always adding new signals, sometimes, in time, taking signals away.
It helps them
- satisfy users
- control the bulk of transactional web traffic.
That positioning has always been a win-win for Google – and a recognisable strategy from them after all these years.
Take unnatural links out of the equation (which have a history of trumping most other signals) and you are left with page level, site level and off-site signals.
All will need to be addressed to insulate against Google Panda (if that can ever be fully successful, against an algorithm that is modified to periodically “ask questions” of your site and overall quality score).
Google holds different types of sites to different standards for different kinds of keywords – which would suggest not all websites need all signals satisfied to rank well in SERPs – not ALL THE TIME.
OBSERVATION – You can have the content and the links – but if your site falls short on even a single user satisfaction signal (even if it is picked up by the algorithm, and not a human reviewer) then your rankings for particular terms could collapse – OR – rankings can be held back – IF Google thinks your organisation, with its resources, or ‘reputation, should be delivering… a little more to users.
OBSERVATION: A site often rises (in terms of traffic numbers) in Google, before a Panda ‘penalty’. Maybe your site has been rated more in-depth – however that has been done.
- Perhaps a problem was only identified on a closer look?
- Perhaps a closer look is only initiated when your ‘ticking boxes’ in Google and getting high marks – and more traffic – for each check until your site hits a PROBLEM in an area of damaging user experience, for instance.
- Perhaps the introduction of a certain SEO technique initially artificially raised your rankings for your pages in a way that Google’s algorithms do not like, and once that problem is spread out throughout your site, traffic begins to deteriorate or is slammed in an algorithm update.
Google says about the guide:
“Note: Some Webmasters have read these rating guidelines and have included information on their sites to influence your Page Quality rating!”
Surely – that’s NOT a bad thing, to make your site HIGHER QUALITY and correctly MARKETING your business to customers – and search quality raters, in the process. Black hats will obviously fake all that (which is why it would be self-defeating of me to publish a handy list of signals to manipulate SERPs that’s not just unnatural links).
Businesses that care about the performance in Google organic should be noting ALL the following points very carefully.
This isn’t about manipulating quality Raters – it is about making it EASY for them to realise you are a REAL business, with a GOOD REPUTATION, and have a LEVEL of EXPERTISE you wish to share with the world.The aim is to create a good user experience, not fake it:
The aim is to create a good user experience, not fake it:
What is the PURPOSE of your page?
Is it to “sell products or services”, “to entertain” or “ to share information about a topic.”
MAKE THE PURPOSE OF YOUR PAGE SINGULAR and OBVIOUS to help quality raters and algorithms.
The name of the game in 2016 (if you’re not faking everything) is VISITOR SATISFACTION.
If a visitor lands on your page – are they satisfied and can they successfully complete WHY they are there?
What Are YMYL Pages?
Google classifies web pages that “potentially impact the future happiness, health, or wealth of users” as “Your Money or Your Life” pages (YMYL) and hold these types of pages to higher standards than, for instance, hobby and informational sites.
Essentially, if you are selling something to visitors or advising on important matters like finance, law or medical advice – your page will be held to this higher standard.
Main Content (MC) of a Page
What Is Google Focused On?
Google is VERY interested in the MAIN CONTENT of a page, the SUPPLEMENTARY CONTENT of a page and HOW THAT PAGE IS monetised, and if that monetisation impacts the user experience of consuming the MAIN CONTENT.
Be careful optimising your site for CONVERSION first, if that gets in the way of the main content.
Google also has a Page Layout Algorithm that demotes pages with a lot of ADs “above the fold” or that users have to scroll past advertisements to get to the main content.
High-quality supplementary content should “(contribute) to a satisfying user experience on the page and website.“
Google says, “(Main CONTENT) is (or should be!) the reason the page exists.” so this is probably the most important part of the page, to Google.
An example of “supplementary” content is “navigation links that allow users to visit other parts of the website” and “footers” and “headers.”
How Reputable & User-Friendly Is Your Website?
Help quality raters EASILY research the reputation of your website, if you have any history.
Make “reputation information about the website” easy to access for a quality rater, as judging the reputation of your website is a large part of what they do.
You will need to monitor, or influence, ‘independent’ reviews about your business – because if they are negative – Google will “trust the independent sources”.
Consider a page that highlights your good press, if you have any.
- Google will consider “positive user reviews as evidence of positive reputation.” so come up with a way to get legitimate positive reviews – and starting on Google would be a good place to start.
- Google states, “News articles, Wikipedia articles, blog posts, magazine articles, forum discussions, and ratings from independent organizations can all be sources of reputation information” but they also state specifically boasts about a lot of internet traffic, for example, should not influence the quality rating of a web page. What should influence the reputation of a page is WHO has shared it on social media etc. rather than just raw numbers of shares. CONSIDER CREATING A PAGE with nofollow links to good reviews on other websites as proof of excellence.
- Google wants quality raters to examine sub pages of your site and often “the URL of its associated homepage” so ensure your home page is modern, up to date, informative and largely ON TOPIC with your internal pages.
- Google wants to know a few things about your website, including:
- Who is moderating the content on the site
- Who is responsible for the website
- Who owns copyright of the content
- Business details (which is important to have synced and accurate across important social media profiles)
- When was this content updated?
- Be careful syndicating other people’s content. Algorithmic duplicate problems aside…..if there is a problem with that content, Google will hold the site it finds content on as ‘responsible’ for that content.
- If you take money online, in any way, you NEED to have an accessible and satisfying ‘customer service’ type page. Google says, “Contact information and customer service information are extremely important for websites that handle money, such as stores, banks, credit card companies, etc. Users need a way to ask questions or get help when a problem occurs. For shopping websites, we’ll ask you to do some special checks. Look for contact information—including the store’s policies on payment, exchanges, and returns. “ Google urges quality raters to be a ‘detective’ in finding this information about you – so it must be important to them.
- Keep webpages updated regularly and let users know when the content was last updated. Google wants raters to “search for evidence that effort is being made to keep the website up to date and running smoothly.“
- Google quality raters are trained to be sceptical of any reviews found. It’s normal for all businesses to have mixed reviews, but “Credible, convincing reports of fraud and financial wrongdoing is evidence of extremely negative reputation“.
- Google asks quality raters to investigate your reputation by searching “giving the example [“ibm.com” reviews –site:ibm.com]: A search on Google for reviews of “ibm.com” which excludes pages on ibm.com.” – So I would do that search yourself and judge for yourself what your reputation is. Very low ratings on independent websites could play a factor in where you rank in the future – ” with Google stating clearly “very low ratings on the BBB site to be evidence for a negative reputation“. Other sites mentioned to review your business include YELP and Amazon. Often – using rich snippets containing schema.org information – you can get Google to display user ratings in the actual SERPs. I noted you can get ‘stars in SERPs’ within two days after I added the code (March 2014).
- If you can get a Wikipedia page – get one!. Keep it updated too. For the rest of us, we’ll just need to work harder to prove you are a real business that has earned its rankings.
- If you have a lot of NEGATIVE reviews – expect to be treated as a business with an “Extremely negative reputation” – and back in 2013 – Google mentioned they had an algorithm for this, too. Google has said the odd bad review is not what this algorithm looks for, as bad reviews are a natural part of the web.
- For quality raters, Google has a Page Quality Rating Scale with 5 rating options along a spectrum of “Lowest, Low, Medium, High, and Highest.”
- Google says “High-quality pages are satisfying and achieve their purpose well” and has lots of “satisfying” content, written by an expert or authority in their field – they go on to include “About Us information” pages, and easy to access “Contact or Customer Service information, etc.“
- Google is looking for a “website that is well cared for and maintained” so you need to keep content management systems updated, check for broken image links and HTML links. If you create a frustrating user experience through sloppy website maintenance – expect that to be reflected in some way with a lower quality rating. Google Panda October 2014 went for e-commerce pages that were optimised ‘the old way’ and are now classed as ‘thin content’.
- Google wants raters to navigate your site and ‘test’ it out to see if it is working. They tell raters to check your shopping cart function is working properly, for instance.
- Google expects pages to “be edited, reviewed, and updated on a regular basis” especially if they are for important issues like medical information, and states not all pages are held to such standards, but one can expect that Google wants information updated in a reasonable timescale. How reasonable is dependant on TOPIC and the PURPOSE of this web page – RELATIVE to competing pages.
- Google wants to rank pages by expert authors, not from content farms.
- You can’t have a great piece of content on a site with a negative reputation and expect it to perform well. A “High rating cannot be used for any website that has a convincing negative reputation.”
- A very positive reputation can lift your content from “medium” to “high-quality“.
- Google doesn’t care about ‘pretty‘ over substance and clearly instructs raters to “not rate based on how “nice” the page looks“.
- Just about every webpage should have a CLEAR way to contact the site manager to achieve a high rating.
- Highlighting ads in your design is BAD practice, and Google gives clear advice to rate the page LOW – Google wants you to optimise for A SATISFYING EXPERIENCE FIRST, CONVERSION SECOND! Conversion optimisers especially should take note of this, and aren’t we all?
- Good news for web designers, content managers and search engine optimisers! ” Google clearly states, “If the website feels inadequately updated and inadequately maintained for its purpose, the Low rating is probably warranted.” although does stipulate again its horses for courses…..if everybody else is crap, then you’ll still fly – not much of those SERPs about these days.
- If your intent is to deceive, be malicious or present pages with no purpose other than to monetise free traffic with no value ad – Google is not your friend.
- Domains that are ‘related’ in Whois can lead to a low-quality score, so be careful how you direct people around multiple sites you own.
- Keyword stuffing your pages is not recommended, even if you do get past the algorithms.
- Quality raters are on the lookout for content that is “copied with minimal alteration” and crediting the original source is not a way to get around this. Google rates this type of activity low-quality.
- How can Google trust a page if it is blocked from it or from reading critical elements that make up that page? Be VERY careful blocking Google from important directories (blocking CSS and .js files are very risky these days). REVIEW your ROBOTS.txt and know exactly what you are blocking and why you are blocking it.
Ratings Can Be Relative
It’s important to note your website quality is often judged on the quality of competing pages for this keyword phrase.
SEO is still a horserace.
A lot of this is all RELATIVE to what YOUR COMPETITION are doing.
Big sites v small sites?
Sites with a lot of links v not a lot of links?
Big companies with a lot of staff v small companies with a few staff?
Do sites at the top of Google get asked more of? Algorithmically and manually? Just…. because they are at the top?
Whether its algorithmic or manual – based on technical, architectural, reputation or content – Google can decide and will decide if your site meets its quality requirements to rank on page one.
The likelihood of you ranking stable at number one is almost non-existent in any competitive niche where you have more than a few players aiming to rank number one.
Not en-masse, not unless you are bending the rules.
My own strategy for visibility over the last few years has been to avoid focusing entirely on ranking for particular keywords and rather improve the search experience of my entire website.
The entire budget of my time went on content improvement, content reorganisation, website architecture improvement, and lately, mobile experience improvement.
I have technical improvements to speed, usability and accessibility in the pipeline.
In simple terms I took thin content and made it fat to make old content perform better.
Unsurprisingly, ranking Fat content comes with it own challenges as the years go by.
Can Thin Content Still Rank In Google?
Ranking top depends on the query and level of competition for the query.
Google’s high-quality recommendations are often for specific niches and specific searches as most of the web would not meet the very highest requirements.
Generally speaking – real quality will stand out, in any niche with a lack of it, at the moment.
The time it takes for this to happen (at Google’s end) leaves a lot to be desired in some niches.
TIME – something Google has an almost infinite supply of compared to 99% of the businesses on the planet – is on Google’s side.
That’s for another post, though….
What Are The High-Quality Characteristics of a Web Page?
The following are examples of what Google calls ‘high-quality characteristics’ of a page and should be remembered:
- “A satisfying or comprehensive amount of very high-quality” main content (MC)
- Copyright notifications up to date
- Functional page design
- Page author has Topical Authority
- High-Quality Main Content
- Positive Reputation or expertise of website or author (Google yourself)
- Very helpful SUPPLEMENTARY content “which improves the user experience.“
- Google wants to reward ‘expertise’ and ‘everyday expertise’ or experience so make this clear (Author Box?)
- Accurate information
- Ads can be at the top of your page as long as it does not distract from the main content on the page
- Highly satisfying website contact information
- Customised and very helpful 404 error pages
- Evidence of expertise
- Attention to detail
If Google can detect investment in time and labour on your site – there are indications that they will reward you for this (or at least – you won’t be effected when others are, meaning you rise in Google SERPs when others fall).
What Characteristics Do The Highest Quality Pages Exhibit?
You obviously want the highest quality ‘score’ but looking at the guide that is a lot of work to achieve.
Google wants to rate you on the effort you put into your website, and how satisfying a visit is to your pages.
- “Very high or highest quality MC, with demonstrated expertise, talent, and/or skill.“
- “Very high level of expertise, authoritativeness, and trustworthiness (page and website) on the topic of the page.”
- “Very good reputation (website or author) on the topic of the page.”
At least for competitive niches were Google intend to police this quality recommendation, Google wants to reward high-quality pages and “the Highest rating may be justified for pages with a satisfying or comprehensive amount of very high-quality” main content.
If your main content is very poor, with “grammar, spelling, capitalization, and punctuation errors“, or not helpful or trustworthy – ANYTHING that can be interpreted as a bad user experience – you can expect to get a low rating.
“We will consider content to be Low quality if it is created without adequate time, effort, expertise, or talent/skill. Pages with low-quality (main content) do not achieve their purpose well.”
Note – not ALL thin pages are low-quality.
If you can satisfy the user with a page thin on content – you are ok (but probably susceptible to someone building a better page than your, more easily, I’d say).
This is a good article about long clicks.
Google expects more from big brands than they do from your store (but that does not mean you shouldn’t be aiming to meet ALL these high-quality guidelines above.
If you violate Google Webmaster recommendations for performance in their indexes of the web – you automatically get a low-quality rating.
Poor page design and poor main content and too many ads = you are toast.
If a rater is subject to a sneaky redirect – they are instructed to rate your site low.
What Are The Low-Quality Signals Google Looks For?
These include but are not limited to:
- Lots of spammy comments
- Low quality content that lacks EAT signal (Expertise + Authority + Trust”)
- NO Added Value for users
- Poor page design
- Malicious harmful or deceptive practices detected
- Negative reputation
- Auto generated content
- No website contact information
- Fakery or INACCURATE information
- Website not maintained
- Pages just created to link to others
- Pages lack purpose
- Keyword stuffing
- Inadequate customer service pages
- Sites that use practices Google doesn’t want you to use
Pages can get a neutral rating too.
Pages that have “Nothing wrong, but nothing special” about them don’t “display characteristics associated with a High rating” and puts you in the middle ground – probably not a sensible place to be a year or so down the line.
Pages Can Be Rated ‘Medium Quality’
Quality raters will rate content as medium rating when the author or entity responsible for it is unknown.
If you have multiple editors contributing to your site, you had better have a HIGH EDITORIAL STANDARD.
One could take from all this that Google Quality raters are out to get you if you manage to get past the algorithms, but equally, Google quality raters could be friends you just haven’t met yet.
Somebody must be getting rated highly, right?
Impress a Google Quality rater and get a high rating.
If you are a spammer you’ll be pulling out the stops to fake this, naturally, but this is a chance for real businesses to put their best foot forward and HELP quality raters correctly judge the size and relative quality of your business and website.
Real reputation is hard to fake – so if you have it – make sure its on your website and is EASY to access from contact and about pages.
The quality raters handbook is a good training guide for looking for links to disavow, too.
It’s pretty clear.
Google organic listings are reserved for ‘remarkable’ and reputable’ content, expertise and trusted businesses.
A high bar to meet – and one that is designed for you to never quite meet unless you are serious about competing, as there is so much work involved.
I think the inferred message is call your Adwords rep if you are an unremarkable business.
Google Quality Algorithms
Google has numerous algorithm updates during a year. The May 2015 Google Quality Algorithm Update known as the Phantom 2 Update or the Quality Update is very reminiscent of Google Panda – and focuses on similar ‘low-quality’ SEO techniques we have been told Panda focuses on.
Google Panda Updates
The last Google Panda updates I researched in-depth (and I have most experience in dealing with) is the:
- July 2015 Google Panda 4.2 (a rolling update set to last months, we are told)
My technical SEO audit identifies most of my findings if present on a site.
I’ve worked on quite a few sites that got hit on October 2014, May 2015 and that have suffered a loss in traffic since July 2015 – and the similarities between them and the extent of their problems is striking and enlightening – at least in part.
There is little chance of Google Panda recovery if you do not take action to deal with what the Google Panda algorithm has identified is ‘a poor user experience” on your website.
The problem is usually to do with content.
Likewise, old SEO techniques need cleaned up to avoid related Google Quality Algorithms.
What Is Google Panda?
Google Panda aims to rate the quality of your pages and website and is based on things about your site that Google can rate, or algorithmically identify.
We are told the current Panda is an attempt to basically stop low-quality thin content pages ranking for keywords they shouldn’t rank for.
Panda evolves – signals can come and go – Google can get better at determining quality as a spokesman from Google has confirmed :
“So it’s not something where we’d say, if your website was previously affected, then it will always be affected. Or if it wasn’t previously affected, it will never be affected.… sometimes we do change the criteria…. category pages…. (I) wouldn’t see that as something where Panda would say, this looks bad.… Ask them the questions from the Panda blog post….. usability, you need to work on.“ John Mueller.
In my notes about Google Penguin, I list the original, somewhat abstract, Panda ranking ‘factors’ published as a guideline for creating high-quality pages. I also list these Panda points below:
(PS – I have emphasised two of the bullet points below, at the top and bottom because I think it’s easier to understand these points as a question, how to work that question out, and ultimately, what Google really cares about – what their users think.
- Would you trust the information presented in this article? (YES or NO)
- Is this article written by an expert or enthusiast who knows the topic well, or is it more shallow in nature? EXPERTISE
- Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations? LOW QUALITY CONTENT/THIN CONTENT
- Would you be comfortable giving your credit card information to this site? (HTTPS? OTHER TRUST SIGNALS (CONTACT/ABOUT / PRIVACY / COPYRIGHT etc.)
- Does this article have spelling, stylistic, or factual errors? (SPELLING + GRAMMAR + CONTENT QUALITY – perhaps wrong dates in content, on old articles, for instance)
- Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines? (OLD SEO TACTICS|DOORWAY PAGES)
- Does the article provide original content or information, original reporting, original research, or original analysis? (ORIGINAL RESEARCH & SATISFYING CONTENT)
- Does the page provide substantial value when compared to other pages in search results? (WHAT’S THE RELATIVE QUALITY OF COMPETITION LIKE FOR THIS TERM?)
- How much quality control is done on content? (WHEN WAS THIS LAST EDITED? Is CONTENT OUTDATED? IS SUPPLEMENTARY CONTENT OUTDATED (External links and images?))
- Does the article describe both sides of a story? (IS THIS A PRESS RELEASE?)
- Is the site a recognized authority on its topic? (EXPERTISE)
- Is the content mass-produced by or outsourced to a large number of creators, or spread across a large network of sites, so that individual pages or sites don’t get as much attention or care? (IS THIS CONTENT BOUGHT FROM A $5 per article content factory? Or is written by an EXPERT or someone with a lot of EXPERIENCE of the subject matter?)
- Was the article edited well, or does it appear sloppy or hastily produced? (QUALITY CONTROL on EDITORIALS)
- For a health related query, would you trust information from this site? (EXPERTISE NEEDED)
- Would you recognize this site as an authoritative source when mentioned by name? (EXPERTISE NEEDED)
- Does this article provide a complete or comprehensive description of the topic? (Is the page text designed to help a visitor or shake them down for their cash?)
- Does this article contain insightful analysis or interesting information that is beyond obvious? (LOW QUALITY CONTENT – You know it when you see it)
- Is this the sort of page you’d want to bookmark, share with a friend, or recommend? (Would sharing this page make you look smart or dumb to your friends? This should be reflected in social signals)
- Does this article have an excessive amount of ads that distract from or interfere with the main content? (OPTIMISE FOR SATISFACTION FIRST – CONVERSION SECOND – do not let the conversion get in the way of satisfying the INTENT of the page. For example – if you rank with INFORMATIONAL CONTENT with a purpose to SERVE those visitors – the visitor should land on your destination page and not be deviated from the PURPOSE of the page – and that was informational, in this example – to educate. SO – educate first – beg for social shares on those articles – and leave the conversion on Merit and slightly more subtle influences rather than massive banners or whatever that annoy users). We KNOW ads (OR DISTRACTING CALL TO ACTIONS) convert well at the tops of articles – but Google says it is sometimes a bad user experience. You run the risk of Google screwing with your rankings as you optimise for conversion so be careful and keep everything simple and obvious.
- Would you expect to see this article in a printed magazine, encyclopedia or book? (Is this a HIGH QUALITY article?)… no? then….
- Are the articles short, unsubstantial, or otherwise lacking in helpful specifics? (Is this a LOW or MEDIUM QUALITY ARTICLE? LOW WORD COUNTS ACROSS PAGES?)
- Are the pages produced with great care and attention to detail vs. less attention to detail? (Does this page impress?)
- Would users complain when they see pages from this site? (WILL THIS PAGE MAKE GOOGLE LOOK STUPID IF IT RANKS TOP?)
All that sits quite nicely with information you can view in the Quality rating guidelines.
If you fail to meet these standards (even some) your rankings can fluctuate wildly (and often, as Google updates Panda every month we are told and often can spot rolling in).
It all probably correlates quite nicely too, with the type of sites you don’t want links from.
Google is raising the quality bar, and forcing optimisers and content creators to spend HOURS, DAYS or WEEKS longer on websites if they ‘expect’ to rank HIGH in natural results.
If someone is putting the hours in to rank their site through legitimate efforts – Google will want to reward that – because it keeps the barrier to entry HIGH for most competitors.
The higher it is – the better option Adwords is to businesses.
When Google does not reward effort – a new black hat is born, I’d say.
Google might be asking things of YOUR site it is NOT ASKING of competing sites once it’s worked out whatever your RELATIVE AUTHORITY AND SIZE are in comparison to these other sites.
I **think** I have actually spotted this when testing for ‘quality signals’ on a granular level that are machine identifiable.
Thresholds that are detected ALGORITHMICALLY and MANUALLY in, probably, a self-feeding cycle to improve algorithms.
I mean – Google says a quality rater does not affect your site, but if your site gets multiple LOW QUALITY notices from manual reviewers – that stuff is coming back to get you later, surely.
Identifying Which Pages On Your Own Site Hurt Or Help Your Rankings
Separating the wheat from the chaff.
Being ‘indexed’ is important. If a page isn’t indexed, the page can’t be returned by Google in Search Engine Results Pages.
While getting as many pages indexed in Google was historically a priority for a SEO, Google is now rating the quality of pages on your site and the type of pages it is indexing. So bulk indexation is no guarantee of success – in fact, it’s a risk in 2016 to index all pages on your site, especially if you have a large, sprawling site.
If you have a lot of low-quality pages (URLs) indexed on your site compared to high-quality pages (URLs)…. Google has told us it is marking certain sites down for that.
Some URLs are just not welcome to be indexed as part of your website content anymore.
Do I need to know which pages are indexed?
No. Knowing is useful, of course, but largely unnecessary. Indexation is never a guarantee of traffic.
Some SEO would tend to scrape Google to get indexation data on a website. I’ve never bothered with that. Most sites I work with have xml sitemap files, so an obvious place to start to look at such issues is Google Search Console.
Google will tell you how many pages you have submitted in a sitemap, and how many pages are indexed. It will not tell you which pages are indexed, but if there is a LARGE discrepancy between SUBMITTED and INDEXED, it’s very much worth digging deeper.
If Google is de-indexing large swaths of your content that you have actually submitted as part of an xml sitemap, then a problem is often afoot.
Unfortunately with this method, you don’t get to see the pages produced by the CMS out with the xml sitemap – so this is not a full picture of the ‘health’ of your website.
Identifying Dead Pages
I usually start with a performance analysis that involves merging data from a physical crawl of a website with analytics data and Webmaster tools data. A content type analysis will identify the type of pages the cms generates. A content performance analysis will gauge how well each section of the site performs.
If you have 100,000 pages on a site, and only 1,000 pages get organic traffic from Google over a 3-6 month period – you can make the argument 99% of the site is rated as ‘crap’ (at least as far as Google rates pages these days).
I group pages like these together as ‘dead pages‘ for further analysis. Deadweight, ‘dead’ for short.
The thinking is if the pages were high-quality, they would be getting some kind of organic traffic.
Identifying which pages receive no organic visitors over a sensible timeframe is a quick, if noisy, way to separate pages that obviously WORK from pages that DONT – and will help you clean up a large portion of redundant URLs on the site.
It helps to see page performance in the context of longer timeframes as some types of content can be seasonal, for instance, and produce false positives over a shorter timescale. It is important to trim content pages carefully – and there are nuances.
Experience can educate you when a page is high-quality and yet receives no traffic. If the page is thin, but is not manipulative, is indeed ‘unique’ and delivers on a purpose with little obvious detectable reason to mark it down, then you can say it is a high-quality page – just with very little search demand for it. Ignored content is not the same as ‘toxic’ content.
False positives aside, once you identify the pages receiving no traffic, you very largely isolate the type of pages on your site that Google doesn’t rate – for whatever reason. A strategy for these pages can then be developed.
Identifying Content That Can Potentially Hurt Your Rankings
As you review the pages, you’re probably going to find pages that include:
- out of date, overlapping or irrelevant content
- collections of pages not paginated properly
- index able pages that shouldn’t be indexed
- stub pages
- indexed search pages
- pages with malformed HTML and broken images
- auto generated pages with little value
You will probably find ‘dead’ pages you didn’t even know your cms produced (hence why an actual crawl of your site is required, rather than just working from a lit of URLs form a xml sitemap, for instance).
Those pages need cleaned up, Google has said. And remaining pages should:
‘stand on their own’ J.Mueller
Google doesn’t like auto generated pages in 2016, so you don’t want Google indexing these pages in a normal fashion. Judicious use of ‘noindex,follow’ directive in robots meta tags, and sensible use of the canonical link element are required implementation on most sites I see these days.
The aim in 2016 is to have as few ‘low-quality pages on a site as possible, using as few aged SEO techniques as possible.
The pages that remain after a URL clear-out, can be reworked and improved.
In fact – they MUST BE improved if you are to win more rankings and get more Google organic traffic in future.
This is time-consuming – just like Google wants it to be. You need to review DEAD pages with a forensic eye and ask:
- Are these pages high-quality and very relevant to a search term?
- Do these pages duplicate content on the pages on the site?
- Are these pages automatically generated, with little or no unique text content on them?
- Is the purpose of this page met WITHOUT sending visitors to another page e.g. doorway pages?
- Will these pages ever pick up natural links?
- Is the intent of these pages to inform first? ( or profit from organic traffic through advertising?)
- Are these pages FAR superior than the competition in Google presently for the search term you want to rank? This is actually very important.
If the answer to any of the above is NO – then it is imperative you take action to minimise the amount of these types of pages on your site.
What about DEAD pages with incoming backlinks or a lot of text content?
Bingo! Use 301 redirects (or use canonical link elements) to redirect any asset you have with some value to Googlebot to equivalent, up to date sections on your site. Do NOT just redirect these pages to your homepage.
Rework available content before you bin it
High-quality content is expensive – so rework content when it is available. Medium quality content can always be made higher quality – in fact – a page is hardly ever finished in 2016. EXPECT to come back to your articles every six months to improve them to keep them moving in the right direction.
Sensible grouping of content types across the site can often leave you with substantial text content that can be reused and repackaged in a way that the same content originally spread over multiple pages, now consolidated into one page reworked and shaped around a topic, has a considerably much more successful time of it in Google SERPs in 2016.
Well, it does if the page you make is useful and has a purpose other than just to make money.
REMEMBER – DEAD PAGES are only one aspect of a site review. There’s going to be a large percentage of any site that gets a little organic traffic but still severely underperforms, too – tomorrows DEAD pages. I call these POOR pages in my reviews.
Minimise Low-Quality Content & Overlapping Text Content
Google may well be able to recognise ‘low-quality’ a lot better than it does ‘high-quality’ – so having a lot of ‘low-quality’ pages on your site is potentially what you are actually going to be rated on (if it makes up most of your site) – now, or in the future. NOT your high-quality content.
This is more or less explained by Google spokespeople like John Mueller. He is constantly on about ‘folding’ thin pages together, these days (and I can say that certainly has a positive impact on many sites).
While his advice in this instance might be specifically about UGC (user generated content like forums) – I am more interested in what he has to say when he talks about the algorithm looking at the site “overall” and how it ‘thinks’ when it finds a mixture of high-quality pages and low-quality pages.
And Google has clearly said in print:
low-quality content on part of a site can impact a site’s ranking as a whole
Avoid Google’s punitive algorithms
Fortunately, we don’t actually need to know and fully understand the ins-and-outs of Google’s algorithms to know what the best course of action is.
The sensible thing in light of Google’s punitive algorithms is just to not let Google index (or more accurately, rate) low-quality pages on your site. And certainly – stop publishing new ‘thin’ pages. Don’t put your site at risk.
If pages get no organic traffic anyway, are out-of-date for instance, and improving them would take a lot of effort and expense, why let Google index them normally, if by rating them it impacts your overall score? Clearing away the low-quality stuff lets you focus on building better stuff on other pages that Google will rank in 2016 and beyond.
Ideally you would have a giant site and every page would be high-quality – but that’s not practical.
A myth is that pages need a lot of text to rank. They don’t, but a lot of people still try to make text bulkier and unique page-to-page .
While that theory is sound (when focused on a single page, when the intent is to deliver utility content to a Google user) using old school SEO techniques on especially a large site spread out across many pages seems to amplify site quality problems, after recent algorithm changes, and so this type of optimisation without keeping an eye on overall site quality is self-defeating in the long run.
Investigating A Traffic Crunch
Every site is impacted by how highly Google rates it.
- There are many reasons a website loses traffic from Google. Server changes, website problems, content changes, downtimes, redesigns, migrations… the list is extensive.
Sometimes, Google turns up the dial on demands on ‘quality’, and if your site falls short, a website traffic crunch is assured. Some sites invite problems ignoring Google’s ‘rules’ and some sites inadvertently introduce technical problems to their site after the date of a major algorithm update, and are then impacted negatively by later refreshes of the algorithm.
Comparing your Google Analytics data side by side with the dates of official algorithm updates is useful in diagnosing a site health issue or traffic drop. In the above example, the client thought it was a switch to HTTPS and server downtime that caused the drop when it was actually the May 6 2015 Google Quality Algorithm (originally called Phantom 2 in some circles) that caused the sudden drop in organic traffic.
A quick check of how the site was laid out soon uncovered a lot of unnecessary pages, or what Google calls thin, overlapping content. This observation would go a long way to confirming that the traffic drop was indeed caused by the May algorithm change.
Another obvious way to gauge the health of a site is to see which pages on the site get zero traffic from Google over a certain period of time. I do this by merging analytics data with crawl data – as analytics doesn’t give you data on pages it sends no traffic to.
Often, this process can highlight low-quality pages on a site.
Google calls a lot of pages ‘thin’ or ‘overlapping’ content these days. I go into some of that in my duplicate content penalty post.
Algorithm changes in 2016 seem to centre on reducing the effectiveness of old-school SEO techniques, with the May 2015 Google ‘Quality’ algorithm update bruisingly familiar. An algorithm change is usually akin to ‘community service’ for the business impacted negatively.
If your pages were designed to get the most out of Google, with commonly known and now outdated SEO techniques chances are Google has identified this and is throttling your rankings in some way. Google will continue to throttle rankings until you clean your pages up.
If Google thinks your links are manipulative, they want them cleaned up, too.
Actually – looking at the backlink profile of this customer, they are going to need a disavow file prepared too.
That is unsurprising in today’s SEO climate.
What could be argued was ‘highly relevant’ or ‘optimised’ on-site SEO for Google just a few years ago is now being treated more like ‘web spam’ by punitive algorithms, rather than just ‘over-optimisation’.
Google went through the SEO playbook and identified old techniques and use them against you today – meaning every SEO job you take on always has a clean up aspect now.
Google has left a very narrow band of opportunity when it comes to SEO – and punishments are designed to take you out of the game for some time while you clean up the infractions.
Google has a LONG list of technical requirements it advises you meet, on top of all the things it tells you NOT to do to optimise your website. Meeting Google’s technical guidelines is no magic bullet to success – but failing to meet them can impact your rankings in the long run – and the odd technical issue can actually severely impact your entire site if rolled out across multiple pages.
The benefit of adhering to technical guidelines is often a second order benefit.
You don’t get penalised, or filtered, when others do. When others fall, you rise.
Mostly – individual technical issues will not be the reason you have ranking problems, but they still need addressed for any second order benefit they provide.
Google spokespeople say ‘user-experience’ is NOT A RANKING FACTOR but this might be splitting hairs as lots of the rules are designed to guarantee a good a ‘user experience’ as possible for Google’s users.
For instance, take good 404 pages. A poor 404 page and user interaction with it, can only lead to a ‘poor user experience’ signal at Google’s end, for a number of reasons. I will highlight a poor 404 page in my audits and actually programmatically look for signs of this issue when I scan a site. I don’t know if Google looks at your site that way to rate it e.g. algorithmically determines if you have a good 404 page – or if it is a UX factor, something to be taken into consideration further down the line – or purely to get you thinking about 404 pages (in general) to help prevent Google wasting resources indexing crud pages and presenting poor results to searchers. I think rather that any rating would be a second order scoring including data from user activity on the SERPs – stuff we as SEO can’t see.
At any rate – I don’t need to know why we need to do something, exactly, if it is in black and white like:
Create useful 404 pages
Tell visitors clearly that the page they’re looking for can’t be found. Use language that is friendly and inviting. Make sure your 404 page uses the same look and feel (including navigation) as the rest of your site. Consider adding links to your most popular articles or posts, as well as a link to your site’s home page. Think about providing a way for users to report a broken link.
No matter how beautiful and useful your custom 404 page, you probably don’t want it to appear in Google search results. In order to prevent 404 pages from being indexed by Google and other search engines, make sure that your webserver returns an actual 404 HTTP status code when a missing page is requested
….. all that is need doing is to follow the guideline as exact as Google tells you to do it.
Most of Google’s technical guidelines can be interpreted in this way. And most need to be followed, whether addressing these issues has any immediate positive impact on the site or not.
Whether or not your site has been impacted in a noticeable way by these algorithms, every SEO project must start with a historical analysis of site performance. Every site has things to clean up and to optimise in a modern way.
The sooner you understand why Google is sending you less traffic than it did last year, the sooner you can clean it up and focus on proactive SEO that starts to impact your rankings in a positive way.
User Experience = SEO WIN!
“Google engineer Gary Illyes talked a lot about user experience and how Webmasters really need to focus on that.” Search Engine Land.
Content quality is the most significant user experience issue Google cares about.
Nobody knows exactly the signals Google to rank pages uses on every SERP at any given time.
The information we have – unless you test things for yourself – comes from Google and is meant to discombobulate you.
I knew one day I would use that word.
Website Usability Tips
If we want to improve user experience in other areas, then we all need to listen to the real usability and user experience experts.
The following usability tips have been taken from Nielson and I have filtered them to focus on items that could be useful, now and in the future, to help maximise the chance you meet the bar on required algorithmic or manual quality ratings:
Home Page Tips
- Show the company name and/or logo in a reasonable size and noticeable location.
- Include a tag line that explicitly summarizes what the site or company does.
- Emphasize what your site does that’s valuable from the user’s point of view, as well as how you differ from key competitors.
- Emphasize the highest priority tasks so that users have a clear starting point on the homepage.
- Clearly designate one page per site as the official homepage.
- On your main company website, don’t use the word “website” to refer to anything but the totality of the company’s web presence.
- Design the homepage to be clearly different from all the other pages on the site.
- Group corporate information, such as About Us, Investor Relations, Press Room, Employment and other information about the company, in one distinct area.
- Include a homepage link to an “About Us” section that gives users an overview about the company and links to any relevant details about your products, services, company values, business proposition, management team, and so forth.
- If you want to get press coverage for your company, include a “Press Room” or “News Room” link on your homepage.
- Present a unified face to the customer, in which the website is one of the touchpoints rather than an entity unto itself.
- Include a “Contact Us” link on the homepage that goes to a page with all contact information for your company.
- If you provide a “feedback” mechanism, specify the purpose of the link and whether customer service or the webmaster will read it, and so forth.
- Don’t include internal company information (which is targeted for employees and should go on the intranet) on the public website.
- Explain how the website makes money if it’s not self-evident.
Content Writing Tips
- Use customer-focused language.
- Avoid redundant content
- Don’t use clever phrases and marketing lingo that make people work too hard to figure out what you’re saying
- Use consistent capitalization and other style standards.
- Don’t label a clearly defined area of the page if the content is sufficiently self-explanatory.
- Avoid single-item categories and single-item bulleted lists.
- Use non-breaking spaces between words in phrases that need to go together in order to be scan able and understood.
- Only use imperative language such as “Enter a City or Zip Code” for mandatory tasks, or qualify the statement appropriately.
- Spell out abbreviations, initialisms, and acronyms, and immediately follow them by the abbreviation, in the first instance.
- Avoid exclamation marks!
- Use all uppercase letters sparingly or not at all as a formatting style.
- Avoid using spaces and punctuation inappropriately, for emphasis.
- Use examples to reveal the site’s content, rather than just describing it.
- For each example, have a link that goes directly to the detailed page for that example, rather than to a general category page of which that item is a part.
- Provide a link to the broader category next to the specific example.
- Make sure it’s obvious which links lead to follow-up information about each example and which links lead to general information about the category as a whole.
- Make it easy to access anything that has been recently featured on your homepage, for example, in the last two weeks or month, by providing a list of recent features as well as putting recent items into the permanent archives.
- Differentiate links and make them scan able. Begin links with the information-carrying word, because users often scan through the first word or two of links to compare them.
- Don’t use generic instructions, such as “Click Here” as a link name.
- Don’t use generic links, such as “More…” at the end of a list of items.
- Allow link colors to show visited and unvisited states. Reserve blue for unvisited links and use a clearly discernable and less saturated color for visited links. NOT GREY. HIGH CONTRAST!
- Don’t use the word “Links” to indicate links on the page.
- If a link does anything other than go to another web page, such as linking to a PDF file or launching an audio or video player, email message, or another application, make sure the link explicitly indicates what will happen.
- Give users an input box on the homepage to enter search queries, instead of just giving them a link to a search page.
- Explain the benefits and frequency of publication to users before asking them for their email addresses.
- Don’t offer users features to customize the basic look of the homepage UI, such as color schemes.
- Don’t automatically refresh the homepage to push updates to users
- Don’t waste space crediting the search engine, design firm, favorite browser company, or the technology behind the scenes.
- Have a plan for handling critical content on your website in the event of an emergency.
- Don’t literally welcome users to your site. Before you give up prime homepage real estate to a salutation, consider using it for a tag line instead.
- Avoid popup windows.
- If you place ads outside the standard banner area at the top of the page, label them as advertising so that users don’t confuse them with your site’s content.
- Keep external ads (ads for companies other than your own) as small and discreet as possible relative to your core homepage content.
- Keep ads for outside companies on the periphery of the page.
- Take users to your “real” homepage when they type your main URL or click a link to your site.
- Be very careful users do not have to SCROLL to get to your content
- As long as all news stories on the homepage occurred within the week, there’s no need to list the date and time in the deck of each story, unless it is truly a breaking news item that has frequent updates.
- Link headlines, rather than the deck, to the full news story.
- Write and edit specific summaries for press releases and news stories that you feature on your homepage tominimise duplication.
- Headlines should be succinct, yet descriptive, to give maximum information in as few words as possible.
- If available, register domain names for alternative spellings, abbreviations, or common misspellings of the site name.
- If you have alternative domain name spellings, choose one as the authorized version and redirect users to it from all the other spellings.
- For any website that has an identity closely connected to a specific country other than the United States, use that country’s top-level domain.
- Homepages for commercial websites should have the URL http://www.company.com (or an equivalent for your country or non-commercial top-leveldomain) .Do not append complex codes or even “index.html” after the domain name.
- Don’t provide tools that reproduce browser functionality
- Don’t include tools unrelated to tasks users come to your site to do.
- Offer users direct access to high-priority tasks on the homepage.
Note – These items are from A USABILITY point of view and here for your reference. See my advice on Page titles for SEO benefits for more information on this. For instance – there are SEO benefits for LONGER TITLES than usability experts recommend.
- Limit window titles to no more than seven or eight words and fewer than 64 total characters.
- Include a short description of the site in the window title.
- Don’t include “homepage” in the title. This adds verbiage without value.
- Don’t include the top-level domain name, such as “.com” in the window title unless it is actually part of the company name, such as “Amazon.com.”
- Begin the window title with the information-carrying word — usually the company name.
- Use drop down menus sparingly, especially if the items in them are not self-explanatory.
- Avoid using multiple text entry boxes on the homepage, especially in the upper part of the page where people tend to look for the search feature.
- Never use widgets for parts of the screen that you don’t want people to click. Make sure widgets are clickable
- Use logos judiciously.
- Use a liquid layout/responsive design so the homepage size adjusts to different screen resolutions.
- The most critical page elements should be visible “above the fold” (in the first screen of content, without scrolling) at the most prevalent window size
- Avoid horizontal scrolling at 1024×768.
- Use high-contrast text and background colors so that type is as legible as possible.
- Limit font styles and other text formatting, such as sizes, colors, and so forth on the page because over-designed text can actually detract from the meaning of the words.
Be consistent. Be Transparent. Meet Visitor Expectation.
Deliver on Purpose.
Example ‘High Quality’ Ecommerce Site
Google has released a new version of the search quality rating guidelines. After numerous ‘leaks’, this previously ‘secretive’ document has now been made available for anyone to download.
This document gives you an idea of the type of quality websites Google wants to display in its search engine results pages.
There is a lot that’s the same as the 2014 version of this document.
I use these quality rating documents and the Google Webmaster Guidelines as the foundation of my audits for e-commerce sites.
You can download the 2015 version of the rating guidelines pdf here, all 150+ pages of it.
What are these quality raters doing?
Quality Raters are rating Google’s ‘experiments’ and manually reviewing web pages that are presented to them in Google’s search engine results pages (SERPs). We are told that these ratings don’t impact your site, directly.
Ratings from evaluators do not determine individual site rankings, but are used help us understand our experiments. The evaluators base their ratings on guidelines we give them; the guidelines reflect what Google thinks search users want. GOOGLE.
What Does Google class as a high-quality product page on an e-commerce site?
This page, and site, appears to check all the boxes Google wants to see in a high-quality e-commerce website these days.
This product page is an example of YMYL page exhibiting “A satisfying or comprehensive amount of very high-quality MC (main content)” and “Very high level of expertise, highly authoritative/highly trustworthy for the purpose of the page” with a “Very positive reputation“.
What is YMYL?
This is a classification of certain pages by Google:
Your Money or Your Life (YMYL) Pages.
where Google explains:
Some types of pages could potentially impact the future happiness, health, or wealth of users. We call such pages “Your Money or Your Life” pages, or YMYL
and in this instance, it refers to a very common type of page:
Shopping or financial transaction pages: webpages which allow users to make purchases, transfer money, pay bills, etc. online (such as online stores and online banking pages)…..We have very high Page Quality rating standards for YMYL pages because low-quality YMYL pages could potentially negatively impact users’ happiness, health, or wealth.
It is interesting to note that 1. This example of a ‘high-quality’ website in the guidelines is from 2013 and 2. The website looks different today.
But – this is a clear example of the kind of ‘user experience’ you are trying to mimic if you have an online e-commerce store and want more Google organic traffic product pages.
You might not be able to mimic the positive reputation this US site has, but you are going to have to build your product pages to compete with it, and others like it.
If you want a review of your website taking some of these guidelines into consideration, you can buy it here.
Google Panda explained? Not quite…I have tested what I have talked about here – and my testing over the last few years has been to identify priorities Google seems to be taking into consideration when rating sites.
You do need to realise that Google Panda and other algorithms deal with pages where the intent is largely INFORMATIONAL or TRANSACTIONAL – and Google has strategies to deal with both.
Thin pages, for instance, on an informational site is different from thin content on an e-commerce site.
The quality signals an e-commerce site will have to display will be different from that of a hobby site, and as such, will be held to higher standards – and somebody somewhere will always be ready to compete for that if the prize is free traffic from Google.
Below is the response to a test page when I implemented ONE of THE most critical ‘user experience’ improvements.
NOTE – High Quality ‘Supplemental’ Content leading to external sites!
Here is some high-quality ‘supplemental content‘ for you to continue learning about Google Panda and ratings guidelines, and some others.
Although I think I’ve covered the most important aspects above, for most webmasters, the following write-ups and analysis should expand your horizons on Page Quality and the Google Panda Algorithm:
- Guidelines for Home Page Visibility – http://www.nngroup.com/articles/113-design-guidelines-homepage-usability/
- The Companies Act 2007 – http://www.hobo-web.co.uk/the-companies-act/
- Panda 4 Analysis – http://www.hmtweb.com/marketing-blog/panda-4-1-analysis/
- Understanding the latest Panda patent – http://www.SEObythesea.com/2014/09/new-panda-update-new-panda-patent/
- Google Panda tips – http://macedynamics.com/research/content-quality-score/
- Another overview of the Quality Ratings Guide – http://macedynamics.com/research/content-quality-score/
- Interview with a quality rater – http://searchengineland.com/interview-google-search-quality-rater-108702
- More info about the Quality Guidelines – http://www.thesempost.com/google-rewrites-quality-rating-guide-SEOs-need-know/
- Google Quality Raters can’t cause a ranking drop (on their own, at least) – http://searchengineland.com/google-quality-raters-cant-cause-site-to-drop-in-rankings-103850
- Another earlier review of the quality guidelines – http://www.potpiegirl.com/2011/10/how-google-makes-algorithm-changes/
- Google Panda updates history – http://searchengineland.com/panda-update-rolling-204313
- Website Usability Tips – http://www.usereffect.com/topic/25-point-website-usability-checklist