New Evidence: Does Google Count Keywords in Anchor Text In Internal Links?


ADVICE WARNINGThis is part of a series of SEO ranking tests that look at traditional ranking signals and their effects on SERP positions. I would certainly NOT go and implement sitewide changes to your site based on this or other recent posts. WAIT for the series to END so you can make a more informed decision based on more observations, because **things get a bit weird later in the series of tests**! AS WITH ANY SEO TESTING – don’t just rely on my findings. Test things yourself on your own site, and realise, that there are not many QUICK WINS that Google isn’t policing in some way so ** avoid doing anything on your site that would leave a footprint that would indicate manipulation. **WARNING

Recently I looked to see if Google counts keywords in the URL to influence rankings for specific keywords and how I investigated this.

Today I am looking at the value of an internal link and its impact on rankings in Google. These posts were originally published in two parts, but I have folded them together to make it easier to read.

My observations from these tests (and my experience) include:

  • witnessing the impact of removing contextual signals from the anchor text of a single internal link pointing to a target page (April 15 impact in the image below)
  • watching as an irrelevant page on the same site take the place in rankings of the relevant target page when the signal is removed (19 April Impact)
  • watching as the target page was again made to rank by re-introducing the contextual signal, this time to a single on-page element e.g. one instance of the keyword phrase in exact match form (May 5 Impact)
  • potential evidence of a SERP Rollback @ May 19/20th
  • potentially successfully measuring the impact of one ranking signal over another (a keyword phrase in one element via another) which would seem to slighly differ from recent advice on MOZ, for instance.

As a result of some of this testing, and my experience in cleaning up sites, I present a theory (or at least a metaphor) on the Google Quality Metric and how it might be deployed.

Will Google Count Keywords in Internal Anchor Text Links?

The video above, from quite some time ago now (OCT 2015), is John Meuller talking about ‘Internal Links’ and their influence on Google SERPS.

we do use internal links to better understand the context of content of your sites

Essentially my tests revolve around ranking pages for keywords where the actual keyphrase is not present in exact match instance anywhere on website, in internal links to the page or on the target page itself.

The relevance signal (mentions of the exact match keyword) IS present in what I call the Redirect Zone – that is – there are backlinks and even exact match domains pointing at the target page but they pass through redirects to get to the final destination URL.

In the image below where it says “Ranking Test Implemented” I introduced one exact match internal anchor text link to the target page from another high-quality page on the site – thereby re-introducing the ‘signal’ for this exact match term on the target site (pointing at the target page).

Where it says ‘Test Removed‘ in the image below, I removed the solitary internal anchor text link to the page, thereby, as I think about it, shortcutting the relevance signal again and leaving the only signal present in the ‘redirect zone’.

Graph: Does Google Count Anchor Text In Internal Links?

It is evident from the screenshot above that something happened to my rankings for that keyword phrase and long tail variants exactly at the same time as my tests were implemented to influence them.

Over recent years, it has been difficult for me, at least, to pin down, with any real confidence anchor text influence from internal pages on an aged domain. Too much is going on at the same time, and most out of an observers control.

I’ve also always presumed Google would look at too much of this sort of onsite SEO activity as attempted manipulation if deployed improperly or quickly, so I have kind of just avoided this kind of manipulation and focused on improving individual page quality ratings.

TEST RESULTS

  1. It seems to me that, YES,  Google does look at keyword rich internal anchor text to provide context and relevance signal, on some level, for some queries, at least.
  2. Where the internal anchor text pointing to a page is the only mention of the target keyword phrase on the site (as my test indicates) it only takes ONE internal anchor text (to another internal page)  to provide the signal required to have NOTICEABLE influence in specific keyword phrase rankings (and so ‘relevance’).

——————————————————-

 

Test Results: Removing Test Focus Keyword Phrase from Internal Links and putting the keyword phrase IN AN ELEMENT on the page

Screenshot 2016-05-25 15.43.32

To recap in my testing:

I am seeing if I can get a page to rank by introducing and removing individual ranking signals.

Up to now, if the signal is not present, the page does not rank at all for the target keyword phrase.

I showed how having a keyword in the URL impacts rankings, and how having the exact keyword phrase in ONE internal anchor text to the target page provides said signal.

Ranking WEIRDNESS 1: Observing an ‘irrelevant’ page on the same site rank when ranking signal is ‘shortcutted’.

The graph above illustrates that when I removed the signal (removed the keyword from internal anchor text) there WAS a visible impact on rankings for the specific keyword phrase – rankings disintegrated again.

BUT – THIS TIME – an irrelevant page on the site started ranking for a long tail variant of the target keyword phrase during the period when there was no signal present at all in the site (apart from the underlying redirect zone).

Screenshot 2016-05-25 15.36.05

This was true UNTIL I implemented a further ranking test (by optimising ANOTHER ELEMENT actually on the page this time, that introduced the test focus keyword phrase (or HEAD TERM I have it as, in the first image on this page) again to the page – the first time that the keyword phrase was present on the actual page (in an element) for a long time).

WEIRDNESS 2 – SERP Rollback?

On May 1st I added the test focus keyword to the actual page in a specific element to test the impact of having the signal ONLY in a particular element on the page.

As expected the signal provided by having the test keyword phrase ONLY in one on-page element DID have some positive impact (although LESS than the impact when the signal was present in Internal Links and this comparison I did find very useful).

That’s not the anomaly – that results from RANKING TEST 3 were almost exactly as I expected. A signal was recognised, but that solitary signal was not enough to make the page as relevant as it was to Google when the signal was in internal links.

The weirdness begins on May 17, where I again removed the keyword phrase from the target page. I expected with NO SIGNAL present anywhere on the site or the page, Google rankings would return to their normal state (zero visibility).

The opposite happened.

Screenshot 2016-05-25 15.36.20

WTF?

Rankings returned to the best positions they have been for the term SINCE I started implemented these ranking tests – even WITHOUT any signal present in any of the areas I have been modifying.

Like a memory effect, the rankings I achieved when the signal was present only in internal links (the strongest signal I have provided yet) have returned.

THINKING OUT LOUD

It’s always extremely difficult to test Google and impossible to make any claims 100% one way or another.

The entire ecosystem is built to obfuscate and confuse anyone trying to understand it better.

Why have rankings returned when there is no live signal present that would directly influence this specific keyword phrase?

My hunch is that this actually might be evidence of what SEOs call a SERP ROLL-BACK – when Google randomly ‘rolls’ the set of results back to previous weeks SERPs to keep us guessing.

If this is a roll back the roll back time frame must be during the period of my RANKING TEST 2 (a month or so maximum) as the page did not rank for these terms like this at all for the year before – and yet they come back almost exactly as they were during my test period.

In the following image, I show this impact on the variant keyword (the keyword phrase, with no spaces) during this possible ‘roll back’.

Screenshot 2016-05-25 23.58.46

An observation about SERP ROLL BACKS.

If a rollback is in place, it does not seem to affect EVERY keyword and every SERP equally. NOT all the time, at least.

MORE IMPORTANTLY – Why did an irrelevant page on the same website rank when the signal was removed?

The target page was still way more relevant than the page Google picked out to present in long tail SERPs – hence my question.

Google was at least confused and at worst apathetic and probably purposefully lazy when it comes to the long-tail SERPs.

Because my signal to them was not explicit, ranking just seemed to fail completely, until the signal was reintroduced and I specifically picked out a page for Google to rank.

To be obvious, something else might be at play. Google might now be relying on other signals – perhaps even the redirect zone – or relative link strength of the ‘irrelevant’ page – but no matter – Google was ranking the less relevant page and CLEARLY IGNORING any relevance signals passing through the redirect zone to my target page.

Observations

From my ranking test 2 (internal links) – it is evident that modifications to internal links CAN make IRRELEVANT PAGES on your site rank instead of the target page, and for some time, IF by modifying these internal links, you SHORTCUT the signal that these links once provided to the target page in a way that entirely removes that signal from the LIVE signals your site provide for a specific keyword phrase (let’s call this in the “CRAWL ZONE” e.g. what can be picked up in a crawl of your HTML pages, as Google would do – which sits above the REDIRECT ZONE in my imagination – which is simply where signals need to pass through a 301 redirect).

That test was modifying only ONE anchor text link.

This might be very pertinent to site migrations when you are modifying hundreds of links at the same time when you need to migrate through redirects and change of URLs and internal anchor text.

YES – the signal may return – but this doesn’t look to anything that happens over a quick timescale. Site migrations like this are potentially going to be very tricky if low-quality pages are in the mix.

Which Ranking Signal Carries the most weight?

In these two tests, I switched the signal from internal links to another element, this time on the page, to observe the impact on the change in terms of rankings for the page for the test focus keyword phrase.

Screenshot 2016-05-25 21.29.04

The signal I switched it to would seem to have less of an impact on ranking than internal links, therefore making a recent whiteboard Friday at least potentially inaccurate in terms of weighting of signals for relevance as they are presented on the whiteboard.

This would be UNLESS I have misinterpreted the presumed rollback activity in the SERPs in May 2016 and those rankings are caused by something else I am misunderstanding. Only time will shed some light on that, I think.

Screenshot 2016-05-25 21.15.00

Yes – I replaced the signal originally in H (in Rand’s list) in an element that was in A to G (on Rands list) – and the result was to make the page markedly LESS relevant, not more.

Which element did I switch it with?

That answer comes later in this series of tests. But to be clear – internal links don’t look to be last in this hierarchy of keyword targeting a page.

PS – To understand this test fully, you really need to read my posts where I investigate: Is a keyword in the URL a ranking factor?

Site Quality Algorithm THEORY

If you have site quality problems, then any help you can get will be a good thing.

Any advice I would offer, especially in a theory, would really need to be sensible, at worst! I have clearly shown in past posts that simply improving pages substantially clearly does improve organic traffic levels.

This is the sort of thing you could expect in a ‘fairish’ system, I think, and we apparently have this ‘fairness’, in some shape or form, baked in.

If you improve INDIVIDUAL pages to satisfy users – Google responds favourably by sending you more visitors – ESPECIALLY when increased user satisfaction manifests in MORE HIGH-QUALITY LINKS (which are probably still the most important ranking signal other than the content quality and user satisfaction algorithms):

Screenshot 2016-05-26 01.01.06

In these SEO tests I have been DELIBERATELY isolating a specific element that provides the signal for a specific keyword phrase to rank and then SHORTCUTTING it, like in an electrical switch, to remove the signal to the target page.

What if this process is how Google also shortcuts your site from a site quality point of view e.g. in algorithm updates?

Let us presume you have a thousand pages on your site and 65% of them fail to meet the quality score threshold set for multiple keyword phrases the site attempts to rank for – a threshold Google constantly tweaks, day to day, by a fractional amount to produce flux in the SERPs. In effect, most of your website is rated low-quality.

Would it be reasonable to presume that a page rated low-quality by Google is neutered in a way that it might not pass along the signals it once did to other pages on your site, yet still remain indexed?

We have been told that pages that rank (e.g. be indexed) sometimes do not have the ability to transfer Pagerank to other pages.

Why would Google want the signals a low-quality page provides, anyways, after it is marked as low-quality or more specifically not preferred by users?

SEOs know that pages deemed extremely low-quality can always be deindexed by Google – but what of pages that Google has in their index that might not pass signals along to other pages?

It would be again reasonable to suggest, I think, that this state of affairs is a possibility – because it is another layer of obfuscation – and Google relies on this type of practice in lots of areas to confuse observers.

SO – let us presume pages can be indexed but can sometimes offer no signals to other pages on your site.

If Google’s algorithms nuke 65% of your pages ability to relay signal to other pages on your site, you have effectively been shortcutted in the manner I have illustrated in these tests  – and that might end up with a state of affairs where irrelevant pages on your website will start to rank in place of once relevant pages, because the pages that did provide the signal no longer pass the quality bar to provide context and signal to your target pages (in the site structure).

Google has clearly stated that they:

use internal links to better understand the context of content of your sitesJohn Meuller

If my theory held water, there would be lots of people on the net with irrelevant pages on their site ranking where other, relevant pages once did – and this test would be repeatable.

If this was true – then the advice we would get from Google would be to IMPROVE pages rather than just REMOVE them, as when you remove them, you do not necessarily reintroduce the signal you need and you would get if you IMPROVED the page with the quality issues e.g. the ideal scenario in a world with no complication.

Especially if webmasters were thinking removing pages was the only answer to their ranking woes – and I think it is fair to say many did, at the outset of this challenge.

Guess what?

Screenshot 2016-05-25 19.32.25

This would make that statement by Google entirely correct but monumentally difficult to achieve, in practice, on sites with lots of pages.

Site Quality punishment is a nightmare scenario for site owners with a lot of low-quality pages and especially where it is actually CONTENT QUALITY (as opposed to a purely technical quality issue) that is the primary issue.

It is only going to get more acute a problem as authorship, in however form Google assigns it, becomes more prevalent e.g. you can have high-quality content that is exactly what Google wants, but this will be outranked by content from authors Google wants to hear from e.g. Danny Sullivan over the rest of us, as Matt Cutts was often fond of saying.

There is incredible opportunity ahead for authors with recognized topical expertise in their fields. Google WANTS you to write stuff it WILL rank. WHO writes your content on your website might be even more important that what is written on your page (in much the same way we recognised what we called ‘domain strength’ where Google gave such ability to rank to domains with a lot of ‘link juice’. To be clear – I still think a lot of the old stuff is still baked in. Links still matter, although ‘building’ external backlinks is probably going to be self-defeating and perhaps crushing in future.

Theoretically, fixing the content quality ‘scores’ across multiple pages would be the only way to reintroduce the signal once present on a site impacted by Google Panda or Site Quality Algorithms, and this would be, from the outset, an incredibly arduous – almost impossible – undertaking for larger sites – and AT LEAST a REAL investment in time and labour – and again – I think there would be lots of sites out there in this sort of scenario, if my theory held water, and you accepted that low-quality content on your site can impact the rankings for other pages.

Google actually has said this in print:

low-quality content on part of a site can impact a site’s ranking as a whole” GOOGLE

It is probably a nearly impossible task for content farms with multiple authors of varying degrees of expertise in topics – mostly zilch – with the only way I see of recovering from that would be at best distasteful and at worst highly unethical and rather obvious so I won’t print it.

Let’s look at that statement in full, from Google, with emphasis and numbers added by me:

One other specific piece of guidance we’ve offered is that low-quality content on some parts of a website can impact the whole site’s rankings, and thus 1. removing low quality pages, 2.merging or 3.improving the content of individual shallow pages into more useful pages, or4. moving low quality pages to a different domain could eventually help the rankings of your higher-quality content. GOOGLE

To recover from Google Panda and site quality algorithms, unless you are waiting for Google to release a softer Panda…. you really need to focus doing ALL of points 1-3 – but lots of webmasters stop at number 1 thinking that will be sufficient when any SEO with any experience knows that is way to simple for what Google wants this entire process to achieve – to take TIME – and, accusations have been made, in many a forum to drive the cost of organic SEO UP to comparable levels and beyond of Adwords.

Just recently I advised a client to do no.4 (4. moving low-quality pages to a different domain) and move an old blog with zero positive signal for the business to another domain to expedite the ‘this content isn’t here anymore – don’t rate my site on this‘ process for re-evaluation by Google.

Site quality problems are, BY DESIGN, MEANT to take a long time to sort out – JUST LIKE GOOGLE PENGUIN and the clean up of unnatural links – but, contrary to the complaints of many webmasters who accuse Google of being opaque on this subject, Google tells you exactly how to fix Google Panda problems and Matt Cutts has been telling people all along to “focus on the user” – which again is probably an absolute truth he can feel he is morally correct in relaying to us (which many guffawed at as lies).

If a site quality algorithm was deployed in this fashion, then punishment would be relative to the infraction and cause the maximum amount of problems for the site owner relative to the methods used to generate rankings. All that once helped a site rank could be made to demote it and hold it under the water, so to speak. In a beautiful system, I think, you would actually penalise yourself not so much Google penalising you, and users would indeed determine the final ranking order in organic SERPs not being ‘sharded’ by Google for their own benefit.

We would, of course, need to assume Google has a ‘Quality Metric‘ separate of relevance signals that is deployed in this fashion.

Guess what?

If your site is impacted by this shortcut effect, then identifying important pages in your hierarchy and user journey and improving them is a sensible way to proceed, as you may well be providing important signals for other pages on your site, too.

Why does Amazon rank for everything during these updates? That is the accusation, at least, and this theory would have an answer.

I would presume that when you remove an important signal from your website, you don’t have many other pages that provide said signals. Amazon ALWAYS has higher quality multiple pages on the same topic and so other signals to fall back on and EASY for an algorithm not to f*&^ up. Amazon, too, probably has every other positive signal in bucket loads too, let’s not forget.

Site quality algorithms deployed in this manner would be a real answer to a wayward use of ‘domain authority’, I’ve long thought.

What about webmasters who have, in good faith, targeted low-quality out of date content on a site and removed it, in order to combat Panda problems?

This was natural after Google said in effect to clean up sites.

I imagine somewhere in Google’s algorithm there is a slight reward for this activity – almost as if Google says to itself – “OK, this webmaster has cleaned up the site and brought the numbers of lower quality pages down on the site, thereby incrementally improving quality scores and so traffic levels we will allow it” – but NOT to the extent that would ever bring back traffic levels to a site hit by Content Quality Algorithms (after May 2015, especially).

Google, I think, must seek to reward white hat webmasters (on some level) if the intent is to adhere to the rules (even if those recommendations have been slightly misunderstood) or what is the point of listening to them at all? Most distrust Google and most evidently consistently fail to understand the advice given.

Again – if my theory held water, there would be a lot of webmasters who spent a lot of time cleaning up sites to comply with Panda who DO see positive numbers month to month in terms of increased organic traffic to a site – but rarely do they see a quick return to former ranking glory WITHOUT a severe investment in sitewide page quality improvement.

I have certainly observed this.

From my own experience – you cannot just delete pages to bring traffic levels back to a site in the same numbers after a Panda algorithm change that impacts your site. You must also improve remaining content, substantially.

It is baked into the system that if you have ranked with low-quality techniques, it is going to take a monumental effort deployed quickly to get your site moving again.

Confirmation Bias? Conspiracy theory?

You can tell me.

This theory posits that the site quality metric can shortcut your site and cause irrelevant pages to rank, and maybe even relevant pages to rank lower than they could if ALL the pages on the site where high quality.

Is this theory just wishful thinking because I currently sell site quality audits and my research revolves around white hat SEO testing? I’ve been educating myself to entity optimisation, too. I have loved SEO for over 15 years, but site quality optimisation and sifting through the rubble of toxic backlinks is not exactly what I signed up for, even as my services adjusted to the changes in Google since 2013/14 as I perceived them.

I can confirm I offer this information in return for building my topical relevance on my subject – that is my primary motive. That has always been valuable. I perceive this to be important going forward. Publishing erroneous or harmful theories wouldn’t achieve that, for me.

In my audits, I basically prioritise tasks by impact and risk factor that a webmaster would need to address to achieve long term rankings. At the same time, I need to educate people as to why exactly their site is probably rated garbage-level.

I’ve deployed these tactics on this very site over the last few years to see if it drove traffic (and this is why I do SEO the way I currently do it):

Screenshot 2016-05-25 23.37.34

SEO has become part of the legitimate long-term marketing mix with quick, or even manipulated results at least a potentially business damaging exercise.

Conversely, though, if you achieve ranking nirvana through improved legitimate site and content quality efforts, you can be rest assured the effort required to dislodge you is probably going to be relative to the effort you put in (if applied correctly in the first place) and a great barrier to entry for all but your most dedicated competitors.

On that point, those that do rank and Google themselves are more than happy with that scenario.

I did like when SEO was fast, gratification was instant and risk was a distant prospect. Today risk is always seemingly close by, and gratification becomes increasingly a longer prospect.

I am wondering if the black hat SEO tests might be more fun.

How To Proceed

Improve pages in the user journey. Try to NOT present low-quality pages to users.

I think it reasonable to say that identifying DEAD PAGES is still an incredibly important first step but how to handle the challenge from there is of equal importance.

Removal of irrelevant, out of date obviously low-quality content on a domain should still be a prerequisite for most webmasters. Canonical’s and redirects are STILL your VERY BEST FRIEND when merging ANY PAGES (but pay close attention to Google dicking about with your redirect chains are all your work goes in the toilet).

Paying close attention to where important signals lie on your site is the only way to attempt to protect them, especially during periods of change, where Google is happy to shortcut your ranking ability.

If you fail to preserve certain signals during changes, you remove signals, and irrelevant pages can rank instead of relevant pages.

If you ultimately don’t improve pages in a way that satisfies users, your quality score is probably coming down, too.

For more on site quality issues, see my Google Panda recovery post.



FREE REVIEW

You can give your site a quick technical SEO audit yourself with our free SEO tool. Our free tool will check your site for any obvious technical problems on your site and offer some advice on how to deal with any problems it finds.

Test Your Site