How to Check Which Links Can Harm Your Site’s Rankings

Posted by Modesto Siotos

Matt Cutts' statement in March 2012 that Google would be rolling out an update against “overoptimised” websites, caused great turmoil within the SEO community. A few days later thousands of blogs were removed from Google's index and Matt tweeted confirming that Google had started taking action against blog networks.

Even though thousands of low-quality blogs of low or average authority were manually removed from Google's index, they weren't the only victims. For instance, www.rachaelwestdesigns.com, a PR7, DA70 domain was also removed, probably due to the very high number of blog roll (site-wide) backlinks.

These actions indicate that the new update on "overoptimised" websites has already begun to roll out but it is uncertain how much of it we have seen so far.

At around the same time Google sent to thousands webmasters the following message via message via Google's Webmaster Tools:

In the above statement, it is unclear what Google’s further actions will be. In any case, working out the number of “artificial” or “unnatural links” with precision is a laborious, almost impossible task. Some low quality links may not be reported by third party link data providers, or even worse, because Google has started deindexing several low quality domains, the task can end-up being a real nightmare as several domains cannot be found even in Google's index.

Nevertheless, there are some actions that can help SEOs assess the backlink profile of any website. Because, in theory, any significant number of low quality links could hurt, it would make sense gathering as many data as possible and not just examine the most recent backlinks. Several thousand domains have already been removed from Google's index, resulting in millions of links being completely devalued according to Distilled's Tom Anthony (2012 Linklove).

Therefore, the impact on the SERPs has already been significant and as always happens in these occasions there will be new winners and losers once the dust settles. However, at this stage it is be a bit early to make any conclusions because it is unclear what Google's next actions are going to be. Nevertheless, getting ready for those changes would make perfect sense, and spotting them as soon as they occur would allow for quicker decision making and immediate actions, as far as link building strategies are concerned.

As Pedro Dias, an Ex-Googler from the search quality/web spam team tweetted, "Link building, the way we know it, is not going to last until the end of the year" (translated from Portuguese).

The Right Time For a Backlinks Risk Assessment

Carrying out a backlinks audit in order to identify the percentage of low-quality backlinks would be a good starting point. A manual, thorough assessment would only be possible for relatively small websites as it is much easier to gather and analyse backlinks data – for bigger sites with thousands of backlinks that would be pointless. The following process expands on Richard Baxter's solution on 'How to check for low quality links', and I hope it makes it more complete.

  1. Identify as many linking root domains as possible using various backlinks data sources.
  2. Check the ToolBar PageRank (TBPR) for all linking root domains and pay attention on the TBPR distribution
  3. Work out the percentage of linking root domains that has been deindexed
  4. Check social metrics distribution (optional)
  5. Repeat steps 2,3 and 4 periodically (e.g. weekly, monthly) and check for the following:
  • A spike towards the low end of the TBPR distribution
  • Increasing number of deindexed linking root domains on a weekly/monthly basis
  • Unchanged numbers of social metrics, remaining in very low levels

A Few Caveats

The above process does come with some caveats but on the whole, it should provide some insight and help making a backlinks' risk assessment in order to work out a short/long term action plan. Even though the results may not be 100% accurate, it should be fairly straightforward to spot negative trends over a period of time.

Data from backlinks intelligence services have flaws. No matter where you get your data from (e.g. Majestic SEO, Open Site Explorer, Ahrefs, Blekko, Sistrix) there is no way to get the same depth of data Google has. Third party tools are often not up to date, and in some cases the linking root domains are not even linking back anymore. Therefore, it would make sense filtering all identified linking root domains and keep only those still linking to your website. At iCrossing we use a proprietary tool but there are commercial link check services available in the market (e.g. Buzzstream, Raven Tools).

ToolBar PageRank gets updated infrequently (roughly 4-5 times in a year), therefore in most cases the returned TBPR values represent the TBPR the linking root domain gained in the the last TBPR update. Therefore, it would be wise checking out when TBPR was last updated before making any conclusions. Carrying out the above process straight after a TBPR update would probably give more accurate results. However, in some cases Google may instantly drop a site's TBPR in order to make public that the site violates their quality guidelines and discourage advertisers. Therefore, low TBPR values such as n/a, (greyed out) or 0 can in many cases flag up low quality linking root domains.

Deindexation may be natural. Even though Google these days is deindexing thousands of low quality blogs, coming across a website with no indexed pages in Google's SERPs doesn’t necessarily mean that it has been penalised. It may be an expired domain that no longer exists, an accidental deindexation (e.g. a meta robots noindex on every page of the site), or some other technical glitch. However, deindexed domains that still have a positive TBPR value could flag websites that Google has recently removed from its index due to guidelines violations (e.g. link exchanges, PageRank manipulation).

Required Tools

For large data sets NetPeak Checker performs faster than SEO Tools, where large data sets can make Excel freeze for a while. NetPeak checker is a standalone free application which provides very useful information for a given list of URLs such as domain PageRank, page PageRank, Majestic SEO data, OSE data (PA, DA, mozRank, mozTrust etc), server responses (e.g. 404, 200, 301) , number of indexed pages in Google and a lot more. All results can then be exported and processed further in Excel.

1. Collect linking root domains

Identifying as many linking root domains as possible is fundamental and relying in just one data provided isn't ideal. Combining data from Web master tools, Majestic SEO, Open Site Explorer may be enough but the more data, the better especially if the examined domain has been around for a long time and has received a large number of backlinks over time. Backlinks from the same linking root domain should be removed so we end up with a long list of unique linking root domains. Also, not found (404) linking root domains should also be removed.

2. Check PageRank distribution

Once a good number of unique linking root domains has been identified, the next step is scrapping the ToolBar PageRank for each one of them. Ideally, this step should be applied only on those root domains that are still linking to our website. The ones that don't should be discarded if not too complicated. Then, using a pivot chart in Excel, we can conclude whether the current PageRank distribution should be a concern or not. A spike towards the lower end values (such as 0s and n/a) should be treated as a rather negative indication as in the graph below.

3. Check for deindexed root domains

Working out the percentage of linking root domains which are not indexed is essential. If deindexed linking root domains still have a positive TBPR value, most likely they have been recently deindexed by Google.

4. Check social metrics distribution (optional)

Adding in the mix the social metrics (e.g. Facebook Likes, Tweets and +1s) of all identified linking root domains may be useful in some cases. The basic idea here is that low quality websites would have a very low number of social mentions as users wouldn't find them useful. Linking root domains with low or no social mentions at all could possibly point towards low quality domains.

5. Check periodically

Repeating the steps 2, 3 and 4 on a weekly or monthly basis, could help identifying whether there is a negative trend due to an increasing number of linking root domains being of removed. If both the PageRank distribution and deindexation rates are deteriorating, sooner or later the website will experience rankings drops that will result in traffic loss. A weekly deindexation rate graph like the following one could give an indication of the degree of link equity loss:

Note: For more details on how to set-up NetPeak and apply the above process using Excel please refer to my post on Connect.icrossing.co.uk.

Remedies & Actions

So far, several websites have seen ranking drops as a result of some of their linking root domains being removed from Google's index. Those with very low PageRank values and low social shares over a period of time should be manually/editorially reviewed in order to assess their quality. Such links are likely to be devalued sooner or later, therefore a new link building strategy should be devised. Working towards a more balanced PageRank distribution should be the main objective, links from low quality websites will keep naturally coming up to some extent.

In general, the more authoritative & trusted a website is, the more low quality linking root domains could be linking to it without causing any issues. Big brands' websites are less likely to be impacted because they are more trusted domains. That means that low authority/trust websites are more at risk, especially if most of their backlinks come from low quality domains, have a high number of site-wide links, or if their backlink profile consists of unnatural anchor text distribution.

Therefore, if any of the above issues have been identified, increasing the website's trust, reducing the number of unnatural site-wide links and making the anchor text distribution look more natural should be the primary remedies.

About the author

Modesto Siotos (@macmodi) works as a Senior Natural Search Analyst for iCrossing UK, where he focuses on technical SEO issues, link tactics and content strategy. Modesto is happy to share his experiences with others and posts regularly on Connect, a UK digital marketing blog.

Sign up for The Moz Top 10, a semimonthly mailer updating you on the top ten hottest pieces of SEO news, tips, and rad links uncovered by the Moz team. Think of it as your exclusive digest of stuff you don’t have time to hunt down but want to read!