google optimizegoogle rankinggoogle seointernet marketingSEO

Google’s Disavow Conundrum -SEO Guide


What you’re about to read is excerpted from an older, longer premium article. The excerpts provided below omit details and context provided by even older articles from which they were taken. Be careful not to infer meaning beyond what you see here.

I received the following question (reformatted for this article): “Michael, you’re convinced no one knows how to disavow properly. What would you disavow?”

Long-time subscribers to the newsletter may recall that I’ve shared my criteria for identifying spammy links in the past. Obviously to answer this question I’ll have to go down the list again. But first let me summarize what can happen with link spam:

  1. The search engine never sees it, so no effect
  2. The search engine indexes it w/o sufficient PageRank-like value, so no effect
  3. The search engine indexes and accepts the spam, so the links pass value
  4. The search engine rejects the links and never indexes them, so no effect
  5. The search engine indexes the spam but doesn’t trust it, so the links DO NOT pass value
  6. The search engine identifies a pattern and penalizes the destination
  7. The search engine identifies the spam as such and assigns a NEGATIVE value

Item 3 is important. Spam sometimes helps. It still works in 2019 (Update: Yes, in 2023, too). Spam may always work.

To the best of my knowledge, Google has never explicitly stated that links can pass negative value but they’ve come close several times. I first openly speculated about “Negative PageRank” in 2011 after the Panda algorithm was released. [When I asked Matt Cutts about this, he pointedly said nothing – leaving us to speculate.]

With the Penguin 1.0 algorithm they went after “Home Page Backlinks”, specifically the blog networks that were publishing entire blog posts on the home page. That was a very easy pattern to identify. Google came up with that idea after the spam team manually tracked down and delisted thousands of paid HPBL networks in March and April 2012.

Penguins 2.0 and 3.0 went deeper into sites and apparently refined or expanded the patterns they were looking for. The fact that Google was relying on patterns means they implemented machine learning sometime in the process, probably with Penguin 1.0.

Up to this point, to recover from a Penguin downgrade you had to get rid of the links. In early 2012 I asked Matt Cutts why he wouldn’t give us a Disavow tool. He said he wasn’t sure it would be used correctly, or if there was even a real need for one. That was when I began talking about “toxic links” in earnest. A Toxic Link (by my definition) is one that the search engine has identified as bad and is using to punish the Website that was seeking to benefit from it.

Penguins 1.0-3.0 appear to have assigned Negative PageRank-like value to the links they identified. Hence, Penguin-marked links were Toxic Links. Unfortunately, people in the SEO community took “toxic links” to mean something else (or it evolved into something else). I rarely speak of “toxic links” any more because there really isn’t a need to.

Penguin 4.0 introduced a continuous evaluation of links (essentially a new document classifier that is run on every Web page as or soon after it is crawled). Penguin 4.0 also reversed the polarity. Instead of using the identified spam links to punish Websites they are simply ignored, dropped from the link graph.

Some people believed that Google may have “grandfathered” the old toxic links from Penguins 1.0-3.0 into the link graph. It could be that Google merely needed time to re-crawl the Web and apply Penguin 4.X to those links, and until that happened their negative values continued to suppress Website rankings. The Grandfather Effect was incidental, perhaps, if it wasn’t intentional.

So the Disavowals continued after Penguin 4.0 rolled out but the justification for them has declined. Over the past 2 years (Update: add another 3 years for 2023) as people continue to disavow links they become more likely to complain about negative effects of Google updates. In other words, when Google changes something, many (but by no means all) of the people who continued disavowing links find themselves losing traffic.

Without the good value-passing links that were disavowed, Websites become more volatile and unstable in the search index. It’s inevitable that they’ll experience more fluctuations in search visibility and rankings.

What Would I Disavow?

Notice that John Mueller advises people to preemptively disavow links they believe would lead to a manual action. He knows what would lead to a manual action. Anyone who is buying and selling links should also know what should lead to a manual action.

Everything else is acceptable.

But what about PBN (Private Blog Network) and GBN (Guest Blog Network) links? Won’t they lead to manual actions? In my opinion that depends on the PBN/GBN. If you’ve been operating a PBN for the last 5 years and have never been delisted or received a manual action notice, you have no reason to disavow your own links. They are still spam. Building private blog networks for search-influencing links is a link scheme, and that violates Google’s search engine guidelines.

But just because a link violates guidelines doesn’t mean the algorithms will see it as a violation (they’re not perfect). And if they do see it as a violation they may decide it’s not severe enough a violation to elevate it to the status of manual action.

At worst your unpenalized PBN (and GBN) links are passing no value (neither negative nor positive). At best they are passing positive value.

I see no reason to disavow in a situation like that.

On the other hand, if the destination site(s) received a manual action (an actual penalty) then I would go ahead and disavow the PBN/GBN links (or add “nofollow” attributes). One would need to do something before filing a reconsideration notice.

I would keep the sites online if they were sending non-search referrals. You’re not breaking any laws by spamming the search engines and if you neutralize the links Google doesn’t care what happens between your Websites. All they care about is the set of links they have identified as violating their guidelines.

Characteristics of Spammy Links

In Volume 6, Issue 33 (September 1, 2017) we published an article titled “Free Web Hosting for Links”. Here is the concluding section of that article:

[ Citation ]

What is the best way to identify links that should be disavowed? Well, begin with your own quality test for the linking site:

  1. Does the site seem to legitimately represent someone’s voice?
  2. Does the linking article provide good information regardless of how well it is written?
  3. Does the linking article present what appears to be a valid opinion?
  4. Do the links in the article support its points or explain vague references?
  5. Does the linking site publish more than 1 article?

If you can answer “YES” to all the above my first inclination is to leave the links alone. It may not be a high quality site but that doesn’t mean it’s spam, it doesn’t mean it’s toxic, and it doesn’t mean it’s hurting your (client) site. These are the kinds of questions you should ask yourself, but come up with your own list that represents your own values.

Next, apply a toxicity test for the linking site:

  1. Is it a hastily made site that obviously won’t ever be updated?
  2. Do the links only use targeted competitive keyword-rich anchor text?
  3. Is the site similar in design, content, tone, and style to others you find in the backlink profile?
  4. Do you know that the link was paid for?
  5. If you scan the backlinks for the linking site, do they look even more suspicious?

If you can answer “YES” to all of the above then my first inclination is to assume the links are spammy, easily identified, and probably toxic. These are probably common sense talking points for anyone with a few years’ or more SEO experience. Voluminous link spam looks very, very bad. People still use it but mostly to propel their “Web 2.0 links” into Bing and Google’s good graces (they hope). They use the Web 2.0 sites for “link whitewashing” or “link laundering”.

These links do (sometimes) appear to still work. I know because we have a reputation management client whose search results went to hell when his hostile entity spammed Google with a lot of crappy links. Those crappy links don’t always meet the above the criteria but they come pretty close.

There is what I would call a Character Test for the linking content.

  • Does it look out of place?
  • Does it feel like it was obviously created for only a specific purpose?

You’re more likely to encounter obvious link spam that fails the Character Test in a hostile reputation management campaign.

Call that Negative SEO, if you will. It’s just dirty, rotten content that would not normally be where it is if someone were not trying to torpedo someone else (or a Website). Someone once linked to my personal domain (Michael-Martinez.Com) with 30,000 links accusing me of being a pedophile and bestiophile (via their link anchors). These links were in forum comments, social media profile pages, and other easily spammed content.

Honest links may be vitriolic but the vitriol tends to be of a different nature. “He’s a lying jerk” is a more honest link anchor than “this pedophile child porn”, in my opinion.

In Volume 6, Bonus Issue 4 (December 29, 2017) we published an article titled “The Risks of Using Nofollow Links”. Here is the concluding section of that article:

[ Citation ]

…if you’re discussing the risks of “nofollow” links with clients or business decision-makers, consider the following order of risk severity:

  1. Known spam links to known link spam recipient
  2. Known spam links to UNknown link spam recipient (not yet caught)
  3. Unknown spam (passes filters) links to known link spam recipient
  4. Unknown spam links to UNknown link spam recipient (not yet caught)
  5. [Some spam] links to non-spam link recipient [INNOCENT]
  6. Non-spam links to known link spam recipient [- probably no harm -]
  7. Non-spam links to UNknown link spam recipient (not yet caught)
  8. Non-spam links to non-spam link recipient

Assuming these are all “nofollow” links, none of them should have any immediate, direct effect on the receiving Websites. But in the first scenario Google may be counting noses and taking names. Otherwise, I don’t think there is (yet) any risk from receiving “nofollow” links. It should be very rare, although obviously any site that is caught passing or receiving spammy links (even if they use “nofollow”) will move up in the risk scale.

Yes, people sometimes ask if they should disavow “nofollow” links. It’s a harmless time-wasting exercise.

In Volume 5, Issue 38 (October 14, 2016) we published an article titled “SEO Micro Case Study No. 132 (Marketing Land Podcast)”. We dissected the key takeaways from the podcast, which was an interview with Googler Gary Illyes. Here are excerpts from our article:

[ Citation ]

Key takeaways from part one include:

  1. Penguin is looking at the linking site (quality signals associated with the linking page) not the site receiving links
  2. Google annotates links with “labels” such as “footer link”, “Penguin real-time”, “disavowed link”, and many more
  3. Gary Illyes pronounced his name as “(e)+ISH” and Danny Sullivan pronounced it as “ee-ish”
  4. Google prefers to ignore a modest number of spammy links through the algorithm
  5. If they detect (maybe 70%) spammy links the Webspam team may apply a manual penalty for egregious behavior
  6. Penguin 4.0 looks at other types of spam (on the source site) than just links
  7. By the time the podcast was recorded only one data center had not yet picked up the new Penguin algorithm

Gary mentioned a negative case study that had been submitted to Google for evaluation. He said that Penguin was devaluing the links and the site was not being hurt by the links. In an unrelated case study, another site received a manual action because of a negative SEO campaign. The company was able to quickly disavow the links and get the penalty lifted. I think Penguin may make it easier for people to see when negative SEO is happening because of the manual actions. Of course, someone might conduct multiple consecutive negative SEO campaigns against you to build up a history of penalties with Google. Gary maintains that they still have not seen negative SEO “working as people think that they should work”.

Key takeaways from part two include:

  1. Most people cannot identify the good directories
  2. Keep your directory links to a small percentage
  3. Legitimate business directories should be okay
  4. Machine learning will not take over the core algorithm
  5. Expect more launches around Google AMP and structured data
  6. Click through rates are “drastically lower” to AMP pages
  7. Larger, well known sites may see better CTRs for their AMP pages than small sites
  8. Machine learning is used to identify new signals and signal aggregations
  9. RankBrain “reranks based on historical signals” that “comes up with a new multiplier”
  10. Panda “measures the quality of a site by looking at a majority of the pages”
  11. Panda only demotes sites; it cannot promote sites (negative signal only)
  12. Panda concludes the site is trying to game the system
  13. Small business sites tend to adapt to new guidelines and signals faster than large business sites
  14. Technical SEO tools like structured data and AMP don’t improve rankings but may get you into more widgets
  15. The “HTTPS signal” just looks at the first 5 characters in the URL

In Volume 3, Bonus Issue 4 (October 31, 2014) we published an article titled “SEO QUESTION NO. 177 (What is the Difference between Interferometry and Correlation Analysis?)” Part of the article dealt with trustworthy vs. untrustworthy links:

[ Citation ]

For example, a trustworthy link may be:

  1. Given without incentive
  2. Part of critical on-site navigation
  3. Using unhelpful anchor text
  4. Rare

An untrustworthy link may:

  1. Only be given with incentive
  2. Be superfluous or unnecessary
  3. Use targeted anchor text
  4. Be common (sitewide or found on many sites)

In Volume 4, Issue 11 (March 20, 2015) we published an article titled “SEO QUESTION NO. 215 (How to Decide if Something is Natural)”. Here is an excerpt from the article:

[ Citation ]

So here is how I evaluate links for naturality:

  1. Would I normally include a link like this on my personal Websites?
  2. Would I normally place a link like this at this spot on the page?
  3. Would I normally place this link next to this kind of content?
  4. Is this the kind of content I would normally create for myself or my clients?
  5. Does this content appear to have a non-manipulating purpose?

And another excerpt:

[ Citation ]

Option 2: Instead, let’s look at some acid tests you can apply without having to crawl the Web, build spreadsheets, etc.

  1. How prominently is the linking page featured on the Website? If the Website doesn’t make the page seem important then that is a red flag.
  2. Does the page get a lot of social media attention beyond self-promotion? If you can’t find gratuitous social media activity for the page then that is a red flag.
  3. Does the page seem exceptional compared to other content on the site? Are the articles random in nature or is it just one out of many articles about how to bake cakes? Content that is largely diverse is a red flag.
  4. How much care was given to the details of the site design? If the site owner made little to no effort to give the design some personality then that is a red flag.
  5. How many links are embedded in the page content? If it’s just an article and you see 2-3 offsite links in every paragraph then that is a red flag.

In Volume 6, Bonus Issue 2 (June 29, 2017) we published an article titled “Was There A June 25 Google Update?” Here is an excerpt from the article:

[ Citation ]

Low quality sites hurt the user experience. They are most likely to:

  1. Obscure information (not just “content” but “information”)
  2. Use vague language in content
  3. Rely on duplicate information (not just “content” but “information”)
  4. Contain a lot of page clutter (widgets, images, videos, ads, etc.)
  5. Diffuse PageRank through flat site architecture
  6. Use PageRank sculpting (adding “nofollow” to internal links)
  7. Create orphans and weaken intrasite navigational support for “leaf” pages (deep content)
  8. Publish content through hard-to-render Javascript
  9. Provide less annotation for graphics
  10. Use more boilerplate text than similar sites

Low quality (spammy) links are most likely to:

  1. Be self-placed (directly or indirectly)
  2. Use preferred (“targeted”) anchor text
  3. Be limited to “relevant” context (natural links are most often embedded in IRrelevant contexts)
  4. Be quickly made
  5. Not support and reinforce useful content or information (in the linking context)
  6. Be found in similar or (near-)duplicate content across multiple sites
  7. Make less sense than normal in-context links would
  8. Match query terms more than they match non-query phrases
  9. Be placed in “guest” or “user-generated” content
  10. Use shorter anchor text than normal links (based on context norms)

You are most likely to find self-placed links in:

  1. Guest posts (body content or bio boxes)
  2. User account profile pages
  3. Social media shares
  4. Blog comments
  5. Forum discussion posts (in the post or in the signature)
  6. Widgets
  7. Attribution text (such as image or infographic captions)

Regardless of whether you are evaluating content or links, you should be asking these kinds of questions:

  1. Why is this thing here?
  2. How is this thing different from similar things on the page / site?
  3. Why is this thing different from similar things on the page / site?
  4. Why is this emphasis necessary?
  5. To a disinterested reader, does this thing add real value to the page?
  6. Would this thing be here if [SEO metric] was lower or zero?
  7. Why does this site exist?
  8. Would the page or site exist if the link were not there?
  9. Whom would a disinterested reader conclude is responsible for the link?
  10. Whom would a disinterested reader conclude is responsible for the content?

Coming back to the original question, “What would [Michael Martinez] disavow?” I would disavow anything that didn’t look natural. But natural links look really weird. All those crawling sites out there that publish meta information about your sites are creating natural links. All those aggregators that take your RSS feeds and publish excerpts of your articles are creating natural links. All those sitewide links whom people you’ve never heard of put into their blogrolls are natural links. All those links in forum discussions you were unaware of are natural links.

And yet these are most often the kinds of links people say they want to disavow.

Just because you feel a Website looks ugly and unprofessional in design doesn’t mean its links are spammy. Spammy sites may indeed be ugly and quickly built, but they are a step below the usual ugly fluff people fear. Ugly is okay. There is no search engine guideline that says a Website cannot or should not be ugly.

Conclusion

If you’re looking at a couple hundred 1-page Weebly Websites with spun content, you would probably be safe to disavow that stuff. Google should be able to identify obvious link spam as obvious link spam. Then again, if it’s still helping and you ain’t been penalized after all these years, why rush to judgment? Most likely your traffic woes are due to something else.

And yet if you can find thousands of links that were clearly intended to manipulate search results for any reason, then don’t wait for the penalty.

If you’re in doubt about what kinds of links to disavow or distrust, then in my opinion you should do nothing. But if you’re absolutely convinced you know why the links exist and it’s not because someone decided to link to 3 million Websites, then give it a shot.
 

This article originally appeared in Volume 8, Issue 17 (May 3, 2019) of the SEO Theory Premium Newsletter. Subscribe now for $25/month or $200/year.


#Googles #Disavow #Conundrum

Hridoy Khan

Md Hridoy Hossain, a dynamic learner from Bangladesh, initially studied Zoology and Fisheries, then delved into Computer Science, specializing in Database and Computer Programming at Bangladesh Technical Education Board (BTEB). Hridoy's diverse expertise spans SEO, Web Development, Digital Marketing, and Software Development, honed through various courses. He manages websites, creating SEO tools and engaging content, generating income via guest posts, AdSense, and affiliate marketing. Across Facebook, Twitter, Instagram, LinkedIn, Pinterest, Reddit, YouTube, and Tumblr, Hridoy shares insights, educating and inspiring his audience. His continuous learning and entrepreneurial flair position him as a rising star in the digital realm. For inquiries or collaboration, reach out at hridoythebest@gmail.com.

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *