SophosLabs details how a combination of keyword-stuffing and link-spamming tricks has polluted Google's search results

Jul 9, 2015 06:44 GMT  ·  By
Sophos researchers report on Google search results poisoning using PDF files
   Sophos researchers report on Google search results poisoning using PDF files

SophosLabs has exposed a new variant of the search poisoning technique, which involves PDF documents instead of HTML files, leading users to websites that don't provide the content they advertise.

To better understand this problem, we must first explain how Google works.

To differentiate itself from regular website users, Google uses a special user-agent called "Googlebot," which lets webmasters know when their site was last indexed, and can be filtered out of traffic analytics programs.

Because of this, black hat SEO experts are able to detect Google's presence on their site, and using a technique called "cloaking," send Google's search crawler a different version of the page than the one delivered to users.

The version of the website released to the search bot is much cleaner, is stuffed with a bunch of meta keywords, follows all of Google's recommendations, and helps the site rank much higher in its search results.

When the user searches for one of those keywords in their browser, the website that employed this technique appears much higher in the search results but can easily contain malware and various scams that don't appear on the page that Google saw.

This is the exact reason Google introduced the notion of backlinks, as a way to detect popular content using links between sites. The more reputable the sites linking to you, the higher your website will appear in the results.

On the other hand, black hat SEO experts have used various link-spamming techniques to counter Google's search algorithm, some of them involving a collection of websites that link to each other, in what's called a link farm.

Google, the all-knowing master, fooled by an old trick

Google tried and successfully weeded out this kind of tricksters in the past, but the SophosLabs team details a chink in Google's armor, which shows how this older technique is successful even today using PDF files instead of HTML pages.

Using the same combination of keyword-stuffing and link-spamming tricks employed in the past, various website owners have created link farms with PDF files stuffed with different keywords, all linking to each other in a tightly knit network.

This has helped various websites rank higher in search results, which, when accessed, led users to other types of websites than what Google initially saw.

The whole trickery was exposed by the Sophos team after their Sophos Antivirus kept detecting hundreds of thousands of suspicious PDF documents per day, which, when examined in depth by a human analyst, revealed a well-organized link farming network.

As the SohposLabs teams reports, "We provided detailed information about our findings to Google, along with notice about our intent to publish. Google acknowledged our communication but chose not to comment further."