When it comes down to cloaking detection

May 22, 2009 11:36 GMT  ·  By

The constant evolution of Microsoft's search engine affects all aspects of Live Search, including the crawler. MSNBot, as the Live Search crawler is dubbed, was patched at the end of the past week as a direct consequence of feedback received by the Redmond company. According to the software giant, the move is an integral part of the natural process involving the addition of upgrades and updates to the search engine technology designed to improve the search experience for end users. This time around, the update was designed to reduce the impact that MSNBot delivered to websites when scanning for cloaking.

“We have modified the cloaking detector. Using the valuable feedback we received regarding the feed crawling issues, we proactively released a patch late last week that should significantly reduce the number of requests to a more acceptable rate,” revealed Brett Yount, from the Live Search Webmaster Center.

Cloaking is a so-called Black Search Engine Optimization technique. Essentially, websites are designed to make the difference between search engine crawlers and actual visitors. Bots are presented with search engine-friendly content, while visitors get the real website. The deceptive technique does not fall well with search engine companies who are working to identify and cut off such websites from their indexes.

“The initial complaints, that we were over-crawling some servers with our cloaking detector, was compounded by and also confused with the new release of our feed crawler that was also overzealous in its attempt to crawl and provide up-to-the-minute results. However, we have taken all of the feedback you have provided and made some improvements,” Yount added.

Microsoft has asked website owners and webmasters to make it easier for the MSNBot to crawl their websites with the integration of sitemaps, meta properties, or RSS updates. Still, the company indicated that, while it was ready to change its crawler, the differences between websites resulted in different “symptoms.”

“Despite releasing a patch for our feed crawler, not all sites are the same, so it is a challenge to gauge a feed crawl-rate that is considered reasonable for all sites. If you believe we are still crawling more than necessary, an alternative option would be setting a crawl-delay. We would urge caution while setting the crawl-delay times as they can severely hamper our ability to crawl your site,” Yount added.