Recommendations from Microsoft

Aug 12, 2009 11:30 GMT  ·  By

Crawl delay is one of the options that webmasters have at their disposal in order to control the search engine bots that are indexing their websites. In order to prevent web server load issues, website owners can delay crawling frequency. This move is necessary in scenarios in which the indexing process of large websites delivers a palpable impact on the server resources available. The Bing crawler, which the search engine inherited from its precursor, Live Search, namely MSNBot, can be served crawl delay parameters via the robots.txt file, explained Rick DeJarnette, Bing Webmaster Center.

“Bing supports the directives of the Robots Exclusion Protocol (REP) as listed in a site’s robots.txt file, which is stored at the root folder of a website. The robots.txt file is the only valid place to set a crawl-delay directive for MSNBot,” DeJarnette added. “The robots.txt file can be configured to employ directives set for specific bots and/or a generic directive for all REP-compliant bots. Bing recommends that any crawl-delay directive be made in the generic directive section for all bots to minimize the chance of code mistakes that can affect how a site is indexed by a particular search engine.”

Here is an example of how a webmaster can set the crawl delay parameter. Simply enter “User-agent: *” and “Crawl-delay: 1” in the robots.txt file, under the generic user agent section. If no crawl delay is set the MSNBot will crawl a website at the normal speed. Furthermore, the company indicated that the bot and the associated indexing process tailored themselves to the content and refresh rates of specific websites. Index refresh speed of 1 means that the Bing MSNBot will crawl a website at a slow speed. The index of 5 is equivalent to a very slow crawling speed, while 10 will cause the bot to crawl content extremely slow.

“The crawl-delay directive accepts only positive, whole numbers as values. Consider the value listed after the colon as a relative amount of throttling down you want to apply to MSNBot from its default crawl rate. The higher the value, the more throttled down the crawl rate will be,” DeJarnette explained. “Bing recommends using the lowest value possible, if you must use any delay, in order to keep the index as fresh as possible with your latest content. We recommend against using any value higher than 10, as that will severely affect the ability of the bot to effectively crawl your site for index freshness.”