Google's Guide on How to Take Down Your Site Correctly, for SOPA or Maintenance
Many websites may need to be temporarily taken down, but it must be done the right way
SOPA has been put aside for now, or so it seems, but several major websites are still going through with the blackout protest. Smaller websites have joined in as well. Taking down your site in protest is somewhat extreme, but it clearly raises attention.There may be other, more peaceful reasons for taking down a website, maintenance, upgrade and so on. When faced with taking down their website, many people worry about the impact this will have on Google ranking, rightfully so.
Luckily, Google Webmaster Trends Analyst Pierre Far, has put together a guide showing how to take down a site temporarily, the right way.
"The most common scenario we’re seeing webmasters talk about implementing is to replace the contents on all or some of their pages with an error message (“site offline”) or a protest message," he wrote.
"The most important point: Webmasters should return a 503 HTTP header for all the URLs participating in the blackout (parts of a site or the whole site)," he explained.
Adding a 503 header ensures that Google knows the content on those pages is not 'real' so it won't be indexed. What's more, this means that even though the same message is displayed on all pages, there won't be any duplicate content issues.
"Googlebot's crawling rate will drop when it sees a spike in 503 headers. This is unavoidable but as long as the blackout is only a transient event, it shouldn't cause any long-term problems and the crawl rate will recover fairly quickly to the pre-blackout rate," he added.
However, he said, as soon as the site is back to normal, the crawling speed will pick up again, usually in a matter of days.
Far also warned about blocking robots.txt. Some webmasters may be inclined to add "Disallow: /" to the file, thinking that blocking Google or other crawlers entirely will prevent any of the problems listed previously. But this will cause more long-term problems than simply using a 503 HTTP code.
At the same time, it's not a great idea to add a 503 header to the robots.txt file as this will also prevent Googlebot from reading it. In these cases, the Google crawler prefers not to index anything on the site, to be on the safe side.
Finally, Far adds that the blackout will be listed in Webmaster Tools. He also advises webmasters to change as little as possible during the blackout to make sure fewer things can go wrong.