Making thousands of applications unavailable

Jul 3, 2009 08:25 GMT  ·  By

Google App Engine, the cloud-based application hosting and development service, was down for over six hours yesterday. The service started having problems at around 6:30 am Pacific Time (PT) and shortly afterwards went down in “unplanned maintenance mode” while Google engineers worked hard to get to the root of the problem. Full service was restored six hours later.

"Today at 8 am PT datastore access for App Engine applications was affected due to a cluster-wide issue," a Google representative stated after the incident, also saying that most of the issue actually lasted about four and a half hours. "The team identified and fixed the underlying problem and service has now been restored. We apologize for the inconvenience and encourage anyone having technical difficulty to visit the System Status Dashboard or the Downtime Notify Group, which are both linked from the Google App Engine Community site."

The first issues popped up at around 6:30 am PT but they only started to become widespread at around 8 am, affecting all applications accessing the Datastore. A few minutes later the Apps Engine went into read-only mode with all Datastore and memcache writes disabled. This was followed by an all-out downtime as none of the apps using the service were available. All of the problems were fixed by 12:35.

Google Apps Engine is the Internet giant's answer to cloud-computing and while enterprise services may not be Google's traditional expertise area the product was adopted by a large number of projects. So, because of the nature of the service, a large number of applications were affected and many developers were not happy with the outage. While bugs in computer software aren't exactly anything new and they aren't going to go away anytime soon when they hit a service of this magnitude people aren't going to be happy.