Apr 22, 2011 09:00 GMT  ·  By

Amazon's cloud troubles continue as the outage has been ongoing for more than 24 hours now. Yesterday, a number of large sites using Amazon's Web Services were down or affected by problems at one of Amazon's data centers. Some of the issues have been resolved and some of the sites have found alternative solutions, but the problem hasn't been fixed entirely.

The issues first arose at about 1:41 AM Pacific Daylight Time, according to Amazon's own status dashboard.

Now, 24 hours later, the outage continues as the latest update, from 10:58 PM PDT, indicates that the team is still working towards a resolution.

"The team continues to be all-hands on deck trying to add capacity to the affected Availability Zone to re-mirror stuck volumes. It's taking us longer than we anticipated to add capacity to this fleet," the most recent update reads.

Several hours after the outage started, Amazon provided an explanation for the issues.

"A networking event early this morning triggered a large amount of re-mirroring of EBS volumes in US-EAST-1. This re-mirroring created a shortage of capacity in one of the US-EAST-1 Availability Zones, which impacted new EBS volume creation as well as the pace with which we could re-mirror and recover affected EBS volumes," Amazon explained.

"Additionally, one of our internal control planes for EBS has become inundated such that it's difficult to create new EBS volumes and EBS backed instances," it added.

One of the solutions was to add more capacity to speed up the re-mirroring process and the move worked to a degree. Later in the day Amazon announced that it has managed to restore full service in all Availability Zones except one.

This was at 1:48 PM PDT, since then Amazon has been working on getting this last Availability Zone working, without much success.

Amazon suggests that users affected move to other Availability Zones. With EC2 instances, they can simply relaunch them, so long as they don't select a specific Availability Zone. For most users, this means that they can at least get their sites and apps back and running again, though some functionality may still be affected.

Those using EBS volumes have it harder though, they still aren't able to access their data, however, they can recover if they have snapshots of their storage devices by creating new EBS volumes in a different Availability Zone.