Wikipedia Is Running on Just One Data Center After Outage

Wikipedia has provided a more detailed explanation for yesterday's outage. And, as they stereotypically say, "the plot thickens." The outage was indeed caused by two cut cables.

The problem is that those two cables shouldn't have been anywhere near each other as they were supposed to be redundant, i.e. one would take over for the other if there were problems.

That obviously failed when both cables were put out of commission. This left Wikimedia's two data centers unable to communicate as the two cables provided a direct link between the two.

"The data centers — one in Ashburn, Virginia and the other in Tampa, Florida — are connected by two separate fiber links (for redundancy). While Ashburn serves most of the traffic, it needs to talk to our Tampa data center for backend services (e.g. database)," Wikimedia explained.

"We do operate two 10-g separate fibers between the data centers. We are now working with our network provider to determine how and why we were impacted by that fiber cut when we are supposed to have redundancy in our network. We are still waiting for their full report," it said.

The Wikimedia tech team fixed the problem by directing all the traffic to the Tampa data center, initially. The two fiber connections were later restored and are now operational. However, Wikimedia is weary of using them until the exact problem is diagnosed.

This means that, for now, all the traffic is diverted to the Tampa data center. Users should not see any difference or notice any performance issues as a result. But it does mean that Wikipedia is running on a single data center at the moment.

Wikimedia is right to be cautious about this, a redundancy solution that isn't redundant isn't "ideal." Barring some extraordinary events, any problem affecting one cable shouldn't have affected the other.

Wikipedia Is Running on Just One Data Center After Outage

Wikimedia is waiting for a full explanation from the ISP on what happened