No RFO was ever posted.
4/14/07 - Service Outage
Collapse
X
-
None has been released yet. But here's a quick rundown of what happened:
On Friday Dallas had a lot of thunderstorms roll through including at least one tornado near by. Utility power in parts of Dallas is notorious for having glitches and outages during strong storms. As a result, SL switched to generator power to help ensure that no problems would arise. (This was also a great real world scenario to see that everything was working over an extended period.) Finally on Saturday night they were going to switch back to utility power and in doing so a 2500amp main breaker failed. This breaker controls generator and utility power into parts of their datacenter. When it failed, all they had to run on was battery power which lasted roughly 30 minutes. Afterwards the inevitable occurred and there were many racks left without power. Replacing a breaker like this is far from an easy and simple task. They had a spare onsite but it's still a replacement process that can easily take a few hours. Once it was replaced, power was restored without any problems. However, additional time was taken to power on various sections and racks a few at a time to prevent a power surge from causing even further problems.
Comment