The Planet's outage really hurts.

Over the weekend ThePlanet had a huge outage that has only just been restored. Apparently there was some kind of explosion in their Houston Texas DataCenter and they were forced to take down the entire datacenter.

What hurts is that the excellent webfaction hosts their servers at ThePlanet. Down went AutomaticRomantic along with thousands of other websites. It could have been worse, the servers could have been destroyed but as it was no servers were damaged and only ThePlanets infrastructure was fried.

I've never had a good experience with DataCenters. I've been with a company that moved from one datacenter because of a massive outage to another datacenter that suffered a similar 100% outage within 6 months of the move. Now ThePlanet. The whole 100% uptime thing is a myth. It doesn't exist in real life no matter how good the datacenters people, procedures or technology. Critical systems just fail.

So what is the answer? Maybe it's systems like Amazons EC2 and Google's AppEngine, highly distributed storage and (in the case of AppEngine) processing which allow you to build systems that can scale easily and withstand failures. Now Google or Amazon could have dataenters fail but they can afford the kind of geographical separation that would mitigate the problem. Writing apps for these kinds of systems is a different skill-set but I think there's safety in the cloud.


