Posted by: krystalstatus | 07/06/2009

Up and running again

We will be asking questions of our datacentre as to why the backup generator failed during the outtage and will post the results here.

In the meantime everything is up and running again and we offer our sincerest apologies for the poor service you received over the past hour.

Regards,

The Krystal Team


Responses

  1. Good to hear everything’s running again. Thanks for solving this so quickly.

  2. I agree – Thankyou. It’s clearly not your fault, so well done for being able to get things back up again so fast :)

  3. Hi – good to hear your ontop of these things. Thanks for responding so quick.

  4. Cheers for the feedback guys – we have since received an incident report which is labeled private and confidential so I can’t copy and paste it across, but I can disclose the basic points for you:

    1. At 19:15 – EDF Energy removed primary mains power feed to site – described by EDF as unplanned works due to a potential severe fault. At no time were *the datacentre* advised of such

    2. At 19:18 EDF Energy removed secondary mains power feed to site – described again by EDF as unplanned works due to a potential severe fault. At no time were *the datacentre* advised of such works.

    3. UPS supplies supported the load of *the datacentre* during these mains power interruptions while switch gear control systems determined which source of mains power to route to the UPSs.

    4. At 19:18 all mains power feeds became unavailable. The generation system received an automatic start signal from the switch gear control systems to start and subsequently entered the spin-up cycle.

    5. The generation system reached a state where it was able to take the full *datacentre* load, however the signal to inform the switch gear control system that generator power was steady and available did not reach the switch gear control system as it should. The cause of this is detailed further in the corrective measures overleaf.

    6. The UPSs continued to support the load until they expired.

    7. Mains power to *the datacentre* was restored at 20:30 hrs

    Needless to say, action has been taken to ensure the incident doesn’t occur again. The engineer in question has been disciplined.

    This seems a very unusual and isolated incident and is the first problem we have had with our datacentre since we moved there a few years ago.


Leave a response

Your response:

Categories