Thursday, 31 January 2013

Emergency maintenance work complete

This morning we carried out emergency maintenance on the filers which are an important part of our IT infrastructure. There was a low but non zero risk that things would go wrong in which case we could lose many key services.

We took down MOLE and MOLE 2 and the web CMS, in preparation. We then carried out the maintenance from 6:30 this morning.

The risky part of the maintenance was completed successfully with no interruption to service. However, on restoring the filers we noticed an orange error light. This didn't affect services but is not a good thing. We checked the config, restarted the servers and tried other measures but the error light didn't go away.

In the end we declared the work complete, restored the services we had taken down and we were ready for the working day.

The error light remains - we will continue to troubleshoot it in a non-disruptive way and if further maintenance or restarts are needed to resolve it we will schedule a follow up maintenance period.