In the past year, headlines around the world have been ablaze with data centre outages and downtime, affecting hundreds of businesses costing them money and business reputation. Focusing on people, process and toolsets, data centres need to strive towards operational excellence.
- Amazon AWS S3 outage is breaking a lot of things for a lot of websites and apps
- Delta: 5-hour computer outage cost us $150 million
- Data centre cooling outage disrupts Azure cloud in Japan
These are just some of the headlines that have taken the industry by surprise in the past year. No one is immune from downtime – from service providers to operators, many businesses have faced immense financial loss and reputation risk due to downtime. Even consumers were affected with massive delays and loss of productivity when the IT systems failed, in the case of British Airways (BA) and Delta Airlines.
These instances certainly aren’t isolated events. A 2016 report by the Ponemon Institute found that downtime costs for the most data centre-dependent businesses are rising faster than average, on average costing $740,357 a 38% increase since the first study in 2010.
What is at stake: business disruption, missed revenue, reduced productivity and reputation risk. When IT infrastructure is not available, neither are business operations. Given the costs of downtime, how can data centres strive towards operational excellence? Broadly grouped into people, process and toolsets, data centre operators can instil a quality mindset in their operations. Here are seven areas for improvement:
(1) Hire the right person for the job: It is essential to recruit experienced staff at all levels of the organisation. For a data centre to run smoothly, you will need experienced engineers as well as experienced managers who can train and mentor junior staff. It is essential to have people with actual operational experience in mission critical environments, even at the front line of the organisation.
Recruiting across similar industries is also essential. Globally, data centre talent is tight, and one way to lessen the pressure is to look at experienced people who come from similar mission critical facilities. Given the pace of change in the industry, finding the right talent could mean looking at adjacent industries for the right skillset.
(2) Ensure the person you hire embodies your corporate values: Staff must not only understand their day-to-day responsibilities, they must embody corporate values as well. This will ensure your employees align their actions around a common set of principles. Operational roles require day to day decision making and not every situation will be clear, therefore having a set of organisational principles that everyone understands will help ensure that decisions support the overall group objectives.
(3) Training is essential: It is not enough for any employee to rely on their previous experience and knowledge base. The data centre industry is continuously evolving and the technology supporting it requires a more sophisticated user than ever before. In addition, some roles will require demonstrated certifications and others on-the-job training and experience. Organisations need to build robust training programmes for all levels as well as ensure the personnel and managers make it a priority.
(4) Adhere to industry standards: There are several industry standards that data centre operators must align to, such as internationally recognised best practice ISO frameworks or PCI Data Security Standards or even green building standards. These standards are important for not only the operator to ensure consistent quality and workmanship, but required by customers to demonstrate to regulatory bodies that their customer or citizen data residing on IT systems is being managed appropriately. Adherence to these standards also means that operators must also regularly conduct simulations in order to ascertain weaknesses and then work on ways to counter negative scenarios.
(5) Share best practices: Data centres operators need to take a pro-active approach when it comes to sharing best practices, sharing successes as well as failures. As more businesses look to place their data in the cloud, the colocation industry is increasingly handling sensitive company information. There are organisations across industries that provide a neutral forum in which operators can share their experiences and best practices. In this manner, data centre operators are able to provide their customers with best-in-class services and ensure that operational issues don’t cascade through the industry.
(6) Continuous improvement is key to data centre operational efficiency: Operational procedures are meant to help facilities run smoothly. Over time, they may become dated or no longer efficient. The term “Kaizen”, the Japanese word for improvement, describes a technique of continuous improvement – focusing on small improvements that will add up to big changes in the efficiency and quality of a process. The data centre environment is not static and by taking this approach, organisations will always be looking for ways to deliver high-quality services.
(7) Make sure you have the right tool for the job: Today’s data centre environment is an interdependent web of complex technology systems spanning electrical, mechanical, security and safety. Operator as well as their customers need a clear view into the environment at any point in time.
As machine learning and artificial intelligence penetrate deeper into the operational realm, we will see an evolution on how these systems interact with each other and how we will operate the overall environment.
The relentless pursuit of perfection
Finding the right talent, honing best-practice facilities standards and even implementing new and emerging technologies is an ongoing process. However, the best technology or the right employee or even the best process alone, will not deliver the efficient, high-quality data centre that modern businesses demand.
It takes an organization of experienced, well-trained, collaborative staff, committed to adhering to rigorous standards, to deliver on the promise to always be up and running.