Carrie Higbie, of network infrastructure supplier Siemon, argues that a business continuity plan will be much more valuable to your future than disaster recovery.
All financial institutions know that downtime is expensive. Many companies have also evaluated the biggest threats to our systems based on industry information and risk assessments. But some still do not understand the difference between a disaster recovery plan and a business continuity plan. Years ago, a proper disaster recovery strategy was a goal for many businesses. But today, our understanding has changed. Disaster recovery takes time, during which business operations can suffer. A robust business continuity plan, on the other hand, enables a company to mitigate the effect of a disaster by enabling it to stay operational. The threat of terrorism and increasingly dramatic effects of changing weather patterns bring these issues to the foreground again.
Financial services companies can best understand the difference between disaster recovery and business continuity through example. A hypothetical insurance company’s disaster recovery plan entails nominal protections such as offsite backups, cold spares and important phone numbers for carriers and service providers. If the company’s facility was destroyed by the disaster, offsite storage in a safe place would help, but if some data was waiting to be backed up when the event occurred, then all of the work since the last backup would have to be redone. Information about some new accounts, quotation requests, claims, and premium payments would probably be lost.
When disaster struck, the insurance firm would not only have to face the challenge of reinstating day-to-day operations, but it would probably be under more pressure from customers, all of whom might be readying insurance claims because of the same disaster that brought its operations to a standstill.
Our hypothetical company should ask itself some questions. How will the business function while recovery is taking place? How will it communicate with its customers? What is the chain of command and notification for internal staff? Where is the command centre? Is there a strategy for dealing with the increased customer demands stemming from a disaster while trying to efficiently continue its own operations? This is where a continuity plan comes into play.
Business continuity planning depends on exposure and available resources. Some companies get together and offer to provide redundant equipment for each other. The newer trend is to outsource a disaster centre or a redundant data centre. This option, while optimal, is not always fiscally possible.
The idea is to provide 30 days worth of offsite operations, assuming that if your facility was completely destroyed you could find a suitable replacement in 30 days. With this in mind, important pieces of information and other tangibles should be stored offsite with your backups. Consider the following as a good starting list:
A vendor listing
A listing of all people to be notified
Service provider information
Forms that will be needed (including checks)
Contact details for everyone in the continuity group
Contact details for all IT personnel
But don’t forget to include IT resources. From a computing perspective, one component often missing from such a list is system logs, which can be crucial to IT continuity. There should be several logs including patches (and associated deployment issues), operating system revisions, and documentation of all custom code.
Computing and communications are core competencies in the financial sector, perhaps more so than in some other sectors, meaning that extra care must be taken when considering the preservation of IT assets and resources. Failing to acknowledge this can have serious consequences, which again can best be illustrated by example. Our hypothetical insurance company, already strained by incomplete data backups, must restore the operating system and applications to process its files before recovering its data. But when it restores the software, nothing works. Someone at a supplier had assisted it with a custom device driver manipulation some months ago, after trying several other solutions that failed to solve a system problem. Without logs stating who did what, in what order, and what the results were, the company is now doomed to repeat the process before it can resume operations. In the meantime, customers are growing increasingly impatient for results.
A good change management strategy based around a configuration management database (CMDB) can help here. Documenting system configurations, along with the processes used to plan and implement them, can provide useful references when it comes to rebuilding systems. IT services frameworks such as the IT Infrastructure Library (ITIL) can help to codify such processes.
One quick approach to documenting procedures is to have the person performing the work write down each and every task along the way. Giving the written procedures to another staff member completely unfamiliar with the task and asking them to follow the procedures can help validate the documentation, which should include everything from boot-up to basic troubleshooting. The person that is testing the documentation will be able to identify any missing points. This also provides the hidden benefit of cross training within the department. If necessary, hire a temp to follow the procedures (not during a critical time).
Disasters and business continuity notwithstanding, an understanding of your own processes, including how and when they occur, can help you to better understand the flow of your business, and to begin asking why things are done this way and whether they can be improved. Process documentation, just like configuration management, can be both a useful operational asset in addition to a contingency tool. Understanding such benefits can help when justifying administrative and capital expenses associated with such investments to senior management.
Continuity plans should also include security concerns both internally and externally. Bearing in mind that security systems may not be working you will need to protect your assets, including your employees. There may be some overtime involved for security personnel, but you are most vulnerable when you are down. Bearing in mind that most security problems are internal, you will want protection on both sides of your firewall and for all of your systems. Opportunity is all that is necessary for an intruder to strike – you do not want to give anyone opportunity.
While much of this article has focused on IT, it is important to remember that the IT department should not have to deal with the business continuity challenge alone. IT tends to be invisible when things go right, but as soon as something goes wrong, many people will blame the IT department first. To avoid the blame game and ensure that business continuity plans run more efficiently, it is important to engage stakeholders throughout the organisation. Each department should have someone involved in the continuity plan. They must have tasks, notification procedures, and logging requirements or a company will scramble for days, weeks, or months trying to place blame, catch up and point fingers.
Any plan can be completely crippled without testing. While people tend to think that disasters won’t happen, in reality they do and a savvy company will make sure that all systems are go, all disaster recovery and business continuity plans are tested and updated at least quarterly.
As a concept, disaster recovery has served many companies well over the decades. However, those who have not mapped recovery windows properly against business operations and produced a financial impact analysis could be vulnerable. For fast-moving companies in the financial sector, continuity is the key word. When you have to conduct business at network speeds, a stitch in time really does save nine.