Human Aspect of Disaster Recovery Part 2

If you missed it, Check out Part 1 on setting up your DR Plan

Setting up your team

First you need someone to be in charge of assessing the damage caused. This person not only will be in charge of saying which parts are broken, but will also need to determine how bad the damage is and how long it will take to fix. If you sell over the internet and your network is down, then this is a mission critical resource for your company and needs to be addressed quickly, while having your fax lines down might only be a problem for invoicing and having it down for a few days might not be considered a problem. With all major problems a level of downtime should also be given

  • Level 1 – Minor 4-8 hours
  • Level 2 – Moderate 8-24 hours
  • Level 3 – Major 24-72 hours
  • Level 4 – Business relocation

This person will be in charge of determining how major each part of the business is for it to continue operating and will also determine how bad the disaster is.

The next person on your team will need to be in charge of actually fixing the problem, they will be in charge of the operation until the problems are fixed. It is important that this person is known as the person in charge during downtime as they might not necessarily be the person in charge during normal operations, which could lead to a conflict in leadership. Having these predefined roles will decrease the time spent on who is doing what. This person will also be leading the recovery, which depending on the disaster could be deploying IT resources or calling vendors for parts.

Finally we need someone in charge of checking that everything is being done, and that the DR plan actually works. While testing the technical part of the plan will be tasked to a member of IT; checking that if a server shuts down it will start up on another machine; that tape backups are being made and are recoverable. While the technical parts are important and vary from company to company, focusing on the human side makes this member of the DR team tasked with making sure everything else is being done. If you only have one person updating the contact list and they forget or become lazy, then they would become our single point of failure, but by having one person who checks to verify that everyone is doing their job then we have the security of knowing our plan is in place and is being constantly updated as needed.

The last thing to keep in mind about avoiding human error is automating and segmenting. The more automated something is the less likely someone is going to mess it up. Virtualization has automated many tasks like from restarting servers to creating backups. The other is segmenting, giving permissions to some people and baring others. The people from who are answering phones don’t need access to sale records and the people from accounting don’t need access to marketing files. This limited access seems obvious to anyone who works in a corporate setting, but is important for any small to medium business that has multiple departments to keep in mind. The less access people have to the important parts of IT the better it will be for IT to manage themselves.

These steps are just the beginning to setting up and implementing a DR plan for your company and shouldn’t be considered an exhaustive list. It doesn’t matter if you spend the money to have a redundant internet line installed if a person hits the poll with their car and knocks them both out. But preparing for and implementing a plan is a start and preparing for most disasters is better than preparing for none.

16 Questions

About The Author

President of NSI, Tom has been helping small and medium businesses succeed in Connecticut for over 25 years.