Not big enough to Virtualize but still need a solid Disaster Recovery Plan, maybe its time for a ZoomBox

By Tom McDonald | Apr 27, 2011 11:15:00 AM

Are your backups taking too long? How often do you test them, and are you sure they would restore properly when you need them? The problem with most disaster recovery solutions is there is no middle ground for SMB’s (Small Medium Businesses). Large corporations can invest in complex virtualization strategies using technology from VMware, this is a great option, but companies with limited IT support or who don’t have the funds to invest in virtualizing their servers are stuck with strategies that don’t give them the support they need. Many are forced to continue using Tape as a backup solution, which has a notorious reputation of not being able to restore. Others rely on having a RAID array, giving them the benefit of allowing a hard drive to crash without losing data, which does give them some security, but only in that one respect. If the server were to die the data would be fine but wouldn’t be assessable until the server was back up and running. This leaves SMB’s with old outdated and extremely limited Disaster Recovery and Business Continuity plans that don’t even come close to the benefits that virtualization gives the larger corporations.

NSI’s main target audience has always SMB’s and having seen the gap in technology brought its technicians together to create the ZoomBox. The ZoomBox is an NSI ran and owned product that gives SMB’s the virtualization protection that their business needs without having to change their entire network. What happens is NSI installs a client on any Windows machine that the customer wants to ensure up time and data protection. The ZoomBox then creates virtual images of each server/desktop 1-3 times a day, this image is then backed up to the cloud for extra protection, ensuring that all your data is perfectly safe regardless of what might happen to your business environment.

Read More >

Comparison between traditional IT BC plan and an VMware implementation

By Tom McDonald | Apr 15, 2011 12:17:00 PM

Many business’s IT infrastructures are based around this set up, with the operating system bound to a specific set of hardware and a specific Application bound to that OS. From there the server runs at about 5-10% of its capacity for most of the day with it peaking only during heavy usage. The data has to be backed up to a local SAN for recovery purposes, generally needing special software to be employed to ensure its being backed up fully and efficiently.

If this is a vital server and has a disaster recovery and business continuity plan implemented with it to ensure that downtime is kept as low as possible, then it will have an identical server installed for failover. This server is only used if the original server fails, but is still uses power and space. Not only that, but this server has to be the same identical model, containing the same hardware configuration, firmware, and local storage to ensure immediate complete compatibility with the original server. This adds cost as you need to have a second set of the hardware and it has to be that same model, limiting upgrade paths for the business.

This set up generally falls into the “Boot and Pray” model of disaster recovery, as the complexity of the set up causes the admin to hope that it works rather than being able to guarantee a smooth transition from server. This has to be done with every vital server that needs to have a redundant back up and each one has its own unique set up, creating a large amount of complexity that is involved with managing all these different machines. This complexity increases the company’s RTO and RPO and makes recovering a much larger ordeal.

Read More >

Downtime not an option? Learn the basics of VMware's Fault Tolerance and what you will need to get up and running

By Tom McDonald | Mar 25, 2011 11:32:00 AM

Is a server crash not an option for your company? Is having your server up and running the life and soul of your business? Then you may want to consider VMware’s Fault Tolerance (FT) feature. VMware Fault Tolerance is a step up from VMware High Availability (HA), with High Availability being VMware’s backup for a VM crash, if a server running a VM happens to go down then the host reboots on a different host. This allows for only a minute or two of downtime as the Virtual Machine starts up on a new server and the primary host that has crashed is restarted, if possible. This is extremely useful and can keep a business functioning with only a moment of downtime. What Fault Tolerance does is eliminate that couple minutes of downtime so that even if a server crashes, nothing is felt by the user. This feature gives companies that can’t stop functioning, even for a minute, the security they need to run their businesses.

How does FT work? Well with HA there is a primary server who runs the VM and a dedicated secondary host that is there in case of failure, if/when that failure occurs the secondary host is started and the VM is restarted on the new host. The failure is detected by using VMware’s heartbeat function that pings the server every second to ensure it is still active on the network, if the host stops responding it is considered to have failed and the VMs are moved to a new machine.  FT continues this trend, but instead of waiting for a host to fail and then restart it uses vLockstep to keep both hosts in sync that way if one was to fail than the other would continue running without having the user notice the server failure. By sharing a virtualized storage, all the files are accessible to both hosts and the primary host updates the secondary host constantly in order to keep both hosts RAM in sync. FT has a few rules to ensure it works properly:

  • Hosts must be in an HA cluster
  • Primary and secondary VMs must run on different hosts
  • Anti- affinity must be enabled (A configuration that ensures that the VM cannot be started on the same host)
  • The VMs must be stored on a shared storage
  • Minimum of 2 Gbps Nics, this is to allow vMotion and FT logging
  • Additional NICs for VM and management network traffic
Read More >

Creating a Disaster Recovery Plan, How to Setup the Right Team

By Tom McDonald | Mar 16, 2011 10:09:00 AM

NSI specializes in Virtualization, Disaster Recovery, Managed Print and is a New England IT Consulting company, this has allowed us to be exposed to an array of IT problems all over Connecticut, New York, and Massachusetts areas and has led us to see not only the problems within different companies IT departments, but also a good understanding of where most companies lack in IT policy and general flaws in their DR Plans. We hope this will help people in preparing for disaster from the human aspect of things.

When most people think of a network going down they general attribute the problem to a hardware failure, whether it be a server or hard drive failure most people blame the devices themselves. But 29% of all data loss is contributed to human error, either from an IT professional who forgot to perform the correct backup, or an office employee who accidently deletes an important file; data loss is real and happens all too often. Unplanned downtime occurs whenever something serious happens to your network that wasn’t planned and can have different effects based on when and how bad the incident is, if a server crashes at 2am and your business operates 9-5 then you are probably alright, but if things were to shut down at 10am during the holiday season then there can be some serious revenue loss for the company. Juniper Networks reports that Human Error is the cause of 50-80% of all downtime, this, along with our 29% of data loss also being human error, shows that large amounts of money loss and headaches can be avoided by implementing policies that don’t just focus on hardware and software fixes, but instead to add policies to help avoid mistakes from happening.

Read More >

Prevent IT Disasters. How VMware High Availability protects your data center

By Tom McDonald | Mar 9, 2011 10:46:00 AM

VMware HA (High Availability) is a major step in setting up a disaster recovery objective. With HA enabled, each ESXi host checks in on the other hosts and looks for a failure, if a failure should occur the VMs on the failed host are restarted on another server. To enable HA on your network a few prerequisites are required; All VMs and their configuration files must reside on a shared storage, this is required so that all the hosts have access to the VM if the host running it should fail; Each host in a VMware HA cluster must have a host name and a static IP, this will guarantee that each host can monitor each other without having false positives on failure if a host changes IP address; Hosts must be configured to have access to the VM network; Finally VMware recommends a redundant network connection, if a network card should fail this would allow communication to the host it is associated with, without this redundancy the host would seen as failing.

Read More >