Home > Data Storage Tips > > Learn how to back up virtual machines
Storage UK Tips:
EMAIL THIS
 TIPS & NEWSLETTERS TOPICS 


Learn how to back up virtual machines


George Crump
01.23.2008
Rating: --- (out of 5)


Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


What you will learn: Two approaches to backing up virtual machines.

Virtual machine disk format (VMDK) files created for virtual machines (VM) exist in a VMware file system referred to as VMFS. A VMDK file then represents a physical hard drive that VMFS presents to your virtual machine. All user data and configuration information about the virtual server is stored in the VMDK file.

In general, a VMDK file tends to be quite large, so files as large as 2 TB are not uncommon. Because of this, they are characterized by large block I/O patterns. The VMDK file is updated for any user data change or virtual server configuration change. Since there is no built in incremental type data capture functionality in the VMDK, any change to this file means that the whole file needs to be backed up again.

How you back up VMDK files depends on what version of VMware ESX you are using.

Backup approaches for ESX 3.0 (or earlier)

If you are running a pre-3.0 version of VMware or if your ESX 3.0 server is not connected to network storage, you might choose to install backup agents on each virtual machine to get file-level backup, effectively treating them as hardware systems. As a result, you would be using your traditional backup process to backup the data. The challenge with this approach is that your backup agent puts a significant I/O and CPU load on the virtual server being backed up. If many virtual machines are being backed up at the same time, it is likely you will overload the ESX host server with multiple active agents.

Alternatively, you might choose to install an agent on the ESX Service Console. This delivers a disaster recovery (DR) backup capability by grabbing the entire set of virtual machines, redo files, service consoles and host states.

The ESX Service Console backup does not give you the ability to do file-level recoveries, so you still need the agent backup on each virtual machine if that is your requirement. There is a lot of redundant data in these two backup types if you are doing both. Not only is the virtual machine instance essentially being backed up twice (once by each process), there typically is a lot of similarity between the VMDK files. You may have 14 virtual windows machines, each with its own application, however all 14 OS installations are very similar.

This type of backup is the ideal scenario for a data deduplication device and will likely set new levels in data deduplication ratios. Efficiencies of more than 40:1 for VMware backups are possible. Without using data deduplication, you are less likely to do frequent backups from the ESX Service Console because of its impact on backup capacity. Performance is also an issue as backup windows are finite in a production virtualized environment.

VMware DR backups are particularly hard to replicate to another site. Since each console's backup is a large net new file (or image set), replication across a WAN segment is problematic. Again, this is where a data deduplication capable of optimised replication shines. Even though you are backing up the entire image to disk, only the segment level differences between the new image and the existing backup of the image are stored and then only those deltas need to be replicated to the remote site.

SAN suppliers will suggest that you use their built-in replication to move your virtual server content to a remote site, again replicating at a block level. The problem with that strategy is that first you have to have a SAN and that SAN has to be used for all virtual machines and images. Second, the disk at the DR site has to be from the same SAN supplier, and third, the capacity in the remote site must equal the capacity in the primary site. All three of these issues drive costs up. In addition, there is a fair amount of complexity in getting SAN-based replication working and it still does not solve the core backup issue. This also means big dollars for high bandwidth because there's no optimisation happening here.

Backing up to a deduplication storage system on the other hand, helps solve the backup issue while at the same time driving down costs. Multiple generations of the local VMDK files can be stored for months and they can be replicated to a DR site with the same segment level efficiency of a SAN-based replication. But these two sites will benefit from data deduplication and, as stated earlier, that reduction in data storage and associated costs could be significant.

Backup approaches for VMware 3.1 users

If you are using VMware 3.1 (the latest version) and your ESX server is on a SAN, VMware 3.1 makes the process substantially easier with VMFS3 Consolidated Backup (VCB). With VCB, you can get centralized file-level backups with no agents being installed on each guest VM. VCB moves the backup process out of the virtual machine and into the infrastructure. Essentially, the ESX server will take a file system-consistent live snapshot of a selected virtual machine. Then that snapshot can be mounted to a backup server attached to the SAN, which can direct the data to a backup target.

Again, a data deduplication system is the ideal target for the VCB. Your primary goal with a VCB is to get the proxy to mount the image and get it backed up rapidly to relieve system resources. The advantage of using a disk target that has data deduplication is that, as stated earlier, there is highly redundant data within the image being backed up and that image is highly redundant to the images already backed up on disk.

From a DR perspective, VMware discusses either replicating the VMFS disks using the replication capability that probably came with the SAN or backing up the VMs using an enterprise backup application to tape and then recovering at the DR hot-site. For the same reasons discussed above, SAN replication is less than ideal. The problems with tape are numerous, and recovering from tape at the hot site is too time consuming.

As with non-VCB VMware backups, a data deduplication system's ability to replicate at a block level provides the ability to have up-to-date DR server farms across the country or around the world. In some cases, this can provide for long distance business continuance by replicating servers via backups, multiple times per day, and having those servers in a stand-by mode at the DR site by restoring them periodically to a remote ESX host.

Backing up VMware environments has created storage management challenges and increased backup costs for IT staffs. Using a deduplication storage system can actually reduce the backup costs and improve the quality of your DR site.

About the author: George Crump is an independent storage consultant with over 20 years of experience.


Rate this Tip
To rate tips, you must be a member of SearchStorage.co.UK.
Register now to start rating these tips. Log in if you are already a member.




Digg This!    StumbleUpon Toolbar StumbleUpon    Bookmark with Delicious Del.icio.us   


RELATED CONTENT
Data storage backup tools
Police force recruits CommVault for centralised backup
How long until backups are just a nearline copy?
How to select the proper backup reporting tool
How to avoid a data deduplication fiasco
Healthcare trust tackles backup malady with D2D2T
Storage vendors push data protection at VMworld
Simpana converts claim better backup
Five ways to create a more efficient backup infrastructure
Asigra sues backup service rival ROBObak for libel
A backup administrator should never be a single point of failure

Disk-based backup
Britannia gets 40:1 data reduction in DR site move via double dedupe
Healthcare trust tackles backup malady with D2D2T
NEC takes Hydrastor grid storage system higher
Five ways to create a more efficient backup infrastructure
Data backup strategies: Migrating from tape to disk
Asigra sues backup service rival ROBObak for libel
Tek-Tools adds path reporting on VMware and VTL
UK storage admins eye data deduplication to shrink backup windows, but remain wary
Unified NAS storage plugs business continuity leaks for water company
How much data protection do mirror copies provide?

RELATED RESOURCES
2020software.com, trial software downloads for accounting software, ERP software, CRM software and business software systems
Search Bitpipe.com for the latest white papers and business webcasts
Whatis.com, the online computer dictionary

DISCLAIMER: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.

About Us  |  Contact Us  |  For Advertisers  |  For Business Partners  |  Site Index  |  RSS
SEARCH 
TechTarget provides enterprise IT professionals with the information they need to perform their jobs - from developing strategy, to making cost-effective IT purchase decisions and managing their organizations' IT projects - with its network of technology-specific Web sites, events and magazines.

TechTarget Corporate Web Site  |  Media Kits  |  Reprints  |  Site Map




All Rights Reserved, Copyright 2008, TechTarget | Read our Privacy Policy
  TechTarget - The IT Media ROI Experts