Data Deduplication and Virtual Machine Backups
The Backup Exec 2012 SP2 Agent for VMware and Hyper-V enables optimised, image-level backups of both VMware and Hyper-V virtual machines. This is accomplished by capturing backups through communication with the VMware or Hyper-V virtual host. However, deduplication recommendations differ between VMware and Hyper-V environments being protected by Backup Exec 2012 SP2.
Deduplication of Image-level VMware Backups
While it is possible to utilise client-side deduplication when protecting VMware virtual machines, this configuration requires that backups be processed by locally installed agents within the virtual machines themselves (either the Agent for Windows or the Agent for Linux). This configuration bypasses the optimised, image-level backup capabilities of the Agent for VMware and Hyper-V in VMware environments that take advantage of Backup Exec’s advanced integration with the VMware vStorage API. For this reason, using client-side deduplication in VMware environments is generally not recommended. Backup Exec server-side deduplication is optimal.
Backup Exec Server-side Deduplication of VMware Image-level Backups
Deduplication of Image-level Hyper-V Backups
Client-side deduplication can be used when capturing image-level backups of Hyper-V virtual machines using the Agent for VMware and the Agent for Microsoft Hyper-V. In this configuration, optimised, image-level backups of virtual machines are captured and deduplicated through the Backup Exec Agent for Windows installed locally to the Hyper-V host. It is not necessary to install an individual agent into each Hyper-V virtual machine in order to realise client-side deduplication in Hyper-V environments.
Client-side Deduplication of Hyper-V Image-level Backups
VMDK and VHD Stream Handlers
Backup Exec 2012 SP2 includes stream handler technology designed specifically for image-level backups of VMware and Hyper-V virtual machines captured through the Agent for VMware and Agent for Microsoft Hyper-V. The stream handler technology within Backup Exec operates invisibly, meaning no additional management or configuration adjustments are required on the part of the administrator.
The stream handler technology within Backup Exec applies to both client-side and Backup Exec server-side deduplication. The stream handlers enable variable-length segmenting of VMware (VMDK) and Hyper-V (VHD) disk files during deduplication calculations. This aligns deduplication blocks to file extent boundaries within the virtual disk, and data changes over time within virtual disk files result in fewer unique blocks. This translates into better storage savings across both VMware and Hyper-V backups when using the Backup Exec 2012 SP2 Agent for VMware and Agent for Microsoft Hyper-V in conjunction with the Backup Exec 2012 SP2 Deduplication Option.
Combining the Agent for VMware and the Agent for Microsoft Hyper-V with the Deduplication Option can offer significant storage savings for Backup Exec administrators, allowing them to reduce storage costs by getting the most out of the backup storage resources at their disposal.
Additional information on the Backup Exec 2012 SP2 Agent for VMware and the Agent for Microsoft Hyper-V can be found in the Backup Exec 2012 SP2 Agent for VMware and Hyper-V Technical Feature Brief, and in the Backup Exec 2012 Administrator’s Guide and the Backup Exec 2012 SP2 Administrator’s Guide Addendum.
Data Deduplication and Storing Backups to Tape
An environment with a disk-to-disk-to-tape architecture is fairly common among customers who are interested in deduplication. It’s important to note that all of the Backup Exec 2012 SP2 methods of deduplication mentioned here are disk-based; deduplicated data is never stored directly to tape in its deduplicated form. However, the process of migrating deduplicated data to tape is very simple. Customers simply add an additional stage to their backup workflow that sends the data to tape storage.
Data Deduplication and Storing Backups to Tape
For data that was backed up using client-side or Backup Exec server-side deduplication, the Backup Exec server is responsible for “rehydrating” the deduplicated data – meaning the process of recreating whole files from deduplicated blocks – before transferring the data to tape. There will be some impact to processor and memory usage during the tape stage of the backup workflow due to the rehydration process. While resource consumption varies based on data set, at most the tape stage of the backup workflow will use 100% of one processor core while rehydrating deduplicated data and copying it to tape.
For data that was backed up to a deduplication appliance, the deduplication appliance itself is responsible for “rehydrating” the deduplicated data prior to it being sent to tape.
Backup Exec Server-side Deduplication
Do you have VMware ESX or vSphere servers with high average processor utilization? If so, the Backup Exec server-side deduplication method can be a useful and effective deduplication solution for these environments. This method of deduplication is performed entirely on the Backup Exec server and does not impact source systems any more than a typical backup would.
The Backup Exec server-side deduplication method performs the deduplication processes against data when it arrives at the Backup Exec server – that is, just before the data is laid down on disk. Data is transmitted in its whole, un-deduplicated form, and then decomposed into deduplication blocks in-line by the Backup Exec server. Only the unique data blocks (that is, the data that the deduplication disk storage device doesn’t yet contain) are stored.
Figure 4: Backup Exec Server-side Deduplication
The Backup Exec server-side deduplication method is optimal for situations where:
• High Processor Utilization on Remote Servers
If the remote system has no processor cycles to spare for deduplication calculations, Backup Exec server deduplication can take the load and still perform deduplication.
• VMware Environments
When using the Agent for VMware and Hyper-V to capture image-level backups of VMware virtual machines, Backup Exec server-side deduplication must be used.
Backup Exec server-side deduplication is not recommended for the following environments:
• Remote Office Protection Over a WAN
With Backup Exec server-side deduplication, the Backup Exec server receives the entire data set before deduplication takes place. This is not a WAN-friendly method of deduplication. Generally, remote office protection without local storage should use client-side deduplication.
Any Backup Exec server that has the Deduplication Option licensed can utilize the Backup Exec server-side deduplication method. Most agents and backup types supported by Backup Exec can take advantage of the space savings inherent with Backup Exec server-side deduplication.
Backup Exec 2012 SP2 Agent Backup Exec Server-side Deduplication Support
Agent for Windows Yes
Agent for Linux Yes
Agent for Mac Yes
Agent for Applications and Databases Yes
Agent for VMware and Hyper-V (VMware) Yes
Agent for VMware and Hyper-V (Hyper-V) Yes
Some Backup Exec customer environments have an existing investment in deduplication-enabled appliances for onsite backup, offsite storage (disaster recovery), and remote office protection. The appliance deduplication method is an excellent fit for these environments.
The appliance deduplication method uses Symantec’s OpenStorage (OST) technology in conjunction with both a 3rd-party deduplication appliance and a manufacturer-developed OST plug-in. Together, these components enable the following:
• Intelligent Replication Tracking
Many 3rd party deduplication appliances include a replication feature enabling data to be efficiently copied from one device to another downstream device. When backup data is transferred by a Backup Exec server to a deduplication appliance through the OST plug-in, the Backup Exec server is able to track when data is replicated to additional appliances. This allows the Backup Exec server to be able to restore data from both the original deduplication appliance or from any of the additional appliance replication destinations.
Appliance deduplication requires that the Backup Exec server be paired with one or more supported OST-based deduplication appliances. Symantec Backup Exec is committed to expanding the breadth and depth of OST partners certified to work with Backup Exec, so additional OST devices are being certified and supported as they complete Backup Exec’s internal qualification processes.
For more information on supported 3rd-party appliances compatible with the OST-based appliance deduplication technology within Backup Exec 2012 SP2, please refer to the Backup Exec 2012 SP2 Hardware Compatibility List (HCL) available online.
I’m at to TechED in Berlin … great party for the fall of the Berlin Wall 2009 (shameful timing) – once again Microsoft mess up my weekend. Windows Server 2008 R2 is pretty bold and it will have a significant impact on the market. Piles of guys I have spoken to are interested in the new capabilities.
There are some significant features in the R2 operating system that can help to boost productivity and help administrators gain more management control. It will be of specific interest to companies that have an extensive investment, or plans a complex deployment, of Hyper-V-based virtualisation; any company that has vast swaths of Windows servers in data centres where space, power or both are becoming tight; as well as any company that is planning to deploy Windows 7 on a wide scale in the near future.
In terms of support for R2 BE is already there with Backup Exec 12.5 for Windows Servers revision 2213 Hotfix 331998. This hot-fix contains recommended fixes for Backup Exec for Windows Servers version 12.5 revision 2213. New support for Windows 2008 R2 (RAWS – Remote Agent Support Only) and a Agent for VMware Virtual Infrastructure fix (AVVI).
- Backup Exec 12.5 revision 2213 32bit Media Servers
- Backup Exec 12.5 revision 2213 x64bit Media Servers
Before installing this hotfix, Backup Exec for Windows 12.5 Service Pack 2 must be installed. Service Pack 2 can be obtained here: http://library.veritas.com/docs/334937. Administrative privileges are required to install this hotfix.
A full backup is recommended after installing this hotfix. Backup Exec Remote Agents must be updated
- Backup Exec 12.5 or Windows Servers revision 2213 Hotfix 327135 – 32 bit download: http://support.veritas.com/docs/334937
- Backup Exec 12.5 for Windows Servers revision 2213 Hotfix 327135 – 64 bit download: http://support.veritas.com/docs/334938
After applying Backup Exec 12.5 Hotfix 328462, an Agent for VMWare Virtual Infrastructure (AVVI) backup job with the “Granular Recovery Technology” (GRT) option enabled, completes with the exception “Failed to mount one or more virtual disk images” (For more details please refer to this document: http://support.veritas.com/docs/331927)
Installation Guide – The installation guide here contains general information for installing Backup Exec product updates as well as special instructions for configurations including CPS, Remote Agents for Windows Servers, Remote Agent for Linux/Unix/Macintosh Server (RALUS/RAMS), Clustered Backup Exec, Shared Storage (SSO) installations, Central Admin Servers (CASO) installations, and SAP/R3 Oracle Agents. http://support.veritas.com/docs/300795
Protecting the VMware environment has its own unique set of data protection challenges. There are basically three ways to protect VMware: the guest OS method, the console backup method and the VMware Consolidated Backup (VCB) method. The guest OS method treats each virtual machine as a standalone server and backups take place as usual as if the virtual is physical server. The second practice is the console backup practice, in which virtualisation administrators back up the VMware ESX Server with no regard of the underlying virtual machines in the ESX environment. (There is a “free” product, ESXi, but it has no console, and requires add-ons to manage.)
VCB Backup requires VMware Infrastructure 3 (VI3) and initially SAN attached disk (iSCSI or Fibre Chanel) but now supports VMFS with local, JBOD, iSCSI and Fibre-Channel-attached disk, network file system (NFS) and virtual compatibility mode raw device mapping (RDM). The only mode not currently supported is physical compatibility mode RDM, together with a dedicated Windows Server 2003 acts as the backup proxy. You then install the VCB software on the Windows Server and provide access to the same SAN Logical Unit Number (LUN) used for the VMware Virtual Disk Files.
The Symantec Backup Exec 12.5 Agent for VMware Virtual Infrastructure (AVVI) is specifically related to the VMware Consolidated Backup framework and is designed and built to communicate directly with VMware ESX and VirtualCenter.VCB was originally introduced in 2006 as nothing more than a collection of interfaces and utilities that backup vendors could exploit. Since then VCB itself and backup vendor support has expanded considerably. The many different code levels for both VCB and backup applications have caused considerable confusion around what environments are supported and what VCB is today.
It is best to think of VCB as a backup framework with a collection of VMware utilities that facilitates backups. Today VCB utilises standard backup products together with snapshot capabilities. It uses command line interface (CLI) capabilities in VMware to take a VM snapshot of Windows-based VMs to offload a copy of the data for the backup product which Backup Exec then mounts and backs up.
Effectively, VCB provides a centralised backup facility that enables you to use Backup Exec to protect system, application, and user data in your virtual machines while reducing the load on virtualised servers. This allows you to backup your virtual machines without disrupting users and applications. So, VCB provides a way to do server-free and LAN-free backup and VM snapshots can be NFS mounted for quicker recovery and GRT as well as centrally manage backups to simplify management of IT resources.
Cool so far?
If you are not using VCB you do not need the BE 12.5 AVVI. Most organisations not using VCB are likely to be using ESXi. Although ESXi is free, there is no service console anymore. So you can’t use local agents on your ESXi host. Everything needs to be able to communicate with the VI API or any other remote connect method to gather information – not so cool.
So, the bottom line is AVVI is only needed when there is a VCB framework around the Virtual infrastructure.
VMware’s Virtual Infrastructure 3 (VI3) family includes: VMware ESX, VirtualCenter, VCB, VMware Converter & VMotion. Backup Exec 12.5′s Agent for VMware Virtual Infrastructure (AVVI) can leverage all of these components of VMware VI3 to automatically discover, protect, and recover virtual machines and their data. All Guest virtual machines (VM’s) hosted by V3I, including Windows and Linux virtual machines, can be protected using Backup Exec’s AVVI integrated support of VCB.