Deduplication Methods – Client-side Deduplication
With Backup Exec 2012 SP2, exciting possibilities for remote office protection are available. The concept of client-side deduplication – where the remote system is responsible for deduplication calculations and where backup data is sent over the network in its deduplicated form – can make the process of protecting remote offices a much more streamlined experience. Remote offices can be challenging to protect effectively; WAN environments may only utilize a fraction of the bandwidth available to a LAN backup. Backups over the WAN can be a challenge to set up, as well as to complete. Some environments include backup servers that are not as powerful as the application servers they are protecting – often, the SQL server or the Exchange server in the environment is the most powerful machine available in terms of processor speed or disk throughput. Where appropriate, why not leverage some of this remote computing power to achieve faster backups? Both of these situations are problems where client-side deduplication can offer a comprehensive solution to the data protection challenges brought on by the environment.
Generally, remote office backup strategies have two basic architectures. First, there are remote offices which do not have local storage, and where backup data is sent directly over the LAN or WAN to the central data center for storage. Second, there are remote offices that employ local storage and then “forward” that locally stored backup data to the central data center for protection. Both of these configurations can use the Backup Exec 2012 SP2 Deduplication Option to streamline and improve backup and recovery for remote offices.
Client-side deduplication is the act of skipping redundant data blocks at the backup source before transmitting the backup stream to the Backup Exec server. Data from the source system is refined into smaller deduplication blocks, and only the unique blocks (that is, the data the Backup Exec server doesn’t yet contain) are sent to the Backup Exec server’s deduplication disk storage device.
A deduplication disk storage device is special type of disk storage configured by Backup Exec where all deduplication data blocks are stored. With the client-side deduplication method, the majority of the processing necessary for deduplication is done on the remote system rather than as the data arrives at the Backup Exec server. Client-side deduplication is the default deduplication method Symantec recommends for several reasons:
Client-side deduplication enables greater scalability by spreading processor usage out across all clients running backups, enabling the Backup Exec server to process more concurrent backups.
Reduced Network Data Transfers
Client-side deduplication minimizes network data transfers as only unique data blocks – not yet stored by the Backup Exec server – are transferred. Most environments – either LAN or WAN environments – can benefit from less data being sent across the network.
Each Backup Exec Agent for Windows and Agent for Linux has the built-in capability to perform client-side deduplication calculations. Note that all deduplication operations require the Deduplication Option to be licensed on the Backup Exec server.
|Backup Exec 2012 SP2 Agent||Client Deduplication Support|
|Agent for Windows||Yes|
|Agent for Linux||Yes|
|Agent for Mac||No|
|Agent for Applications and Databases||Yes|
|Agent for VMware and Hyper-V (VMware)||No*|
|Agent for VMware and Hyper-V (Hyper-V)||Yes**|
|*While it is possible to utilize client-side deduplication when protecting VMware virtual machines, this configuration requires that backups be processed by locally installed agents within the virtual machines themselves (the Agent for Windows or the Agent for Linux). This configuration bypasses the optimized, image-level backup capabilities of the Agent for VMware and Hyper-V in VMware environments. For these reasons, using client-side deduplication in VMware environments is generally not recommended. Backup Exec server-side deduplication is usually optimal.|
|**Client-side deduplication can be used when protecting Hyper-V environments using the Agent for VMware and Hyper-V. In this configuration, optimized, image-level backups of virtual machines are captured and deduplicated through the Backup Exec Agent for Windows installed locally to the Hyper-V host. It is not necessary to install an individual agent into each Hyper-V virtual machine in order to realize client-side deduplication in Hyper-V environments.|
One of the really cool functions of BE is the Granular Recovery Technology (GRT). By the way, anytime you need more information on any aspect of BE please see the Backup Exec for Windows Servers Administrator’s Guide. In fact, don’t take my word for it, download from here:
Just a few tips to help you get the best out of BE’s GRT:
- Review the requirements for staging locations in the Administrator’s Guide.
- You must use a staging location for GRT-enabled jobs in the following scenarios:
- You back up to or restore from a volume with file size limitations.
- You restore granular items from tape.
- You run an off-host backup job.
- You are better off creating a separate backup-to-disk folder specifically for all GRT enabled backup jobs – this really simplifies media management. You will need to manage the IMG media that GRT enabled jobs create differently than other backup-to-disk media.
- Don’t allocate a maximum size for backup-to-disk files. If you do then you are in danger of getting failed jobs because of low disk space. This is because the backup-to-disk file often occupies extra space since GRT information is stored in IMG media and Backup Exec will only create a backup-to-disk file that is as large as the size that you specified.
- If you are using frequent incremental GRT enabled jobs it is a really good idea to run a full GRT enabled backup job every so often. This is because each incremental GRT enabled job requires a small amount of internal storage. If this storage amount increases too much, it can affect system resources. When you run the full GRT enabled backup job, you make available the storage space that has accumulated from incremental jobs.
12.5 delivers GRT for Exchange, Active Directory, SharePoint Server, and SharePoint Services which gives you the ability to recover granular data quickly and efficiently from a single-pass backup. It means, for example, that you do not have to run Exchange mailbox backups to recover granular data, including documents, list items and user attributes, or properties.