Modern Data Management and Protection Challenges
Customers of all types and sizes are seeking new and innovative ways to overcome challenges associated with data growth and storage management. While these challenges are not necessarily new, they continue to become more complex and more difficult to overcome due to the following:
- Pace of data growth has accelerated
- Location of data has become more dispersed
- Linkages between data sets have become more complex
Data and storage management challenges are compounded by the need for companies to protect critical data assets against disaster through backup and recovery solutions. In order to maintain backups of critical data assets, additional secondary storage resources are required. This additional layer of backup storage must be implemented wherever backups occur, including central data centers and remote offices.
Storage Efficiencies through Data Deduplication
Backup Exec 2012 includes advanced data deduplication technology that allows companies to dramatically reduce the amount of storage required for backups, and to more efficiently centralize backup data from multiple sites for assured disaster recovery. These data deduplication capabilities are available in the Backup Exec 2012 Deduplication Option.
Backup Exec 2012 Data Deduplication Technology
The data deduplication technology within Backup Exec 2012 breaks down streams of backup data into “blocks.” Each data block is identified as either unique or non-unique, and a tracking database is used to ensure that only a single copy of a data block is saved to storage by that Backup Exec server. For subsequent backups, the tracking database identifies which blocks have been protected and only stores the blocks that are new or unique. For example, if five different client systems are sending backup data to a Backup Exec server and a data block is found in backup streams from all five of those client systems, only a single copy of the data block is actually stored by the Backup Exec server. This process of reducing redundant data blocks that are saved to backup storage leads to significant reduction in storage space needed for backups.
Figure 1: Deduplication Process
The deduplication technology within Backup Exec is applied across all backups managed by a deduplication-enabled Backup Exec server.
Deduplication Methods within Backup Exec 2012
The Backup Exec 2012 Deduplication Option gives backup administrators the flexibility to choose when and where deduplication calculations take place. Three deduplication methods are supported by Backup Exec 2012. These are as follows:
The client-side deduplication method is a software-driven process. Deduplication takes place at the source or protected client, and backup data is sent over the network in deduplicated form to the Backup Exec server. Only unique blocks of backup data are sent to the backup server and saved to backup storage; non-unique blocks are skipped.
Backup Exec Server-side Deduplication
The server-side deduplication method is also a software-driven process. Deduplication takes place after backup data has arrived at the Backup Exec server and just before data is stored to disk (also known as inline deduplication). Only unique blocks of backup data are stored; non-unique blocks are skipped.
Third-party Appliance Deduplication
The third-party appliance deduplication method is a hardware-driven process and is driven by Symantec OpenStorage (OST) APIs. Deduplication takes place on the third-party deduplication appliance (can be in-line or post-process deduplication, for example, ExaGrid or Quantum). Third-party appliance deduplication devices handle all aspects of deduplication.
Administrators can mix and match deduplication methods to fit their unique needs. For example, a single Backup Exec server enabled for deduplication can simultaneously use client-side deduplication for some jobs, server-side deduplication for other jobs, and third-party appliance deduplication for yet another set of jobs.
Figure 2: Deduplication Methods
The different deduplication methods supported by Backup Exec 2012 have various configurations for which they are best suited. The benefits of each method, as well as the configurations for which each method is best suited, will be detailed in the following weeks.
When searching for a backup and recovery solution for virtual environments, here are a few “must have” features to consider:
1) Granular Recovery
Granular and application level recovery is paramount to any virtual backup strategy. If you can’t restore what you need, when you need it, then your whole entire backup strategy is flawed from day one. Make sure your chosen solution provides all levels of recovery – full virtual machine, individual virtual disks, virtualized application & database servers, along with standards like file, folder and granular objects such as an individual email.
Backup Exec leverages Symantec’s patented Granular Recovery Technology (GRT) to provide all the recovery methods mentioned above. The innovative GRT feature helps IT Administrators save time and headaches by enabling them to restore individual files, folders and granular objects within a guest virtual machine from a single-pass image backup. In addition, Backup Exec also provides the ability to recover an entire VM or virtual disk, virtualized applications and databases. Backup Exec even includes physical to virtual conversion technology, so you can accelerate your transition to virtual environments. Overall, Backup Exec provides one product and any recovery.
2) Application Awareness
Application awareness is an essential component of virtual machine backup. While most backup products provide “crash consistent” backups – meaning those applications use integration with technologies like Microsoft’s VSS, many backup products do not perform required post-process functions like log truncation, which ensure you are protecting the application completely. Many backup applications can’t perform granular recovery of those virtualized applications either.
Many business-critical applications – like Microsoft Exchange or SQL Server – will only do certain types of maintenance only when a successful backup occurs. Application-aware backup solutions ensure this maintenance can take place. Usually, this requires some sort of software (i.e. an Agent, whether it’s deployed beforehand or injected and uninstalled on demand) in the virtualized application server. The most capable backup applications such as Backup Exec are able to index, catalog, or otherwise capture important application metadata that is necessary for fast search and recovery of granular application items.
3) Data Deduplication
We’ve all heard the saying that VMs are multiplying like bunny rabbits. According to a recent ESG survey companies have about 16 virtual machines per physical host, with a plan to grow to 26 per host. This number will continue to move upwards as hardware is built to accommodate this trend. It’s no surprise that between all these guest machines there is significant duplication of data from both applications and operating systems.
To manage data growth and storage costs while improving network bandwidth optimization, data deduplication is a must. However, not all data deduplication solutions are equal. Look for a solution that offers source side deduplication. Why? By removing redundant data as close to the source as possible maximizes the benefits of deduplication. It will decrease network traffic, reduce the storage footprint and lowers memory, thereby helping to beat backup windows and make backup strategies more successful.
Also, ensure your data deduplication solution works across everything you protect – across all virtual machines and any physical servers too – otherwise the storage savings from deduplication will be severely impacted. You want to be able to deduplicate your data effectively as possible and having multiple backup jobs containing the same data isn’t very efficient. For example, if you are protecting 100 VMs and 50 physical servers running Windows. True global data deduplication would reduce backup to just one instance of the operating system as opposed to 150.
Backup Exec enables customers to choose the deduplication method that best suits their environment. Backup Exec’s Deduplication Option offers three methods for deduplicating data across the enterprise (across all backup jobs). These methods are Client (or source) Deduplication, Media Server Deduplication, and Appliance Deduplication.
4) Physical Server and Multi-Hypervisor Support
More and more organizations are running multiple hypervisors within their environment, especially as alternatives to VMware are gaining popularity – especially Microsoft’s Hyper-V. Finding a single solution that supports all of your hypervisors will simplify backup complexity and licensing, streamline management and reduce costs.
While some IT organizations have invested in multiple separate tools for backup – one for physical servers and virtual servers – customers have consistently asked for a single vendor to manage both environments. This is because a differing approach to backup leads to inconsistent data management, backup confusion, increased cost, and even conflict between various IT organizations. The solution is for IT to bring together the virtualization and backup teams, assign ownership, authority and resources for backup of both physical and virtual machines.
With the release of Backup Exec 2012, now you can eliminate backup complexity and the need for specialized point products through a single solution that unifies virtual and physical, deduplication, and replication while offering the choice of on-premise software, appliance, or cloud delivery models. Unlike other solutions, Backup Exec is powered by Symantec V-Ray technology, which enables visibility across both virtual and physical environments for fast and efficient backup and recovery.
What are your must haves in a backup and recovery solution for VMs and why?
Actually, I’ve discovered it’s not just me – thank goodness. When you go through a product launch process there is always a chance that the general ”noise” is you banging on about stuff and is limited to inside your own head.
Not so in BE 2010 case … I’ve been running around Europe in the last month or so, and the feedback I’ve been getting is that there is an awful amount of interest in deduplication. Even smaller companies who thought that deduplication was probably too much for their needs are seriously looking at getting rid of some of the duplicate data on their primary storage, but even more are looking at deduplication as a way of improving backup and restore times.
I was at an event a couple of months back where every conversation I had was around the length of time it takes to backup. As data continues to grow across the IT infrastructure, everywhere: laptops, disparate storage devices, remote offices, as well as the good old Data Centre, it has created a fundamental shift in the way organisations need to manage information. Keeping information on disk for faster DR restores is all fine and dandy, but there is simply too much data around. Disk based backup is now getting a tricky to manage and as cumbersome as tape-based backup.
A number of customers are turning to deduplication technologies in order to facilitate faster backups, reduce primary storage, and reduce not just the amount of disk being used up but also improve tape media rotation and management.
Deduplication gives you the ability to tape a strategic approach to storage and backups. Organisations now have the ability to deploy an integrated platform that is easy to manage and supports source and target-based deduplication.
Primary storage deduplication will become widely deployed in the next 6 to 12 months. Most organisations have not yet gone down this route, however, with the Option now built into Backup Exec 2010 this is all the more likely because it’s available simply, really easy to install and the benefits are huge.