Backup Exec Server-side Deduplication
Do you have VMware ESX or vSphere servers with high average processor utilization? If so, the Backup Exec server-side deduplication method can be a useful and effective deduplication solution for these environments. This method of deduplication is performed entirely on the Backup Exec server and does not impact source systems any more than a typical backup would.
The Backup Exec server-side deduplication method performs the deduplication processes against data when it arrives at the Backup Exec server – that is, just before the data is laid down on disk. Data is transmitted in its whole, un-deduplicated form, and then decomposed into deduplication blocks in-line by the Backup Exec server. Only the unique data blocks (that is, the data that the deduplication disk storage device doesn’t yet contain) are stored.
Figure 4: Backup Exec Server-side Deduplication
The Backup Exec server-side deduplication method is optimal for situations where:
• High Processor Utilization on Remote Servers
If the remote system has no processor cycles to spare for deduplication calculations, Backup Exec server deduplication can take the load and still perform deduplication.
• VMware Environments
When using the Agent for VMware and Hyper-V to capture image-level backups of VMware virtual machines, Backup Exec server-side deduplication must be used.
Backup Exec server-side deduplication is not recommended for the following environments:
• Remote Office Protection Over a WAN
With Backup Exec server-side deduplication, the Backup Exec server receives the entire data set before deduplication takes place. This is not a WAN-friendly method of deduplication. Generally, remote office protection without local storage should use client-side deduplication.
Any Backup Exec server that has the Deduplication Option licensed can utilize the Backup Exec server-side deduplication method. Most agents and backup types supported by Backup Exec can take advantage of the space savings inherent with Backup Exec server-side deduplication.
Backup Exec 2012 SP2 Agent Backup Exec Server-side Deduplication Support
Agent for Windows Yes
Agent for Linux Yes
Agent for Mac Yes
Agent for Applications and Databases Yes
Agent for VMware and Hyper-V (VMware) Yes
Agent for VMware and Hyper-V (Hyper-V) Yes
Some Backup Exec customer environments have an existing investment in deduplication-enabled appliances for onsite backup, offsite storage (disaster recovery), and remote office protection. The appliance deduplication method is an excellent fit for these environments.
The appliance deduplication method uses Symantec’s OpenStorage (OST) technology in conjunction with both a 3rd-party deduplication appliance and a manufacturer-developed OST plug-in. Together, these components enable the following:
• Intelligent Replication Tracking
Many 3rd party deduplication appliances include a replication feature enabling data to be efficiently copied from one device to another downstream device. When backup data is transferred by a Backup Exec server to a deduplication appliance through the OST plug-in, the Backup Exec server is able to track when data is replicated to additional appliances. This allows the Backup Exec server to be able to restore data from both the original deduplication appliance or from any of the additional appliance replication destinations.
Appliance deduplication requires that the Backup Exec server be paired with one or more supported OST-based deduplication appliances. Symantec Backup Exec is committed to expanding the breadth and depth of OST partners certified to work with Backup Exec, so additional OST devices are being certified and supported as they complete Backup Exec’s internal qualification processes.
For more information on supported 3rd-party appliances compatible with the OST-based appliance deduplication technology within Backup Exec 2012 SP2, please refer to the Backup Exec 2012 SP2 Hardware Compatibility List (HCL) available online.
Deduplication Methods – Client-side Deduplication
With Backup Exec 2012 SP2, exciting possibilities for remote office protection are available. The concept of client-side deduplication – where the remote system is responsible for deduplication calculations and where backup data is sent over the network in its deduplicated form – can make the process of protecting remote offices a much more streamlined experience. Remote offices can be challenging to protect effectively; WAN environments may only utilize a fraction of the bandwidth available to a LAN backup. Backups over the WAN can be a challenge to set up, as well as to complete. Some environments include backup servers that are not as powerful as the application servers they are protecting – often, the SQL server or the Exchange server in the environment is the most powerful machine available in terms of processor speed or disk throughput. Where appropriate, why not leverage some of this remote computing power to achieve faster backups? Both of these situations are problems where client-side deduplication can offer a comprehensive solution to the data protection challenges brought on by the environment.
Generally, remote office backup strategies have two basic architectures. First, there are remote offices which do not have local storage, and where backup data is sent directly over the LAN or WAN to the central data center for storage. Second, there are remote offices that employ local storage and then “forward” that locally stored backup data to the central data center for protection. Both of these configurations can use the Backup Exec 2012 SP2 Deduplication Option to streamline and improve backup and recovery for remote offices.
Client-side deduplication is the act of skipping redundant data blocks at the backup source before transmitting the backup stream to the Backup Exec server. Data from the source system is refined into smaller deduplication blocks, and only the unique blocks (that is, the data the Backup Exec server doesn’t yet contain) are sent to the Backup Exec server’s deduplication disk storage device.
A deduplication disk storage device is special type of disk storage configured by Backup Exec where all deduplication data blocks are stored. With the client-side deduplication method, the majority of the processing necessary for deduplication is done on the remote system rather than as the data arrives at the Backup Exec server. Client-side deduplication is the default deduplication method Symantec recommends for several reasons:
Client-side deduplication enables greater scalability by spreading processor usage out across all clients running backups, enabling the Backup Exec server to process more concurrent backups.
Reduced Network Data Transfers
Client-side deduplication minimizes network data transfers as only unique data blocks – not yet stored by the Backup Exec server – are transferred. Most environments – either LAN or WAN environments – can benefit from less data being sent across the network.
Each Backup Exec Agent for Windows and Agent for Linux has the built-in capability to perform client-side deduplication calculations. Note that all deduplication operations require the Deduplication Option to be licensed on the Backup Exec server.
|Backup Exec 2012 SP2 Agent||Client Deduplication Support|
|Agent for Windows||Yes|
|Agent for Linux||Yes|
|Agent for Mac||No|
|Agent for Applications and Databases||Yes|
|Agent for VMware and Hyper-V (VMware)||No*|
|Agent for VMware and Hyper-V (Hyper-V)||Yes**|
|*While it is possible to utilize client-side deduplication when protecting VMware virtual machines, this configuration requires that backups be processed by locally installed agents within the virtual machines themselves (the Agent for Windows or the Agent for Linux). This configuration bypasses the optimized, image-level backup capabilities of the Agent for VMware and Hyper-V in VMware environments. For these reasons, using client-side deduplication in VMware environments is generally not recommended. Backup Exec server-side deduplication is usually optimal.|
|**Client-side deduplication can be used when protecting Hyper-V environments using the Agent for VMware and Hyper-V. In this configuration, optimized, image-level backups of virtual machines are captured and deduplicated through the Backup Exec Agent for Windows installed locally to the Hyper-V host. It is not necessary to install an individual agent into each Hyper-V virtual machine in order to realize client-side deduplication in Hyper-V environments.|
Modern Data Management and Protection Challenges
Customers of all types and sizes are seeking new and innovative ways to overcome challenges associated with data growth and storage management. While these challenges are not necessarily new, they continue to become more complex and more difficult to overcome due to the following:
- Pace of data growth has accelerated
- Location of data has become more dispersed
- Linkages between data sets have become more complex
Data and storage management challenges are compounded by the need for companies to protect critical data assets against disaster through backup and recovery solutions. In order to maintain backups of critical data assets, additional secondary storage resources are required. This additional layer of backup storage must be implemented wherever backups occur, including central data centers and remote offices.
Storage Efficiencies through Data Deduplication
Backup Exec 2012 includes advanced data deduplication technology that allows companies to dramatically reduce the amount of storage required for backups, and to more efficiently centralize backup data from multiple sites for assured disaster recovery. These data deduplication capabilities are available in the Backup Exec 2012 Deduplication Option.
Backup Exec 2012 Data Deduplication Technology
The data deduplication technology within Backup Exec 2012 breaks down streams of backup data into “blocks.” Each data block is identified as either unique or non-unique, and a tracking database is used to ensure that only a single copy of a data block is saved to storage by that Backup Exec server. For subsequent backups, the tracking database identifies which blocks have been protected and only stores the blocks that are new or unique. For example, if five different client systems are sending backup data to a Backup Exec server and a data block is found in backup streams from all five of those client systems, only a single copy of the data block is actually stored by the Backup Exec server. This process of reducing redundant data blocks that are saved to backup storage leads to significant reduction in storage space needed for backups.
Figure 1: Deduplication Process
The deduplication technology within Backup Exec is applied across all backups managed by a deduplication-enabled Backup Exec server.
Deduplication Methods within Backup Exec 2012
The Backup Exec 2012 Deduplication Option gives administrators the flexibility to choose when and where deduplication calculations take place. Three deduplication methods are supported by Backup Exec 2012. These are as follows:
The client-side deduplication method is a software-driven process. Deduplication takes place at the source or protected client, and backup data is sent over the network in deduplicated form to the Backup Exec server. Only unique blocks of backup data are sent to the backup server and saved to backup storage; non-unique blocks are skipped.
Backup Exec Server-side Deduplication
The Backup Exec server-side deduplication method is also a software-driven process. Deduplication takes place after backup data has arrived at the Backup Exec server and just before data is stored to disk (also known as inline deduplication). Only unique blocks of backup data are stored; non-unique blocks are skipped.
The appliance deduplication method is a hardware-driven process. Deduplication takes place on the deduplication appliance (can be in-line or post-process deduplication, for example, ExaGrid or Quantum). 3rd-party deduplication devices handle all aspects of deduplication.
Administrators can mix and match deduplication methods to fit their unique needs. For example, a single Backup Exec server enabled for deduplication can simultaneously use client-side deduplication for some jobs, Backup Exec server-side deduplication for others, and appliance deduplication for yet another set of jobs.
Figure 2: Deduplication Methods
The different deduplication methods supported by Backup Exec 2012 have various configurations for which they are best suited. The benefits of each method, as well as the configurations for which each method is best suited, will be detailed in the following weeks.
How to help reduce the amount of data you backup …
BEDAT – What’s that?
The Backup Exec Deduplication Assessment Tool (BEDAT) is a utility designed to help partners demonstrate the value of Backup Exec and its deduplication technology to their customers – without having to have Backup Exec installed on the system! BEDAT scans user-selected data sets on one or more Windows-based systems in a customer’s network environment and estimates the deduplication savings that would be experienced if the same systems were protected using Backup Exec or the Backup Exec 3600 Appliance and deduplication – using the same algorithms used in BE itself. BEDAT returns global deduplication results, per resource deduplication results, and per data type deduplication results. BEDAT does not actually capture or transport any customer data during the assessment process; it only captures deduplication fingerprint information and transmits this data to be included in deduplication results.
How does it work?
The Backup Exec Deduplication Assessment Tool (BEDAT) installs to almost any Windows-based, x86 or x64 computer system. When run, it can calculate deduplication results for the system on which it is installed, as well as other systems available on the network. When capturing deduplication data from remote network systems, a small agent is temporarily installed to the remote servers and removed after deduplication calculations have been completed. BEDAT is designed to be as simple and as easy to use as possible. It is a wizard-driven utility that does not require any specific IT expertise to use successfully.
Who is it for?
The Backup Exec Deduplication Assessment Tool (BEDAT) is designed to be used by Backup Exec partners as they help customers understand the storage optimisation benefits of the deduplication technology found in Backup Exec. If you are a customer please feel free to ask your Symantec IT supplier to provide this service.
Where can Partners get it?
The Backup Exec Deduplication Assessment Tool (BEDAT) is available for partners to download at the Symantec PartnerNet site. For end user customers interested in using BEDAT in their environments, please contact a local Symantec partner and talk to them about how Backup Exec can help optimise your backup storage and network utilisation.
What platforms and data types are supported?
The Backup Exec Deduplication Assessment Tool (BEDAT) supports Windows 2003 and Windows 2008 x86 and x64 platforms, including both physical and virtual systems. It supports estimating deduplication results for file system data, Exchange data, and SQL data.
Please note: while designed to be highly accurate, the results offered by the Backup Exec Deduplication Assessment Tool (BEDAT) represent estimates of the storage savings that would be gained by using Backup Exec deduplication technology.
Very loosely, we were instructed to delete everything pre dot com bubble bursting (2000), keep everything post and now we are fast running out of data centre disk allocation space, err?
In fact it’s wonder we manage to do anything given the amount of information we need to process. As a consequence we are now facing a greater threat – too much information. There are somewhere between 60 to 160 Billion mails sent around the world every single day. These emails include attachments such as reports, presentations, letters and pictures. In spite of the limitations such as privacy and too much unwanted mail, email is the best way to communicate efficiently, quickly and cheaply. The danger with email, as with any other way of sharing information, is that too much information simply clogs the system up and become a bottleneck to productivity.
Here are some useful top tips that may help:
- Understand the new business user – organisations must better understand the challenges employees are facing when navigating the world of information management. Look at when and how employees are accessing their information, make sure that data is indexed and categorised, and that intelligent archiving and search tools are available
- Prepare the infrastructure – with the relentless flow of information only set to continue, IT infrastructure must be able to cost effectively manage the increasing requirements for storage by implementing solutions able to dedupe and archive appropriately, automate processes and monitor and report on system status across all different devices and environments
- Prepare people – create IT policies that educate employees on how to manage their information – from email practices like limiting the ‘CC’ and ‘reply to all culture’, to saving only the latest document version and overcoming the fear of the delete button. Help employees understand the company’s information retention strategy so they know what information is recoverable. This will empower them to take charge of information control and maintain productivity and efficiency
- Keep security front of mind – it seems like an obvious statement, but reinforcing company security policies around mobile devices could protect against significant and damaging data loss. Make sure employees know the company processes and take advantage of technologies that enable the IT department to see where the most important information is, at all times
- Encourage staff to switch off – with the information era in full swing and with more and more opportunity for employees to stay connected at all times, it’s important that organisations support staff welfare and encourage them to switch off every once in a while
Seriously consider optimising your storage to reduce overall front end storage usage. Improving capacity can be done through integrated archiving and deduplication as well as tiering your storage. Archiving moves old data to a separate store so you don’t have to backup the same data day-in, day-out – forever. Deduplication only backs up data (at a block level) once, using a pointer to the unique data. So you can both reduce the amount you backup as well as dramatically reducing your backup window with archiving and data deduplication.
But, I hear you say, if I implement deduplication technology what are the benefits? Well, Backup Exec can help with that too. Read all about the Backup Exec Deduplication Assessment Tool in Part III.
… or so Computerworld UK said on Monday. Something that I’ve been saying for 3 years – but, hey, what do I know? According to Computerworld UK the consumerisation of IT is an unavoidable phenomenon that will force businesses to rethink their security policies. What about all that stuff some poor soul will have to back up? CIOs need to deal with consumerisation. For many years we were able to say to our end users – “no you can’t have that”, or “no we don’t support this”. But those days are over. The security of this phenomenon is certainly a concern, but IDC reckon that 80% of data created by individual consumers will end up on corporate networks. This will inevitably cause a overload on our already overloaded systems. Until we have the management capabilities for streaming applications to the desktop (oh, sorry Symantec already does that). OK, so when we get around to migrating to this model where all our data is help inside the network consumerisation won’t be nearly as scary – until then data growth will continue on its upward curve.
That’s why deduplication and archiving are key to our backup strategies. We are challenged with managing and protecting the ever-increasing amounts of data. Backup Exec offer deduplication across physical and virtual machines to reduce the length and size of backups. Deduplication has the power to transform information management; it is great for backup, it is great for archiving, and can even make virtualised server backup manageable. Symantec believes that deduplication should live in every part of the information architecture.
Much of the content produced now consists of email, documents, presentations, and other types of unstructured information. This explosion of information has a significant impact on storage spending and IT’s ability to meet the needs of its internal customers and business units. Backup Exec’s integrated archiving option is focused on reducing the amount of information backed up. Together with Enterprise Vault (EV), Backup Exec’s Agent for EV helps organisations to unify content sources, apply retention policies, reduce backup windows, shorten recovery times, and optimise storage resources making it easier for companies of all sizes to store, manage, and protect all unstructured data.
I don’t know about you but I’ve been following the EMC vs NetApps Data Domain saga with some interest. Well, the waiting eventually ended with EMC stealing Data Domain from under NetApps nose for a measly $2.4B ( a mere bagatelle?). Earlier this year EMC announced the pulling together of all its “data reduction” technologies to give it some sort of coherence. A great strategy but one that is pretty difficult to accomplish with so many disparate technologies in the EMC portfolio. By adding Data Domain to the mix the chances of this happening gets even more unlikely. Which is unfortunate when you consider that for most customers date deduplication is a pretty important requirement.
In reality it is important for EMC to ensure that the substantial Data Domain/EMC integration effort will take precedence to short term data deduplication integration and so EMC will do whatever it takes to show how effortlessly the integration of Data Domain has been. Data Domain sales teams are likely to be incredibly aggressive pushing their deduplication capabilities at the expense of everything. I would be pretty careful about what you purchase from EMC and why.
From Symantec’s point of view Backup Exec is looking to manage all these points. For BE customers it’s going to be dead simple. If your problem is you want to improve recovery and manage tape and disk based backup, as well as system recovery and virtualisation technologies and you’ve already got Backup Exec you will be able to simply plug a solution into your existing processes using an agent for deduplication. Remember, if you are considering moving your backup solution to anything but BE you’ll have to do some serious thinking about your backup architecture and you’ll not be simply plugging into your existing backup software.
The conversations you should be having with your IT partner are around strategic fit. What’s the right technology to solve my business problem. You need a tool that does the job or all the jobs you need it to do. Not something that falls short in one or more areas . So, whatever the question is … the answer is The Backup Exec Family.