Backup Exec and Integrated Archiving for Exchange – Part XI … More Performance Notes and Recommendations
Archiving Backups Stored to Tape
The Backup Exec 2012 SP2 Exchange Mailbox Archiving Option does not currently support archiving from backup data stored to tape media.
Backup Storage Types Supported for Exchange Archiving
In order to be eligible for archiving, backup sets must be stored to one of the following storage location types:
- Non-removable disk storage
- Deduplication disk storage
- A storage array in a Storage Provisioning Option environment
Data Removal Best Practices
It is recommended that vault store properties be set so that items are deleted from the original location only after vault stores are backed up. This is the default setting. Schedule vault store backups (full or incremental) to run between every run of archive tasks. That way, every archive task will be able to remove those items from the original location that were archived by the previous run of the archive task, thereby providing storage savings on the primary system and its subsequent backups.
End User Recovery of Archived Emails
The Backup Exec 2012 SP2 Exchange Mailbox Archiving Option provides the user-friendly Virtual Vault feature designed to make end user recovery of archived emails a very simple and painless process. It is recommended that Backup Exec 2012 SP2 administrators take full advantage of this feature. For details on configuring Virtual Vault for end users, please refer to the Backup Exec 2012 SP2 Administrator’s Guide Addendum.
Archive Storage Configuration Best Practices
Configure your backup destination storage to use different disks than your vault store partitions. This will give better performance for archive tasks that read data from backup sets and ingest the data into archives, as reading and ingestion processes will have separate physical disk resources at their disposal.
Configure your archive indexing location to use different disks than your vault store partitions. This will give better performance for archive tasks that index data as it is being archived, as indexing and ingestion processes will have separate physical disk resources at their disposal.
Microsoft Outlook Required
The Backup Exec 2012 SP2 Exchange Mailbox Archiving Option requires Outlook 2007 SP2 (including Microsoft hot fix 968858) to be installed to the Backup Exec 2012 SP2 server. Microsoft Outlook should be installed before installing the Exchange Mailbox Archiving Option.
Before installing the Exchange Mailbox Archiving Option, be sure that DNS has been configured correctly. The Backup Exec 2012 SP2 server adds its own alias into DNS, and Symantec has found that a large percentage of customer issues relating to the installation and configuration of the Backup Exec 2012 SP2 Exchange Mailbox Archiving Option are related to environments where DNS is not configured correctly. The installation wizard will prompt for a fully qualified domain name in order to create the DNS alias.
Exchange Objects not archived by an Exchange Mailbox Archiving Task
The following Exchange objects are not valid candidates for archiving tasks:
- Mail messages that have pending reminders
- Any Exchange items other than mail messages, such as address book entries and calendar items
- Mail messages in Exchange managed folders, journal mailboxes, or in public folders
First Exchange Archive Task Performance
The first time an Exchange archive task is run, it will find a large number of mail messages that are valid candidates for archiving. This can result in an Exchange archive task taking a substantial amount of time to finish the first time it is run. Subsequent runs of the Exchange archive task will only pick up messages that became eligible since the last run, and will not take as long.
It is advisable to plan around the first run of Exchange archive tasks by scheduling it to run over the weekend or over a longer block of available time. Another approach would be to add mailboxes to archive tasks in phases rather than all at once.
Archive Task Scheduling Recommendations
Schedule your archive tasks to run after full backups. Because archive tasks source data from backup images, by scheduling archive tasks to run after full backups they run faster since they process the latest full backup rather than a chain of incremental backups going back to the latest full.
Schedule your archive tasks so that they run outside the backup window. Because archive tasks source data from backup images, archive tasks impact the backup server and not production systems. Scheduling archive tasks to run outside the backup window allow them to make full use of backup server processing cycles and prevent scheduling conflicts between backup tasks and archive tasks.
General Best Practices
Emails Must Be Backed Up Before They Can Be Archived
Due to the unique implementation of archiving capabilities within Backup Exec 2012 SP2, only data that is being protected by Backup Exec 2012 SP2 through backup jobs can be archived. If no backup job exists to protect the Exchange server in question, the email data associated with that Exchange server cannot be archived.
In Backup Exec 2012 SP2, archiving tasks are implemented as an additional stage to a backup job. Archiving tasks will only employ archiving rules against source data from the backup job of which they are a part. When adding an archive stage to a backup job, it is advisable to configure the backup job to be the most compatible with archiving, such as storing backups to disk rather than directly to tape.
The Exchange Mailbox Archiving Option does not support archiving data from backup sets stored only to tape.
The Exchange Mailbox Archiving Option does not currently support clustering. It will install to a Backup Exec 2012 SP2 server that is a cluster node, but it will not be allowed to join the Backup Exec 2012 SP2 cluster.
For the Backup Exec 2012 SP2 Exchange Mailbox Archiving Option, the Backup Exec 2012 SP2 server must be in a domain. For configurations involving multiple domains, the domain of the Backup Exec 2012 SP2 services account must be trusted by the Backup Exec 2012 SP2 server domain as well as the domain of the Exchange servers targeted for archiving.
In addition, the BE services account must be granted permissions on each Exchange server targeted for archiving tasks. The Administrator’s Guide lists additional details in regards to the permissions that need to be provided to the Backup Exec 2012 SP2 Service Account on Exchanges mailboxes in order to enable archiving. Please refer to the Administrator’s Guide for additional details.
Exchange Mailbox Archiving Option Sizing Guidelines
The Backup Exec 2012 SP2 Exchange Mailbox Archiving Option requires permanent disk space for the following archiving components:
- Vault store
- Vault store partitions
- Index locations
- The following SQL Express or SQL Server databases:
- Directory database
- Vault store databases
- Fingerprint databases
As the data in a vault store grows, additional vault store partitions can be added to provide additional capacity. Local drive or network shares can be used for vault store partitions. The following section offers introductory sizing guidelines for administrators; for further details, please refer to the Backup Exec 2012 Administrator’s Guide.
Sizing Guidelines for the Exchange Mailbox Archiving Option
Symantec supplies certain formulas that administrators can use to estimate disk space requirements for the Exchange Mailbox Archiving Option. The following values and variables are used in the formulas:
- ‘N’ is the number of emails
- ‘m’ is the average number of identical copies of attachments across user mailboxes
- The compression factor for attachments is estimated as 60%; if the attachments are mostly Office 2007 files, the compression factor to use is 90%
- The average number of emails that have attachments is estimated at 20%
- The average size of an email attachment is estimated at 250 KB
Vault Store Partition Size
The size of a vault store partition used for Exchange Mailbox Archiving Option depends on the following items:
- Size of the emails
- Type of attachments
- Number and size of the attachments
- Number of emails with attachments
Vault store partition sizing formula for the Exchange Mailbox Archiving Option for which single instance storage is enabled:
(Nx16) + ((1/m) x (Nx0.2×0.6×250) kilobytes
For example, if you want to know the disk space requirements for a vault store partition for 100,000 emails, you estimate that each email attachment is shared across three people on average. The calculation for the approximate disk space requirements would be as follows:
(100000 x 16) + ((1/3) x 100000 x 0.2 x 0.6 x 250) kilobytes = 2.6 GB approximately
The size of an index is approximately 8% of the total size of the items that are archived. The percentage may be less if there is less content to index. For example, there is less content to index when there are large attachments such as MP3 or .jpeg files.
Example: You have 100,000 emails that each has a body size of 8 KB. About 20% of the emails have attachments, each with an average total size of 250 KB. The index size is approximately 450 MB.
Directory Database Size
The Directory database only grows when a new mailbox or share is archived for the first time. The recommended disk space allocation is 500 MB.
Vault Store Database Size
The size of a vault store database is approximately:
N x 500 bytes
The vault store database grows with every item that is archived. Temporary space is used to hold information on the items that have not been backed up or indexed.
Fingerprint Database Size
The fingerprint database is created only if you enable single instance storage of archived items. Backup Exec 2012 SP2 initially allocates 212 MB for the fingerprint database. The fingerprint database grows with every item that is archived.
If the database grows to more than 212 MB, use the following calculation to estimate the disk space that it requires:
1/m x Nx0.2 x 500 bytes
Exchange Mailbox Analyzer Tool
To assist with planning efforts around Exchange email archiving, an Exchange mailbox analyzer utility is available for download from the SymIQ for Partners portal.
Exchange Mailbox Analyzer (EMA) is a tool that examines Microsoft Exchange Server 2003, 2007, and 2010 environments to collect information on:
- Number of messages
- Size of messages
- Age of messages
- Number of attachments
- Size of attachments
- Top users
- Duplicate attachments*
- Duplicate message bodies*
The results collected can be imported into the Enterprise Vault 8.0 sizing tool to help estimate sizing requirements for Enterprise Vault.
Each license of the Backup Exec 2012 SP2 Exchange Mailbox Archiving Option enables archiving protection for 10 user mailboxes. Additionally, one license of the Backup Exec 2012 SP2 Agent for Applications and Databases is required for each Exchange server that you want to archive.
Example Licensing Environment
Here is an example Backup Exec 2012 SP2 environment with Exchange servers to be archived:
Exchange Mailbox Archiving Option Example Licensing Diagram
This environment would require the following licenses in order to be fully enabled with Exchange archiving capabilities and remain in compliance with Backup Exec 2012 SP2 license requirements:
|Backup Exec Licensable Component||Required Licenses|
|Backup Exec Server||1|
|Exchange Mailbox Archiving Option (10 Users Each )||10|
|Agent for Microsoft Exchange||3|
Exchange Mailbox Archiving Option and the Agent for Applications and Databases
The Exchange Mailbox Archiving Option can only archive emails from GRT backup sets captured from Exchange servers using the Agent for Applications and Databases. As a result, the Agent for Applications and Databases is required for each Exchange server that will be involved in the archiving process.
Exchange Mailbox Archiving Option and the Agent for VMware and the Agent for Microsoft Hyper-V
The Agents for VMware and Hyper-V offer advanced technology designed specifically for the backup and recovery of VMware and Hyper-V environments. This includes optimized, image-based backups of VMware and Hyper-V virtual machines, including virtual machines hosting key applications such as Microsoft Exchange.
At this time, the Backup Exec 2012 SP2 Exchange Mailbox Archiving Option does not support archiving of Exchange email objects from image-level backups of Exchange virtual machines captured by the Agents for VMware and Hyper-V. In order to enable archiving support of virtual machines hosting Micrsoft Exchange, the virtual machines must be protected using agent-based backups, which essentially treats the virtual machines as if they were standalone physical servers.
Exchange Mailbox Archiving Option and the Deduplication Option
The Backup Exec 2012 SP2 Exchange Mailbox Archiving Option includes SIS deduplication technology which is enabled by default. No additional licenses are required to enable the SIS deduplication technology within the Exchange Mailbox Archiving Option beyond the license for the Exchange Mailbox Archiving Option itself. However, this deduplication technology is limited to optimizing storage usage within the vault store, and does not extend to basic backup data storage.
The Backup Exec 2012 SP2 Exchange Mailbox Archiving Option can be used in conjunction with the Backup Exec 2012 SP2 Deduplication Option to realize additional data storage savings.
When the Backup Exec 2012 SP2 Deduplication Option is added to the environment, deduplication technology is also employed against the deduplication disk storage device where backup sets are stored. The Backup Exec 2012 SP2 Deduplication Option enables several block-level deduplication capabilities that can greatly benefit administrators looking to control storage growth. There are three different methods of deduplication that are available with the Deduplication Option:
- Client-side deduplication
- Backup Exec server-side deduplication
- Appliance deduplication
Underlying Technical Principles
Exchange Mailbox Archiving Option Basic Architecture
The archiving technology imbedded within Backup Exec 2012 SP2 is based upon Enterprise Vault and uses the same core archiving storage components found in Enterprise Vault. So it’s pretty fine technology. The storage components in EV & BE include:
- Vault store
- Vault store partitions
- Fingerprint database (Single Instance Storage (SIS) Deduplication)
The Backup Exec 2012 SP2 server manages the vault store and vault store partitions as a storage device and writes data into archives from backup data sources according to the backup jobs configured by the administrator. So the Administrator defines the archiving policies for Exchange – 90 day/200 day whatever …
Figure 2: Archiving Storage Components Diagram
The vault store is the parent container for archived data. The vault store is managed as a storage device by the Backup Exec 2012 SP2 server. The vault store is separated into partitions which contain the actual archive data. Only one partition can be open at any given time to receive new archive data. However, Archives can span more than one partition.
Vault Store Partition
A vault store partition is a path to storage (e.g. ‘E:\Archive’). A vault store partition can be in one of two states: open or closed. The partition with the open state is the partition to which new archive data is written and stored. As mentioned previously, only one partition can be open at a time, so you can only write to one partition at any one time even though archives can span across partitions.
Vault store partitions with the closed state do not receive new archive data; however, closed partitions can still be read for data recovery purposes and can have data elements deleted according to archive expiration policies configured by the Backup Exec 2012 SP2 Administrator.
An archive is a collection of archived data. For the purposes of the Backup Exec 2012 SP2 Exchange Mailbox Archiving Option, one archive corresponds to one backed up user mailbox.
Archives can have new data added to them and can have old data deleted from them according to the settings configured by the Backup Exec 2012 SP2 administrator. An archive can span more than one vault store partitition, since the partition that was open at the time the archive was created may not be the same partition that is open and receives new data for the archive at a later time.
Although the data that’s been archived is deleted at source – thereby creating storage space on the Exchange Server – the end user can still locate the archive data in the same way that they could using Enterprise Vault – dead cool frankly!
Unique Value of Backup Exec 2012 SP2 and Integrated Archiving
For small and medium size environments, Backup Exec 2012 SP2 offers a unique approach to archiving through the unification of backup and archiving processes into a single offering. By linking backup and archiving technologies into a single solution, administrators can both protect critical servers and applications for disaster recovery and also archive email and file system data to secondary or tertiary storage. Backup Exec 2012 SP2 enables administrators to realize both backup and archiving benefits while only ‘touching’ critical servers and applications once.
Integration with Enterprise Vault Technology
Backup Exec 2012 SP2 includes integrated archiving solutions for both file system data and Exchange email data. The archiving technology within Backup Exec 2012 SP2 is based on the proven, market-leading Enterprise Vault family of products. By leveraging this proven technology, Backup Exec 2012 SP2 is able to offer the following advantages to administrators:
- A single, integrated solution for both backup and archiving
- Lower total cost of ownership from using a single product to solve two key IT problems
(1) Protection of critical servers and applications for disaster recovery
(2) Controlled archiving of file system and Exchange email data to lower cost storage
- Lower impact on production servers as backup and archiving are achieved from a single ‘touch’
- Compatibility and interoperability assurance from true technology integration
Reliability from utilizing market-proven archiving technology
Archiving Process Diagram
As organizations grow and expand, upgrade paths are available that enable organizations to transition from the integrated version of Enterprise Vault in Backup Exec 2012 SP2 to the full Enterprise Vault solution.
Exchange Mailbox Archiving Option
Backup Exec 2012 SP2 licenses its integrated archiving technology through two product options: the Exchange Mailbox Archiving Option and the File System Archiving Option. This blog is designed to assist partners and customers as they design and implement Backup Exec 2012 SP2 and the Exchange Mailbox Archiving Option.
Archive … Don’t always pump for Dedupe
Not many people know about the integrated archiving option in Backup Exec – most people go for the Deduplication Option to reduce back end storage and mistakenly believe that deduplication will speed up the backups – bacause you’re storing less data … but deduplication still needs to process the data so it can actually take longer (unless you are deduplicating at the source/remote site).
If all you want to do is to speed up your backups and you are finding that the speed of the backup is primarily governed by the amount of data you are now having to shift over the network, then archiving is the puppy you’re after.
In this new series I am going to cover:
Integration of Enterprise Vault into Backup Exec 2012 SP2
- Underlying Principles and Technology
- Licensing Considerations
- Performance factors
- Best practices
If you want more detailed instucations on how to instal and manage Backup Exec 2012 SP2 and the Excahnge Mailbox Archiving Option please, please refer to the Admisnistration Guide and the Backup Exec 2012 SP2 addendum, as well as copious Tech Notes on the subject on www.backupexec.com.
Data and Storage Management Challenges
The evolution of applications and technologies designed to enable the creation, sharing, and management of data is continuing to drive data volumes higher and higher. Today, it’s easier than ever for end users to create and share data. As a result of these technologies and the dramatic data growth they facilitate, administrators are struggling to ensure their company’s critical data and application assets remain functional and protected, and are looking for ways to better manage the storage resources they employ to match the different data assets they are responsible for.
Data storage solutions come in a variety of flavors. Depending on the type, size, and priority of the associated server or application, selected storage devices may have a higher or lower capacity or have a higher or lower performance level. Some applications and servers demand high performance storage systems, while other applications and servers can be satisfied with cheaper, lower performance storage. As the performance of the storage device increases, so does its price.
Also, not all data assets are created equal. Some types of data may be accessed frequently, while other types of data may be touched only once and quickly become old or stagnant. Also, not all data assets are of the same size. PowerPoint presentations, video files, and similar types of data can be quite large. Other data types, such a Word documents and text files, are usually quite small.
Storage Efficiencies through Archiving
Archiving is an important technology through which administrators can control storage management costs. Archiving allows administrators to control what types of data reside in what types of storage. This includes controlling what data remains resident on primary storage resources — commonly expensive, high-performance storage solutions — and what data is moved to secondary or tertiary storage resources — commonly slower, long term storage devices — which might have a much higher storage capacity. Data attributes such as size and age can be leveraged by archiving solutions to help administrators control the archiving process.
By using archiving technology, administrators can better manage investments into expensive, high-performance primary storage resources by ensuring that non-critical data assets are moved to cheaper storage solutions, increasing the available capacity of primary storage in production environments and helping administrators get the most out of their current storage investments.
Archiving solutions can also improve application performance. By archiving application objects to secondary or tertiary storage resources and removing them from the original application server, application databases shrink and performance improves.
Very loosely, we were instructed to delete everything pre dot com bubble bursting (2000), keep everything post and now we are fast running out of data centre disk allocation space, err?
In fact it’s wonder we manage to do anything given the amount of information we need to process. As a consequence we are now facing a greater threat – too much information. There are somewhere between 60 to 160 Billion mails sent around the world every single day. These emails include attachments such as reports, presentations, letters and pictures. In spite of the limitations such as privacy and too much unwanted mail, email is the best way to communicate efficiently, quickly and cheaply. The danger with email, as with any other way of sharing information, is that too much information simply clogs the system up and become a bottleneck to productivity.
Here are some useful top tips that may help:
- Understand the new business user – organisations must better understand the challenges employees are facing when navigating the world of information management. Look at when and how employees are accessing their information, make sure that data is indexed and categorised, and that intelligent archiving and search tools are available
- Prepare the infrastructure – with the relentless flow of information only set to continue, IT infrastructure must be able to cost effectively manage the increasing requirements for storage by implementing solutions able to dedupe and archive appropriately, automate processes and monitor and report on system status across all different devices and environments
- Prepare people – create IT policies that educate employees on how to manage their information – from email practices like limiting the ‘CC’ and ‘reply to all culture’, to saving only the latest document version and overcoming the fear of the delete button. Help employees understand the company’s information retention strategy so they know what information is recoverable. This will empower them to take charge of information control and maintain productivity and efficiency
- Keep security front of mind – it seems like an obvious statement, but reinforcing company security policies around mobile devices could protect against significant and damaging data loss. Make sure employees know the company processes and take advantage of technologies that enable the IT department to see where the most important information is, at all times
- Encourage staff to switch off – with the information era in full swing and with more and more opportunity for employees to stay connected at all times, it’s important that organisations support staff welfare and encourage them to switch off every once in a while
Seriously consider optimising your storage to reduce overall front end storage usage. Improving capacity can be done through integrated archiving and deduplication as well as tiering your storage. Archiving moves old data to a separate store so you don’t have to backup the same data day-in, day-out – forever. Deduplication only backs up data (at a block level) once, using a pointer to the unique data. So you can both reduce the amount you backup as well as dramatically reducing your backup window with archiving and data deduplication.
But, I hear you say, if I implement deduplication technology what are the benefits? Well, Backup Exec can help with that too. Read all about the Backup Exec Deduplication Assessment Tool in Part III.
Yes, it’s true – we are becoming a nation of information addicts – at least according to a survey Symantec recently carried out. Symantec wanted to find out more about how the so-called information explosion is affecting the everyday lives of British office workers. What was abundantly clear is that we are all suffering from this 21st century ailment – Information Overload – sounds like a Tom Cruise film, or AC/DC album – and it is overtaking not only our working lives, but our personal ones too.
Accessing work information out of hours, compulsively checking emails, texts and social media and hoarding endless emails and multiple versions of the same file are all symptoms of information overload experienced by those we surveyed. See the stats here.
But whereas the technology enabling us to do this (fantastic mobile devices and faster connectivity) all purport to make us more productive in the workplace, is our mismanagement of information actually counter-productive?
IDC has recently estimated that in 2011 over 1.8 Zetabytes of information was created and replicated (IDC, “The 2011 Digital Universe Study: Extracting Value from Chaos”) and if we go by Moore’s Law this will continue to grow almost immeasurably over the coming years. What does this mean for our state of mind and the systems we work with – will we reach a moment when we are essentially ‘drowning’ in information?
Not if the technologies that store and manage information also continue to improve. We are working very hard to make managing information easier, faster and more efficient for businesses of all sizes. This means making sure that what is actually useful and valuable is stored, archived and backed up correctly, while the rest is relegated to permanent deletion.
But technology can only go so far, some of the onus is still on businesses and individuals to moderate their work behaviour to take into account this new work paradigm.
Part II – What can we do about it?
I need a new service, so I need an application, and a new server, and perhaps some storage … and if we’re lucky we ask ourselves “oh, yes, what about the backup?” Have you noticed how really never turn IT off, we just add to it. So we end up with a backup strategy that encompassed everything 3 or 4 years ago, but one that falls pretty short today; that’s how it really works.
Even though we know that we really should backup all our data – just in case – are we absolutely convinced we actually are? Backup is our critical data protection solution and yet we rarely review our backup strategy.
With server virtualisation, the need for fast reliable application recovery, the exponential growth of unstructured data and poor data lifecycle management are some of the root causes of operational inefficiencies in IT and why we are change the way we approach our backup strategies.
With more and more companies adopting virtualisation technologies to improve efficiencies and reduce CAPEX costs, organisations are looking for ways of protecting both virtual and physical environments with a single backup tool. It makes sense to use a solution that gives you granular recovery from a single pass backup, saving time, money and any amount of effort – don’t use separate tools and end up backing up the backup it turns the recovery process into a nightmare!
The backup and recovery of Microsoft Applications is an inherently challenging process that becomes more difficult as the databases grow and the demands on its online availability increases, further limiting the time available for backup and recovery operations. Granular Recovery of Exchange, SQL and Active Directory from a single pass backup makes it easy and efficient to identify and recover only those objects needed.
Optimising storage to reduce overall storage capacity can be done through integrated archiving and deduplication. Archiving moves old data to a separate store so you don’t have to backup the same data day-in, day-out – forever. Deduplication only backs up data (at a block level) once, using a pointer to the unique data. You can reduce the backup window dramatically with both archiving and data deduplication.
Backup Exec 2010
Backup Exec Agents and Options enhance and extend platform and feature support for your backup environments for Microsoft applications, virtual environments (VMware and Microsoft Server 2008 R2 Hyper-V) as well as storage reduction or optimisation technologies.