Do we actually need to backup our Exchange data anymore?


A number of administrators are currently reconsidering their approach to backing up their Exchange Servers. In large implementations the costs associated with traditional backups are considerable and the emergence of disk based backups using products such as DPM has stimulated more debate.  A lot of companies are now implementing or designing to implement Exchange 2007 and this introduces a number of points which contribute to this discussion.

I am certainly not advocating any administrator abandoning their backup solution without considerable thought and planning but if you are reconsidering your approach to backups then the best way to start is to understand the kind of situations that will require some form of recovery and which of these situations mean resorting to backup.  What follows is a quick list of scenarios, which is by no means exhaustive, and which might be different for every implementation:

  • Disk Failure: assuming that not mitigated by software or hardware RAID i.e. Volume\LUN failure
  • Server Failure
  • Lost email (recent): including hard delete but deleted in error - 10 days ago for example
  • Lost email (historical): including hard delete but deleted in error - 6 months ago for example
  • Mailbox deletion (recent): by administrator - 10 days ago for example
  • Mailbox deletion (historical): by administrator - 6 months ago for example
  • Physical Database Corruption 
  • Logical Database Corruption
  • Data Centre Failure: data centre failure as a result of flooding, for example
  • Email restore request (recent): Internal investigation requires that an individual email or series of emails be located - from 10 days ago for example
  • Email restore request (historical): Internal investigation requires that an individual email or series of emails be located - from 6 months ago for example
  • Email thread request (recent): Internal investigation requires that an email thread be retrieved to include the dates, times, sender & recipients - from 10 days ago for example
  • Email thread request (historical): Internal investigation requires that an email thread be retrieved to include the dates, times, sender & recipients - from 6 months ago for example
  • Long term data retention: Legal requirement exists to retain data for 8 years for example

From this list we need to understand whether our business actually needs to be able to recover service and\or data in the event of every one of these scenarios.  We also need to understand the risks to service and\or data associated with each of these scenarios.  The approach I would advocate is noting down how your proposed design mitigates against each scenario described and then using the risk understand whether the design needs to be altered to cater for each scenario in question.

For example:

Scenario Mitigation Backup Required?
     
Disk Failure E2K7 Continuous Replication - Local or Continuous (LCR or CCR) No
Server Failure Clustered solution - Shared Copy Cluster (SCC) or Cluster Continuous Replication (CCR) No
Lost email (recent) Deleted Item Retention No
Lost email (historical)

Backup - Unless your backup solution involves taking bricks-level backup the recovery path would include restoring an entire mailbox database and then extracting a mailbox.  The user would have to be very specific about when the mail was deleted otherwise this will be very time consuming.
Deleted Item Retention - could be used if set to a long enough period.

Yes
Mailbox deletion (recent) Deleted Mailbox Retention No
Mailbox deletion (historical)

Backup - Unless your backup solution involves taking bricks-level backup the recovery path would include restoring an entire mailbox database and then extracting a mailbox.  The administrator would have to be very specific about when the mailbox was deleted otherwise this will be very time consuming.

Yes
Physical Database Corruption Exchange 2007 Continuous Replication - (LCR or CCR) - physical corruption causing damage to transaction logs will prevent them from being played into the replica database. Physical corruption causing damage to the active database is mitigated by the database replica. No
Logical Database Corruption

Exchange 2007 Continuous Replication - SCR (Standby Continuous Replication) - by introducing a lag time into our SCR configuration and appropriate monitoring logical corruption would be prevented from impacting the SCR target database.
Backup provides an additional option here. Especially for companies where SCR is not being implemented.

No
Data Centre Failure SCR No
Email restore request (recent) Might be a combination of administrator led search of active mailboxes including deleted item retention plus a restore from backup Yes
Email restore request (historical)

Backup - Unless your backup solution involves taking bricks-level backup the recovery path would include restoring an entire mailbox database and then extracting a mailbox or mailboxes.  The administrator would have to be very specific about when the mailbox was sent\received or deleted. This will likely be very time consuming and involve a lot of administrative resource.

Yes
Email thread request (recent) Message Tracking Logs No

Email thread request (historical)

Message Tracking Logs (if within the threshold defined by the administrator)
Backup - from a backup of the message tracking logs or a backup of Exchange data.

Yes
Long term data retention Backup - traditionally data is backed up to tape and kept offsite to satisfy this requirement Yes

Using my set of scenarios I believe that in most companies the reasons that you might deploy a backup solution are as follows:

  1. Lost email (historical): including hard delete but deleted in error - 6 months ago for example
  2. Mailbox deletion (historical): by administrator - 6 months ago
  3. Email restore request (recent): Internal investigation requires that an individual email or series of emails be located - from 10 days ago
  4. Email restore request (historical): Internal investigation requires that an individual email or series of emails be located - from 6 months ago
  5. Email thread request (historical): Internal investigation requires that an email thread be retrieved to include the dates, times, sender & recipients - from 6 months ago
  6. Long term data retention: Legal requirement exists to retain data for 8 years for example

(As an aside I think many administrators might be surprised at the above list.  Traditionally we backup to protect against data corruption and this list doesn't not include this as a scenario.)

Many companies that must satisfy the requirements above will still need to deploy a backup solution.  However I believe that there are many implementations where the requirements above either do not need to be met or can be met in other ways.

  1. Lost email (historical):
    Why not increase deleted item retention to much longer - as long as this is considered in the process of designing your Exchange Servers for capacity and performance this may provide sufficient mitigation.  If an SLA is defined that email retrieval beyond this will not be possible then you may not need to provide further mitigation.  If your company has already deployed an archive solution this might also be sufficient mitigation.
  2. Mailbox deletion (historical):
    Increasing deleted mailbox retention to much longer would to some extent mitigate against this.  Again an appropriate SLA might protect you from having to meet this requirement beyond 3 months for example. In addition an archive solution, if already deployed, can provide the ability to recover an entire mailbox so might also provide a solution.  (I am in no way recommending, or advising that you avoid, a 3rd party message archival solution but in some companies one is already deployed and if so it makes sense to make use of it.)
  3. Email restore request (recent):
    This might be satisfied by an administrator led search of mailboxes and deleted item retention.  Legal requirements might mean that you need to resort to your backup and restore to a recovery server. It may be that the only way to adequately satisfy this requirement is to envelope journal all email.  This has its own implications for any design which will need to be catered for. (Journaling everything is potentially expensive in terms of performance and storage and is an administrative overhead but it may be the only solution to meeting this requirement.)
  4. Email restore request (historical):
    This is normally satisfied by restoring from a traditional Exchange backup.  This is potentially very time consuming and might involve multiple restores of multiple databases to more than one recovery server. This needs to be understood before it is articulated to the business that this type of scenario is satisfied by a traditional backup. It may be that the only way to adequately satisfy this requirement is to envelope journal all email.  This has its own implications for any design which will need to be catered for. (Journaling everything is potentially expensive in terms of performance and storage and is an administrative overhead but it may be the only solution to meeting this requirement.)
  5. Email thread request (historical):
    If it is just the times and dates of emails and sender\recipient information then this might be satisfied using Message Tracking Logs.  Either extend the length of time for which tracking logs are kept and\or back the historical logs up to disk using NTBackup for example.  This may be sufficient for this requirement.  Alternatively you may need to consider envelope journaling.  Traditional Exchange backups might do the job but this might be considered overkill in this situation. (Journaling everything is potentially expensive in terms of performance and storage and is an administrative overhead but it may be the only solution to meeting this requirement.)
  6. Long term data retention:
    Make sure you understand exactly what the requirements are in this situation.  ..does this requirement actually exist? If so in what form must the data be retained, for how long, how quickly does it need to be restored, where does it need to be restored etc..  A traditional Exchange backup may or may not be able to meet these requirements.

So it is hoped that having followed this exercise that if nothing else you have a much better understanding of why you run backups of your Exchange data.  It might be that you can eliminate backups altogether or as in most cases you might be able to dramatically reduce the number of backups you take and costs associated with them.

Comments (11)
  1. What about PF? They’re still present in Exchange organizations.

  2. Jaen Snyman says:

    You will be running out of log file space very soon if you are not doing any backups. You need some way to clean ou the logs, unless you are turning the circular logging on which has its own challenges in an environment.

  3. Yes agreed – public folders need to be considered.  If you only have a single public folder store then CCRLCR might be enough to protect this data or if you have multiple databases then public folder replication may be enough.  Or dump out the contents to disk on a regular basis via a script?  These are some options but more thought required I think.

  4. Yes you do need to consider what to do with the transaction logs..  Truncating them with NTBackup on a weekly basis perhaps?  Important point for capacity planning…

  5. A blog on this same subject has just been posted to You Had Me At EHLO… and there are a few points

  6. I recently posted an article to the Exchange Team site here and received a number of comments. I thought

  7. Carol Chisholm says:

    I look after several small businesses, never use Small Business Server if I can help it because it is such a pain in acquisitions and mergers.

    I need offsite backup on tape, often have to get old mailboxes and old mailstores back, indeed did an Exchange 5.5 restore last month.

    When confidential data is concerned, no-one trusts their ISP or hosting company!

    We need a simple backup and clear logs tool (most small sites have very small logs)

  8. There will always be companies and implementations where data needs to be backed up in the traditional sense.  As E2K7 is adopted (not mentioning future versions) there are more choices about what actually constitutes a backup and how to approach data protection.  For long term retention of data though the realistic choices are only really journaling and backup.

  9. Ok so here’s my thoughts on the whole SAN ( Storage Area Network ) versus DAS ( Direct Attached Storage

  10. Asolomio says:

    It’s not hard to imagine mass events (organization-wide) that could leave people wishing that they (a) had a discrete backup of their Exchange data outside the context of Exchange, and (b) kept their resume more up to date.  A lot of these events are missing from your analysis.  

    For example, a hacker, or more likely, a disgruntled administrator with administrative access to one Exchange server is going to have the same access to all the other Exchange servers in the environment, and could wreak absolute havoc deleting data from both the production and replica servers.  

    Similarly, a widespread virus will also affect both production and replicas (log replay delay on an SCR target can help, but it’s pretty expensive).  Whether you fix that with virus software or backups, it’s wise to have both options.

  11. Yes agreed – there will be cases where we might need to revert to a copy of our data that is kept outside of Exchange..  but if this is the only requirement for our backup then this redically changes most peoples approach of backing up every night and keeping backups for the long term..  This might only require a few days worth of backup on disk – which might mean reducing the requirement on a massive tape infrastructure?

    When it comes to virus infections I would want to tackle this with virus software to begin with but again I think that many companies might deploy a backup solution that protects for the short term to disk only…  Plus with a combination of say Edge and Forefront where you are using multiple virusspam engines this risk is very small.

Comments are closed.

Skip to main content