Lee Wilbur
IT Solutions
A service of Multiverse Enterprises Inc.

A Backup and Disaster Recovery Primer

If you discover any links that no longer work or have questions and/or comments about this page, my site, or me, please contact me.

CONTENTS

  1. Introduction
  2. Backup Types
    • Major Types
    • Volume Shadow Copy
    • RAID
  3. Software
    • Agents
    • Compression
  4. Analyzing Your Needs
  5. Media and Hardware
  6. Testing
  7. Restores
  8. Links
    • Imaging Software
    • Backup Software Products
    • Backup Hardware Products
    • Tape Formats
    • Internet Backup Services
    • Other Backup Links
  9. Credits
  10. Disclaimer

INTRODUCTION Contents

Over the years I have managed to gain a significant amount of experience backing up and archiving data in business environments. In addition, I have benefited from debates and the experience of others whom I have worked with as well as colleagues online.

Strictly speaking, a backup is simply a copy of a particular piece or segment of data that is held in reserve in the event the primary data becomes unavailable. Simply making a copy of a file creates a backup. This kind of backup is rarely sufficient as problems could arise with the media the file is stored on (for example, a failed hard disk) or something might happen to the physical location of the data (for example a fire destroys the server).

While there is no universally appropriate backup solution, understanding all your needs as well as the benefits and drawbacks of the various technologies currently available can help you make the best decision possible to keep your data and business safe in the event of a system failure... or worse.

Most people start out with the preconceived notion that backups involve tapes and will order their server with a backup tape drive. As I'll explain later, though tape still has it's uses, for many environments, it is no longer an appropriate or cost effective medium. The following attempts to explain key concepts as well as what I consider important when evaluating a client's needs and implementing backup solutions. This document is NOT intended to illustrate exactly how backups are done with given software or hardware. It is intended to provide an overview of the various backup technologies, services, and options, to help you decide what is right for your needs. You will then need to obtain the appropriate items and reference their documentation, technical support, and other implementation resources.

Before we begin, it is important to note that my area of expertise is Windows and this article is heavily Windows-centric. However, I will include comments for other platforms, such as Linux and Mac when I have something appropriate for regarding these systems. In most cases, the information provided can be applied to all platforms with little need for modification.

Lastly, while I provide links to some topics and articles I've found online, an excellent site for additional information on any given topic here is Wikipedia, an online encyclopedia with entries in dozens of languages. If there's something you are uncertain about while reading, I encourage you to look it up on Wikipedia.

BACKUP TYPES Contents

There are three MAJOR types of backups:

  1. Full Backups - They backup every file on the system (in theory). Disk images, such as those created by Symantec's Norton Ghost or Acronis True Image, can be considered full backups. You might also do a full backup of only a drive or a folder.
  2. Differential Backups - They backup everything that has changed since the last full backup. Expect that each night a differential is run and there hasn't been a full, the differential will grow larger and larger.
  3. Incremental - Backs up everything that's changed since the LAST full OR incremental backup. These backups are often fairly small and consistent in size (assuming your work habits don't change much).

Depending on how much data you have, full backups can take a LONG time to backup and aren't typically recommended or practical to do on a daily basis. Usually, fulls are done on a weekly, every-other-week, or monthly basis. Most backup programs, when performing full backups, mark every file backed up as having been backed up, so that on subsequent differential or incremental backups, only changed files are backed up. (NOTE: depending on your business/purpose, they may occur more or less frequently).

Differential backups, as noted, grow in size and in turn, time required to complete. Eventually, they can get quite large and time consuming. This is why full backups are often scheduled regularly, to prevent differentials from growing too large. A Monday-Friday Differential followed by a weekend full is a common practice. If the worst happened and you needed to restore data on Friday morning, you would need the last full backup and the last differential backup to restore all your data - effectively two restores. You also have the benefit of potentially having multiple copies of the same file on multiple backup tapes. A file changed on Tuesday will appear on Tuesday's backup as well as Wednesday, Thursday, and Friday, until the next full is done. In the event the backup is corrupt or the media fails (tape breaks, disk fails, etc), you have other backups you may be able to retrieve this file from.

Incrementals, are often the fastest backups because the only backup the data changed since the last backup. While they use less backup space every night, an incremental backup scheme would require EVERY backup job be restored since and including the last full backup. So if Friday morning, you needed to restore your systems, you would need to restore the full from the weekend, then Monday's incremental, Tuesday's incremental, Wednesday's incremental, and Thursday's incremental. Not terribly efficient for the restores. Additionally, if a file changed on Tuesday but not since then, and you need to restore that file, you need to know the exact date the backup changed on to determine which backup to restore from. In my experience, Incremental backups are not done very often, in part for this reason. In Linux and Unix systems, it is not uncommon to use multi-level backups which will utilize only full and incremental backup strategies. For more information on such backups, review http://www.faqs.org/docs/linux_admin/x2615.html.

A fourth backup type seen in some utilities, including the backup program included with Windows 2000/2003/XP is Copy. Copy is like a full backup, only the files aren't marked as having been recently backed up. Subsequent differential or incremental backups will still see files changed since the last full as needing to be backed up. To keep your backups simple(r), try to use fulls whenever a copy seems appropriate, unless there is a good specific reason to use Copy.

Volume Shadow Copy
One of the new features of Windows Server 2003 is Volume Shadow Copy. This service, when enabled, takes periodic "snapshots" of the drive contents and stores only the changed parts of a file in a special area allowing you to recover previous versions.

While a very useful feature (one I often cite as an important reason to upgrade to Windows Server 2003), it should NOT be your sole backup. Such abilities cannot provide off-site backups and do not protect against hardware failures.

Other products can provide similar features. High-end Network Attached Storage (NAS) devices have offered similar features for several years.

For more information on the Volume Shadow Copy service and its implementation, review the links section of this document.

RAID
Redundant Array of Inexpensive Disks (RAID) is a technology that is intended to help protect against the failure of one or more hard disks by combining two or more hard disks and making them appear as one. With RARE exception, no server should be setup without using some form of RAID. As useful as RAID is, it should NEVER be used as a form of backup. RAID IS NOT A SUBSTITUTE FOR BACKUPS. RAID cannot protect against file corruption, accidental deletion, or disasters that might destroy the site or server. This article will not cover RAID technologies in depth, but will highlight the three most commonly used forms of RAID, also known as Levels.

For more information on RAID, review the links section of this document.

SOFTWARE Contents

What software to use? As noted before, I'm not very familiar with Linux, Unix, or Mac backup solutions (though in the Mac world, Retrospect is fairly popular). Backup, also referred to as NTBackup, included with Windows XP Professional, Windows 2000, and Windows 2003 is a suitable if not terribly feature-rich program. It lacks some abilities you might otherwise like to have, but it will backup everything you NEED backed up on a Windows system for no additional cost. If you have a large server environment, you'll most likely want to purchase third party software such as Symantec Backup Exec (formerly Veritas) or Brightstor ArcServe and their various agents for enterprise products like Microsoft Exchange, SQL database products, and other operating systems.

Deciding which software is best for your organization can be difficult. Fortunately, both of these products offer free trials with all most if not all the associated agents. You can also explore countless other products available for your backup needs. A list of several products can be found at the end of this document, some of which have already been mentioned.

Agents
Backup agents provide extra functionality to backup software. Some popular agents include Open File agents, SQL agents, and Exchange agents.

An Open File agent is intended to allow you to backup files that are otherwise in use ("open") while the backup takes place.

An Exchange agent could provide brick level backups where you can backup individual e-mails and items in Exchange. (Brick-level backups should not be used as the primary backup of Exchange as they do not clear Exchange log files, which is necessary for a properly maintained Exchange server).

Database servers (most frequently, SQL servers) have various built-in methods of backup. Some can even backup directly to tape. These built-in methods can allow for their backup without the use of agents. However, agents can help keep the database online during the backups and make restores easier, especially in large enterprises.

Compression
Hardware or Software Compression? Hardware is almost always faster. Most backup software will DISABLE software compression if hardware compression is available. However, review your backup software's manuals for their recommendations. Some backup software, such as the Linux open source Amanda recommends using Software compression as this will allow the software to know how much data can fit on a given piece of media.

Do NOT base your capacity and tape requirements on the advertised compressed capacity of a tape. Most people will find the supposed 800 GB LTO3 tape only holds 550-650 GB compressed. Your mileage WILL vary based on the type of data being backed up. I have backed up a wide variety of data over the years and NEVER seen a tape come CLOSE to a 2:1 compression ratio as advertised. The amount of compression attained will be highly dependant upon what you are backing up. If you are backing up MPEG, JPG, MP3 and ZIP files, they are already highly compressed and will not compress much if at all when backing up. It is even possible that such highly compressed files will take more space on a backup medium than they do normally. Files such as large text files, databases, and uncompressed graphics will compress much better and if that is all you are backing up, you might see compression ratios much closer to the advertised 2:1.

ANALYZING YOUR NEEDS Contents

It's important to understand that this is not just about backup. The flashy words are "business continuity" and "disaster recovery". People often think of backups for situations when your server crashes or someone deletes a file. For larger businesses, September 11 and Hurricane Katrina made people realize that you if you are to survive, you need to have contingency plans for every possible eventuality. Retail businesses are far more difficult - if not impossible - to keep running in the event of disaster. But service-based businesses, such as realty, law firms, insurance agencies, and consultancies can continue with minimal downtime if they plan appropriately -- or they could suffer needless financial losses and loss of business by not planning and investing in business continuity solutions.

When evaluating backup solutions you need to consider a variety of factors:

MEDIA AND HARDWARE Contents

While tape is the most referenced backup media, anything that can store another copy of your important data can be considered backup media. USB Flash drives, CDs, REV drives, Zip Drives, hard drives, technically, even paper is a backup - when you print out the contents of the file.

Proprietary drives include Iomega REV Drives and other uncommon/non-standard drives and media. Do the math. Take the cost of the device used to access the proprietary media - the drive itself, then add in the cost of enough pieces of media to support your backup needs - at least 3 will be required for any reasonable backup solution. Now compare that to the cost for a hard disk - external - or an external hard disk adapter, such as a DriveDock from www.wiebetech.com. While it depends on the proprietary technology used, In almost every instance the drive and disks will cost more per GB stored. And, if something ever happens to the proprietary drive itself, you could have issues restoring your backup especially if the drive fails just as you need to perform a restore. Using external hard drives, if the casing fails (which is unlikely) you can open up the external drive casing and plug them into ANY computer, internally, thus providing relatively quick access to the disk and to recover your data.

Today, Iomega REV drives are the most popular proprietary drive format. Remember Zip drives? Jaz drives? Bernoulli drives? The REV drive seems to be more popular than a Jaz drive, but not as popular as the Zip drive (Zip drives, at one point, could be purchased built in to your new computer, even laptops).

If you need to store each and every backup (or each and every full backup) long term, then you should consider using tape. Long term, it's cheaper than any other method. And even if there are problems with tape heads and reading the media, OFTEN BUT NOT ALWAYS, some expensive data recovery services can get you access to the data. This should rarely happen though. If not, if you can overwrite data 6 months old and older, than, provided you are not using LARGE (Terabytes or closing in on terabytes) of data, then I tend to recommend using a hard disk solution. These, in my experience, tend to be more reliable, faster, and cheaper in cost per GB.

Even if you don't need to keep your backups forever, you should consider keeping periodic fulls. File corruption can occur and not be noticed for months or even years. You might consider using older media to store quarterly copies of full backups and one copy of a full every year indefinitely. This will provide a lengthy archive of old files and give you an excellent chance of recovering data found corrupt that had not been accessed and confirmed good for months or years. (For a personal example, imagine you write your resume and store it on a computer for 6 years. Then when you need to edit it, you find it's corrupt. You would have had NO REASON to open the file for 6 years, so unless you have a backup from 6 years ago, you can't recover it. Now you have to spend 6 hours rewriting it rather than 20 minutes updating it. If your data isn't changing much and you only have a few megs per day, you may want to consider using a third party service to backup your data offsite. They would effectively upload the data to their site at scheduled times, instead of using tape or media. You could possibly get cheaper service and use a 3rd party web host with large amounts of storage if you are prepared to do a little extra legwork yourself in maintaining things - removing old files, scripting the upload, etc). Even with this solution, certain aspects of of your systems will still be best served with a more local backup solution. Windows systems should have their System States backed up regularly, especially Domain Controllers as the Active Directory is backed up when the System State of the domain controller is backed up.

What you are backing up will make a different in your overall required costs. If it's JUST files, then you don't need any special software. If it's Exchange then you might be better off buying backup software that can do a "brick level backup" (This ability is often part of a separately purchased agent that works with commercial backup software) which would allow you to restore individual email messages. The built in backup tool with 2000/2003 will backup exchange and restore it, but restoring an individual email would be more cumbersome than with a brick level backup. (If you have exchange, you might want to review the information provided here: http://www.petri.co.il/backup_exchange_2000_2003_with_ntbackup.htm).

As mentioned before, in a Windows environment, you also should pay attention to the Windows system states - A normal FULL, DIFFERENTIAL, or INCREMENTAL backup will NOT properly backup Windows OR the Windows Active Directory. To do this, you MUST do a system state backup. The built in backup tool will do this for you and save it to a file. I STRONGLY recommend doing system state backups of ALL domain controllers and Exchange Servers whenever making changes to the domain. Not doing so is an unnecessary and risky gamble. High end backup software, such as Veritas Backup Exec or Brightstor ArcServe will do system state backups as well. Then there's your database servers. If your company runs a SQL class database, you need to consider the expensive backup tools like Veritas or Brightstor. They have available (at extra charge) agents that will backup the databases without shutting them down. This can be critical if your database needs to be running 24x7x365.

Lastly, cost per GB. The cost per GB for backing up LARGE amounts of data (TB in size) via tape still can't be beat. LARGE tapes cost between 30% and 50% less than a hard drive of equal size. But the tape drives often costs many hundreds or even thousands of dollars. So, for example, if you are backing up 10 GB of data every night and want a way of doing this automatically, then I would suggest backing up to and cycling three external hard drives off site. This would cost you perhaps $300 and potentially last you 2-4 years. Whereas tape, even though the tapes might be $50 each, the drive will likely cost $400 or more - depending on type of drive. So you end up spending $500-$800 over 2-4 years, at least, and you're using a technology where, if your tape drive dies you have absolutely no access to your backups. On the other hand, you can always attach a hard drive to any computer and read the backups.

Flash Drives
Proprietary
(such as Iomega
REV drive)
CD/DVD
External Hard Disk
Tape
Pros
  • No moving parts
  • Easily portable
  • Potentially allows random access to data, depending on format used to backup.
  • Very low initial cost for small environments
  • Easily portable
  • Potentially allows random access to data, depending on format used to backup.
  • Media is low cost, typically $.10 to .25 per in bulk.
  • Easily portable
  • Very low initial costs.
  • Potentially allows random access to data, depending on format used to backup.
  • No moving parts
  • Large capacity
  • Potentially allows random access to data, depending on format used to backup
  • Fast
  • Easily accessed from any computer.
  • Media cost is lowest when data must be frequently archived for long periods of time.
  • Devices are available that can support backups of multiple TB without user intervention.
  • Cons
  • Limited lifespan
  • Tends to operate slower as more and more writes take place
  • Very high cost per GB
  • Proprietary - failure of the drive can impede backups and/or restores - if this happens at a critical time, it can be a major problem.
  • Trusting the proprietary solution for long term archival purposes can be problematic if the maker goes out of business.
  • Media capacity is quite small. CDs hold 700 MB, DVDs hold 4-8 GB.
  • Media is write-once and cannot be erased. (More expensive RW media can be erased)
  • Most delicate media - while still relatively sturdy, drops and falls can cause the drive and all data to become inaccessible without an expensive recovery service.
  • When used for long term archiving, costs can become considerable.
  • High initial cost.
  • No possibility of random access to data.
  • Data recorded with old drives may sometimes be unreadable on new drives.
  • Cost/ GB1 $4.25 $1.43

    $0.038 (DVD)
    $0.136 (DVD-DL)
    $0.680 (BluRay)
    $0.243 (CD)

    $0.175 $0.138 (DAT72)
    $0.088 (LTO3)
    1. Cost per uncompressed gigabyte is for media only and does not take into account number of write cycles or cost of hardware to write to the media. Actual price (US Dollars) may vary based on exact capacity of media, rebates, and changing market conditions. Source of prices is NewEgg.com on January 4, 2008.

    CD/DVD
    If your data doesn't grow that much, you can use a CD/DVD recorder to backup your differential data. The backups are fairly fast, the media is compact and cheap, and the data can be accessed - usually - by any system with a DVD drive. The problem is, most people can't get a complete backup on a DVD and getting them to work in an automated fashion CAN be difficult.

    External Hard Drive
    Costs are relatively inexpensive and depending on the hardware you buy, cost can be as little as 30 cents per GB - or less. For fast, reliable, easily performed, and easily recovered backups, I believe this is the best solution. You will, of course, need at least two drives so you can cycle one off site. Only drawbacks are that if you need to store data for long periods or have large amounts of data to backup (over 400GB), it can be more expensive than tape.

    Be careful - some external hard drives, noteably the larger drives from Lacie, use RAID 0 internally. If one of the internal drives fails, you lose ALL the data on the disk For this reason, when using external drives for backup purposes I strongly recommend ONLY using products based on single drives. Note: If you are using Windows and you want to utilize external hard drives for backups, be sure to use the NTFS file system and not FAT32. FAT32 has a 4 GB file size limit which could cause your backups to fail if they reach or exceed 4 GB. For more information on NTFS and FAT32, check out www.ntfs.com

    Tape
    When people talk of backups, MOST people think of tape. For most small businesses, tape is not the most cost effective solution. Unless you are backing up terabytes of data and/or need to keep each backup for a lengthy period of time, tape can be more expensive and less reliable than an external hard drive. For LARGE backups and storing backups for long times, it is still the most cost effective solution.

    Given costs of drives, media, hardware and tape delicacy, and various other issues with tape I discuss later, getting a tape drive with a native capacity of less than 300 GB just isn't a wise idea for most people. This will limit you to getting SDLT or LTO tape technology. LTO3 can hold 400 GB native (uncompressed) and the newest SDLT technology can handle 300 GB per tape, also native.

    If you need to keep your backups for lengthy periods of time, for example to comply with HIPAA or Sarb-Ox, and you otherwise don't have much data to backup regularly, then you might want a smaller tape drive for backup purposes. Further, I am not certain tape alone is sufficient for compliance. Be sure you research the requirements of any regulations before assuming any given technology will suffice.

    Internet Backup
    I don't necessary recommend this company, but here's one option - http://www.remotedatabackups.com/. The idea is great - you get an off-site, quickly recoverable backup of your important data. This is an important factor that can make the cost per GB (compared to other backup methods) less important. You will typically want to have a reasonably fast internet connection and keep in mind that LARGE amounts of data (GB's worth) CAN take hours to restore and initially backup. Once backed up, you can typically backup changes fairly quickly. For a "poor man's" method of doing this, you can always script an FTP connection to a remote ISP and upload important files, such as accounting files, via the script nightly - you just need to periodically delete old backups or most likely run out of space.

    Disk-to-Disk-to-Tape
    Sometimes referred to as D2D2T, this is a process where data is backed up to a disk-based backup system that can allow for rapid restores of data. Data on the D2D2T devices is then backed up to tape, sometimes with less frequency than the intial disk-to-disk backup.

    This style of backup doesn't generally make sense for small business, often being found in large enterprises. Such systems usually require terabytes of available storage for multiple backups and can be quite expensive to implement.

    Other Things to Consider

    Aside from backing up your critical data, don't forget to configure your SQL databases for backups, Microsoft Exchange (if used), and when backing up Windows 2000 or 2003 servers or 2000/XP workstations, MAKE SURE you backup the System State, which contains Active Directory information on domain controllers, your registry, and essential files that your computer uses to know it's vital information.

    Regardless of media used, a certain degree of common sense is necessary. All media is sensitive to abuse. People might think tapes are the hardiest media - but drop one and break the casing and getting that data on the tape can be VERY difficult. Hard drives CAN be more sensitive to drops, but good ones and ones in good enclosures will have technology that can help ensure they survive falls. CD/DVD media can typically survive drops fine. Picking up though can scratch the media - one grain of sand and you can destroy the portions of the media or the entire disk trying to pick it up from the floor. I don't have any hands on experience with Rev disks, but I don't think they are made of carbon nanotubes so breakage is still a possible result, probably as likely as a tape. So handle with care - not necessarily kid gloves, but don't play catch or Frisbee with the media either.

    Backup Networks
    If your network sees high utilization at all hours, you might want to consider putting in a backup network. This would be a separate network adapter in each server and dedicated network switches that would provide a dedicated path between the backup server(s) and the other servers that need backing up on your network. A backup network could expand your backup window and would allow your backups to take place without saturating your network equipment and potentially disrupting your users.

    TESTING Contents

    Oh yes, one more VERY important detail. TEST YOUR BACKUP PLAN. What good are your backups if, when the time comes and you need them, you can't get them to work? Testing also has the benefit of creating familiarity with restore and recovery procedures, allowing you to know exactly what to do in the event you need to recover data.

    In 2006, a computer tech accidentally deleted and reformatted drives containing information on the distribution of 38 BILLION dollars, distributed to Alaskan residents by the Alaska Department of Revenue. So they went to backups, only to discover the tapes were unreadable. They ended up recovering (for $200,000) the necessary information by reentering data from the paper work stored in 300 boxes. AP Story at Yahoo!

    Pick a weekend and fake a problem. For example, turn off your server and consider it dead. Rebuild the server on another system and do a restore to see that everything works. Test your database recovery plan. Test your e-mail recovery plan. Test your web server recovery plan. Test your fill in the blank recovery plan. And once is never enough. I would recommend testing twice per year at least.

    It is almost impossible to think of every possible issue that could impact your systems but by planning and testing a variety of scenarios, you will become more confident in your ability to recover from any problem.

    RESTORES Contents

    There are two primary types of restores:

    File restores are generally easy enough. Just restore the file you need to restore. When possible, to be absolutely safe, you should restore data to an alternate location so you don't overwrite the existing data. Why not overwrite? What happens if you restore the wrong data? Or if you there are two versions of the file(s) with different data that needs to be merged? Restoring to a different location will allow your users to reconcile the data themselves.

    System level restores are far less frequent but also an order of magnitude more dangerous. Restoring a system or database to a prior point in time can result in lost data. If at all possible, before performing a System State or Database restore (including an Exchange Restore) I strongly recommend performing a backup before hand. This will help ensure that if the intended restore doesn't solve your problem or if it creates different or worse problems, you can bring the system back to the way it was.

    Another important note regarding system level restores, especially those involving System States - they must typically be restored to similar hardware, primarily hardware that is system boards that utilize the same motherboard chipsets.

    This might seem obvious, but knowing what to restore and when is vital. Recently, I found a person who restored their Active Directory by using a three month old backup. Obviously, if this is the only backup available, you need to make do, but if it is the only backup available, then you really need to re-evaluate your backup plan.

    So why was this a bad idea? System State data has a limited shelf life - for more information you can review Microsoft Knowledge Base (KB) article 216993. In short, some of the problems that can occur with such an old system state restore include:

    Remember, databases also need special consideration. Blindly restoring databases can cause data to be permanently lost. Depending on your business, this could be devastating to you.

    LINKS Contents

    Please do not consider the following list of links to be complete. There are likely other products, services, and technologies available which may be emerging, aging, or otherwise not listed here. Further, these links are provided for convenience and not intended to be recommendations for the use of the product, service, or technology (unless specifically noted earlier). If you find any of them to be outdated, please use my contact page to report them to me.

    My Scripts

    BACKUP.CMD Version 1.0 Beta 1
    This is my backup script that I recently completed. It uses the built-in NTBackup utility included with Windows 2003 and XP. It is designed to backup data to an external drive and use a tool called GBMailer to e-mail a report concerning the backup to an e-mail address specified.

    Imaging Software Products

    Backup Software Products

    Tape Formats

    Backup Hardware Products

    Internet/Online Backup Services

    Other Backup Links

    CREDITS

    This document was inspired by the questions of others on the community support site, Experts-Exchange. My Experts-Exchange profile is available here.

    Other experts who frequently participate at Experts-Exchange have also contributed directly to this document. They include:

    In addition, I would like to thank kmcferrin of Tek-Tips Forums for providing clarifications regarding RAID 10/0+1

    Lastly, you can find the question, asked at Experts-Exchange, requesting additional editing and supplemental material here.

    DISCLAIMER

    The information provided here is as accurate as possible, but still may contain errors. Use of the information provided is entirely at your own risk.

    You may copy the content of the page in whole or in part provided you include a link directly back to this page.

    Version 2.406. Last Modified August 27, 2007 - Revision History