Replication vs Offsite Backup

You may have read some of my previous postings where I encourage everyone to maintain offsite copies of data. One method for getting your data off site to protect against a disaster at your main site is replication. Most database servers have built-in replication functionality. There are third-party products that can be used replicate data across a network. Replication has one advantage over backup in that the data is usually replicated soon after it arrives at the primary server. Relatively little data would be lost if the primary server were lost because the replication server would be within a few minutes of the primary server. Of course in order for replication to be considered a form of offsite backup, the replication server must be in a different physical facility than the primary server, preferably with some distance between the two.

While using replication to backup your data to an offsite location has the advantage of being very current, there are some significant exposures that you should be aware of. First of all, there is the potential for human error or malicious code that destroys the data on the primary server causing the same data to be destroyed on the replication server. With replication, many of the causes of data loss would also be replicated to the secondary server. Another significant area that needs to be covered are historical backups. With pure replication, you don't have any way to retrieve an older copy of your data.

Bottom line is that you may need to do both. Replication serves to keep a standby copy of your data that can be used in case of a disaster at your primary facility. But replication does not protect you against some of the more common forms of data loss involving intrusions, viruses, and human error. A recent backup may be needed to restore erroneously updated data. A good assessment of your backup and recovery policy is the best way to make sure that you are adequately protected against costly data losses.

In-file Delta Backups - Taking Incremental Backup to the Next Level

Incremental backup methods only copy files that have changed since they were last backed up. This saves substantial time and backup media because only a fraction of the data on any particular system is changed daily. Traditional incremental backup methods determine which files have changed by using one or more of the following methods:

  • Archive flag setting
  • Time stamp comparison
  • CRC comparison
  • File size comparison

Once it has been determined that a file has been changed since it was last backed-up, the entire file must be backed-up again. While this is much more efficient than backing up the entire disk, incremental backup still makes copies of vast amounts of data that have not changed. My Outlook .PST file, for example, is about 2GB and contains many thousands of saved emails. Furthermore, my Outlook .PST file changes everyday. Incremental backups would copy the entire file every day. I may only save 10 new emails in my .PST file on a typical day but the entire file must be backed up in order to have a current backup copy. So effectively, I would be backing up many thousands of emails so I can have a backup copy of the last 10.

Delta file backups methods improve on the traditional incremental backup methods by only backing up parts of files that have changed. In my case, only about 20K of my .PST file gets backed up on a daily basis with delta file technology. Delta technology examines a file in chunks called blocks. A CRC value is computed for each block of the disk file and then compared to the CRC values of the copy of the file on the backup system. Each block that differs is backed up. Blocks that have not changed are not backed up because an exact copy of those blocks is already stored on the backup system. If the file needs to be restored, then the blocks of the file on the backup system are used to reconstruct the file including all of the most recent updates.

When I first looked at delta backup methods, I had a certain amount of apprehension. I had a fear that the blocks would get out of sync or for some other reason the file would not be the same once it had been taken apart and put back together. I spent quite a bit of time looking at the algorithms, and testing actual backup and restores using delta technology. I was quite surprised at the results. Delta file technology is solid and mature. In fact, I hadn't seen a single case where file integrity violated. I have come to the conclusion that the algorithms and implementations are very good and leave little to chance. It is also apparent that the testing of the software that implements delta file technology is fairly straight forward, making the process verifiable. I liken the delta file technology to that of SSL or ZIP compression when it comes to the ability to transform files to another form and back again. It simply works, every-time.

In-file delta technology is especially useful in online backup systems. The better online backup systems are designed to use bandwidth efficiently by avoiding movement of data that is already at the destination. In my case, my .PST file is backed up daily to an offsite backup service in a few seconds. It would take at least 2 hours without the the delta file technology. Online backup to offsite storage locations is great technology, but it is the in-file delta technology that makes it practical to backup large files over the Internet.

Hard Drive Goes Click, Click, Click, ...

If you have ever fired up your computer just to hear your hard drive make clicking sounds then you know what I am talking about. This is often the sound of death for a hard drive. It is not a pleasant experience. In fact, it can be a downright sickening feeling when you hear that clicking sound coming from your computer and see a message like "Invalid System Disk" or "No System Disk"... I have been using personal computers since the early 1980's and larger systems before that. I have experienced many hard drive failures over the years.

When one of my disk drives dies, it is only a minor inconvenience to me. I have to buy a new one, install it, and restore my data. That all takes time, but I don't have that sinking feeling that comes with losing data. Unlike most other small business and home computer users, I expect my hard drives to fail. I am fully aware that any of my hard drives can fail at anytime. Therefore, I diligently backup my data. Actually, I don't do much of anything. My data is automatically backed up by an online backup system. I rarely do any manual backups.

I think most of us who are computer professionals realize that any data that is worth keeping needs to be backed up. I hope the rest of you will get the message and understand that all hard disk drives fail. Even a new drive can fail. The only way to ensure that your data will not be lost is to back it up.