The Danger of Using Replication for Data Backup

I only get one chance every four years to write a blog post on February 29th. However, I am not writing this just to get something posted with a Feb 29 date. I want to warn you about a serious danger of using replication as a backup strategy. Replicating files or databases ensures that you have a secondary copy of your data in a different location in the event that your primary storage fails or is destroyed. The replicated data can be in an off-site location which adds some additional protection against disasters that may render your entire office or data-center unusable.

While replication protects data against many hardware and connectivity problems, there are many other hazards to your data that replication offers absolutely no protection for. Suppose your data is accidentally deleted or overwritten, which are common causes of data loss. Guess what? Your replicated data is also deleted or overwritten. What if a virus, malicious code, hacker, or other security problem affects your data? Your replicated data will almost certainly be affected in the same way. The fact is that replication does not protect against the most common causes of data loss.

Data replication is not a substitute for backup and recovery procedures. If data is important enough to be replicated then it also needs to be backed-up.

Mozy Pro Prices Increasing Dramatically

According to an Information Week article on Feb 27, EMC has announced new pricing for it's online backup service Mozy Pro. Under the new pricing a SOHO customer with 1 server and 10GB of data will increase from $8.95/mo to $24.95/mo. A 100GB customer will pay $224.50/mo, up from $89.50.

I am sure a lot of people are thinking that EMC is screwing up Mozy. I don't think we should blame EMC for this. Long before EMC came into the picture Mozy has been giving away free and dirt-cheap online backup accounts. This was a very effective marketing strategy for Mozy. They amassed a customer base in the hundreds of thousands, most paying nothing and others paying relatively small amounts. That business model can't be sustainable for the long-term. I think Mozy's plan from the onset was to build a big customer base with the free accounts, and then sell the company and make a ton of money for the management and VC investors. Seeing how EMC bought Mozy for $76M last fall, I would say they accomplished their goal.

EMC didn't buy Mozy to give away free accounts and dirt-cheap backup services. They probably don't have any intentions of selling the business either. I would have to believe that EMC intends to turn the Mozy business into a profit for their shareholders. I would also expect the other Mozy offerings to be adjusted with that goal in mind as well. Don't blame EMC for this, they are not the ones that created a business model that was designed to suck up market share and then sell the business.

Using Online Backup for Offsite Storage Needs

Magnetic tapes have been the preferred media for backing up data for about four decades. Tapes were used for backup when programs and data were stored on Hollerith punch cards. The tapes and machinery have changed many times over the years, but tape is still the most widely used backup media in 2008.

Over the past several years, many organizations have migrated toward disk-based backup systems. Disk-based systems are more reliable than tapes and they have become less expensive to own. Although there are clear advantages to disk-based backup systems, many organizations have large investments in tape backup hardware and media as well as software and processes. Some CIO's and IT directors are reluctant to scrap their investments in tape backup systems and convert over to disk-based systems. Disk-based backup has only recently become cost-effective and widely accepted, but it still does not have the long-term history that tape-based system have.

While many IT executives have been reluctant to make the switch to disk-based backup systems, others have felt compelled to move to disk-based systems due to the cost-saving and reliability advantages. For the traditional shops that have no intention of moving to disk-based systems anytime soon, there is one area where immediate benefits can be had without scrapping the tape backup systems. Moving tapes to an offsite storage facility has always been part of tape backup systems. Online backup services are an ideal way to move critical data off-site automatically, eliminating the manual and error-prone procedures of manually seleting tapes and transporting them back and forth to and from an offsite facility. For those who are determined to keep their tape backup systems in place, they can continue to backup to tape, and also use an online backup service to move critical data off-site. Having the backup data moved off site to a disk-based online backup service provider, is much simpler, less expensive, and more reliable than the traditional methods of physically moving tapes from location to location.

Secure Online Backup Can Be an Inexpensive Solution to Costly Data Breaches

I received a notice from my mortgage company that my personal information may have been stolen. They said that a backup tape containing customer information had been lost and they were notifying all affected customers in case their information was used in an identity theft. To help me protect my credit, they paid for a credit monitoring service that I could use for one year. I occasionally hear about data similar data breaches, but this is the first time my data was involved.

The letter from the mortgage company went on to explain that they create regular backup tapes of important data and the tapes are transported to a secure offsite storage facility. In this case, one or more of the backup tapes were lost in transit. My first thought was, "why weren't they encrypting the backup data". If they were encrypting the data on the backup tapes, then there would be no need to go through the expense and embarrassment of having to notify thousands, if not millions of customers because no data would have been exposed. And what if the data was actually used for identity theft; I suppose some very expensive remedies and lawsuits would be forthcoming. Thinking back to my first job out of college when I worked as a systems programmer in a bank data center; we didn't encrypt our backup tapes either, but that was in the 1980's.

Fast-forward twenty years--encryption technology is widely available, affordable, efficient, and very secure. There is no good excuse for sending data offsite in plain-text form. Many of the long-time IT operations are probably operating pretty-much the same way that we did it back at the bank; at the end of nightly processing, copy the databases out to tape and put them in cases to be transported to the offsite storage location. Another factor may be pure ignorance on the part of small and medium sized organizations that don't have professional IT staff. In any case, these companies are probably feeling pretty good about the fact that they are doing regular backups and storing the data off-site.

The problem is getting to be more critical than ever because companies are storing more data on computers and retaining the data for longer periods of time. That trend will certainly continue. If large and small IT operations don't improve their backup methods with encryption, then the frequency and volume of data breaches related to mishandled backup data will increase also.

One very simple solution that solves the problem is secure online backup services. With online backup, there is no tape to get lost, and the secure online backup services encrypt the data before it leaves your computers and the backup data remains encrypted at all times. Although backup tapes can be encrypted, it is much easier to skip the tapes and the transportation, and just backup the data straight to an online backup service. In almost every case, online backup services are a much less expensive offsite backup solutions than tapes and other media.

All online backup services are not the same. You can search the Internet for professional and secure online backup services and find quite a few that worth considering. I also advise you to read this article: SSL is Not Enough Security for Online Backup. When you consider the cost to purchase and maintain backup hardware and media, and then add in the IT operational costs, transportation, and software; secure online backup services are clearly an inexpensive solution to offsite backup of sensitive information with very low exposure to data breaches.

Backup Your Entire Drive or Just The Data?

"Should I backup my entire hard drive, or just my data files?"

If you have a huge hard drive with lots of software on it, then backing up your entire hard drive can be a challenge. If you purchase an external hard drive to use as your backup you will need to make copies of your primary drive frequently to keep your external drive up-to-date. You will also need to make that sure you are using a backup program that clones everything, including the boot sectors. This is not trivial and usually requires creating some kind of boot disk, booting your computer from a special disk and duplicating your hard drive while your computer is running some specialized software outside of your normal operating system. If you go the route of backing up your data only, then you will have some much simpler and quicker options, however, in the event of a hard drive crash you will be required to reinstall your operating system and software before you can restore your data. Power-users can easily handle either method, but everyone isn't a power-user.

Christopher Null recently wrote a blog entry that says; "Do you feel comfortable reinstalling Windows and your various programs on a bare hard drive? If not, then back up everything. Power users can forgo the full drive backup and just grab data files, typically the stuff that lives in your My Documents folder." It seems to me that you need to be a power user either way. Duplicating the entire disk requires a certain amount of expertise for both the backup and restore. I think he has a very good point, but it is not as simple as that. Making a complete clone of your hard drive is not a bad idea, but you also need to do the data backups on top of that. Here is something that I posted a few weeks ago on the topic: Full Disk Backup vs Data Backup.

If you have decided to backup your data files only, then you need to make sure you are backing up the correct files. See Selecting Which Files to Backup and also A Useful Tip for Software.

Whether you are duplicating your entire disk, backing up your data files only, or doing a combination of both, you will need a certain amount of computer know-how to successfully restore. If you are not a power-user or you are not sure if your backup methods are adequate, then you should enlist the help of a friend, relative, or professional who knows what they are doing. Don't wait until your hard disk is fried to find out that your backup is useless.

Selecting the Best Hard Disk Drive

When consumers look to upgrade their computer hard drives, they tend to look for the most capacity for the lowest cost. In most cases this is not a bad strategy for desktop or home computers. The drives from the top name manufacturers are all fairly reliable. You can find great deals by using sites like http://www.pricewatch.com/ or even searching Google, MSN, or Yahoo. However, if you are purchasing a hard drive for something other than a general purpose desktop computer, then there are other factors to consider.

First of all, if the drive is being purchased for a drive array or a server, then the desktop hard drives are not well suited for this purpose. This includes RAID systems, Storage Area Networks (SAN), and Network Attached Storage (NAS). If you are selecting drives for one of these purposes, then you will be better served by selecting an enterprise class drive or a near-line storage drive.

Differences in desktop hard drives versus enterprise or near-line drives:

  • Desktop drives are not designed for a continuous duty cycle. Enterprise and near-line drives are built to run 24x7.

  • Desktop drives have a much lower meantime between failure rating (MTFB). While desktop drives usually have an MTFB around 500,000 hours, enterprise and near line drives are usually 1,000,000 or more hours. But don't get wrapped up in these MTFB numbers; just like a lot of other published specs, they serve more a of marketing purpose than a reality. Each drive manufacturer tests their own drives using methods and numbers that are going to reflect best on their products. 500,000 hours is over 57 years! No hard drive is going last nearly that long. Not even close! If you get five years out of a hard drive, then consider yourself lucky. The MTFB numbers are actually calculated based on the service lifetimes of a large number of same drive. They can be used for comparison, but the service life or warranty of a drive model is an important factor to be used in conjunction with the MTFB.

  • Enterprise and near-line drives have longer warranties. Now these numbers are worth comparing. If the manufacturer guarantees a desktop drive for one year and their enterprise class drives for five years, then you can safely conclude that the manufacturer expects those drives to last much longer. The numbers are even more significant than you might think. Since desktop drives are typically designed for usage patterns of around 8 hours per day, and enterprise drives are designed for usage around the clock, the manufactures are much more confident in the quality of the enterprise and near-line drives. By the way, the warranty and RMA procedures for all the name-brand providers that I have experience with (Seagate, Maxtor, Western Digital, Samsung, Hitatchi... ) are pretty good. I say pretty good because most of them require you to pay shipping, handling and expediting costs to replace drives that fail due their faulty materials and workmanship.


  • Error correction works very differently in desktop drives as opposed to enterprise and near-line drives. A desktop drive will retry disk operations many times before giving up and reporting an error. A disk operation that should take a split second may take a minute or more in a desktop drive if an error is encountered. The operation may eventually succeed and the desktop computer user will just experience a pause or temporary lockup. Disk array systems are usually designed and configured to recover from errors in a completely different way. They store redundant data on separate disks so if the data can't be accessed on one physical disk, it is always available on another. If a desktop drive is used in an array, the drive itself will attempt to retry for lengthy period of time causing the array logic to timeout and possibly cause the drive to drop from the array. The near-line and enterprise drives usually perform more robust error detection and correction in their caching systems as opposed to retrying the physical disk operation for lengthy period of time.

  • Desktop drives are also optimized for single user access where near-line and enterprise drives are engineered to handle access from many users at the same time. The near-line and enterprise drives have certain firmware commands and capabilities to organize near simultaneous requests so that they can be handled in the most efficient way.

  • Near-line and enterprise drives are usually designed to withstand much more vibration and sustained vibration than desktop drives. Drives that are configured in arrays and server racks are subjected to constant vibration from other drives and equipment. Vibration can accelerate the deterioration of a hardrive and end it's life prematurely.

  • Many of the near-line and enterprise drives are engineered to manage workload by slowing down or validating each read and write when the drive temperature rises. This effectively causes the drives to cool down and continue to operate indefinitely while a desktop drive would fail under similar circumstances.

  • Some desktop drives are designed to spin down during periods of non-use. This saves wear and tear on the drive as well as saves energy. While this may extend the life of a hard drive and be desirable for a desktop computer, it can cause problems in arrays. Array's don't expect to have to wait for a drive to spin up before accessing data. Enterprise and near-line drives have other ways of saving energy, such as powering down parts of the circuitry that is not being used; but not spinning down the drive, which takes much longer to start back up than an micro electronic circuit.

For desktop applications, the drives designed for desktops may be the best choice. They are certainly less expensive and in many cases they are faster for desktop type applications. If you are sensitive to noise, then you can find quieter desktop drives than server or enterprise drives.

There are other types of hard drives that are designed for automotive use, portable devices, appliances and more. Those drives are built for the specific application and environment they are intended to be used in. Heat, vibration, reliability, noise, performance, costs, and access profiles are all factors in designing a hard drive for a particular application. Don't just find the cheapest drive with the most capacity, you may be making a mistake that could put data at an unnecessary risk. Or even worse; a device or system may be more likely to fail that could have even more costly consequences.

Corporate IT Considering Online Backup Options

The high upfront costs and ongoing cost of ownership of data backup systems is driving corporate IT to seriously consider online backup systems as viable options. As bandwidth becomes more affordable and backup hardware and software gets more expensive to own, the scale has tipped in favor of hosted services.

Large corporations have already made investments in tape and backup hardware, so they may be reluctant to scrap those investments and move to online backup services. However, as backup hardware becomes obsolete and in need of replacement, corporate IT is looking at the alternatives and hosted offsite backup services are looking more attractive everyday. Some shops are keeping their backup hardware in place for now and supplementing their backup programs with online services that store data off-site. Eventually there will be no need to maintain the old systems because online backup has reached level of maturity and IT managers are quickly getting comfortable with the security and reliability of online services.

Quoted from http://www.computerworld.com:

Corporate IT warms up to online backup services

February 4, 2008 (Computerworld) Large businesses are looking more closely at online backup options as a way to ease systems administration headaches and avoid security concerns linked to physical backup procedures.