Selecting Which Files to Backup

With the enormous hard drives that are available today, backing up the entire hard disk drive is usually not a practical option. While there are a few situation where a full backup of the hard drive makes sense, in most cases it is counter productive. If you are backing up an entire 500+ GB hard drive, then you are probably not doing it very often and you are probably not keeping any historical archives of your data. It just gets to be too time-consuming and expensive to maintain multiple copies of huge data stores.

Of the segment of the population who do actually backup their data on a regular basis, most of them are using some type of incremental backup strategy. Incremental backups only copy the data that has changed since the last backup. Since software program installations are now using exponentially more disk space than they did a few years ago, many of us are now using a more selective strategy for backing up our data.

Since online backup services are quickly replacing tapes, cd's, and other local backup options as the best and most effective way to backup home and small business computer data, it has become even more important than ever to use a more selective strategy when determining what data needs to be backed up. Besides the obvious advantages of online backup services in the area of automation and costs savings over traditional backup methods, they also store your data securely off-site, which has been a major flaw in most small business and home user backup methods in the past. Online backup solutions are best used to selectively backup data.

Fortunately, most modern operating systems, such as Windows XP, Windows Vista, and Mac OS, store user data files in a disk location separate from the program files and operating system files by default. This makes it much easier for administrators and users to locate there personal and important data that needs to be backed up. But please note; the operating systems only store data in the /Users or /My Documents folders by default, you may have data in other locations. In many cases the default location for a software program is determined by the settings in that program itself, not the operating system. It is very possible that your applications are configured to store data in non-standard locations. It is also possible, although somewhat unusual, that a software program may store data under the program files folders by default, however, this is not typical for recent versions of software. Another possibility that you have to account for is the fact that you, as a user, can often change the default location where documents will be stored. For example, most programs have a "Save As" function that allows documents to be saved anywhere on the disk.

If you are opting to backup your data files only, which is often the most practical choice, then the burden is on you to make sure that you are backing up the important folders, and to make sure you store new documents in a location that is being backed up. This is not all that difficult to manage. The steps below will help you identify where your data files are being stored:

  1. Make a list of the applications that you use. Then open each application and determine where your data files are being stored. You can either open some of your existing documents in the application, noting the location where they are opened from, or you can save a new document and make note of where the application is saving your new document by default. I recommend you do both.

  2. Use your operating systems search functionality to search for known document types like; .doc, .xls, .psd, .ai, .jpg, .mp3, .pst. Browse through the search results for each file type and determine which files are yours and then look at the folder they are stored in.

  3. Use your backup software to specify the locations found in the steps above are to be backed up.

If you find the steps above to be intimidating, then you may want to enlist the help of a friend or relative who is more computer literate. Chances are that if you are a casual computer user, and your software is reasonably modern, then your data files are all stored in the default locations. It would still be a good idea to verify that your important files are being backed up.


Online Backup for Enterprises

Are large enterprises ready for online backup? Apparently EMC thinks so. They acquired Mozy last year for $76M, now they are branching outside of the consumer and small business market and into the enterprise. This is starting to make some sense. Mozy was positioned as an economy backup solution, but EMC has been in the business of selling high-end solutions to big business. I don't expect them to dump the free and cheap online backup users right away, but you can bet they will be focussed on these enterprise deals.

Quoted from http://www.informationweek.com

EMC (NYSE: EMC) is losing no time in taking its acquisition of Berkeley Data Systems' online Mozy backup technology and targeting it at large enterprises. In an announcement Tuesday, EMC unveiled MozyEnterprise as well as a brace of resellers with clout in the enterprise world to market it.

Noting that it plans to build on the Mozy brand's success in the consumer and small business market, EMC said it will begin offering MozyEnterprise with its EMC Fortress data storage platform. EMC acquired Mozy as part of its Berkeley Data Systems buy for $76 million last October.

Accessibility of Offsite Backup Data

It's snowing here in Atlanta today. There is barely enough snow to cover the ground but it may be enough to shut down the city. If you have ever been in a southern city when it snows, then you know what I am talking about. The kids are sitting in front of the television watching the local news, hoping to see their school's name crawl across the bottom of the screen in the list of closed schools and businesses. The city of Atlanta isn't very good at clearing snow from the roads, and the Atlanta drivers are not very good at driving on snow and ice. Everyone is better off just staying off the roads. I don't mind, I can get my work done from home without much problem. That is if the power stays on. Snow and ice tend to make tree limbs break and knock down power lines.

These weather conditions cause me to think about what would happen to a business if they suffered a serious computer failure at their office or data center, or perhaps completely lost access to their data center. Suppose the business had properly backed up their data and stored copies off-site. If the roads are closed and people can't get to work, getting access to offsite backup data could be a serious problem. Suppose your entire city is struck by a disaster such as a hurricane, flood, or earthquake.

There is a simple and relatively inexpensive solution to this problem; use an online backup service. Professionally managed online backup services store data in secure data centers with redundant bandwidth and power. Some of them use multiple data centers to ensure that backup data will be accessible when needed. Even if you are determined to continue to use tape backup systems, you should consider using online backup for some of your most critical data. The data stored on an online backup service will be available in situations where your offsite tape backups may be inaccessible.

Choose Your Online Backup Provider Carefully

When selecting an Online Backup Service, it is critical that you understand how the provider protects your data. Back in September, 2007, I posted this: Caution: Online Backup Startups.

" A relatively novice computer owner with a cable modem and fixed IP address can buy a terabyte disk drive and some software and call himself an online backup service. Imagine your backup data that you thought was safe and secure at an online vault is actually stored on someone's Dell home computer in the bonus room above their garage."

Don't just assume that a larger company is the answer to ensuring that your backup data is safe. Large companies providing backup services are only as good as the IT people who are actually operating the backup service. I spent plenty of time consulting with large companies who are mostly interested in doing everything on the cheap so they can maximize profits. It also seems obvious to me that hot new online backup services that are offering huge free accounts or unlimited backup storage for a small fee are aggressively trying to get customers, even if they are losing money on them. These companies are likely positioning themselves to either sell the company, or extract money out of those customers once they are on the hook and have critical mass. You can bet that they will change their business model at some point, whether they sell out or not.

Quoted from http://webworkerdaily.com/

Web Worker Daily » Archive Who Protects Your Cloud Data? «

Back in April, we speculated about one of the hidden dangers of depending on web services to store your data: the possibility that no one was doing backups. Now that possibility may have turned to reality for users of Omnidrive (once touted as the “clear leader” in the online storage field by TechCrunch). The service has been offline for some days, with its servers currently not responding at all. A December article at ReadWriteWeb contains serious allegations of fraud from the company’s ex-CTO (as well as a defense from the CEO).

My sympathies at this point are with Omnidrive’s users, particularly those who have their only copies of documents on an unreachable server. I can think of plenty of times when a days-long outage (let alone a permanent loss) of my own document storage would be devastating. The larger question, though, is what you as a user can (or should) do about this? Online document storage is certainly attractive to the web worker; being able to access and share your work easily in any browser is definitely a killer feature. But how do you balance that off against the fact that your documents could simply vanish overnight?

Online Backup and Online Storage are not the same thing. Online backup implies that the files stored at the service provider are backup copies, not the primary copy. Although online backup maybe inherently less risky than online storage, the last thing you want to find out when you need to restore your data is that your data is not available.

A few suggestions that may help avoid making the wrong choice when selecting an Online Backup Service:

  • Look for an online backup service provider that has a sustainable business model, with account plans that are priced fairly. Ultra-cheap, free, and unlimited are not sustainable without some other sources of income, such as advertising or selling customer lists.

  • Online backup services that have a primary focus on providing online backup services are more likely to continue to provide good service, as opposed to a company that has multiple business lines where they can refocus their investments on the more profitable lines.

  • Look for clear statements about security and redundancy on their websites. Make sure your data is stored in at least two real data-centers, not in a back-office server room.

  • Be skeptical of any company that does not provide a telephone number on their website.

  • Get a trial account, backup some data for a few days, and then restore a set of files (as large as possible). Make sure you can restore your files quickly and easily. You don't want to find out that restoring data is a problem when you need to recover from a loss.

There are a lot of good online backup services to choose from. Many offer good prices. Don't just pick a low cost provider. Assess the value of your data, your family photos, your business records, your music, ... Choose a provider that is dedicated to protecting those assets.



Online storage market set to explode, says researcher - Network World

We have been posting about the advantages of online backup for almost two years. It looks like the word is getting out. IDC is now following the rapidly growing Online Backup market. With the expanding availability of fast broadband service to consumers and small businesses, along with the ineffectiveness of most other backup solutions, Online Backup is quickly become the first best solution for most data backup needs.

Quoted from http://www.networkworld.com:

Online storage market set to explode, says researcher - Network World

"Consumers and small businesses especially are interested in alternative methods of protecting their data, as traditional backup methods fall short regarding efficiency, reliability, and ease-of-use," said Doug Chandler, research director for storage services at IDC. "Online backup has become an attractive approach for many customers, with the advent of cheaper broadband access, users' greater comfort level with web-based services, and the growing business need for a second site for remote data protection purposes."

Online Backup Providers Opt For Disk Over Tape

Quoted from http://www.enterprisestorageforum.com:

Online Backup Providers Opt For Disk Over Tape

January 10, 2008
By Drew Robb

After years of unfulfilled promise and false starts, online backup has become one of the hottest segments of the storage market, fueling startups and sending the biggest storage vendors on an acquisition spree.

"Online backup is a very nascent market that is fragmented in terms of the kinds of players," said Doug Chandler, an analyst with International Data Corp. (IDC). "There are about a couple of dozen standalone vendors in this field, many of them very small."

Perhaps as a consequence of this startup dynamic, the online backup field reveals a surprising trend. Rather than relying on tape libraries to hold millions of customer files, most of these vendors appear to prefer disk as a medium.

"We feel tape is a legacy technology that really should have no place in data protection today, given advances that disk offers," said Scott Bush, director of marketing at AmeriVault Corp. of Waltham, Mass.

AmeriVault is tape free. So are DS3 DataVaulting of Fairfax, Va., ElephantDrive Inc. of Los Angeles, Remote Backup Systems of Memphis, and many others. Their tape-less inclination could be a sign of things to come in the broader storage market.

"Many of the established companies started with tape, but some of the newer ones back up to disk," said Chandler. "As a result, we are seeing a move away from tape, with tape becoming more of a medium for long-term retention."

Nascent Market

Online backup hasn't merited much attention from analysts up till now. That is about to be remedied, though, as IDC has just released a research report on this branch of the market.

Chandler said the market remains relatively small but is expanding rapidly. IDC expects 33 percent compound annual growth through 2011, reaching $715 million in annual revenues.

While mature markets typically have the top few companies accounting for 60 to 80 percent of the pie, that isn't the case with online backup — although consolidation is already setting in. According to IDC, the biggest players are Iron Mountain (acquired online backup providers Connected and LiveVault), EMC (acquired Mozy), Seagate (acquired Evault) and IBM (acquired Arsenal Digital). Nine-year-old Arsenal survived both the dot-com storage service provider era and the first wave of online backup consolidation that began a few years ago before finally succumbing to IBM's offer. Mozy, on the other hand, has a large customer base but a small revenue stream so far, said Chandler.

Other big companies are eyeing this area with interest.

"Symantec has a beta service for online backup," said Chandler. "Imation is another company with an existing customer base and established channel relationships, so they have a good opportunity to grow rapidly."

The majority of these business models, however, favor disk. DS3 DataVaulting, for example, uses Fibre Channel EMC Clariion for primary disk and backs this up with copies stored on MAID (massive array of idle disks) storage from Copan Systems.

"We maintain three data centers: our primary in Ashburn Va., a secondary for replicated data (gold service) in Allen Tex., and a third in Chantilly, Va., for lower tiers of service," said Stacy Hayes, COO and co-founder of DS3 DataVaulting.com.

Primary backup is in the $5 to $7 per GB stored range based on volume and level of service. Archive is cheaper, at $2 to $3 per GB stored.

Hayes said DS3's main competition is the status quo — companies either not prepared to spend to protect their data or unwilling to change their existing backup habits. She also said she is facing increasing pressure from in-house solutions involving software (such as Double-Take Software and CommVault Systems) and hardware (such as Data Domain and FalconStor Software) combinations.

"Our customer base isn't interested in building and maintaining in-house solutions," said Hayes. "They recognize the value of utility-based (pay for what you use) service."

AmeriVault harnesses tier 3 data centers, which provide fire suppression, climate controls, redundant power, UPS, diesel and security/entrance controls. Customer backups are stored there on replicated RAID arrays and a third copy is sent to a business continuity site provided by SunGard Availability Services of Wayne, Pa., more than 1,000 miles away. This is on disk.

"We deploy a NetApp replication solution to the continuity vault that serves as a failover contingency should our main data center suffer an outage," said Bush.

For premium service (backup data in triplicate), it costs $7 to $17 per GB per month. AmeriVault also offers an economy version that stores one copy off site and is priced at $6 to $12 per GB per month. For archiving, costs drop to about $5 per GB per month.

"We compete with tape vendors and other service providers," said Bush. "We are able to provide automated online backup, secure offsite storage and a solid disaster recovery plan at affordable pricing."

Yet another tapeless outfit is ElephantDrive. It secures data by replicating it among multiple storage pools within its Storage Virtual Network (SVN). The SVN is a collection of storage network nodes — some nodes are high-end redundant arrays, some nodes are redundantly configured JBOD shelves, and some are actually storage services like Amazon's S3.

"Because a core principle of our SVN design is that multiple copies of each object be available at all times, they are all disk-based and we don't employ any tape-based media or any other archiving solution as part of the production application," said Ben Widhelm, CTO of ElephantDrive. "Every data object is available on at least two geographically independent nodes. If a disaster in the vicinity of one of our storage nodes results in the incapacitation of that node, the restore and access requests are automatically routed to alternative nodes."

He admits that all of the top players in the space offer effective backup tools. He said his company differentiates itself by delivering real-time access to all secured objects and the fastest transfer times available.

"Our biggest competitor remains inaction on the part of small and medium-sized enterprises," said Widheim. "There is still a massive amount of unsecured data, completely vulnerable to a variety of threats ranging from spilled coffee and user error to natural disaster and intentional attack."

The Battle Ahead

As IDC notes and most of the vendors are experiencing, online backup is an area that is just beginning to take off. But take off it will — and soon. How soon probably depends on how well vendors can scale out their environments to deliver fast, cheap and secure services to vast quantities of users.

"Whoever manages to scale this out will be in the driver seat over time," said IDC's Chandler. "EMC wanted Mozy, as it has an infrastructure based on scalability. To deliver in volume, you have to have the right infrastructure."

NextPhase Wireless to Offer Comprehensive Online Backup Solutions for Business Customers

NextPhase is a wireless Internet service provider. They are building a high-bandwidth wireless network to provide Internet connectivity to business and residential customers. They are giving their customers the opportunity to scrap their old expensive hardware backup solutions and use their Internet bandwidth to safely and securely backup their data to an offsite storage location. See the article below from Fox Business:

Quoted from http://www.foxbusiness.com

NextPhase Wireless to Offer Comprehensive Online Backup Solutions for Business Customers

ANAHEIM, CA, Jan 09, 2008 (MARKET WIRE via COMTEX) -- NextPhase Wireless, Inc. (OTCBB: NPHS ), a nationwide developer of WiMAX-ready networks and provider of advanced wireless broadband solutions, today announced that it is launching a comprehensive suite of robust, secure, scalable and cost-effective online backup services for businesses of all sizes.

"The ever-increasing deluge of information that companies require to run their businesses, and the mission-critical nature of that information, means that it is essential that they adopt comprehensive data backup and recovery strategies to ensure data integrity, reliability and business continuity in the event of a disaster. Sadly, too many companies wait until they experience significant data loss before they realize that they even need a backup strategy, or worse, determine that what they had in place was insufficient and that they're unable to recover the lost information," stated Tom Hemingway, Chairman and COO of NextPhase.

"Traditionally, tape media has been used for data backups. However, this medium is relatively slow and, being hardware-based, introduces potential points of failure (e.g. dirty heads, tape wears out, etc.). In addition, tape-based technologies cannot scale to meet rapidly growing tomes of information that requires backing up, and so businesses -- irrespective of size, require a different solution. Fortunately, many of these companies already have a business-class Internet connection which can be utilized to rapidly and cost-effectively deploy robust, secure and scalable online backup services that easily configure to meet their specific needs," added Hemingway.

Exchange 2007 Local Continuous Replication

Local Continuous Replication is a database replication tool new to Exchange 2007 Server. Local Continuous Replication (LCR) is built into Exchange 2007 and replicates the Storage Group Databases to a backup location. The backup location needs to be local to the server where the "Mailbox" role is hosted. The backup location can be another disk in the same computer, or an externally attached storage system. You can also replicate the databases to the same disk as the live Storage Group databases, but that provide as much value.

LCR asynchronously replicates all changes to the primary databases by sending transaction log files to the backup location as changes occur. The transactions in the log files are then applied to the backup database, keeping them virtually in sync with the primary databases. LCR is very similar to SQL Server Log Shipping, except it is not intended to replicate to remote systems.

While LCR is not a replacement for regular backups, it can have a significant impact on your backup and recovery strategy for Exchange 2007. Since your data is being replicated continuously, the need for daily backups of your Exchange 2007 databases is reduced for most organizations. With LCR your data is protected against hard drive failures and disk corruption. In the event of a loss or corruption of your primary database, Exchange 2007 can be manually switched to the backup database within minutes with little or no data loss. Taking regular full backups and storing backup copies off site is still very necessary, but you may find that your organization has reduced requirements for daily backups.


See Local Continuous Replication for more information.

Windows Home Server: Data Backup

Microsoft's Windows Home Server is designed for homes that have multiple computers. Besides file sharing, security and organization of data, it has some great capabilities for backing up and protecting data.

Quoted from http://www.microsoft.com:

Windows Home Server: Frequently Asked Questions

Windows Home Server provides automated “no touch” backup for all of the PCs in the home. It will automatically backup your home computers, including the operating system, applications and data, and allow you to easily restore the entire computer or an individual file or folder to a previous point in time. The Windows Home Server Backup solution uses an innovative system to backup only the data that has not already been backed up before. Even if you have several copies of the same data on different computers, the data is backed up only once and Windows Home Server keeps track of what data was stored on each home computer on each day. This makes it very efficient in terms of the time it takes for backups and the amount of space that is used on Windows Home Server.

Just Sit Back and Relax, Don't Worry About Your Data

Data backup and recovery is an incredibly boring subject to most people. On any given day, there are a lot more exciting things to do and talk about. There are some ways to avoid the boredom.

You can sit back and relax and just don't worry about data. Why make yourself sick over worrying about all the bad things that can happen to your data? Just ignore it; say it will never happen to you. This reminds me of the smoker who claims he knows a guy who lived to be 101 years old and he smoked two packs a day. I am not sure if he thinks he will be just as lucky, or if he is just saying "the hell with it, I am going to live the way I want to. "

Sure you can find examples of smokers who don't die of cancer or suffer from some other smoking related ailment, just like you can find people who have never lost their data. But just like the smoker, if you don't backup your data, odds are that you will eventually pay a price. The big difference is that you can protect your data without making a major sacrifice, or even changing your habits.

It only takes a few minutes to sign up for an online backup service, and it only cost a few bucks a month. You can still sit back and relax, and you don't have to worry about your data. When that hard-drive does meet it's demise; your family photos, documents, letters, music files and all of your valuable data will be easily recovered.

A Useful Tip for Software

While I have always recommended storing copies of software installation media in a safe off-site location, that advice is becoming a little dated. I recently rebuilt my computer on new hardware and on a new operating system (Windows Vista Ultimate Edition, 64-bit). Once the operating system was up and running, the next task was to install all of the software that I use on a daily basis. Something is different now. I rarely use CD's and DVD's. I find myself downloading the software from the vendors' websites so that I will have the latest and greatest version. I usually need my license keys and sometimes some login credentials to the vendor's website.

I keep a paper file with copies of all of my receipts, license agreements, keys, serial numbers and anything else that I may need to prove I have a license and can install the software. I have come to the conclusion that this paper file is a vulnerability. If my office burns, my paper files burn. I can get a new computer and restore all of my data from my online backup service provider. While I have backup copies of my software media, I am more likely to download the latest versions. But, if my office is destroyed, so is a lot of information that I would want and need to get my software installed and up to date.

Now I scan my software license documentation into PDF files. I also have a spreadsheet with a list of my software, the vendor's website URL, my login credentials and my license keys. These files are automatically backed up off-site every time they change. This seems to be a much more effective and efficient way to ensure that my software will not be lost than actually making copies of the disks and physically taking them home with me (occasionally).

Full Disk Backup vs Data Backup

There are two distinct types of full-disk backups; one is a backup of every file on the disk, and the other is an image backup of your disk. The biggest difference comes at restore time. Lets take a few minutes and look at the advantages and disadvantages of the two types of full-disk backups and also data-only backups.

Disk Image Backup

Advantages: Can completely restore your entire hard drive in the event of a failure. Disk image backup software often provides a method for creating bootable media. This is the fastest way to get your computer back up and running after a catastrophic hard drive failure.

Disadvantages: Media requirements can be excessive, especially with the large hard drives that are available today. It can also be very difficult to keep your disk image backups current because of the time needed to backup the entire hard drive. However, some image backup software allows for the images to be updated incrementally. This saves time, but it can create multiple physical media sets that must be kept up with in order to do a complete restore. While restoring an entire disk image is an excellent recovery option when the disk drive itself fails, it is not so useful for recovering from a destroyed or stolen computer. Disk images need to be restored to the same or very similar computer hardware. If your computer needs to be replaced, it is highly unlikely that you will end up with the same hardware.

Full-Disk Backups

Advantages: Every file is backed up. Very little effort or expertise is needed when deciding what to backup.

Disadvantages: Media requirements can be excessive, especially with large hard drives. System files and program files are often in use or locked so that backing them up causes errors or other problems. Restoring of system files and program files is usually not recommended because components get out of sync causing unpredictable results.

Data-Only Backups

Advantages: Significantly less media requirements. Faster than full-disk or image backups. Can be easily used to recover important data after replacing a hard drive or an entire computer.

Disadvantages: After a hard drive or computer replacement, the operating system and programs need to be reinstalled before the data backups can be used. The correct files must be selected for backup, however, this is less of a problem on modern operating system because the store documents in a common location by default (My Documents).


Conclusion

Both types of full-disk backups require significant media and therefor more time to create and keep updated. As a result, full-disk backups are usually not performed as often as data-only backups. In the event of a catastrophe, it is likely that the full-disk backup will be outdated, requiring more current data to be restored after the restoration of the disk itself. This multi-step process and additional effort required to make the backups and store them offsite defeats part of the purpose for using them in the first place.

Data-only backups are fast and can easily be completely automated and stored offsite with a good online backup service. Data-only backups require extra effort to reinstall the operating system and programs, but the result is usually a very clean and trouble free system after the recovery. If you are using data-only backups, be sure to keep copies of your operating system and software media in a place where you can access them in case of an emergency.