More feature stories by year:
Return to: 2004 Feature Stories
CLIENT: BakBone Software
October 2004: Dell Power Solutions
By deploying virtual disk libraries (VDLs), IT organizations can take advantage of high-speed disk technologies for backup processes, while benefiting from the portability and security of tape for long-term data retention and disaster recovery. This article demonstrates how using BakBone® NetVault tape backup and restore software with VDLs can help organizations make effective use of tape media, achieve quick recovery, and back up data within short backup windows.
Virtual disk libraries (VDLs), similar to tape libraries, offer an efficient and new media backup paradigm for organizations that use Dell hardware. When using VDLs, tape becomes a strategic component of an organization's data protection strategy, but not the major element of the strategy. IT staff can create multiple, duplicate copies of backup jobs from the VDL to tape, or vice versa. Storing backup data on a VDL also allows data copy jobs to be run offline so they have little or no impact on the network, application servers, or workstations. Administrators can also set up specific backup policies such as retention dates, rotation schemes, and media groups for VDLs.
Because they are not physical entities, VDLs are immune to the mechanical difficulties of tape backup over a network. One example of such an issue is shoe shining—a scenario in which a tape drive runs back and forth along the same piece of tape until it receives the next bit of information to tell it where on the tape to go next. Another example is a host with a slow data stream, which is not a problem for a VDL because it can capture data in "drips" or "blasts", then copy it to virtual media slots as a save set. These advantages offered by VDLs can help administrators streamline the backup processes.
The mechanics of VDLs
A VDL consists of directories called drives and slots on a disk. These drives and slots each contain numbered directories, which are unique for each drive or slot. The media file that resides in each numbered slot directory represents a "tape" in the virtual library, while a media file in each numbered drive directory represents a virtual disk.
Bakbone NetVault tape backup and recovery software treats VDLs as if they were physical libraries. The more drives the VDL contains, the more simultaneous backups can be performed. Each VDL is usually configured with a minimum of eight slots, but it will always have many more slots than drives. These extra slots allow for proper handling of backup retention cycles. In addition, different operating systems may impose limits on the maximum file size, which can affect the number of slots needed for handling the data. When the NetVault VDL is configured for Dell servers and the number of slots and media capacity is defined, media files are created and the space is pre-allocated in the VDL.
Administrators may also install an optional Application Plugin Module (APM) for Oracle®, MySQL, Sybase, PostgreSQL, IBM® Informix, and various other applications. These modules automatically add application-specific components to the backup and restore selection criteria that appear on the NetVault graphical user interface (GUI). From this common GUI, administrators can manage all backup and restore operations across a storage area network (SAN), network attached storage (NAS), wide area network (WAN), or local area network (LAN).
Advantages of VDL staging
VDL staging can be useful in two types of situations. If a company has a huge file system with millions of files, a typical server might not be able to read the files fast enough to stream data to today's high-performance tape drives. This can lead to shoe shining and premature drive or media failures. Since shoe shining occurs only with tape storage, not disk storage, performance degradation that can result from shoe shining is not an issue when backing up to a VDL.
Another case might be that an organization's backup window is too small to back up several clients onto a limited number of tape drives. In this case, a VDL with enough virtual drives could back up all clients simultaneously. Performance in this situation would depend on network bandwidth; a Gigabit Ethernet network would likely be required to handle a heavy backup load. For example, suppose the backup window is too small for an administrator to back up five clients in one hour, each with 10 GB of data using only one tape drive performing at 18 GB/hour. Using VDL staging, the administrator could first back up to multiple virtual tape drives, copy to physical tape, then define enough virtual tape drives to back up all the clients within the allotted hour.
Staging versus multiplexing
Multiplexing is another approach for backing up multiple clients to limited tape drives within short backup windows.
In multiplexing, multiple streams of backup data are sent to one tape device. This method has several drawbacks. First, a backup of any given client will use more tape than is actually required; therefore, this approach may necessitate handling multiple tapes per client backup. Since it requires more media, the probability of failure is higher. The time required for restoring data is also longer because more tape needs to be scanned for a given restore time and data must be reconstructed from multiple data streams. Multiplexing also uses more CPU resources on the backup server, because data streams must be reorganized and packed into a multiplexed stream.
Although the VDL staging approach requires extra disk space for the virtual library resource allocation, each client's backups are contiguous on tape, which helps to conserve media and enable fast restore jobs.
A VDL staging example
The following example illustrates an enterprise backup scenario using VDL staging. Acme Engineering, Inc. has approximately 250 employees in the several functional units: Finance, Engineering, and Sales and Marketing. The company has worldwide operations, with corporate headquarters in the United States and sales offices in Europe and Asia.
The Acme IT infrastructure includes:
Network configuration. The NetVault backup server is running the Red Hat Enterprise Linux ES 3 OS and hosted on a rack-mounted Dell PowerEdge 2650 server connected to the Gigabit Ethernet LAN backbone. Attached to the NetVault backup server is a Dell/EMC AX100 storage array with 3 TB of disk capacity and a Dell PowerVault 136T tape library configured with three Linear Tape-Open 2 (LTO-2) tape drives and 72 slots. The AX100 storage array is configured within NetVault as a VDL with four drives and 15 slots for disk-to-disk (D2D) backup.
NetVault clients are installed and configured on each of the four Dell PowerEdge 2600 servers and the Dell PowerVault 775N NAS server. A NetVault SmartClient is installed on the CRM application server because of the size (1.3 TB) of the underlying Oracle® database. The SmartClient allows the client to write directly to the SAN-attached storage media (VDL and tape library). Additionally, NetVault APMs for Microsoft Exchange, Oracle, and MySQL are installed and configured on the appropriate applications servers to address the requirement for hot backups of these applications—that is, backups performed while applications are up and running.
In this example, Acme has 5 TB of data in its environment that changes an average of 10 percent (500 GB) per day. The PowerVault 775N NAS file server has 2 TB of storage containing more than two million files. The backup window is limited to just six hours (between 10 p.m. and 4 a.m.) because the company has operations in multiple time zones.
Acme's D2D2T backup policy. Acme uses a disk-to-disk-to-tape (D2D2T) backup strategy:
Backup policy considerations. The backup policies deployed in the Acme example were driven by several key considerations. The first was the limited backup window. The sheer volume of data prevented Acme from conducting full nightly backups, which led to incremental backups Monday through Friday and full backups on Saturday. The result was an average of 500 GB backed up each night and more than 5 TB each weekend.
Based on Acme's configuration of network bandwidth capacity, the number of physical tape drives, and write speeds of the LTO-2 drive, Acme's backup needs would not have dictated a D2D2T data protection strategy. Nevertheless, several factors drove Acme to implement a VDL solution in which incremental backups were written to disk and later transferred to physical tape.
One factor concerned the characteristics of the data being backed up on the PowerVault 775N NAS server. Because of the large number of very small files, it was difficult to keep the LTO-2 tape drives spinning. The impact was long backup times and significant wear and tear on both tape media and drives because of shoe shining. Writing these files to disk by using VDL staging addressed both issues.
A second important consideration for using a VDL was the restore pattern for this data. The patterns showed that nearly 80 percent of restore requests were associated with the data that had been backed up from the PowerVault 775N NAS server. And of those restore requests, 80 percent occurred within four days of the last access of the file. Therefore, keeping at least four days of data on the VDL resulted in much faster restore times compared to tape. These fast restore times helped increase user productivity. The 3 TB VDL on the AX100 storage array allowed more than a week of incremental backup data to be stored on disk, which further increased the likelihood that restore requests could be serviced directly from disk.
Consolidation of file system backups
Combining a backup-to-disk approach with the NetVault Consolidated backup feature lets administrators create a synthetic full backup without actually running a full backup each week. Initially, one full backup to serve as the base must be run; but once this full backup has been created, all future full backups can be performed by combining the previous incremental backups kept on disk and the last synthetic full backup written to tape.
The consolidated backup feature eliminates the need to run weekly, resource-intensive full backups, which tend to consume large amounts of network bandwidth (particularly if backing up across the LAN) and server bandwidth that may be put to better use elsewhere. In addition, consolidated backups do not consume system resources—network or application server bandwidth—thus allowing full backups to be run at any time of the day without an adverse effect on production operations. Although consolidated backups consume the backup server resources and VDL and tape library resources, these resources are typically not needed during normal business hours. This arrangement enables consolidated full backups to be run easily during business hours, when IT staff is more likely to be available to monitor progress and help ensure good backups.
Return to: 2004 Feature Stories