The many complications and risks of tape

Magnetic tape technology was adopted for backup many years ago because it met most of the physical storage requirements, primarily by being portable so that it could be transported to an off-site facility. This gave rise to a sizeable ecosystem of related backup technologies and services, including tape media, tape drives, autoloaders, large scale libraries, device and subsystem firmware, peripheral interfaces, protocols, cables, backup software with numerous agents and options, off-site storage service providers, courier services, and a wide variety of consulting practices to help companies of all sizes understand how to implement and use it all effectively.

Tape media
Tape complexity starts with its physical construction. In one respect, it is almost miraculous that tape engineers have been able to design and manufacture media that meets so many challenging and conflicting requirements. Magnetic tape is a long ribbon of multiple laminated layers, including a microscopically jagged layer of extremely small metallic particles that record the data and a super-smooth base layer of polyester-like material that gives the media its strength and flexibility. It must be able to tolerate being wound and unwound and pulled and positioned through a high-tension alignment mechanism without losing the integrity of its dimensions. Manufacturing data grade magnetic tapes involves sophisticated chemistry, magnetics, materials, and processes.

Unfortunately, there are many environmental threats to tape, mostly because metals tend to oxidize and break apart. Tape manufacturers are moving to increase the environmental range that their products can withstand, but historically, they have recommended storing them in a fairly narrow humidity and temperature range. There is no question that the IT teams with the most success using tape take care to restrict its exposure to increased temperatures and humidity. Also, as the density of tape increases, vibration during transport has become a factor, resulting in new packaging and handling requirements. Given that tapes are stored in warehouses prior to being purchased and that they are regularly transported by courier services and stored off-site, there are environmental variables beyond the IT team’s control—and that makes people suspicious of its reliability.

Tape’s metallic layer is abrasive to tape recording heads and constantly causes wear and tear to them. Over time the heads wear out, sometimes much faster than expected. It can be very difficult to determine if the problem is head wear, tape defects, or dirty tape heads. Sometimes the only remedy is to replace both the tape heads and all the tapes. The time, effort, and cost involved in managing
wear-and-tear issues can be a sizeable burden on the IT group with no possible return on that investment to the organization. Tape aficionados are very careful about the tapes they buy and how they care for them, but many IT leaders no longer think it is worthwhile to maintain tapes and tape equipment.

Media management and rotation
Transporting tapes also exposes them to the risk of being lost, misplaced, or stolen. The exposure
to the organization from lost tapes can be extremely negative, especially if they contain customer account information, financial data, or logon credentials. Businesses that have lost tapes in-transit have not only had to pay for extensive customer notification and education programs, but they have also suffered the loss of reputation.

Backup software determines the order that tapes are used, as well as the generation of tape names. Unfortunately, tapes are sometimes mislabeled which can lead to incomplete backup coverage, as well as making restores and recoveries more challenging. It sounds like a simpleproblem to solve, but when you consider that multiple tapes may have been used as part of a single backup job and that some tapes (or copies of tapes) are off site and cannot be physically checked, it turns out that there is not always a fast way to clear up any confusion.

Tape rotation is the schedule that is used by backup software to determine which tapes should be used for the next backup operation. If an administrator improperly loads the wrong tape in a tape drive, the backup software may not run, which means new data is not protected.

Conversely, the backup software may choose to overwrite existing data on the tape, making it impossible to recover any of it. A similar problem occurs when a backup administrator erroneously deletes tape records from the backup system’s database or erases the wrong tapes. Backup only works correctly when the database used to track data on tape accurately reflects the data that is recorded on tapes.

These sorts of problems are well-known to backup administrators and are more common that one might think. Backup administration and tape management tends to be repetitive, uninteresting work which sets the stage for operator oversights and errors. This is the reality of tape backup and it is why automated data protection with the Microsoft HCS solution from

Microsoft is such an important breakthrough. It removes the responsibility for error- prone processes from people who would rather be doing something else. When you look at all the problems with tape, it is highly questionable as an infrastructure technology. Infrastructures should be dependable above all else and yet, that is the consistent weakness of tape technology in nearly all its facets.

Synthetic full backups
An alternative to making full backup copies is to make what are called synthetic full copies, which aggregate data from multiple tapes or disk-based backups onto a tape (or tapes) that contains all the data that would be captured if a full backup were to be run. They reduce the time needed to complete backup processing, but they still consume administrative resources and suffer from the same gremlins that haunt all tape processes.

The real issue is why it should be necessary to make so many copies of data that have already been made so many times before. Considering the incredible advances in computing technology over the years, it seems absurd that more intelligence could not be applied to data protection, and it highlights the fundamental weakness of tape as a portable media for off-site storage.

Restoring from tape
It would almost be comical if it weren’t so vexing, but exceptions are normal where recovering from tape is concerned. Things often go wrong with backup that keeps it from completing as expected. It’s never a problem until it’s time to recover data and then it can suddenly become extremely important in an unpleasant sort of way. Data that was skipped during backup cannot be recovered. Even worse, tape failures during recovery prevents data from being restored.

Unpleasant surprises tend to be just the beginning of a long detour where restores are concerned. Fortunately, there may be copies from earlier backup jobs that are available to recover. Unfortunately, several weeks or months of data could be lost. When this happens, somebody has a lot of reconstruction work to do to recreate the data that couldn’t be restored.

One thing to expect from disaster recovery is that more tapes will need to be used than assumed. Another is that two different administrators are likely to vary the process enough so that the tapes they use are different—as well as the time they spend before deciding the job is done, which implies the job is never completely finished. Most people who have conducted a disaster recovery would say there was unfinished business that they didn’t have time to figure out. Their efforts were good enough—they passed the test—but unknown problems were still lurking.

Source of Information : Rethinking Enterprise Storage

The inefficiencies and risks of backup processes

If cloud storage had existed decades ago, it’s unlikely that the industry would have developed
the backup processes that are commonly used today. However, the cloud didn’t exist, and IT teams had to come up with ways to protect data from a diverse number of threats, including large storms, power outages, computer viruses, and operator errors. That’s why vendors and IT professionals developed backup technologies and best practices, to make copies of data and store them off site in remote facilities where they could be retrieved after a disaster. A single “backup system” is constructed from many different components that must be implemented and managed correctly for backup to achieve its ultimate goal: the ability to restore the organization’s data after a disaster has destroyed it.

Many companies have multiple, sometimes incompatible, backup systems and technologies protecting different types of computing equipment. Many standards were developed over the years, prescribing various technologies, such as tape formats and communication interfaces, to achieve basic interoperability. Despite these efforts, IT teams have often had a difficult time recognizing the commonality between their backup systems. To many, it is a byzantine mess of arcane processes.

Technology obsolescence is another difficult aspect of data protection. As new backup storage technologies are introduced, IT teams have to manage the transition to those technologies as well as retain access to data across multiple technologies. This tends to be more problematic for long-term data archiving than backup, but it is a consideration that weighs on IT teams nonetheless.

Disaster recovery is the most stressful, complex undertaking in all of IT. Recreating replacement
systems from tape backups involves many intricate details that are very difficult to foresee and plan for. Doing this without the usual set of online resources is the ultimate test of the IT team’s skills—a test with a very high bar and no chance for a retry. Most IT teams do not know what their own recovery capabilities are; for example, how much data they could restore and how long it would take. When you consider how much time, money, and energy has been invested in backup, this is a sad state of affairs for the IT industry. Data growth is only making the situation worse.

Source of Information : Rethinking Enterprise Storage

Hybrid cloud storage architecture

Hybrid cloud storage overcomes the problems of managing data and storage by integrating on-premises storage with cloud storage services. In this architecture, on-premises storage uses the capacity on internal SSDs and HDDs, as well as on the expanded storage resources that are provided by cloud storage. A key element of the architecture is that the distance over which data is stored is extended far beyond the on-premises data center, thereby providing disaster protection. The transparent access to cloud storage from a storage system on-premises is technology that was developed by StorSimple and it is called Cloud-integrated Storage, or CiS. CiS is made up of both hardware and software. The hardware is an industry-standard iSCSI SAN array that is optimized to perform automated data and storage management tasks that are implemented in software.

The combination of CiS and Windows Azure Storage creates a new hybrid cloud storage architecture with expanded online storage capacity that is located an extended distance from the data center.

Change the architecture and change the function
CiS performs a number of familiar data and storage management functions that are significantly transformed when implemented within the hybrid cloud storage architecture.

CiS takes periodic snapshots to automatically capture changes to data at regular intervals. Snapshots give storage administrators the ability to restore historical versions of files for end users who need to work with an older version of a file. Storage administrators highly value snapshots for their efficiency and ease of use—especially compared to restoring data from tape. The main limitation with snapshots is that they are restricted to on-premises storage and susceptible to the same threats that can destroy data on primary storage. Implementing snapshots in a hybrid cloud storage architecture adds the element of extended distance, which makes them useful for backup and disaster recovery purposes.

Data tiering
CiS transparently performs data tiering, a process which moves data between the SSDs and HDDs in the CiS system according to the data’s activity level with the goal of placing data on the optimal cost/performance devices. Expanding data tiering with a hybrid cloud storage architecture transparently moves dormant data off site to the cloud so it no longer occupies on-premises storage. This transparent, online “cold data” tier is a whole new storage level that is not available with traditional storage architectures, and it provides a way to have archived data available online.

Thin provisioning
SAN storage is a multitenant environment where storage resources are shared among multiple servers. Thin provisioning allocates storage capacity to servers in small increments on a first-come, first-served basis, as opposed to reserving it in advance for each server. The caveat almost always mentioned with thin provisioning is the concern about over-committing resources, running out of capacity, and experiencing the nightmare of system crashes, data corruptions, and prolonged downtime.

However, thin provisioning in the context of hybrid cloud storage operates in an environment where data tiering to the cloud is automated and can respond to capacityfull scenarios on demand. In other words, data tiering from CiS to Windows Azure Storage provides a capacity safety valve for thin provisioning that significantly eases the task of managing storage capacity on-premises.

Source of Information : Rethinking Enterprise Storage

Best practices or obsolete practices

The IT team does a great deal of work to ensure data is protected from threats such as natural disasters, power outages, bugs, hardware glitches, and security intrusions. Many of the best practices for protecting data that we use today were developed for mainframe environments half a century ago. They are respected by IT professionals who have used them for many years to manage data and storage, but some of these practices have become far less effective in light of data growth realities. Some best practices for protecting data are under pressure for their costs, the time they take to perform, and their inability to adapt to change. One best practice area that many IT teams find impractical is disaster recovery (DR). DR experts all stress the importance of simulating and practicing recovery, but simulating a recovery takes a lot of time to prepare for and tends to be disruptive to production operations. As a result, many IT teams never get around to practicing their DR plans. Another best practice area under scrutiny is backup, due to chronic problems with data growth, media errors, equipment problems, and operator miscues. Dedupe backup systems significantly reduce the amount of backup data stored and help many IT teams successfully
complete daily backups. But dedupe systems tend to be costly, and the benefits are limited
to backup operations and don’t include the recovery side of the equation. Dedupe does not change the necessity to store data off-site on tapes, which is a technology that many IT teams would prefer to do away with. Many IT teams are questioning the effectiveness of their storage best practices and are looking for ways to change or replace those that aren’t working well for them anymore.

Doing things the same old way doesn’t solve new problems
The root cause of most storage problems is the large amount of data being stored. Enterprise storage arrays lack capacity “safety valves” to deal with capacity-full scenarios and slow to a crawl or crash when they run out of space. As a result, capacity planning can take a lot of time that could be used for other things. What many IT leaders dislike most about capacity management is the loss of reputation that comes with having to spend money unexpectedly on storage that was targeted for other projects. In addition, copying large amounts of data during backup takes a long time even when they are using dedupe backup systems. Technologies like InfiniBand and Server Message Block (SMB) 3.0 can significantly reduce the amount of time it takes to transfer data, but they can only do so much.

More intelligence and different ways of managing data and storage are needed to change the dynamics of data center management. IT teams that are already under pressure to work more efficiently are looking for new technologies to reduce the amount of time they spend on it. The Microsoft HCS solution discussed in this book is a solution for existing management technologies and methods that can’t keep up.

Source of Information : Rethinking Enterprise Storage

Solid State Disks under the covers

SSDs are one of the hottest technologies in storage. Made with nonvolatile flash memory, they are unencumbered by seek time and rotational latencies. From a storage administrator’s perspective, they are simply a lot faster than disk drives. However, they are far from being a “bunch of memory chips” that act like a disk drive. The challenge with flash is that individual memory cells can wear out over time, particularly if they are used for low-latency transaction processing applications. To alleviate this challenge, SSD engineers design a number of safeguards, including metadata tracking for all cells and data, compressing data to use fewer cells, parity striping to protect against cell failures, wear-leveling to use cells uniformly, “garbage collecting“ to remove obsolete data, trimming to remove deleted data, and metering to indicate when the device will stop being usable. SSDs manage everything that needs to be managed internally. Users are advised not to use defrag or other utilities that reorganize data on SSDs. They won’t perform faster, but they will wear out faster.

Source of Information : Rethinking Enterprise Storage

Virtual systems and hybrid cloud storage

IT teams use virtualization technology to consolidate, relocate, and scale applications to keep pace with the organization’s business demands and to reduce their operating costs. Hypervisors, such as ESX and ESXi from VMware and Hyper-V from Microsoft, create logical system images called virtual machines (VMs) that are independent of system hardware thereby enabling IT teams to work much more efficiently and quickly.

But virtualization creates problems for storage administrators who need more time to plan and implement changes. The storage resources for ESX and ESXi hypervisors are Virtual Machine Disk Format (VMDK) files, and for Hyper-V hypervisors, they are Virtual Hard Disk  (VHD) files. While VMs are rapidly moved from one server to another, moving the associated VMDKs and VHDs from one storage system to another is a much slower process. VMs can be relocated from one server to another without relocating the VMDKs and VHDs, but the  process of load balancing for performance usually involves shifting both VMs and VMDKS/VHDs. Data growth complicates the situation by consuming storage capacity, which degrades performance for certain VMs, and forces the IT team to move VMDKs/VHDs from one storage system to another, which can set off a chain reaction of VMDK/VHD relocations along the way. Hybrid cloud storage gracefully expands the capacity of storage, including VMDKs and VHDs, eliminating the need to move them for capacity reasons. By alleviating the pressures of data growth, hybrid cloud storage creates a more stable environment for VMs.

Source of Information : Rethinking Enterprise Storage

The constant nemesis: data growth

IDC’s Digital Universe study estimates that the amount of data stored worldwide is more than doubling every two years, so it is no surprise that managing data growth is often listed as one of the top priorities by IT leaders. IT professionals have ample experience with this problem and are well aware of the difficulties managing data growth in their corporate data centers. Balancing performance and data protection requirements with power and space constraints is a constant challenge.

IT leaders cannot surrender to the problems of data growth, so they need a strategy that will diminish the impact of it on their organizations. The hybrid cloud storage approach leverages cloud storage to offload data growth pressures to the cloud. Storage, which has always had an integral role in computing, will continue to have a fundamental role in the transformation to hybrid cloud computing—for its primary functionality (storing data) as well as its impact on those responsible for managing it.

Source of Information : Rethinking Enterprise Storage

The many complications and risks of tape

Magnetic tape technology was adopted for backup many years ago because it met most of the physical storage requirements, primarily by being ...