Everyone's doing it.
Since overcoming the habit of remaining committed to one type of computing, enterprises have become technologically adventurous.
They've been "swinging" with powerful desktop and client/server computing for more than a decade. But in the process they've learned -- sometimes the hard way -- that they can't afford to forget some basic procedural principles. For example, if they don't practice safe storage while surfing the networks, they're flirting with disaster.
"People are exploring new approaches with different vendors and on different platforms," says C.D. Larson, an IBM storage advisory specialist in Costa Mesa, Calif. "Many simply assumed that everything would be as safe as it had been in the comfy old data center. But they're finding some security services they once took for granted -- such as backing up and archiving data -- simply aren't there. And some solutions which answered these problems in the mainframe environment don't work in a distributed environment."
As a result, data storage and security for the client/server marketplace has become a booming business. In 1993, U.S. organizations spent about $226 million on all kinds of client/server storage technologies. And they'll spend about $1.5 billion annually by 1998, according to Peripheral Strategies Inc., a market-research firm in Santa Barbara, Calif.
Many enterprises migrating to client/server platforms are enhancing their existing mainframe systems with fast new tape drives, "juke boxes" for compact disk storage, and robotic tape libraries that sort through hundreds of tapes, mount the proper one, and spin it -- faster than a speeding bullet -- to the point where required data is located.
Spending on storage by corporations and other large-scale computing enterprises is changing dramatically, too. During 1993, in client/server environments, only about 14 percent of all spending for storage hardware and software was for enterprisewide client/server solutions, 31 percent was spent on storage solely for UNIX platforms in the organization, and 55 percent was spent for PC platforms. It was almost as if the PCs weren't a part of the overall enterprise. Peripheral Strategies estimates that the 1994 spending breakout will be nearly reversed from the year before, with 53 percent spent on developing enterprisewide solutions, 26 percent spent in UNIX environments, and 21 percent for storage on PC platforms.
What's driving these trends? Today's corporation is a glutton for information. Much of it comes from mainframes, of course, but it's also created on midrange hosts, on powerful RISC-based workstations in UNIX environments, and on PCs, laptops, notebooks, and personal digital assistants.
For example, at ARCO's data processing arm, ARCO Exploration and Production Technology in Plano, Texas, there are 1.4 terabytes (TB) of "corporate" data available on disk storage devices at any time and countless TB in tape storage to meet the worldwide needs of the seventh largest oil company.
At Providan Corp., a financial services and insurance company in Louisville, Ky., dozens of Novell local area networks (LANs) throughout the enterprise each contain an average of three to five gigabytes (GB) of data in addition to data residing on corporate mainframe systems.
All this corporate "enterprise" information, in something of a technological mating dance, routinely meshes with unfathomable amounts of information on individual PCs as it is "spreadsheeted" and "word processed" on desktops throughout the organization. Some of it finds its way into presentation or publishing application files. Other bits and pieces of corporate information are incorporated into groupware applications. And in all of these desktop application environments, internal information is mixed with all kinds of information from either online services or market-research, credit, or financial firms.
The problem of managing this growing menagerie of information is exacerbated by the rapidly increasing number of desktop and portable PC networks. These networks themselves create their own kinds of information to manage resources and communication -- further adding to the glut of data.
Keeping up with all the data on these widely scattered machines from a central location is a daunting task -- even if the data created on each device simply stays put. But information created on one platform is often forwarded to another.
And there's the rub. On the one hand, client/server and other distributed systems make corporate information more valuable by helping ensure the data are available when and where needed. On the other hand, protecting this new value-added information asset -- and making it available on other desktops where it can become even more valuable -- is, at best, an iffy process for most organizations.
The growing reliance on PCs and LANs by businesses creates a far greater need to protect and use data resources efficiently, says Glenn McDermed, who analyzes the data-storage and management technology industry for the Gartner Group in Stamford, Conn. In many instances, impatient users installed PCs and LANs without any help or input from their company's information technology (IT) organization.
Now companies are beginning to understand why IT staffs had been cautious: although playing with data is more fun than using safe storage practices, the practices are becoming more necessary in today's environment.
Other factors are also contributing to the problem.
To start with, the typical LAN administrator, who may have performed some small format backup operations at the server level, has done little or nothing with the data residing on individual PCs, says McDermed. Backing up data from dozens -- perhaps even hundreds -- of PCs on one LAN "is a labor-intensive effort, and LAN administrators don't want it. That's why the task of implementing robust backup procedures and disciplines is going back to the data center."
Another problem is the resistance of PC users, who don't want anyone monkeying with data stored on their machines.
As a result, many organizations often rely on end users to handle their own desktop-storage and data-security requirements. But users aren't disciplined in the art of data management, nor do they have the tools to do the jobs for their own desktop systems or the enterprise's LANs.
Then, too, end users who do practice safe storage often do so unproductively with unreliable technologies. So the time users save downloading mainframe data into a desktop spreadsheet is easily offset by the time spent saving that data to a floppy disk and locking it in a fireproof filing cabinet.
For example, users may think they've done enough by backing up data files. However, they often overlook other files, created by hours of work, that customize their graphical user interfaces (GUIs) and configure their systems to work in the environment -- printer drivers, modem access commands, embedded commands in applications that tell the computer how the user wants to store and retrieve data, for example. On the other hand, some users will spend hours of computing resources backing up megabytes (MB) of hard-disk data which never changes and which can be replaced by more efficient methods. And all too often, they assume that safe storage means putting data where nobody can get at it -- like on a mainframe storage device that can only be accessed by data processing staffers.
"In today's environment, it's equally important that you don't back up certain things," says Dean Smith, storage management consultant for ARCO Exploration and Production Technology. "If you don't believe me, ask Oliver North."
Smith, of course, refers to E-mail messages recovered from White House backup files that became the "smoking gun" in the Iran-Contra affair. But his point is well taken in reference to information that is created in the day-to-day course of business that's useful only for a limited time and should not be kept for any purpose.
"You need an auditable solution that can demonstrate that the files which ought not be backed up are in fact not backed up," says Smith.
the same problems that mainframe
users encountered in the early 1970s,"
says IBM's C.D. Larson.
Another source of lost data is the occasional earthquake. After last year's quake in Los Angeles, many experts expected to find mainframes lying "casters up" in demolished computer rooms. Not so. The biggest damage was to PCs, which were either destroyed or stranded in wrecked buildings that users couldn't enter.
"When the ground starts shaking, your average PC tape drive just isn't going to cut it," in terms of protecting the enterprise, says IBM's Larson. In the recent southern California earthquake, many users lost access to their PCs and backups. One who agrees with that conclusion is Scott Cumbie, an 11-year veteran storage administrator for Providan Corp. and a member of the Storage Management Steering Committee of the SHARE user group that is studying storage management issues that enterprises will encounter in the future.
"In general, the corporate data ought to reside on the mainframe," Cumbie says bluntly. "Applications ought to reside on the LAN."
As the number of networked PCs increases, so does the complexity of the issue. Some storage schemes, for instance, are designed to save only the data files. They ignore upgrades to configuration files, printer drivers, operating systems and application software, and other system elements -- all of which allow users to customize their work environments so that they're easy to use and compatible with other platforms in their organizations.
With all those problems, it doesn't take much to turn some of the strongest PC power users into backup advocates -- one lightning bolt, one hard-disk crash during a project, or one spreadsheet file lost on the network can do the trick. Or the secretary's hard-disk crashing just before an important company meeting, taking a vice president's charts with it.
The good news is that PC hard disks don't crash as often as they did when 10 MB and 20 MB units were the staples of the office. The bad news is, when a crash does happen, a user has to cope with hard disks that hold up to 100 MB, 200 MB, and 500 MB of data. Moreover, the typical Novell LAN installation, which now stores some 40 GB of data, will hold about 100 GB in just a few years. Such rapid growth by itself renders old storage techniques totally inadequate.
But a subtler problem is associated with the rapid growth of desktop computing. Many proponents of LAN technology have said so often that networks of powerful PCs will replace mainframes that they have come to believe it -- and have built products that do not preserve the finer features of the mainframe environment, mundane though they may be. As an example of how "unsophisticated" the PC world is about such mainframe-style basics as storage management, Cumbie discovered as Providan was deploying its centralized storage management system for LANs that its Novell file servers "only recognize real memory."
"It was eye-opening for me," says Cumbie. "Here we are in the middle of the 1990s and the most widely used file server in the world has no idea what virtual storage is."
Before you get into any kind of centralized backup scenario, says Cumbie, "you really need to be current with all your software and operating systems, including all the fixes and enhancements." And, he adds, you've got to be intimately familiar with your network topology.
Cumbie's organization learned the last lesson the hard way. Although it had decided not to try to backup centrally the remote LAN fileservers in its branch offices because of the pressure the required data transfer would place on its WAN, some of its TCP/IP-based servers didn't get the message. Set up to choose "the first route available instead of the best route available," the TCP/IP servers in Louisville sent their traffic through Philadelphia and back to the mainframe in Louisville. Cumbie says he was surprised at how well the WAN routing performed; it was about 10 times slower than the straight backup to the mainframe that occurred once the problem was corrected.
Today, only about 20 percent of data-loss catastrophes are attributable to hardware failure (a figure that some IT experts like ARCO's Dean Smith will argue is too high). The rest can be chalked up to "pilot error" -- end-user mistakes.
In some instances, these storage, data security, and management issues have hindered the migration to client/server environments. When a strategic application is placed on a platform, users want it to be absolutely secure. In a way, it's good that users are considering these issues before they migrate legacy applications and information to desktops. When it comes to data security, enterprises can't take chances anymore.
"In client/server environments today," stresses Larson, "some of the big issues that must be addressed are data security, data integrity, and data availability across the enterprise. Most companies aren't equipped to do that at any level for their distributed systems."
The solution? An enterprise's management -- which is responsible for access to information, for protecting it, and for productivity and profitability -- must develop an enterprisewide approach to storage management solutions.
That includes IT professionals. For years, IT staffs resisted the proliferation of PCs in their enterprises because managing the data and security of widespread units is such a formidable task. But they lost the battle. And now, with the migration of strategic platforms to client/server -- and the attendant requirement for better storage disciplines and practices -- they may be regaining control.
But the job won't be easy. First, IT staffs often don't have enough personnel; many departments have been downsized. Second, in a typical enterprise today, the IT staff is faced with a variety of technologically diverse platforms and networks. They require sophisticated software tools to help them do a harder job just as well as they did it in the close-quartered computer environment of the past with fewer resources. The key to the solution rests not in bigger disk drives and greater bandwidth on networks. The key is software that can virtually do your data management thinking for you and that can grow -- or more appropriately in today's environment -- shrink with you.
Some companies moving toward client/server realize the value of their enterprise's information and are taking steps to protect it and use it more efficiently. One such company is Western Geophysical in Houston, Texas, which specializes in gathering detailed data about the earth's crust for oil and gas exploration. This information is its principal product and is vital to its operations, so protecting it is essential.
"The systems and tools in client/server today aren't conducive to handling terabytes of data in any focused process," says Tim Mather, a systems programmer for Western Geophysical. "Therefore, it's key to have effective tools for managing storage resources, so you can get the data to where a user needs it. You've got to build those tools before you move your strategic platform to client/servers."
The tool Western Geo uses is IBM's ADSTAR Distributed Storage Manager (ADSM), a network-based storage solution for backing up and archiving data in distributed, multivendor workstations and LAN fileservers.
Western Geo manipulates enormous quantities of data to create cross-sections of the earth. Raw information collected in the field is processed against algorithms to create images. A typical "sample" of field data ranges from 500 GB to two or three TB, says Mather. It's necessary to divide that into useful chunks, a process that relies on the speed of parallel processors.
"In gathering, dividing, and processing information, distributed storage management is critical to our operation," says Mather. "We probably create and deal with more data than any other company -- or even industry."
Although it's migrating some tasks to workstations, using load-leveling tools to put jobs on multiple systems, the company primarily uses mainframes for batch processing. Western Geo's biggest problem in data management is moving data from one place to another.
Unlike Western Geo, some companies haven't mastered centralized data management techniques. In some instances, centralization can be a monumental undertaking.
To regain control of their widely dispersed LAN environments, a number of companies are placing network file servers in a central location. There, many backup schemes are available, including the use of a mainframe to manage data. For geographically scattered LANs -- where it's impossible to have workstations connected to file servers in the data center -- using a mainframe is possible, but there are technological limitations.
"When LANs are dispersed," says the Gartner's McDermed, "the issue is bandwidth. Data backup is essentially transferring large-scale files. Today's typical WAN is designed for small -- not large -- bursts of data. Even dedicated T-1 networks aren't very fast for transferring massive files."
New technologies, such as Integrated Services Digital Network and asynchronous transfer mode, could help once they are embraced by an enterprise. That's likely to happen because exciting new applications, like videoconferencing, multimedia, replication, and imaging, are needed, McDermed explains, not because the bandwidth is required for backup.
As a result, enterprises today are exploring other techniques for their LANs and PCs, such as data compression and incremental backups -- and hierarchical storage management (HSM), the long-standing backup and storage standard used in mainframe environments. HSM separates infrequently and frequently used files. The latter includes active data, such as new files created each day, configuration files, and -- in the Windows environment -- the *.ini files that are created and updated with virtually every keystroke and click of the mouse.
"About 80 percent of the hard disk data on any PC has not been referenced in the previous 30 days," says McDermed. "That data should be incrementally backed up; doing it often is a waste. The other 20 percent of the data should be backed up frequently." These techniques are common in the mainframe environment, but they are foreign to the LAN environment.
"Today, PC users are running into the same problems that mainframe users encountered in the early 1970s," says IBM's Larson. "Since we know the solutions, let's not reinvent them. Let's use what we learned, drop any baggage, and move forward to new technologies and products."
And the Winner Is...