Brightsurf Science News and Current Science News Events
 
Email a Friend Send to a friend
Printer Friendly Print Computer scientists develop solutions for long-term storage of digital data

Computer scientists develop solutions for long-term storage of digital data

April 22, 2008

SANTA CRUZ, CA--Although the digital age is well under way, one crucial detail remains to be worked out--how to store vast amounts of digital information in a way that allows future generations to recover it.

"The problem is how to build a large-scale data storage system to last 50 to 100 years," said Ethan Miller, associate professor of computer science in the Baskin School of Engineering at the University of California, Santa Cruz.




Tape libraries are widely used for data storage, but digital tape has many shortcomings as an archival medium. Miller's group has come up with a new approach, called Pergamum, which uses hard disk drives to provide energy-efficient, cost-effective storage. The declining cost of hard drives has made them more competitive with tape, and they offer numerous advantages for searching and retreiving data. "It's like the difference between a VCR and TiVo," Miller said.

Pergamum, named after the ancient Greek library that made the transition from fragile papyrus to more durable parchment, is a distributed network of intelligent, disk-based storage devices. The team that developed it includes UCSC graduate students Mark Storer and Kevin Greenan, along with researcher Kaladhar Voruganti of NetApp (formerly Network Appliance), a company that focuses on storage and data management solutions.

Archival storage is a big issue for businesses, partly due to legal requirements for the preservation of financial and business records, and also because data mining strategies can turn stored data into a valuable resource. Long-term storage is also a growing issue for individuals who are filling their personal computers with digital photos, movies, and documents.

"There is a risk that an entire generation's cultural history could be lost if people aren't able to retrieve that data," Storer said. "Everyone is switching to digital cameras, but we've never demonstrated that digital data can be reliably preserved for a long time."

Pergamum has attracted a lot of attention from industry since Storer presented it at a leading conference in the field, the USENIX Conference on File and Storage Technologies (FAST '08), held in San Jose in February. Robin Harris, an industry consultant who writes an influential blog called StorageMojo, called the Pergamum paper his "favorite FAST '08 paper" (see http://storagemojo.com/2008/03/14/storagemojos-favorite-fast-08-paper/).

The researchers designed the system to provide reliable, energy-efficient data storage using off-the-shelf components. It also has the ability to evolve over time as storage technologies change. "You want to avoid 'forklift upgrades,' where you have to get rid of the old system and transfer all your data to a whole new system," Miller said.

According to Storer, businesses are beginning to recognize that archival storage is very different from simply backing up their data. "A backup is a safety net--you hope you won't need it. Archival data you do want to use--it's a valuable resource and you want to be able to mine it for information," he said.

Tapes work well for backups, in which data are written once, rarely read, and not kept indefinitely. But archival data should be easy to read, query, browse, and search, and tape has inherent weaknesses in these areas. Existing disk-based systems offer excellent performance, but rely on power-hungry central controllers.

"Energy usage is a big issue, so a lot of our effort in designing Pergamum focused on dramatically reducing power use," Miller said.

Pergamum uses individual building blocks consisting of a hard drive; a small, low-power processor (like the chip in an iPhone); a flash memory card; and an ethernet port. These units, called "tomes," are connected using relatively inexpensive ethernet switches.

"Each tome is like a minicomputer, but with very low power demands," Miller said. "When not in use, it can shut down almost completely."

Even when active, the devices use very little power (less than 13 watts), which can be delivered over the network using Power over Ethernet technology. As a result, each unit is essentially a self-contained box with a network connection. The flash memory provides low-power, persistent storage so that many operations can be performed without activating the hard drive.

For reliability, Pergamum uses two levels of redundancy--within and between disks--to protect from both disk failures and errors in writing data to a disk (so-called "latent sector errors"). Tomes can be easily added to expand the system or to replace failed disks. And if hard disk drives become obsolete in 10 years, Pergamum won't suffer the same fate. The system doesn't care what the actual storage medium is, as long as the device can implement the simple protocol that will allow it to function as part of the network.

"In 50 years, the devices might use holographic storage," Storer said. "As long as you can wrap the new storage medium in this intelligent layer that speaks the protocol, it can participate in the network."

Pergamum is one of several related projects being developed by researchers in the Storage Systems Research Center (SSRC) at UCSC's Baskin School of Engineering. The center's other archival storage projects include Deep Store, which dramatically reduces the amount of space required to store data, and POTSHARDS, which provides long-term secure storage using "secret splitting" instead of traditional encryption. Both of these projects would be compatible with Pergamum, Miller said.

University of California - Santa Cruz



Related Data Storage News Articles Data Storage News and Current Data Storage Events RSS Data Storage News and Current Data Storage Events RSS
New 'nano-positioners' may have atomic-scale precision
Engineers have created a tiny motorized positioning device that has twice the dexterity of similar devices being developed for applications that include biological sensors and more compact, powerful computer hard drives.

Researchers untangle quantum quirk
Quantum computing has been hailed as the next leap forward for computers, promising to catapult memory capacity and processing speeds well beyond current limits. Several challenging problems need to be cracked, however, before the dream can be fully realized.

European light research opens door for optical storage and computing
The goal of replacing electronics with optics for processing data in computers is coming closer through cutting edge European research into the mysterious properties of "fast and slow" light.

The Not-So-Digital Future of Digital Signal Processing
Fungi processing audio signals. E. Coli storing images. DNA acting as logic circuits. It's possible, and in some cases, it's already happened. In any event, performing digital signal processing using organic and chemical materials without electrical currents could be the wave of the future.

A researcher of UPV/EHU has designed nanomagnets for industry
The PhD, defended by chemist Sonia Moralejo García at the University of the Basque Country (UPV/EHU), achieved a well-defined line for the manufacture of nanomagnets and other magnetic devices of wide industrial application.

Listening for the cosmic symphony: New SU supercomputer will help scientists listen for black holes
Scientists hope that a new supercomputer being built by Syracuse University's Department of Physics may help them identify the sound of a celestial black hole. The supercomputer, dubbed SUGAR (SU Gravitational and Relativity Cluster), will soon receive massive amounts of data from the California Institute of Technology (Caltech) that was collected over a two-year period at the Laser Interferometer Gravitational-Wave Observatory (LIGO).

NIST team develops novel method for nanostructured polymer thin films
All researchers at the National Institute of Standards and Technology (NIST) wanted was a simple, quick method for making thin films of block copolymers or BCPs (chemically distinct polymers linked together) in order to have decent samples for taking measurements important to the microelectronics industry.

Using Life's Building Blocks to Control Nanoparticle Assembly
Using DNA, the molecule that carries life's genetic instructions, researchers at the U.S. Department of Energy's Brookhaven National Laboratory are studying how to control both the speed of nanoparticle assembly and the structure of its resulting nanoclusters.

Nanoscale blasting adjusts resistance in magnetic sensors
A new process for adjusting the resistance of semiconductor devices by carpeting a small area of the device with tiny pits, like a yard dug up by demented terriers, may be the key to a new class of magnetic sensors, enabling new, ultra-dense data storage devices.

NCAR Adds Resources to TeraGrid
Researchers who use the TeraGrid, the nation's most comprehensive and advanced infrastructure for open scientific research, can now leverage the computing resources of a powerful, 2048-processor BlueGene/L system at the National Center for Atmospheric Research (NCAR).
More Data Storage News Articles


Accounting Information Systems (11th Edition) (Accounting Information Systems)
by Marshall B. Romney, Paul J. Steinbart

KEY BENEFIT: Thorough and up-to-date, this book supports any of the most popular approaches to AIS: focus on transaction cycles and controls; focus on systems life cycle; focus on databases and data modeling; or focus on computer-based controls, fraud and auditing. KEY TOPICS: The book begins with an overview and conceptual foundations then goes on to discuss control and audit of accounting...



Information Systems Today: Managing in the Digital World (3rd Edition)
by Leonard Jessup, Joseph Valacich

Information Systems Today, 3e, speaks directly to WHY IS MATTERS today by focusing on what every business student needs to know about IS including its leading role in the globalization of...



Modern Database Management (9th Edition)
by Jeffrey A. Hoffer, Mary Prescott, Heikki Topi

Hoffer focuses on the latest principles, concepts and technologies and what leading practitioners say is most important for database developers. Database analysis, database design, SQL, client/server database environment, data warehousing, data quality and integration, and object-oriented data modeling. Intended for professional development programs in introductory database management. ...



Advanced Web Metrics with Google Analytics
by Brian Clifton

Let Web metrics expert Brian Clifton help you maximize your website's potential. In this book you'll discover the information you need to get a true picture of your site's impact and stay competitive using Google Analytics. Featuring implementation techniques not documented elsewhere, this informative guide teaches you how to turn data into actionable information and optimize the user experience...



Core Concepts of Accounting Information Systems
by Nancy A., DBA Bagranoff, Mark G., Ph.D. Simkin, Carolyn Strand, Ph.D., CPA Norman

Offering concise, user-friendly coverage of core topics, this essential text provides a strong foundation for courses in Accounting Information Systems and gives instructors the flexibility they need to meet their individual course objectives. This text - now with more color! - is an excellent, stand-alone resource for a shorter course in accounting information systems, or the prefect foundation...



Accounting Information Systems
by James A. Hall

The sixth edition of ACCOUNTING INFORMATION SYSTEMS provides thorough and up-to-date coverage of accounting information systems and related technologies. It features an early presentation of transaction cycles, as well as an emphasis on ethics, fraud, and the modern manufacturing environment. The book focuses on the needs and responsibilities of accountants as end users of systems; systems...



Accounting Information Systems (10th Edition) (Accounting Information Systems)
by Marshall B. Romney, Paul J. Steinbart

Thorough and up-to-date, this book supports any of the most popular approaches to AIS: focus on transaction cycles and controls; focus on systems life cycle; focus on databases and data modeling; or focus on computer-based controls, fraud and auditing. The book begins with an overview and conceptual foundations then goes on to discuss control and audit of accounting information systems,...



Data Mining: Practical Machine Learning Tools and Techniques, Second Edition (Morgan Kaufmann Series in Data Management Systems)
by Ian H. Witten, Eibe Frank

As with any burgeoning technology that enjoys commercial attention, the use of data mining is surrounded by a great deal of hype. Exaggerated reports tell of secrets that can be uncovered by setting algorithms loose on oceans of data. But there is no magic in machine learning, no hidden power, no alchemy. Instead there is an identifiable body of practical techniques that can extract useful...



Essentials of Nursing Informatics
by Virginia K. Saba, Kathleen Ann McCormick

Learn how computers and technology affect the nurse’s role in caring for the patient. Now fully updated and enhanced, the fourth edition includes new coverage of PDAs, the impact of HIPAA guidelines, patient safety issues, privacy issues, optimal use of decision support tools, and much...



Introduction to Data Mining
by Pang-Ning Tan, Michael Steinbach, Vipin Kumar

Introduction to Data Mining presents fundamental concepts and algorithms for those learning data mining for the first time. Each major topic is organized into two chapters, beginning with basic concepts that provide necessary background for understanding each data mining technique, followed by more advanced concepts and algorithms....

© 2008 BrightSurf.com