U-M leads $4 million project to preserve poll and survey data

September 30, 2004

ANN ARBOR, Mich.---In the thick of a presidential election, the latest findings from surveys and polls are reported on a daily basis. But much of the data behind the news on American public opinion is literally here today and gone tomorrow.

"At least half the survey and poll data collected since the 1940s has disappeared," said historian Myron Gutmann, director of the Inter-university Consortium for Political and Social Research (ICPSR) at the University of Michigan Institute for Social Research. "We're not sure yet if it's gone permanently or not."

Gutmann is the principal investigator on a new $4.1 million project to acquire and preserve data from opinion polls, voting records, large-scale surveys and other social science studies. Funded primarily by the Library of Congress, the world's largest library, the three-year project is a broad-based partnership between ICPSR, the world's largest academic social science data archive, and five other institutions.

Other institutions involved in the project are the Roper Center for Public Opinion Research at the University of Connecticut, the Howard W. Odum Institute at the University of North Carolina-Chapel Hill, the Henry A. Murray Research Center at Harvard's Radcliffe Institute, the National Archives and Records Administration, and the Harvard-MIT Data Center.

"This effort will ensure that future generations of Americans have access to vital material that will allow them to understand their nation, its social organization and its policies and politics," Gutmann said.

For three-quarters of a century, public opinion polls, social surveys and other kinds of structured interviews have tracked people's values, attitudes, knowledge and behavior. Surveys have done more than predict the outcomes of elections or tell us when presidents gain or lose popularity. They inform us about aging, health and health care, race relations, women's rights, employment and family life---the full story of the social and cultural tapestry that makes up our nation. They provide the data necessary for sound, empirically based policy-making.

But a huge quantity of this data is missing or at-risk. "It has not been archived and without aggressive activities to locate and preserve it, it will disappear for good," Gutmann said. "This at-risk data can be found on the computers of individual researchers and research institutions, in bookcases and libraries, even in boxes of punched cards stored in warehouses. Some data reside on websites that don't have truly persistent URLs."

The good news, Gutmann says, is that the missing material has left tracks that researchers affiliated with the new project will follow, in the form of news releases, public grant announcements and publications describing the research. After identifying and finding at-risk content, the project aims to acquire the data, assure its security and prepare public use files that safeguard confidentiality.

"Our goal is to assure that the material remains accessible, complete, uncorrupted and usable over time," Gutmann said. "Rapid technological change will always threaten the viability of digital materials produced in previous years under obsolete technological conditions. But this project will greatly enhance our ability to preserve important data collections."
-end-
Established in 1948, the Institute for Social Research (ISR) is among the world's oldest survey research organizations, and a world leader in the development and application of social science methodology. ISR conducts some of the most widely-cited studies in the nation, including the Survey of Consumer Attitudes, the National Election Studies, the Monitoring the Future Study, the Panel Study of Income Dynamics, the Health and Retirement Study, the Columbia County Longitudinal Study and the National Survey of Black Americans. ISR researchers also collaborate with social scientists in more than 60 nations on the World Values Surveys and other projects, and the Institute has established formal ties with universities in Poland, China, and South Africa. ISR is also home to the Inter-University Consortium for Political and Social Research (ICPSR), the world's largest computerized social science data archive. Visit the ISR Web site at www.isr.umich.edu for more information.

Through its National Digital Library (NDL) Program, the Library of Congress is one of the leading providers of noncommercial intellectual content on the Internet (www.loc.gov). The NDL Program's flagship American Memory project, in collaboration with other institutions nationwide, makes freely available more than 8.5 million American historical items. In December 2000, Congress authorized the Library of Congress to develop and execute a congressionally approved plan for a National Digital Information Infrastructure and Preservation Program. A $99.8 million congressional appropriation was made to establish the program. The goal is to build a network throughout the country of committed partners working through a preservation architecture with defined roles and responsibilities. The complete text of the "Plan for the National Digital Information Infrastructure and Preservation Program" is available at www.digitalpreservation.gov. This includes an explanation of how the plan was developed, who the Library worked with to develop the plan and the key components of the digital preservation infrastructure. The plan was approved by Congress in December 2002.

University of Michigan

Related Data Articles from Brightsurf:

Keep the data coming
A continuous data supply ensures data-intensive simulations can run at maximum speed.

Astronomers are bulging with data
For the first time, over 250 million stars in our galaxy's bulge have been surveyed in near-ultraviolet, optical, and near-infrared light, opening the door for astronomers to reexamine key questions about the Milky Way's formation and history.

Novel method for measuring spatial dependencies turns less data into more data
Researcher makes 'little data' act big through, the application of mathematical techniques normally used for time-series, to spatial processes.

Ups and downs in COVID-19 data may be caused by data reporting practices
As data accumulates on COVID-19 cases and deaths, researchers have observed patterns of peaks and valleys that repeat on a near-weekly basis.

Data centers use less energy than you think
Using the most detailed model to date of global data center energy use, researchers found that massive efficiency gains by data centers have kept energy use roughly flat over the past decade.

Storing data in music
Researchers at ETH Zurich have developed a technique for embedding data in music and transmitting it to a smartphone.

Life data economics: calling for new models to assess the value of human data
After the collapse of the blockchain bubble a number of research organisations are developing platforms to enable individual ownership of life data and establish the data valuation and pricing models.

Geoscience data group urges all scientific disciplines to make data open and accessible
Institutions, science funders, data repositories, publishers, researchers and scientific societies from all scientific disciplines must work together to ensure all scientific data are easy to find, access and use, according to a new commentary in Nature by members of the Enabling FAIR Data Steering Committee.

Democratizing data science
MIT researchers are hoping to advance the democratization of data science with a new tool for nonstatisticians that automatically generates models for analyzing raw data.

Getting the most out of atmospheric data analysis
An international team including researchers from Kanazawa University used a new approach to analyze an atmospheric data set spanning 18 years for the investigation of new-particle formation.

Read More: Data News and Data Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.