Participants in environmental health studies vulnerable to re-identification

January 13, 2020

Newton, Mass. (January 13, 2020) -- Before sharing human research data, scientists routinely strip it of personal information such as name, address, and birthdate in order to protect the privacy of their study participants. However, reporting in the journal Environmental Health Perspectives, researchers at Silent Spring Institute and their colleagues show that for environmental health studies, that might not be enough--even anonymized data can sometimes be traced back to individuals.

The new study highlights the need for greater protections for participants in human research studies. It also has implications for a proposed federal rule by the U.S. Environmental Protection Agency (EPA) that would require scientists to make their data public in order for their research to be used as a basis for environmental regulations.

"Researchers promise to protect the privacy of their study participants--a routine practice in nearly all scientific studies involving people," says lead author Katherine Boronow, a staff scientist at Silent Spring. "Our research shows that making data publicly available from environmental health studies, even after obvious identifiers are removed, could violate these pledges."

In a previous study, Silent Spring researchers conducted an experiment in which they shared anonymized data from the Institute's Household Exposure Study in California with a team of Harvard researchers skilled in re-identification techniques. By linking housing and demographic data from the study to publicly-available data such as tax assessor records, and using other information described in the study such as the location of the housing developments and the levels of indoor air pollutants measured, the team successfully re-identified 25 percent of participants from one housing development by name.

Now, in this latest investigation, the researchers show that vulnerability to re-identification is a common aspect of environmental health data. They reviewed a dozen environmental health studies and identified five different types of data (location, medical, genetic, occupation, and housing) that overlap with outside databases and could contribute to the risk of re-identification.

The researchers found that all 12 studies included at least two out of the five data types, and three studies included all five. "Having multiple data types provides more opportunities for someone to match research data against existing commercial or public databases," says Boronow.

Measurements of pollutants in people's bodies or in their homes are also a characteristic data type of many environmental health studies. Currently, however, these measurements alone are less vulnerable to data linkage because there are few databases that include chemical measurements that could be used for matching.

To explore a different way that chemical exposure data might be used in re-identification, the team conducted a cluster analysis using data from Silent Spring's Household Exposure Study in California and in Massachusetts and from the Centers for Disease Control's Green Housing Study in Boston and Cincinnati. They fed the raw chemical measurements to an algorithm that sorted the data within each study into two groups. The groups created by the algorithm corresponded to geographic location with 80 to 98 percent accuracy.

If the data cluster into groups by location, says Boronow, then each group can be matched to data narrowed to that location, making it more likely for a re-identification attack to produce correct matches. This shows how someone could use chemical data to infer a characteristic of people in a study even if that characteristic is excluded when the study data are shared.

Data sharing has many benefits. By pooling data, researchers can create larger, more diverse datasets that could lead to advances in knowledge. It can also give researchers access to data that are difficult or expensive to obtain, such as data from biological or environmental samples collected after an environmental disaster. However, as the new study shows, it also has its risks.

Dr. Julia Brody, executive director at Silent Spring and a co-author of the study, says the implications of privacy risks are not trivial. Loss of privacy could result in stigma for individuals and communities. It could affect property values, insurance, or a person's chances of employment. It could also damage trust in research.

In 2018, EPA released a proposed rule called "Strengthening Transparency in Regulatory Science," that would require researchers to disclose their raw data as a precondition for the agency using a study to support regulatory decisions. Because the requirement could jeopardize confidential information about study participants, it could disqualify critical environmental health studies that form the basis of existing regulations, such as current limits on air pollutants. EPA is expected to release a revised version of the proposed rule early this year.

"Thousands of Americans have contributed personal data to scientific research with the goal of improving health for all," says Brody. "We must not take advantage of their generosity with rules that threaten their privacy and discourage future participation in research."

With growing pressure on scientists to share their data, and with more consumer data available online, Brody says it is important to fully characterize the risks of data sharing and identify solutions. Results from their research, she says, could help scientists develop informed consent documents that are more forthcoming about the risks and could help determine what types of data should be excluded from public sharing. It could also lay the groundwork for legal and policy protections for participants should they fall victim to re-identification.
Funding for this project was provided by the National Institute of Environmental Health Sciences of the National Institutes of Health.

Reference: Boronow, K.E., L.J. Perovich, L. Sweeney, J.S. Yoo, R.A. Rudel, P. Brown, J.G. Brody. 2020. Privacy Risks of Sharing Data from Environmental Health Studies. Environmental Health Perspectives. 128(1): 17008. DOI:10.1289/EHP4817

About Silent Spring Institute:

Silent Spring Institute, located in Newton, Mass., is the leading scientific research organization dedicated to uncovering the link between chemicals in our everyday environments and women's health, with a focus on breast cancer prevention. Founded in 1994, the institute is developing innovative tools to accelerate the transition to safer chemicals, while translating its science into policies that protect health. Visit us at and follow us on Twitter @SilentSpringIns.

Silent Spring Institute

Related Air Pollutants Articles from Brightsurf:

Wildfire smoke more dangerous than other air pollutants for asthma patients
For people who suffer from asthma, wildfire smoke is more hazardous than other types of air pollution, according to a new study from the Desert Research Institute (DRI), the Renown Institute for Health Innovation (Renown IHI) and the Washoe County Health District (WCHD).

Pesticides can protect crops from hydrophobic pollutants
Researchers have revealed that commercial pesticides can be applied to crops in the Cucurbitaceae family to decrease their accumulation of hydrophobic pollutants, thereby improving crop safety.

COVID-19 lockdown reduced dangerous air pollutants in five Indian cities by up to 54 percent
The COVID-19 crisis and subsequent lockdown measures have led to a dramatic reduction of harmful air pollutants across major cities in India, finds a new study from the University of Surrey.

Traffic density, wind and air stratification influence concentrations of air pollutant NO2
Traffic density is the most important factor for much the air pollutant nitrogen dioxide (NO2).

Nearly half of US breathing unhealthy air; record-breaking air pollution in nine cities
Amid the COVID-19 pandemic, the impact of air pollution on lung health is of heightened concern.

Babies in popular low-riding pushchairs are exposed to alarming levels of toxic air pollutants
Parents who are using popular low-riding pushchairs could be exposing their babies to alarming levels of air pollution, finds a new study from the University of Surrey.

Research team works to develop new ways to detect air pollutants
With a $2.3 million award from the National Institute for Occupational Safety and Health, an interdisciplinary team of Virginia Tech researchers led by Masoud Agah, the Virginia Microelectronics Consortium Professor in the Bradley Department of Electrical and Computer Engineering, is working to revolutionize a testing process for these harmful pollutants, in particular for truck drivers.

Prenatal and early life exposure to multiple air pollutants increases odds of toddler allergies
A new article in Annals of Allergy, Asthma and Immunology shows a significant association between multiple prenatal and early life exposures to indoor pollutants and the degree of allergic sensitivity in 2-year-olds.

Clean air research converts toxic air pollutant into industrial chemical
A toxic pollutant produced by burning fossil fuels can be captured from the exhaust gas stream and converted into useful industrial chemicals using only water and air thanks to a new advanced material developed by an international team of scientists.

Exposure to air pollutants from power plants varies by race, income and geography
Many people take electricity for granted -- the power to turn on light with the flip of a switch, or keep food from spoiling with refrigeration.

Read More: Air Pollutants News and Air Pollutants Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to