Studies of vulnerable populations get a 'bootstrapped' boost from statisticians

December 13, 2016

A hallmark of good government is policies which lift up vulnerable or neglected populations. But crafting effective policy requires sound knowledge of vulnerable groups. And that is a daunting task since these populations -- which include undocumented immigrants, homeless people or drug users -- are usually hidden in the margins thanks to cultural taboos, murky legal status or simple neglect from society.

"These are not groups where there's a directory you can go to and look up a random sample," said Adrian Raftery, a professor of statistics and sociology at the University of Washington. "That makes it very difficult to make inferences or draw conclusions about these 'hidden' groups."

Since these groups are hard to identify and reach, researchers like Raftery can struggle to make accurate inferences about them, determine their needs and find effective ways to reach them. And government policies to help vulnerable groups run a high risk of failing.

Sociologists once hoped that an approach called respondent-driven sampling -- or RDS -- would help them make reliable inferences about hard-to-reach groups. But subsequent analyses cast doubt on the efficacy of RDS studies.

In a paper published online Dec. 7 in the Proceedings of the National Academy of Sciences, Raftery and his team report how a statistical approach called "tree bootstrapping" can accurately assess uncertainty in RDS studies. That would put RDS on firm ground as one of the few methods to study vulnerable groups.

First described in 1997, respondent-driven sampling in studies works around the "problem" of recruitment. Normally, social scientists try to recruit study subjects at random from their target population. But this is not possible when social or legal issues act as barriers between researchers and subjects.

"This is an underlying problem when you're trying to access and make inferences about populations that are hard to access, like drug users," said Raftery.

With the RDS method, researchers can start with a handful of participants, and use them to recruit additional participants using existing social connections.

"You can set up a storefront and find a few people in the hard-to-reach population: You interview them, collect data and give them vouchers to give to their friends -- who can come in as well," said Raftery. "It was immediately useful for accessing these populations."

To date, over 460 RDS studies of vulnerable populations have been conducted. But researchers have shown that the standard estimates of uncertainty are wrong, making it hard to use RDS in a valid way. It turns out that the inferences that researchers drew about these populations were biased by the fact that their study subjects weren't chosen at random.

"RDS is kind of like trying to describe an elephant when you're blindfolded and only get to touch one part of the elephant," said Raftery. "You can get a lot of data about that one part of the elephant, but we -- the researchers -- didn't have the proper methods to draw firm, scientifically sound conclusions about the elephant as a whole."

Raftery and his team started looking for methods to assess the uncertainty in RDS studies. They quickly settled on bootstrapping, a statistical approach used to assess uncertainty in estimates based on a random sample. In traditional bootstrapping, researchers take an existing dataset -- for example, condom use among 1,000 HIV-positive men -- and randomly resample a new dataset, calculating condom use in the new dataset. They then do this many times, yielding a distribution of values of condom use that reflects the uncertainty in the original sample.

The team modified bootstrapping for RDS datasets. But instead of bootstrapping data on individuals, they bootstrapped data about the connections among individuals.

To see if this "tree bootstrapping" could attach certainty to conclusions from RDS datasets, they turned to two large, publicly available datasets. One was a multiyear survey of health and achievement among more than 90,000 adolescents, while the other was a survey of social contacts and sexual and drug habits among about 5,400 heterosexual adults. Neither dataset was collected using the RDS method. But since both datasets included information about the social contacts among subjects, the researchers could modify them to "simulate" data from a RDS study.

By tree bootstrapping, Raftery's team found that they could get much better statements of scientific certainty about their conclusions from these RDS-like studies. They then applied their method to a third dataset -- a RDS study of intravenous drug users in Ukraine. Again, Raftery's team found that they could draw firm conclusions.

"Previously, RDS might give an estimate of 20 percent of drug users in an area being HIV positive, but little idea how accurate this would be. Now you can say with confidence that at least 10 percent are," said Rafferty. "That's something firm you can say. And that can form the basis of a policy to respond, as well as additional studies of these groups."

With tree bootstrapping, Raftery believes researchers can draw more certain, less variable conclusions from RDS studies. He wants other groups to examine and use tree bootstrapping on both existing RDS datasets and future RDS studies.

"I hope this paper will help put RDS on a firm basis, and tell us what we can and can't conclude from RDS studies," said Raftery.
Lead author is UW statistics graduate student Aaron Baraff. Co-author is UW associate professor of statistics and sociology Tyler McCormick. The research was funded by the National Institutes of Health and the U.S. Army Research Office.

For more information, contact Raftery at 206-543-4505 or

Grant numbers: R01-HD054511, R01-HD070936, U54-HL127624, K01-HD078452 and 62389-CS-YIP.

University of Washington

Related Elephant Articles from Brightsurf:

Old males vital to elephant societies
Old male elephants play a key role in leading all-male groups, new research suggests.

Following African elephant trails to approach conservation differently
Elephant trails may lead the way to better conservation approaches.

Elephant welfare can be assessed using two indicators
In two new studies, scientists from the University of Turku, Finland, have investigated how to measure stress in semi-captive working elephants.

Study of elephant, capybara, human hair finds that thicker hair isn't always stronger
Despite being four times thicker than human hair, elephant hair is only half as strong -- that's just one finding from researchers studying the hair strength of many different mammals.

Online tool speeds response to elephant poaching by tracing ivory to source
A new tool uses an interactive database of geographic and genetic information to quickly identify where the confiscated tusks of African elephants were originally poached.

A mouse or an elephant: what species fights infection more effectively?
Hamilton College Assistant Professor of Biology Cynthia Downs led a study with co-authors from North Dakota State University, University of California, Davis, Eckerd College, and University of South Florida that investigated whether body mass was related to concentrations of two important immune cell types in the blood among hundreds of species of mammals ranging from tiny Jamaican fruit bats (~40 g) to giant killer whales (~5,600 kg).

Elephant seal 'supermoms' produce most of the population, study finds
Most of the pups born in an elephant seal colony in California over a span of five decades were produced by a relatively small number of long-lived 'supermoms,' according to a new study by researchers at the University of California, Santa Cruz.

African forest elephant helps increase biomass and carbon storage
Un international study with key contributions from Brazilian researchers shows that an endangered species, famed as a 'forest gardener,' influences African forest composition in terms of tree species and increases the aboveground biomass over the long term.

Increasing value of ivory poses major threat to elephant populations
The global price of ivory increased tenfold since its 1989 trade ban by the Convention on the International Trade in Endangered Species (CITES), new research has found.

Elephant extinction will raise carbon dioxide levels in atmosphere
Forest elephants engineer the ecosystem of the entire central African forest, and their catastrophic decline toward extinction has implications for carbon policy.

Read More: Elephant News and Elephant Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to