Nav: Home

NIH-led effort examines use of big data for infectious disease surveillance

November 14, 2016

Big data derived from electronic health records, social media, the internet and other digital sources have the potential to provide more timely and detailed information on infectious disease threats or outbreaks than traditional surveillance methods. A team of scientists led by the National Institutes of Health reviewed the growing body of research on the subject and has published its analyses in a special issue of The Journal of Infectious Diseases.

Traditional infectious disease surveillance -- typically based on laboratory tests and other data collected by public health institutions -- is the gold standard. But, the authors note it can have time lags, is expensive to produce, and typically lacks the local resolution needed for accurate monitoring. Further, it can be cost-prohibitive in low-income countries. In contrast, big data streams from internet queries, for example, are available in real time and can track disease activity locally, but have their own biases. Hybrid tools that combine traditional surveillance and big data sets may provide a way forward, the scientists suggest, serving to complement, rather than replace, existing methods.

"The ultimate goal is to be able to forecast the size, peak or trajectory of an outbreak weeks or months in advance in order to better respond to infectious disease threats. Integrating big data in surveillance is a first step toward this long-term goal," says Cecile Viboud, Ph.D., co-editor of the supplement and a senior scientist at the NIH's Fogarty International Center. "Now that we have demonstrated proof of concept by comparing data sets in high-income countries, we can examine these models in low-resource settings where traditional surveillance is sparse."

Experts in epidemiology, computer science and modeling collaborated on the supplement's 10 articles. They report on the opportunities and challenges associated with three types of data: medical encounter files, such as records from healthcare facilities and insurance claim forms; crowdsourced data collected from volunteers who self-report symptoms in near real time; and data generated by the use of social media, the internet and mobile phones, which may include self-reporting of health, behavior and travel information to help elucidate disease transmission.

But big data's potential must be tempered with caution, the authors say. Non-traditional data streams may lack key demographic identifiers such as age and sex, or provide information that underrepresents infants, children, the elderly and developing countries. Social media outlets may not be stable sources of data, as they can disappear if there is a loss of interest or financing. Most importantly, any novel data stream must be validated against established infectious disease surveillance data and systems, the authors said.

Each article features a promising example of the use of big data to monitor and model infectious diseases activity:
  • In the United States, researchers found what they describe as "excellent alignment" between medical insurance claim data for flu-like illnesses and proven influenza activity reported by the Centers for Disease Control and Prevention.

  • A European surveillance system that began collecting crowdsourced data on influenza as part of a research project is now considered an adjunct to existing surveillance activities. Influenzanet uses standardized online surveys to gather information from volunteers who self-report their symptoms on a weekly basis. A number of European Union member states are now using the tool and expanding it to include Zika, salmonella and other diseases.

  • An online platform, ResistanceOpen, was developed by U.S. and Canadian scientists to monitor antibiotic resistance at the regional level. The site takes advantage of publicly available, online data from community healthcare institutions as well as regional, national and international bodies. An analysis showed online information compared favorably with traditional reporting systems in the two countries.

  • Multiple studies have looked at social media and internet health forums for information on drug use and to detect adverse drug reactions. While there are technical and ethical challenges, the authors suggest internet search logs and social media posts can provide information more quickly than traditional physician-based reporting systems.

  • In a comparison of the relatively new field of epidemic forecasting to the better-established one for weather forecasting, the authors note the former is much more difficult given that there is less observational data for disease, and because human behavior has the potential to rapidly alter the course of an epidemic.

  • An examination of spatial data -- including from insurance claims and social media posts -- shows their potential for filling geographical information gaps but also presents technical, practical and privacy challenges that must be addressed.

  • With appropriate safeguards to ensure anonymity, call data records from mobile phones may provide "an unprecedented opportunity" to determine how travel affects disease transmission. Studies of malaria and rubella in Kenya showed call data improved the understanding of the spatial transmission of those diseases.

  • Online news articles and health bulletins from public health agencies were manually extracted and modeled to elucidate transmission patterns for two recent outbreaks--the Ebola epidemic in West Africa and a Middle East Respiratory Syndrome outbreak in South Korea. Internet findings were in line with traditional data, providing a proof of concept that this approach can be generalized and automatized to a variety of online sources and generate information on disease transmission.

  • Researchers also describe the benefits of a novel, publicly available epidemic simulation data management system, called epiDMS, which provides storage and indexing services for large data simulation sets, as well as search functionality and data analysis to aid decision makers during healthcare emergencies.

While the new hybrid models that combine traditional and digital disease surveillance methods show promise, the scientists agree there is still an overall scarcity of reliable surveillance information, especially compared to other fields such as climatology, where the data sets are huge. "To be able to produce accurate forecasts, we need better observational data that we just don't have in infectious diseases," notes Professor Shweta Bansal of Georgetown University, a co-editor of the supplement. "There's a magnitude of difference between what we need and what we have, so our hope is that big data will help us fill this gap."

Multi-disciplinary initiatives such as the NIH-led Big Data to Knowledge program will be instrumental in expanding the use of big data in research, as noted in the supplement.
The publication's authors include scientists affiliated with Fogarty's Research and Policy for Infectious Diseases program (RAPIDD), grantees from NIH's National Institute of General Medical Sciences, and researchers from nearly 20 universities throughout North America and Europe. The supplement was produced with support from Georgia State University, the Fogarty International Center, Northeastern University and Georgetown University.

About the Fogarty International Center: the Center addresses global health challenges through innovative and collaborative research and training programs, and supports and advances the NIH mission through international partnerships. For more information, visit

About the National Institutes of Health (NIH): NIH, the nation's medical research agency, includes 27 Institutes and Centers and is a component of the U.S. Department of Health and Human Services. NIH is the primary federal agency conducting and supporting basic, clinical, and translational medical research, and is investigating the causes, treatments, and cures for both common and rare diseases. For more information about NIH and its programs, visit

NIH/Fogarty International Center

Related Infectious Diseases Articles:

COVID-19 a reminder of the challenge of emerging infectious diseases
The emergence and rapid increase in cases of coronavirus disease 2019 (COVID-19), a respiratory illness caused by a novel coronavirus, pose complex challenges to the global public health, research and medical communities, write federal scientists from NIH's National Institute of Allergy and Infectious Diseases (NIAID) and from the Centers for Disease Control and Prevention (CDC).
Certain antidepressants could provide treatment for multiple infectious diseases
Some antidepressants could potentially be used to treat a wide range of diseases caused by bacteria living within cells, according to work by researchers in the Virginia Commonwealth University School of Medicine and collaborators at other institutions.
Opioid epidemic is increasing rates of some infectious diseases
The US faces a public health crisis as the opioid epidemic fuels growing rates of certain infectious diseases, including HIV/AIDS, hepatitis, heart infections, and skin and soft tissue infections.
Infectious diseases could be diagnosed with smartphones in sub-Saharan Africa
A new Imperial-led review has outlined how health workers could use existing phones to predict and curb the spread of infectious diseases.
The Lancet Infectious Diseases: Experts warn of a surge in vector-borne diseases as humanitarian crisis in Venezuela worsens
The ongoing humanitarian crisis in Venezuela is accelerating the re-emergence of vector-borne diseases such as malaria, Chagas disease, dengue, and Zika virus, and threatens to jeopardize public health gains in the country over the past two decades, warn leading public health experts.
Glow-in-the-dark paper as a rapid test for infectious diseases
Researchers from Eindhoven University of Technology (The Netherlands) and Keio University (Japan) present a practicable and reliable way to test for infectious diseases.
Math shows how human behavior spreads infectious diseases
Mathematics can help public health workers better understand and influence human behaviors that lead to the spread of infectious disease, according to a study from the University of Waterloo.
Many Americans say infectious and emerging diseases in other countries will threaten the US
An overwhelming majority of Americans (95%) think infectious and emerging diseases facing other countries will pose a 'major' or 'minor' threat to the U.S. in the next few years, but more than half (61%) say they are confident the federal government can prevent a major infectious disease outbreak in the US, according to a new national public opinion survey commissioned by Research!America and the American Society for Microbiology.
Decline in deaths from most infectious diseases in US, large differences among counties
Deaths due to most infectious diseases decreased in the United States from 1980 to 2014, although there were large differences among counties.
AI to fight the spread of infectious diseases
Public outreach campaigns can prevent the spread of devastating yet treatable diseases such as tuberculosis (TB), malaria and gonorrhea.
More Infectious Diseases News and Infectious Diseases Current Events

Trending Science News

Current Coronavirus (COVID-19) News

Top Science Podcasts

We have hand picked the top science podcasts of 2020.
Now Playing: TED Radio Hour

Listen Again: The Power Of Spaces
How do spaces shape the human experience? In what ways do our rooms, homes, and buildings give us meaning and purpose? This hour, TED speakers explore the power of the spaces we make and inhabit. Guests include architect Michael Murphy, musician David Byrne, artist Es Devlin, and architect Siamak Hariri.
Now Playing: Science for the People

#576 Science Communication in Creative Places
When you think of science communication, you might think of TED talks or museum talks or video talks, or... people giving lectures. It's a lot of people talking. But there's more to sci comm than that. This week host Bethany Brookshire talks to three people who have looked at science communication in places you might not expect it. We'll speak with Mauna Dasari, a graduate student at Notre Dame, about making mammals into a March Madness match. We'll talk with Sarah Garner, director of the Pathologists Assistant Program at Tulane University School of Medicine, who takes pathology instruction out of...
Now Playing: Radiolab

What If?
There's plenty of speculation about what Donald Trump might do in the wake of the election. Would he dispute the results if he loses? Would he simply refuse to leave office, or even try to use the military to maintain control? Last summer, Rosa Brooks got together a team of experts and political operatives from both sides of the aisle to ask a slightly different question. Rather than arguing about whether he'd do those things, they dug into what exactly would happen if he did. Part war game part choose your own adventure, Rosa's Transition Integrity Project doesn't give us any predictions, and it isn't a referendum on Trump. Instead, it's a deeply illuminating stress test on our laws, our institutions, and on the commitment to democracy written into the constitution. This episode was reported by Bethel Habte, with help from Tracie Hunte, and produced by Bethel Habte. Jeremy Bloom provided original music. Support Radiolab by becoming a member today at     You can read The Transition Integrity Project's report here.