Nav: Home

International competition benchmarks metagenomics software

October 02, 2017

Communities of bacteria live everywhere: inside our bodies, on our bodies and all around us. The human gut alone contains hundreds of species of bacteria that help digest food and provide nutrients, but can also make us sick. To learn more about these groups of bacteria and how they impact our lives, scientists need to study them. But this task poses challenges, because taking the bacteria into the laboratory is either impossible or would disrupt the biological processes the scientists wish to study.

To bypass these difficulties, scientists have turned to the field of metagenomics. In metagenomics, researchers use algorithms to piece together DNA from an environmental sample to determine the type and role of bacteria present. Unlike established fields such as chemistry, where researchers evaluate their results against a set of known standards, metagenomics is a relatively young field that lacks such benchmarks.

Mihai Pop, a professor of computer science at the University of Maryland with a joint appointment in the University of Maryland Institute for Advanced Computer Studies, recently helped judge an international challenge called the Critical Assessment of Metagenome Interpretation (CAMI), which benchmarked metagenomics software. The results were published in the journal Nature Methods on October 2, 2017.

"There's no one algorithm that we can say is the best at everything," said Pop, who is also co-director of the Center for Health-related Informatics and Bioimaging at UMD. "What we found was that one tool does better in one context, but another does better in another context. It is important for researchers to know that they need to choose software based on the specific questions they are trying to answer."

The study's results were not surprising to Pop, because of the many challenges metagenomics software developers face. First, DNA analysis is challenging in metagenomics because the recovered DNA often comes from the field, not a tightly controlled laboratory environment. In addition, DNA from many organisms--some of which may not have known genomes--mingle together in a sample, making it difficult to correctly assemble, or piece together, individual genomes. Moreover, DNA degrades in harsh environments.

"I like to think of metagenomics as a new type of microscope," Pop said. "In the old days, you would use a microscope to study bacteria. Now we have a much more powerful microscope, which is DNA sequencing coupled with advanced algorithms. Metagenomics holds the promise of helping us understand what bacteria do in the world. But first we need to tune that microscope."

CAMI's leader invited Pop to help evaluate the submissions by challenge participants because of his expertise in genome and metagenome assembly. In 2009, Pop helped publish Bowtie, one of the most commonly used software packages for assembling genomes. More recently, he collaborated with the University of Maryland School of Medicine to analyze hundreds of thousands of gene sequences as part of the largest, most comprehensive study of childhood diarrheal diseases ever conducted in developing countries.

"We uncovered new, unknown bacteria that cause diarrheal diseases, and we also found interactions between bacteria that might worsen or improve illness," Pop said. "I feel that's one of the most impactful projects I've done using metagenomics."

For the competition, CAMI researchers combined approximately 700 microbial genomes and 600 viral genomes with other DNA sources and simulated how such a collection of DNA might appear in the field. The participants' task was to reconstruct and analyze the genomes of the simulated DNA pool.

CAMI researchers scored the participants' submissions in three areas: how well they assembled the fragmented genomes; how well they "binned," or organized, DNA fragments into related groups to determine the families of organisms in the mixture; and how well they "profiled," or reconstructed, the identity and relative abundance of the organisms present in the mixture. Pop contributed metrics and software for evaluating the submitted assembled genomes.

Nineteen teams submitted 215 entries using six genome assemblers, nine binners and 10 profilers to tackle this challenge.

The results showed that for assembly, algorithms that pieced together a genome using different lengths of smaller DNA fragments outperformed those that used DNA fragments of a fixed length. However, no assemblers did well at picking apart different, yet similar genomes.

For the binning task, the researchers found tradeoffs in how accurately the software programs identified the group to which a particular DNA fragment belonged, versus how many DNA fragments the software assigned to any groups. This result suggests that researchers need to choose their binning software based on whether accuracy or coverage is more important. In addition, the performance of all binning algorithms decreased when samples included multiple related genomes.

In profiling, software either recovered the relative abundance of bacteria in the sample better or detected organisms better, even at very low quantities. However, the latter algorithms identified the wrong organism more often.

Going forward, Pop said the CAMI group will continue to run new challenges with different data sets and new evaluations aimed at more specific aspects of software performance. Pop is excited to see scientists use the benchmarks to address research questions in the laboratory and the clinic.

"The field of metagenomics needs standards to ensure that results are correct, well validated and follow best practices," Pop said. "For instance, if a doctor is going to stage an intervention based on results from metagenomic software, it's essential that those results be correct. Our work provides a roadmap for choosing appropriate software."
This work was led by Alice McHardy of the Department for Computational Biology of Infection Research at the Helmholtz Centre for Infection Research and the Braunschweig Integrated Centre of Systems Biology in Braunschweig, Germany.

This work was supported by an Engineering and Physical Sciences Research Council Grant (Award No. EP/K032208/1), a U.S. Department of Energy contract (Award No. DEAC02-05CH11231) and the Cluster of Excellence on Plant Sciences program funded by the Deutsche Forschungsgemeinschaft. The content of this article does not necessarily reflect the views of these organizations.

The research paper, "Critical Assessment of Metagenome Interpretation - a benchmark of computational metagenomics software," Alice McHardy et al., was published in the journal Nature Methods on October 2, 2017.

Media Relations Contact:

Irene Ying

University of Maryland
College of Computer, Mathematical, and Natural Sciences
2300 Symons Hall
College Park, MD 20742

About the College of Computer, Mathematical, and Natural SciencesThe College of Computer, Mathematical, and Natural Sciences at the University of Maryland educates more than 7,000 future scientific leaders in its undergraduate and graduate programs each year. The college's 10 departments and more than a dozen interdisciplinary research centers foster scientific discovery with annual sponsored research funding exceeding $150 million.

University of Maryland

Related Bacteria Articles:

Bacteria must be 'stressed out' to divide
Bacterial cell division is controlled by both enzymatic activity and mechanical forces, which work together to control its timing and location, a new study from EPFL finds.
How bees live with bacteria
More than 90 percent of all bee species are not organized in colonies, but fight their way through life alone.
The bacteria building your baby
Australian researchers have laid to rest a longstanding controversy: is the womb sterile?
Detecting bacteria in space
A new genomic approach provides a glimpse into the diverse bacterial ecosystem on the International Space Station.
Hopping bacteria
Scientists have long known that key models of bacterial movement in real-world conditions are flawed.
Bacteria uses viral weapon against other bacteria
Bacterial cells use both a virus -- traditionally thought to be an enemy -- and a prehistoric viral protein to kill other bacteria that competes with it for food according to an international team of researchers who believe this has potential implications for future infectious disease treatment.
Drug diversity in bacteria
Bacteria produce a cocktail of various bioactive natural products in order to survive in hostile environments with competing (micro)organisms.
Bacteria walk (a bit) like we do
EPFL biophysicists have been able to directly study the way bacteria move on surfaces, revealing a molecular machinery reminiscent of motor reflexes.
Using bacteria to create a water filter that kills bacteria
Engineers have created a bacteria-filtering membrane using graphene oxide and bacterial nanocellulose.
Probiotics are not always 'good bacteria'
Researchers from the Cockrell School of Engineering were able to shed light on a part of the human body - the digestive system -- where many questions remain unanswered.
More Bacteria News and Bacteria Current Events

Top Science Podcasts

We have hand picked the top science podcasts of 2019.
Now Playing: TED Radio Hour

In & Out Of Love
We think of love as a mysterious, unknowable force. Something that happens to us. But what if we could control it? This hour, TED speakers on whether we can decide to fall in — and out of — love. Guests include writer Mandy Len Catron, biological anthropologist Helen Fisher, musician Dessa, One Love CEO Katie Hood, and psychologist Guy Winch.
Now Playing: Science for the People

#542 Climate Doomsday
Have you heard? Climate change. We did it. And it's bad. It's going to be worse. We are already suffering the effects of it in many ways. How should we TALK about the dangers we are facing, though? Should we get people good and scared? Or give them hope? Or both? Host Bethany Brookshire talks with David Wallace-Wells and Sheril Kirschenbaum to find out. This episode is hosted by Bethany Brookshire, science writer from Science News. Related links: Why Climate Disasters Might Not Boost Public Engagement on Climate Change on The New York Times by Andrew Revkin The other kind...
Now Playing: Radiolab

An Announcement from Radiolab