Nav: Home

Biological experiments become transparent -- anywhere, any time

February 14, 2017

Biological experiments are generating increasingly large and complex sets of data. This has made it difficult to reproduce experiments at other research laboratories in order to confirm - or refute - the results. The difficulty lies not only in the complexity of the data, but also in the elaborate computer programmes and systems needed to analyse them. Scientists from the Luxembourg Centre for Systems Biomedicine (LCSB) of the University of Luxembourg have now developed a new bioinformatics tool that will make the analysis of biological and biomedical experiments more transparent and reproducible.

The tool was developed under the direction of Prof. Paul Wilmes, head of the LCSB group Eco-Systems Biology, in close cooperation with the LCSB Bioinformatics Core. A paper describing the tool has been published in the highly ranked open access journal Genome Biology. The new bioinformatics tool, called IMP, is also available to researchers online.

Biological and biomedical research is being inundated with a flood of data as new studies delve into increasingly complex subjects - like the entire microbiome of the gut - using faster automated techniques allowing so-called high-throughput experiments. Experiments that not long ago had to be carried out laboriously by hand can now be repeated swiftly and systematically almost as often as needed. Analytical methods for interpreting this data have yet to catch up with the trend. "Each time you use a different method to analyse these complex systems, something different comes out of it," says Paul Wilmes. Every laboratory uses its own computational programs, and these are often kept secret. The computational methods also frequently change, sometimes simply due to a new operating system. "So it is extremely difficult, and often even impossible, to reproduce certain results at a different lab," Wilmes explains. "Yet, that is the very foundation of science: an experiment must be reproducible anywhere, any time, and must lead to the same results. Otherwise, we couldn't draw any meaningful conclusions from it."

The scientists at LCSB are now helping to rectify this situation. An initiative has been launched at the LCSB Bioinformatics Core, called "R3 - Reproducible Research Results". "With R3, we want to enable scientists around the world to increase the reproducibility and transparency of their research - through systematic training, through the development of methods and tools, and through establishing the necessary infrastructure," says Dr. Reinhard Schneider, head of the Bioinformatics Core.

The insights from the R3 initiative are then used in projects such as IMP. "IMP is a reproducible pipeline for the analysis of highly complex data," says Dr. Shaman Narayanasamy. As co-author of the study, he has just completed his doctor's degree on this subject in Paul Wilmes' group. "We preserve computer programs in the very state in which they delivered certain experimental data. From this quasi frozen state, we can later thaw the programs out again if the data ever need reprocessing, or if new data need to be analysed in the same way." The scientists also aggregate different components of the analytical software into so-called containers. These can be combined in different ways without risking interference between different program parts.

"The subprograms in the containers can be stringed together in series as needed," says the first author of the study, Yohan Jarosz of the Bioinformatics Core. This creates a pipeline for the data to flow through. Because the computational operators are frozen in containers, one does not need reference data to know the conditions - e.g. type of operating system or computer processor - under which to perform the analysis. "The whole process remains entirely open and transparent," says Jarosz. Every scientist can thus modify any step of the program - of course diligently recording every part of the process in a logbook to ensure full traceability.

Paul Wilmes is especially interested in using this method to analyse metagenomic and metatranscriptomic data. Such data are produced, for example, when researching entire bacterial communities in the human gut or in wastewater treatment plants. By knowing the full complement of DNA in the sample and all the gene products, they can determine what bacterial species are present and active in the gut or treatment plant. What is more, the scientists can also tell how big the population of each bacterial species is, what substances they produce at a given point in time, and what influences the organisms have on one another.

The catch, until recently, was that researchers at other laboratories have had a hard time reproducing the experimental results. With IMP, that has now changed, Wilmes continues: "We have already put data from other laboratories through the first tests with IMP. The results are clear: We can reproduce them - and our computations in IMP bring far more details to light than came forth in the original study, for example identifying genes that play a crucial role in the metabolism of bacterial communities."

"Thanks to IMP, only standardised and reproducible methods are now used in microbiome research at LCSB - from the wet lab, where the experiments are done, to the dry lab, where above all computer simulations and models are run. We have an internationally pioneering role in this," says Wilmes. "Thanks to R3, IMP also sets standards which other institutes, not only LCSB, will surely be interested to apply," adds Reinhard Schneider of the Bioinformatics Core. "We therefore make the technology of other researchers openly available - the standard ought to be quickly adopted. Only reproducible analyses of results will advance biomedicine in the long term."
-end-


University of Luxembourg

Related Data Articles:

Discrimination, lack of diversity, & societal risks of data mining highlighted in big data
A special issue of Big Data presents a series of insightful articles that focus on Big Data and Social and Technical Trade-Offs.
Journal AAS publishes first data description paper: Data collection and sharing
AAS published its first data description paper on June 8, 2017.
73 percent of academics say access to research data helps them in their work; 34 percent do not publish their data
Combining results from bibliometric analyses, a global sample of researcher opinions and case-study interviews, a new report reveals that although the benefits of open research data are well known, in practice, confusion remains within the researcher community around when and how to share research data.
Designing new materials from 'small' data
A Northwestern and Los Alamos team developed a novel workflow combining machine learning and density functional theory calculations to create design guidelines for new materials that exhibit useful electronic properties, such as ferroelectricity and piezoelectricity.
Big data for the universe
Astronomers at Lomonosov Moscow State University in cooperation with their French colleagues and with the help of citizen scientists have released 'The Reference Catalog of galaxy SEDs,' which contains value-added information about 800,000 galaxies.
What to do with the data?
Rapid advances in computing constantly translate into new technologies in our everyday lives.
Why keep the raw data?
The increasingly popular subject of raw diffraction data deposition is examined in a Topical Review in IUCrJ.
Infrastructure data for everyone
How much electricity flows through the grid? When and where?
Finding patterns in corrupted data
A new 'robust' statistical method from MIT enables efficient model fitting with corrupted, high-dimensional data.
Big data for little creatures
A multi-disciplinary team of researchers at UC Riverside has received $3 million from the National Science Foundation Research Traineeship program to prepare the next generation of scientists and engineers who will learn how to exploit the power of big data to understand insects.

Related Data Reading:

Best Science Podcasts 2019

We have hand picked the best science podcasts for 2019. Sit back and enjoy new science podcasts updated daily from your favorite science news services and scientists.
Now Playing: TED Radio Hour

Setbacks
Failure can feel lonely and final. But can we learn from failure, even reframe it, to feel more like a temporary setback? This hour, TED speakers on changing a crushing defeat into a stepping stone. Guests include entrepreneur Leticia Gasca, psychology professor Alison Ledgerwood, astronomer Phil Plait, former professional athlete Charly Haversat, and UPS training manager Jon Bowers.
Now Playing: Science for the People

#524 The Human Network
What does a network of humans look like and how does it work? How does information spread? How do decisions and opinions spread? What gets distorted as it moves through the network and why? This week we dig into the ins and outs of human networks with Matthew Jackson, Professor of Economics at Stanford University and author of the book "The Human Network: How Your Social Position Determines Your Power, Beliefs, and Behaviours".