Swiss army knife for genome research

November 26, 2019

It is the the dream of every molecular geneticist: an easy-to-use program that compares data sets from different cellular conditions, identifies enhancer regions and then assigns them to their target genes. A research team led by Martin Vingron at the Max Planck Institute for Molecular Genetics in Berlin has now developed a program that masters all of this.

"DNA is pretty boring, since it is practically the same in every cell," says Martin Vingron, Director and Head of the Department of Bioinformatics at the Max Planck Institute for Molecular Genetics in Berlin. "When the genome is like the book of life, I am most interested in the side notes." These "notes" are small chemical marks attached to the DNA molecule that do not alter the genetic information itself, but influence what happens to the DNA at the respective site. In other words, these marks have an epigenetic effect. They serve as regulators of genomic regions that are responsible for the activation and deactivation of genes, such as promoters and enhancers.

In many complex diseases, the epigenetic control of genes does not work correctly and is of great interest for scientists. The analysis of these regions in the lab, however, is often cumbersome, time-consuming and complicated. That's why Vingron and his team decided to develop a new program package called Condition-specific Regulatory Units Prediction (CRUP) which simplifies analysis and solves several practical problems.

"We wanted to combine the common steps in the process of enhancer prediction in a simple, universal program," says bioinformatician Verena Heinrich who developed the package. CRUP simplifies the analysis in many regards. The machine learning algorithm is not limited to specific cell- or tissue-types. It does not need to be recalibrated prior to each analysis of a data set and allows comparative study of several data series. The tool, which was developed by Heinrich and doctoral student Anna Ramisch, is still easy to use.

The enhancer's stimulating activity

CRUP specifically identifies and characterizes enhancers - DNA segments that stimulate or "enhance" the transcription of genes. These regions attract proteins that attach to promoter sequences which function as a switch for each gene. However, which enhancer controls the right genes at correct time often remains a mystery. "Enhancers and their associated genes can be located far away from each other," says Heinrich. "This makes it difficult for us to assign the regulatory sequences to their respective targets."

The genome contains hundreds of thousands of enhancers which are active in different phases in the life of a cell like during growth, maintenance, or disease. When the DNA is tightly packed like a wool thread on spools of carrier proteins called histones, the regulatory sequences are in a "resting" state. They only become active by chemical modifications to the histone proteins. Then, sections unwrap from the DNA clusters, get exposed, and become accessible to molecules that activate genes. The analysis of histone proteins by chromatin immunoprecipitation (ChIP) in tandem with DNA sequencing then reveals which enhancers are active and which are not.

In three steps to a complete analysis

These ChIP data are the input for the newly developed program. CRUP first examines all sequences and then decides whether it is an enhancer or not. The classification algorithm is based on artificial intelligence, which is trained with information from mouse embryonic stem cells. It detects enhancer regions in many other animal species or tissues, as Heinrich and her colleagues demonstrated on a diverse set of data provided by the German Epigenome Program (DEEP).

In the second step, CRUP can be fed multiple data sets and the program finds where they differ. This makes it possible to interpret a series of measurements or pinpoint differences between tissues. Epigenetic changes to enhancers become apparent - over time, or when comparing healthy and diseased tissues. The third and final step of the analysis is the mapping of genes to their respective enhancers. "We asked: What part of the genome is active at the same time in the same place?" explains Heinrich. To achieve this, CRUP links the enhancer analysis with transcription data that reveal which genes are active, and experiments that tell which parts of the DNA strand are close to each other.

Finally, the researchers tested their program in a practical setting. They analyzed the tissue of mice with the immune disease rheumatoid arthritis and compared it with data from healthy animals. CRUP identified more than 200 differences in enhancer regions, some of which had already been associated with the disease in other studies. The genes that CRUP assigned to these enhancers have also been shown to play a role in disease.

A catalyst for research

"Our program reliably identifies candidates for disease-associated enhancers and links them to their target genes," says Vingron. His team hopes the new tool will make the field more accessible as well as accelerate research to help identify the causes of complex human diseases. "CRUP should be particularly useful for all the research groups that do not have a team of bioinformaticians at hand."
Original publication

Ramisch A, Heinrich, V, Glaser LV, Fuchs A, Yang X, Benner P, Schöpflin R, Li N, Kinkley S, Römer-Hillmann A, Longinotto J, Heyne S, Czepukojc B, Kessler SM, Kiemer AK, Cadenas C, Arrigoni, L, Gasparoni N, Manke T, Pap T, Pospisilik A, Hengstler J, Walter J, Meijsing SH, Chung HR, Vingron M CRUP: a comprehensive framework to predict condition-specific regulatory units. Genome Biology 2019 Nov 08; 20:227


Related DNA Articles from Brightsurf:

A new twist on DNA origami
A team* of scientists from ASU and Shanghai Jiao Tong University (SJTU) led by Hao Yan, ASU's Milton Glick Professor in the School of Molecular Sciences, and director of the ASU Biodesign Institute's Center for Molecular Design and Biomimetics, has just announced the creation of a new type of meta-DNA structures that will open up the fields of optoelectronics (including information storage and encryption) as well as synthetic biology.

Solving a DNA mystery
''A watched pot never boils,'' as the saying goes, but that was not the case for UC Santa Barbara researchers watching a ''pot'' of liquids formed from DNA.

Junk DNA might be really, really useful for biocomputing
When you don't understand how things work, it's not unusual to think of them as just plain old junk.

Designing DNA from scratch: Engineering the functions of micrometer-sized DNA droplets
Scientists at Tokyo Institute of Technology (Tokyo Tech) have constructed ''DNA droplets'' comprising designed DNA nanostructures.

Does DNA in the water tell us how many fish are there?
Researchers have developed a new non-invasive method to count individual fish by measuring the concentration of environmental DNA in the water, which could be applied for quantitative monitoring of aquatic ecosystems.

Zigzag DNA
How the cell organizes DNA into tightly packed chromosomes. Nature publication by Delft University of Technology and EMBL Heidelberg.

Scientists now know what DNA's chaperone looks like
Researchers have discovered the structure of the FACT protein -- a mysterious protein central to the functioning of DNA.

DNA is like everything else: it's not what you have, but how you use it
A new paradigm for reading out genetic information in DNA is described by Dr.

A new spin on DNA
For decades, researchers have chased ways to study biological machines.

From face to DNA: New method aims to improve match between DNA sample and face database
Predicting what someone's face looks like based on a DNA sample remains a hard nut to crack for science.

Read More: DNA News and DNA Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to