Nav: Home

Neural network fills in data gaps for spatial analysis of chromosomes

November 07, 2019

PITTSBURGH--Computational methods used to fill in missing pixels in low-quality images or video also can help scientists provide missing information for how DNA is organized in the cell, computational biologists at Carnegie Mellon University have shown.

Filling in this missing information will make it possible to more readily study the 3D structure of chromosomes and, in particular, subcompartments that may play a crucial role in both disease formation and determining cell functions, said Jian Ma, associate professor in CMU's Computational Biology Department.

In a research paper published today by the journal Nature Communications, Ma and Kyle Xiong, a CMU Ph.D. student in the CMU-University of Pittsburgh Joint Ph.D. Program in Computational Biology, report that they successfully applied their machine learning method to nine cell lines. This enabled them, for the first time, to study differences in spatial organization related to subcompartments across those lines.

Previously, subcompartments could be revealed in only a single cell type of lymphoblastoid cells -- a cell line known as GM12878 -- that has been exhaustively sequenced at great expense using Hi-C technology, which measures spatial interactivity among all regions of the genome.

"We now know a lot about the linear composition of DNA in chromosomes, but in the nuclei of human cells, DNA isn't linear," Xiong said. "Chromosomes in the cell nucleus are folded and packaged into 3D shapes. That 3D structure is critical to understanding the cellular functions in development and diseases." Subcompartments are of particular interest because they reflect spatial segregation of chromosome regions with high interactivity.

Scientists are eager to learn more about the juxtaposition of subcompartments and how it affects cell function, Ma said. But until now researchers could calculate the patterns of subcompartments only if they had an extremely high coverage Hi-C dataset -- that is, the DNA had been sequenced in great detail to capture more interactions. That level of detail is missing in the datasets for cell lines other than GM12878.

Working with Ma, Xiong used an artificial neural network called a denoising autoencoder to help fill in the gaps in less-than-complete Hi-C datasets. In computer vision applications, the autoencoder can supply missing pixels by learning what types of pixels typically are found together and making its best guess. Xiong adapted the autoencoder to high-throughput genomics, using the dataset for GM12878 to train it to recognize what sequences of DNA pairs from different chromosomes typically might be interacting with each other in 3D space in the cell nucleus.

This computational method, which Ma and Xiong have dubbed SNIPER, proved successful in identifying subcompartments in eight cell lines whose interchromosomal interactions based on Hi-C data were only partially known. They also applied SNIPER to the GM12878 data as a control. But Xiong noted that it is not yet known how widely this tool can be used on all other cell types. He and Ma are continuing to enhance the method, however, so it can be used on a variety of cellular conditions and even in different organisms.

"We need to understand how subcompartment patterns are involved in the basic functions of cells, as well as how mutations can affect these 3D structures," Ma said. "Thus far, in the few cell lines we've been able to study, we see that some subcompartments are consistent across cell types, while others vary. Much remains to be learned."
-end-
The National Institutes of Health and the National Science Foundation supported this work.

Carnegie Mellon University

Related Dna Articles:

Zigzag DNA
How the cell organizes DNA into tightly packed chromosomes. Nature publication by Delft University of Technology and EMBL Heidelberg.
Scientists now know what DNA's chaperone looks like
Researchers have discovered the structure of the FACT protein -- a mysterious protein central to the functioning of DNA.
DNA is like everything else: it's not what you have, but how you use it
A new paradigm for reading out genetic information in DNA is described by Dr.
A new spin on DNA
For decades, researchers have chased ways to study biological machines.
From face to DNA: New method aims to improve match between DNA sample and face database
Predicting what someone's face looks like based on a DNA sample remains a hard nut to crack for science.
Self-healing DNA nanostructures
DNA assembled into nanostructures such as tubes and origami-inspired shapes could someday find applications ranging from DNA computers to nanomedicine.
DNA design that anyone can do
Researchers at MIT and Arizona State University have designed a computer program that allows users to translate any free-form drawing into a two-dimensional, nanoscale structure made of DNA.
DNA find
A Queensland University of Technology-led collaboration with University of Adelaide reveals that Australia's pint-sized banded hare-wallaby is the closest living relative of the giant short-faced kangaroos which roamed the continent for millions of years, but died out about 40,000 years ago.
DNA structure impacts rate and accuracy of DNA synthesis
DNA sequences with the potential to form unusual conformations, which are frequently associated with cancer and neurological diseases, can in fact slow down or speed up the DNA synthesis process and cause more or fewer sequencing errors.
Changes in mitochondrial DNA control how nuclear DNA mutations are expressed in cardiomyopathy
Differences in the DNA within the mitochondria, the energy-producing structures within cells, can determine the severity and progression of heart disease caused by a nuclear DNA mutation.
More DNA News and DNA Current Events

Trending Science News

Current Coronavirus (COVID-19) News

Top Science Podcasts

We have hand picked the top science podcasts of 2020.
Now Playing: TED Radio Hour

Teaching For Better Humans 2.0
More than test scores or good grades–what do kids need for the future? This hour, TED speakers explore how to help children grow into better humans, both during and after this time of crisis. Guests include educators Richard Culatta and Liz Kleinrock, psychologist Thomas Curran, and writer Jacqueline Woodson.
Now Playing: Science for the People

#556 The Power of Friendship
It's 2020 and times are tough. Maybe some of us are learning about social distancing the hard way. Maybe we just are all a little anxious. No matter what, we could probably use a friend. But what is a friend, exactly? And why do we need them so much? This week host Bethany Brookshire speaks with Lydia Denworth, author of the new book "Friendship: The Evolution, Biology, and Extraordinary Power of Life's Fundamental Bond". This episode is hosted by Bethany Brookshire, science writer from Science News.
Now Playing: Radiolab

Space
One of the most consistent questions we get at the show is from parents who want to know which episodes are kid-friendly and which aren't. So today, we're releasing a separate feed, Radiolab for Kids. To kick it off, we're rerunning an all-time favorite episode: Space. In the 60's, space exploration was an American obsession. This hour, we chart the path from romance to increasing cynicism. We begin with Ann Druyan, widow of Carl Sagan, with a story about the Voyager expedition, true love, and a golden record that travels through space. And astrophysicist Neil de Grasse Tyson explains the Coepernican Principle, and just how insignificant we are. Support Radiolab today at Radiolab.org/donate.