Nav: Home

Neural network fills in data gaps for spatial analysis of chromosomes

November 07, 2019

PITTSBURGH--Computational methods used to fill in missing pixels in low-quality images or video also can help scientists provide missing information for how DNA is organized in the cell, computational biologists at Carnegie Mellon University have shown.

Filling in this missing information will make it possible to more readily study the 3D structure of chromosomes and, in particular, subcompartments that may play a crucial role in both disease formation and determining cell functions, said Jian Ma, associate professor in CMU's Computational Biology Department.

In a research paper published today by the journal Nature Communications, Ma and Kyle Xiong, a CMU Ph.D. student in the CMU-University of Pittsburgh Joint Ph.D. Program in Computational Biology, report that they successfully applied their machine learning method to nine cell lines. This enabled them, for the first time, to study differences in spatial organization related to subcompartments across those lines.

Previously, subcompartments could be revealed in only a single cell type of lymphoblastoid cells -- a cell line known as GM12878 -- that has been exhaustively sequenced at great expense using Hi-C technology, which measures spatial interactivity among all regions of the genome.

"We now know a lot about the linear composition of DNA in chromosomes, but in the nuclei of human cells, DNA isn't linear," Xiong said. "Chromosomes in the cell nucleus are folded and packaged into 3D shapes. That 3D structure is critical to understanding the cellular functions in development and diseases." Subcompartments are of particular interest because they reflect spatial segregation of chromosome regions with high interactivity.

Scientists are eager to learn more about the juxtaposition of subcompartments and how it affects cell function, Ma said. But until now researchers could calculate the patterns of subcompartments only if they had an extremely high coverage Hi-C dataset -- that is, the DNA had been sequenced in great detail to capture more interactions. That level of detail is missing in the datasets for cell lines other than GM12878.

Working with Ma, Xiong used an artificial neural network called a denoising autoencoder to help fill in the gaps in less-than-complete Hi-C datasets. In computer vision applications, the autoencoder can supply missing pixels by learning what types of pixels typically are found together and making its best guess. Xiong adapted the autoencoder to high-throughput genomics, using the dataset for GM12878 to train it to recognize what sequences of DNA pairs from different chromosomes typically might be interacting with each other in 3D space in the cell nucleus.

This computational method, which Ma and Xiong have dubbed SNIPER, proved successful in identifying subcompartments in eight cell lines whose interchromosomal interactions based on Hi-C data were only partially known. They also applied SNIPER to the GM12878 data as a control. But Xiong noted that it is not yet known how widely this tool can be used on all other cell types. He and Ma are continuing to enhance the method, however, so it can be used on a variety of cellular conditions and even in different organisms.

"We need to understand how subcompartment patterns are involved in the basic functions of cells, as well as how mutations can affect these 3D structures," Ma said. "Thus far, in the few cell lines we've been able to study, we see that some subcompartments are consistent across cell types, while others vary. Much remains to be learned."
The National Institutes of Health and the National Science Foundation supported this work.

Carnegie Mellon University

Related Dna Articles:

In one direction or the other: That is how DNA is unwound
DNA is like a book, it needs to be opened to be read.
DNA is like everything else: it's not what you have, but how you use it
A new paradigm for reading out genetic information in DNA is described by Dr.
A new spin on DNA
For decades, researchers have chased ways to study biological machines.
From face to DNA: New method aims to improve match between DNA sample and face database
Predicting what someone's face looks like based on a DNA sample remains a hard nut to crack for science.
Self-healing DNA nanostructures
DNA assembled into nanostructures such as tubes and origami-inspired shapes could someday find applications ranging from DNA computers to nanomedicine.
DNA design that anyone can do
Researchers at MIT and Arizona State University have designed a computer program that allows users to translate any free-form drawing into a two-dimensional, nanoscale structure made of DNA.
DNA find
A Queensland University of Technology-led collaboration with University of Adelaide reveals that Australia's pint-sized banded hare-wallaby is the closest living relative of the giant short-faced kangaroos which roamed the continent for millions of years, but died out about 40,000 years ago.
DNA structure impacts rate and accuracy of DNA synthesis
DNA sequences with the potential to form unusual conformations, which are frequently associated with cancer and neurological diseases, can in fact slow down or speed up the DNA synthesis process and cause more or fewer sequencing errors.
Changes in mitochondrial DNA control how nuclear DNA mutations are expressed in cardiomyopathy
Differences in the DNA within the mitochondria, the energy-producing structures within cells, can determine the severity and progression of heart disease caused by a nuclear DNA mutation.
Switching DNA and RNA on and off
DNA and RNA are naturally polarised molecules. Scientists believe that these molecules have an in-built polarity that can be reoriented or reversed fully or in part under an electric field.
More Dna News and Dna Current Events

Top Science Podcasts

We have hand picked the top science podcasts of 2019.
Now Playing: TED Radio Hour

Why do we revere risk-takers, even when their actions terrify us? Why are some better at taking risks than others? This hour, TED speakers explore the alluring, dangerous, and calculated sides of risk. Guests include professional rock climber Alex Honnold, economist Mariana Mazzucato, psychology researcher Kashfia Rahman, structural engineer and bridge designer Ian Firth, and risk intelligence expert Dylan Evans.
Now Playing: Science for the People

#540 Specialize? Or Generalize?
Ever been called a "jack of all trades, master of none"? The world loves to elevate specialists, people who drill deep into a single topic. Those people are great. But there's a place for generalists too, argues David Epstein. Jacks of all trades are often more successful than specialists. And he's got science to back it up. We talk with Epstein about his latest book, "Range: Why Generalists Triumph in a Specialized World".
Now Playing: Radiolab

Dolly Parton's America: Neon Moss
Today on Radiolab, we're bringing you the fourth episode of Jad's special series, Dolly Parton's America. In this episode, Jad goes back up the mountain to visit Dolly's actual Tennessee mountain home, where she tells stories about her first trips out of the holler. Back on the mountaintop, standing under the rain by the Little Pigeon River, the trip triggers memories of Jad's first visit to his father's childhood home, and opens the gateway to dizzying stories of music and migration. Support Radiolab today at