Nav: Home

Neural nets supplant marker genes in analyzing single cell RNA sequencing

November 13, 2018

PITTSBURGH--Computer scientists at Carnegie Mellon University say neural networks and supervised machine learning techniques can efficiently characterize cells that have been studied using single cell RNA-sequencing (scRNA-seq). This finding could help researchers identify new cell subtypes and differentiate between healthy and diseased cells.

Rather than rely on marker genes, which are not available for all cell types, this new automated method analyzes all of the scRNA-seq data to select just those parameters that can differentiate one cell from another. This enables the analysis of all cell types and provides a method for comparative analysis of those cells.

Researchers from CMU's Computational Biology Department explain their method today in the online journal Nature Communications. They also describe a web server called scQuery that makes the method usable by all researchers.

Over the past five years, single cell sequencing has become a major tool for cell researchers. In the past, researchers could only obtain DNA or RNA sequence information by processing batches of cells, providing results that only reflected average values of the cells. Analyzing cells one at a time, by contrast, enables researchers to identify subtypes of cells, or to see how a healthy cell differs from a diseased cell, or how a young cell differs from an aged cell.

This type of sequencing will support the National Institutes of Health's new Human BioMolecular Atlas Program (HuBMAP), which is building a 3D map of the human body that shows how tissues differ on a cellular level. Ziv Bar-Joseph, professor of computational biology and machine learning and a co-author of today's paper, leads a CMU-based center contributing computational tools to that project.

"With each experiment yielding hundreds of thousands of data points, this is becoming a Big Data problem," said Amir Alavi, a Ph.D. student in computational biology who was co-lead author of the paper with post-doctoral researcher Matthew Ruffalo. "Traditional analysis methods are insufficient for such large scales."

Alavi, Ruffalo and their colleagues developed an automated pipeline that attempts to download all public scRNA-seq data available for mice -- identifying the genes and proteins expressed in each cell -- from the largest data repositories, including the NIH's Gene Expression Omnibus (GEO). The cells were then labeled by type and processed via a neural network, a computer system modeled on the human brain. By comparing all of the cells with each other, the neural net identified the parameters that make each cell distinct.

The researchers tested this model using scRNA-seq data from a mouse study of a disease similar to Alzheimer's. As would be expected, the analysis showed similar levels of brain cells in both healthy and diseased cells, while the diseased cells included substantially more immune cells, such as macrophages, generated in response to the disease.

The researchers used their pipeline and methods to create scQuery, a web server that can speed comparative analysis of new scRNA-seq data. Once a researcher submits a single cell experiment to the server, the group's neural networks and matching methods can quickly identify related cell subtypes and identify earlier studies of similar cells.
-end-
In addition to Ruffalo, Alavi and Bar-Joseph, authors of the research paper include Aiyappa Parvangada and Zhilin Huang, both graduate students in computational biology. The National Institutes of Health, the National Science Foundation, the Pennsylvania Department of Health and the James S. McDonnell Foundation supported this work.

Carnegie Mellon University

Related Genes Articles:

Symphony of genes
One of the most exciting discoveries in genome research was that the last common ancestor of all multicellular animals already possessed an extremely complex genome.
New genes out of nothing
One key question in evolutionary biology is how novel genes arise and develop.
Good genes
A team of scientists from NAU, Arizona State University, the University of Groningen in the Netherlands, the Center for Coastal Studies in Massachusetts and nine other institutions worldwide to study potential cancer suppression mechanisms in cetaceans, the mammalian group that includes whales, dolphins and porpoises.
How lifestyle affects our genes
In the past decade, knowledge of how lifestyle affects our genes, a research field called epigenetics, has grown exponentially.
Genes that regulate how much we dream
Sleep is known to allow animals to re-energize themselves and consolidate memories.
More Genes News and Genes Current Events

Best Science Podcasts 2019

We have hand picked the best science podcasts for 2019. Sit back and enjoy new science podcasts updated daily from your favorite science news services and scientists.
Now Playing: TED Radio Hour

Erasing The Stigma
Many of us either cope with mental illness or know someone who does. But we still have a hard time talking about it. This hour, TED speakers explore ways to push past — and even erase — the stigma. Guests include musician and comedian Jordan Raskopoulos, neuroscientist and psychiatrist Thomas Insel, psychiatrist Dixon Chibanda, anxiety and depression researcher Olivia Remes, and entrepreneur Sangu Delle.
Now Playing: Science for the People

#537 Science Journalism, Hold the Hype
Everyone's seen a piece of science getting over-exaggerated in the media. Most people would be quick to blame journalists and big media for getting in wrong. In many cases, you'd be right. But there's other sources of hype in science journalism. and one of them can be found in the humble, and little-known press release. We're talking with Chris Chambers about doing science about science journalism, and where the hype creeps in. Related links: The association between exaggeration in health related science news and academic press releases: retrospective observational study Claims of causality in health news: a randomised trial This...