Nav: Home

The grammar of cell development branching time

May 29, 2019

One of the greatest achievements of science in recent years is the technology for obtaining information about thousands of individual cells extracted from an organism. This technology includes the so-called "omics" for individual cells (genomics, epigenomics, transcriptomics, proteomics), which give us the data about the genomes of thousands of single cells, the state and activity of various genes in them, as well as the presence of various proteins in these cells.

It is convenient to present the data about each cell as a point in a highly multidimensional space. As a result, using the new technology, scientists around the world obtain thousands of points (cells) in a space of enormous dimension.

The study based on data analysis methods including topological and geometric analysis, "topological grammars", "principal graphs method", "data approximation", etc., is an important element of the new (and huge in terms of investments and number of players) technology for data acquisition about living organisms. This new technology involves "omics" of single cells. Such data open up colossal and not yet fully realized opportunities for the development of biology and personalized medicine.

The idea of branching development time allows one to convert the resulting mountains of data into a more understandable, readable and interpretable form. We can imagine that each cell lies on some development trajectory. These trajectories can branch at the point where the cell in its development makes the choice of one future variant from several possible ones. Geometrically, these development trajectories with bifurcation points represent the branching development time.

A new technology for extracting this branching time from the data was developed by a large international team of researchers, including 15 scientists from 6 countries: the USA, China, France, Italy, the UK and Russia.

Complex trees are constructed using elementary transformation grammars. At each step of the basic algorithm, the elementary transformation that gives the greatest gain in the quality of data approximation is chosen.

The method of topological grammars for processing complex data of a general nature was proposed as early as 2007 by Professor Alexander Gorban (Great Britain, currently supervising the implementation of a megagrant at the Lobachevsky State University of Nizhny Novgorod) and his former student Andrei Zinoviev (France, currently collaborating with the Lobachevsky University in the implementation of the megagrant).

"The concept of branching time (or, as it is often called, pseudo-time) arises in biology in the following way: cells and events that occur to them are placed along a certain graph (or, more formally, a one-dimensional continuum, since a graph is a discrete object). This branching continuum plays the same role in the analysis of developmental and differentiation events as linear time in other areas (a scale for event placement). No mystery or modification of physical time is involved. People have introduced this concept and many of them use it. It is convenient. The topology of this scale is extracted from data analysis. Next, the data is mapped on this scale," explains Alexander Gorban.

This method was developed further as part of a broad international cooperation organized by L. Pinello from Harvard University and was used to create a specialized software product STREAM, which builds the branching time of cell development based on the "omics" data of all single cells.

"Just imagine, fairly recently, we learned with great delight and a sense of wonder about the decoding of the human genome. The proposed new technology makes it possible to determine the status and activity of genes and other important data simultaneously for tens of thousands of cells taken from the body. This information will be determined for each of them individually rather than by giving some sort of average values. Thus, extremely important information will be provided about the development of an individual organism and the origination of various diseases such as cancer. However, one must be able to read, decipher and extract useful information from such data. We provide a tool for working with these data and extracting important information from them," continues Alexander Gorban.

"Both the graph and its embedding in the data space are built simultaneously, and then this embedding is used to map all the data. That is, a highly multidimensional space (with the dimension of hundreds of thousands) is reduced to a branching one-dimensional continuum," concludes Alexander Gorban.
An article describing the method and the first results of its application has been published in a new issue of Nature Communication magazine:

H Chen, L Albergante, JY Hsu, CA Lareau, G Lo Bosco, J Guan, S Zhou, AN Gorban, DE Bauer, MJ Aryee, DM Langenau, A Zinovyev, JD Buenrostro, GC Yuan, and L Pinello, Single-cell trajectories reconstruction data with STREAM, Nature Communications, volume 10, Article number: 1903 (2019),

STREAM software, its ElPiGraph compute core, and other project-related programs are freely available online:

Lobachevsky University

Related Genes Articles:

Insomnia genes found
An international team of researchers has found, for the first time, seven risk genes for insomnia.
Genes affecting our communication skills relate to genes for psychiatric disorder
By screening thousands of individuals, an international team led by researchers of the Max Planck Institute for Psycholinguistics, the University of Bristol, the Broad Institute and the iPSYCH consortium has provided new insights into the relationship between genes that confer risk for autism or schizophrenia and genes that influence our ability to communicate during the course of development.
The fate of Neanderthal genes
The Neanderthals disappeared about 30,000 years ago, but little pieces of them live on in the form of DNA sequences scattered through the modern human genome.
Face shape is in the genes
Many of the characteristics that make up a person's face, such as nose size and face width, stem from specific genetic variations, reports John Shaffer of the University of Pittsburgh in Pennsylvania, and colleagues, in a study published on Aug.
Study finds hundreds of genes and genetic codes that regulate genes tied to alcoholism
Using rats carefully bred to either drink large amounts of alcohol or to spurn it, researchers at Indiana and Purdue universities have identified hundreds of genes that appear to play a role in increasing the desire to drink alcohol.
Reading between the genes
For a long time dismissed as 'junk DNA,' we now know that also the regions between the genes fulfill vital functions.
The silence of the genes
Research led by Dr. Keiji Tanimoto from the University of Tsukuba, Japan, has brought us closer to understanding the mechanisms underlying the phenomenon of genomic imprinting.
Why some genes are highly expressed
The DNA in our cells is folded into millions of small packets, like beads on a string, allowing our two-meter linear DNA genomes to fit into a nucleus of only about 0.01 mm in diameter.
Activating genes on demand
A new approach developed by Harvard geneticist George Church, Ph.D., can help uncover how tandem gene circuits dictate life processes, such as the healthy development of tissue or the triggering of a particular disease, and can also be used for directing precision stem cell differentiation for regenerative medicine and growing organ transplants.
Controlling genes with light
Researchers at Duke University have demonstrated a new way to activate genes with light, allowing precisely controlled and targeted genetic studies and applications.

Related Genes Reading:

Best Science Podcasts 2019

We have hand picked the best science podcasts for 2019. Sit back and enjoy new science podcasts updated daily from your favorite science news services and scientists.
Now Playing: TED Radio Hour

Do animals grieve? Do they have language or consciousness? For a long time, scientists resisted the urge to look for human qualities in animals. This hour, TED speakers explore how that is changing. Guests include biological anthropologist Barbara King, dolphin researcher Denise Herzing, primatologist Frans de Waal, and ecologist Carl Safina.
Now Playing: Science for the People

#SB2 2019 Science Birthday Minisode: Mary Golda Ross
Our second annual Science Birthday is here, and this year we celebrate the wonderful Mary Golda Ross, born 9 August 1908. She died in 2008 at age 99, but left a lasting mark on the science of rocketry and space exploration as an early woman in engineering, and one of the first Native Americans in engineering. Join Rachelle and Bethany for this very special birthday minisode celebrating Mary and her achievements. Thanks to our Patreons who make this show possible! Read more about Mary G. Ross: Interview with Mary Ross on Lash Publications International, by Laurel Sheppard Meet Mary Golda...