Re-learning how to read a genome

November 10, 2014

Cold Spring Harbor, NY - There are roughly 20,000 genes and thousands of other regulatory "elements" stored within the three billion letters of the human genome. Genes encode information that is used to create proteins, while other genomic elements help regulate the activation of genes, among other tasks. Somehow all of this coded information within our DNA needs to be read by complex molecular machinery and transcribed into messages that can be used by our cells.

Usually, reading a gene is thought to be a lot like reading a sentence. The reading machinery is guided to the start of the gene by various sequences in the DNA - the equivalent of a capital letter - and proceeds from left to right, DNA letter by DNA letter, until it reaches a sequence that forms a punctuation mark at the end. The capital letter and punctuation marks that tell the cell where, when, and how to read a gene are known as regulatory elements.

But scientists have recently discovered that genes aren't the only messages read by the cell. In fact, many regulatory elements themselves are also read and transcribed into messages, the equivalent of pronouncing the words "capital letter," "comma," or "period." Even more surprising, genes are read bi-directionally from so-called "start sites" - in effect, generating messages in both forward and backward directions.

With all these messages, how does the cell know which one encodes the information needed to make a protein? Is there something different about the reading process at genes and regulatory elements that helps avoid confusion? New research, published today in Nature Genetics, has revealed that the initial steps of the reading process itself are actually remarkably similar at both genes and regulatory elements. The main differences seem to occur after this initial step, in the length and stability of the messages. Gene messages are long and stable enough to ensure that genes becomes proteins, whereas regulatory messages are short and unstable, and are rapidly "cleaned up" by the cell.

To make the distinction, the team, which was co-led by CSHL Professor Adam Siepel and Cornell University Professor John Lis, looked for differences between the initial reading processes at genes and a set of regulatory elements called enhancers. "We took advantage of highly sensitive experimental techniques developed in the Lis lab to measure newly made messages in the cell," says Siepel. "It's like having a new, more powerful microscope for observing the process of transcription as it occurs in living cells."

Remarkably, the team found that the reading patterns for enhancer and gene messages are highly similar in many respects, sharing a common architecture. "Our data suggests that the same basic reading process is happening at genes and these non-genic regulatory elements," explains Siepel. "This points to a unified model for how DNA transcription is initiated throughout the genome."

Working together, the biochemists from Lis's laboratory and the computer jockeys from Siepel's group carefully compared the patterns at enhancers and genes, combining their own data with vast public data sets from the NIH's Encyclopedia of DNA Elements (ENCODE) project. "By many different measures, we found that the patterns of transcription initiation are essentially the same at enhancers and genes," says Siepel. "Most RNA messages are rapidly targeted for destruction, but the messages at genes that are read in the right direction - those destined to be a protein - are spared from destruction." The team was able to devise a model to mathematically explain the difference between stable and unstable transcripts, offering insight into what defines a gene. According to Siepel, "Our analysis shows that the 'code' for stability is, in large part, written in the DNA, at enhancers and genes alike."

This work has important implications for the evolutionary origins of new genes, according to Siepel. "Because DNA is read in both directions from any start site, every one of these sites has the potential to generate two protein-coding genes with just a few subtle changes. The genome is full of potential new genes."
-end-
This work was supported by the National Institutes of Health.

"Analysis of transcription start sites from nascent RNA identifies a unified architecture of initiation regions at mammalian promoters and enhancers." appears online in Nature Genetics on November 10, 2014. The authors are: Leighton Core, André Martins, Charles Danko, Colin Waters, Adam Siepel, and John Lis. The paper can be obtained online at: http://dx.doi.org/10.1038/ng.3142

About Cold Spring Harbor Laboratory

Founded in 1890, Cold Spring Harbor Laboratory (CSHL) has shaped contemporary biomedical research and education with programs in cancer, neuroscience, plant biology and quantitative biology. CSHL is ranked number one in the world by Thomson Reuters for the impact of its research in molecular biology and genetics. The Laboratory has been home to eight Nobel Prize winners. Today, CSHL's multidisciplinary scientific community is more than 600 researchers and technicians strong and its Meetings & Courses program hosts more than 12,000 scientists from around the world each year to its Long Island campus and its China center. For more information, visit http://www.cshl.edu.

Cold Spring Harbor Laboratory

Related DNA Articles from Brightsurf:

A new twist on DNA origami
A team* of scientists from ASU and Shanghai Jiao Tong University (SJTU) led by Hao Yan, ASU's Milton Glick Professor in the School of Molecular Sciences, and director of the ASU Biodesign Institute's Center for Molecular Design and Biomimetics, has just announced the creation of a new type of meta-DNA structures that will open up the fields of optoelectronics (including information storage and encryption) as well as synthetic biology.

Solving a DNA mystery
''A watched pot never boils,'' as the saying goes, but that was not the case for UC Santa Barbara researchers watching a ''pot'' of liquids formed from DNA.

Junk DNA might be really, really useful for biocomputing
When you don't understand how things work, it's not unusual to think of them as just plain old junk.

Designing DNA from scratch: Engineering the functions of micrometer-sized DNA droplets
Scientists at Tokyo Institute of Technology (Tokyo Tech) have constructed ''DNA droplets'' comprising designed DNA nanostructures.

Does DNA in the water tell us how many fish are there?
Researchers have developed a new non-invasive method to count individual fish by measuring the concentration of environmental DNA in the water, which could be applied for quantitative monitoring of aquatic ecosystems.

Zigzag DNA
How the cell organizes DNA into tightly packed chromosomes. Nature publication by Delft University of Technology and EMBL Heidelberg.

Scientists now know what DNA's chaperone looks like
Researchers have discovered the structure of the FACT protein -- a mysterious protein central to the functioning of DNA.

DNA is like everything else: it's not what you have, but how you use it
A new paradigm for reading out genetic information in DNA is described by Dr.

A new spin on DNA
For decades, researchers have chased ways to study biological machines.

From face to DNA: New method aims to improve match between DNA sample and face database
Predicting what someone's face looks like based on a DNA sample remains a hard nut to crack for science.

Read More: DNA News and DNA Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.