Penn scientists identify patterns of RNA regulation in the nuclei of plants

December 31, 2014

When the human genome was first sequenced, experts predicted they would find about 100,000 genes. The actual number has turned out to be closer to 20,000, just a few thousand more than fruit flies have. The question logically arose: how can a relatively small number of genes lay the blueprint for the complexities of the human body?

The explanation is that genes are subject to many and varied forms of regulation that can alter the form of that protein and can determine whether and how much of a gene product is made. Much of this regulation occurs during and just after DNA is transcribed into RNA.

In a new study done in plants, University of Pennsylvania biologists built on earlier work in which they cataloged all the interactions that occur between RNA and the proteins that bind to it. This time, they looked exclusively at these interactions in the nuclei, and simultaneously obtained data about the nuclear RNA molecules' structure. By combining these datasets, their findings give a global view of the patterns that can affect the various RNA regulatory processes that occur before these molecules move into the cytoplasm, where they are translated into the proteins that make up a living organism.

In addition, the researchers have provided a vast, publically available set of data that other scientists can use to address questions about any genes and regulatory mechanisms that interest them, gaining a better understanding of the dynamics of the journey from DNA to protein.

Brian D. Gregory, an assistant professor in Penn's School of Arts & Sciences' Department of Biology, was senior author on the work, which will appear in the journal Molecular Cell. Sager J. Gosai, a research specialist, and Shawn W. Foley, a graduate student, both members of Gregory's lab, were co-first authors. Additional contributors from Penn included Ian M. Silverman, a graduate student in the Gregory lab, along with Fevzi Daldal, a professor in the Department of Biology and Nur Selamoglu of the Daldal lab. The Penn researchers teamed with Emory University's Dongxue Wang and Roger B. Deal and University of Arizona's Andrew D. L. Nelson and Mark A. Beilstein to conduct the study.

Earlier this year in Genome Biology, Gregory's team reported on a method they developed to obtain a complete catalog of the interactions in live organisms between RNA and RNA-binding proteins, or RBPs, which interact with RNA transcripts to repress, enhance or otherwise alter gene expression in a cell-type specific manner. The technique is called PIP-seq, for protein interaction profile sequencing. Their initial demonstration of PIP-seq identified the full complement of RBP interaction sites in a human cell line.

In the current work, they used the commonly studied plant Arabidopsis thaliana to map out all of the RBP interaction sites as well as compile a full look at the secondary structure of the RNA transcripts. Unlike the first study, which looked at all the RNA in the cell, a set of material known as the transcriptome, this study looked only in the nucleus.

"By focusing specifically on the nucleus we can get away from all of the features on RNA molecules that are associated with the process of translation into proteins, which occurs in the cytoplasm," Gregory said.

The researchers extracted nuclei from 10-day-old Arabidopsis seedlings. They performed PIP-seq and also obtained information on the secondary structure of the RNA--how the strands of RNA fold, loop or bind together.

Focusing on sections of RNA that bind to RBPs, the team found that these sequences have been conserved over evolutionary time and are likely playing an important function in gene regulatory mechanisms.

The scientists also found a strong inverse relationship between patterns of RBP binding and secondary structure.

"When structure is low, proteins tend to bind those regions and when structure is high, RBPs tend to not bind those regions," Gregory said. "Time and time again, we've seen that the structural context, and not just the RNA sequence, is a selective force in RBP binding."

Another significant finding was unique patterns of RBP binding and structure present around the start codon of each messenger RNA transcript, which is where a cell's protein-making machinery begins the process of making RNA in proteins.

"This is suggesting that there is a regulatory event happening here even before the RNA comes out of the nucleus and engages with the translation machinery," Gosai said. "It's an exciting place for future studies to start with and figure out what regulation events are happening in the nucleus."

Two key forms of transcript regulation are alternative splicing, in which pieces of RNA undergo a cut-and-paste process to generate new sequences that can code for various proteins, and alternative polyadenylation, which alters where a transcript ends and an adenine "tail" is added, a process that can enhance either stabilization or degradation of the RNA molecule.

In their analysis, the Penn biologists found that RBP-binding sites and certain patterns of secondary structure were much more common at sites where alternative splicing and alternative polyadenylation occur.

"In humans, almost 95 percent of genes are alternatively spliced, and the number is at least 60 percent in plants," said Foley. "To see high levels of RBP binding and an interplay with secondary structure at sites of alternative splicing and polyadenylation in plants is good indication of where and how regulation is occurring to produce different proteins from one RNA sequence."

As in their previous study using PIP-seq, Gregory and his colleagues identified recurring patterns, known as "motifs," of RNA sequences at sites that tended to be bound by certain RBPs. It's possible, the researchers noted, that these groups of RBPs could bind functionally-related genes to coordinate their regulation.

Finally, the team zoomed in on one RBP-bound sequence motif that was particularly abundant and found that it interacted with an RBP called CP29A.

"This protein was known to bind RNA in the chloroplast, but we were able to identify it as a nuclear RBP for the first time," Foley said, suggesting CP29A may be an important regulatory factor in both organelles.

To follow up on this work, the Penn scientists will examine how RNA regulation differs in plant tissues at different developmental stages. They also plan to use PIP-seq and structural analyses to study other types of organisms.

"Now that we've found beautiful patterns that mark alternative splicing and other events that shape the protein-coding capacity of plants, we're going to go in and identify the proteins that lead to those," Gregory said. "And eventually we'd like to go into humans and other organisms and ask if we see similar patterns."
The research was supported by grants from the National Science Foundation and National Institutes for General Medical Sciences. All data from this and other studies from Brian Gregory's lab can be accessed at

University of Pennsylvania

Related Proteins Articles from Brightsurf:

New understanding of how proteins operate
A ground-breaking discovery by Centenary Institute scientists has provided new understanding as to the nature of proteins and how they exist and operate in the human body.

Finding a handle to bag the right proteins
A method that lights up tags attached to selected proteins can help to purify the proteins from a mixed protein pool.

Designing vaccines from artificial proteins
EPFL scientists have developed a new computational approach to create artificial proteins, which showed promising results in vivo as functional vaccines.

New method to monitor Alzheimer's proteins
IBS-CINAP research team has reported a new method to identify the aggregation state of amyloid beta (Aβ) proteins in solution.

Composing new proteins with artificial intelligence
Scientists have long studied how to improve proteins or design new ones.

Hero proteins are here to save other proteins
Researchers at the University of Tokyo have discovered a new group of proteins, remarkable for their unusual shape and abilities to protect against protein clumps associated with neurodegenerative diseases in lab experiments.

Designer proteins
David Baker, Professor of Biochemistry at the University of Washington to speak at the AAAS 2020 session, 'Synthetic Biology: Digital Design of Living Systems.' Prof.

Gone fishin' -- for proteins
Casting lines into human cells to snag proteins, a team of Montreal researchers has solved a 20-year-old mystery of cell biology.

Coupled proteins
Researchers from Heidelberg University and Sendai University in Japan used new biotechnological methods to study how human cells react to and further process external signals.

Understanding the power of honey through its proteins
Honey is a culinary staple that can be found in kitchens around the world.

Read More: Proteins News and Proteins Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to