Evolution reveals a link between DNA and protein shape

December 07, 2011

Fifty years after the pioneering discovery that a protein's three-dimensional structure is determined solely by the sequence of its amino acids, an international team of researchers has taken a major step toward fulfilling a tantalizing promise: predicting the structure of a protein from its sequence alone. This advance could open a series of doors for previously intractable research into important biological processes and development of novel therapeutic drugs.

The team from Harvard Medical School (HMS), Politecnico di Torino / Human Genetics Foundation Torino (HuGeF) and Memorial Sloan-Kettering Cancer Center in New York (MSKCC) reported their results on Dec. 7 in the journal PLoS ONE.

In molecular biology and biomedical engineering, knowing the shape of protein molecules is key to understanding how they perform the work of life, the mechanisms of disease and drug design. Normally scientists determine the shape of protein molecules by expensive and complicated experiments, but for most proteins these experiments have not yet been done, leaving many crucial biological questions unanswered.

In principle, this problem could be solved by computing a protein's shape based simply on its sequence, which is relatively easily determined based on its DNA , but despite limited success for some smaller proteins, this challenge has remained essentially unsolved. The difficulty lies in the astronomically large number of possible shapes for each protein; without any shortcuts, it would take a supercomputer many years to explore all of these options and find the right one for even a small protein.

"Experimental structure determination has a hard time keeping up with the explosion in genetic sequence information," said Debora Marks, a mathematical biologist in the Department of Systems Biology at HMS, who worked closely with Lucy Colwell, a mathematician who recently moved from Harvard to Cambridge University. The two researchers collaborated with physicists Riccardo Zecchina and Andrea Pagnani in Torino in a team effort initiated by Marks and computational biologist Chris Sander of the Computational Biology Program at MSKCC, who had earlier attempted a similar solution to the problem when substantially fewer sequences were available.

The international team tested a bold premise: that evolution can provide a roadmap to how a protein folds. Their approach combined three key elements: evolutionary information accumulated over many millions of years; data from high-throughput genetic sequencing; and a key method from statistical physics, co-developed in the Torino group with Martin Weigt, who recently moved to the University of Paris.

"Collaboration was key," Sander said. "As with many important discoveries in science, no one could provide the answer in isolation."

Using the accumulated evolutionary information, in the form of the sequences of thousands of proteins grouped into families of proteins likely to have similar shapes, the team developed an algorithm to infer which parts of a protein interact to determine its shape. With these internal protein interactions in hand, the researchers implemented widely-used molecular simulation software developed by Axel Brunger at Stanford University to generate the atomic details of the protein shape.

Using this process, the team was for the first time able to compute remarkably accurate shapes from sequence information alone for a test set of 15 diverse proteins, with no protein size limit in sight, with unprecedented accuracy.

"Alone, none of the individual pieces are completely novel, but apparently nobody had put all of them together to predict 3D protein structure," Colwell said.

The researchers caution that their method does have some weaknesses. Experimental structures, when available, generally are more accurate in atomic detail, and the method works only when researchers have genetic data for large protein families - but advances in DNA sequencing have yielded a torrent of such data that is forecast to continue growing exponentially in the foreseeable future.

The next step, the researchers say, is to predict the structures of unsolved proteins currently being investigated experimentally, before exploring the large uncharted territory of currently unknown protein structures. "Synergy between computational prediction and experimental determination of structures is likely to yield increasingly valuable insight into the large universe of protein shapes that crucially determine their function and evolutionary dynamics," Sander said.
Citation: Marks DS, Colwell LJ, Sheridan R, Hopf TA, Pagnani A, et al. (2011) Protein 3D Structure Computed from Evolutionary Sequence Variation. PLoS ONE 6(12): e28766. doi:10.1371/journal.pone.0028766

Funding: CS and RS have support from the Dana Farber Cancer Institute-Memorial Sloan-Kettering Cancer Center Physical Sciences Oncology Center (NIH U54-CA143798). LC is supported by an Engineering and Physical Sciences Research Council fellowship (EP/H028064/1). TH has support from the German National Academic Foundation. RZ has support from European Community grant 267915. No other financial support was received for the research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

This press release was written by Roger Alan Leo in HMS Communications office.

LINK TO THE FREELY AVAILABLE ARTICLE: http://www.plosone.org/article/info:doi/10.1371/journal.pone.0028766


Related DNA Articles from Brightsurf:

A new twist on DNA origami
A team* of scientists from ASU and Shanghai Jiao Tong University (SJTU) led by Hao Yan, ASU's Milton Glick Professor in the School of Molecular Sciences, and director of the ASU Biodesign Institute's Center for Molecular Design and Biomimetics, has just announced the creation of a new type of meta-DNA structures that will open up the fields of optoelectronics (including information storage and encryption) as well as synthetic biology.

Solving a DNA mystery
''A watched pot never boils,'' as the saying goes, but that was not the case for UC Santa Barbara researchers watching a ''pot'' of liquids formed from DNA.

Junk DNA might be really, really useful for biocomputing
When you don't understand how things work, it's not unusual to think of them as just plain old junk.

Designing DNA from scratch: Engineering the functions of micrometer-sized DNA droplets
Scientists at Tokyo Institute of Technology (Tokyo Tech) have constructed ''DNA droplets'' comprising designed DNA nanostructures.

Does DNA in the water tell us how many fish are there?
Researchers have developed a new non-invasive method to count individual fish by measuring the concentration of environmental DNA in the water, which could be applied for quantitative monitoring of aquatic ecosystems.

Zigzag DNA
How the cell organizes DNA into tightly packed chromosomes. Nature publication by Delft University of Technology and EMBL Heidelberg.

Scientists now know what DNA's chaperone looks like
Researchers have discovered the structure of the FACT protein -- a mysterious protein central to the functioning of DNA.

DNA is like everything else: it's not what you have, but how you use it
A new paradigm for reading out genetic information in DNA is described by Dr.

A new spin on DNA
For decades, researchers have chased ways to study biological machines.

From face to DNA: New method aims to improve match between DNA sample and face database
Predicting what someone's face looks like based on a DNA sample remains a hard nut to crack for science.

Read More: DNA News and DNA Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.