Nav: Home

A new approach to sequence and assemble primate genomes

March 31, 2016

Technical advances in reading long DNA sequences have ramifications in understanding primate evolution and human disease.

The genome of the Western lowland gorilla has now been sequenced and assembled at a high level of quality beginning to approach that of the mouse and human genome.

A new sequencing technology based on longer sequence reads allows missing genes and missing forms of genetic variation to be discovered for the first time. This assembly offers new biological insights into a living species that is second only to chimps in its evolutionary closeness to humans.

Reporting in the April 1 edition of Science, researchers led by Evan Eichler, University of Washington professor of genome sciences, explained why previous genome assemblies for the gorilla and other mammals have been fragmented, incomplete, and potentially misleading:

Massively parallel sequencing technologies, while increasing the speed, improving the accuracy, and reducing the cost of genome sequencing, typically produce only short stretches of sequences called "reads" After sequencing, the reads are pieced back together with genome assembly software.

The program attempts to reconstruct the original genome by using the overlap between the sequence reads. Unfortunately, the presence of long, repetitive DNA, which is common in human and other primate genomes, confuses assembly software and causes it to break the genome into very small fragments.

"Such assemblies can be like Swiss cheese," Eichler said, "with a lot of missing biological information in the gaps." The original, published Western lowland gorilla genome, created using the short-read technology, he said, was broken into more than 400,000 pieces.

"These gaps are not random, but are clustered at sites of repeats," he said. "If geneticists can't capture these repeats and determine structural differences in genomes, they have problems understanding the organization of genes and comparing genetic variation within and across species."

His team included UW bioinformatics specialists David Gordon and John Huddleston, as well as postdoctoral fellows, Mark Chaisson, Chris Hill and Zev Kronenberg. The research team analyzed DNA in a blood sample of a female Western lowland gorilla from Chicago's Lincoln Park Zoo.

The researchers used Single Molecule, Real-Time (SMRT) sequencing technology, the assembly tools Falcon and QUIVER, and other techniques to generate long sequence reads. These were more than a hundred times the length of the most popular sequence technologies. The long reads allowed them to traverse most of the repeat regions of the gorilla genome during the assembly.

The result was a new gorilla genome assembly that was larger and had far fewer pieces. Instead of 400,000 fragments, there are now only 1,800 pieces. The average size of the genome fragments was 800 times larger with approximately 90 percent of all gaps in the original assembly closed.

This additional sequencing information, the researchers observed, greatly improved gene annotation for that species of gorilla. It also led to the discovery of thousands of protein- and peptide-coding segments and new regulatory elements that had been missed as part of the first genome assembly.

Differences in how genes are controlled, or even the loss or disruption of certain gene regulatory elements, may explain why human ancestors evolved to be so different from their great ape relatives.

The scientists also found tens of thousands of new structural variants, such as deletions or insertions of DNA, that are likely to be more important than the smaller single base pair differences that were cataloged before. (Base pairs are the two chemicals that bond into a rung on the DNA ladder)

"My motivation in studying human and great ape genomes," Eichler said, "is to try to learn what makes us tick as a species. I'd like to see a re-doing of all the great ape genomes, including chimpanzee and orangutan, to get a comprehensive view of the genetic variants that distinguish humans from the great apes. I believe there is far more genetic variation than we had previously thought. The first step is finding it."

Among the areas where the researchers have seen intriguing dissimilarities between humans and gorillas are in genes associated with sensory perception, keratin (a skin protein) production, insulin regulation, immunity, reproduction and cell signaling.

The new genome assembly also provides new clues into the evolutionary history of the lowland gorilla. Prior studies have demonstrated that the gorilla population underwent a bottleneck in the not so distant past, but analyses with the new genome shows that the bottleneck was more severe than previously thought.

Patterns of genetic variation within the gorilla genome can provide evidence of how disease, climate change and human activity affect lowland gorilla populations.

"I think the take home message," Eichler said, "is that the new genome technology and assembly bring us back to the place we should have been 10 years ago."

"Sequencing technology and computational biology," Eichler and his team wrote in their paper, "have now advanced to the stage where individual laboratories can generate high quality genomes of mammals. This capability has the promise to revolutionize our understanding of genome evolution and species biology."

Eichler added that these advances are also likely to contribute greatly to research on the genetic underpinnings of human disease, especially if more human genomes are sequenced in this way.

"As medical researchers, if we depend only on short read sequences, there's a chink in our armor. The work on gorilla and other human genomes clearly demonstrates that large swathes of genetic variation can't be understood with the short sequence-read approaches. Long read sequencing is allowing us to access a new levels of genetic variation that were previously inaccessible ,inaccessible," he said.

However, he added, "At $80,000 a pop, the price is not yet right today for clinical sequencing of human genomes using the long reads. Given a few years of years of cost reduction and further advances in technology, I am willing to bet this is the way we will sequence human genomes to discover disease-causing mutations in the future. "
-end-
The research reported in the Science paper, "Long-read sequence assembly of the gorilla genome" was support by grants from the National Institute of Health. Eichler is a Howard Hughes Medical Institute investigator.

Researchers from the University of California, Santa Cruz, Washington University in St. Louis, and Pacific Biosciences of California collaborated with the UW on this project.

University of Washington Health Sciences/UW Medicine

Related Dna Articles:

Zigzag DNA
How the cell organizes DNA into tightly packed chromosomes. Nature publication by Delft University of Technology and EMBL Heidelberg.
Scientists now know what DNA's chaperone looks like
Researchers have discovered the structure of the FACT protein -- a mysterious protein central to the functioning of DNA.
DNA is like everything else: it's not what you have, but how you use it
A new paradigm for reading out genetic information in DNA is described by Dr.
A new spin on DNA
For decades, researchers have chased ways to study biological machines.
From face to DNA: New method aims to improve match between DNA sample and face database
Predicting what someone's face looks like based on a DNA sample remains a hard nut to crack for science.
Self-healing DNA nanostructures
DNA assembled into nanostructures such as tubes and origami-inspired shapes could someday find applications ranging from DNA computers to nanomedicine.
DNA design that anyone can do
Researchers at MIT and Arizona State University have designed a computer program that allows users to translate any free-form drawing into a two-dimensional, nanoscale structure made of DNA.
DNA find
A Queensland University of Technology-led collaboration with University of Adelaide reveals that Australia's pint-sized banded hare-wallaby is the closest living relative of the giant short-faced kangaroos which roamed the continent for millions of years, but died out about 40,000 years ago.
DNA structure impacts rate and accuracy of DNA synthesis
DNA sequences with the potential to form unusual conformations, which are frequently associated with cancer and neurological diseases, can in fact slow down or speed up the DNA synthesis process and cause more or fewer sequencing errors.
Changes in mitochondrial DNA control how nuclear DNA mutations are expressed in cardiomyopathy
Differences in the DNA within the mitochondria, the energy-producing structures within cells, can determine the severity and progression of heart disease caused by a nuclear DNA mutation.
More DNA News and DNA Current Events

Trending Science News

Current Coronavirus (COVID-19) News

Top Science Podcasts

We have hand picked the top science podcasts of 2020.
Now Playing: TED Radio Hour

Listen Again: Reinvention
Change is hard, but it's also an opportunity to discover and reimagine what you thought you knew. From our economy, to music, to even ourselves–this hour TED speakers explore the power of reinvention. Guests include OK Go lead singer Damian Kulash Jr., former college gymnastics coach Valorie Kondos Field, Stockton Mayor Michael Tubbs, and entrepreneur Nick Hanauer.
Now Playing: Science for the People

#562 Superbug to Bedside
By now we're all good and scared about antibiotic resistance, one of the many things coming to get us all. But there's good news, sort of. News antibiotics are coming out! How do they get tested? What does that kind of a trial look like and how does it happen? Host Bethany Brookeshire talks with Matt McCarthy, author of "Superbugs: The Race to Stop an Epidemic", about the ins and outs of testing a new antibiotic in the hospital.
Now Playing: Radiolab

Dispatch 6: Strange Times
Covid has disrupted the most basic routines of our days and nights. But in the middle of a conversation about how to fight the virus, we find a place impervious to the stalled plans and frenetic demands of the outside world. It's a very different kind of front line, where urgent work means moving slow, and time is marked out in tiny pre-planned steps. Then, on a walk through the woods, we consider how the tempo of our lives affects our minds and discover how the beats of biology shape our bodies. This episode was produced with help from Molly Webster and Tracie Hunte. Support Radiolab today at Radiolab.org/donate.