Genome superheroes unmask

August 13, 2001

Last June, in a legendary bit of scientific heroism, a graduate student from the University of California stitched together a massive collection of DNA sequence pieces to create the first public draft of the human genome sequence. The student, James Kent, had worked around the clock to create a computer program that could assemble the draft in time for the Human Genome Project to declare completion on June 26, 2000.

Now, Kent and his colleague David Haussler describe the creation of that computer program, revealing the surprisingly simple ideas behind the most important puzzle-solving exercise in recent history. The program, called GigAssembler, had to trim and assemble the nearly 400,000 pieces of human DNA sequence generated by the HGP over a decade.

To perform this daunting task, GigAssembler used a so-called "greedy" algorithm that assembles sequence pieces according to best fit first. GigAssembler can consult a wide variety of information to determine how pieces fit - including sequence overlap, gene data, and "maps" generated by the Human Genome Project. For example, if two sequence segments code parts of the same gene, GigAssembler scores a fit.

Using these principles, as well as cleverly designed strategies to resolve conflicting fits, GigAssembler successfully assembled the first public genome draft containing 2.7 billion base pairs (88% of the genome). Since then, GigAssembler has performed further assemblies incorporating up to 92% of the human genome.
-end-
Contact (author): David Haussler
University of California, Santa Cruz
Santa Cruz, CA 95064
USA
Haussler@cse.ucsc.edu

Also this month, Genome Research features a commentary on Kent and Haussler's work, entitled "Assembling Puzzles from Preassembled Blocks," by Pavel Pevzner.

Contact: Pavel Pevzner
University of California, San Diego
La Jolla, CA 92093
USA
Ppevzner@cs.ucsd.edu

Cold Spring Harbor Laboratory

Related Genome Articles from Brightsurf:

Genome evolution goes digital
Dr. Alan Herbert from InsideOutBio describes ground-breaking research in a paper published online by Royal Society Open Science.

Breakthrough in genome visualization
Kadir Dede and Dr. Enno Ohlebusch at Ulm University in Germany have devised a method for constructing pan-genome subgraphs at different granularities without having to wait hours and days on end for the software to process the entire genome.

Sturgeon genome sequenced
Sturgeons lived on earth already 300 million years ago and yet their external appearance seems to have undergone very little change.

A sea monster's genome
The giant squid is an elusive giant, but its secrets are about to be revealed.

Deciphering the walnut genome
New research could provide a major boost to the state's growing $1.6 billion walnut industry by making it easier to breed walnut trees better equipped to combat the soil-borne pathogens that now plague many of California's 4,800 growers.

Illuminating the genome
Development of a new molecular visualisation method, RNA-guided endonuclease -- in situ labelling (RGEN-ISL) for the CRISPR/Cas9-mediated labelling of genomic sequences in nuclei and chromosomes.

A genome under influence
References form the basis of our comprehension of the world: they enable us to measure the height of our children or the efficiency of a drug.

How a virus destabilizes the genome
New insights into how Kaposi's sarcoma-associated herpesvirus (KSHV) induces genome instability and promotes cell proliferation could lead to the development of novel antiviral therapies for KSHV-associated cancers, according to a study published Sept.

Better genome editing
Reich Group researchers develop a more efficient and precise method of in-cell genome editing.

Unlocking the genome
A team led by Prof. Stein Aerts (VIB-KU Leuven) uncovers how access to relevant DNA regions is orchestrated in epithelial cells.

Read More: Genome News and Genome Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.