Science Current Events | Science News | Brightsurf.com
 
Email a Friend Send to a friend
Printer Friendly Print Establishing standard definitions for genome sequences

Establishing standard definitions for genome sequences

October 09, 2009

In 1996, researchers from major genome sequencing centers around the world convened on the island of Bermuda and defined a finished genome as a gapless sequence with a nucleotide error rate of one or less in 10,000 bases. This effectively set the quality target for the human genome effort and was quickly applied to other genome projects. If a genome sequence didn't meet this stringent criterion, it was simply considered a "draft."

More than a decade later, researchers are finding that with the advent of the latest sequencing technologies the terms "draft" and "finished" are no longer sufficient to describe the varying levels of genome sequence quality being produced. The quality issue is of particular concern for any researcher who wants to use the sequence, in order to know its integrity and reliability. This is of even greater concern for reference genome sequences, such as those genome projects conducted in support of the U.S. Department of Energy (DOE) missions of bioenergy and environmental clean-up, because they provide the foundational knowledge of the gene content and how these organisms interact with the environment.




As the proverbial "fire hose of data" becomes a Niagara torrent, with conservative estimates of 12,000 draft genomes hitting the public databases by 2012, researchers may be surprised to find that these datasets describe genomes that are not complete. Recognizing the problem, a group of researchers from several sequencing centers, including the DOE Joint Genome Institute (JGI), the Sanger Institute and the Human Microbiome Project (HMP) Jumpstart Consortium sequencing institutes, has proposed a new set of standards that expand upon the so-called "Bermuda standard." In the October 9 issue of the journal Science, they propose four additional categories between "draft" and "finished" status that reflect varying levels of completeness.

"In the past we've been limited to two options, requiring us and the other centers to come up with internal definitions," said DOE JGI metagenomics researcher Patrick Chain at Los Alamos National Laboratory (LANL), first author of the Science paper. "But these are not clear and they're not propagated to the databases to which we submit sequences. So when users try to download genomes they get data of unknown quality with no information, or a complete genome that they assume has been checked for missing-data errors."

Chain said that when he and the other organizers of the Sequencing, Finishing, Analysis in the Future meeting hosted by LANL first gathered in 2005, they were concerned by the varying quality of the new genomes being submitted to public archives . As the meeting organizers all represented major sequencing centers (and smaller groups as well), the genome projects standards group was initiated at LANL, stimulated by these concerns.

The six categories defined by the group include:

* "Standard draft," which is the minimum amount of information needed for submission to a public database;
* "High quality draft," which is typically generated by large sequencing centers such as DOE JGI, and which has little or no manual review;
* "Improved high quality draft," which consists of data reviewed by either people or machines to some extent so most of the genetic data is assembled correctly, but some errors may still be present;
* "Annotation-directed improvement," which is a sequenced segment that presents all the information in various gene regions as accurately as possible;
* "Noncontiguous finished," which includes sequences that have been reviewed by both people and machines and would be considered complete except for "recalcitrant regions" that are proving problematic;
* "Finished," which defines complete sequences that have minimal errors, if any.

DOE JGI's Chris Detter, one of the paper's senior authors, and head of the LANL Genome Science group, said that the definitions provided in the Science paper are fairly flexible because the group wanted the proposed standards to apply regardless of the genome project or sequencing technologies employed.

"My hope is all the major genome centers and advanced genomics groups use the gradations that fit their needs," he said. "Some centers may want all six, while some may only want three, but as long as they keep them intact we are in good shape. Then, my hope is that the smaller genomics groups adopt the classes as written to help the rest of the scientific community know what they are generating and submitting."

Chain added that the process of coming up with the proposed standards was not exactly an easy task since all major centers "have different pipelines, different sequencing techniques, different internal standards". They also recognized that the attempt to develop a "one size fits all" set of standards is still a work in progress. The definitions provided in the Science paper are fairly flexible, designed to apply regardless of the genome project or sequencing technologies employed and to meet each group's needs.

"We do expect that a number of people will comment on these standards, and possibly expand on the categories," he said, "but we feel we've covered all the bases with these six categories."

Chain said the group plans to team with the Genomic Standards Consortium, a grassroots movement begun by scientists who were concerned about the need for data collection standards in genome projects. The group has also talked to public archives such as GenBank to append these proposed standards to GenBank entries so that researchers can tell if the sequences will be useful to them. "Standards are a major issue to be tackled in genomics right now," Chain said. "These proposals are guideposts meant to inform users and generators."

DOE/Joint Genome Institute



Related Genome Sequencing Current Events and Genome Sequencing News Articles Genome Sequencing Current Events and Genome Sequencing News RSS Genome Sequencing Current Events and Genome Sequencing News RSS
Scientists at UA, collaborating institutions decode maize genome
Scientists from the University of Arizona led by Arizona Genomics Institute director Rod A. Wing and from collaborating institutions have deciphered the complete genetic code of the maize plant for the first time.

UCSD discovery allows scientists for the first time to experimentally annotate genomes
Over the last 20 years, the sequencing of the human genome, along with related organisms, has represented one of the largest scientific endeavors in the history of mankind.

Standards for a new genomic era
A team of geneticists at Los Alamos National Laboratory, together with a consortium of international researchers, has recently proposed a set of standards designed to elucidate the quality of publicly available genetic sequencing information.

Draft potato genome based on unique potato variety
The Potato Genome Sequencing Consortium (PGSC), an international team of scientists from industry and academia in 14 countries, has released a draft sequence of the potato genome with the help of a Virginia Tech researcher.

MSU scientist helps map potato genome; move will improve crop yield
It's been cultivated for at least 7,000 years and spread from South America to grow on every continent except Antarctica. Now the humble potato has had its genome sequenced.

Study of huge numbers of genetic mutations point to oxidative stress as underlying cause
A study that tracked genetic mutations through the human equivalent of about 5,000 years has demonstrated for the first time that oxidative DNA damage is a primary cause of the process of mutation - the fuel for evolution but also a leading cause of aging, cancer and other diseases.

Faster, cheaper way to find disease genes in human genome passes initial test
University of Washington (UW) researchers have successfully developed a novel genome-analysis strategy for more rapid, lower cost discovery of possible gene-disease links.

CSHL scientists harness logic of 'Sudoku' math puzzle to vastly enhance genome-sequencing capability
A math-based game that has taken the world by storm with its ability to delight and puzzle may now be poised to revolutionize the fast-changing world of genome sequencing and the field of medical genetics.

Aluminum-oxide nanopore beats other materials for DNA analysis
Fast and affordable genome sequencing has moved a step closer with a new solid-state nanopore sensor being developed by researchers at the University of Illinois.

A Genome May Reduce Your Carbon Footprint
With the costs of genome sequencing rapidly decreasing, and with the infrastructure now developed for almost anyone with access to a computer to cheaply store, access, and analyze sequence information, emphasis is increasingly being placed on ways to apply genome data to real world problems, including reducing dependency on fossil fuel.
More Genome Sequencing Current Events and Genome Sequencing News Articles
Next-Generation Genome Sequencing: Towards Personalized Medicine

Next-Generation Genome Sequencing: Towards Personalized Medicine
by Michal Janitz (Editor)

Written by leading experts from industry and academia, this comprehensive resource addresses recent developments in next generation DNA sequencing technology and their impact on genome research, drug discovery and health care. As such, it presents a detailed comparative analysis of commercially available platforms as well as insights into alternative, emerging sequencing techniques. In addition, the book not only covers the principles of DNA sequencing techniques but also social, ethical and commercial aspects, the concept of personalized medicine and a five-year perspective of DNA sequencing.

Charlie Rose with Hamilton Smith, Francis Collins, Harold Varmus, Arnold Levine, James Watson, Nicholas Wade, William Haseltine, Arthur Caplan & J. Craig Venter (September 1, 2000)

Charlie Rose with Hamilton Smith, Francis Collins, Harold Varmus, Arnold Levine, James Watson, Nicholas Wade, William Haseltine, Arthur Caplan & J. Craig Venter (September 1, 2000)

Episode five of the special series "Mapping the Human Genome." A panel of experts discusses the ethics and implications of the Human Genome Project. They are: Dr. Craig Venter, president of Celera Genomics; Dr. Hamilton Smith, in charge of creating a DNA library for Celera Genomics; Dr Francis Collins, molecular geneticist and director of the National Center for Human Genome Research Institute at the National Institutes of Health; Dr. Harold Varmus, president of Rockefeller University; Dr. James Watson, molecular biologist and co-discoverer of the double-helix structure of DNA; Nicholas Wade, science reporter for The New York Times; Dr. William Haseltine, chairman and CEO of Human Genome Sciences; and Dr. Arthur Caplan, director of the Center for Bioethics at the University of...

SciEd Sequencing the Human Genome; Dna Map Using Restcn Enzymes

SciEd Sequencing the Human Genome; Dna Map Using Restcn Enzymes
by Edvotek

DNA Map Using Restcn Enzymes

Genome Sequencing Technology and Algorithms

Genome Sequencing Technology and Algorithms
by Sun Kim (Editor), Haixu Tang (Editor), Elaine R. Mardis (Editor)

The 2003 completion of the Human Genome Project was just one step in the evolution of DNA sequencing. Now from a "who's who" of pioneers in the field comes the latest genome sequencing and assembly advances that are redefining the field. This trail-blazing book gives researchers, unparalleled access to state-of-the-art DNA sequencing technologies, new algorithmic sequence assembly techniques, and emerging methods for both resequencing and genome analysis that together form the most solid foundation possible for tackling experimental and computational challenges in the genome sciences today. Including critiques of existing techniques, this far-reaching resource offers researchers assistance in achieving more rapid and accurate DNA sequencing and developing the next generation of...

Genome Mapping and Sequencing

Genome Mapping and Sequencing
by Ian Dunham (Author)

Written by experts in the field, this title provides a comprehensive source of information on DNA sequencing and mapping, the newest technologiy and procedures in areas such as radiation hybrid mapping, FISH and specialized sequencing techniques. It also covers genomic sequence software and sequence databases. This book is an essential guide for anyone involved in DNA sequencing and mapping.

  Rapid genome sequencing of RNA viruses.(DISPATCHES)(Disease/Disorder overview): An article from: Emerging Infectious Diseases
by Tetsuya Mizutani (Author), Daiji Endoh (Author), Michiko Okamoto (Author), Kazuya Shirato (Author), Hiroyuki Shimizu (Author), Minetaro Arita (Author), Shuetsu Fukushi (Author), Masayuki Saijo (Author), Kouyi Sakai (Author), Chang Kweng Lim (Author), Mikako Ito (Author), Reiko Nerome (Author), Tomohiko Takasaki (Author), Koji Ishii (Author), Tetsuro Suzuki (Author)

This digital document is an article from Emerging Infectious Diseases, published by Thomson Gale on February 1, 2007. The length of the article is 1571 words. The page length shown above is based on a typical 300-word page. The article is delivered in HTML format and is available in your Amazon.com Digital Locker immediately after purchase. You can view it with any web browser.

Citation Details
Title: Rapid genome sequencing of RNA viruses.(DISPATCHES)(Disease/Disorder overview)
Author: Tetsuya Mizutani
Publication: Emerging Infectious Diseases (Magazine/Journal)
Date: February 1, 2007
Publisher: Thomson Gale
Volume: 13 Issue: 2 Page: 322(3)

Article Type: Disease/Disorder overview

Distributed by Thomson...

Charlie Rose (August 29, 2000)

Charlie Rose (August 29, 2000)

A rebroadcast of part two of the special edition series, "Mapping the Human Genome" that originally aired on June 20, 2000. Dr. Francis Collins, molecular geneticist and director of the National Center for Human Genome Research Institute at the National Institutes of Health, and his colleagues Judy Crabtree, Dr. Steven Lipkin, David Duggan and Dr. Olli-P. Kallioniemi, discuss the Human Genome Project.

This product is manufactured on demand using DVD-R recordable media. Amazon.com's standard return policy will apply.

  Agaricus Bisporus Mushroom Genome Sequencing.(specialty mushrooms): An article from: Mushroom News
by Richard W. Kerrigan (Author)

This digital document is an article from Mushroom News, published by American Mushroom Institute on February 1, 2009. The length of the article is 614 words. The page length shown above is based on a typical 300-word page. The article is delivered in HTML format and is available immediately after purchase. You can view it with any web browser.

Citation Details
Title: Agaricus Bisporus Mushroom Genome Sequencing.(specialty mushrooms)
Author: Richard W. Kerrigan
Publication: Mushroom News (Magazine/Journal)
Date: February 1, 2009
Publisher: American Mushroom Institute
Volume: 57 Issue: 2 Page: 4(2)

Distributed by Gale, a part of Cengage...

Mapping and Sequencing the Human Genome

Mapping and Sequencing the Human Genome
by Committee on Mapping and Sequencing the Human Genome (Author), National Research Council (Author)

There is growing enthusiasm in the scientific community about the prospect of mapping and sequencing the human genome, a monumental project that will have far-reaching consequences for medicine, biology, technology, and other fields. But how will such an effort be organized and funded? How will we develop the new technologies that are needed? What new legal, social, and ethical questions will be raised? "Mapping and Sequencing the Human Genome" is a blueprint for this proposed project. The authors offer a highly readable explanation of the technical aspects of genetic mapping and sequencing, and they recommend specific interim and long-range research goals, organizational strategies, and funding levels. They also outline some of the legal and social questions that might arise and urge...

Charlie Rose with Francis Collins, Steven Lipkin, David Duggan, Olli-P. Kallioniemi & Judy Crabtree (June 20, 2000)

Charlie Rose with Francis Collins, Steven Lipkin, David Duggan, Olli-P. Kallioniemi & Judy Crabtree (June 20, 2000)

Part two of the special edition series, "Mapping the Human Genome." Dr. Francis Collins, molecular geneticist and director of the National Center for Human Genome Research Institute at the National Institutes of Health, and his colleagues Judy Crabtree, Dr. Steven Lipkin, David Duggan and Dr. Olli-P. Kallioniemi, discuss the Human Genome Project.

This product is manufactured on demand using DVD-R recordable media. Amazon.com's standard return policy will apply.

© 2009 BrightSurf.com