ENCODE consortium publishes scientific strategy

October 21, 2004

BETHESDA, Md., Thurs., Oct. 21, 2004 - A research consortium organized by the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health (NIH), today published a paper in the journal Science detailing the scientific rationale and strategy behind its quest to produce a comprehensive catalog of all parts of the human genome crucial to biological function. Also today, NHGRI announced the award of $5.5 million in technology development grants to provide new tools for the pioneering effort.

In a peer-reviewed article published in the Oct. 22 issue of Science, the ENCyclopedia Of DNA Elements (ENCODE) consortium outlines its plans for achieving its ambitious goal of building a "parts list" of all sequence-based functional elements in the human DNA sequence. The list will include: protein-coding genes; non-protein-coding genes; regulatory elements involved in the control of gene transcription; and DNA sequences that mediate chromosomal structure and dynamics. The ENCODE researchers also anticipate they may uncover additional functional elements that have yet to be recognized.

"Creating this monumental reference work will help us mine and fully utilize the human genome sequence. Such knowledge will lead to a far deeper understanding of human biology and stimulate the development of new strategies for improving human health," said NHGRI Director Francis S. Collins, M.D., Ph.D.

While the completion of the Human Genome Project in April 2003, and the publication of the finished human genome sequence in Nature just this week, marked significant scientific achievements, these are only the first steps toward the ultimate goal of using information about the human genome sequence to diagnose, treat and prevent disease. Over the past several years, researchers have made major strides in using DNA sequence data to help find genes, which are the parts of the genome that code for proteins. The protein-coding component of these genes, however, makes up just a small fraction of the human genome - about 1.5 percent. There is strong evidence that other parts of the genome have important functions, but very little information exists about where these other "functional elements" are located and how they work. The ENCODE project aims to address this critical goal of genomics research.

Launched in September 2003, the ENCODE project is being implemented in three phases: a pilot phase, a technology development phase and a production phase. In the pilot phase, which is expected to last three years, ENCODE researchers are devising and testing high-throughput ways of efficiently applying known approaches to identify functional elements. Their collaborative efforts are centered on 44 DNA targets, which together cover about 1 percent of the human genome, or about 30 million base pairs. The target regions were strategically selected to provide a representative cross section of the entire human genome sequence. Simultaneously, in the second phase of the ENCODE Project, the technology development component, other research groups are striving to develop new technologies designed to widen the array of novel methods and technologies available to be applied to the ENCODE project. Guided by the results of the first two phases, NHGRI will decide how to initiate the production phase and expand the ENCODE project to analyze the remaining 99 percent of the human genome.

"Major challenges lie ahead on the road to a complete encyclopedia of DNA elements," said Elise A. Feingold, Ph.D., NHGRI's program director in charge of the ENCODE project. "Such work is well beyond the scope of any single group. However, by bringing together researchers with a broad range of interests and expertise to work in a highly collaborative setting, we expect that the ENCODE consortium will have the power to achieve a goal of this magnitude."

Among the many hurdles facing the ENCODE consortium is the complexity of the problem. No single experimental approach can be used to identify all functional elements, and many current methods may not provide a cost effective means of finding functional elements in a target as large as the human genome. Furthermore, many functional elements are only active in certain types of cells or at certain stages of development, which means it may be necessary to analyze many different types of human cells. In addition, if a truly comprehensive inventory is to be created, more work needs to be done to learn about functional elements not surveyed in the pilot project, including centromeres (the middles of chromosomes) and telomeres (the ends of chromosomes). In their Science article, ENCODE researchers set forth their plans for addressing these and other challenges.

NHGRI has designated the ENCODE project as a community resource project, which means that all data generated for this project will be deposited in free, public databases as soon as they are experimentally verified. "During the Human Genome Project, our policy of rapid data release enabled researchers to take advantage of human genomic sequence data as soon as they were produced. Similarly, the ENCODE consortium will make valuable data rapidly available for use by scientists around the world," said Mark S. Guyer, Ph.D., director of NHGRI's Division of Extramural Research.

Also today, NHGRI announced the award of a second set of ENCODE technology development grants, which are intended to complement the first set of technology development grants made in 2003 by adding more novel methods and technologies to the consortium's "tool box." "These grants are aimed at broadening the types of functional elements that we are studying under ENCODE and also expanding the portfolio of technologies that we can apply to them," said Peter Good, Ph.D., NHGRI's program director for genome informatics.

Recipients of the 2004 ENCODE Technology Development Grants and their total approximate funding are: The ENCODE consortium currently is comprised of several research teams in the United States, as well as groups in Canada, Singapore, Spain and the United Kingdom. The collaborative effort is open to all interested researchers in academia, government and industry who agree to abide by the consortium's guidelines.
For more detailed information on the ENCODE project, including a complete list of participants and the consortium's data release and accessibility policies, go to: www.genome.gov/ENCODE. ENCODE data that can be directly linked to genomic sequence will be made available at the University of California, Santa Cruz ENCODE Genome Browser (www.genome.ucsc.edu/ENCODE) and the ENSEMBL Browser (www.ensembl.org).

NHGRI is one of 27 institutes and centers at NIH, an agency of the Department of Health and Human Services. The NHGRI Division of Extramural Research supports grants for research and for training and career development at sites nationwide. Additional information about NHGRI can be found at: www.genome.gov.

NIH/National Human Genome Research Institute

Related Genome Articles from Brightsurf:

Genome evolution goes digital
Dr. Alan Herbert from InsideOutBio describes ground-breaking research in a paper published online by Royal Society Open Science.

Breakthrough in genome visualization
Kadir Dede and Dr. Enno Ohlebusch at Ulm University in Germany have devised a method for constructing pan-genome subgraphs at different granularities without having to wait hours and days on end for the software to process the entire genome.

Sturgeon genome sequenced
Sturgeons lived on earth already 300 million years ago and yet their external appearance seems to have undergone very little change.

A sea monster's genome
The giant squid is an elusive giant, but its secrets are about to be revealed.

Deciphering the walnut genome
New research could provide a major boost to the state's growing $1.6 billion walnut industry by making it easier to breed walnut trees better equipped to combat the soil-borne pathogens that now plague many of California's 4,800 growers.

Illuminating the genome
Development of a new molecular visualisation method, RNA-guided endonuclease -- in situ labelling (RGEN-ISL) for the CRISPR/Cas9-mediated labelling of genomic sequences in nuclei and chromosomes.

A genome under influence
References form the basis of our comprehension of the world: they enable us to measure the height of our children or the efficiency of a drug.

How a virus destabilizes the genome
New insights into how Kaposi's sarcoma-associated herpesvirus (KSHV) induces genome instability and promotes cell proliferation could lead to the development of novel antiviral therapies for KSHV-associated cancers, according to a study published Sept.

Better genome editing
Reich Group researchers develop a more efficient and precise method of in-cell genome editing.

Unlocking the genome
A team led by Prof. Stein Aerts (VIB-KU Leuven) uncovers how access to relevant DNA regions is orchestrated in epithelial cells.

Read More: Genome News and Genome Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.