1000 Genomes Project releases pilot data

June 21, 2010

HOUSTON -- (June 21, 2010) - The completion of three pilot projects designed to determine how best to build an extremely detailed map of human genetic variation begins a new chapter in the international project called 1,000 Genomes (http://www.1000genomes.org/page.php?page=home), said the director of the Baylor College of Medicine Human Genome Sequencing Center (http://www.hgsc.bcm.tmc.edu/), which is a major contributor to the effort.

"Mapping all the shared normal variation in human populations is a critical step to interpreting medically actionable genetic changes," said Dr. Richard Gibbs (http://www.hgsc.bcm.tmc.edu/content-home-HGSC_director-x.hgsc), also a professor in the department of molecular and human genetics at BCM (www.bcm.edu).

The 1,000 Genomes project began in 2008 with the kickoff of three pilot programs. Completion of the pilots launches the full-scale effort to build a public database of human genetic variation from the genomes of 2,500 people from 27 populations around the world. With the announcement, groups involved in the project placed their final data in freely available databases that can be used and accessed by the worldwide research community.

"The 1000 Genomes Project has a simple goal: peer more deeply into the genetic variations of the human genome to understand the genetic contribution to common human diseases," said Dr. Eric D. Green, director of the National Human Genome Research Institute, which provides major funding to the effort. "I am excited about the progress being made on this resource for use by scientists around the world and look forward to seeing what we learn from the next stage of the project."

Recent studies looking for variations that contribute to common human ailments, such as heart disease and diabetes, indicate that a host of rare variations account for much of the burden of disease in the human population. Complex and detailed maps such as those to be assembled from the project provide a potent tool for identifying those rare variations.

The pilot program tested the viability of three strategies. BCM designed and coordinated the strategy that involved targeting the sequencing to gene coding regions. This project provided the most complete data for the exons (or coding regions) of 1,000 genes, as it was designed to deeply sample the DNA in each of nearly 700 people. An estimated 2 percent of the human genome is composed of protein-coding genes.

"We also developed new methods to target variation in genes, and showed that this approach gave maximum information about this important class of human variation", said Dr. Fuli Yu, an assistant professor in the BCM Human Genome Sequencing Center and coordinator of the study.

The project's fast pace was made possible only by next-generation sequencing technology, which can produce thousands or million of sequences rapidly. The techniques involved allow researchers to evaluate all the rare variants found in areas of the genome known to be associated with human disease.

Another of the pilot projects involved using a variety of sequencing technologies to sequence the genomes of six people (two nuclear families including parents and one daughter) at high coverage (meaning in exacting detail). Each sample was sequenced from 20 to 60 times, uncovering a more complete picture of DNA variation in these families. Using different technologies scientists also obtained a better understanding of the strengths of each sequencing platform.

The other pilot project sequenced the genomes of 179 people in less detail - subjecting each sample to an average of approximately four sequencing passes. Researchers then combined the data from different people to discover which genetic variants they share. This technique will provide valuable information in uncovering those genomic variations shared among people or populations.

Researchers can obtain the data freely through the 1000 Genomes website (www.1000genomes.org) or from the NCBI at ftp://ftp-trace.ncbi.nih.gov/1000genomes/ or the EBI at: ftp://ftp.1000genomes.ebi.ac.uk/. Researchers with limited computing power will be able to access the data through Amazon Web services through the company's Elastic Compute Cloud (AmazonEC2). The database contains all forms of variation found in the genome from single changes called single nucleotide polymorphisms (SNPs), to small insertions and deletions (of genetic material) to the large changes in the structure and number of copies of chromosomes called copy number variations.
In addition to the Baylor College of Medicine Human Genome Sequencing Center, much of the pilot work was carried out by researchers at the Wellcome Trust Sanger Institute in the United Kingdom, BGI Shenzhen in China, the Broad Institute of MIT, and Harvard in Massachusett ; the Washington University Genome Sequencing Center at the Washington University School of Medicine in St. Louis; and Boston College.

Funding for this work comes from 454 Life Sciences, a Roche company, Branford, Conn.; Applied Biosystems, an Applera Corp. business, Foster City, Calif.; Beijing Genomics Institute, Shenzhen, China; Illumina Inc., San Diego; the Max Planck Institute for Molecular Genetics, Berlin, Germany; the Wellcome Trust Sanger Institute, Hinxton, Cambridge, U.K.; and the NHGRI, which supports the work being done by Baylor College of Medicine, Houston, Texas; the Broad Institute, Cambridge, Mass.; and Washington University, St. Louis, Missouri. For more information on basic science research at Baylor College of Medicine, go to www.bcm.edu/fromthelab or www.bcm.edu/news.

Baylor College of Medicine

Related Genome Articles from Brightsurf:

Genome evolution goes digital
Dr. Alan Herbert from InsideOutBio describes ground-breaking research in a paper published online by Royal Society Open Science.

Breakthrough in genome visualization
Kadir Dede and Dr. Enno Ohlebusch at Ulm University in Germany have devised a method for constructing pan-genome subgraphs at different granularities without having to wait hours and days on end for the software to process the entire genome.

Sturgeon genome sequenced
Sturgeons lived on earth already 300 million years ago and yet their external appearance seems to have undergone very little change.

A sea monster's genome
The giant squid is an elusive giant, but its secrets are about to be revealed.

Deciphering the walnut genome
New research could provide a major boost to the state's growing $1.6 billion walnut industry by making it easier to breed walnut trees better equipped to combat the soil-borne pathogens that now plague many of California's 4,800 growers.

Illuminating the genome
Development of a new molecular visualisation method, RNA-guided endonuclease -- in situ labelling (RGEN-ISL) for the CRISPR/Cas9-mediated labelling of genomic sequences in nuclei and chromosomes.

A genome under influence
References form the basis of our comprehension of the world: they enable us to measure the height of our children or the efficiency of a drug.

How a virus destabilizes the genome
New insights into how Kaposi's sarcoma-associated herpesvirus (KSHV) induces genome instability and promotes cell proliferation could lead to the development of novel antiviral therapies for KSHV-associated cancers, according to a study published Sept.

Better genome editing
Reich Group researchers develop a more efficient and precise method of in-cell genome editing.

Unlocking the genome
A team led by Prof. Stein Aerts (VIB-KU Leuven) uncovers how access to relevant DNA regions is orchestrated in epithelial cells.

Read More: Genome News and Genome Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.