What to sequence next: Pick one species at a time

December 02, 2005

After humans, mice, chickens and others what genomes should scientists sequence next? In a paper published today in PLoS Genetics, Fabio Pardi and Nick Goldman of the EMBL-European Bioinformatics Institute present a way to decide. Surprisingly, they show that always choosing the next best single species is just as effective as planning to sequence several genomes in advance.

DNA sequencing has revealed a vast amount of information about biology. But genome sequencing remains expensive and time consuming, so scientists need a strategy to help them select the organisms that will give them the most new information.

One solution is to sequence the most distantly related organisms, to get the widest possible diversity of sequences. Biologists represent the relationships between different species as a tree, with the length of the branches varying according to the degree by which their DNA sequences differ. "If we are prepared to assume that the most informative set is the one with the greatest evolutionary divergence, the problem of which species to sequence next can be solved by observing the length of the branches that separate the unsequenced species from those that have already had their genomes sequenced, and choosing the organism that's separated from the others by the longest sequence of branches", explains Fabio Pardi.

The tendency has been for centres to choose a group of new genomes to sequence. However, the current study shows that picking the best candidates one at a time is equally informative. "Computer scientists call this a 'greedy strategy' because it involves always taking the best bet for yourself", says Nick Goldman. "However, if, say, a centre had enough funding to sequence five organisms, we might expect to get a better set of genomes by considering all five together. Counter-intuitively, we found that in this case the greedy strategy is the best. We were surprised because in computer science greed is definitely not good - greedy algorithms seldom provide the best solution to a problem."

"Our findings have clear implications for planning large-scale genome sequencing efforts", continues Pardi. "Provided that they remain open about their choices so that two different sequencing centres don't choose the same genome, selecting the next most attractive organism to sequence is just as effective as having a long-term strategy."

Evolutionary divergence isn't the only factor that scientists consider when choosing which genomes to sequence, but other criteria can be factored into Goldman and Pardi's greedy strategy so long as those criteria can be quantified. For example, sequencing costs, or the economic importance of an organism, could be considered. Their strategy can also be applied to different problems, such as conservation biology. 'Of course, we're not advocating that genome scientists or conservation biologists stop working cooperatively, but at least they can feel confident about sequencing or conserving the organism of their choice without messing things up for their collaborators,' says Goldman.

European Molecular Biology Laboratory

Related Genome Articles from Brightsurf:

Genome evolution goes digital
Dr. Alan Herbert from InsideOutBio describes ground-breaking research in a paper published online by Royal Society Open Science.

Breakthrough in genome visualization
Kadir Dede and Dr. Enno Ohlebusch at Ulm University in Germany have devised a method for constructing pan-genome subgraphs at different granularities without having to wait hours and days on end for the software to process the entire genome.

Sturgeon genome sequenced
Sturgeons lived on earth already 300 million years ago and yet their external appearance seems to have undergone very little change.

A sea monster's genome
The giant squid is an elusive giant, but its secrets are about to be revealed.

Deciphering the walnut genome
New research could provide a major boost to the state's growing $1.6 billion walnut industry by making it easier to breed walnut trees better equipped to combat the soil-borne pathogens that now plague many of California's 4,800 growers.

Illuminating the genome
Development of a new molecular visualisation method, RNA-guided endonuclease -- in situ labelling (RGEN-ISL) for the CRISPR/Cas9-mediated labelling of genomic sequences in nuclei and chromosomes.

A genome under influence
References form the basis of our comprehension of the world: they enable us to measure the height of our children or the efficiency of a drug.

How a virus destabilizes the genome
New insights into how Kaposi's sarcoma-associated herpesvirus (KSHV) induces genome instability and promotes cell proliferation could lead to the development of novel antiviral therapies for KSHV-associated cancers, according to a study published Sept.

Better genome editing
Reich Group researchers develop a more efficient and precise method of in-cell genome editing.

Unlocking the genome
A team led by Prof. Stein Aerts (VIB-KU Leuven) uncovers how access to relevant DNA regions is orchestrated in epithelial cells.

Read More: Genome News and Genome Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.