"Self Organizing Maps" Help Analyze Thousands Of Genes

March 16, 1999

Using a sophisticated computer algorithm, a team of scientists at the Whitehead Institute has designed a new technique to analyze the massive amounts of data generated by DNA microarrays, also known as DNA chips. This technique will help scientists decipher how our 100,000 genes work together to keep us healthy and how diseases result when they fail.

"DNA arrays have revolutionized DNA analysis by allowing us to observe the activities of thousands of genes simultaneously," says Todd Golub, research scientist at the Whitehead/MIT Center for Genome Research. "But until now, it's been really difficult to interpret this extraordinarily complex raw data. Our technique is among the first in a new generation of tools that will speed up the analysis of the enormous amounts of genetic data emerging from laboratories worldwide."

Dr. Golub and his colleagues at the Whitehead Institute, Dana-Farber Cancer Institute, Dartmouth Medical School, and the Massachusetts Institute of Technology, report their technique in the March 16 issue of the Proceedings of the National Academy of Sciences. The Whitehead/MIT Center for Genome Research is one of the flagship centers of the U.S. Human Genome Project, the effort to determine the 3 billion letters that make up the human blueprint.

"The core of the technique is an algorithm, called a self-organizing map (SOM), that takes advantage of the fact that many genes in a cell behave similarly," explains Pablo Tamayo, the lead author of the paper and research scientist at the Whitehead Institute. "Instead of having 2,000 individual genes, all doing different things, you might have 25 groups of genes doing similar things."

Tamayo compares the final product of the SOM to an executive summary for CEOs. Rather than having to read every page of a 1,000-page report, CEOs can get an overview of the report by simply reading the summary. "It's impossible to visually inspect every gene," he says. "This method produces a quick scan of what's going on with thousands of genes."

The researchers created a computer package called GENECLUSTER, which organizes the activities of thousands of genes in only minutes. To test GENECLUSTER, they analyzed the genes expressed in several models of leukemia cell growth. In many cases, the algorithm identified genes known to be important in this process, but occasionally it also identified unexpected genes. This finding suggests that the method might be useful in helping to identify the function of unknown genes. "Because genes that have similar functions are generally expressed in the same basic pattern, knowing the expression pattern of a gene could help identify its function," explains Tamayo.

SOMs have been used widely in data mining, particularly for large or messy datasets like stock market data, but this study is the first to apply them to gene analysis.

The study was supported in part by consortium of three companies -- Bristol-Myers Squibb Company; Affymetrix, Inc.; and Millennium Pharmaceuticals Inc.-- that formed a unique corporate partnership to fund a five-year research program in functional genomics at the Whitehead/MIT Genome Center. It was also supported by grants from the National Institutes of Health to the Lander and Dmitrovsky labs.

The paper is titled "Interpreting patterns of gene expression with self-organzing maps: Methods and applications to hematopoietic differentiation." The authors are: Pablo Tamayo, Donna Slonim, and Jill Mesirov, of the Whitehead Institute; Qing Zhu, of the Dana-Farber Cancer Institute; Sutisak Kitareewan and Ethan Dmitrovsky, of the Department of Pharmacology and Toxicology at Dartmouth Medical School; Eric Lander, of the Whitehead Institute and the Massachusetts Institute of Technology; and Todd Golub, of the Whitehead Institute and the Dana-Farber Cancer Institute.

Whitehead Institute for Biomedical Research

Related Genes Articles from Brightsurf:

Are male genes from Mars, female genes from Venus?
In a new paper in the PERSPECTIVES section of the journal Science, Melissa Wilson reviews current research into patterns of sex differences in gene expression across the genome, and highlights sampling biases in the human populations included in such studies.

New alcohol genes uncovered
Do you have what is known as problematic alcohol use?

How status sticks to genes
Life at the bottom of the social ladder may have long-term health effects that even upward mobility can't undo, according to new research in monkeys.

Symphony of genes
One of the most exciting discoveries in genome research was that the last common ancestor of all multicellular animals already possessed an extremely complex genome.

New genes out of nothing
One key question in evolutionary biology is how novel genes arise and develop.

Good genes
A team of scientists from NAU, Arizona State University, the University of Groningen in the Netherlands, the Center for Coastal Studies in Massachusetts and nine other institutions worldwide to study potential cancer suppression mechanisms in cetaceans, the mammalian group that includes whales, dolphins and porpoises.

How lifestyle affects our genes
In the past decade, knowledge of how lifestyle affects our genes, a research field called epigenetics, has grown exponentially.

Genes that regulate how much we dream
Sleep is known to allow animals to re-energize themselves and consolidate memories.

The genes are not to blame
Individualized dietary recommendations based on genetic information are currently a popular trend.

Timing is everything, to our genes
Salk scientists discover critical gene activity follows a biological clock, affecting diseases of the brain and body.

Read More: Genes News and Genes Current Events
Brightsurf.com is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.