Individuals' medical histories predicted by their noncoding genomes, Stanford study finds

February 04, 2016

Identifying mutations in the control switches of genes can be a surprisingly accurate way to predict a person's medical history, researchers at the Stanford University School of Medicine have found.

When the scientists used the technique to analyze the whole genome sequences of five individuals, they found that a person with narcolepsy had mutations in the regulatory regions of genes controlling alertness; a person with a family history of sudden cardiac death had mutations in regions controlling genes associated with cardiac output; and a person with high blood pressure had mutations in regions controlling circulating sodium levels in the blood.

"The beauty of having whole genomes available for study is that you can then ask completely agnostic questions," said Gill Bejerano, PhD, an associate professor of developmental biology, of pediatrics and of computer science at Stanford. "We set out to find hidden layers of susceptibility in the regulatory regions of these genomes. We were very pleased that our analysis gave such clear and significant associations between the mutations and medical histories."

Bejerano, a genomicist who is a member of the Stanford Artificial Intelligence Lab, Child Health Research Institute, Neurosciences Institute, Cancer Institute and Bio-X, is the senior author of a paper describing the research, which will be published Feb. 4 in PLOS Computational Biology. The first author is Harendra Guturu, PhD, a former Stanford graduate student who is now a research associate in pediatrics at the university.

Importance of regulatory regions

The researchers focused their analyses on a relatively small proportion of each person's genome -- the sequences of regulatory regions that have been faithfully conserved among many species over millions of years of evolution. Proteins called transcription factors bind to regulatory regions to control when, where and how genes are expressed. Some regulatory regions have evolved to generate species-specific differences -- for example, mutating in a way that changes the expression of a gene involved in foot anatomy in humans -- while other regions have stayed mostly the same for millennia.

"In these cases, evolution has given a clear signal that these regions are important to key biological pathways, and it's important for them to stick around," said Bejerano.

All of us have some natural variation in our genome, accumulated through botched DNA replication, chemical mutation and simple errors that arise when each cell tries to successfully copy 3 billion nucleotides prior to each cell division. When these errors occur in our sperm or egg cells, they are passed to our children and perhaps grandchildren. These variations, called polymorphisms, are usually, but not always, harmless.

GREAT work

Guturu looked for what are called single nucleotide polymorphisms, or SNPs, in the DNA of five people who have made their genomes and information about their own or their family's medical history publicly available for use by researchers worldwide. SNPs are places along a chromosome where the DNA sequence varies from a composite human DNA reference sequence by one letter, or nucleotide.

Rather than search through the whole genome, Guturu focused on SNPs in evolutionarily conserved regulatory regions. Even within these regions, each person had many SNPs. So Guturu used a software program, Predicting Regulatory Information of Single Motifs, developed in the Bejerano lab, to predict which nucleotide changes were likely to disrupt the conserved binding of a transcription factor.

Guturu then turned to software called Genomic Regions Enrichment of Annotations Tool to determine whether the disrupted binding sites were likely to perturb the expression of groups of genes that together control a particular biological function. GREAT, which was also developed in the Bejerano lab, curates knowledge about the diverse functions of thousands of different groups of genes. For any set of genomic regions a user inputs, GREAT determines the most common set or sets of nearby genes.

Using this approach to study the genomes of the five individuals, Guturu, Bejerano and their colleagues found that one of the individuals who had a family history of sudden cardiac death had a surprising accumulation of variants associated with "abnormal cardiac output"; another with hypertension had variants likely to affect genes involved in circulating sodium levels; and another with narcolepsy had variants affecting parasympathetic nervous system development. In all five cases, GREAT reported results that jibed with what was known about that individual's self-reported medical history, and that were rarely seen in the more than 1,000 other genomes used as controls.

'Exciting avenue for study'

The researchers would like to create a web portal that would allow others to easily conduct similar studies. However, they concede that, for some diseases, the results may not be so clear-cut.

"We are the sum of billions of transcription-factor-binding events in thousands of cell types throughout our bodies," said Bejerano. "Not every disease will be amenable to this type of analysis. But this study shows that nature, even the noncoding genome, can be very benevolent when you ask the right questions. And it may help us begin to combine our knowledge about variations, or mutations, that occur throughout the genome. It's a very exciting avenue for study."

The research is an example of Stanford Medicine's focus on precision health, the goal of which is to anticipate and prevent disease in the healthy and precisely diagnose and treat disease in the ill.

Other Stanford co-authors of the paper are graduate student Sandeep Chinchali and former graduate student Shoa Clarke, MD, PhD.

The research was supported by the National Institutes of Health (grants R01HG005058 and R01HD059862), the National Science Foundation, the Howard Hughes Medical Institute, a Stanford graduate fellowship and the King Abdullah University of Science and Technology.

Bejerano and Guturu have filed a patent application on the algorithm used in this study.

Stanford's Department of Developmental Biology also supported the work.
The Stanford University School of Medicine consistently ranks among the nation's top medical schools, integrating research, medical education, patient care and community service. For more news about the school, please visit The medical school is part of Stanford Medicine, which includes Stanford Health Care and Lucile Packard Children's Hospital Stanford. For information about all three, please visit

Stanford University Medical Center

Related DNA Articles from Brightsurf:

A new twist on DNA origami
A team* of scientists from ASU and Shanghai Jiao Tong University (SJTU) led by Hao Yan, ASU's Milton Glick Professor in the School of Molecular Sciences, and director of the ASU Biodesign Institute's Center for Molecular Design and Biomimetics, has just announced the creation of a new type of meta-DNA structures that will open up the fields of optoelectronics (including information storage and encryption) as well as synthetic biology.

Solving a DNA mystery
''A watched pot never boils,'' as the saying goes, but that was not the case for UC Santa Barbara researchers watching a ''pot'' of liquids formed from DNA.

Junk DNA might be really, really useful for biocomputing
When you don't understand how things work, it's not unusual to think of them as just plain old junk.

Designing DNA from scratch: Engineering the functions of micrometer-sized DNA droplets
Scientists at Tokyo Institute of Technology (Tokyo Tech) have constructed ''DNA droplets'' comprising designed DNA nanostructures.

Does DNA in the water tell us how many fish are there?
Researchers have developed a new non-invasive method to count individual fish by measuring the concentration of environmental DNA in the water, which could be applied for quantitative monitoring of aquatic ecosystems.

Zigzag DNA
How the cell organizes DNA into tightly packed chromosomes. Nature publication by Delft University of Technology and EMBL Heidelberg.

Scientists now know what DNA's chaperone looks like
Researchers have discovered the structure of the FACT protein -- a mysterious protein central to the functioning of DNA.

DNA is like everything else: it's not what you have, but how you use it
A new paradigm for reading out genetic information in DNA is described by Dr.

A new spin on DNA
For decades, researchers have chased ways to study biological machines.

From face to DNA: New method aims to improve match between DNA sample and face database
Predicting what someone's face looks like based on a DNA sample remains a hard nut to crack for science.

Read More: DNA News and DNA Current Events is a participant in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to