Nav: Home

Compiling big data in a human-centric way

May 11, 2017

HOUSTON - (May 11, 2017) - When a group of researchers in the Undiagnosed Disease Network at Baylor College of Medicine realized they were spending days combing through databases searching for information regarding gene variants, they decided to do something about it. By creating MARRVEL (Model organism Aggregated Resources for Rare Variant ExpLoration) they are now able to help not only their own lab but also researchers everywhere search databases all at once and in a matter of minutes.

This collaborative effort among Baylor, the Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital and Harvard Medical School is described in the latest online edition of the American Journal of Human Genetics.

Big data search engine

"One big problem we have is that tens of thousands of human genome variants and phenotypes are spread throughout a number of databases, each one with their own organization and nomenclature that aren't easily accessible," said Julia Wang, an M.D./Ph.D. candidate in the Medical Scientist Training Program at Baylor and a McNair Student Scholar in the Bellen lab, as well as first author on the publication. "MARRVEL is a way to assess the large volume of data, providing a concise summary of the most relevant information in a rapid user-friendly format."

MARRVEL displays information from OMIM, ExAC, ClinVar, Geno2MP, DGV, and DECIPHER, all separate databases to which researchers across the globe have contributed, sharing tens of thousands of human genome variants and phenotypes. Since there is not a set standard for recording this type of information, each one has a different approach and searching each database can yield results organized in different ways. Similarly, decades of research in various model organisms, from mouse to yeast, are also stored in their own individual databases with different sets of standards.

Dr. Zhandong Liu, assistant professor in pediatrics - neurology at Baylor, a member of the Jan and Dan Duncan Neurological Research Institute at Texas Children's and co-corresponding author on the publication, explains that MARRVEL acts similar to an internet search engine.

"This program helps to collate the information in a common language, drawing parallels and putting it together on one single page. Our program curates model organism specific databases to concurrently display a concise summary of the data," Liu said.

Supporting researchers

A user can first search for a gene or variant, Wang explains. Results may include what is known about this gene overall, whether or not that gene is associated with a disease, whether it is highly occurring in the general population and how it is affected by certain mutations.

"MARRVEL helps to facilitate analysis of human genes and variants by cross-disciplinary integration of 18 million records so we can speed up the discovery process through computation," Liu said. "All this information is basically inaccessible unless researchers can access it efficiently and apply it to their own work to find causes, treatments and hopefully identify new diseases."

Collaboration

This project started as a necessity for the Model Organism Screening Center for the Undiagnosed Disease Network at Baylor, but as it grew, the group began reaching out to researchers in different disciplines for feedback on how MARRVEL might benefit them.

"This program is just the start. I think our tool is going to be a model for us to help clinicians and basic scientists more efficiently use the information already publicly available," Wang said. "It will help us understand and process all of the different mutations that researchers are discovering."

"The most exciting part is how this project is bringing so many different researchers together," Liu said. "We are working with labs we might not have normally collaborated with, trying to put together a puzzle of all this data."

Both Wang and Liu are thankful to the contributions from the genetics communities allowing them access to the databases as they developed MARRVEL.

Others who contributed to the findings include Drs. Rami Al-Ouran, Seon-Young Kim, Ying-Wooi Wan, Michael Wangler, Shinya Yamamoto, Hsiao-Tuan Chao, and Hugo Bellen (Howard Hughes Medical Institute at Baylor) all with Baylor College of Medicine; Yanhui Hu, Aram Comjean, Stephanie E. Mohr, and Norbert Perrimon (Howard Hughes Medical Institute at Harvard Medical School) all with Harvard Medical School.

For full funding and acknowledgements please see full publication (available after embargo lifts)
-end-
Both Wang and Liu are thankful to the contributions from the genetics communities allowing them access to the databases as they developed MARRVEL. Others who contributed to the findings include Drs. Rami Al-Ouran, Seon-Young Kim, Ying-Wooi Wan, Michael Wangler, Shinya Yamamoto, Hsiao-Tuan Chao, and Hugo Bellen (Howard Hughes Medical Institute at Baylor) all with Baylor College of Medicine; Yanhui Hu, Aram Comjean, Stephanie E. Mohr, and Norbert Perrimon (Howard Hughes Medical Institute at Harvard Medical School) all with Harvard Medical School.

For full funding and acknowledgements please see full publication (available after embargo lifts)

Baylor College of Medicine

Related Disease Articles:

Contact sports associated with Lewy body disease, Parkinson's disease symptoms, dementia
There is mounting evidence that repetitive head impacts from contact sports and other exposures are associated with the neurodegenerative disease chronic traumatic encephalopathy (CTE) and dementia.
In kidney disease patients, illicit drug use linked with disease progression and death
Among individuals with chronic kidney disease, hard illicit drug use was associated with higher risks of kidney disease progression and early death.
Parkinson's disease among patients with inflammatory bowel disease
Patients with inflammatory bowel disease appeared more likely than patients without the disorder to develop Parkinson's disease, while anti-tumor necrosis factor therapy for inflammatory bowel disease was associated with reduced incidence of Parkinson's in a new study that analyzed administrative claims data for more than 170 million patients.
Despite reductions in infectious disease mortality in US, diarrheal disease deaths on the rise
Deaths from infectious diseases have declined overall in the United States over the past three decades.
Defects on regulators of disease-causing proteins can cause neurological disease
Mutations in human PUMILIO1, a gene that regulates Ataxin1 production, cause conditions similar to spinocerebellar ataxia type 1 (SCA1).
More Disease News and Disease Current Events

Best Science Podcasts 2019

We have hand picked the best science podcasts for 2019. Sit back and enjoy new science podcasts updated daily from your favorite science news services and scientists.
Now Playing: TED Radio Hour

Rethinking Anger
Anger is universal and complex: it can be quiet, festering, justified, vengeful, and destructive. This hour, TED speakers explore the many sides of anger, why we need it, and who's allowed to feel it. Guests include psychologists Ryan Martin and Russell Kolts, writer Soraya Chemaly, former talk radio host Lisa Fritsch, and business professor Dan Moshavi.
Now Playing: Science for the People

#537 Science Journalism, Hold the Hype
Everyone's seen a piece of science getting over-exaggerated in the media. Most people would be quick to blame journalists and big media for getting in wrong. In many cases, you'd be right. But there's other sources of hype in science journalism. and one of them can be found in the humble, and little-known press release. We're talking with Chris Chambers about doing science about science journalism, and where the hype creeps in. Related links: The association between exaggeration in health related science news and academic press releases: retrospective observational study Claims of causality in health news: a randomised trial This...