Nav: Home

New TSRI project helps researchers build a biomedical knowledgebase

April 14, 2016

LA JOLLA, CA - April 14, 2016 - Imagine attempting to bake a cake--except you have to go to different stores for flour and milk, drive across town to get eggs and call a friend to borrow a cake pan.

This is the kind of disjointed scenario many scientists face when they attempt to gather data scattered across small databases and hard-to-search PDF files.

"It's not that the data doesn't exist," said Andrew Su, associate professor at The Scripps Research Institute (TSRI). "The data just isn't stored in a way that scientists can easily access."

"Open data is vital for progress and research," added TSRI Assistant Professor of Molecular and Experimental Medicine Ben Good. "We need to break down those barriers."

To solve this problem, Su, Good and their colleagues at TSRI have integrated biomedical data into Wikidata, a public, editable database where researchers can easily link genes, proteins and more. Their work was announced in two recent papers in the journal Database.

A Better Way to Research

Technological breakthroughs in the last 10 years have led to rapid increases in the volume and rate of biomedical research, which in turn has led to a rapid growth in biomedical knowledge. However, this knowledge is currently fragmented across countless resources--from online databases to supplementary data files to individual facts in individual papers.

"As a research community, we spend a lot of time searching for good resources and trying to link them together," said TSRI Research Associate Tim Putman, who was first author of one of the studies. "It's cringeworthy."

Even when databases are open to the public, current knowledge isn't always organized in a uniform way, Putman explained.

Rather than leave each research group to tackle data integration individually, Wikidata offers a new model for organizing all this information. Built on the same principles as Wikipedia, Wikidata enables anyone to add new information to an open community database.

While other Wikidata editors have added information on millions of items as diverse as works of art to U.S. cities, the TSRI team has focused on adding information on biomedical concepts.

TSRI Research Associate Sebastian Burgstaller-Muehlbacher, first author on one study, added data on all human and mouse genes, all human diseases and all drugs approved by the U.S. Food and Drug Administration.

Putman then extended Wikidata with a focus on microbial genomes. With all this information collected in one system, researchers can more easily spot connections between diseases, pathogens and biological processes. As an example, Putman used the model to show that other microorganisms in the body can influence chlamydia infections.

As a proof of concept, Putman led the development of a genome browser based on Wikidata. Rather than having to develop one browser for every sequenced genome, this genome browser allows users to browse any genome that has been loaded into Wikidata.

"You can zoom in on a gene, click on it and the sequence will pop up," said Good. The genome browser will then link back to the original Wikidata entry.

In the end, the researchers plan to have a comprehensive, uniform database that is easy to search and open to anyone who wants to add data and link related concepts.

"We think this data should all be open," said Su. "This just makes intuitive sense."
-end-
In addition to Su, Good, Putman and Burgstaller-Muehlbacher, authors of the paper, "Centralizing content and distributing labor: a community model for curating the very long tail of microbial genomes," (http://database.oxfordjournals.org/content/2016/baw028.full?sid=4d5e9514-0e4a-40da-aa3d-2c27fc04b743) were Chunlei Wu of TSRI and Andra Waagmeester of Micelio.

In addition to Su, Good, Putman, Burgstaller-Muehlbacher and Waagmeester, authors of the second study, "Wikidata as a semantic framework for the Gene Wiki initiative," were Elvira Mitraka and Lynn Schriml of The University of Maryland, Baltimore; Justin Leong and Paul Pavlidis of the University of British Columbia; and Julia Turner of TSRI.

Both studies were supported by the National Institutes of Health (grants GM089820, GM083924, GM114833 and DA036134).

Scripps Research Institute

Related Genome Articles:

A close look into the barley genome
An international consortium, with the participation of the Helmholtz Zentrum München, Plant Genome and Systems Biology Department (PGSB), has published methodologically significant data on the barley genome.
Barley genome sequenced
Looking for a better beer or single malt Scotch whiskey?
From Genome Research: Pathogen demonstrates genome flexibility in cystic fibrosis
Chronic lung infections can be devastating for patients with cystic fibrosis (CF), and infection by Burkholderia cenocepacia, one of the most common species found in cystic fibrosis patients, is often antibiotic resistant.
A three-dimensional map of the genome
Cells face a daunting task. They have to neatly pack a several meter-long thread of genetic material into a nucleus that measures only five micrometers across.
Rhino genome results
A study by San Diego Zoo Global reveals that the prospects for recovery of the critically endangered northern white rhinoceros -- of which only three individuals remain -- will reside with the genetic resources that have been banked at San Diego Zoo Global's Frozen Zoo®.
More Genome News and Genome Current Events

Best Science Podcasts 2019

We have hand picked the best science podcasts for 2019. Sit back and enjoy new science podcasts updated daily from your favorite science news services and scientists.
Now Playing: TED Radio Hour

Anthropomorphic
Do animals grieve? Do they have language or consciousness? For a long time, scientists resisted the urge to look for human qualities in animals. This hour, TED speakers explore how that is changing. Guests include biological anthropologist Barbara King, dolphin researcher Denise Herzing, primatologist Frans de Waal, and ecologist Carl Safina.
Now Playing: Science for the People

#534 Bacteria are Coming for Your OJ
What makes breakfast, breakfast? Well, according to every movie and TV show we've ever seen, a big glass of orange juice is basically required. But our morning grapefruit might be in danger. Why? Citrus greening, a bacteria carried by a bug, has infected 90% of the citrus groves in Florida. It's coming for your OJ. We'll talk with University of Maryland plant virologist Anne Simon about ways to stop the citrus killer, and with science writer and journalist Maryn McKenna about why throwing antibiotics at the problem is probably not the solution. Related links: A Review of the Citrus Greening...