Nav: Home

New TSRI project helps researchers build a biomedical knowledgebase

April 14, 2016

LA JOLLA, CA - April 14, 2016 - Imagine attempting to bake a cake--except you have to go to different stores for flour and milk, drive across town to get eggs and call a friend to borrow a cake pan.

This is the kind of disjointed scenario many scientists face when they attempt to gather data scattered across small databases and hard-to-search PDF files.

"It's not that the data doesn't exist," said Andrew Su, associate professor at The Scripps Research Institute (TSRI). "The data just isn't stored in a way that scientists can easily access."

"Open data is vital for progress and research," added TSRI Assistant Professor of Molecular and Experimental Medicine Ben Good. "We need to break down those barriers."

To solve this problem, Su, Good and their colleagues at TSRI have integrated biomedical data into Wikidata, a public, editable database where researchers can easily link genes, proteins and more. Their work was announced in two recent papers in the journal Database.

A Better Way to Research

Technological breakthroughs in the last 10 years have led to rapid increases in the volume and rate of biomedical research, which in turn has led to a rapid growth in biomedical knowledge. However, this knowledge is currently fragmented across countless resources--from online databases to supplementary data files to individual facts in individual papers.

"As a research community, we spend a lot of time searching for good resources and trying to link them together," said TSRI Research Associate Tim Putman, who was first author of one of the studies. "It's cringeworthy."

Even when databases are open to the public, current knowledge isn't always organized in a uniform way, Putman explained.

Rather than leave each research group to tackle data integration individually, Wikidata offers a new model for organizing all this information. Built on the same principles as Wikipedia, Wikidata enables anyone to add new information to an open community database.

While other Wikidata editors have added information on millions of items as diverse as works of art to U.S. cities, the TSRI team has focused on adding information on biomedical concepts.

TSRI Research Associate Sebastian Burgstaller-Muehlbacher, first author on one study, added data on all human and mouse genes, all human diseases and all drugs approved by the U.S. Food and Drug Administration.

Putman then extended Wikidata with a focus on microbial genomes. With all this information collected in one system, researchers can more easily spot connections between diseases, pathogens and biological processes. As an example, Putman used the model to show that other microorganisms in the body can influence chlamydia infections.

As a proof of concept, Putman led the development of a genome browser based on Wikidata. Rather than having to develop one browser for every sequenced genome, this genome browser allows users to browse any genome that has been loaded into Wikidata.

"You can zoom in on a gene, click on it and the sequence will pop up," said Good. The genome browser will then link back to the original Wikidata entry.

In the end, the researchers plan to have a comprehensive, uniform database that is easy to search and open to anyone who wants to add data and link related concepts.

"We think this data should all be open," said Su. "This just makes intuitive sense."
-end-
In addition to Su, Good, Putman and Burgstaller-Muehlbacher, authors of the paper, "Centralizing content and distributing labor: a community model for curating the very long tail of microbial genomes," (http://database.oxfordjournals.org/content/2016/baw028.full?sid=4d5e9514-0e4a-40da-aa3d-2c27fc04b743) were Chunlei Wu of TSRI and Andra Waagmeester of Micelio.

In addition to Su, Good, Putman, Burgstaller-Muehlbacher and Waagmeester, authors of the second study, "Wikidata as a semantic framework for the Gene Wiki initiative," were Elvira Mitraka and Lynn Schriml of The University of Maryland, Baltimore; Justin Leong and Paul Pavlidis of the University of British Columbia; and Julia Turner of TSRI.

Both studies were supported by the National Institutes of Health (grants GM089820, GM083924, GM114833 and DA036134).

Scripps Research Institute

Related Genome Articles:

A close look into the barley genome
An international consortium, with the participation of the Helmholtz Zentrum München, Plant Genome and Systems Biology Department (PGSB), has published methodologically significant data on the barley genome.
Barley genome sequenced
Looking for a better beer or single malt Scotch whiskey?
From Genome Research: Pathogen demonstrates genome flexibility in cystic fibrosis
Chronic lung infections can be devastating for patients with cystic fibrosis (CF), and infection by Burkholderia cenocepacia, one of the most common species found in cystic fibrosis patients, is often antibiotic resistant.
A three-dimensional map of the genome
Cells face a daunting task. They have to neatly pack a several meter-long thread of genetic material into a nucleus that measures only five micrometers across.
Rhino genome results
A study by San Diego Zoo Global reveals that the prospects for recovery of the critically endangered northern white rhinoceros -- of which only three individuals remain -- will reside with the genetic resources that have been banked at San Diego Zoo Global's Frozen Zoo®.
Science and legal experts debate future uses and impact of human genome editing in Gender & the Genome
Precise, economical genome editing tools such as CRISPR have made it possible to make targeted changes in genes, which could be applied to human embryos to correct mutations, prevent disease, or alter traits.
Genome: It's all about architecture
How do pathogens such as bacteria or parasites manage to hide from their host's immune system?
Accelerating genome analysis
An international team of scientists, led by researchers from A*STAR's Genome Institute of Singapore and the Bioinformatics Institute, have developed SIFT 4G (SIFT for Genomes) -- a software that can lead to faster genome analysis.
Packaging and unpacking of the genome
Single-cell techniques have been used to investigate histone replacement and chromatin remodeling in developing oocytes.
The astounding genome of the dinoflagellate
Dinoflagellates live free-floating in the ocean or symbiotically with corals, serving up -- or as -- lunch to a host of mollusks, tiny fish and coral species.

Related Genome Reading:

Best Science Podcasts 2019

We have hand picked the best science podcasts for 2019. Sit back and enjoy new science podcasts updated daily from your favorite science news services and scientists.
Now Playing: TED Radio Hour

Anthropomorphic
Do animals grieve? Do they have language or consciousness? For a long time, scientists resisted the urge to look for human qualities in animals. This hour, TED speakers explore how that is changing. Guests include biological anthropologist Barbara King, dolphin researcher Denise Herzing, primatologist Frans de Waal, and ecologist Carl Safina.
Now Playing: Science for the People

#SB2 2019 Science Birthday Minisode: Mary Golda Ross
Our second annual Science Birthday is here, and this year we celebrate the wonderful Mary Golda Ross, born 9 August 1908. She died in 2008 at age 99, but left a lasting mark on the science of rocketry and space exploration as an early woman in engineering, and one of the first Native Americans in engineering. Join Rachelle and Bethany for this very special birthday minisode celebrating Mary and her achievements. Thanks to our Patreons who make this show possible! Read more about Mary G. Ross: Interview with Mary Ross on Lash Publications International, by Laurel Sheppard Meet Mary Golda...