Nav: Home

New studies from the Center for Genome Architecture at Baylor explore 3-D structure of DNA

August 01, 2016

HOUSTON - (August 1, 2016) - In a set of papers published last week in Cell Systems, Dr. Erez Lieberman Aiden, assistant professor of molecular and human genetics and McNair Scholar at Baylor College of Medicine and director of the Center for Genome Architecture (TC4GA), and his colleagues introduce Juicer, an open-source tool used in three-dimensional (3-D) genome sequencing (Hi-C) processes.

Hi-C, invented by Aiden and collaborators in 2009, explores the three-dimensional structure of the genome, creating terabases of sequencing data resulting in high-resolution contact maps that comprehensively chart the loops that form when the genome folds up inside the nucleus of a cell.

In previous Hi-C experiments, Aiden and his team identified the sheer bandwidth of the data as a central challenge. Existing hardware and software simply could not process and analyze the massive amounts of data produced in these experiments, with a single map spanning billions of reads and trillions of base pairs.

To alleviate this bottleneck in data analysis, Aiden and his team at Baylor, led by Dr. Neva Durand, Muhammad Shamim and Ido Machol, designed Juicer, a fully-automated pipeline that allows users with little to no computational background to transform raw sequencing data into genome-wide maps of looping with a single click. Juicer produces the Hi-C file with loops and contact domains automatically annotated, which facilitates the visualization and analysis of the map and its structural features.

"The studies published in Cell Systems describe our team's new, end-to-end system for analysis of 3-D genome sequencing data. It is the first system of its kind, making it possible to map the loops in a mammalian genome in a fully automated fashion," said Durand, a senior scientist at TC4GA and co-first author on both new studies.

As a demonstration of the power of the new tool, Aiden and his colleagues created the deepest 3-D maps of the genome to date, spanning over three terabytes of data drawn from a single experimental condition.

But improvements in software weren't enough: adequate hardware is also a central challenge. The researchers tracked the performance of Juicer on four cluster systems, including a system based on Edico Genome's DRAGEN Bio-IT processing platform coupled with IBM's Power8 architecture.

Edico's DRAGEN platform accelerated the analysis of the massive data sets derived from this study of 3-D structures of DNA by nearly 20 fold, a dramatic speedup from all of other systems tested.

Machol, a co-author on both studies, noted that, "When we ran our pipeline on a hybrid DRAGEN/Power system, the data analysis was 20-fold faster than running the pipeline on an industry standard cluster. That kind of difference opens the door to many analyses that would have been very impractical before."

DRAGEN generates accelerated implementations of genome pipeline algorithms using a field-programmable gate array (FPGA). The platform is reconfigurable and flexible through remote downloads, allowing users to create custom algorithms and refine existing pipelines.

"Given the dramatic acceleration that we observed, we are excited about the extraordinary potential of FPGA technology in 3-D genomics." said Shamim, an M.D./Ph.D. student at Baylor and co-first author on the Juicer study.

Aiden, who is also a faculty member at Rice University, in the department of Computer Science and at the Center for Theoretical Biological Physics, commented on the experiment, saying, "The partnership between TC4GA and Edico Genome is a game-changer. The results that are possible using DRAGEN are more than a one-off exercise: they are a strong indicator of the future of the 3-D genomics field as a whole. We are confident that our collaboration will lead to a great deal of innovation both within the Texas Medical Center community, and beyond."

Added Pieter van Rooyen, Ph.D., chief executive officer of Edico Genome, "Dr. Aiden and his team's application of DRAGEN to accelerate Juicer is a great example of DRAGEN's effectiveness in processing massive amounts of raw sequencing data in minimal time, and without requiring any additional training or a post-graduate degree. We are continually working to optimize DRAGEN and expect the next version to be even faster than the speed we have already achieved."

Juicer is available as open source software and is compatible with multiple cluster operating systems, Edico's DRAGEN, and Amazon Web Services. It may be downloaded on the web at
Other contributors to this work include James T. Robinson, Jill P. Mesirov, and Eric S. Lander of the Broad Institute of Harvard and MIT, and Suhas Rao and Miriam Huntley, from The Center for Genome Architecture.

This work was supported by an NIH New Innovator Award (1DP2OD008540-01), the National Human Genome Research Institute (NHGRI) Centers of Excellence in Genomic Science (P50HG006193), an NVIDIA Research Center Award, an IBM University Challenge Award, a Google Research Award, a Cancer Prevention Research Institute of Texas Scholar Award (R1304), a McNair Medical Institute Scholar Award, the President's Early Career Award in Science and Engineering, and a grant from the National Science Foundation (NSF) Physics Frontiers Centers (Center for Theoretical Biological Physics).

The authors received grants from the Welch Foundation (to E.L.A.), the National Institute of General Medical Sciences (NIGMS R01GM074024 to J.P.M.), and NHGRI (HG003067 to E.S.L.). The Center for Genome Architecture is grateful to Janice, Robert, and Cary McNair for support.

Read the full papers online:

Juicer Provides a One-Click System for Analyzing Loop-Resolution HI-C Experiments:

Juicebox Provides a Visualization System for Hi-C Contact Maps with Unlimited Zoom:

Baylor College of Medicine

Related Genome Articles:

A close look into the barley genome
An international consortium, with the participation of the Helmholtz Zentrum München, Plant Genome and Systems Biology Department (PGSB), has published methodologically significant data on the barley genome.
Barley genome sequenced
Looking for a better beer or single malt Scotch whiskey?
From Genome Research: Pathogen demonstrates genome flexibility in cystic fibrosis
Chronic lung infections can be devastating for patients with cystic fibrosis (CF), and infection by Burkholderia cenocepacia, one of the most common species found in cystic fibrosis patients, is often antibiotic resistant.
A three-dimensional map of the genome
Cells face a daunting task. They have to neatly pack a several meter-long thread of genetic material into a nucleus that measures only five micrometers across.
Rhino genome results
A study by San Diego Zoo Global reveals that the prospects for recovery of the critically endangered northern white rhinoceros -- of which only three individuals remain -- will reside with the genetic resources that have been banked at San Diego Zoo Global's Frozen Zoo®.
Science and legal experts debate future uses and impact of human genome editing in Gender & the Genome
Precise, economical genome editing tools such as CRISPR have made it possible to make targeted changes in genes, which could be applied to human embryos to correct mutations, prevent disease, or alter traits.
Genome: It's all about architecture
How do pathogens such as bacteria or parasites manage to hide from their host's immune system?
Accelerating genome analysis
An international team of scientists, led by researchers from A*STAR's Genome Institute of Singapore and the Bioinformatics Institute, have developed SIFT 4G (SIFT for Genomes) -- a software that can lead to faster genome analysis.
Packaging and unpacking of the genome
Single-cell techniques have been used to investigate histone replacement and chromatin remodeling in developing oocytes.
The astounding genome of the dinoflagellate
Dinoflagellates live free-floating in the ocean or symbiotically with corals, serving up -- or as -- lunch to a host of mollusks, tiny fish and coral species.

Related Genome Reading:

Best Science Podcasts 2019

We have hand picked the best science podcasts for 2019. Sit back and enjoy new science podcasts updated daily from your favorite science news services and scientists.
Now Playing: TED Radio Hour

Digital Manipulation
Technology has reshaped our lives in amazing ways. But at what cost? This hour, TED speakers reveal how what we see, read, believe — even how we vote — can be manipulated by the technology we use. Guests include journalist Carole Cadwalladr, consumer advocate Finn Myrstad, writer and marketing professor Scott Galloway, behavioral designer Nir Eyal, and computer graphics researcher Doug Roble.
Now Playing: Science for the People

#529 Do You Really Want to Find Out Who's Your Daddy?
At least some of you by now have probably spit into a tube and mailed it off to find out who your closest relatives are, where you might be from, and what terrible diseases might await you. But what exactly did you find out? And what did you give away? In this live panel at Awesome Con we bring in science writer Tina Saey to talk about all her DNA testing, and bioethicist Debra Mathews, to determine whether Tina should have done it at all. Related links: What FamilyTreeDNA sharing genetic data with police means for you Crime solvers embraced...