Nav: Home

Research overcomes key obstacles to scaling up DNA data storage

June 03, 2019

Researchers from North Carolina State University have developed new techniques for labeling and retrieving data files in DNA-based information storage systems, addressing two of the key obstacles to widespread adoption of DNA data storage technologies.

"DNA systems are attractive because of their potential information storage density; they could theoretically store a billion times the amount of data stored in a conventional electronic device of comparable size," says James Tuck, co-corresponding author of a paper on the work and an associate professor of electrical and computer engineering at NC State.

"But two of the big challenges here are, how do you identify the strands of DNA that contain the file you are looking for? And once you identify those strands, how do you remove them so that they can be read - and do so without destroying the strands?"

"Previous work had come up with a system that appends short, 20-monomer long sequences of DNA called primer-binding sequences to the ends of DNA strands that are storing information," says Albert Keung, co-corresponding author of the paper and an assistant professor of chemical and biomolecular engineering at NC State. "You could use a small DNA primer that matches the corresponding primer-binding sequence to identify the appropriate strands that comprise your desired file. However, there are only an estimated 30,000 of these binding sequences available, which is insufficient for practical use. We wanted to find a way to overcome this limitation."

To address these problems, the researchers developed two techniques that, taken together, they call DNA Enrichment and Nested Separation, or DENSe.

The researchers tackled the file identification challenge by using two, nested primer-binding sequences. The system first identifies all of the strands containing the initial binder sequence. It then conducts a second "search" of that subset of strands to single out those strands that contain the second binder sequence.

"This increases the number of estimated file names from approximately 30,000 to approximately 900 million," Tuck says.

Once identified, the file still needs to be extracted. Existing techniques use polymerase chain reaction (PCR) to make lots (and lots) of copies of the relevant DNA strands, then sequence the entire sample. Because there are so many copies of the targeted DNA strands, their signal overwhelms the rest of the strands in the sample, making it possible to identify the targeted DNA sequence and read the file.

"That technique is not efficient, and it doesn't work if you are trying to retrieve data from a high-capacity database - there's just too much other DNA in the system," says Kyle Tomek, a Ph.D. student at NC State and co-lead author of the paper.

So the researchers took a different approach to data retrieval, attaching any of several small molecular tags to the primers being used to identify targeted DNA strands. When the primer finds the targeted DNA, it uses PCR to make a copy of the relevant DNA - and the copy is attached to the molecular tag.

The researchers also utilized magnetic microbeads coated with molecules that bind specifically to a given tag. These functionalized microbeads "grab" the tags of targeted DNA strands. The microbeads can then be retrieved with a magnet, bringing the targeted DNA with them.

"This system allows us to retrieve the DNA strands associated with a specific file without having to make many copies of each strand, while also preserving the original DNA strands in the database," Keung says.

"We've implemented the DENSe system experimentally using sample files, and have demonstrated that it can be used to store and retrieve text and image files," Keung adds.

"These techniques, when used in tandem, open the door to developing DNA-based data storage systems with modern capacities and file-access capabilities," Tomek says.

"Next steps include scaling this up and testing the DENSe approach with larger databases," Tuck says. "A big challenge there is cost."

The paper, "Driving the Scalability of DNA-Based Information Storage Systems," is published in the journal ACS Synthetic Biology. Co-lead author of the paper is Kevin Volkel, a Ph.D. student at NC State. The paper was co-authored by Alexander Simpson, a former graduate student at NC State; and Austin Hass and Elaine Indermaur, both undergraduates at NC State.
-end-
The work was done with support from the National Science Foundation under grant number 1650148.

North Carolina State University

Related Dna Articles:

Zigzag DNA
How the cell organizes DNA into tightly packed chromosomes. Nature publication by Delft University of Technology and EMBL Heidelberg.
Scientists now know what DNA's chaperone looks like
Researchers have discovered the structure of the FACT protein -- a mysterious protein central to the functioning of DNA.
DNA is like everything else: it's not what you have, but how you use it
A new paradigm for reading out genetic information in DNA is described by Dr.
A new spin on DNA
For decades, researchers have chased ways to study biological machines.
From face to DNA: New method aims to improve match between DNA sample and face database
Predicting what someone's face looks like based on a DNA sample remains a hard nut to crack for science.
Self-healing DNA nanostructures
DNA assembled into nanostructures such as tubes and origami-inspired shapes could someday find applications ranging from DNA computers to nanomedicine.
DNA design that anyone can do
Researchers at MIT and Arizona State University have designed a computer program that allows users to translate any free-form drawing into a two-dimensional, nanoscale structure made of DNA.
DNA find
A Queensland University of Technology-led collaboration with University of Adelaide reveals that Australia's pint-sized banded hare-wallaby is the closest living relative of the giant short-faced kangaroos which roamed the continent for millions of years, but died out about 40,000 years ago.
DNA structure impacts rate and accuracy of DNA synthesis
DNA sequences with the potential to form unusual conformations, which are frequently associated with cancer and neurological diseases, can in fact slow down or speed up the DNA synthesis process and cause more or fewer sequencing errors.
Changes in mitochondrial DNA control how nuclear DNA mutations are expressed in cardiomyopathy
Differences in the DNA within the mitochondria, the energy-producing structures within cells, can determine the severity and progression of heart disease caused by a nuclear DNA mutation.
More DNA News and DNA Current Events

Trending Science News

Current Coronavirus (COVID-19) News

Top Science Podcasts

We have hand picked the top science podcasts of 2020.
Now Playing: TED Radio Hour

Making Amends
What makes a true apology? What does it mean to make amends for past mistakes? This hour, TED speakers explore how repairing the wrongs of the past is the first step toward healing for the future. Guests include historian and preservationist Brent Leggs, law professor Martha Minow, librarian Dawn Wacek, and playwright V (formerly Eve Ensler).
Now Playing: Science for the People

#565 The Great Wide Indoors
We're all spending a bit more time indoors this summer than we probably figured. But did you ever stop to think about why the places we live and work as designed the way they are? And how they could be designed better? We're talking with Emily Anthes about her new book "The Great Indoors: The Surprising Science of how Buildings Shape our Behavior, Health and Happiness".
Now Playing: Radiolab

The Third. A TED Talk.
Jad gives a TED talk about his life as a journalist and how Radiolab has evolved over the years. Here's how TED described it:How do you end a story? Host of Radiolab Jad Abumrad tells how his search for an answer led him home to the mountains of Tennessee, where he met an unexpected teacher: Dolly Parton.Jad Nicholas Abumrad is a Lebanese-American radio host, composer and producer. He is the founder of the syndicated public radio program Radiolab, which is broadcast on over 600 radio stations nationwide and is downloaded more than 120 million times a year as a podcast. He also created More Perfect, a podcast that tells the stories behind the Supreme Court's most famous decisions. And most recently, Dolly Parton's America, a nine-episode podcast exploring the life and times of the iconic country music star. Abumrad has received three Peabody Awards and was named a MacArthur Fellow in 2011.