Saturday, March 04, 2017

"Computer Virus" Will Cease Being a Metaphor During My Lifetime

thescientist |  Yaniv Erlich and colleagues encoded large media files in DNA, copied the DNA multiple times, and still managed to retrieve the files without any errors, they reported in Science today (March 2). Compared with cassette tapes and 8 mm film, DNA is far less likely to become obsolete, and its storage density is roughly 215 petabytes of data per gram of genetic material, the researchers noted.

To test DNA’s media-storage capabilities, Erlich, an assistant professor of computer science at Columbia University in New York City, and Dina Zielinski, a senior associate scientist at the New York Genome Center, encoded six large files—including a French film and a computer operating system (OS), complete with word-processing software—into DNA. They then recovered the data from PCR-generated copies of that DNA. The Scientist spoke with Erlich about the study, and other potential data-storage applications for DNA.

The Scientist: Why is DNA a good place to store information?

Yaniv Erlich: First, we’re starting to reach the physical limits of hard drives. DNA is much more compact than magnetic media—about 1 million times more compact. Second, it can last for a much longer time. Think about your CDs from the 90s, they’re probably scratched by now. [Today] we can read DNA from a skeleton [that is] 4,000 years old. Third, one of the nice features about DNA is that it is not subject to digital obsoleteness. Think about videocassettes or 8 mm movies. It’s very hard these days to watch these movies because the hardware changes so fast. DNA—that hardware isn’t going anywhere. It’s been around for the last 3 billion years. If humanity loses its ability to read DNA, we have much bigger problems than data storage.

TS: Have other researchers tried to store information in DNA?

YE: There are several groups that have already done this process, and they inspired us, but our approach has several advantages. Ours is 60 percent more efficient than previous strategies and our results are very immune to noise and error. Most previous studies reported some issues getting the data back from the DNA, some gaps [in the information retrieved], but we show it’s easy. We even tried to make it harder for ourselves . . . so we tried to copy the data, and the enzymatic reaction [involved in copying DNA] introduces errors. We copied the data, and then copied that copy, and then copied a copy of that copy—nine times—and we were still able to recover the data without one error. We also . . . achieved a density of 215 petabytes per one gram of DNA. Your laptop has probably one terabyte. Multiply that by 200,000, and we could fit all that information into one gram of DNA.