Friday, October 28, 2016

Computational Genomics F'Real...,


WSJ |  A QUICK RIDDLE: WHAT DO 100 works of classic literature, a seed database from the nonprofit Crop Trust and the Universal Declaration of Human Rights have in common? All of them were recently converted from bits of digital data to strands of synthetic DNA. In addition to these weighty files, researchers at Microsoft and the University of Washington converted a high-definition music video of “This Too Shall Pass” by the alternative rock band OK Go. The video is an homage to Rube Goldberg-like contraptions, which bear more than a passing resemblance to the labyrinthine process of transforming data into the genetic instructions that shape all living things.

This recent data-to-DNA conversion, completed in July, totaled 200 megabytes—which would barely register on a 16-gigabyte iPhone. It’s not a huge amount of information, but it bested the previous DNA storage record, set by scientists at Harvard University, by a factor of about 10. To achieve this, researchers concocted a convoluted process to encode the data, store it in synthetic DNA and then use DNA sequencing machines to retrieve and, finally, decode the data. The result? The exact same files they began with.

Which raises the question: Why bother?

“We are seeing this explosion in the amount of data that needs to be stored,” says Karin Strauss, the principal Microsoft researcher on the project. “To continue storing this information, we need radical new approaches.” In an age of gargantuan, power-sucking data centers, the space-saving potential of data stored in DNA is staggering. “You can archive all the data on the internet in a shoebox,” says Luis Ceze, an associate professor of computer science and engineering at the University of Washington.