Data and DNA

Microsoft executives have revealed that they aim to have a "proto-commercial" DNA data storage system available in three years and hope to have an operational model in a decade. The eventual device will be around the size of a 1970s era Xerox Printer.

Microsoft's current system works by firstly converting data from zeroes and ones to the ATCG molecules DNA is composed of, with markers to show how the original piece of data was composed. These sequences are then synthesized into actual DNA, and pooled with the other sequences created.

To extract and access the files, a polymerase chain reaction is used to select the appropriate sequences. These are then read, and the ATCG molecules are turned back into data. Both Microsoft's studies and a similar experiment run by Erlich Lab members Dina Zielinski and Yaniv Erlich (who also predicted DNA storage would be usable in about a decade) showed that the extracted content was error-free.

While the process has been refined, the cost and time of the procedure is impeding further development. The chemical process used to manufacture DNA strands is both laborious and expensive: the 13,448,372 unique pieces of DNA used in the Microsoft study would cost $800,000 on the open market. That research — while record breaking in quantity — "did not [show] any progress towards the goal" of increasing speed or decreasing cost, Elrich said in an interview with MIT Technology Review.

Elrich himself has proposed a novel modification to tackle the problem: replace the 40-year-old and time consuming process it currently takes to make DNA with one that uses enzymes, as our own bodies do.

Biotech Solution to a Tech Problem

While these obstacles need to be overcome, DNA data storage could be the solution to a world that needs more and more data stored more and more compactly. Victor Zhirnov, Chief Scientist of the Semiconductor Research Corporation, told MIT Technology Review, "efforts to shrink computer memory are hitting physical limits," while Louis Ceze, Associate Professor at Washington University, said in a Microsoft video that “we’re storing a lot of data, and current storage technologies cannot keep up with it.”

DNA offers a solution to this issue, and a possible worldwide data revolution because of three of its properties: density, longevity, and continued relevance.

“DNA is the densest known storage medium in the universe, just based on the laws of physics," Zhirnov said in the interview. Some of the statistics the scientists quote are mind boggling: every movie ever made could fit inside a volume of DNA smaller than a sugar-cube; the whole accessible internet, estimated to be a quintillion bytes,  would fill no more than a shoebox; and all of your data could be stored in a drop of DNA.

The longevity remains somewhat controversial. While many tape storage experts (who Microsoft eventually wants to make obsolete) remain doubtful, the teams undertaking these study vouch that DNA is thousands of times more durable than a silicon device, and cite the example of DNA still being extracted from ancient remains.

Finally, because the medium is the same as our biological make-up, the Microsoft Scientists claim that DNA won't be subject to the transient whims of trends and time. Elrich said in an interview with Researchgate, "humanity is unlikely to lose its ability to read these molecules. If it does, we will have much bigger problems than data storage."

As the the world's population grows and becomes increasing relent on ever-advancing technology, it produces more and more data, all of which needs to be stored securely. DNA data storage could be the solution that allows the march of big data (which was recently estimated by some to be more valuable than oil) to continue unimpeded.

Share This Article