According to some estimates, the total amount of digital information produced globally will reach an astounding 40 trillion gigabytes by 2020. With numbers likes these on the horizon, researchers around the world have been hard at work trying to come up with more advanced data storage systems that could handle all of that data in a way that’s both cost-efficient and reliable.
So far, these efforts have given us a technique capable of fitting as many as 2,200 terabytes of data on a single gram of DNA – a tremendous achievement, no less.
Last week, however, a team of researchers, led by Olgica Milenkovic from the University of Illinois, had detailed a new system that can store up to 490 exabytes – or 490 billion gigabytes! – of information per gram of DNA. Moreover, the new technique also allows the data to be rewritten and accessed in a selective manner.
With DNA storage, data is first translated into binary code (1s and 0s) and then converted to DNA bases (A, G, T and C). Once the information is laid out, DNA is synthesized to match the data. To read the information contained inside, scientists simply sequence the DNA and convert the data back to binary.
To make the stored information easier to access, Milenkovic and her team tagged the strands of DNA (each synthesized with 1,000 bytes of data) with two address sequences at each end. Once they amplified the strand they wanted, they could re-write the information contained within using conventional DNA editing techniques.
The system was tested on Wikipedia. “We encoded parts of the Wikipedia pages of six universities in the USA, and selected and edited parts of the text written in DNA corresponding to three of these schools,” wrote the authors in their paper, published in the science journal Nature.
Even though the new technique is a significant improvement over the previous ones in terms of storage capacity, it’s also much more expensive. Encoding and storing 17 kilobytes (KB) of data cost $4,023, compared to $12,600 for storing 739 KB via one of the previous methods.
“The costs of synthesizing DNA (i.e. recording the information) are prohibitively high at the moment to allow this technology to be competitive with flash or other memories,” said Milenkovic.
Despite this glaring problem, future seems bright – DNA storage costs are declining extremely fast. According to the researchers, the cost of synthesizing 1,000 bps blocks dropped almost 7-fold during the past seven months alone.
With costs plummeting so fast, the new technique could soon be used by various governmental, scientific and historical organisations, as well as “monsters” projects like the LHC, which generates about 15 petabytes of data every year.
Things don’t look so good for personal computers, though. “I do not see how one could directly connect classical computers with DNA storage media at the moment, as one needs to do some processing on the DNA output to make it readable by a computer,” said Milenkovic. “But this processing may be incorporated into new generations of DNA sequencers – I am not aware of anyone working on this subject at the moment, though.”