Hidden Content
DNA data storage is about to get wild.

DNA offers the best information storage device scientists have ever discovered. But accessing it is incredibly complex.
That's why scientists set out to create an amazing result with a somewhat simplified method of DNA storage.
The researchers were able to show that theoretically, the entirety of YouTube could fit on a teaspoon.
If ever there were sprawling website, it would be YouTube. The video-sharing site that was launched in 2005 has become one of the dominant forces of the Internet, both for good and for, well, less than good. But there's something both its critics and advocates would agree on: It's big. Really, really big.

That's why it's so shocking that researchers have shown the theoretical possibility that they could store 10 petabytes (10 million gigabytes) of data in a single gram of DNA. Potentially, all of YouTube could fit on a teaspoon.

The study, from researchers at the Technion–Israel Institute of Technology in Haifa and the Interdisciplinary Center (IDC) Herzliya, also in Israel, is meant to examine the possibility of DNA as data storage. With use of the cloud now commonplace, data storage has become increasingly crucial. Server farms, the traditional solution, have raised environmental concerns, given their large demands on electricity. Some companies, like Microsoft, have experimented with putting their servers underwater to deal with the challenge.

DNA already holds the intensely complex code for human life, which makes it potentially amazing for data storage. But it's tough. Encoding information DNA requires a chain made up of links called nucleotides. These nucleotides are the four building blocks of life, marked with letters A, C, G, and T. Binary sequences consisting of 0s and 1s are then translated into these four letters.
During a process known as synthesis, DNA molecules are produced representing these same sequences. Then, in a process called sequencing, researchers create an output that represents the original nucleotide sequence.

Working out the problems that the team did, even theoretically, is a step forward. In a press statement, the team describes their progress as:
(1) increasing the number of letters used to encode the information (beyond the original 4 letters); (2) significantly reducing the number of synthesis rounds required to store information on DNA; (3) improving the error correction mechanism used.
"The current synthesis and sequencing processes are inherently redundant, because each molecule is produced in large numbers1 and is read in multiple copies during sequencing, says Professor Zohar Yakhini of the Technion in the press statement. "The method we developed leverages this redundancy to increase the effective number of letters well over the original four letters, making it possible for us to encode and write each unit of information in fewer cycles of synthesis."

The team was able to reduce the number of synthesis rounds required per unit of information by 20 percent. Given the intense complexity of this work, anything that makes it simpler and more efficient is a step in the right direction. The scientists' work could lead to a 75 percent reduction in the future.
"In this work, we have implemented a DNA based storage system that encodes information with synthesis efficiency that is significantly better than the standard approach," says Professor Roee Amit, who runs a synthetic biology lab at the Techion. "The study included the actual implementation of the new coding technique for storing large-volume information on DNA molecules and reconstructing it for testing the process."

Scientists are also considering CRISPR techniques to make DNA more malleable to information storage.