4 min readCollaborators Preserve Archive-Quality Audio Recordings on DNA
San Francisco, CA — Twist Bioscience has announced that, working with Microsoft and University of Washington researchers, they have successfully stored archival-quality audio recordings of two important music performances from the archives of the world-renowned Montreux Jazz Festival.
According to the report, these selections are encoded and stored in nature’s preferred storage medium, DNA. These tiny specks of DNA will preserve a part of UNESCO’s Memory of the World Archive, where valuable cultural heritage collections are recorded. This is the first time DNA has been used as a long-term archival-quality storage medium.
Quincy Jones, Entertainment Executive, Music Composer and Arranger, Musician and Music Producer said, “With advancements in nanotechnology, I believe we can expect to see people living prolonged lives, and with that, we can also expect to see more developments in the enhancement of how we live. For me, life is all about learning where you came from in order to get where you want to go, but in order to do so, you need access to history! And with the unreliability of how archives are often stored, I sometimes worry that our future generations will be left without such access…So, it absolutely makes my soul smile to know that EPFL, Twist Bioscience and others are coming together to preserve the beauty and history of the Montreux Jazz Festival for our future generations, on DNA!…I’ve been a part of this festival for decades and it truly is a magnificent representation of what happens when different cultures unite for the sake of music. Absolute magic. And I’m proud to know that the memory of this special place will never be lost.”
“Our partnership with EPFL in digitizing our archives aims not only at their positive exploration, but also at their preservation for the next generations,” says Thierry Amsallem, president of the Claude Nobs Foundation. “By taking part in this pioneering experiment which writes the songs into DNA strands, we can be certain that they will be saved on a medium that will never become obsolete!”
The Montreux Jazz Digital Project is a collaboration between the Claude Nobs Foundation, curator of the Montreux Jazz Festival audio-visual collection and the École Polytechnique Fédérale de Lausanne (EPFL) to digitize, enrich, store, show, and preserve this notable legacy created by Claude Nobs, the Festival’s founder.
In this proof-of-principle project, two quintessential music performances from the Montreux Jazz Festival – Smoke on the Water, performed by Deep Purple and Tutu, performed by Miles Davis – have been encoded onto DNA and read back with 100 percent accuracy. After being decoded, the songs were played on September 29th at the ArtTech Forum in Lausanne, Switzerland. Smoke on the Water was selected as a tribute to Claude Nobs, the Montreux Jazz Festival’s founder. The song memorializes a fire and Funky Claude’s rescue efforts at the Casino Barrière de Montreuxduring a Frank Zappa concert promoted by Claude Nobs. Miles Davis’ Tutu was selected for the role he played in music history and the Montreux Jazz Festival’s success. Miles Davis died in 1991.
“We archived two magical musical pieces on DNA of this historic collection, equating to 140MB of stored data in DNA,” said Dr. Karin Strauss, a Senior Researcher at Microsoft, and one of the project’s leaders. “The amount of DNA used to store these songs is much smaller than one grain of sand. Amazingly, storing the entire six petabyte Montreux Jazz Festival’s collection would result in DNA smaller than one grain of rice.”
Dr. Luis Ceze, a professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, said, “DNA, nature’s preferred information storage medium, is an ideal fit for digital archives because of its durability, density and eternal relevance. Storing items from the Montreux Jazz Festival is a perfect way to show how fast DNA digital data storage is becoming real.”
Nature’s Preferred Storage Medium
Nature selected DNA as its hard drive billions of years ago to encode all the genetic instructions necessary for life. These instructions include all the information necessary for survival. DNA molecules encode information with sequences of discrete units. In computers, these discrete units are the 0s and 1s of “binary code,” whereas in DNA molecules, the units are the four distinct nucleotide bases: adenine (A), cytosine (C), guanine (G) and thymine (T).
“DNA is a remarkably efficient molecule that can remain stable for millennia,” said Dr. Bill Peck, chief technology officer of Twist Bioscience. “This is a very exciting project: we are now in an age where we can use the remarkable efficiencies of nature to archive master copies of our cultural heritage in DNA. As we develop the economies of this process new performances can be added any time. Unlike current storage technologies, nature’s media will not change and will remain readable through time. There will be no new technology to replace DNA, nature has already optimized the format.”
DNA: Far More Efficient Than a Computer
Each cell within the human body contains approximately three billion base pairs of DNA. With 75 trillion cells in the human body, this equates to the storage of 150 zettabytes (1021) of information within each body. By comparison, the largest data centres can be hundreds of thousands to even millions of square feet to hold a comparable amount of stored data.
The Elegance of DNA as a Storage Medium
Like music, which can be widely varied with a finite number of notes, DNA encodes individuality with only four different letters in varied combinations. When using DNA as a storage medium, there are several advantages in addition to the universality of the format and incredible storage density. DNA can be stable for thousands of years when stored in a cool dry place and is easy to copy using polymerase chain reaction to create back-up copies of archived material. In addition, because of PCR, small data sets can be targeted and recovered quickly from a large dataset without needing to read the entire file.
How to Store Digital Data in DNA
To encode the music performances into archival storage copies in DNA, Twist Bioscience worked with Microsoft and University of Washington researchers to complete four steps: Coding, synthesis/storage, retrieval and decoding. First, the digital files were converted from the binary code using 0s and 1s into sequences of A, C, T and G. For purposes of the example, 00 represents A, 10 represents C, 01 represents G and 11 represents T. Twist Bioscience then synthesizes the DNA in short segments in the sequence order provided. The short DNA segments each contain about 12 bytes of data as well as a sequence number to indicate their place within the overall sequence. This is the process of storage. And finally, to ensure that the file is stored accurately, the sequence is read back to ensure 100 percent accuracy, and then decoded from A, C, T or G into a two-digit binary representation.
Importantly, to encapsulate and preserve encoded DNA, the collaborators are working with Professor Dr. Robert Grass of ETH Zurich. Grass has developed an innovative technology inspired by preservation of DNA within prehistoric fossils. With this technology, digital data encoded in DNA remains preserved for millennia.
“We are incredibly proud to be a part of this momentous event, with the first archived songs placed into the UNESCO Memory of the World Register,” said Dr. Emily Leproust, CEO of Twist Bioscience.
Article adapted from a Twist Bioscience news release.