DNA-based data storage; the next generation of data backup

Amazon Web Services launches AWS Backup service

By Michael Cade, Senior Global Technologist, Veeam and Sandeep Bhambure, Vice President and Managing Director, India & SAARC, Veeam Software

India is one of the largest and fastest-growing markets for digital consumers and is witnessing an enormous digital surge for both consumers and enterprises alike. With India moving towards a cashless economy, added to that decreasing cost and increasing availability of smartphones, and with Indian government emphasizing on self-reliance and data protection through data localisation, data generated by consumers and organizations is at an all-time high.

According to a report by Mckinsey, in 2018 India had 560 million internet subscribers, second only to China. Indian mobile data users consume 8.3 gigabits (GB) of data each month on average, compared with 5.5 GB for mobile users in China. By 2025, core digital sectors such as IT and business process management (IT-BPM), digital communication services, and electronics manufacturing could double their GDP level to $355 billion to $435 billion. Newly digitising sectors, including agriculture, education, energy, financial services, healthcare, logistics, and retail, as well as government services and labour markets, could each create $10 billion to $150 billion of incremental economic value in 2025, leading to a staggering growth in the amount of data we’re generating, storing and accessing.

Data has become the common denominator which sits across everything organizations do. Whether it’s driving the day-to-day activities we all take for granted or providing the new insights which shape our thinking around some of humanity’s biggest questions, data is augmenting and empowering human intelligence.

With all of this in mind, it’s likely that we’ll need to fundamentally reconsider the current data storage technologies that we have on hand. The staggering amount of data we’re generating is already causing challenges, with data centre technologies requiring significant power and cooling, as well as ongoing maintenance and monitoring. We could be moving towards a huge bottleneck in the capabilities that are available, as both the volumes and speed of access to data increase further. What’s more, hardware such as servers, hard drives and flash storage can degrade. It seems unlikely at first, but there’s much we can learn from the natural world about data storage. The medium here is DNA, and when it comes to preserving and archiving vital information, it has an unbeatable track record.

Nature’s storage medium

One alternative to our current storage devices could be DNA-based data storage. Being ultra-compact and easy to replicate – thanks to its primary role in creating life – gives DNA two big advantages. One gram of DNA could potentially hold as much as 455 exabytes of data, according to the New Scientist. That’s more than all the digital data currently in the world, by a huge margin. And while DNA is itself quite fragile, when stored in the right conditions it can be incredibly stable. Thousand-year-old fossilized remains have been found with DNA still intact. The longevity of cassettes and CDs just doesn’t compare, and so from an archiving and backup perspective, it could be the perfect material.

Progress on the technology has been extremely promising, with Microsoft and University of Washington researchers last year developing the world’s first DNA storage device that can carry out the entire process automatically. Using the device, researchers encoded the word ‘hello’ on to DNA, and were able to convert it back to data readable by a computer.

From DNA to glass

In the race to find the data storage medium of the future, glass is another material in the running. Microsoft’s Project Silica, for example, is a proof of concept that uses quartz glass as a storage medium. Lasers permanently change the structure of glass, making it possible to store data that can then be read by machine learning algorithms. By taking up a fraction of the space, and not requiring the climate-controlled storage or other regular maintenance of typical storage mediums, it holds immense promise for archiving and backup activity.

But while techniques might be steadily improving, the time and cost of decoding the information needs to come down before DNA data storage can be used commercially. While scientists have been experimenting with storing digital data in DNA since 2012, for example, it took 21 hours for that 5-byte ‘hello’ message to be written and then read back out. However, progress is steady – it cost $100m in 2001 to sequence a human genome, today all it takes is two days and $1,000.

The business of backup could be transformed by DNA. Archives and data centres, and their immense physical footprints, could be eliminated. The sum of the world’s knowledge may well one day be stored on something you need a microscope to observe. And as we generate even more data, and reach the limit of our current storage technologies, the value of powerful alternatives will only become greater. Today’s complex backup efforts could be reduced down to a single record, created once, that lasts well beyond any living memory. The next generation of storage technology is in some ways already here – we just need to learn how to harness it. It’s critical to have a next-generation data storage technology, in order to deliver faster, reliable, secure, scalable, and cost-efficient back-up solutions for cloud data management.


Please enter your comment!
Please enter your name here