Data explosion in AI era: PolyU leads breakthroughs in protein-based data storage, delivering high storage capacity, strong stability and encryption capabilities
Facing the massive volumes of data generated by AI training and smart devices, Prof. Zhongping YAO, Associate Head and Professor of the Department of Applied Biology and Chemical Technology, Dr Cheuk-chi NG, Research Assistant Professor of the same department, and Prof. Chung-Ming Francis LAU, Associate Dean (Global Engagement) of the Faculty of Engineering and Professor of the Department of Electrical and Electronic Engineering, have achieved a world-first breakthrough: using engineered proteins to store digital data, and completing the full process from data storage to retrieval in de novo designed unnatural proteins.
Previously, the team used peptides (short amino acid chains) as data carriers, but they had limited storage efficiency and high production costs. Now, they have innovatively turned to proteins. Proteins feature much longer amino acid sequences, delivering higher storage efficiency and capacity. They can also be mass‑produced via bacteria at low cost, and preserved stably in powder or solution form – outperforming DNA in stability.
Two major challenges they overcame:
Data‑bearing proteins tend to be unstable and difficult to express.
The full amino acid sequence must be accurately reconstructed to retrieve the encoded data.
Their innovative solution: Inspired by natural collagen, they designed a protein “backbone” template. By embedding data‑encoded amino acid sequences into this stable template, they successfully expressed the target proteins in E. coli, and retrieved the data using liquid chromatography–tandem mass spectrometry (LC‑MS/MS) combined with custom algorithms.
Impressive results:
30 times higher storage density than the peptide‑based method
Only 10% of the production cost
Enabled random access and cryptographic protection capabilities
The research findings have been published in Nature Communications, marking a major step toward sustainable, high‑capacity and highly stable data storage for the AI era.
Source: https://www.polyu.edu.hk/en/media/media-releases/2026/0514_polyu-leads-breakthroughs-in-protein-based-data-storage/