Random access and semantic search in DNA data storage enabled by Cas9 and machine-guided design

محفوظ في:
التفاصيل البيبلوغرافية
الحاوية / القاعدة:Nature Communications vol. 16, no. 1 (2025), p. 6388
المؤلف الرئيسي: Imburgia, Carina
مؤلفون آخرون: Organick, Lee, Zhang, Karen, Cardozo, Nicolas, McBride, Jeff, Bee, Callista, Wilde, Delaney, Roote, Gwendolin, Jorgensen, Sophia, Ward, David, Anderson, Charlie, Strauss, Karin, Ceze, Luis, Nivala, Jeff
منشور في:
Nature Publishing Group
الموضوعات:
الوصول للمادة أونلاين:Citation/Abstract
Full Text
Full Text - PDF
الوسوم: إضافة وسم
لا توجد وسوم, كن أول من يضع وسما على هذه التسجيلة!

MARC

LEADER 00000nab a2200000uu 4500
001 3228985524
003 UK-CbPIL
022 |a 2041-1723 
024 7 |a 10.1038/s41467-025-61264-5  |2 doi 
035 |a 3228985524 
045 2 |b d20250101  |b d20251231 
084 |a 145839  |2 nlm 
100 1 |a Imburgia, Carina  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
245 1 |a Random access and semantic search in DNA data storage enabled by Cas9 and machine-guided design 
260 |b Nature Publishing Group  |c 2025 
513 |a Journal Article 
520 3 |a DNA is a promising medium for digital data storage due to its exceptional data density and longevity. Practical DNA-based storage systems require selective data retrieval to minimize decoding time and costs. In this work, we introduce CRISPR-Cas9 as a user-friendly tool for multiplexed, low-latency molecular data extraction. We first present a one-pot, multiplexed random access method in which specific data files are selectively cleaved using a CRISPR-Cas9 addressing system and then sequenced via nanopore technology. This approach was validated on a pool of 1.6 million DNA sequences, comprising 25 unique data files. We then developed a molecular similarity-search approach combining machine learning with Cas9-based retrieval. Using a deep neural network, we mapped a database of 1.74 million images into a reduced-dimensional embedding, encoding each embedding as a Cas9 target sequence. These target sequences act as molecular addresses, capturing clusters of semantically related images. By leveraging Cas9’s off-target cleavage activity, query sequences cleave both exact and closely related targets, enabling high-fidelity retrieval of molecular addresses corresponding to in silico image clusters similar to the query. These approaches move towards addressing key challenges in molecular data retrieval by offering simplified, rapid isothermal protocols and new DNA data access capabilities.CRISPR-Cas9 has potential as an efficient tool for information retrieval in DNA data storage. Here the authors present a Cas9-based random access and similarity search approach and test on DNA databases, progressing toward simpler, isothermal protocols. 
653 |a Databases 
653 |a Similarity 
653 |a Metadata 
653 |a CRISPR 
653 |a Random access 
653 |a Information retrieval 
653 |a Digital data 
653 |a Artificial neural networks 
653 |a Data storage 
653 |a Clusters 
653 |a Nucleotide sequence 
653 |a Machine learning 
653 |a Data retrieval 
653 |a Storage systems 
653 |a Information storage 
653 |a Gene sequencing 
653 |a Searching 
653 |a Deoxyribonucleic acid--DNA 
653 |a Multiplexing 
653 |a Design 
653 |a Images 
653 |a Information processing 
653 |a Latency 
653 |a Embedding 
653 |a Neural networks 
653 |a Semantics 
653 |a Environmental 
700 1 |a Organick, Lee  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
700 1 |a Zhang, Karen  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
700 1 |a Cardozo, Nicolas  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
700 1 |a McBride, Jeff  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
700 1 |a Bee, Callista  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
700 1 |a Wilde, Delaney  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
700 1 |a Roote, Gwendolin  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
700 1 |a Jorgensen, Sophia  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
700 1 |a Ward, David  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
700 1 |a Anderson, Charlie  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
700 1 |a Strauss, Karin  |u Microsoft Research, Redmond, USA (GRID:grid.419815.0) (ISNI:0000 0001 2181 3404) 
700 1 |a Ceze, Luis  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
700 1 |a Nivala, Jeff  |u Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657); Molecular Engineering and Sciences Institute, University of Washington, Seattle, USA (GRID:grid.34477.33) (ISNI:0000000122986657) 
773 0 |t Nature Communications  |g vol. 16, no. 1 (2025), p. 6388 
786 0 |d ProQuest  |t Health & Medical Collection 
856 4 1 |3 Citation/Abstract  |u https://www.proquest.com/docview/3228985524/abstract/embedded/H09TXR3UUZB2ISDL?source=fedsrch 
856 4 0 |3 Full Text  |u https://www.proquest.com/docview/3228985524/fulltext/embedded/H09TXR3UUZB2ISDL?source=fedsrch 
856 4 0 |3 Full Text - PDF  |u https://www.proquest.com/docview/3228985524/fulltextPDF/embedded/H09TXR3UUZB2ISDL?source=fedsrch