{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T17:38:39Z","timestamp":1740159519213,"version":"3.37.3"},"reference-count":43,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2023,10,30]],"date-time":"2023-10-30T00:00:00Z","timestamp":1698624000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,10,30]],"date-time":"2023-10-30T00:00:00Z","timestamp":1698624000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100008967","name":"Philipps-Universit\u00e4t Marburg","doi-asserted-by":"crossref","id":[{"id":"10.13039\/100008967","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Datenbank Spektrum"],"published-print":{"date-parts":[[2023,11]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Over the past decade, DNA has emerged as a\u00a0new storage medium with intriguing data volume and durability capabilities. Despite its advantages, DNA storage also has crucial limitations, such as intricate data access interfaces and restricted random accessibility. To overcome these limitations, DNAContainer has been introduced with a\u00a0novel storage interface for DNA that spans a\u00a0very large virtual address space on objects and allows random access to DNA at scale. In this paper, we substantially improve the first version of DNAContainer, focusing on the update capabilities of its data structures and optimizing its memory footprint. In addition, we extend the previous set of experiments on DNAContainer with new ones whose results reveal the impact of essential parameters on the performance and memory footprint.<\/jats:p>","DOI":"10.1007\/s13222-023-00460-3","type":"journal-article","created":{"date-parts":[[2023,10,30]],"date-time":"2023-10-30T16:02:05Z","timestamp":1698681725000},"page":"211-220","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["An Extension of DNAContainer with a Small Memory Footprint"],"prefix":"10.1007","volume":"23","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6276-4020","authenticated-orcid":false,"given":"Alex","family":"El-Shaikh","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9362-153X","authenticated-orcid":false,"given":"Bernhard","family":"Seeger","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,10,30]]},"reference":[{"key":"460_CR1","volume-title":"12th USENIX workshop on hot topics in storage and file systems (hotstorage 20)","author":"B Li","year":"2020","unstructured":"Li B, Song NY, Ou L, Du DHC (2020) Can we store the whole world\u2019s data in DNA storage? In: 12th USENIX workshop on hot topics in storage and file systems (hotstorage 20). USENIX Association, (https:\/\/www.usenix.org\/conference\/hotstorage20\/presentation\/li)"},{"issue":"1","key":"460_CR2","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s40537-020-00378-7","volume":"7","author":"TJ Ma","year":"2020","unstructured":"Ma TJ, Garcia RJ, Danford F, Patrizi L, Galasso J, Loyd J (2020) Big data actionable intelligence architecture. Journal of Big Data 7(1):1\u201319","journal-title":"Journal of Big Data"},{"key":"460_CR3","doi-asserted-by":"publisher","first-page":"637","DOI":"10.1145\/2872362.2872397","volume-title":"Proceedings of the twenty-first international conference on architectural support for programming languages and operating systems","author":"J Bornholt","year":"2016","unstructured":"Bornholt J, Lopez R, Carmean DM, Ceze L, Seelig G, Strauss K (2016) A\u00a0DNA-based archival storage system. In: Proceedings of the twenty-first international conference on architectural support for programming languages and operating systems, pp 637\u2013649"},{"issue":"4","key":"460_CR4","doi-asserted-by":"publisher","first-page":"366","DOI":"10.1038\/nmat4594","volume":"15","author":"V Zhirnov","year":"2016","unstructured":"Zhirnov V, Zadegan RM, Sandhu GS, Church GM, Hughes WL (2016) Nucleic acid memory. Nature Materials 15(4):366\u2013370","journal-title":"Nature Materials"},{"key":"460_CR5","doi-asserted-by":"crossref","unstructured":"Allentoft ME, Collins M, Harker D, Haile J, Oskam CL, Hale ML et al (1748) The half-life of DNA in bone: measuring decay kinetics in 158 dated fossils. Proceedings of the Royal Society B: Biological Sciences, 279(1748), 4724-4733","DOI":"10.1098\/rspb.2012.1745"},{"key":"460_CR6","first-page":"98","volume-title":"Biennal Conference on Innovative Data Systems Research (CIDR 2019)","author":"R Appuswamy","year":"2019","unstructured":"Appuswamy R, Lebrigand K, Barbry P, Antonini M, Madderson O, Freemont P et al (2019) OligoArchive: Using DNA in the DBMS storage hierarchy. In: Biennal Conference on Innovative Data Systems Research (CIDR 2019), p 98"},{"key":"460_CR7","series-title":"arXiv preprint arXiv:220505488","volume-title":"DNA data storage, sequencing data-carrying DNA","author":"J Quah","year":"2022","unstructured":"Quah J, Sella O, Heinis T (2022) DNA data storage, sequencing data-carrying DNA. arXiv preprint arXiv:220505488"},{"issue":"3","key":"460_CR8","first-page":"1","volume":"21","author":"YS Lin","year":"2022","unstructured":"Lin YS, Liang YP, Chen TY, Chang YH, Chen SH, Wei HW et al (2022) How to enable index scheme for reducing the writing cost of DNA storage on insertion and deletion. ACM Transactions on Embedded Computing Systems 21(3):1\u201325","journal-title":"ACM Transactions on Embedded Computing Systems"},{"issue":"3","key":"460_CR9","doi-asserted-by":"publisher","first-page":"242","DOI":"10.1038\/nbt.4079","volume":"36","author":"L Organick","year":"2018","unstructured":"Organick L, Ang SD, Chen YJ, Lopez R, Yekhanin S, Makarychev K et al (2018) Random access in large-scale DNA data storage. Nature Biotechnology 36(3):242\u2013248","journal-title":"Nature Biotechnology"},{"key":"460_CR10","first-page":"773","volume-title":"BTW 2023","author":"A El-Shaikh","year":"2023","unstructured":"El-Shaikh A, Seeger B (2023) DNAcontainer: an object-based storage architecture on DNA. In: BTW 2023. Gesellschaft f\u00fcr Informatik e.V., Bonn, pp 773\u2013795"},{"key":"460_CR11","doi-asserted-by":"publisher","first-page":"325","DOI":"10.2741\/e93","volume":"2","author":"H Liu","year":"2010","unstructured":"Liu H, Bebu I, Li X (2010) Microarray probes and probe sets. Frontiers in Bioscience (Elite edition) 2:325","journal-title":"Frontiers in Bioscience (Elite edition)"},{"issue":"6328","key":"460_CR12","doi-asserted-by":"publisher","first-page":"950","DOI":"10.1126\/science.aaj2038","volume":"355","author":"Y Erlich","year":"2017","unstructured":"Erlich Y, Zielinski D (2017) DNA Fountain enables a\u00a0robust and efficient storage architecture. Science 355(6328):950\u2013954","journal-title":"Science"},{"issue":"1","key":"460_CR13","doi-asserted-by":"publisher","first-page":"lqab126","DOI":"10.1093\/nargab\/lqab126","volume":"4","author":"A El-Shaikh","year":"2022","unstructured":"El-Shaikh A, Welzel M, Heider D, Seeger B (2022) High-scale random access on DNA storage systems. NAR Genomics and Bioinformatics 4(1):lqab126","journal-title":"NAR Genomics and Bioinformatics"},{"issue":"1","key":"460_CR14","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41467-020-16797-2","volume":"11","author":"KN Lin","year":"2020","unstructured":"Lin KN, Volkel K, Tuck JM, Keung AJ (2020) Dynamic and scalable DNA-based information storage. Nature Communications 11(1):1\u201312","journal-title":"Nature Communications"},{"key":"460_CR15","series-title":"bioRxiv","doi-asserted-by":"publisher","DOI":"10.1101\/2020.02.05.936369","volume-title":"Random access DNA memory in a\u00a0scalable, archival file storage system","author":"JL Banal","year":"2020","unstructured":"Banal JL, Shepherd TR, Berleant JD, Huang H, Reyes M, Ackerman CM et al (2020) Random access DNA memory in a\u00a0scalable, archival file storage system. bioRxiv. https:\/\/doi.org\/10.1101\/2020.02.05.936369"},{"issue":"8","key":"460_CR16","doi-asserted-by":"publisher","first-page":"456","DOI":"10.1038\/s41576-019-0125-3","volume":"20","author":"L Ceze","year":"2019","unstructured":"Ceze L, Nivala J, Strauss K (2019) Molecular digital data storage using DNA. Nature Review Genetics 20(8):456\u2013466","journal-title":"Nature Reviews Genetics"},{"issue":"10","key":"460_CR17","doi-asserted-by":"publisher","first-page":"5451","DOI":"10.1093\/nar\/gkab230","volume":"49","author":"C Xu","year":"2021","unstructured":"Xu C, Zhao C, Ma B, Liu H (2021) Uncertainties in synthetic DNA-based data storage. Nucleic Acids Research 49(10):5451\u20135469","journal-title":"Nucleic Acids Research"},{"issue":"1","key":"460_CR18","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13036-019-0211-2","volume":"13","author":"Y Wang","year":"2019","unstructured":"Wang Y, Zhang J, Gunawan E, Guan YL, Poh CL et al (2019) High capacity DNA data storage with variable-length Oligonucleotides using repeat accumulate code and hybrid mapping. Journal of Biological Engineering 13(1):1\u201311","journal-title":"Journal of Biological Engineering"},{"issue":"01","key":"460_CR19","doi-asserted-by":"publisher","first-page":"91","DOI":"10.1109\/69.50908","volume":"2","author":"O Deux","year":"1990","unstructured":"Deux O et al (1990) The story of O2. IEEE Transactions on Knowledge & Data Engineering 2(01):91\u2013108","journal-title":"IEEE Transactions on Knowledge & Data Engineering"},{"issue":"3","key":"460_CR20","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/2512961","volume":"46","author":"D Ma","year":"2014","unstructured":"Ma D, Feng J, Li G (2014) A\u00a0survey of address translation technologies for flash memories. ACM Computing Surveys 46(3):1\u201339","journal-title":"ACM Computing Surveys"},{"issue":"5","key":"460_CR21","doi-asserted-by":"publisher","first-page":"499","DOI":"10.1038\/nmeth.2918","volume":"11","author":"S Kosuri","year":"2014","unstructured":"Kosuri S, Church GM (2014) Large-scale de novo DNA synthesis: technologies and applications. Nature Methods 11(5):499\u2013507","journal-title":"Nature Methods"},{"issue":"11","key":"460_CR22","doi-asserted-by":"publisher","first-page":"3322","DOI":"10.1093\/bioinformatics\/btaa140","volume":"36","author":"M Schwarz","year":"2020","unstructured":"Schwarz M, Welzel M, Kabdullayeva T, Becker A, Freisleben B, Heider D (2020) MESA: automated assessment of synthetic DNA fragments and simulation of DNA synthesis, storage, sequencing and PCR errors. Bioinformatics 36(11):3322\u20133326","journal-title":"Bioinformatics"},{"issue":"1","key":"460_CR23","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-019-45832-6","volume":"9","author":"R Heckel","year":"2019","unstructured":"Heckel R, Mikutis G, Grass RN (2019) A\u00a0characterization of the DNA data storage channel. Scientific Reports 9(1):1\u201312","journal-title":"Scientific Reports"},{"issue":"6","key":"460_CR24","doi-asserted-by":"publisher","first-page":"333","DOI":"10.1038\/nrg.2016.49","volume":"17","author":"S Goodwin","year":"2016","unstructured":"Goodwin S, McPherson JD, McCombie WR (2016) Coming of age: ten years of next-generation sequencing technologies. Nature Reviews Genetics 17(6):333\u2013351","journal-title":"Nature Reviews Genetics"},{"issue":"1","key":"460_CR25","doi-asserted-by":"publisher","first-page":"129","DOI":"10.1146\/annurev.bioeng.4.020702.153438","volume":"4","author":"MJ Heller","year":"2002","unstructured":"Heller MJ (2002) DNA microarray technology: devices, systems, and applications. Annual Review of Biomedical Engineering 4(1):129\u2013153","journal-title":"Annual Review of Biomedical Engineering"},{"issue":"7435","key":"460_CR26","doi-asserted-by":"publisher","first-page":"77","DOI":"10.1038\/nature11875","volume":"494","author":"N Goldman","year":"2013","unstructured":"Goldman N, Bertone P, Chen S, Dessimoz C, LeProust EM, Sipos B et al (2013) Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. Nature 494(7435):77\u201380","journal-title":"Nature"},{"issue":"6","key":"460_CR27","first-page":"1092","volume":"7","author":"Y Dong","year":"2020","unstructured":"Dong Y, Sun F, Ping Z, Ouyang Q, Qian L (2020) DNA storage: research landscape and future prospects. Nature Science Review 7(6):1092\u20131107","journal-title":"Nature Science Review"},{"issue":"1","key":"460_CR28","doi-asserted-by":"publisher","first-page":"628","DOI":"10.1038\/s41467-023-36297-3","volume":"14","author":"M Welzel","year":"2023","unstructured":"Welzel M, Schwarz PM, L\u00f6chel HF, Kabdullayeva T, Clemens S, Becker A et al (2023) DNA-Aeon provides flexible arithmetic coding for constraint adherence and error correction in DNA storage. Nature Communications 14(1):628","journal-title":"Nature Communications"},{"key":"460_CR29","doi-asserted-by":"crossref","unstructured":"Park SJ, Park H, Kwak HY, No JS (2023) BIC codes: bit insertion-based constrained codes with error correction for DNA storage. IEEE Transactions Emerging Topics in Computing 11(3):764\u2013777","DOI":"10.1109\/TETC.2023.3268274"},{"key":"460_CR30","doi-asserted-by":"publisher","first-page":"604","DOI":"10.1145\/276698.276876","volume-title":"Proceedings of the thirtieth annual ACM symposium on theory of computing","author":"P Indyk","year":"1998","unstructured":"Indyk P, Motwani R (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. In: Proceedings of the thirtieth annual ACM symposium on theory of computing, pp 604\u2013613"},{"key":"460_CR31","series-title":"Cat. No. 97TB100171","first-page":"21","volume-title":"Proceedings. Compression and Complexity of SEQUENCES 1997","author":"AZ Broder","year":"1997","unstructured":"Broder AZ (1997) On the resemblance and containment of documents. In: Proceedings. Compression and Complexity of SEQUENCES 1997. Cat. No. 97TB100171. IEEE, pp 21\u201329"},{"key":"460_CR32","volume-title":"Mining of massive datasets","author":"R Anand","year":"2011","unstructured":"Anand R, David JU (2011) Mining of massive datasets. Cambridge University Press"},{"issue":"6","key":"460_CR33","doi-asserted-by":"publisher","first-page":"817","DOI":"10.1109\/TC.2011.108","volume":"61","author":"Y Hua","year":"2011","unstructured":"Hua Y, Xiao B, Veeravalli B, Feng D (2011) Locality-sensitive Bloom filter for approximate membership query. IEEE Transactions on Computers 61(6):817\u2013830","journal-title":"IEEE Transactions on Computers"},{"issue":"2","key":"460_CR34","doi-asserted-by":"publisher","first-page":"1912","DOI":"10.1109\/COMST.2018.2889329","volume":"21","author":"L Luo","year":"2018","unstructured":"Luo L, Guo D, Ma RT, Rottenstreich O, Luo X (2018) Optimizing bloom filter: challenges, solutions, and comparisons. IEEE Communications Surveys & Tutorials 21(2):1912\u20131949","journal-title":"IEEE Communications Surveys & Tutorials"},{"issue":"2","key":"460_CR35","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1038\/nmeth.1419","volume":"7","author":"L Mamanova","year":"2010","unstructured":"Mamanova L, Coffey AJ, Scott CE, Kozarewa I, Turner EH, Kumar A et al (2010) Target-enrichment strategies for next-generation sequencing. Nature Methods 7(2):111\u2013118","journal-title":"Nature Methods"},{"issue":"6","key":"460_CR36","doi-asserted-by":"publisher","first-page":"1241","DOI":"10.1021\/acssynbio.9b00100","volume":"8","author":"KJ Tomek","year":"2019","unstructured":"Tomek KJ, Volkel K, Simpson A, Hass AG, Indermaur EW, Tuck JM et al (2019) Driving the scalability of DNA-based information storage systems. ACS Synthetic Biology 8(6):1241\u20131248","journal-title":"ACS Synthetic Biology"},{"issue":"1","key":"460_CR37","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1109\/SURV.2011.031611.00024","volume":"14","author":"S Tarkoma","year":"2011","unstructured":"Tarkoma S, Rothenberg CE, Lagerspetz E (2011) Theory and practice of bloom filters for distributed systems. IEEE Communications Surveys & Tutorials 14(1):131\u2013155","journal-title":"IEEE Communications Surveys & Tutorials"},{"issue":"6","key":"460_CR38","doi-asserted-by":"publisher","first-page":"557","DOI":"10.1109\/LCOMM.2010.06.100344","volume":"14","author":"CE Rothenberg","year":"2010","unstructured":"Rothenberg CE, Macapuna CA, Verdi FL, Magalhaes MF (2010) The deletable Bloom filter: a\u00a0new member of the Bloom family. IEEE Communications Letters 14(6):557\u2013559","journal-title":"IEEE Communications Letters"},{"key":"460_CR39","unstructured":"GBIF Org User Occurrence download. The global biodiversity information facility. https:\/\/www.gbif.org\/occurrence\/download\/0165113-230224095556074. Accessed: 26.10.2023"},{"issue":"6","key":"460_CR40","doi-asserted-by":"publisher","first-page":"2551","DOI":"10.1109\/TIT.2006.874390","volume":"52","author":"A Shokrollahi","year":"2006","unstructured":"Shokrollahi A (2006) Raptor codes. IEEE Transactions on Information Theory 52(6):2551\u20132567","journal-title":"IEEE Transactions on Information Theory"},{"issue":"1","key":"460_CR41","doi-asserted-by":"publisher","first-page":"7053","DOI":"10.1038\/s41598-023-34160-5","volume":"13","author":"A El-Shaikh","year":"2023","unstructured":"El-Shaikh A, Seeger B (2023) Content-based filter queries on DNA data storage systems. Scientific Reports 13(1):7053. https:\/\/doi.org\/10.1038\/s41598-023-34160-5","journal-title":"Scientific Reports"},{"key":"460_CR42","series-title":"arXiv:190611062","volume-title":"Survey of information encoding techniques for DNA","author":"T Heinis","year":"2019","unstructured":"Heinis T, Alnasir JJ (2019) Survey of information encoding techniques for DNA. arXiv:190611062"},{"issue":"6","key":"460_CR43","doi-asserted-by":"publisher","first-page":"giz075","DOI":"10.1093\/gigascience\/giz075","volume":"8","author":"Z Ping","year":"2019","unstructured":"Ping Z, Ma D, Huang X, Chen S, Liu L, Guo F et al (2019) Carbon-based archiving: current progress and future prospects of DNA-based data storage. GigaScience 8(6):giz75","journal-title":"GigaScience"}],"container-title":["Datenbank-Spektrum"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13222-023-00460-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s13222-023-00460-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s13222-023-00460-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,7,15]],"date-time":"2024-07-15T11:08:45Z","timestamp":1721041725000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s13222-023-00460-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,10,30]]},"references-count":43,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,11]]}},"alternative-id":["460"],"URL":"https:\/\/doi.org\/10.1007\/s13222-023-00460-3","relation":{},"ISSN":["1618-2162","1610-1995"],"issn-type":[{"type":"print","value":"1618-2162"},{"type":"electronic","value":"1610-1995"}],"subject":[],"published":{"date-parts":[[2023,10,30]]},"assertion":[{"value":"31 May 2023","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 October 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"30 October 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"No declared competing interests.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}