{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,30]],"date-time":"2025-12-30T23:45:41Z","timestamp":1767138341046,"version":"build-2238731810"},"publisher-location":"Cham","reference-count":31,"publisher":"Springer International Publishing","isbn-type":[{"value":"9783030507428","type":"print"},{"value":"9783030507435","type":"electronic"}],"license":[{"start":{"date-parts":[[2020,1,1]],"date-time":"2020-01-01T00:00:00Z","timestamp":1577836800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2020,6,15]],"date-time":"2020-06-15T00:00:00Z","timestamp":1592179200000},"content-version":"vor","delay-in-days":166,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2020]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Research is increasingly becoming data-driven, and natural sciences are not an exception. In both biology and medicine, we are observing an exponential growth of structured data collections from experiments and population studies, enabling us to gain novel insights that would otherwise not be possible. However, these growing data sets pose a challenge for existing compute infrastructures since data is outgrowing limits within compute. In this work, we present the application of a novel approach, Memory-Driven Computing (MDC), in the life sciences. MDC proposes a data-centric approach that has been designed for growing data sizes and provides a composable infrastructure for changing workloads. In particular, we show how a typical pipeline for genomics data processing can be accelerated, and application modifications required to exploit this novel architecture. Furthermore, we demonstrate how the isolated evaluation of individual tasks misses significant overheads of typical pipelines in genomics data processing.<\/jats:p>","DOI":"10.1007\/978-3-030-50743-5_17","type":"book-chapter","created":{"date-parts":[[2020,6,15]],"date-time":"2020-06-15T15:03:45Z","timestamp":1592233425000},"page":"328-344","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Scaling Genomics Data Processing with Memory-Driven Computing to Accelerate Computational Biology"],"prefix":"10.1007","author":[{"given":"Matthias","family":"Becker","sequence":"first","affiliation":[]},{"given":"Umesh","family":"Worlikar","sequence":"additional","affiliation":[]},{"given":"Shobhit","family":"Agrawal","sequence":"additional","affiliation":[]},{"given":"Hartmut","family":"Schultze","sequence":"additional","affiliation":[]},{"given":"Thomas","family":"Ulas","sequence":"additional","affiliation":[]},{"given":"Sharad","family":"Singhal","sequence":"additional","affiliation":[]},{"given":"Joachim L.","family":"Schultze","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2020,6,15]]},"reference":[{"key":"17_CR1","unstructured":"jeMalloc. http:\/\/jemalloc.net"},{"key":"17_CR2","unstructured":"SAM specification (2019). http:\/\/samtools.github.io\/hts-specs\/SAMv1.pdf"},{"key":"17_CR3","unstructured":"SAMtools 1.9 documentation (2019)"},{"key":"17_CR4","unstructured":"The National Institutes of Health (NIH) Sequence Read Archive (SRA) (2019). https:\/\/www.ncbi.nlm.nih.gov\/sra\/"},{"issue":"21","key":"17_CR5","doi-asserted-by":"publisher","first-page":"3355","DOI":"10.1093\/bioinformatics\/btx342","volume":"33","author":"M Alser","year":"2017","unstructured":"Alser, M., Hassan, H., Xin, H., Ergin, O., Mutlu, O., Alkan, C.: GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping. Bioinform. 33(21), 3355\u20133363 (2017). https:\/\/doi.org\/10.1093\/bioinformatics\/btx342. (Oxford England)","journal-title":"Bioinform."},{"key":"17_CR6","doi-asserted-by":"crossref","unstructured":"Becker, M., et al.: Accelerated genomics data processing using memory-driven computing (accepted). In: Proceedings of the 6th International Workshop on High Performance Computing on Bioinformatics (HPCB 2019) in conjunction with the IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2019), San Diego, USA (2019)","DOI":"10.1109\/BIBM47256.2019.8983296"},{"issue":"12","key":"17_CR7","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1109\/JSTQE.2012.2236080","volume":"48","author":"KM Bresniker","year":"2015","unstructured":"Bresniker, K.M., Singhal, S., Williams, R.S.: Adapting to thrive in a new economy of memory abundance. Computer 48(12), 44\u201353 (2015). https:\/\/doi.org\/10.1109\/JSTQE.2012.2236080","journal-title":"Computer"},{"key":"17_CR8","unstructured":"Chen, F., et al.: Billion node graph inference: iterative processing on the machine. Tech. rep. (2016). https:\/\/www.labs.hpe.com\/publications\/HPE-2016-101"},{"issue":"5","key":"17_CR9","doi-asserted-by":"publisher","first-page":"507","DOI":"10.1109\/TCT.1971.1083337","volume":"18","author":"L Chua","year":"1971","unstructured":"Chua, L.: Memristor-the missing circuit element. IEEE Trans. Circuit Theory 18(5), 507\u2013519 (1971). https:\/\/doi.org\/10.1109\/TCT.1971.1083337","journal-title":"IEEE Trans. Circuit Theory"},{"key":"17_CR10","doi-asserted-by":"publisher","DOI":"10.1093\/nar\/gky1124","author":"CE Cook","year":"2019","unstructured":"Cook, C.E., et al.: The European Bioinformatics Institute in 2018: tools, infrastructure and training. Nucl. Acids Res. (2019). https:\/\/doi.org\/10.1093\/nar\/gky1124","journal-title":"Nucl. Acids Res."},{"key":"17_CR11","doi-asserted-by":"publisher","first-page":"476","DOI":"10.3233\/978-1-61499-432-9-476","volume":"205","author":"D Firnkorn","year":"2014","unstructured":"Firnkorn, D., Knaup-Gregori, P., Lorenzo Bermejo, J., Ganzinger, M.: Alignment of high-throughput sequencing data inside in-memory databases. Stud. Health Technol. Inform. 205, 476\u2013480 (2014). https:\/\/doi.org\/10.3233\/978-1-61499-432-9-476","journal-title":"Stud. Health Technol. Inform."},{"key":"17_CR12","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1005331","author":"F Fr\u00f6hlich","year":"2017","unstructured":"Fr\u00f6hlich, F., Kaltenbacher, B., Theis, F.J., Hasenauer, J.: Scalable parameter estimation for genome-scale biochemical reaction networks. PLoS Comput. Biol. (2017). https:\/\/doi.org\/10.1371\/journal.pcbi.1005331","journal-title":"PLoS Comput. Biol."},{"key":"17_CR13","unstructured":"Gen-Z Consortium: Gen-Z core specification 1.0 (2018). https:\/\/genzconsortium.org\/specification\/core-specification-1-0\/"},{"key":"17_CR14","unstructured":"Ghemawat, S., Menage, P.: Tcmalloc: thread-caching malloc (2007). http:\/\/goog-perftools.sourceforge.net\/doc\/tcmalloc.html"},{"key":"17_CR15","doi-asserted-by":"publisher","unstructured":"Hajj, I.E., et al.: SpaceJMP : programming with multiple virtual address spaces. In: ASPLOS, pp. 353\u2013368, No. Section 3 (2016). https:\/\/doi.org\/10.1145\/2872362.2872366","DOI":"10.1145\/2872362.2872366"},{"issue":"2","key":"17_CR16","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1371\/journal.pone.0209523","volume":"14","author":"C Herzeel","year":"2019","unstructured":"Herzeel, C., Costanza, P., Decap, D., Fostier, J., Verachtert, W.: elPrep 4: a multithreaded framework for sequence analysis. PLoS ONE 14(2), 1\u201316 (2019). https:\/\/doi.org\/10.1371\/journal.pone.0209523","journal-title":"PLoS ONE"},{"key":"17_CR17","unstructured":"Programming Languages \u2013 Technical Specification for C++ Extensions for Parallelism. ISO\/IEC TS 19570:2018. Standard (November 2018)"},{"key":"17_CR18","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1109\/MM.2018.2890253","volume":"39","author":"R Kaplan","year":"2018","unstructured":"Kaplan, R., Yavits, L., Ginosar, R.: RASSA: resistive pre-alignment accelerator for approximate DNA long read mapping. IEEE Micro 39, 44\u201354 (2018). https:\/\/doi.org\/10.1109\/MM.2018.2890253","journal-title":"IEEE Micro"},{"key":"17_CR19","doi-asserted-by":"crossref","unstructured":"Keeton, K.: The machine : an architecture for memory-centric computing. In: Workshop on Runtime and Operating Systems for Supercomputers (ROSS), p. 2768406 (June 2015)","DOI":"10.1145\/2768405.2768406"},{"key":"17_CR20","doi-asserted-by":"publisher","DOI":"10.1038\/s41587-019-0201-4","author":"D Kim","year":"2019","unstructured":"Kim, D., Paggi, J.M., Park, C., Bennett, C., Salzberg, S.L.: Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. (2019). https:\/\/doi.org\/10.1038\/s41587-019-0201-4","journal-title":"Nat. Biotechnol."},{"key":"17_CR21","doi-asserted-by":"publisher","unstructured":"Kim, J.S., et al.: GRIM-filter: fast seed location filtering in DNA read mapping using processing-in-memory technologies. BMC Genomics 19(Suppl 2) (2018). https:\/\/doi.org\/10.1186\/s12864-018-4460-0","DOI":"10.1186\/s12864-018-4460-0"},{"key":"17_CR22","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/bts480","author":"J K\u00f6ster","year":"2012","unstructured":"K\u00f6ster, J., Rahmann, S.: Snakemake-a scalable bioinformatics workflow engine. Bioinformatics (2012). https:\/\/doi.org\/10.1093\/bioinformatics\/bts480","journal-title":"Bioinformatics"},{"key":"17_CR23","doi-asserted-by":"publisher","unstructured":"Lavenier, D., Roy, J.F., Furodet, D.: DNA mapping using processor-in-memory architecture. In: Proceedings - 2016 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2016, pp. 1429\u20131435 (2017). https:\/\/doi.org\/10.1109\/BIBM.2016.7822732","DOI":"10.1109\/BIBM.2016.7822732"},{"issue":"16","key":"17_CR24","doi-asserted-by":"publisher","first-page":"2078","DOI":"10.1093\/bioinformatics\/btp352","volume":"25","author":"H Li","year":"2009","unstructured":"Li, H., et al.: The sequence alignment\/map format and SAMtools. Bioinformatics 25(16), 2078\u20132079 (2009). https:\/\/doi.org\/10.1093\/bioinformatics\/btp352","journal-title":"Bioinformatics"},{"issue":"1","key":"17_CR25","doi-asserted-by":"publisher","first-page":"317","DOI":"10.1145\/3200691.3178511","volume":"53","author":"X Li","year":"2018","unstructured":"Li, X., Tan, G., Wang, B., Sun, N.: High-performance genomic analysis framework with in-memory computing. ACM SIGPLAN Not. 53(1), 317\u2013328 (2018). https:\/\/doi.org\/10.1145\/3200691.3178511","journal-title":"ACM SIGPLAN Not."},{"key":"17_CR26","doi-asserted-by":"publisher","unstructured":"Luo, R., et al.: SOAP3-dp: fast, accurate and sensitive GPU-based short read aligner. PLoS ONE 8(5) (2013). https:\/\/doi.org\/10.1371\/journal.pone.0065632","DOI":"10.1371\/journal.pone.0065632"},{"key":"17_CR27","unstructured":"Regev, A., et al.: The Human Cell Atlas White Paper (October 2018). http:\/\/arxiv.org\/abs\/1810.05192"},{"issue":"5","key":"17_CR28","doi-asserted-by":"publisher","first-page":"547","DOI":"10.1038\/s41587-019-0071-9","volume":"37","author":"W Saelens","year":"2019","unstructured":"Saelens, W., Cannoodt, R., Todorov, H., Saeys, Y.: A comparison of single-cell trajectory inference methods. Nat. Biotechnol. 37(5), 547\u2013554 (2019). https:\/\/doi.org\/10.1038\/s41587-019-0071-9","journal-title":"Nat. Biotechnol."},{"key":"17_CR29","doi-asserted-by":"publisher","unstructured":"Schapranow, M.P., Plattner, H.: HIG - an in-memory database platform enabling real-time analyses of genome data. In: Proceedings - 2013 IEEE International Conference on Big Data, Big Data 2013, pp. 691\u2013696 (2013). https:\/\/doi.org\/10.1109\/BigData.2013.6691638","DOI":"10.1109\/BigData.2013.6691638"},{"issue":"November","key":"17_CR30","doi-asserted-by":"publisher","first-page":"2032","DOI":"10.5281\/zenodo.13200.Contact","volume":"31","author":"A Tarasov","year":"2017","unstructured":"Tarasov, A., Vilella, A.J., Cuppen, E., Nijman, I.J., Prins, P.: Genome analysis Sambamba : fast processing of NGS alignment formats. Bioinformatics 31(November), 2032\u20132034 (2017). https:\/\/doi.org\/10.5281\/zenodo.13200.Contact","journal-title":"Bioinformatics"},{"issue":"2","key":"17_CR31","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1109\/MCSE.2017.29","volume":"19","author":"TN Theis","year":"2017","unstructured":"Theis, T.N., Philip Wong, H.S.: The end of Moore\u2019s Law: a new beginning for information technology. Comput. Sci. Eng. 19(2), 41\u201350 (2017). https:\/\/doi.org\/10.1109\/MCSE.2017.29","journal-title":"Comput. Sci. Eng."}],"updated-by":[{"DOI":"10.1007\/978-3-030-50743-5_28","type":"correction","label":"Correction","source":"publisher","updated":{"date-parts":[[2020,6,15]],"date-time":"2020-06-15T00:00:00Z","timestamp":1592179200000}}],"container-title":["Lecture Notes in Computer Science","High Performance Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-030-50743-5_17","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,12,18]],"date-time":"2023-12-18T15:05:18Z","timestamp":1702911918000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-030-50743-5_17"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2020]]},"ISBN":["9783030507428","9783030507435"],"references-count":31,"URL":"https:\/\/doi.org\/10.1007\/978-3-030-50743-5_17","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"value":"0302-9743","type":"print"},{"value":"1611-3349","type":"electronic"}],"subject":[],"published":{"date-parts":[[2020]]},"assertion":[{"value":"15 June 2020","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"15 June 2020","order":2,"name":"change_date","label":"Change Date","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"Correction","order":3,"name":"change_type","label":"Change Type","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"The original version of chapters 17 and 24 were previously published non-open access. They have now been made open access under a CC BY 4.0 license and the copyright holder has been changed to \u2018The Author(s).\u2019 The book has also been updated with the change.","order":4,"name":"change_details","label":"Change Details","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"The chapters 19 and 25 were inadvertently published open access. This has been corrected and the chapters are now non-open access.","order":5,"name":"change_details","label":"Change Details","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"ISC High Performance","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"International Conference on High Performance Computing","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Frankfurt am Main","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Germany","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2020","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"22 June 2020","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"25 June 2020","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"35","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"supercomputing2020","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"https:\/\/www.isc-hpc.com\/","order":11,"name":"conference_url","label":"Conference URL","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Double-blind","order":1,"name":"type","label":"Type","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"Linklings","order":2,"name":"conference_management_system","label":"Conference Management System","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"87","order":3,"name":"number_of_submissions_sent_for_review","label":"Number of Submissions Sent for Review","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"27","order":4,"name":"number_of_full_papers_accepted","label":"Number of Full Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"0","order":5,"name":"number_of_short_papers_accepted","label":"Number of Short Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"31% - The value is computed by the equation \"Number of Full Papers Accepted \/ Number of Submissions Sent for Review * 100\" and then rounded to a whole number.","order":6,"name":"acceptance_rate_of_full_papers","label":"Acceptance Rate of Full Papers","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"3.73","order":7,"name":"average_number_of_reviews_per_paper","label":"Average Number of Reviews per Paper","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"4.33","order":8,"name":"average_number_of_papers_per_reviewer","label":"Average Number of Papers per Reviewer","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"No","order":9,"name":"external_reviewers_involved","label":"External Reviewers Involved","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"The conference was held virtually due to the COVID-19 pandemic.","order":10,"name":"additional_info_on_review_process","label":"Additional Info on Review Process","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}}]}}