{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,23]],"date-time":"2025-12-23T15:36:32Z","timestamp":1766504192947,"version":"3.41.0"},"reference-count":25,"publisher":"Association for Computing Machinery (ACM)","issue":"05","license":[{"start":{"date-parts":[[2025,4,29]],"date-time":"2025-04-29T00:00:00Z","timestamp":1745884800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"NSF","doi-asserted-by":"publisher","award":["1908601, 1652294"],"award-info":[{"award-number":["1908601, 1652294"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Applications Driving Architectures (ADA) Research Center"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Commun. ACM"],"published-print":{"date-parts":[[2025,5]]},"abstract":"<jats:p>Genomics is playing an important role in transforming healthcare. Genetic data, however, is being produced at a rate that far outpaces Moore's Law. Many efforts have been made to accelerate genomics kernels on modern commodity hardware, such as CPUs and GPUs, as well as custom accelerators (ASICs) for specific genomics kernels. While ASICs provide higher performance and energy efficiency than general-purpose hardware, they incur a high hardware-design cost. Moreover, to extract the best performance, ASICs tend to have significantly different architectures for different kernels. The divergence of ASIC designs makes it difficult to run commonly used modern sequencing analysis pipelines due to software integration and programming challenges.<\/jats:p>\n          <jats:p>\n            With the observation that many genomics kernels are dominated by dynamic programming (DP) algorithms, this paper presents GenDP, a framework of dynamic programming acceleration including\n            <jats:italic>DPAx<\/jats:italic>\n            , a DP accelerator, and\n            <jats:italic>DPMap<\/jats:italic>\n            , a graph-partitioning algorithm that maps DP objective functions to the accelerator. DPAx supports DP kernels with various dependency patterns, such as 1D and 2D DP tables and long-range dependencies in the graph structure. DPAx also supports different DP objective functions and precisions required for genomics applications. GenDP is evaluated on genomics kernels in both short-read and long-read analysis pipelines, achieving 157.8\n            <jats:inline-formula>\n              <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" display=\"inline\">\n                <mml:mo>\u00d7<\/mml:mo>\n              <\/mml:math>\n            <\/jats:inline-formula>\n            <jats:inline-formula>\n              <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" display=\"inline\">\n                <mml:mrow>\n                  <mml:mi>t<\/mml:mi>\n                  <mml:mi>h<\/mml:mi>\n                  <mml:mi>r<\/mml:mi>\n                  <mml:mi>o<\/mml:mi>\n                  <mml:mi>u<\/mml:mi>\n                  <mml:mi>g<\/mml:mi>\n                  <mml:mi>h<\/mml:mi>\n                  <mml:mi>p<\/mml:mi>\n                  <mml:mi>u<\/mml:mi>\n                  <mml:mi>t<\/mml:mi>\n                  <mml:mo>\/<\/mml:mo>\n                  <mml:mi>m<\/mml:mi>\n                  <mml:msup>\n                    <mml:mi>m<\/mml:mi>\n                    <mml:mn>2<\/mml:mn>\n                  <\/mml:msup>\n                <\/mml:mrow>\n              <\/mml:math>\n            <\/jats:inline-formula>\n            over GPU baselines and 132.0\n            <jats:inline-formula>\n              <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" display=\"inline\">\n                <mml:mo>\u00d7<\/mml:mo>\n              <\/mml:math>\n            <\/jats:inline-formula>\n            <jats:inline-formula>\n              <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\" display=\"inline\">\n                <mml:mrow>\n                  <mml:mi>t<\/mml:mi>\n                  <mml:mi>h<\/mml:mi>\n                  <mml:mi>r<\/mml:mi>\n                  <mml:mi>o<\/mml:mi>\n                  <mml:mi>u<\/mml:mi>\n                  <mml:mi>g<\/mml:mi>\n                  <mml:mi>h<\/mml:mi>\n                  <mml:mi>p<\/mml:mi>\n                  <mml:mi>u<\/mml:mi>\n                  <mml:mi>t<\/mml:mi>\n                  <mml:mo>\/<\/mml:mo>\n                  <mml:mi>m<\/mml:mi>\n                  <mml:msup>\n                    <mml:mi>m<\/mml:mi>\n                    <mml:mn>2<\/mml:mn>\n                  <\/mml:msup>\n                <\/mml:mrow>\n              <\/mml:math>\n            <\/jats:inline-formula>\n            over CPU baselines.\n          <\/jats:p>","DOI":"10.1145\/3712168","type":"journal-article","created":{"date-parts":[[2025,4,28]],"date-time":"2025-04-28T20:39:52Z","timestamp":1745872792000},"page":"81-90","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["GenDP: A Framework of Dynamic Programming Acceleration for Genome\u00a0Sequencing Analysis"],"prefix":"10.1145","volume":"68","author":[{"given":"Yufeng","family":"Gu","sequence":"first","affiliation":[{"name":"University of Michigan, Ann Arbor, MI, USA, MI, USA"}]},{"given":"Arun","family":"Subramaniyan","sequence":"additional","affiliation":[{"name":"Illumina Inc., San Diego, CA, USA"}]},{"given":"Tim","family":"Dunn","sequence":"additional","affiliation":[{"name":"University of Michigan, Ann Arbor, MI, USA"}]},{"given":"Alireza","family":"Khadem","sequence":"additional","affiliation":[{"name":"University of Michigan, Ann Arbor, MI, USA"}]},{"given":"Kuan-Yu","family":"Chen","sequence":"additional","affiliation":[{"name":"University of Michigan, Ann Arbor, MI, USA"}]},{"given":"Somnath","family":"Paul","sequence":"additional","affiliation":[{"name":"Intel Corporation, Hilsboro, OR, USA"}]},{"given":"Mohammad","family":"Vasimuddin","sequence":"additional","affiliation":[{"name":"Intel Corporation, Bangalore, KA, India"}]},{"given":"Sanchit","family":"Misra","sequence":"additional","affiliation":[{"name":"Intel Corporation, Bangalore, KA, India"}]},{"given":"David","family":"Blaauw","sequence":"additional","affiliation":[{"name":"University of Michigan, Ann Arbor, MI, USA"}]},{"given":"Satish","family":"Narayanasamy","sequence":"additional","affiliation":[{"name":"University of Michigan, Ann Arbor, MI, USA"}]},{"given":"Reetuparna","family":"Das","sequence":"additional","affiliation":[{"name":"University of Michigan, Ann Arbor, MI, USA"}]}],"member":"320","published-online":{"date-parts":[[2025,4,29]]},"reference":[{"issue":"1","key":"e_1_3_1_2_2","first-page":"1","article-title":"Gasal2: A GPU accelerated sequence alignment library for high-throughput NGS data","volume":"20","author":"Ahmed N.","year":"2019","unstructured":"Ahmed, N. et al. Gasal2: A GPU accelerated sequence alignment library for high-throughput NGS data. BMC bioinformatics 20, 1 (2019), 1\u201320.","journal-title":"BMC bioinformatics"},{"key":"e_1_3_1_3_2","doi-asserted-by":"crossref","unstructured":"Cali D.S. et al. Genasm: A high-performance low-power approximate string matching acceleration framework for genome sequence analysis. In 2020 53rd Annual IEEE\/ACM Intern. Symp. on Microarchitecture (MICRO)\u00a0(2020) 951\u2013966.","DOI":"10.1109\/MICRO50266.2020.00081"},{"key":"e_1_3_1_4_2","unstructured":"DNA sequencing costs: Data. National Human Genome Research Institute; https:\/\/tinyurl.com\/yxr89ude."},{"key":"e_1_3_1_5_2","unstructured":"DRAMPower: Open-source DRAM power and energy estimation tool;\u00a0https:\/\/tinyurl.com\/2a9jqb2f."},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3466752.3480117"},{"key":"e_1_3_1_7_2","doi-asserted-by":"crossref","unstructured":"Fujiki D. et al. Genax: A genome sequencing accelerator. In 2018 ACM\/IEEE 45th Annual Intern. Symp. on Computer Architecture (ISCA)\u00a0(2018) 69\u201382.","DOI":"10.1109\/ISCA.2018.00017"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1109\/MM.2012.51"},{"key":"e_1_3_1_9_2","doi-asserted-by":"crossref","unstructured":"Guo L. et al. Hardware acceleration of long read pairwise overlapping in genome sequencing: A race between FPGA and GPU. In 2019 IEEE 27th Annual Intern. Symp. on Field-Programmable Custom Computing Machines (FCCM)\u00a0(2019) 127\u2013135.","DOI":"10.1109\/FCCM.2019.00027"},{"key":"e_1_3_1_10_2","unstructured":"Harris D. Nvidia hopper gpu architecture accelerates dynamic programming up to 40x using new dpx instructions. Nvidia (2022);\u00a0https:\/\/tinyurl.com\/227uqkta"},{"key":"e_1_3_1_11_2","doi-asserted-by":"crossref","unstructured":"Kalikar S. Jain C. Md V. and Misra S. Accelerating long-read analysis on modern cpus. bioRxiv\u00a0(2021).","DOI":"10.1101\/2021.07.21.453294"},{"key":"e_1_3_1_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/LCA.2015.2414456"},{"key":"e_1_3_1_13_2","doi-asserted-by":"crossref","unstructured":"Li C. et al. INC-Seq: Accurate single molecule reads using nanopore sequencing. GigaScience 5 1 (Aug. 2016) s13742-016-0140-7.","DOI":"10.1186\/s13742-016-0140-7"},{"key":"e_1_3_1_14_2","doi-asserted-by":"crossref","unstructured":"Nowatzki T. Gangadhar V. Ardalani N. and Sankaralingam K. Stream-dataflow acceleration. In 2017 ACM\/IEEE 44th Annual Intern. Symp. on Computer Architecture\u00a0(2017) 416\u2013429.","DOI":"10.1145\/3079856.3080255"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.1145\/2485922.2485935"},{"key":"e_1_3_1_16_2","doi-asserted-by":"crossref","unstructured":"Poplin R. et al. Scaling accurate genetic variant discovery to tens of thousands of samples. BioRxiv (2017) 201178.","DOI":"10.1101\/201178"},{"key":"e_1_3_1_17_2","doi-asserted-by":"crossref","unstructured":"Prabhakar R. et al. Plasticine: A reconfigurable architecture for parallel patterns. In 2017 ACM\/IEEE 44th Annual Intern. Symp. on Computer Architecture (ISCA) (2017) 389\u2013402.","DOI":"10.1145\/3079856.3080256"},{"key":"e_1_3_1_18_2","doi-asserted-by":"publisher","DOI":"10.26502\/jbb.2642-91280067"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1145\/1067649.801719"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1016\/0022-2836(81)90087-5"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pbio.1002195"},{"key":"e_1_3_1_22_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.vlsi.2017.02.002"},{"key":"e_1_3_1_23_2","doi-asserted-by":"crossref","unstructured":"Subramaniyan A. et al. Genomicsbench: A benchmark suite for genomics. In 2021 IEEE Intern. Symp. on Performance Analysis of Systems and Software\u00a0(2021) 1\u201312.","DOI":"10.1109\/ISPASS51385.2021.00012"},{"key":"e_1_3_1_24_2","doi-asserted-by":"crossref","unstructured":"Turakhia Y. Bejerano G. and Dally W.J. Darwin: A genomics co-processor provides up to 15 000x acceleration on long read assembly (2018) 199\u2013213.","DOI":"10.1145\/3173162.3173193"},{"key":"e_1_3_1_25_2","doi-asserted-by":"crossref","unstructured":"Vasimuddin M. Misra S. Li H. and Aluru S. Efficient architecture-aware acceleration of BWA-MEM for multicore systems. In 2019 IEEE Intern. Parallel and Distributed Processing Symp. 314\u2013324.","DOI":"10.1109\/IPDPS.2019.00041"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1109\/LSSC.2020.3045148"}],"container-title":["Communications of the ACM"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3712168","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3712168","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3712168","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T01:10:28Z","timestamp":1750295428000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3712168"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,4,29]]},"references-count":25,"journal-issue":{"issue":"05","published-print":{"date-parts":[[2025,5]]}},"alternative-id":["10.1145\/3712168"],"URL":"https:\/\/doi.org\/10.1145\/3712168","relation":{},"ISSN":["0001-0782","1557-7317"],"issn-type":[{"type":"print","value":"0001-0782"},{"type":"electronic","value":"1557-7317"}],"subject":[],"published":{"date-parts":[[2025,4,29]]},"assertion":[{"value":"2025-04-29","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}