{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T09:40:27Z","timestamp":1763458827110,"version":"3.45.0"},"reference-count":24,"publisher":"Association for Computing Machinery (ACM)","issue":"10","license":[{"start":{"date-parts":[[2017,9,22]],"date-time":"2017-09-22T00:00:00Z","timestamp":1506038400000},"content-version":"vor","delay-in-days":365,"URL":"http:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"funder":[{"DOI":"10.13039\/100000001","name":"National Science Foundation","doi-asserted-by":"publisher","award":["CNS 1111407"],"award-info":[{"award-number":["CNS 1111407"]}],"id":[{"id":"10.13039\/100000001","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["Commun. ACM"],"published-print":{"date-parts":[[2016,9,22]]},"abstract":"<jats:p>\n                    This paper proposes an efficient parallel algorithm for an important class of dynamic programming problems that includes Viterbi, Needleman--Wunsch, Smith--Waterman, and Longest Common Subsequence. In dynamic programming, the subproblems that do not depend on each other, and thus can be computed in parallel, form stages, or\n                    <jats:italic toggle=\"yes\">wavefronts.<\/jats:italic>\n                    The algorithm presented in this paper provides additional parallelism allowing multiple stages to be computed in parallel despite dependences among them. The correctness and the performance of the algorithm relies on rank convergence properties of matrix multiplication in the tropical semiring, formed with plus as the multiplicative operation and max as the additive operation.\n                  <\/jats:p>\n                  <jats:p>This paper demonstrates the efficiency of the parallel algorithm by showing significant speedups on a variety of important dynamic programming problems. In particular, the parallel Viterbi decoder is up to 24\u00d7 faster (with 64 processors) than a highly optimized commercial baseline.&lt;!-- END_PAGE_1 --&gt;<\/jats:p>","DOI":"10.1145\/2983553","type":"journal-article","created":{"date-parts":[[2016,9,22]],"date-time":"2016-09-22T13:55:52Z","timestamp":1474552552000},"page":"85-92","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":12,"title":["Efficient parallelization using rank convergence in dynamic programming algorithms"],"prefix":"10.1145","volume":"59","author":[{"given":"Saeed","family":"Maleki","sequence":"first","affiliation":[{"name":"Microsoft Research, Redmond, WA"}]},{"given":"Madanlal","family":"Musuvathi","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}]},{"given":"Todd","family":"Mytkowicz","sequence":"additional","affiliation":[{"name":"Microsoft Research, Redmond, WA"}]}],"member":"320","published-online":{"date-parts":[[2016,9,22]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1137\/0219066"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.5555\/1893145"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.5555\/1834610.1834613"},{"key":"e_1_2_1_4_1","first-page":"213","article-title":"On the rank of a tropical matrix","volume":"52","author":"Develin M.","year":"2005","unstructured":"Develin, M., Santos, F., Sturmfels, B. On the rank of a tropical matrix. Combin. Comput Geom. 52 (2005), 213--242.","journal-title":"Combin. Comput Geom."},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1093\/bioinformatics\/btl582"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.1007\/BF00927838"},{"key":"e_1_2_1_7_1","doi-asserted-by":"publisher","DOI":"10.1109\/35.79382"},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.1006\/jpdc.1994.1053"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1145\/7902.7903"},{"key":"e_1_2_1_10_1","doi-asserted-by":"publisher","DOI":"10.1145\/360825.360861"},{"key":"e_1_2_1_11_1","volume-title":"Proceedings of the 15th Australasian Workshop on Combinatorial Algorithms","author":"Hyyro H.","year":"2004","unstructured":"Hyyro, H. Bit-parallel LCS-length computation revisited. In Proceedings of the 15th Australasian Workshop on Combinatorial Algorithms (2004), 16--27."},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1145\/322217.322232"},{"key":"e_1_2_1_13_1","doi-asserted-by":"publisher","DOI":"10.1145\/2555243.2555264"},{"key":"e_1_2_1_14_1","volume-title":"Pacific Symposium on Biocomputing","author":"Martins W.S.","year":"2001","unstructured":"Martins, W.S., Cuvillo, J.B.D., Useche, F.J., Theobald, K.B., Gao, G. A multithreaded parallel implementation of a dynamic programming algorithm for sequence comparison. In Pacific Symposium on Biocomputing (2001), 311--322."},{"key":"e_1_2_1_15_1","doi-asserted-by":"publisher","DOI":"10.5555\/905532"},{"key":"e_1_2_1_16_1","unstructured":"National Center for Biotechnology Information http:\/\/www.ncbi.nlm.nih.gov\/.2013."},{"key":"e_1_2_1_17_1","doi-asserted-by":"publisher","DOI":"10.1016\/0022-2836(70)90057-4"},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1109\/JPROC.2004.840306"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1016\/0022-2836(81)90087-5"},{"key":"e_1_2_1_20_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jpdc.2010.01.004"},{"key":"e_1_2_1_21_1","unstructured":"Texas Advanced Computing Center http:\/\/www.tacc.utexas.edu\/resources\/hpc. Stampede: Dell PowerEdge C8220 Cluster with Intel Xeon Phi coprocessors."},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1137\/0212043"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1967.1054010"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.5555\/578474"}],"container-title":["Communications of the ACM"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2983553","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2983553","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2983553","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,11,18]],"date-time":"2025-11-18T09:36:16Z","timestamp":1763458576000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2983553"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,9,22]]},"references-count":24,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2016,9,22]]}},"alternative-id":["10.1145\/2983553"],"URL":"https:\/\/doi.org\/10.1145\/2983553","relation":{},"ISSN":["0001-0782","1557-7317"],"issn-type":[{"type":"print","value":"0001-0782"},{"type":"electronic","value":"1557-7317"}],"subject":[],"published":{"date-parts":[[2016,9,22]]},"assertion":[{"value":"2016-09-22","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}