{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,3,2]],"date-time":"2025-03-02T05:25:56Z","timestamp":1740893156734,"version":"3.38.0"},"reference-count":22,"publisher":"SAGE Publications","issue":"5","license":[{"start":{"date-parts":[[2017,4,18]],"date-time":"2017-04-18T00:00:00Z","timestamp":1492473600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"funder":[{"DOI":"10.13039\/100006168","name":"National Nuclear Security Administration","doi-asserted-by":"publisher","award":["DE-NA0002374"],"award-info":[{"award-number":["DE-NA0002374"]}],"id":[{"id":"10.13039\/100006168","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["The International Journal of High Performance Computing Applications"],"published-print":{"date-parts":[[2018,9]]},"abstract":"<jats:p> Modern machines consist of multiple compute devices and complex memory hierarchies. For many applications, it is imperative that any data movement between and within the various compute devices be done as efficiently as possible in order to obtain maximum performance. However, hand-optimizing code for one architecture will likely sacrifice both performance portability and software maintainability. In addition, some optimization decisions are best made at runtime. This suggests that the problem ought to be tackled on two fronts. First, provide the programmer with a declarative language to describe data layouts and data motion. This would allow the runtime system to be tuned for each architecture by a specialist and free the programmer to concentrate on the application itself. Second, exploit the execution time information to optimize the data movement code further. MPI derived datatypes accomplish the former task and Just In Time (JIT) compilation can be used for the latter. In this paper, we present DAME\u2014a language and interpreter designed to be used as the backend for MPI derived datatypes. We also present DAME-L and DAME-X, two JIT-enabled implementations of DAME, all of which have been integrated into MPICH. We evaluate their performance on DDTBench and two mini-applications written with MPI derived datatypes and obtain communication speedups of up to 20\u00d7 and mini-application speedups of up to 3\u00d7. <\/jats:p>","DOI":"10.1177\/1094342017695444","type":"journal-article","created":{"date-parts":[[2017,4,18]],"date-time":"2017-04-18T11:19:35Z","timestamp":1492514375000},"page":"760-774","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":1,"title":["DAME: Runtime-compilation for data movement"],"prefix":"10.1177","volume":"32","author":[{"given":"Tarun","family":"Prabhu","sequence":"first","affiliation":[{"name":"Department of Computer Science, University of Illinois, Urbana\u2013Champaign, USA"}]},{"given":"William","family":"Gropp","sequence":"additional","affiliation":[{"name":"Department of Computer Science, University of Illinois, Urbana\u2013Champaign, USA"}]}],"member":"179","published-online":{"date-parts":[[2017,4,18]]},"reference":[{"key":"bibr1-1094342017695444","unstructured":"MPI Standards Committee (MPI Forum) (2015) A message-passing interface standard. Version 3.0. Technical report. Available at: http:\/\/www.mpi-forum.org\/docs\/ (accessed 23 October 2016)."},{"key":"bibr2-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1016\/j.ascom.2014.12.001"},{"key":"bibr3-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1007\/11846802_36"},{"key":"bibr4-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1109\/40.591653"},{"key":"bibr5-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1145\/1543135.1542528"},{"key":"bibr6-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-15646-5_14"},{"key":"bibr7-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1145\/2503210.2503246"},{"key":"bibr8-1094342017695444","unstructured":"Kjolstad F, Hoefler T, Snir M (2011) A transformation to convert packing code to compact datatypes for efficient zero-copy data transfer. Technical Report, University of Illinois, USA."},{"key":"bibr9-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1145\/2833157.2833162"},{"key":"bibr10-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1109\/CGO.2004.1281665"},{"key":"bibr11-1094342017695444","first-page":"538","volume-title":"International conference on parallel processing workshops, 2004. ICPP 2004 workshops","author":"Lu Q","year":"2004"},{"key":"bibr12-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1145\/1065010.1065034"},{"key":"bibr13-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2014.52"},{"key":"bibr14-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1145\/7902.7904"},{"key":"bibr15-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1109\/SC.2014.50"},{"key":"bibr16-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1145\/2802658.2802659"},{"key":"bibr17-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1145\/1014007.1014010"},{"key":"bibr18-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-540-39924-7_55"},{"key":"bibr19-1094342017695444","first-page":"1","volume-title":"Proceedings of the 2013 IEEE\/ACM international symposium on code generation and optimization","author":"Santos HN","year":"2013"},{"key":"bibr20-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1145\/2488551.2488552"},{"key":"bibr21-1094342017695444","doi-asserted-by":"publisher","DOI":"10.1145\/2642769.2642771"},{"key":"bibr22-1094342017695444","first-page":"1","volume-title":"Parallel and distributed processing symposium, 2004. Proceedings of the 18th international","volume":"1","author":"Wu J","year":"2004"}],"container-title":["The International Journal of High Performance Computing Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342017695444","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/1094342017695444","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342017695444","content-type":"application\/pdf","content-version":"vor","intended-application":"syndication"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/1094342017695444","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,1]],"date-time":"2025-03-01T06:29:06Z","timestamp":1740810546000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/1094342017695444"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,4,18]]},"references-count":22,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2018,9]]}},"alternative-id":["10.1177\/1094342017695444"],"URL":"https:\/\/doi.org\/10.1177\/1094342017695444","relation":{},"ISSN":["1094-3420","1741-2846"],"issn-type":[{"type":"print","value":"1094-3420"},{"type":"electronic","value":"1741-2846"}],"subject":[],"published":{"date-parts":[[2017,4,18]]}}}