{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,6,19]],"date-time":"2025-06-19T04:11:14Z","timestamp":1750306274520,"version":"3.41.0"},"reference-count":33,"publisher":"Association for Computing Machinery (ACM)","issue":"2","license":[{"start":{"date-parts":[[2016,6,14]],"date-time":"2016-06-14T00:00:00Z","timestamp":1465862400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Archit. Code Optim."],"published-print":{"date-parts":[[2016,6,27]]},"abstract":"<jats:p>Parallel computers are becoming deeply hierarchical. Locality-aware programming models allow programmers to control locality at one level through establishing affinity between data and executing activities. This, however, does not enable locality exploitation at other levels. Therefore, we must conceive an efficient abstraction of hierarchical locality and develop techniques to exploit it. Techniques applied directly by programmers, beyond the first level, burden the programmer and hinder productivity. In this article, we propose the Parallel Hierarchical Locality Abstraction Model for Execution (PHLAME). PHLAME is an execution model to abstract and exploit machine hierarchical properties through locality-aware programming and a runtime that takes into account machine characteristics, as well as a data sharing and communication profile of the underlying application. This article presents and experiments with concepts and techniques that can drive such runtime system in support of PHLAME. Our experiments show that our techniques scale up and achieve performance gains of up to 88%.<\/jats:p>","DOI":"10.1145\/2897783","type":"journal-article","created":{"date-parts":[[2016,6,14]],"date-time":"2016-06-14T12:29:28Z","timestamp":1465907368000},"page":"1-25","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":8,"title":["Exploiting Hierarchical Locality in Deep Parallel Architectures"],"prefix":"10.1145","volume":"13","author":[{"given":"Ahmad","family":"Anbar","sequence":"first","affiliation":[{"name":"The George Washington University, Washington DC"}]},{"given":"Olivier","family":"Serres","sequence":"additional","affiliation":[{"name":"The George Washington University, Washington DC"}]},{"given":"Engin","family":"Kayraklioglu","sequence":"additional","affiliation":[{"name":"The George Washington University, Washington DC"}]},{"given":"Abdel-Hameed A.","family":"Badawy","sequence":"additional","affiliation":[{"name":"The George Washington University, Washington DC, and Arkansas Tech University, Russellville, AR"}]},{"given":"Tarek","family":"El-Ghazawi","sequence":"additional","affiliation":[{"name":"The George Washington University, Washington DC"}]}],"member":"320","published-online":{"date-parts":[[2016,6,14]]},"reference":[{"doi-asserted-by":"publisher","key":"e_1_2_1_1_1","DOI":"10.1109\/PADSW.2014.7097832"},{"doi-asserted-by":"publisher","key":"e_1_2_1_2_1","DOI":"10.1109\/PGAS.2015.16"},{"doi-asserted-by":"publisher","key":"e_1_2_1_3_1","DOI":"10.1145\/125826.125925"},{"volume-title":"Technical Report.","author":"Bonachea Dan","unstructured":"Dan Bonachea . 2002. GAS Net Specification , V1.1. Technical Report. Berkeley, CA, USA . Dan Bonachea. 2002. GASNet Specification, V1.1. Technical Report. Berkeley, CA, USA.","key":"e_1_2_1_4_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_5_1","DOI":"10.1109\/PDP.2010.67"},{"doi-asserted-by":"publisher","key":"e_1_2_1_6_1","DOI":"10.1109\/71.642949"},{"doi-asserted-by":"publisher","key":"e_1_2_1_7_1","DOI":"10.1177\/1094342007078442"},{"doi-asserted-by":"publisher","key":"e_1_2_1_8_1","DOI":"10.1109\/MM.2010.31"},{"doi-asserted-by":"publisher","key":"e_1_2_1_9_1","DOI":"10.1109\/IPDPS.2012.56"},{"doi-asserted-by":"publisher","key":"e_1_2_1_10_1","DOI":"10.1109\/99.660313"},{"doi-asserted-by":"publisher","key":"e_1_2_1_11_1","DOI":"10.1109\/5992.988653"},{"doi-asserted-by":"publisher","key":"e_1_2_1_12_1","DOI":"10.5555\/762761.762821"},{"key":"e_1_2_1_13_1","volume-title":"UPC: Distributed Shared-Memory Programming","author":"El-Ghazawi Tarek","year":"2003","unstructured":"Tarek El-Ghazawi , William Carlson , Thomas Sterling , and Katherine Yelick . 2003 . UPC: Distributed Shared-Memory Programming . Wiley-Interscience , New York, NY . Tarek El-Ghazawi, William Carlson, Thomas Sterling, and Katherine Yelick. 2003. UPC: Distributed Shared-Memory Programming. Wiley-Interscience, New York, NY."},{"doi-asserted-by":"publisher","key":"e_1_2_1_14_1","DOI":"10.5555\/1214527"},{"volume-title":"Proceedings, 11th European PVM\/MPI Users\u2019 Group Meeting","author":"Gabriel Edgar","unstructured":"Edgar Gabriel , Graham E. Fagg , George Bosilca , Thara Angskun , Jack J. Dongarra , Jeffrey M. Squyres , Vishal Sahay , Prabhanjan Kambadur , Brian Barrett , Andrew Lumsdaine , Ralph H. Castain , David J. Daniel , Richard L. Graham , and Timothy S. Woodall . 2004. Open MPI: Goals, concept, and design of a next generation MPI implementation . In Proceedings, 11th European PVM\/MPI Users\u2019 Group Meeting . Budapest, Hungary, 97--104. Edgar Gabriel, Graham E. Fagg, George Bosilca, Thara Angskun, Jack J. Dongarra, Jeffrey M. Squyres, Vishal Sahay, Prabhanjan Kambadur, Brian Barrett, Andrew Lumsdaine, Ralph H. Castain, David J. Daniel, Richard L. Graham, and Timothy S. Woodall. 2004. Open MPI: Goals, concept, and design of a next generation MPI implementation. In Proceedings, 11th European PVM\/MPI Users\u2019 Group Meeting. Budapest, Hungary, 97--104.","key":"e_1_2_1_15_1"},{"unstructured":"Habanero-C. 2015. Homepage. Retrieved from https:\/\/wiki.rice.edu\/confluence\/display\/HABANERO\/ Habanero-C.  Habanero-C. 2015. Homepage. Retrieved from https:\/\/wiki.rice.edu\/confluence\/display\/HABANERO\/ Habanero-C.","key":"e_1_2_1_16_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_17_1","DOI":"10.1109\/SC.2014.33"},{"doi-asserted-by":"publisher","key":"e_1_2_1_18_1","DOI":"10.1109\/TPDS.2013.104"},{"doi-asserted-by":"publisher","key":"e_1_2_1_19_1","DOI":"10.1137\/S1064827595287997"},{"key":"e_1_2_1_20_1","volume-title":"Top ten exascale research challenges. DOE ASCAC Subcommittee Report (February","author":"Lucas R.","year":"2014","unstructured":"R. Lucas , J. Ang , K. Bergman , S. Borkar , W. Carlson , L. Carrington , G. Chiu , R. Colwell , W. Dally , J. Dongarra , and others. 2014. Top ten exascale research challenges. DOE ASCAC Subcommittee Report (February 2014 ). R. Lucas, J. Ang, K. Bergman, S. Borkar, W. Carlson, L. Carrington, G. Chiu, R. Colwell, W. Dally, J. Dongarra, and others. 2014. Top ten exascale research challenges. DOE ASCAC Subcommittee Report (February 2014)."},{"doi-asserted-by":"publisher","key":"e_1_2_1_21_1","DOI":"10.1109\/2.982916"},{"volume-title":"Euro-Par 2013: Parallel Processing Workshops","author":"Majeti Deepak","unstructured":"Deepak Majeti , Rajkishore Barik , Jisheng Zhao , Max Grossman , and Vivek Sarkar . 2014. Compiler-driven data layout transformation for heterogeneous platforms . In Euro-Par 2013: Parallel Processing Workshops . Springer , Berlin , 188--197. Deepak Majeti, Rajkishore Barik, Jisheng Zhao, Max Grossman, and Vivek Sarkar. 2014. Compiler-driven data layout transformation for heterogeneous platforms. In Euro-Par 2013: Parallel Processing Workshops. Springer, Berlin, 188--197.","key":"e_1_2_1_22_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_23_1","DOI":"10.1109\/IPDPS.2011.197"},{"unstructured":"PGAS. 2015. Partitioned Global Address Space Languages. (2015). http:\/\/pgas.org Accessed: 7\/2015.  PGAS. 2015. Partitioned Global Address Space Languages. (2015). http:\/\/pgas.org Accessed: 7\/2015.","key":"e_1_2_1_24_1"},{"volume-title":"Encyclopedia of Parallel Computing","author":"Poole Stephen W.","unstructured":"Stephen W. Poole , Oscar Hernandez , Jeffery A. Kuehn , Galen M. Shipman , Anthony Curtis , and Karl Feind . 2011. OpenSHMEM - Toward a unified RMA model . In Encyclopedia of Parallel Computing , David Padua (Ed.). Springer , Berlin , 1379--1391. DOI:http:\/\/dx.doi.org\/10.1007\/978-0-387-09766-4_490 10.1007\/978-0-387-09766-4_490 Stephen W. Poole, Oscar Hernandez, Jeffery A. Kuehn, Galen M. Shipman, Anthony Curtis, and Karl Feind. 2011. OpenSHMEM - Toward a unified RMA model. In Encyclopedia of Parallel Computing, David Padua (Ed.). Springer, Berlin, 1379--1391. DOI:http:\/\/dx.doi.org\/10.1007\/978-0-387-09766-4_490","key":"e_1_2_1_25_1"},{"doi-asserted-by":"publisher","key":"e_1_2_1_26_1","DOI":"10.1145\/2464996.2465434"},{"doi-asserted-by":"publisher","key":"e_1_2_1_27_1","DOI":"10.1145\/2145816.2145823"},{"doi-asserted-by":"publisher","key":"e_1_2_1_28_1","DOI":"10.1177\/1094342006064482"},{"key":"e_1_2_1_29_1","volume-title":"Proceedings of the IEEE International Symposium on Parallel and Distributed Processing, 2008. IPDPS 2008. 1--8. DOI:http:\/\/dx.doi.org\/ 10","author":"Su Hung-Hsun","year":"2008","unstructured":"Hung-Hsun Su , M. Billingsley , and A. D. George . 2008. Parallel performance wizard: A performance analysis tool for partitioned global-address-space programming . In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing, 2008. IPDPS 2008. 1--8. DOI:http:\/\/dx.doi.org\/ 10 .1109\/IPDPS. 2008 .4536476 10.1109\/IPDPS.2008.4536476 Hung-Hsun Su, M. Billingsley, and A. D. George. 2008. Parallel performance wizard: A performance analysis tool for partitioned global-address-space programming. In Proceedings of the IEEE International Symposium on Parallel and Distributed Processing, 2008. IPDPS 2008. 1--8. DOI:http:\/\/dx.doi.org\/ 10.1109\/IPDPS.2008.4536476"},{"doi-asserted-by":"publisher","key":"e_1_2_1_30_1","DOI":"10.1109\/CLUSTER.2011.43"},{"doi-asserted-by":"publisher","key":"e_1_2_1_31_1","DOI":"10.1109\/SC.2012.63"},{"key":"e_1_2_1_32_1","volume-title":"Samuel Williams, and Yili Zheng.","author":"Yelick Katherine","year":"2015","unstructured":"Katherine Yelick , Vivek Sarkar , John Mellor-Crummey , James Demmel , Krste Asanovi , Armando Fox , Mattan Erez , Dan Quinlan , Surendra Byna , Marc Day , Tony Drummond , Paul Hargrove , Steven Hofmeyr , Costin Iancu , Khaled Ibrahim , Frank Mueller , Leonid Oliker , Eric Roman , John Shalf , David Skinner , Erich Strohmaier , Brian Van Straalen , Samuel Williams, and Yili Zheng. 2015 . DEGAS : Dynamic Exascale Global Address Space ; slides available online, retrieved 5\/2015, http:\/\/goo.gl\/IrfsIs. Katherine Yelick, Vivek Sarkar, John Mellor-Crummey, James Demmel, Krste Asanovi, Armando Fox, Mattan Erez, Dan Quinlan, Surendra Byna, Marc Day, Tony Drummond, Paul Hargrove, Steven Hofmeyr, Costin Iancu, Khaled Ibrahim, Frank Mueller, Leonid Oliker, Eric Roman, John Shalf, David Skinner, Erich Strohmaier, Brian Van Straalen, Samuel Williams, and Yili Zheng. 2015. DEGAS: Dynamic Exascale Global Address Space; slides available online, retrieved 5\/2015, http:\/\/goo.gl\/IrfsIs."},{"doi-asserted-by":"publisher","key":"e_1_2_1_33_1","DOI":"10.1109\/CCGrid.2015.97"}],"container-title":["ACM Transactions on Architecture and Code Optimization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2897783","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/2897783","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,18]],"date-time":"2025-06-18T04:38:45Z","timestamp":1750221525000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/2897783"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2016,6,14]]},"references-count":33,"journal-issue":{"issue":"2","published-print":{"date-parts":[[2016,6,27]]}},"alternative-id":["10.1145\/2897783"],"URL":"https:\/\/doi.org\/10.1145\/2897783","relation":{},"ISSN":["1544-3566","1544-3973"],"issn-type":[{"type":"print","value":"1544-3566"},{"type":"electronic","value":"1544-3973"}],"subject":[],"published":{"date-parts":[[2016,6,14]]},"assertion":[{"value":"2015-09-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-02-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2016-06-14","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}