{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,5,13]],"date-time":"2025-05-13T12:07:08Z","timestamp":1747138028581,"version":"3.40.3"},"publisher-location":"Cham","reference-count":20,"publisher":"Springer International Publishing","isbn-type":[{"type":"print","value":"9783031104183"},{"type":"electronic","value":"9783031104190"}],"license":[{"start":{"date-parts":[[2022,1,1]],"date-time":"2022-01-01T00:00:00Z","timestamp":1640995200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,7,1]],"date-time":"2022-07-01T00:00:00Z","timestamp":1656633600000},"content-version":"vor","delay-in-days":181,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2022]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Shared memory mechanisms, e.g., POSIX shmem or XPMEM, are widely used to implement efficient intra-node communication among processes running on the same node. While POSIX shmem allows other processes to access only newly allocated memory, XPMEM allows accessing any existing data and thus enables more efficient communication because the send buffer content can directly be copied to the receive buffer. Recently, the shared address space model has been proposed, where processes on the same node are mapped into the same address space at the time of process creation, allowing processes to access any data in the shared address space. Process-in-Process (PiP) is an implementation of such mechanism. The functionalities of shared memory mechanisms and the shared address space model look very similar \u2013 both allow accessing the data of other processes \u2013, however, the shared address space model includes the shared memory model. Their internal mechanisms are also notably different. This paper clarifies the differences between the shared memory and the shared address space models, both qualitatively and quantitatively. This paper is not to showcase applications of the shared address space model, but through minimal modifications to an existing MPI implementation it highlights the basic differences between the two models. The following four MPI configurations are evaluated and compared; 1) POSIX Shmem, 2) XPMEM, 3) PiP-Shmem, where intra-node communication is implemented to utilize POSIX shmem but MPI processes share the same address space, and 4) PiP-XPMEM, where XPMEM functions are implemented by the PiP library (without the need for linking to XPMEM library). Evaluation is done using the Intel MPI benchmark suite and six HPC benchmarks (HPCCG, miniGhost, LULESH2.0, miniMD, miniAMR and mpiGraph). Most notably, mpiGraph performance of PiP-XPMEM outperforms the XPMEM implementation by almost 1.5x. The performance numbers of HPCCG, miniGhost, miniMD, LULESH2.0 running with PiP-Shmem and PiP-XPMEM are comparable with those of POSIX Shmem and XPMEM. PiP is not only a practical implementation of the shared address space model, but it also provides opportunities for developing new optimization techniques, which the paper further elaborates on.\n<\/jats:p>","DOI":"10.1007\/978-3-031-10419-0_5","type":"book-chapter","created":{"date-parts":[[2022,6,30]],"date-time":"2022-06-30T17:07:51Z","timestamp":1656608871000},"page":"59-78","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["On the\u00a0Difference Between Shared Memory and\u00a0Shared Address Space in\u00a0HPC Communication"],"prefix":"10.1007","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7010-8098","authenticated-orcid":false,"given":"Atsushi","family":"Hori","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4775-1835","authenticated-orcid":false,"given":"Kaiming","family":"Ouyang","sequence":"additional","affiliation":[]},{"given":"Balazs","family":"Gerofi","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2286-9770","authenticated-orcid":false,"given":"Yutaka","family":"Ishikawa","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,7,1]]},"reference":[{"unstructured":"Hydrodynamics Challenge Problem, Lawrence Livermore National Laboratory. Technical report, LLNL-TR-490254, July 2011","key":"5_CR1"},{"unstructured":"miniamr, version 00 (2014). https:\/\/www.osti.gov\/\/servlets\/purl\/1253324","key":"5_CR2"},{"doi-asserted-by":"publisher","unstructured":"Barrett, R.F., Heroux, M.A., Vaughan, C.T.: MiniGhost: a miniapp for exploring boundary exchange strategies using stencil computations in scientific parallel computing. https:\/\/doi.org\/10.2172\/1039405. https:\/\/www.osti.gov\/biblio\/1039405","key":"5_CR3","DOI":"10.2172\/1039405"},{"unstructured":"Brightwell, R., Pedretti, K.: An intra-node implementation of OpenSHMEM using virtual address space mapping. In: Fifth Partitioned Global Address Space Conference. Galveston Island, Texas (2011). http:\/\/pgas11.rice.edu\/papers\/BrightwellPedretti-OpenSHMEM-PGAS11.pdf","key":"5_CR4"},{"doi-asserted-by":"crossref","unstructured":"Brightwell, R., Pedretti, K., Hudson, T.: SMARTMAP: operating system support for efficient data sharing among processes on a multi-core processor. In: Proceedings of the 2008 ACM\/IEEE Conference on Supercomputing, SC 2008, pp. 25:1\u201325:12. IEEE Press, Piscataway (2008). http:\/\/dl.acm.org\/citation.cfm?id=1413370.1413396","key":"5_CR5","DOI":"10.1109\/SC.2008.5218881"},{"doi-asserted-by":"publisher","unstructured":"Garg, R., Price, G., Cooperman, G.: Mana for MPI: MPI-agnostic network-agnostic transparent checkpointing. In: Proceedings of the 28th International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2019, pp. 49\u201360. Association for Computing Machinery, New York (2019). https:\/\/doi.org\/10.1145\/3307681.3325962","key":"5_CR6","DOI":"10.1145\/3307681.3325962"},{"unstructured":"Hashmi, J.M.: Designing high performance shared-address-space and adaptive communication middlewares for next-generation HPC systems. Ph.D. thesis, The Ohio State University (2020)","key":"5_CR7"},{"doi-asserted-by":"publisher","unstructured":"Heroux, M.A., Dongarra, J., Luszczek, P.: HPCG benchmark technical specification. https:\/\/doi.org\/10.2172\/1113870. https:\/\/www.osti.gov\/biblio\/1113870","key":"5_CR8","DOI":"10.2172\/1113870"},{"unstructured":"Hjelm, N.: Linux Cross-Memory Attach. https:\/\/github.com\/hjelmn\/xpmem","key":"5_CR9"},{"doi-asserted-by":"publisher","unstructured":"Hori, A., et al.: Process-in-process: techniques for practical address-space sharing. In: Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, HPDC 2018, pp. 131\u2013143. Association for Computing Machinery, New York (2018). https:\/\/doi.org\/10.1145\/3208040.3208045","key":"5_CR10","DOI":"10.1145\/3208040.3208045"},{"unstructured":"Intel Corp.: Intel MPI Benchmarks - User Guide and Methodology Description, Revision 3.2.4 edn. (2013)","key":"5_CR11"},{"doi-asserted-by":"crossref","unstructured":"Karlin, I., Keasler, J., Neely, R.: Lulesh 2.0 updates and changes. Technical report, LLNL-TR-641973, August 2013","key":"5_CR12","DOI":"10.2172\/1090032"},{"unstructured":"Kerrisk, M.: shm_overview(7) - Linux manual page (2021). https:\/\/man7.org\/linux\/man-pages\/man7\/shm_overview.7.html","key":"5_CR13"},{"doi-asserted-by":"publisher","unstructured":"Li, M., Lin, J., Lu, X., Hamidouche, K., Tomko, K., Panda, D.K.: Scalable MiniMD design with hybrid MPI and OpenSHMEM. In: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models, PGAS 2014. Association for Computing Machinery, New York (2014). https:\/\/doi.org\/10.1145\/2676870.2676893","key":"5_CR14","DOI":"10.1145\/2676870.2676893"},{"unstructured":"Moody, A.: mpigraph, version 00 (2007). https:\/\/www.osti.gov\/\/servlets\/purl\/1249421","key":"5_CR15"},{"doi-asserted-by":"publisher","unstructured":"Ouyang, K., Si, M., Hori, A., Chen, Z., Balaji, P.: Daps: a dynamic asynchronous progress stealing model for MPI communication. In: 2021 IEEE International Conference on Cluster Computing (CLUSTER), pp. 516\u2013527 (2021). https:\/\/doi.org\/10.1109\/Cluster48925.2021.00027","key":"5_CR16","DOI":"10.1109\/Cluster48925.2021.00027"},{"doi-asserted-by":"crossref","unstructured":"Ouyang, K., Si, M., Hori, A., Chen, Z., Balaji, P.: CAB-MPI: exploring interprocess work-stealing towards balanced MPI communication. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. IEEE Press (2020)","key":"5_CR17","DOI":"10.1109\/SC41405.2020.00040"},{"key":"5_CR18","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"publisher","first-page":"94","DOI":"10.1007\/978-3-642-03770-2_16","volume-title":"Recent Advances in Parallel Virtual Machine and Message Passing Interface","author":"M P\u00e9rache","year":"2009","unstructured":"P\u00e9rache, M., Carribault, P., Jourdren, H.: MPC-MPI: an MPI implementation reducing the overall memory consumption. In: Ropo, M., Westerholm, J., Dongarra, J. (eds.) EuroPVM\/MPI 2009. LNCS, vol. 5759, pp. 94\u2013103. Springer, Heidelberg (2009). https:\/\/doi.org\/10.1007\/978-3-642-03770-2_16"},{"doi-asserted-by":"publisher","unstructured":"Shimada, A., Gerofi, B., Hori, A., Ishikawa, Y.: Proposing a new task model towards many-core architecture. In: Proceedings of the First International Workshop on Many-Core Embedded Systems, MES 2013, pp. 45\u201348. ACM, New York (2013). https:\/\/doi.org\/10.1145\/2489068.2489075","key":"5_CR19","DOI":"10.1145\/2489068.2489075"},{"unstructured":"The Ohio State University: MVAPICH: MPI over InfiniBand, Omni-Path, Ethernet\/iWARP, and RoCE. http:\/\/mvapich.cse.ohio-state.edu","key":"5_CR20"}],"container-title":["Lecture Notes in Computer Science","Supercomputing Frontiers"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/978-3-031-10419-0_5","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,6,30]],"date-time":"2022-06-30T17:13:58Z","timestamp":1656609238000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/978-3-031-10419-0_5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022]]},"ISBN":["9783031104183","9783031104190"],"references-count":20,"URL":"https:\/\/doi.org\/10.1007\/978-3-031-10419-0_5","relation":{},"ISSN":["0302-9743","1611-3349"],"issn-type":[{"type":"print","value":"0302-9743"},{"type":"electronic","value":"1611-3349"}],"subject":[],"published":{"date-parts":[[2022]]},"assertion":[{"value":"1 July 2022","order":1,"name":"first_online","label":"First Online","group":{"name":"ChapterHistory","label":"Chapter History"}},{"value":"SCFA","order":1,"name":"conference_acronym","label":"Conference Acronym","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Asian Conference on Supercomputing Frontiers","order":2,"name":"conference_name","label":"Conference Name","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Singapore","order":3,"name":"conference_city","label":"Conference City","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Singapore","order":4,"name":"conference_country","label":"Conference Country","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"2022","order":5,"name":"conference_year","label":"Conference Year","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"1 March 2022","order":7,"name":"conference_start_date","label":"Conference Start Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"3 March 2022","order":8,"name":"conference_end_date","label":"Conference End Date","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"7","order":9,"name":"conference_number","label":"Conference Number","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"scfa2022","order":10,"name":"conference_id","label":"Conference ID","group":{"name":"ConferenceInfo","label":"Conference Information"}},{"value":"Single-blind","order":1,"name":"type","label":"Type","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"EasyChair","order":2,"name":"conference_management_system","label":"Conference Management System","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"21","order":3,"name":"number_of_submissions_sent_for_review","label":"Number of Submissions Sent for Review","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"8","order":4,"name":"number_of_full_papers_accepted","label":"Number of Full Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"0","order":5,"name":"number_of_short_papers_accepted","label":"Number of Short Papers Accepted","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"38% - The value is computed by the equation \"Number of Full Papers Accepted \/ Number of Submissions Sent for Review * 100\" and then rounded to a whole number.","order":6,"name":"acceptance_rate_of_full_papers","label":"Acceptance Rate of Full Papers","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"3.8","order":7,"name":"average_number_of_reviews_per_paper","label":"Average Number of Reviews per Paper","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"3.5","order":8,"name":"average_number_of_papers_per_reviewer","label":"Average Number of Papers per Reviewer","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}},{"value":"No","order":9,"name":"external_reviewers_involved","label":"External Reviewers Involved","group":{"name":"ConfEventPeerReviewInformation","label":"Peer Review Information (provided by the conference organizers)"}}]}}