{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,8]],"date-time":"2025-10-08T16:32:35Z","timestamp":1759941155434,"version":"3.37.3"},"reference-count":23,"publisher":"Springer Science and Business Media LLC","issue":"17","license":[{"start":{"date-parts":[[2023,6,7]],"date-time":"2023-06-07T00:00:00Z","timestamp":1686096000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,6,7]],"date-time":"2023-06-07T00:00:00Z","timestamp":1686096000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100010661","name":"Horizon 2020 Framework Programme","doi-asserted-by":"publisher","award":["800858"],"award-info":[{"award-number":["800858"]}],"id":[{"id":"10.13039\/100010661","id-type":"DOI","asserted-by":"publisher"}]},{"name":"FernUniversit\u00e4t in Hagen"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Supercomput"],"published-print":{"date-parts":[[2023,11]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Python is becoming increasingly popular in scientific computing. The package MPI for Python (mpi4py) allows writing efficient parallel programs that scale across multiple nodes. However, it does not support non-contiguous data via slices, which is a well-known feature of NumPy. In this work, we therefore evaluate several methods to support the direct transfer of non-contiguous arrays in mpi4py. This significantly simplifies the code, while the performance basically stays the same. In a PingPong-, Stencil- and Lattice-Boltzmann-Benchmark, we compare the common manual copying, a NumPy-Copy design and a design that is based on MPI derived datatypes. In one case, the MPI derived datatype design could achieve a speedup of 15% in a Stencil-Benchmark on four compute nodes. Our designs are superior to naive manual copies, but for maximum performance manual copies with pre-allocated buffers or MPI persistent communication will be a better choice.\n<\/jats:p>","DOI":"10.1007\/s11227-023-05398-7","type":"journal-article","created":{"date-parts":[[2023,6,7]],"date-time":"2023-06-07T04:12:49Z","timestamp":1686111169000},"page":"20019-20040","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Simplifying non-contiguous data transfer with MPI for Python"],"prefix":"10.1007","volume":"79","author":[{"given":"Klaus","family":"N\u00f6lp","sequence":"first","affiliation":[]},{"given":"Lena","family":"Oden","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,6,7]]},"reference":[{"unstructured":"TIOBE (2023) TIOBE programming community index for April 2023. https:\/\/www.tiobe.com\/tiobe-index\/. Accessed 19 Apr 2023","key":"5398_CR1"},{"unstructured":"Cass S (2022) Top programming languages 2022. https:\/\/spectrum.ieee.org\/top-programming-languages-2022 Accessed 19 Apr 2023","key":"5398_CR2"},{"issue":"7825","key":"5398_CR3","doi-asserted-by":"publisher","first-page":"357","DOI":"10.1038\/s41586-020-2649-2","volume":"585","author":"CR Harris","year":"2020","unstructured":"Harris CR (2020) Array programming with NumPy. Nature 585(7825):357\u2013362","journal-title":"Nature"},{"key":"5398_CR4","doi-asserted-by":"publisher","first-page":"261","DOI":"10.1038\/s41592-019-0686-2","volume":"17","author":"P Virtanen","year":"2020","unstructured":"Virtanen P (2020) SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat Methods 17:261\u2013272","journal-title":"Nat Methods"},{"issue":"4","key":"5398_CR5","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1109\/MCSE.2021.3083216","volume":"23","author":"L Dalcin","year":"2021","unstructured":"Dalcin L (2021) Mpi4py: status update after 12 years of development. Comput Sci Eng 23(4):47\u201354","journal-title":"Comput Sci Eng"},{"unstructured":"Okuta R (2017) Cupy: a numpy-compatible library for nvidia gpu calculations. In: Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems (NIPS). http:\/\/learningsys.org\/nips17\/assets\/papers\/paper_16.pdf","key":"5398_CR6"},{"doi-asserted-by":"crossref","unstructured":"Lam SK (2015) Numba: A LLVM-based Python JIT compiler. In: Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC. LLVM \u201915. Association for Computing Machinery, New York, NY, USA","key":"5398_CR7","DOI":"10.1145\/2833157.2833162"},{"doi-asserted-by":"crossref","unstructured":"Ziogas AN (2021) NPBench: a benchmarking suite for high-performance NumPy. In: Proceedings of the ACM International Conference on Supercomputing, pp 63\u201374","key":"5398_CR8","DOI":"10.1145\/3447818.3460360"},{"doi-asserted-by":"crossref","unstructured":"Fink Z (2021) Performance evaluation of Python parallel programming models: Charm4Py and mpi4py. CoRR","key":"5398_CR9","DOI":"10.1109\/ESPM254806.2021.00010"},{"doi-asserted-by":"crossref","unstructured":"Alnaasan N (2022) OMB-Py: Python micro-benchmarks for evaluating performance of MPI libraries on HPC systems","key":"5398_CR10","DOI":"10.1109\/IPDPSW55747.2022.00143"},{"doi-asserted-by":"crossref","unstructured":"Ziogas AN (2021) Productivity, portability, performance: data-centric Python. CoRR","key":"5398_CR11","DOI":"10.1145\/3458817.3476176"},{"doi-asserted-by":"crossref","unstructured":"Xiong Q (2018) MPI derived datatypes: performance and portability issues. In: Proceedings of the 25th European MPI Users\u2019 Group Meeting, pp 1\u201310","key":"5398_CR12","DOI":"10.1145\/3236367.3236378"},{"key":"5398_CR13","doi-asserted-by":"publisher","first-page":"98","DOI":"10.1016\/j.parco.2017.08.006","volume":"69","author":"A Carpen-Amarie","year":"2017","unstructured":"Carpen-Amarie A (2017) On expected and observed communication performance with MPI derived datatypes. Parallel Comput 69:98\u2013117","journal-title":"Parallel Comput"},{"doi-asserted-by":"crossref","unstructured":"Hashmi JM (2019) FALCON: efficient designs for zero-copy MPI datatype processing on emerging architectures. In: 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp 355\u2013364","key":"5398_CR14","DOI":"10.1109\/IPDPS.2019.00045"},{"doi-asserted-by":"crossref","unstructured":"Pearson C (2021) TEMPI: an interposed MPI library with a canonical representation of CUDA-aware datatypes","key":"5398_CR15","DOI":"10.1145\/3431379.3460645"},{"doi-asserted-by":"crossref","unstructured":"Chu C-H (2019) High-performance adaptive MPI derived datatype communication for modern multi-GPU systems. In: 2019 IEEE 26th International Conference on High Performance Computing, Data, and Analytics (HiPC), pp 267\u2013276","key":"5398_CR16","DOI":"10.1109\/HiPC.2019.00041"},{"doi-asserted-by":"crossref","unstructured":"Eijkhout V (2020) Performance of MPI sends of non-contiguous data. In: 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), pp 889\u2013895","key":"5398_CR17","DOI":"10.1109\/IPDPSW50202.2020.00149"},{"doi-asserted-by":"crossref","unstructured":"Gabriel E (2004) Open MPI: goals, concept, and design of a next generation MPI implementation. In: Proceedings, 11th European PVM\/MPI Users\u2019 Group Meeting, Budapest, Hungary, pp 97\u2013104","key":"5398_CR18","DOI":"10.1007\/978-3-540-30218-6_19"},{"unstructured":"Argonne-National-Laboratory (2023) MPICH: A high-performance, portable implementation of MPI. https:\/\/www.mpich.org Accessed 19 Apr 2023","key":"5398_CR19"},{"unstructured":"MPI-Forum (2021) MPI: A message-passing interface standard, Version 4.0. https:\/\/www.mpi-forum.org\/docs\/mpi-4.0\/mpi40-report.pdf","key":"5398_CR20"},{"doi-asserted-by":"crossref","unstructured":"Di Girolamo S (2019) Network-accelerated non-contiguous memory transfers. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, pp 1\u201314","key":"5398_CR21","DOI":"10.1145\/3295500.3356189"},{"unstructured":"Pastewka L (2022) HPC with Python: an MPI-parallel implementation of the Lattice Boltzmann Method. In: Proceedings of the 5th bwHPC Symposium","key":"5398_CR22"},{"issue":"12","key":"5398_CR23","doi-asserted-by":"publisher","first-page":"4254","DOI":"10.3390\/en15124254","volume":"15","author":"R Gong","year":"2022","unstructured":"Gong R (2022) Lattice Boltzmann modeling of spontaneous imbibition in variable-diameter capillaries. Energies 15(12):4254","journal-title":"Energies"}],"container-title":["The Journal of Supercomputing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-023-05398-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11227-023-05398-7\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11227-023-05398-7.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,10,2]],"date-time":"2023-10-02T08:13:58Z","timestamp":1696234438000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11227-023-05398-7"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,7]]},"references-count":23,"journal-issue":{"issue":"17","published-print":{"date-parts":[[2023,11]]}},"alternative-id":["5398"],"URL":"https:\/\/doi.org\/10.1007\/s11227-023-05398-7","relation":{},"ISSN":["0920-8542","1573-0484"],"issn-type":[{"type":"print","value":"0920-8542"},{"type":"electronic","value":"1573-0484"}],"subject":[],"published":{"date-parts":[[2023,6,7]]},"assertion":[{"value":"13 May 2023","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 June 2023","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"This declaration is not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval"}},{"value":"The source code and datasets generated during the current study are available at .","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Availability of data and materials"}}]}}