{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,20]],"date-time":"2025-12-20T08:39:43Z","timestamp":1766219983365,"version":"3.48.0"},"publisher-location":"New York, NY, USA","reference-count":42,"publisher":"ACM","content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":[],"published-print":{"date-parts":[[2025,9,8]]},"DOI":"10.1145\/3754598.3754648","type":"proceedings-article","created":{"date-parts":[[2025,12,20]],"date-time":"2025-12-20T08:34:32Z","timestamp":1766219672000},"page":"753-763","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":0,"title":["pyGinkgo: A Sparse Linear Algebra Operator Framework for Python"],"prefix":"10.1145","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2219-1616","authenticated-orcid":false,"given":"Keshvi","family":"Tuteja","sequence":"first","affiliation":[{"name":"Karlsruhe Institute of Technology, Karlsruhe, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0128-3933","authenticated-orcid":false,"given":"Gregor","family":"Olenik","sequence":"additional","affiliation":[{"name":"Technical University of Munich, Heilbronn, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-2099-2898","authenticated-orcid":false,"given":"Roman","family":"Mishchuk","sequence":"additional","affiliation":[{"name":"Technical University of Munich, Heilbronn, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5229-3739","authenticated-orcid":false,"given":"Yu-Hsiang","family":"Tsai","sequence":"additional","affiliation":[{"name":"Technical University of Munich, Heilbronn, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-2233-1041","authenticated-orcid":false,"given":"Markus","family":"G\u00f6tz","sequence":"additional","affiliation":[{"name":"Helmholtz AI, Karlsruhe, Germany and Karlsruhe Institute of Technology, Karlsruhe, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5065-469X","authenticated-orcid":false,"given":"Achim","family":"Streit","sequence":"additional","affiliation":[{"name":"Karlsruhe Institute of Technology, Karlsruhe, Germany"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2177-952X","authenticated-orcid":false,"given":"Hartwig","family":"Anzt","sequence":"additional","affiliation":[{"name":"Technical University of Munich, Heilbronn, Germany and University of Tennessee, Knoxville, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7156-2022","authenticated-orcid":false,"given":"Charlotte","family":"Debus","sequence":"additional","affiliation":[{"name":"Karlsruhe Institute of Technology, Kalsruhe, Germany"}]}],"member":"320","published-online":{"date-parts":[[2025,12,20]]},"reference":[{"key":"e_1_3_3_2_2_2","doi-asserted-by":"publisher","unstructured":"2002. An updated set of basic linear algebra subprograms (BLAS). ACM Trans. Math. Softw. 28 2 (June 2002) 135\u2013151. 10.1145\/567806.567807","DOI":"10.1145\/567806.567807"},{"key":"e_1_3_3_2_3_2","doi-asserted-by":"publisher","unstructured":"Mart\u00edn Abadi Ashish Agarwal Paul Barham Eugene Brevdo Zhifeng Chen Craig Citro Greg\u00a0S. Corrado Andy Davis Jeffrey Dean Matthieu Devin Sanjay Ghemawat Ian Goodfellow Andrew Harp Geoffrey Irving Michael Isard Yangqing Jia Rafal Jozefowicz Lukasz Kaiser Manjunath Kudlur Josh Levenberg Dandelion Man\u00e9 Rajat Monga Sherry Moore Derek Murray Chris Olah Mike Schuster Jonathon Shlens Benoit Steiner Ilya Sutskever Kunal Talwar Paul Tucker Vincent Vanhoucke Vijay Vasudevan Fernanda Vi\u00e9gas Oriol Vinyals Pete Warden Martin Wattenberg Martin Wicke Yuan Yu and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. 10.48550\/arXiv.1603.04467 Software available from tensorflow.org.","DOI":"10.48550\/arXiv.1603.04467"},{"key":"e_1_3_3_2_4_2","doi-asserted-by":"publisher","unstructured":"Ahmad Abdelfattah Natalie Beams Robert Carson Pieter Ghysels Tzanio Kolev Thomas Stitt Arturo Vargas Stanimire Tomov and Jack Dongarra. 2024. MAGMA: Enabling exascale performance with accelerated BLAS and LAPACK for diverse GPU architectures. The International Journal of High Performance Computing Applications (2024-06 2024). 10.1177\/10943420241261960","DOI":"10.1177\/10943420241261960"},{"key":"e_1_3_3_2_5_2","unstructured":"David Abrahams and Ralf\u00a0W Grosse-Kunstleve. 2003. Building hybrid systems with Boost. Python. C\/C++ Users Journal 21 LBNL-53142 (2003). https:\/\/www.boost.org\/doc\/libs\/1_87_0\/libs\/python\/doc\/html\/index.html"},{"key":"e_1_3_3_2_6_2","doi-asserted-by":"publisher","unstructured":"Martin\u00a0S. Alnaes Anders Logg Kristian\u00a0B. \u00d8lgaard Marie\u00a0E. Rognes and Garth\u00a0N. Wells. 2014. Unified Form Language: A domain-specific language for weak formulations of partial differential equations. ACM Trans. Math. Software 40 (2014). 10.1145\/2566630","DOI":"10.1145\/2566630"},{"key":"e_1_3_3_2_7_2","doi-asserted-by":"publisher","DOI":"10.5555\/323215"},{"key":"e_1_3_3_2_8_2","doi-asserted-by":"publisher","unstructured":"Hartwig Anzt Terry Cojean Yen-Chen Chen Goran Flegar Fritz G\u00f6bel Thomas Gr\u00fctzmacher Pratik Nayak Tobias Ribizel and Yu-Hsiang Tsai. 2020. Ginkgo: A high performance numerical linear algebra library. Journal of Open Source Software 5 52 (2020) 2260. 10.21105\/joss.02260","DOI":"10.21105\/joss.02260"},{"key":"e_1_3_3_2_9_2","doi-asserted-by":"publisher","unstructured":"Hartwig Anzt Terry Cojean Goran Flegar Fritz G\u00f6bel Thomas Gr\u00fctzmacher Pratik Nayak Tobias Ribizel Yuhsiang\u00a0Mike Tsai and Enrique\u00a0S. Quintana-Ort\u00ed. 2022. Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing. ACM Trans. Math. Software 48 1 (Feb. 2022) 2:1\u20132:33. 10.1145\/3480935","DOI":"10.1145\/3480935"},{"key":"e_1_3_3_2_10_2","doi-asserted-by":"publisher","unstructured":"Hartwig Anzt Terry Cojean Chen Yen-Chen Jack Dongarra Goran Flegar Pratik Nayak Stanimire Tomov Yuhsiang\u00a0M. Tsai and Weichung Wang. 2020. Load-Balancing Sparse Matrix Vector Product Kernels on GPUs. ACM Trans. Parallel Comput. 7 1 Article 2 (March 2020) 26\u00a0pages. 10.1145\/3380930","DOI":"10.1145\/3380930"},{"key":"e_1_3_3_2_11_2","doi-asserted-by":"crossref","unstructured":"Hartwig Anzt William Sawyer Stanimire Tomov Piotr Luszczek Ichitaro Yamazaki and Jack Dongarra. 2014. Optimizing Krylov Subspace Solvers on Graphics Processing Units. (05-2014 2014).","DOI":"10.1109\/IPDPSW.2014.107"},{"key":"e_1_3_3_2_12_2","doi-asserted-by":"publisher","DOI":"10.1109\/PMBS51919.2020.00009"},{"key":"e_1_3_3_2_13_2","first-page":"74","volume-title":"Tcl\/Tk Workshop","author":"Beazley David\u00a0M","year":"1996","unstructured":"David\u00a0M Beazley et\u00a0al. 1996. Swig: An easy to use tool for integrating scripting languages with C and C++.. In Tcl\/Tk Workshop, Vol.\u00a043. 74. https:\/\/www.swig.org\/"},{"key":"e_1_3_3_2_14_2","doi-asserted-by":"publisher","unstructured":"Stefan Behnel Robert Bradshaw Craig Citro Lisandro Dalcin Dag\u00a0Sverre Seljebotn and Kurt Smith. 2011. Cython: The Best of Both Worlds. Computing in Science & Engineering 13 2 (2011) 31\u201339. 10.1109\/MCSE.2010.118","DOI":"10.1109\/MCSE.2010.118"},{"key":"e_1_3_3_2_15_2","doi-asserted-by":"publisher","unstructured":"Nathan Bell Luke\u00a0N. Olson Jacob Schroder and Ben Southworth. 2023. PyAMG: Algebraic Multigrid Solvers in Python. Journal of Open Source Software 8 87 (2023) 5495. 10.21105\/joss.05495","DOI":"10.21105\/joss.05495"},{"key":"e_1_3_3_2_16_2","volume-title":"JAX: composable transformations of Python+NumPy programs","author":"Bradbury James","year":"2018","unstructured":"James Bradbury, Roy Frostig, Peter Hawkins, Matthew\u00a0James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http:\/\/github.com\/jax-ml\/jax"},{"key":"e_1_3_3_2_17_2","unstructured":"Sharan Chetlur Cliff Woolley Philippe Vandermersch Jonathan Cohen John Tran Bryan Catanzaro and Evan Shelhamer. 2014. cuDNN: Efficient Primitives for Deep Learning. arxiv:https:\/\/arXiv.org\/abs\/1410.0759\u00a0[cs.NE] https:\/\/arxiv.org\/abs\/1410.0759"},{"key":"e_1_3_3_2_18_2","unstructured":"Maxwell\u00a0D. Collins and Pushmeet Kohli. 2014. Memory Bounded Deep Convolutional Networks. arxiv:https:\/\/arXiv.org\/abs\/1412.1442\u00a0[cs.CV] https:\/\/arxiv.org\/abs\/1412.1442"},{"key":"e_1_3_3_2_19_2","unstructured":"Steven Dalton Nathan Bell Luke Olson and Michael Garland. 2014. Cusp: Generic Parallel Algorithms for Sparse Matrix and Graph Computations. http:\/\/cusplibrary.github.io\/ Version 0.5.0."},{"key":"e_1_3_3_2_20_2","doi-asserted-by":"publisher","unstructured":"J.\u00a0J. Dongarra Jermey\u00a0Du Cruz Sven Hammarling and I.\u00a0S. Duff. 1990. Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs. ACM Trans. Math. Softw. 16 1 (March 1990) 18\u201328. 10.1145\/77626.77627","DOI":"10.1145\/77626.77627"},{"key":"e_1_3_3_2_21_2","doi-asserted-by":"publisher","unstructured":"J.\u00a0J. Dongarra Jeremy Du\u00a0Croz Sven Hammarling and I.\u00a0S. Duff. 1990. A set of level 3 basic linear algebra subprograms. ACM Trans. Math. Softw. 16 1 (March 1990) 1\u201317. 10.1145\/77626.79170","DOI":"10.1145\/77626.79170"},{"key":"e_1_3_3_2_22_2","doi-asserted-by":"publisher","unstructured":"Charles\u00a0R. Harris K.\u00a0Jarrod Millman St\u00e9fan\u00a0J. van\u00a0der Walt Ralf Gommers Pauli Virtanen David Cournapeau Eric Wieser Julian Taylor Sebastian Berg Nathaniel\u00a0J. Smith Robert Kern Matti Picus Stephan Hoyer Marten\u00a0H. van Kerkwijk Matthew Brett Allan Haldane Jaime\u00a0Fern\u00e1ndez del R\u00edo Mark Wiebe Pearu Peterson Pierre G\u00e9rard-Marchant Kevin Sheppard Tyler Reddy Warren Weckesser Hameer Abbasi Christoph Gohlke and Travis\u00a0E. Oliphant. 2020. Array programming with NumPy. Nature 585 7825 (Sept. 2020) 357\u2013362. 10.1038\/s41586-020-2649-2","DOI":"10.1038\/s41586-020-2649-2"},{"key":"e_1_3_3_2_23_2","doi-asserted-by":"publisher","unstructured":"Vicente Hernandez Jose\u00a0E. Roman and Vicente Vidal. 2005. SLEPc: A scalable and flexible toolkit for the solution of eigenvalue problems. ACM Trans. Math. Softw. 31 3 (Sept. 2005) 351\u2013362. 10.1145\/1089014.1089019","DOI":"10.1145\/1089014.1089019"},{"key":"e_1_3_3_2_24_2","doi-asserted-by":"publisher","unstructured":"Torsten Hoefler Dan Alistarh Tal Ben-Nun Nikoli Dryden and Alexandra Peste. 2021. Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks. arxiv:https:\/\/arXiv.org\/abs\/2102.00554\u00a0[cs.LG] 10.48550\/arXiv.2102.00554","DOI":"10.48550\/arXiv.2102.00554"},{"key":"e_1_3_3_2_25_2","unstructured":"Wenzel Jakob Jason Rhinelander and Dean Moldovan. 2017. pybind11\u2013Seamless operability between C++ 11 and Python. URL: https:\/\/github. com\/pybind\/pybind11 (2017)."},{"key":"e_1_3_3_2_26_2","doi-asserted-by":"publisher","unstructured":"Scott\u00a0P. Kolodziej Mohsen Aznaveh Matthew Bullock Jarrett David Timothy\u00a0A. Davis Matthew Henderson Yifan Hu and Read Sandstrom. 2019. The SuiteSparse Matrix Collection Website Interface. Journal of Open Source Software 4 35 (2019) 1244. 10.21105\/joss.01244","DOI":"10.21105\/joss.01244"},{"key":"e_1_3_3_2_27_2","unstructured":"Michele Martone Salvatore Filippone Salvatore Tucci Marcin Paprzycki and Maria Ganzha. 2010. Utilizing Recursive Storage in Sparse Matrix-Vector Multiplication - Preliminary Considerations. (2010) 300\u2013305."},{"key":"e_1_3_3_2_28_2","volume-title":"NVIDIA Collective Communications Library (NCCL)","author":"Corporation NVIDIA","unstructured":"NVIDIA Corporation. [n. d.]. NVIDIA Collective Communications Library (NCCL). NVIDIA. https:\/\/developer.nvidia.com\/nccl"},{"key":"e_1_3_3_2_29_2","volume-title":"cuSPARSE - GPU library APIs for sparse computation.","author":"Corporation NVIDIA","year":"2022","unstructured":"NVIDIA Corporation. 2022. cuSPARSE - GPU library APIs for sparse computation. NVIDIA. https:\/\/docs.nvidia.com\/cuda\/archive\/12.6.1\/pdf\/CUSPARSE_Library.pdf"},{"key":"e_1_3_3_2_30_2","volume-title":"cuSOLVER -Direct Linear Solvers on NVIDIA GPUs","author":"Corporation NVIDIA","year":"2023","unstructured":"NVIDIA Corporation. 2023. cuSOLVER -Direct Linear Solvers on NVIDIA GPUs. NVIDIA. https:\/\/docs.nvidia.com\/cuda\/archive\/12.6.1\/pdf\/CUSOLVER_Library.pdf"},{"key":"e_1_3_3_2_31_2","volume-title":"cuRAND - Random Number Generation on NVIDIA GPUs","author":"Corporation NVIDIA","year":"2024","unstructured":"NVIDIA Corporation. 2024. cuRAND - Random Number Generation on NVIDIA GPUs. NVIDIA. https:\/\/docs.nvidia.com\/cuda\/archive\/12.6.1\/pdf\/CURAND_Library.pdf"},{"key":"e_1_3_3_2_32_2","volume-title":"cuBLAS - Basic Linear Algebra on NVIDIA GPUs","author":"Corporation NVIDIA","year":"2025","unstructured":"NVIDIA Corporation. 2025. cuBLAS - Basic Linear Algebra on NVIDIA GPUs. NVIDIA. https:\/\/docs.nvidia.com\/cuda\/archive\/12.6.1\/pdf\/CUBLAS_Library.pdf"},{"key":"e_1_3_3_2_33_2","volume-title":"cuFFT - GPU-accelerated Fast Fourier Transform (FFT) implementations","author":"Corporation NVIDIA","year":"2025","unstructured":"NVIDIA Corporation. 2025. cuFFT - GPU-accelerated Fast Fourier Transform (FFT) implementations. NVIDIA. https:\/\/docs.nvidia.com\/cuda\/pdf\/CUFFT_Library.pdf"},{"key":"e_1_3_3_2_34_2","volume-title":"Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems (NIPS)","author":"Okuta Ryosuke","year":"2017","unstructured":"Ryosuke Okuta, Yuya Unno, Daisuke Nishino, Shohei Hido, and Crissman. 2017. CuPy : A NumPy-Compatible Library for NVIDIA GPU Calculations. In Proceedings of Workshop on Machine Learning Systems (LearningSys) in The Thirty-first Annual Conference on Neural Information Processing Systems (NIPS). https:\/\/api.semanticscholar.org\/CorpusID:41278748"},{"key":"e_1_3_3_2_35_2","doi-asserted-by":"publisher","unstructured":"Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury Gregory Chanan Trevor Killeen Zeming Lin Natalia Gimelshein Luca Antiga Alban Desmaison Andreas K\u00f6pf Edward Yang Zach DeVito Martin Raison Alykhan Tejani Sasank Chilamkurthy Benoit Steiner Lu Fang Junjie Bai and Soumith Chintala. 2019. PyTorch: An Imperative Style High-Performance Deep Learning Library. arxiv:https:\/\/arXiv.org\/abs\/1912.01703\u00a0[cs.LG] 10.48550\/arXiv.1912.01703","DOI":"10.48550\/arXiv.1912.01703"},{"key":"e_1_3_3_2_36_2","doi-asserted-by":"publisher","unstructured":"F. Pedregosa G. Varoquaux A. Gramfort V. Michel B. Thirion O. Grisel M. Blondel P. Prettenhofer R. Weiss V. Dubourg J. Vanderplas A. Passos D. Cournapeau M. Brucher M. Perrot and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011) 2825\u20132830. 10.48550\/arXiv.1201.0490","DOI":"10.48550\/arXiv.1201.0490"},{"key":"e_1_3_3_2_37_2","doi-asserted-by":"publisher","unstructured":"M. Sala W. Spotz and M. Heroux. 2008. PyTrilinos: High-Performance Distributed-Memory Solvers for Python. ACM Transactions on Mathematical Software (TOMS) 34 (March 2008). Issue 2. 10.1145\/1326548.1326549","DOI":"10.1145\/1326548.1326549"},{"key":"e_1_3_3_2_38_2","volume-title":"PySparse: A Fast Sparse Matrix Library for Python","unstructured":"SourceForge [n. d.]. PySparse: A Fast Sparse Matrix Library for Python. SourceForge. https:\/\/pysparse.sourceforge.net\/contents.html Version 1.0.2."},{"key":"e_1_3_3_2_39_2","unstructured":"Suraj Srinivas Akshayvarun Subramanya and R.\u00a0Venkatesh Babu. 2016. Training Sparse Neural Networks. arxiv:https:\/\/arXiv.org\/abs\/1611.06694\u00a0[cs.CV] https:\/\/arxiv.org\/abs\/1611.06694"},{"key":"e_1_3_3_2_40_2","doi-asserted-by":"publisher","unstructured":"Christian\u00a0R. Trott Damien Lebrun-Grandi\u00e9 Daniel Arndt Jan Ciesko Vinh Dang Nathan Ellingwood Rahulkumar Gayatri Evan Harvey Daisy\u00a0S. Hollman Dan Ibanez Nevin Liber Jonathan Madsen Jeff Miles David Poliakoff Amy Powell Sivasankaran Rajamanickam Mikael Simberg Dan Sunderland Bruno Turcksin and Jeremiah Wilke. 2022. Kokkos 3: Programming Model Extensions for the Exascale Era. IEEE Transactions on Parallel and Distributed Systems 33 4 (2022) 805\u2013817. 10.1109\/TPDS.2021.3097283","DOI":"10.1109\/TPDS.2021.3097283"},{"key":"e_1_3_3_2_41_2","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-030-50743-516"},{"key":"e_1_3_3_2_42_2","unstructured":"Ashish Vaswani Noam Shazeer Niki Parmar Jakob Uszkoreit Llion Jones Aidan\u00a0N. Gomez Lukasz Kaiser and Illia Polosukhin. 2023. Attention Is All You Need. arxiv:https:\/\/arXiv.org\/abs\/1706.03762\u00a0[cs.CL] https:\/\/arxiv.org\/abs\/1706.03762"},{"key":"e_1_3_3_2_43_2","doi-asserted-by":"publisher","unstructured":"Yulong Yan Haoming Chu Yi Jin Yuxiang Huan Zhuo Zou and Lirong Zheng. 2022. Backpropagation With Sparsity Regularization for Spiking Neural Network Learning. Frontiers in Neuroscience 16 (2022). 10.3389\/fnins.2022.760298","DOI":"10.3389\/fnins.2022.760298"}],"event":{"name":"ICPP '25: 54th International Conference on Parallel Processing","location":"San Diego CA USA","acronym":"ICPP '25"},"container-title":["Proceedings of the 54th International Conference on Parallel Processing"],"original-title":[],"link":[{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3754598.3754648","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,12,20]],"date-time":"2025-12-20T08:36:18Z","timestamp":1766219778000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3754598.3754648"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,9,8]]},"references-count":42,"alternative-id":["10.1145\/3754598.3754648","10.1145\/3754598"],"URL":"https:\/\/doi.org\/10.1145\/3754598.3754648","relation":{},"subject":[],"published":{"date-parts":[[2025,9,8]]},"assertion":[{"value":"2025-12-20","order":3,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}