{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,7]],"date-time":"2026-03-07T18:56:36Z","timestamp":1772909796246,"version":"3.50.1"},"reference-count":80,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2022,1,31]],"date-time":"2022-01-31T00:00:00Z","timestamp":1643587200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Parallel Comput."],"published-print":{"date-parts":[[2022,3,31]]},"abstract":"<jats:p>We present new sequential and parallel sorting algorithms that now represent the fastest known techniques for a wide range of input sizes, input distributions, data types, and machines. Somewhat surprisingly, part of the speed advantage is due to the additional feature of the algorithms to work in-place, i.e., they do not need a significant amount of space beyond the input array. Previously, the in-place feature often implied performance penalties. Our main algorithmic contribution is a blockwise approach to in-place data distribution that is provably cache-efficient. We also parallelize this approach taking dynamic load balancing and memory locality into account.<\/jats:p>\n          <jats:p>\n            Our new comparison-based algorithm\n            <jats:italic>In-place Parallel Super Scalar Samplesort (<\/jats:italic>\n            <jats:italic>\n              <jats:sans-serif>\n                IPS\n                <jats:sup>4<\/jats:sup>\n                o\n              <\/jats:sans-serif>\n            <\/jats:italic>\n            <jats:italic>)<\/jats:italic>\n            , combines this technique with branchless decision trees. By taking cases with many equal elements into account and by adapting the distribution degree dynamically, we obtain a highly robust algorithm that outperforms the best previous in-place parallel comparison-based sorting algorithms by almost a factor of three. That algorithm also outperforms the best comparison-based competitors regardless of whether we consider in-place or not in-place, parallel or sequential settings.\n          <\/jats:p>\n          <jats:p>\n            Another surprising result is that\n            <jats:sans-serif>\n              IPS\n              <jats:sup>4<\/jats:sup>\n              o\n            <\/jats:sans-serif>\n            even outperforms the best (in-place or not in-place) integer sorting algorithms in a wide range of situations. In many of the remaining cases (often involving near-uniform input distributions, small keys, or a sequential setting), our new\n            <jats:italic>In-place Parallel Super Scalar Radix Sort (<\/jats:italic>\n            <jats:italic>\n              <jats:sans-serif>\n                IPS\n                <jats:sup>2<\/jats:sup>\n                Ra\n              <\/jats:sans-serif>\n            <\/jats:italic>\n            <jats:italic>)<\/jats:italic>\n            turns out to be the best algorithm.\n          <\/jats:p>\n          <jats:p>Claims to have the \u2013 in some sense \u2013 \u201cbest\u201d sorting algorithm can be found in many papers which cannot all be true. Therefore, we base our conclusions on an extensive experimental study involving a large part of the cross product of 21 state-of-the-art sorting codes, 6 data types, 10 input distributions, 4 machines, 4 memory allocation strategies, and input sizes varying over 7 orders of magnitude. This confirms the claims made about the robust performance of our algorithms while revealing major performance problems in many competitors outside the concrete set of measurements reported in the associated publications. This is particularly true for integer sorting algorithms giving one reason to prefer comparison-based algorithms for robust general-purpose sorting.<\/jats:p>","DOI":"10.1145\/3505286","type":"journal-article","created":{"date-parts":[[2022,1,31]],"date-time":"2022-01-31T16:35:52Z","timestamp":1643646952000},"page":"1-62","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":28,"title":["Engineering In-place (Shared-memory) Sorting Algorithms"],"prefix":"10.1145","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0686-0138","authenticated-orcid":false,"given":"Michael","family":"Axtmann","sequence":"first","affiliation":[{"name":"Karlsruhe Institute of Technology, Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7867-3200","authenticated-orcid":false,"given":"Sascha","family":"Witt","sequence":"additional","affiliation":[{"name":"Karlsruhe Institute of Technology, Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9450-8619","authenticated-orcid":false,"given":"Daniel","family":"Ferizovic","sequence":"additional","affiliation":[{"name":"Karlsruhe Institute of Technology, Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3330-9349","authenticated-orcid":false,"given":"Peter","family":"Sanders","sequence":"additional","affiliation":[{"name":"Karlsruhe Institute of Technology, Karlsruhe, Germany"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2022,1,31]]},"reference":[{"key":"e_1_3_4_2_2","first-page":"197","volume-title":"20th Symposium on Parallelism in Algorithms and Architectures (SPAA)","author":"Arge Lars","year":"2008","unstructured":"Lars Arge, Michael T. Goodrich, Michael J. Nelson, and Nodari Sitchinava. 2008. Fundamental parallel algorithms for private-cache chip multiprocessors. In 20th Symposium on Parallelism in Algorithms and Architectures (SPAA). ACM, 197\u2013206. https:\/\/doi.org\/10.1145\/1378533.1378573"},{"key":"e_1_3_4_3_2","first-page":"15","volume-title":"21st Workshop on Algorithm Engineering and Experiments (ALENEX)","author":"Aum\u00fcller Martin","year":"2019","unstructured":"Martin Aum\u00fcller and Nikolaj Hass. 2019. Simple and fast blockquicksort using Lomuto\u2019s partitioning scheme. In 21st Workshop on Algorithm Engineering and Experiments (ALENEX). SIAM, 15\u201326. https:\/\/doi.org\/10.1137\/1.9781611975499.2"},{"key":"e_1_3_4_4_2","article-title":"NUMA Array","author":"Axtmann Michael","year":"2020","unstructured":"Michael Axtmann. 2020. NUMA Array. https:\/\/github.com\/ips4o\/NumaArray. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/ips4o\/NumaArray"},{"key":"e_1_3_4_5_2","article-title":"(Parallel) Super Scalar Sample Sort","author":"Axtmann Michael","year":"2020","unstructured":"Michael Axtmann. 2020. (Parallel) Super Scalar Sample Sort. https:\/\/github.com\/ips4o\/ps4o. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/ips4o\/ps4o"},{"key":"e_1_3_4_6_2","first-page":"13","volume-title":"27th Symposium on Parallelism in Algorithms and Architectures (SPAA)","author":"Axtmann Michael","year":"2015","unstructured":"Michael Axtmann, Timo Bingmann, Peter Sanders, and Christian Schulz. 2015. Practical massively parallel sorting. In 27th Symposium on Parallelism in Algorithms and Architectures (SPAA). ACM, 13\u201323. https:\/\/doi.org\/10.1145\/2755573.2755595"},{"key":"e_1_3_4_7_2","first-page":"9:1\u20139:14","volume-title":"25th European Symposium on Algorithms (ESA)","author":"Axtmann Michael","year":"2017","unstructured":"Michael Axtmann, Sascha Witt, Daniel Ferizovic, and Peter Sanders. 2017. In-Place parallel super scalar samplesort (IPSSSSo). In 25th European Symposium on Algorithms (ESA), Vol. 87. LIPIcs, 9:1\u20139:14. https:\/\/doi.org\/10.4230\/LIPIcs.ESA.2017.9"},{"key":"e_1_3_4_8_2","article-title":"Engineering In-place (Shared-memory) Sorting Algorithms","author":"Axtmann Michael","year":"2020","unstructured":"Michael Axtmann, Sascha Witt, Daniel Ferizovic, and Peter Sanders. Sept. 2020. Engineering In-place (Shared-memory) Sorting Algorithms. Computing Research Repository (CoRR). arxiv:2009.13569.","journal-title":"Computing Research Repository (CoRR)"},{"key":"e_1_3_4_9_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF01939369"},{"key":"e_1_3_4_10_2","doi-asserted-by":"publisher","DOI":"10.5555\/3299630"},{"key":"e_1_3_4_11_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00453-015-0071-1"},{"key":"e_1_3_4_12_2","doi-asserted-by":"publisher","DOI":"10.1002\/spe.2922"},{"key":"e_1_3_4_13_2","first-page":"507","volume-title":"32nd Symposium on Parallelism in Algorithms and Architectures (SPAA)","author":"Blelloch Guy E.","year":"2020","unstructured":"Guy E. Blelloch, Daniel Anderson, and Laxman Dhulipala. 2020. ParlayLib - A toolkit for parallel algorithms on shared-memory multicore machines. In 32nd Symposium on Parallelism in Algorithms and Architectures (SPAA). ACM, 507\u2013509. https:\/\/doi.org\/10.1145\/3350755.3400254"},{"key":"e_1_3_4_14_2","first-page":"189","volume-title":"22nd Symposium on Parallelism in Algorithms and Architectures (SPAA)","author":"Blelloch Guy E.","year":"2010","unstructured":"Guy E. Blelloch, Phillip B. Gibbons, and Harsha Vardhan Simhadri. 2010. Low depth cache-oblivious algorithms. In 22nd Symposium on Parallelism in Algorithms and Architectures (SPAA). ACM, 189\u2013199. https:\/\/doi.org\/10.1145\/1810479.1810519"},{"issue":"12","key":"e_1_3_4_15_2","first-page":"273","article-title":"A comparison of sorting algorithms for the connection machine CM-2","volume":"39","author":"Blelloch Guy E.","year":"1996","unstructured":"Guy E. Blelloch, Charles E. Leiserson, Bruce M. Maggs, C. Greg Plaxton, Stephen J. Smith, and Marco Zagha. 1996. A comparison of sorting algorithms for the connection machine CM-2, In 3rd Symposium on Parallel Algorithms and Architectures (SPAA).Commun. ACM 39, 12es, 273\u2013297. https:\/\/doi.org\/10.1145\/113379.113380","journal-title":"3rd Symposium on Parallel Algorithms and Architectures (SPAA)"},{"key":"e_1_3_4_16_2","doi-asserted-by":"publisher","DOI":"10.14569\/IJACSA.2017.081044"},{"key":"e_1_3_4_17_2","first-page":"2.2:1\u20132.2:23","article-title":"Engineering a cache-oblivious sorting algorithm","volume":"12","author":"Brodal Gerth St\u00f8lting","year":"2007","unstructured":"Gerth St\u00f8lting Brodal, Rolf Fagerberg, and Kristoffer Vinther. 2007. Engineering a cache-oblivious sorting algorithm. ACM J. Exp. Algorithmics 12 (2007), 2.2:1\u20132.2:23. https:\/\/doi.org\/10.1145\/1227161.1227164","journal-title":"ACM J. Exp. Algorithmics"},{"key":"e_1_3_4_18_2","doi-asserted-by":"publisher","DOI":"10.14778\/2824032.2824050"},{"key":"e_1_3_4_19_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00165-016-0401-3"},{"key":"e_1_3_4_20_2","article-title":"Intel\u00ae Integrated Performance Primitives","author":"Corporation Intel","year":"2020","unstructured":"Intel Corporation. 2020. Intel\u00ae Integrated Performance Primitives. https:\/\/software.intel.com\/en-us\/ipp-dev-reference. Version 2020 Initial Release.","journal-title":"https:\/\/software.intel.com\/en-us\/ipp-dev-reference"},{"key":"e_1_3_4_21_2","doi-asserted-by":"publisher","DOI":"10.1007\/s101070100263"},{"key":"e_1_3_4_22_2","doi-asserted-by":"publisher","DOI":"10.3138\/FM57-6770-U75U-7727"},{"key":"e_1_3_4_23_2","series-title":"Lecture Notes in Computer Science","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1007\/BFb0016252","volume-title":"Mathematical Foundations of Computer Science","author":"Durian Branislav","year":"1986","unstructured":"Branislav Durian. 1986. Quicksort without a stack. In Mathematical Foundations of Computer Science(Lecture Notes in Computer Science, Vol. 233). Springer, 283\u2013289. https:\/\/doi.org\/10.1007\/BFb0016252"},{"key":"e_1_3_4_24_2","first-page":"38:1\u201338:16","volume-title":"24th European Symposium on Algorithms (ESA)","author":"Edelkamp Stefan","year":"2016","unstructured":"Stefan Edelkamp and Armin Wei\u00df. 2016. BlockQuicksort: Avoiding branch mispredictions in quicksort. In 24th European Symposium on Algorithms (ESA), Vol. 57. LIPIcs, 38:1\u201338:16. https:\/\/doi.org\/10.4230\/LIPIcs.ESA.2016.38"},{"key":"e_1_3_4_25_2","first-page":"1","volume-title":"21st Workshop on Algorithm Engineering and Experiments (ALENEX)","author":"Edelkamp Stefan","year":"2019","unstructured":"Stefan Edelkamp and Armin Wei\u00df. 2019. Worst-case efficient sorting with QuickMergesort. In 21st Workshop on Algorithm Engineering and Experiments (ALENEX). SIAM, 1\u201314. https:\/\/doi.org\/10.1137\/1.9781611975499.1"},{"key":"e_1_3_4_26_2","first-page":"160","volume-title":"11th Symposium on Experimental Algorithms (SEA)","author":"Elmasry Amr","year":"2012","unstructured":"Amr Elmasry, Jyrki Katajainen, and Max Stenmark. 2012. Branch mispredictions don\u2019t affect mergesort. In 11th Symposium on Experimental Algorithms (SEA), Vol. 7276. Springer, 160\u2013171. https:\/\/doi.org\/10.1007\/978-3-642-30850-5_15"},{"key":"e_1_3_4_27_2","doi-asserted-by":"publisher","DOI":"10.1145\/146370.146381"},{"key":"e_1_3_4_28_2","doi-asserted-by":"publisher","DOI":"10.1145\/5666.5673"},{"key":"e_1_3_4_29_2","first-page":"291","volume-title":"15th Symposium on Discrete Algorithms (SODA)","author":"Franceschini Gianni","year":"2004","unstructured":"Gianni Franceschini. 2004. Proximity mergesort: Optimal in-place sorting in the cache-oblivious model. In 15th Symposium on Discrete Algorithms (SODA). SIAM, 291\u2013299."},{"key":"e_1_3_4_30_2","doi-asserted-by":"publisher","DOI":"10.1145\/1082036.1082037"},{"key":"e_1_3_4_31_2","doi-asserted-by":"publisher","DOI":"10.1016\/0167-8191(92)90089-P"},{"key":"e_1_3_4_32_2","doi-asserted-by":"publisher","DOI":"10.1145\/321592.321600"},{"key":"e_1_3_4_33_2","doi-asserted-by":"publisher","DOI":"10.1145\/320831.320833"},{"key":"e_1_3_4_34_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2010.01.034"},{"key":"e_1_3_4_35_2","doi-asserted-by":"publisher","DOI":"10.1145\/72935.72953"},{"key":"e_1_3_4_36_2","article-title":"TimSort","author":"Goro Fuji","year":"2014","unstructured":"Fuji Goro and Morwenn. 2014. TimSort. https:\/\/github.com\/timsort\/cpp-TimSort.git. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/timsort\/cpp-TimSort.git"},{"key":"e_1_3_4_37_2","article-title":"RADULS2","author":"Group REFRESH Bioinformatics","year":"2017","unstructured":"REFRESH Bioinformatics Group. 2017. RADULS2. https:\/\/github.com\/refresh-bio\/RADULS. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/refresh-bio\/RADULS"},{"key":"e_1_3_4_38_2","article-title":"Parallel In-Place Algorithms: Theory and Practice","author":"Gu Yan","year":"2021","unstructured":"Yan Gu, Omar Obeya, and Julian Shun. 2021. Parallel In-Place Algorithms: Theory and Practice. Computing Research Repository (CoRR). arxiv:2103.01216","journal-title":"Computing Research Repository (CoRR)"},{"key":"e_1_3_4_39_2","doi-asserted-by":"publisher","DOI":"10.1109\/12.88483"},{"key":"e_1_3_4_40_2","doi-asserted-by":"publisher","DOI":"10.1109\/12.46289"},{"key":"e_1_3_4_41_2","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/5.1.10"},{"key":"e_1_3_4_42_2","doi-asserted-by":"publisher","DOI":"10.1145\/2751205.2751247"},{"key":"e_1_3_4_43_2","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/35.6.643"},{"key":"e_1_3_4_44_2","article-title":"Super Scalar Sample Sort","author":"H\u00fcbschle-Schneider Lorenz","year":"2016","unstructured":"Lorenz H\u00fcbschle-Schneider. 2016. Super Scalar Sample Sort. https:\/\/github.com\/lorenzhs\/ssssort. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/lorenzhs\/ssssort"},{"issue":"3","key":"e_1_3_4_45_2","first-page":"159","article-title":"An optimal algorithm for approximating a piecewise linear function","volume":"9","author":"Imai Hiroshi","year":"1986","unstructured":"Hiroshi Imai and Masao Iri. 1986. An optimal algorithm for approximating a piecewise linear function. J. Inf. Process. 9, 3 (1986), 159\u2013162.","journal-title":"J. Inf. Process."},{"key":"e_1_3_4_46_2","doi-asserted-by":"publisher","DOI":"10.1109\/5992.814657"},{"issue":"1","key":"e_1_3_4_47_2","first-page":"1.9:1\u20131.9:28","article-title":"On a model of virtual address translation","volume":"19","author":"Jurkiewicz Tomasz","year":"2014","unstructured":"Tomasz Jurkiewicz and Kurt Mehlhorn. 2014. On a model of virtual address translation. Journal of Experimental Algorithmics (JEA) 19, 1 (2014), 1.9:1\u20131.9:28. https:\/\/doi.org\/10.1145\/2656337","journal-title":"Journal of Experimental Algorithmics (JEA)"},{"key":"e_1_3_4_48_2","first-page":"780","volume-title":"14th European Symposium on Algorithms (ESA)","author":"Kaligosi Kanela","year":"2006","unstructured":"Kanela Kaligosi and Peter Sanders. 2006. How branch mispredictions affect quicksort. In 14th European Symposium on Algorithms (ESA), Vol. 4168. Springer, 780\u2013791. https:\/\/doi.org\/10.1007\/11841036_69"},{"issue":"1","key":"e_1_3_4_49_2","first-page":"27","article-title":"Practical in-place mergesort","volume":"3","author":"Katajainen Jyrki","year":"1996","unstructured":"Jyrki Katajainen, Tomi Pasanen, and Jukka Teuhola. 1996. Practical in-place mergesort. Nord. J. Comput. 3, 1 (1996), 27\u201340.","journal-title":"Nord. J. Comput."},{"key":"e_1_3_4_50_2","first-page":"246","volume-title":"Theory and Applications of Models of Computation (TAMC)","author":"Kim Pok-Son","year":"2008","unstructured":"Pok-Son Kim and Arne Kutzner. 2008. Ratio based stable in-place merging. In Theory and Applications of Models of Computation (TAMC), Vol. 4978. Springer, 246\u2013257. https:\/\/doi.org\/10.1007\/978-3-540-79228-4_22"},{"key":"e_1_3_4_51_2","doi-asserted-by":"crossref","first-page":"481","DOI":"10.1007\/978-3-319-67792-7_47","volume-title":"5th International Conference on Man-Machine Interactions (ICMMI)","volume":"659","author":"Kokot Marek","year":"2017","unstructured":"Marek Kokot, Sebastian Deorowicz, and Maciej Dlugosz. 2017. Even faster sorting of (not only) integers. In 5th International Conference on Man-Machine Interactions (ICMMI), Vol. 659. Springer, 481\u2013491. https:\/\/doi.org\/10.1007\/978-3-319-67792-7_47"},{"key":"e_1_3_4_52_2","article-title":"LearnedSort","author":"Kristo Ani","year":"2020","unstructured":"Ani Kristo. 2020. LearnedSort. https:\/\/github.com\/learnedsystems\/LearnedSort. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/learnedsystems\/LearnedSort"},{"key":"e_1_3_4_53_2","first-page":"47","volume-title":"16th Workshop on Algorithm Engineering and Experiments (ALENEX)","author":"Kushagra Shrinu","year":"2014","unstructured":"Shrinu Kushagra, Alejandro L\u00f3pez-Ortiz, Aurick Qiao, and J. Ian Munro. 2014. Multi-pivot quicksort: Theory and experiments. In 16th Workshop on Algorithm Engineering and Experiments (ALENEX). SIAM, 47\u201360. https:\/\/doi.org\/10.1137\/1.9781611973198.6"},{"key":"e_1_3_4_54_2","doi-asserted-by":"publisher","DOI":"10.1016\/0020-0190(89)90116-6"},{"key":"e_1_3_4_55_2","article-title":"WikiSort","author":"McFadden Mike","year":"2014","unstructured":"Mike McFadden. 2014. WikiSort. https:\/\/github.com\/BonzaiThePenguin\/WikiSort. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/BonzaiThePenguin\/WikiSort"},{"key":"e_1_3_4_56_2","doi-asserted-by":"publisher","DOI":"10.5555\/2159557"},{"issue":"1","key":"e_1_3_4_57_2","first-page":"5","article-title":"Engineering radix sort","volume":"6","author":"McIlroy Peter M.","year":"1993","unstructured":"Peter M. McIlroy, Keith Bostic, and M. Douglas McIlroy. 1993. Engineering radix sort. Comput. Syst. 6, 1 (1993), 5\u201327.","journal-title":"Comput. Syst."},{"key":"e_1_3_4_58_2","first-page":"63:1\u201363:16","volume-title":"26th European Symposium on Algorithms (ESA)","author":"Munro J. Ian","unstructured":"J. Ian Munro and Sebastian Wild. [n.d.]. Nearly-optimal mergesorts: Fast, practical sorting methods that optimally adapt to existing runs. In 26th European Symposium on Algorithms (ESA). 63:1\u201363:16."},{"key":"e_1_3_4_59_2","doi-asserted-by":"publisher","DOI":"10.1002\/(SICI)1097-024X(199708)27:8<983::AID-SPE117>3.0.CO;2-#"},{"key":"e_1_3_4_60_2","article-title":"RegionSort","author":"Obeya Omar","year":"2019","unstructured":"Omar Obeya, Endrias Kahssay, Edward Fan, and Julian Shun. 2019. RegionSort. https:\/\/github.com\/omarobeya\/parallel-inplace-radixsort. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/omarobeya\/parallel-inplace-radixsort"},{"key":"e_1_3_4_61_2","first-page":"213","volume-title":"The 31st ACM on Symposium on Parallelism in Algorithms and Architectures (SPAA)","author":"Obeya Omar","year":"2019","unstructured":"Omar Obeya, Endrias Kahssay, Edward Fan, and Julian Shun. 2019. Theoretically-efficient and practical parallel in-place radix sorting. In The 31st ACM on Symposium on Parallelism in Algorithms and Architectures (SPAA). ACM, ACM, 213\u2013224."},{"key":"e_1_3_4_62_2","article-title":"Pattern-defeating quicksort","author":"Peters Orson","year":"2015","unstructured":"Orson Peters. 2015. Pattern-defeating quicksort. https:\/\/github.com\/orlp\/pdqsort. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/orlp\/pdqsort"},{"key":"e_1_3_4_63_2","article-title":"Timsort","author":"Peters Tim","year":"2002","unstructured":"Tim Peters. 2002. Timsort. http:\/\/svn.python.org\/projects\/python\/trunk\/Objects\/listsort.txt. Accessed: 2020-03-31.","journal-title":"http:\/\/svn.python.org\/projects\/python\/trunk\/Objects\/listsort.txt"},{"key":"e_1_3_4_64_2","article-title":"In-place MSB","author":"Polychroniou Orestis","year":"2014","unstructured":"Orestis Polychroniou. 2014. In-place MSB. http:\/\/www.cs.columbia.edu\/ orestis\/publications.html. Accessed: 2020-09-01.","journal-title":"http:\/\/www.cs.columbia.edu\/ orestis\/publications.html"},{"key":"e_1_3_4_65_2","doi-asserted-by":"publisher","DOI":"10.1145\/2588555.2610522"},{"key":"e_1_3_4_66_2","first-page":"171","volume-title":"Algorithms for Hardware Caches and TLB","author":"Rahman Naila","year":"2002","unstructured":"Naila Rahman. 2002. Algorithms for Hardware Caches and TLB. Vol. 2625. Springer, 171\u2013192. https:\/\/doi.org\/10.1007\/3-540-36574-5_8"},{"key":"e_1_3_4_67_2","volume-title":"Intel Threading Building Blocks - Outfitting C++ for Multi-core Processor Parallelism","author":"Reinders James","year":"2007","unstructured":"James Reinders. 2007. Intel Threading Building Blocks - Outfitting C++ for Multi-core Processor Parallelism. O\u2019Reilly."},{"key":"e_1_3_4_68_2","doi-asserted-by":"publisher","DOI":"10.5555\/3378988"},{"key":"e_1_3_4_69_2","first-page":"784","volume-title":"12th European Symposium on Algorithms (ESA)","author":"Sanders Peter","year":"2004","unstructured":"Peter Sanders and Sebastian Winkel. 2004. Super scalar sample sort. In 12th European Symposium on Algorithms (ESA), Vol. 3221. Springer, 784\u2013796. https:\/\/doi.org\/10.1007\/978-3-540-30140-0_69"},{"key":"e_1_3_4_70_2","first-page":"68","volume-title":"24th Symposium on Parallelism in Algorithms and Architectures (SPAA)","author":"Shun Julian","year":"2012","unstructured":"Julian Shun, Guy E. Blelloch, Jeremy T. Fineman, Phillip B. Gibbons, Aapo Kyrola, Harsha Vardhan Simhadri, and Kanat Tangwongsan. 2012. Brief announcement: The problem based benchmark suite. In 24th Symposium on Parallelism in Algorithms and Architectures (SPAA). ACM, 68\u201370. https:\/\/doi.org\/10.1145\/2312005.2312018"},{"key":"e_1_3_4_71_2","first-page":"682","volume-title":"Euro-Par","author":"Singler Johannes","year":"2007","unstructured":"Johannes Singler, Peter Sanders, and Felix Putze. 2007. MCSTL: The multi-core standard template library. In Euro-Par, Vol. 4641. Springer, 682\u2013694. https:\/\/doi.org\/10.1007\/978-3-540-74466-5_72"},{"key":"e_1_3_4_72_2","article-title":"I Wrote a Faster Sorting Algorithm","author":"Skarupke Malte","year":"2016","unstructured":"Malte Skarupke. 2016. I Wrote a Faster Sorting Algorithm. https:\/\/probablydance.com\/2016\/12\/27\/i-wrote-a-faster-sorting-algorithm\/. Accessed: 2020-03-31.","journal-title":"https:\/\/probablydance.com\/2016\/12\/27\/i-wrote-a-faster-sorting-algorithm\/"},{"key":"e_1_3_4_73_2","article-title":"Ska Sort","author":"Skarupke Malte","year":"2016","unstructured":"Malte Skarupke. 2016. Ska Sort. https:\/\/github.com\/skarupke\/ska_sort. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/skarupke\/ska_sort"},{"key":"e_1_3_4_74_2","article-title":"ASPaS","author":"Lab Virginia Tech SyNeRGy","year":"2018","unstructured":"Virginia Tech SyNeRGy Lab. 2018. ASPaS. https:\/\/github.com\/vtsynergy\/aspas_sort. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/vtsynergy\/aspas_sort"},{"key":"e_1_3_4_75_2","first-page":"372","volume-title":"11th Euromicro Workshop on Parallel, Distributed and Network-Based Processing (PDP)","author":"Tsigas Philippas","year":"2003","unstructured":"Philippas Tsigas and Yi Zhang. 2003. A simple, fast parallel implementation of quicksort and its performance evaluation on SUN enterprise 10000. In 11th Euromicro Workshop on Parallel, Distributed and Network-Based Processing (PDP). IEEE Computer Society, 372. https:\/\/doi.org\/10.1109\/EMPDP.2003.1183613"},{"key":"e_1_3_4_76_2","first-page":"160","volume-title":"Euro-Par","author":"Wassenberg Jan","year":"2011","unstructured":"Jan Wassenberg and Peter Sanders. 2011. Engineering a multi-core radix sort. In Euro-Par, Vol. 6853. Springer, 160\u2013169. https:\/\/doi.org\/10.1007\/978-3-642-23397-5_16"},{"issue":"1","key":"e_1_3_4_77_2","first-page":"44","article-title":"A generalized, one-way, stackless quicksort","volume":"27","author":"Wegner Lutz M.","year":"1987","unstructured":"Lutz M. Wegner. 1987. A generalized, one-way, stackless quicksort. BIT Comput. Sci. Sect. 27, 1 (1987), 44\u201348. https:\/\/doi.org\/10.1007\/BF01937353","journal-title":"BIT Comput. Sci. Sect."},{"key":"e_1_3_4_78_2","article-title":"BlockQuicksort","author":"Weiss Armin","year":"2016","unstructured":"Armin Weiss. 2016. BlockQuicksort. https:\/\/github.com\/weissan\/BlockQuicksort. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/weissan\/BlockQuicksort"},{"key":"e_1_3_4_79_2","article-title":"Yaroslavskiy\u2019s Dual-Pivot Quicksort","author":"Weiss Armin","year":"2016","unstructured":"Armin Weiss. 2016. Yaroslavskiy\u2019s Dual-Pivot Quicksort. https:\/\/github.com\/weissan\/BlockQuicksort\/blob\/master\/Yaroslavskiy.h++. Accessed: 2020-09-01.","journal-title":"https:\/\/github.com\/weissan\/BlockQuicksort\/blob\/master\/Yaroslavskiy.h++"},{"key":"e_1_3_4_80_2","article-title":"Intel Threading Building Blocks with CMake build system","author":"Wenzel Jokob","year":"2019","unstructured":"Jokob Wenzel. 2019. Intel Threading Building Blocks with CMake build system. https:\/\/github.com\/wjakob\/tbb. TBB 2019 Update 6.","journal-title":"https:\/\/github.com\/wjakob\/tbb"},{"key":"e_1_3_4_81_2","article-title":"Question on sorting","author":"Yaroslavskiy Vladimir","year":"2010","unstructured":"Vladimir Yaroslavskiy. 2010. Question on sorting. http:\/\/mail.openjdk.java.net\/pipermail\/core-libs-dev\/2010-July\/004649.html. Accessed: 2020-09-01.","journal-title":"http:\/\/mail.openjdk.java.net\/pipermail\/core-libs-dev\/2010-July\/004649.html"}],"container-title":["ACM Transactions on Parallel Computing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3505286","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3505286","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T17:49:24Z","timestamp":1750182564000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3505286"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,31]]},"references-count":80,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,3,31]]}},"alternative-id":["10.1145\/3505286"],"URL":"https:\/\/doi.org\/10.1145\/3505286","relation":{},"ISSN":["2329-4949","2329-4957"],"issn-type":[{"value":"2329-4949","type":"print"},{"value":"2329-4957","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,31]]},"assertion":[{"value":"2020-10-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-01-31","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}