{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,25]],"date-time":"2026-01-25T04:59:22Z","timestamp":1769317162324,"version":"3.49.0"},"reference-count":25,"publisher":"Association for Computing Machinery (ACM)","issue":"3","license":[{"start":{"date-parts":[[2017,5,1]],"date-time":"2017-05-01T00:00:00Z","timestamp":1493596800000},"content-version":"unspecified","delay-in-days":0,"URL":"http:\/\/www.springer.com\/tdm"}],"funder":[{"name":"Israel Science Foundation (IL)","award":["182\/13"],"award-info":[{"award-number":["182\/13"]}]},{"DOI":"10.13039\/501100004836","name":"Det Frie Forskningsr\u00e5d","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004836","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Form. Asp. Comput."],"published-print":{"date-parts":[[2017,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>In this paper, we show how the theory of sorting networks can be applied to synthesize optimized general-purpose sorting libraries. Standard sorting libraries are often based on combinations of the classic Quicksort algorithm, with insertion sort applied as base case for small, fixed, numbers of inputs. Unrolling the code for the base case by ignoring loop conditions eliminates branching, resulting in code equivalent to a sorting network. By replacing it with faster sorting networks, we can improve the performance of these algorithms. We show that by considering the number of comparisons and swaps alone we are not able to predict any real advantage of this approach. However, significant speed-ups are obtained when taking advantage of instruction level parallelism and non-branching conditional assignment instructions, both of which are common in modern CPU architectures. Furthermore, a close control of how often registers have to be spilled to memory gives us a complete explanation of the performance of different sorting networks, allowing us to choose an optimal one for each particular architecture. Our experimental results show that using code synthesized from these efficient sorting networks as the base case for Quicksort libraries results in significant real-world speed-ups.<\/jats:p>","DOI":"10.1007\/s00165-016-0401-3","type":"journal-article","created":{"date-parts":[[2016,11,4]],"date-time":"2016-11-04T13:24:59Z","timestamp":1478265899000},"page":"559-579","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":4,"title":["Optimizing sorting algorithms by using sorting networks"],"prefix":"10.1145","volume":"29","author":[{"given":"Michael","family":"Codish","sequence":"first","affiliation":[{"name":"Department of Computer Science, Ben-Gurion University of the Negev, Beersheba, Israel"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7866-7484","authenticated-orcid":false,"given":"Lu\u00eds","family":"Cruz-Filipe","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Markus","family":"Nebel","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Peter","family":"Schneider-Kamp","sequence":"additional","affiliation":[{"name":"Department of Mathematics and Computer Science, University of Southern Denmark, Campusvej 55, 5230, Odense M, Denmark"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","reference":[{"key":"e_1_2_1_2_1_2","doi-asserted-by":"crossref","unstructured":"Batcher KE (1968) Sorting networks and their applications. In: AFIPS Conference Proceedings vol 32. Thomson Book Company pp 307\u2013314","DOI":"10.1145\/1468075.1468121"},{"key":"e_1_2_1_2_2_2","unstructured":"Baddar SWA-H Batcher KE (2011) Designing sorting networks: a new paradigm. Springer"},{"key":"e_1_2_1_2_3_2","doi-asserted-by":"publisher","DOI":"10.1145\/321119.321126"},{"key":"e_1_2_1_2_4_2","doi-asserted-by":"crossref","unstructured":"Bundala D Z\u00e1vodn\u00fd J (2014) Optimal sorting networks. In: Dediu AH Mart\u00edn-Vide C Sierra-Rodr\u00edguez JL Truthe B (eds) LATA 2014 vol 8370 of LNCS. Springer pp 236\u2013247","DOI":"10.1007\/978-3-319-04921-2_19"},{"key":"e_1_2_1_2_5_2","doi-asserted-by":"crossref","unstructured":"Codish M Cruz-Filipe L Frank M Schneider-Kamp P (2014) Twenty-five comparators is optimal when sorting nine inputs (and twenty-nine for ten). In: ICTAI 2014. IEEE December pp 186\u2013193","DOI":"10.1109\/ICTAI.2014.36"},{"key":"e_1_2_1_2_6_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.jcss.2015.11.014"},{"key":"e_1_2_1_2_7_2","doi-asserted-by":"crossref","unstructured":"Codish M Cruz-Filipe L Nebel M Schneider-Kamp P (2015) Applying sorting networks to synthesize optimized sorting libraries. In: Falaschi M (ed) LOPSTR vol 9527 of LNCS. Springer pp 127\u2013142","DOI":"10.1007\/978-3-319-27436-2_8"},{"key":"e_1_2_1_2_8_2","doi-asserted-by":"crossref","unstructured":"Codish M Cruz-Filipe L Schneider-Kamp P (2015) The quest for optimal sorting networks: efficient generation of two-layer prefixes. In: Winkler F Negru V Ida T Jebelan T Petcu D Watt SM Zaharie D (eds) SYNASC 2014. IEEE pp 359\u2013366","DOI":"10.1109\/SYNASC.2014.55"},{"key":"e_1_2_1_2_9_2","doi-asserted-by":"crossref","unstructured":"Codish M Cruz-Filipe L Schneider-Kamp P (2015) Sorting networks: the end game. In: Dediu AH Formenti E Mart\u00edn-Vide C Truthe B (eds) LATA 2015 vol 8977 of LNCS. Springer pp 664\u2013675","DOI":"10.1007\/978-3-319-15579-1_52"},{"key":"e_1_2_1_2_10_2","doi-asserted-by":"crossref","unstructured":"Eppstein D Goodrich MT Tamassia R (2010) Privacy-preserving data-oblivious geometric algorithms for geographic data. In: GIS 10 ACM pp 13\u201322","DOI":"10.1145\/1869790.1869796"},{"key":"e_1_2_1_2_11_2","doi-asserted-by":"crossref","unstructured":"Ehlers T M\u00fcller M (2015) New bounds on optimal sorting networks. In: Beckmann A Mitrana V Soskova MI (eds) CiE 2015 vol 9136 of LNCS. Springer pp 167\u2013176","DOI":"10.1007\/978-3-319-20028-6_17"},{"key":"e_1_2_1_2_12_2","doi-asserted-by":"crossref","unstructured":"Furtak T Amaral JN Niewiadomski R (2007) Using SIMD registers and instructions to enable instruction-level parallelism in sorting algorithms. In: SPAA. ACM pp 348\u2013357","DOI":"10.1145\/1248377.1248436"},{"key":"e_1_2_1_2_13_2","unstructured":"Fisher JA Faraboschi P Young C (2005) Embedded computing: a VLIW approach to architecture compilers and tools. Morgan Kaufman"},{"key":"e_1_2_1_2_14_2","unstructured":"Gamble J (2011) Algorithm::networksort 1.30. Available from http:\/\/cpansearch.perl.org\/src\/JGAMBLE\/Algorithm-Networksort-1.30\/lib\/Algorithm\/Networksort.pm"},{"key":"e_1_2_1_2_15_2","doi-asserted-by":"crossref","unstructured":"Gre\u00df A Zachmann G (2006) GPU-ABiSort: optimal parallel sorting on stream architectures. In: IPDPS. IEEE","DOI":"10.1109\/IPDPS.2006.1639284"},{"key":"e_1_2_1_2_16_2","doi-asserted-by":"publisher","DOI":"10.1145\/321160.321164"},{"key":"e_1_2_1_2_17_2","doi-asserted-by":"publisher","DOI":"10.1093\/comjnl\/5.1.10"},{"key":"e_1_2_1_2_18_2","unstructured":"Knuth DE (1973) The art of computer programming vol III: sorting and searching. Addison-Wesley"},{"key":"e_1_2_1_2_19_2","unstructured":"Lopez B Cruz-Cortes N (2014) On the usage of sorting networks to big data. In: Arabnia HR Yang MQ Jandieri G Park JJ Solo AMG Tinetti FG (eds) Advances in big data analytics: the 2014 WorldComp International Conference Proceedings. Mercury Learning and Information"},{"key":"e_1_2_1_2_20_2","unstructured":"Paoloni G (2010) How to benchmark code execution times on intel\u00ae IA-32 and IA-64 instruction set architectures. White paper 324264-001 Intel Corporation September"},{"key":"e_1_2_1_2_21_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF02090393"},{"key":"e_1_2_1_2_22_2","doi-asserted-by":"publisher","DOI":"10.1007\/BF00289467"},{"key":"e_1_2_1_2_23_2","unstructured":"Sedgewick R Flajolet P (1996) An introduction to the analysis of algorithms. Addison-Wesley-Longman"},{"key":"e_1_2_1_2_24_2","doi-asserted-by":"crossref","unstructured":"Silc J Robic B Ungerer T (1999) Processor architecture: from dataflow to superscalar and beyond. Springer","DOI":"10.1007\/978-3-642-58589-0"},{"key":"e_1_2_1_2_25_2","unstructured":"Sedgewick R Wayne K (2011) Algorithms. Addison-Wesley 4th edn"}],"container-title":["Formal Aspects of Computing"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/article\/10.1007\/s00165-016-0401-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00165-016-0401-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1007\/s00165-016-0401-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1007\/s00165-016-0401-3","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,1,7]],"date-time":"2022-01-07T06:57:00Z","timestamp":1641538620000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1007\/s00165-016-0401-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2017,5]]},"references-count":25,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2017,5]]}},"alternative-id":["10.1007\/s00165-016-0401-3"],"URL":"https:\/\/doi.org\/10.1007\/s00165-016-0401-3","relation":{},"ISSN":["0934-5043","1433-299X"],"issn-type":[{"value":"0934-5043","type":"print"},{"value":"1433-299X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2017,5]]}}}