{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T03:23:33Z","timestamp":1740108213210,"version":"3.37.3"},"reference-count":25,"publisher":"Springer Science and Business Media LLC","issue":"7","license":[{"start":{"date-parts":[[2023,9,2]],"date-time":"2023-09-02T00:00:00Z","timestamp":1693612800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,9,2]],"date-time":"2023-09-02T00:00:00Z","timestamp":1693612800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Comput Stat"],"published-print":{"date-parts":[[2024,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>When dealing with large-scale applications, the availability of simple and efficient algorithms is essential. We focus on the algorithm for calculating the order statistics, i.e. for selecting the <jats:italic>k<\/jats:italic>th smallest element of an array <jats:italic>X<\/jats:italic>. Many statistical procedures rely on this basic operation, that is usually solved by sorting all the elements and selecting the one in position <jats:italic>k<\/jats:italic>. If the dimension of the array to sort is quite large, this simple operation can become excessively time consuming. For this purpose, we propose an original randomised algorithm that reduces the dimension of the selection problem by focusing only on a small subset of elements that contains the solution. Despite its random nature, it always returns the target value. Empirical results shows that, for arrays of dimensions running from <jats:inline-formula><jats:alternatives><jats:tex-math>$$10^5$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:msup>\n                    <mml:mn>10<\/mml:mn>\n                    <mml:mn>5<\/mml:mn>\n                  <\/mml:msup>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula> to <jats:inline-formula><jats:alternatives><jats:tex-math>$$10^8$$<\/jats:tex-math><mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                  <mml:msup>\n                    <mml:mn>10<\/mml:mn>\n                    <mml:mn>8<\/mml:mn>\n                  <\/mml:msup>\n                <\/mml:math><\/jats:alternatives><\/jats:inline-formula>, our procedure resulted to be remarkably (up to almost 10 times) faster than the na\u00efve procedure, independently of the programming environment and of the sorting algorithm, and with a relative advantage that tends to growth with the dimension of the array.<\/jats:p>","DOI":"10.1007\/s00180-023-01381-1","type":"journal-article","created":{"date-parts":[[2023,9,2]],"date-time":"2023-09-02T08:01:57Z","timestamp":1693641717000},"page":"3599-3624","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Order statistics in large arrays (OSILA): a simple randomised algorithm for a fast and efficient attainment of the order statistics in very large arrays"],"prefix":"10.1007","volume":"39","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2749-9142","authenticated-orcid":false,"given":"Andrea","family":"Cerasa","sequence":"first","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,9,2]]},"reference":[{"issue":"2","key":"1381_CR1","doi-asserted-by":"publisher","first-page":"117","DOI":"10.1016\/j.jkss.2010.02.007","volume":"39","author":"AC Atkinson","year":"2010","unstructured":"Atkinson AC, Riani M, Cerioli A (2010) The forward search: theory and data analysis. J Korean Stat Soc 39(2):117\u2013134","journal-title":"J Korean Stat Soc"},{"key":"1381_CR2","unstructured":"Azzini I, Perrotta D, Torti F (2023) A practically efficient fixed-pivot selection algorithm and its extensible MATLAB suite. arXiv:2302.05705"},{"issue":"4","key":"1381_CR3","doi-asserted-by":"publisher","first-page":"448","DOI":"10.1016\/S0022-0000(73)80033-9","volume":"7","author":"M Blum","year":"1973","unstructured":"Blum M, Floyd RW, Pratt VR, Rivest RL, Tarjan RE et al (1973) Time bounds for selection. J Comput Syst Sci 7(4):448\u2013461","journal-title":"J Comput Syst Sci"},{"issue":"8","key":"1381_CR4","doi-asserted-by":"publisher","first-page":"1160","DOI":"10.3390\/e24081160","volume":"24","author":"A Cerasa","year":"2022","unstructured":"Cerasa A (2022) Introducing robust statistics in the uncertainty quantification of nuclear safeguards measurements. Entropy 24(8):1160","journal-title":"Entropy"},{"key":"1381_CR5","volume-title":"Order statistics","author":"HA David","year":"2004","unstructured":"David HA, Nagaraja HN (2004) Order statistics. Wiley, New Jersey"},{"key":"1381_CR6","doi-asserted-by":"crossref","unstructured":"Dixit VS, Jain P (2019) Weighted percentile-based context-aware recommender system. In: Applications of artificial intelligence techniques in engineering: SIGMA 2018, New Delhi, vol 2. Springer, pp. 377\u2013388","DOI":"10.1007\/978-981-13-1822-1_35"},{"issue":"3","key":"1381_CR7","doi-asserted-by":"publisher","first-page":"165","DOI":"10.1145\/360680.360691","volume":"18","author":"RW Floyd","year":"1975","unstructured":"Floyd RW, Rivest RL (1975) Expected time bounds for selection. Commun ACM 18(3):165\u2013172","journal-title":"Commun ACM"},{"issue":"3","key":"1381_CR8","doi-asserted-by":"publisher","first-page":"173","DOI":"10.1145\/360680.360694","volume":"18","author":"RW Floyd","year":"1975","unstructured":"Floyd RW, Rivest RL (1975) The algorithm SELECT for finding the ith smallest of n elements [m1] (algorithm 489). Commun ACM 18(3):173","journal-title":"Commun ACM"},{"issue":"3","key":"1381_CR9","doi-asserted-by":"publisher","first-page":"496","DOI":"10.1145\/321592.321600","volume":"17","author":"WD Frazer","year":"1970","unstructured":"Frazer WD, McKellar AC (1970) Samplesort: a sampling approach to minimal storage tree sorting. J ACM (JACM) 17(3):496\u2013507","journal-title":"J ACM (JACM)"},{"issue":"480","key":"1381_CR10","doi-asserted-by":"publisher","first-page":"1300","DOI":"10.1198\/016214507000001166","volume":"102","author":"R Fried","year":"2007","unstructured":"Fried R, Einbeck J, Gather U (2007) Weighted repeated median smoothing and filtering. J Am Stat Assoc 102(480):1300\u20131308","journal-title":"J Am Stat Assoc"},{"issue":"7","key":"1381_CR11","first-page":"321","volume":"4","author":"CAR Hoare","year":"1961","unstructured":"Hoare CAR (1961) Algorithm 64: quicksort. Commun ACM 4(7):321","journal-title":"Commun ACM"},{"issue":"1","key":"1381_CR12","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1145\/362452.362489","volume":"14","author":"CA Hoare","year":"1971","unstructured":"Hoare CA (1971) Proof of a program: FIND. Commun ACM 14(1):39\u201345","journal-title":"Commun ACM"},{"issue":"4","key":"1381_CR13","doi-asserted-by":"publisher","first-page":"361","DOI":"10.1080\/00031305.1996.10473566","volume":"50","author":"RJ Hyndman","year":"1996","unstructured":"Hyndman RJ, Fan Y (1996) Sample quantiles in statistical packages. Am Stat 50(4):361\u2013365","journal-title":"Am Stat"},{"key":"1381_CR14","doi-asserted-by":"crossref","unstructured":"K\u00e9gl B (2003) Robust regression by boosting the median. In: Learning theory and kernel machines: 16th annual conference on learning theory and 7th kernel workshop, COLT\/Kernel, August 24\u201327, 2003. Proceedings, Washington, DC. Springer, pp 258\u2013272","DOI":"10.1007\/978-3-540-45167-9_20"},{"issue":"1\u20132","key":"1381_CR15","doi-asserted-by":"publisher","first-page":"214","DOI":"10.1016\/j.tcs.2005.06.032","volume":"347","author":"KC Kiwiel","year":"2005","unstructured":"Kiwiel KC (2005) On Floyd and Rivest\u2019s SELECT algorithm. Theor Comput Sci 347(1\u20132):214\u2013238","journal-title":"Theor Comput Sci"},{"key":"1381_CR16","volume-title":"The art of computer programming; volume 3 sorting and searching","author":"DE Knuth","year":"1998","unstructured":"Knuth DE (1998) The art of computer programming; volume 3 sorting and searching. Addison-Wesley Professional, Boston"},{"key":"1381_CR17","volume-title":"Probability and computing: randomization and probabilistic techniques in algorithms and data analysis","author":"M Mitzenmacher","year":"2017","unstructured":"Mitzenmacher M, Upfal E (2017) Probability and computing: randomization and probabilistic techniques in algorithms and data analysis. Cambridge University Press, Cambridge"},{"key":"1381_CR18","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1016\/j.chemolab.2012.03.017","volume":"116","author":"M Riani","year":"2012","unstructured":"Riani M, Perrotta D, Torti F (2012) FSDA: a MATLAB toolbox for robust analysis and interactive data exploration. Chemom Intell Lab Syst 116:17\u201332","journal-title":"Chemom Intell Lab Syst"},{"key":"1381_CR19","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v067.c01","volume":"67","author":"M Riani","year":"2015","unstructured":"Riani M, Perrotta D, Cerioli A (2015) The forward search for very large datasets. J Stat Softw 67:1\u201320","journal-title":"J Stat Softw"},{"issue":"388","key":"1381_CR20","doi-asserted-by":"publisher","first-page":"871","DOI":"10.1080\/01621459.1984.10477105","volume":"79","author":"PJ Rousseeuw","year":"1984","unstructured":"Rousseeuw PJ (1984) Least median of squares regression. J Am Stat Assoc 79(388):871\u2013880","journal-title":"J Am Stat Assoc"},{"issue":"409","key":"1381_CR21","doi-asserted-by":"publisher","first-page":"97","DOI":"10.1080\/01621459.1990.10475311","volume":"85","author":"PJ Rousseeuw","year":"1990","unstructured":"Rousseeuw PJ, Bassett GW Jr (1990) The remedian: a robust averaging method for large data sets. J Am Stat Assoc 85(409):97\u2013104","journal-title":"J Am Stat Assoc"},{"issue":"424","key":"1381_CR22","doi-asserted-by":"publisher","first-page":"1273","DOI":"10.1080\/01621459.1993.10476408","volume":"88","author":"PJ Rousseeuw","year":"1993","unstructured":"Rousseeuw PJ, Croux C (1993) Alternatives to the median absolute deviation. J Am Stat Assoc 88(424):1273\u20131283","journal-title":"J Am Stat Assoc"},{"issue":"2","key":"1381_CR23","doi-asserted-by":"publisher","first-page":"159","DOI":"10.1016\/0196-6774(86)90001-5","volume":"7","author":"R Sedgewick","year":"1986","unstructured":"Sedgewick R (1986) A new upper bound for Shellsort. J Algorithms 7(2):159\u2013173","journal-title":"J Algorithms"},{"issue":"3","key":"1381_CR24","doi-asserted-by":"publisher","first-page":"185","DOI":"10.1145\/362875.362901","volume":"12","author":"RC Singleton","year":"1969","unstructured":"Singleton RC (1969) Algorithm 347: an efficient algorithm for sorting with minimal storage [m1]. Commun ACM 12(3):185\u2013186","journal-title":"Commun ACM"},{"key":"1381_CR25","doi-asserted-by":"crossref","unstructured":"Uhlig S (1997) Robust estimation of variance components with high breakdown point in the 1-way random effect model. In: Industrial statistics: aims and computational aspects. Proceedings of the satellite conference to the 51st session of the international statistical institute (ISI), August 16\u201317, 1997, Athens. Springer, pp 65\u201373","DOI":"10.1007\/978-3-642-59268-3_5"}],"container-title":["Computational Statistics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00180-023-01381-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00180-023-01381-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00180-023-01381-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,20]],"date-time":"2024-11-20T08:23:03Z","timestamp":1732090983000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00180-023-01381-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,2]]},"references-count":25,"journal-issue":{"issue":"7","published-print":{"date-parts":[[2024,12]]}},"alternative-id":["1381"],"URL":"https:\/\/doi.org\/10.1007\/s00180-023-01381-1","relation":{},"ISSN":["0943-4062","1613-9658"],"issn-type":[{"type":"print","value":"0943-4062"},{"type":"electronic","value":"1613-9658"}],"subject":[],"published":{"date-parts":[[2023,9,2]]},"assertion":[{"value":"14 November 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 June 2023","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 September 2023","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}