{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,22]],"date-time":"2026-06-22T18:51:42Z","timestamp":1782154302193,"version":"3.54.5"},"reference-count":62,"publisher":"Springer Science and Business Media LLC","issue":"10","license":[{"start":{"date-parts":[[2026,5,1]],"date-time":"2026-05-01T00:00:00Z","timestamp":1777593600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,5,15]],"date-time":"2026-05-15T00:00:00Z","timestamp":1778803200000},"content-version":"vor","delay-in-days":14,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100014597","name":"Universidade da Coru\u00f1a","doi-asserted-by":"crossref","id":[{"id":"10.13039\/501100014597","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Neural Comput &amp; Applic"],"published-print":{"date-parts":[[2026,5]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>In the era of big data, selecting representative samples has become essential to mitigate overfitting, noise, and high computational cost in machine learning. This study systematically reviews the evolution of instance selection (IS) methods, highlighting the growing importance of instance hardness (IH) as a guiding criterion to improve training efficiency and model robustness. Through a comprehensive search in Scopus and Web of Science, fifty-five studies were identified and analyzed following strict inclusion and exclusion criteria. The reviewed works were classified according to their underlying rationale\u2013error-based, geometric, heuristic, or explainability-driven\u2013revealing that IH principles intersect these categories as a transversal perspective on data quality. Most studies focus on enhancing predictive accuracy (56%) and computational efficiency (36%), while bias reduction and privacy preservation remain secondary. Reported outcomes show significant dataset reductions (up to 97%) with minimal accuracy loss and, in some cases, notable performance gains (+32% accuracy, +67% improvement in MSE). Despite these advances, explicit references to IH are rare, though many methods implicitly rely on related metrics such as misclassification frequency or decision-boundary proximity. Overall, IS is gaining relevance across domains such as cybersecurity, biomedicine, and computer vision, yet the field still lacks standardized methodologies and benchmarking frameworks, underscoring the need for unified, IH-informed strategies for robust and generalizable instance selection.<\/jats:p>","DOI":"10.1007\/s00521-026-12124-w","type":"journal-article","created":{"date-parts":[[2026,5,15]],"date-time":"2026-05-15T13:51:35Z","timestamp":1778853095000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Comparison of data set sample selection algorithms for data science: a systematic review"],"prefix":"10.1007","volume":"38","author":[{"ORCID":"https:\/\/orcid.org\/0009-0004-2177-0626","authenticated-orcid":false,"given":"Alberto","family":"Fern\u00e1ndez-S\u00e1nchez","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Marcos Gestal","family":"Pose","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Ver\u00f3nica Bol\u00f3n","family":"Canedo","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Juli\u00e1n Dorado","family":"de la Calle","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Alejandro Pazos","family":"Sierra","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2026,5,15]]},"reference":[{"key":"12124_CR1","doi-asserted-by":"publisher","DOI":"10.1016\/j.compbiomed.2021.104576","volume":"135","author":"A Ahuja","year":"2021","unstructured":"Ahuja A, Al-Zogbi L, Krieger A (2021) Application of noise-reduction techniques to machine learning algorithms for breast cancer tumor identification. Comput Biol Med 135:104576. https:\/\/doi.org\/10.1016\/j.compbiomed.2021.104576","journal-title":"Comput Biol Med"},{"key":"12124_CR2","doi-asserted-by":"publisher","unstructured":"Majeed A, Hwang, S.O.: Data compactness versus prediction performance, (2025) Achieving both by pruning redundant samples with dominant patterns and hamming distance based sampling scheme. IEEE Access 13:79655\u201379677. https:\/\/doi.org\/10.1109\/ACCESS.2025.3566430","DOI":"10.1109\/ACCESS.2025.3566430"},{"issue":"9","key":"12124_CR3","doi-asserted-by":"publisher","first-page":"351","DOI":"10.5455\/jjcit.71-1739734998","volume":"11","author":"R Younisse","year":"2025","unstructured":"Younisse R, Saif A, Al-Madi N, Almajali S, Mahafzah B (2025) Improving iot security: the impact of dimensionality and size reduction on intrusion-detection performance. Jordan J Comput Inf Technol 11(9):351\u2013368. https:\/\/doi.org\/10.5455\/jjcit.71-1739734998","journal-title":"Jordan J Comput Inf Technol"},{"issue":"3","key":"12124_CR4","doi-asserted-by":"publisher","first-page":"255","DOI":"10.1007\/s00723-020-01192-3","volume":"51","author":"S-J Im","year":"2020","unstructured":"Im S-J, Shim J-H, Kim J-Y, Baek H-M (2020) Comparison of differences in brain structure between teenagers and twenties brain using 3t magnetic resonance imaging. Appl Magn Reson 51(3):255\u2013276. https:\/\/doi.org\/10.1007\/s00723-020-01192-3","journal-title":"Appl Magn Reson"},{"key":"12124_CR5","doi-asserted-by":"publisher","unstructured":"Gilles FH (2013) The developing human brain: differences from adult brain. In: Bl\u00fcml S, Panigrahy A (eds) MR Spectroscopy of Pediatric Brain Disorders. Springer, New York, pp 3\u201310. https:\/\/doi.org\/10.1007\/978-1-4419-5864-8_1","DOI":"10.1007\/978-1-4419-5864-8_1"},{"key":"12124_CR6","doi-asserted-by":"publisher","first-page":"48","DOI":"10.3389\/neuro.09.048.2009","volume":"3","author":"S Lipp\u00e9","year":"2009","unstructured":"Lipp\u00e9 S, Kovacevic N, McIntosh R (2009) Differential maturation of brain signal complexity in the human auditory and visual system. Front Hum Neurosci 3:48. https:\/\/doi.org\/10.3389\/neuro.09.048.2009","journal-title":"Front Hum Neurosci"},{"issue":"1\u20132","key":"12124_CR7","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1016\/S0020-0190(99)00156-8","volume":"73","author":"V Pestov","year":"2000","unstructured":"Pestov V (2000) On the geometry of similarity search: dimensionality curse and concentration of measure. Inf Process Lett 73(1\u20132):47\u201351. https:\/\/doi.org\/10.1016\/S0020-0190(99)00156-8","journal-title":"Inf Process Lett"},{"issue":"3","key":"12124_CR8","doi-asserted-by":"publisher","first-page":"257","DOI":"10.1023\/A:1007626913721","volume":"38","author":"DR Wilson","year":"2001","unstructured":"Wilson DR, Martinez TR (2001) Reduction techniques for instance-based learning algorithms. Mach Learn 38(3):257\u2013286. https:\/\/doi.org\/10.1023\/A:1007626913721","journal-title":"Mach Learn"},{"key":"12124_CR9","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2020.113297","volume":"149","author":"M Malhat","year":"2020","unstructured":"Malhat M, Menshawy ME, Mousa H, Sisi AE (2020) A new approach for instance selection: algorithms, evaluation, and comparisons. Expert Syst Appl 149:113297. https:\/\/doi.org\/10.1016\/j.eswa.2020.113297","journal-title":"Expert Syst Appl"},{"issue":"2","key":"12124_CR10","doi-asserted-by":"publisher","first-page":"133","DOI":"10.1007\/s10462-010-9165-y","volume":"34","author":"JA Olvera-L\u00f3pez","year":"2010","unstructured":"Olvera-L\u00f3pez JA, Carrasco-Ochoa JA, Mart\u00ednez-Trinidad JF, Kittler J (2010) A review of instance selection methods. Artif Intell Rev 34(2):133\u2013143. https:\/\/doi.org\/10.1007\/s10462-010-9165-y","journal-title":"Artif Intell Rev"},{"issue":"3","key":"12124_CR11","doi-asserted-by":"publisher","first-page":"417","DOI":"10.1109\/TPAMI.2011.142","volume":"34","author":"S Garc\u00eda","year":"2012","unstructured":"Garc\u00eda S, Derrac J, Cano JR, Herrera F (2012) Prototype selection for nearest neighbor classification: taxonomy and empirical study. IEEE Trans Pattern Anal Mach Intell 34(3):417\u2013435. https:\/\/doi.org\/10.1109\/TPAMI.2011.142","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"12124_CR12","doi-asserted-by":"publisher","DOI":"10.1145\/3582000","author":"W Cunha","year":"2023","unstructured":"Cunha W, Viegas F, Fran\u00e7a C, Rosa T, Rocha L, Gon\u00e7alves MA (2023) A comparative survey of instance selection methods applied to non-neural and transformer-based text classification. ACM Comput Surv. https:\/\/doi.org\/10.1145\/3582000","journal-title":"ACM Comput Surv"},{"issue":"2","key":"12124_CR13","doi-asserted-by":"publisher","first-page":"225","DOI":"10.1007\/s10994-013-5422-z","volume":"95","author":"MR Smith","year":"2014","unstructured":"Smith MR, Martinez T, Giraud-Carrier C (2014) An instance level analysis of data complexity. Mach Learn 95(2):225\u2013256. https:\/\/doi.org\/10.1007\/s10994-013-5422-z","journal-title":"Mach Learn"},{"key":"12124_CR14","doi-asserted-by":"publisher","unstructured":"Mishra V, Mishra MP (2023). Prisma for review of management literature \u2013 method, merits, and limitations \u2013 an academic review. In: Advancing Methodologies of Conducting Literature Review in Management Domain. Emerald Publishing Limited . https:\/\/doi.org\/10.1108\/S2754-586520230000002007","DOI":"10.1108\/S2754-586520230000002007"},{"issue":"7","key":"12124_CR15","doi-asserted-by":"publisher","first-page":"467","DOI":"10.7326\/M18-0850","volume":"169","author":"AC Tricco","year":"2018","unstructured":"Tricco AC, Lillie E, Zarin W, O\u2019Brien KK, Colquhoun H, Levac D, Moher D, Peters MDJ, Horsley T, Weeks L, Hempel S, Akl EA, Chang C, McGowan J, Stewart L, Hartling L, Aldcroft A, Wilson MG, Garritty C, Lewin S, Godfrey CM, Macdonald MT, Langlois EV, Soares-Weiser K, Moriarty J, Clifford T, Tuncalp O, Straus SE (2018) Prisma extension for scoping reviews (prisma-scr): checklist and explanation. Ann Intern Med 169(7):467\u2013473","journal-title":"Ann Intern Med"},{"issue":"01","key":"12124_CR16","doi-asserted-by":"publisher","DOI":"10.1142\/S0218213023500276","volume":"32","author":"Y Nomura","year":"2023","unstructured":"Nomura Y, Kurita T (2023) Data expansion approach with attention mechanism for learning with noisy labels. Int J Artif Intell Tools 32(01):2350027. https:\/\/doi.org\/10.1142\/S0218213023500276","journal-title":"Int J Artif Intell Tools"},{"key":"12124_CR17","doi-asserted-by":"publisher","DOI":"10.1016\/j.eswa.2023.119536","volume":"217","author":"AK Panja","year":"2023","unstructured":"Panja AK, Rayala A, Agarwala A, Neogy S, Chowdhury C (2023) A hybrid tuple selection pipeline for smartphone based human activity recognition. Expert Syst Appl 217:119536. https:\/\/doi.org\/10.1016\/j.eswa.2023.119536","journal-title":"Expert Syst Appl"},{"issue":"Suppl 1","key":"12124_CR18","doi-asserted-by":"publisher","first-page":"89","DOI":"10.1007\/s10586-018-1821-z","volume":"22","author":"M Suganthi","year":"2019","unstructured":"Suganthi M, Karunakaran V (2019) Instance selection and feature extraction using cuttlefish optimization algorithm and principal component analysis using decision tree. Clust Comput 22(Suppl 1):89\u2013101. https:\/\/doi.org\/10.1007\/s10586-018-1821-z","journal-title":"Clust Comput"},{"key":"12124_CR19","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.jss.2015.04.038","volume":"106","author":"W-C Lin","year":"2015","unstructured":"Lin W-C, Tsai C-F, Ke S-W, Hung C-W, Eberle W (2015) Learning to detect representative data for large scale instance selection. J Syst Softw 106:1\u20138. https:\/\/doi.org\/10.1016\/j.jss.2015.04.038","journal-title":"J Syst Softw"},{"key":"12124_CR20","doi-asserted-by":"publisher","unstructured":"Fenza G, Gallo M, Loia V, Orciuoli F, Herrera-Viedma E (2021) .Data set quality in machine learning: Consistency measure based on group decision making. Appl. Soft Comput. 106(C) https:\/\/doi.org\/10.1016\/j.asoc.2021.107366","DOI":"10.1016\/j.asoc.2021.107366"},{"key":"12124_CR21","doi-asserted-by":"publisher","unstructured":"Huang X, Zhou S (2020). Adaptive transmission for edge learning via training loss estimation. In: ICC 2020 - 2020 IEEE International Conference on Communications (ICC), pp. 1\u20136 . https:\/\/doi.org\/10.1109\/ICC40277.2020.9149251","DOI":"10.1109\/ICC40277.2020.9149251"},{"key":"12124_CR22","doi-asserted-by":"publisher","unstructured":"Das S, Simha V, Swamidas J, Soares A, Da Fonseca VP (2024). Unbalanced fault classification using active learning in synthetic fiber manufacturing process. In: 2024 IEEE International Systems Conference (SysCon), pp. 1\u20138 . https:\/\/doi.org\/10.1109\/SysCon61195.2024.10553615","DOI":"10.1109\/SysCon61195.2024.10553615"},{"key":"12124_CR23","doi-asserted-by":"publisher","unstructured":"Ramirez-Cruz J-f, Fuentes O, Alarcon-Aquino V, Garcia-Banuelos L (2006).: Instance selection and feature weighting using evolutionary algorithms. In: 2006 15th International Conference on Computing, pp. 73\u201379 . https:\/\/doi.org\/10.1109\/CIC.2006.42","DOI":"10.1109\/CIC.2006.42"},{"key":"12124_CR24","doi-asserted-by":"publisher","unstructured":"Kim M, Kim D, Hwang E, Kim E, Ko S-G, Lee B-T (2022) .: Inferring socio-demographic information using smart meter data by transfer learning. In: 2022 6th International Conference on Green Energy and Applications (ICGEA), pp. 221\u2013225 . https:\/\/doi.org\/10.1109\/ICGEA54406.2022.9791982","DOI":"10.1109\/ICGEA54406.2022.9791982"},{"key":"12124_CR25","doi-asserted-by":"publisher","DOI":"10.3390\/app12031011","author":"J Cho","year":"2022","unstructured":"Cho J, Gong S, Choi K (2022) A study on high-speed outlier detection method of network abnormal behavior data using heterogeneous multiple classifiers. Appl Sci. https:\/\/doi.org\/10.3390\/app12031011","journal-title":"Appl Sci"},{"key":"12124_CR26","doi-asserted-by":"publisher","unstructured":"Chulif S, Lee SH, Chang YL, Tsun MTK, Chai KC, Then YL (2023) Momo strategy: Learn more from more mistakes. In: 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), pp. 659\u2013665. https:\/\/doi.org\/10.1109\/APSIPAASC58517.2023.10317346","DOI":"10.1109\/APSIPAASC58517.2023.10317346"},{"key":"12124_CR27","doi-asserted-by":"publisher","unstructured":"Guan S, Chen M, Ha H-Y, Chen S-C, Shyu M-L, Zhang C (2025) Deep learning with mca-based instance selection and bootstrapping for imbalanced data classification. In: 2015 IEEE Conference on Collaboration and Internet Computing (CIC), pp. 288\u2013295. https:\/\/doi.org\/10.1109\/CIC.2015.40","DOI":"10.1109\/CIC.2015.40"},{"issue":"2","key":"12124_CR28","doi-asserted-by":"publisher","first-page":"853","DOI":"10.13053\/CyS-26-2-4255","volume":"26","author":"SO Tovias-Alanis","year":"2022","unstructured":"Tovias-Alanis SO, Gomez-Flores W, Toscano-Pulido G (2022) Evolutionary instance selection based on preservation of the data probability density function. Comput Sist 26(2):853\u2013866. https:\/\/doi.org\/10.13053\/CyS-26-2-4255","journal-title":"Comput Sist"},{"key":"12124_CR29","doi-asserted-by":"publisher","DOI":"10.1016\/j.apacoust.2020.107573","author":"Y Dokuz","year":"2021","unstructured":"Dokuz Y, Tufekci Z (2021) Mini-batch sample selection strategies for deep learning based speech recognition. Appl Acoust. https:\/\/doi.org\/10.1016\/j.apacoust.2020.107573","journal-title":"Appl Acoust"},{"key":"12124_CR30","doi-asserted-by":"publisher","unstructured":"Akinyelu AA (2021).: Improving the speed of machine learning algorithms using bio-inspired techniques. In: 2021 International Conference on Electrical, Computer and Energy Technologies (ICECET), pp. 240\u2013249 https:\/\/doi.org\/10.1109\/ICECET52533.2021.9698651","DOI":"10.1109\/ICECET52533.2021.9698651"},{"key":"12124_CR31","doi-asserted-by":"publisher","DOI":"10.1109\/TIM.2024.3352696","author":"W Liao","year":"2024","unstructured":"Liao W, Bak-Jensen B, Pillai JR, Xia X, Ruan G, Yang Z (2024) Reducing annotation efforts in electricity theft detection through optimal sample selection. IEEE Trans Instrum Meas. https:\/\/doi.org\/10.1109\/TIM.2024.3352696","journal-title":"IEEE Trans Instrum Meas"},{"issue":"2","key":"12124_CR32","doi-asserted-by":"publisher","first-page":"1497","DOI":"10.32604\/cmc.2023.034914","volume":"76","author":"S-Y Lee","year":"2023","unstructured":"Lee S-Y, Park J, Kim D-Y (2023) Context awareness by noise-pattern analysis of a smart factory. Comput Mater Contin 76(2):1497\u20131514. https:\/\/doi.org\/10.32604\/cmc.2023.034914","journal-title":"Comput Mater Contin"},{"key":"12124_CR33","doi-asserted-by":"publisher","DOI":"10.1016\/j.compmedimag.2024.102379","volume":"115","author":"S Rajaraman","year":"2024","unstructured":"Rajaraman S, Zamzmi G, Yang F, Liang Z, Xue Z, Antani S (2024) Semantically redundant training data removal and deep model classification performance: a study with chest x-rays. Comput Med Imaging Graph 115:102379. https:\/\/doi.org\/10.1016\/j.compmedimag.2024.102379","journal-title":"Comput Med Imaging Graph"},{"key":"12124_CR34","doi-asserted-by":"publisher","unstructured":"Wang P, Sun L, Li F (2023).: Privacy-aware knowledge distillation based on dynamic sample selection. In: 2023 International Conference on Data Security and Privacy Protection (DSPP), pp. 266\u2013271. https:\/\/doi.org\/10.1109\/DSPP58763.2023.10405225","DOI":"10.1109\/DSPP58763.2023.10405225"},{"key":"12124_CR35","doi-asserted-by":"publisher","DOI":"10.1016\/j.pmcj.2023.101849","author":"B Molina-Coronado","year":"2023","unstructured":"Molina-Coronado B, Mori U, Mendiburu A, Miguel-Alonso J (2023) Efficient concept drift handling for batch Android malware detection models. Pervasive Mob Comput. https:\/\/doi.org\/10.1016\/j.pmcj.2023.101849","journal-title":"Pervasive Mob Comput"},{"key":"12124_CR36","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2401.16193","author":"Z Wan","year":"2024","unstructured":"Wan Z, Wang Z, Wang Y, Wang Z, Zhu H, Satoh S (2024) Contributing dimension structure of deep feature for coreset selection. arXiv. https:\/\/doi.org\/10.48550\/arXiv.2401.16193","journal-title":"arXiv"},{"key":"12124_CR37","doi-asserted-by":"publisher","unstructured":"Shahmiri L, Wong P, Dooley LS (2022) : Accurate medicinal plant identification in natural environments by embedding mutual information in a convolution neural network model. In: 2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS), Five, pp. 1\u20136 . https:\/\/doi.org\/10.1109\/IPAS55744.2022.10053008","DOI":"10.1109\/IPAS55744.2022.10053008"},{"key":"12124_CR38","doi-asserted-by":"publisher","first-page":"152098","DOI":"10.1109\/ACCESS.2021.3127195","volume":"9","author":"B Baek","year":"2021","unstructured":"Baek B, Euh S, Baek D, Kim D, Hwang D (2021) Histogram entropy representation and prototype based machine learning approach for malware family classification. IEEE Access 9:152098\u2013152114. https:\/\/doi.org\/10.1109\/ACCESS.2021.3127195","journal-title":"IEEE Access"},{"issue":"2","key":"12124_CR39","doi-asserted-by":"publisher","first-page":"3719","DOI":"10.32604\/cmc.2022.025196","volume":"72","author":"M Al-Akhras","year":"2022","unstructured":"Al-Akhras M, Darwish Z, Atawneh S, Habib M (2022) Improving association rules accuracy in noisy domains using instance reduction techniques. Comput Mater Contin 72(2):3719\u20133749. https:\/\/doi.org\/10.32604\/cmc.2022.025196","journal-title":"Comput Mater Contin"},{"key":"12124_CR40","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2022.108934","author":"J Kim","year":"2022","unstructured":"Kim J, Lee J (2022) Instance-based transfer learning method via modified domain-adversarial neural network with influence function: applications to design metamodeling and fault diagnosis. Appl Soft Comput. https:\/\/doi.org\/10.1016\/j.asoc.2022.108934","journal-title":"Appl Soft Comput"},{"key":"12124_CR41","doi-asserted-by":"publisher","DOI":"10.1016\/j.ast.2024.109606","author":"J Lou","year":"2024","unstructured":"Lou J, Chen R, Liu J, Bao Y, You Y, Huang L, Xu M (2024) General framework for unsteady aerodynamic prediction of airfoils based on deep transfer learning. Aerosp Sci Technol. https:\/\/doi.org\/10.1016\/j.ast.2024.109606","journal-title":"Aerosp Sci Technol"},{"key":"12124_CR42","doi-asserted-by":"publisher","DOI":"10.1088\/1741-4326\/ad59b5","author":"Y Zhong","year":"2024","unstructured":"Zhong Y, Zheng W, Chen ZY, Yan W, Xia F, Yu LM, Xue FM, Shen CS, Ai XK, Yang ZY, Yu YL, Nie ZS, Ding YH, Liang YF, Chen ZP (2024) High-beta disruption prediction study on HL-2A with instance-based transfer learning. Nucl Fusion. https:\/\/doi.org\/10.1088\/1741-4326\/ad59b5","journal-title":"Nucl Fusion"},{"key":"12124_CR43","doi-asserted-by":"publisher","unstructured":"Li D, Xia T, Li Q, Li X, Wang J (2024) . Robustness and privacy for green learning under noisy labels. In: 2023 IEEE 22nd International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), pp. 1111\u20131118 https:\/\/doi.org\/10.1109\/TrustCom60117.2023.00153","DOI":"10.1109\/TrustCom60117.2023.00153"},{"key":"12124_CR44","doi-asserted-by":"publisher","DOI":"10.3389\/frobt.2024.1281060","author":"S Das","year":"2024","unstructured":"Das S, Fonseca V, Soares A (2024) Active learning strategies for robotic tactile texture recognition tasks. Frontiers in Robotics and AI. https:\/\/doi.org\/10.3389\/frobt.2024.1281060","journal-title":"Frontiers in Robotics and AI"},{"issue":"18","key":"12124_CR45","doi-asserted-by":"publisher","DOI":"10.3390\/molecules28186680","volume":"28","author":"H Tao","year":"2023","unstructured":"Tao H, Shan S, Fu H, Zhu C, Liu B (2023) An augmented sample selection framework for prediction of anticancer peptides. Molecules 28(18):6680. https:\/\/doi.org\/10.3390\/molecules28186680","journal-title":"Molecules"},{"key":"12124_CR46","doi-asserted-by":"publisher","DOI":"10.1016\/j.compag.2023.108373","volume":"215","author":"T Sun","year":"2023","unstructured":"Sun T, Xue C, Chen Y, Zhao L, Qiao C, Huang A, Li F, Liu C, Wang H, Yang Y, Yao Q, Wang G, Sun C, Tian T, Qiao H (2023) Cost-effective identification of the field maturity of tobacco leaves based on deep semi-supervised active learning and smartphone photograph. Comput Electron Agric 215:108373. https:\/\/doi.org\/10.1016\/j.compag.2023.108373","journal-title":"Comput Electron Agric"},{"key":"12124_CR47","doi-asserted-by":"publisher","first-page":"83","DOI":"10.1016\/j.neunet.2023.01.018","volume":"161","author":"J-D Lin","year":"2023","unstructured":"Lin J-D, Han Y-H, Huang P-H, Tan J, Chen J-C, Tanveer M, Hua K-L (2023) Defaek: domain effective fast adaptive network for face anti-spoofing. Neural Netw 161:83\u201391. https:\/\/doi.org\/10.1016\/j.neunet.2023.01.018","journal-title":"Neural Netw"},{"key":"12124_CR48","doi-asserted-by":"publisher","DOI":"10.1016\/j.ecoinf.2020.101160","volume":"60","author":"H Gan","year":"2020","unstructured":"Gan H, Zhang J, Towsey M, Truskinger A, Stark D, Rensburg BJ, Li Y, Roe P (2020) Data selection in frog chorusing recognition with acoustic indices. Eco Inform 60:101160. https:\/\/doi.org\/10.1016\/j.ecoinf.2020.101160","journal-title":"Eco Inform"},{"issue":"19","key":"12124_CR49","doi-asserted-by":"publisher","first-page":"4883","DOI":"10.3390\/rs14194883","volume":"14","author":"Y Zhao","year":"2022","unstructured":"Zhao Y, Zhang X, Feng W, Xu J (2022) Deep learning classification by resnet-18 based on the real spectral dataset from multispectral remote sensing images. Remote Sens 14(19):4883. https:\/\/doi.org\/10.3390\/rs14194883","journal-title":"Remote Sens"},{"key":"12124_CR50","doi-asserted-by":"publisher","unstructured":"Hossain MI, Zamzmi G, Mouton P, Sun Y, Goldgof D (2023) Enhancing neonatal pain assessment transparency via explanatory training examples identification. In: 2023 IEEE 36th International Symposium on Computer-Based Medical Systems (CBMS), pp. 311\u2013316 https:\/\/doi.org\/10.1109\/CBMS58004.2023.00236","DOI":"10.1109\/CBMS58004.2023.00236"},{"issue":"2","key":"12124_CR51","doi-asserted-by":"publisher","first-page":"1519","DOI":"10.1007\/s00521-022-07770-9","volume":"35","author":"Z Wen","year":"2023","unstructured":"Wen Z, Xu H, Ying S (2023) Jsmix: a holistic algorithm for learning with label noise. Neural Comput Appl 35(2):1519\u20131533. https:\/\/doi.org\/10.1007\/s00521-022-07770-9","journal-title":"Neural Comput Appl"},{"key":"12124_CR52","doi-asserted-by":"publisher","first-page":"349","DOI":"10.1016\/j.patrec.2020.03.031","volume":"133","author":"S Das","year":"2020","unstructured":"Das S, Mandal S, Bhoyar A, Bharde M, Ganguly N, Bhattacharya S, Bhattacharya S (2020) Multi-criteria online frame-subset selection for autonomous vehicle videos. Pattern Recogn Lett 133:349\u2013355. https:\/\/doi.org\/10.1016\/j.patrec.2020.03.031","journal-title":"Pattern Recogn Lett"},{"key":"12124_CR53","doi-asserted-by":"publisher","first-page":"530","DOI":"10.1016\/j.isprsjprs.2024.11.005","volume":"218","author":"E De Clerck","year":"2021","unstructured":"De Clerck E, Kov\u00e1cs DD, Berger K, Schlerf M, Verrelst J (2021) Optimizing hybrid models for canopy nitrogen mapping from sentinel-2 in Google Earth Engine. ISPRS J Photogramm Remote Sens 218:530\u2013545. https:\/\/doi.org\/10.1016\/j.isprsjprs.2024.11.005","journal-title":"ISPRS J Photogramm Remote Sens"},{"issue":"6","key":"12124_CR54","doi-asserted-by":"publisher","first-page":"3005","DOI":"10.1007\/s12652-021-03135-7","volume":"13","author":"Y Bai","year":"2022","unstructured":"Bai Y, Bain M (2022) Optimizing weighted lazy learning and Naive Bayes classification using differential evolution algorithm. J Ambient Intell Humaniz Comput 13(6):3005\u20133024. https:\/\/doi.org\/10.1007\/s12652-021-03135-7","journal-title":"J Ambient Intell Humaniz Comput"},{"key":"12124_CR55","doi-asserted-by":"publisher","unstructured":"Sun Z, Shen F, Huang D, Wang Q, Shu X, Yao Y, Tang J (2022) Pnp: Robust learning from noisy labels by probabilistic noise prediction. In: 2022 IEEE\/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5301\u20135310 https:\/\/doi.org\/10.1109\/CVPR52688.2022.00524","DOI":"10.1109\/CVPR52688.2022.00524"},{"issue":"1","key":"12124_CR56","doi-asserted-by":"publisher","first-page":"219","DOI":"10.1007\/s00371-021-02323-y","volume":"39","author":"K Be\u0161eni\u0107","year":"2023","unstructured":"Be\u0161eni\u0107 K, Ahlberg J, Pand\u017ei\u0107 IS (2023) Picking out the bad apples: unsupervised biometric data filtering for refined age estimation. Vis Comput 39(1):219\u2013237. https:\/\/doi.org\/10.1007\/s00371-021-02323-y","journal-title":"Vis Comput"},{"issue":"2","key":"12124_CR57","doi-asserted-by":"publisher","first-page":"689","DOI":"10.1109\/TCYB.2017.2651114","volume":"48","author":"Z Yu","year":"2018","unstructured":"Yu Z, Lu Y, Zhang J, You J, Wong H-S, Wang Y, Han G (2018) Progressive semisupervised learning of multiple classifiers. IEEE Trans Cybern 48(2):689\u2013702. https:\/\/doi.org\/10.1109\/TCYB.2017.2651114","journal-title":"IEEE Trans Cybern"},{"key":"12124_CR58","doi-asserted-by":"publisher","first-page":"2472","DOI":"10.1109\/TASE.2024.3379945","volume":"22","author":"T Yin","year":"2024","unstructured":"Yin T, Zhang W, Kou J, Liu N (2024) Promoting automatic detection of road damage: a high-resolution dataset, a new approach, and a new evaluation criterion. IEEE Trans Autom Sci Eng 22:2472\u20132484. https:\/\/doi.org\/10.1109\/TASE.2024.3379945","journal-title":"IEEE Trans Autom Sci Eng"},{"issue":"1","key":"12124_CR59","doi-asserted-by":"publisher","first-page":"509","DOI":"10.1007\/s11440-023-01950-0","volume":"19","author":"Y Jiang","year":"2024","unstructured":"Jiang Y, Wang W, Zou L, Cao Y (2024) Regional landslide susceptibility assessment based on improved semi-supervised clustering and deep learning. Acta Geotech 19(1):509\u2013529. https:\/\/doi.org\/10.1007\/s11440-023-01950-0","journal-title":"Acta Geotech"},{"key":"12124_CR60","doi-asserted-by":"publisher","DOI":"10.1016\/j.ins.2024.121738","volume":"695","author":"GFA Yeo","year":"2025","unstructured":"Yeo GFA, Hudson I, Akman D, Chan J (2025) Spis: a stochastic approximation approach to minimal subset instance selection. Inf Sci 695:121738. https:\/\/doi.org\/10.1016\/j.ins.2024.121738","journal-title":"Inf Sci"},{"key":"12124_CR61","doi-asserted-by":"publisher","DOI":"10.3390\/electronics12071656","author":"L Radlinski","year":"2023","unstructured":"Radlinski L (2023) The impact of data quality on software testing effort prediction. Electronics. https:\/\/doi.org\/10.3390\/electronics12071656","journal-title":"Electronics"},{"key":"12124_CR62","doi-asserted-by":"publisher","unstructured":"Karim N, Rizve MN, Rahnavard N, Mian A, Shah M (2022) Unicon: Combating label noise through uniform selection and contrastive learning. arXiv preprint arXiv:2203.14542, https:\/\/doi.org\/10.48550\/arXiv.2203.14542","DOI":"10.48550\/arXiv.2203.14542"}],"container-title":["Neural Computing and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-026-12124-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00521-026-12124-w","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00521-026-12124-w.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,6,22]],"date-time":"2026-06-22T18:25:05Z","timestamp":1782152705000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00521-026-12124-w"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,5]]},"references-count":62,"journal-issue":{"issue":"10","published-print":{"date-parts":[[2026,5]]}},"alternative-id":["12124"],"URL":"https:\/\/doi.org\/10.1007\/s00521-026-12124-w","relation":{},"ISSN":["0941-0643","1433-3058"],"issn-type":[{"value":"0941-0643","type":"print"},{"value":"1433-3058","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,5]]},"assertion":[{"value":"13 October 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 April 2026","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 May 2026","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical Approval"}},{"value":"The authors declare that they have no competing interests.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflicts of Interest"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Code availability"}},{"value":"Not applicable.","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Materials availability"}},{"value":"Not applicable.","order":6,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"Not applicable.","order":7,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to participate"}}],"article-number":"408"}}