{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,3]],"date-time":"2025-09-03T10:05:05Z","timestamp":1756893905472,"version":"3.41.2"},"reference-count":50,"publisher":"Frontiers Media SA","license":[{"start":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T00:00:00Z","timestamp":1731974400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":["frontiersin.org"],"crossmark-restriction":true},"short-container-title":["Front. Artif. Intell."],"abstract":"<jats:p>Active learning enables prediction models to achieve better performance faster by adaptively querying an oracle for the labels of data points. Sometimes the oracle is a human, for example when a medical diagnosis is provided by a doctor. According to the behavioral sciences, people, because they employ heuristics, might sometimes exhibit biases in labeling. How does modeling the oracle as a human heuristic affect the performance of active learning algorithms? If there is a drop in performance, can one design active learning algorithms robust to labeling bias? The present article provides answers. We investigate two established human heuristics (fast-and-frugal tree, tallying model) combined with four active learning algorithms (entropy sampling, multi-view learning, conventional information density, and, our proposal, inverse information density) and three standard classifiers (logistic regression, random forests, support vector machines), and apply their combinations to 15 datasets where people routinely provide labels, such as health and other domains like marketing and transportation. There are two main results. First, we show that if a heuristic provides labels, the performance of active learning algorithms significantly drops, sometimes below random. Hence, it is key to design active learning algorithms that are robust to labeling bias. Our second contribution is to provide such a robust algorithm. The proposed inverse information density algorithm, which is inspired by human psychology, achieves an overall improvement of 87% over the best of the other algorithms. In conclusion, designing and benchmarking active learning algorithms can benefit from incorporating the modeling of human heuristics.<\/jats:p>","DOI":"10.3389\/frai.2024.1491932","type":"journal-article","created":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T06:16:24Z","timestamp":1731996984000},"update-policy":"https:\/\/doi.org\/10.3389\/crossmark-policy","source":"Crossref","is-referenced-by-count":1,"title":["Active learning with human heuristics: an algorithm robust to labeling bias"],"prefix":"10.3389","volume":"7","author":[{"given":"Sriram","family":"Ravichandran","sequence":"first","affiliation":[]},{"given":"Nandan","family":"Sudarsanam","sequence":"additional","affiliation":[]},{"given":"Balaraman","family":"Ravindran","sequence":"additional","affiliation":[]},{"given":"Konstantinos V.","family":"Katsikopoulos","sequence":"additional","affiliation":[]}],"member":"1965","published-online":{"date-parts":[[2024,11,19]]},"reference":[{"key":"B1","first-page":"256","article-title":"\u201cImpacts of behavioral biases on active learning strategies,\u201d","author":"Agarwal","year":"2022","journal-title":"International Conference On Artificial Intelligence in Information And Communication (ICAIIC)"},{"key":"B2","doi-asserted-by":"publisher","first-page":"1289","DOI":"10.1287\/opre.1070.0485","article-title":"Cumulative dominance and heuristic performance in binary multiattribute choice","volume":"56","author":"Baucells","year":"2008","journal-title":"Oper. Res"},{"key":"B3","doi-asserted-by":"publisher","first-page":"1039","DOI":"10.1007\/s10994-017-5633-9","article-title":"Optimal classification trees","volume":"106","author":"Bertsimas","year":"2017","journal-title":"Mach. Learn"},{"volume-title":"Classification and Regression Trees","year":"1984","author":"Breiman","key":"B4"},{"key":"B5","doi-asserted-by":"publisher","first-page":"105233","DOI":"10.1016\/j.cognition.2022.105233","article-title":"Humans adaptively resolve the explore-exploit dilemma under cognitive constraints: evidence from a multi-armed bandit task","volume":"229","author":"Brown","year":"2022","journal-title":"Cognition"},{"key":"B6","doi-asserted-by":"publisher","DOI":"10.30855\/gmbd.2020.03.03","article-title":"Classification of raisin grains using machine vision and artificial intelligence methods","author":"Cinar","year":"2020","journal-title":"Comp. Sci. Agricult. Food Sci"},{"key":"B7","doi-asserted-by":"crossref","first-page":"201","DOI":"10.1007\/BF00993277","article-title":"Improving generalization with active learning","volume":"15","author":"Cohn","year":"1994","journal-title":"Mach. Learn"},{"key":"B8","doi-asserted-by":"publisher","first-page":"571","DOI":"10.1037\/0003-066X.34.7.571","article-title":"The robust beauty of improper linear models in decision making","volume":"34","author":"Dawes","year":"1979","journal-title":"Am. Psychol"},{"key":"B9","first-page":"797","article-title":"\u201cActive learning with human-like noisy oracle,\u201d","volume-title":"IEEE International Conference On Data Mining","author":"Du","year":"2010"},{"key":"B10","doi-asserted-by":"crossref","DOI":"10.1093\/acprof:oso\/9780199744282.001.0001","volume-title":"Heuristics: The Foundations of Adaptive Behavior","author":"Gigerenzer","year":"2011"},{"key":"B11","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511808098","volume-title":"Heuristics and Biases: The Psychology of Intuitive Judgment","author":"Gilovich","year":"2002"},{"key":"B12","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-642-21738-8_21","article-title":"\u201cLearning from multiple annotators with Gaussian processes,\u201d","author":"Groot","year":"2011","journal-title":"Artificial Neural Networks And Machine Learning"},{"key":"B13","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-08422-0_41","article-title":"\u201cActive learning based on random forest and its application to terrain classification,\u201d","author":"Gu","year":"2014","journal-title":"Progress in Systems Engineering. Advances in Intelligent Systems and Computing, Vol. 366"},{"key":"B14","first-page":"91","article-title":"\u201cPersonalized active learning for collaborative filtering,\u201d","author":"Harpale","year":"2008","journal-title":"ACM SIGIR 2008"},{"key":"B15","doi-asserted-by":"publisher","first-page":"6453","DOI":"10.1007\/s10994-024-06567-2","article-title":"Evidential uncertainty sampling strategies for active learning","volume":"113","author":"Hoarau","year":"2024","journal-title":"Mach. Learn"},{"key":"B16","unstructured":"\u201cLearning and applying case adaptation rules for classification: an ensemble approach,\u201d\n          \n          \n            \n              Jalali\n              V.\n            \n            \n              Leake\n              D. B.\n            \n            \n              Forouzandehmehr\n              N.\n            \n          \n          International Joint Conference on Artificial Intelligence\n          \n          2017"},{"key":"B17","doi-asserted-by":"crossref","DOI":"10.1017\/CBO9780511809477","volume-title":"Judgment under Uncertainty: Heuristics and Biases","author":"Kahneman","year":"1982"},{"key":"B18","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1287\/deca.1100.0191","article-title":"Psychological heuristics for making inferences: definition, performance, and the emerging theory and practice","volume":"8","author":"Katsikopoulos","year":"2011","journal-title":"Deci. Analy"},{"key":"B19","doi-asserted-by":"publisher","first-page":"327","DOI":"10.1287\/deca.2013.0281","article-title":"Why Do Simple Heuristics Perform Well in Choices with Binary Attributes?","volume":"10","author":"Katsikopoulos","year":"2013","journal-title":"Deci. Analy"},{"volume-title":"Classification in the Wild: The Science and Art of Transparent Decision Making","year":"2020","author":"Katsikopoulos","key":"B20"},{"key":"B21","unstructured":"Kelly\n              M.\n            \n            \n              Longjohn\n              R.\n            \n            \n              Nottingham\n              K."},{"key":"B22","doi-asserted-by":"publisher","first-page":"1132","DOI":"10.1002\/widm.1132","article-title":"Active learning with support vector machines","volume":"4","author":"Kremer","year":"2014","journal-title":"Wiley Interdisc. Rev.: Data Mining Knowl. Discov"},{"key":"B23","doi-asserted-by":"publisher","first-page":"13779","DOI":"10.1007\/s11042-022-12941-w","article-title":"Adversarial image perturbations with distortions weighted by color on deep neural networks","volume":"82","author":"Kwon","year":"2023","journal-title":"Multimed. Tools Appl"},{"key":"B24","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/ACCESS.2023.3245632","article-title":"Dual-mode method for generating adversarial examples to attack deep neural networks","volume":"1","author":"Kwon","year":"2023","journal-title":"IEEE Access"},{"key":"B25","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/ACCESS.2022.3216075","article-title":"Audio adversarial example detection using the audio style transfer learning method","volume":"2022","author":"Kwon","year":"2022","journal-title":"IEEE Access"},{"key":"B26","doi-asserted-by":"publisher","first-page":"4511510","DOI":"10.1155\/2022\/4511510","article-title":"Textual adversarial training of machine learning model for resistance to adversarial examples","volume":"12","author":"Kwon","year":"2022","journal-title":"Secur. Commun. Networ"},{"key":"B27","doi-asserted-by":"publisher","first-page":"123582","DOI":"10.1016\/j.eswa.2024.123582","article-title":"Active learning inspired method in generative models","volume":"249","author":"Lan","year":"2024","journal-title":"Expert Syst. Appl"},{"key":"B28","doi-asserted-by":"publisher","first-page":"120786","DOI":"10.1016\/j.ins.2024.120786","article-title":"Data-efficient software defect prediction: a comparative analysis of active learning-enhanced models and voting ensembles","volume":"676","author":"Liapis","year":"2024","journal-title":"Inf. Sci"},{"key":"B29","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.2307.02719","article-title":"Understanding uncertainty sampling","author":"Liu","year":"2023","journal-title":"arXiv"},{"key":"B30","doi-asserted-by":"publisher","first-page":"352","DOI":"10.1016\/j.jmp.2008.04.003","article-title":"Categorization with limited resources: a family of simple heuristics","volume":"52","author":"Martignon","year":"2008","journal-title":"J. Math. Psychol"},{"key":"B31","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1016\/0004-3702(82)90040-6","article-title":"Generalization as search","volume":"18","author":"Mitchell","year":"1982","journal-title":"Artif. Intell"},{"key":"B32","doi-asserted-by":"publisher","first-page":"1898","DOI":"10.3390\/math12121898","article-title":"Exploring data augmentation and active learning benefits in imbalanced datasets","volume":"12","author":"Moles","year":"2024","journal-title":"Mathematics"},{"volume-title":"Human-in-the-Loop Machine Learning: Active Learning and Annotation for Human-Centered AI","year":"2021","author":"Monarch","key":"B33"},{"key":"B34","doi-asserted-by":"publisher","first-page":"203","DOI":"10.1613\/jair.2005","article-title":"Active learning with multiple views","volume":"27","author":"Muslea","year":"2006","journal-title":"J. Artif. Intellig. Res"},{"key":"B35","doi-asserted-by":"publisher","first-page":"344","DOI":"10.1017\/S1930297500006239","article-title":"FFTrees: a toolbox to create, visualize, and evaluate fast-and-frugal decision trees","volume":"12","author":"Phillips","year":"2017","journal-title":"Judgm. Decis. Mak"},{"key":"B36","first-page":"1655","article-title":"Active Learning with Feedback on Features and Instances","volume":"7","author":"Raghavan","year":"2006","journal-title":"J. Mach. Learn. Res"},{"key":"B37","first-page":"18310","article-title":"\u201cConvergence of uncertainty sampling for active learning,\u201d","volume-title":"Proceedings of the 39th International Conference on Machine Learning, Proceedings of Machine Learning Research, Vol. 162","author":"Raj","year":"2022"},{"key":"B38","doi-asserted-by":"publisher","first-page":"1398844","DOI":"10.3389\/frai.2024.1398844","article-title":"Semi-supervised active learning using convolutional auto-encoder and contrastive learning","volume":"7","author":"Roda","year":"2024","journal-title":"Front. Artif. Intellig"},{"volume-title":"Active Learning Literature Survey","year":"2009","author":"Settles","key":"B39"},{"key":"B40","first-page":"1070","article-title":"\u201cAn analysis of active learning strategies for sequence labeling tasks,\u201d","volume-title":"Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing","author":"Settles","year":"2008"},{"key":"B41","doi-asserted-by":"publisher","first-page":"379","DOI":"10.1002\/j.1538-7305.1948.tb01338.x","article-title":"Mathematical theory of communication","volume":"27","author":"Shannon","year":"1948","journal-title":"Bell Syst. Tech. J"},{"key":"B42","doi-asserted-by":"crossref","first-page":"614","DOI":"10.1145\/1401890.1401965","article-title":"\u201cGet another label? improving data quality and data mining using multiple, noisy labelers,\u201d","author":"Sheng","year":"2008","journal-title":"Proceedings Of The 14th ACM SIGKDD International Conference On Knowledge Discovery And Data Mining"},{"key":"B43","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1146\/annurev.ps.41.020190.000245","article-title":"Invariants of human behavior","volume":"41","author":"Simon","year":"1990","journal-title":"Annu. Rev. Psychol"},{"key":"B44","article-title":"\u201cLinear decision rule as aspiration for simple decision heuristics,\u201d","author":"Sim\u015fek","year":"2013","journal-title":"Part of Advances in Neural Information Processing Systems 26 (NIPS 2013"},{"journal-title":"Human Behavior in Contextual","year":"2015","author":"Stoji\u0107","key":"B45"},{"key":"B46","doi-asserted-by":"publisher","first-page":"108605","DOI":"10.1016\/j.compbiomed.2024.108605","article-title":"Exploring UMAP in hybrid models of entropy-based and representativeness sampling for active learning in biomedical segmentation","volume":"176","author":"Tan","year":"2024","journal-title":"Comput. Biol. Med"},{"volume-title":"Simple Heuristics That Make Us Smart","year":"1999","author":"Todd","key":"B47"},{"key":"B48","first-page":"1162","article-title":"Advances in active learning algorithms based on sampling strategy","volume":"49","author":"Wu","year":"2012","journal-title":"Jisuanji Yanjiu Yu Fazhan\/Computer Res. Dev"},{"key":"B49","unstructured":"On the bias of precision estimation under separate sampling\n          \n          \n            \n              Xie\n              S.\n            \n            \n              Braga-Neto\n              U. M.\n            \n          \n          Cancer Inform\n          \n          2019"},{"key":"B50","doi-asserted-by":"publisher","first-page":"401","DOI":"10.1016\/j.patcog.2018.06.004","article-title":"A benchmark and comparison of active learning for logistic regression","volume":"83","author":"Yang","year":"2018","journal-title":"Pattern Recognit"}],"container-title":["Frontiers in Artificial Intelligence"],"original-title":[],"link":[{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2024.1491932\/full","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,19]],"date-time":"2024-11-19T06:16:34Z","timestamp":1731996994000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.frontiersin.org\/articles\/10.3389\/frai.2024.1491932\/full"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,11,19]]},"references-count":50,"alternative-id":["10.3389\/frai.2024.1491932"],"URL":"https:\/\/doi.org\/10.3389\/frai.2024.1491932","relation":{},"ISSN":["2624-8212"],"issn-type":[{"type":"electronic","value":"2624-8212"}],"subject":[],"published":{"date-parts":[[2024,11,19]]},"article-number":"1491932"}}