{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,12,18]],"date-time":"2025-12-18T14:29:45Z","timestamp":1766068185832,"version":"3.44.0"},"reference-count":54,"publisher":"Springer Science and Business Media LLC","issue":"12","license":[{"start":{"date-parts":[[2025,7,16]],"date-time":"2025-07-16T00:00:00Z","timestamp":1752624000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,7,16]],"date-time":"2025-07-16T00:00:00Z","timestamp":1752624000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Appl Intell"],"published-print":{"date-parts":[[2025,8]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>Positive-Unlabelled (PU) learning is a field of machine learning that aims to learn classifiers from data consisting of labelled positive and unlabelled instances, which can be in reality positive or negative, but whose label is unknown. Many PU learning methods have been proposed over the last two decades, so many so that selecting an optimal method for a given PU learning task presents a challenge. Our previous work has addressed this by proposing GA-Auto-PU, the first Automated Machine Learning (Auto-ML) system for PU learning. In this work, we propose two new PU learning Auto-ML systems: BO-Auto-PU, based on a Bayesian Optimisation (BO) approach, and EBO-Auto-PU, based on a novel evolutionary\/BO approach. We present an extensive evaluation of the three Auto-ML systems, comparing them to each other and to well-established PU learning methods across 60 datasets (20 datasets, each with 3 versions). The results of the comparison show statistically significant improvements in predictive accuracy over the baseline methods, as well as large improvements in computational time for the newly proposed Auto-PU systems over the original Auto-PU system.<\/jats:p>","DOI":"10.1007\/s10489-025-06706-9","type":"journal-article","created":{"date-parts":[[2025,7,16]],"date-time":"2025-07-16T12:03:16Z","timestamp":1752667396000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Automated machine learning for positive-unlabelled learning"],"prefix":"10.1007","volume":"55","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0801-2909","authenticated-orcid":false,"given":"Jack D.","family":"Saunders","sequence":"first","affiliation":[]},{"given":"Alex A.","family":"Freitas","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,7,16]]},"reference":[{"issue":"4","key":"6706_CR1","doi-asserted-by":"publisher","first-page":"719","DOI":"10.1007\/s10994-020-05877-5","volume":"109","author":"J Bekker","year":"2020","unstructured":"Bekker J, Davis J (2020) Learning from positive and unlabeled data: A survey. Mach Learn 109(4):719\u2013760","journal-title":"Mach Learn"},{"key":"6706_CR2","doi-asserted-by":"crossref","unstructured":"Jaskie K, Spanias A (2019) Positive and unlabeled learning algorithms and applications: A survey. In: 2019 10th International conference on information, intelligence, systems and applications (IISA), pp 1\u20138. IEEE","DOI":"10.1109\/IISA.2019.8900698"},{"key":"6706_CR3","doi-asserted-by":"crossref","unstructured":"Zhang Y, Li L, Zhou J, et al (2017) Poster: A PU learning based system for potential malicious URL detection. In: Proceedings of the ACM conference on computer and communications security, pp 2599\u20132601","DOI":"10.1145\/3133956.3138825"},{"key":"6706_CR4","doi-asserted-by":"crossref","unstructured":"Luo Y, Cheng S, Liu C, et al (2018) PU learning in payload-based web anomaly detection. In: Proceedings of the third international conference on security of smart cities, industrial control system and communications, pp 1\u20135","DOI":"10.1109\/SSIC.2018.8556662"},{"issue":"20","key":"6706_CR5","doi-asserted-by":"publisher","first-page":"2640","DOI":"10.1093\/bioinformatics\/bts504","volume":"28","author":"P Yang","year":"2012","unstructured":"Yang P, Li X, Mei K et al (2012) Positive-unlabelled learning for disease gene identification. Bioinformatics 28(20):2640\u20132647","journal-title":"Bioinformatics"},{"key":"6706_CR6","doi-asserted-by":"publisher","first-page":"102","DOI":"10.1016\/j.jbi.2018.03.006","volume":"81","author":"O Nikdelfaz","year":"2018","unstructured":"Nikdelfaz O, Jalili S (2018) Disease genes prediction by HMM based PU-learning using gene expression profiles. J Biomed Inf 81:102\u2013111","journal-title":"J Biomed Inf"},{"key":"6706_CR7","doi-asserted-by":"publisher","first-page":"23","DOI":"10.1016\/j.compbiolchem.2018.05.022","volume":"76","author":"A Vasighizaker","year":"2018","unstructured":"Vasighizaker A, Jalili S (2018) C-PUGP: A cluster-based positive unlabeled learning method for disease gene prediction and prioritization. Comput Biol Chem 76:23\u201331","journal-title":"Comput Biol Chem"},{"key":"6706_CR8","unstructured":"Liu B, Lee WS, Yu PS, Li X (2002) Partially supervised classification of text documents. ICML, vol 2. NSW, Sydney, pp 387\u2013394"},{"key":"6706_CR9","doi-asserted-by":"crossref","unstructured":"Ke T, Yang B, Zhen L et al (2012) Building high-performance classifiers using positive and unlabelled examples for text. In: International Symposium on Neural Networks, pp 187\u2013195","DOI":"10.1007\/978-3-642-31362-2_21"},{"key":"6706_CR10","first-page":"1463","volume":"30","author":"L Liu","year":"2014","unstructured":"Liu L, Peng T (2014) Clustering-based method for positive and unlabelled text categorization enhanced by improved TFIDF. J Inf Sci Eng 30:1463\u20131481","journal-title":"J Inf Sci Eng"},{"key":"6706_CR11","doi-asserted-by":"crossref","unstructured":"Saunders JD, Freitas AA (2022) GA-Auto-PU: A genetic algorithm-based automated machine learning system for positive-unlabeled learning. In: Proceedings of the genetic and evolutionary computation conference companion. GECCO \u201922, pp 288\u2013291. ACM","DOI":"10.1145\/3520304.3528932"},{"issue":"4","key":"6706_CR12","doi-asserted-by":"publisher","first-page":"1425","DOI":"10.1093\/bib\/bbz080","volume":"21","author":"X Zeng","year":"2020","unstructured":"Zeng X, Zhong Y, Lin W, Zou Q (2020) Predicting disease-associated circular RNAs using deep forests combined with positive-unlabeled learning methods. Briefings in Bioinformatics 21(4):1425\u20131436","journal-title":"Briefings in Bioinformatics"},{"key":"6706_CR13","unstructured":"Li XL, Zhang L, Liu B, Ng SK (2010) Distributional similarity vs. PU learning for entity set expansion. In: Proceedings of the ACL 2010 conference short papers, pp 359\u2013364"},{"key":"6706_CR14","unstructured":"Xia R, Hu X, Lu J, et al (2013) Instance selection and instance weighting for cross-domain sentiment classification via PU learning. In: Twenty-third international joint conference on artificial intelligence, pp 2176\u20132182"},{"key":"6706_CR15","doi-asserted-by":"publisher","first-page":"2373","DOI":"10.1007\/s10489-017-1076-z","volume":"48","author":"T Ke","year":"2018","unstructured":"Ke T, Jing L, Lv H et al (2018) Global and local learning from positive and unlabeled examples. Appl Intell 48:2373\u20132392","journal-title":"Appl Intell"},{"issue":"5","key":"6706_CR16","doi-asserted-by":"publisher","first-page":"1332","DOI":"10.1109\/TMM.2018.2871421","volume":"21","author":"J Zhang","year":"2018","unstructured":"Zhang J, Wang Z, Meng J et al (2018) Boosting positive and unlabeled learning for anomaly detection with multi-features. IEEE Trans Multimed 21(5):1332\u20131344","journal-title":"IEEE Trans Multimed"},{"key":"6706_CR17","doi-asserted-by":"crossref","unstructured":"Schrunner S, Geiger B.C, Zernig A, Kern R (2020) A generative semi-supervised classifier for datasets with unknown classes. In: Proceedings of the 35th annual ACM symposium on applied computing, pp 1066\u20131074","DOI":"10.1145\/3341105.3373890"},{"key":"6706_CR18","doi-asserted-by":"crossref","unstructured":"Liu B, Liu Z, Xiao Y (2021) A new dictionary-based positive and unlabeled learning method. Applied Intelligence, pp 1\u201315","DOI":"10.1007\/s10489-021-02344-z"},{"key":"6706_CR19","doi-asserted-by":"crossref","unstructured":"He Y, Li X, Zhang M, et al (2023) A novel observation points-based positive-unlabeled learning algorithm. Chinese Association for Artificial Intelligence Transactions on Intelligence Technology, pp 1\u201319","DOI":"10.1049\/cit2.12152"},{"issue":"1","key":"6706_CR20","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1093\/nsr\/nwy108","volume":"6","author":"Z Zhou","year":"2019","unstructured":"Zhou Z, Feng J (2019) Deep forest. National Sci Rev 6(1):74\u201386","journal-title":"Deep forest. National Sci Rev"},{"key":"6706_CR21","doi-asserted-by":"crossref","unstructured":"Elkan C, Noto K (2008) Learning classifiers from only positive and unlabeled data. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 213\u2013220","DOI":"10.1145\/1401890.1401920"},{"key":"6706_CR22","doi-asserted-by":"crossref","unstructured":"Japkowicz N, Shah M (2011) Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press","DOI":"10.1017\/CBO9780511921803"},{"issue":"2","key":"6706_CR23","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1145\/3575637.3575642","volume":"24","author":"JD Saunders","year":"2022","unstructured":"Saunders JD, Freitas AA (2022) Evaluating the predictive performance of positive-unlabelled classifiers: a brief critical review and practical recommendations for improvement. ACM SIGKDD Explorations Newsletter 24(2):5\u201311","journal-title":"ACM SIGKDD Explorations Newsletter"},{"key":"6706_CR24","unstructured":"Yao Q, Wang M, Chen Y, Dai W, Li Y, Tu W, Yang Q, Yu Y (2018) Taking human out of learning applications: A survey on automated machine learning. arXiv:1810.13306"},{"key":"6706_CR25","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2020.106622","volume":"212","author":"X He","year":"2021","unstructured":"He X, Zhao K, Chu X (2021) AutoML: A survey of the state-of-the-art. Knowl-Based Syst 212:106622","journal-title":"Knowl-Based Syst"},{"key":"6706_CR26","doi-asserted-by":"publisher","first-page":"409","DOI":"10.1613\/jair.1.11854","volume":"70","author":"M Z\u00f6ller","year":"2021","unstructured":"Z\u00f6ller M, Huber M (2021) Benchmark and survey of automated machine learning frameworks. J Artif Intell Res 70:409\u2013472","journal-title":"J Artif Intell Res"},{"key":"6706_CR27","doi-asserted-by":"crossref","unstructured":"Eiben A, Smith J (2003) Introduction to Evolutionary Computing, vol 53. Springer","DOI":"10.1007\/978-3-662-05094-1"},{"key":"6706_CR28","doi-asserted-by":"crossref","unstructured":"Olson R, Bartley N, Urbanowicz R, Moore J (2016) Evaluation of a tree-based pipeline optimization tool for automating data science. In: Proceedings of the genetic and evolutionary computation conference vol 2016, pp 485\u2013492","DOI":"10.1145\/2908812.2908918"},{"key":"6706_CR29","doi-asserted-by":"crossref","unstructured":"S\u00e1 A, Pinto W.G, Oliveira L, Pappa G (2017) RECIPE: a grammar-based framework for automatically evolving classification pipelines. In: European conference on genetic programming, pp 246\u2013261. Springer","DOI":"10.1007\/978-3-319-55696-3_16"},{"key":"6706_CR30","doi-asserted-by":"crossref","unstructured":"Saunders JD, Freitas AA (2022) Evaluating a new genetic algorithm for automated machine learning in positive-unlabelled learning. In: Artificial evolution - 15th international conference. Lecture Notes in Computer Science, vol 14091, pp 42\u201357. Springer","DOI":"10.1007\/978-3-031-42616-2_4"},{"key":"6706_CR31","doi-asserted-by":"crossref","unstructured":"Frazier P (2018) A tutorial on bayesian optimization. arXiv:1807.02811","DOI":"10.1287\/educ.2018.0188"},{"key":"6706_CR32","doi-asserted-by":"crossref","unstructured":"Mo\u010dkus J (1975) On bayesian methods for seeking the extremum. In: Optimization techniques IFIP technical conference, pp 400\u2013404","DOI":"10.1007\/978-3-662-38527-2_55"},{"issue":"1","key":"6706_CR33","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1145\/3425501","volume":"1","author":"G De Ath","year":"2021","unstructured":"De Ath G, Everson R, Rahat A, Fieldsend J (2021) Greed is good: Exploration and exploitation trade-offs in bayesian optimization. ACM Trans Evol Learn Optim 1(1):1\u201322","journal-title":"ACM Trans Evol Learn Optim"},{"issue":"2","key":"6706_CR34","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1016\/j.swevo.2011.05.001","volume":"1","author":"Y Jin","year":"2011","unstructured":"Jin Y (2011) Surrogate-assisted evolutionary computation: Recent advances and future challenges. Swarm Evol Comput 1(2):61\u201370","journal-title":"Swarm Evol Comput"},{"key":"6706_CR35","unstructured":"Asuncion A, Newman D (2007) UCI machine learning repository"},{"issue":"12","key":"6706_CR36","doi-asserted-by":"publisher","first-page":"2677","DOI":"10.1162\/jocn.2009.21407","volume":"22","author":"D Marcus","year":"2010","unstructured":"Marcus D, Fotenos A, Csernansky J, Morris J, Buckner R (2010) Open access series of imaging studies: longitudinal mri data in nondemented and demented older adults. Journal of Cognitive Neuroscience 22(12):2677\u20132684","journal-title":"Journal of Cognitive Neuroscience"},{"issue":"1","key":"6706_CR37","first-page":"1","volume":"7","author":"B Pereira","year":"2016","unstructured":"Pereira B, Chin S, Rueda O, Vollan H, Provenzano E, Bardwell H, Pugh M, Jones L, Russell R, Sammut S et al (2016) The somatic mutation profiles of 2,433 breast cancers refine their genomic and transcriptomic landscapes. Nat Commun 7(1):1\u201316","journal-title":"Nat Commun"},{"key":"6706_CR38","volume-title":"Counting Processes and Survival Analysis","author":"T Fleming","year":"1991","unstructured":"Fleming T, Harrington D (1991) Counting Processes and Survival Analysis. Wiley"},{"key":"6706_CR39","doi-asserted-by":"crossref","unstructured":"Islam M, Ferdousi R, Rahman S, Bushra H (2020) Likelihood prediction of diabetes at early stage using data mining techniques. In: Computer Vision and Machine Intelligence in Medical Image Analysis, pp 113\u2013125. Springer,","DOI":"10.1007\/978-981-13-8798-2_12"},{"issue":"1","key":"6706_CR40","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12911-020-1023-5","volume":"20","author":"D Chicco","year":"2020","unstructured":"Chicco D, Jurman G (2020) Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Medical Informatics and Decision Making 20(1):1\u201316","journal-title":"BMC Medical Informatics and Decision Making"},{"issue":"1","key":"6706_CR41","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-017-00047-5","volume":"7","author":"J Hlavni\u010dka","year":"2017","unstructured":"Hlavni\u010dka J, \u010cmejla R, Tykalov\u00e1 T, \u0160onka K, R\u016f\u017ei\u010dka E, Rusz J (2017) Automated analysis of connected speech reveals early biomarkers of parkinson\u2019s disease in patients with rapid eye movement sleep behaviour disorder. Scientific Reports 7(1):1\u201310","journal-title":"Scientific Reports"},{"key":"6706_CR42","doi-asserted-by":"crossref","unstructured":"Emon M, Keya M, Meghla T, Rahman M, Al Mamun M, Kaiser M (2020) Performance analysis of machine learning approaches in stroke prediction. In: 2020 4th International conference on electronics, communication and aerospace technology (ICECA), pp 1464\u20131469. IEEE","DOI":"10.1109\/ICECA49313.2020.9297525"},{"key":"6706_CR43","doi-asserted-by":"crossref","unstructured":"He J, Zhang Y, Li X, Wang Y (2010) Naive bayes classifier for positive unlabeled learning with uncertainty. In: Proceedings of the 2010 SIAM international conference on data mining, pp 361\u2013372. SIAM","DOI":"10.1137\/1.9781611972801.32"},{"key":"6706_CR44","doi-asserted-by":"crossref","unstructured":"Basile T, Mauro N, Esposito F, Ferilli S, Vergari A (2017) Density estimators for positive-unlabeled learning. In: International workshop on new frontiers in mining complex patterns, pp 49\u201364. Springer","DOI":"10.1007\/978-3-319-78680-3_4"},{"issue":"3","key":"6706_CR45","first-page":"1","volume":"12","author":"B Shinkins","year":"2017","unstructured":"Shinkins B, Nicholson BD, Primrose J et al (2017) The diagnostic accuracy of a single CEA blood test in detecting colorectal cancer recurrence: Results from the FACS trial. Public Library of Science One 12(3):1\u201311","journal-title":"Public Library of Science One"},{"issue":"10","key":"6706_CR46","doi-asserted-by":"publisher","first-page":"1551","DOI":"10.1002\/mds.27485","volume":"33","author":"T Beach","year":"2018","unstructured":"Beach T, Adler C (2018) Importance of low diagnostic accuracy for early parkinson\u2019s disease. Movement Disorders 33(10):1551\u20131554","journal-title":"Movement Disorders"},{"issue":"1","key":"6706_CR47","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1472-6947-1-6","volume":"1","author":"G De Bruyn","year":"2001","unstructured":"De Bruyn G, Graviss E (2001) A systematic review of the diagnostic accuracy of physical examination for the detection of cirrhosis. BMC Medical Informatics and Decision Making 1(1):1\u201311","journal-title":"BMC Medical Informatics and Decision Making"},{"issue":"4","key":"6706_CR48","first-page":"616","volume":"47","author":"H Palmedo","year":"2006","unstructured":"Palmedo H, Bucerius J, Joe A et al (2006) Integrated PET\/CT in differentiated thyroid cancer: Diagnostic accuracy and impact on patient management. Journal of Nuclear Medicine 47(4):616\u2013624","journal-title":"Journal of Nuclear Medicine"},{"issue":"5","key":"6706_CR49","doi-asserted-by":"publisher","first-page":"984","DOI":"10.2214\/AJR.15.15785","volume":"207","author":"E Nerad","year":"2016","unstructured":"Nerad E, Lahaye MJ, Maas M et al (2016) Diagnostic accuracy of CT for local staging of colon cancer: a systematic review and meta-analysis. American Journal of Roentgenology 207(5):984\u2013995","journal-title":"American Journal of Roentgenology"},{"issue":"5","key":"6706_CR50","doi-asserted-by":"publisher","first-page":"862","DOI":"10.4103\/jcrt.JCRT_678_17","volume":"13","author":"Y Zhang","year":"2017","unstructured":"Zhang Y, Ren H (2017) Meta-analysis of diagnostic accuracy of magnetic resonance imaging and mammography for breast cancer. Journal of Cancer Research and Therapeutics 13(5):862\u2013868","journal-title":"Journal of Cancer Research and Therapeutics"},{"issue":"1","key":"6706_CR51","doi-asserted-by":"publisher","first-page":"2","DOI":"10.1007\/978-3-031-79178-9","volume":"16","author":"K Jaskie","year":"2022","unstructured":"Jaskie K, Spanias A (2022) Positive unlabeled learning. Synthesis Lectures Artif Intell Mach Learn 16(1):2\u2013152","journal-title":"Synthesis Lectures Artif Intell Mach Learn"},{"key":"6706_CR52","unstructured":"Wilcoxon F, Katti S, Wilcox R (1963) Critical Values and Probability Levels for the Wilcoxon Rank Sum Test and the Wilcoxon Signed Rank Test vol 1. American Cyanamid"},{"key":"6706_CR53","first-page":"1","volume":"7","author":"J Dem\u0161ar","year":"2006","unstructured":"Dem\u0161ar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1\u201330","journal-title":"J Mach Learn Res"},{"issue":"5","key":"6706_CR54","doi-asserted-by":"publisher","first-page":"1763","DOI":"10.1213\/ANE.0000000000002864","volume":"126","author":"P Schober","year":"2018","unstructured":"Schober P, Boer C, Schwarte LA (2018) Correlation coefficients: Appropriate use and interpretation. Anesthesia & Analgesia 126(5):1763\u20131768","journal-title":"Anesthesia & Analgesia"}],"container-title":["Applied Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-025-06706-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10489-025-06706-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-025-06706-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T15:55:42Z","timestamp":1758297342000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10489-025-06706-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,16]]},"references-count":54,"journal-issue":{"issue":"12","published-print":{"date-parts":[[2025,8]]}},"alternative-id":["6706"],"URL":"https:\/\/doi.org\/10.1007\/s10489-025-06706-9","relation":{},"ISSN":["0924-669X","1573-7497"],"issn-type":[{"type":"print","value":"0924-669X"},{"type":"electronic","value":"1573-7497"}],"subject":[],"published":{"date-parts":[[2025,7,16]]},"assertion":[{"value":"8 June 2025","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"16 July 2025","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing Interests"}},{"value":"This article does not involve any studies with human participants or animals performed by any of the authors.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical and informed consent for data used"}}],"article-number":"875"}}