{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T16:04:13Z","timestamp":1776182653143,"version":"3.50.1"},"reference-count":19,"publisher":"Springer Science and Business Media LLC","issue":"5","license":[{"start":{"date-parts":[[2021,7,31]],"date-time":"2021-07-31T00:00:00Z","timestamp":1627689600000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,7,31]],"date-time":"2021-07-31T00:00:00Z","timestamp":1627689600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001691","name":"Grants in aid for Scientific Research","doi-asserted-by":"crossref","award":["18k11195"],"award-info":[{"award-number":["18k11195"]}],"id":[{"id":"10.13039\/501100001691","id-type":"DOI","asserted-by":"crossref"}]},{"name":"Grants in Aid for Young Scientists","award":["18k17325"],"award-info":[{"award-number":["18k17325"]}]},{"DOI":"10.13039\/100000054","name":"National Cancer Institute","doi-asserted-by":"publisher","award":["P30 CA068485"],"award-info":[{"award-number":["P30 CA068485"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Appl Intell"],"published-print":{"date-parts":[[2022,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>A binary classification problem is common in medical field, and we often use sensitivity, specificity, accuracy, negative and positive predictive values as measures of performance of a binary predictor. In computer science, a classifier is usually evaluated with precision (positive predictive value) and recall (sensitivity). As a single summary measure of a classifier\u2019s performance, <jats:italic>F<\/jats:italic><jats:sub>1<\/jats:sub> score, defined as the harmonic mean of precision and recall, is widely used in the context of information retrieval and information extraction evaluation since it possesses favorable characteristics, especially when the prevalence is low. Some statistical methods for inference have been developed for the <jats:italic>F<\/jats:italic><jats:sub>1<\/jats:sub> score in binary classification problems; however, they have not been extended to the problem of multi-class classification. There are three types of <jats:italic>F<\/jats:italic><jats:sub>1<\/jats:sub> scores, and statistical properties of these <jats:italic>F<\/jats:italic><jats:sub>1<\/jats:sub> scores have hardly ever been discussed. We propose methods based on the large sample multivariate central limit theorem for estimating <jats:italic>F<\/jats:italic><jats:sub>1<\/jats:sub> scores with confidence intervals.<\/jats:p>","DOI":"10.1007\/s10489-021-02635-5","type":"journal-article","created":{"date-parts":[[2021,7,31]],"date-time":"2021-07-31T04:14:59Z","timestamp":1627704899000},"page":"4961-4972","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":239,"title":["Confidence interval for micro-averaged F1 and macro-averaged F1 scores"],"prefix":"10.1007","volume":"52","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7745-2342","authenticated-orcid":false,"given":"Kanae","family":"Takahashi","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0696-9659","authenticated-orcid":false,"given":"Kouji","family":"Yamamoto","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6786-3527","authenticated-orcid":false,"given":"Aya","family":"Kuchiba","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-8908-9165","authenticated-orcid":false,"given":"Tatsuki","family":"Koyama","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2021,7,31]]},"reference":[{"key":"2635_CR1","volume-title":"Information retrieval","author":"CJ van Rijsbergen","year":"1979","unstructured":"van Rijsbergen CJ (1979) Information retrieval. Butterworths, Oxford"},{"key":"2635_CR2","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511809071","volume-title":"Introduction to information retrieval","author":"CD Manning","year":"2008","unstructured":"Manning CD, Raghavan P, Sch\u00fctze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge"},{"key":"2635_CR3","doi-asserted-by":"publisher","first-page":"427","DOI":"10.1016\/j.ipm.2009.03.002","volume":"45","author":"M Sokolova","year":"2009","unstructured":"Sokolova M, Lapalme G (2009) A systematic analysis of performance measures for classification tasks. Inf Process Manage 45:427\u2013437","journal-title":"Inf Process Manage"},{"key":"2635_CR4","doi-asserted-by":"publisher","first-page":"651","DOI":"10.1109\/TKDE.2014.2359667","volume":"27","author":"Y Wang","year":"2015","unstructured":"Wang Y, Li J, Li Y, Wangi R, Yang X (2015) Confidence interval for F1 measure of algorithm performance based on blocked 3 \u00d7 2 cross-validation. IEEE Trans Knowl Data Eng 27:651\u2013659","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"2","key":"2635_CR5","doi-asserted-by":"publisher","first-page":"324","DOI":"10.1109\/TNSRE.2017.2733220","volume":"26","author":"H Dong","year":"2018","unstructured":"Dong H, Supratak A, Pan W, Wu C, Matthews PM, Guo Y (2018) Mixed neural network approach for temporal sleep stage classification. IEEE Trans Neural Syst Rehabil Eng 26(2):324\u2013333","journal-title":"IEEE Trans Neural Syst Rehabil Eng"},{"issue":"9 Suppl","key":"2635_CR6","doi-asserted-by":"publisher","first-page":"45","DOI":"10.1186\/s12920-016-0203-8","volume":"2","author":"J Wang","year":"2016","unstructured":"Wang J, Zhang J, An Y, Lin H, Yang Z, Zhang Y, Sun Y (2016) Biomedical event trigger detection by dependency-based word embedding. BMC Med Genomics 2(9 Suppl):45","journal-title":"BMC Med Genomics"},{"key":"2635_CR7","doi-asserted-by":"crossref","unstructured":"Socor\u00f3 JC, Al\u00edas F, Alsina-Pag\u00e8s RM (2017) An anomalous noise events detector for dynamic road traffic noise mapping in real-life urban and suburban environments. Sensors (Basel) 17(10)","DOI":"10.3390\/s17102323"},{"issue":"Suppl 17","key":"2635_CR8","doi-asserted-by":"publisher","first-page":"499","DOI":"10.1186\/s12859-018-2467-9","volume":"19","author":"S Chowdhury","year":"2018","unstructured":"Chowdhury S, Dong X, Qian L, Li X, Guan Y, Yang J, Yu Q (2018) A multitask bi-directional RNN model for named entity recognition on Chinese electronic medical records. BMC Bioinforma 19 (Suppl 17):499","journal-title":"BMC Bioinforma"},{"key":"2635_CR9","doi-asserted-by":"publisher","first-page":"259","DOI":"10.1016\/j.patcog.2017.08.030","volume":"73","author":"A Troya-Galvis","year":"2018","unstructured":"Troya-Galvis A, Gan \u0327carski P, Berti-\u00c9quille L (2018) Remote sensing image analysis by aggregation of segmentation-classification collaborative agents. Pattern Recognit 73:259\u2013274","journal-title":"Pattern Recognit"},{"key":"2635_CR10","doi-asserted-by":"publisher","first-page":"103310","DOI":"10.1016\/j.jbi.2019.103310","volume":"99","author":"N Hong","year":"2019","unstructured":"Hong N, Wen A, Stone DJ, Tsuji S, Kingsbury PR, Rasmussen LV, Pacheco JA, Adekkanattu P, Wang F, Luo Y, Pathak J, Liu H, Jiang G (2019) Developing a FHIRbased EHR phenotyping framework: A case study for identification of patients with obesity and multiple comorbidities from discharge summaries. J Biomed Inform 99:103310","journal-title":"J Biomed Inform"},{"key":"2635_CR11","doi-asserted-by":"publisher","first-page":"105432","DOI":"10.1016\/j.aap.2020.105432","volume":"137","author":"L Li","year":"2020","unstructured":"Li L, Zhong B, Hutmacher C, Liang Y, Horrey WJ, Xu X (2020) Detection of driver manual distraction via image-based hand and ear recognition. Accid Anal Prev 137:105432","journal-title":"Accid Anal Prev"},{"key":"2635_CR12","doi-asserted-by":"crossref","unstructured":"Zhou H, Ma Y, Li X (2020) Feature selection based on term frequency deviation rate for text classification. Appl Intell","DOI":"10.1007\/s10489-020-01937-4"},{"key":"2635_CR13","doi-asserted-by":"crossref","unstructured":"Rashid MM, Kamruzzaman J, Hassan MM, Imam T, Gordon S (2020) Cyberattacks detection in IoT-based smart city applications using machine learning techniques. Int J Environ Res Public Health 17(24)","DOI":"10.3390\/ijerph17249347"},{"key":"2635_CR14","doi-asserted-by":"publisher","first-page":"131","DOI":"10.1016\/j.inffus.2020.11.005","volume":"68","author":"SH Wang","year":"2021","unstructured":"Wang SH, Nayak DR, Guttery DS, Zhang X, Zhang YD (2021) COVID-19 classification by CCSHNet with deep fusion using transfer learning and discriminant correlation analysis. Inf Fusion 68:131\u2013148","journal-title":"Inf Fusion"},{"key":"2635_CR15","doi-asserted-by":"crossref","unstructured":"Hao J, Yue K, Zhang B, Duan L, Fu X (2021) Transfer learning of bayesian network for measuring qos of virtual machines. Appl Intell","DOI":"10.1007\/s10489-021-02362-x"},{"key":"2635_CR16","doi-asserted-by":"publisher","first-page":"108265","DOI":"10.1016\/j.anucene.2021.108265","volume":"158","author":"J Li","year":"2021","unstructured":"Li J, Lin M (2021) Ensemble learning with diversified base models for fault diagnosis in nuclear power plants. Ann Nucl Energy 158:108265","journal-title":"Ann Nucl Energy"},{"key":"2635_CR17","doi-asserted-by":"crossref","unstructured":"Zhang D, Wang J, Zhao X (2015) Estimating the uncertainty of average F1 scores. In: Proceedings of the 2015 International conference on the theory of information retrieval","DOI":"10.1145\/2808194.2809488"},{"key":"2635_CR18","doi-asserted-by":"publisher","first-page":"2200106","DOI":"10.1109\/JTEHM.2019.2959331","volume":"8","author":"F Zhu","year":"2020","unstructured":"Zhu F, Li X, Mcgonigle D, Tang H, He Z, Zhang C, Hung GU, Chiu PY, Zhou W (2020) Analyze informant-based questionnaire for the early diagnosis of senile dementia using deep learning. IEEE J Transl Eng Health Med 8:2200106","journal-title":"IEEE J Transl Eng Health Med"},{"issue":"4","key":"2635_CR19","doi-asserted-by":"publisher","first-page":"e0231629","DOI":"10.1371\/journal.pone.0231629","volume":"15","author":"S Bhalla","year":"2020","unstructured":"Bhalla S, Kaur H, Kaur R, Sharma S, Raghava GPS (2020) Expression based biomarkers and models to classify early and late-stage samples of papillary thyroid carcinoma. PLoS One 15(4):e0231629","journal-title":"PLoS One"}],"container-title":["Applied Intelligence"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-021-02635-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10489-021-02635-5\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10489-021-02635-5.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,3,5]],"date-time":"2022-03-05T05:14:58Z","timestamp":1646457298000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10489-021-02635-5"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,7,31]]},"references-count":19,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2022,3]]}},"alternative-id":["2635"],"URL":"https:\/\/doi.org\/10.1007\/s10489-021-02635-5","relation":{},"ISSN":["0924-669X","1573-7497"],"issn-type":[{"value":"0924-669X","type":"print"},{"value":"1573-7497","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,7,31]]},"assertion":[{"value":"19 June 2021","order":1,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"31 July 2021","order":2,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"None.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"<!--Emphasis Type='Bold' removed-->Conflict of Interests"}}]}}