{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,1,24]],"date-time":"2026-01-24T15:58:33Z","timestamp":1769270313419,"version":"3.49.0"},"reference-count":31,"publisher":"Springer Science and Business Media LLC","issue":"9","license":[{"start":{"date-parts":[[2025,8,26]],"date-time":"2025-08-26T00:00:00Z","timestamp":1756166400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,8,26]],"date-time":"2025-08-26T00:00:00Z","timestamp":1756166400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/100018693","name":"HORIZON EUROPE Framework Programme","doi-asserted-by":"publisher","award":["101094326"],"award-info":[{"award-number":["101094326"]}],"id":[{"id":"10.13039\/100018693","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Ente per le Nuove Tecnologie, l'Energia e l'Ambiente"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Scientometrics"],"published-print":{"date-parts":[[2025,9]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>\n                    The role of women in modern society is a central problem in several developed countries. Despite encouraging policies, women\u2019s participation in STEM fields is significantly lower than men\u2019s one. In order to develop solutions for mitigating this disparity, a deeper understanding of the underlying causes is crucial and a proper quantification of the phenomenon represents a first step to any analysis. While the problem of gender gap in scientific communities was long debated, information on authors\u2019 genders is often unavailable (see, for instance, ResearchGate and Scopus). Additionally, the lack of open-source software for automated gender prediction based on names calls for time costly human efforts. It arises the need for novel effective algorithms. Moreover, as a further challenge, desired software should guarantee gender fairness by providing the same performance for both male and female names recognition. In this paper, we propose a gender fair software to automatically predict authors\u2019 gender from their given names. The code leverages most of the existing information sources, i.e., Scopus, Semantic Scholar, and Harvard dataset. We performed an experimental application by analysing two datasets of publications, thus providing interesting insights. Finally, we evaluated the software performances in terms of accuracy, precision, recall,\n                    <jats:inline-formula>\n                      <jats:alternatives>\n                        <jats:tex-math>$$\\text {F1-score}$$<\/jats:tex-math>\n                        <mml:math xmlns:mml=\"http:\/\/www.w3.org\/1998\/Math\/MathML\">\n                          <mml:mtext>F1-score<\/mml:mtext>\n                        <\/mml:math>\n                      <\/jats:alternatives>\n                    <\/jats:inline-formula>\n                    , and gender fairness by means of two distinct case studies. The proposed solution can enable fairer gender prediction by combining open data with carefully calibrated criteria, matching the performance of commercial tools while offering a transparent and accessible solution.\n                  <\/jats:p>","DOI":"10.1007\/s11192-025-05384-1","type":"journal-article","created":{"date-parts":[[2025,8,26]],"date-time":"2025-08-26T18:29:09Z","timestamp":1756232949000},"page":"4849-4877","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Improving fair name-based prediction of gender in scientific communities"],"prefix":"10.1007","volume":"130","author":[{"given":"Maria","family":"Guariglia Migliore","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Gregorio","family":"D\u2019Agostino","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tatiana","family":"Patriarca","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1045-0510","authenticated-orcid":false,"given":"Antonio","family":"De Nicola","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2025,8,26]]},"reference":[{"issue":"2","key":"5384_CR1","doi-asserted-by":"publisher","first-page":"101144","DOI":"10.1016\/j.joi.2021.101144","volume":"15","author":"G Abramo","year":"2021","unstructured":"Abramo, G., Aksnes, D. W., & D\u2019Angelo, C. A. (2021). Gender differences in research performance within and between countries: Italy vs Norway. Journal of Informetrics, 15(2), 101144. https:\/\/doi.org\/10.1016\/j.joi.2021.101144","journal-title":"Journal of Informetrics"},{"key":"5384_CR2","unstructured":"Alford, R. D. (1987) Naming and identity: A cross cultural study of personal naming practices. Retrieved from, https:\/\/api.semanticscholar.org\/CorpusID:141787656"},{"key":"5384_CR3","doi-asserted-by":"crossref","unstructured":"B\u00e8rub\u00e8, N., Ghiasi, G., Sainte-Marie, M., & Lariviere, V. (2020) Wiki-gendersort: Automatic gender detection using first names in wikipedia","DOI":"10.31235\/osf.io\/ezw7p"},{"key":"5384_CR4","first-page":"3","volume":"8","author":"C Bonferroni","year":"1936","unstructured":"Bonferroni, C. (1936). Teoria statistica delle classi e calcolo delle probabilita. Pubblicazioni del Regio Istituto Superiore di Scienze Economiche e Commericiali di Firenze, 8, 3\u201362.","journal-title":"Pubblicazioni del Regio Istituto Superiore di Scienze Economiche e Commericiali di Firenze"},{"key":"5384_CR5","doi-asserted-by":"publisher","DOI":"10.1371\/journal.pcbi.1005134","author":"KS Bonham","year":"2017","unstructured":"Bonham, K. S., & Stefan, M. I. (2017). Women are underrepresentad in computational biology: An analysis of the scholarly literature in biology, computer science and computational biology. PLoS Computational Biology. https:\/\/doi.org\/10.1371\/journal.pcbi.1005134","journal-title":"PLoS Computational Biology"},{"issue":"2","key":"5384_CR6","doi-asserted-by":"publisher","first-page":"48","DOI":"10.1145\/3624700","volume":"67","author":"M Buyl","year":"2024","unstructured":"Buyl, M., & Bie, T. (2024). Inherent limitations of AI fairness. Communications of the ACM, 67(2), 48\u201355. https:\/\/doi.org\/10.1145\/3624700","journal-title":"Communications of the ACM"},{"key":"5384_CR7","doi-asserted-by":"publisher","DOI":"10.1007\/s11192-024-05005-3","author":"T Choji","year":"2024","unstructured":"Choji, T., Moral-Munoz, J., & Cobo, M. (2024). Is the scientific impact of the LIS themes gender-biased? A bibliometric analysis of the evolution, scientific impact, and relative contribution by gender from 2007 to 2022. Scientometrics. https:\/\/doi.org\/10.1007\/s11192-024-05005-3","journal-title":"Scientometrics"},{"key":"5384_CR8","volume-title":"Statistical Power Analysis for the Behavioral Sciences","author":"J Cohen","year":"1988","unstructured":"Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Lawrence Erlbaum Associates.","edition":"2"},{"issue":"5","key":"5384_CR9","doi-asserted-by":"publisher","first-page":"3807","DOI":"10.1007\/s11192-021-03885-3","volume":"126","author":"A De Nicola","year":"2021","unstructured":"De Nicola, A., & D\u2019Agostino, G. (2021). Assessment of gender divide in scientific communities. Scientometrics, 126(5), 3807\u20133840. https:\/\/doi.org\/10.1007\/s11192-021-03885-3","journal-title":"Scientometrics"},{"key":"5384_CR10","unstructured":"De\u00a0Nicola, A., Patriarca, T., Fresilli, B., Opromolla, A., Guariglia\u00a0Migliore, M., Leonardi, N., D\u2019Agostino, G., Cellini, M., Mirenda, C., Tagliacozzo, S., Pisacane, L., & Vassillo, C. (2024) D.1.2 - Report on gendered assessment of the energy systems knowledge community and EU policies for sustainable energy systems\u2014Horizon Europe Project gEneSys\u2014Transforming gendered interrelations of power and inequalities in transition pathways to sustainable energy systems, grant agreement no. 101094326. https:\/\/ec.europa.eu\/research\/participants\/documents\/downloadPublic?documentIds=080166e509765b4f&appId=PPGMS"},{"key":"5384_CR11","doi-asserted-by":"publisher","DOI":"10.1038\/srep04770","author":"P Deville","year":"2014","unstructured":"Deville, P., Wang, D., Sinatra, R., Song, C., Blondel, V. D., & Barab\u00e1si, A.-L. (2014). Career on the move: Geography, stratification, and scientific impact. Scientific Reports. https:\/\/doi.org\/10.1038\/srep04770","journal-title":"Scientific Reports"},{"key":"5384_CR12","doi-asserted-by":"publisher","DOI":"10.1037\/amp0000494","author":"A Eagly","year":"2019","unstructured":"Eagly, A., Nater, C., Miller, D., Kaufmann, M., Sczesny, S. (2019). Gender stereotypes have changed: A cross-temporal meta-analysis of us public opinion polls from 1946 to 2018. American Psychologist. https:\/\/doi.org\/10.1037\/amp0000494","journal-title":"American Psychologist"},{"key":"5384_CR13","doi-asserted-by":"publisher","first-page":"323","DOI":"10.18653\/v1\/2024.gebnlp-1.20","volume-title":"Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP)","author":"V Gautam","year":"2024","unstructured":"Gautam, V., Subramonian, A., Lauscher, A., & Keyes, O. (2024). Stop! in the name of flaws: Disentangling personal names and sociodemographic attributes in NLP. In A. Fale\u0144ska, C. Basta, M. Costa-Juss\u00e0, S. Goldfarb-Tarrant, & D. Nozza (Eds.), Proceedings of the 5th Workshop on Gender Bias in Natural Language Processing (GeBNLP) (pp. 323\u2013337). Association for Computational Linguistics."},{"key":"5384_CR14","doi-asserted-by":"crossref","unstructured":"Gomide, J., & F.D., & Kling H,. (2017). Name usage pattern in the synonym ambiguity problem in bibliographic data. Scientometrics, 112, 747.","DOI":"10.1007\/s11192-017-2410-2"},{"key":"5384_CR658","doi-asserted-by":"publisher","unstructured":"Guariglia Migliore, M., D\u2019Agostino, G., Patriarca, T., & De Nicola, A. (2025). Datasets for Fair Name-Based Gender Prediction in Scientific Communities. figshare. Dataset. https:\/\/doi.org\/10.6084\/m9.figshare.29909603.v1","DOI":"10.6084\/m9.figshare.29909603.v1"},{"key":"5384_CR200","doi-asserted-by":"crossref","unstructured":"Larivi\u00e8re, V., Ni, C., Gingras, Y., Cronin, B., & Sugimoto, C. R. (2013). Bibliometrics: Global gender disparities in science. Nature, 504(7479), 211\u2013213.","DOI":"10.1038\/504211a"},{"key":"5384_CR15","doi-asserted-by":"publisher","DOI":"10.1093\/oxfordhb\/9780199656431.001.0001","volume-title":"The Oxford Handbook of Names and Naming","author":"C Hough","year":"2016","unstructured":"Hough, C. (2016). The Oxford Handbook of Names and Naming. Oxford University Press."},{"issue":"9","key":"5384_CR16","doi-asserted-by":"publisher","first-page":"4609","DOI":"10.1073\/pnas.1914221117","volume":"117","author":"J Huang","year":"2020","unstructured":"Huang, J., Gates, A. J., Sinatra, R., & Barab\u00e1si, A.-L. (2020). Historical comparison of gender inequality in scientific careers across countries and disciplines. Proceedings of the National Academy of Sciences of the United States of America, 117(9), 4609\u20134616. https:\/\/doi.org\/10.1073\/pnas.1914221117","journal-title":"Proceedings of the National Academy of Sciences of the United States of America"},{"key":"5384_CR17","volume-title":"Sample Sizes for Clinical Trials","author":"SA Julious","year":"2004","unstructured":"Julious, S. A. (2004). Sample Sizes for Clinical Trials. Chapman & Hall\/CRC."},{"key":"5384_CR18","doi-asserted-by":"publisher","first-page":"108","DOI":"10.18653\/v1\/W16-5614","volume-title":"Proceedings of the First Workshop on NLP and Computational Social Science","author":"R Knowles","year":"2016","unstructured":"Knowles, R., Carroll, J., & Dredze, M. (2016). Demographer: Extremely simple name demographics. In D. Bamman, A. S. Do\u011fru\u00f6z, J. Eisenstein, D. Hovy, D. Jurgens, B. O\u2019Connor, A. Oh, O. Tsur, & S. Volkova (Eds.), Proceedings of the First Workshop on NLP and Computational Social Science (pp. 108\u2013113). Association for Computational Linguistics."},{"key":"5384_CR19","unstructured":"LGBTQIA Resource Center Lesbian, Gay, Bisexual, Transgender, Queer, Intersex, Asexual. Retrieved March 24, 2024, from https:\/\/lgbtqia.ucdavis.edu"},{"issue":"2","key":"5384_CR20","doi-asserted-by":"publisher","first-page":"153","DOI":"10.1007\/bf02295996","volume":"12","author":"Q McNemar","year":"1947","unstructured":"McNemar, Q. (1947). Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika, 12(2), 153\u2013157. https:\/\/doi.org\/10.1007\/bf02295996","journal-title":"Psychometrika"},{"issue":"2","key":"5384_CR21","doi-asserted-by":"publisher","first-page":"317","DOI":"10.1177\/006996670904300205","volume":"43","author":"R Meganathan","year":"2009","unstructured":"Meganathan, R. (2009). The politics of naming: The search for linguistic and ethnic identity in Tamil Nadu. Contributions to Indian Sociology, 43(2), 317\u2013324. https:\/\/doi.org\/10.1177\/006996670904300205","journal-title":"Contributions to Indian Sociology"},{"issue":"5","key":"5384_CR22","doi-asserted-by":"publisher","first-page":"56","DOI":"10.1145\/3319422","volume":"62","author":"FC Payton","year":"2019","unstructured":"Payton, F. C., & Berki, E. (2019). Countering the negative image of women in computing. Communications of the ACM, 62(5), 56\u201363. https:\/\/doi.org\/10.1145\/3319422","journal-title":"Communications of the ACM"},{"key":"5384_CR23","doi-asserted-by":"publisher","unstructured":"Raffo, J. (2021). WGND 2.0. https:\/\/doi.org\/10.7910\/DVN\/MSEGSJ","DOI":"10.7910\/DVN\/MSEGSJ"},{"issue":"3","key":"5384_CR24","doi-asserted-by":"publisher","first-page":"101556","DOI":"10.1016\/j.joi.2024.101556","volume":"18","author":"R S\u00e1nchez-Jim\u00e9nez","year":"2024","unstructured":"S\u00e1nchez-Jim\u00e9nez, R., Guerrero-Castillo, P., Guerrero-Bote, V. P., Halevi, G., & De-Moya-Aneg\u00f3n, F. (2024). Analysis of the distribution of authorship by gender in scientific output: A global perspective. Journal of Informetrics, 18(3), 101556. https:\/\/doi.org\/10.1016\/j.joi.2024.101556","journal-title":"Journal of Informetrics"},{"key":"5384_CR25","doi-asserted-by":"publisher","DOI":"10.7717\/peerj-cs.156","author":"L Santamar\u00eda","year":"2021","unstructured":"Santamar\u00eda, L., & Mihaljevi\u0107, H. (2021). Comparison and benchmark of name-to-gender inference services. PeerJ Computer Science. https:\/\/doi.org\/10.7717\/peerj-cs.156","journal-title":"PeerJ Computer Science"},{"key":"5384_CR26","doi-asserted-by":"publisher","first-page":"414","DOI":"10.5195\/jmla.2021.1185","volume":"109","author":"P Sebo","year":"2021","unstructured":"Sebo, P. (2021). Performance of gender detection tools: A comparative study of name-to-gender inference services. Journal of the Medical Library Association, 109, 414.","journal-title":"Journal of the Medical Library Association"},{"key":"5384_CR27","unstructured":"United Nations - Department of Economic and Social Affairs Sustainable Development. (2015). Transforming our world: The 2030 agenda for sustainable development. Journal of Public Health, 37, 13."},{"key":"5384_CR28","doi-asserted-by":"crossref","unstructured":"Van Buskirk, I., Clauset, A., & Larremore, D. B. (2023) An open-source cultural consensus approach to name-based gender classification. In: Proceedings of the Seventeenth International AAAI Conference on Web and Social Media (ICWSM2023)","DOI":"10.1609\/icwsm.v17i1.22195"},{"issue":"11","key":"5384_CR29","doi-asserted-by":"publisher","first-page":"8861","DOI":"10.1007\/s11192-021-04171-y","volume":"126","author":"L Zhang","year":"2021","unstructured":"Zhang, L., Sivertsen, G., Du, H., Huang, Y., & Gl\u00e4nzel, W. (2021). Gender differences in the aims and impacts of research. Scientometrics, 126(11), 8861\u20138886. https:\/\/doi.org\/10.1007\/s11192-021-04171-y","journal-title":"Scientometrics"}],"container-title":["Scientometrics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11192-025-05384-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11192-025-05384-1","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11192-025-05384-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,1,23]],"date-time":"2026-01-23T08:13:43Z","timestamp":1769156023000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11192-025-05384-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,26]]},"references-count":31,"journal-issue":{"issue":"9","published-print":{"date-parts":[[2025,9]]}},"alternative-id":["5384"],"URL":"https:\/\/doi.org\/10.1007\/s11192-025-05384-1","relation":{},"ISSN":["0138-9130","1588-2861"],"issn-type":[{"value":"0138-9130","type":"print"},{"value":"1588-2861","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,8,26]]},"assertion":[{"value":"7 May 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 July 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 August 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"All authors involved in this research have declared no Conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}]}}