{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T02:17:38Z","timestamp":1778033858399,"version":"3.51.4"},"reference-count":44,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2023,6,15]],"date-time":"2023-06-15T00:00:00Z","timestamp":1686787200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2023,6,15]],"date-time":"2023-06-15T00:00:00Z","timestamp":1686787200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100002347","name":"Bundesministerium f\u00fcr Bildung und Forschung","doi-asserted-by":"publisher","award":["01DD20003"],"award-info":[{"award-number":["01DD20003"]}],"id":[{"id":"10.13039\/501100002347","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100011199","name":"FP7 Ideas: European Research Council","doi-asserted-by":"publisher","award":["819536"],"award-info":[{"award-number":["819536"]}],"id":[{"id":"10.13039\/100011199","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Int J Digit Libr"],"published-print":{"date-parts":[[2024,3]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>The purpose of this work is to describe the<jats:sc>orkg<\/jats:sc>-Leaderboard software designed to extract<jats:italic>leaderboards<\/jats:italic>defined as<jats:italic>task\u2013dataset\u2013metric<\/jats:italic>tuples automatically from large collections of empirical research papers in artificial intelligence (AI). The software can support both the main workflows of scholarly publishing, viz. as LaTeX files or as PDF files. Furthermore, the system is integrated with the open research knowledge graph (ORKG) platform, which fosters the machine-actionable publishing of scholarly findings. Thus, the systemsss output, when integrated within the ORKG\u2019s supported Semantic Web infrastructure of representing machine-actionable \u2018resources\u2019 on the Web, enables: (1) broadly, the integration of empirical results of researchers across the world, thus enabling transparency in empirical research with the potential to also being complete contingent on the underlying data source(s) of publications; and (2) specifically, enables researchers to track the progress in AI with an overview of the state-of-the-art across the most common AI tasks and their corresponding datasets via dynamic ORKG frontend views leveraging tables and visualization charts over the machine-actionable data. Our best model achieves performances above 90% F1 on the<jats:italic>leaderboard<\/jats:italic>extraction task, thus proving<jats:sc>orkg<\/jats:sc>-Leaderboards a practically viable tool for real-world usage. Going forward, in a sense,<jats:sc>orkg<\/jats:sc>-Leaderboards transforms the<jats:italic>leaderboard<\/jats:italic>extraction task to an automated digitalization task, which has been, for a long time in the community, a crowdsourced endeavor.<\/jats:p>","DOI":"10.1007\/s00799-023-00366-1","type":"journal-article","created":{"date-parts":[[2023,6,15]],"date-time":"2023-06-15T14:02:56Z","timestamp":1686837776000},"page":"41-54","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["ORKG-Leaderboards: a systematic workflow for mining leaderboards as a knowledge graph"],"prefix":"10.1007","volume":"25","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-0021-9729","authenticated-orcid":false,"given":"Salomon","family":"Kabongo","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6616-9509","authenticated-orcid":false,"given":"Jennifer","family":"D\u2019Souza","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0698-2864","authenticated-orcid":false,"given":"S\u00f6ren","family":"Auer","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2023,6,15]]},"reference":[{"key":"366_CR1","doi-asserted-by":"crossref","unstructured":"Parra\u00a0Escart\u00edn, C., Reijers, W., Lynn, T., Moorkens, J., Way, A., Liu, C.-H.: Ethical considerations in NLP shared tasks. In: Proceedings of the First ACL Workshop on Ethics in Natural Language Processing, pp. 66\u201373. Association for Computational Linguistics, Valencia, Spain (2017). https:\/\/doi.org\/10.18653\/v1\/W17-1608","DOI":"10.18653\/v1\/W17-1608"},{"issue":"4","key":"366_CR2","doi-asserted-by":"publisher","first-page":"897","DOI":"10.1162\/COLI_a_00304","volume":"43","author":"M Nissim","year":"2017","unstructured":"Nissim, M., Abzianidze, L., Evang, K., van der Goot, R., Haagsma, H., Plank, B., Wieling, M.: Last words: sharing is caring: the future of shared tasks. Comput. Linguist. 43(4), 897\u2013904 (2017)","journal-title":"Comput. Linguist."},{"key":"366_CR3","doi-asserted-by":"publisher","unstructured":"Kim, J.-D., Pyysalo, S.: In: Dubitzky, W., Wolkenhauer, O., Cho, K.-H., Yokota, H. (eds.) BioNLP Shared Task, pp. 138\u2013141. Springer, New York (2013). https:\/\/doi.org\/10.1007\/978-1-4419-9863-7_138","DOI":"10.1007\/978-1-4419-9863-7_138"},{"issue":"3","key":"366_CR4","doi-asserted-by":"publisher","first-page":"258","DOI":"10.1087\/20100308","volume":"23","author":"AE Jinha","year":"2010","unstructured":"Jinha, A.E.: Article 50 million: an estimate of the number of scholarly articles in existence. Learn. Publ. 23(3), 258\u2013263 (2010)","journal-title":"Learn. Publ."},{"key":"366_CR5","unstructured":"Chiarelli, A., Johnson, R., Richens, E., Pinfield, S.: Accelerating scholarly communication: the transformative role of preprints (2019)"},{"key":"366_CR6","unstructured":"paperswithcode.com. https:\/\/paperswithcode.com\/. Accessed 26 Apr 2021"},{"key":"366_CR7","unstructured":"NLP-progress. http:\/\/nlpprogress.com\/. Accessed 26 Apr 2021"},{"key":"366_CR8","unstructured":"AI metrics. https:\/\/www.eff.org\/ai\/metrics. Accessed 26 Apr 2021"},{"key":"366_CR9","unstructured":"SQuAD Explorer. https:\/\/rajpurkar.github.io\/SQuAD-explorer\/. Accessed 26 Apr 2021"},{"key":"366_CR10","unstructured":"Reddit Sota. https:\/\/github.com\/RedditSota\/state-of-the-art-result-for-machine-learning-problems. Accessed 26 Apr 2021"},{"key":"366_CR11","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/sdata.2016.18","volume":"3","author":"MD Wilkinson","year":"2016","unstructured":"Wilkinson, M.D., Dumontier, M., Aalbersberg, I.J., Appleton, G., Axton, M., Baak, A., Blomberg, N., Boiten, J.-W., da Silva Santos, L.B., Bourne, P.E., et al.: The fair guiding principles for scientific data management and stewardship. Sci. Data 3, 1\u20139 (2016)","journal-title":"Sci. Data"},{"key":"366_CR12","volume-title":"FAIR Principles: Interpretations and Implementation Considerations","author":"A Jacobsen","year":"2019","unstructured":"Jacobsen, A., de Miranda Azevedo, R., Juty, N., Batista, D., Coles, S., Cornet, R., Courtot, M., Crosas, M., Dumontier, M., Evelo, C.T., et al.: FAIR Principles: Interpretations and Implementation Considerations. MIT Press, Cambridge (2019)"},{"issue":"3","key":"366_CR13","doi-asserted-by":"publisher","first-page":"516","DOI":"10.1515\/bfp-2020-2042","volume":"44","author":"S Auer","year":"2020","unstructured":"Auer, S., Oelen, A., Haris, M., Stocker, M., D\u2019Souza, J., Farfar, K.E., Vogt, L., Prinz, M., Wiens, V., Jaradeh, M.Y.: Improving access to scientific literature with knowledge graphs. Bibliothek Forschung und Praxis 44(3), 516\u2013529 (2020)","journal-title":"Bibliothek Forschung und Praxis"},{"key":"366_CR14","unstructured":"Escart\u00edn, C.P., Lynn, T., Moorkens, J., Dunne, J.: Towards transparency in NLP shared tasks. arXiv preprint arXiv:2105.05020 (2021)"},{"key":"366_CR15","doi-asserted-by":"crossref","unstructured":"Kabongo, S., D\u2019Souza, J., Auer, S.: Automated mining of leaderboards for empirical ai research. In: International Conference on Asian Digital Libraries, pp. 453\u2013470 . Springer (2021)","DOI":"10.1007\/978-3-030-91669-5_35"},{"key":"366_CR16","first-page":"17283","volume":"33","author":"M Zaheer","year":"2020","unstructured":"Zaheer, M., Guruganesh, G., Dubey, K.A., Ainslie, J., Alberti, C., Ontanon, S., Pham, P., Ravula, A., Wang, Q., Yang, L.: Big bird: transformers for longer sequences. Adv. Neural. Inf. Process. Syst. 33, 17283\u201317297 (2020)","journal-title":"Adv. Neural. Inf. Process. Syst."},{"key":"366_CR17","doi-asserted-by":"crossref","unstructured":"D\u2019Souza, J., Auer, S.: Computer science named entity recognition in the open research knowledge graph. In: From Born-Physical to Born-Virtual: Augmenting Intelligence in Digital Libraries: 24th International Conference on Asian Digital Libraries, ICADL 2022, Hanoi, Vietnam, November 30\u2013December 2, 2022, Proceedings, pp. 35\u201345 . Springer (2022)","DOI":"10.1007\/978-3-031-21756-2_3"},{"key":"366_CR18","unstructured":"Gupta, S., Manning, C.: Analyzing the dynamics of research by extracting key aspects of scientific papers. In: Proceedings of 5th International Joint Conference on Natural Language Processing, pp. 1\u20139. Asian Federation of Natural Language Processing, Chiang Mai, Thailand (2011). https:\/\/aclanthology.org\/I11-1001"},{"issue":"11","key":"366_CR19","doi-asserted-by":"publisher","first-page":"2278","DOI":"10.1109\/5.726791","volume":"86","author":"Y LeCun","year":"1998","unstructured":"LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278\u20132324 (1998)","journal-title":"Proc. IEEE"},{"key":"366_CR20","unstructured":"Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Proceedings of the 26th InternationalConference on Neural Information Processing Systems, vol. 2. NIPS 13, pp. 2787\u20132795. Curran Associates Inc., Red Hook, NY, USA (2013)"},{"key":"366_CR21","doi-asserted-by":"crossref","unstructured":"Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pp. 311\u2013318 (2002)","DOI":"10.3115\/1073083.1073135"},{"issue":"5","key":"366_CR22","first-page":"1","volume":"1","author":"Y Sasaki","year":"2007","unstructured":"Sasaki, Y.: The truth of the f-measure. Teach. Tutor. Mater. 1(5), 1\u20135 (2007)","journal-title":"Teach. Tutor. Mater."},{"key":"366_CR23","doi-asserted-by":"crossref","unstructured":"Voorhees, E.M.: The trec-8 question answering track report. In: Trec, vol. 99, pp. 77\u201382 (1999)","DOI":"10.6028\/NIST.SP.500-246.qa-overview"},{"key":"366_CR24","doi-asserted-by":"crossref","unstructured":"Anteghini, M., D\u2019Souza, J., dos Santos, V.A., Auer, S.: Easy semantification of bioassays. In: International Conference of the Italian Association for Artificial Intelligence, pp. 198\u2013212 . Springer (2022)","DOI":"10.1007\/978-3-031-08421-8_14"},{"issue":"1","key":"366_CR25","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1038\/s41597-018-0005-2","volume":"6","author":"O Kononova","year":"2019","unstructured":"Kononova, O., Huo, H., He, T., Rong, Z., Botari, T., Sun, W., Tshitoyan, V., Ceder, G.: Text-mined dataset of inorganic materials synthesis recipes. Sci. Data 6(1), 1\u201311 (2019)","journal-title":"Sci. Data"},{"key":"366_CR26","doi-asserted-by":"publisher","unstructured":"Kulkarni, C., Xu, W., Ritter, A., Machiraju, R.: An annotated corpus for machine reading of instructions in wet lab protocols. In: NAACL: HLT, Volume 2 (Short Papers), New Orleans, Louisiana, pp. 97\u2013106 (2018). https:\/\/doi.org\/10.18653\/v1\/N18-2016","DOI":"10.18653\/v1\/N18-2016"},{"key":"366_CR27","doi-asserted-by":"crossref","unstructured":"Mysore, S., Jensen, Z., Kim, E., Huang, K., Chang, H.-S., Strubell, E., Flanigan, J., McCallum, A., Olivetti, E.: The materials science procedural text corpus: annotating materials synthesis procedures with shallow semantic structures. In: Proceedings of the 13th Linguistic Annotation Workshop, pp. 56\u201364 (2019)","DOI":"10.18653\/v1\/W19-4007"},{"key":"366_CR28","unstructured":"Kuniyoshi, F., Makino, K., Ozawa, J., Miwa, M.: Annotating and extracting synthesis process of all-solid-state batteries from scientific literature. In: LREC, pp. 1941\u20131950 (2020)"},{"key":"366_CR29","unstructured":"Handschuh, S., QasemiZadeh, B.: The acl rd-tec: a dataset for benchmarking terminology extraction and classification in computational linguistics. In: COLING 2014: 4th International Workshop on Computational Terminology (2014)"},{"key":"366_CR30","doi-asserted-by":"crossref","unstructured":"Augenstein, I., Das, M., Riedel, S., Vikraman, L., McCallum, A.: Semeval 2017 task 10: Scienceie - extracting keyphrases and relations from scientific publications. In: SemEval@ACL (2017)","DOI":"10.18653\/v1\/S17-2091"},{"key":"366_CR31","doi-asserted-by":"crossref","unstructured":"Luan, Y., He, L., Ostendorf, M., Hajishirzi, H.: Multi-task identification of entities, relations, and coreference for scientific knowledge graph construction. In: EMNLP (2018)","DOI":"10.18653\/v1\/D18-1360"},{"key":"366_CR32","unstructured":"D\u2019Souza, J., Hoppe, A., Brack, A., Jaradeh, M.Y., Auer, S., Ewerth, R.: The stem-ECR dataset: Grounding scientific entity references in stem scholarly content to authoritative encyclopedic and lexicographic sources. In: LREC, Marseille, France, pp. 2192\u20132203 (2020)"},{"key":"366_CR33","doi-asserted-by":"publisher","unstructured":"Hou, Y., Jochim, C., Gleize, M., Bonin, F., Ganguly, D.: Identification of tasks, datasets, evaluation metrics, and numeric scores for scientific leaderboards construction. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 5203\u20135213. Association for Computational Linguistics, Florence, Italy (2019). https:\/\/doi.org\/10.18653\/v1\/P19-1513","DOI":"10.18653\/v1\/P19-1513"},{"key":"366_CR34","doi-asserted-by":"crossref","unstructured":"Jain, S., van Zuylen, M., Hajishirzi, H., Beltagy, I.: Scirex: A challenge dataset for document-level information extraction. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7506\u20137516 (2020)","DOI":"10.18653\/v1\/2020.acl-main.670"},{"key":"366_CR35","doi-asserted-by":"crossref","unstructured":"Mondal, I., Hou, Y., Jochim, C.: End-to-end construction of nlp knowledge graph. In: Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pp. 1885\u20131895 (2021)","DOI":"10.18653\/v1\/2021.findings-acl.165"},{"key":"366_CR36","unstructured":"GROBID. GitHub (2008\u20132022)"},{"key":"366_CR37","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., Polosukhin, I.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998\u20136008 (2017)"},{"key":"366_CR38","unstructured":"Kenton, J.D.M.-W.C., Toutanova, L.K.: Bert: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of NAACL-HLT, pp. 4171\u20134186 (2019)"},{"key":"366_CR39","unstructured":"Natural Language Inference. https:\/\/paperswithcode.com\/task\/natural-language-inference. Accessed 22 Apr 2021"},{"key":"366_CR40","doi-asserted-by":"publisher","unstructured":"Beltagy, I., Lo, K., Cohan, A.: SciBERT: a pretrained language model for scientific text. In: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pp. 3615\u20133620. Association for Computational Linguistics, Hong Kong, China (2019). https:\/\/doi.org\/10.18653\/v1\/D19-1371","DOI":"10.18653\/v1\/D19-1371"},{"key":"366_CR41","unstructured":"Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R.R., Le, Q.V.: Xlnet: Generalized autoregressive pretraining for language understanding. In: Advances in Neural Information Processing Systems, vol. 32 (2019)"},{"key":"366_CR42","doi-asserted-by":"crossref","unstructured":"Jiang, M., D\u2019Souza, J., Auer, S., Downie, J.S.: Improving scholarly knowledge representation: Evaluating bert-based models for scientific relation classification. In: International Conference on Asian Digital Libraries, pp. 3\u201319 . Springer (2020)","DOI":"10.1007\/978-3-030-64452-9_1"},{"key":"366_CR43","doi-asserted-by":"crossref","unstructured":"Dai, Z., Yang, Z., Yang, Y., Carbonell, J.G., Le, Q., Salakhutdinov, R.: Transformer-xl: Attentive language models beyond a fixed-length context. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pp. 2978\u20132988 (2019)","DOI":"10.18653\/v1\/P19-1285"},{"key":"366_CR44","unstructured":"Ware, M., Mabe, M.: The STM report: An overview of scientific and scholarly journal publishing (2015)"}],"updated-by":[{"DOI":"10.1007\/s00799-024-00405-5","type":"correction","label":"Correction","source":"publisher","updated":{"date-parts":[[2024,5,28]],"date-time":"2024-05-28T00:00:00Z","timestamp":1716854400000}}],"container-title":["International Journal on Digital Libraries"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00799-023-00366-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s00799-023-00366-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s00799-023-00366-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,10,22]],"date-time":"2024-10-22T10:10:11Z","timestamp":1729591811000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s00799-023-00366-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,6,15]]},"references-count":44,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2024,3]]}},"alternative-id":["366"],"URL":"https:\/\/doi.org\/10.1007\/s00799-023-00366-1","relation":{"correction":[{"id-type":"doi","id":"10.1007\/s00799-024-00405-5","asserted-by":"object"}]},"ISSN":["1432-5012","1432-1300"],"issn-type":[{"value":"1432-5012","type":"print"},{"value":"1432-1300","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,6,15]]},"assertion":[{"value":"3 August 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"10 May 2023","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 May 2023","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 June 2023","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 May 2024","order":5,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Correction","order":6,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"A Correction to this paper has been published:","order":7,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"https:\/\/doi.org\/10.1007\/s00799-024-00405-5","URL":"https:\/\/doi.org\/10.1007\/s00799-024-00405-5","order":8,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}}]}}