{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,25]],"date-time":"2026-03-25T21:58:06Z","timestamp":1774475886541,"version":"3.50.1"},"reference-count":36,"publisher":"Emerald","issue":"5","license":[{"start":{"date-parts":[[2024,4,2]],"date-time":"2024-04-02T00:00:00Z","timestamp":1712016000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["JD"],"published-print":{"date-parts":[[2024,9,3]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title><jats:p>In order to estimate the value of semi-automated subject indexing in operative library catalogues, the study aimed to investigate five different automated implementations of an open source software package on a large set of Swedish union catalogue metadata records, with Dewey Decimal Classification (DDC) as the target classification system. It also aimed to contribute to the body of research on aboutness and related challenges in automated subject indexing and evaluation.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title><jats:p>On a sample of over 230,000 records with close to 12,000 distinct DDC classes, an open source tool Annif, developed by the National Library of Finland, was applied in the following implementations: lexical algorithm, support vector classifier, fastText, Omikuji Bonsai and an ensemble approach combing the former four. A qualitative study involving two senior catalogue librarians and three students of library and information studies was also conducted to investigate the value and inter-rater agreement of automatically assigned classes, on a sample of 60 records.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Findings<\/jats:title><jats:p>The best results were achieved using the ensemble approach that achieved 66.82% accuracy on the three-digit DDC classification task. The qualitative study confirmed earlier studies reporting low inter-rater agreement but also pointed to the potential value of automatically assigned classes as additional access points in information retrieval.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title><jats:p>The paper presents an extensive study of automated classification in an operative library catalogue, accompanied by a qualitative study of automated classes. It demonstrates the value of applying semi-automated indexing in operative information retrieval systems.<\/jats:p><\/jats:sec>","DOI":"10.1108\/jd-01-2022-0026","type":"journal-article","created":{"date-parts":[[2024,4,1]],"date-time":"2024-04-01T03:38:32Z","timestamp":1711942712000},"page":"1057-1079","source":"Crossref","is-referenced-by-count":17,"title":["Automated Dewey Decimal Classification of Swedish library metadata using Annif software"],"prefix":"10.1108","volume":"80","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-4169-4777","authenticated-orcid":false,"given":"Koraljka","family":"Golub","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Osma","family":"Suominen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ahmed Taiye","family":"Mohammed","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7657-5407","authenticated-orcid":false,"given":"Harriet","family":"Aagaard","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Olof","family":"Osterman","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"140","published-online":{"date-parts":[[2024,4,2]]},"reference":[{"issue":"2","key":"key2024083106453853300_ref001","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1016\/s0306-4573(00)00026-1","article-title":"The nature of indexing: how humans and machines analyze messages and texts for retrieval. Part I: research, and the nature of human indexing","volume":"37","year":"2001","journal-title":"Information Processing and Management"},{"key":"key2024083106453853300_ref002","first-page":"115","article-title":"Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures","year":"2013"},{"issue":"2","key":"key2024083106453853300_ref003","doi-asserted-by":"publisher","DOI":"10.7557\/13.2383","article-title":"Why build Dewey numbers? The remediation of the Dewey decimal classification system","volume":"16","year":"2012","journal-title":"Nordlit"},{"key":"key2024083106453853300_ref004","first-page":"3163","article-title":"Taming pretrained transformers for extreme multi-label text classification","year":"2020"},{"key":"key2024083106453853300_ref005","unstructured":"Conradi, E. (2017), \u201cDDC and automatic classification\u201d, available at: https:\/\/edug.pansoft.de\/tiki-index.php?page=DDC+and+automatic+classification"},{"key":"key2024083106453853300_ref006","article-title":"Automated indexing - a case study from the national agricultural library","year":"2014"},{"issue":"12","key":"key2024083106453853300_ref007","doi-asserted-by":"publisher","first-page":"10967","DOI":"10.1016\/j.eswa.2012.03.027","article-title":"Automated text classification using a dynamic artificial neural network model","volume":"39","year":"2012","journal-title":"Expert Systems with Applications"},{"issue":"3","key":"key2024083106453853300_ref008","doi-asserted-by":"publisher","first-page":"204","DOI":"10.1080\/10572317.2016.1205406","article-title":"Potential and challenges of subject access in libraries today on the example of Swedish libraries","volume":"48","year":"2016","journal-title":"International Information and Library Review"},{"issue":"4","key":"key2024083106453853300_ref039","doi-asserted-by":"crossref","first-page":"297","DOI":"10.5771\/0943-7444-2018-4-297","article-title":"Subject access in Swedish discovery services","volume":"45","year":"2018","journal-title":"Knowledge Organization"},{"issue":"8","key":"key2024083106453853300_ref009","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1080\/01639374.2021.2012311","article-title":"Automated subject indexing: an overview","volume":"59","year":"2021","journal-title":"Cataloging and Classification Quarterly"},{"issue":"1","key":"key2024083106453853300_ref010","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1002\/asi.23600","article-title":"A framework for evaluating automatic indexing or classification in the context of retrieval","volume":"67","year":"2016","journal-title":"Journal of the Association for Information Science and Technology"},{"issue":"10.75","key":"key2024083106453853300_ref011","first-page":"1","article-title":"Automated KOS-based subject indexing in INIS","volume":"10","year":"2018","journal-title":"Journal Article"},{"key":"key2024083106453853300_ref012","volume-title":"Documentation \u2013 Methods for Examining Documents, Determining Their Subjects, and Selecting Index Terms: ISO 5963","author":"International Organization for Standardization","year":"1985"},{"issue":"4","key":"key2024083106453853300_ref013","doi-asserted-by":"publisher","first-page":"422","DOI":"10.1145\/582415.582418","article-title":"Cumulated gain-based evaluation of IR techniques","volume":"20","year":"2002","journal-title":"ACM Transactions on Information Systems (TOIS)"},{"issue":"1-7","key":"key2024083106453853300_ref014","doi-asserted-by":"publisher","first-page":"646","DOI":"10.1016\/s0169-7552(98)00035-x","article-title":"Automatic classification of Web resources using Java and Dewey decimal classification","volume":"30","year":"1998","journal-title":"Computer Networks and ISDN Systems"},{"issue":"2","key":"key2024083106453853300_ref015","doi-asserted-by":"publisher","first-page":"211","DOI":"10.1177\/0165551513514932","article-title":"Towards linking libraries and Wikipedia: automatic subject indexing of library records with Wikipedia concepts","volume":"40","year":"2014","journal-title":"Journal of Information Science"},{"key":"key2024083106453853300_ref016","article-title":"The role of automated categorization in e-government information retrieval","year":"2013"},{"key":"key2024083106453853300_ref017","article-title":"Bag of tricks for efficient text classification","year":"2016","journal-title":"arXiv Preprint arXiv:1607.01759"},{"key":"key2024083106453853300_ref018","article-title":"Automation first\u2013the subject cataloguing policy of the Deutsche National bibliothek","year":"2017"},{"key":"key2024083106453853300_ref019","article-title":"Putting research-based machine learning solutions for subject indexing into practice","year":"2020"},{"issue":"11","key":"key2024083106453853300_ref020","doi-asserted-by":"publisher","first-page":"2099","DOI":"10.1007\/s10994-020-05888-2","article-title":"Bonsai: diverse and shallow trees for extreme multi-label classification","volume":"109","year":"2020","journal-title":"Machine Learning"},{"issue":"5","key":"key2024083106453853300_ref021","doi-asserted-by":"publisher","first-page":"976","DOI":"10.1108\/jd-07-2014-0103","article-title":"Augmenting Dublin core digital library metadata with Dewey decimal classification","volume":"71","year":"2015","journal-title":"Journal of Documentation"},{"key":"key2024083106453853300_ref042","volume-title":"Indexing and Abstracting in Theory and Practice","year":"2003","edition":"3rd ed."},{"issue":"1","key":"key2024083106453853300_ref022","doi-asserted-by":"publisher","first-page":"80","DOI":"10.1007\/s10489-011-0314-z","article-title":"An enhanced support vector machine classification framework by using Euclidean distance function for text document categorization","volume":"37","year":"2012","journal-title":"Applied Intelligence"},{"issue":"8","key":"key2024083106453853300_ref040","doi-asserted-by":"publisher","DOI":"10.1186\/s13326-017-0113-5","article-title":"12 years on \u2013 is the NLM medical text indexer still useful and relevant?","volume":"8","year":"2017","journal-title":"Journal of Biomedical Semantics"},{"key":"key2024083106453853300_ref023","article-title":"Scorpion","author":"OCLC","year":"2004","journal-title":"OCLC Software"},{"key":"key2024083106453853300_ref024","first-page":"2825","article-title":"Scikit-learn: machine learning in Python","volume":"12","year":"2011","journal-title":"The Journal of Machine Learning Research"},{"key":"key2024083106453853300_ref025","first-page":"993","article-title":"Parabel: partitioned label trees for extreme classification with application to dynamic search advertising","year":"2018"},{"issue":"1","key":"key2024083106453853300_ref026","doi-asserted-by":"publisher","first-page":"70","DOI":"10.1002\/asi.21233","article-title":"Document categorization in legal electronic discovery: computer classification vs manual review","volume":"61","year":"2010","journal-title":"Journal of the American Society for Information Science and Technology"},{"issue":"Supplement 24","key":"key2024083106453853300_ref041","first-page":"76","article-title":"Computer supported indexing: a history and evaluation of NASA\u2019s MAI system","volume":"61","year":"1997","journal-title":"Encyclopedia of Library and Information Services"},{"issue":"1","key":"key2024083106453853300_ref027","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18352\/lq.10285","article-title":"Annif: DIY automated subject indexing using multiple algorithms","volume":"29","year":"2019","journal-title":"LIBER Quarterly"},{"issue":"1","key":"key2024083106453853300_ref028","doi-asserted-by":"publisher","first-page":"265","DOI":"10.4403\/jlis.it-12740","article-title":"Annif and Finto AI: developing and implementing automated subject indexing","volume":"13","year":"2022","journal-title":"JLIS. It"},{"issue":"2","key":"key2024083106453853300_ref029","doi-asserted-by":"publisher","first-page":"169","DOI":"10.1007\/s00799-018-0240-3","article-title":"Fusion architectures for automatic subject indexing under concept drift","volume":"21","year":"2020","journal-title":"International Journal on Digital Libraries"},{"key":"key2024083106453853300_ref030","volume-title":"NLM Medical Text Indexer (MTI)","year":"2019"},{"key":"key2024083106453853300_ref031","doi-asserted-by":"crossref","unstructured":"Wiesenm\u00fcller, H. (2017), \u201cDas neue Sacherschlie\u00dfungskonzept der DNB in der FAZ\u201d, available at: https:\/\/www.basiswissen-rda.de\/neues-sacherschliessungskonzept-faz\/ (accessed 2 August 2017).","DOI":"10.1515\/9783110544725"},{"key":"key2024083106453853300_ref032","first-page":"5820","article-title":"AttentionXML: label tree-based attention-aware deep model for high-performance extreme multi-label text classification","year":"2019"}],"container-title":["Journal of Documentation"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/JD-01-2022-0026\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/JD-01-2022-0026\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T22:33:11Z","timestamp":1753396391000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/jd\/article\/80\/5\/1057-1079\/1236186"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,4,2]]},"references-count":36,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2024,4,2]]},"published-print":{"date-parts":[[2024,9,3]]}},"alternative-id":["10.1108\/JD-01-2022-0026"],"URL":"https:\/\/doi.org\/10.1108\/jd-01-2022-0026","relation":{},"ISSN":["0022-0418"],"issn-type":[{"value":"0022-0418","type":"print"}],"subject":[],"published":{"date-parts":[[2024,4,2]]}}}