{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,16]],"date-time":"2026-03-16T10:12:06Z","timestamp":1773655926167,"version":"3.50.1"},"reference-count":33,"publisher":"Emerald","issue":"4","license":[{"start":{"date-parts":[[2013,11,18]],"date-time":"2013-11-18T00:00:00Z","timestamp":1384732800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2013,11,18]]},"abstract":"<jats:sec>\n               <jats:title content-type=\"abstract-heading\">Purpose<\/jats:title>\n               <jats:p> \u2013 This paper aims to report on the design and development of a new approach for automatic classification and subject indexing of research documents in scientific digital libraries and repositories (DLR) according to library controlled vocabularies such as DDC and FAST. <\/jats:p>\n            <\/jats:sec>\n            <jats:sec>\n               <jats:title content-type=\"abstract-heading\">Design\/methodology\/approach<\/jats:title>\n               <jats:p> \u2013 The proposed concept matching-based approach (CMA) detects key Wikipedia concepts occurring in a document and searches the OPACs of conventional libraries via querying the WorldCat database to retrieve a set of MARC records which share one or more of the detected key concepts. Then the semantic similarity of each retrieved MARC record to the document is measured and, using an inference algorithm, the DDC classes and FAST subjects of those MARC records which have the highest similarity to the document are assigned to it. <\/jats:p>\n            <\/jats:sec>\n            <jats:sec>\n               <jats:title content-type=\"abstract-heading\">Findings<\/jats:title>\n               <jats:p> \u2013 The performance of the proposed method in terms of the accuracy of the DDC classes and FAST subjects automatically assigned to a set of research documents is evaluated using standard information retrieval measures of precision, recall, and F1. The authors demonstrate the superiority of the proposed approach in terms of accuracy performance in comparison to a similar system currently deployed in a large scale scientific search engine. <\/jats:p>\n            <\/jats:sec>\n            <jats:sec>\n               <jats:title content-type=\"abstract-heading\">Originality\/value<\/jats:title>\n               <jats:p> \u2013 The proposed approach enables the development of a new type of subject classification system for DLR, and addresses some of the problems similar systems suffer from, such as the problem of imbalanced training data encountered by machine learning-based systems, and the problem of word-sense ambiguity encountered by string matching-based systems.<\/jats:p>\n            <\/jats:sec>","DOI":"10.1108\/lht-03-2013-0030","type":"journal-article","created":{"date-parts":[[2013,10,28]],"date-time":"2013-10-28T13:01:55Z","timestamp":1382965315000},"page":"725-747","source":"Crossref","is-referenced-by-count":11,"title":["Classification of scientific publications according to library controlled vocabularies"],"prefix":"10.1108","volume":"31","author":[{"given":"Arash","family":"Joorabchi","sequence":"first","affiliation":[]},{"given":"Abdulhussain","family":"E. Mahdi","sequence":"additional","affiliation":[]}],"member":"140","reference":[{"key":"key2022021920313417100_b1","doi-asserted-by":"crossref","unstructured":"Adamick, J.\n                and \n                  Reznik-Zellen, R.\n                (2010), \u201cTrends in large-scale subject repositories\u201d, D-Lib Magazine, Vol. 16 Nos 11\/12.","DOI":"10.1045\/november2010-adamick"},{"key":"key2022021920313417100_b2","doi-asserted-by":"crossref","unstructured":"Beall, J.\n                (2011), \u201cAcademic library databases and the problem of word-sense ambiguity\u201d, The Journal of Academic Librarianship, Vol. 37 No. 1, pp. 64-69.","DOI":"10.1016\/j.acalib.2010.10.008"},{"key":"key2022021920313417100_b3","doi-asserted-by":"crossref","unstructured":"Chung, Y.-M.\n                and \n                  Noh, Y.-H.\n                (2003), \u201cDeveloping a specialized directory system by automatically classifying web documents\u201d, Journal of Information Science, Vol. 29 No. 2, pp. 117-126.","DOI":"10.1177\/016555150302900204"},{"key":"key2022021920313417100_b4","doi-asserted-by":"crossref","unstructured":"Dean, R.J.\n                (2004), \u201cFAST: development of simplified headings for metadata\u201d, Cataloging & Classification Quarterly, Vol. 39 Nos 1-2, pp. 331-352.","DOI":"10.1300\/J104v39n01_03"},{"key":"key2022021920313417100_b5","doi-asserted-by":"crossref","unstructured":"Dolin, R.\n               , \n                  Agrawal, D.\n                and \n                  Abbadi, E.E.\n                (1999), \u201cScalable collection summarization and selection\u201d, Proceedings of the Fourth ACM Conference on Digital Libraries, ACM, Berkeley, CA.","DOI":"10.1145\/313238.313257"},{"key":"key2022021920313417100_b6","doi-asserted-by":"crossref","unstructured":"Frank, E.\n                and \n                  Paynter, G.W.\n                (2004), \u201cPredicting Library of Congress classifications from Library of Congress subject headings\u201d, Journal of the American Society for Information Science and Technology, Vol. 55 No. 3, pp. 214-227.","DOI":"10.1002\/asi.10360"},{"key":"key2022021920313417100_b7","unstructured":"Godby, C.J.\n                and \n                  Smith, D.\n                (2000-2002), Scorpion [Online]. OCLC Online Computer Library Center, Inc, available: www.oclc.org\/research\/activities\/scorpion.html (accessed February 2013)."},{"key":"key2022021920313417100_b8","doi-asserted-by":"crossref","unstructured":"Golub, K.\n                (2006), \u201cAutomated subject classification of textual web pages, based on a controlled vocabulary: challenges and recommendations\u201d, New Review of Hypermedia and Multimedia, Vol. 12 No. 1, pp. 11-27.","DOI":"10.1080\/13614560600774313"},{"key":"key2022021920313417100_b9","doi-asserted-by":"crossref","unstructured":"Golub, K.\n               , \n                  Ard\u00f6, A.\n               , \n                  Mladeni\u0107, D.\n                and \n                  Grobelnik, M.\n                (2006), Comparing and Combining Two Approaches to Automated Subject Classification of Text. Research and Advanced Technology for Digital Libraries, Springer, Berlin\/Heidelberg.","DOI":"10.1007\/11863878_45"},{"key":"key2022021920313417100_b10","doi-asserted-by":"crossref","unstructured":"Grineva, M.\n               , \n                  Grinev, M.\n                and \n                  Lizorkin, D.\n                (2009), \u201cExtracting key terms from noisy and multi-theme documents\u201d, 18th International Conference on World Wide Web, Madrid, Spain, ACM, New York, NY.","DOI":"10.1145\/1526709.1526798"},{"key":"key2022021920313417100_b11","doi-asserted-by":"crossref","unstructured":"Hickey, T.B.\n               , \n                  O'Neill, E.T.\n                and \n                  Toves, J.\n                (2002), \u201cExperiments with the IFLA functional requirements for bibliographic records (FRBR)\u201d, D-Lib Magazine, Vol. 8 No. 9, pp. 1-13.","DOI":"10.1045\/september2002-hickey"},{"key":"key2022021920313417100_b12","doi-asserted-by":"crossref","unstructured":"Hunter, L.\n                and \n                  Cohen, K.B.\n                (2006), \u201cBiomedical language processing: what's beyond PubMed?\u201d, Molecular Cell, Vol. 21 No. 5, pp. 589-594.","DOI":"10.1016\/j.molcel.2006.02.012"},{"key":"key2022021920313417100_b13","doi-asserted-by":"crossref","unstructured":"Jenkins, C.\n               , \n                  Jackson, M.\n               , \n                  Burden, P.\n                and \n                  Wallis, J.\n                (1998), \u201cAutomatic classification of web resources using Java and Dewey Decimal Classification\u201d, Computer Networks and ISDN Systems, Vol. 30 Nos 1-7, pp. 646-648.","DOI":"10.1016\/S0169-7552(98)00035-X"},{"key":"key2022021920313417100_b14","doi-asserted-by":"crossref","unstructured":"Jones, K.S.\n                (2004), \u201cIDF term weighting and IR research lessons\u201d, Journal of Documentation, Vol. 60 No. 5, pp. 521-523.","DOI":"10.1108\/00220410410560591"},{"key":"key2022021920313417100_b15","doi-asserted-by":"crossref","unstructured":"Joorabchi, A.\n                and \n                  Mahdi, A.E.\n                (2013), \u201cAutomatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms\u201d, Journal of Information Science, Vol. 39 No. 3, February 8, pp. 410-426, doi: 10.1177\/0165551512472138.","DOI":"10.1177\/0165551512472138"},{"key":"key2022021920313417100_b16","doi-asserted-by":"crossref","unstructured":"Larson, R.R.\n                (1992), \u201cExperiments in automatic Library of Congress Classification\u201d, Journal of the American Society for Information Science, Vol. 43 No. 7, pp. 130-148.","DOI":"10.1002\/(SICI)1097-4571(199203)43:2<130::AID-ASI3>3.0.CO;2-S"},{"key":"key2022021920313417100_b17","unstructured":"L\u00f6sch, M.\n                (2011), \u201cA multidisciplinary search engine for scientific open access documents\u201d, in \n                  Depping, R.\n                and \n                  Christiane, S.\n                (Eds), Elektronische Schriftenreihe der Universit\u00e1ts- und Stadtbibliothek K\u00f3ln, 2 Cologne: EBSLG Annual General Conference."},{"key":"key2022021920313417100_b18","unstructured":"L\u00f6sch, M.\n               , \n                  Waltinger, U.\n               , \n                  Horstmann, W.\n                and \n                  Mehler, A.\n                (2011), \u201cBuilding a DDC-annotated Corpus from OAI Metadata\u201d, Journal of Digital Information, Vol. 12 No. 2."},{"key":"key2022021920313417100_b19","doi-asserted-by":"crossref","unstructured":"Mahdi, A.E.\n                and \n                  Joorabchi, A.\n                (2010), \u201cA citation-based approach to automatic topical indexing of scientific literature\u201d, Journal of Information Science, Vol. 36 No. 6, pp. 798-811.","DOI":"10.1177\/0165551510388080"},{"key":"key2022021920313417100_b20","unstructured":"Medelyan, O.\n                (2009), \u201cHuman-competitive automatic topic indexing\u201d, PhD thesis, University of Waikato, Hamilton."},{"key":"key2022021920313417100_b21","doi-asserted-by":"crossref","unstructured":"Medelyan, O.\n                and \n                  Witten, I.H.\n                (2008), \u201cDomain-independent automatic keyphrase indexing with small training sets\u201d, Journal of the American Society for Information Science and Technology, Vol. 59 No. 7, pp. 1026-1040.","DOI":"10.1002\/asi.20790"},{"key":"key2022021920313417100_b22","unstructured":"Medelyan, O.\n               , \n                  Witten, I.H.\n                and \n                  Milne, D.\n                (2008), Topic Indexing with Wikipedia. First AAAI Workshop on Wikipedia and Artificial Intelligence (WIKIAI'08) Chicago, USA, AAAI Press, Chicago, IL."},{"key":"key2022021920313417100_b23","unstructured":"Milne, D.\n                (2009), \u201cAn open-source toolkit for mining Wikipedia\u201d, paper presented at New Zealand Computer Science Research Student Conference."},{"key":"key2022021920313417100_b24","unstructured":"M\u00f6ller, G.\n               , \n                  Carstensen, K.-U.\n               , \n                  Diekmann, B.\n                and \n                  W\u00e4tjen, H.\n                (1999), \u201cAutomatic classification of the world-wide web using the universal decimal classification\u201d, in \n                  Decker, R.\n                and \n                  Gaul, W.\n                (Eds), Proceedings of the 23rd Annual Conference of the German Classification Society (GfKl), Springer-Verlag, Bielefeld."},{"key":"key2022021920313417100_b25","unstructured":"Osborne, M.\n               , \n                  Petrovic, S.\n               , \n                  McCreadie, R.\n               , \n                  MacDonald, C.\n                and \n                  Ounis, I.\n                (2012), \u201cBieber no more: first story detection using Twitter and Wikipedia\u201d, SIGIR Workshop in Time-aware Information Access (TAIA'12) Portland, Oregon, USA, ACM, New York, NY."},{"key":"key2022021920313417100_b26","doi-asserted-by":"crossref","unstructured":"Pong, J.Y.-H.\n               , \n                  Kwok, R.C.-W.\n               , \n                  Lau, R.Y.-K.\n               , \n                  Hao, J.-X.\n                and \n                  Wong, P.C.-C.\n                (2008), \u201cA comparative study of two automatic document classification methods in a library setting\u201d, Journal of Information Science, Vol. 34 No. 2, pp. 213-230.","DOI":"10.1177\/0165551507082592"},{"key":"key2022021920313417100_b27","unstructured":"Roger, T.\n               , \n                  Keith, S.\n                and \n                  Diane, V.-G.\n                (1997), \u201cEvaluating Dewey concepts as a knowledge base for automatic subject assignment\u201d, Proceedings of the Second ACM International Conference on Digital Libraries. Philadelphia, Pennsylvania, United States, ACM, New York, NY."},{"key":"key2022021920313417100_b28","doi-asserted-by":"crossref","unstructured":"Rolling, L.\n                (1981), \u201cIndexing consistency, quality and efficiency\u201d, Information Processing & Management, Vol. 17 No. 2, pp. 69-76.","DOI":"10.1016\/0306-4573(81)90028-5"},{"key":"key2022021920313417100_b29","unstructured":"Traugott, K.\n               , \n                  Anders, A.\n                and \n                  Koraljka, G.\n                (2004), \u201cBrowsing and searching behavior in the renardus web service a study based on log analysis\u201d, Proceedings of the 4th ACM\/IEEE-CS joint conference on Digital libraries. Tuscon, AZ, USA, ACM, New York, NY."},{"key":"key2022021920313417100_b30","unstructured":"Vizine-Goetz, D.\n                (2010), \u201cClassify: a FRBR-based research prototype for applying classification numbers\u201d, OCLC NextSpace, 14, January, pp. 14-15."},{"key":"key2022021920313417100_b31","doi-asserted-by":"crossref","unstructured":"Waltinger, U.\n               , \n                  Mehler, A.\n               , \n                  L\u00f6sch, M.\n                and \n                  Horstmann, W.\n                (2011), \u201cHierarchical classification of OAI metadata using the DDC taxonomy\u201d, in \n                  Bernardi, R.\n               , \n                  Chambers, S.\n               , \n                  Gottfried, B.\n               , \n                  Segond, F.\n                and \n                  Zaihrayeu, I.\n                (Eds), Advanced Language Technologies for Digital Libraries, Springer, Berlin\/Heidelberg.","DOI":"10.1007\/978-3-642-23160-5_3"},{"key":"key2022021920313417100_b32","doi-asserted-by":"crossref","unstructured":"Wang, J.\n                (2009), \u201cAn extensive study on automated Dewey Decimal Classification\u201d, Journal of the American Society for Information Science and Technology, Vol. 60 No. 11, pp. 2269-2286.","DOI":"10.1002\/asi.21147"},{"key":"key2022021920313417100_b33","unstructured":"Yi, K.\n                (2007), \u201cAutomated text classification using library classification schemes: trends, issues, and challenges\u201d, International Cataloguing and Bibliographic Control (ICBC), Vol. 36 No. 4, pp. 78-82."}],"container-title":["Library Hi Tech"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/www.emeraldinsight.com\/doi\/full-xml\/10.1108\/LHT-03-2013-0030","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/LHT-03-2013-0030\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/LHT-03-2013-0030\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T22:14:10Z","timestamp":1753395250000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/lht\/article\/31\/4\/725-747\/263260"}},"subtitle":["A new concept matching-based approach"],"editor":[{"given":"Jane","family":"Greenberg, Eva Mendez Rodriguez and Gema Bueno de la Fuente","sequence":"first","affiliation":[]}],"short-title":[],"issued":{"date-parts":[[2013,11,18]]},"references-count":33,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2013,11,18]]}},"alternative-id":["10.1108\/LHT-03-2013-0030"],"URL":"https:\/\/doi.org\/10.1108\/lht-03-2013-0030","relation":{},"ISSN":["0737-8831"],"issn-type":[{"value":"0737-8831","type":"print"}],"subject":[],"published":{"date-parts":[[2013,11,18]]}}}