{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,11]],"date-time":"2025-09-11T19:45:33Z","timestamp":1757619933480,"version":"3.44.0"},"reference-count":40,"publisher":"Springer Science and Business Media LLC","issue":"8","license":[{"start":{"date-parts":[[2025,7,28]],"date-time":"2025-07-28T00:00:00Z","timestamp":1753660800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2025,7,28]],"date-time":"2025-07-28T00:00:00Z","timestamp":1753660800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"Tampere University"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Scientometrics"],"published-print":{"date-parts":[[2025,8]]},"abstract":"<jats:title>Abstract<\/jats:title>\n          <jats:p>The paper looks at the methodology of empirical analyses of the content and structure of Information Science (IS). The traditional approach in empirical analysis is intellectual content analysis (ICA) of a representative data set. The high labor cost prohibits the analysis of massive data sets. A recent alternative is based on data mining\/machine learning. Its strength is the capability of analyzing massive datasets efficiently. However, a significant issue is the quality of content analysis. The paper compares latent Dirichlet allocation\/topic modeling (LDA\/TM) based statistical analysis to ICA using the same data set, 1514 scholarly articles from the year 2015 volumes of 30 IS journals. The intellectual analysis provides the mirror for reflecting the TM results. LDA\/TM is strong in identifying new directions of a discipline and processing masses of text. Its weaknesses include semantic haziness of topics due to bag-of-words article representation, text pre-processing, tuning of parameters, and being unanalytic in composing topics from words belonging to different categories.<\/jats:p>","DOI":"10.1007\/s11192-025-05376-1","type":"journal-article","created":{"date-parts":[[2025,7,28]],"date-time":"2025-07-28T04:43:19Z","timestamp":1753677799000},"page":"4309-4337","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["Comparing representations of a discipline derived through LDA vs. intellectual content analysis: the case of information science"],"prefix":"10.1007","volume":"130","author":[{"ORCID":"https:\/\/orcid.org\/0009-0008-7591-4639","authenticated-orcid":false,"given":"Kaisa","family":"Ylikruuvi","sequence":"first","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7655-8930","authenticated-orcid":false,"given":"Kalervo","family":"J\u00e4rvelin","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4441-5393","authenticated-orcid":false,"given":"Pertti","family":"Vakkari","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2298-9553","authenticated-orcid":false,"given":"Martti","family":"Juhola","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2025,7,28]]},"reference":[{"key":"5376_CR1","doi-asserted-by":"publisher","first-page":"17","DOI":"10.1007\/978-3-319-73531-3_2","volume-title":"Machine learning for text","author":"CC Aggarwal","year":"2018","unstructured":"Aggarwal, C. C. (2018). Text preparation and similarity computation. Machine learning for text (pp. 17\u201330). Springer."},{"issue":"1","key":"5376_CR2","doi-asserted-by":"publisher","first-page":"27","DOI":"10.1177\/0961000611424819","volume":"44","author":"N Aharony","year":"2012","unstructured":"Aharony, N. (2012). Library and Information Science research areas: A content analysis of articles from the top 10 journals 2007\u20138. Journal of Librarianship and Information Science, 44(1), 27\u201335.","journal-title":"Journal of Librarianship and Information Science"},{"issue":"4","key":"5376_CR3","first-page":"1","volume":"42","author":"V Arman-Keown","year":"2020","unstructured":"Arman-Keown, V., & Patterson, L. (2020). Content analysis in library and information science research: An analysis of trends. Library and Information Science Research, 42(4), 1\u201312.","journal-title":"Library and Information Science Research"},{"key":"5376_CR4","first-page":"993","volume":"3","author":"DM Blei","year":"2003","unstructured":"Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. The Journal of Machine Learning Research, 3, 993\u20131022.","journal-title":"The Journal of Machine Learning Research"},{"issue":"1","key":"5376_CR5","doi-asserted-by":"publisher","first-page":"675","DOI":"10.1002\/(SICI)1097-4571(1999)50:8<675::AID-ASI5>3.0.CO;2-B","volume":"50","author":"V Cano","year":"1999","unstructured":"Cano, V. (1999). Bibliometric overview of library and information science research in Spain. Journal of the American Society for Information Science, 50(1), 675\u2013680.","journal-title":"Journal of the American Society for Information Science"},{"key":"5376_CR6","unstructured":"Chang, J., Gerrish, S., et al. (2009). Reading tea leaves: How humans interpret topic models. In Advances in neural information processing systems, pp. 288\u2013296."},{"issue":"3","key":"5376_CR7","doi-asserted-by":"publisher","first-page":"1589","DOI":"10.1007\/s11192-018-2822-7","volume":"116","author":"YW Chang","year":"2018","unstructured":"Chang, Y. W. (2018). Examining interdisciplinarity of library and information science (LIS) based on LIS articles contributed by LIS authors. Scientometrics, 116(3), 1589\u20131613.","journal-title":"Scientometrics"},{"issue":"3","key":"5376_CR8","doi-asserted-by":"publisher","first-page":"2071","DOI":"10.1007\/s11192-015-1762-8","volume":"105","author":"YW Chang","year":"2015","unstructured":"Chang, Y. W., Huang, M. H., & Lin, C. W. (2015). Evolution of research subjects in library and information science based on keyword, bibliographical coupling, and co-citation analyses. Scientometrics, 105(3), 2071\u20132087.","journal-title":"Scientometrics"},{"key":"5376_CR9","doi-asserted-by":"publisher","unstructured":"Chauhan, U., & Shah, A. (2021). Topic modeling using latent Dirichlet allocation: A survey. ACM Computing Surveys, 54(7), Article 145, 35 pages. https:\/\/doi.org\/10.1145\/3462478","DOI":"10.1145\/3462478"},{"issue":"1","key":"5376_CR10","doi-asserted-by":"publisher","first-page":"36","DOI":"10.1016\/j.lisr.2014.09.003","volume":"37","author":"H Chu","year":"2015","unstructured":"Chu, H. (2015). Research methods in library and information science: A content Analysis. Library & Information Science Research, 37(1), 36\u201341.","journal-title":"Library & Information Science Research"},{"key":"5376_CR11","unstructured":"Darling, W. M. (2011). A theoretical and practical implementation tutorial on topic modeling and gibbs sampling. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies (pp. 642\u2013647)."},{"issue":"3","key":"5376_CR12","doi-asserted-by":"publisher","first-page":"1507","DOI":"10.1007\/s11192-017-2432-9","volume":"112","author":"CG Figuerola","year":"2017","unstructured":"Figuerola, C. G., Garc\u00eda Marco, F. J., & Pinto, M. (2017). Mapping the evolution of library and information science (1978\u20132014) using topic modeling on LISA. Scientometrics, 112(3), 1507\u20131535.","journal-title":"Scientometrics"},{"key":"5376_CR13","doi-asserted-by":"crossref","unstructured":"Friedman, D. (2019). topicdoc: Topic-Specific Diagnostics for LDA and CTM Topic Models. R package version 0.1.0. https:\/\/CRAN.R-project.org\/package=Topicdoc.","DOI":"10.32614\/CRAN.package.topicdoc"},{"issue":"1\/2","key":"5376_CR14","doi-asserted-by":"publisher","first-page":"74","DOI":"10.1108\/NLW-08-2015-0055","volume":"117","author":"V Gauchi Risso","year":"2016","unstructured":"Gauchi Risso, V. (2016). Research methods used in library and information science during the 1970\u20132010. New Library World, 117(1\/2), 74\u201393.","journal-title":"New Library World"},{"issue":"13","key":"5376_CR15","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v040.i13","volume":"40","author":"B Gr\u00fcn","year":"2011","unstructured":"Gr\u00fcn, B., & Hornik, K. (2011). topicmodels: An R Package for Fitting Topic Models. Journal of Statistical Software, 40(13), 1\u201330.","journal-title":"Journal of Statistical Software"},{"issue":"3","key":"5376_CR16","doi-asserted-by":"publisher","first-page":"2561","DOI":"10.1007\/s11192-020-03721-0","volume":"125","author":"X Han","year":"2020","unstructured":"Han, X. (2020). Evolution of research topics in LIS between 1996 and 2019: an analysis based on latent Dirichlet allocation topic model. Scientometrics, 125(3), 2561\u20132595.","journal-title":"Scientometrics"},{"key":"5376_CR17","doi-asserted-by":"crossref","unstructured":"Hvitfeldt, E., & Silge, J. (2022). Supervised Machine Learning for Text Analysis in R. https:\/\/smltar.com\/","DOI":"10.1201\/9781003093459"},{"key":"5376_CR18","volume-title":"The turn: Integration of information seeking and retrieval in context","author":"P Ingwersen","year":"2005","unstructured":"Ingwersen, P., & J\u00e4rvelin, K. (2005). The turn: Integration of information seeking and retrieval in context. Springer."},{"issue":"4","key":"5376_CR19","first-page":"395","volume":"12","author":"K J\u00e4rvelin","year":"1990","unstructured":"J\u00e4rvelin, K., & Vakkari, P. (1990). Content analysis of research articles in library and information science. Library and Information Science Research, 12(4), 395\u2013421.","journal-title":"Library and Information Science Research"},{"issue":"7","key":"5376_CR20","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1108\/JD-03-2021-0062","volume":"78","author":"K J\u00e4rvelin","year":"2022","unstructured":"J\u00e4rvelin, K., & Vakkari, P. (2022). LIS research across 50 years: Content analysis of journal articles. Journal of Documentation, 78(7), 65\u201388. https:\/\/doi.org\/10.1108\/JD-03-2021-0062","journal-title":"Journal of Documentation"},{"issue":"4","key":"5376_CR21","doi-asserted-by":"publisher","first-page":"548","DOI":"10.1016\/j.lisr.2006.03.018","volume":"28","author":"S-J Kim","year":"2006","unstructured":"Kim, S.-J., & Jeong, D. Y. (2006). An analysis of the development and use of theory in library and information science research articles. Library & Information Science Research, 28(4), 548\u2013562. https:\/\/doi.org\/10.1016\/j.lisr.2006.03.018","journal-title":"Library & Information Science Research"},{"key":"5376_CR22","doi-asserted-by":"publisher","first-page":"278","DOI":"10.1016\/j.acalib.2019.04.001","volume":"45","author":"G Liu","year":"2019","unstructured":"Liu, G., & Yang, L. (2019). Popular research topics in the recent journal publications of library and information science. The Journal of Academic Librarianship, 45, 278\u2013287.","journal-title":"The Journal of Academic Librarianship"},{"issue":"3","key":"5376_CR23","doi-asserted-by":"publisher","first-page":"15","DOI":"10.22452\/mjlis.vol25no3.2","volume":"25","author":"BD Lund","year":"2020","unstructured":"Lund, B. D. (2020). Who really contributes to information science research? An analysis of disciplinarity and nationality of contributors to ten top journals. Malaysian Journal of Library and Information Science, 25(3), 15\u201329.","journal-title":"Malaysian Journal of Library and Information Science"},{"issue":"8","key":"5376_CR24","doi-asserted-by":"publisher","first-page":"1059","DOI":"10.1002\/asi.24474","volume":"72","author":"J Ma","year":"2021","unstructured":"Ma, J., & Lund, B. (2021). The evolution and shift of research topics and methods in library and information science. Journal of the Association for Information Science and Technology, 72(8), 1059\u20131074.","journal-title":"Journal of the Association for Information Science and Technology"},{"key":"5376_CR25","unstructured":"Mayo, M. (2017). A general approach to pre-processing text data. Retrieved March 08, 2022, from https:\/\/www.kdnuggets.com\/2017\/12\/general-approachpre-processing-text-data.html."},{"key":"5376_CR26","unstructured":"Mimno, D., Wallach, H., Talley, E., Leenders, M., & McCallum, A. (2011). Optimizing semantic coherence in topic models. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (pp. 262\u2013272)."},{"issue":"1","key":"5376_CR27","doi-asserted-by":"publisher","first-page":"665","DOI":"10.1007\/s11192-020-03657-5","volume":"125","author":"Y Miyata","year":"2020","unstructured":"Miyata, Y., Ishita, E., Yang, F., Yamamoto, M., Iwase, A., & Kurata, K. (2020). Knowledge structure transition in library and information science: topic modeling and visualization. Scientometrics, 125(1), 665\u2013687.","journal-title":"Scientometrics"},{"issue":"4","key":"5376_CR28","doi-asserted-by":"publisher","first-page":"256","DOI":"10.1177\/0961000610380820","volume":"42","author":"G Prebor","year":"2010","unstructured":"Prebor, G. (2010). Analysis of the interdisciplinary nature of library and information science. Journal of Librarianship and Information Science, 42(4), 256\u2013267.","journal-title":"Journal of Librarianship and Information Science"},{"key":"5376_CR29","unstructured":"Saracevic, T. (1992). Information science: origin, evolution and relations. In Conceptions of library and information science: historical, empirical and theoretical perspectives (pp.\u00a05\u201327). Taylor Graham."},{"key":"5376_CR30","unstructured":"Saracevic, T. (1997). The stratified model of information retrieval interaction: Extension and applications. In Proceedings of the ASIS Annual Meeting, 34, pp. 313\u2013327. Learned Information."},{"issue":"1","key":"5376_CR31","doi-asserted-by":"publisher","first-page":"185","DOI":"10.1002\/asi.21435","volume":"62","author":"CR Sugimoto","year":"2011","unstructured":"Sugimoto, C. R., et al. (2011). The shifting sands of disciplinary development: Analysing North American Library and Information Science dissertations using latent Dirichlet allocation. Journal of the American Society for Information Science and Technology, 62(1), 185\u2013204.","journal-title":"Journal of the American Society for Information Science and Technology"},{"issue":"7","key":"5376_CR32","first-page":"1446","volume":"65","author":"O Tuomaala","year":"2014","unstructured":"Tuomaala, O., J\u00e4rvelin, K., & Vakkari, P. (2014). Evolution of library and information science, 1965\u20132005: Content analysis of journal articles. Journal of the American Society for Information Science and Technology, 65(7), 1446\u20131462.","journal-title":"Journal of the American Society for Information Science and Technology"},{"issue":"1","key":"5376_CR33","doi-asserted-by":"publisher","first-page":"575","DOI":"10.1007\/s11192-020-03471-z","volume":"124","author":"C Urbano","year":"2020","unstructured":"Urbano, C., & Ardanuy, J. (2020). Cross-disciplinary collaboration versus coexistence in LIS serials: Analysis of authorship affiliations in four European countries. Scientometrics, 124(1), 575\u20136021.","journal-title":"Scientometrics"},{"key":"5376_CR34","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1108\/S0065-2830(1994)0000018003","volume-title":"Advances in librarianship advances in librarianship","author":"P Vakkari","year":"1994","unstructured":"Vakkari, P. (1994). Library and information science: Its content and scope. In I. P. Godden (Ed.), Advances in librarianship advances in librarianship (Vol. 18, pp. 1\u201355). Emerald."},{"issue":"12","key":"5376_CR35","doi-asserted-by":"publisher","first-page":"4499","DOI":"10.1002\/asi.24690","volume":"73","author":"P Vakkari","year":"2022","unstructured":"Vakkari, P., Chang, Y.-W., & J\u00e4rvelin, K. (2022a). Disciplinary contributions to research topics and methodology in Library and Information Science\u2014Leading to fragmentation? Journal of the Association for Information Science and Technology, 73(12), 4499\u20134522. https:\/\/doi.org\/10.1002\/asi.24690","journal-title":"Journal of the Association for Information Science and Technology"},{"issue":"8","key":"5376_CR36","doi-asserted-by":"publisher","first-page":"4499","DOI":"10.1007\/s11192-022-04452-0","volume":"127","author":"P Vakkari","year":"2022","unstructured":"Vakkari, P., Chang, Y.-W., & J\u00e4rvelin, K. (2022b). Largest contribution to LIS by external disciplines as measured by the characteristics of research articles. Scientometrics, 127(8), 4499\u20134522. https:\/\/doi.org\/10.1007\/s11192-022-04452-0","journal-title":"Scientometrics"},{"issue":"7","key":"5376_CR37","doi-asserted-by":"publisher","first-page":"811","DOI":"10.1002\/asi.24757","volume":"74","author":"P Vakkari","year":"2023","unstructured":"Vakkari, P., Chang, Y.-W., & J\u00e4rvelin, K. (2023). The association of disciplinary background with the e volution of topics and methods in library and information science research 1995\u20132015. Journal of the Association for Information Science and Technology, 74(7), 811\u2013827. https:\/\/doi.org\/10.1002\/asi.24757","journal-title":"Journal of the Association for Information Science and Technology"},{"key":"5376_CR38","first-page":"389","volume-title":"The study of information Interdisciplinary messages","author":"P Wilson","year":"1983","unstructured":"Wilson, P. (1983). Bibliographical R&D. In F. Machlup & U. Mansfield (Eds.), The study of information Interdisciplinary messages (pp. 389\u2013397). Wiley."},{"key":"5376_CR39","unstructured":"Ylikruuvi, K. (2023). Automating the discipline analysis with latent Dirichlet allocation: A case study on 30 core journals of library and information science published in 2015. Tampere university, Faculty of information technology and communication sciences, Master\u2019s Thesis, May 2023, https:\/\/urn.fi\/URN:NBN:fi:tuni-202306086623 ."},{"key":"5376_CR40","doi-asserted-by":"publisher","first-page":"3433","DOI":"10.1007\/s11192-024-05048-6","volume":"129","author":"Y Zhang","year":"2024","unstructured":"Zhang, Y., & Zhang, C. (2024). Extracting problem and method sentence from scientific papers: A context-enhanced transformer using formulaic expression desensitization. Scientometrics, 129, 3433\u20133468. https:\/\/doi.org\/10.1007\/s11192-024-05048-6","journal-title":"Scientometrics"}],"container-title":["Scientometrics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11192-025-05376-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11192-025-05376-1\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11192-025-05376-1.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,9,8]],"date-time":"2025-09-08T03:55:51Z","timestamp":1757303751000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11192-025-05376-1"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,7,28]]},"references-count":40,"journal-issue":{"issue":"8","published-print":{"date-parts":[[2025,8]]}},"alternative-id":["5376"],"URL":"https:\/\/doi.org\/10.1007\/s11192-025-05376-1","relation":{},"ISSN":["0138-9130","1588-2861"],"issn-type":[{"type":"print","value":"0138-9130"},{"type":"electronic","value":"1588-2861"}],"subject":[],"published":{"date-parts":[[2025,7,28]]},"assertion":[{"value":"23 July 2024","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"20 June 2025","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 July 2025","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethical approval"}},{"value":"Not applicable.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent to participate"}},{"value":"Not applicable.","order":5,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}}]}}