{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T17:11:08Z","timestamp":1780506668203,"version":"3.54.1"},"reference-count":62,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2023,12,29]],"date-time":"2023-12-29T00:00:00Z","timestamp":1703808000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>Educational content recommendation is a cornerstone of AI-enhanced learning. In particular, to facilitate navigating the diverse learning resources available on learning platforms, methods are needed for automatically linking learning materials, e.g., in order to recommend textbook content based on exercises. Such methods are typically based on semantic textual similarity (STS) and the use of embeddings for text representation. However, it remains unclear what types of embeddings should be used for this task. In this study, we carry out an extensive empirical evaluation of embeddings derived from three different types of models: (i) static embeddings trained using a concept-based knowledge graph, (ii) contextual embeddings from a pre-trained language model, and (iii) contextual embeddings from a large language model (LLM). In addition to evaluating the models individually, various ensembles are explored based on different strategies for combining two models in an early vs. late fusion fashion. The evaluation is carried out using digital textbooks in Swedish for three different subjects and two types of exercises. The results show that using contextual embeddings from an LLM leads to superior performance compared to the other models, and that there is no significant improvement when combining these with static embeddings trained using a knowledge graph. When using embeddings derived from a smaller language model, however, it helps to combine them with knowledge graph embeddings. The performance of the best-performing model is high for both types of exercises, resulting in a mean Recall@3 of 0.96 and 0.95 and a mean MRR of 0.87 and 0.86 for quizzes and study questions, respectively, demonstrating the feasibility of using STS based on text embeddings for educational content recommendation. The ability to link digital learning materials in an unsupervised manner\u2014relying only on readily available pre-trained models\u2014facilitates the development of AI-enhanced learning.<\/jats:p>","DOI":"10.3390\/fi16010012","type":"journal-article","created":{"date-parts":[[2023,12,29]],"date-time":"2023-12-29T03:28:41Z","timestamp":1703820521000},"page":"12","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":28,"title":["Evaluating Embeddings from Pre-Trained Language Models and Knowledge Graphs for Educational Content Recommendation"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7860-1784","authenticated-orcid":false,"given":"Xiu","family":"Li","sequence":"first","affiliation":[{"name":"Department of Computer and Systems Sciences, Stockholm University, NOD-Huset, Borgarfjordsgatan 12, 16455 Stockholm, Sweden"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Aron","family":"Henriksson","sequence":"additional","affiliation":[{"name":"Department of Computer and Systems Sciences, Stockholm University, NOD-Huset, Borgarfjordsgatan 12, 16455 Stockholm, Sweden"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Martin","family":"Duneld","sequence":"additional","affiliation":[{"name":"Department of Computer and Systems Sciences, Stockholm University, NOD-Huset, Borgarfjordsgatan 12, 16455 Stockholm, Sweden"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Jalal","family":"Nouri","sequence":"additional","affiliation":[{"name":"Department of Computer and Systems Sciences, Stockholm University, NOD-Huset, Borgarfjordsgatan 12, 16455 Stockholm, Sweden"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0945-707X","authenticated-orcid":false,"given":"Yongchao","family":"Wu","sequence":"additional","affiliation":[{"name":"Department of Computer and Systems Sciences, Stockholm University, NOD-Huset, Borgarfjordsgatan 12, 16455 Stockholm, Sweden"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2023,12,29]]},"reference":[{"key":"ref_1","unstructured":"Thaker, K., Zhang, L., He, D., and Brusilovsky, P. (2020, January 10\u201313). Recommending Remedial Readings Using Student Knowledge State. Proceedings of the International Conference on Educational Data Mining (EDM), Online."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1109\/TLT.2022.3225206","article-title":"Adaptive Learning Support System Based on Automatic Recommendation of Personalized Review Materials","volume":"16","author":"Okubo","year":"2022","journal-title":"IEEE Trans. Learn. Technol."},{"key":"ref_3","unstructured":"Rahdari, B., Brusilovsky, P., Thaker, K., and Barria-Pineda, J. (2020, January 6). Using knowledge graph for explainable recommendation of external content in electronic textbooks. Proceedings of the Second International Workshop on Intelligent Textbooks 2020, Ifrane, Morocco."},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Rahdari, B., Brusilovsky, P., Thaker, K., and Barria-Pineda, J. (2020, January 14\u201318). Knowledge-driven wikipedia article recommendation for electronic textbooks. Proceedings of the European Conference on Technology Enhanced Learning, Heidelberg, Germany.","DOI":"10.1007\/978-3-030-57717-9_28"},{"key":"ref_5","unstructured":"Barria-Pineda, J., Narayanan, A.B.L., and Brusilovsky, P. (2022). Augmenting Digital Textbooks with Reusable Smart Learning Content: Solutions and Challenges, EasyChair. Technical Report."},{"key":"ref_6","unstructured":"Herlinda, R. (2014, January 7\u20139). The use of textbook in teaching and learning process. Proceedings of the 6th TEFLIN International Conference, Surakarta, Indonesia."},{"key":"ref_7","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv."},{"key":"ref_8","unstructured":"OpenAI (2023). GPT-4 Technical Report. arXiv."},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Li\u00e9tard, B., Abdou, M., and S\u00f8gaard, A. (2021). Do Language Models Know the Way to Rome?. arXiv.","DOI":"10.18653\/v1\/2021.blackboxnlp-1.40"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Hadi, M.U., Qureshi, R., Shah, A., Irfan, M., Zafar, A., Shaikh, M., Akhtar, N., Wu, J., and Mirjalili, S. (2023). A survey on large language models: Applications, challenges, limitations, and practical usage. TechRxiv.","DOI":"10.36227\/techrxiv.23589741.v1"},{"key":"ref_11","unstructured":"Zhao, X., Lu, J., Deng, C., Zheng, C., Wang, J., Chowdhury, T., Yun, L., Cui, H., Xuchao, Z., and Zhao, T. (2023). Domain Specialization as the Key to Make Large Language Models Disruptive: A Comprehensive Survey. arXiv."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Rajabi, E., and Etminani, K. (2022). Knowledge-graph-based explainable AI: A systematic review. J. Inf. Sci.","DOI":"10.1177\/01655515221112844"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Zhang, F., Yuan, N.J., Lian, D., Xie, X., and Ma, W.Y. (2016, January 13\u201317). Collaborative knowledge base embedding for recommender systems. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939673"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"494","DOI":"10.1109\/TNNLS.2021.3070843","article-title":"A survey on knowledge graphs: Representation, acquisition, and applications","volume":"33","author":"Ji","year":"2021","journal-title":"IEEE Trans. Neural Netw. Learn. Syst."},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Xiong, C., Power, R., and Callan, J. (2017, January 3\u20137). Explicit semantic ranking for academic search via knowledge graph embedding. Proceedings of the 6th International Conference on World Wide Web, Perth, Australia.","DOI":"10.1145\/3038912.3052558"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Li, X., Henriksson, A., Nouri, J., Duneld, M., and Wu, Y. (2023, January 12\u201314). Linking Swedish Learning Materials to Exercises through an AI-Enhanced der System. Proceedings of the International Conference in Methodologies and Intelligent Systems for Techhnology Enhanced Learning, Guimaraes, Portugal.","DOI":"10.1007\/978-3-031-41226-4_10"},{"key":"ref_17","unstructured":"Le, Q., and Mikolov, T. (June, January 21\u2013). Distributed representations of sentences and documents. Proceedings of the International Conference on Machine Learning, Beijing, China."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Niu, Y., Lin, R., and Xue, H. (2023). Research on Learning Resource Recommendation Based on Knowledge Graph and Collaborative Filtering. Appl. Sci., 13.","DOI":"10.3390\/app131910933"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"16","DOI":"10.1109\/ACCESS.2021.3137960","article-title":"Combining Citation Network Information and Text Similarity for Research Article Recommender Systems","volume":"10","author":"Sterling","year":"2021","journal-title":"IEEE Access"},{"key":"ref_20","unstructured":"Ostendorff, M. (2020). Contextual document similarity for content-based literature recommender systems. arXiv."},{"key":"ref_21","doi-asserted-by":"crossref","first-page":"827","DOI":"10.1109\/TKDE.2019.2895033","article-title":"A hybrid e-learning recommendation approach based on learners\u2019 influence propagation","volume":"32","author":"Wan","year":"2019","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"34166","DOI":"10.1109\/ACCESS.2018.2850376","article-title":"A personalized group-based recommendation approach for Web search in E-learning","volume":"6","author":"Rahman","year":"2018","journal-title":"IEEE Access"},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"78","DOI":"10.1145\/2629489","article-title":"Wikidata: A free collaborative knowledgebase","volume":"57","year":"2014","journal-title":"Commun. ACM"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Suchanek, F.M., Kasneci, G., and Weikum, G. (2007, January 8\u201312). Yago: A core of semantic knowledge. Proceedings of the 16th International Conference on World Wide Web, Banff, AB, Canada.","DOI":"10.1145\/1242572.1242667"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Bollacker, K., Evans, C., Paritosh, P., Sturge, T., and Taylor, J. (2008, January 10\u201312). Freebase: A collaboratively created graph database for structuring human knowledge. Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, Vancouver, BC, Canada.","DOI":"10.1145\/1376616.1376746"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Auer, S., Bizer, C., Kobilarov, G., Lehmann, J., Cyganiak, R., and Ives, Z. (2007, January 11\u201315). Dbpedia: A nucleus for a web of open data. Proceedings of the International Semantic Web Conference, Busan, Republic of Korea.","DOI":"10.1007\/978-3-540-76298-0_52"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Speer, R., Chin, J., and Havasi, C. (2017, January 4\u20139). Conceptnet 5. 5: An open multilingual graph of general knowledge. Proceedings of the AAAI Conference on Artificial Intelligence, San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.11164"},{"key":"ref_28","first-page":"2787","article-title":"Translating embeddings for modeling multi-relational data","volume":"26","author":"Bordes","year":"2013","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_29","unstructured":"Hamilton, W., Ying, Z., and Leskovec, J. (2017). Inductive representation learning on large graphs. arXiv."},{"key":"ref_30","doi-asserted-by":"crossref","unstructured":"Schlichtkrull, M., Kipf, T.N., Bloem, P., Van Den Berg, R., Titov, I., and Welling, M. (2018, January 3\u20137). Modeling relational data with graph convolutional networks. Proceedings of the Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece.","DOI":"10.1007\/978-3-319-93417-4_38"},{"key":"ref_31","doi-asserted-by":"crossref","unstructured":"Grover, A., and Leskovec, J. (2016, January 13\u201317). node2vec: Scalable feature learning for networks. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939754"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Wang, X., He, X., Cao, Y., Liu, M., and Chua, T.S. (2019, January 4\u20138). Kgat: Knowledge graph attention network for recommendation. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA.","DOI":"10.1145\/3292500.3330989"},{"key":"ref_33","unstructured":"Wang, Z., Li, J., Liu, Z., and Tang, J. (2016, January 9\u201315). Text-enhanced representation learning for knowledge graph. Proceedings of the International Joint Conference on Artificial Intelligent (IJCAI), New York, NY, USA."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"126511","DOI":"10.1016\/j.neucom.2023.126511","article-title":"Enhancing text representations separately with entity descriptions","volume":"552","author":"Zhao","year":"2023","journal-title":"Neurocomputing"},{"key":"ref_35","unstructured":"Yu, D., Zhu, C., Yang, Y., and Zeng, M. (March, January 22). Jaket: Joint pre-training of knowledge graph and language understanding. Proceedings of the AAAI Conference on Artificial Intelligence, Virtual."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Yamada, I., Asai, A., Shindo, H., Takeda, H., and Matsumoto, Y. (2020). Luke: Deep contextualized entity representations with entity-aware self-attention. arXiv.","DOI":"10.18653\/v1\/2020.emnlp-main.523"},{"key":"ref_37","unstructured":"El Boukkouri, H., Ferret, O., Lavergne, T., and Zweigenbaum, P. (August, January 28). Embedding strategies for specialized domains: Application to clinical entity recognition. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, Florence, Italy."},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Wang, B., Shen, T., Long, G., Zhou, T., Wang, Y., and Chang, Y. (2021, January 19\u201313). Structure-augmented text representation learning for efficient knowledge graph completion. Proceedings of the Web Conference 2021, Ljubljana, Slovenia.","DOI":"10.1145\/3442381.3450043"},{"key":"ref_39","unstructured":"Murom\u00e4gi, A., Sirts, K., and Laur, S. (2017). Linear ensembles of word embedding models. arXiv."},{"key":"ref_40","unstructured":"Gammelgaard, M.L., Christiansen, J.G., and S\u00f8gaard, A. (2023). Large language models converge toward human-like concept organization. arXiv."},{"key":"ref_41","unstructured":"Goossens, S. (2023, November 28). A Guide to Building Document Embeddings. Available online: https:\/\/radix.ai\/blog\/2021\/3\/a-guide-to-building-document-embeddings-part-1\/."},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"105151","DOI":"10.1016\/j.engappai.2022.105151","article-title":"Ensemble deep learning: A review","volume":"115","author":"Ganaie","year":"2022","journal-title":"Eng. Appl. Artif. Intell."},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Faruqui, M., Dodge, J., Jauhar, S.K., Dyer, C., Hovy, E., and Smith, N.A. (2014). Retrofitting word vectors to semantic lexicons. arXiv.","DOI":"10.3115\/v1\/N15-1184"},{"key":"ref_44","doi-asserted-by":"crossref","unstructured":"Fang, L., Luo, Y., Feng, K., Zhao, K., and Hu, A. (2019, January 13\u201317). Knowledge-enhanced ensemble learning for word embeddings. Proceedings of the World Wide Web Conference, San Francisco, CA, USA.","DOI":"10.1145\/3308558.3313425"},{"key":"ref_45","first-page":"5534","article-title":"A Knowledge-Enriched Ensemble Method for Word Embedding and Multi-Sense Embedding","volume":"35","author":"Fang","year":"2023","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_46","doi-asserted-by":"crossref","first-page":"10098","DOI":"10.1109\/TKDE.2023.3250499","article-title":"Knowledge graph augmented network towards multiview representation learning for aspect-based sentiment analysis","volume":"35","author":"Zhong","year":"2023","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_47","doi-asserted-by":"crossref","unstructured":"Ri, R., Yamada, I., and Tsuruoka, Y. (2021). mLUKE: The power of entity representations in multilingual pretrained language models. arXiv.","DOI":"10.18653\/v1\/2022.acl-long.505"},{"key":"ref_48","doi-asserted-by":"crossref","unstructured":"Peters, M.E., Neumann, M., Logan IV, R.L., Schwartz, R., Joshi, V., Singh, S., and Smith, N.A. (2019). Knowledge enhanced contextual word representations. arXiv.","DOI":"10.18653\/v1\/D19-1005"},{"key":"ref_49","unstructured":"Malmsten, M., B\u00f6rjeson, L., and Haffenden, C. (2020). Playing with Words at the National Library of Sweden\u2013Making a Swedish BERT. arXiv."},{"key":"ref_50","unstructured":"Rekathati, F. (2023, November 28). The KBLab Blog: Introducing a Swedish Sentence Transformer. Available online: https:\/\/kb-labb.github.io\/posts\/2021-08-23-a-swedish-sentencetransformer."},{"key":"ref_51","unstructured":"Isbister, T., and Sahlgren, M. (2020). Why not simply translate? A first Swedish evaluation benchmark for semantic similarity. arXiv."},{"key":"ref_52","first-page":"1877","article-title":"Language models are few-shot learners","volume":"33","author":"Brown","year":"2020","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_53","unstructured":"Neelakantan, A., Xu, T., Puri, R., Radford, A., Han, J.M., Tworek, J., Yuan, Q., Tezak, N., Kim, J.W., and Hallacy, C. (2022). Text and code embeddings by contrastive pre-training. arXiv."},{"key":"ref_54","unstructured":"Greene, R., Sanders, T., Weng, L., and Neelakantan, A. (2023, November 28). New and Improved Embedding Model. OpenAI Blog. Available online: https:\/\/openai.com\/blog\/new-and-improved-embedding-model."},{"key":"ref_55","unstructured":"Ekgren, A., Gyllensten, A.C., Gogoulou, E., Heiman, A., Verlinden, S., \u00d6hman, J., Carlsson, F., and Sahlgren, M. (2022, January 20\u201325). Lessons Learned from GPT-SW3: Building the First Large-Scale Generative Language Model for Swedish. Proceedings of the Thirteenth Language Resources and Evaluation Conference, Marseille, France."},{"key":"ref_56","unstructured":"Mihalcea, R., Corley, C., and Strapparava, C. (2006, January 16\u201320). Corpus-based and knowledge-based measures of text semantic similarity. Proceedings of the AAAI, Boston, MA, USA."},{"key":"ref_57","doi-asserted-by":"crossref","unstructured":"Reimers, N., and Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv.","DOI":"10.18653\/v1\/D19-1410"},{"key":"ref_58","doi-asserted-by":"crossref","unstructured":"Su, Y., and Kabala, Z.J. (2023). Public Perception of ChatGPT and Transfer Learning for Tweets Sentiment Analysis Using Wolfram Mathematica. Data, 8.","DOI":"10.3390\/data8120180"},{"key":"ref_59","doi-asserted-by":"crossref","unstructured":"Meng, R., Zhao, S., Han, S., He, D., Brusilovsky, P., and Chi, Y. (2017). Deep keyphrase generation. arXiv.","DOI":"10.18653\/v1\/P17-1054"},{"key":"ref_60","doi-asserted-by":"crossref","unstructured":"Li, X., Nouri, J., Henriksson, A., Duneld, M., and Wu, Y. (2022, January 12\u201314). Automatic Educational Concept Extraction Using NLP. Proceedings of the International Conference in Methodologies and Intelligent Systems for Techhnology Enhanced Learning, L\u2019Aquila, Italy.","DOI":"10.1007\/978-3-031-20617-7_17"},{"key":"ref_61","doi-asserted-by":"crossref","unstructured":"Ferragina, P., and Scaiella, U. (2010, January 26\u201330). Tagme: On-the-fly annotation of short text fragments (by wikipedia entities). Proceedings of the 19th ACM International Conference on INFORMATION and Knowledge Management, Toronto, ON, Canada.","DOI":"10.1145\/1871437.1871689"},{"key":"ref_62","unstructured":"Bougouin, A., Boudin, F., and Daille, B. (2013, January 14\u201319). Topicrank: Graph-based topic ranking for keyphrase extraction. Proceedings of the International Joint Conference on Natural Language Processing (IJCNLP), Nagoya, Japan."}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/16\/1\/12\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:43:51Z","timestamp":1760132631000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/16\/1\/12"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,12,29]]},"references-count":62,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,1]]}},"alternative-id":["fi16010012"],"URL":"https:\/\/doi.org\/10.3390\/fi16010012","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,12,29]]}}}