{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T16:14:04Z","timestamp":1778084044215,"version":"3.51.4"},"reference-count":44,"publisher":"Springer Science and Business Media LLC","issue":"3","license":[{"start":{"date-parts":[[2026,3,1]],"date-time":"2026-03-01T00:00:00Z","timestamp":1772323200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T00:00:00Z","timestamp":1772755200000},"content-version":"vor","delay-in-days":5,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004329","name":"The Slovenian Research and Innovation Agency","doi-asserted-by":"publisher","award":["PR-12394"],"award-info":[{"award-number":["PR-12394"]}],"id":[{"id":"10.13039\/501100004329","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100003407","name":"Ministero dell\u2019Istruzione, dell\u2019Universit\u00e0 e della Ricerca","doi-asserted-by":"publisher","award":["MIUR_PRIN 2020 2020ZSL9F9"],"award-info":[{"award-number":["MIUR_PRIN 2020 2020ZSL9F9"]}],"id":[{"id":"10.13039\/501100003407","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Mach Learn"],"published-print":{"date-parts":[[2026,3]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:p>Building on the success of large language models (LLMs), LLM-based representations have dominated the document representation landscape, achieving strong performance on document embedding benchmarks. However, high-dimensional, computationally expensive LLM embeddings can be too generic or inefficient for domain-specific and resource-scarce applications. To address these limitations, we introduce FuDoBa\u2014a Bayesian optimisation-based representation learning method that integrates LLM embeddings with domain-specific structured knowledge, sourced both locally and from external repositories such as WikiData. This fusion produces low-dimensional, task-relevant representations while reducing training complexity and yielding interpretable early-fusion weights for improved classification performance. We demonstrate the effectiveness of our approach on six datasets across two domains, showing that when paired with robust AutoML-based classifiers, our method performs on par with, or surpasses, proprietary LLM-only embedding baselines, while offering modality-wise interpretability and a smaller dimensional footprint.<\/jats:p>","DOI":"10.1007\/s10994-026-07008-y","type":"journal-article","created":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T13:12:55Z","timestamp":1772802775000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":0,"title":["FuDoBa: Fusing Document and Knowledge Graph Based Representations with Bayesian Optimisation"],"prefix":"10.1007","volume":"115","author":[{"given":"Boshko","family":"Koloski","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Senja","family":"Pollak","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Roberto","family":"Navigli","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bla\u017e","family":"\u0160krlj","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2026,3,6]]},"reference":[{"key":"7008_CR1","doi-asserted-by":"crossref","unstructured":"Aggarwal, C.C., Hinneburg, A., Keim, D.A. (2001). On the surprising behavior of distance metrics in high dimensional spaces. In: Proceedings of the 8th international conference on database theory. (pp. 420\u2013434). Springer.","DOI":"10.1007\/3-540-44503-X_27"},{"key":"7008_CR2","doi-asserted-by":"crossref","unstructured":"Barba, E., Orlando, R., Cabot, P.-L.H., & Navigli, R. (2024). ReLiK: Retrieve, read and link: Fast and accurate entity linking and relation extraction on an academic budget. https:\/\/openreview.net\/forum?id=b0IRscfEOb","DOI":"10.18653\/v1\/2024.findings-acl.839"},{"key":"7008_CR3","unstructured":"BehnamGhader, P., Adlakha, V., Mosbach, M., Bahdanau, D., Chapados, N., & Reddy, S. (2024). LLM2Vec: Large language models are secretly powerful text encoders. In: First conference on language modeling."},{"key":"7008_CR4","doi-asserted-by":"publisher","first-page":"785","DOI":"10.1145\/2939672.2939785","volume-title":"Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining","author":"T Chen","year":"2016","unstructured":"Chen, T., & Guestrin, C. (2016). Xgboost: A scalable tree boosting system. Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 785\u2013794). ACM."},{"issue":"12","key":"7008_CR5","doi-asserted-by":"publisher","first-page":"9205","DOI":"10.1109\/JIOT.2021.3093065","volume":"9","author":"Q Chen","year":"2022","unstructured":"Chen, Q., Wang, W., Huang, K., & Coenen, F. (2022). Zero-shot text classification via knowledge graph embedding for social media data. IEEE Internet of Things Journal, 9(12), 9205\u20139213. https:\/\/doi.org\/10.1109\/JIOT.2021.3093065","journal-title":"IEEE Internet of Things Journal"},{"key":"7008_CR6","doi-asserted-by":"crossref","unstructured":"Cocchi, F., Moratelli, N., Cornia, M., Baraldi, L., & Cucchiara, R. (2025). Augmenting multimodal LLMs with self-reflective tokens for knowledge-based visual question answering.","DOI":"10.1109\/CVPR52734.2025.00859"},{"key":"7008_CR7","first-page":"1","volume":"7","author":"J Dem\u0161ar","year":"2006","unstructured":"Dem\u0161ar, J. (2006). Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research, 7, 1\u201330.","journal-title":"Journal of Machine Learning Research"},{"key":"7008_CR8","series-title":"Long and Short Papers","doi-asserted-by":"publisher","first-page":"4171","DOI":"10.18653\/v1\/N19-1423","volume-title":"Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies","author":"J Devlin","year":"2019","unstructured":"Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of deep bidirectional transformers for language understanding. Long and Short PapersIn J. Burstein, C. Doran, & T. Solorio (Eds.), Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies (Vol. 1, pp. 4171\u20134186). Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/v1\/N19-1423"},{"key":"7008_CR9","unstructured":"Erickson, N., Mueller, J., Shirkov, A., Zhang, H., Larroy, P., Li, M., & Smola, A. (2020). Autogluon-tabular: Robust and accurate automl for structured data. Preprint retrieved from https:\/\/arxiv.org\/abs\/2003.06505"},{"key":"7008_CR10","first-page":"4186","volume-title":"Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP)","author":"A Fan","year":"2019","unstructured":"Fan, A., Gardent, C., Braud, C., & Bordes, A. (2019). Using local knowledge graph construction to scale Seq2Seq models to multi-document inputs. In K. Inui, J. Jiang, V. Ng, & X. Wan (Eds.), Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) (pp. 4186\u20134196). Association for Computational Linguistics."},{"key":"7008_CR11","doi-asserted-by":"publisher","first-page":"6894","DOI":"10.18653\/v1\/2021.emnlp-main.552","volume-title":"Proceedings of the 2021 conference on empirical methods in natural language processing","author":"T Gao","year":"2021","unstructured":"Gao, T., Yao, X., & Chen, D. (2021). SimCSE: Simple contrastive learning of sentence embeddings. In M. F. Moens, X. Huang, L. Specia, & S. W. T. Yih (Eds.), Proceedings of the 2021 conference on empirical methods in natural language processing (pp. 6894\u20136910). Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/v1\/2021.emnlp-main.552"},{"key":"7008_CR12","unstructured":"Grattafiori, A., Dubey, A., Jauhri, A., Pandey, A., Kadian, A., Al-Dahle, A., Letman, A., Mathur, A., Schelten, A., Vaughan, A. & Yang, A. (2024). The Llama 3 Herd of models"},{"key":"7008_CR13","doi-asserted-by":"publisher","first-page":"319","DOI":"10.1038\/s41586-024-08328-6","volume":"637","author":"N Hollmann","year":"2025","unstructured":"Hollmann, N., M\u00fcller, S., Eggensperger, K., & Hutter, F. (2025). Accurate predictions on small data with a tabular foundation model. Nature, 637, 319\u2013326.","journal-title":"Nature"},{"key":"7008_CR14","doi-asserted-by":"crossref","unstructured":"Holzm\u00fcller, D., Grinsztajn, L., & Steinwart, I. (2024). Better by default: Strong pre-tuned MLPs and boosted trees on tabular data. In: The thirty-eighth annual conference on neural information processing systems. https:\/\/openreview.net\/forum?id=3BNPUDvqMt","DOI":"10.52202\/079017-0837"},{"key":"7008_CR15","doi-asserted-by":"publisher","first-page":"2370","DOI":"10.18653\/v1\/2021.findings-emnlp.204","volume-title":"Findings of the association for computational linguistics: EMNLP 2021","author":"P-L Huguet Cabot","year":"2021","unstructured":"Huguet Cabot, P.-L., & Navigli, R. (2021). REBEL: Relation extraction by end-to-end language generation. In M.-F. Moens, X. Huang, L. Specia, & S.W.-T. Yih (Eds.), Findings of the association for computational linguistics: EMNLP 2021 (pp. 2370\u20132381). Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/v1\/2021.findings-emnlp.204"},{"issue":"1","key":"7008_CR16","doi-asserted-by":"publisher","first-page":"1","DOI":"10.32604\/cmc.2024.053204","volume":"80","author":"T Jiao","year":"2024","unstructured":"Jiao, T., Guo, C., Feng, X., Chen, Y., & Song, J. (2024). A comprehensive survey on deep learning multi-modal fusion: Methods, technologies and applications. Computers, Materials and Continua, 80(1), 1\u201335. https:\/\/doi.org\/10.32604\/cmc.2024.053204","journal-title":"Computers, Materials and Continua"},{"key":"7008_CR17","doi-asserted-by":"crossref","unstructured":"Khosla, S., Tiwari, A., Kafle, K., Jenni, S., Zhao, H., Collomosse, J., & Shi, J. (2025). MAGNET: Augmenting generative decoders with representation learning and infilling capabilities. Preprint retrieved from https:\/\/arxiv.org\/abs\/2501.08648","DOI":"10.18653\/v1\/2025.acl-long.1325"},{"issue":"2","key":"7008_CR18","doi-asserted-by":"publisher","first-page":"164","DOI":"10.1109\/TAC.1980.1102314","volume":"25","author":"V Klema","year":"1980","unstructured":"Klema, V., & Laub, A. (1980). The singular value decomposition: Its computation and some applications. IEEE Transactions on Automatic Control, 25(2), 164\u2013176. https:\/\/doi.org\/10.1109\/TAC.1980.1102314","journal-title":"IEEE Transactions on Automatic Control"},{"key":"7008_CR19","doi-asserted-by":"publisher","unstructured":"Koduri, S. (2012). Multisensor data fusion with singular value decomposition. In: 2012 UKSim 14th international conference on computer modelling and simulation, (pp. 422\u2013426). https:\/\/doi.org\/10.1109\/UKSim.2012.65","DOI":"10.1109\/UKSim.2012.65"},{"key":"7008_CR20","doi-asserted-by":"publisher","first-page":"208","DOI":"10.1016\/j.neucom.2022.01.096","volume":"496","author":"B Koloski","year":"2022","unstructured":"Koloski, B., Stepi\u0161nik Perdih, T., Robnik-\u0160ikonja, M., Pollak, S., & \u0160krlj, B. (2022). Knowledge graph informed fake news classification via heterogeneous representation ensembles. Neurocomputing, 496, 208\u2013226. https:\/\/doi.org\/10.1016\/j.neucom.2022.01.096","journal-title":"Neurocomputing"},{"key":"7008_CR21","doi-asserted-by":"publisher","first-page":"101","DOI":"10.1007\/978-3-031-78977-9_7","volume-title":"Discovery science: 27th international conference, DS 2024, Pisa, Italy, October 14\u201316, 2024, proceedings, part I","author":"B Koloski","year":"2025","unstructured":"Koloski, B., Pollak, S., Navigli, R., & \u0160krlj, B. (2025). Automl-guided fusion of entity and llm-based representations fordocument classification. Discovery science: 27th international conference, DS 2024, Pisa, Italy, October 14\u201316, 2024, proceedings, part I (pp. 101\u2013115). Springer. https:\/\/doi.org\/10.1007\/978-3-031-78977-9_7"},{"key":"7008_CR22","first-page":"1584","volume-title":"Proceedings of the thirteenth language resources and evaluation conference","author":"T Kuzman","year":"2022","unstructured":"Kuzman, T., Rupnik, P., & Ljube\u0161i\u0107, N. (2022). The GINCO training dataset for web genre identification of documents out in the wild. In N. Calzolari, F. B\u00e9chet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, J. Odijk, & S. Piperidis (Eds.), Proceedings of the thirteenth language resources and evaluation conference (pp. 1584\u20131594). European Language Resources Association."},{"issue":"9","key":"7008_CR23","doi-asserted-by":"publisher","first-page":"1449","DOI":"10.1109\/JPROC.2015.2460697","volume":"103","author":"D Lahat","year":"2015","unstructured":"Lahat, D., Adali, T., & Jutten, C. (2015). Multimodal data fusion: An overview of methods, challenges, and prospects. Proceedings of the IEEE, 103(9), 1449\u20131477.","journal-title":"Proceedings of the IEEE"},{"key":"7008_CR24","unstructured":"Le, Q., & Mikolov, T. (2014). Distributed representations of sentences and documents. In Xing, E.P., Jebara, T. (Eds.), Proceedings of the 31st international conference on machine learning. Proceedings of machine learning research, (vol. 32, pp. 1188\u20131196). PMLR. https:\/\/proceedings.mlr.press\/v32\/le14.html"},{"issue":"1","key":"7008_CR25","doi-asserted-by":"publisher","first-page":"250","DOI":"10.1093\/bioinformatics\/btz470","volume":"36","author":"TT Le","year":"2020","unstructured":"Le, T. T., Fu, W., & Moore, J. H. (2020). Scaling tree-based automated machine learning to biomedical big data with a feature set selector. Bioinformatics, 36(1), 250\u2013256.","journal-title":"Bioinformatics"},{"key":"7008_CR26","doi-asserted-by":"publisher","unstructured":"Li, X., & Li, J. (2024). AoE: Angle-optimized embeddings for semantic textual similarity. In: Ku, L.-W., Martins, A., & Srikumar, V. (Eds.), Proceedings of the 62nd annual meeting of the association for computational linguistics (volume 1: long papers), (pp. 1825\u20131839). Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/v1\/2024.acl-long.101","DOI":"10.18653\/v1\/2024.acl-long.101"},{"key":"7008_CR27","doi-asserted-by":"publisher","first-page":"231","DOI":"10.1162\/tacl_a_00179","volume":"2","author":"A Moro","year":"2014","unstructured":"Moro, A., Raganato, A., & Navigli, R. (2014). Entity linking meets word sense disambiguation: A unified approach. Transactions of the Association for Computational Linguistics, 2, 231\u2013244. https:\/\/doi.org\/10.1162\/tacl_a_00179","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"7008_CR28","doi-asserted-by":"publisher","first-page":"2014","DOI":"10.18653\/v1\/2023.eacl-main.148","volume-title":"Proceedings of the 17th conference of the European chapter of the association for computational linguistics","author":"N Muennighoff","year":"2023","unstructured":"Muennighoff, N., Tazi, N., Magne, L., & Reimers, N. (2023). MTEB: Massive text embedding benchmark. In A. Vlachos & I. Augenstein (Eds.), Proceedings of the 17th conference of the European chapter of the association for computational linguistics (pp. 2014\u20132037). Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/v1\/2023.eacl-main.148"},{"key":"7008_CR29","unstructured":"Navigli, R., & Ponzetto, S.P. (2010). BabelNet: Building a very large multilingual semantic network. In: Haji\u010d, J., Carberry, S., Clark, S., & Nivre, J. (Eds.), Proceedings of the 48th annual meeting of the association for computational linguistics, (pp. 216\u2013225). Association for Computational Linguistics. https:\/\/aclanthology.org\/P10-1023"},{"key":"7008_CR30","unstructured":"Ostendorff, M., Bourgonje, P., Berger, M., Moreno-Schneider, J., Rehm, G., & Gipp, B. (2019). Enriching bert with knowledge graph embeddings for document classification. Preprint retrieved from https:\/\/arxiv.org\/abs\/1909.08402"},{"key":"7008_CR31","unstructured":"Qu, J., Holzm\u00fcller, D., Varoquaux, G., & Morvan, M.L. (2025). TabICL: A tabular foundation model for in-context learning on large data. In: Forty-second international conference on machine learning. https:\/\/openreview.net\/forum?id=0VvD1PmNzM"},{"key":"7008_CR32","doi-asserted-by":"publisher","first-page":"5838","DOI":"10.18653\/v1\/2020.emnlp-main.470","volume-title":"Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP)","author":"T Ranasinghe","year":"2020","unstructured":"Ranasinghe, T., & Zampieri, M. (2020). Multilingual offensive language identification with cross-lingual embeddings. In B. Webber, T. Cohn, Y. He, & Y. Liu (Eds.), Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP) (pp. 5838\u20135844). Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/v1\/2020.emnlp-main.470"},{"key":"7008_CR33","doi-asserted-by":"publisher","DOI":"10.7551\/mitpress\/3206.001.0001","volume-title":"Gaussian processes for machine learning","author":"CE Rasmussen","year":"2005","unstructured":"Rasmussen, C. E., & Williams, C. K. I. (2005). Gaussian processes for machine learning. The MIT Press. https:\/\/doi.org\/10.7551\/mitpress\/3206.001.0001"},{"key":"7008_CR34","doi-asserted-by":"publisher","first-page":"3982","DOI":"10.18653\/v1\/D19-1410","volume-title":"Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP)","author":"N Reimers","year":"2019","unstructured":"Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence embeddings using Siamese BERT-networks. In K. Inui, J. Jiang, V. Ng, & X. Wan (Eds.), Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP) (pp. 3982\u20133992). Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/v1\/D19-1410"},{"key":"7008_CR35","doi-asserted-by":"publisher","unstructured":"Sarmah, B., Mehta, D., Hall, B., Rao, R., Patel, S., & Pasquali, S. (2024). Hybridrag: Integrating knowledge graphs and vector retrieval augmented generation for efficient information extraction. In: Proceedings of the 5th ACM international conference on AI in finance, (pp. 608\u2013616). Association for Computing Machinery. https:\/\/doi.org\/10.1145\/3677052.3698671","DOI":"10.1145\/3677052.3698671"},{"key":"7008_CR36","volume-title":"Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018)","author":"H Schwenk","year":"2018","unstructured":"Schwenk, H., & Li, X. (2018). A corpus for multilingual document classification in eight languages. In N. Calzolari, K. Choukri, C. Cieri, T. Declerck, S. Goggi, K. Hasida, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, S. Piperidis, & T. Tokunaga (Eds.), Proceedings of the eleventh international conference on language resources and evaluation (LREC 2018). European Language Resources Association (ELRA)."},{"key":"7008_CR37","doi-asserted-by":"publisher","unstructured":"\u0160krlj, B., & Petkovi\u0107, M. (2021). Compressibility of distributed document representations. In: 2021 IEEE international conference on data mining (ICDM), (pp. 1330\u20131335). https:\/\/doi.org\/10.1109\/ICDM51629.2021.00166","DOI":"10.1109\/ICDM51629.2021.00166"},{"key":"7008_CR38","doi-asserted-by":"publisher","DOI":"10.1016\/j.csl.2020.101104","volume":"65","author":"B \u0160krlj","year":"2021","unstructured":"\u0160krlj, B., Martinc, M., Kralj, J., Lavra\u010d, N., & Pollak, S. (2021a). tax2vec: Constructing interpretable features from taxonomies for short text classification. Computer Speech & Language, 65, Article 101104. https:\/\/doi.org\/10.1016\/j.csl.2020.101104","journal-title":"Computer Speech & Language"},{"key":"7008_CR39","doi-asserted-by":"publisher","DOI":"10.1007\/s10994-021-05968-x","author":"B \u0160krlj","year":"2021","unstructured":"\u0160krlj, B., Martinc, M., Lavra\u010d, N., & Pollak, S. (2021b). Autobot: Evolving neuro-symbolic representations for explainable low resource text classification. Machine Learning. https:\/\/doi.org\/10.1007\/s10994-021-05968-x","journal-title":"Machine Learning"},{"key":"7008_CR40","unstructured":"Snoek, J., Larochelle, H., & Adams, R.P. (2012) Practical bayesian optimization of machine learning algorithms. In: Bartlett, P.L., Pereira, F.C.N., Burges, C.J.C., Bottou, L., & Weinberger, K.Q. (Eds.), Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a Meeting Held December 3\u20136, 2012, Lake Tahoe, Nevada, United States, pp. 2960\u20132968"},{"key":"7008_CR41","doi-asserted-by":"crossref","unstructured":"Speer, R., Chin, J., & Havasi, C. (2017) Conceptnet 5.5: An open multilingual graph of general knowledge. In: Proceedings of the thirty-first AAAI conference on artificial intelligence, (pp. 4444\u20134451). AAAI Press.","DOI":"10.1609\/aaai.v31i1.11164"},{"key":"7008_CR42","unstructured":"Sun, Z., Deng, Z.-H., Nie, J.-Y., & Tang, J. (2019). Rotate: Knowledge graph embedding by relational rotation in complex space. In: International conference on learning representations, (pp. 1\u201318). https:\/\/openreview.net\/forum?id=HkgEQnRqYQ"},{"key":"7008_CR43","doi-asserted-by":"publisher","first-page":"176","DOI":"10.1162\/tacl_a_00360","volume":"9","author":"X Wang","year":"2021","unstructured":"Wang, X., Gao, T., Zhu, Z., Zhang, Z., Liu, Z., Li, J., & Tang, J. (2021). KEPLER: A unified model for knowledge embedding and pre-trained language representation. Transactions of the Association for Computational Linguistics, 9, 176\u2013194. https:\/\/doi.org\/10.1162\/tacl_a_00360","journal-title":"Transactions of the Association for Computational Linguistics"},{"key":"7008_CR44","doi-asserted-by":"publisher","first-page":"11897","DOI":"10.18653\/v1\/2024.acl-long.642","volume-title":"Proceedings of the 62nd annual meeting of the association for computational linguistics (volume 1: long papers)","author":"L Wang","year":"2024","unstructured":"Wang, L., Yang, N., Huang, X., Yang, L., Majumder, R., & Wei, F. (2024). Improving text embeddings with large language models. In L.-W. Ku, A. Martins, & V. Srikumar (Eds.), Proceedings of the 62nd annual meeting of the association for computational linguistics (volume 1: long papers) (pp. 11897\u201311916). Association for Computational Linguistics. https:\/\/doi.org\/10.18653\/v1\/2024.acl-long.642"}],"container-title":["Machine Learning"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-026-07008-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s10994-026-07008-y","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s10994-026-07008-y.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,5,6]],"date-time":"2026-05-06T15:30:30Z","timestamp":1778081430000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s10994-026-07008-y"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2026,3]]},"references-count":44,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2026,3]]}},"alternative-id":["7008"],"URL":"https:\/\/doi.org\/10.1007\/s10994-026-07008-y","relation":{},"ISSN":["0885-6125","1573-0565"],"issn-type":[{"value":"0885-6125","type":"print"},{"value":"1573-0565","type":"electronic"}],"subject":[],"published":{"date-parts":[[2026,3]]},"assertion":[{"value":"23 April 2025","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"1 February 2026","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 February 2026","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 March 2026","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare no conflict of interest.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Conflict of interest"}}],"article-number":"61"}}