{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,6]],"date-time":"2026-03-06T21:12:36Z","timestamp":1772831556656,"version":"3.50.1"},"reference-count":102,"publisher":"SAGE Publications","issue":"4","license":[{"start":{"date-parts":[[2022,8,3]],"date-time":"2022-08-03T00:00:00Z","timestamp":1659484800000},"content-version":"unspecified","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/100007633","name":"stiftelsen f\u00f6r milj\u00f6strategisk forskning","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100007633","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Information Visualization"],"published-print":{"date-parts":[[2022,10]]},"abstract":"<jats:p> Comparing text documents is an essential task for a variety of applications within diverse research fields, and several different methods have been developed for this. However, calculating text similarity is an ambiguous and context-dependent task, so many open challenges still exist. In this paper, we present a novel method for text similarity calculations based on the combination of embedding technology and ensemble methods. By using several embeddings, instead of only one, we show that it is possible to achieve higher quality, which in turn is a key factor for developing high-performing applications for text similarity exploitation. We also provide a prototype visual analytics tool which helps the analyst to find optimal performing ensembles and gain insights to the inner workings of the similarity calculations. Furthermore, we discuss the generalizability of our key ideas to fields beyond the scope of text analysis. <\/jats:p>","DOI":"10.1177\/14738716221114372","type":"journal-article","created":{"date-parts":[[2022,8,3]],"date-time":"2022-08-03T10:40:49Z","timestamp":1659523249000},"page":"335-353","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":11,"title":["Interactive optimization of embedding-based text similarity calculations"],"prefix":"10.1177","volume":"21","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6150-0787","authenticated-orcid":false,"given":"Daniel","family":"Witschard","sequence":"first","affiliation":[{"name":"Linnaeus University, V\u00e4xj\u00f6, Sweden"}]},{"given":"Ilir","family":"Jusufi","sequence":"additional","affiliation":[{"name":"Linnaeus University, V\u00e4xj\u00f6, Sweden"}]},{"given":"Rafael M","family":"Martins","sequence":"additional","affiliation":[{"name":"Linnaeus University, V\u00e4xj\u00f6, Sweden"}]},{"given":"Kostiantyn","family":"Kucher","sequence":"additional","affiliation":[{"name":"Linnaeus University, V\u00e4xj\u00f6, Sweden"},{"name":"Link\u00f6ping University, Link\u00f6ping, Sweden"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-0519-2537","authenticated-orcid":false,"given":"Andreas","family":"Kerren","sequence":"additional","affiliation":[{"name":"Linnaeus University, V\u00e4xj\u00f6, Sweden"},{"name":"Link\u00f6ping University, Link\u00f6ping, Sweden"}]}],"member":"179","published-online":{"date-parts":[[2022,8,3]]},"reference":[{"key":"bibr1-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1145\/1376815.1376819"},{"key":"bibr2-14738716221114372","first-page":"385","volume-title":"*SEM 2012: the first joint conference on lexical and computational semantics \u2013 Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth international workshop on semantic evaluation (SemEval 2012)","author":"Agirre E"},{"key":"bibr3-14738716221114372","doi-asserted-by":"publisher","DOI":"10.3390\/info11090421"},{"key":"bibr4-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2013.50"},{"key":"bibr5-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1155\/2020\/4750871"},{"key":"bibr6-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1007\/s11192-020-03396-7"},{"key":"bibr7-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1007\/s41109-019-0197-1"},{"key":"bibr8-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1016\/j.knosys.2018.03.022"},{"issue":"5","key":"bibr9-14738716221114372","doi-asserted-by":"crossref","first-page":"833","DOI":"10.1109\/TKDE.2018.2849727","volume":"31","author":"Zhu D","year":"2019","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"bibr10-14738716221114372","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1901.09069."},{"key":"bibr11-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1145\/3490099.3511122."},{"key":"bibr12-14738716221114372","doi-asserted-by":"publisher","DOI":"10.5121\/csit.2020.100402."},{"key":"bibr13-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1093\/comnet\/cnz043"},{"key":"bibr14-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1613\/jair.614"},{"key":"bibr15-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1007\/s11704-019-8208-z"},{"key":"bibr16-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1609\/aimag.v35i4.2513"},{"key":"bibr17-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2870052"},{"key":"bibr18-14738716221114372","first-page":"80","volume-title":"Proceedings of the IEEE international conference on data science and advanced analytics","author":"Gilpin LH"},{"key":"bibr19-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1145\/3236009"},{"key":"bibr20-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.13092"},{"key":"bibr21-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2017.01.105"},{"key":"bibr22-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2018.2864838"},{"key":"bibr23-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.14034"},{"key":"bibr24-14738716221114372","doi-asserted-by":"publisher","DOI":"10.48550\/arXiv.1611.05469."},{"key":"bibr25-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2744478"},{"key":"bibr26-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2019.2903946"},{"key":"bibr27-14738716221114372","volume-title":"Mastering the Information Age: Solving pProblems with Visual Analytics","author":"Keim DA","year":"2010"},{"key":"bibr28-14738716221114372","volume-title":"Proceedings of the winter simulation conference, WSC\u201912","author":"Kerren A"},{"key":"bibr29-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1145\/2808234"},{"key":"bibr30-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1080\/01605682.2020.1768809"},{"key":"bibr31-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.14017"},{"key":"bibr32-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2014.2346481"},{"key":"bibr33-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.13324"},{"key":"bibr34-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-73531-3"},{"key":"bibr35-14738716221114372","first-page":"1137","volume":"3","author":"Bengio Y","year":"2003","journal-title":"Journal of Machine Learning Research"},{"key":"bibr36-14738716221114372","first-page":"2493","volume":"12","author":"Collobert R","year":"2011","journal-title":"Journal of Machine Learning Research"},{"key":"bibr37-14738716221114372","first-page":"384","volume-title":"Proceedings of the 48th annual meeting of the association for computational linguistics, ACL\u201910","author":"Turian J","year":"2010"},{"key":"bibr38-14738716221114372","doi-asserted-by":"publisher","DOI":"10.31219\/osf.io\/tah3y."},{"key":"bibr39-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2021.3105899."},{"key":"bibr40-14738716221114372","first-page":"2013","volume-title":"Proceedings of the 26th international conference on neural information processing systems\u2014Volume 2, NIPS\u201913","author":"Mikolov T"},{"key":"bibr41-14738716221114372","first-page":"2019","volume-title":"Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Volume 1 (long and short papers), NAACL \u201919","author":"Devlin J"},{"key":"bibr42-14738716221114372","first-page":"1188","volume-title":"Proceedings of the 31st international conference on machine learning, ICML\u201914","author":"Le Q","year":"2014"},{"key":"bibr43-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1111\/j.1551-6709.2010.01106.x"},{"key":"bibr44-14738716221114372","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.acl-long.39."},{"key":"bibr45-14738716221114372","first-page":"1631","volume-title":"Proceedings of the 2013 conference on empirical methods in natural language processing, EMNLP\u201913","author":"Socher R","year":"2013"},{"key":"bibr46-14738716221114372","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/P14-1062"},{"key":"bibr47-14738716221114372","first-page":"3294","volume-title":"Proceedings of the 28th international conference on neural information processing systems\u2014volume 2, NIPS\u201915","author":"Kiros R","year":"2015"},{"key":"bibr48-14738716221114372","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D18-2029."},{"key":"bibr49-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1007\/BF00058655"},{"key":"bibr50-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1214\/aos\/1024691079"},{"key":"bibr51-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1016\/S0893-6080(05)80023-1"},{"key":"bibr52-14738716221114372","author":"Speer R","year":"2016","journal-title":"arXiv:1604.01692"},{"key":"bibr53-14738716221114372","first-page":"96","volume-title":"Proceedings of the 21st Nordic conference on computational linguistics, NoDaLiDa\u201917","author":"Murom\u00e4gi A","year":"2017"},{"key":"bibr54-14738716221114372","doi-asserted-by":"crossref","first-page":"309","DOI":"10.1007\/978-3-030-30952-7_31","volume-title":"Web information systems and applications: 16th international conference, WISA 2019","volume":"11817","author":"Xia C"},{"key":"bibr55-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/CMV.2007.20."},{"key":"bibr56-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1145\/2133416.2146416"},{"key":"bibr57-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2013.124"},{"key":"bibr58-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1177\/1473871611416549"},{"key":"bibr59-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2015.2467551"},{"key":"bibr60-14738716221114372","first-page":"117","volume-title":"Proceedings of the IEEE pacific visualization symposium, PacificVis\u201915","author":"Kucher K"},{"key":"bibr61-14738716221114372","doi-asserted-by":"publisher","DOI":"10.2312\/eurovisstar.20151113."},{"key":"bibr62-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2018.2834341"},{"key":"bibr63-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-06793-3"},{"key":"bibr64-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.13728"},{"key":"bibr65-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.13211"},{"key":"bibr66-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2016.2610422"},{"key":"bibr67-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1007\/BF02023610"},{"key":"bibr68-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1016\/S0167-9236(99)00032-9"},{"key":"bibr69-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1007\/s10732-009-9107-5"},{"issue":"1","key":"bibr70-14738716221114372","first-page":"211","volume":"32","author":"Eskelinen P","year":"2010","journal-title":"Spectrum"},{"key":"bibr71-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2009.156"},{"key":"bibr72-14738716221114372","first-page":"215","volume-title":"Proceedings of the IEEE symposium on visual analytics science and technology, VAST\u201910","author":"Berger W"},{"key":"bibr73-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1111\/j.1467-8659.2011.01940.x"},{"key":"bibr74-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2014.2346744"},{"key":"bibr75-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2014.2346574"},{"key":"bibr76-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2014.2346321"},{"key":"bibr77-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2016.2598667"},{"key":"bibr78-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2745141"},{"key":"bibr79-14738716221114372","first-page":"1532","volume-title":"Proceedings of the conference on empirical methods in natural language processing, EMNLP\u201914","author":"Pennington J"},{"key":"bibr80-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.13672"},{"key":"bibr81-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2020.3045918"},{"key":"bibr82-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1145\/3377325.3377514."},{"key":"bibr83-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1145\/1518701.1518895."},{"key":"bibr84-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TBDATA.2018.2877350"},{"key":"bibr85-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2020.3030352"},{"key":"bibr86-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2020.3030354"},{"key":"bibr87-14738716221114372","volume-title":"Foundations of Statistical Natural Language Processing","author":"Manning CD","year":"1999"},{"key":"bibr88-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1016\/0306-4573(88)90021-0"},{"key":"bibr89-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2016.2615308"},{"key":"bibr90-14738716221114372","unstructured":"D3. Data-driven documents, https:\/\/d3js.org\/ (2011, accessed 8 March 2022)."},{"key":"bibr91-14738716221114372","unstructured":"Google Research. Google Colaboratory, https:\/\/colab.research.google.com\/ (2014, accessed 8 March 2022)."},{"key":"bibr92-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2011.188"},{"key":"bibr93-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2744805"},{"key":"bibr94-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2014.2346423"},{"key":"bibr95-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2015.2467436"},{"key":"bibr96-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/INFVIS.2000.885093."},{"key":"bibr97-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2009.195"},{"key":"bibr98-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2013.126"},{"key":"bibr99-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2018.2865146"},{"key":"bibr100-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1613\/jair.1.11640"},{"key":"bibr101-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2744199"},{"key":"bibr102-14738716221114372","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2020.3028975"}],"container-title":["Information Visualization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14738716221114372","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/14738716221114372","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14738716221114372","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,3,1]],"date-time":"2025-03-01T16:09:04Z","timestamp":1740845344000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/14738716221114372"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,3]]},"references-count":102,"journal-issue":{"issue":"4","published-print":{"date-parts":[[2022,10]]}},"alternative-id":["10.1177\/14738716221114372"],"URL":"https:\/\/doi.org\/10.1177\/14738716221114372","relation":{},"ISSN":["1473-8716","1473-8724"],"issn-type":[{"value":"1473-8716","type":"print"},{"value":"1473-8724","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,3]]}}}