{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,3]],"date-time":"2026-05-03T03:19:07Z","timestamp":1777778347662,"version":"3.51.4"},"reference-count":107,"publisher":"SAGE Publications","issue":"3","license":[{"start":{"date-parts":[[2023,5,1]],"date-time":"2023-05-01T00:00:00Z","timestamp":1682899200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/journals.sagepub.com\/page\/policies\/text-and-data-mining-license"}],"content-domain":{"domain":["journals.sagepub.com"],"crossmark-restriction":true},"short-container-title":["Information Visualization"],"published-print":{"date-parts":[[2023,7]]},"abstract":"<jats:p>Transformer-based language models such as BERT and its variants have found widespread use in natural language processing (NLP). A common way of using these models is to fine-tune them to improve their performance on a specific task. However, it is currently unclear how the fine-tuning process affects the underlying structure of the word embeddings from these models. We present TopoBERT, a visual analytics system for interactively exploring the fine-tuning process of various transformer-based models \u2013 across multiple fine-tuning batch updates, subsequent layers of the model, and different NLP tasks \u2013 from a topological perspective. The system uses the mapper algorithm from topological data analysis (TDA) to generate a graph that approximates the shape of a model\u2019s embedding space for an input dataset. TopoBERT enables its users (e.g. experts in NLP and linguistics) to (1) interactively explore the fine-tuning process across different model-task pairs, (2) visualize the shape of embedding spaces at multiple scales and layers, and (3) connect linguistic and contextual information about the input dataset with the topology of the embedding space. Using TopoBERT, we provide various use cases to exemplify its applications in exploring fine-tuned word embeddings. We further demonstrate the utility of TopoBERT, which enables users to generate insights about the fine-tuning process and provides support for empirical validation of these insights.<\/jats:p>","DOI":"10.1177\/14738716231168671","type":"journal-article","created":{"date-parts":[[2023,5,2]],"date-time":"2023-05-02T02:51:03Z","timestamp":1682995863000},"page":"186-208","update-policy":"https:\/\/doi.org\/10.1177\/sage-journals-update-policy","source":"Crossref","is-referenced-by-count":14,"title":["TopoBERT: Exploring the topology of fine-tuned word representations"],"prefix":"10.1177","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-2965-3561","authenticated-orcid":false,"given":"Archit","family":"Rathore","sequence":"first","affiliation":[{"name":"School of Computing, University of Utah, Salt Lake City, UT, USA"},{"name":"Scientific Computing and Imaging Institute, University of Utah, Salt Lake City, UT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0009-0001-1558-223X","authenticated-orcid":false,"given":"Yichu","family":"Zhou","sequence":"additional","affiliation":[{"name":"School of Computing, University of Utah, Salt Lake City, UT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Vivek","family":"Srikumar","sequence":"additional","affiliation":[{"name":"School of Computing, University of Utah, Salt Lake City, UT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Bei","family":"Wang","sequence":"additional","affiliation":[{"name":"School of Computing, University of Utah, Salt Lake City, UT, USA"},{"name":"Scientific Computing and Imaging Institute, University of Utah, Salt Lake City, UT, USA"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"179","published-online":{"date-parts":[[2023,5,1]]},"reference":[{"key":"bibr1-14738716231168671","first-page":"4171","volume-title":"Proceedings of the conference of the North American Chapter of the Association for computational linguistics: human language technologies","author":"Devlin J","year":"2019"},{"key":"bibr2-14738716231168671","volume":"11692","author":"Liu Y","year":"2019","journal-title":"arXiv preprint"},{"key":"bibr3-14738716231168671","first-page":"5998","volume":"30","author":"Vaswani A","year":"2017","journal-title":"Adv Neural Inf Process Syst"},{"key":"bibr4-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1031"},{"key":"bibr5-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1162\/coli_a_00422"},{"key":"bibr6-14738716231168671","volume-title":"International conference on learning representations (ICLR)","author":"Tenney I","year":"2019"},{"key":"bibr7-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.129"},{"key":"bibr8-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.603"},{"key":"bibr9-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2022.acl-long.144"},{"key":"bibr10-14738716231168671","first-page":"1046","volume-title":"Proceedings of the 60th annual meeting of the Association for Computational Linguistics (Volume 1: Long Papers)","author":"Zhou Y"},{"key":"bibr11-14738716231168671","first-page":"91","volume-title":"Eurographics Symposium on Point-Based Graphics","author":"Singh G","year":"2007"},{"key":"bibr12-14738716231168671","first-page":"1634","volume":"30","author":"Hofer C","year":"2017","journal-title":"Adv Neural Inf Process Syst"},{"key":"bibr13-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1111\/cgf.14195"},{"key":"bibr14-14738716231168671","author":"Gabrielsson RB","year":"2018","journal-title":"arXiv preprint"},{"key":"bibr15-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TPAMI.2020.3013679"},{"key":"bibr16-14738716231168671","first-page":"5657","volume":"32","author":"Hu X","year":"2019","journal-title":"Adv Neural Inf Process Syst"},{"key":"bibr17-14738716231168671","first-page":"2573","volume-title":"Proceedings of the 22nd international conference on artificial intelligence and statistics, proceedings of machine learning research","volume":"89","author":"Chen C"},{"key":"bibr18-14738716231168671","author":"Rieck B","year":"2018","journal-title":"arXiv preprint"},{"key":"bibr19-14738716231168671","first-page":"2677","volume-title":"Proceedings of the IEEE\/CVF conference on computer vision and pattern recognition","author":"Corneanu CA"},{"issue":"184","key":"bibr20-14738716231168671","first-page":"1","volume":"21","author":"Naitzat G","year":"2020","journal-title":"J Mach Learn Res"},{"key":"bibr21-14738716231168671","author":"Barannikov S","year":"2020","journal-title":"arXiv preprint"},{"key":"bibr22-14738716231168671","first-page":"117","volume-title":"International conference on statistical language and speech processing","author":"Doshi P"},{"key":"bibr23-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/W19-0605"},{"key":"bibr24-14738716231168671","first-page":"1","volume-title":"IEEE 17th international workshop on signal processing advances in wireless communications (SPAWC)","author":"Guan H"},{"key":"bibr25-14738716231168671","author":"Das S","year":"2021","journal-title":"arXiv preprint"},{"key":"bibr26-14738716231168671","author":"Kushnareva L","year":"2022","journal-title":"arXiv preprint"},{"key":"bibr27-14738716231168671","author":"Kushnareva L","year":"2021","journal-title":"arXiv preprint"},{"key":"bibr28-14738716231168671","author":"Cherniavskii D","year":"2022","journal-title":"arXiv preprint"},{"key":"bibr29-14738716231168671","volume":"15195","author":"Perez I","year":"2022","journal-title":"arXiv preprint"},{"key":"bibr30-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1145\/2939672.2939778"},{"key":"bibr31-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2016.2598831"},{"key":"bibr32-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2744358"},{"key":"bibr33-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2744878"},{"key":"bibr34-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/MCG.2018.042731661"},{"key":"bibr35-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1016\/j.visinf.2017.01.006"},{"key":"bibr36-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2020.3030418"},{"key":"bibr37-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2019.12.012"},{"key":"bibr38-14738716231168671","first-page":"56","volume":"26","author":"Wexler J","year":"2020","journal-title":"IEEE Trans Vis Comput Graph"},{"key":"bibr39-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2020.2973258"},{"key":"bibr40-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/VAST50239.2020.00007"},{"key":"bibr41-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2014.2346482"},{"key":"bibr42-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/VAST.2011.6102448"},{"key":"bibr43-14738716231168671","volume-title":"Workshop track proceedings of the 3rd international conference on learning representations (ICLR)","author":"Springenberg JT"},{"key":"bibr44-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2016.2598838"},{"key":"bibr45-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2018.2816223"},{"key":"bibr46-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2744718"},{"key":"bibr47-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2019.2934659"},{"key":"bibr48-14738716231168671","first-page":"337","volume-title":"Proceedings of the 33rd annual ACM conference on human factors in computing systems","author":"Amershi S"},{"key":"bibr49-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2016.2598828"},{"key":"bibr50-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2018.2864499"},{"key":"bibr51-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2020.3030449"},{"key":"bibr52-14738716231168671","first-page":"291","author":"Garcea F","journal-title":"Artificial neural networks in pattern recognition"},{"key":"bibr53-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1007\/s41095-020-0191-7"},{"key":"bibr54-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1177\/1473871620904671"},{"key":"bibr55-14738716231168671","doi-asserted-by":"publisher","DOI":"10.2991\/hcis.k.210704.003"},{"key":"bibr56-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2018.2843369"},{"key":"bibr57-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2745141"},{"issue":"1","key":"bibr58-14738716231168671","first-page":"1064","volume":"26","author":"Spinner T","year":"2020","journal-title":"IEEE Trans Vis Comput Graph"},{"key":"bibr59-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1609\/aaai.v32i1.11491"},{"key":"bibr60-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2018.2865230"},{"key":"bibr61-14738716231168671","author":"Chan GYY","year":"2020","journal-title":"arXiv preprint"},{"key":"bibr62-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/VAST.2017.8585721"},{"key":"bibr63-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2017.2744158"},{"key":"bibr64-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/VIS47514.2020.00062"},{"key":"bibr65-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2019.2903946"},{"key":"bibr66-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P19-3007"},{"key":"bibr67-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1080\/14786440109462720"},{"issue":"11","key":"bibr68-14738716231168671","first-page":"2579","volume":"9","author":"Van der Maaten L","year":"2008","journal-title":"J Mach Learn Res"},{"key":"bibr69-14738716231168671","author":"McInnes L","year":"2018","journal-title":"arXiv preprint"},{"key":"bibr70-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TVCG.2016.2640960"},{"key":"bibr71-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1162\/tacl_a_00349"},{"key":"bibr72-14738716231168671","volume-title":"Neural compression: from information theory to applications, workshop at ICLR","author":"Whitney WF"},{"key":"bibr73-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.14"},{"key":"bibr74-14738716231168671","first-page":"428","volume-title":"Proceedings of the 59th annual meeting of the Association for Computational Linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers)","author":"Limisiewicz T"},{"key":"bibr75-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.naacl-main.401"},{"key":"bibr76-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2021.emnlp-main.806"},{"key":"bibr77-14738716231168671","first-page":"4129","volume-title":"Proceedings of the 2019 conference of the North American Chapter of the Association for Computational Linguistics: human language technologies","volume":"1","author":"Hewitt J","year":"2019"},{"key":"bibr78-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1006"},{"key":"bibr79-14738716231168671","first-page":"7","volume-title":"Proceedings of the 4th workshop on representation learning for NLP (RepL4NLP-2019)","author":"Peters ME"},{"key":"bibr80-14738716231168671","first-page":"33","volume-title":"Proceedings of the third BlackboxNLP workshop on analyzing and interpreting neural networks for NLP","author":"Merchant A"},{"key":"bibr81-14738716231168671","first-page":"68","volume-title":"Proceedings of the third BlackboxNLP workshop on analyzing and interpreting neural networks for NLP","author":"Mosbach M"},{"key":"bibr82-14738716231168671","first-page":"87","volume-title":"Proceedings of the 1st conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th international joint conference on natural language processing","author":"Hao Y"},{"key":"bibr83-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1007\/BF01451612"},{"key":"bibr84-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1016\/j.tcs.2007.10.018"},{"key":"bibr85-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/SMI.2003.1199624"},{"issue":"12","key":"bibr86-14738716231168671","first-page":"1","volume":"19","author":"Carri\u00e8re M","year":"2018","journal-title":"J Mach Learn Res"},{"key":"bibr87-14738716231168671","volume-title":"IEEE international conference on Big Data (Big Data)","author":"Chalapathi N"},{"key":"bibr88-14738716231168671","first-page":"226","volume-title":"Proceedings of the 2nd International conference on knowledge discovery and data mining","author":"Ester M"},{"key":"bibr89-14738716231168671","first-page":"101","volume-title":"Proceedings of the IEEE 14th Pacific visualization symposium (PacificVis)","author":"Zhou Y"},{"key":"bibr90-14738716231168671","volume-title":"Proceedings of the 37th AAAI Conference on Artificial Intelligence (AAAI)","author":"Purvine E","year":"2021"},{"key":"bibr91-14738716231168671","volume":"03904","author":"Chowdhury S","year":"2021","journal-title":"arXiv preprint"},{"key":"bibr92-14738716231168671","doi-asserted-by":"publisher","DOI":"10.3115\/v1\/W15-1612"},{"key":"bibr93-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/P18-1018"},{"key":"bibr94-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/S17-1022"},{"key":"bibr95-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N19-1112"},{"key":"bibr96-14738716231168671","first-page":"1659","volume-title":"Proceedings of the tenth international conference on language resources and evaluation (LREC)","author":"Nivre J"},{"key":"bibr97-14738716231168671","volume":"08962","author":"Turc I","year":"2019","journal-title":"arXiv preprint"},{"key":"bibr98-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-demos.6"},{"key":"bibr99-14738716231168671","volume-title":"International conference on learning representations (ICLR)","author":"Loshchilov I","year":"2017"},{"key":"bibr100-14738716231168671","doi-asserted-by":"publisher","DOI":"10.1109\/TIT.1982.1056489"},{"key":"bibr101-14738716231168671","first-page":"336","volume-title":"Proceedings of IEEE symposium on visual languages","author":"Shneiderman B."},{"key":"bibr102-14738716231168671","unstructured":"Abadi M, Agarwal A, Barham P, et al. TensorFlow: Large-scale machine learning on heterogeneous systems. https:\/\/www.tensorflow.org\/ (2015)."},{"key":"bibr103-14738716231168671","first-page":"3111","volume":"26","author":"Mikolov T","year":"2013","journal-title":"Adv Neural Inf Process Syst"},{"key":"bibr104-14738716231168671","first-page":"4585","volume-title":"Proceedings of the Ninth international conference on language resources and evaluation (LREC)","author":"de Marneffe MC","year":"2014"},{"key":"bibr105-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/D19-1445"},{"key":"bibr106-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/N18-1202"},{"key":"bibr107-14738716231168671","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.acl-main.747"}],"container-title":["Information Visualization"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14738716231168671","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/full-xml\/10.1177\/14738716231168671","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/journals.sagepub.com\/doi\/pdf\/10.1177\/14738716231168671","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2026,4,29]],"date-time":"2026-04-29T19:19:17Z","timestamp":1777490357000},"score":1,"resource":{"primary":{"URL":"https:\/\/journals.sagepub.com\/doi\/10.1177\/14738716231168671"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,1]]},"references-count":107,"journal-issue":{"issue":"3","published-print":{"date-parts":[[2023,7]]}},"alternative-id":["10.1177\/14738716231168671"],"URL":"https:\/\/doi.org\/10.1177\/14738716231168671","relation":{},"ISSN":["1473-8716","1473-8724"],"issn-type":[{"value":"1473-8716","type":"print"},{"value":"1473-8724","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,1]]}}}