{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T06:20:57Z","timestamp":1774333257539,"version":"3.50.1"},"reference-count":45,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T00:00:00Z","timestamp":1648512000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T00:00:00Z","timestamp":1648512000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Med Inform Decis Mak"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Analyzing the unstructured textual data contained in electronic health records (EHRs) has always been a challenging task. Word embedding methods have become an essential foundation for neural network-based approaches in natural language processing (NLP), to learn dense and low-dimensional word representations from large unlabeled corpora that capture the implicit semantics of words. Models like Word2Vec, GloVe or FastText have been broadly applied and reviewed in the bioinformatics and healthcare fields, most often to embed clinical notes or activity and diagnostic codes. Visualization of the learned embeddings has been used in a subset of these works, whether for exploratory or evaluation purposes. However, visualization practices tend to be heterogeneous, and lack overall guidelines.<\/jats:p><\/jats:sec><jats:sec><jats:title>Objective<\/jats:title><jats:p>This scoping review aims to describe the methods and strategies used to visualize medical concepts represented using word embedding methods. We aim to understand the objectives of the visualizations and their limits.<\/jats:p><\/jats:sec><jats:sec><jats:title>Methods<\/jats:title><jats:p>This scoping review summarizes different methods used to visualize word embeddings in healthcare. We followed the methodology proposed by Arksey and O\u2019Malley (Int J Soc Res Methodol 8:19\u201332, 2005) and by Levac et al. (Implement Sci 5:69, 2010) to better analyze the data and provide a synthesis of the literature on the matter.<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>We first obtained 471 unique articles from a search conducted in PubMed, MedRxiv and arXiv databases. 30 of these were effectively reviewed, based on our inclusion and exclusion criteria. 23 articles were excluded in the full review stage, resulting in the analysis of 7 papers that fully correspond to our inclusion criteria. Included papers pursued a variety of objectives and used distinct methods to evaluate their embeddings and to visualize them. Visualization also served heterogeneous purposes, being alternatively used as a way to explore the embeddings, to evaluate them or to merely illustrate properties otherwise formally assessed.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>Visualization helps to explore embedding results (further dimensionality reduction, synthetic representation). However, it does not exhaust the information conveyed by the embeddings nor constitute a self-sustaining evaluation method of their pertinence.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s12911-022-01822-9","type":"journal-article","created":{"date-parts":[[2022,3,29]],"date-time":"2022-03-29T08:07:13Z","timestamp":1648541233000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":12,"title":["Visualization of medical concepts represented using word embeddings: a scoping review"],"prefix":"10.1186","volume":"22","author":[{"given":"Naima","family":"Oubenali","sequence":"first","affiliation":[]},{"given":"Sabrina","family":"Messaoud","sequence":"additional","affiliation":[]},{"given":"Alexandre","family":"Filiot","sequence":"additional","affiliation":[]},{"given":"Antoine","family":"Lamer","sequence":"additional","affiliation":[]},{"given":"Paul","family":"Andrey","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,3,29]]},"reference":[{"key":"1822_CR1","doi-asserted-by":"publisher","first-page":"38","DOI":"10.15265\/IY-2017-007","volume":"26","author":"SM Meystre","year":"2017","unstructured":"Meystre SM, Lovis C, B\u00fcrkle T, Tognola G, Budrionis A, Lehmann CU. Clinical data reuse or secondary use: current status and potential future progress. Yearb Med Inform. 2017;26:38\u201352.","journal-title":"Yearb Med Inform"},{"key":"1822_CR2","doi-asserted-by":"publisher","first-page":"e12239","DOI":"10.2196\/12239","volume":"7","author":"S Sheikhalishahi","year":"2019","unstructured":"Sheikhalishahi S, Miotto R, Dudley JT, Lavelli A, Rinaldi F, Osmani V. Natural language processing of clinical notes on chronic diseases: systematic review. JMIR Med Inform. 2019;7:e12239.","journal-title":"JMIR Med Inform"},{"key":"1822_CR3","doi-asserted-by":"publisher","first-page":"364","DOI":"10.1093\/jamia\/ocy173","volume":"26","author":"TA Koleck","year":"2019","unstructured":"Koleck TA, Dreisbach C, Bourne PE, Bakken S. Natural language processing of symptoms documented in free-text narratives of electronic health records: a systematic review. J Am Med Inform Assoc. 2019;26:364\u201379.","journal-title":"J Am Med Inform Assoc"},{"key":"1822_CR4","first-page":"281","volume":"2018","author":"Y Zhang","year":"2018","unstructured":"Zhang Y, Li H-J, Wang J, Cohen T, Roberts K, Xu H. Adapting word embeddings from multiple domains to symptom recognition from psychiatric notes. AMIA Summits Transl Sci Proc. 2018;2018:281\u20139.","journal-title":"AMIA Summits Transl Sci Proc"},{"key":"1822_CR5","doi-asserted-by":"publisher","first-page":"16","DOI":"10.1186\/s13326-021-00248-y","volume":"12","author":"J Legrand","year":"2021","unstructured":"Legrand J, Toussaint Y, Ra\u00efssi C, Coulet A. Syntax-based transfer learning for the task of biomedical relation extraction. J Biomed Semant. 2021;12:16.","journal-title":"J Biomed Semant"},{"key":"1822_CR6","unstructured":"Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed Representations of Words and Phrases and their Compositionality. In: Advances in neural information processing systems. Curran Associates, Inc.; 2013."},{"key":"1822_CR7","unstructured":"Mnih A, Kavukcuoglu K. Learning word embeddings efficiently with noise-contrastive estimation. In: Advances in neural information processing systems. Curran Associates, Inc.; 2013."},{"key":"1822_CR8","doi-asserted-by":"crossref","unstructured":"Bengio S, Heigold G. Word Embeddings for Speech Recognition. Google Research. 2014. https:\/\/research.google\/pubs\/pub42543\/. Accessed 1 Sept 2021.","DOI":"10.21437\/Interspeech.2014-273"},{"key":"1822_CR9","unstructured":"Mikolov T, Le QV, Sutskever I. Exploiting similarities among languages for machine translation. ArXiv13094168 Cs. 2013."},{"key":"1822_CR10","doi-asserted-by":"crossref","unstructured":"Wu Y, Xu J, Zhang Y, Xu H. Clinical abbreviation disambiguation using neural word embeddings. In: Proceedings of BioNLP 15. Beijing: Association for Computational Linguistics; 2015. p. 171\u20136.","DOI":"10.18653\/v1\/W15-3822"},{"key":"1822_CR11","unstructured":"Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. ArXiv13013781 Cs. 2013."},{"key":"1822_CR12","doi-asserted-by":"crossref","unstructured":"Pennington J, Socher R, Manning C. GloVe: global vectors for word representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Doha: Association for Computational Linguistics; 2014. p. 1532\u201343.","DOI":"10.3115\/v1\/D14-1162"},{"key":"1822_CR13","doi-asserted-by":"publisher","first-page":"135","DOI":"10.1162\/tacl_a_00051","volume":"5","author":"P Bojanowski","year":"2017","unstructured":"Bojanowski P, Grave E, Joulin A, Mikolov T. Enriching word vectors with subword information. Trans Assoc Comput Linguist. 2017;5:135\u201346.","journal-title":"Trans Assoc Comput Linguist"},{"key":"1822_CR14","unstructured":"Devlin J, Chang M-W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (Long and short papers). Minneapolis: Association for Computational Linguistics; 2019. p. 4171\u201386."},{"key":"1822_CR15","unstructured":"SECNLP: A survey of embeddings in clinical natural language processing-ScienceDirect. https:\/\/www.sciencedirect.com\/science\/article\/pii\/S1532046419302436. Accessed 9 Sept 2021."},{"key":"1822_CR16","doi-asserted-by":"publisher","first-page":"895","DOI":"10.3389\/fchem.2019.00895","volume":"7","author":"Y-F Zhang","year":"2020","unstructured":"Zhang Y-F, Wang X, Kaushik AC, Chu Y, Shan X, Zhao M-Z, et al. SPVec: a Word2vec-inspired feature representation method for drug-target interaction prediction. Front Chem. 2020;7:895.","journal-title":"Front Chem"},{"key":"1822_CR17","doi-asserted-by":"publisher","first-page":"122","DOI":"10.3390\/cells8020122","volume":"8","author":"Y Wang","year":"2019","unstructured":"Wang Y, You Z-H, Yang S, Li X, Jiang T-H, Zhou X. A high efficient biological language model for predicting protein-protein interactions. Cells. 2019;8:122.","journal-title":"Cells"},{"key":"1822_CR18","unstructured":"IVS2vec: A tool of inverse virtual screening based on word2vec and deep learning techniques-ScienceDirect. https:\/\/www.sciencedirect.com\/science\/article\/pii\/S1046202318304080. Accessed 9 Sept 2021."},{"key":"1822_CR19","doi-asserted-by":"publisher","first-page":"630","DOI":"10.3389\/fgene.2020.00630","volume":"11","author":"L Wang","year":"2020","unstructured":"Wang L, Wang Q, Bai H, Liu C, Liu W, Zhang Y, et al. EHR2Vec: representation learning of medical concepts from temporal patterns of clinical notes based on self-attention mechanism. Front Genet. 2020;11:630.","journal-title":"Front Genet"},{"key":"1822_CR20","doi-asserted-by":"publisher","unstructured":"Multi-layer Representation Learning for Medical Concepts | Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. https:\/\/doi.org\/10.1145\/2939672.2939823. Accessed 1 Sept 2021.","DOI":"10.1145\/2939672.2939823"},{"key":"1822_CR21","doi-asserted-by":"crossref","unstructured":"Martinez Soriano I, Castro Pe\u00f1a JL, Fernandez Breis JT, San Rom\u00e1n I, Alonso Barriuso A, Guevara Baraza D. Snomed2Vec: representation of SNOMED CT terms with Word2Vec. In: 2019 IEEE 32nd international symposium on computer-based medical systems (CBMS). 2019. p. 678\u201383.","DOI":"10.1109\/CBMS.2019.00138"},{"key":"1822_CR22","doi-asserted-by":"crossref","unstructured":"Freitas JKD, Johnson KW, Golden E, Nadkarni GN, Dudley JT, Bottinger EP, et al. Phe2vec: automated disease phenotyping based on unsupervised embeddings from electronic health records. 2021.","DOI":"10.1016\/j.patter.2021.100337"},{"key":"1822_CR23","unstructured":"Zhang Z. Explorations in word embeddings: graph-based word embedding learning and cross-lingual contextual word embedding learning. phdthesis. Universit\u00e9 Paris Saclay (COmUE); 2019."},{"key":"1822_CR24","doi-asserted-by":"publisher","first-page":"12","DOI":"10.1016\/j.jbi.2018.09.008","volume":"87","author":"Y Wang","year":"2018","unstructured":"Wang Y, Liu S, Afzal N, Rastegar-Mojarad M, Wang L, Shen F, et al. A comparison of word embeddings for the biomedical natural language processing. J Biomed Inform. 2018;87:12\u201320.","journal-title":"J Biomed Inform"},{"key":"1822_CR25","unstructured":"Hinton G, Roweis S. Stochastic neighbor embedding, p. 8."},{"key":"1822_CR26","unstructured":"Roweis S. Em algorithms for pca and spca. In: Advances in neural information processing systems. MIT Press; 1998. p. 626\u201332."},{"key":"1822_CR27","unstructured":"McInnes L, Healy J, Melville J. UMAP: uniform manifold approximation and projection for dimension reduction. ArXiv180203426 Cs Stat. 2020."},{"key":"1822_CR28","first-page":"100","volume":"28","author":"JA Hartigan","year":"1979","unstructured":"Hartigan JA, Wong MA. Algorithm AS 136: a K-means clustering algorithm. J R Stat Soc Ser C Appl Stat. 1979;28:100\u20138.","journal-title":"J R Stat Soc Ser C Appl Stat"},{"key":"1822_CR29","doi-asserted-by":"publisher","first-page":"19","DOI":"10.1080\/1364557032000119616","volume":"8","author":"H Arksey","year":"2005","unstructured":"Arksey H, O\u2019Malley L. Scoping studies: towards a methodological framework. Int J Soc Res Methodol. 2005;8:19\u201332.","journal-title":"Int J Soc Res Methodol"},{"key":"1822_CR30","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1186\/1748-5908-5-69","volume":"5","author":"D Levac","year":"2010","unstructured":"Levac D, Colquhoun H, O\u2019Brien KK. Scoping studies: advancing the methodology. Implement Sci IS. 2010;5:69.","journal-title":"Implement Sci IS"},{"key":"1822_CR31","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1007\/s13755-018-0062-0","volume":"7","author":"S Shah","year":"2018","unstructured":"Shah S, Luo X, Kanakasabai S, Tuason R, Klopper G. Neural networks for mining the associations between diseases and symptoms in clinical notes. Health Inf Sci Syst. 2018;7:1.","journal-title":"Health Inf Sci Syst"},{"key":"1822_CR32","doi-asserted-by":"crossref","unstructured":"Beaulieu-Jones BK, Kohane IS, Beam AL. Learning contextual hierarchical structure of medical concepts with poincair\u00e9 embeddings to clarify phenotypes. In: Biocomputing 2019. Kohala Coast: WORLD SCIENTIFIC; 2018. p. 8\u201317.","DOI":"10.1142\/9789813279827_0002"},{"key":"1822_CR33","doi-asserted-by":"publisher","first-page":"e12310","DOI":"10.2196\/12310","volume":"7","author":"E Dynomant","year":"2019","unstructured":"Dynomant E, Lelong R, Dahamna B, Massonnaud C, Kerdelhu\u00e9 G, Grosjean J, et al. Word embedding for the French natural language in health care: comparative study. JMIR Med Inform. 2019;7:e12310.","journal-title":"JMIR Med Inform"},{"issue":"Suppl","key":"1822_CR34","first-page":"2","volume":"18","author":"Z Chen","year":"2018","unstructured":"Chen Z, He Z, Liu X, Bian J. Evaluating semantic relations in neural word embeddings with biomedical and general domain knowledge bases. BMC Med Inform Decis Mak. 2018;18(Suppl):2.","journal-title":"BMC Med Inform Decis Mak"},{"key":"1822_CR35","doi-asserted-by":"publisher","unstructured":"WordNet: a lexical database for English: communications of the ACM: vol 38, No 11. https:\/\/doi.org\/10.1145\/219717.219748?casa_token=_7prztC2C4EAAAAA:7ENbs1mSRFmiWG2fmnvKIP8AbFinxmylJRQHk18oSVOJl4dCwKbs7q0qpCpl-cKPXKtuMw-LhNyLEUc. Accessed 29 Nov 2021.","DOI":"10.1145\/219717.219748?casa_token=_7prztC2C4EAAAAA:7ENbs1mSRFmiWG2fmnvKIP8AbFinxmylJRQHk18oSVOJl4dCwKbs7q0qpCpl-cKPXKtuMw-LhNyLEUc"},{"key":"1822_CR36","doi-asserted-by":"publisher","first-page":"41","DOI":"10.1055\/s-0038-1637976","volume":"2","author":"DB Lindberg","year":"1993","unstructured":"Lindberg DB, Humphreys BL, McCray AT. The unified medical language system. Yearb Med Inform. 1993;2:41\u201351.","journal-title":"Yearb Med Inform"},{"key":"1822_CR37","first-page":"1001","volume":"26","author":"M El-Assady","year":"2020","unstructured":"El-Assady M, Kehlbeck R, Collins C, Keim D, Deussen O. Semantic concept spaces: guided topic model refinement using word-embedding projections. IEEE Trans Vis Comput Graph. 2020;26:1001\u201311.","journal-title":"IEEE Trans Vis Comput Graph"},{"key":"1822_CR38","unstructured":"Measures of semantic similarity and relatedness in the biomedical domain-ScienceDirect. https:\/\/www.sciencedirect.com\/science\/article\/pii\/S1532046406000645. Accessed 9 Sept 2021."},{"key":"1822_CR39","unstructured":"Hliaoutakis A. Semantic similarity measures in MeSH ontology and their application to information retrieval on medline, p. 79."},{"key":"1822_CR40","doi-asserted-by":"publisher","first-page":"251","DOI":"10.1016\/j.jbi.2010.10.004","volume":"44","author":"SVS Pakhomov","year":"2011","unstructured":"Pakhomov SVS, Pedersen T, McInnes B, Melton GB, Ruggieri A, Chute CG. Towards a framework for developing semantic relatedness reference standards. J Biomed Inform. 2011;44:251\u201365.","journal-title":"J Biomed Inform"},{"key":"1822_CR41","first-page":"572","volume":"2010","author":"S Pakhomov","year":"2010","unstructured":"Pakhomov S, McInnes B, Adam T, Liu Y, Pedersen T, Melton GB. Semantic similarity and relatedness between clinical terms: an experimental study. AMIA Annu Symp Proc AMIA Symp AMIA Symp. 2010;2010:572\u20136.","journal-title":"AMIA Annu Symp Proc AMIA Symp AMIA Symp"},{"key":"1822_CR42","doi-asserted-by":"crossref","unstructured":"Levy O, Goldberg Y. Dependency-based word embeddings. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (Volume 2: Short Papers). 2014. p. 302\u2013308.","DOI":"10.3115\/v1\/P14-2050"},{"key":"1822_CR43","doi-asserted-by":"crossref","unstructured":"Speer R, Chin J, Havasi C. ConceptNet 5.5: an open multilingual graph of general knowledge. In: Thirty-first AAAI conference on artificial intelligence. 2017.","DOI":"10.1609\/aaai.v31i1.11164"},{"key":"1822_CR44","doi-asserted-by":"publisher","first-page":"036106","DOI":"10.1103\/PhysRevE.82.036106","volume":"82","author":"D Krioukov","year":"2010","unstructured":"Krioukov D, Papadopoulos F, Kitsak M, Vahdat A, Bogu\u00f1\u00e1 M. Hyperbolic geometry of complex networks. Phys Rev E. 2010;82:036106.","journal-title":"Phys Rev E"},{"key":"1822_CR45","first-page":"287","volume":"5","author":"B Kulis","year":"2013","unstructured":"Kulis B. Metric learning: a survey. Mach Learn. 2013;5:287\u2013364.","journal-title":"Mach Learn"}],"container-title":["BMC Medical Informatics and Decision Making"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-022-01822-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12911-022-01822-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-022-01822-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2023,1,30]],"date-time":"2023-01-30T17:28:43Z","timestamp":1675099723000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcmedinformdecismak.biomedcentral.com\/articles\/10.1186\/s12911-022-01822-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,3,29]]},"references-count":45,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["1822"],"URL":"https:\/\/doi.org\/10.1186\/s12911-022-01822-9","relation":{},"ISSN":["1472-6947"],"issn-type":[{"value":"1472-6947","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,3,29]]},"assertion":[{"value":"24 September 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"7 March 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"29 March 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Not applicable.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"83"}}