{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,16]],"date-time":"2026-03-16T23:59:40Z","timestamp":1773705580187,"version":"3.50.1"},"reference-count":46,"publisher":"MDPI AG","issue":"1","license":[{"start":{"date-parts":[[2022,1,9]],"date-time":"2022-01-09T00:00:00Z","timestamp":1641686400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Future Internet"],"abstract":"<jats:p>In the knowledge discovery field of the Big Data domain the analysis of geographic positioning and mobility information plays a key role. At the same time, in the Natural Language Processing (NLP) domain pre-trained models such as BERT and word embedding algorithms such as Word2Vec enabled a rich encoding of words that allows mapping textual data into points of an arbitrary multi-dimensional space, in which the notion of proximity reflects an association among terms or topics. The main contribution of this paper is to show how analytical tools, traditionally adopted to deal with geographic data to measure the mobility of an agent in a time interval, can also be effectively applied to extract knowledge in a semantic realm, such as a semantic space of words and topics, looking for latent trajectories that can benefit the properties of neural network latent representations. As a case study, the Scopus database was queried about works of highly cited researchers in recent years. On this basis, we performed a dynamic analysis, for measuring the Radius of Gyration as an index of the mobility of researchers across scientific topics. The semantic space is built from the automatic analysis of the paper abstracts of each author. In particular, we evaluated two different methodologies to build the semantic space and we found that Word2Vec embeddings perform better than the BERT ones for this task. Finally, The scholars\u2019 trajectories show some latent properties of this model, which also represent new scientific contributions of this work. These properties include (i) the correlation between the scientific mobility and the achievement of scientific results, measured through the H-index; (ii) differences in the behavior of researchers working in different countries and subjects; and (iii) some interesting similarities between mobility patterns in this semantic realm and those typically observed in the case of human mobility.<\/jats:p>","DOI":"10.3390\/fi14010025","type":"journal-article","created":{"date-parts":[[2022,1,9]],"date-time":"2022-01-09T20:29:26Z","timestamp":1641760166000},"page":"25","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Mobility in Unsupervised Word Embeddings for Knowledge Extraction\u2014The Scholars\u2019 Trajectories across Research Topics"],"prefix":"10.3390","volume":"14","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1808-4487","authenticated-orcid":false,"given":"Gianfranco","family":"Lombardo","sequence":"first","affiliation":[{"name":"Department of Engineering and Architecture (DIA), University of Parma, 43100 Parma, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6030-9435","authenticated-orcid":false,"given":"Michele","family":"Tomaiuolo","sequence":"additional","affiliation":[{"name":"Department of Engineering and Architecture (DIA), University of Parma, 43100 Parma, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5916-9770","authenticated-orcid":false,"given":"Monica","family":"Mordonini","sequence":"additional","affiliation":[{"name":"Department of Engineering and Architecture (DIA), University of Parma, 43100 Parma, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-2857-1617","authenticated-orcid":false,"given":"Gaia","family":"Codeluppi","sequence":"additional","affiliation":[{"name":"Department of Engineering and Architecture (DIA), University of Parma, 43100 Parma, Italy"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3528-0260","authenticated-orcid":false,"given":"Agostino","family":"Poggi","sequence":"additional","affiliation":[{"name":"Department of Engineering and Architecture (DIA), University of Parma, 43100 Parma, Italy"}]}],"member":"1968","published-online":{"date-parts":[[2022,1,9]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"779","DOI":"10.1038\/nature06958","article-title":"Understanding individual human mobility patterns","volume":"453","author":"Gonzalez","year":"2008","journal-title":"Nature"},{"key":"ref_2","first-page":"37","article-title":"Moving destination prediction using sparse dataset: A mobility gradient descent approach","volume":"11","author":"Wang","year":"2017","journal-title":"ACM Trans. Knowl. Discov. Data (TKDD)"},{"key":"ref_3","first-page":"56","article-title":"Spatio-Temporal Routine Mining on Mobile Phone Data","volume":"12","author":"Qin","year":"2018","journal-title":"ACM Trans. Knowl. Discov. Data (TKDD)"},{"key":"ref_4","unstructured":"Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. arXiv."},{"key":"ref_5","unstructured":"Devlin, J., Chang, M.W., Lee, K., and Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv."},{"key":"ref_6","unstructured":"(2020, July 16). The Scopus Repository. Available online: https:\/\/www.elsevier.com\/solutions\/scopus."},{"key":"ref_7","doi-asserted-by":"crossref","unstructured":"Reimers, N., and Gurevych, I. (2019). Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv.","DOI":"10.18653\/v1\/D19-1410"},{"key":"ref_8","first-page":"37","article-title":"From data mining to knowledge discovery in databases","volume":"17","author":"Fayyad","year":"1996","journal-title":"AI Mag."},{"key":"ref_9","first-page":"1","article-title":"Knowledge discovery from a more than a decade studies on healthcare Big Data systems: A scientometrics study","volume":"6","author":"Ghatari","year":"2019","journal-title":"J. Big Data"},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"95","DOI":"10.1038\/s41586-019-1335-8","article-title":"Unsupervised word embeddings capture latent knowledge from materials science literature","volume":"571","author":"Tshitoyan","year":"2019","journal-title":"Nature"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"114053","DOI":"10.1016\/j.eswa.2020.114053","article-title":"Neural network embeddings on corporate annual filings for portfolio selection","volume":"164","author":"Adosoglou","year":"2020","journal-title":"Expert Syst. Appl."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Grover, A., and Leskovec, J. (2016, January 13\u201317). node2vec: Scalable feature learning for networks. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA.","DOI":"10.1145\/2939672.2939754"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"420","DOI":"10.1016\/j.future.2021.10.011","article-title":"Continual representation learning for node classification in power-law graphs","volume":"128","author":"Lombardo","year":"2021","journal-title":"Future Gener. Comput. Syst."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Hu, B., Wang, H., Wang, L., and Yuan, W. (2018). Adverse drug reaction predictions using stacking deep heterogeneous information network embedding approach. Molecules, 23.","DOI":"10.3390\/molecules23123193"},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"381","DOI":"10.3389\/fgene.2019.00381","article-title":"To embed or not: Network embedding as a paradigm in computational biology","volume":"10","author":"Nelson","year":"2019","journal-title":"Front. Genet."},{"key":"ref_16","first-page":"77","article-title":"ActorNode2Vec: An Actor-based solution for Node Embedding over large networks","volume":"14","author":"Lombardo","year":"2020","journal-title":"Intell. Artif."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Tomaiuolo, M., Lombardo, G., Mordonini, M., Cagnoni, S., and Poggi, A. (2020). A survey on troll detection. Future Internet, 12.","DOI":"10.3390\/fi12020031"},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"61","DOI":"10.1140\/epjst\/e2013-01715-5","article-title":"Understanding the patterns of car travel","volume":"215","author":"Pappalardo","year":"2013","journal-title":"Eur. Phys. J. Spec. Top."},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Pappalardo, L., Pedreschi, D., Smoreda, Z., and Giannotti, F. (2015, January 9\u201312). Using big data to study the link between human mobility and socio-economic development. Proceedings of the 2015 IEEE International Conference on IEEE, Hong Kong, China.","DOI":"10.1109\/BigData.2015.7363835"},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"101419","DOI":"10.1016\/j.compenvurbsys.2019.101419","article-title":"Tracking urban geo-topics based on dynamic topic model","volume":"79","author":"Yao","year":"2020","journal-title":"Comput. Environ. Urban Syst."},{"key":"ref_21","first-page":"38:1","article-title":"Mining Event-Oriented Topics in Microblog Stream with Unsupervised Multi-View Hierarchical Embedding","volume":"20","author":"Peng","year":"2018","journal-title":"ACM Trans. Knowl. Discov. Data"},{"key":"ref_22","first-page":"10","article-title":"Semantic text similarity using corpus-based word similarity and string similarity","volume":"2","author":"Islam","year":"2008","journal-title":"ACM Trans. Knowl. Discov. Data (TKDD)"},{"key":"ref_23","unstructured":"Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., and Dean, J. (2013). Distributed representations of words and phrases and their compositionality. Advances in Neural Information Processing Systems, MIT Press."},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Solomon, A., Bar, A., Yanai, C., Shapira, B., and Rokach, L. (2018, January 8\u201311). Predict demographic information using word2vec on spatial trajectories. Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization, Singapore.","DOI":"10.1145\/3209219.3209224"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"45","DOI":"10.1140\/epjds\/s13688-018-0173-5","article-title":"Identifying and predicting social lifestyles in people\u2019s trajectories by neural networks","volume":"7","author":"Zion","year":"2018","journal-title":"EPJ Data Sci."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Qiang, J., Chen, P., Wang, T., and Wu, X. (2017). Topic modeling over short texts by incorporating word embeddings. Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer.","DOI":"10.1007\/978-3-319-57529-2_29"},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"104","DOI":"10.1007\/s41109-019-0228-y","article-title":"Graph-based exploration and clustering analysis of semantic spaces","volume":"4","author":"Veremyev","year":"2019","journal-title":"Appl. Netw. Sci."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"5200","DOI":"10.1073\/pnas.0307545100","article-title":"Coauthorship networks and patterns of scientific collaboration","volume":"101","author":"Newman","year":"2004","journal-title":"Proc. Natl. Acad. Sci. USA"},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"81","DOI":"10.1145\/1247001.1247010","article-title":"Automatic and versatile publications ranking for research institutions and scholars","volume":"50","author":"Ren","year":"2007","journal-title":"Commun. ACM"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"1320","DOI":"10.1002\/asi.21062","article-title":"Comparing bibliometric statistics obtained from the Web of Science and Scopus","volume":"60","author":"Archambault","year":"2009","journal-title":"J. Assoc. Inf. Sci. Technol."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/1742-5581-3-1","article-title":"Scopus database: A review","volume":"3","author":"Burnham","year":"2006","journal-title":"Biomed. Digit. Libr."},{"key":"ref_32","first-page":"1","article-title":"Knowledge discovery on Scopus","volume":"1959","author":"Fornacciari","year":"2017","journal-title":"CEUR Workshop Proc."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Yegros-Yegros, A., Rafols, I., and D\u2019Este, P. (2015). Does interdisciplinary research lead to higher citation impact? The different effect of proximal and distal interdisciplinarity. PLoS ONE, 10.","DOI":"10.1371\/journal.pone.0135095"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Ying, Q.F., Venkatramanan, S., and Chiu, D.M. (2015, January 18\u201322). Modeling and analysis of scholar mobility on scientific landscape. Proceedings of the 24th International Conference on World Wide Web, ACM, Florence, Italy.","DOI":"10.1145\/2740908.2741737"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"May, C., Wang, A., Bordia, S., Bowman, S.R., and Rudinger, R. (2019). On measuring social biases in sentence encoders. arXiv.","DOI":"10.18653\/v1\/N19-1063"},{"key":"ref_36","unstructured":"Zhang, T., Kishore, V., Wu, F., Weinberger, K.Q., and Artzi, Y. (2019). Bertscore: Evaluating text generation with bert. arXiv."},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Baroni, M., Dinu, G., and Kruszewski, G. (2014, January 22\u201327). Do not count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors. Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Baltimore, MD, USA.","DOI":"10.3115\/v1\/P14-1023"},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"811","DOI":"10.1007\/s11192-013-1208-0","article-title":"hIa: An individual annual h-index to accommodate disciplinary and career length differences","volume":"99","author":"Harzing","year":"2014","journal-title":"Scientometrics"},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.physrep.2018.01.001","article-title":"Human mobility: Models and applications","volume":"734","author":"Barbosa","year":"2018","journal-title":"Phys. Rep."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"9136","DOI":"10.1038\/srep09136","article-title":"Explaining the power-law distribution of human mobility through transportation modality decomposition","volume":"5","author":"Zhao","year":"2015","journal-title":"Sci. Rep."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Cox, D., and Barndorff-Nielsen, O. (1994). Inference and Asymptotics. Chapman & Hall\/CRC Monographs on Statistics & Applied Probability, Taylor & Francis.","DOI":"10.1007\/978-1-4899-3210-5"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Wasserman, L., and Wasserman, L. (2004). All of Statistics: A Concise Course in Statistical Inference, Springer. Springer Texts in Statistics.","DOI":"10.1007\/978-0-387-21736-9"},{"key":"ref_43","first-page":"307","article-title":"Likelihood ratio tests for model selection and non-nested hypotheses","volume":"57","author":"Vuong","year":"1989","journal-title":"Econom. J. Econom. Soc."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"661","DOI":"10.1137\/070710111","article-title":"Power-law distributions in empirical data","volume":"51","author":"Clauset","year":"2009","journal-title":"SIAM Rev."},{"key":"ref_45","first-page":"1002","article-title":"Numerical recipes in C++","volume":"2","author":"Press","year":"1992","journal-title":"Art Sci. Comput."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Aceto, G., Ciuonzo, D., Montieri, A., Persico, V., and Pescap\u00e9, A. (2019, January 19\u201321). Know your big data trade-offs when classifying encrypted mobile traffic with deep learning. Proceedings of the 2019 Network Traffic Measurement and Analysis Conference (TMA), Paris, France.","DOI":"10.23919\/TMA.2019.8784565"}],"container-title":["Future Internet"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1999-5903\/14\/1\/25\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,13]],"date-time":"2025-10-13T14:01:47Z","timestamp":1760364107000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1999-5903\/14\/1\/25"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,1,9]]},"references-count":46,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,1]]}},"alternative-id":["fi14010025"],"URL":"https:\/\/doi.org\/10.3390\/fi14010025","relation":{},"ISSN":["1999-5903"],"issn-type":[{"value":"1999-5903","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,1,9]]}}}