{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,18]],"date-time":"2026-05-18T12:15:21Z","timestamp":1779106521309,"version":"3.51.4"},"reference-count":35,"publisher":"Springer Science and Business Media LLC","issue":"11","license":[{"start":{"date-parts":[[2021,2,27]],"date-time":"2021-02-27T00:00:00Z","timestamp":1614384000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,2,27]],"date-time":"2021-02-27T00:00:00Z","timestamp":1614384000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["Multimed Tools Appl"],"published-print":{"date-parts":[[2022,5]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:p>In the last few years, the integration of researches in Computer Science and medical fields has made available to the scientific community an enormous amount of data, stored in databases. In this paper, we analyze the data available in the Parkinson\u2019s Progression Markers Initiative (PPMI), a comprehensive observational, multi-center study designed to identify progression biomarkers important for better treatments for Parkinson\u2019s disease. The data of PPMI participants are collected through a comprehensive battery of tests and assessments including Magnetic Resonance Imaging and DATscan imaging, collection of blood, cerebral spinal fluid, and urine samples, as well as cognitive and motor evaluations. To this aim, we propose a technique to identify a correlation between the biomedical data in the PPMI dataset for verifying the consistency of medical reports formulated during the visits and allow to correctly categorize the various patients. To correlate the information of each patient\u2019s medical report, Information Retrieval and Machine Learning techniques have been adopted, including the Latent Semantic Analysis, Text2Vec and Doc2Vec techniques. Then, patients are grouped and classified into affected or not by using clustering algorithms according to the similarity of medical reports. Finally, we have adopted a visualization system based on the D3 framework to visualize correlations among medical reports with an interactive chart, and to support the doctor in analyzing the chronological sequence of visits in order to diagnose Parkinson\u2019s disease early.<\/jats:p>","DOI":"10.1007\/s11042-021-10506-x","type":"journal-article","created":{"date-parts":[[2021,2,27]],"date-time":"2021-02-27T17:03:11Z","timestamp":1614445391000},"page":"14685-14703","update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":8,"title":["Visualizing correlations among Parkinson biomedical data through information retrieval and machine learning techniques"],"prefix":"10.1007","volume":"81","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-3164-1858","authenticated-orcid":false,"given":"Maria","family":"Frasca","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Genoveffa","family":"Tortora","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,2,27]]},"reference":[{"key":"10506_CR1","unstructured":"Alsabti K, Ranka S, Singh V (2000) An efficient k-means clustering algorithm. First workshop high performance data mining"},{"issue":"10","key":"10506_CR2","doi-asserted-by":"publisher","first-page":"3889","DOI":"10.1007\/s12652-018-1160-1","volume":"10","author":"A Anagaw","year":"2019","unstructured":"Anagaw A, Chang Y-L (2019) A new complement na\u00efve bayesian approach for biomedical data classification. J Ambient Intell Human Comput 10 (10):3889\u20133897","journal-title":"J Ambient Intell Human Comput"},{"key":"10506_CR3","doi-asserted-by":"crossref","unstructured":"Beam AL, Kompa B, Schmaltz A, Fried I, Weber G, Palmer N, Shi X, Cai T, Kohane IS (2018) Clinical concept embeddings learned from massive sources of multimodal medical data. arXiv:1804.01486","DOI":"10.1142\/9789811215636_0027"},{"key":"10506_CR4","unstructured":"Blaas J, Botha CP, Post FH (2007) Interactive visualization of multi-field medical data using linked physical and feature-space views. In: EuroVis, pp 123\u2013130"},{"issue":"5","key":"10506_CR5","doi-asserted-by":"publisher","first-page":"1211","DOI":"10.1109\/TCBB.2013.16","volume":"10","author":"S Bleik","year":"2013","unstructured":"Bleik S, Mishra M, Huan J, Song M (2013) Text categorization of biomedical data sets using graph kernels and a controlled vocabulary. IEEE\/ACM Trans Comput Biol Bioinform 10(5):1211\u20131217","journal-title":"IEEE\/ACM Trans Comput Biol Bioinform"},{"key":"10506_CR6","doi-asserted-by":"crossref","unstructured":"Bouadjenek MR, Verspoor K (2017) Multi-field query expansion is effective for biomedical dataset retrieval. Database 2017","DOI":"10.1093\/database\/bax062"},{"key":"10506_CR7","doi-asserted-by":"crossref","unstructured":"Chen H, Fuller SS, Friedman C, Hersh W (2005) Knowledge management, data mining, and text mining in medical informatics. In: Medical informatics. Springer, New York, pp 3\u201333","DOI":"10.1007\/0-387-25739-X_1"},{"key":"10506_CR8","unstructured":"Chen Q, Sokolova M (2018) Word2vec and doc2vec in unsupervised sentiment analysis of clinical discharge summaries. arXiv:1805.00352"},{"key":"10506_CR9","doi-asserted-by":"crossref","unstructured":"Chou S, Chang W, Cheng C-Y, Jehng J-C, Chang C (2008) An information retrieval system for medical records & documents. In: 30th annual intl conf of the IEEE eng in medicine and biology sfociety. IEEE, pp 1474\u20131477","DOI":"10.1109\/IEMBS.2008.4649446"},{"issue":"1","key":"10506_CR10","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1093\/bmb\/ldn013","volume":"86","author":"CA Davie","year":"2008","unstructured":"Davie CA (2008) A review of parkinson\u2019s disease. British Med Bull 86(1):109\u2013127","journal-title":"British Med Bull"},{"key":"10506_CR11","doi-asserted-by":"crossref","unstructured":"Distante D, Risi M, Scanniello G (2010) Extending web content management systems navigation capabilities with semantic navigation maps. In: 12th IEEE Intl Symposium on Web Systems Evolution (WSE). IEEE, pp 1\u20135","DOI":"10.1109\/WSE.2010.6224336"},{"key":"10506_CR12","unstructured":"Dynomant E, Darmoni SJ, Lejeune \u00c9, Kerdelhu\u00e9 G, Leroy J-P, Lequertier V, Canu S, Grosjean J (2019) Doc2vec on the pubmed corpus: study of a new approach to generate related articles. arXiv:1911.11698"},{"key":"10506_CR13","unstructured":"Euzenat J (2007) Semantic precision and recall for ontology alignment evaluation. In: IJCAI, vol 7, pp 348\u2013353"},{"key":"10506_CR14","doi-asserted-by":"crossref","unstructured":"Fern\u00e1ndez E, Garc\u00eda-Moreno J-M, Mart\u00edn de Pablos A, Chac\u00f3n J (2014) May the thyroid gland and thyroperoxidase participate in nitrosylation of serum proteins and sporadic parkinson\u2019s disease?","DOI":"10.1089\/ars.2014.6072"},{"key":"10506_CR15","doi-asserted-by":"crossref","unstructured":"Gath I, Geva AB (1989) Unsupervised optimal fuzzy clustering. IEEE Trans Pattern Anal Mach Intell 773\u2013780","DOI":"10.1109\/34.192473"},{"issue":"6","key":"10506_CR16","doi-asserted-by":"publisher","first-page":"72","DOI":"10.1145\/3209086","volume":"61","author":"D Gefen","year":"2018","unstructured":"Gefen D, Miller J, Armstrong JK, Cornelius FH, Robertson N, Smith-McLallen A, Taylor JA (2018) Identifying patterns in medical records through latent semantic analysis. Commun ACM 61(6):72\u201377","journal-title":"Commun ACM"},{"issue":"1","key":"10506_CR17","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1001\/archneur.56.1.33","volume":"56","author":"DJ Gelb","year":"1999","unstructured":"Gelb DJ, Oliver E, Gilman S (1999) Diagnostic criteria for parkinson disease. Archiv Neurol 56(1):33\u201339","journal-title":"Archiv Neurol"},{"key":"10506_CR18","doi-asserted-by":"crossref","unstructured":"Hu G (2010) Total cholesterol and the risk of parkinson\u2019s disease: A review for some new findings. Parkinson\u2019s disease 2010","DOI":"10.4061\/2010\/836962"},{"issue":"1","key":"10506_CR19","first-page":"4","volume":"1","author":"A Khan","year":"2010","unstructured":"Khan A, Baharudin B, Lee LH, Khan K (2010) A review of machine learning algorithms for text-documents classification. J Adv Inform Technol 1 (1):4\u201320","journal-title":"J Adv Inform Technol"},{"key":"10506_CR20","unstructured":"Le Q, Mikolov T (2014) Distributed representations of sentences and documents. In: International conference on machine learning, pp 1188\u20131196"},{"key":"10506_CR21","volume-title":"Data visualization strategies for the electronic health record","author":"BJ Lesselroth","year":"2011","unstructured":"Lesselroth BJ, Pieczkiewicz DS (2011) Data visualization strategies for the electronic health record. Nova Science Publishers Inc, New York"},{"issue":"6","key":"10506_CR22","doi-asserted-by":"publisher","first-page":"668","DOI":"10.1016\/j.jbi.2006.02.001","volume":"39","author":"Q Li","year":"2006","unstructured":"Li Q, Wu Y-FB (2006) Identifying important concepts from medical documents. J Biomed Inform 39(6):668\u2013679","journal-title":"J Biomed Inform"},{"issue":"1","key":"10506_CR23","doi-asserted-by":"publisher","first-page":"76","DOI":"10.1016\/j.datak.2006.02.008","volume":"61","author":"W Mao","year":"2007","unstructured":"Mao W, Chu WW (2007) The phrase-based vector space model for automatic retrieval of free-text medical documents. Data Knowl Eng 61(1):76\u201392","journal-title":"Data Knowl Eng"},{"issue":"4","key":"10506_CR24","doi-asserted-by":"publisher","first-page":"629","DOI":"10.1016\/j.pneurobio.2011.09.005","volume":"95","author":"K Marek","year":"2011","unstructured":"Marek K, Jennings D, Lasch S, Siderowf A, Tanner C, Simuni T, Coffey C, Kieburtz K, Flagg E, Chowdhury S et al (2011) The parkinson progression marker initiative (PPMI). Progress Neurobiol 95(4):629\u2013635","journal-title":"Progress Neurobiol"},{"issue":"6","key":"10506_CR25","doi-asserted-by":"publisher","first-page":"381","DOI":"10.1016\/j.parkreldis.2004.03.008","volume":"10","author":"RP Munhoz","year":"2004","unstructured":"Munhoz RP, Teive HA, Troiano AR, Hauck PR, Leiva MHH, Graff H, Werneck LC (2004) Parkinson\u2019s disease and thyroid dysfunction. Parkinson Relat Disord 10(6):381\u2013383","journal-title":"Parkinson Relat Disord"},{"key":"10506_CR26","doi-asserted-by":"crossref","unstructured":"Pellecchia MT, Frasca M, Citarella AA, Risi M, Francese R, Tortora G, De Marco F (2019) Identifying correlations among biomedical data through information retrieval techniques. In: 2019 23rd international conference information visualisation (IV). IEEE, pp 269\u2013274","DOI":"10.1109\/IV.2019.00052"},{"key":"10506_CR27","first-page":"1","volume-title":"Data mining","author":"A Rajaraman","year":"2011","unstructured":"Rajaraman A, Ullman JD (2011) Data mining. Cambridge University Press, Cambridge, pp 1\u201317"},{"issue":"3","key":"10506_CR28","doi-asserted-by":"publisher","first-page":"207","DOI":"10.1561\/1100000039","volume":"5","author":"A Rind","year":"2013","unstructured":"Rind A, Wang TD, Aigner W, Miksch S, Wongsuphasawat K, Plaisant C, Shneiderman B (2013) Interactive information visualization to explore and query electronic health records. Found Trends Human-Comput Interact 5 (3):207\u2013298","journal-title":"Found Trends Human-Comput Interact"},{"key":"10506_CR29","doi-asserted-by":"crossref","unstructured":"Romano S, Scanniello G, Risi M, Gravino C (2011) Clustering and lexical information support for the recovery of design pattern in source code. In: 27th IEEE Intl Conf on software maintenance (ICSM). IEEE, pp 500\u2013503","DOI":"10.1109\/ICSM.2011.6080818"},{"issue":"2","key":"10506_CR30","doi-asserted-by":"publisher","first-page":"392","DOI":"10.1016\/j.cag.2011.01.011","volume":"35","author":"T Ropinski","year":"2011","unstructured":"Ropinski T, Oeltze S, Preim B (2011) Survey of glyph-based visualization techniques for spatial multivariate medical data. Comput Graphics 35 (2):392\u2013401","journal-title":"Comput Graphics"},{"key":"10506_CR31","doi-asserted-by":"crossref","unstructured":"Selivanov D, Wang Q (2016) text2vec: Modern text mining framework for r. Computer software manual(R package version 0.4. 0). Retrieved from https:\/\/CRAN.R-project.org\/package=text2vec","DOI":"10.32614\/CRAN.package.text2vec"},{"issue":"1","key":"10506_CR32","doi-asserted-by":"publisher","first-page":"104","DOI":"10.1016\/j.ipm.2013.08.006","volume":"50","author":"AK Uysal","year":"2014","unstructured":"Uysal AK, Gunal S (2014) The impact of preprocessing on text classification. Inform Process Manag 50(1):104\u2013112","journal-title":"Inform Process Manag"},{"issue":"2","key":"10506_CR33","doi-asserted-by":"publisher","first-page":"330","DOI":"10.1136\/amiajnl-2014-002955","volume":"22","author":"VL West","year":"2015","unstructured":"West VL, Borland D, Hammond WE (2015) Innovative information visualization of electronic health record data: A systematic review. J Am Med Inform Assoc 22(2):330\u2013339","journal-title":"J Am Med Inform Assoc"},{"issue":"4","key":"10506_CR34","doi-asserted-by":"publisher","first-page":"910","DOI":"10.2337\/dc10-1922","volume":"34","author":"Q Xu","year":"2011","unstructured":"Xu Q, Park Y, Huang X, Hollenbeck A, Blair A, Schatzkin A, Chen H (2011) Diabetes and risk of parkinson\u2019s disease. Diabetes Care 34(4):910\u2013915","journal-title":"Diabetes Care"},{"issue":"7","key":"10506_CR35","doi-asserted-by":"publisher","first-page":"1178","DOI":"10.1093\/bioinformatics\/bth060","volume":"20","author":"G Zhou","year":"2004","unstructured":"Zhou G, Zhang J, Su J, Shen D, Tan C (2004) Recognizing names in biomedical texts: A machine learning approach. Bioinformatics 20(7):1178\u20131190","journal-title":"Bioinformatics"}],"container-title":["Multimedia Tools and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-021-10506-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1007\/s11042-021-10506-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1007\/s11042-021-10506-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,8,25]],"date-time":"2024-08-25T00:03:17Z","timestamp":1724544197000},"score":1,"resource":{"primary":{"URL":"https:\/\/link.springer.com\/10.1007\/s11042-021-10506-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2,27]]},"references-count":35,"journal-issue":{"issue":"11","published-print":{"date-parts":[[2022,5]]}},"alternative-id":["10506"],"URL":"https:\/\/doi.org\/10.1007\/s11042-021-10506-x","relation":{},"ISSN":["1380-7501","1573-7721"],"issn-type":[{"value":"1380-7501","type":"print"},{"value":"1573-7721","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,2,27]]},"assertion":[{"value":"30 August 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 December 2020","order":2,"name":"revised","label":"Revised","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"5 January 2021","order":3,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"27 February 2021","order":4,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}}]}}