{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,10,12]],"date-time":"2025-10-12T03:34:32Z","timestamp":1760240072402,"version":"build-2065373602"},"reference-count":44,"publisher":"MDPI AG","issue":"5","license":[{"start":{"date-parts":[[2019,3,3]],"date-time":"2019-03-03T00:00:00Z","timestamp":1551571200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["IJERPH"],"abstract":"<jats:p>Medicine is a knowledge area continuously experiencing changes. Every day, discoveries and procedures are tested with the goal of providing improved service and quality of life to patients. With the evolution of computer science, multiple areas experienced an increase in productivity with the implementation of new technical solutions. Medicine is no exception. Providing healthcare services in the future will involve the storage and manipulation of large volumes of data (big data) from medical records, requiring the integration of different data sources, for a multitude of purposes, such as prediction, prevention, personalization, participation, and becoming digital. Data integration and data sharing will be essential to achieve these goals. Our work focuses on the development of a framework process for the integration of data from different sources to increase its usability potential. We integrated data from an internal hospital database, external data, and also structured data resulting from natural language processing (NPL) applied to electronic medical records. An extract-transform and load (ETL) process was used to merge different data sources into a single one, allowing more effective use of these data and, eventually, contributing to more efficient use of the available resources.<\/jats:p>","DOI":"10.3390\/ijerph16050769","type":"journal-article","created":{"date-parts":[[2019,3,4]],"date-time":"2019-03-04T05:45:36Z","timestamp":1551678336000},"page":"769","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["Merging Data Diversity of Clinical Medical Records to Improve Effectiveness"],"prefix":"10.3390","volume":"16","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-1600-7867","authenticated-orcid":false,"given":"Berit I.","family":"Helgheim","sequence":"first","affiliation":[{"name":"Logistics, Molde University College, Molde, NO-6410 Molde, Norway"}]},{"given":"Rui","family":"Maia","sequence":"additional","affiliation":[{"name":"DEI, Instituto Superior T\u00e9cnico, 1049-001 Lisboa, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6662-0806","authenticated-orcid":false,"given":"Joao C.","family":"Ferreira","sequence":"additional","affiliation":[{"name":"Instituto Universit\u00e1rio de Lisboa (ISCTE-IUL), ISTAR-IUL, 1649-026 Lisbon, Portugal"}]},{"given":"Ana Lucia","family":"Martins","sequence":"additional","affiliation":[{"name":"Instituto Universit\u00e1rio de Lisboa (ISCTE-IUL), BRU-IUL, 1649-026 Lisbon, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2019,3,3]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1186\/s12911-016-0357-5","article-title":"A hybrid solution for extracting structured medical information from unstructured data in medical records via a double-reading\/entry system","volume":"16","author":"Luo","year":"2016","journal-title":"BMC Med. Inform. Decis. Mak."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"85","DOI":"10.1097\/00024665-200503000-00008","article-title":"Designing an EMR planning process based on staff attitudes toward and opinions about computers in healthcare","volume":"23","author":"McLane","year":"2005","journal-title":"CIN Comput. Inform. Nurs."},{"key":"ref_3","first-page":"184","article-title":"Challenges of Electronic Medical Record Implementation in the Emergency Department","volume":"22","author":"Yamamoto","year":"2006","journal-title":"Pediatr. Emerg. Care"},{"key":"ref_4","unstructured":"Yadav, P., Steinbach, M., Kumar, V., and Simon, G. (arXiv, 2017). Mining Electronic Health Records: A Survey, arXiv."},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Sun, W., Cai, Z., Li, Y., Liu, F., Fang, S., and Wang, G. (2018). Data processing and text mining technologies on electronic medical records: A review. J. Healthc. Eng.","DOI":"10.1155\/2018\/4302425"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1007\/978-3-030-01746-0_13","article-title":"Extracting clinical information from electronic medical records","volume":"806","author":"Lamy","year":"2018","journal-title":"Adv. Intell. Syst. Comput."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"55","DOI":"10.1016\/j.mpaic.2016.10.006","article-title":"Data, information, knowledge and wisdom","volume":"18","author":"Cooper","year":"2017","journal-title":"Anaesth. Intensive Care Med."},{"key":"ref_8","first-page":"15","article-title":"Hierarchy of Knowledge\u2014from Data to Wisdom","volume":"2","author":"Allen","year":"2004","journal-title":"Int. J. Curr. Res. Multidiscip."},{"key":"ref_9","unstructured":"Garets, D., and Davis, M. (2019, February 27). Electronic Medical Records vs. Electronic Health Records: Yes, There Is a Difference A HIMSS Analytics TM White Paper. Available online: https:\/\/s3.amazonaws.com\/rdcms-himss\/files\/production\/public\/HIMSSorg\/Content\/files\/WP_EMR_EHR.pdf."},{"key":"ref_10","unstructured":"(2019, February 27). The Dorenfest Complete Integrated Healthcare Delivery System Plus (Ihds+) Database and Library. Available online: https:\/\/foundation.himss.org\/Dorenfest\/About."},{"key":"ref_11","first-page":"64","article-title":"Data Mining Applications in Healthcare","volume":"19","author":"Koh","year":"2011","journal-title":"J. Healthc. Inf. Manag."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Sun, W., Cai, Z., Liu, F., Fang, S., and Wang, G. (2017, January 12\u201315). A survey of data mining technology on electronic medical records. Proceedings of the 2017 IEEE 19th International Conference on e-Health Networking, Applications and Services (Healthcom), e-Health Networking, Applications and Services (Healthcom), Dalian, China.","DOI":"10.1109\/HealthCom.2017.8210774"},{"key":"ref_13","unstructured":"Roy, S.B., Teredesai, A., Zolfaghar, K., Liu, R., Hazel, D., Newman, S., and Marinez, A. (2015, January 10\u201313). Dynamic Hierarchical Classification for Patient Risk-of Readmission. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Baba, Y., Kashima, H., Nohara, Y., Kai, E., Ghosh, P., Islam, R., Ahmed, A., Kuroda, M., Inoue, S., and Hiramatsu, T. (2015, January 10\u201313). Predictive Approaches for Low-Cost Preventive Medicine Program in Developing Countries. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia.","DOI":"10.1145\/2783258.2788587"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Somanchi, S., Adhikari, S., Lin, A., Eneva, E., and Ghani, R. (2015, January 10\u201313). Early Prediction of Cardiac Arrest (Code Blue) using Electronic Medical Records. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney, NSW, Australia.","DOI":"10.1145\/2783258.2788588"},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Ravindranath, K.R. (2015, January 8\u201310). Clinical Decision Support System for heart diseases using Extended sub tree. Proceedings of the 2015 International Conference on Pervasive Computing (ICPC), Pune, India.","DOI":"10.1109\/PERVASIVE.2015.7087026"},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Amin, S.U., Agarwal, K., and Beg, R. (2013, January 20\u201322). Genetic neural network based data mining in prediction of heart disease using risk factors. Proceedings of the 2013 IEEE Conference on Information & Communication Technologies, Hanoi, Vietnam.","DOI":"10.1109\/CICT.2013.6558288"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Chia, C.-C., and Syed, Z. (2014, January 24\u201327). Scalable noise mining in long-term electrocardiographic time-series to predict death following heart attacks. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA.","DOI":"10.1145\/2623330.2623702"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"e49","DOI":"10.1093\/jamia\/ocv124","article-title":"Cardiac catheterization laboratory inpatient forecast tool: A prospective evaluation","volume":"23","author":"Toerper","year":"2016","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"555","DOI":"10.1007\/s10115-014-0740-4","article-title":"Stabilized sparse ordinal regression for medical risk stratification","volume":"43","author":"Tran","year":"2015","journal-title":"Knowl. Inf. Syst."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Kop, R., Hoogendoorn, M., Moons, L.M.G., Numans, M.E., and Teije, A.T. (2015, January 20). On the advantage of using dedicated data mining techniques to predict colorectal cancer. Proceedings of the Conference on Artificial Intelligence in Medicine in Europe, Pavia, Italy.","DOI":"10.1007\/978-3-319-19551-3_16"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"130","DOI":"10.1109\/TKDE.2013.76","article-title":"Extending association rule summarization techniques to assess risk of diabetes mellitus","volume":"27","author":"Simon","year":"2015","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Rabbi, K., Mamun, Q., and Islam, M.D.R. (2015, January 15\u201317). Dynamic feature selection (DFS) based Data clustering technique on sensory data streaming in eHealth record system. Proceedings of the 2015 IEEE 10th Conference on Industrial Electronics and Applications (ICIEA), Auckland, New Zealand.","DOI":"10.1109\/ICIEA.2015.7334192"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Sumana, B.V., and Santhanam, T. (2014, January 10\u201311). Prediction of diseases by cascading clustering and classification. Proceedings of the Advances in Electronics, Computers and Communications. International Conference (ICAECC 2014), Bangalore, India.","DOI":"10.1109\/ICAECC.2014.7002426"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Feldman, K., Hazekamp, N., and Chawla, N.V. (2016, January 4\u20137). Mining the Clinical Narrative: All Text are Not Equal. Proceedings of the 2016 IEEE International Conference on Healthcare Informatics (ICHI), Chicago, IL, USA.","DOI":"10.1109\/ICHI.2016.37"},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"507","DOI":"10.1136\/jamia.2009.001560","article-title":"Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): Architecture, component evaluation and applications","volume":"17","author":"Savova","year":"2010","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"ref_27","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1016\/j.procs.2016.09.153","article-title":"Towards Developing an Intelligent Agent to Assist in Patient Diagnosis Using Neural Networks on Unstructured Patient Clinical Notes: Initial Analysis and Models","volume":"100","author":"Pulmano","year":"2016","journal-title":"Procedia Comput. Sci."},{"key":"ref_28","first-page":"505","article-title":"The Feasibility of Using Large-Scale Text Mining to Detect Adverse Childhood Experiences in a VA-Treated Population","volume":"28","author":"Araneo","year":"2015","journal-title":"Appl. Comput. Electromagn. Soc. J."},{"key":"ref_29","doi-asserted-by":"crossref","first-page":"1168","DOI":"10.2105\/AJPH.2014.302440","article-title":"Improving identification of fall-related injuries in ambulatory care using statistical text mining","volume":"105","author":"Luther","year":"2015","journal-title":"Am. J. Public Health"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"S203","DOI":"10.1016\/j.jbi.2015.08.003","article-title":"Coronary artery disease risk assessment from unstructured electronic health records using text mining","volume":"58","author":"Jonnagaddala","year":"2015","journal-title":"J. Biomed. Inform."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"89","DOI":"10.1016\/j.jbi.2015.09.011","article-title":"Exploring methods for identifying related patient safety events using structured and unstructured data","volume":"58","author":"Fong","year":"2015","journal-title":"J. Biomed. Inform."},{"key":"ref_32","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1016\/j.cmpb.2018.08.016","article-title":"A hybrid data mining model for diagnosis of patients with clinical suspicion of dementia","volume":"165","author":"Moreira","year":"2018","journal-title":"Comput. Methods Prog. Biomed."},{"key":"ref_33","doi-asserted-by":"crossref","first-page":"65","DOI":"10.1016\/j.bdr.2018.05.004","article-title":"Novel Approach to Predict Hospital Readmissions Using Feature Selection from Unstructured Data with Class Imbalance","volume":"13","author":"Sundararaman","year":"2018","journal-title":"Big Data Res."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1499","DOI":"10.1111\/jgs.15411","article-title":"The Value of Unstructured Electronic Health Record Data in Geriatric Syndrome Case Identification","volume":"66","author":"Kharrazi","year":"2018","journal-title":"J. Am. Geriatr. Soc."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"876","DOI":"10.14778\/2983200.2983204","article-title":"A framework for annotating CSV-like data","volume":"9","author":"Arenas","year":"2016","journal-title":"Proc. VLDB Endow."},{"key":"ref_36","unstructured":"(2019, February 27). Pooled Resource Open-Access ALS Clinical Trials Database. Available online: https:\/\/nctu.partners.org\/proact\/data\/index."},{"key":"ref_37","doi-asserted-by":"crossref","first-page":"753","DOI":"10.1177\/0193945916689084","article-title":"Data quality in electronic health records research: Quality domains and assessment methods","volume":"40","author":"Feder","year":"2018","journal-title":"Western J. Nurs. Res."},{"key":"ref_38","doi-asserted-by":"crossref","first-page":"40","DOI":"10.1016\/j.ijmedinf.2016.03.006","article-title":"Data quality assessment framework to assess electronic medical record data for use in research","volume":"90","author":"Reimer","year":"2016","journal-title":"Int. J. Med. Inform."},{"key":"ref_39","doi-asserted-by":"crossref","first-page":"263","DOI":"10.1109\/TBME.2016.2573285","article-title":"Omic and electronic health record big data analytics for precision medicine","volume":"64","author":"Wu","year":"2017","journal-title":"IEEE Trans. Biomed. Eng."},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"144","DOI":"10.1136\/amiajnl-2011-000681","article-title":"Methods and dimensions of electronic health record data quality assessment: Enabling reuse for clinical research","volume":"20","author":"Weiskopf","year":"2013","journal-title":"J. Am. Med. Inform. Assoc."},{"key":"ref_41","unstructured":"Tennison, J., Kellogg, G., and Herman, I. (2019, February 27). Model for tabular data and metadata on the web, W3C Working Draft 8. Available online: https:\/\/www.w3.org\/TR\/tabular-data-model\/."},{"key":"ref_42","unstructured":"Amaral, P., Pinto, S., de Carvalho, M., Tom\u2019as, P., and Madeira, S.C. (2012, January 12). Predicting the need for noninvasive ventilation in patients with amyotrophic lateral sclerosis. Proceedings of the ACM SIGKDD Workshop on Health Informatics (HI-KDD 2012), Beijing, China."},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"133","DOI":"10.1016\/j.jbi.2015.09.021","article-title":"Prognostic models based on patient snapshots and time windows: Predicting disease progression to assisted ventilation in amyotrophic lateral sclerosis","volume":"58","author":"Carreiro","year":"2015","journal-title":"J. Biomed. Inform."},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"138","DOI":"10.1007\/978-3-030-01746-0_16","article-title":"Predictive analysis in healthcare: Emergency wait time prediction","volume":"806","author":"Pereira","year":"2019","journal-title":"Adv. Intell. Syst. Comput."}],"container-title":["International Journal of Environmental Research and Public Health"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/1660-4601\/16\/5\/769\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T12:35:59Z","timestamp":1760186159000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/1660-4601\/16\/5\/769"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,3,3]]},"references-count":44,"journal-issue":{"issue":"5","published-online":{"date-parts":[[2019,3]]}},"alternative-id":["ijerph16050769"],"URL":"https:\/\/doi.org\/10.3390\/ijerph16050769","relation":{},"ISSN":["1660-4601"],"issn-type":[{"type":"electronic","value":"1660-4601"}],"subject":[],"published":{"date-parts":[[2019,3,3]]}}}