{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,21]],"date-time":"2026-02-21T02:28:11Z","timestamp":1771640891744,"version":"3.50.1"},"reference-count":29,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,2,22]],"date-time":"2022-02-22T00:00:00Z","timestamp":1645488000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,2,22]],"date-time":"2022-02-22T00:00:00Z","timestamp":1645488000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100013371","name":"Universit? C?te d?Azur","doi-asserted-by":"publisher","award":["ANR-15-IDEX-00"],"award-info":[{"award-number":["ANR-15-IDEX-00"]}],"id":[{"id":"10.13039\/501100013371","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100013371","name":"Universit? C?te d?Azur","doi-asserted-by":"publisher","award":["19-P3IA-0002"],"award-info":[{"award-number":["19-P3IA-0002"]}],"id":[{"id":"10.13039\/501100013371","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Biomed Semant"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Artificial intelligence methods applied to electronic medical records (EMRs) hold the potential to help physicians save time by sharpening their analysis and decisions, thereby improving the health of patients. On the one hand, machine learning algorithms have proven their effectiveness in extracting information and exploiting knowledge extracted from data. On the other hand, knowledge graphs capture human knowledge by relying on conceptual schemas and formalization and supporting reasoning. Leveraging knowledge graphs that are legion in the medical field, it is possible to pre-process and enrich data representation used by machine learning algorithms. Medical data standardization is an opportunity to jointly exploit the richness of knowledge graphs and the capabilities of machine learning algorithms.<\/jats:p><\/jats:sec><jats:sec><jats:title>Methods<\/jats:title><jats:p>We propose to address the problem of hospitalization prediction for patients with an approach that enriches vector representation of EMRs with information extracted from different knowledge graphs before learning and predicting. In addition, we performed an automatic selection of features resulting from knowledge graphs to distinguish noisy ones from those that can benefit the decision making. We report the results of our experiments on the PRIMEGE PACA database that contains more than 600,000 consultations carried out by 17 general practitioners (GPs).<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>A statistical evaluation shows that our proposed approach improves hospitalization prediction. More precisely, injecting features extracted from cross-domain knowledge graphs in the vector representation of EMRs given as input to the prediction algorithm significantly increases the F1 score of the prediction.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>By injecting knowledge from recognized reference sources into the representation of EMRs, it is possible to significantly improve the prediction of medical events. Future work would be to evaluate the impact of a feature selection step coupled with a combination of features extracted from several knowledge graphs. A possible avenue is to study more hierarchical levels and properties related to concepts, as well as to integrate more semantic annotators to exploit unstructured data.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s13326-022-00261-9","type":"journal-article","created":{"date-parts":[[2022,2,22]],"date-time":"2022-02-22T11:03:19Z","timestamp":1645527799000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":3,"title":["Extending electronic medical records vector models with knowledge graphs to improve hospitalization prediction"],"prefix":"10.1186","volume":"13","author":[{"given":"Rapha\u00ebl","family":"Gazzotti","sequence":"first","affiliation":[]},{"given":"Catherine","family":"Faron","sequence":"additional","affiliation":[]},{"given":"Fabien","family":"Gandon","sequence":"additional","affiliation":[]},{"given":"Virginie","family":"Lacroix-Hugues","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4425-4163","authenticated-orcid":false,"given":"David","family":"Darmon","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2022,2,22]]},"reference":[{"key":"261_CR1","first-page":"462","volume":"245","author":"V Lacroix-Hugues","year":"2017","unstructured":"Lacroix-Hugues V, Darmon D, Pradier C, Staccini P. Creation of the first french database in primary care using the icpc2: Feasibility study. Stud Health Technol Inform. 2017; 245:462\u20136.","journal-title":"Stud Health Technol Inform"},{"issue":"2","key":"261_CR2","doi-asserted-by":"publisher","first-page":"101","DOI":"10.4068\/cmj.2018.54.2.101","volume":"54","author":"S-M Wang","year":"2018","unstructured":"Wang S-M, Han C, Bahk W-M, Lee S-J, Patkar AA, Masand PS, Pae C-U. Addressing the side effects of contemporary antidepressant drugs: a comprehensive review. Chonnam Med J. 2018; 54(2):101\u201312.","journal-title":"Chonnam Med J"},{"issue":"1","key":"261_CR3","doi-asserted-by":"publisher","first-page":"39","DOI":"10.1186\/s13326-017-0149-6","volume":"8","author":"H Min","year":"2017","unstructured":"Min H, Mobahi H, Irvin K, Avramovic S, Wojtusiak J. Predicting activities of daily living for cancer patients using an ontology-guided machine learning methodology. J Biomed Semant. 2017; 8(1):39.","journal-title":"J Biomed Semant"},{"key":"261_CR4","doi-asserted-by":"publisher","unstructured":"Choi E, Bahadori MT, Song L, Stewart WF, Sun J. GRAM: Graph-based Attention Model for Healthcare Representation Learning. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, August 13 - 17, 2017. ACM: 2017. p. 787\u201395. https:\/\/doi.org\/10.1145\/3097983.3098126.","DOI":"10.1145\/3097983.3098126"},{"key":"261_CR5","doi-asserted-by":"publisher","unstructured":"Pennington J, Socher R, Manning CD. Glove: Global Vectors for Word Representation In: Moschitti A, Pang B, Daelemans W, editors. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014, October 25-29, 2014, Doha, Qatar, A meeting of SIGDAT, a Special Interest Group of the ACL. ACL: 2014. p. 1532\u201343. https:\/\/doi.org\/10.3115\/v1\/d14-1162.","DOI":"10.3115\/v1\/d14-1162"},{"key":"261_CR6","unstructured":"Peng X, Long G, Shen T, Wang S, Niu Z, Zhang C. MIMO: Mutual Integration of Patient Journey and Medical Ontology for Healthcare Representation Learning. CoRR. 2021; abs\/2107.09288. https:\/\/arxiv.org\/abs\/2107.09288. Accessed 29 July 2021."},{"issue":"6","key":"261_CR7","doi-asserted-by":"publisher","first-page":"801","DOI":"10.1109\/TKDE.2010.152","volume":"23","author":"O Frunza","year":"2011","unstructured":"Frunza O, Inkpen D, Tran T. A machine learning approach for identifying disease-treatment relations in short texts. IEEE Trans Knowl Data Eng. 2011; 23(6):801\u201314.","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"261_CR8","doi-asserted-by":"publisher","unstructured":"Gazzotti R, Faron-Zucker C, Gandon F, Lacroix-Hugues V, Darmon D. Injecting Domain Knowledge in Electronic Medical Records to Improve Hospitalization Prediction In: Hitzler P, Fern\u00e1ndez M, Janowicz K, Zaveri A, Gray AJG, L\u00f3pez V, Haller A, Hammar K, editors. The Semantic Web - 16th International Conference, ESWC 2019, Portoro\u017e, Slovenia, June 2-6, 2019, Proceedings. Lecture Notes in Computer Science, vol. 11503. Springer: 2019. p. 116\u201330. https:\/\/doi.org\/10.1007\/978-3-030-21348-0_8.","DOI":"10.1007\/978-3-030-21348-0_8"},{"key":"261_CR9","doi-asserted-by":"publisher","unstructured":"Gazzotti R, Faron-Zucker C, Gandon F, Lacroix-Hugues V, Darmon D. Injection of automatically selected DBpedia subjects in electronic medical records to boost hospitalization prediction In: Hung C-C, Cern\u00fd T, Shin D, Bechini A, editors. SAC\u201920: The 35th ACM\/SIGAPP Symposium on Applied Computing, online event, [Brno, Czech Republic], March 30 - April 3, 2020. ACM: 2020. p. 2013\u201320. https:\/\/doi.org\/10.1145\/3341105.3373932.","DOI":"10.1145\/3341105.3373932"},{"key":"261_CR10","unstructured":"Gazzotti R. Knowledge graphs based extension of patients\u2019 files to predict hospitalization. (pr\u00e9diction d\u2019hospitalisation par la g\u00e9n\u00e9ration de caract\u00e9ristiques extraites de graphes de connaissances). PhD thesis, University of C\u00f4te d\u2019Azur, Nice, France. 2020."},{"key":"261_CR11","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4899-3242-6","volume-title":"Generalized Linear Models","author":"P McCullagh","year":"1989","unstructured":"McCullagh P, Nelder JA. Generalized Linear Models. London: Chapman & Hall \/ CRC; 1989."},{"issue":"1","key":"261_CR12","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L. Random forests. Mach Learn. 2001; 45(1):5\u201332.","journal-title":"Mach Learn"},{"issue":"3","key":"261_CR13","first-page":"27","volume":"2","author":"C-C Chang","year":"2011","unstructured":"Chang C-C, Lin C-J. Libsvm: a library for support vector machines. ACM Trans Intell Syst Technol (TIST). 2011; 2(3):27.","journal-title":"ACM Trans Intell Syst Technol (TIST)"},{"issue":"1","key":"261_CR14","doi-asserted-by":"publisher","first-page":"198","DOI":"10.1093\/jamia\/ocw042","volume":"24","author":"BA Goldstein","year":"2017","unstructured":"Goldstein BA, Navar AM, Pencina MJ, Ioannidis J. Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review. J Am Med Inform Assoc. 2017; 24(1):198\u2013208.","journal-title":"J Am Med Inform Assoc"},{"key":"261_CR15","unstructured":"Lafferty JD, McCallum A, Pereira FCN Brodley CE, Danyluk AP, (eds).Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. Morgan Kaufmann; 2001. pp. 282\u20139."},{"issue":"1","key":"261_CR16","doi-asserted-by":"publisher","first-page":"49","DOI":"10.1145\/1882471.1882479","volume":"12","author":"G Forman","year":"2010","unstructured":"Forman G, Scholz M. Apples-to-apples in cross-validation studies: pitfalls in classifier performance measurement. ACM SIGKDD Explor Newsl. 2010; 12(1):49\u201357.","journal-title":"ACM SIGKDD Explor Newsl"},{"key":"261_CR17","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E. Scikit-learn: Machine learning in Python. J Mach Learn Res. 2011; 12:2825\u201330.","journal-title":"J Mach Learn Res"},{"issue":"4","key":"261_CR18","doi-asserted-by":"publisher","first-page":"267","DOI":"10.1561\/2200000013","volume":"4","author":"C Sutton","year":"2012","unstructured":"Sutton C, McCallum A, et al. An introduction to conditional random fields. Found Trends$^{\\circledR }$ Mach Learn. 2012; 4(4):267\u2013373.","journal-title":"Found Trends$^{\\circledR }$\u24c7 Mach Learn"},{"issue":"Jul","key":"261_CR19","first-page":"2079","volume":"11","author":"GC Cawley","year":"2010","unstructured":"Cawley GC, Talbot NL. On over-fitting in model selection and subsequent selection bias in performance evaluation. J Mach Learn Res. 2010; 11(Jul):2079\u2013107.","journal-title":"J Mach Learn Res"},{"issue":"Feb","key":"261_CR20","first-page":"281","volume":"13","author":"J Bergstra","year":"2012","unstructured":"Bergstra J, Bengio Y. Random search for hyper-parameter optimization. J Mach Learn Res. 2012; 13(Feb):281\u2013305.","journal-title":"J Mach Learn Res"},{"key":"261_CR21","unstructured":"Gazzotti R. Knowledge graphs based extension of patients\u2019 files to predict hospitalization. PhD thesis, Universit\u00e9 C\u00f4te d\u2019Azur. 2020."},{"key":"261_CR22","doi-asserted-by":"publisher","unstructured":"Daiber J, Jakob M, Hokamp C, Mendes PN. Improving efficiency and accuracy in multilingual entity extraction In: Sabou M, Blomqvist E, Di Noia T, Sack H, Pellegrini T, editors. I-SEMANTICS 2013 - 9th International Conference on Semantic Systems, ISEM \u201913, Graz, Austria, September 4-6, 2013. ACM: 2013. p. 121\u20134. https:\/\/doi.org\/10.1145\/2506182.2506198.","DOI":"10.1145\/2506182.2506198"},{"issue":"1","key":"261_CR23","doi-asserted-by":"publisher","first-page":"61","DOI":"10.1177\/001316447003000105","volume":"30","author":"K Krippendorff","year":"1970","unstructured":"Krippendorff K. Estimating the reliability, systematic error and random error of interval data. Educ Psychol Meas. 1970; 30(1):61\u201370.","journal-title":"Educ Psychol Meas"},{"issue":"4","key":"261_CR24","doi-asserted-by":"publisher","first-page":"555","DOI":"10.1162\/coli.07-034-R2","volume":"34","author":"R Artstein","year":"2008","unstructured":"Artstein R, Poesio M. Inter-coder agreement for computational linguistics. Comput Linguist. 2008; 34(4):555\u201396.","journal-title":"Comput Linguist"},{"key":"261_CR25","first-page":"338","volume":"1","author":"O Corby","year":"2010","unstructured":"Corby O, Zucker CF. The kgram abstract machine for knowledge graph querying. Web Intell Intell Agent Technol (WI-IAT). 2010; 1:338\u201341.","journal-title":"Web Intell Intell Agent Technol (WI-IAT)"},{"issue":"Jan","key":"261_CR26","first-page":"1","volume":"7","author":"J Dem\u0161ar","year":"2006","unstructured":"Dem\u0161ar J. Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res. 2006; 7(Jan):1\u201330.","journal-title":"J Mach Learn Res"},{"issue":"3","key":"261_CR27","doi-asserted-by":"publisher","first-page":"239","DOI":"10.1023\/A:1024068626366","volume":"52","author":"C Nadeau","year":"2003","unstructured":"Nadeau C, Bengio Y. Inference for the generalization error. Mach Learn. 2003; 52(3):239\u201381.","journal-title":"Mach Learn"},{"issue":"1","key":"261_CR28","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","volume":"58","author":"R Tibshirani","year":"1996","unstructured":"Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodological). 1996; 58(1):267\u201388.","journal-title":"J R Stat Soc Ser B (Methodological)"},{"key":"261_CR29","doi-asserted-by":"publisher","unstructured":"Cunningham H, Maynard D, Bontcheva K, Tablan V. A framework and graphical development environment for robust NLP tools and applications. In: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, July 6-12, 2002, Philadelphia, PA, USA. ACL: 2002. p. 168\u201375. https:\/\/doi.org\/10.3115\/1073083.1073112.","DOI":"10.3115\/1073083.1073112"}],"container-title":["Journal of Biomedical Semantics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13326-022-00261-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13326-022-00261-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13326-022-00261-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,9,19]],"date-time":"2024-09-19T00:36:45Z","timestamp":1726706205000},"score":1,"resource":{"primary":{"URL":"https:\/\/jbiomedsem.biomedcentral.com\/articles\/10.1186\/s13326-022-00261-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,22]]},"references-count":29,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["261"],"URL":"https:\/\/doi.org\/10.1186\/s13326-022-00261-9","relation":{},"ISSN":["2041-1480"],"issn-type":[{"value":"2041-1480","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,2,22]]},"assertion":[{"value":"3 February 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"23 December 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"22 February 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"A declaration was made to CNIL (French supervisory authority for the protection of personal data registration no. 1585962). An informative poster intended for the patients and explaining the modalities of access and rectification of the data was arranged among the member physicians.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"6"}}