{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,2,21]],"date-time":"2025-02-21T10:56:45Z","timestamp":1740135405560,"version":"3.37.3"},"reference-count":66,"publisher":"Springer Science and Business Media LLC","issue":"S3","license":[{"start":{"date-parts":[[2021,2,1]],"date-time":"2021-02-01T00:00:00Z","timestamp":1612137600000},"content-version":"tdm","delay-in-days":0,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"},{"start":{"date-parts":[[2021,2,24]],"date-time":"2021-02-24T00:00:00Z","timestamp":1614124800000},"content-version":"vor","delay-in-days":23,"URL":"http:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"Wenzhou Department of Science and Technology Development","award":["ZG2017020"],"award-info":[{"award-number":["ZG2017020"]}]},{"name":"University of Alabam at Birmingham","award":["startup budget"],"award-info":[{"award-number":["startup budget"]}]},{"name":"National Institute of Health","award":["U54TR002731"],"award-info":[{"award-number":["U54TR002731"]}]},{"DOI":"10.13039\/100000968","name":"American Heart Association","doi-asserted-by":"publisher","award":["data science fellowship award to the Informatics Institute of UAB"],"award-info":[{"award-number":["data science fellowship award to the Informatics Institute of UAB"]}],"id":[{"id":"10.13039\/100000968","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000054","name":"National Cancer Institute","doi-asserted-by":"crossref","award":["U01CA223976"],"award-info":[{"award-number":["U01CA223976"]}],"id":[{"id":"10.13039\/100000054","id-type":"DOI","asserted-by":"crossref"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Med Inform Decis Mak"],"published-print":{"date-parts":[[2021,2]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>In this work, we aimed to demonstrate how to utilize the lab test results and other clinical information to support precision medicine research and clinical decisions on complex diseases, with the support of electronic medical record facilities. We defined \u201cclinotypes\u201d as clinical information that could be observed and measured objectively using biomedical instruments. From well-known \u2018omic\u2019 problem definitions, we defined problems using clinotype information, including stratifying patients\u2014identifying interested sub cohorts for future studies, mining significant associations between clinotypes and specific phenotypes-diseases, and discovering potential linkages between clinotype and genomic information. We solved these problems by integrating public omic databases and applying advanced machine learning and visual analytic techniques on two-year health exam records from a large population of healthy southern Chinese individuals (size n\u2009=\u200991,354). When developing the solution, we carefully addressed the missing information, imbalance and non-uniformed data annotation issues.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>We organized the techniques and solutions to address the problems and issues above into CPA framework (Clinotype Prediction and Association-finding). At the data preprocessing step, we handled the missing value issue with predicted accuracy of 0.760. We curated 12,635 clinotype-gene associations. We found 147 Associations between 147 chronic diseases-phenotype and clinotypes, which improved the disease predictive performance to AUC (average) of 0.967. We mined 182 significant clinotype-clinotype associations among 69 clinotypes.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusions<\/jats:title>\n                <jats:p>Our results showed strong potential connectivity between the omics information and the clinical lab test information. The results further emphasized the needs to utilize and integrate the clinical information, especially the lab test results, in future PheWas and omic studies. Furthermore, it showed that the clinotype information could initiate an alternative research direction and serve as an independent field of data to support the well-known \u2018phenome\u2019 and \u2018genome\u2019 researches.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s12911-021-01387-z","type":"journal-article","created":{"date-parts":[[2021,2,24]],"date-time":"2021-02-24T07:04:41Z","timestamp":1614150281000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":2,"title":["Linking clinotypes to phenotypes and genotypes from laboratory test results in comprehensive physical exams"],"prefix":"10.1186","volume":"21","author":[{"given":"Thanh","family":"Nguyen","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Tongbin","family":"Zhang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Geoffrey","family":"Fox","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Sisi","family":"Zeng","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Ni","family":"Cao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chuandi","family":"Pan","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-8829-7504","authenticated-orcid":false,"given":"Jake Y.","family":"Chen","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,2,24]]},"reference":[{"issue":"19","key":"1387_CR1","doi-asserted-by":"publisher","first-page":"1981","DOI":"10.1001\/jama.2018.2009","volume":"319","author":"AK Manrai","year":"2018","unstructured":"Manrai AK, Patel CJ, Ioannidis JPA. In the era of precision medicine and big data, who is normal? JAMA. 2018;319(19):1981\u20132.","journal-title":"JAMA"},{"issue":"4","key":"1387_CR2","doi-asserted-by":"publisher","first-page":"e0123617","DOI":"10.1371\/journal.pone.0123617","volume":"10","author":"S Liu","year":"2015","unstructured":"Liu S, Hou J, Zhang H, Wu Y, Hu M, Zhang L, Xu J, Na R, Jiang H, Ding Q. The evaluation of the risk factors for non-muscle invasive bladder cancer (NMIBC) recurrence after transurethral resection (TURBt) in Chinese population. PLoS ONE. 2015;10(4):e0123617.","journal-title":"PLoS ONE"},{"key":"1387_CR3","doi-asserted-by":"publisher","first-page":"478","DOI":"10.1111\/biom.12283","volume":"71","author":"BA Goldstein","year":"2015","unstructured":"Goldstein BA, Assimes T, Winkelmayer WC, Hastie T. Detecting clinically meaningful biomarkers with repeated measurements: an illustration with electronic health records. Biometrics. 2015;71:478\u201386.","journal-title":"Biometrics"},{"issue":"5","key":"1387_CR4","doi-asserted-by":"publisher","first-page":"1103","DOI":"10.1377\/hlthaff.24.5.1103","volume":"24","author":"R Hillestad","year":"2005","unstructured":"Hillestad R, Bigelow J, Bower A, Girosi F, Meili R, Scoville R, Taylor R. Can electronic medical record systems transform health care? Potential health benefits, savings, and costs. Health Aff (Millwood). 2005;24(5):1103\u201317.","journal-title":"Health Aff (Millwood)"},{"key":"1387_CR5","doi-asserted-by":"publisher","first-page":"137","DOI":"10.1186\/1472-6963-10-137","volume":"10","author":"L Martirosyan","year":"2010","unstructured":"Martirosyan L, Arah OA, Haaijer-Ruskamp FM, Braspenning J, Denig P. Methods to identify the target population: implications for prescribing quality indicators. BMC health services research. 2010;10:137.","journal-title":"BMC health services research"},{"key":"1387_CR6","doi-asserted-by":"publisher","first-page":"30","DOI":"10.1186\/1472-6947-6-30","volume":"6","author":"QT Zeng","year":"2006","unstructured":"Zeng QT, Goryachev S, Weiss S, Sordo M, Murphy SN, Lazarus R. Extracting principal diagnosis, co-morbidity and smoking status for asthma research: evaluation of a natural language processing system. BMC Med Inform Decis Mak. 2006;6:30.","journal-title":"BMC Med Inform Decis Mak"},{"issue":"8","key":"1387_CR7","doi-asserted-by":"publisher","first-page":"e1002141","DOI":"10.1371\/journal.pcbi.1002141","volume":"7","author":"FS Roque","year":"2011","unstructured":"Roque FS, Jensen PB, Schmock H, Dalgaard M, Andreatta M, Hansen T, Soeby K, Bredkjaer S, Juul A, Werge T, et al. Using electronic patient records to discover disease correlations and stratify patient cohorts. PLoS Comput Biol. 2011;7(8):e1002141.","journal-title":"PLoS Comput Biol"},{"issue":"Suppl 9","key":"1387_CR8","doi-asserted-by":"publisher","first-page":"S7","DOI":"10.1186\/1471-2105-11-S9-S7","volume":"11","author":"R Harpaz","year":"2010","unstructured":"Harpaz R, Chase HS, Friedman C. Mining multi-item drug adverse effect associations in spontaneous reporting systems. BMC Bioinform. 2010;11(Suppl 9):S7.","journal-title":"BMC Bioinform"},{"issue":"1","key":"1387_CR9","doi-asserted-by":"publisher","first-page":"179","DOI":"10.1136\/amiajnl-2014-002649","volume":"22","author":"H Xu","year":"2015","unstructured":"Xu H, Aldrich MC, Chen Q, Liu H, Peterson NB, Dai Q, Levy M, Shah A, Han X, Ruan X, et al. Validating drug repurposing signals using electronic health records: a case study of metformin associated with reduced cancer mortality. J Am Med Inform Assoc. 2015;22(1):179\u201391.","journal-title":"J Am Med Inform Assoc"},{"key":"1387_CR10","doi-asserted-by":"publisher","first-page":"212278","DOI":"10.7573\/dic.212278","volume":"4","author":"MH Roberts","year":"2015","unstructured":"Roberts MH, Mapel DW, Von Worley A, Beene J. Clinical factors, including All Patient Refined Diagnosis Related Group severity, as predictors of early rehospitalization after COPD exacerbation. Drugs Context. 2015;4:212278.","journal-title":"Drugs Context"},{"issue":"2","key":"1387_CR11","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1309\/LM404L0HHUTWWUDD","volume":"40","author":"FH Wians","year":"2009","unstructured":"Wians FH. Clinical laboratory tests: which, why, and what do the results mean? Lab Med. 2009;40(2):105\u201313.","journal-title":"Lab Med"},{"issue":"7","key":"1387_CR12","doi-asserted-by":"publisher","first-page":"e0180332","DOI":"10.1371\/journal.pone.0180332","volume":"12","author":"JH Kim","year":"2017","unstructured":"Kim JH, Lim S, Park KS, Jang HC, Choi SH. Total and differential WBC counts are related with coronary artery atherosclerosis and increase the risk for cardiovascular disease in Koreans. PLoS ONE. 2017;12(7):e0180332.","journal-title":"PLoS ONE"},{"issue":"1","key":"1387_CR13","doi-asserted-by":"publisher","first-page":"e5","DOI":"10.2196\/medinform.3172","volume":"2","author":"T Adamusiak","year":"2014","unstructured":"Adamusiak T, Shimoyama N, Shimoyama M. Next generation phenotyping using the unified medical language system. JMIR Med Inform. 2014;2(1):e5.","journal-title":"JMIR Med Inform"},{"issue":"2\u20133","key":"1387_CR14","doi-asserted-by":"publisher","first-page":"201","DOI":"10.1016\/j.ijmedinf.2006.05.008","volume":"76","author":"R Lenz","year":"2007","unstructured":"Lenz R, Beyer M, Kuhn KA. Semantic integration in healthcare networks. Int J Med Inform. 2007;76(2\u20133):201\u20137.","journal-title":"Int J Med Inform"},{"issue":"16","key":"1387_CR15","doi-asserted-by":"publisher","first-page":"1738","DOI":"10.1056\/NEJMsb0800209","volume":"358","author":"RD Kush","year":"2008","unstructured":"Kush RD, Helton E, Rockhold FW, Hardison CD. Electronic health records, medical research, and the Tower of Babel. N Engl J Med. 2008;358(16):1738\u201340.","journal-title":"N Engl J Med"},{"issue":"5","key":"1387_CR16","doi-asserted-by":"publisher","first-page":"375","DOI":"10.2345\/i0899-8205-40-5-375.1","volume":"40","author":"J Kabachinski","year":"2006","unstructured":"Kabachinski J. What is health level 7? Biomed Instrum Technol Assoc Adv Med Instrum. 2006;40(5):375\u20139.","journal-title":"Biomed Instrum Technol Assoc Adv Med Instrum"},{"key":"1387_CR17","first-page":"153","volume":"115","author":"D Kalra","year":"2005","unstructured":"Kalra D, Beale T, Heard S. The openEHR foundation. Stud Health Technol Inform. 2005;115:153\u201373.","journal-title":"Stud Health Technol Inform"},{"issue":"5","key":"1387_CR18","doi-asserted-by":"publisher","first-page":"507","DOI":"10.1136\/jamia.2009.001560","volume":"17","author":"GK Savova","year":"2010","unstructured":"Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC, Chute CG. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. J Am Med Inform Assoc. 2010;17(5):507\u201313.","journal-title":"J Am Med Inform Assoc"},{"issue":"Database issue","key":"1387_CR19","doi-asserted-by":"publisher","first-page":"D789","DOI":"10.1093\/nar\/gku1205","volume":"43","author":"JS Amberger","year":"2015","unstructured":"Amberger JS, Bocchini CA, Schiettecatte F, Scott AF, Hamosh A. OMIM.org: Online Mendelian Inheritance in Man (OMIM(R)), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 2015;43(Database issue):D789\u201398.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"1387_CR20","doi-asserted-by":"publisher","first-page":"144","DOI":"10.1038\/ejhg.2013.96","volume":"22","author":"EM Ramos","year":"2014","unstructured":"Ramos EM, Hoffman D, Junkins HA, Maglott D, Phan L, Sherry ST, Feolo M, Hindorff LA. Phenotype-Genotype Integrator (PheGenI): synthesizing genome-wide association study (GWAS) data with existing genomic resources. Eur J Hum Genet. 2014;22(1):144\u20137.","journal-title":"Eur J Hum Genet"},{"issue":"3","key":"1387_CR21","doi-asserted-by":"publisher","first-page":"e89204","DOI":"10.1371\/journal.pone.0089204","volume":"9","author":"B Greshake","year":"2014","unstructured":"Greshake B, Bayer PE, Rausch H, Reda J. openSNP\u2013a crowdsourced web resource for personal genomics. PLoS ONE. 2014;9(3):e89204.","journal-title":"PLoS ONE"},{"issue":"3","key":"1387_CR22","doi-asserted-by":"publisher","first-page":"328","DOI":"10.1197\/jamia.M3028","volume":"16","author":"X Wang","year":"2009","unstructured":"Wang X, Hripcsak G, Markatou M, Friedman C. Active computerized pharmacovigilance using natural language processing, statistics, and electronic health records: a feasibility study. J Am Med Inform Assoc JAMIA. 2009;16(3):328\u201337.","journal-title":"J Am Med Inform Assoc JAMIA"},{"issue":"12","key":"1387_CR23","doi-asserted-by":"publisher","first-page":"e84","DOI":"10.1016\/j.ijmedinf.2009.04.007","volume":"78","author":"A Oztekin","year":"2009","unstructured":"Oztekin A, Delen D, Kong ZJ. Predicting the graft survival for heart-lung transplantation patients: an integrated data mining methodology. Int J Med Inform. 2009;78(12):e84-96.","journal-title":"Int J Med Inform"},{"issue":"1","key":"1387_CR24","doi-asserted-by":"publisher","first-page":"33","DOI":"10.1016\/j.artmed.2010.01.002","volume":"49","author":"D Delen","year":"2010","unstructured":"Delen D, Oztekin A, Kong ZJ. A machine learning-based approach to prognostic analysis of thoracic transplantations. Artif Intell Med. 2010;49(1):33\u201342.","journal-title":"Artif Intell Med"},{"key":"1387_CR25","doi-asserted-by":"publisher","first-page":"419","DOI":"10.1146\/annurev.publhealth.012809.103649","volume":"31","author":"RD Gibbons","year":"2010","unstructured":"Gibbons RD, Amatya AK, Brown CH, Hur K, Marcus SM, Bhaumik DK, Mann JJ. Post-approval drug safety surveillance. Annu Rev Public Health. 2010;31:419\u201337.","journal-title":"Annu Rev Public Health"},{"key":"1387_CR26","doi-asserted-by":"crossref","unstructured":"Cox DR. Regression models and life-tables. In: Breakthroughs in statistics. Springer; 1992. p. 527\u2013541.","DOI":"10.1007\/978-1-4612-4380-9_37"},{"issue":"2","key":"1387_CR27","doi-asserted-by":"publisher","first-page":"113","DOI":"10.1016\/j.artmed.2004.07.002","volume":"34","author":"D Delen","year":"2005","unstructured":"Delen D, Walker G, Kadam A. Predicting breast cancer survivability: a comparison of three data mining methods. Artif Intell Med. 2005;34(2):113\u201327.","journal-title":"Artif Intell Med"},{"issue":"e1","key":"1387_CR28","doi-asserted-by":"publisher","first-page":"e118","DOI":"10.1136\/amiajnl-2012-001360","volume":"20","author":"JS Mathias","year":"2013","unstructured":"Mathias JS, Agrawal A, Feinglass J, Cooper AJ, Baker DW, Choudhary A. Development of a 5 year life expectancy index in older adults using predictive mining of electronic health record data. J Am Med Inform Assoc. 2013;20(e1):e118-124.","journal-title":"J Am Med Inform Assoc"},{"issue":"3","key":"1387_CR29","doi-asserted-by":"publisher","first-page":"283","DOI":"10.1097\/MLR.0000000000000315","volume":"53","author":"E Shadmi","year":"2015","unstructured":"Shadmi E, Flaks-Manov N, Hoshen M, Goldman O, Bitterman H, Balicer RD. Predicting 30-day readmissions with preadmission electronic health record data. Med Care. 2015;53(3):283\u20139.","journal-title":"Med Care"},{"issue":"1","key":"1387_CR30","doi-asserted-by":"publisher","first-page":"155","DOI":"10.1136\/amiajnl-2014-002768","volume":"22","author":"CM Rochefort","year":"2015","unstructured":"Rochefort CM, Verma AD, Eguale T, Lee TC, Buckeridge DL. A novel method of adverse event detection can accurately identify venous thromboembolisms (VTEs) from narrative electronic health record data. J Am Med Inform Assoc. 2015;22(1):155\u201365.","journal-title":"J Am Med Inform Assoc"},{"issue":"4","key":"1387_CR31","doi-asserted-by":"publisher","first-page":"498","DOI":"10.1136\/amiajnl-2011-000217","volume":"18","author":"AA Boxwala","year":"2011","unstructured":"Boxwala AA, Kim J, Grillo JM, Ohno-Machado L. Using statistical and machine learning to help institutions detect suspicious access to electronic health records. J Am Med Inform Assoc. 2011;18(4):498\u2013505.","journal-title":"J Am Med Inform Assoc"},{"issue":"9","key":"1387_CR32","doi-asserted-by":"publisher","first-page":"1205","DOI":"10.1093\/bioinformatics\/btq126","volume":"26","author":"JC Denny","year":"2010","unstructured":"Denny JC, Ritchie MD, Basford MA, Pulley JM, Bastarache L, Brown-Gentry K, Wang D, Masys DR, Roden DM, Crawford DC. PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics. 2010;26(9):1205\u201310.","journal-title":"Bioinformatics"},{"issue":"5","key":"1387_CR33","doi-asserted-by":"publisher","first-page":"490","DOI":"10.1093\/jamia\/ocz017","volume":"26","author":"TM Herr","year":"2019","unstructured":"Herr TM, Peterson JF, Rasmussen LV, Caraballo PJ, Peissig PL, Starren JB. Corrigendum to: Pharmacogenomic clinical decision support design and multi-site process outcomes analysis in the eMERGE Network. J Am Med Inform Assoc. 2019;26(5):490.","journal-title":"J Am Med Inform Assoc"},{"issue":"2","key":"1387_CR34","doi-asserted-by":"publisher","first-page":"124","DOI":"10.1136\/jamia.2009.000893","volume":"17","author":"SN Murphy","year":"2010","unstructured":"Murphy SN, Weber G, Mendis M, Gainer V, Chueh HC, Churchill S, Kohane I. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J Am Med Inform Assoc. 2010;17(2):124\u201330.","journal-title":"J Am Med Inform Assoc"},{"issue":"13","key":"1387_CR35","doi-asserted-by":"publisher","first-page":"1355","DOI":"10.1001\/jama.2016.11076","volume":"316","author":"MJ Joyner","year":"2016","unstructured":"Joyner MJ, Paneth N, Ioannidis JP. What Happens When Underperforming Big Ideas in Research Become Entrenched? JAMA. 2016;316(13):1355\u20136.","journal-title":"JAMA"},{"issue":"12","key":"1387_CR36","doi-asserted-by":"publisher","first-page":"e1002823","DOI":"10.1371\/journal.pcbi.1002823","volume":"8","author":"JC Denny","year":"2012","unstructured":"Denny JC. Mining electronic health records in the genomics era. PLoS Comput Biol. 2012;8(12):e1002823.","journal-title":"PLoS Comput Biol"},{"key":"1387_CR37","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1146\/annurev.publhealth.25.102802.124410","volume":"25","author":"TE Raghunathan","year":"2004","unstructured":"Raghunathan TE. What do we do with missing data? Some options for analysis of incomplete data. Annu Rev Public Health. 2004;25:99\u2013117.","journal-title":"Annu Rev Public Health"},{"key":"1387_CR38","doi-asserted-by":"publisher","first-page":"524","DOI":"10.1016\/j.ijmedinf.2015.03.005","volume":"84","author":"A Moreno-Conde","year":"2015","unstructured":"Moreno-Conde A, Jodar-Sanchez F, Kalra D. Requirements for clinical information modelling tools. Int J Med Inform. 2015;84:524\u201336.","journal-title":"Int J Med Inform"},{"issue":"e2","key":"1387_CR39","doi-asserted-by":"publisher","first-page":"e232","DOI":"10.1136\/amiajnl-2013-001932","volume":"20","author":"MR Boland","year":"2013","unstructured":"Boland MR, Hripcsak G, Shen Y, Chung WK, Weng C. Defining a comprehensive verotype using electronic health records for personalized medicine. J Am Med Inform Assoc. 2013;20(e2):e232-238.","journal-title":"J Am Med Inform Assoc"},{"key":"1387_CR40","doi-asserted-by":"publisher","first-page":"925","DOI":"10.1093\/jamia\/ocv008","volume":"22","author":"A Moreno-Conde","year":"2015","unstructured":"Moreno-Conde A, Moner D, Cruz WD, Santos MR, Maldonado JA, Robles M, Kalra D. Clinical information modeling processes for semantic interoperability of electronic health records: systematic review and inductive analysis. J Am Med Inform Assoc. 2015;22:925\u201334.","journal-title":"J Am Med Inform Assoc"},{"issue":"21","key":"1387_CR41","doi-asserted-by":"publisher","first-page":"8685","DOI":"10.1073\/pnas.0701361104","volume":"104","author":"KI Goh","year":"2007","unstructured":"Goh KI, Cusick ME, Valle D, Childs B, Vidal M, Barabasi AL. The human disease network. Proc Natl Acad Sci USA. 2007;104(21):8685\u201390.","journal-title":"Proc Natl Acad Sci USA"},{"issue":"5","key":"1387_CR42","doi-asserted-by":"publisher","first-page":"429","DOI":"10.3233\/IDA-2002-6504","volume":"6","author":"N Japkowicz","year":"2002","unstructured":"Japkowicz N, Stephen S. The class imbalance problem: a systematic study. Intell Data Anal. 2002;6(5):429\u201349.","journal-title":"Intell Data Anal"},{"issue":"2","key":"1387_CR43","doi-asserted-by":"publisher","first-page":"579","DOI":"10.1109\/JBHI.2016.2634587","volume":"22","author":"G Wang","year":"2018","unstructured":"Wang G, Deng Z, Choi KS. Tackling missing data in community health studies using additive LS-SVM classifier. IEEE J Biomed Health Inform. 2018;22(2):579\u201387.","journal-title":"IEEE J Biomed Health Inform"},{"key":"1387_CR44","volume-title":"Statistical analysis with missing data","author":"RJ Little","year":"2019","unstructured":"Little RJ, Rubin DB. Statistical analysis with missing data, vol. 793. Hoboken: Wiley; 2019."},{"key":"1387_CR45","unstructured":"Smola AJ, Scholkopf B. A tutorial on support vector regression, Berlin, Germany. NeuroCOLT2 Technical Report Series; 1998."},{"issue":"2","key":"1387_CR46","first-page":"223","volume":"35","author":"DA Salazar","year":"2012","unstructured":"Salazar DA, V\u00e9lez JI, Salazar JC. Comparison between SVM and logistic regression: which one is better to discriminate? Rev Colomb Estad. 2012;35(2):223\u201337.","journal-title":"Rev Colomb Estad"},{"key":"1387_CR47","unstructured":"Ibm I. CPLEX optimizer. 2010."},{"issue":"1","key":"1387_CR48","doi-asserted-by":"publisher","first-page":"18","DOI":"10.1111\/j.0824-7935.2004.t01-1-00228.x","volume":"20","author":"A Estabrooks","year":"2014","unstructured":"Estabrooks A, Jo T, Japkowicz N. A multiple sampling method for learning from imbalanced data sets. Comput Intell. 2014;20(1):18\u201336.","journal-title":"Comput Intell"},{"issue":"D1","key":"1387_CR49","doi-asserted-by":"publisher","first-page":"D668","DOI":"10.1093\/nar\/gkx1040","volume":"46","author":"Z Yue","year":"2018","unstructured":"Yue Z, Zheng Q, Neylon MT, Yoo M, Shin J, Zhao Z, Tan AC, Chen JY. PAGER 2.0: an update to the pathway, annotated-list and gene-signature electronic repository for Human Network Biology. Nucleic Acids Res. 2018;46(D1):D668\u201376.","journal-title":"Nucleic Acids Res"},{"issue":"12","key":"1387_CR50","doi-asserted-by":"publisher","first-page":"i250","DOI":"10.1093\/bioinformatics\/btv265","volume":"31","author":"Z Yue","year":"2015","unstructured":"Yue Z, Kshirsagar MM, Nguyen T, Suphavilai C, Neylon MT, Zhu L, Ratliff T, Chen JY. PAGER: constructing PAGs and new PAG-PAG relationships for network biology. Bioinformatics. 2015;31(12):i250-257.","journal-title":"Bioinformatics"},{"issue":"Database issue","key":"1387_CR51","doi-asserted-by":"publisher","first-page":"D691","DOI":"10.1093\/nar\/gkq1018","volume":"39","author":"D Croft","year":"2011","unstructured":"Croft D, O\u2019Kelly G, Wu G, Haw R, Gillespie M, Matthews L, Caudy M, Garapati P, Gopinath G, Jassal B, et al. Reactome: a database of reactions, pathways and biological processes. Nucleic Acids Res. 2011;39(Database issue):D691\u20137.","journal-title":"Nucleic Acids Res"},{"issue":"D1","key":"1387_CR52","doi-asserted-by":"publisher","first-page":"D649","DOI":"10.1093\/nar\/gkx1132","volume":"46","author":"A Fabregat","year":"2018","unstructured":"Fabregat A, Jupe S, Matthews L, Sidiropoulos K, Gillespie M, Garapati P, Haw R, Jassal B, Korninger F, May B, et al. The reactome pathway knowledgebase. Nucleic Acids Res. 2018;46(D1):D649\u201355.","journal-title":"Nucleic Acids Res"},{"key":"1387_CR53","doi-asserted-by":"crossref","unstructured":"Baxevanis AD. Searching Online Mendelian Inheritance in Man (OMIM) for information on genetic loci involved in human disease. Current protocols in human genetics\/editorial board, Jonathan L Haines [et al] 2012, Chapter 9:Unit 9 13. 11\u201310.","DOI":"10.1002\/0471142905.hg0913s73"},{"issue":"D1","key":"1387_CR54","doi-asserted-by":"publisher","first-page":"D1038","DOI":"10.1093\/nar\/gky1151","volume":"47","author":"JS Amberger","year":"2019","unstructured":"Amberger JS, Bocchini CA, Scott AF, Hamosh A. OMIM.org: leveraging knowledge across phenotype-gene relationships. Nucleic Acids Res. 2019;47(D1):D1038\u201343.","journal-title":"Nucleic Acids Res"},{"issue":"Database issue","key":"1387_CR55","doi-asserted-by":"publisher","first-page":"D1060","DOI":"10.1093\/nar\/gkr901","volume":"40","author":"AC Culhane","year":"2012","unstructured":"Culhane AC, Schroder MS, Sultana R, Picard SC, Martinelli EN, Kelly C, Haibe-Kains B, Kapushesky M, St Pierre AA, Flahive W, et al. GeneSigDB: a manually curated database and resource for analysis of gene expression signatures. Nucleic Acids Res. 2012;40(Database issue):D1060\u20136.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"1387_CR56","doi-asserted-by":"publisher","first-page":"44","DOI":"10.1038\/nprot.2008.211","volume":"4","author":"W da Huang","year":"2009","unstructured":"da Huang W, Sherman BT, Lempicki RA. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009;4(1):44\u201357.","journal-title":"Nat Protoc"},{"issue":"Web Server issu","key":"1387_CR57","doi-asserted-by":"publisher","first-page":"W169","DOI":"10.1093\/nar\/gkm415","volume":"35","author":"W da Huang","year":"2007","unstructured":"da Huang W, Sherman BT, Tan Q, Kir J, Liu D, Bryant D, Guo Y, Stephens R, Baseler MW, Lane HC, et al. DAVID Bioinformatics Resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res. 2007;35(Web Server issue):W169\u201375.","journal-title":"Nucleic Acids Res"},{"key":"1387_CR58","volume-title":"Introduction to statistics and data analysis","author":"R Peck","year":"2015","unstructured":"Peck R, Olsen C, Devore JL. Introduction to statistics and data analysis. Boston: Cengage Learning; 2015."},{"key":"1387_CR59","doi-asserted-by":"publisher","DOI":"10.1017\/CBO9780511810114","volume-title":"Data mining and analysis: fundamental concepts and algorithms","author":"MJ Zaki","year":"2014","unstructured":"Zaki MJ, Meira W Jr. Data mining and analysis: fundamental concepts and algorithms. 1st ed. Cambridge: Cambridge University Press; 2014.","edition":"1"},{"issue":"3","key":"1387_CR60","first-page":"18","volume":"2","author":"A Liaw","year":"2002","unstructured":"Liaw A, Wiener M. Classification and regression by randomForest. R News. 2002;2(3):18\u201322.","journal-title":"R News"},{"issue":"1","key":"1387_CR61","doi-asserted-by":"publisher","first-page":"10","DOI":"10.1145\/1656274.1656278","volume":"11","author":"M Hall","year":"2009","unstructured":"Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. ACM SIGKDD Explor Newslett. 2009;11(1):10\u20138.","journal-title":"ACM SIGKDD Explor Newslett"},{"key":"1387_CR62","doi-asserted-by":"publisher","first-page":"26094","DOI":"10.1038\/srep26094","volume":"6","author":"R Miotto","year":"2016","unstructured":"Miotto R, Li L, Kidd BA, Dudley JT. Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Sci Rep. 2016;6:26094.","journal-title":"Sci Rep"},{"key":"1387_CR63","doi-asserted-by":"crossref","unstructured":"Choi JY, Bae S-H, Qiu X, Fox G. High performance dimension reduction and visualization for large high-dimensional data analysis. In: Proceedings of the 2010 10th IEEE\/ACM international conference on cluster, cloud and grid computing. IEEE Computer Society. 2010; 331\u2013340.","DOI":"10.1109\/CCGRID.2010.104"},{"issue":"02","key":"1387_CR64","doi-asserted-by":"publisher","first-page":"1340006","DOI":"10.1142\/S0129626413400069","volume":"23","author":"G Fox","year":"2013","unstructured":"Fox G. Robust scalable visualized clustering in vector and non vector semi-metric spaces. Parallel Process Lett. 2013;23(02):1340006.","journal-title":"Parallel Process Lett"},{"issue":"1","key":"1387_CR65","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/34.566806","volume":"19","author":"T Hofmann","year":"1997","unstructured":"Hofmann T, Buhmann JM. Pairwise data clustering by deterministic annealing. IEEE Trans Pattern Anal Mach Intell. 1997;19(1):1\u201314.","journal-title":"IEEE Trans Pattern Anal Mach Intell"},{"key":"1387_CR66","doi-asserted-by":"publisher","first-page":"53","DOI":"10.1016\/0377-0427(87)90125-7","volume":"20","author":"P Rousseeuw","year":"1987","unstructured":"Rousseeuw P. Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Comput Appl Math. 1987;20:53\u201365.","journal-title":"Comput Appl Math"}],"container-title":["BMC Medical Informatics and Decision Making"],"original-title":[],"language":"en","link":[{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-021-01387-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/article\/10.1186\/s12911-021-01387-z\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"http:\/\/link.springer.com\/content\/pdf\/10.1186\/s12911-021-01387-z.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,2,24]],"date-time":"2021-02-24T07:06:10Z","timestamp":1614150370000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcmedinformdecismak.biomedcentral.com\/articles\/10.1186\/s12911-021-01387-z"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,2]]},"references-count":66,"journal-issue":{"issue":"S3","published-print":{"date-parts":[[2021,2]]}},"alternative-id":["1387"],"URL":"https:\/\/doi.org\/10.1186\/s12911-021-01387-z","relation":{},"ISSN":["1472-6947"],"issn-type":[{"type":"electronic","value":"1472-6947"}],"subject":[],"published":{"date-parts":[[2021,2]]},"assertion":[{"value":"11 November 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"6 January 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"24 February 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"The research protocol in this work was approved by Wenzhou Municipal Science and Technology Bureau and The First Affiliated Hospital, Wenzhou Medical University, Wenzhou, Zhejiang, China. This is in accordant to the scientific description in Project Number ZG2017020, titled \u201cResearch and Development of Disease Prevention and Prediction System Based on Cloud Computing and Medical Big Data\u201d. Since the protocol used a large number of individuals\u2019 medical records, it was practically impossible to obtain all participants\u2019 consents. Therefore, the consent requirement was waived. All authors have completed the training required by the Institutional Review Board in this project.","order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"This work does not include any include identifiable details related to individuals.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that this work has no competing interest.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"51"}}