{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,5,2]],"date-time":"2026-05-02T09:17:21Z","timestamp":1777713441268,"version":"3.51.4"},"reference-count":126,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2024,2,2]],"date-time":"2024-02-02T00:00:00Z","timestamp":1706832000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2024,2,2]],"date-time":"2024-02-02T00:00:00Z","timestamp":1706832000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"name":"National Research Foundation of Korea (NRF) grant funded by the Korea government","award":["2018R1C1B6008277"],"award-info":[{"award-number":["2018R1C1B6008277"]}]},{"name":"Bio & Medical Technology Development Program of the National Research Foundation (NRF) funded by the Korean government","award":["2019M3E5D3073365"],"award-info":[{"award-number":["2019M3E5D3073365"]}]},{"name":"National Biobank of Korea, the Korea Disease Control and Prevention Agency, Republic of Korea","award":["KBN-2020-106"],"award-info":[{"award-number":["KBN-2020-106"]}]},{"name":"Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea governmen","award":["No.RS-2022-00155885, Artificial Intelligence Convergence Innovation Human Resources Development (Hanyang University ERICA)"],"award-info":[{"award-number":["No.RS-2022-00155885, Artificial Intelligence Convergence Innovation Human Resources Development (Hanyang University ERICA)"]}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"abstract":"<jats:title>Abstract<\/jats:title><jats:sec><jats:title>Background<\/jats:title><jats:p>Genome-wide association studies have successfully identified genetic variants associated with human disease. Various statistical approaches based on penalized and machine learning methods have recently been proposed for disease prediction. In this study, we evaluated the performance of several such methods for predicting asthma using the Korean Chip (KORV1.1) from the Korean Genome and Epidemiology Study (KoGES).<\/jats:p><\/jats:sec><jats:sec><jats:title>Results<\/jats:title><jats:p>First, single-nucleotide polymorphisms were selected via single-variant tests using logistic regression with the adjustment of several epidemiological factors. Next, we evaluated the following methods for disease prediction: ridge, least absolute shrinkage and selection operator, elastic net, smoothly clipped absolute deviation, support vector machine, random forest, boosting, bagging, na\u00efve Bayes, and<jats:italic>k<\/jats:italic>-nearest neighbor. Finally, we compared their predictive performance based on the area under the curve of the receiver operating characteristic curves, precision, recall, F1-score, Cohen\u2032s Kappa, balanced accuracy, error rate, Matthews correlation coefficient, and area under the precision-recall curve. Additionally, three oversampling algorithms are used to deal with imbalance problems.<\/jats:p><\/jats:sec><jats:sec><jats:title>Conclusions<\/jats:title><jats:p>Our results show that penalized methods exhibit better predictive performance for asthma than that achieved via machine learning methods. On the other hand, in the oversampling study, randomforest and boosting methods overall showed better prediction performance than penalized methods.<\/jats:p><\/jats:sec>","DOI":"10.1186\/s12859-024-05677-x","type":"journal-article","created":{"date-parts":[[2024,2,2]],"date-time":"2024-02-02T19:02:35Z","timestamp":1706900555000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":7,"title":["Evaluation of penalized and machine learning methods for asthma disease prediction in the Korean Genome and Epidemiology Study (KoGES)"],"prefix":"10.1186","volume":"25","author":[{"given":"Yongjun","family":"Choi","sequence":"first","affiliation":[]},{"given":"Junho","family":"Cha","sequence":"additional","affiliation":[]},{"given":"Sungkyoung","family":"Choi","sequence":"additional","affiliation":[]}],"member":"297","published-online":{"date-parts":[[2024,2,2]]},"reference":[{"issue":"1","key":"5677_CR1","doi-asserted-by":"publisher","first-page":"2","DOI":"10.5334\/aogh.2412","volume":"85","author":"O Enilari","year":"2019","unstructured":"Enilari O, Sinha S. The global impact of asthma in adult populations. Ann Glob Health. 2019;85(1):2.","journal-title":"Ann Glob Health"},{"issue":"1 Suppl","key":"5677_CR2","doi-asserted-by":"publisher","first-page":"4S","DOI":"10.1378\/chest.130.1_suppl.4S","volume":"130","author":"SS Braman","year":"2006","unstructured":"Braman SS. The global burden of asthma. Chest. 2006;130(1 Suppl):4S-12S.","journal-title":"Chest"},{"issue":"9","key":"5677_CR3","doi-asserted-by":"publisher","first-page":"691","DOI":"10.1016\/S2213-2600(17)30293-X","volume":"5","author":"GCRD Collaborators","year":"2017","unstructured":"Collaborators GCRD. Global, regional, and national deaths, prevalence, disability-adjusted life years, and years lived with disability for chronic obstructive pulmonary disease and asthma, 1990\u20132015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet Respir Med. 2017;5(9):691.","journal-title":"Lancet Respir Med"},{"issue":"9743","key":"5677_CR4","doi-asserted-by":"publisher","first-page":"803","DOI":"10.1016\/S0140-6736(10)61087-2","volume":"376","author":"PG Gibson","year":"2010","unstructured":"Gibson PG, McDonald VM, Marks GB. Asthma in older adults. Lancet. 2010;376(9743):803\u201313.","journal-title":"Lancet"},{"issue":"3","key":"5677_CR5","doi-asserted-by":"publisher","first-page":"298","DOI":"10.5021\/ad.2015.27.3.298","volume":"27","author":"C Kim","year":"2015","unstructured":"Kim C, Park KY, Ahn S, Kim DH, Li K, Kim DW, Kim MB, Jo SJ, Yim HW, Seo SJ. Economic Impact of Atopic Dermatitis in Korean Patients. Ann Dermatol. 2015;27(3):298\u2013305.","journal-title":"Ann Dermatol"},{"issue":"12","key":"5677_CR6","doi-asserted-by":"publisher","DOI":"10.1038\/cti.2017.54","volume":"6","author":"CT Vicente","year":"2017","unstructured":"Vicente CT, Revez JA, Ferreira MAR. Lessons from ten years of genome-wide association studies of asthma. Clin Transl Immunol. 2017;6(12): e165.","journal-title":"Clin Transl Immunol"},{"issue":"5","key":"5677_CR7","doi-asserted-by":"publisher","first-page":"2412","DOI":"10.3390\/ijms22052412","volume":"22","author":"P Ntontsi","year":"2021","unstructured":"Ntontsi P, Photiades A, Zervas E, Xanthou G, Samitas K. Genetics and epigenetics in asthma. Int J Mol Sci. 2021;22(5):2412.","journal-title":"Int J Mol Sci"},{"issue":"2","key":"5677_CR8","doi-asserted-by":"publisher","first-page":"170","DOI":"10.4168\/aair.2019.11.2.170","volume":"11","author":"KW Kim","year":"2019","unstructured":"Kim KW, Ober C. Lessons Learned From GWAS of Asthma. Allergy Asthma Immunol Res. 2019;11(2):170\u201387.","journal-title":"Allergy Asthma Immunol Res"},{"issue":"1","key":"5677_CR9","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1016\/S2213-2600(18)30389-8","volume":"7","author":"N Shrine","year":"2019","unstructured":"Shrine N, Portelli MA, John C, Soler Artigas M, Bennett N, Hall R, Lewis J, Henry AP, Billington CK, Ahmad A, et al. Moderate-to-severe asthma in individuals of European ancestry: a genome-wide association study. Lancet Respir Med. 2019;7(1):20\u201334.","journal-title":"Lancet Respir Med"},{"issue":"1","key":"5677_CR10","doi-asserted-by":"publisher","first-page":"880","DOI":"10.1038\/s41467-019-08469-7","volume":"10","author":"M Daya","year":"2019","unstructured":"Daya M, Rafaels N, Brunetti TM, Chavan S, Levin AM, Shetty A, Gignoux CR, Boorgula MP, Wojcik G, Campbell M, et al. Association study in African-admixed populations across the Americas recapitulates asthma risk loci in non-African populations. Nat Commun. 2019;10(1):880.","journal-title":"Nat Commun"},{"issue":"4","key":"5677_CR11","doi-asserted-by":"publisher","first-page":"665","DOI":"10.1016\/j.ajhg.2019.02.022","volume":"104","author":"MAR Ferreira","year":"2019","unstructured":"Ferreira MAR, Mathur R, Vonk JM, Szwajda A, Brumpton B, Granell R, Brew BK, Ullemar V, Lu Y, Jiang Y, et al. Genetic architectures of childhood- and adult-onset asthma are partly distinct. Am J Hum Genet. 2019;104(4):665\u201384.","journal-title":"Am J Hum Genet"},{"issue":"23","key":"5677_CR12","doi-asserted-by":"publisher","first-page":"4022","DOI":"10.1093\/hmg\/ddz175","volume":"28","author":"A Johansson","year":"2019","unstructured":"Johansson A, Rask-Andersen M, Karlsson T, Ek WE. Genome-wide association analysis of 350 000 Caucasians from the UK Biobank identifies novel loci for asthma, hay fever and eczema. Hum Mol Genet. 2019;28(23):4022\u201341.","journal-title":"Hum Mol Genet"},{"key":"5677_CR13","doi-asserted-by":"publisher","first-page":"223","DOI":"10.1146\/annurev-genom-083117-021651","volume":"19","author":"SAG Willis-Owen","year":"2018","unstructured":"Willis-Owen SAG, Cookson WOC, Moffatt MF. The Genetics and Genomics of Asthma. Annu Rev Genomics Hum Genet. 2018;19:223\u201346.","journal-title":"Annu Rev Genomics Hum Genet"},{"issue":"7265","key":"5677_CR14","doi-asserted-by":"publisher","first-page":"747","DOI":"10.1038\/nature08494","volume":"461","author":"TA Manolio","year":"2009","unstructured":"Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, McCarthy MI, Ramos EM, Cardon LR, Chakravarti A, et al. Finding the missing heritability of complex diseases. Nature. 2009;461(7265):747\u201353.","journal-title":"Nature"},{"issue":"20","key":"5677_CR15","doi-asserted-by":"publisher","first-page":"11462","DOI":"10.1073\/pnas.201162998","volume":"98","author":"M West","year":"2001","unstructured":"West M, Blanchette C, Dressman H, Huang E, Ishida S, Spang R, Zuzan H, Olson JA Jr, Marks JR, Nevins JR. Predicting the clinical status of human breast cancer by using gene expression profiles. Proc Natl Acad Sci U S A. 2001;98(20):11462\u20137.","journal-title":"Proc Natl Acad Sci U S A"},{"issue":"2","key":"5677_CR16","doi-asserted-by":"publisher","first-page":"109","DOI":"10.1038\/nrg1522","volume":"6","author":"WY Wang","year":"2005","unstructured":"Wang WY, Barratt BJ, Clayton DG, Todd JA. Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet. 2005;6(2):109\u201318.","journal-title":"Nat Rev Genet"},{"issue":"18","key":"5677_CR17","doi-asserted-by":"publisher","first-page":"3525","DOI":"10.1093\/hmg\/ddp295","volume":"18","author":"DM Evans","year":"2009","unstructured":"Evans DM, Visscher PM, Wray NR. Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk. Hum Mol Genet. 2009;18(18):3525\u201331.","journal-title":"Hum Mol Genet"},{"issue":"7256","key":"5677_CR18","doi-asserted-by":"publisher","first-page":"748","DOI":"10.1038\/nature08185","volume":"460","author":"SM Purcell","year":"2009","unstructured":"International Schizophrenia C, Purcell SM, Wray NR, Stone JL, Visscher PM, O\u2019Donovan MC, Sullivan PF, Sklar P. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature. 2009;460(7256):748\u201352.","journal-title":"Nature"},{"issue":"5","key":"5677_CR19","doi-asserted-by":"publisher","first-page":"468","DOI":"10.1161\/CIRCGENETICS.110.946269","volume":"3","author":"RW Davies","year":"2010","unstructured":"Davies RW, Dandona S, Stewart AF, Chen L, Ellis SG, Tang WH, Hazen SL, Roberts R, McPherson R, Wells GA. Improved prediction of cardiovascular disease based on a panel of single nucleotide polymorphisms identified through genome-wide association studies. Circ Cardiovasc Genet. 2010;3(5):468\u201374.","journal-title":"Circ Cardiovasc Genet"},{"issue":"R2","key":"5677_CR20","doi-asserted-by":"publisher","first-page":"R166","DOI":"10.1093\/hmg\/ddn250","volume":"17","author":"AC Janssens","year":"2008","unstructured":"Janssens AC, van Duijn CM. Genome-based prediction of common diseases: advances and prospects. Hum Mol Genet. 2008;17(R2):R166-173.","journal-title":"Hum Mol Genet"},{"issue":"1","key":"5677_CR21","doi-asserted-by":"publisher","first-page":"105","DOI":"10.1016\/j.ahj.2009.04.022","volume":"158","author":"JB van der Net","year":"2009","unstructured":"van der Net JB, Janssens AC, Sijbrands EJ, Steyerberg EW. Value of genetic profiling for the prediction of coronary heart disease. Am Heart J. 2009;158(1):105\u201310.","journal-title":"Am Heart J"},{"issue":"10","key":"5677_CR22","doi-asserted-by":"publisher","first-page":"e374","DOI":"10.1371\/journal.pmed.0030374","volume":"3","author":"MN Weedon","year":"2006","unstructured":"Weedon MN, McCarthy MI, Hitman G, Walker M, Groves CJ, Zeggini E, Rayner NW, Shields B, Owen KR, Hattersley AT, et al. Combining information from common type 2 diabetes risk polymorphisms improves disease prediction. PLoS Med. 2006;3(10):e374.","journal-title":"PLoS Med"},{"issue":"3","key":"5677_CR23","doi-asserted-by":"publisher","first-page":"273","DOI":"10.1007\/BF00994018","volume":"20","author":"C Cortes","year":"1995","unstructured":"Cortes C, Vapnik V. Support-Vector Networks. Mach Learn. 1995;20(3):273\u201397.","journal-title":"Mach Learn"},{"issue":"Suppl 2","key":"5677_CR24","doi-asserted-by":"publisher","first-page":"S11","DOI":"10.1186\/1752-0509-6-S2-S11","volume":"6","author":"D Yoon","year":"2012","unstructured":"Yoon D, Kim YJ, Park T. Phenotype prediction from genome-wide association studies: application to smoking behaviors. BMC Syst Biol. 2012;6(Suppl 2):S11.","journal-title":"BMC Syst Biol"},{"issue":"1","key":"5677_CR25","doi-asserted-by":"publisher","first-page":"5","DOI":"10.1023\/A:1010933404324","volume":"45","author":"L Breiman","year":"2001","unstructured":"Breiman L. Random forests. Mach Learn. 2001;45(1):5\u201332.","journal-title":"Mach Learn"},{"issue":"2","key":"5677_CR26","doi-asserted-by":"publisher","first-page":"197","DOI":"10.1007\/BF00116037","volume":"5","author":"RE Schapire","year":"1990","unstructured":"Schapire RE. The strength of weak learnability. Mach Learn. 1990;5(2):197\u2013227.","journal-title":"Mach Learn"},{"issue":"2","key":"5677_CR27","doi-asserted-by":"publisher","first-page":"123","DOI":"10.1007\/BF00058655","volume":"24","author":"L Breiman","year":"1996","unstructured":"Breiman L. Bagging predictors. Mach Learn. 1996;24(2):123\u201340.","journal-title":"Mach Learn"},{"key":"5677_CR28","unstructured":"Langley P, Iba W, Thompson K. An analysis of Bayesian classifiers. In: Aaai. Citeseer; 1992. pp. 223\u2013228."},{"issue":"1","key":"5677_CR29","doi-asserted-by":"publisher","first-page":"21","DOI":"10.1109\/TIT.1967.1053964","volume":"13","author":"T Cover","year":"1967","unstructured":"Cover T, Hart P. Nearest neighbor pattern classification. IEEE Trans Inf Theory. 1967;13(1):21\u20137.","journal-title":"IEEE Trans Inf Theory"},{"key":"5677_CR30","doi-asserted-by":"publisher","first-page":"43","DOI":"10.1016\/j.artmed.2017.09.005","volume":"85","author":"B Lopez","year":"2018","unstructured":"Lopez B, Torrent-Fontbona F, Vinas R, Fernandez-Real JM. Single Nucleotide Polymorphism relevance learning with Random Forests for Type 2 diabetes risk prediction. Artif Intell Med. 2018;85:43\u20139.","journal-title":"Artif Intell Med"},{"issue":"1","key":"5677_CR31","doi-asserted-by":"publisher","first-page":"12665","DOI":"10.1038\/s41598-017-13056-1","volume":"7","author":"G Pare","year":"2017","unstructured":"Pare G, Mao S, Deng WQ. A machine-learning heuristic to improve gene score prediction of polygenic traits. Sci Rep. 2017;7(1):12665.","journal-title":"Sci Rep"},{"key":"5677_CR32","doi-asserted-by":"publisher","first-page":"267","DOI":"10.3389\/fgene.2019.00267","volume":"10","author":"DSW Ho","year":"2019","unstructured":"Ho DSW, Schierding W, Wake M, Saffery R, O\u2019Sullivan J. Machine learning SNP based prediction for precision medicine. Front Genet. 2019;10:267.","journal-title":"Front Genet"},{"issue":"1","key":"5677_CR33","doi-asserted-by":"publisher","first-page":"55","DOI":"10.1080\/00401706.1970.10488634","volume":"12","author":"AE Hoerl","year":"1970","unstructured":"Hoerl AE, Kennard RW. Ridge regression\u2014biased estimation for nonorthogonal problems. Technometrics. 1970;12(1):55\u2013000.","journal-title":"Technometrics"},{"issue":"1","key":"5677_CR34","doi-asserted-by":"publisher","first-page":"69","DOI":"10.1080\/00401706.1970.10488635","volume":"12","author":"AE Hoerl","year":"1970","unstructured":"Hoerl AE, Kennard RW. Ridge regression\u2014applications to nonorthogonal problems. Technometrics. 1970;12(1):69\u2013000.","journal-title":"Technometrics"},{"issue":"3","key":"5677_CR35","first-page":"603","volume":"26","author":"AE Hoerl","year":"1970","unstructured":"Hoerl AE. Ridge regression. Biometrics. 1970;26(3):603\u201310.","journal-title":"Biometrics"},{"issue":"1","key":"5677_CR36","doi-asserted-by":"crossref","first-page":"267","DOI":"10.1111\/j.2517-6161.1996.tb02080.x","volume":"58","author":"R Tibshirani","year":"1996","unstructured":"Tibshirani R. Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B-Methodol. 1996;58(1):267\u201388.","journal-title":"J R Stat Soc Ser B-Methodol"},{"key":"5677_CR37","doi-asserted-by":"publisher","first-page":"768","DOI":"10.1111\/j.1467-9868.2005.00527.x","volume":"67","author":"H Zou","year":"2005","unstructured":"Zou H, Hastie T. Regularization and variable selection via the elastic net (vol B 67, pg 301, 2005). J R Stat Soc Ser B-Stat Methodol. 2005;67:768\u2013768.","journal-title":"J R Stat Soc Ser B-Stat Methodol"},{"issue":"456","key":"5677_CR38","doi-asserted-by":"publisher","first-page":"1348","DOI":"10.1198\/016214501753382273","volume":"96","author":"JQ Fan","year":"2001","unstructured":"Fan JQ, Li RZ. Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc. 2001;96(456):1348\u201360.","journal-title":"J Am Stat Assoc"},{"key":"5677_CR39","doi-asserted-by":"publisher","DOI":"10.1007\/978-0-387-84858-7","volume-title":"The elements of statistical learning: data mining, inference, and prediction","author":"T Hastie","year":"2009","unstructured":"Hastie T, Tibshirani R, Friedman JH, Friedman JH. The elements of statistical learning: data mining, inference, and prediction, vol. 2. New York: Springer; 2009."},{"issue":"Suppl 7","key":"5677_CR40","doi-asserted-by":"publisher","first-page":"S27","DOI":"10.1186\/1753-6561-3-S7-S27","volume":"3","author":"YJ Sung","year":"2009","unstructured":"Sung YJ, Rice TK, Shi G, Gu CC, Rao D. Comparison between single-marker analysis using Merlin and multi-marker analysis using LASSO for Framingham simulated data. BMC Proc. 2009;3(Suppl 7):S27.","journal-title":"BMC Proc"},{"issue":"6","key":"5677_CR41","doi-asserted-by":"publisher","first-page":"714","DOI":"10.1093\/bioinformatics\/btp041","volume":"25","author":"TT Wu","year":"2009","unstructured":"Wu TT, Chen YF, Hastie T, Sobel E, Lange K. Genome-wide association analysis by lasso penalized logistic regression. Bioinformatics. 2009;25(6):714\u201321.","journal-title":"Bioinformatics"},{"issue":"5","key":"5677_CR42","doi-asserted-by":"publisher","first-page":"416","DOI":"10.1111\/j.1469-1809.2010.00597.x","volume":"74","author":"S Cho","year":"2010","unstructured":"Cho S, Kim K, Kim YJ, Lee JK, Cho YS, Lee JY, Han BG, Kim H, Ott J, Park T. Joint identification of multiple genetic variants via elastic-net variable selection in a genome-wide association analysis. Ann Hum Genet. 2010;74(5):416\u201328.","journal-title":"Ann Hum Genet"},{"key":"5677_CR43","doi-asserted-by":"publisher","first-page":"605891","DOI":"10.1155\/2015\/605891","volume":"2015","author":"S Won","year":"2015","unstructured":"Won S, Choi H, Park S, Lee J, Park C, Kwon S. Evaluation of penalized and nonpenalized methods for disease prediction with large-scale genetic data. Biomed Res Int. 2015;2015:605891.","journal-title":"Biomed Res Int"},{"issue":"2","key":"5677_CR44","doi-asserted-by":"publisher","first-page":"375","DOI":"10.1016\/j.ajhg.2007.10.012","volume":"82","author":"N Malo","year":"2008","unstructured":"Malo N, Libiger O, Schork NJ. Accommodating linkage disequilibrium in genetic-association analyses via ridge regression. Am J Hum Genet. 2008;82(2):375\u201385.","journal-title":"Am J Hum Genet"},{"issue":"2","key":"5677_CR45","doi-asserted-by":"publisher","first-page":"e20","DOI":"10.1093\/ije\/dyv316","volume":"46","author":"Y Kim","year":"2017","unstructured":"Kim Y, Han BG. Ko GESg: cohort profile: the Korean Genome and Epidemiology Study (KoGES) Consortium. Int J Epidemiol. 2017;46(2):e20.","journal-title":"Int J Epidemiol"},{"issue":"3","key":"5677_CR46","doi-asserted-by":"publisher","first-page":"185","DOI":"10.1016\/j.phrp.2012.07.007","volume":"3","author":"JE Lee","year":"2012","unstructured":"Lee JE, Kim JH, Hong EJ, Yoo HS, Nam HY, Park O. National Biobank of Korea: quality control programs of collected-human biospecimens. Osong Public Health Res Perspect. 2012;3(3):185\u20139.","journal-title":"Osong Public Health Res Perspect"},{"issue":"1","key":"5677_CR47","doi-asserted-by":"publisher","first-page":"1382","DOI":"10.1038\/s41598-018-37832-9","volume":"9","author":"S Moon","year":"2019","unstructured":"Moon S, Kim YJ, Han S, Hwang MY, Shin DM, Park MY, Lu Y, Yoon K, Jang HM, Kim YK, et al. The Korea Biobank Array: design and identification of coding variants associated with blood biochemical traits. Sci Rep. 2019;9(1):1382.","journal-title":"Sci Rep"},{"issue":"2","key":"5677_CR48","doi-asserted-by":"publisher","first-page":"405","DOI":"10.1109\/TKDE.2012.232","volume":"26","author":"S Barua","year":"2014","unstructured":"Barua S, Islam MM, Yao X, Murase K. MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans Knowl Data Eng. 2014;26(2):405\u201325.","journal-title":"IEEE Trans Knowl Data Eng"},{"key":"5677_CR49","doi-asserted-by":"publisher","first-page":"99","DOI":"10.1016\/j.inffus.2013.12.003","volume":"20","author":"HX Zhang","year":"2014","unstructured":"Zhang HX, Li MF. RWO-Sampling: a random walk over-sampling approach to imbalanced data classification. Inf Fusion. 2014;20:99\u2013116.","journal-title":"Inf Fusion"},{"key":"5677_CR50","doi-asserted-by":"publisher","first-page":"321","DOI":"10.1613\/jair.953","volume":"16","author":"NV Chawla","year":"2002","unstructured":"Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res. 2002;16:321\u201357.","journal-title":"J Artif Intell Res"},{"issue":"3","key":"5677_CR51","doi-asserted-by":"publisher","first-page":"310","DOI":"10.1038\/ng.2892","volume":"46","author":"M Kircher","year":"2014","unstructured":"Kircher M, Witten DM, Jain P, O\u2019roak BJ, Cooper GM, Shendure J. A general framework for estimating the relative pathogenicity of human genetic variants. Nat Genet. 2014;46(3):310\u20135.","journal-title":"Nat Genet"},{"issue":"5","key":"5677_CR52","doi-asserted-by":"publisher","first-page":"761","DOI":"10.1093\/bioinformatics\/btu703","volume":"31","author":"D Quang","year":"2015","unstructured":"Quang D, Chen Y, Xie X. DANN: a deep learning approach for annotating the pathogenicity of genetic variants. Bioinformatics. 2015;31(5):761\u20133.","journal-title":"Bioinformatics"},{"issue":"2","key":"5677_CR53","doi-asserted-by":"publisher","first-page":"148","DOI":"10.1016\/S0033-3549(04)50006-7","volume":"116","author":"MD Eisner","year":"2001","unstructured":"Eisner MD, Yelin EH, Trupin L, Blanc PD. Asthma and smoking status in a population-based study of California adults. Public Health Rep. 2001;116(2):148\u201357.","journal-title":"Public Health Rep"},{"issue":"2","key":"5677_CR54","doi-asserted-by":"publisher","first-page":"153","DOI":"10.1097\/01.all.0000162308.89857.6c","volume":"5","author":"LK Arruda","year":"2005","unstructured":"Arruda LK, Sol\u00e9 D, Baena-Cagnani CE, Naspitz CK. Risk factors for asthma and atopy. Curr Opin Allergy Clin Immunol. 2005;5(2):153\u20139.","journal-title":"Curr Opin Allergy Clin Immunol"},{"key":"5677_CR55","doi-asserted-by":"crossref","unstructured":"Toskala E, Kennedy DW. Asthma risk factors. In: International forum of allergy & rhinology. Wiley Online Library; 2015. pp. S11\u2013S16.","DOI":"10.1002\/alr.21557"},{"issue":"1","key":"5677_CR56","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2105-12-77","volume":"12","author":"X Robin","year":"2011","unstructured":"Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, M\u00fcller M. pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011;12(1):1\u20138.","journal-title":"BMC Bioinform"},{"key":"5677_CR57","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v028.i05","volume":"28","author":"M Kuhn","year":"2008","unstructured":"Kuhn M. Building predictive models in R using the caret package. J Stat Softw. 2008;28:1\u201326.","journal-title":"J Stat Softw"},{"key":"5677_CR58","unstructured":"Gorman B. mltools: Machine learning tools. URL: https:\/\/CRAN.R-project.org\/package=mltools R package version 03 2018, 5."},{"issue":"1","key":"5677_CR59","doi-asserted-by":"publisher","first-page":"145","DOI":"10.1093\/bioinformatics\/btw570","volume":"33","author":"T Saito","year":"2017","unstructured":"Saito T, Rehmsmeier M. Precrec: fast and accurate precision-recall and ROC curve calculations in R. Bioinformatics. 2017;33(1):145\u20137.","journal-title":"Bioinformatics"},{"issue":"7","key":"5677_CR60","doi-asserted-by":"publisher","first-page":"565","DOI":"10.1038\/ng.608","volume":"42","author":"J Yang","year":"2010","unstructured":"Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet. 2010;42(7):565\u20139.","journal-title":"Nat Genet"},{"issue":"1","key":"5677_CR61","doi-asserted-by":"publisher","first-page":"76","DOI":"10.1016\/j.ajhg.2010.11.011","volume":"88","author":"J Yang","year":"2011","unstructured":"Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet. 2011;88(1):76\u201382.","journal-title":"Am J Hum Genet"},{"key":"5677_CR62","doi-asserted-by":"publisher","first-page":"329","DOI":"10.1016\/j.knosys.2018.07.035","volume":"161","author":"I Cordon","year":"2018","unstructured":"Cordon I, Garcia S, Fernandez A, Herrera F. Imbalance: Oversampling algorithms for imbalanced classification in R. Knowl-Based Syst. 2018;161:329\u201341.","journal-title":"Knowl-Based Syst"},{"issue":"16","key":"5677_CR63","doi-asserted-by":"publisher","first-page":"e164","DOI":"10.1093\/nar\/gkq603","volume":"38","author":"K Wang","year":"2010","unstructured":"Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38(16):e164\u2013e164.","journal-title":"Nucleic Acids Res"},{"issue":"1","key":"5677_CR64","first-page":"100","volume":"173","author":"C-C Lin","year":"2020","unstructured":"Lin C-C, Law BF, Hettick JM. Acute 4, 4\u2032-methylene diphenyl diisocyanate exposure-mediated downregulation of miR-206-3p and miR-381-3p activates inducible nitric oxide synthase transcription by targeting calcineurin\/NFAT signaling in macrophages. Toxicol Sci. 2020;173(1):100\u201313.","journal-title":"Toxicol Sci"},{"issue":"10\u201311","key":"5677_CR65","doi-asserted-by":"publisher","first-page":"813","DOI":"10.1016\/j.clinbiochem.2011.04.021","volume":"44","author":"L-J Li","year":"2011","unstructured":"Li L-J, Gao L-B, Lv M-L, Dong W, Su X-W, Liang W-B, Zhang L. Association between SNPs in pre-miRNA and risk of chronic obstructive pulmonary disease. Clin Biochem. 2011;44(10\u201311):813\u20136.","journal-title":"Clin Biochem"},{"key":"5677_CR66","doi-asserted-by":"crossref","unstructured":"Akat A, Yilmaz Semerci S, Ugurel OM, Erdemir A, Danhaive O, Cetinkaya M, Turgut-Balik D. Bronchopulmonary dysplasia and wnt pathway-associated single nucleotide polymorphisms. Pediatric Res 2021;1\u201311.","DOI":"10.1038\/s41390-021-01851-6"},{"key":"5677_CR67","doi-asserted-by":"publisher","DOI":"10.1183\/23120541.00802-2020","author":"SSP Nemani","year":"2021","unstructured":"Nemani SSP, Vermeulen CJ, Pech M, Faiz A, Oliver BGG, van den Berge M, Burgess JK, Kopp MV, Weckmann M. COL4A3 expression in asthmatic epithelium depends on intronic methylation and ZNF263 binding. ERJ open Res. 2021. https:\/\/doi.org\/10.1183\/23120541.00802-2020.","journal-title":"ERJ open Res"},{"issue":"6","key":"5677_CR68","doi-asserted-by":"publisher","first-page":"986","DOI":"10.1016\/j.ajhg.2012.04.015","volume":"90","author":"G Lopez-Herrera","year":"2012","unstructured":"Lopez-Herrera G, Tampella G, Pan-Hammarstr\u00f6m Q, Herholz P, Trujillo-Vargas CM, Phadwal K, Simon AK, Moutschen M, Etzioni A, Mory A. Deleterious mutations in LRBA are associated with a syndrome of immune deficiency and autoimmunity. Am J Hum Genet. 2012;90(6):986\u20131001.","journal-title":"Am J Hum Genet"},{"issue":"6","key":"5677_CR69","doi-asserted-by":"publisher","first-page":"1393","DOI":"10.1016\/j.jaci.2008.02.031","volume":"121","author":"Y Yang","year":"2008","unstructured":"Yang Y, Haitchi HM, Cakebread J, Sammut D, Harvey A, Powell RM, Holloway JW, Howarth P, Holgate ST, Davies DE. Epigenetic mechanisms silence a disintegrin and metalloprotease 33 expression in bronchial epithelial cells. J Allergy Clin Immunol. 2008;121(6):1393-1399 e1314.","journal-title":"J Allergy Clin Immunol"},{"key":"5677_CR70","doi-asserted-by":"publisher","DOI":"10.1183\/23120541.00058-2015","author":"T Szul","year":"2016","unstructured":"Szul T, Castaldi P, Cho MH, Blalock JE, Gaggar A. Genetic regulation of expression of leukotriene A4 hydrolase. ERJ Open Res. 2016. https:\/\/doi.org\/10.1183\/23120541.00058-2015.","journal-title":"ERJ Open Res"},{"issue":"5","key":"5677_CR71","doi-asserted-by":"publisher","first-page":"1218","DOI":"10.1016\/j.jaci.2012.01.074","volume":"129","author":"M Imboden","year":"2012","unstructured":"Imboden M, Bouzigon E, Curjuric I, Ramasamy A, Kumar A, Hancock DB, Wilk JB, Vonk JM, Thun GA, Siroux V, et al. Genome-wide association study of lung function decline in adults with and without asthma. J Allergy Clin Immunol. 2012;129(5):1218\u201328.","journal-title":"J Allergy Clin Immunol"},{"issue":"1","key":"5677_CR72","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1038\/s41598-021-95887-7","volume":"11","author":"S Sin","year":"2021","unstructured":"Sin S, Choi H-M, Lim J, Kim J, Bak SH, Choi SS, Park J, Lee JH, Oh Y-M, Lee MK. A genome-wide association study of quantitative computed tomographic emphysema in Korean populations. Sci Rep. 2021;11(1):1\u201310.","journal-title":"Sci Rep"},{"key":"5677_CR73","doi-asserted-by":"publisher","DOI":"10.1155\/2016\/3564341","author":"J-C B\u00e9rub\u00e9","year":"2016","unstructured":"B\u00e9rub\u00e9 J-C, Gaudreault N, Lavoie-Charland E, Sbarra L, Henry C, Madore A-M, Par\u00e9 PD, van den Berge M, Nickle D, Laviolette M. Identification of susceptibility genes of adult asthma in French Canadian women. Can Respir J. 2016. https:\/\/doi.org\/10.1155\/2016\/3564341.","journal-title":"Can Respir J"},{"issue":"1","key":"5677_CR74","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12967-020-02581-9","volume":"18","author":"Z G\u00e1l","year":"2020","unstructured":"G\u00e1l Z, G\u00e9zsi A, Semsei \u00c1F, Nagy A, Sult\u00e9sz M, Csoma Z, Tam\u00e1si L, G\u00e1lffy G, Szalai C. Investigation of circulating lncRNAs as potential biomarkers in chronic respiratory diseases. J Transl Med. 2020;18(1):1\u201315.","journal-title":"J Transl Med"},{"issue":"10","key":"5677_CR75","doi-asserted-by":"publisher","first-page":"e12091","DOI":"10.1002\/clt2.12091","volume":"11","author":"M Suzuki","year":"2021","unstructured":"Suzuki M, Cole JJ, Konno S, Makita H, Kimura H, Nishimura M, Maciewicz RA. Large-scale plasma proteomics can reveal distinct endotypes in chronic obstructive pulmonary disease and severe asthma. Clin Transl Allergy. 2021;11(10):e12091.","journal-title":"Clin Transl Allergy"},{"issue":"1","key":"5677_CR76","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2350-13-110","volume":"13","author":"AS Tulah","year":"2012","unstructured":"Tulah AS, Begh\u00e9 B, Barton SJ, Holloway JW, Sayers I. Leukotriene B4 receptor locus gene characterisation and association studies in asthma. BMC Med Genet. 2012;13(1):1\u201311.","journal-title":"BMC Med Genet"},{"issue":"5","key":"5677_CR77","first-page":"65","volume":"71","author":"C Li","year":"2020","unstructured":"Li C, Liu H, Zhang J, Zhang J, Dai L, Zhao Z, Fang L, Liu L, Shu J, Feng J. LncRNA BMF-AS1 exerts anti-apoptosis function in COPD by regulating BMF expression. Age (Mean\u00b1SD, year). 2020;71(5):65\u201364.","journal-title":"Age (Mean\u00b1SD, year)"},{"issue":"2","key":"5677_CR78","doi-asserted-by":"publisher","first-page":"481","DOI":"10.1016\/j.jaci.2012.05.043","volume":"130","author":"A Alangari","year":"2012","unstructured":"Alangari A, Alsultan A, Adly N, Massaad MJ, Kiani IS, Aljebreen A, Raddaoui E, Almomen A-K, Al-Muhsen S, Geha RS. LPS-responsive beige-like anchor (LRBA) gene mutation in a family with inflammatory bowel disease and combined immunodeficiency. J Allergy Clin Immunol. 2012;130(2):481-488. e482.","journal-title":"J Allergy Clin Immunol"},{"issue":"4","key":"5677_CR79","doi-asserted-by":"publisher","first-page":"245","DOI":"10.3390\/jpm10040245","volume":"10","author":"M Michalik","year":"2020","unstructured":"Michalik M, Samet A, Dmowska-Koroblewska A, Podbielska-Kubera A, Waszczuk-Jankowska M, Struck-Lewicka W, Markuszewski MJ. An overview of the application of systems biology in an understanding of chronic rhinosinusitis (CRS) development. J Pers Med. 2020;10(4):245.","journal-title":"J Pers Med"},{"issue":"202","key":"5677_CR80","doi-asserted-by":"publisher","first-page":"ra85","DOI":"10.1126\/scisignal.2001637","volume":"4","author":"T Tanaka","year":"2011","unstructured":"Tanaka T, Yamamoto Y, Muromoto R, Ikeda O, Sekine Y, Grusby MJ, Kaisho T, Matsuda T. PDLIM2 inhibits T helper 17 cell development and granulomatous inflammation through degradation of STAT3. Sci Signal. 2011;4(202):ra85\u2013ra85.","journal-title":"Sci Signal"},{"issue":"4","key":"5677_CR81","doi-asserted-by":"publisher","first-page":"582","DOI":"10.1111\/j.1365-2222.2009.03438.x","volume":"40","author":"M Via","year":"2010","unstructured":"Via M, De Giacomo A, Corvol H, Eng C, Seibold MA, Gillett C, Galanter J, Sen S, Tcheurekdjian H, Chapela R. The role of LTA4H and ALOX5AP genes in the risk for asthma in Latinos. Clin Exp Allergy. 2010;40(4):582\u20139.","journal-title":"Clin Exp Allergy"},{"issue":"8","key":"5677_CR82","doi-asserted-by":"publisher","first-page":"1046","DOI":"10.1111\/j.1398-9995.2008.01667.x","volume":"63","author":"J Holloway","year":"2008","unstructured":"Holloway J, Barton S, Holgate S, Rose-Zerilli M, Sayers I. The role of LTA4H and ALOX5AP polymorphism in asthma and allergy susceptibility. Allergy. 2008;63(8):1046\u201353.","journal-title":"Allergy"},{"issue":"7","key":"5677_CR83","doi-asserted-by":"publisher","first-page":"3055","DOI":"10.21037\/jtd.2019.07.55","volume":"11","author":"J Kim","year":"2019","unstructured":"Kim J, Kim DY, Heo H-R, Choi SS, Hong S-H, Kim WJ. Role of miRNA-181a-2-3p in cadmium-induced inflammatory responses of human bronchial epithelial cells. J Thorac Dis. 2019;11(7):3055.","journal-title":"J Thorac Dis"},{"issue":"1","key":"5677_CR84","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1465-9921-15-58","volume":"15","author":"MM Perry","year":"2014","unstructured":"Perry MM, Tsitsiou E, Austin PJ, Lindsay MA, Gibeon DS, Adcock IM, Chung KF. Role of non-coding RNAs in maintaining primary airway smooth muscle cells. Respir Res. 2014;15(1):1\u201312.","journal-title":"Respir Res"},{"issue":"1","key":"5677_CR85","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1165\/rcmb.2016-0101OC","volume":"56","author":"LP Hayden","year":"2017","unstructured":"Hayden LP, Cho MH, McDonald MLN, Crapo JD, Beaty TH, Silverman EK, Hersh CP. Susceptibility to childhood pneumonia: a genome-wide analysis. Am J Respir Cell Mol Biol. 2017;56(1):20\u20138.","journal-title":"Am J Respir Cell Mol Biol"},{"issue":"1","key":"5677_CR86","doi-asserted-by":"publisher","first-page":"31","DOI":"10.1186\/s13073-021-00835-9","volume":"13","author":"P Rentzsch","year":"2021","unstructured":"Rentzsch P, Schubach M, Shendure J, Kircher M. CADD-Splice-improving genome-wide variant effect prediction using deep learning-derived splice scores. Genome Med. 2021;13(1):31.","journal-title":"Genome Med"},{"key":"5677_CR87","doi-asserted-by":"publisher","DOI":"10.1093\/bib\/bbac022","author":"T Jo","year":"2022","unstructured":"Jo T, Nho K, Bice P, Saykin AJ. Alzheimer\u2019s Disease Neuroimaging I: Deep learning-based identification of genetic variants: application to Alzheimer\u2019s disease classification. Brief Bioinform. 2022. https:\/\/doi.org\/10.1093\/bib\/bbac022.","journal-title":"Brief Bioinform"},{"issue":"2","key":"5677_CR88","first-page":"449","volume":"19","author":"P Hall","year":"2009","unstructured":"Hall P, Lee ER, Park BU. Bootstrap-based penalty choice for the lasso, achieving oracle performance. Stat Sin. 2009;19(2):449\u201371.","journal-title":"Stat Sin"},{"issue":"1","key":"5677_CR89","doi-asserted-by":"publisher","first-page":"468","DOI":"10.1214\/10-AOAS377","volume":"5","author":"S Wang","year":"2011","unstructured":"Wang S, Nan B, Rosset S, Zhu J. Random Lasso. Ann Appl Stat. 2011;5(1):468\u201385.","journal-title":"Ann Appl Stat"},{"issue":"1","key":"5677_CR90","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2105-11-523","volume":"11","author":"R Blagus","year":"2010","unstructured":"Blagus R, Lusa L. Class prediction for high-dimensional class-imbalanced data. BMC Bioinform. 2010;11(1):1\u201317.","journal-title":"BMC Bioinform"},{"key":"5677_CR91","doi-asserted-by":"publisher","first-page":"220","DOI":"10.1016\/j.eswa.2016.12.035","volume":"73","author":"G Haixiang","year":"2017","unstructured":"Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, Bing G. Learning from class-imbalanced data: Review of methods and applications. Expert Syst Appl. 2017;73:220\u201339.","journal-title":"Expert Syst Appl"},{"issue":"9","key":"5677_CR92","doi-asserted-by":"publisher","first-page":"1263","DOI":"10.1109\/TKDE.2008.239","volume":"21","author":"H He","year":"2009","unstructured":"He H, Garcia EA. Learning from imbalanced data. IEEE Trans Knowl Data Eng. 2009;21(9):1263\u201384.","journal-title":"IEEE Trans Knowl Data Eng"},{"issue":"4","key":"5677_CR93","doi-asserted-by":"publisher","first-page":"221","DOI":"10.1007\/s13748-016-0094-0","volume":"5","author":"B Krawczyk","year":"2016","unstructured":"Krawczyk B. Learning from imbalanced data: open challenges and future directions. Prog Artif Intell. 2016;5(4):221\u201332.","journal-title":"Prog Artif Intell"},{"key":"5677_CR94","doi-asserted-by":"publisher","DOI":"10.1007\/978-3-319-98074-4","volume-title":"Learning from imbalanced data sets","author":"A Fern\u00e1ndez","year":"2018","unstructured":"Fern\u00e1ndez A, Garc\u00eda S, Galar M, Prati RC, Krawczyk B, Herrera F. Learning from imbalanced data sets, vol. 10. Cham: Springer; 2018."},{"issue":"7","key":"5677_CR95","doi-asserted-by":"publisher","first-page":"906","DOI":"10.1038\/ng2088","volume":"39","author":"J Marchini","year":"2007","unstructured":"Marchini J, Howie B, Myers S, McVean G, Donnelly P. A new multipoint method for genome-wide association studies by imputation of genotypes. Nat Genet. 2007;39(7):906\u201313.","journal-title":"Nat Genet"},{"key":"5677_CR96","doi-asserted-by":"publisher","first-page":"7","DOI":"10.1186\/s13742-015-0047-8","volume":"4","author":"CC Chang","year":"2015","unstructured":"Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience. 2015;4:7.","journal-title":"Gigascience"},{"issue":"1","key":"5677_CR97","doi-asserted-by":"publisher","first-page":"1","DOI":"10.18637\/jss.v033.i01","volume":"33","author":"J Friedman","year":"2010","unstructured":"Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw. 2010;33(1):1\u201322.","journal-title":"J Stat Softw"},{"key":"5677_CR98","doi-asserted-by":"crossref","unstructured":"Bayes T. LII. An essay towards solving a problem in the doctrine of chances. By the late Rev. Mr. Bayes, FRS communicated by Mr. Price, in a letter to John Canton, AMFR S. Philosophical transactions of the Royal Society of London 1763(53);370\u2013418.","DOI":"10.1098\/rstl.1763.0053"},{"key":"5677_CR99","unstructured":"Meyer D, Dimitriadou E, Hornik K, Weingessel A, Leisch F, Chang C-C, Lin C-C, Meyer MD. Package \u2018e1071\u2019. The R Journal 2019."},{"issue":"6","key":"5677_CR100","doi-asserted-by":"publisher","first-page":"585","DOI":"10.1038\/hdy.2017.4","volume":"118","author":"Y Bian","year":"2017","unstructured":"Bian Y, Holland JB. Enhancing genomic prediction with genome-wide association studies in multiparental maize populations. Heredity (Edinb). 2017;118(6):585\u201393.","journal-title":"Heredity (Edinb)"},{"issue":"Suppl 1","key":"5677_CR101","doi-asserted-by":"publisher","first-page":"S65","DOI":"10.1186\/1471-2105-10-S1-S65","volume":"10","author":"R Jiang","year":"2009","unstructured":"Jiang R, Tang W, Wu X, Fu W. A random forest approach to the detection of epistatic interactions in case-control studies. BMC Bioinform. 2009;10(Suppl 1):S65.","journal-title":"BMC Bioinform"},{"issue":"4","key":"5677_CR102","doi-asserted-by":"publisher","first-page":"e93379","DOI":"10.1371\/journal.pone.0093379","volume":"9","author":"V Botta","year":"2014","unstructured":"Botta V, Louppe G, Geurts P, Wehenkel L. Exploiting SNP correlations within random forest for genome-wide association studies. PLoS ONE. 2014;9(4):e93379.","journal-title":"PLoS ONE"},{"key":"5677_CR103","volume-title":"Package \u2018randomforest\u2019","author":"S RColourBrewer","year":"2018","unstructured":"RColourBrewer S, Liaw MA. Package \u2018randomforest.\u2019 Berkeley: University of California; 2018."},{"key":"5677_CR104","doi-asserted-by":"crossref","unstructured":"Ogutu JO, Piepho H-P, Schulz-Streeck T. A comparison of random forests, boosting and support vector machines for genomic selection. In: BMC proceedings. . BioMed Central; 2011. pp. 1\u20135.","DOI":"10.1186\/1753-6561-5-S3-S11"},{"key":"5677_CR105","unstructured":"Tan AC, Gilbert D. Ensemble machine learning on gene expression data for cancer classification. 2003."},{"issue":"3","key":"5677_CR106","doi-asserted-by":"publisher","first-page":"325","DOI":"10.1016\/j.ajhg.2010.07.021","volume":"87","author":"X Wan","year":"2010","unstructured":"Wan X, Yang C, Yang Q, Xue H, Fan X, Tang NL, Yu W. BOOST: A fast approach to detecting gene-gene interactions in genome-wide case-control studies. Am J Hum Genet. 2010;87(3):325\u201340.","journal-title":"Am J Hum Genet"},{"issue":"8","key":"5677_CR107","doi-asserted-by":"publisher","first-page":"523","DOI":"10.1038\/nrg3253","volume":"13","author":"Y Moreau","year":"2012","unstructured":"Moreau Y, Tranchevent LC. Computational tools for prioritizing candidate genes: boosting disease gene discovery. Nat Rev Genet. 2012;13(8):523\u201336.","journal-title":"Nat Rev Genet"},{"key":"5677_CR108","unstructured":"Culp M, Johnson K, Michailidis G. Culp MM: Package \u2018ada\u2019. Avaiable online at: https:\/\/cran.r-project.org\/web\/packages\/ada\/index.html. 2016."},{"key":"5677_CR109","doi-asserted-by":"crossref","unstructured":"Verma A, Mehta S. A comparative study of ensemble learning methods for classification in bioinformatics. In: 2017 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence. IEEE; 2017. pp. 155\u2013158.","DOI":"10.1109\/CONFLUENCE.2017.7943141"},{"key":"5677_CR110","doi-asserted-by":"crossref","unstructured":"Dittman DJ, Khoshgoftaar TM, Napolitano A, Fazelpour A. Select-bagging: Effectively combining gene selection and bagging for balanced bioinformatics data. In: 2014 IEEE international conference on bioinformatics and bioengineering. IEEE; 2014. pp. 413\u2013419.","DOI":"10.1109\/BIBE.2014.66"},{"issue":"1","key":"5677_CR111","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/1471-2105-5-136","volume":"5","author":"B Liu","year":"2004","unstructured":"Liu B, Cui Q, Jiang T, Ma S. A combinational feature selection and ensemble neural network method for classification of gene expression data. BMC Bioinform. 2004;5(1):1\u201312.","journal-title":"BMC Bioinform"},{"key":"5677_CR112","unstructured":"Peters A, Hothorn T, Hothorn MT. Package \u2018ipred\u2019. R Package 2009:2009."},{"issue":"15","key":"5677_CR113","doi-asserted-by":"publisher","first-page":"2429","DOI":"10.1093\/bioinformatics\/bth267","volume":"20","author":"T Li","year":"2004","unstructured":"Li T, Zhang C, Ogihara M. A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics. 2004;20(15):2429\u201337.","journal-title":"Bioinformatics"},{"issue":"14","key":"5677_CR114","first-page":"1","volume":"13","author":"F Sambo","year":"2012","unstructured":"Sambo F, Trifoglio E, Di Camillo B, Toffolo GM, Cobelli C. Bag of Na\u00efve Bayes: biomarker selection and classification from genome-wide SNP data. BMC Bioinform. 2012;13(14):1\u201310.","journal-title":"BMC Bioinform"},{"issue":"1","key":"5677_CR115","doi-asserted-by":"publisher","first-page":"47","DOI":"10.1007\/s13721-012-0006-6","volume":"1","author":"J Van Hulse","year":"2012","unstructured":"Van Hulse J, Khoshgoftaar TM, Napolitano A, Wald R. Threshold-based feature selection techniques for high-dimensional bioinformatics data. Netw Model Anal Health Inform Bioinform. 2012;1(1):47\u201361.","journal-title":"Netw Model Anal Health Inform Bioinform"},{"issue":"2","key":"5677_CR116","doi-asserted-by":"publisher","first-page":"201","DOI":"10.1007\/s10462-017-9541-y","volume":"50","author":"C Wan","year":"2018","unstructured":"Wan C, Freitas AA. An empirical evaluation of hierarchical feature selection methods for classification in bioinformatics datasets with gene ontology-based features. Artif Intell Rev. 2018;50(2):201\u201340.","journal-title":"Artif Intell Rev"},{"key":"5677_CR117","doi-asserted-by":"crossref","unstructured":"Yao Z, Ruzzo WL. A regression-based K nearest neighbor algorithm for gene function prediction from heterogeneous data. In: BMC bioinformatics. BioMed Central; 2006. pp. 1\u201311.","DOI":"10.1186\/1471-2105-7-S1-S11"},{"issue":"1","key":"5677_CR118","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s12957-023-03277-2","volume":"16","author":"C Li","year":"2018","unstructured":"Li C, Zeng X, Yu H, Gu Y, Zhang W. Identification of hub genes with diagnostic values in pancreatic cancer by bioinformatics analyses and supervised learning methods. World Journal of Surgical Oncology. 2018;16(1):1\u201312.","journal-title":"World Journal of Surgical Oncology"},{"key":"5677_CR119","doi-asserted-by":"crossref","unstructured":"Saha S, Biswas S, Acharyya S: Gene selection by sample classification using k nearest neighbor and meta-heuristic algorithms. In: 2016 IEEE 6th international conference on advanced computing (IACC): 2016. IEEE: 250\u2013255.","DOI":"10.1109\/IACC.2016.55"},{"key":"5677_CR120","unstructured":"Cho S-B, Won H-H: Machine learning in DNA microarray analysis for cancer classification. In: Proceedings of the First Asia-Pacific Bioinformatics Conference on Bioinformatics 2003-Volume 19: 2003. 189\u2013198."},{"key":"5677_CR121","first-page":"220","volume":"26","author":"S Narkhede","year":"2018","unstructured":"Narkhede S. Understanding auc-roc curve. Towards Data Sci. 2018;26:220\u20137.","journal-title":"Towards Data Sci"},{"issue":"5","key":"5677_CR122","doi-asserted-by":"publisher","first-page":"404","DOI":"10.1016\/j.jbi.2005.02.008","volume":"38","author":"TA Lasko","year":"2005","unstructured":"Lasko TA, Bhagwat JG, Zou KH, Ohno-Machado L. The use of receiver operating characteristic curves in biomedical informatics. J Biomed Inform. 2005;38(5):404\u201315.","journal-title":"J Biomed Inform"},{"issue":"3","key":"5677_CR123","doi-asserted-by":"publisher","first-page":"e0118432","DOI":"10.1371\/journal.pone.0118432","volume":"10","author":"T Saito","year":"2015","unstructured":"Saito T, Rehmsmeier M. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PLoS ONE. 2015;10(3):e0118432.","journal-title":"PLoS ONE"},{"key":"5677_CR124","doi-asserted-by":"publisher","first-page":"35","DOI":"10.1186\/s13040-017-0155-3","volume":"10","author":"D Chicco","year":"2017","unstructured":"Chicco D. Ten quick tips for machine learning in computational biology. BioData Min. 2017;10:35.","journal-title":"BioData Min"},{"issue":"8","key":"5677_CR125","doi-asserted-by":"publisher","first-page":"855","DOI":"10.1016\/j.jclinepi.2015.02.010","volume":"68","author":"B Ozenne","year":"2015","unstructured":"Ozenne B, Subtil F, Maucort-Boulch D. The precision\u2013recall curve overcame the optimism of the receiver operating characteristic curve in rare diseases. J Clin Epidemiol. 2015;68(8):855\u20139.","journal-title":"J Clin Epidemiol"},{"issue":"3","key":"5677_CR126","doi-asserted-by":"publisher","first-page":"e92209","DOI":"10.1371\/journal.pone.0092209","volume":"9","author":"J Keilwagen","year":"2014","unstructured":"Keilwagen J, Grosse I, Grau J. Area under precision-recall curves for weighted and unweighted data. PLoS ONE. 2014;9(3):e92209.","journal-title":"PLoS ONE"}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-024-05677-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-024-05677-x\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-024-05677-x.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2024,11,9]],"date-time":"2024-11-09T23:37:40Z","timestamp":1731195460000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-024-05677-x"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2024,2,2]]},"references-count":126,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2024,12]]}},"alternative-id":["5677"],"URL":"https:\/\/doi.org\/10.1186\/s12859-024-05677-x","relation":{},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2024,2,2]]},"assertion":[{"value":"9 June 2022","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"26 January 2024","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"2 February 2024","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The study was reviewed and approved by the Institutional Review Board of Hanyang University (IRB No. HYUIRB-202210-013). All CAVAS, KARE, and HEXA study participants provided written informed consent. All methods were carried out in accordance with relevant guidelines and regulations (Declaration of Helsinki).","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare no conflict of interest.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"56"}}