{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,12]],"date-time":"2026-03-12T19:07:06Z","timestamp":1773342426354,"version":"3.50.1"},"reference-count":56,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2021,4,19]],"date-time":"2021-04-19T00:00:00Z","timestamp":1618790400000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2021,4,19]],"date-time":"2021-04-19T00:00:00Z","timestamp":1618790400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["BMC Bioinformatics"],"published-print":{"date-parts":[[2021,12]]},"abstract":"<jats:title>Abstract<\/jats:title>\n                  <jats:sec>\n                    <jats:title>Background<\/jats:title>\n                    <jats:p>Genotype\u2013phenotype predictions are of great importance in genetics. These predictions can help to find genetic mutations causing variations in human beings. There are many approaches for finding the association which can be broadly categorized into two classes, statistical techniques, and machine learning. Statistical techniques are good for finding the actual SNPs causing variation where Machine Learning techniques are good where we just want to classify the people into different categories. In this article, we examined the Eye-color and Type-2 diabetes phenotype. The proposed technique is a hybrid approach consisting of some parts from statistical techniques and remaining from Machine learning.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Results<\/jats:title>\n                    <jats:p>The main dataset for Eye-color phenotype consists of 806 people. 404 people have Blue-Green eyes where 402 people have Brown eyes. After preprocessing we generated 8 different datasets, containing different numbers of SNPs, using the mutation difference and thresholding at individual SNP. We calculated three types of mutation at each SNP no mutation, partial mutation, and full mutation. After that data is transformed for machine learning algorithms. We used about 9 classifiers, RandomForest, Extreme Gradient boosting, ANN, LSTM, GRU, BILSTM, 1DCNN, ensembles of ANN, and ensembles of LSTM which gave the best accuracy of 0.91, 0.9286, 0.945, 0.94, 0.94, 0.92, 0.95, and 0.96% respectively. Stacked ensembles of LSTM outperformed other algorithms for 1560 SNPs with an overall accuracy of 0.96, AUC\u00a0=\u00a00.98 for brown eyes, and AUC\u00a0=\u00a00.97 for Blue-Green eyes. The main dataset for Type-2 diabetes consists of 107 people where 30 people are classified as cases and 74 people as controls. We used different linear threshold to find the optimal number of SNPs for classification. The final model gave an accuracy of 0.97%.<\/jats:p>\n                  <\/jats:sec>\n                  <jats:sec>\n                    <jats:title>Conclusion<\/jats:title>\n                    <jats:p>Genotype\u2013phenotype predictions are very useful especially in forensic. These predictions can help to identify SNP variant association with traits and diseases. Given more datasets, machine learning model predictions can be increased. Moreover, the non-linearity in the Machine learning model and the combination of SNPs Mutations while training the model increases the prediction. We considered binary classification problems but the proposed approach can be extended to multi-class classification.<\/jats:p>\n                  <\/jats:sec>","DOI":"10.1186\/s12859-021-04077-9","type":"journal-article","created":{"date-parts":[[2021,4,19]],"date-time":"2021-04-19T08:04:45Z","timestamp":1618819485000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":10,"title":["Eye-color and Type-2 diabetes phenotype prediction from genotype data using deep learning methods"],"prefix":"10.1186","volume":"22","author":[{"given":"Muhammad","family":"Muneeb","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Andreas","family":"Henschel","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2021,4,19]]},"reference":[{"issue":"4","key":"4077_CR1","doi-asserted-by":"publisher","first-page":"285","DOI":"10.1038\/hdy.2014.103","volume":"115","author":"P Bateson","year":"2014","unstructured":"Bateson P. Why are individuals so different from each other? Heredity. 2014;115(4):285\u201392. https:\/\/doi.org\/10.1038\/hdy.2014.103.","journal-title":"Heredity"},{"issue":"7414","key":"4077_CR2","doi-asserted-by":"publisher","first-page":"57","DOI":"10.1038\/nature11247","volume":"489","author":"The ENCODE Project Consortium","year":"2012","unstructured":"The ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57\u201374. https:\/\/doi.org\/10.1038\/nature11247.","journal-title":"Nature"},{"issue":"4","key":"4077_CR3","doi-asserted-by":"publisher","first-page":"80","DOI":"10.3390\/v9040080","volume":"9","author":"MR Kubiak","year":"2017","unstructured":"Kubiak MR, Maka\u0142owska I. Protein-coding genes\u2019 retrocopies and their functions. Viruses. 2017;9(4):80. https:\/\/doi.org\/10.3390\/v9040080.","journal-title":"Viruses"},{"key":"4077_CR4","unstructured":"Basic genetics information\u2014understanding genetics\u2014NCBI bookshelf. https:\/\/www.ncbi.nlm.nih.gov\/books\/NBK115558\/. Accessed 30 Nov 2020."},{"key":"4077_CR5","unstructured":"Understanding genetics: a New York, mid-Atlantic guide for patients and health professionals\u2014PubMed. https:\/\/pubmed.ncbi.nlm.nih.gov\/23304754\/. Accessed 30 Nov 2020."},{"key":"4077_CR6","unstructured":"Defective proteins and dominance and recessiveness\u2014modern genetic analysis\u2014NCBI bookshelf. https:\/\/www.ncbi.nlm.nih.gov\/books\/NBK21404\/. Accessed 30 Nov 2020."},{"key":"4077_CR7","unstructured":"The differences between mendelian & polygenic traits. https:\/\/sciencing.com\/differences-between-mendelian-polygenic-traits-8777329.html. Accessed 30 Nov 2020."},{"key":"4077_CR8","unstructured":"Human genetic disorders: studying single-gene (mendelian) diseases|learn science at scitable. https:\/\/www.nature.com\/scitable\/topicpage\/rare-genetic-disorders-learning-about-genetic-disease-979\/. Accessed 30 Nov 2020."},{"key":"4077_CR9","doi-asserted-by":"publisher","DOI":"10.1007\/978-1-4939-9012-2_38","author":"CS Agler","year":"2019","unstructured":"Agler CS, Shungin D, Zandon\u00e1 AGF, Schmadeke P, Basta PV, Luo J, Cantrell J, Pahel TD, Meyer BD, Shaffer JR, Schaefer AS, North KE, Divaris K. Protocols, methods, and tools for genome-wide association studies (GWAS) of dental traits. Methods Mol Biol. 2019;. https:\/\/doi.org\/10.1007\/978-1-4939-9012-2_38.","journal-title":"Methods Mol Biol"},{"issue":"3","key":"4077_CR10","doi-asserted-by":"publisher","first-page":"1505","DOI":"10.1534\/genetics.105.054452","volume":"174","author":"S Furihata","year":"2006","unstructured":"Furihata S, Ito T, Kamatani N. Test of association between haplotypes and phenotypes in case-control studies: examination of validity of the application of an algorithm for samples from cohort or clinical trials to case-control samples using simulated and real data. Genetics. 2006;174(3):1505\u201316. https:\/\/doi.org\/10.1534\/genetics.105.054452.","journal-title":"Genetics"},{"issue":"7","key":"4077_CR11","doi-asserted-by":"publisher","first-page":"1607","DOI":"10.1016\/j.sjbs.2018.09.011","volume":"26","author":"J Alghamdi","year":"2019","unstructured":"Alghamdi J, Amoudi M, Kassab AC, Mufarrej MA, Ghamdi SA. Eye color prediction using single nucleotide polymorphisms in Saudi population. Saudi J Biol Sci. 2019;26(7):1607\u201312. https:\/\/doi.org\/10.1016\/j.sjbs.2018.09.011.","journal-title":"Saudi J Biol Sci"},{"key":"4077_CR12","unstructured":"Quantitative trait loci mapping. https:\/\/www.ncbi.nlm.nih.gov\/pmc\/articles\/PMC6875759\/. Accessed 30 Nov 2020."},{"issue":"6","key":"4077_CR13","doi-asserted-by":"publisher","first-page":"116","DOI":"10.1371\/journal.pcbi.0030116","volume":"3","author":"AL Tarca","year":"2007","unstructured":"Tarca AL, Carey VJ, Chen X-W, Romero R, Dr\u0103ghici S. Machine learning and its applications to biology. PLoS Comput Biol. 2007;3(6):116. https:\/\/doi.org\/10.1371\/journal.pcbi.0030116.","journal-title":"PLoS Comput Biol"},{"key":"4077_CR14","doi-asserted-by":"publisher","DOI":"10.3389\/fgene.2019.00267","author":"DSW Ho","year":"2019","unstructured":"Ho DSW, Schierding W, Wake M, Saffery R, O\u2019Sullivan J. Machine learning SNP based prediction for precision medicine. Front Genet. 2019;. https:\/\/doi.org\/10.3389\/fgene.2019.00267.","journal-title":"Front Genet"},{"issue":"10","key":"4077_CR15","doi-asserted-by":"publisher","first-page":"781","DOI":"10.1038\/nrg1916","volume":"7","author":"DJ Balding","year":"2006","unstructured":"Balding DJ. A tutorial on statistical methods for population association studies. Nat Rev Genet. 2006;7(10):781\u201391. https:\/\/doi.org\/10.1038\/nrg1916.","journal-title":"Nat Rev Genet"},{"key":"4077_CR16","doi-asserted-by":"publisher","DOI":"10.3389\/fgene.2019.01091","author":"Y Liu","year":"2019","unstructured":"Liu Y, Wang D, He F, Wang J, Joshi T, Xu D. Phenotype prediction and genome-wide association study using deep convolutional neural network of soybean. Front Genet. 2019;. https:\/\/doi.org\/10.3389\/fgene.2019.01091.","journal-title":"Front Genet"},{"issue":"8","key":"4077_CR17","doi-asserted-by":"publisher","first-page":"1735","DOI":"10.1162\/neco.1997.9.8.1735","volume":"9","author":"S Hochreiter","year":"1997","unstructured":"Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput. 1997;9(8):1735\u201380. https:\/\/doi.org\/10.1162\/neco.1997.9.8.1735.","journal-title":"Neural Comput"},{"issue":"9","key":"4077_CR18","doi-asserted-by":"publisher","first-page":"2018","DOI":"10.3390\/s19092018","volume":"19","author":"S Huang","year":"2019","unstructured":"Huang S, Tang J, Dai J, Wang Y. Signal status recognition based on 1DCNN and its feature extraction mechanism analysis. Sensors. 2019;19(9):2018. https:\/\/doi.org\/10.3390\/s19092018.","journal-title":"Sensors"},{"issue":"7","key":"4077_CR19","doi-asserted-by":"publisher","first-page":"2361","DOI":"10.3390\/app10072361","volume":"10","author":"F Yang","year":"2020","unstructured":"Yang F, Zhang W, Tao L, Ma J. Transfer learning strategies for deep learning-based PHM algorithms. Appl Sci. 2020;10(7):2361. https:\/\/doi.org\/10.3390\/app10072361.","journal-title":"Appl Sci"},{"key":"4077_CR20","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-019-40561-2","author":"A Drouin","year":"2019","unstructured":"Drouin A, Letarte G, Raymond F, Marchand M, Corbeil J, Laviolette F. Interpretable genotype-to-phenotype classifiers with performance guarantees. Sci Rep. 2019;. https:\/\/doi.org\/10.1038\/s41598-019-40561-2.","journal-title":"Sci Rep"},{"issue":"5","key":"4077_CR21","doi-asserted-by":"publisher","first-page":"192","DOI":"10.1016\/j.cub.2009.01.027","volume":"19","author":"F Liu","year":"2009","unstructured":"Liu F, van Duijn K, Vingerling JR, Hofman A, Uitterlinden AG, Janssens ACJW, Kayser M. Eye color and the prediction of complex phenotypes from genotypes. Curr Biol. 2009;19(5):192\u20133. https:\/\/doi.org\/10.1016\/j.cub.2009.01.027.","journal-title":"Curr Biol"},{"issue":"3","key":"4077_CR22","doi-asserted-by":"publisher","first-page":"330","DOI":"10.1016\/j.fsigen.2011.07.009","volume":"6","author":"S Walsh","year":"2012","unstructured":"Walsh S, Wollstein A, Liu F, Chakravarthy U, Rahu M, Seland JH, Soubrane G, Tomazzoli L, Topouzis F, Vingerling JR, Vioque J, Fletcher AE, Ballantyne KN, Kayser M. DNA-based eye colour prediction across europe with the IrisPlex system. Forensic Sci Int Genet. 2012;6(3):330\u201340. https:\/\/doi.org\/10.1016\/j.fsigen.2011.07.009.","journal-title":"Forensic Sci Int Genet"},{"issue":"1","key":"4077_CR23","doi-asserted-by":"publisher","first-page":"65","DOI":"10.1186\/s41935-020-00200-8","volume":"10","author":"NAM Al-Rashedi","year":"2020","unstructured":"Al-Rashedi NAM, Mandal AM, Alobaidi LA. Eye color prediction using the IrisPlex system: a limited pilot study in the Iraqi population. Egypt J Forensic Sci. 2020;10(1):65. https:\/\/doi.org\/10.1186\/s41935-020-00200-8.","journal-title":"Egypt J Forensic Sci"},{"issue":"4","key":"4077_CR24","doi-asserted-by":"publisher","first-page":"444","DOI":"10.1016\/j.fsigen.2013.03.005","volume":"7","author":"JS Allwood","year":"2013","unstructured":"Allwood JS, Harbison S. SNP model development for the prediction of eye colour in New Zealand. Forensic Sci Int Genet. 2013;7(4):444\u201352. https:\/\/doi.org\/10.1016\/j.fsigen.2013.03.005.","journal-title":"Forensic Sci Int Genet"},{"key":"4077_CR25","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1016\/j.fsigen.2013.12.003","volume":"9","author":"GM Dembinski","year":"2014","unstructured":"Dembinski GM, Picard CJ. Evaluation of the IrisPlex DNA-based eye color prediction assay in a United States population. Forensic Sci Int Genet. 2014;9:111\u20137. https:\/\/doi.org\/10.1016\/j.fsigen.2013.12.003.","journal-title":"Forensic Sci Int Genet"},{"issue":"1","key":"4077_CR26","doi-asserted-by":"publisher","first-page":"107","DOI":"10.2991\/jegh.k.191028.001","volume":"10","author":"MAB Khan","year":"2019","unstructured":"Khan MAB, Hashim MJ, King JK, Govender RD, Mustafa H, Kaabi JA. Epidemiology of type 2 diabetes\u2014global Burden of disease and forecasted trends. J Epidemiol Global Health. 2019;10(1):107. https:\/\/doi.org\/10.2991\/jegh.k.191028.001.","journal-title":"J Epidemiol Global Health"},{"key":"4077_CR27","doi-asserted-by":"publisher","first-page":"32","DOI":"10.1002\/dmrr.2352","volume":"28","author":"Y Bi","year":"2012","unstructured":"Bi Y, Wang T, Xu M, Xu Y, Li M, Lu J, Zhu X, Ning G. Advanced research on risk factors of type 2 diabetes. Diabetes Metab Res Rev. 2012;28:32\u20139. https:\/\/doi.org\/10.1002\/dmrr.2352.","journal-title":"Diabetes Metab Res Rev"},{"key":"4077_CR28","doi-asserted-by":"publisher","first-page":"706","DOI":"10.1016\/j.procs.2020.03.336","volume":"167","author":"NP Tigga","year":"2020","unstructured":"Tigga NP, Garg S. Prediction of type 2 diabetes using machine learning classification methods. Procedia Comput Sci. 2020;167:706\u201316. https:\/\/doi.org\/10.1016\/j.procs.2020.03.336.","journal-title":"Procedia Comput Sci"},{"key":"4077_CR29","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-017-17433-8","author":"Y Wang","year":"2017","unstructured":"Wang Y, Liu S, Chen R, Chen Z, Yuan J, Li Q. A novel classification indicator of type 1 and type 2 diabetes in china. Sci Rep. 2017;. https:\/\/doi.org\/10.1038\/s41598-017-17433-8.","journal-title":"Sci Rep"},{"issue":"4","key":"4077_CR30","doi-asserted-by":"publisher","first-page":"248","DOI":"10.4258\/hir.2019.25.4.248","volume":"25","author":"S Abhari","year":"2019","unstructured":"Abhari S, Kalhori SRN, Ebrahimi M, Hasannejadasl H, Garavand A. Artificial intelligence applications in type 2 diabetes mellitus care: focus on machine learning methods. Healthc Inform Res. 2019;25(4):248. https:\/\/doi.org\/10.4258\/hir.2019.25.4.248.","journal-title":"Healthc Inform Res"},{"issue":"1","key":"4077_CR31","doi-asserted-by":"publisher","first-page":"26","DOI":"10.1186\/1471-2156-11-26","volume":"11","author":"H-J Ban","year":"2010","unstructured":"Ban H-J, Heo JY, Oh K-S, Park K-J. Identification of type 2 diabetes-associated combination of SNPs using support vector machine. BMC Genet. 2010;11(1):26. https:\/\/doi.org\/10.1186\/1471-2156-11-26.","journal-title":"BMC Genet"},{"key":"4077_CR32","unstructured":"openSNP. https:\/\/opensnp.org\/."},{"key":"4077_CR33","doi-asserted-by":"publisher","DOI":"10.7555\/jbr.29.20140007","author":"P Zeng","year":"2015","unstructured":"Zeng P, et al. Statistical analysis for genome-wide association study. J Biomed Res. 2015;. https:\/\/doi.org\/10.7555\/jbr.29.20140007.","journal-title":"J Biomed Res"},{"issue":"5","key":"4077_CR34","doi-asserted-by":"publisher","first-page":"356","DOI":"10.1038\/nrg2344","volume":"9","author":"MI McCarthy","year":"2008","unstructured":"McCarthy MI, Abecasis GR, Cardon LR, Goldstein DB, Little J, Ioannidis JPA, Hirschhorn JN. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nat Rev Genet. 2008;9(5):356\u201369. https:\/\/doi.org\/10.1038\/nrg2344.","journal-title":"Nat Rev Genet"},{"issue":"11","key":"4077_CR35","doi-asserted-by":"publisher","first-page":"1243","DOI":"10.1038\/ng1653","volume":"37","author":"DG Clayton","year":"2005","unstructured":"Clayton DG, Walker NM, Smyth DJ, Pask R, Cooper JD, Maier LM, Smink LJ, Lam AC, Ovington NR, Stevens HE, Nutland S, Howson JMM, Faham M, Moorhead M, Jones HB, Falkowski M, Hardenbol P, Willis TD, Todd JA. Population structure, differential bias and genomic control in a large-scale, case-control association study. Nat Genet. 2005;37(11):1243\u20136. https:\/\/doi.org\/10.1038\/ng1653.","journal-title":"Nat Genet"},{"key":"4077_CR36","doi-asserted-by":"publisher","unstructured":"Jabbar HK, Khan RZ. Methods to avoid over-fitting and under-fitting in supervised machine learning (comparative study). In: Computer science, communication and instrumentation devices. Research Publishing Services. . p. 163\u201372. 2014. https:\/\/doi.org\/10.3850\/978-981-09-5247-1_017.","DOI":"10.3850\/978-981-09-5247-1_017"},{"issue":"12","key":"4077_CR37","doi-asserted-by":"publisher","first-page":"1046","DOI":"10.1097\/meg.0b013e3282f198a0","volume":"19","author":"E Grossi","year":"2007","unstructured":"Grossi E, Buscema M. Introduction to artificial neural networks. Eur J Gastroenterol Hepatol. 2007;19(12):1046\u201354. https:\/\/doi.org\/10.1097\/meg.0b013e3282f198a0.","journal-title":"Eur J Gastroenterol Hepatol"},{"key":"4077_CR38","doi-asserted-by":"publisher","unstructured":"Ma W, Qiu Z, Song J, Cheng Q, Ma C. DeepGS: Predicting phenotypes from genotypes using deep learning. 2017. https:\/\/doi.org\/10.1101\/241414.","DOI":"10.1101\/241414"},{"issue":"S1","key":"4077_CR39","doi-asserted-by":"publisher","first-page":"51","DOI":"10.1002\/gepi.20473","volume":"33","author":"S Szymczak","year":"2009","unstructured":"Szymczak S, Biernacka JM, Cordell HJ, Gonz\u00e1lez-Recio O, K\u00f6nig IR, Zhang H, Sun YV. Machine learning in genome-wide association studies. Genet Epidemiol. 2009;33(S1):51\u20137. https:\/\/doi.org\/10.1002\/gepi.20473.","journal-title":"Genet Epidemiol"},{"key":"4077_CR40","doi-asserted-by":"publisher","DOI":"10.3389\/fgene.2019.00214","author":"B Tang","year":"2019","unstructured":"Tang B, Pan Z, Yin K, Khateeb A. Recent advances of deep learning in bioinformatics and computational biology. Front Genet. 2019;. https:\/\/doi.org\/10.3389\/fgene.2019.00214.","journal-title":"Front Genet"},{"issue":"03","key":"4077_CR41","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1117\/1.jrs.14.034503","volume":"14","author":"M Khoshboresh-Masouleh","year":"2020","unstructured":"Khoshboresh-Masouleh M, Alidoost F, Arefi H. Multiscale building segmentation based on deep learning for remote sensing RGB images from different sensors. J Appl Remote Sens. 2020;14(03):1. https:\/\/doi.org\/10.1117\/1.jrs.14.034503.","journal-title":"J Appl Remote Sens"},{"issue":"04","key":"4077_CR42","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1117\/1.jrs.12.046018","volume":"12","author":"MK Masouleh","year":"2018","unstructured":"Masouleh MK, Shah-Hosseini R. Fusion of deep learning with adaptive bilateral filter for building outline extraction from remote sensing imagery. J Appl Remote Sens. 2018;12(04):1. https:\/\/doi.org\/10.1117\/1.jrs.12.046018.","journal-title":"J Appl Remote Sens"},{"key":"4077_CR43","doi-asserted-by":"publisher","first-page":"111","DOI":"10.1016\/j.inffus.2020.09.006","volume":"66","author":"F Piccialli","year":"2021","unstructured":"Piccialli F, Somma VD, Giampaolo F, Cuomo S, Fortino G. A survey on deep learning in medicine: why, how and when? Inf Fusion. 2021;66:111\u201337. https:\/\/doi.org\/10.1016\/j.inffus.2020.09.006.","journal-title":"Inf Fusion"},{"issue":"02","key":"4077_CR44","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1117\/1.jrs.13.024508","volume":"13","author":"MK Masouleh","year":"2019","unstructured":"Masouleh MK, Sadeghian S. Deep learning-based method for reconstructing three-dimensional building cadastre models from aerial images. J Appl Remote Sens. 2019;13(02):1. https:\/\/doi.org\/10.1117\/1.jrs.13.024508.","journal-title":"J Appl Remote Sens"},{"issue":"5","key":"4077_CR45","doi-asserted-by":"publisher","first-page":"1307","DOI":"10.1007\/s00425-018-2976-9","volume":"248","author":"W Ma","year":"2018","unstructured":"Ma W, Qiu Z, Song J, Li J, Cheng Q, Zhai J, Ma C. A deep convolutional neural network approach for predicting phenotypes from genotypes. Planta. 2018;248(5):1307\u201318. https:\/\/doi.org\/10.1007\/s00425-018-2976-9.","journal-title":"Planta"},{"key":"4077_CR46","doi-asserted-by":"publisher","first-page":"132306","DOI":"10.1016\/j.physd.2019.132306","volume":"404","author":"A Sherstinsky","year":"2020","unstructured":"Sherstinsky A. Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D. 2020;404:132306. https:\/\/doi.org\/10.1016\/j.physd.2019.132306.","journal-title":"Physica D"},{"key":"4077_CR47","doi-asserted-by":"publisher","first-page":"157","DOI":"10.1007\/978-1-4419-9326-7_5","volume-title":"Ensemble machine learning","author":"A Cutler","year":"2012","unstructured":"Cutler A, Cutler DR, Stevens JR. Random forests. In: Zhang C, Ma Y, editors. Ensemble machine learning. Boston: Springer; 2012. p. 157\u201375. https:\/\/doi.org\/10.1007\/978-1-4419-9326-7_5."},{"issue":"4","key":"4077_CR48","doi-asserted-by":"publisher","first-page":"755","DOI":"10.1111\/1755-0998.12773","volume":"18","author":"MSO Brieuc","year":"2018","unstructured":"Brieuc MSO, Waters CD, Drinan DP, Naish KA. A practical introduction to random forest for genetic association studies in ecology and evolution. Mol Ecol Resour. 2018;18(4):755\u201366. https:\/\/doi.org\/10.1111\/1755-0998.12773.","journal-title":"Mol Ecol Resour"},{"issue":"1","key":"4077_CR49","doi-asserted-by":"publisher","first-page":"68","DOI":"10.1186\/1471-2180-13-68","volume":"13","author":"JR Bayjanov","year":"2013","unstructured":"Bayjanov JR, Starrenburg MJ, van der Sijde MR, Siezen RJ, van Hijum SA. Genotype-phenotype matching analysis of 38 lactococcus lactis strains using random forest methods. BMC Microbiol. 2013;13(1):68. https:\/\/doi.org\/10.1186\/1471-2180-13-68.","journal-title":"BMC Microbiol"},{"key":"4077_CR50","doi-asserted-by":"publisher","DOI":"10.1038\/s41598-018-31573-5","author":"H Behravan","year":"2018","unstructured":"Behravan H, Hartikainen JM, Tengstr\u00f6m M, Pylk\u00e4s K, Winqvist R, Kosma V, Mannermaa A. Machine learning identifies interacting genetic variants contributing to breast cancer risk: a case study in finnish cases and controls. Sci Rep. 2018;. https:\/\/doi.org\/10.1038\/s41598-018-31573-5.","journal-title":"Sci Rep"},{"key":"4077_CR51","doi-asserted-by":"publisher","first-page":"3","DOI":"10.1007\/3-540-45808-5_1","volume-title":"Neural nets","author":"G Valentini","year":"2002","unstructured":"Valentini G, Masulli F. Ensembles of learning machines. In: Goos G, Hartmanis J, van Leeuwen J, Marinaro M, Tagliaferri R, editors. Neural nets, vol. 2486. Berlin: Springer; 2002. p. 3\u201320. https:\/\/doi.org\/10.1007\/3-540-45808-5_1."},{"key":"4077_CR52","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1016\/j.inffus.2018.11.008","volume":"52","author":"V Bol\u00f3n-Canedo","year":"2019","unstructured":"Bol\u00f3n-Canedo V, Alonso-Betanzos A. Ensembles for feature selection: a review and future trends. Inf Fusion. 2019;52:1\u201312. https:\/\/doi.org\/10.1016\/j.inffus.2018.11.008.","journal-title":"Inf Fusion"},{"issue":"6","key":"4077_CR53","doi-asserted-by":"publisher","first-page":"1141","DOI":"10.1016\/j.kint.2020.02.028","volume":"97","author":"RSG Sealfon","year":"2020","unstructured":"Sealfon RSG, Mariani LH, Kretzler M, Troyanskaya OG. Machine learning, the kidney, and genotype-phenotype analysis. Kidney Int. 2020;97(6):1141\u20139. https:\/\/doi.org\/10.1016\/j.kint.2020.02.028.","journal-title":"Kidney Int"},{"key":"4077_CR54","doi-asserted-by":"publisher","unstructured":"International Inflammatory Bowel Disease Genetics Consortium (IIBDGC), Romagnoni A, J\u00e9gou S, Van Steen K, Wainrib G, Hugot J-P. Comparative performances of machine learning methods for classifying Crohn Disease patients using genome-wide genotyping data. Sci Rep. 2019;9(1):10351. https:\/\/doi.org\/10.1038\/s41598-019-46649-z. Accessed 1 Feb 2021.","DOI":"10.1038\/s41598-019-46649-z"},{"key":"4077_CR55","doi-asserted-by":"publisher","unstructured":"Chen T, Guestrin C. XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, San Francisco California USA. p. 785\u201394. 2016. https:\/\/doi.org\/10.1145\/2939672.2939785.","DOI":"10.1145\/2939672.2939785"},{"key":"4077_CR56","first-page":"631","volume-title":"Encyclopedia of machine learning","author":"GI Webb","year":"2011","unstructured":"Webb GI, Sammut C, Perlich C, Horv\u00e1th T, Wrobel S, Korb KB, Noble WS, Leslie C, Lagoudakis MG, Quadrianto N, Buntine WL, Quadrianto N, Buntine WL, Getoor L, Namata G, Getoor L, Jiawei Han XJ, Ting J-A, Vijayakumar S, Schaal S. Logistic regression. In: Sammut C, Webb GI, editors. Encyclopedia of machine learning. Boston: Springer; 2011. p. 631."}],"updated-by":[{"DOI":"10.1186\/s12859-021-04218-0","type":"correction","label":"Correction","source":"publisher","updated":{"date-parts":[[2021,6,11]],"date-time":"2021-06-11T00:00:00Z","timestamp":1623369600000}}],"container-title":["BMC Bioinformatics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-021-04077-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s12859-021-04077-9\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s12859-021-04077-9.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2021,7,1]],"date-time":"2021-07-01T11:03:35Z","timestamp":1625137415000},"score":1,"resource":{"primary":{"URL":"https:\/\/bmcbioinformatics.biomedcentral.com\/articles\/10.1186\/s12859-021-04077-9"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,4,19]]},"references-count":56,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2021,12]]}},"alternative-id":["4077"],"URL":"https:\/\/doi.org\/10.1186\/s12859-021-04077-9","relation":{"has-preprint":[{"id-type":"doi","id":"10.21203\/rs.3.rs-125397\/v1","asserted-by":"object"}]},"ISSN":["1471-2105"],"issn-type":[{"value":"1471-2105","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,4,19]]},"assertion":[{"value":"9 December 2020","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"3 March 2021","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"19 April 2021","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"11 June 2021","order":4,"name":"change_date","label":"Change Date","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"Correction","order":5,"name":"change_type","label":"Change Type","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"A Correction to this paper has been published:","order":6,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"https:\/\/doi.org\/10.1186\/s12859-021-04218-0","URL":"https:\/\/doi.org\/10.1186\/s12859-021-04218-0","order":7,"name":"change_details","label":"Change Details","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"All the data considered for this study is from a publically available source and for that, no approval is required.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"198"}}