{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,14]],"date-time":"2026-04-14T16:42:00Z","timestamp":1776184920516,"version":"3.50.1"},"reference-count":67,"publisher":"Emerald","issue":"1","license":[{"start":{"date-parts":[[2023,5,3]],"date-time":"2023-05-03T00:00:00Z","timestamp":1683072000000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/www.emerald.com\/insight\/site-policies"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["DTA"],"published-print":{"date-parts":[[2024,1,29]]},"abstract":"<jats:sec><jats:title content-type=\"abstract-subheading\">Purpose<\/jats:title><jats:p>Ovarian cancer (OC) is the most common type of gynecologic cancer in the world with a high rate of mortality. Due to manifestation of generic symptoms and absence of specific biomarkers, OC is usually diagnosed at a late stage. Machine learning models can be employed to predict driver genes implicated in causative mutations.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Design\/methodology\/approach<\/jats:title><jats:p>In the present study, a comprehensive next generation sequencing (NGS) analysis of whole exome sequences of 47 OC patients was carried out to identify clinically significant mutations. Nine functional features of 708 mutations identified were input into a machine learning classification model by employing the eXtreme Gradient Boosting (XGBoost) classifier method for prediction of OC driver genes.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Findings<\/jats:title><jats:p>The XGBoost classifier model yielded a classification accuracy of 0.946, which was superior to that obtained by other classifiers such as decision tree, Naive Bayes, random forest and support vector machine. Further, an interaction network was generated to identify and establish correlations with cancer-associated pathways and gene ontology data.<\/jats:p><\/jats:sec><jats:sec><jats:title content-type=\"abstract-subheading\">Originality\/value<\/jats:title><jats:p>The final results revealed 12 putative candidate cancer driver genes, namely LAMA3, LAMC3, COL6A1, COL5A1, COL2A1, UGT1A1, BDNF, ANK1, WNT10A, FZD4, PLEKHG5 and CYP2C9, that may have implications in clinical diagnosis.<\/jats:p><\/jats:sec>","DOI":"10.1108\/dta-03-2022-0096","type":"journal-article","created":{"date-parts":[[2023,5,3]],"date-time":"2023-05-03T02:28:03Z","timestamp":1683080883000},"page":"62-80","source":"Crossref","is-referenced-by-count":6,"title":["Machine learning approaches for prediction of ovarian cancer driver genes from mutational and network analysis"],"prefix":"10.1108","volume":"58","author":[{"ORCID":"https:\/\/orcid.org\/0000-0003-2781-2095","authenticated-orcid":false,"given":"Rucha","family":"Wadapurkar","sequence":"first","affiliation":[]},{"given":"Sanket","family":"Bapat","sequence":"additional","affiliation":[]},{"given":"Rupali","family":"Mahajan","sequence":"additional","affiliation":[]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6530-3238","authenticated-orcid":false,"given":"Renu","family":"Vyas","sequence":"additional","affiliation":[]}],"member":"140","published-online":{"date-parts":[[2023,5,3]]},"reference":[{"issue":"10","key":"key2024012913143197100_ref001","doi-asserted-by":"crossref","first-page":"2131","DOI":"10.1021\/acs.jcim.8b00414","article-title":"Machine learning classification and structure-functional analysis of cancer mutations reveal unique dynamic and network signatures of driver sites in oncogenes and tumor suppressor genes","volume":"58","year":"2018","journal-title":"Journal of Chemical Information and Modeling"},{"key":"key2024012913143197100_ref002","article-title":"Ovarian Cancer","author":"American Cancer Society","year":"2016"},{"key":"key2024012913143197100_ref003","doi-asserted-by":"crossref","unstructured":"Bartz-Beielstein, T., Chandrasekaran, S. and Rehbach, F. (2023), \u201cCase study II: tuning of gradient boosting (xgboost)\u201d, in IDE+A: Institute for Data Science, Engineering, and Analytics (Ed.), Hyperparameter Tuning for Machine and Deep Learning with R: A Practical Guide, Springer Nature Singapore, Singapore, pp. 221-234.","DOI":"10.1007\/978-981-19-5170-1_9"},{"issue":"7","key":"key2024012913143197100_ref004","article-title":"Patient-specific driver gene prediction and risk assessment through integrated network analysis of cancer omics profiles","volume":"43","year":"2015","journal-title":"Nucleic Acids Research"},{"issue":"5","key":"key2024012913143197100_ref005","doi-asserted-by":"crossref","first-page":"401","DOI":"10.1158\/2159-8290.CD-12-0095","article-title":"The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data","volume":"2","year":"2012","journal-title":"Cancer Discovery"},{"issue":"7","key":"key2024012913143197100_ref006","doi-asserted-by":"crossref","first-page":"433","DOI":"10.1136\/jmedgenet-2012-100918","article-title":"wANNOVAR: annotating genetic variants for personal genomes via the web","volume":"49","year":"2012","journal-title":"Journal of Medical Genetics"},{"key":"key2024012913143197100_ref007","first-page":"491502","article-title":"Classification of cancer primary sites using machine learning and somatic mutations","volume":"2015","year":"2015","journal-title":"BioMed Research International"},{"key":"key2024012913143197100_ref008","doi-asserted-by":"crossref","first-page":"642","DOI":"10.1093\/bib\/bbv068","article-title":"Advances in computational approaches for prioritizing driver mutations and significantly mutated genes in cancer genomes","volume":"17","year":"2016","journal-title":"Briefings in Bioinformatics"},{"key":"key2024012913143197100_ref009","doi-asserted-by":"crossref","first-page":"7","DOI":"10.1186\/1477-7827-1-7","article-title":"Mitogen-activated protein kinases in normal and (pre)neoplastic ovarian surface epithelium","volume":"1","year":"2003","journal-title":"Reproductive Biology and Endocrinology"},{"issue":"16","key":"key2024012913143197100_ref010","doi-asserted-by":"crossref","first-page":"2745","DOI":"10.1093\/bioinformatics\/btv195","article-title":"PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels","volume":"31","year":"2015","journal-title":"Bioinformatics"},{"issue":"1","key":"key2024012913143197100_ref011","doi-asserted-by":"crossref","first-page":"21","DOI":"10.1186\/s12920-019-0652-y","article-title":"Germline variants in DNA repair genes associated with hereditary breast and ovarian cancer syndrome: analysis of a 21 gene panel in the Brazilian population","volume":"13","year":"2020","journal-title":"BMC Medical Genomics"},{"key":"key2024012913143197100_ref012","doi-asserted-by":"crossref","first-page":"151","DOI":"10.12688\/f1000research.4492.2","article-title":"Cytoscape: the network visualization tool for GenomeSpace workflows","volume":"3","year":"2014","journal-title":"F1000Research"},{"issue":"3","key":"key2024012913143197100_ref013","doi-asserted-by":"crossref","first-page":"663","DOI":"10.1007\/s11517-021-02476-x","article-title":"Hybrid gene selection approach using XGBoost and multi-objective genetic algorithm for cancer classification","volume":"60","year":"2022","journal-title":"Medical & Biological Engineering & Computing"},{"key":"key2024012913143197100_ref014","doi-asserted-by":"crossref","first-page":"556","DOI":"10.1038\/nrg3767","article-title":"Expanding the computational toolbox for mining cancer genomes","volume":"15","year":"2014","journal-title":"Nature Reviews Genetics"},{"key":"key2024012913143197100_ref015","doi-asserted-by":"crossref","first-page":"103","DOI":"10.1023\/A:1007413511361","article-title":"On the optimality of the simple Bayesian classifier under zero-one loss","volume":"29","year":"1997","journal-title":"Machine Learning"},{"issue":"8","key":"key2024012913143197100_ref016","doi-asserted-by":"crossref","first-page":"2125","DOI":"10.1093\/hmg\/ddu733","article-title":"Comparison and integration of deleteriousness prediction methods for nonsynonymous SNVs in whole exome sequencing studies","volume":"24","year":"2015","journal-title":"Human Molecular Genetics"},{"key":"key2024012913143197100_ref017","first-page":"905951","article-title":"Identification and analysis of driver missense mutations using rotation forest with feature selection","volume":"2014","year":"2014","journal-title":"BioMed Research International"},{"issue":"1","key":"key2024012913143197100_ref018","doi-asserted-by":"crossref","first-page":"169","DOI":"10.1159\/000493966","article-title":"The profile of genetic mutations in papillary thyroid cancer detected by whole exome sequencing","volume":"50","year":"2018","journal-title":"Cellular Physiology and Biochemistry"},{"issue":"1","key":"key2024012913143197100_ref019","doi-asserted-by":"crossref","first-page":"50","DOI":"10.1186\/s13048-018-0424-x","article-title":"DNA damage repair in ovarian cancer: unlocking the heterogeneity","volume":"11","year":"2018","journal-title":"Journal of Ovarian Research"},{"issue":"11","key":"key2024012913143197100_ref020","doi-asserted-by":"crossref","first-page":"1081","DOI":"10.1038\/nmeth.2642","article-title":"IntOGen-mutations identifies cancer drivers across tumor types","volume":"10","year":"2013","journal-title":"Nature Methods"},{"issue":"97","key":"key2024012913143197100_ref021","first-page":"163","article-title":"Probability and the weighing of evidence","volume":"26","year":"1951","journal-title":"Philosophy, the Royal Institute of Philosophy"},{"key":"key2024012913143197100_ref022","doi-asserted-by":"crossref","first-page":"153","DOI":"10.1038\/nature05610","article-title":"Patterns of somatic mutation in human cancer genomes","volume":"446","year":"2007","journal-title":"Nature"},{"issue":"Suppl_1","key":"key2024012913143197100_ref023","first-page":"i508","article-title":"Prediction of cancer driver genes through network-based moment propagation of mutation scores","volume":"36","year":"2020","journal-title":"Bioinformatics"},{"key":"key2024012913143197100_ref024","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1038\/446145a","article-title":"Cancer: drivers and passengers","volume":"446","year":"2007","journal-title":"Nature"},{"key":"key2024012913143197100_ref025","volume-title":"The Elements of Statistical Learning, Data Mining, Inference, and Prediction","year":"2001"},{"key":"key2024012913143197100_ref026","first-page":"7983236","article-title":"A survey of computational tools to analyze and interpret whole exome sequencing data","volume":"2016","year":"2016","journal-title":"International Journal of Genomics"},{"issue":"5","key":"key2024012913143197100_ref027","first-page":"560","article-title":"The classification of the applicable machine learning methods in robot manipulators","volume":"2","year":"2012","journal-title":"International Journal of Machine Learning and Computing"},{"key":"key2024012913143197100_ref028","doi-asserted-by":"crossref","first-page":"113","DOI":"10.1016\/j.jare.2020.11.006","article-title":"A risk prediction model of gene signatures in ovarian cancer through bagging of GA-XGBoost models","volume":"30","year":"2021","journal-title":"Journal of Advanced Research"},{"key":"key2024012913143197100_ref029","doi-asserted-by":"crossref","first-page":"203","DOI":"10.1007\/978-0-387-98094-2_10","article-title":"Activated epidermal growth factor receptor in ovarian cancer","volume":"149","year":"2009","journal-title":"Cancer Treatment and Research"},{"issue":"2","key":"key2024012913143197100_ref030","doi-asserted-by":"crossref","first-page":"283","DOI":"10.1016\/j.cell.2018.03.042","article-title":"The cancer genome atlas: creating lasting value beyond Its Data","volume":"173","year":"2018","journal-title":"Cell"},{"issue":"1","key":"key2024012913143197100_ref031","doi-asserted-by":"crossref","first-page":"12394","DOI":"10.1038\/s41598-018-30261-8","article-title":"Inflammation is a key contributor to ovarian cancer cell seeding","volume":"8","year":"2018","journal-title":"Scientific Reports"},{"issue":"17","key":"key2024012913143197100_ref032","doi-asserted-by":"crossref","first-page":"2283","DOI":"10.1093\/bioinformatics\/btp373","article-title":"VarScan: variant detection in massively parallel sequencing of individual and pooled samples","volume":"25","year":"2009","journal-title":"Bioinformatics"},{"key":"key2024012913143197100_ref033","first-page":"249","article-title":"Supervised machine learning: a review of classification techniques","volume":"31","year":"2007","journal-title":"Informatica"},{"issue":"D1","key":"key2024012913143197100_ref034","doi-asserted-by":"crossref","first-page":"D1062","DOI":"10.1093\/nar\/gkx1153","article-title":"ClinVar: improving access to variant interpretations and supporting evidence","volume":"46","year":"2018","journal-title":"Nucleic Acids Research"},{"key":"key2024012913143197100_ref035","first-page":"D19","article-title":"International nucleotide sequence database collaboration. the sequence read archive","volume":"39","year":"2011","journal-title":"Nucleic Acids Research"},{"issue":"5","key":"key2024012913143197100_ref036","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1093\/bioinformatics\/btp698","article-title":"Fast and accurate long-read alignment with burrows-wheeler transform","volume":"26","year":"2010","journal-title":"Bioinformatics"},{"issue":"D1","key":"key2024012913143197100_ref037","first-page":"D863","article-title":"DriverDBv3: a multi-omics database for cancer driver gene research","volume":"48","year":"2020","journal-title":"Nucleic Acids Research"},{"issue":"8","key":"key2024012913143197100_ref038","doi-asserted-by":"crossref","first-page":"894","DOI":"10.1002\/humu.21517","article-title":"dbNSFP: a lightweight database of human nonsynonymous SNPs and their functional predictions","volume":"32","year":"2011","journal-title":"Human Mutation"},{"key":"key2024012913143197100_ref039","doi-asserted-by":"crossref","first-page":"10204","DOI":"10.1038\/srep10204","article-title":"Evaluation and integration of cancer gene classifiers: identification and ranking of plausible drivers","volume":"5","year":"2015","journal-title":"Scientific Reports"},{"issue":"Suppl 1","key":"key2024012913143197100_ref040","first-page":"S81","article-title":"Applications of machine learning and data mining methods to detect associations of rare and common variants with complex traits","volume":"38","year":"2014","journal-title":"Genetic Epidemiology"},{"issue":"1","key":"key2024012913143197100_ref041","doi-asserted-by":"crossref","first-page":"16188","DOI":"10.1038\/s41598-017-16286-5","article-title":"Driver pattern identification over the gene co-expression of drug response in ovarian cancer by integrating high throughput genomics data","volume":"7","year":"2017","journal-title":"Scientific Reports"},{"key":"key2024012913143197100_ref042","doi-asserted-by":"crossref","first-page":"13","DOI":"10.3389\/fgene.2019.00013","article-title":"deepDriver: predicting cancer driver genes based on somatic mutations using deep convolutional neural networks","volume":"10","year":"2019","journal-title":"Frontiers in Genetics"},{"key":"key2024012913143197100_ref043","doi-asserted-by":"crossref","first-page":"287","DOI":"10.2147\/IJWH.S197604","article-title":"Ovarian cancer in the world: epidemiology and risk factors","volume":"11","year":"2019","journal-title":"International Journal of Women's Health"},{"issue":"1","key":"key2024012913143197100_ref044","doi-asserted-by":"crossref","first-page":"638","DOI":"10.1186\/s12864-016-2942-5","article-title":"Identifying candidate drivers of drug response in heterogeneous cancer by mining high throughput genomics data","volume":"17","year":"2016","journal-title":"BMC Genomics"},{"issue":"9","key":"key2024012913143197100_ref045","doi-asserted-by":"crossref","first-page":"11705","DOI":"10.3390\/ijms130911705","article-title":"Mechanisms of ovarian cancer metastasis: biochemical pathways","volume":"13","year":"2012","journal-title":"International Journal of Molecular Sciences"},{"issue":"3","key":"key2024012913143197100_ref046","doi-asserted-by":"crossref","first-page":"128","DOI":"10.14445\/22312803\/IJCTT-V48P126","article-title":"Supervised machine learning algorithms: classification and comparison","volume":"48","year":"2017","journal-title":"International Journal of Computer Trends and Technology"},{"issue":"2","key":"key2024012913143197100_ref047","doi-asserted-by":"crossref","first-page":"334","DOI":"10.1038\/sj.bjc.6602315","article-title":"Lack of EGF receptor contributes to drug sensitivity of human germline cells","volume":"92","year":"2005","journal-title":"Journal of Cancer"},{"issue":"2","key":"key2024012913143197100_ref048","first-page":"101","article-title":"Tyrosine kinase \u2013 role and significance in cancer","volume":"1","year":"2004","journal-title":"International Journal of Medical Sciences"},{"key":"key2024012913143197100_ref049","article-title":"Potential consequences on protein level and using prediction tools","volume-title":"Variant effect predictor training course","year":"2018"},{"key":"key2024012913143197100_ref050","doi-asserted-by":"crossref","first-page":"5","DOI":"10.1186\/gm524","article-title":"Identifying driver mutations in sequenced cancer genomes: computational approaches to enable precision medicine","volume":"6","year":"2014","journal-title":"Genome Medicine"},{"issue":"1","key":"key2024012913143197100_ref051","first-page":"15","article-title":"Ovarian cancer screening and early detection in the general population","volume":"4","year":"2011","journal-title":"Reviews in Obstetrics and Gynecology"},{"issue":"11","key":"key2024012913143197100_ref052","article-title":"A new molecular signature method for prediction of driver cancer pathways from transcriptional data","volume":"44","year":"2016","journal-title":"Nucleic Acids Research"},{"issue":"1","key":"key2024012913143197100_ref053","doi-asserted-by":"crossref","first-page":"17217","DOI":"10.1038\/s41598-020-74251-1","article-title":"A network pharmacology-based approach to explore potential targets of Caesalpinia pulcherima: an updated prototype in drug discovery","volume":"10","year":"2020","journal-title":"Scientific Reports"},{"issue":"2","key":"key2024012913143197100_ref054","doi-asserted-by":"crossref","first-page":"51","DOI":"10.1002\/gcc.22507","article-title":"Identification of somatic genetic alterations in ovarian clear cell carcinoma with next generation sequencing","volume":"57","year":"2018","journal-title":"Genes, Chromosomes & Cancer"},{"issue":"3","key":"key2024012913143197100_ref055","doi-asserted-by":"crossref","first-page":"347","DOI":"10.1002\/jcp.1041340305","article-title":"Serial propagation of human ovarian surface epithelium in tissue culture","volume":"134","year":"1988","journal-title":"Journal of Cellular Physiology"},{"issue":"6","key":"key2024012913143197100_ref056","first-page":"852","article-title":"Developing a web based system for breast cancer prediction using XGboost classifier","volume":"9","year":"2020","journal-title":"International Journal of Engineering Research & Technology"},{"issue":"D1","key":"key2024012913143197100_ref057","doi-asserted-by":"crossref","first-page":"D362","DOI":"10.1093\/nar\/gkw937","article-title":"The STRING database in 2017: quality-controlled protein-protein association networks, made broadly accessible","volume":"45","year":"2017","journal-title":"Nucleic Acids Research"},{"key":"key2024012913143197100_ref058","unstructured":"Tableau (c2017), \u201cMeet the Tableau desktop family\u201d, [Internet], Tableau, Seattle, WA, available at: https:\/\/public.tableau.com\/en-us\/s\/download (accessed 23 April 2023)."},{"issue":"1","key":"key2024012913143197100_ref059","doi-asserted-by":"crossref","first-page":"25","DOI":"10.1186\/s13073-018-0531-8","article-title":"Cancer Genome Interpreter annotates the biological and clinical relevance of tumor alterations","volume":"10","year":"2018","journal-title":"Genome Medicine"},{"issue":"5","key":"key2024012913143197100_ref060","doi-asserted-by":"crossref","first-page":"6","DOI":"10.3747\/co.v17i5.668","article-title":"Association of lipid metabolism with ovarian cancer","volume":"17","year":"2010","journal-title":"Current Oncology"},{"issue":"D1","key":"key2024012913143197100_ref061","doi-asserted-by":"crossref","first-page":"D941","DOI":"10.1093\/nar\/gky1015","article-title":"COSMIC: the catalogue of somatic mutations in cancer","volume":"47","year":"2019","journal-title":"Nucleic Acids Research"},{"key":"key2024012913143197100_ref062","first-page":"A68","article-title":"The cancer genome atlas (TCGA): an immeasurable source of knowledge","volume":"19","year":"2015","journal-title":"Contemporary Oncology (Pozn)"},{"issue":"1 Suppl 4","key":"key2024012913143197100_ref063","first-page":"S47","article-title":"The rationale for the combination of selective EGFR inhibitors with cytotoxic drugs and radiotherapy","volume":"22","year":"2007","journal-title":"The International Journal of Biological Markers"},{"issue":"1","key":"key2024012913143197100_ref064","doi-asserted-by":"crossref","first-page":"27","DOI":"10.1109\/TCBB.2016.2621042","article-title":"Application of genetic programming (GP) formalism for building disease predictive models from protein-protein interactions (PPI) data","volume":"15","year":"2018","journal-title":"IEEE\/ACM Transactions on Computational Biology and Bioinformatics"},{"issue":"5","key":"key2024012913143197100_ref065","doi-asserted-by":"crossref","first-page":"1159","DOI":"10.1007\/s00438-019-01569-5","article-title":"Network pharmacology exploration reveals the bioactive compounds and molecular mechanisms of Li-Ru-Kang against hyperplasia of mammary gland","volume":"294","year":"2019","journal-title":"Molecular Genetics and Genomics"},{"issue":"2","key":"key2024012913143197100_ref066","doi-asserted-by":"crossref","first-page":"258","DOI":"10.4236\/tel.2021.112019","article-title":"A study on forecasting the default risk of bond based on xgboost algorithm and over-sampling method","volume":"11","year":"2021","journal-title":"Theoretical Economics Letters"},{"key":"key2024012913143197100_ref067","doi-asserted-by":"publisher","first-page":"585029","DOI":"10.3389\/fgene.2020.585029","article-title":"A novel XGBoost method to identify cancer tissue-of-origin based on copy number variations","volume":"11","year":"2020","journal-title":"Frontiers in Genetics"}],"container-title":["Data Technologies and Applications"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/DTA-03-2022-0096\/full\/xml","content-type":"application\/xml","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/www.emerald.com\/insight\/content\/doi\/10.1108\/DTA-03-2022-0096\/full\/html","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,7,24]],"date-time":"2025-07-24T23:15:03Z","timestamp":1753398903000},"score":1,"resource":{"primary":{"URL":"http:\/\/www.emerald.com\/dta\/article\/58\/1\/62-80\/1221185"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,3]]},"references-count":67,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2023,5,3]]},"published-print":{"date-parts":[[2024,1,29]]}},"alternative-id":["10.1108\/DTA-03-2022-0096"],"URL":"https:\/\/doi.org\/10.1108\/dta-03-2022-0096","relation":{},"ISSN":["2514-9288","2514-9288"],"issn-type":[{"value":"2514-9288","type":"print"},{"value":"2514-9288","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,3]]}}}