{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,3]],"date-time":"2026-06-03T00:25:12Z","timestamp":1780446312686,"version":"3.54.1"},"reference-count":37,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,2,25]],"date-time":"2022-02-25T00:00:00Z","timestamp":1645747200000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,2,25]],"date-time":"2022-02-25T00:00:00Z","timestamp":1645747200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100004843","name":"Manipal University","doi-asserted-by":"publisher","id":[{"id":"10.13039\/501100004843","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100004320","name":"Philips","doi-asserted-by":"publisher","id":[{"id":"10.13039\/100004320","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["J Big Data"],"published-print":{"date-parts":[[2022,12]]},"abstract":"<jats:title>Abstract<\/jats:title><jats:sec>\n                <jats:title>Background<\/jats:title>\n                <jats:p>Radiotherapy is frequently used to treat head and neck Squamous cell carcinomas (HNSCC). Treatment outcomes being highly uncertain, there is a significant need for robust predictive tools to improvise treatment decision-making and better understand HNSCC by recognizing hidden patterns in data. We conducted this study to identify if Machine Learning (ML) could accurately predict outcomes and identify new prognostic variables in HNSCC.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Method<\/jats:title>\n                <jats:p>Retrospective data of 311 HNSCC patients treated with radiotherapy between 2013 and 2018 at our center and having a follow-up of at least three months' duration were collected. Binary-classification prediction models were developed for: Choice of Initial Treatment, Residual disease, Locoregional Recurrence, Distant Recurrence, and Development of New Primary. Clinical data were pre-processed using Imputation, Feature selection, Minority Oversampling, and Feature scaling algorithms. A method to retain original characteristics of dataset in testing samples while performing minority oversampling is illustrated. The classification comparison was performed using Random Forest (RF), Kernel Support Vector Machine (KSVM), and XGBoost classification algorithms for each model.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Results<\/jats:title>\n                <jats:p>For the choice of the initial treatment model, the testing accuracy was 84.58% using RF. The distant recurrence, locoregional recurrence, new-primary, and residual models had a testing accuracy (using KSVM) of 95.12%, 77.55%, 98.61%, and 92.25%, respectively. The important clinical determinants were identified using Shapely Values for each classification model, and the mean area under the curve (AUC) for the receiver operating curve was plotted.<\/jats:p>\n              <\/jats:sec><jats:sec>\n                <jats:title>Conclusion<\/jats:title>\n                <jats:p>ML was able to predict several clinically relevant outcomes, and with additional clinical validation, could facilitate recognition of novel prognostic factors in HNSCC.<\/jats:p>\n              <\/jats:sec>","DOI":"10.1186\/s40537-022-00578-3","type":"journal-article","created":{"date-parts":[[2022,2,25]],"date-time":"2022-02-25T16:03:11Z","timestamp":1645804991000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":16,"title":["Predicting clinical outcomes of radiotherapy for head and neck squamous cell carcinoma patients using machine learning algorithms"],"prefix":"10.1186","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-7940-220X","authenticated-orcid":false,"given":"Tarun","family":"Gangil","sequence":"first","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Amina Beevi","family":"Shahabuddin","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-5160-482X","authenticated-orcid":false,"given":"B.","family":"Dinesh Rao","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Krishnamoorthy","family":"Palanisamy","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Biswaroop","family":"Chakrabarti","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-7949-1708","authenticated-orcid":false,"given":"Krishna","family":"Sharan","sequence":"additional","affiliation":[],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"297","published-online":{"date-parts":[[2022,2,25]]},"reference":[{"issue":"6","key":"578_CR1","doi-asserted-by":"publisher","first-page":"394","DOI":"10.3322\/caac.21492","volume":"68","author":"F Bray","year":"2018","unstructured":"Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68(6):394\u2013424.","journal-title":"CA Cancer J Clin"},{"issue":"5","key":"578_CR2","doi-asserted-by":"publisher","first-page":"e266","DOI":"10.1016\/S1470-2045(17)30252-8","volume":"18","author":"JJ Caudell","year":"2017","unstructured":"Caudell JJ, Torres-Roca JF, Gillies RJ, Enderling H, Kim S, Rishi A, et al. The future of personalised radiotherapy for head and neck cancer. Lancet Oncol. 2017;18(5):e266\u201373. https:\/\/doi.org\/10.1016\/S1470-2045(17)30252-8.","journal-title":"Lancet Oncol"},{"issue":"13","key":"578_CR3","doi-asserted-by":"publisher","first-page":"1212","DOI":"10.1056\/NEJMp1606181","volume":"375","author":"Z Obermeyer","year":"2016","unstructured":"Obermeyer Z, Ziad MDD, Emanuel EJ. Predicting the Future - Big Data, Machine Learning, and Clinical Medicine. N Engl J Med. 2016;375(13):1212\u20136.","journal-title":"N Engl J Med"},{"issue":"6","key":"578_CR4","doi-asserted-by":"publisher","first-page":"1095","DOI":"10.1016\/j.hoc.2019.08.003","volume":"33","author":"CR Deig","year":"2019","unstructured":"Deig CR, Kanwar A, Thompson RF. Artificial intelligence in radiation oncology. Hematol Oncol Clin North Am. 2019;33(6):1095\u2013104. https:\/\/doi.org\/10.1016\/j.hoc.2019.08.003.","journal-title":"Hematol Oncol Clin North Am"},{"issue":"4","key":"578_CR5","doi-asserted-by":"publisher","first-page":"378","DOI":"10.1111\/jop.13135","volume":"50","author":"H Alkhadar","year":"2021","unstructured":"Alkhadar H, Macluskey M, White S, Ellis I, Gardner A. Comparison of machine learning algorithms for the prediction of five-year survival in oral squamous cell carcinoma. J Oral Pathol Med. 2021;50(4):378\u201384.","journal-title":"J Oral Pathol Med"},{"issue":"10","key":"578_CR6","doi-asserted-by":"publisher","first-page":"977","DOI":"10.1111\/jop.13089","volume":"49","author":"CS Chu","year":"2020","unstructured":"Chu CS, Lee NP, Adeoye J, Thomson P, Choi SW. Machine learning and treatment outcome prediction for oral cancer. J Oral Pathol Med. 2020;49(10):977\u201385.","journal-title":"J Oral Pathol Med"},{"issue":"12","key":"578_CR7","doi-asserted-by":"publisher","first-page":"1115","DOI":"10.1001\/jamaoto.2019.0981","volume":"145","author":"OA Karadaghy","year":"2019","unstructured":"Karadaghy OA, Shew M, New J, Bur AM. Development and assessment of a machine learning model to help predict survival among patients with oral squamous cell carcinoma. JAMA Otolaryngol Head Neck Surg. 2019;145(12):1115\u201320.","journal-title":"JAMA Otolaryngol Head Neck Surg"},{"issue":"12","key":"578_CR8","doi-asserted-by":"publisher","first-page":"4770","DOI":"10.1016\/j.eswa.2013.02.032","volume":"40","author":"P Rosado","year":"2013","unstructured":"Rosado P, Lequerica-Fernandez P, Villallain L, Pena I, Sanchez-Lasheras F, De Vicente JC. Survival model in oral squamous cell carcinoma based on clinicopathological parameters, molecular markers and support vector machines. Expert Syst Appl. 2013;40(12):4770\u20136. https:\/\/doi.org\/10.1016\/j.eswa.2013.02.032.","journal-title":"Expert Syst Appl"},{"key":"578_CR9","doi-asserted-by":"publisher","first-page":"20","DOI":"10.1016\/j.oraloncology.2019.03.011","volume":"92","author":"AM Bur","year":"2019","unstructured":"Bur AM, Holcomb A, Goodwin S, Woodroof J, Karadaghy O, Shnayder Y, et al. Machine learning to predict occult nodal metastasis in early oral squamous cell carcinoma. Oral Oncol. 2019;92:20\u20135. https:\/\/doi.org\/10.1016\/j.oraloncology.2019.03.011.","journal-title":"Oral Oncol"},{"issue":"12","key":"578_CR10","doi-asserted-by":"publisher","first-page":"2208","DOI":"10.1016\/j.joms.2020.06.015","volume":"78","author":"J Shan","year":"2020","unstructured":"Shan J, Jiang R, Chen X, Zhong Y, Zhang W, Xie L, et al. Machine learning predicts lymph node metastasis in early-stage oral tongue squamous cell carcinoma. J Oral Maxillofac Surg. 2020;78(12):2208\u201318. https:\/\/doi.org\/10.1016\/j.joms.2020.06.015.","journal-title":"J Oral Maxillofac Surg"},{"key":"578_CR11","doi-asserted-by":"publisher","first-page":"104068","DOI":"10.1016\/j.ijmedinf.2019.104068","volume":"136","author":"RO Alabi","year":"2020","unstructured":"Alabi RO, Elmusrati M, Sawazaki-Calone I, Kowalski LP, Haglund C, Coletta RD, et al. Comparison of supervised machine learning classification techniques in prediction of locoregional recurrences in early oral tongue cancer. Int J Med Inform. 2020;136:104068. https:\/\/doi.org\/10.1016\/j.ijmedinf.2019.104068.","journal-title":"Int J Med Inform"},{"issue":"4","key":"578_CR12","doi-asserted-by":"publisher","first-page":"489","DOI":"10.1007\/s00428-019-02642-5","volume":"475","author":"RO Alabi","year":"2019","unstructured":"Alabi RO, Elmusrati M, Sawazaki-Calone I, Kowalski LP, Haglund C, Coletta RD, et al. Machine learning application for prediction of locoregional recurrences in early oral tongue cancer: a Web-based prognostic tool. Virchows Arch. 2019;475(4):489\u201397.","journal-title":"Virchows Arch"},{"key":"578_CR13","unstructured":"Mandal S, Gupta A, Chanu WP. Survival prediction of head and neck squamous cell carcinoma using machine learning models. 2021;1\u20138. Available from: http:\/\/arxiv.org\/abs\/2105.07390."},{"issue":"4","key":"578_CR14","doi-asserted-by":"publisher","first-page":"1193","DOI":"10.1109\/JBHI.2015.2450362","volume":"19","author":"J Andreu-Perez","year":"2015","unstructured":"Andreu-Perez J, Poon CCY, Merrifield RD, Wong STC, Yang GZ. Big data for health. IEEE J Biomed Heal Informatics. 2015;19(4):1193\u2013208.","journal-title":"IEEE J Biomed Heal Informatics"},{"issue":"12","key":"578_CR15","doi-asserted-by":"publisher","first-page":"1","DOI":"10.3390\/e22121391","volume":"22","author":"I Lopez-Arevalo","year":"2020","unstructured":"Lopez-Arevalo I, Aldana-Bobadilla E, Molina-Villegas A, Galeana-Zapi\u00e9n H, Mu\u00f1iz-Sanchez V, Gausin-Valle S. A memory-efficient encoding method for processing mixed-type data on machine learning. Entropy. 2020;22(12):1\u201321.","journal-title":"Entropy"},{"key":"578_CR16","doi-asserted-by":"publisher","first-page":"106","DOI":"10.1016\/j.chemolab.2012.11.010","volume":"120","author":"Y Liu","year":"2013","unstructured":"Liu Y, Brown SD. Comparison of five iterative imputation methods for multivariate classification. Chemom Intell Lab Syst. 2013;120:106\u201315.","journal-title":"Chemom Intell Lab Syst"},{"issue":"2","key":"578_CR17","first-page":"1561","volume":"11","author":"MO Arowolo","year":"2021","unstructured":"Arowolo MO, Adebiyi MO, Adebiyi AA, Aremu C. An ICA-ensemble learning approaches for prediction of RNAseq malaria vector gene expression data classification. Int J Electr Comput Eng. 2021;11(2):1561\u20139.","journal-title":"Int J Electr Comput Eng"},{"key":"578_CR18","doi-asserted-by":"publisher","first-page":"182422","DOI":"10.1109\/ACCESS.2020.3029234","volume":"8","author":"MO Arowolo","year":"2020","unstructured":"Arowolo MO, Adebiyi MO, Adebiyi AA, Okesola OJ. A hybrid heuristic dimensionality reduction methods for classifying malaria vector gene expression data. IEEE Access. 2020;8:182422\u201330.","journal-title":"IEEE Access"},{"key":"578_CR19","doi-asserted-by":"publisher","unstructured":"Arowolo MO, Adebiyi MO, Aremu C, Adebiyi AA. A survey of dimension reduction and classification methods for RNA-Seq data on malaria vector. J Big Data. 2021;8(1). https:\/\/doi.org\/10.1186\/s40537-021-00441-x.","DOI":"10.1186\/s40537-021-00441-x"},{"key":"578_CR20","doi-asserted-by":"publisher","unstructured":"Arowolo MO, Adebiyi MO, Adebiyi AA, Olugbara O. Optimized hybrid investigative based dimensionality reduction methods for malaria vector using KNN classifier. J Big Data. 2021;8(1). https:\/\/doi.org\/10.1186\/s40537-021-00415-z","DOI":"10.1186\/s40537-021-00415-z"},{"issue":"9","key":"578_CR21","doi-asserted-by":"publisher","first-page":"2579","DOI":"10.17576\/jsm-2021-5009-07","volume":"50","author":"MO Arowolo","year":"2021","unstructured":"Arowolo MO, Adebiyi MO, Adebiyi AA. Enhanced dimensionality reduction methods for classifying malaria vector dataset using decision tree. Sains Malaysiana. 2021;50(9):2579\u201389.","journal-title":"Sains Malaysiana"},{"key":"578_CR22","first-page":"1091","volume":"2020","author":"YK Saheed","year":"2020","unstructured":"Saheed YK, Hambali MA, Arowolo MO, Olasupo YA. Application of GA feature selection on naive bayes, random forest and SVM for credit card fraud detection. Int Conf Decis Aid Sci Appl DASA. 2020;2020:1091\u20137.","journal-title":"Int Conf Decis Aid Sci Appl DASA."},{"key":"578_CR23","first-page":"2825","volume":"12","author":"F Pedregosa","year":"2011","unstructured":"Pedregosa F, Varoquaux S, Gramfort A, VincentMichel BT. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011;12:2825\u201330.","journal-title":"J Mach Learn Res"},{"key":"578_CR24","unstructured":"Brownlee J, Sanderson M, Koshy A, Cheremskoy A, Halfyard J. Machine learning mastery with Python: Data Cleaning, Feature Selection, and Data Transforms in Python. 2020"},{"key":"578_CR25","unstructured":"Brownlee J. Imbalanced classification with Python. Mach Learn Mastery. 2020;463."},{"key":"578_CR26","doi-asserted-by":"publisher","first-page":"105662","DOI":"10.1016\/j.asoc.2019.105662","volume":"83","author":"G Kov\u00e1cs","year":"2019","unstructured":"Kov\u00e1cs G. An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets. Appl Soft Comput J. 2019;83:105662. https:\/\/doi.org\/10.1016\/j.asoc.2019.105662.","journal-title":"Appl Soft Comput J"},{"issue":"2","key":"578_CR27","doi-asserted-by":"publisher","first-page":"519","DOI":"10.1007\/s10044-017-0649-0","volume":"22","author":"E Debie","year":"2019","unstructured":"Debie E, Shafi K. Implications of the curse of dimensionality for supervised learning classifier systems: theoretical and empirical analyses. Pattern Anal Appl. 2019;22(2):519\u201336.","journal-title":"Pattern Anal Appl"},{"issue":"5","key":"578_CR28","doi-asserted-by":"publisher","first-page":"9029","DOI":"10.30534\/ijatcse\/2020\/307952020","volume":"9","author":"C Akmal","year":"2020","unstructured":"Akmal C, Yahaya C, Firdaus A, Mohamad S, Ernawan F, Faizal M, et al. Automated feature selection using boruta algorithm to detect mobile malware. Int J Adv Trends Comput Sci Eng. 2020;9(5):9029\u201336.","journal-title":"Int J Adv Trends Comput Sci Eng"},{"key":"578_CR29","doi-asserted-by":"publisher","unstructured":"Naik N, Mohan BR. Optimal feature selection of technical indicator and stock prediction using machine learning technique. In: Communications in computer and information science. vol. 985. Springer Singapore; 2019. p. 261\u2013268. https:\/\/doi.org\/10.1007\/978-981-13-8300-7_22.","DOI":"10.1007\/978-981-13-8300-7_22"},{"issue":"1432","key":"578_CR30","doi-asserted-by":"publisher","first-page":"106036","DOI":"10.1016\/j.compag.2021.106036","volume":"183","author":"S Shafiee","year":"2021","unstructured":"Shafiee S, Lied LM, Burud I, Dieseth JA, Alsheikh M, Lillemo M. Sequential forward selection and support vector regression in comparison to LASSO regression for spring wheat yield prediction based on UAV imagery. Comput Electron Agric. 2021;183(1432):106036. https:\/\/doi.org\/10.1016\/j.compag.2021.106036.","journal-title":"Comput Electron Agric"},{"issue":"6","key":"578_CR31","doi-asserted-by":"publisher","first-page":"1005","DOI":"10.1007\/s11548-014-0992-1","volume":"9","author":"M Tan","year":"2014","unstructured":"Tan M, Pu J, Zheng B. Optimization of breast mass classification using sequential forward floating selection (SFFS) and a support vector machine (SVM) model. Int J Comput Assist Radiol Surg. 2014;9(6):1005\u201320.","journal-title":"Int J Comput Assist Radiol Surg"},{"key":"578_CR32","doi-asserted-by":"publisher","unstructured":"Shi X, Li Q, Qi Y, Huang T, Li J. An accident prediction approach based on XGBoost. 20017;1\u20137. https:\/\/doi.org\/10.1109\/ISKE.2017.8258806.","DOI":"10.1109\/ISKE.2017.8258806"},{"key":"578_CR33","doi-asserted-by":"publisher","first-page":"225","DOI":"10.1007\/978-3-662-44851-9_15","volume-title":"Machine learning and knowledge discovery in databases","author":"ZC Lipton","year":"2014","unstructured":"Lipton ZC, Elkan C, Naryanaswamy B. Optimal thresholding of classifiers to maximize F1 measure. In: Calders T, Esposito F, H\u00fcllermeier E, Meo R, editors. Machine learning and knowledge discovery in databases. Heidelberg: Springer; 2014. p. 225\u201339."},{"issue":"7","key":"578_CR34","doi-asserted-by":"publisher","first-page":"1145","DOI":"10.1016\/S0031-3203(96)00142-2","volume":"30","author":"AP Bradley","year":"1997","unstructured":"Bradley AP. The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit. 1997;30(7):1145\u201359.","journal-title":"Pattern Recognit"},{"key":"578_CR35","doi-asserted-by":"crossref","unstructured":"Messalas A, Kanellopoulos Y, Makris C. Model-agnostic interpretability with shapley values. In: 10th Int Conf Information, Intell Syst Appl IISA 2019. 2019;1\u20137.","DOI":"10.1109\/IISA.2019.8900669"},{"issue":"2","key":"578_CR36","doi-asserted-by":"publisher","first-page":"167","DOI":"10.1080\/10485252.2015.1010532","volume":"27","author":"Y Jung","year":"2015","unstructured":"Jung Y, Hu J. A K-fold averaging cross-validation procedure. J Nonparametr Stat. 2015;27(2):167\u201379.","journal-title":"J Nonparametr Stat"},{"issue":"11","key":"578_CR37","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1001\/jamanetworkopen.2020.25881","volume":"3","author":"FM Howard","year":"2020","unstructured":"Howard FM, Kochanny S, Koshy M, Spiotto M, Pearson AT. Machine learning-guided adjuvant treatment of head and neck cancer. JAMA Netw Open. 2020;3(11):1\u201313.","journal-title":"JAMA Netw Open"}],"container-title":["Journal of Big Data"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-022-00578-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s40537-022-00578-3\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s40537-022-00578-3.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,2,25]],"date-time":"2022-02-25T16:12:26Z","timestamp":1645805546000},"score":1,"resource":{"primary":{"URL":"https:\/\/journalofbigdata.springeropen.com\/articles\/10.1186\/s40537-022-00578-3"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,2,25]]},"references-count":37,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,12]]}},"alternative-id":["578"],"URL":"https:\/\/doi.org\/10.1186\/s40537-022-00578-3","relation":{},"ISSN":["2196-1115"],"issn-type":[{"value":"2196-1115","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,2,25]]},"assertion":[{"value":"26 October 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"15 February 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"25 February 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"Ethical clearance for collecting retrospective data was provided by Kasturba Medical College and Kasturba Hospital Institutional Ethics Committee (Registration No. ECR\/146\/Inst\/KA\/2013\/RR-16). The Institutional Ethical clearance number for the study is IEC: 165\/2018. The study was registered with the clinical trials registry of India (CTRI). CTRI Number: CTRI\/2018\/04\/013517 (Registered on: 27\/04\/2018)\u2014Trial Registered Prospectively. Type of study-Observational.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Ethics approval and consent to participate"}},{"value":"Not applicable.","order":3,"name":"Ethics","group":{"name":"EthicsHeading","label":"Consent for publication"}},{"value":"The authors declare that they have no competing interests.","order":4,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"25"}}