{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,27]],"date-time":"2026-02-27T05:03:17Z","timestamp":1772168597283,"version":"3.50.1"},"reference-count":34,"publisher":"MDPI AG","issue":"22","license":[{"start":{"date-parts":[[2022,11,12]],"date-time":"2022-11-12T00:00:00Z","timestamp":1668211200000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"name":"National Funds through the Portuguese funding agency, FCT-Foundation for Science and Technology Portugal","award":["LA\/P\/0063\/2020"],"award-info":[{"award-number":["LA\/P\/0063\/2020"]}]},{"name":"National Funds through the Portuguese funding agency, FCT-Foundation for Science and Technology Portugal","award":["2021.05767.BD"],"award-info":[{"award-number":["2021.05767.BD"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Mathematics"],"abstract":"<jats:p>The use of deep learning methods in medical imaging has been able to deliver promising results; however, the success of such models highly relies on large, properly annotated datasets. The annotation of medical images is a laborious, expensive, and time-consuming process. This difficulty is increased for the mutations status label since these require additional exams (usually biopsies) to be obtained. On the other hand, raw images, without annotations, are extensively collected as part of the clinical routine. This work investigated methods that could mitigate the labelled data scarcity problem by using both labelled and unlabelled data to improve the efficiency of predictive models. A semi-supervised learning (SSL) approach was developed to predict epidermal growth factor receptor (EGFR) mutation status in lung cancer in a less invasive manner using 3D CT scans.The proposed approach consists of combining a variational autoencoder (VAE) and exploiting the power of adversarial training, intending that the features extracted from unlabelled data to discriminate images can help in the classification task. To incorporate labelled and unlabelled images, adversarial training was used, extending a traditional variational autoencoder. With the developed method, a mean AUC of 0.701 was achieved with the best-performing model, with only 14% of the training data being labelled. This SSL approach improved the discrimination ability by nearly 7 percentage points over a fully supervised model developed with the same amount of labelled data, confirming the advantage of using such methods when few annotated examples are available.<\/jats:p>","DOI":"10.3390\/math10224225","type":"journal-article","created":{"date-parts":[[2022,11,14]],"date-time":"2022-11-14T04:21:45Z","timestamp":1668399705000},"page":"4225","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":6,"title":["Semi-Supervised Approach for EGFR Mutation Prediction on CT Images"],"prefix":"10.3390","volume":"10","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-1192-452X","authenticated-orcid":false,"given":"Cl\u00e1udia","family":"Pinheiro","sequence":"first","affiliation":[{"name":"INESC TEC\u2014Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal"},{"name":"FEUP\u2014Faculty of Engineering, University of Porto, 4200-465 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3069-2282","authenticated-orcid":false,"given":"Francisco","family":"Silva","sequence":"additional","affiliation":[{"name":"INESC TEC\u2014Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal"},{"name":"FCUP\u2014Faculty of Science, University of Porto, 4169-007 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-1681-2436","authenticated-orcid":false,"given":"Tania","family":"Pereira","sequence":"additional","affiliation":[{"name":"INESC TEC\u2014Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-6193-8540","authenticated-orcid":false,"given":"H\u00e9lder P.","family":"Oliveira","sequence":"additional","affiliation":[{"name":"INESC TEC\u2014Institute for Systems and Computer Engineering, Technology and Science, 4200-465 Porto, Portugal"},{"name":"FCUP\u2014Faculty of Science, University of Porto, 4169-007 Porto, Portugal"}]}],"member":"1968","published-online":{"date-parts":[[2022,11,12]]},"reference":[{"key":"ref_1","unstructured":"(2021, October 30). The Top 10 Causes of Death. Available online: https:\/\/www.who.int\/news-room\/fact-sheets\/detail\/the-top-10-causes-of-death."},{"key":"ref_2","unstructured":"(2022, March 07). Cancer Today-International Agency for Research on Cancer. Available online: https:\/\/gco.iarc.fr\/today\/home."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"7","DOI":"10.3322\/caac.21654","article-title":"Cancer Statistics, 2021","volume":"71","author":"Siegel","year":"2021","journal-title":"CA Cancer J. Clin."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"4006","DOI":"10.1038\/ncomms5006","article-title":"Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach","volume":"5","author":"Aerts","year":"2014","journal-title":"Nat. Commun."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"441","DOI":"10.1016\/j.ejca.2011.11.036","article-title":"Radiomics: Extracting more information from medical images using advanced feature analysis","volume":"48","author":"Gillies","year":"2012","journal-title":"Eur. J. Cancer"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"563","DOI":"10.1148\/radiol.2015151169","article-title":"Radiomics: Images are more than pictures, they are data","volume":"278","author":"Gillies","year":"2016","journal-title":"Radiology"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"1960","DOI":"10.1007\/s00261-019-02028-w","article-title":"Radiogenomics: Bridging imaging and genomics","volume":"44","author":"Bodalal","year":"2019","journal-title":"Abdom. Radiol."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"e13963","DOI":"10.1097\/MD.0000000000013963","article-title":"Can CT radiomic analysis in NSCLC predict histology and EGFR mutation status?","volume":"98","author":"Digumarthy","year":"2019","journal-title":"Medicine"},{"key":"ref_9","doi-asserted-by":"crossref","unstructured":"Pinheiro, G., Pereira, T., Dias, C., Freitas, C., Hespanhol, V., Costa, J.L., Cunha, A., and Oliveira, H.P. (2020). Identifying relationships between imaging phenotypes and lung cancer-related mutation status: EGFR and KRAS. Sci. Rep., 10.","DOI":"10.1038\/s41598-020-60202-3"},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Morgado, J., Pereira, T., Silva, F., Freitas, C., Negr\u00e3o, E., de Lima, B.F., da Silva, M.C., Madureira, A.J., Ramos, I., and Hespanhol, V. (2021). Machine Learning and Feature Selection Methods for EGFR Mutation Status Prediction in Lung Cancer. Appl. Sci., 11.","DOI":"10.3390\/app11073273"},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"1800986","DOI":"10.1183\/13993003.00986-2018","article-title":"Predicting EGFR mutation status in lung adenocarcinoma on computed tomography image using deep learning","volume":"53","author":"Wang","year":"2019","journal-title":"Eur. Respir. J."},{"key":"ref_12","first-page":"211","article-title":"ImageNet Large Scale Visual Recognition Challenge","volume":"115","author":"Russakovsky","year":"2014","journal-title":"CoRR"},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"3532","DOI":"10.1002\/cam4.2233","article-title":"Toward automatic prediction of EGFR mutation status in pulmonary adenocarcinoma with 3D deep learning","volume":"8","author":"Zhao","year":"2019","journal-title":"Cancer Med."},{"key":"ref_14","unstructured":"Filho, C.J.A.B., Siqueira, H.V., Ferreira, D.D., Bertol, D.W., and de Oliveira, R.C.L. (2021). On Teacher-Student Semi-Supervised Learning for Chest X-ray Image Classification. Anais do 15 Congresso Brasileiro de Intelig\u00eancia Computacional, SBIC."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"4","DOI":"10.1016\/j.compmedimag.2016.07.004","article-title":"Enhancing deep convolutional neural network scheme for breast cancer diagnosis with unlabeled data","volume":"57","author":"Sun","year":"2017","journal-title":"Comput. Med. Imaging Graph."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1016\/j.amsu.2020.12.043","article-title":"Comparing supervised and semi-supervised Machine Learning Models on Diagnosing Breast Cancer","volume":"62","author":"Shatnawi","year":"2021","journal-title":"Ann. Med. Surg."},{"key":"ref_17","doi-asserted-by":"crossref","first-page":"2276","DOI":"10.1109\/JBHI.2021.3131103","article-title":"NAS-SGAN: A Semi-supervised Generative Adversarial Network Model for Atypia Scoring of Breast Cancer Histopathological Images","volume":"26","author":"Das","year":"2021","journal-title":"IEEE J. Biomed. Health Inform."},{"key":"ref_18","doi-asserted-by":"crossref","first-page":"237","DOI":"10.1016\/j.media.2019.07.004","article-title":"Semi-supervised adversarial model for benign\u2013malignant lung nodule classification on chest CT","volume":"57","author":"Xie","year":"2019","journal-title":"Med. Image Anal."},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"180202","DOI":"10.1038\/sdata.2018.202","article-title":"A radiogenomic dataset of non-small cell lung cancer","volume":"5","author":"Bakr","year":"2018","journal-title":"Sci. Data"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"(2011). Reduced Lung-Cancer Mortality with Low-Dose Computed Tomographic Screening. N. Engl. J. Med., 365, 395\u2013409.","DOI":"10.1056\/NEJMoa1102873"},{"key":"ref_21","first-page":"243","article-title":"The National Lung Screening Trial: Overview and Study Design","volume":"258","author":"Aberle","year":"2010","journal-title":"Radiology"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Silva, F., Pereira, T., Morgado, J., Cunha, A., and Oliveira, H.P. (2021, January 1\u20135). The Impact of Interstitial Diseases Patterns on Lung CT Segmentation. Proceedings of the 2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), Guadalajara, Mexico.","DOI":"10.1109\/EMBC46164.2021.9630354"},{"key":"ref_23","unstructured":"Kingma, D.P., and Welling, M. (2013). Auto-Encoding Variational Bayes. arXiv."},{"key":"ref_24","unstructured":"Doersch, C. (2016). Tutorial on Variational Autoencoders. arXiv."},{"key":"ref_25","unstructured":"Radford, A., Metz, L., and Chintala, S. (2015). Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. arXiv."},{"key":"ref_26","first-page":"448","article-title":"Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift","volume":"Volume 37","author":"Bach","year":"2015","journal-title":"Proceedings of the 32nd International Conference on Machine Learning"},{"key":"ref_27","unstructured":"Lee, D., Sugiyama, M., Luxburg, U., Guyon, I., and Garnett, R. (2016). Improved Techniques for Training GANs. Advances in Neural Information Processing Systems, Curran Associates, Inc."},{"key":"ref_28","first-page":"1929","article-title":"Dropout: A Simple Way to Prevent Neural Networks from Overfitting","volume":"15","author":"Srivastava","year":"2014","journal-title":"J. Mach. Learn. Res."},{"key":"ref_29","unstructured":"Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., and Weinberger, K. (2014, January 8\u201313). Generative Adversarial Nets. Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada."},{"key":"ref_30","unstructured":"Larsen, A.B.L., S\u00f8nderby, S.K., and Winther, O. (2015, January 20\u201322). Autoencoding beyond pixels using a learned similarity metric. Proceedings of the 33rd International Conference on Machine Learning, New York, NY, USA."},{"key":"ref_31","unstructured":"Cover, T.M., and Thomas, J.A. (2006). Elements of Information Theory, Wiley-Interscience. [2nd ed.]."},{"key":"ref_32","unstructured":"Higgins, I., Matthey, L., Pal, A., Burgess, C.P., Glorot, X., Botvinick, M.M., Mohamed, S., and Lerchner, A. (2017, January 24\u201326). beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework. Proceedings of the ICLR, Toulon, France."},{"key":"ref_33","first-page":"3272","article-title":"Good semi-supervised learning that requires a bad gan","volume":"30","author":"Dai","year":"2017","journal-title":"Adv. Neural Inf. Process. Syst."},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"58667","DOI":"10.1109\/ACCESS.2021.3070701","article-title":"EGFR Assessment in Lung Cancer CT Images: Analysis of Local and Holistic Regions of Interest Using Deep Unsupervised Transfer Learning","volume":"9","author":"Silva","year":"2021","journal-title":"IEEE Access"}],"container-title":["Mathematics"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2227-7390\/10\/22\/4225\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T01:16:51Z","timestamp":1760145411000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2227-7390\/10\/22\/4225"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,11,12]]},"references-count":34,"journal-issue":{"issue":"22","published-online":{"date-parts":[[2022,11]]}},"alternative-id":["math10224225"],"URL":"https:\/\/doi.org\/10.3390\/math10224225","relation":{},"ISSN":["2227-7390"],"issn-type":[{"value":"2227-7390","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,11,12]]}}}