{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,3,24]],"date-time":"2026-03-24T17:01:22Z","timestamp":1774371682099,"version":"3.50.1"},"reference-count":34,"publisher":"MDPI AG","issue":"6","license":[{"start":{"date-parts":[[2023,5,29]],"date-time":"2023-05-29T00:00:00Z","timestamp":1685318400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada (NSERC)","doi-asserted-by":"publisher","award":["ALLRP 576612-22"],"award-info":[{"award-number":["ALLRP 576612-22"]}],"id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100000038","name":"Natural Sciences and Engineering Research Council of Canada (NSERC)","doi-asserted-by":"publisher","award":["1R03CA253212-01"],"award-info":[{"award-number":["1R03CA253212-01"]}],"id":[{"id":"10.13039\/501100000038","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health (NIH)","doi-asserted-by":"publisher","award":["ALLRP 576612-22"],"award-info":[{"award-number":["ALLRP 576612-22"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/100000002","name":"National Institutes of Health (NIH)","doi-asserted-by":"publisher","award":["1R03CA253212-01"],"award-info":[{"award-number":["1R03CA253212-01"]}],"id":[{"id":"10.13039\/100000002","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Flexible laryngoscopy is commonly performed by otolaryngologists to detect laryngeal diseases and to recognize potentially malignant lesions. Recently, researchers have introduced machine learning techniques to facilitate automated diagnosis using laryngeal images and achieved promising results. The diagnostic performance can be improved when patients\u2019 demographic information is incorporated into models. However, the manual entry of patient data is time-consuming for clinicians. In this study, we made the first endeavor to employ deep learning models to predict patient demographic information to improve the detector model\u2019s performance. The overall accuracy for gender, smoking history, and age was 85.5%, 65.2%, and 75.9%, respectively. We also created a new laryngoscopic image set for the machine learning study and benchmarked the performance of eight classical deep learning models based on CNNs and Transformers. The results can be integrated into current learning models to improve their performance by incorporating the patient\u2019s demographic information.<\/jats:p>","DOI":"10.3390\/jimaging9060109","type":"journal-article","created":{"date-parts":[[2023,5,29]],"date-time":"2023-05-29T03:39:36Z","timestamp":1685331576000},"page":"109","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":9,"title":["Gender, Smoking History, and Age Prediction from Laryngeal Images"],"prefix":"10.3390","volume":"9","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6171-3176","authenticated-orcid":false,"given":"Tianxiao","family":"Zhang","sequence":"first","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA"}]},{"given":"Andr\u00e9s M.","family":"Bur","sequence":"additional","affiliation":[{"name":"Department of Otolaryngology\u2014Head and Neck Surgery, University of Kansas Medical Center, Kansas City, KS 66160, USA"}]},{"given":"Shannon","family":"Kraft","sequence":"additional","affiliation":[{"name":"Department of Otolaryngology\u2014Head and Neck Surgery, University of Kansas Medical Center, Kansas City, KS 66160, USA"}]},{"given":"Hannah","family":"Kavookjian","sequence":"additional","affiliation":[{"name":"Department of Otolaryngology\u2014Head and Neck Surgery, University of Kansas Medical Center, Kansas City, KS 66160, USA"}]},{"given":"Bryan","family":"Renslo","sequence":"additional","affiliation":[{"name":"Department of Otolaryngology\u2014Head and Neck Surgery, University of Kansas Medical Center, Kansas City, KS 66160, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-9690-0067","authenticated-orcid":false,"given":"Xiangyu","family":"Chen","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA"}]},{"given":"Bo","family":"Luo","sequence":"additional","affiliation":[{"name":"Department of Electrical Engineering and Computer Science, University of Kansas, Lawrence, KS 66045, USA"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3182-104X","authenticated-orcid":false,"given":"Guanghui","family":"Wang","sequence":"additional","affiliation":[{"name":"Department of Computer Science, Toronto Metropolitan University, Toronto, ON M5B 2K3, Canada"}]}],"member":"1968","published-online":{"date-parts":[[2023,5,29]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1001\/archotol.1985.00800110067004","article-title":"The Role of Endoscopy in Evaluating Patients With Head and Neck Cancer: A Multi-Institutional Prospective Study","volume":"111","author":"Leipzig","year":"1985","journal-title":"Arch. Otolaryngol. Neck Surg."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"1037","DOI":"10.1002\/lio2.656","article-title":"Tumor detection with transoral use of flexible endoscopy for unknown primary head and neck cancer","volume":"6","author":"Ebisumoto","year":"2021","journal-title":"Laryngoscope Investig. Otolaryngol."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"92","DOI":"10.1016\/j.ebiom.2019.08.075","article-title":"Computer-aided diagnosis of laryngeal cancer via deep learning based on laryngoscopic images","volume":"48","author":"Xiong","year":"2019","journal-title":"EBioMedicine"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"060503","DOI":"10.1117\/1.JBO.22.6.060503","article-title":"Deep convolutional neural networks for classifying head and neck cancer using hyperspectral imaging","volume":"22","author":"Halicek","year":"2017","journal-title":"J. Biomed. Opt."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"1798","DOI":"10.1002\/lary.29960","article-title":"Deep Learning Applied to White Light and Narrow Band Imaging Videolaryngoscopy: Toward Real-Time Laryngeal Cancer Detection","volume":"132","author":"Azam","year":"2022","journal-title":"Laryngoscope"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"7497","DOI":"10.1038\/s41598-018-25842-6","article-title":"Automatic anatomical classification of esophagogastroduodenoscopy images using deep convolutional neural networks","volume":"8","author":"Takiyama","year":"2018","journal-title":"Sci. Rep."},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"13914","DOI":"10.1038\/s41598-022-18217-5","article-title":"Hierarchical dynamic convolutional neural network for laryngeal disease classification","volume":"12","author":"Wang","year":"2022","journal-title":"Sci. Rep."},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"E686","DOI":"10.1002\/lary.28539","article-title":"Automatic recognition of laryngoscopic images using a deep-learning technique","volume":"130","author":"Ren","year":"2020","journal-title":"Laryngoscope"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"319","DOI":"10.1007\/s10162-022-00846-2","article-title":"Harnessing the Power of Artificial Intelligence in Otolaryngology and the Communication Sciences","volume":"23","author":"Wilson","year":"2022","journal-title":"J. Assoc. Res. Otolaryngol."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Li, K., Fathan, M.I., Patel, K., Zhang, T., Zhong, C., Bansal, A., Rastogi, A., Wang, J.S., and Wang, G. (2021). Colonoscopy polyp detection and classification: Dataset creation and comparative evaluations. PLoS ONE, 16.","DOI":"10.1371\/journal.pone.0255809"},{"key":"ref_11","unstructured":"Patel, K.B., Li, F., and Wang, G. (2022, January 2). FuzzyNet: A Fuzzy Attention Module for Polyp Segmentation. Proceedings of the NeurIPS\u201922 Workshop on All Things Attention: Bridging Different Perspectives on Attention, New Orleans, LO, USA."},{"key":"ref_12","doi-asserted-by":"crossref","unstructured":"Patel, K., Bur, A.M., and Wang, G. (2021, January 26\u201328). Enhanced u-net: A feature enhancement network for polyp segmentation. Proceedings of the 2021 18th Conference on Robots and Vision (CRV), Burnaby, BC, Canada.","DOI":"10.1109\/CRV52889.2021.00032"},{"key":"ref_13","doi-asserted-by":"crossref","unstructured":"Patel, K., Li, K., Tao, K., Wang, Q., Bansal, A., Rastogi, A., and Wang, G. (2020). A comparative study on polyp classification using convolutional neural networks. PLoS ONE, 15.","DOI":"10.1371\/journal.pone.0236452"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"238","DOI":"10.1007\/s12559-023-10118-7","article-title":"CT radiomic features and clinical biomarkers for predicting coronary artery disease","volume":"15","author":"Militello","year":"2023","journal-title":"Cogn. Comput."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1109\/TMI.2020.3035253","article-title":"CA-Net: Comprehensive attention convolutional neural networks for explainable medical image segmentation","volume":"40","author":"Gu","year":"2020","journal-title":"IEEE Trans. Med. Imaging"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"102470","DOI":"10.1016\/j.media.2022.102470","article-title":"Explainable artificial intelligence (XAI) in deep learning-based medical image analysis","volume":"79","author":"Kuijf","year":"2022","journal-title":"Med. Image Anal."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Prinzi, F., Orlando, A., Gaglio, S., Midiri, M., and Vitabile, S. (2022, January 1\u20133). ML-Based Radiomics Analysis for Breast Cancer Classification in DCE-MRI. Proceedings of the Applied Intelligence and Informatics: Second International Conference, AII 2022, Reggio Calabria, Italy.","DOI":"10.1007\/978-3-031-24801-6_11"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"He, K., Zhang, X., Ren, S., and Sun, J. (2016, January 27\u201330). Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.90"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Huang, G., Liu, Z., Van Der Maaten, L., and Weinberger, K.Q. (2017, January 21\u201326). Densely connected convolutional networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.243"},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., and Chen, L.C. (2018, January 18\u201322). Mobilenetv2: Inverted residuals and linear bottlenecks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00474"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Ma, N., Zhang, X., Zheng, H.T., and Sun, J. (2018, January 8\u201314). Shufflenet v2: Practical guidelines for efficient cnn architecture design. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01264-9_8"},{"key":"ref_22","unstructured":"Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., and Gelly, S. (2020). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"211","DOI":"10.1007\/s11263-015-0816-y","article-title":"Imagenet large scale visual recognition challenge","volume":"115","author":"Russakovsky","year":"2015","journal-title":"Int. J. Comput. Vis."},{"key":"ref_24","unstructured":"Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., and Adam, H. (2017). Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv."},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Zhang, X., Zhou, X., Lin, M., and Sun, J. (2018, January 18\u201322). Shufflenet: An extremely efficient convolutional neural network for mobile devices. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00716"},{"key":"ref_26","unstructured":"Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, \u0141., and Polosukhin, I. (2017). Attention is all you need. arXiv."},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Chen, X., Hu, Q., Li, K., Zhong, C., and Wang, G. (2023, January 2\u20137). Accumulated Trivial Attention Matters in Vision Transformers on Small Datasets. Proceedings of the IEEE\/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA.","DOI":"10.1109\/WACV56688.2023.00397"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., and Guo, B. (2021, January 11\u201317). Swin transformer: Hierarchical vision transformer using shifted windows. Proceedings of the IEEE\/CVF International Conference on Computer Vision, Montreal, BC, Canada.","DOI":"10.1109\/ICCV48922.2021.00986"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., and Zagoruyko, S. (2020, January 23\u201328). End-to-end object detection with transformers. Proceedings of the European Conference on Computer Vision, Glasgow, UK.","DOI":"10.1007\/978-3-030-58452-8_13"},{"key":"ref_30","unstructured":"Ma, W., Zhang, T., and Wang, G. (2022). Miti-detr: Object detection based on transformers with mitigatory self-attention convergence. arXiv."},{"key":"ref_31","unstructured":"Kingma, D.P., and Ba, J. (2014). Adam: A method for stochastic optimization. arXiv."},{"key":"ref_32","unstructured":"Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., and Antiga, L. (2019). Pytorch: An imperative style, high-performance deep learning library. arXiv."},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Zhou, B., Khosla, A., Lapedriza, A., Oliva, A., and Torralba, A. (2016, January 27\u201330). Learning deep features for discriminative localization. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.","DOI":"10.1109\/CVPR.2016.319"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1709","DOI":"10.1109\/TETC.2020.3018312","article-title":"A novel bio-inspired approach for high-performance management in service-oriented networks","volume":"9","author":"Conti","year":"2020","journal-title":"IEEE Trans. Emerg. Top. Comput."}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/9\/6\/109\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T19:44:05Z","timestamp":1760125445000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/9\/6\/109"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,29]]},"references-count":34,"journal-issue":{"issue":"6","published-online":{"date-parts":[[2023,6]]}},"alternative-id":["jimaging9060109"],"URL":"https:\/\/doi.org\/10.3390\/jimaging9060109","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,5,29]]}}}