{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,17]],"date-time":"2026-04-17T08:12:38Z","timestamp":1776413558434,"version":"3.51.2"},"reference-count":56,"publisher":"Association for Computing Machinery (ACM)","issue":"1","license":[{"start":{"date-parts":[[2021,10,15]],"date-time":"2021-10-15T00:00:00Z","timestamp":1634256000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Internet Technol."],"published-print":{"date-parts":[[2022,2,28]]},"abstract":"<jats:p>Edge Analytics and Artificial Intelligence are important features of the current smart connected living community. In a society where people, homes, cities, and workplaces are simultaneously connected through various devices, primarily through mobile devices, a considerable amount of data is exchanged, and the processing and storage of these data are laborious and difficult tasks. Edge Analytics allows the collection and analysis of such data on mobile devices, such as smartphones and tablets, without involving any cloud-centred architecture that cannot guarantee real-time responsiveness. Meanwhile, Artificial Intelligence techniques can constitute a valid instrument to process data, limiting the computation time, and optimising decisional processes and predictions in several sectors, such as healthcare. Within this field, in this article, an approach able to evaluate the voice quality condition is proposed. A fully automatic algorithm, based on Deep Learning, classifies a voice as healthy or pathological by analysing spectrogram images extracted by means of the recording of vowel \/a\/, in compliance with the traditional medical protocol. A light Convolutional Neural Network is embedded in a mobile health application in order to provide an instrument capable of assessing voice disorders in a fast, easy, and portable way. Thus, a straightforward mobile device becomes a screening tool useful for the early diagnosis, monitoring, and treatment of voice disorders. The proposed approach has been tested on a broad set of voice samples, not limited to the most common voice diseases but including all the pathologies present in three different databases achieving F1-scores, over the testing set, equal to 80%, 90%, and 73%. Although the proposed network consists of a reduced number of layers, the results are very competitive compared to those of other \u201ccutting edge\u201d approaches constructed using more complex neural networks, and compared to the classic deep neural networks, for example, VGG-16 and ResNet-50.<\/jats:p>","DOI":"10.1145\/3433993","type":"journal-article","created":{"date-parts":[[2021,10,16]],"date-time":"2021-10-16T00:44:52Z","timestamp":1634345092000},"page":"1-16","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":22,"title":["A Deep Learning Approach for Voice Disorder Detection for Smart Connected Living Environments"],"prefix":"10.1145","volume":"22","author":[{"given":"Laura","family":"Verde","sequence":"first","affiliation":[{"name":"Institute of High-Performance Computing and Networking (ICAR)\u2014National Research Council of Italy (CNR), Naples, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nadia","family":"Brancati","sequence":"additional","affiliation":[{"name":"Institute of High-Performance Computing and Networking (ICAR)\u2014National Research Council of Italy (CNR), Naples, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Giuseppe","family":"De Pietro","sequence":"additional","affiliation":[{"name":"Institute of High-Performance Computing and Networking (ICAR)\u2014National Research Council of Italy (CNR), Naples, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Maria","family":"Frucci","sequence":"additional","affiliation":[{"name":"Institute of High-Performance Computing and Networking (ICAR)\u2014National Research Council of Italy (CNR), Naples, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-7856-8761","authenticated-orcid":false,"given":"Giovanna","family":"Sannino","sequence":"additional","affiliation":[{"name":"Institute of High-Performance Computing and Networking (ICAR)\u2014National Research Council of Italy (CNR), Naples, Italy"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2021,10,15]]},"reference":[{"key":"e_1_2_1_1_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2017.2696056"},{"key":"e_1_2_1_2_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICRAMET.2017.8253139"},{"key":"e_1_2_1_3_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2856238"},{"key":"e_1_2_1_4_1","first-page":"92","article-title":"Intelligent pathological voice detection","volume":"5","author":"Ali Akbar","year":"2018","unstructured":"Akbar Ali and Sanjay Ganar . 2018 . Intelligent pathological voice detection . International Journal of Innovative Research in Technology 5 , 5 (2018), 92 \u2013 95 . Akbar Ali and Sanjay Ganar. 2018. Intelligent pathological voice detection. International Journal of Innovative Research in Technology 5, 5 (2018), 92\u201395.","journal-title":"International Journal of Innovative Research in Technology"},{"key":"e_1_2_1_5_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.patrec.2019.04.005"},{"key":"e_1_2_1_6_1","doi-asserted-by":"publisher","DOI":"10.18576\/amis\/100324"},{"key":"e_1_2_1_7_1","volume-title":"Fifth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications. ISCA","author":"Amir Ofer","year":"2007","unstructured":"Ofer Amir , Michael Wolf , and Noam Amir . 2007 . A clinical comparison between MDVP and Praat softwares: Is there a difference? In Fifth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications. ISCA , Firenze University Press, 37\u201340. Ofer Amir, Michael Wolf, and Noam Amir. 2007. A clinical comparison between MDVP and Praat softwares: Is there a difference? In Fifth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications. ISCA, Firenze University Press, 37\u201340."},{"key":"e_1_2_1_8_1","doi-asserted-by":"publisher","DOI":"10.5555\/3281970.3288254"},{"key":"e_1_2_1_9_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.anl.2014.11.001"},{"key":"e_1_2_1_10_1","volume-title":"Retrieved","author":"Boersma Paul","year":"2009","unstructured":"Paul Boersma and David Weenink . 2009 . Praat: Doing phonetics by computer (Version 5.1. 05) [Computer program] . Retrieved August 30, 2020 fromhttps:\/\/www.fon.hum.uva.nl\/praat\/. Paul Boersma and David Weenink. 2009. Praat: Doing phonetics by computer (Version 5.1. 05) [Computer program]. Retrieved August 30, 2020 fromhttps:\/\/www.fon.hum.uva.nl\/praat\/."},{"key":"e_1_2_1_11_1","doi-asserted-by":"publisher","DOI":"10.1109\/51.603651"},{"key":"e_1_2_1_12_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.compeleceng.2018.04.008"},{"key":"e_1_2_1_13_1","volume-title":"PhysioNet.","author":"Cesari Ugo","year":"2018","unstructured":"Ugo Cesari , Giuseppe De Pietro , Elio Marciano , Ciro Niri , Giovanna Sannino , and Laura Verde . 2018 . VOICED (VOice ICar fEDerico II) Database . PhysioNet. January 30, 2020 https:\/\/physionet.org\/physiobank\/database\/voiced\/. Ugo Cesari, Giuseppe De Pietro, Elio Marciano, Ciro Niri, Giovanna Sannino, and Laura Verde. 2018. VOICED (VOice ICar fEDerico II) Database. PhysioNet. January 30, 2020 https:\/\/physionet.org\/physiobank\/database\/voiced\/."},{"key":"e_1_2_1_14_1","volume-title":"Deep neural network for automatic classification of pathological voice signals. Journal of Voice","author":"Chen Lili","year":"2020","unstructured":"Lili Chen and Junjiang Chen . 2020. Deep neural network for automatic classification of pathological voice signals. Journal of Voice ( 2020 ). Lili Chen and Junjiang Chen. 2020. Deep neural network for automatic classification of pathological voice signals. Journal of Voice (2020)."},{"key":"e_1_2_1_15_1","volume-title":"Smart supervision of cardiomyopathy based on fuzzy Harris hawks optimizer and wearable sensing data optimization: A new model","author":"Ding Weiping","year":"2020","unstructured":"Weiping Ding , Mohamed Abdel-Basset , Khalid A. Eldrandaly , Laila Abdel-Fatah , and Victor Hugo C. de Albuquerque . 2020. Smart supervision of cardiomyopathy based on fuzzy Harris hawks optimizer and wearable sensing data optimization: A new model . IEEE Transactions on Cybernetics ( 2020 ), 1\u201315. Weiping Ding, Mohamed Abdel-Basset, Khalid A. Eldrandaly, Laila Abdel-Fatah, and Victor Hugo C. de Albuquerque. 2020. Smart supervision of cardiomyopathy based on fuzzy Harris hawks optimizer and wearable sensing data optimization: A new model. IEEE Transactions on Cybernetics (2020), 1\u201315."},{"key":"e_1_2_1_16_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSAC.2020.3020598"},{"key":"e_1_2_1_17_1","unstructured":"Massachusetts Eye and Ear Infirmary. 1994. Elemetrics Disordered Voice Database (Version 1.03).  Massachusetts Eye and Ear Infirmary. 1994. Elemetrics Disordered Voice Database (Version 1.03)."},{"key":"e_1_2_1_18_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jvoice.2018.02.003"},{"key":"e_1_2_1_19_1","doi-asserted-by":"publisher","DOI":"10.1055\/s-2005-861450"},{"key":"e_1_2_1_20_1","first-page":"627","article-title":"Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation","volume":"4","author":"Hajian-Tilaki Karimollah","year":"2013","unstructured":"Karimollah Hajian-Tilaki . 2013 . Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation . Caspian Journal of Internal Medicine 4 , 2 (2013), 627 . Karimollah Hajian-Tilaki. 2013. Receiver operating characteristic (ROC) curve analysis for medical diagnostic test evaluation. Caspian Journal of Internal Medicine 4, 2 (2013), 627.","journal-title":"Caspian Journal of Internal Medicine"},{"key":"e_1_2_1_21_1","doi-asserted-by":"publisher","DOI":"10.1109\/MNET.2019.1800235"},{"key":"e_1_2_1_22_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00521-018-3464-7"},{"key":"e_1_2_1_23_1","doi-asserted-by":"publisher","DOI":"10.1109\/CVPR.2016.90"},{"key":"e_1_2_1_24_1","doi-asserted-by":"publisher","DOI":"10.1109\/TASL.2009.2016734"},{"key":"e_1_2_1_25_1","doi-asserted-by":"publisher","DOI":"10.1109\/JSYST.2015.2470644"},{"key":"e_1_2_1_26_1","doi-asserted-by":"publisher","DOI":"10.1145\/3241056"},{"key":"e_1_2_1_27_1","doi-asserted-by":"publisher","DOI":"10.1007\/s00530-017-0561-x"},{"key":"e_1_2_1_28_1","doi-asserted-by":"publisher","DOI":"10.1109\/MNET.011.2000458"},{"key":"e_1_2_1_29_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2020.2985280"},{"key":"e_1_2_1_30_1","doi-asserted-by":"publisher","DOI":"10.1145\/2647868.2654889"},{"key":"e_1_2_1_31_1","doi-asserted-by":"publisher","DOI":"10.5555\/2999134.2999257"},{"key":"e_1_2_1_32_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jvoice.2017.11.011"},{"key":"e_1_2_1_33_1","doi-asserted-by":"publisher","DOI":"10.1109\/5.726791"},{"key":"e_1_2_1_34_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.media.2017.07.005"},{"key":"e_1_2_1_35_1","doi-asserted-by":"publisher","DOI":"10.1016\/S2589-7500(19)30123-2"},{"key":"e_1_2_1_36_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.jvoice.2016.08.015"},{"key":"e_1_2_1_37_1","first-page":"13","article-title":"La valutazione soggettiva ed oggettiva della disfonia. Il protocollo SIFEL","volume":"24","author":"Ricci Maccarini A.","year":"2002","unstructured":"A. Ricci Maccarini and E. Lucchini . 2002 . La valutazione soggettiva ed oggettiva della disfonia. Il protocollo SIFEL . Acta Phoniatrica Latina 24 , 1\/2 (2002), 13 \u2013 42 . A. Ricci Maccarini and E. Lucchini. 2002. La valutazione soggettiva ed oggettiva della disfonia. Il protocollo SIFEL. Acta Phoniatrica Latina 24, 1\/2 (2002), 13\u201342.","journal-title":"Acta Phoniatrica Latina"},{"key":"e_1_2_1_38_1","doi-asserted-by":"publisher","DOI":"10.1155\/2017\/8783751"},{"key":"e_1_2_1_39_1","doi-asserted-by":"publisher","DOI":"10.3390\/app10113723"},{"key":"e_1_2_1_40_1","doi-asserted-by":"publisher","DOI":"10.1109\/MCOM.2018.1700790"},{"key":"e_1_2_1_41_1","volume-title":"Javier Del Ser, and Victor Hugo C. de Albuquerque","author":"Muhammad Khan","year":"2020","unstructured":"Khan Muhammad , Salman Khan , Javier Del Ser, and Victor Hugo C. de Albuquerque . 2020 . Deep learning for multigrade brain tumor classification in smart healthcare systems: A prospective survey. IEEE Transactions on Neural Networks and Learning Systems ( 2020), 1\u20138. Khan Muhammad, Salman Khan, Javier Del Ser, and Victor Hugo C. de Albuquerque. 2020. Deep learning for multigrade brain tumor classification in smart healthcare systems: A prospective survey. IEEE Transactions on Neural Networks and Learning Systems (2020), 1\u20138."},{"key":"e_1_2_1_42_1","volume-title":"Ninth International Conference on Language Resources and Evaluation (LREC'14)","author":"Orozco-Arroyave Juan Rafael","year":"2014","unstructured":"Juan Rafael Orozco-Arroyave , Juli\u00e1n David Arias-Londo\u00f1o , Jes\u00fas Francisco Vargas-Bonilla , Mar\u00eda Claudia Gonzalez-R\u00e1tiva , and Elmar N\u00f6th . 2014 . New Spanish speech corpus database for the analysis of people suffering from Parkinson's disease . In Ninth International Conference on Language Resources and Evaluation (LREC'14) . European Language Resources Association (ELRA), 342\u2013347. Juan Rafael Orozco-Arroyave, Juli\u00e1n David Arias-Londo\u00f1o, Jes\u00fas Francisco Vargas-Bonilla, Mar\u00eda Claudia Gonzalez-R\u00e1tiva, and Elmar N\u00f6th. 2014. New Spanish speech corpus database for the analysis of people suffering from Parkinson's disease. In Ninth International Conference on Language Resources and Evaluation (LREC'14). European Language Resources Association (ELRA), 342\u2013347."},{"key":"e_1_2_1_43_1","first-page":"143","article-title":"A German database of patterns of pathological vocal fold vibration","volume":"3","author":"P\u00fctzer Manfred","year":"1997","unstructured":"Manfred P\u00fctzer and Jacques Koreman . 1997 . A German database of patterns of pathological vocal fold vibration . Phonus 3 (1997), 143 \u2013 153 . Manfred P\u00fctzer and Jacques Koreman. 1997. A German database of patterns of pathological vocal fold vibration. Phonus 3 (1997), 143\u2013153.","journal-title":"Phonus"},{"key":"e_1_2_1_44_1","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP.2019.8682391"},{"key":"e_1_2_1_45_1","first-page":"238","article-title":"Improved algorithm for pathological and normal voices identification","volume":"7","author":"Sabir Brahim","year":"2017","unstructured":"Brahim Sabir , Fatima Rouda , Yassine Khazri , Bouzekri Touri , and Mohamed Moussetad . 2017 . Improved algorithm for pathological and normal voices identification . International Journal of Electrical and Computer Engineering 7 , 1 (2017), 238 . Brahim Sabir, Fatima Rouda, Yassine Khazri, Bouzekri Touri, and Mohamed Moussetad. 2017. Improved algorithm for pathological and normal voices identification. International Journal of Electrical and Computer Engineering 7, 1 (2017), 238.","journal-title":"International Journal of Electrical and Computer Engineering"},{"key":"e_1_2_1_46_1","doi-asserted-by":"publisher","DOI":"10.1109\/TII.2018.2832081"},{"key":"e_1_2_1_47_1","doi-asserted-by":"publisher","DOI":"10.1016\/j.inffus.2019.06.004"},{"key":"e_1_2_1_48_1","unstructured":"R. T. Sataloff K. M. Kost and S. E. Linville. 2005. The effects of age on the voice. In Vocal Health and Pedagogy - Science Assessment and Treatment (3rd Edition) R. T. Sataloff (Ed.). Plural Publishing Inc. San Diego 319\u2013338.  R. T. Sataloff K. M. Kost and S. E. Linville. 2005. The effects of age on the voice. In Vocal Health and Pedagogy - Science Assessment and Treatment (3rd Edition) R. T. Sataloff (Ed.). Plural Publishing Inc. San Diego 319\u2013338."},{"key":"e_1_2_1_49_1","volume-title":"The prevalence and impact of voice problems in nonprofessional voice users: Preliminary findings. Journal of Voice","author":"Sheyona Valson","year":"2020","unstructured":"Valson Sheyona and Usha Devadas . 2020. The prevalence and impact of voice problems in nonprofessional voice users: Preliminary findings. Journal of Voice ( 2020 ). Valson Sheyona and Usha Devadas. 2020. The prevalence and impact of voice problems in nonprofessional voice users: Preliminary findings. Journal of Voice (2020)."},{"key":"e_1_2_1_50_1","volume-title":"3rd International Conference on Learning Representations (ICLR'15)","author":"Simonyan Karen","year":"2015","unstructured":"Karen Simonyan and Andrew Zisserman . 2015 . Very deep convolutional networks for large-scale image recognition . In 3rd International Conference on Learning Representations (ICLR'15) . 1\u201314. Karen Simonyan and Andrew Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. In 3rd International Conference on Learning Representations (ICLR'15). 1\u201314."},{"key":"e_1_2_1_51_1","doi-asserted-by":"publisher","DOI":"10.1016\/S0892-1997(97)80069-0"},{"key":"e_1_2_1_52_1","volume-title":"Watts","author":"Thijs Zo\u00eb","year":"2020","unstructured":"Zo\u00eb Thijs , Kristie Knickerbocker , and Christopher R . Watts . 2020 . Epidemiological patterns and treatment outcomes in a private practice community voice clinic. Journal of Voice ( 2020). Zo\u00eb Thijs, Kristie Knickerbocker, and Christopher R. Watts. 2020. Epidemiological patterns and treatment outcomes in a private practice community voice clinic. Journal of Voice (2020)."},{"key":"e_1_2_1_53_1","volume-title":"Irish Machine Vision and Image Processing Conference (IMVIP'19)","author":"Nam Trinh","year":"2019","unstructured":"Trinh Nam and Darragh O'Brien . 2019 . Pathological speech classification using a convolutional neural network . In Irish Machine Vision and Image Processing Conference (IMVIP'19) . Technological University Dublin, Dublin, Ireland, 72\u201375. Trinh Nam and Darragh O'Brien. 2019. Pathological speech classification using a convolutional neural network. In Irish Machine Vision and Image Processing Conference (IMVIP'19). Technological University Dublin, Dublin, Ireland, 72\u201375."},{"key":"e_1_2_1_54_1","doi-asserted-by":"publisher","DOI":"10.1109\/ACCESS.2018.2816338"},{"key":"e_1_2_1_55_1","doi-asserted-by":"publisher","DOI":"10.1109\/EMBC.2018.8513222"},{"key":"e_1_2_1_56_1","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2018-1351"}],"container-title":["ACM Transactions on Internet Technology"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3433993","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3433993","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T21:28:11Z","timestamp":1750195691000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3433993"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2021,10,15]]},"references-count":56,"journal-issue":{"issue":"1","published-print":{"date-parts":[[2022,2,28]]}},"alternative-id":["10.1145\/3433993"],"URL":"https:\/\/doi.org\/10.1145\/3433993","relation":{},"ISSN":["1533-5399","1557-6051"],"issn-type":[{"value":"1533-5399","type":"print"},{"value":"1557-6051","type":"electronic"}],"subject":[],"published":{"date-parts":[[2021,10,15]]},"assertion":[{"value":"2020-08-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2020-11-01","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2021-10-15","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}