{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,6,2]],"date-time":"2026-06-02T09:32:18Z","timestamp":1780392738242,"version":"3.54.1"},"reference-count":23,"publisher":"World Scientific Pub Co Pte Ltd","issue":"01","funder":[{"DOI":"10.13039\/501100001459","name":"Ministry of Education - Singapore","doi-asserted-by":"publisher","award":["R-R12-A405-0009"],"award-info":[{"award-number":["R-R12-A405-0009"]}],"id":[{"id":"10.13039\/501100001459","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001459","name":"Ministry of Education - Singapore","doi-asserted-by":"publisher","award":["R-IE2-A405-00006"],"award-info":[{"award-number":["R-IE2-A405-00006"]}],"id":[{"id":"10.13039\/501100001459","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Int. J. As. Lang. Proc."],"published-print":{"date-parts":[[2025,3]]},"abstract":"<jats:p> This paper presents a novel approach to address the scarcity of labeled data in speech de-identification, a critical task for protecting personal privacy. By leveraging a large language model, we propose a fully automated data augmentation strategy that generates synthetic speech text data enriched with diverse personally identifiable information (PII) entities. This augmented dataset is then used to train the speech-de-identification models, significantly improving its performance on spoken language. To further enhance de-identification accuracy, we explore both pipeline and end-to-end models. While the pipeline approach sequentially applies speech recognition and named entity recognition, the end-to-end model jointly learns these tasks. Our experimental results demonstrate the effectiveness of our data augmentation strategy and the superiority of the end-to-end model in improving PII detection accuracy and robustness. <\/jats:p>","DOI":"10.1142\/s2717554524500140","type":"journal-article","created":{"date-parts":[[2024,12,31]],"date-time":"2024-12-31T05:25:36Z","timestamp":1735622736000},"source":"Crossref","is-referenced-by-count":2,"title":["Leveraging Large Language Models for Speech De-Identification"],"prefix":"10.1142","volume":"35","author":[{"ORCID":"https:\/\/orcid.org\/0009-0007-7650-0756","authenticated-orcid":false,"given":"Priyanshu","family":"Dhingra","sequence":"first","affiliation":[{"name":"Rajiv Gandhi Institute of Petroleum Technology, Mubarakpur Mukhatiya, Uttar Pradesh 229305, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0009-0002-9032-0739","authenticated-orcid":false,"given":"Satyam","family":"Agrawal","sequence":"additional","affiliation":[{"name":"National Institute of Technology Karnataka, Mangaluru, Karnataka 575025, India"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-0948-0568","authenticated-orcid":false,"given":"Chandra Sekar","family":"Veerappan","sequence":"additional","affiliation":[{"name":"Singapore Institute of Technology, Singapore 828608, Singapore"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6257-7399","authenticated-orcid":false,"given":"Eng Siong","family":"Chng","sequence":"additional","affiliation":[{"name":"Nanyang Technological University, Singapore 639798, Singapore"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"ORCID":"https:\/\/orcid.org\/0000-0003-3410-8354","authenticated-orcid":false,"given":"Rong","family":"Tong","sequence":"additional","affiliation":[{"name":"Singapore Institute of Technology, Singapore 828608, Singapore"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"219","published-online":{"date-parts":[[2025,1,27]]},"reference":[{"key":"S2717554524500140BIB003","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2022-10484"},{"key":"S2717554524500140BIB004","doi-asserted-by":"publisher","DOI":"10.1093\/jamia\/ocw156"},{"key":"S2717554524500140BIB005","doi-asserted-by":"publisher","DOI":"10.1016\/j.jbi.2017.05.023"},{"key":"S2717554524500140BIB006","doi-asserted-by":"publisher","DOI":"10.1109\/TKDE.2020.2981314"},{"key":"S2717554524500140BIB008","doi-asserted-by":"publisher","DOI":"10.1109\/ICCV51070.2023.00475"},{"key":"S2717554524500140BIB009","doi-asserted-by":"publisher","DOI":"10.1145\/3375627.3375849"},{"key":"S2717554524500140BIB010","doi-asserted-by":"publisher","DOI":"10.1145\/3411764.3445122"},{"key":"S2717554524500140BIB012","doi-asserted-by":"publisher","DOI":"10.18653\/v1\/2020.emnlp-main.488"},{"key":"S2717554524500140BIB013","doi-asserted-by":"publisher","DOI":"10.1016\/j.asoc.2022.109803"},{"key":"S2717554524500140BIB015","doi-asserted-by":"publisher","DOI":"10.1145\/3589334.3645678"},{"key":"S2717554524500140BIB016","doi-asserted-by":"publisher","DOI":"10.1007\/s10772-023-10055-8"},{"key":"S2717554524500140BIB017","first-page":"110","volume-title":"Proc. 61st Annu. Meeting of the Association for Computational Linguistics","volume":"2","author":"Cai J.","year":"2023"},{"key":"S2717554524500140BIB018","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2015-711"},{"key":"S2717554524500140BIB019","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP39728.2021.9414483"},{"key":"S2717554524500140BIB020","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2018-1751"},{"key":"S2717554524500140BIB023","first-page":"197","volume-title":"Proc. 2019 Conf. North American Chapter of the Association for Computational Linguistics: Human Language Technologies","volume":"2","author":"Cohn I.","year":"2019"},{"key":"S2717554524500140BIB024","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2011-547"},{"key":"S2717554524500140BIB025","first-page":"1746","volume-title":"Proc. 61st Annu. Meeting of the Association for Computational Linguistics","volume":"1","author":"Szyma\u0144ski P.","year":"2023"},{"key":"S2717554524500140BIB027","doi-asserted-by":"publisher","DOI":"10.1109\/SLT.2018.8639513"},{"key":"S2717554524500140BIB028","doi-asserted-by":"publisher","DOI":"10.1109\/ICASSP43922.2022.9746955"},{"key":"S2717554524500140BIB029","doi-asserted-by":"publisher","DOI":"10.1109\/IALP63756.2024.10661176"},{"key":"S2717554524500140BIB030","doi-asserted-by":"publisher","DOI":"10.1109\/ICAICTA63815.2024.10762997"},{"key":"S2717554524500140BIB031","doi-asserted-by":"publisher","DOI":"10.21437\/Interspeech.2019-1525"}],"container-title":["International Journal of Asian Language Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.worldscientific.com\/doi\/pdf\/10.1142\/S2717554524500140","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,2,17]],"date-time":"2025-02-17T08:32:26Z","timestamp":1739781146000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.worldscientific.com\/doi\/10.1142\/S2717554524500140"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,1,27]]},"references-count":23,"journal-issue":{"issue":"01","published-print":{"date-parts":[[2025,3]]}},"alternative-id":["10.1142\/S2717554524500140"],"URL":"https:\/\/doi.org\/10.1142\/s2717554524500140","relation":{},"ISSN":["2717-5545","2424-791X"],"issn-type":[{"value":"2717-5545","type":"print"},{"value":"2424-791X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,1,27]]},"article-number":"2450014"}}