{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,7]],"date-time":"2025-11-07T09:47:09Z","timestamp":1762508829806,"version":"3.41.0"},"reference-count":37,"publisher":"Association for Computing Machinery (ACM)","issue":"5","license":[{"start":{"date-parts":[[2023,5,8]],"date-time":"2023-05-08T00:00:00Z","timestamp":1683504000000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/www.acm.org\/publications\/policies\/copyright_policy#Background"}],"content-domain":{"domain":["dl.acm.org"],"crossmark-restriction":true},"short-container-title":["ACM Trans. Asian Low-Resour. Lang. Inf. Process."],"published-print":{"date-parts":[[2023,5,31]]},"abstract":"<jats:p>There is a need to prevent the use of modulated voice signals to conduct criminal activities. Voice signal change detection based on convolutional neural networks is proposed. We use three commonly used voice processing software (Audacity, CoolEdit, and RTISI) to change tones in voice libraries. The research further raises each voice by five semitones and are recorded at different levels (+4, +5, +6, +7, and +8, respectively). Simultaneously, every voice is lowered by five halftones, represented as \u20134, \u20135, \u20136, \u20137, and \u20138, respectively. The convolution neural network corresponding to network b-3 is determined as the final classifier in this article through experiments. The average accuracy A1 of its three categories has reached more than 97%, the detection accuracy A2 of electronic tone sandhi speech has reached more than 97%, and the false alarm rate of the original speech is less than 1.9%. The outcomes obtained shows that the detection algorithm in this article is effective, and it has good generalization ability.<\/jats:p>","DOI":"10.1145\/3545569","type":"journal-article","created":{"date-parts":[[2022,9,19]],"date-time":"2022-09-19T12:22:15Z","timestamp":1663590135000},"page":"1-13","update-policy":"https:\/\/doi.org\/10.1145\/crossmark-policy","source":"Crossref","is-referenced-by-count":2,"title":["Generation of Voice Signal Tone Sandhi and Melody Based on Convolutional Neural Network"],"prefix":"10.1145","volume":"22","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-4604-9796","authenticated-orcid":false,"given":"Wei","family":"Jiang","sequence":"first","affiliation":[{"name":"Department of Music, Shandong University of Science and Technology, Qingdao Shandong, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-6425-0787","authenticated-orcid":false,"given":"Mengqi","family":"Li","sequence":"additional","affiliation":[{"name":"Department of Music, Shandong University of Science and Technology, Qingdao Shandong, China"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5106-7609","authenticated-orcid":false,"given":"Mohammad","family":"Shabaz","sequence":"additional","affiliation":[{"name":"Model Institute of Engineering and Technology, Jammu, J&amp;K, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4990-5252","authenticated-orcid":false,"given":"Ashutosh","family":"Sharma","sequence":"additional","affiliation":[{"name":"School of Computer Science, University of Petroleum and Energy Studies, Dehradun, India"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0001-5913-5979","authenticated-orcid":false,"given":"Mohd Anul","family":"Haq","sequence":"additional","affiliation":[{"name":"Department of Computer Science, College of Computer Science and Information Science, Majmaah University, Saudi Arabia"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"320","published-online":{"date-parts":[[2023,5,8]]},"reference":[{"key":"e_1_3_1_2_2","doi-asserted-by":"publisher","DOI":"10.1504\/ijcat.2020.110415"},{"key":"e_1_3_1_3_2","doi-asserted-by":"publisher","DOI":"10.1037\/dev0000781"},{"key":"e_1_3_1_4_2","first-page":"1","article-title":"Non-intrusive speech intelligibility prediction using convolutional neural networks","volume":"99","author":"Andersen A. H.","year":"2018","unstructured":"A. H. Andersen, J. M. D. Haan, Z. H. Tan, and J. Jensen. 2018. Non-intrusive speech intelligibility prediction using convolutional neural networks. IEEE\/ACM Trans. Aud. Speech Lang. Process. 99 (2018), 1\u20131.","journal-title":"IEEE\/ACM Trans. Aud. Speech Lang. Process."},{"issue":"4","key":"e_1_3_1_5_2","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1145\/3490031","article-title":"Dual discriminator GAN: Restoring ancient Yi characters","volume":"21","author":"Chen S.","year":"2022","unstructured":"S. Chen, Y. Yang, X. Liu, and S. Zhu. 2022. Dual discriminator GAN: Restoring ancient Yi characters. Trans. As. Low-Resour. Lang. Inf. Process. 21, 4 (2022), 1\u201323.","journal-title":"Trans. As. Low-Resour. Lang. Inf. Process."},{"key":"e_1_3_1_6_2","doi-asserted-by":"publisher","DOI":"10.1145\/3491065"},{"key":"e_1_3_1_7_2","doi-asserted-by":"publisher","DOI":"10.1145\/3510451"},{"key":"e_1_3_1_8_2","doi-asserted-by":"publisher","DOI":"10.1145\/3501399"},{"key":"e_1_3_1_9_2","doi-asserted-by":"publisher","DOI":"10.1145\/3501398"},{"key":"e_1_3_1_10_2","doi-asserted-by":"publisher","DOI":"10.5120\/19389-9145"},{"key":"e_1_3_1_11_2","doi-asserted-by":"publisher","DOI":"10.1075\/ijchl.3.1.01yan"},{"issue":"4","key":"e_1_3_1_12_2","first-page":"1","article-title":"Performance analysis of neural network, nmf and statistical approaches for speech enhancement","volume":"23","author":"Kandagatla R. K.","year":"2020","unstructured":"R. K. Kandagatla and V. S. Potluri. 2020. Performance analysis of neural network, nmf and statistical approaches for speech enhancement. Int. J. Speech Technol. 23, 4 (2020), 1\u201321.","journal-title":"Int. J. Speech Technol."},{"key":"e_1_3_1_13_2","doi-asserted-by":"publisher","DOI":"10.2299\/jsp.24.179"},{"key":"e_1_3_1_14_2","doi-asserted-by":"crossref","first-page":"47","DOI":"10.21778\/2218-5453-2019-4-47-52","article-title":"Speech recognition based on convolution neural networks","volume":"4","author":"Belorutsky R. Y.","year":"2019","unstructured":"R. Y. Belorutsky and S. V. Zhitnik. 2019. Speech recognition based on convolution neural networks. Iss. Radio Electr. 4 (2019), 47\u201352.","journal-title":"Iss. Radio Electr"},{"key":"e_1_3_1_15_2","doi-asserted-by":"publisher","DOI":"10.3390\/e21050479"},{"key":"e_1_3_1_16_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11265-017-1293-z"},{"key":"e_1_3_1_17_2","doi-asserted-by":"publisher","DOI":"10.1007\/s11277-019-06902-0"},{"issue":"1","key":"e_1_3_1_18_2","doi-asserted-by":"crossref","first-page":"90","DOI":"10.1016\/j.knosys.2018.07.033","article-title":"Audio classification using attention-augmented convolutional neural network","volume":"161","author":"Wu Y.","year":"2018","unstructured":"Y. Wu, H. Mao, and Z. Yi. 2018. Audio classification using attention-augmented convolutional neural network. Knowl.-Bas. Syst. 161, 1 (December 2018), 90\u2013100.","journal-title":"Knowl.-Bas. Syst"},{"key":"e_1_3_1_19_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.neucom.2015.02.085"},{"key":"e_1_3_1_20_2","doi-asserted-by":"publisher","DOI":"10.1250\/ast.39.163"},{"key":"e_1_3_1_21_2","doi-asserted-by":"publisher","DOI":"10.3390\/computers9020046"},{"issue":"11","key":"e_1_3_1_22_2","first-page":"55","article-title":"Mobile-based human emotion recognition is based on speech and heart rate","volume":"25","author":"Alshaibani H.","year":"2019","unstructured":"H. Alshaibani and H. M. Swady. 2019. Mobile-based human emotion recognition is based on speech and heart rate. Univ. Baghd. Eng. J. 25, 11 (2019), 55\u201366.","journal-title":"Univ. Baghd. Eng. J."},{"issue":"2","key":"e_1_3_1_23_2","first-page":"1","article-title":"Improving the decoding efficiency of deep neural network acoustic models by cluster-based senone selection","volume":"90","author":"Liu J. H.","year":"2017","unstructured":"J. H. Liu, Z. H. Ling, S. Wei, G. P. Hu, and L. R. Dai. 2017. Improving the decoding efficiency of deep neural network acoustic models by cluster-based senone selection. J. Sign. Process. Syst. 90, 2 (2017), 1\u201313.","journal-title":"J. Sign. Process. Syst."},{"issue":"11","key":"e_1_3_1_24_2","first-page":"151","article-title":"Spectral analysis and feature extraction of speech signal in dysphonia patients","volume":"113","author":"Shamila S.","year":"2017","unstructured":"S. Shamila, U. Snekhalatha, and D. Balakrishnan. 2017. Spectral analysis and feature extraction of speech signal in dysphonia patients. Int. J. Pure Appl. Math. 113, 11 (2017), 151\u2013160.","journal-title":"Int. J. Pure Appl. Math."},{"key":"e_1_3_1_25_2","doi-asserted-by":"publisher","DOI":"10.1016\/j.specom.2020.10.003"},{"key":"e_1_3_1_26_2","doi-asserted-by":"publisher","DOI":"10.1007\/s00034-016-0388-2"},{"key":"e_1_3_1_27_2","doi-asserted-by":"publisher","DOI":"10.21608\/ejle.2017.59392"},{"key":"e_1_3_1_28_2","doi-asserted-by":"publisher","DOI":"10.4236\/jsip.2015.62006"},{"key":"e_1_3_1_29_2","doi-asserted-by":"publisher","DOI":"10.1002\/ima.22510"},{"key":"e_1_3_1_30_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10278-021-00418-5"},{"key":"e_1_3_1_31_2","first-page":"1","article-title":"Radar signal intra-pulse modulation recognition based on convolutional denoising autoencoder and deep convolutional neural network","volume":"99","author":"Qu Z.","year":"2019","unstructured":"Z. Qu, W. Wang, C. Hou, and C. Hou. 2019. Radar signal intra-pulse modulation recognition based on convolutional denoising autoencoder and deep convolutional neural network. IEEE Access 99 (2019), 1\u20131.","journal-title":"IEEE Access"},{"issue":"8","key":"e_1_3_1_32_2","doi-asserted-by":"crossref","first-page":"862","DOI":"10.1049\/iet-rsn.2017.0547","article-title":"Radar emitter classification based on unidimensional convolutional neural network","volume":"12","author":"Sun J.","year":"2018","unstructured":"J. Sun, G. Xu, W. Ren, and Z. Yan. 2018. Radar emitter classification based on unidimensional convolutional neural network. Radar Sonar Navig. IET 12, 8 (2018), 862\u2013867.","journal-title":"Radar Sonar Navig. IET"},{"key":"e_1_3_1_33_2","doi-asserted-by":"publisher","DOI":"10.1364\/OE.25.017150"},{"issue":"3","key":"e_1_3_1_34_2","doi-asserted-by":"crossref","first-page":"1401","DOI":"10.1121\/1.4908240","article-title":"Effects of manipulating the signal-to-noise envelope power ratio on speech intelligibility","volume":"137","author":"JRgensen S.","year":"2015","unstructured":"S. JRgensen, R. Decorsi\u00e8re, and T. Dau. 2015. Effects of manipulating the signal-to-noise envelope power ratio on speech intelligibility. J. Acoust. Soc. Am. 137, 3 (2015), 1401.","journal-title":"J. Acoust. Soc. Am."},{"key":"e_1_3_1_35_2","doi-asserted-by":"publisher","DOI":"10.1007\/s10772-014-9246-4"},{"issue":"5","key":"e_1_3_1_36_2","first-page":"837","article-title":"Speech and emotional recognition method based on improving convolutional neural networks","volume":"36","author":"Zeng R. H.","year":"2018","unstructured":"R. H. Zeng and S. Q. Zhang. 2018. Speech and emotional recognition method based on improving convolutional neural networks. J. Appl. Sci. 36, 5 (2018), 837\u2013844.","journal-title":"J. Appl. Sci"},{"key":"e_1_3_1_37_2","first-page":"1","article-title":"Modulation classification based on signal constellation diagrams and deep learning","volume":"99","author":"Peng S.","year":"2018","unstructured":"S. Peng, H. Jiang, H. Wang, H. Alwageed, and Y. D. Yao. 2018. Modulation classification based on signal constellation diagrams and deep learning. IEEE Trans. Neural Netw. Learn. Syst. 99 (2018), 1\u201310.","journal-title":"IEEE Trans. Neural Netw. Learn. Syst"},{"key":"e_1_3_1_38_2","doi-asserted-by":"crossref","first-page":"6188","DOI":"10.1049\/joe.2019.0203","article-title":"Modulation classification based on denoising autoencoder and convolutional neural network with gnu radio","volume":"19","author":"Wang J.","year":"2019","unstructured":"J. Wang, W. Wang, F. Luo, and S. Wei. 2019. Modulation classification based on denoising autoencoder and convolutional neural network with gnu radio. J. Eng. 19 (2019), 6188\u20136191.","journal-title":"J. Eng"}],"container-title":["ACM Transactions on Asian and Low-Resource Language Information Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3545569","content-type":"unspecified","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3545569","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,6,17]],"date-time":"2025-06-17T19:02:45Z","timestamp":1750186965000},"score":1,"resource":{"primary":{"URL":"https:\/\/dl.acm.org\/doi\/10.1145\/3545569"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,5,8]]},"references-count":37,"journal-issue":{"issue":"5","published-print":{"date-parts":[[2023,5,31]]}},"alternative-id":["10.1145\/3545569"],"URL":"https:\/\/doi.org\/10.1145\/3545569","relation":{},"ISSN":["2375-4699","2375-4702"],"issn-type":[{"type":"print","value":"2375-4699"},{"type":"electronic","value":"2375-4702"}],"subject":[],"published":{"date-parts":[[2023,5,8]]},"assertion":[{"value":"2022-03-01","order":0,"name":"received","label":"Received","group":{"name":"publication_history","label":"Publication History"}},{"value":"2022-06-06","order":1,"name":"accepted","label":"Accepted","group":{"name":"publication_history","label":"Publication History"}},{"value":"2023-05-08","order":2,"name":"published","label":"Published","group":{"name":"publication_history","label":"Publication History"}}]}}