{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,2,6]],"date-time":"2026-02-06T22:45:54Z","timestamp":1770417954841,"version":"3.49.0"},"reference-count":28,"publisher":"Springer Science and Business Media LLC","issue":"1","license":[{"start":{"date-parts":[[2022,8,13]],"date-time":"2022-08-13T00:00:00Z","timestamp":1660348800000},"content-version":"tdm","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"},{"start":{"date-parts":[[2022,8,13]],"date-time":"2022-08-13T00:00:00Z","timestamp":1660348800000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["No. 61901165"],"award-info":[{"award-number":["No. 61901165"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["No. 61501199"],"award-info":[{"award-number":["No. 61501199"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":["link.springer.com"],"crossmark-restriction":false},"short-container-title":["EURASIP J. Adv. Signal Process."],"abstract":"<jats:title>Abstract<\/jats:title><jats:p>Digital audio tampering detection can be used to verify the authenticity of digital audio. However, most current methods use standard electronic network frequency (ENF) databases for visual comparison analysis of ENF continuity of digital audio or perform feature extraction for classification by machine learning methods. ENF databases are usually tricky to obtain, visual methods have weak feature representation, and machine learning methods have more information loss in features, resulting in low detection accuracy. This paper proposes a fusion method of shallow and deep features to fully use ENF information by exploiting the complementary nature of features at different levels to more accurately describe the changes in inconsistency produced by tampering operations to raw digital audio. Firstly, the audio signal is band-pass filtered to obtain the ENF component. Then, the discrete Fourier transform (DFT) and Hilbert transform are performed to obtain the phase and instantaneous frequency of the ENF component. Secondly, the mean value of the sequence variation is used as the shallow feature; the feature matrix obtained by framing and reshaping of the ENF sequence is used as the input of the convolutional neural network; the characteristics of the fitted coefficients are obtained by curve fitting. Then, the local details of ENF are obtained from the feature matrix by the convolutional neural network, and the global information of ENF is obtained by fitting coefficient features through deep neural network (DNN). The depth features of ENF are composed of ENF global information and local information together. The shallow and deep features are fused using an attention mechanism to give greater weights to features useful for classification and suppress invalid features. Finally, the tampered audio is detected by downscaling and fitting with a DNN containing two fully connected layers, and classification is performed using a Softmax layer. The method achieves 97.03% accuracy on three classic databases: Carioca 1, Carioca 2, and New Spanish. In addition, we have achieved an accuracy of 88.31% on the newly constructed database GAUDI-DI. Experimental results show that the proposed method is superior to the state-of-the-art method.<\/jats:p>","DOI":"10.1186\/s13634-022-00900-4","type":"journal-article","created":{"date-parts":[[2022,8,13]],"date-time":"2022-08-13T06:02:45Z","timestamp":1660370565000},"update-policy":"https:\/\/doi.org\/10.1007\/springer_crossmark_policy","source":"Crossref","is-referenced-by-count":24,"title":["Shallow and deep feature fusion for digital audio tampering detection"],"prefix":"10.1186","volume":"2022","author":[{"ORCID":"https:\/\/orcid.org\/0000-0001-6960-509X","authenticated-orcid":false,"given":"Zhifeng","family":"Wang","sequence":"first","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Yao","family":"Yang","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Chunyan","family":"Zeng","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shuai","family":"Kong","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Shixiong","family":"Feng","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]},{"given":"Nan","family":"Zhao","sequence":"additional","affiliation":[],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"297","published-online":{"date-parts":[[2022,8,13]]},"reference":[{"key":"900_CR1","doi-asserted-by":"publisher","first-page":"62719","DOI":"10.1109\/access.2021.3073786","volume":"9","author":"MA Qamhan","year":"2021","unstructured":"M.A. Qamhan, H. Altaheri, A.H. Meftah, G. Muhammad, Y.A. Alotaibi, Digital audio forensics: Microphone and environment classification using deep learning. IEEE Access 9, 62719\u201362733 (2021). https:\/\/doi.org\/10.1109\/access.2021.3073786","journal-title":"IEEE Access"},{"issue":"4","key":"900_CR2","doi-asserted-by":"publisher","first-page":"413","DOI":"10.1108\/ijwis-06-2020-0038","volume":"16","author":"C Zeng","year":"2020","unstructured":"C. Zeng, D. Zhu, Z. Wang, Z. Wang, N. Zhao, L. He, An end-to-end deep source recording device identification system for web media forensics. Int. J. Web Inf. Syst. 16(4), 413\u2013425 (2020). https:\/\/doi.org\/10.1108\/ijwis-06-2020-0038","journal-title":"Int. J. Web Inf. Syst."},{"key":"900_CR3","doi-asserted-by":"publisher","first-page":"236","DOI":"10.1109\/tifs.2020.3009579","volume":"16","author":"G Hua","year":"2021","unstructured":"G. Hua, H. Liao, Q. Wang, H. Zhang, D. Ye, Detection of electric network frequency in audio recordings\u2014from theory to practical detectors. IEEE Trans. Inf. Forensics Secur. 16, 236\u2013248 (2021). https:\/\/doi.org\/10.1109\/tifs.2020.3009579","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"900_CR4","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1186\/s13634-021-00763-1","volume":"41","author":"C Zeng","year":"2021","unstructured":"C. Zeng, W.Z. Zhu\u00a0D, Spatial and temporal learning representation for end-to-end recording device identification. EURASIP J. Adv. Signal Process. 41, 1\u201319 (2021). https:\/\/doi.org\/10.1186\/s13634-021-00763-1","journal-title":"EURASIP J. Adv. Signal Process."},{"issue":"11","key":"900_CR5","doi-asserted-by":"publisher","first-page":"1827","DOI":"10.1109\/tifs.2013.2280888","volume":"8","author":"H Malik","year":"2013","unstructured":"H. Malik, Acoustic environment identification and its applications to audio forensics. IEEE Trans. Inf. Forensics Secur. 8(11), 1827\u20131837 (2013). https:\/\/doi.org\/10.1109\/tifs.2013.2280888","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"issue":"11","key":"900_CR6","doi-asserted-by":"publisher","first-page":"1746","DOI":"10.1109\/tifs.2013.2278843","volume":"8","author":"H Zhao","year":"2013","unstructured":"H. Zhao, H. Malik, Audio recording location identification using acoustic environment signature. IEEE Trans. Inf. Forensics Secur. 8(11), 1746\u20131759 (2013). https:\/\/doi.org\/10.1109\/tifs.2013.2278843","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"900_CR7","doi-asserted-by":"crossref","unstructured":"C. Zeng, D. Zhu, Z. Wang, Y. Yang, Deep and shallow feature fusion and recognition of recording devices based on attention mechanism, in Advances in Intelligent Networking and Collaborative Systems (Springer, Cham, 2020), pp. 372\u2013381","DOI":"10.1007\/978-3-030-57796-4_36"},{"key":"900_CR8","doi-asserted-by":"crossref","unstructured":"L. Cuccovillo, S. Mann, M. Tagliasacchi, P. Aichroth, Audio tampering detection via microphone classification, in 15th International Workshop on Multimedia Signal Processing (2013), pp. 177\u2013182","DOI":"10.1109\/MMSP.2013.6659284"},{"key":"900_CR9","doi-asserted-by":"crossref","unstructured":"X. Meng, C. Li, L. Tian, Detecting audio splicing forgery algorithm based on local noise level estimation, in 5th International Conference on Systems and Informatics (2018), pp. 861\u2013865","DOI":"10.1109\/ICSAI.2018.8599318"},{"issue":"1","key":"900_CR10","doi-asserted-by":"publisher","first-page":"1009","DOI":"10.1007\/s11042-016-4277-2","volume":"77","author":"M Zakariah","year":"2017","unstructured":"M. Zakariah, M.K. Khan, H. Malik, Digital multimedia audio forensics: past, present and future. Multimed. Tools Appl. 77(1), 1009\u20131040 (2017). https:\/\/doi.org\/10.1007\/s11042-016-4277-2","journal-title":"Multimed. Tools Appl."},{"issue":"9","key":"900_CR11","doi-asserted-by":"publisher","first-page":"2441","DOI":"10.1109\/tifs.2019.2900935","volume":"14","author":"Q Yan","year":"2019","unstructured":"Q. Yan, R. Yang, J. Huang, Detection of speech smoothing on very short clips. IEEE Trans. Inf. Forensics Secur. 14(9), 2441\u20132453 (2019). https:\/\/doi.org\/10.1109\/tifs.2019.2900935","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"issue":"9","key":"900_CR12","doi-asserted-by":"publisher","first-page":"2331","DOI":"10.1109\/tifs.2019.2895965","volume":"14","author":"Q Yan","year":"2019","unstructured":"Q. Yan, R. Yang, J. Huang, Robust copy\u2013move detection of speech recording using similarities of pitch and formant. IEEE Trans. Inf. Forensics Secur. 14(9), 2331\u20132341 (2019). https:\/\/doi.org\/10.1109\/tifs.2019.2895965","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"issue":"4","key":"900_CR13","doi-asserted-by":"publisher","first-page":"2303","DOI":"10.1007\/s11042-014-2406-3","volume":"75","author":"J Chen","year":"2014","unstructured":"J. Chen, S. Xiang, H. Huang, W. Liu, Detecting and locating digital audio forgeries based on singularity analysis with wavelet packet. Multimed. Tools Appl. 75(4), 2303\u20132325 (2014). https:\/\/doi.org\/10.1007\/s11042-014-2406-3","journal-title":"Multimed. Tools Appl."},{"key":"900_CR14","doi-asserted-by":"publisher","first-page":"63","DOI":"10.1016\/j.dsp.2016.07.015","volume":"60","author":"X Lin","year":"2017","unstructured":"X. Lin, X. Kang, Exposing speech tampering via spectral phase analysis. Digital Signal Process. 60, 63\u201374 (2017). https:\/\/doi.org\/10.1016\/j.dsp.2016.07.015","journal-title":"Digital Signal Process."},{"key":"900_CR15","first-page":"37","volume":"43","author":"Z Xie","year":"2018","unstructured":"Z. Xie, Z. Wei, X. Liu, Y. Xue, Y. Yeung, Copy-move detection of digital audio based on multi-feature decision. J. Inf. Secur. Appl. 43, 37\u201346 (2018)","journal-title":"J. Inf. Secur. Appl."},{"key":"900_CR16","doi-asserted-by":"publisher","unstructured":"Z. Wang, J. Wang, C. Zeng, Q. Min, Y. Tian, M. Zuo, Digital audio tampering detection based on ENF consistency, in International Conference on Wavelet Analysis and Pattern Recognition (2018), pp. 209\u2013214. https:\/\/doi.org\/10.1109\/icwapr.2018.8521378","DOI":"10.1109\/icwapr.2018.8521378"},{"issue":"2","key":"900_CR17","doi-asserted-by":"publisher","first-page":"277","DOI":"10.1109\/tifs.2018.2837645","volume":"14","author":"A Hajj-Ahmad","year":"2019","unstructured":"A. Hajj-Ahmad, C.-W. Wong, S. Gambino, Q. Zhu, M. Yu, M. Wu, Factors affecting ENF capture in audio. IEEE Trans. Inf. Forensics Secur. 14(2), 277\u2013288 (2019). https:\/\/doi.org\/10.1109\/tifs.2018.2837645","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"900_CR18","doi-asserted-by":"publisher","first-page":"1417","DOI":"10.1109\/TIFS.2013.2272217","volume":"8","author":"R Garg","year":"2013","unstructured":"R. Garg, A. Varna, A. Hajj-Ahmad, M. Wu, \u201cseeing\u2019\u2019 enf: Power-signature-based timestamp for digital multimedia via optical sensing and signal processing. IEEE Trans. Inf. Forensics Secur. 8, 1417\u20131432 (2013)","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"900_CR19","doi-asserted-by":"publisher","first-page":"1868","DOI":"10.1109\/TIFS.2019.2952264","volume":"15","author":"G Hua","year":"2020","unstructured":"G. Hua, H. Zhang, ENF signal enhancement in audio recordings. IEEE Trans. Inf. Forensics Secur. 15, 1868\u20131878 (2020). https:\/\/doi.org\/10.1109\/TIFS.2019.2952264","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"900_CR20","doi-asserted-by":"publisher","first-page":"3874","DOI":"10.1109\/TIFS.2021.3099697","volume":"16","author":"G Hua","year":"2021","unstructured":"G. Hua, H. Liao, H. Zhang, D. Ye, J. Ma, Robust enf estimation based on harmonic enhancement and maximum weight clique. IEEE Trans. Inf. Forensics Secur. 16, 3874\u20133887 (2021). https:\/\/doi.org\/10.1109\/TIFS.2021.3099697","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"issue":"5","key":"900_CR21","doi-asserted-by":"publisher","first-page":"1003","DOI":"10.1109\/tifs.2016.2516824","volume":"11","author":"G Hua","year":"2016","unstructured":"G. Hua, Y. Zhang, J. Goh, V.L.L. Thing, Audio authentication by exploring the absolute-error-map of ENF signals. IEEE Trans. Inf. Forensics Secur. 11(5), 1003\u20131016 (2016). https:\/\/doi.org\/10.1109\/tifs.2016.2516824","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"900_CR22","doi-asserted-by":"publisher","first-page":"2314","DOI":"10.1109\/TIFS.2014.2363524","volume":"9","author":"PA Esquef","year":"2014","unstructured":"P.A. Esquef, J. Apolinario, L. Biscainho, Edit detection in speech recordings via instantaneous electric network frequency variations. IEEE Trans. Inf. Forensics Secur. 9, 2314\u20132326 (2014)","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"900_CR23","doi-asserted-by":"publisher","first-page":"534","DOI":"10.1109\/TIFS.2010.2051270","volume":"5","author":"D Nicolalde","year":"2010","unstructured":"D. Nicolalde, J. Apolinario, L. Biscainho, Audio authenticity: detecting enf discontinuity with high precision phase analysis. IEEE Trans. Inf. Forensics Secur. 5, 534\u2013543 (2010)","journal-title":"IEEE Trans. Inf. Forensics Secur."},{"key":"900_CR24","doi-asserted-by":"publisher","unstructured":"D.P. Nicolalde, J.A. Apolinario, Evaluating digital audio authenticity with spectral distances and ENF phase change, in IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE, Taipei, 2009), pp. 1417\u20131420. https:\/\/doi.org\/10.1109\/icassp.2009.4959859","DOI":"10.1109\/icassp.2009.4959859"},{"key":"900_CR25","doi-asserted-by":"publisher","unstructured":"L. Wang, H. Liang, X. Lin, X. Kang, Revealing the processing history of pitch-shifted voice using CNNs, in IEEE International Workshop on Information Forensics and Security (WIFS) (IEEE, Hong Kong, 2018), pp. 1\u20137. https:\/\/doi.org\/10.1109\/wifs.2018.8630783","DOI":"10.1109\/wifs.2018.8630783"},{"key":"900_CR26","doi-asserted-by":"publisher","first-page":"1","DOI":"10.1109\/TMM.2016.2571999","volume":"18","author":"X Lin","year":"2016","unstructured":"X. Lin, J. Liu, X. Kang, Audio recapture detection with convolutional neural networks. IEEE Trans. Multimed. 18, 1\u201315 (2016)","journal-title":"IEEE Trans. Multimed."},{"key":"900_CR27","doi-asserted-by":"crossref","unstructured":"S. Jadhav, R. Patole, P. Rege, Audio splicing detection using convolutional neural network, in 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT) (2019), pp. 1\u20135","DOI":"10.1109\/ICCCNT45670.2019.8944345"},{"key":"900_CR28","unstructured":"A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A.N. Gomez, \u0141. Kaiser, I. Polosukhin, Attention is all you need, in Advances in Neural Information Processing Systems, vol. 30 (2017), pp. 1\u201311"}],"container-title":["EURASIP Journal on Advances in Signal Processing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13634-022-00900-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/article\/10.1186\/s13634-022-00900-4\/fulltext.html","content-type":"text\/html","content-version":"vor","intended-application":"text-mining"},{"URL":"https:\/\/link.springer.com\/content\/pdf\/10.1186\/s13634-022-00900-4.pdf","content-type":"application\/pdf","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2022,8,13]],"date-time":"2022-08-13T06:17:34Z","timestamp":1660371454000},"score":1,"resource":{"primary":{"URL":"https:\/\/asp-eurasipjournals.springeropen.com\/articles\/10.1186\/s13634-022-00900-4"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2022,8,13]]},"references-count":28,"journal-issue":{"issue":"1","published-online":{"date-parts":[[2022,12]]}},"alternative-id":["900"],"URL":"https:\/\/doi.org\/10.1186\/s13634-022-00900-4","relation":{},"ISSN":["1687-6180"],"issn-type":[{"value":"1687-6180","type":"electronic"}],"subject":[],"published":{"date-parts":[[2022,8,13]]},"assertion":[{"value":"30 September 2021","order":1,"name":"received","label":"Received","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"28 July 2022","order":2,"name":"accepted","label":"Accepted","group":{"name":"ArticleHistory","label":"Article History"}},{"value":"13 August 2022","order":3,"name":"first_online","label":"First Online","group":{"name":"ArticleHistory","label":"Article History"}},{"order":1,"name":"Ethics","group":{"name":"EthicsHeading","label":"Declarations"}},{"value":"The authors declare that they have no competing interests.","order":2,"name":"Ethics","group":{"name":"EthicsHeading","label":"Competing interests"}}],"article-number":"69"}}