{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,7,16]],"date-time":"2026-07-16T14:49:31Z","timestamp":1784213371599,"version":"3.55.0"},"reference-count":37,"publisher":"MDPI AG","issue":"16","license":[{"start":{"date-parts":[[2019,8,13]],"date-time":"2019-08-13T00:00:00Z","timestamp":1565654400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["41876110"],"award-info":[{"award-number":["41876110"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012226","name":"Fundamental Research Funds for the Central Universities","doi-asserted-by":"publisher","award":["3072019CFT0602"],"award-info":[{"award-number":["3072019CFT0602"]}],"id":[{"id":"10.13039\/501100012226","id-type":"DOI","asserted-by":"publisher"}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>A method with a combination of multi-dimensional fusion features and a modified deep neural network (MFF-MDNN) is proposed to recognize underwater acoustic targets in this paper. Specifically, due to the complex and changeable underwater environment, it is difficult to describe underwater acoustic signals with a single feature. The Gammatone frequency cepstral coefficient (GFCC) and modified empirical mode decomposition (MEMD) are developed to extract multi-dimensional features in this paper. Moreover, to ensure the same time dimension, a dimension reduction method is proposed to obtain multi-dimensional fusion features in the original underwater acoustic signals. Then, to reduce redundant features and further improve recognition accuracy, the Gaussian mixture model (GMM) is used to modify the structure of a deep neural network (DNN). Finally, the proposed underwater acoustic target recognition method can obtain an accuracy of 94.3% under a maximum of 800 iterations when the dataset has underwater background noise with weak targets. Compared with other methods, the recognition results demonstrate that the proposed method has higher accuracy and strong adaptability.<\/jats:p>","DOI":"10.3390\/rs11161888","type":"journal-article","created":{"date-parts":[[2019,8,13]],"date-time":"2019-08-13T04:31:21Z","timestamp":1565670681000},"page":"1888","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":87,"title":["Underwater Acoustic Target Recognition: A Combination of Multi-Dimensional Fusion Features and Modified Deep Neural Network"],"prefix":"10.3390","volume":"11","author":[{"given":"Xingmei","family":"Wang","sequence":"first","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Anhua","family":"Liu","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Yu","family":"Zhang","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Harbin Institute of Technology, Harbin 518000, China"}],"role":[{"vocabulary":"crossref","role":"author"}]},{"given":"Fuzhao","family":"Xue","sequence":"additional","affiliation":[{"name":"College of Computer Science and Technology, Harbin Engineering University, Harbin 150001, China"}],"role":[{"vocabulary":"crossref","role":"author"}]}],"member":"1968","published-online":{"date-parts":[[2019,8,13]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Yang, H., Shen, S., Yao, X., Sheng, M., and Wang, C. (2018). Competitive Deep-Belief Networks for Underwater Acoustic Target Recognition. Sensors, 18.","DOI":"10.3390\/s18040952"},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"145","DOI":"10.1016\/j.apacoust.2018.11.003","article-title":"Underwater sonar image classification using adaptive weights convolutional neural network","volume":"146","author":"Wang","year":"2018","journal-title":"Appl. Acoust."},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"2552","DOI":"10.1007\/978-981-10-6571-2_310","article-title":"Target Recognition Based on 3-D Sparse Underwater Sonar Sensor Network","volume":"463","author":"Liang","year":"2019","journal-title":"Lect. Notes Electr. Eng."},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"52","DOI":"10.1016\/j.apacoust.2018.03.026","article-title":"Extraction and classification of acoustic scattering from underwater target based on Wigner-Ville distribution","volume":"138","author":"Wu","year":"2018","journal-title":"Appl. Acoust."},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"17","DOI":"10.1016\/j.apacoust.2016.11.016","article-title":"Marine mammal sound classification based on a parallel recognition model and octave analysis","volume":"119","year":"2017","journal-title":"Appl. Acoust."},{"key":"ref_6","doi-asserted-by":"crossref","unstructured":"Can, G., Akbas, C.E., and Cetin, A.E. (2016, January 27\u201328). Recognition of vessel acoustic signatures using non-linear teager energy based features. Proceedings of the 2016 International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM), Reggio Calabria, Italy.","DOI":"10.1109\/IWCIM.2016.7801190"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"291","DOI":"10.1017\/S0263574713001306","article-title":"AUV behavior recognition using behavior histograms, HMMs, and CRFs","volume":"32","author":"Novitzky","year":"2014","journal-title":"Robot"},{"key":"ref_8","doi-asserted-by":"crossref","first-page":"981","DOI":"10.1016\/S0031-3203(00)00046-7","article-title":"Reconstruction and segmentation of underwater acoustic images combining confidence information in MRF models","volume":"34","author":"Murino","year":"2001","journal-title":"Pattern Recognit."},{"key":"ref_9","unstructured":"Tegowski, J., Koza, R., Pawliczka, I., Skora, K., Trzcinska, K., and 71Zdroik, J. (2016, January 10\u201314). Statistical, Spectral and Wavelet Features of the Ambient Noise Detected in the Southern Baltic Sea. Proceedings of the 23rd International Congress on Sound and Vibration: From Ancient to Modern Acoustics, Int Inst Acoustics & Vibration, Auburn, Al, USA."},{"key":"ref_10","doi-asserted-by":"crossref","first-page":"3371","DOI":"10.1121\/1.4876439","article-title":"Using Gaussian mixture models to detect and classify dolphin whistles and pulses","volume":"135","author":"Parada","year":"2014","journal-title":"J. Acoust. Soc. Am."},{"key":"ref_11","unstructured":"Lim, T., Bae, K., Hwang, C., and Lee, H. (2007, January 12\u201315). Classification of underwater transient signals using MFCC feature vector. Proceedings of the 9th International Symposium on Signal Processing and Its Applications (ISSPA), Sharjah, United Arab."},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"772","DOI":"10.1093\/ietfec\/e91-a.3.772","article-title":"Underwater Transient Signal Classification Using Binary Pattern Image of MFCC and Neural Network","volume":"E91A","author":"Lim","year":"2008","journal-title":"IEICE Trans. Fundam. Electron. Commun. Comput. Sci."},{"key":"ref_13","unstructured":"Jankowski, C., Quatieri, T., and Reynolds, D. (1995, January 9\u201312). Measuring fine structure in speech: Application to speaker identification. Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, Detroit, MI, USA."},{"key":"ref_14","doi-asserted-by":"crossref","unstructured":"Guo, Y., and Gas, B. (2009, January 10\u201315). Underwater transient and non transient signals classification using predictive neural networks. Proceedings of the 2009 IEEE\/RSJ International Conference on Intelligent Robots and Systems, St. Louis, MO, USA.","DOI":"10.1109\/IROS.2009.5354031"},{"key":"ref_15","doi-asserted-by":"crossref","unstructured":"Wang, W., Li, S., Yang, J., Liu, Z., and Zhou, W. (2016, January 9\u201311). Feature Extraction of Underwater Target in Auditory Sensation Area Based on MFCC. Proceedings of the 2016 IEEE\/OES China Ocean Acoustics Symposium (COA), Harbin, China.","DOI":"10.1109\/COA.2016.7535736"},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"39","DOI":"10.1016\/j.specom.2016.12.004","article-title":"Empirical mode decomposition for adaptive AM-FM analysis of speech: A review","volume":"88","author":"Sharma","year":"2017","journal-title":"Speech Commun."},{"key":"ref_17","doi-asserted-by":"crossref","unstructured":"Holambe, R.S., and Deshpande, M.S. (2012). Advances in Non-Linear Modeling for Speech Processing, Springer Science and Business Media.","DOI":"10.1007\/978-1-4614-1505-3"},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Lian, Z., Xu, K., Wan, J., and Li, G. (2017, January 25\u201326). Underwater acoustic target classification based on modified GFCC features. Proceedings of the IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, China.","DOI":"10.1109\/IAEAC.2017.8054017"},{"key":"ref_19","first-page":"200","article-title":"Speaker recognition algorithm based on Gammatone filter bank","volume":"51","author":"Mao","year":"2015","journal-title":"Comput. Eng. Appl."},{"key":"ref_20","doi-asserted-by":"crossref","first-page":"1097","DOI":"10.1109\/TASL.2008.2001109","article-title":"Speaker Identification Using Instantaneous Frequencies","volume":"16","author":"Grimaldi","year":"2008","journal-title":"IEEE Trans. Audio Speech Lang. Process."},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Hasan, T., and Hansen, J.H. (2011, January 27\u201331). Robust speaker recognition in non-stationary room environments based on empirical mode decomposition. Proceedings of the Interspeech, Florence, Italy.","DOI":"10.21437\/Interspeech.2011-150"},{"key":"ref_22","doi-asserted-by":"crossref","unstructured":"Zeng, X., and Wang, S. (2014, January 5\u20138). Underwater sound classification based on Gammatone filter bank and Hilbert-Huang transform. Proceedings of the 2014 IEEE International Conference on Signal Processing, Communications and Computing (ICSPCC), Guilin, China.","DOI":"10.1109\/ICSPCC.2014.6986287"},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Deshpande, M.S., and Holambe, R.S. (2009, January 16\u201318). Speaker Identification Based on Robust AM-FM Features. Proceedings of the Second International Conference on Emerging Trends in Engineering & Technology, Nagpur, India.","DOI":"10.1109\/ICETET.2009.209"},{"key":"ref_24","first-page":"19","article-title":"Am-fm based robust speaker identification in babble noise","volume":"6","author":"Deshpande","year":"2011","journal-title":"Environments"},{"key":"ref_25","doi-asserted-by":"crossref","first-page":"903","DOI":"10.1098\/rspa.1998.0193","article-title":"The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis","volume":"454","author":"Huang","year":"1998","journal-title":"Proc. R. Soc. A Math. Phys. Eng. Sci."},{"key":"ref_26","doi-asserted-by":"crossref","first-page":"2317","DOI":"10.1098\/rspa.2003.1123","article-title":"A confidence limit for the empirical mode decomposition and Hilbert spectral analysis","volume":"459","author":"Wu","year":"2003","journal-title":"Proc. R. Soc. A Math. Phys. Eng. Sci."},{"key":"ref_27","first-page":"1926","article-title":"Speech formant frequency estimation based on Hilbert\u2013Huang transform","volume":"40","author":"Huang","year":"2006","journal-title":"J. Zhejiang Univ. Eng. Sci."},{"key":"ref_28","doi-asserted-by":"crossref","first-page":"26","DOI":"10.1016\/j.dsp.2016.07.012","article-title":"A better decomposition of speech obtained using modified Empirical Mode Decomposition","volume":"58","author":"Sharma","year":"2016","journal-title":"Digit. Signal Process."},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Huang, N.E., and Shen, S.S.P. (2005). Hilbert-Huang Transform and Its Applications, World Scientific.","DOI":"10.1142\/9789812703347"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"792","DOI":"10.1016\/j.sigpro.2005.06.011","article-title":"Speech pitch determination based on Hilbert-Huang transform","volume":"86","author":"Huang","year":"2006","journal-title":"Signal Process."},{"key":"ref_31","unstructured":"Hayakawa, S., and Itakura, F. (1994, January 19\u201322). Text-dependent speaker recognition using the information in the higher frequency band. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Adelaide, Australia."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Kotari, V., and Chang, K.C. (2011). Fusion and Gaussian Mixture Based Classifiers for SONAR data. Signal Processing, Sensor Fusion, and Target Recognition XX, International Society for Optics and Photonics.","DOI":"10.1117\/12.883697"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Liu, S., and Sim, K.C. (2014, January 4\u20139). On combining DNN and GMM with unsupervised speaker adaptation for robust automatic speech recognition. Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy.","DOI":"10.1109\/ICASSP.2014.6853585"},{"key":"ref_34","doi-asserted-by":"crossref","first-page":"1633","DOI":"10.1007\/s11063-017-9755-7","article-title":"An Improved Deep Clustering Model for Underwater Acoustical Targets","volume":"48","author":"Wang","year":"2018","journal-title":"Neural Process. Lett."},{"key":"ref_35","doi-asserted-by":"crossref","first-page":"EL196","DOI":"10.1121\/1.5054911","article-title":"Automatic classification of grouper species by their sounds using deep neural networks","volume":"144","author":"Ibrahim","year":"2018","journal-title":"J. Acoust. Soc. Am."},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Li, J., Dai, W., Metze, F., Qu, S., and Das, S. (2017, January 5\u20139). A comparison of Deep Learning methods for environmental sound detection. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA.","DOI":"10.1109\/ICASSP.2017.7952131"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Dumpala, S.H., and Kopparapu, S.K. (2017, January 14\u201319). Improved speaker recognition system for stressed speech using deep neural networks. Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA.","DOI":"10.1109\/IJCNN.2017.7965997"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/16\/1888\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,11]],"date-time":"2025-10-11T13:10:38Z","timestamp":1760188238000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/11\/16\/1888"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2019,8,13]]},"references-count":37,"journal-issue":{"issue":"16","published-online":{"date-parts":[[2019,8]]}},"alternative-id":["rs11161888"],"URL":"https:\/\/doi.org\/10.3390\/rs11161888","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2019,8,13]]}}}