{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,3]],"date-time":"2026-04-03T17:31:56Z","timestamp":1775237516866,"version":"3.50.1"},"reference-count":45,"publisher":"MDPI AG","issue":"19","license":[{"start":{"date-parts":[[2023,9,29]],"date-time":"2023-09-29T00:00:00Z","timestamp":1695945600000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"funder":[{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["52171322"],"award-info":[{"award-number":["52171322"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["2020YFB1313200"],"award-info":[{"award-number":["2020YFB1313200"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["D5000210944"],"award-info":[{"award-number":["D5000210944"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100001809","name":"National Natural Science Foundation of China","doi-asserted-by":"publisher","award":["PF2023066"],"award-info":[{"award-number":["PF2023066"]}],"id":[{"id":"10.13039\/501100001809","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program","doi-asserted-by":"publisher","award":["52171322"],"award-info":[{"award-number":["52171322"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program","doi-asserted-by":"publisher","award":["2020YFB1313200"],"award-info":[{"award-number":["2020YFB1313200"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program","doi-asserted-by":"publisher","award":["D5000210944"],"award-info":[{"award-number":["D5000210944"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"DOI":"10.13039\/501100012166","name":"National Key Research and Development Program","doi-asserted-by":"publisher","award":["PF2023066"],"award-info":[{"award-number":["PF2023066"]}],"id":[{"id":"10.13039\/501100012166","id-type":"DOI","asserted-by":"publisher"}]},{"name":"Fundamental Research Funds for the Central Universities","award":["52171322"],"award-info":[{"award-number":["52171322"]}]},{"name":"Fundamental Research Funds for the Central Universities","award":["2020YFB1313200"],"award-info":[{"award-number":["2020YFB1313200"]}]},{"name":"Fundamental Research Funds for the Central Universities","award":["D5000210944"],"award-info":[{"award-number":["D5000210944"]}]},{"name":"Fundamental Research Funds for the Central Universities","award":["PF2023066"],"award-info":[{"award-number":["PF2023066"]}]},{"name":"Graduate Innovation Fund","award":["52171322"],"award-info":[{"award-number":["52171322"]}]},{"name":"Graduate Innovation Fund","award":["2020YFB1313200"],"award-info":[{"award-number":["2020YFB1313200"]}]},{"name":"Graduate Innovation Fund","award":["D5000210944"],"award-info":[{"award-number":["D5000210944"]}]},{"name":"Graduate Innovation Fund","award":["PF2023066"],"award-info":[{"award-number":["PF2023066"]}]}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["Remote Sensing"],"abstract":"<jats:p>Underwater target detection technology plays a crucial role in the autonomous exploration of underwater vehicles. In recent years, significant progress has been made in the field of target detection through the application of artificial intelligence technology. Effectively applying AI techniques to underwater target detection is a highly promising area of research. However, the difficulty and high cost of underwater acoustic data collection have led to a severe lack of data, greatly restricting the development of deep-learning-based target detection methods. The present study is the first to utilize diffusion models for generating underwater acoustic data, thereby effectively addressing the issue of poor detection performance arising from the scarcity of underwater acoustic data. Firstly, we place iron cylinders and cones underwater (simulating small preset targets such as mines). Subsequently, we employ an autonomous underwater vehicle (AUV) equipped with side-scan sonar (SSS) to obtain underwater target data. The collected target data are augmented using the denoising diffusion probabilistic model (DDPM). Finally, the augmented data are used to train an improved YOLOv7 model, and its detection performance is evaluated on a test set. The results demonstrate the effectiveness of the proposed method in generating similar data and overcoming the challenge of limited training sample data. Compared to models trained solely on the original data, the model trained with augmented data shows a mean average precision (mAP) improvement of approximately 30% across various mainstream detection networks. Additionally, compared to the original model, the improved YOLOv7 model proposed in this study exhibits a 2% increase in mAP on the underwater dataset.<\/jats:p>","DOI":"10.3390\/rs15194772","type":"journal-article","created":{"date-parts":[[2023,10,2]],"date-time":"2023-10-02T04:28:08Z","timestamp":1696220888000},"page":"4772","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":14,"title":["Small-Sample Underwater Target Detection: A Joint Approach Utilizing Diffusion and YOLOv7 Model"],"prefix":"10.3390","volume":"15","author":[{"given":"Chensheng","family":"Cheng","sequence":"first","affiliation":[{"name":"School of Marine Science and Technology, Northwestern Polytenical University, Xi\u2019an 710072, China"}]},{"given":"Xujia","family":"Hou","sequence":"additional","affiliation":[{"name":"School of Marine Science and Technology, Northwestern Polytenical University, Xi\u2019an 710072, China"}]},{"given":"Xin","family":"Wen","sequence":"additional","affiliation":[{"name":"School of Marine Science and Technology, Northwestern Polytenical University, Xi\u2019an 710072, China"}]},{"given":"Weidong","family":"Liu","sequence":"additional","affiliation":[{"name":"School of Marine Science and Technology, Northwestern Polytenical University, Xi\u2019an 710072, China"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-1774-727X","authenticated-orcid":false,"given":"Feihu","family":"Zhang","sequence":"additional","affiliation":[{"name":"School of Marine Science and Technology, Northwestern Polytenical University, Xi\u2019an 710072, China"}]}],"member":"1968","published-online":{"date-parts":[[2023,9,29]]},"reference":[{"key":"ref_1","doi-asserted-by":"crossref","unstructured":"Li, J., Chen, L., Shen, J., Xiao, X., Liu, X., Sun, X., Wang, X., and Li, D. (2023). Improved Neural Network with Spatial Pyramid Pooling and Online Datasets Preprocessing for Underwater Target Detection Based on Side Scan Sonar Imagery. Remote Sens., 15.","DOI":"10.3390\/rs15020440"},{"key":"ref_2","doi-asserted-by":"crossref","unstructured":"Wu, M., Wang, Q., Rigall, E., Li, K., Zhu, W., He, B., and Yan, T. (2019). ECNet: Efficient convolutional networks for side scan sonar image segmentation. Sensors, 19.","DOI":"10.3390\/s19092009"},{"key":"ref_3","doi-asserted-by":"crossref","unstructured":"Yu, Y., Zhao, J., Gong, Q., Huang, C., Zheng, G., and Ma, J. (2021). Real-time underwater maritime object detection in side-scan sonar images based on transformer-YOLOv5. Remote Sens., 13.","DOI":"10.3390\/rs13183555"},{"key":"ref_4","doi-asserted-by":"crossref","unstructured":"Szymak, P., Piskur, P., and Naus, K. (2020). The effectiveness of using a pretrained deep learning neural networks for object classification in underwater video. Remote Sens., 12.","DOI":"10.3390\/rs12183020"},{"key":"ref_5","doi-asserted-by":"crossref","first-page":"103630","DOI":"10.1016\/j.apor.2023.103630","article-title":"Real-time underwater target detection for AUV using side scan sonar images based on deep learning","volume":"138","author":"Li","year":"2023","journal-title":"Appl. Ocean Res."},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"l5604413","DOI":"10.1109\/TGRS.2023.3248605","article-title":"Underwater Forward-Looking Sonar Images Target Detection via Speckle Reduction and Scene Prior","volume":"61","author":"Long","year":"2023","journal-title":"IEEE Trans. Geosci. Remote. Sens."},{"key":"ref_7","unstructured":"Doersch, C. (2016). Tutorial on variational autoencoders. arXiv."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Kingma, D.P., and Welling, M. (2019). An introduction to variational autoencoders. Foundations and Trends\u00ae in Machine Learning, Now Publishers.","DOI":"10.1561\/9781680836233"},{"key":"ref_9","first-page":"30","article-title":"Variational walkback: Learning a transition operator as a stochastic recurrent net","volume":"Volume 30","author":"Alias","year":"2017","journal-title":"Advances in Neural Information Processing Systems, Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4\u20139 December 2017"},{"key":"ref_10","unstructured":"Kim, T., and Bengio, Y. (2016). Deep directed generative models with energy-based probability estimation. arXiv."},{"key":"ref_11","doi-asserted-by":"crossref","first-page":"139","DOI":"10.1145\/3422622","article-title":"Generative adversarial networks","volume":"63","author":"Goodfellow","year":"2020","journal-title":"Commun. ACM"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"53","DOI":"10.1109\/MSP.2017.2765202","article-title":"Generative adversarial networks: An overview","volume":"35","author":"Creswell","year":"2018","journal-title":"IEEE Signal Process. Mag."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"3313","DOI":"10.1109\/TKDE.2021.3130191","article-title":"A review on generative adversarial networks: Algorithms, theory, and applications","volume":"35","author":"Gui","year":"2021","journal-title":"IEEE Trans. Knowl. Data Eng."},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"3964","DOI":"10.1109\/TPAMI.2020.2992934","article-title":"Normalizing flows: An introduction and review of current methods","volume":"43","author":"Kobyzev","year":"2020","journal-title":"IEEE Trans. Pattern Anal. Mach. Intell."},{"key":"ref_15","first-page":"2617","article-title":"Normalizing flows for probabilistic modeling and inference","volume":"22","author":"Papamakarios","year":"2021","journal-title":"J. Mach. Learn. Res."},{"key":"ref_16","doi-asserted-by":"crossref","unstructured":"Saharia, C., Chan, W., Chang, H., Lee, C., Ho, J., Salimans, T., Fleet, D., and Norouzi, M. (2022, January 6\u201310). Palette: Image-to-image diffusion models. Proceedings of the ACM SIGGRAPH 2022 Conference, Los Angeles, CA, USA.","DOI":"10.1145\/3528233.3530757"},{"key":"ref_17","unstructured":"Ho, J., Jain, A., and Abbeel, P. (2020, January 6\u201312). Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems: 34th Annual Conference on Neural Information Processing Systems (NeurIPS 2020), Virtual Conference."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Chen, Y., Liang, H., and Pang, S. (2022). Study on small samples active sonar target recognition based on deep learning. J. Mar. Sci. Eng., 10.","DOI":"10.3390\/jmse10081144"},{"key":"ref_19","doi-asserted-by":"crossref","first-page":"2819","DOI":"10.1049\/iet-ipr.2019.1735","article-title":"Underwater sonar image classification using generative adversarial network and convolutional neural network","volume":"14","author":"Xu","year":"2020","journal-title":"IET Image Process."},{"key":"ref_20","doi-asserted-by":"crossref","unstructured":"Wang, Z., Guo, Q., Lei, M., Guo, S., and Ye, X. (2021, January 26\u201328). High-Quality Sonar Image Generation Algorithm Based on Generative Adversarial Networks. Proceedings of the 2021 40th Chinese Control Conference (CCC), IEEE, Shanghai, China.","DOI":"10.23919\/CCC52363.2021.9550195"},{"key":"ref_21","doi-asserted-by":"crossref","unstructured":"Jegorova, M., Karjalainen, A.I., Vazquez, J., and Hospedales, T. (August, January 31). Full-scale continuous synthetic sonar data generation with markov conditional generative adversarial networks. Proceedings of the 2020 IEEE International Conference on Robotics and Automation (ICRA), Paris, France.","DOI":"10.1109\/ICRA40945.2020.9197353"},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1505","DOI":"10.1109\/LGRS.2020.3005679","article-title":"Side-scan sonar image synthesis based on generative adversarial network for images in multiple frequencies","volume":"18","author":"Jiang","year":"2020","journal-title":"IEEE Geosci. Remote. Sens. Lett."},{"key":"ref_23","doi-asserted-by":"crossref","unstructured":"Lee, E.h., Park, B., Jeon, M.H., Jang, H., Kim, A., and Lee, S. (2022). Data augmentation using image translation for underwater sonar image segmentation. PLoS ONE, 17.","DOI":"10.1371\/journal.pone.0272602"},{"key":"ref_24","doi-asserted-by":"crossref","first-page":"242","DOI":"10.1080\/01691864.2021.1873845","article-title":"Cyclegan-based realistic image dataset generation for forward-looking sonar","volume":"35","author":"Liu","year":"2021","journal-title":"Adv. Robot."},{"key":"ref_25","first-page":"1274260","article-title":"Spectral Normalized CycleGAN with Application in Semisupervised Semantic Segmentation of Sonar Images","volume":"2022","author":"Zhang","year":"2022","journal-title":"Comput. Intell. Neurosci."},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Karjalainen, A.I., Mitchell, R., and Vazquez, J. (2019, January 9\u201310). Training and validation of automatic target recognition systems using generative adversarial networks. Proceedings of the 2019 Sensor Signal Processing for Defence Conference (SSPD), Brighton, UK.","DOI":"10.1109\/SSPD.2019.8751666"},{"key":"ref_27","unstructured":"Ho, J., Salimans, T., Gritsenko, A., Chan, W., Norouzi, M., and Fleet, D.J. (2022). Video diffusion models. arXiv."},{"key":"ref_28","unstructured":"Batzolis, G., Stanczuk, J., Sch\u00f6nlieb, C.B., and Etmann, C. (2021). Conditional image generation with score-based diffusion models. arXiv."},{"key":"ref_29","unstructured":"Chen, T., Zhang, R., and Hinton, G. (2022). Analog bits: Generating discrete data using diffusion models with self-conditioning. arXiv."},{"key":"ref_30","unstructured":"Alcaraz, J.M.L., and Strodthoff, N. (2022). Diffusion-based time series imputation and forecasting with structured state space models. arXiv."},{"key":"ref_31","unstructured":"Liu, J., Li, C., Ren, Y., Chen, F., and Zhao, Z. (March, January 22). Diffsinger: Singing voice synthesis via shallow diffusion mechanism. Proceedings of the AAAI Conference on Artificial Intelligence, Virtual Conference."},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Koizumi, Y., Zen, H., Yatabe, K., Chen, N., and Bacchiani, M. (2022). SpecGrad: Diffusion probabilistic model based neural vocoder with adaptive noise spectral shaping. arXiv.","DOI":"10.21437\/Interspeech.2022-301"},{"key":"ref_33","unstructured":"Cao, H., Tan, C., Gao, Z., Chen, G., Heng, P.A., and Li, S.Z. (2022). A survey on generative diffusion model. arXiv."},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Luo, S., Su, Y., Peng, X., Wang, S., Peng, J., and Ma, J. (2022). Antigen-specific antibody design and optimization with diffusion-based generative models. bioRxiv.","DOI":"10.1101\/2022.07.10.499510"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Wang, C.Y., Bochkovskiy, A., and Liao, H.Y.M. (2022). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv.","DOI":"10.1109\/CVPR52729.2023.00721"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Liu, K., Sun, Q., Sun, D., Peng, L., Yang, M., and Wang, N. (2023). Underwater target detection based on improved YOLOv7. J. Mar. Sci. Eng., 11.","DOI":"10.3390\/jmse11030677"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Chen, X., Yuan, M., Yang, Q., Yao, H., and Wang, H. (2023). Underwater-YCC: Underwater Target Detection Optimization Algorithm Based on YOLOv7. J. Mar. Sci. Eng., 11.","DOI":"10.3390\/jmse11050995"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., and Hu, Q. (2020, January 13\u201319). ECA-Net: Efficient channel attention for deep convolutional neural networks. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA.","DOI":"10.1109\/CVPR42600.2020.01155"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Hou, Q., Zhou, D., and Feng, J. (2021, January 20\u201325). Coordinate attention for efficient mobile network design. Proceedings of the IEEE\/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA.","DOI":"10.1109\/CVPR46437.2021.01350"},{"key":"ref_40","unstructured":"Park, J., Woo, S., Lee, J.Y., and Kweon, I.S. (2018). Bam: Bottleneck attention module. arXiv."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Hu, J., Shen, L., and Sun, G. (2018, January 18\u201323). Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA.","DOI":"10.1109\/CVPR.2018.00745"},{"key":"ref_42","doi-asserted-by":"crossref","unstructured":"Woo, S., Park, J., Lee, J.Y., and Kweon, I.S. (2018, January 8\u201314). Cbam: Convolutional block attention module. Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany.","DOI":"10.1007\/978-3-030-01234-2_1"},{"key":"ref_43","doi-asserted-by":"crossref","first-page":"80804","DOI":"10.1109\/ACCESS.2022.3195901","article-title":"Solar cell surface defect detection based on improved YOLO v5","volume":"10","author":"Zhang","year":"2022","journal-title":"IEEE Access"},{"key":"ref_44","unstructured":"Sitaula, C., KC, S., and Aryal, J. (2023). Enhanced Multi-level Features for Very High Resolution Remote Sensing Scene Classification. arXiv."},{"key":"ref_45","doi-asserted-by":"crossref","unstructured":"Zhang, Z., Yan, Z., Jing, J., Gu, H., and Li, H. (2023). Generating Paired Seismic Training Data with Cycle-Consistent Adversarial Networks. Remote Sens., 15.","DOI":"10.3390\/rs15010265"}],"container-title":["Remote Sensing"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/19\/4772\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,10]],"date-time":"2025-10-10T21:02:24Z","timestamp":1760130144000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2072-4292\/15\/19\/4772"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2023,9,29]]},"references-count":45,"journal-issue":{"issue":"19","published-online":{"date-parts":[[2023,10]]}},"alternative-id":["rs15194772"],"URL":"https:\/\/doi.org\/10.3390\/rs15194772","relation":{},"ISSN":["2072-4292"],"issn-type":[{"value":"2072-4292","type":"electronic"}],"subject":[],"published":{"date-parts":[[2023,9,29]]}}}