{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2026,4,8]],"date-time":"2026-04-08T16:25:48Z","timestamp":1775665548651,"version":"3.50.1"},"reference-count":54,"publisher":"MDPI AG","issue":"8","license":[{"start":{"date-parts":[[2025,8,21]],"date-time":"2025-08-21T00:00:00Z","timestamp":1755734400000},"content-version":"vor","delay-in-days":0,"URL":"https:\/\/creativecommons.org\/licenses\/by\/4.0\/"}],"content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":["J. Imaging"],"abstract":"<jats:p>Gunshot sound classification plays a crucial role in public safety, forensic investigations, and intelligent surveillance systems. This study evaluates the performance of deep learning models in classifying firearm sounds by analyzing twelve time\u2013frequency spectrogram representations, including Mel, Bark, MFCC, CQT, Cochleagram, STFT, FFT, Reassigned, Chroma, Spectral Contrast, and Wavelet. The dataset consists of 2148 gunshot recordings from four firearm types, collected in a semi-controlled outdoor environment under multi-orientation conditions. To leverage advanced computer vision techniques, all spectrograms were converted into RGB images using perceptually informed colormaps. This enabled the application of image processing approaches and fine-tuning of pre-trained Convolutional Neural Networks (CNNs) originally developed for natural image classification. Six CNN architectures\u2014ResNet18, ResNet50, ResNet101, GoogLeNet, Inception-v3, and InceptionResNetV2\u2014were trained on these spectrogram images. Experimental results indicate that CQT, Cochleagram, and Mel spectrograms consistently achieved high classification accuracy, exceeding 94% when paired with deep CNNs such as ResNet101 and InceptionResNetV2. These findings demonstrate that transforming time\u2013frequency features into RGB images not only facilitates the use of image-based processing but also allows deep models to capture rich spectral\u2013temporal patterns, providing a robust framework for accurate firearm sound classification.<\/jats:p>","DOI":"10.3390\/jimaging11080281","type":"journal-article","created":{"date-parts":[[2025,8,21]],"date-time":"2025-08-21T15:19:02Z","timestamp":1755789542000},"page":"281","update-policy":"https:\/\/doi.org\/10.3390\/mdpi_crossmark_policy","source":"Crossref","is-referenced-by-count":5,"title":["Deep Spectrogram Learning for Gunshot Classification: A Comparative Study of CNN Architectures and Time-Frequency Representations"],"prefix":"10.3390","volume":"11","author":[{"ORCID":"https:\/\/orcid.org\/0000-0002-8969-2984","authenticated-orcid":false,"given":"Pafan","family":"Doungpaisan","sequence":"first","affiliation":[{"name":"Faculty of Industrial Technology and Management, King Mongkut\u2019s University of Technology North Bangkok, Bangkok 10800, Thailand"}],"role":[{"role":"author","vocabulary":"crossref"}]},{"ORCID":"https:\/\/orcid.org\/0000-0002-4252-078X","authenticated-orcid":false,"given":"Peerapol","family":"Khunarsa","sequence":"additional","affiliation":[{"name":"Faculty of Science and Technology, Uttaradit Rajabhat University, Uttaradit 53000, Thailand"}],"role":[{"role":"author","vocabulary":"crossref"}]}],"member":"1968","published-online":{"date-parts":[[2025,8,21]]},"reference":[{"key":"ref_1","unstructured":"The Global Burden of Disease 2016 Injury Collaborators (2018). Global Mortality from Firearms, 1990\u20132016. JAMA, 320, 792\u2013814."},{"key":"ref_2","doi-asserted-by":"crossref","first-page":"120","DOI":"10.1186\/s12992-021-00771-8","article-title":"Firearm Violence: A Neglected \u2018Global Health\u2019 Issue","volume":"17","author":"Werbick","year":"2021","journal-title":"Glob. Health"},{"key":"ref_3","doi-asserted-by":"crossref","first-page":"103115","DOI":"10.1109\/ACCESS.2023.3316695","article-title":"Preventing Crimes Through Gunshots Recognition Using Novel Feature Engineering and Meta-Learning Approach","volume":"11","author":"Raza","year":"2023","journal-title":"IEEE Access"},{"key":"ref_4","doi-asserted-by":"crossref","first-page":"2783","DOI":"10.1016\/S0140-6736(24)01123-1","article-title":"Gun Violence: A Global Problem in Need of Local Solutions","volume":"403","author":"Zadey","year":"2024","journal-title":"Lancet"},{"key":"ref_5","doi-asserted-by":"crossref","unstructured":"Priya, M., Shendre, S., and Pati, P. (2024, January 21\u201323). Enhanced Gunshot Sound Detection Using AlexNet and XGBoost from Fourier Spectrograms. Proceedings of the 4th International Conference on Intelligent Technologies (CONIT), Bangalore, India.","DOI":"10.1109\/CONIT61985.2024.10626951"},{"key":"ref_6","doi-asserted-by":"crossref","first-page":"58940","DOI":"10.1109\/ACCESS.2024.3392649","article-title":"Enhancing Gun Detection with Transfer Learning and YAMNet Audio Classification","volume":"12","author":"Valliappan","year":"2024","journal-title":"IEEE Access"},{"key":"ref_7","doi-asserted-by":"crossref","first-page":"279","DOI":"10.1111\/1460-6984.12783","article-title":"Deep Learning in Automatic Detection of Dysphonia: Comparing Acoustic Features and Developing a Generalizable Framework","volume":"58","author":"Chen","year":"2022","journal-title":"Int. J. Lang. Commun. Disord."},{"key":"ref_8","doi-asserted-by":"crossref","unstructured":"Tibrewal, N., Leeuwis, N., and Alimardani, M. (2022). Classification of Motor Imagery EEG Using Deep Learning Increases Performance in Inefficient BCI Users. PLoS ONE, 17.","DOI":"10.1371\/journal.pone.0268880"},{"key":"ref_9","doi-asserted-by":"crossref","first-page":"5891","DOI":"10.1109\/JIOT.2024.3489963","article-title":"Deep-Transfer-Learning-Based Intelligent Gunshot Detection and Firearm Recognition Using Tri-Axial Acceleration","volume":"12","author":"Chen","year":"2025","journal-title":"IEEE Internet Things J."},{"key":"ref_10","doi-asserted-by":"crossref","unstructured":"Zhang, Y., Hu, X., Ma, J., Wang, X., Luo, H., Wu, Z., Zhang, S., Shi, D., Yu, Y., and Qiu, X. (2021). Clinical Applicable AI System Based on Deep Learning Algorithm for Differentiation of Pulmonary Infectious Disease. Front. Med., 8.","DOI":"10.3389\/fmed.2021.753055"},{"key":"ref_11","doi-asserted-by":"crossref","unstructured":"Leal, J., Rowe, S., Stearns, V., Connolly, R., Vaklavas, C., Liu, M., Storniolo, A., Wahl, R., Pomper, M., and Solnes, L. (2022). Automated Lesion Detection of Breast Cancer in [18F] FDG PET\/CT Using a Novel AI-Based Workflow. Front. Oncol., 12.","DOI":"10.3389\/fonc.2022.1007874"},{"key":"ref_12","doi-asserted-by":"crossref","first-page":"581","DOI":"10.1080\/17453674.2017.1344459","article-title":"Artificial Intelligence for Analyzing Orthopedic Trauma Radiographs","volume":"88","author":"Olczak","year":"2017","journal-title":"Acta Orthop."},{"key":"ref_13","doi-asserted-by":"crossref","first-page":"109091","DOI":"10.1016\/j.dib.2023.109091","article-title":"A Multi-Firearm, Multi-Orientation Audio Dataset of Gunshots","volume":"48","author":"Kabealo","year":"2023","journal-title":"Data Brief"},{"key":"ref_14","doi-asserted-by":"crossref","first-page":"907","DOI":"10.1049\/iet-spr.2017.0167","article-title":"Hard Component Detection of Transient Noise and Its Removal Using Empirical Mode Decomposition and Wavelet-Based Predictive Filter","volume":"12","author":"Tanwar","year":"2018","journal-title":"IET Signal Process."},{"key":"ref_15","doi-asserted-by":"crossref","first-page":"589","DOI":"10.1007\/s11760-020-01780-7","article-title":"Leaf Image Analysis-Based Crop Diseases Classification","volume":"15","author":"Kurmi","year":"2021","journal-title":"Signal Image Video Process."},{"key":"ref_16","doi-asserted-by":"crossref","first-page":"981","DOI":"10.1007\/s11045-022-00820-4","article-title":"Deep CNN Model for Crops\u2019 Diseases Detection Using Leaf Images","volume":"33","author":"Kurmi","year":"2022","journal-title":"Multidimens. Syst. Signal Process."},{"key":"ref_17","first-page":"13873","article-title":"Learning Temporal Resolution in Spectrogram for Audio Classification","volume":"38","author":"Liu","year":"2022","journal-title":"Proc. Aaai Conf. Artif. Intell."},{"key":"ref_18","doi-asserted-by":"crossref","unstructured":"Shariff, K., Haron, M., Ali, M., Yassin, I., Azami, M., and Kechik, M. (2024, January 20\u201321). Comparison of Spectrograms for Classification of Vehicles from Traffic Audio. Proceedings of the 2024 IEEE Symposium on Wireless Technology & Applications (ISWTA), Kuala Lumpur, Malaysia.","DOI":"10.1109\/ISWTA62130.2024.10651848"},{"key":"ref_19","doi-asserted-by":"crossref","unstructured":"Raghuwanshi, P., and Kaushik, R. (2024, January 8\u201310). Insect Classification Using Mel-CSTFT: A Fusion of Mel Spectrogram and Chroma STFT Features. Proceedings of the 2024 First International Conference on Electronics, Communication and Signal Processing (ICECSP), New Delhi, India.","DOI":"10.1109\/ICECSP61809.2024.10698728"},{"key":"ref_20","unstructured":"Wolf-Monheim, F. (2024). Spectral and Rhythm Features for Audio Classification with Deep Convolutional Neural Networks. arXiv."},{"key":"ref_21","unstructured":"Nirmal, M.R., and Shajee Mohan, B.S. (2020, January 17\u201319). Music Genre Classification Using Spectrograms. Proceedings of the 2020 International Conference on Power, Instrumentation, Control and Computing (PICC), Thrissur, India."},{"key":"ref_22","doi-asserted-by":"crossref","first-page":"1549","DOI":"10.1109\/TASLP.2020.2993152","article-title":"Speech\/Music Classification Using Features from Spectral Peaks","volume":"28","author":"Bhattacharjee","year":"2020","journal-title":"IEEE\/ACM Trans. Audio Speech, Lang. Process."},{"key":"ref_23","doi-asserted-by":"crossref","first-page":"106620","DOI":"10.1109\/ACCESS.2023.3318015","article-title":"A Survey of Audio Classification Using Deep Learning","volume":"11","author":"Zaman","year":"2023","journal-title":"IEEE Access"},{"key":"ref_24","doi-asserted-by":"crossref","unstructured":"Massoudi, M., Verma, S., and Jain, R. (2021, January 20\u201322). Urban Sound Classification Using CNN. Proceedings of the 2021 6th International Conference on Inventive Computation Technologies (ICICT), Coimbatore, India.","DOI":"10.1109\/ICICT50816.2021.9358621"},{"key":"ref_25","doi-asserted-by":"crossref","unstructured":"Krishna, B.V., Devi, G.D., Sumathy, V., and Manikandan, J. (2023, January 17\u201318). An Improved Music Genre Classification Using Convolutional Neural Network and Spectrograms. Proceedings of the 2023 International Conference on System, Computation, Automation and Networking (ICSCAN), Puducherry, India.","DOI":"10.1109\/ICSCAN58655.2023.10395616"},{"key":"ref_26","doi-asserted-by":"crossref","unstructured":"Pham, L., Lam, P., Nguyen, T., Nguyen, H., and Schindler, A. (October, January 30). Deepfake Audio Detection Using Spectrogram-Based Feature and Ensemble of Deep Learning Models. Proceedings of the 2024 IEEE 5th International Symposium on the Internet of Sounds (IS2), Erlangen, Germany.","DOI":"10.1109\/IS262782.2024.10704095"},{"key":"ref_27","doi-asserted-by":"crossref","unstructured":"Ananda, A., Ngan, K., Karaba\u011f, C., Ter-Sarkisov, A., Alonso, E., and Reyes-Aldasoro, C. (2021). Classification and Visualisation of Normal and Abnormal Radiographs: A Comparison between Eleven Convolutional Neural Network Architectures. Sensors, 21.","DOI":"10.1101\/2021.06.16.21259014"},{"key":"ref_28","doi-asserted-by":"crossref","unstructured":"Choudhury, B., Rajakumar, K., Badhale, A., Roy, A., Sahoo, R., and Margret, I. (2024, January 28\u201329). Comparative Analysis of Advanced Models for Satellite-Based Aircraft Identification. Proceedings of the 2024 International Conference on Smart Systems for Electrical, Electronics, Communication and Computer Engineering (ICSSEECC), Coimbatore, India.","DOI":"10.1109\/ICSSEECC61126.2024.10649458"},{"key":"ref_29","doi-asserted-by":"crossref","unstructured":"Wong, K., and Lin, L. (2022, January 27\u201328). A Comparison of Six Convolutional Neural Networks for Weapon Categorization. Proceedings of the 2022 International Conference on Electrical Engineering and Informatics (ICELTICs), Banda Aceh, Indonesia.","DOI":"10.1109\/ICELTICs56128.2022.9932092"},{"key":"ref_30","doi-asserted-by":"crossref","first-page":"107520","DOI":"10.1016\/j.apacoust.2020.107520","article-title":"A New Pyramidal Concatenated CNN Approach for Environmental Sound Classification","volume":"170","author":"Demir","year":"2020","journal-title":"Appl. Acoust."},{"key":"ref_31","doi-asserted-by":"crossref","first-page":"78236","DOI":"10.1109\/ACCESS.2020.2989673","article-title":"Automated Firearm Classification from Bullet Markings Using Deep Learning","volume":"8","author":"Pisantanaroj","year":"2020","journal-title":"IEEE Access"},{"key":"ref_32","doi-asserted-by":"crossref","unstructured":"Nalla, R., Varela, M., and Oispuu, M. (2021, January 23\u201325). Evaluation of Image Classification Networks on Impulse Sound Classification Task. Proceedings of the 2021 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), Karlsruhe, Germany.","DOI":"10.1109\/MFI52462.2021.9591202"},{"key":"ref_33","doi-asserted-by":"crossref","unstructured":"Nathala, S., Yakkati, R., Dayal, A., Manikandan, M., Zhou, J., and Cenkeramaddi, L. (2024, January 5\u20138). Vessel Type Classification Utilizing Underwater Acoustic Data and Deep Learning. Proceedings of the 2024 IEEE 19th Conference on Industrial Electronics and Applications (ICIEA), Kristiansand, Norway.","DOI":"10.1109\/ICIEA61579.2024.10665252"},{"key":"ref_34","doi-asserted-by":"crossref","unstructured":"Mahdianpari, M., Salehi, B., Rezaee, M., Mohammadimanesh, F., and Zhang, Y. (2018). Very Deep Convolutional Neural Networks for Complex Land Cover Mapping Using Multispectral Remote Sensing Imagery. Remote Sens., 10.","DOI":"10.3390\/rs10071119"},{"key":"ref_35","doi-asserted-by":"crossref","unstructured":"Alaqil, R., Alsuhaibani, J., Alhumaidi, B., Alnasser, R., Alotaibi, R., and Benhidour, H. (2020, January 3\u20135). Automatic Gun Detection from Images Using Faster R-CNN. Proceedings of the 2020 First International Conference of Smart Systems and Emerging Technologies (SMARTTECH), Riyadh, Saudi Arabia.","DOI":"10.1109\/SMART-TECH49988.2020.00045"},{"key":"ref_36","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., and Rabinovich, A. (2015, January 7\u201312). Going Deeper with Convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA.","DOI":"10.1109\/CVPR.2015.7298594"},{"key":"ref_37","doi-asserted-by":"crossref","unstructured":"Ahn, Y., Hwang, J., Jung, Y., Jeong, T., and Shin, J. (2021). Automated Mesiodens Classification System Using Deep Learning on Panoramic Radiographs of Children. Diagnostics, 11.","DOI":"10.3390\/diagnostics11081477"},{"key":"ref_38","doi-asserted-by":"crossref","unstructured":"Kim, Y., Park, J., Chang, M., Ryu, J., Lim, W., and Jung, S. (2021). Influence of the Depth of the Convolutional Neural Networks on an Artificial Intelligence Model for Diagnosis of Orthognathic Surgery. J. Pers. Med., 11.","DOI":"10.3390\/jpm11050356"},{"key":"ref_39","doi-asserted-by":"crossref","unstructured":"Kim, J., Nam, N., Shim, J., Jung, Y., Cho, B., and Hwang, J. (2020). Transfer Learning via Deep Neural Networks for Implant Fixture System Classification Using Periapical Radiographs. J. Clin. Med., 9.","DOI":"10.3390\/jcm9041117"},{"key":"ref_40","doi-asserted-by":"crossref","first-page":"e230095","DOI":"10.1148\/ryai.230095","article-title":"Deep Learning-Based Identification of Brain MRI Sequences Using a Model Trained on Large Multicentric Study Cohorts","volume":"6","author":"Mahmutoglu","year":"2023","journal-title":"Radiol. Artif. Intell."},{"key":"ref_41","doi-asserted-by":"crossref","unstructured":"Szegedy, C., Ioffe, S., Vanhoucke, V., and Alemi, A. (2017, January 4\u20139). Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), San Francisco, CA, USA.","DOI":"10.1609\/aaai.v31i1.11231"},{"key":"ref_42","doi-asserted-by":"crossref","first-page":"213","DOI":"10.1177\/0161734620932609","article-title":"Classification of Breast Masses on Ultrasound Shear Wave Elastography Using Convolutional Neural Networks","volume":"42","author":"Fujioka","year":"2020","journal-title":"Ultrason. Imaging"},{"key":"ref_43","doi-asserted-by":"crossref","unstructured":"Schiele, S., Arndt, T., Martin, B., Miller, S., Bauer, S., Banner, B., Brendel, E., Schenkirsch, G., Anthuber, M., and Huss, R. (2021). Deep Learning Prediction of Metastasis in Locally Advanced Colon Cancer Using Binary Histologic Tumor Images. Cancers, 13.","DOI":"10.3390\/cancers13092074"},{"key":"ref_44","doi-asserted-by":"crossref","first-page":"1","DOI":"10.1016\/j.mri.2020.10.003","article-title":"Deep-Learning Approach with Convolutional Neural Network for Classification of Maximum Intensity Projections of Dynamic Contrast-Enhanced Breast Magnetic Resonance Imaging","volume":"75","author":"Fujioka","year":"2020","journal-title":"Magn. Reson. Imaging"},{"key":"ref_45","doi-asserted-by":"crossref","first-page":"307","DOI":"10.21037\/atm.2019.06.29","article-title":"Deep Convolutional Neural Network Inception-v3 Model for Differential Diagnosing of Lymph Node in Cytological Images: A Pilot Study","volume":"7","author":"Guan","year":"2019","journal-title":"Ann. Transl. Med."},{"key":"ref_46","doi-asserted-by":"crossref","unstructured":"Su, S., Li, L., Wang, Y., and Li, Y. (2023). Stroke Risk Prediction by Color Doppler Ultrasound of Carotid Artery-Based Deep Learning Using Inception V3 and VGG-16. Front. Neurol., 14.","DOI":"10.3389\/fneur.2023.1111906"},{"key":"ref_47","first-page":"e936830","article-title":"A Novel Deep Learning Model to Distinguish Malignant Versus Benign Solid Lung Nodules","volume":"28","author":"Wang","year":"2022","journal-title":"Med. Sci. Monit. Int. Med. J. Exp. Clin. Res."},{"key":"ref_48","doi-asserted-by":"crossref","first-page":"3276","DOI":"10.21037\/qims-21-1089","article-title":"Adversarial Training for Prostate Cancer Classification Using Magnetic Resonance Imaging","volume":"12","author":"Hu","year":"2021","journal-title":"Quant. Imaging Med. Surg."},{"key":"ref_49","doi-asserted-by":"crossref","first-page":"699","DOI":"10.1016\/j.ijom.2021.09.001","article-title":"Performance of Deep Convolutional Neural Network for Classification and Detection of Oral Potentially Malignant Disorders in Photographic Images","volume":"51","author":"Warin","year":"2021","journal-title":"Int. J. Oral Maxillofac. Surg."},{"key":"ref_50","doi-asserted-by":"crossref","unstructured":"Liu, K., Qin, S., Ning, J., Xin, P., Wang, Q., Chen, Y., Zhao, W., Zhang, E., and Lang, N. (2023). Prediction of Primary Tumor Sites in Spinal Metastases Using a ResNet-50 Convolutional Neural Network Based on MRI. Cancers, 15.","DOI":"10.3390\/cancers15112974"},{"key":"ref_51","doi-asserted-by":"crossref","unstructured":"Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017, January 21\u201326). Feature Pyramid Networks for Object Detection. Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA.","DOI":"10.1109\/CVPR.2017.106"},{"key":"ref_52","doi-asserted-by":"crossref","first-page":"3676","DOI":"10.1109\/TIP.2018.2825107","article-title":"TextBoxes++: A Single-Shot Oriented Scene Text Detector","volume":"27","author":"Liao","year":"2018","journal-title":"IEEE Trans. Image Process."},{"key":"ref_53","doi-asserted-by":"crossref","first-page":"7820","DOI":"10.1109\/TCSVT.2024.3376773","article-title":"Asymptotic Feature Pyramid Network for Labeling Pixels and Regions","volume":"34","author":"Yang","year":"2024","journal-title":"IEEE Trans. Circuits Syst. Video Technol."},{"key":"ref_54","doi-asserted-by":"crossref","first-page":"2864","DOI":"10.1109\/TIP.2022.3141844","article-title":"CM-Net: Concentric Mask Based Arbitrary-Shaped Text Detection","volume":"31","author":"Yang","year":"2022","journal-title":"IEEE Trans. Image Process."}],"container-title":["Journal of Imaging"],"original-title":[],"language":"en","link":[{"URL":"https:\/\/www.mdpi.com\/2313-433X\/11\/8\/281\/pdf","content-type":"unspecified","content-version":"vor","intended-application":"similarity-checking"}],"deposited":{"date-parts":[[2025,10,9]],"date-time":"2025-10-09T18:32:38Z","timestamp":1760034758000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.mdpi.com\/2313-433X\/11\/8\/281"}},"subtitle":[],"short-title":[],"issued":{"date-parts":[[2025,8,21]]},"references-count":54,"journal-issue":{"issue":"8","published-online":{"date-parts":[[2025,8]]}},"alternative-id":["jimaging11080281"],"URL":"https:\/\/doi.org\/10.3390\/jimaging11080281","relation":{},"ISSN":["2313-433X"],"issn-type":[{"value":"2313-433X","type":"electronic"}],"subject":[],"published":{"date-parts":[[2025,8,21]]}}}