Deep Spectrogram Learning for Gunshot Classification: A Comparative Study of CNN Architectures and Time-Frequency Representations
I tiakina i:
| I whakaputaina i: | Journal of Imaging vol. 11, no. 8 (2025), p. 281-307 |
|---|---|
| Kaituhi matua: | |
| Ētahi atu kaituhi: | |
| I whakaputaina: |
MDPI AG
|
| Ngā marau: | |
| Urunga tuihono: | Citation/Abstract Full Text + Graphics Full Text - PDF |
| Ngā Tūtohu: |
Kāore He Tūtohu, Me noho koe te mea tuatahi ki te tūtohu i tēnei pūkete!
|
MARC
| LEADER | 00000nab a2200000uu 4500 | ||
|---|---|---|---|
| 001 | 3244042760 | ||
| 003 | UK-CbPIL | ||
| 022 | |a 2313-433X | ||
| 024 | 7 | |a 10.3390/jimaging11080281 |2 doi | |
| 035 | |a 3244042760 | ||
| 045 | 2 | |b d20250101 |b d20251231 | |
| 100 | 1 | |a Pafan, Doungpaisan |u Faculty of Industrial Technology and Management, King Mongkut’s University of Technology North Bangkok, Bangkok 10800, Thailand; pafan.d@itm.kmutnb.ac.th | |
| 245 | 1 | |a Deep Spectrogram Learning for Gunshot Classification: A Comparative Study of CNN Architectures and Time-Frequency Representations | |
| 260 | |b MDPI AG |c 2025 | ||
| 513 | |a Journal Article | ||
| 520 | 3 | |a Gunshot sound classification plays a crucial role in public safety, forensic investigations, and intelligent surveillance systems. This study evaluates the performance of deep learning models in classifying firearm sounds by analyzing twelve time–frequency spectrogram representations, including Mel, Bark, MFCC, CQT, Cochleagram, STFT, FFT, Reassigned, Chroma, Spectral Contrast, and Wavelet. The dataset consists of 2148 gunshot recordings from four firearm types, collected in a semi-controlled outdoor environment under multi-orientation conditions. To leverage advanced computer vision techniques, all spectrograms were converted into RGB images using perceptually informed colormaps. This enabled the application of image processing approaches and fine-tuning of pre-trained Convolutional Neural Networks (CNNs) originally developed for natural image classification. Six CNN architectures—ResNet18, ResNet50, ResNet101, GoogLeNet, Inception-v3, and InceptionResNetV2—were trained on these spectrogram images. Experimental results indicate that CQT, Cochleagram, and Mel spectrograms consistently achieved high classification accuracy, exceeding 94% when paired with deep CNNs such as ResNet101 and InceptionResNetV2. These findings demonstrate that transforming time–frequency features into RGB images not only facilitates the use of image-based processing but also allows deep models to capture rich spectral–temporal patterns, providing a robust framework for accurate firearm sound classification. | |
| 651 | 4 | |a United States--US | |
| 653 | |a Accuracy | ||
| 653 | |a Datasets | ||
| 653 | |a Deep learning | ||
| 653 | |a Wavelet transforms | ||
| 653 | |a Law enforcement | ||
| 653 | |a Color imagery | ||
| 653 | |a Racial profiling | ||
| 653 | |a Artificial neural networks | ||
| 653 | |a Audio recordings | ||
| 653 | |a Public safety | ||
| 653 | |a Small arms | ||
| 653 | |a Computer vision | ||
| 653 | |a Automation | ||
| 653 | |a Machine learning | ||
| 653 | |a Image processing | ||
| 653 | |a Representations | ||
| 653 | |a Pattern recognition | ||
| 653 | |a Sound | ||
| 653 | |a Comparative studies | ||
| 653 | |a Artificial intelligence | ||
| 653 | |a Fourier transforms | ||
| 653 | |a Spectrograms | ||
| 653 | |a Time-frequency analysis | ||
| 653 | |a Neural networks | ||
| 653 | |a Classification | ||
| 653 | |a Image classification | ||
| 653 | |a Surveillance | ||
| 653 | |a Gun violence | ||
| 653 | |a Acoustics | ||
| 653 | |a Murders & murder attempts | ||
| 653 | |a Surveillance systems | ||
| 700 | 1 | |a Peerapol, Khunarsa |u Faculty of Science and Technology, Uttaradit Rajabhat University, Uttaradit 53000, Thailand | |
| 773 | 0 | |t Journal of Imaging |g vol. 11, no. 8 (2025), p. 281-307 | |
| 786 | 0 | |d ProQuest |t Advanced Technologies & Aerospace Database | |
| 856 | 4 | 1 | |3 Citation/Abstract |u https://www.proquest.com/docview/3244042760/abstract/embedded/6A8EOT78XXH2IG52?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text + Graphics |u https://www.proquest.com/docview/3244042760/fulltextwithgraphics/embedded/6A8EOT78XXH2IG52?source=fedsrch |
| 856 | 4 | 0 | |3 Full Text - PDF |u https://www.proquest.com/docview/3244042760/fulltextPDF/embedded/6A8EOT78XXH2IG52?source=fedsrch |