The interaction between dolphins and fishing activities poses economic and ecological challenges, needing improved monitoring techniques. The present study presents a preliminary investigation into the classification of bottlenose dolphin (Tursiops truncatus) vocalizations using convolutional neural networks (CNNs), applied to a dataset of underwater acoustic recordings. The proposed approach classified four main vocalization typologies - whistles, echolocation clicks, burst pulse sounds, and feeding buzzes - while distinguishing them from background noise. The dataset, collected at the Oltremare marine park in Riccione, Italy, was processed through spectrogram analysis, with the application of specific filtering to enhance signal characteristics and to filter out the undesired noise. The CNN model, trained using a 10-fold cross-validation approach, achieved an average classification accuracy exceeding 95%, with precision, recall, and F1-score close to 90%. The results demonstrate the feasibility of deep learning-based classification but also highlight the need to work with wide and differentiated datasets, particularly for identifying feeding buzzes. This preliminary study may contribute to the development of an autonomous monitoring system for detecting dolphin presence in marine environments using AI-based classification. Future work will focus on expanding the dataset with recordings from varied environments and optimizing preprocessing techniques to improve robustness in real-world conditions.
Towards Automated Dolphin Vocalization Recognition: A Preliminary CNN-Based Study / Di Nardo, F.; De Marco, R.; Veli, D. L.; Screpanti, L.; Castagna, B.; Lucchetti, A.; Scaradozzi, D.. - (2025), pp. 280-285. ( 33rd Mediterranean Conference on Control and Automation, MED 2025 Farah Hotel, mar 2025) [10.1109/MED64031.2025.11073240].
Towards Automated Dolphin Vocalization Recognition: A Preliminary CNN-Based Study
Di Nardo F.Primo
;Screpanti L.;Castagna B.;Scaradozzi D.Ultimo
2025-01-01
Abstract
The interaction between dolphins and fishing activities poses economic and ecological challenges, needing improved monitoring techniques. The present study presents a preliminary investigation into the classification of bottlenose dolphin (Tursiops truncatus) vocalizations using convolutional neural networks (CNNs), applied to a dataset of underwater acoustic recordings. The proposed approach classified four main vocalization typologies - whistles, echolocation clicks, burst pulse sounds, and feeding buzzes - while distinguishing them from background noise. The dataset, collected at the Oltremare marine park in Riccione, Italy, was processed through spectrogram analysis, with the application of specific filtering to enhance signal characteristics and to filter out the undesired noise. The CNN model, trained using a 10-fold cross-validation approach, achieved an average classification accuracy exceeding 95%, with precision, recall, and F1-score close to 90%. The results demonstrate the feasibility of deep learning-based classification but also highlight the need to work with wide and differentiated datasets, particularly for identifying feeding buzzes. This preliminary study may contribute to the development of an autonomous monitoring system for detecting dolphin presence in marine environments using AI-based classification. Future work will focus on expanding the dataset with recordings from varied environments and optimizing preprocessing techniques to improve robustness in real-world conditions.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


