Emergency Siren Recognition (ESR) is an important issue for automotive safety. We are interested in the early recognition of ambulance sirens in urban scenarios, where noise can be produced by a wide variety of sources and represents an impediment to the perception of alarm sounds by drivers. In this paper, we propose a deep convolutional neural network based on the U-Net encoding path for the ESR task. To overcome the problem of audio acquisition, an algorithm has been implemented to generate a synthetic dataset that reproduces the sound of a siren in multiple urban traffic contexts. We perform emergency sound recognition to identify the presence of the alerting sound using spectrogram-like features. Our experimental evaluations demonstrate that our ESR approach has achieved excellent performance both in mono-scenarios and multi-scenarios at very low SNRs, also in conditions unseen during training thanks to a large amount of training data.
Emergency Siren Recognition in Urban Scenarios: Synthetic Dataset and Deep Learning Models / Cantarini, M.; Serafini, L.; Gabrielli, L.; Principi, E.; Squartini, S.. - ELETTRONICO. - 12463:(2020), pp. 207-220. (Intervento presentato al convegno 16th International Conference on Intelligent Computing, ICIC 2020 tenutosi a ita nel 2020) [10.1007/978-3-030-60799-9_18].
Emergency Siren Recognition in Urban Scenarios: Synthetic Dataset and Deep Learning Models
Cantarini M.;Serafini L.;Gabrielli L.;Principi E.;Squartini S.
2020-01-01
Abstract
Emergency Siren Recognition (ESR) is an important issue for automotive safety. We are interested in the early recognition of ambulance sirens in urban scenarios, where noise can be produced by a wide variety of sources and represents an impediment to the perception of alarm sounds by drivers. In this paper, we propose a deep convolutional neural network based on the U-Net encoding path for the ESR task. To overcome the problem of audio acquisition, an algorithm has been implemented to generate a synthetic dataset that reproduces the sound of a siren in multiple urban traffic contexts. We perform emergency sound recognition to identify the presence of the alerting sound using spectrogram-like features. Our experimental evaluations demonstrate that our ESR approach has achieved excellent performance both in mono-scenarios and multi-scenarios at very low SNRs, also in conditions unseen during training thanks to a large amount of training data.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.