Emergency Siren Recognition (ESR) is an important issue for automotive safety. We are interested in the early recognition of ambulance sirens in urban scenarios, where noise can be produced by a wide variety of sources and represents an impediment to the perception of alarm sounds by drivers. In this paper, we propose a deep convolutional neural network based on the U-Net encoding path for the ESR task. To overcome the problem of audio acquisition, an algorithm has been implemented to generate a synthetic dataset that reproduces the sound of a siren in multiple urban traffic contexts. We perform emergency sound recognition to identify the presence of the alerting sound using spectrogram-like features. Our experimental evaluations demonstrate that our ESR approach has achieved excellent performance both in mono-scenarios and multi-scenarios at very low SNRs, also in conditions unseen during training thanks to a large amount of training data.
Emergency Siren Recognition in Urban Scenarios: Synthetic Dataset and Deep Learning Models / Cantarini, M.; Serafini, L.; Gabrielli, L.; Principi, E.; Squartini, S.. - ELETTRONICO. - 12463:(2020), pp. 207-220. (Intervento presentato al convegno 16th International Conference on Intelligent Computing, ICIC 2020 tenutosi a ita nel 2020) [10.1007/978-3-030-60799-9_18].