In recent years there has been a considerable rise in interest towards Graph Representation and Learning techniques, especially in such cases where data has intrinsically a graph-like structure: social networks, molecular lattices, or semantic interactions, just to name a few. In this paper, we propose a novel way to represent an audio signal from its spectrogram by deriving a graph-based representation which can be then employed by already established Graph Deep-Neural-Networks techniques. We evaluate this approach on a Sound Event Classification task by employing the widely used ESC and Urbansound8k datasets and compare it with a Convolutional Neural Network (CNN) based method. We show that such proposed graph-based approach is extremely compact and used in conjunction learned CNN features, allows for a significant increase in classification accuracy over the baseline with more than 50 times less parameters than the original CNN method. This suggests that, the proposed graph-based features can offer additional discriminative information on top of learned CNN features.
Graph-based Representation of Audio signals for Sound Event Classification
Aironi, C;Cornell, S;Principi, E;Squartini, S
2021-01-01
Abstract
In recent years there has been a considerable rise in interest towards Graph Representation and Learning techniques, especially in such cases where data has intrinsically a graph-like structure: social networks, molecular lattices, or semantic interactions, just to name a few. In this paper, we propose a novel way to represent an audio signal from its spectrogram by deriving a graph-based representation which can be then employed by already established Graph Deep-Neural-Networks techniques. We evaluate this approach on a Sound Event Classification task by employing the widely used ESC and Urbansound8k datasets and compare it with a Convolutional Neural Network (CNN) based method. We show that such proposed graph-based approach is extremely compact and used in conjunction learned CNN features, allows for a significant increase in classification accuracy over the baseline with more than 50 times less parameters than the original CNN method. This suggests that, the proposed graph-based features can offer additional discriminative information on top of learned CNN features.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.