In recent years there has been a considerable rise in interest towards Graph Representation and Learning techniques, especially in such cases where data has intrinsically a graph-like structure: social networks, molecular lattices, or semantic interactions, just to name a few. In this paper, we propose a novel way to represent an audio signal from its spectrogram by deriving a graph-based representation which can be then employed by already established Graph Deep-Neural-Networks techniques. We evaluate this approach on a Sound Event Classification task by employing the widely used ESC and Urbansound8k datasets and compare it with a Convolutional Neural Network (CNN) based method. We show that such proposed graph-based approach is extremely compact and used in conjunction learned CNN features, allows for a significant increase in classification accuracy over the baseline with more than 50 times less parameters than the original CNN method. This suggests that, the proposed graph-based features can offer additional discriminative information on top of learned CNN features.

Graph-based Representation of Audio signals for Sound Event Classification / Aironi, C; Cornell, S; Principi, E; Squartini, S. - (2021), pp. 566-570. (Intervento presentato al convegno EUSIPCO 2021) [10.23919/EUSIPCO54536.2021.9616143].

Graph-based Representation of Audio signals for Sound Event Classification

Aironi, C;Cornell, S;Principi, E;Squartini, S
2021-01-01

Abstract

In recent years there has been a considerable rise in interest towards Graph Representation and Learning techniques, especially in such cases where data has intrinsically a graph-like structure: social networks, molecular lattices, or semantic interactions, just to name a few. In this paper, we propose a novel way to represent an audio signal from its spectrogram by deriving a graph-based representation which can be then employed by already established Graph Deep-Neural-Networks techniques. We evaluate this approach on a Sound Event Classification task by employing the widely used ESC and Urbansound8k datasets and compare it with a Convolutional Neural Network (CNN) based method. We show that such proposed graph-based approach is extremely compact and used in conjunction learned CNN features, allows for a significant increase in classification accuracy over the baseline with more than 50 times less parameters than the original CNN method. This suggests that, the proposed graph-based features can offer additional discriminative information on top of learned CNN features.
2021
978-9-0827-9706-0
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/310930
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 4
social impact