Acoustic novelty detection with adversarial autoencoders

Principi, Emanuele; Vesperini, Fabio; Squartini, Stefano; Piazza, Francesco

doi:10.1109/IJCNN.2017.7966273

Novelty detection is the task of recognising events the differ from a model of normality. This paper proposes an acoustic novelty detector based on neural networks trained with an adversarial training strategy. The proposed approach is composed of a feature extraction stage that calculates Log-Mel spectral features from the input signal. Then, an autoencoder network, trained on a corpus of 'normal' acoustic signals, is employed to detect whether a segment contains an abnormal event or not. A novelty is detected if the Euclidean distance between the input and the output of the autoencoder exceeds a certain threshold. The innovative contribution of the proposed approach resides in the training procedure of the autoencoder network: instead of using the conventional training procedure that minimises only the Minimum Mean Squared Error loss function, here we adopt an adversarial strategy, where a discriminator network is trained to distinguish between the output of the autoencoder and data sampled from the training corpus. The autoencoder, then, is trained also by using the binary cross-entropy loss calculated at the output of the discriminator network. The performance of the algorithm has been assessed on a corpus derived from the PASCAL CHiME dataset. The results showed that the proposed approach provides a relative performance improvement equal to 0.26% compared to the standard autoencoder. The significance of the improvement has been evaluated with a one-tailed z-test and resulted significant with p < 0.001. The presented approach thus showed promising results on this task and it could be extended as a general training strategy for autoencoders if confirmed by additional experiments.

Acoustic novelty detection with adversarial autoencoders / Principi, Emanuele; Vesperini, Fabio; Squartini, Stefano; Piazza, Francesco. - ELETTRONICO. - 2017-:(2017), pp. 3324-3330. ( 2017 International Joint Conference on Neural Networks, IJCNN 2017 usa 2017) [10.1109/IJCNN.2017.7966273].