The SPLICE algorithm has been recently proposed in the literature to address the robustness issue in Automatic Speech Recognition (ASR). Several variants have been also proposed to improve some drawbacks of the original technique. In this presentation an innovative efficient solution is discussed: it is based on SNR estimation in the frequency or mel domain and investigates the possibility of using different noise types for GMM training in order to maximize the generalization capabilities of the tool and therefore the recognition performances in presence of unknown noise sources. Computer simulations, conducted on the AURORA2 database, seem to confirm the effectiveness of the idea: the proposed approach yields similar accuracy performances w.r.t. the reference one, even employing a simpler mismatch compensation paradigm which does not need any a-priori knowledge on the noises used in the training phase.

Efficient SNR driven SPLICE implementation for robust speech recognition / Squartini, Stefano; Principi, Emanuele; Cifani, S.; Rotili, R.; Piazza, Francesco. - LNCS voulme n. 6800:(2011), pp. 70-80. [10.1007/978-3-642-25775-9_6]

Efficient SNR driven SPLICE implementation for robust speech recognition

SQUARTINI, Stefano;PRINCIPI, EMANUELE;PIAZZA, Francesco
2011-01-01

Abstract

The SPLICE algorithm has been recently proposed in the literature to address the robustness issue in Automatic Speech Recognition (ASR). Several variants have been also proposed to improve some drawbacks of the original technique. In this presentation an innovative efficient solution is discussed: it is based on SNR estimation in the frequency or mel domain and investigates the possibility of using different noise types for GMM training in order to maximize the generalization capabilities of the tool and therefore the recognition performances in presence of unknown noise sources. Computer simulations, conducted on the AURORA2 database, seem to confirm the effectiveness of the idea: the proposed approach yields similar accuracy performances w.r.t. the reference one, even employing a simpler mismatch compensation paradigm which does not need any a-priori knowledge on the noises used in the training phase.
2011
Analysis of Verbal and Nonverbal Communication and Enactment: The Processing Issue
9783642257742
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/58353
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact