This paper proposes innovative multi-channel bayesian estimators in the feature-domain for robust speech recognition. Both minimum-mean-squared-error (MMSE) and maximum-a-posteriori (MAP) criteria have been explored: the related algorithms extend the multi-channel frequency-domain counterparts and generalize the single-channel feature-domain MMSE solution, recently appeared in the literature. Computer simulations conducted on a modified AURORA2 database show the efficacy of the frequency-domain multi-channel estimators when used as a pre-processing stage of a speech recognition engine, and that the proposed multi-channel MAP approach outperforms single-channel estimators by at least 3% on average.
Robust Speech Recognition Using Feature-Domain Multi-Channel Bayesian Estimators / Principi, E; Rotili, R; Cifani, S; Marinelli, Lorenzo; Squartini, S; Piazza, F.. - (2010).
Robust Speech Recognition Using Feature-Domain Multi-Channel Bayesian Estimators
PRINCIPI E;ROTILI R;CIFANI S;SQUARTINI S;F. PIAZZA
2010-01-01
Abstract
This paper proposes innovative multi-channel bayesian estimators in the feature-domain for robust speech recognition. Both minimum-mean-squared-error (MMSE) and maximum-a-posteriori (MAP) criteria have been explored: the related algorithms extend the multi-channel frequency-domain counterparts and generalize the single-channel feature-domain MMSE solution, recently appeared in the literature. Computer simulations conducted on a modified AURORA2 database show the efficacy of the frequency-domain multi-channel estimators when used as a pre-processing stage of a speech recognition engine, and that the proposed multi-channel MAP approach outperforms single-channel estimators by at least 3% on average.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.