Speech interfaced Human-Machine interfaces have been gaining an increasing interest among the scientific community and technology market. One of the key task to be faced within this architectures is speech recognition, for which a certain degree of understanding has been already reached in the literature. Several efforts have been oriented on purpose to take the presence of acoustic non-idealities, i.e. noise and acoustic reverberation, into account. Feature warping in cepstral domain is one of the most performing approaches: different techniques have been proposed so far and some of them are based on the histogram-equalization concept, which has shown to be effective. It employs a suitable transformation to the acoustically degraded cepstral coefficients in order to re-establish the clean speech statistical properties. The Quantile-based implementation has been taken here as reference. The main objective of this work is exploiting the availability of multichannel acoustic information, coming from a microphone array acquisition, in order to enhance the statistics modeling capabilities of the algorithm and therefore improve its speech enhancement effect. Some computer simulations based on the AURORA 2 database have been carried out and obtained results confirm the effectiveness of the idea.

Multichannel Cepstral Domain Feature Warping for Robust Speech Recognition / Squartini, S.; Fagiani, M.; Principi, E.; Piazza, F.. - Volume 226, 2011:(2011), pp. 284-292. [10.3233/978-1-60750-692-8-284]

Multichannel Cepstral Domain Feature Warping for Robust Speech Recognition

S. SQUARTINI;M. FAGIANI;E. PRINCIPI;F. PIAZZA
2011-01-01

Abstract

Speech interfaced Human-Machine interfaces have been gaining an increasing interest among the scientific community and technology market. One of the key task to be faced within this architectures is speech recognition, for which a certain degree of understanding has been already reached in the literature. Several efforts have been oriented on purpose to take the presence of acoustic non-idealities, i.e. noise and acoustic reverberation, into account. Feature warping in cepstral domain is one of the most performing approaches: different techniques have been proposed so far and some of them are based on the histogram-equalization concept, which has shown to be effective. It employs a suitable transformation to the acoustically degraded cepstral coefficients in order to re-establish the clean speech statistical properties. The Quantile-based implementation has been taken here as reference. The main objective of this work is exploiting the availability of multichannel acoustic information, coming from a microphone array acquisition, in order to enhance the statistics modeling capabilities of the algorithm and therefore improve its speech enhancement effect. Some computer simulations based on the AURORA 2 database have been carried out and obtained results confirm the effectiveness of the idea.
2011
978-160750691-1
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/47905
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 2
social impact