This paper presents, on the basis of a rigorous mathematical formulation, a multicomponent sinusoidal model that allows an asymptotically exact reconstruction of nonstationary speech signals, regardless of their duration and without any limitation in the modeling of voiced, unvoiced, and transitional segments. The proposed approach is based on the application of the Hilbert transform to obtain an amplitude signal from which an AM component is extracted by filtering, so that the residue can then be iteratively processed in the same way. This technique permits a multicomponent AM-FM model to be derived in which the number of components (iterations) may be arbitrarily chosen. Additionally, the instantaneous frequencies of these components can be calculated with a given accuracy by segmentation of the phase signals. The validity of the proposed approach has been proven by some applications to both synthetic signals and natural speech. Several comparisons show how this approach almost always has a higher performance than that obtained by current best practices, and does not need the complex filter optimizations required by other techniques.

Multicomponent AM-FM representations: An asymptotically exact approach / Gianfelici, Francesco; Biagetti, Giorgio; Crippa, Paolo; Turchetti, Claudio. - In: IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. - ISSN 1558-7916. - 15:3(2007), pp. 823-837. [10.1109/TASL.2006.889744]

Multicomponent AM-FM representations: An asymptotically exact approach

GIANFELICI, Francesco;BIAGETTI, Giorgio;CRIPPA, Paolo;TURCHETTI, Claudio
2007-01-01

Abstract

This paper presents, on the basis of a rigorous mathematical formulation, a multicomponent sinusoidal model that allows an asymptotically exact reconstruction of nonstationary speech signals, regardless of their duration and without any limitation in the modeling of voiced, unvoiced, and transitional segments. The proposed approach is based on the application of the Hilbert transform to obtain an amplitude signal from which an AM component is extracted by filtering, so that the residue can then be iteratively processed in the same way. This technique permits a multicomponent AM-FM model to be derived in which the number of components (iterations) may be arbitrarily chosen. Additionally, the instantaneous frequencies of these components can be calculated with a given accuracy by segmentation of the phase signals. The validity of the proposed approach has been proven by some applications to both synthetic signals and natural speech. Several comparisons show how this approach almost always has a higher performance than that obtained by current best practices, and does not need the complex filter optimizations required by other techniques.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11566/53078
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 106
  • ???jsp.display-item.citation.isi??? 92
social impact