Skip to Main Content
In natural speech production, the vocal tract continuously changes in shade and therefore operates as a time-varying Filter. In this paper, the time-dependent (autoregressive, PARCOR or Log Area Ratio) coefficients of an AR model are appoximated by a linear decomposition on the basis of orthogonal (Legendre, Fourier and orolate spheroidal) functions of time, covering an entire speech segment (oF syllabic size). These techniques are validated by synthesis experiments. Listening tests have demonstrated that spectral encodinq using non-stationary models allows an average 50% reduction of the number of parameters to be transmitted per second of speech, as compared to spectral estimators operating on a frame by frame basis. Applications to speech coding, synthesis and recognition are discussed.