A record of voiced speech is LPCanalysed. It is also partitioned into a sequence of short signals, each containing a single glottal pulse. Each short signal is taken to be the convolution of a component varying from one short signal to the next and an invariant component, corrupted by a significant but not overwhelming contamination, i.e. noise plus all other imperfections. The invariant component, which is initially estimated by shift-and-add processing, is the multiple convolution of the invariant responses of the recording apparatus, the speaker's lips and vocal tract (plus nasal tract and soft palate) and the speaker's average glottal excitation. This initial estimate, which is characteristic of the glottal excitation, is iteratively refined by a computational procedure which makes use of the LPC coefficients. The procedure, which checks its own numerical convergence, is illustrated by presenting results for six different speakers and for a single speaker under varying conditions.
Published in:
Physical Science, Measurement and Instrumentation, Management and Education - Reviews, IEE Proceedings A
(Volume:134
,
Issue:
10
)
Date of Publication: December 1987