Acoustical Assessment of Voice Disorder With Continuous Speech Using ASR Posterior Features | IEEE Journals & Magazine | IEEE Xplore

Acoustical Assessment of Voice Disorder With Continuous Speech Using ASR Posterior Features


Abstract:

Traditionally acoustical assessment of voice disorder relies on simple and homogeneous speech samples like sustained vowels. Continuous speech is believed to be more repr...Show More

Abstract:

Traditionally acoustical assessment of voice disorder relies on simple and homogeneous speech samples like sustained vowels. Continuous speech is believed to be more representative of the daily function of voice and more preferable in clinical practice. This paper describes an attempt on automating voice assessment with continuous speech utterances. The proposed system makes use of a novel type of features that are derived from phone posterior probabilities outputted by a deep neural network based automatic speech recognition (ASR) system. These ASR-based voice features are designed to effectively quantify the mismatch between disordered voice and normal voice. Prediction of voice disorder severity is carried out first at utterance-level and subsequently the prediction scores for individual utterances from a subject are combined to give an overall-assessment on the subject. With a low-dimension ASR-based feature vector, the utterance-level prediction accuracy is comparable to that with conventional features with a much higher dimension. By jointly using the ASR features and conventional voice features, a subject-level prediction accuracy of over 80% on three severity classes can be achieved. Subjects with mild disorder and those with severe disorder could be perfectly distinguished by the proposed method.
Page(s): 1047 - 1059
Date of Publication: 17 March 2019

ISSN Information:

Funding Agency:


Contact IEEE to Subscribe

References

References is not available for this document.