Developing a children's Filipino speech corpus for application in automatic detection of reading miscues and disfluencies | IEEE Conference Publication | IEEE Xplore

Developing a children's Filipino speech corpus for application in automatic detection of reading miscues and disfluencies


Abstract:

Recognizing the potential benefit that the current speech processing technology offers to improve children's literacy, researchers in the past few years have devoted thei...Show More

Abstract:

Recognizing the potential benefit that the current speech processing technology offers to improve children's literacy, researchers in the past few years have devoted their efforts in developing reading miscue detectors (RMDs) and automated reading tutors (ARTs). A primary challenge however in developing speech technologies for children may be the unavailability of a dedicated children's speech corpus that can be used for system design and test. In the past few years, children's speech corpora have been developed for languages such as English, Dutch, Chinese Mandarin, Italian, German and Swedish. But since Filipino has features and orthography that are distinct from other languages, the focus of this study is the development of a children's Filipino speech corpus (CFSC). In this paper, we present the CFSC design, reading text, data collection procedure and speech transcription method. We also performed initial analysis of the reading miscues and disfluencies found in the CFSC. The results of the miscue analysis suggest possible ways for modeling the reading miscues and possible methods for detecting them. Among these methods are acoustic model likelihood calculation and analysis of duration-based prosodic features. The CFSC presented in this study will be used for the development of an RMD and an ART for Filipino.
Date of Conference: 19-22 November 2012
Date Added to IEEE Xplore: 17 January 2013
ISBN Information:

ISSN Information:

Conference Location: Cebu, Philippines

Contact IEEE to Subscribe

References

References is not available for this document.