Scheduled System Maintenance
On Saturday, October 21, single article sales and account management will be unavailable until 6 PM ET.
Notice: There is currently an issue with the citation download feature. Learn more.

2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE)

28-30 Oct. 2015

Filter Results

Displaying Results 1 - 25 of 57
  • [Front cover]

    Publication Year: 2015, Page(s): c1
    Request permission for commercial reuse | PDF file iconPDF (412 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2015, Page(s):1 - 2
    Request permission for commercial reuse | PDF file iconPDF (227 KB)
    Freely Available from IEEE
  • Message of the O-COCOSDA Convener

    Publication Year: 2015, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (113 KB) | HTML iconHTML
    Freely Available from IEEE
  • Message from the conference chair

    Publication Year: 2015, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (196 KB) | HTML iconHTML
    Freely Available from IEEE
  • Message from the technical program committee chair

    Publication Year: 2015, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (276 KB) | HTML iconHTML
    Freely Available from IEEE
  • Organizing committee

    Publication Year: 2015, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (184 KB)
    Freely Available from IEEE
  • Scientific committee

    Publication Year: 2015, Page(s):1 - 2
    Request permission for commercial reuse | PDF file iconPDF (200 KB)
    Freely Available from IEEE
  • Organizers and exhibitors

    Publication Year: 2015, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (216 KB)
    Freely Available from IEEE
  • Conference program

    Publication Year: 2015, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (117 KB)
    Freely Available from IEEE
  • Technical program

    Publication Year: 2015, Page(s):1 - 6
    Request permission for commercial reuse | PDF file iconPDF (528 KB)
    Freely Available from IEEE
  • Keynote speech 1: Artificial intelligence needs a language cognitive revolution

    Publication Year: 2015, Page(s): 1
    Request permission for commercial reuse | PDF file iconPDF (347 KB)
    Freely Available from IEEE
  • Keynote speech 2: Ontology-supported special corpus of illocution, emotion and prosody

    Publication Year: 2015, Page(s):1 - 2
    Request permission for commercial reuse | PDF file iconPDF (439 KB)
    Freely Available from IEEE
  • Keynote speech 3: Toward simultaneous, natural and multimodal speech-to-speech translation

    Publication Year: 2015, Page(s):1 - 2
    Request permission for commercial reuse | PDF file iconPDF (320 KB)
    Freely Available from IEEE
  • Keynote speech 4: Extraction of linguistic and paralinguistic information from audio-visual data

    Publication Year: 2015, Page(s):1 - 2
    Request permission for commercial reuse | PDF file iconPDF (430 KB)
    Freely Available from IEEE
  • Noise-robust and stress-free visualization of pronunciation diversity of World Englishes using a learner's self-centered viewpoint

    Publication Year: 2015, Page(s):1 - 6
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (799 KB) | HTML iconHTML

    The term of “World Englishes” describes the current and real state of English and one of their main characteristics is a large diversity of pronunciation, called accents. We have developed two techniques of individual-based clustering of the diversity [1, 2] and educationally-effective visualization of the diversity [3]. Accent clustering requires a technique to quantify the accent g... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Elicit spoken-style data from social media through a style classifier

    Publication Year: 2015, Page(s):7 - 12
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (221 KB) | HTML iconHTML

    We explore the use of social media data to reduce the effort in developing a conversational speech corpus. The LOTUS-SOC corpus is created by recording Twitter messages through a mobile application. In the first phase, which took around one month, 172 hours of speech from 208 speakers were recorded and ready for use without the need for speech segmentation and transcription. In terms of language s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Stress annotated Urdu speech corpus to build female voice for TTS

    Publication Year: 2015, Page(s):13 - 20
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1053 KB) | HTML iconHTML

    This research describes the stress annotation process for the two hours of Urdu speech corpus containing 18,640 words and 28,866 syllables to build a natural voice for Text-to-speech (TTS) system. For the stress annotation of speech corpus, two algorithms i.e. phonological and acoustic stress marking algorithms have been tested in comparison to perceptual stress marking. Urdu phonological stress m... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Toward improving estimation accuracy of emotion dimensions in bilingual scenario based on three-layered model

    Publication Year: 2015, Page(s):21 - 26
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (280 KB) | HTML iconHTML

    This paper proposes a newly revised three-layered model to improve emotion dimensions (valence, activation) estimation for bilingual scenario, using knowledge of commonalities and differences of human perception among multiple languages. Most of previous systems on speech emotion recognition only worked in each mono-language. However, to construct a generalized emotion recognition system which be ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Development of new speech corpus for elderly Japanese speech recognition

    Publication Year: 2015, Page(s):27 - 31
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (273 KB) | HTML iconHTML

    We have constructed a new speech data corpus, using the utterances of 100 elderly Japanese people, to improve speech recognition accuracy of the speech of older people. Humanoid robots are being developed for use in elder care nursing homes. Interaction with such robots is expected to help maintain the cognitive abilities of nursing home residents, as well as providing them with companionship. In ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Corpus design and development of an annotated speech database for Punjabi

    Publication Year: 2015, Page(s):32 - 37
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (721 KB) | HTML iconHTML

    Punjabi is an important Indo-Aryan languages spoken in India and in some other countries especially Pakistan. It is a tonal language and its phonetic and phonological aspects have not been studied very much. The present paper reports development of phonemically annotated speech database of Malwai dialect of Punjabi. A phonetically rich text database of 1500 words and 300 sentences from a corpus of... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Chinese Traditional Opera database for Music Genre Recognition

    Publication Year: 2015, Page(s):38 - 41
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (237 KB) | HTML iconHTML

    This paper introduces the database of Traditional Chinese Opera for Music Genre Recognition, which is still a gap in our knowledge. The database contains different songs which are from the 14 most popular kinds of Chinese Opera, including Sichuan Opera, Beijing Opera, etc. Each Opera is then annotated between music, song and speech which are identified by acoustic waveform and spectral analysis. T... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A study on adaptation of speaking rate-dependent hierarchical prosodic model for Chinese dialect TTS

    Publication Year: 2015, Page(s):42 - 46
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (1100 KB) | HTML iconHTML

    This paper presents a new approach to developing a speaking rate (SR)-dependent hierarchical prosodic model (SR-HPM) to be utilized in a SR-controlled TTS for Taiwanese (Min-Nan) language, a resource-limited Chinese dialect. The main issue is to conquer the difficulty of building the SR-HPM directly from a Taiwanese database with sparse coverage of linguistic context, prosody and SR. By using the ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Contrastive study of focus phonetic realization between Jinan dialect and Taiyuan dialect

    Publication Year: 2015, Page(s):47 - 52
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (654 KB) | HTML iconHTML

    It is usually considered that focus bears communicative function in discourse, each language has its own ways to realize focus. This paper compares the focus realization of Jinan dialect and Taiyuan dialect. It aims to investigate the similarity and difference of focus realization through examining the variations of mean F0, duration and intensity in both focused and unfocused conditions between t... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • On finding word-level break-type formation rules for mandarin read speech

    Publication Year: 2015, Page(s):53 - 57
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (348 KB) | HTML iconHTML

    This paper presents a study on exploring word-level break-type formation rules for Mandarin read speech. A 4-layer hierarchical structure with seven break types is adopted to represent the prosody of utterance. The work is based on the break-type tags labeled on a large read-speech database by the prosody labeling and modeling algorithm (PLM) proposed previously. Occurrence frequencies of seven br... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • The recognition of neutral tone across acoustic cues

    Publication Year: 2015, Page(s):58 - 63
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (575 KB) | HTML iconHTML

    In Standard Chinese, both F0 and duration are the important acoustic cues for neutral tone perception. The present study focuses on the acoustic cues that contribute to neutral tone perception by checking the interplay between acoustic cues and other factors, including lexical status, the underlying tones. Manipulations were conducted according to different acoustic cues, obtaining thre... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.