Syntactic and sub-lexical features for Turkish discriminative language models | IEEE Conference Publication | IEEE Xplore

Syntactic and sub-lexical features for Turkish discriminative language models


Abstract:

This paper investigates syntactic and sub-lexical features in Turkish discriminative language models (DLMs). DLM is a feature-based language modeling approach. It reranks...Show More

Abstract:

This paper investigates syntactic and sub-lexical features in Turkish discriminative language models (DLMs). DLM is a feature-based language modeling approach. It reranks the ASR output with discriminatively trained feature parameters. Syntactic information is incorporated into DLM as part-of-speech (PoS) tag n-gram features and head-to-head dependency relations. Sub-lexical units are first utilized as language modeling units in the baseline recognizer. Then, sub-lexical features are used to rerank the sub-lexical hypotheses. We explore features, similar to syntactic features, on sub-lexical units to reveal the implicit morpho-syntactic information conveyed by these units. We find out that DLM yields more improvement for sub-lexical units than for words. Basic sub-lexical n-gram features result in 0.6% reduction over the baseline and morpho-syntactic features yield an additional 0.4% reduction on the test set.
Date of Conference: 14-19 March 2010
Date Added to IEEE Xplore: 28 June 2010
ISBN Information:

ISSN Information:

Conference Location: Dallas, TX, USA

Contact IEEE to Subscribe

References

References is not available for this document.