By Topic

Asian Language Processing (IALP), 2011 International Conference on

Date 15-17 Nov. 2011

Filter Results

Displaying Results 1 - 25 of 82
  • [Front cover]

    Publication Year: 2011, Page(s): C1
    Request permission for commercial reuse | PDF file iconPDF (12890 KB)
    Freely Available from IEEE
  • [Title page i]

    Publication Year: 2011, Page(s): i
    Request permission for commercial reuse | PDF file iconPDF (68 KB)
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2011, Page(s): iii
    Request permission for commercial reuse | PDF file iconPDF (133 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2011, Page(s): iv
    Request permission for commercial reuse | PDF file iconPDF (116 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2011, Page(s):v - ix
    Request permission for commercial reuse | PDF file iconPDF (155 KB)
    Freely Available from IEEE
  • Message from the General Chair

    Publication Year: 2011, Page(s): x
    Request permission for commercial reuse | PDF file iconPDF (89 KB) | HTML iconHTML
    Freely Available from IEEE
  • Message from the Program Chairs

    Publication Year: 2011, Page(s): xi
    Request permission for commercial reuse | PDF file iconPDF (38 KB) | HTML iconHTML
    Freely Available from IEEE
  • Message from the Local Organizing Chair

    Publication Year: 2011, Page(s): xii
    Request permission for commercial reuse | PDF file iconPDF (73 KB) | HTML iconHTML
    Freely Available from IEEE
  • Conference Committees

    Publication Year: 2011, Page(s): xiii
    Request permission for commercial reuse | PDF file iconPDF (57 KB)
    Freely Available from IEEE
  • Program Committee

    Publication Year: 2011, Page(s):xiv - xv
    Request permission for commercial reuse | PDF file iconPDF (71 KB)
    Freely Available from IEEE
  • Invited Talks

    Publication Year: 2011, Page(s):xvi - xx
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (99 KB)

    More than 6000 living languages are spoken in the world today, and the majority of them are concentrating in Asia. Every language has its own specific acoustic as well as linguistic characteristics that require special modeling techniques. This talk presents our recent experiences in regard to building automatic speech recognition (ASR) systems for the Indonesian, Thai and Chinese languages. For I... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Simplified-Traditional Chinese Character Conversion Model Based on Log-Linear Models

    Publication Year: 2011, Page(s):3 - 6
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (165 KB) | HTML iconHTML

    With the growth of exchange activities between four regions of cross strait, the problem to correctly convert between Traditional Chinese (TC) and Simplified Chinese (SC) become more and more important. Numerous one-to-many mappings and term usage differences make it more difficult to convert from SC to TC. This paper proposed a novel simplified-traditional Chinese character conversion model based... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving Chinese Dependency Parsing with Self-Disambiguating Patterns

    Publication Year: 2011, Page(s):7 - 10
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (180 KB) | HTML iconHTML

    To solve the data sparseness problem in dependency parsing, most previous studies used features constructed from large-scale auto-parsed data. Unlike previous work, we propose a new approach to improve dependency parsing with context-free dependency triples (CDT) extracted by using self-disambiguating patterns (SDP). The use of SDP makes it possible to avoid the dependency on a baseline parser and... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Joint Decoding for Chinese Word Segmentation and POS Tagging Using Character-Based and Word-Based Discriminative Models

    Publication Year: 2011, Page(s):11 - 14
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (265 KB) | HTML iconHTML

    For Chinese word segmentation and POS tagging problem, both character-based and word-based discriminative approaches can be used. Experiments show that these two approaches bring different errors and can complement each other. In this paper, we propose a joint decoding model based on both character-based and word-based models using multi-beam search algorithm. Experimental results show that the jo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Natural Language Grammar Induction of Indonesian Language Corpora Using Genetic Algorithm

    Publication Year: 2011, Page(s):15 - 18
    Cited by:  Papers (2)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (164 KB) | HTML iconHTML

    Grammar Induction is a machine learning process for learning grammar from corpora. This paper will discuss the process of grammar induction for Indonesian language corpora using genetic algorithm. The Grammar production rules will be modeled in the form of chromosomes. The fitness function is used to count how many sentences can be parsed. The data used are Indonesian fairy tales stories such as "... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Error-Driven Adaptive Language Modeling for Chinese Pinyin-to-Character Conversion

    Publication Year: 2011, Page(s):19 - 22
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (209 KB) | HTML iconHTML

    The performance of Chinese Pinyin-to-Character conversion is severely affected when the characteristics of the training and conversion data differ. As natural language is highly variable and uncertain, it is impossible to build a complete and general language model to suit all the tasks. The traditional adaptive MAP models mix the task independent data with task dependent data using a mixture coef... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Theoretical Framework of Mongolian Word Segmentation Specification for Information Processing

    Publication Year: 2011, Page(s):23 - 25
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (125 KB) | HTML iconHTML

    The establishment of Contemporary Mongolian word segmentation specification for information processing has a great significance in the standardization of information processing, the compatibleness of different systems, the sharing of corpus, grammatical analysis, and POS tagging. The present paper studies the framework of Mongolian word segmentation including guidelines, formulating principles, st... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Research on the Uyghur Information Database for Information Processing

    Publication Year: 2011, Page(s):26 - 29
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (153 KB) | HTML iconHTML

    Although the "grammatical rule + dictionary" is the traditional pattern for natural language processing, it can be hard to explain the combination of words in language. If all word combinations are entered into a database, the grammar and the information system would be simplified. The necessity, the methods and the principles of establishing the phrase information database of Uyghur language will... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Sentence Boundary Detection in Colloquial Arabic Text: A Preliminary Result

    Publication Year: 2011, Page(s):30 - 32
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (162 KB) | HTML iconHTML

    Recently, natural language processing tasks are more frequently conducted over online content. This poses a special problem for applications over Arabic language. Online Arabic content is usually written in informal colloquial Arabic, which is characterized to be ill-structured and lacks specific linguistic standardization. In this paper, we investigate a preliminary step to conduct successful NLP... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Study of the Classification and Arrangement Rule of Uygur Morphemes for Information Processing

    Publication Year: 2011, Page(s):33 - 36
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (142 KB) | HTML iconHTML

    In the processing of modern uygur corpus, it is necessary to make a word character mark study of the word level within the modern uygur language data. Since the classification of morpheme is to serve the mark of word character, the article classifies Uygur morphemes from their functions and lists their all classifications and arrangement rules. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Graph-Based Language Model of Long-Distance Dependency

    Publication Year: 2011, Page(s):37 - 40
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (169 KB) | HTML iconHTML

    In the natural language processing and its related fields, the classic text representation methods seldom consider the role of the words order and long-distance dependency in the texts for the semantic representation. In this paper, we discussed current situation and problems of the statistical language models, especially for Head-driven statistical language model and Head-driven Phrase Structure ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • BASRAH: Arabic Verses Meters Identification System

    Publication Year: 2011, Page(s):41 - 44
    Cited by:  Papers (1)
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (203 KB) | HTML iconHTML

    In this paper, we present BASRAH, a system that automatically identifies the meter of Arabic verse, which is an operation that requires a certain level of human expertise. BASRAH uses the numerical prosody method, which depends on verse coding that is derived from the general concept of al-Khalil's feet through using the two primary units (cord=2 and peg=3). BASRAH has proved to be an efficient to... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • WordNet Editor to Refine Indonesian Language Lexical Database

    Publication Year: 2011, Page(s):47 - 50
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (259 KB) | HTML iconHTML

    This paper describes an approach for editing Indonesian Language Lexical Database especially noun category and its relations. The purpose of this editor is to refine Indonesian Lexical Database that was developed in our previous researches. The visualization of the editor is using graph library with some modifications and additions. Furthermore, this editor will be web based so that everyone can p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Two Ontological Approaches to Building an Intergrated Semantic Network for Yami ka-Verbs

    Publication Year: 2011, Page(s):51 - 54
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (380 KB) | HTML iconHTML

    This paper describes a proposed ontological language processing system for integrating two semantic sets for a group of important verbs with the prefix Kain Yami, an Austronesian language in Taiwan. The two semantic sets represent two different classification approaches. One approach follows the concepts and rules of WordNet and the other uses the metaphors in Yami indigenous knowledge. The ontolo... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Issues with the Unergative/Unaccusative Classification of the Intransitive Verbs

    Publication Year: 2011, Page(s):55 - 58
    Request permission for commercial reuse | Click to expandAbstract | PDF file iconPDF (267 KB) | HTML iconHTML

    The paper abandons a strict two-way sub-classification of intransitive verbs into unaccuasative and unergative for Hindi and proposes a distribution plotting of the same in a diffusion chart. The diagnostics tests that Bhatt (2003) applied on Hindi data are ranked for their efficiency of attributing correct sub-class to verbs. The diffusion chart shows that a tripartite classification handles the ... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.