2010 International Conference on Asian Language Processing

28-30 Dec. 2010

Filter Results

Displaying Results 1 - 25 of 94
  • [Front cover]

    Publication Year: 2010, Page(s): C1
    Request permission for reuse | PDF file iconPDF (3352 KB)
    Freely Available from IEEE
  • [Title page i]

    Publication Year: 2010, Page(s): i
    Request permission for reuse | PDF file iconPDF (62 KB)
    Freely Available from IEEE
  • [Title page iii]

    Publication Year: 2010, Page(s): iii
    Request permission for reuse | PDF file iconPDF (109 KB)
    Freely Available from IEEE
  • [Copyright notice]

    Publication Year: 2010, Page(s): iv
    Request permission for reuse | PDF file iconPDF (118 KB)
    Freely Available from IEEE
  • Table of contents

    Publication Year: 2010, Page(s):v - x
    Request permission for reuse | PDF file iconPDF (161 KB)
    Freely Available from IEEE
  • Message from General Chairs

    Publication Year: 2010, Page(s):xi - xii
    Request permission for reuse | PDF file iconPDF (67 KB) | HTML iconHTML
    Freely Available from IEEE
  • Message from Program Chairs

    Publication Year: 2010, Page(s): xiii
    Request permission for reuse | PDF file iconPDF (65 KB) | HTML iconHTML
    Freely Available from IEEE
  • Conference Committees

    Publication Year: 2010, Page(s): xiv
    Request permission for reuse | PDF file iconPDF (58 KB)
    Freely Available from IEEE
  • Program Committee

    Publication Year: 2010, Page(s):xv - xvi
    Request permission for reuse | PDF file iconPDF (87 KB)
    Freely Available from IEEE
  • Organizers and Sponsors

    Publication Year: 2010, Page(s):xvii - xviii
    Request permission for reuse | PDF file iconPDF (61 KB)
    Freely Available from IEEE
  • Invited talks

    Publication Year: 2010, Page(s): xix
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (92 KB)

    Summary form only given. In this talk, the speaker will measure the reduction in ambiguity that can be gained by using translated text to constrain meanings. Instead of using the translation itself to determine senses, they use a shared hierarchy of word senses: WordNet. Experiments with aligned Chinese, English and Japanese text show a substantial reduction in ambiguity for each language. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Survey on Rendering Traditional Mongolian Script

    Publication Year: 2010, Page(s):3 - 6
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (1055 KB) | HTML iconHTML

    This paper discusses the rendering issues of complex text layout - traditional Mongolian script. The traditional Mongolian script has been standardized in Unicode. We analyzed existing Open Type fonts and their rendering schemes for traditional Mongolian script. We found some errors, and discovered grammatical rules, which are not documented in international standards. None of the existing Open Ty... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Combination of Statistical and Rule-Based Approach for Mongolian Lexical Analysis

    Publication Year: 2010, Page(s):7 - 10
    Cited by:  Papers (1)
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (184 KB) | HTML iconHTML

    Mongolian lexical analysis is the first step in Mongolian information processing such as Chinese-Mongolian machine translation. In this paper, we introduce a statistic and rule based approach to solving the Mongolian word segmentation & POS tagging all at once. In this method, we use tree frame as basic statistical model. And then we combine the model with some rules to improve the lexical... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Letter Tagging Approach to Uyghur Tokenization

    Publication Year: 2010, Page(s):11 - 14
    Cited by:  Papers (1)
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (306 KB) | HTML iconHTML

    In this paper, we present a letter tagging approach(LTA) to Uyghur tokenization. Experiments show that the problem with label bias (rich and complex suffixes) problem to be resolved using LTA combined with CRFs, so it is more effective than previous work, the accuracy of word tokenization reaches 93.3%. In future our tokenization research will be very useful to other Altaic languages information p... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Development of Analysis Rules for Bangla Root and Primary Suffix for Universal Networking Language

    Publication Year: 2010, Page(s):15 - 18
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (233 KB) | HTML iconHTML

    This paper describes a method for the development of Bangla Enconversion within the framework of the Universal Networking Language (UNL). We also discuss some issues and problems related to the UNL representation that affect the quality of generation. Additionally, the ling ware engineering is introduced as a technique to enhance the quality and increase the development efficiency. In this paper a... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Suffix-Based Noun and Verb Classifier for an Inflectional Language

    Publication Year: 2010, Page(s):19 - 22
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (312 KB) | HTML iconHTML

    Nouns and verbs pose the major challenge in part-of-speech tagging exercises. In this paper we present a suffix based noun and verb classifier for Assamese, an inflectional, relatively free word order Indic language. We used a tiny dictionary of frequent words to increase the accuracy. We obtained F-score of around 85%. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Behavior of Word 'kaa' in Urdu Language

    Publication Year: 2010, Page(s):23 - 26
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (312 KB)

    This paper discusses the behavior of `kaa' and suggests the selection of Part of Speech (POS) on the basis of linguistic evidence. It also suggests some tests that can be used for correct classification of `kaa'. The selection of correct POS is important for computational processing, including parsing, generation, and identification of grammatical relations. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Methods to Divide Uygur Morphemes and Treatments for Exceptions

    Publication Year: 2010, Page(s):27 - 30
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (195 KB) | HTML iconHTML

    Based on necessity of the establishment of modern Uygur morphemes database, the paper studies the principle and the method to define Uygur morphemes and focuses on some special conditions including syllabic of morphemes, dual-part-words, morpheme cluster and compound morphemes. It is a basic study of the establishment of Uygur morphemes database. View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Rules for Morphological Analysis of Bangla Verbs for Universal Networking Language

    Publication Year: 2010, Page(s):31 - 34
    Cited by:  Papers (3)
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (211 KB) | HTML iconHTML

    The Universal Networking Language (UNL) deals with the communication across nations of different languages and involves with many different related discipline such as linguistics, epistemology, computer science etc. It helps to overcome the language barrier among people of different nations to solve problems emerging from current globalization trends and geopolitical interdependence. Morphological... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Discussion on Collation of Tibetan Syllable

    Publication Year: 2010, Page(s):35 - 38
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (327 KB) | HTML iconHTML

    Based on the general syllable structure, a syllable's component letters should be expanded orderly into the series of basic consonant, prefix consonant, head consonant... and the second suffix consonant. If there is no letter in a syllable's particular position, a special character, whose collation element is less than that of any Tibetan letter, should be used in the corresponding position of the... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Dictionary Mechanism for Chinese Word Segmentation Based on the Finite Automata

    Publication Year: 2010, Page(s):39 - 42
    Cited by:  Papers (1)
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (386 KB) | HTML iconHTML

    Dictionary mechanism is the basis of Chinese word segmentation, and its quality directly affects the speed and efficiency of Chinese word segmentation. In existing dictionary mechanisms, there are such shortages as space wasting, low efficiency, and difficult maintenance, and therefore, how to establish an effective mechanism is an urgent problem for Chinese word segmentation. In this paper, the i... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Development of Templates for Dictionary Entries of Bangla Roots and Primary Suffixes for Universal Networking Language

    Publication Year: 2010, Page(s):43 - 46
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (226 KB) | HTML iconHTML

    The Universal Networking Language (UNL) is a world wide generalizes form of human interactive language in a machine independent digital platform for defining, recapitulating, amending, storing and dissipating knowledge or information among people of different affiliations. The theoretical and applied research associated with this interdisciplinary endeavor facilitates in a number of practical appl... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Study on "Worry" Separable Words & Its Separable Slots

    Publication Year: 2010, Page(s):47 - 50
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (212 KB) | HTML iconHTML

    This paper makes a comprehensive investigation on the usage of “Worry” separable words in different separable slots in modern Chinese using the theory of combining "semantic meaning" and "grammar". It analyzes the inner structure of “Worry” separable words, explores the rules of selectional restriction in different separable slots, and shows the law and characteristic of their usage in different s... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • Improving Dependency Parsing Using Punctuation

    Publication Year: 2010, Page(s):53 - 56
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (574 KB) | HTML iconHTML

    The high-order graph-based dependency parsing model achieves state-of-the-art accuracy by incorporating rich feature representations. However, its parsing efficiency and accuracy degrades dramatically when the input sentence gets longer. This paper presents a novel two-stage method to improve high-order graph-based parsing, which uses punctuation, such as commas and semicolons, to segment the inpu... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.
  • A Tree Probability Generation Using VB-EM for Thai PGLR Parser

    Publication Year: 2010, Page(s):57 - 60
    Request permission for reuse | Click to expandAbstract | PDF file iconPDF (298 KB) | HTML iconHTML

    In this paper, we applied VB-EM algorithm to generate a probability of constituent combination for PGLR parser. Three linguistic features which are simple PCFG, head-outward dependency and head-emission were calculated. The probabilities were used in a parsing process to find the best probable output tree. From our experiment, the parsing result from a combination of all features for first path an... View full abstract»

    Full text access may be available. Click article title to sign in or learn about subscription options.