Abstract:
High-quality word representations have been very successful in recent years at improving performance across a variety of NLP tasks. These word representations are the map...Show MoreMetadata
Abstract:
High-quality word representations have been very successful in recent years at improving performance across a variety of NLP tasks. These word representations are the mappings of each word in the vocabulary to a real vector in the Euclidean space. Besides high performance on specific tasks, learned word representations have been shown to perform well on establishing linear relationships among words. The recently introduced skip-gram model improved performance on unsupervised learning of word embeddings that contains rich syntactic and semantic word relations both in terms of accuracy and speed. Word embeddings that have been used frequently on English language, is not applied to Turkish yet. In this paper, we apply the skip-gram model to a large Turkish text corpus and measured the performance of them quantitatively with the "question" sets that we generated. The learned word embeddings and the question sets are publicly available at our website.
Date of Conference: 23-25 April 2014
Date Added to IEEE Xplore: 12 June 2014
Electronic ISBN:978-1-4799-4874-1
Print ISSN: 2165-0608