Skip to Main Content
The aim of this talk consists in the introduction to the language-resource-related activities of the National Institute for Japanese Language and Linguistics (NINJAL). Since the last half of the 1990s, the former National Language Research Institute (NLRI) played a central role in the development of Japanese language resources by constructing corpora like Corpus of Spontaneous Japanese (CSJ) and Taiyo Corpus. In 2006, the language resource group of NLRI started a Japanese corpus compilation initiative named KOTONOHA, and set about the construction of a 100 million words Balanced Corpus of Contemporary Written Japanese (BCCWJ). The activity of NLRI was inherited by the NINJAL Center for Corpus Development reestablished in 2009. Now that the construction of the BCCWJ was completed successfully in August 2011, the NINJAL center set about two new projects of exploratory nature: a historical corpus project and a 10-billion-word ultra-large-scale Web-based corpus project. In addition to the presentation of the NLRI-NINJAL activities, language resource development in Japanese institutions other than NINJAL will be introduced briefly in the beginning. Also, application of the CSJ to the study of phonetics will also be demonstrated at the end.
Date of Conference: 26-28 Oct. 2011