Summary form only given. 16-bit Asian language texts are difficult to compress using conventional 8-bit sampling text compression schemes. Recently the word-based text compression method has been studied with the intention of compressing Japanese and Chinese texts individually. In order to compress a large number of small-sized Japanese documents, such as groupware and E-mail, we applied a semi-adaptive word-based method to Japanese at DCC'98. To further enable multilingual text compression, we also applied a static word-based method to both the Japanese and Chinese texts and evaluated compression characteristics and performance using a computer simulation
Published in:
Data Compression Conference, 1999. Proceedings. DCC '99
Date of Conference: 29-31 Mar 1999