By Topic

A Method of Construction of the Chinese and English Bilingual Translation Corpus Based on Web Data Mining

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Liu Dong-Fei ; Coll. of Comput. Sci. & Technol., Wuhan Univ. of Technol., Wuhan, China ; Zhou Xing

The paper introduces a method of construction of the Chinese and English bilingual translation corpus based on web data mining. To collect huge amount of page data by web spider, and identify bilingual web page by a series of complicated purification and analysis process, then analysis the DOM structure of the two page text, we can get the Chinese and English parallel translation corpus and save them to database. As the corpus accumulated by machine automatically, it has higher efficiency, and the translate content come from the internet, original resource is rich and accurate relatively, it can provide a good reference data for translation software.

Published in:

Information Processing, 2009. APCIP 2009. Asia-Pacific Conference on  (Volume:1 )

Date of Conference:

18-19 July 2009