By Topic

On Refining Real-Time Multilingual News Event Extraction through Deployment of Cross-Lingual Information Fusion Techniques

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Piskorski, J. ; Inst. of Comput. Sci., Warsaw, Poland ; Belayeva, J. ; Atkinson, M.

Nowadays, many influential security-related facts are reported multiple times by different sources and in different languages. Therefore, in the recent years, the research on advancing event extraction technology shifted from classical single-document extraction toward cross-document information aggregation and fact validation. However, relatively little work has been reported on cross-lingual information fusion in this area. This paper presents the results of some preliminary experiments on deploying cross-lingual information fusion techniques for refining the results of a large-scale multilingual news event extraction system. The first technique is based on fusing the responses of the mono-lingual event extraction systems, whereas the second one uses state-of-the-art machine translation to convert all news articles reporting on a given event into one common language and subsequently applies the corresponding mono-lingual event extraction system on the translated articles. An evaluation of the aforementioned techniques on a news article corpus, whose articles refer to 523 real-world crisis-related events (violent events, man-made and natural disasters), revealed that the descriptions of circa 10% of the events could be refined through fusing the event descriptions returned by the mono-lingual event extraction systems. The overall gain in recall and precision against the best mono-lingual system was 6,4% and 4,8% respectively. The second approach, based on machine translation, turned to perform significantly worse compared to the former technique and the best mono-lingual system (English).

Published in:

Intelligence and Security Informatics Conference (EISIC), 2011 European

Date of Conference:

12-14 Sept. 2011