Skip to Main Content
Nowadays, many influential security-related facts are reported multiple times by different sources and in different languages. Therefore, in the recent years, the research on advancing event extraction technology shifted from classical single-document extraction toward cross-document information aggregation and fact validation. However, relatively little work has been reported on cross-lingual information fusion in this area. This paper presents the results of some preliminary experiments on deploying cross-lingual information fusion techniques for refining the results of a large-scale multilingual news event extraction system. The first technique is based on fusing the responses of the mono-lingual event extraction systems, whereas the second one uses state-of-the-art machine translation to convert all news articles reporting on a given event into one common language and subsequently applies the corresponding mono-lingual event extraction system on the translated articles. An evaluation of the aforementioned techniques on a news article corpus, whose articles refer to 523 real-world crisis-related events (violent events, man-made and natural disasters), revealed that the descriptions of circa 10% of the events could be refined through fusing the event descriptions returned by the mono-lingual event extraction systems. The overall gain in recall and precision against the best mono-lingual system was 6,4% and 4,8% respectively. The second approach, based on machine translation, turned to perform significantly worse compared to the former technique and the best mono-lingual system (English).