By Topic

Opinion Summarization in Bengali: A Theme Network Model

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Amitava Das ; Dept. of Comput. Sci. & Eng., Jadavpur Univ., Kolkata, India ; Sivaji Bandyopadhyay

Theme network is a semantic network of document specific themes. So far Natural Language Processing (NLP) research patronized much of topic based summarizer system, unable to capture thematic semantic affinity of any text i.e. a news article containing the concepts, "gun," "convenience store," "demand money" and "make getaway" might suggest the topics "robbery" and "crime". In this paper the development of an opinion summarization system that works on Bengali News corpus has been described. The system identifies the sentiment information in each document, aggregates them and represents the summary information in text. The present system follows a topic-sentiment model for sentiment identification and aggregation. Topic-sentiment model is designed as discourse level theme identification and the topic-sentiment aggregation is achieved by theme clustering (k-means) and Document level Theme Relational Graph representation. The Document Level Theme Relational Graph is finally used for candidate summary sentence selection by standard page rank algorithms used in Information Retrieval (IR). As Bengali is a resource constraint language, the building of annotated gold standard corpus and acquisition of linguistics tools for lexico-syntactic, syntactic and discourse level features extraction are described in this paper. The reported accuracy of the Theme detection technique is 83.60% (precision), 76.44% (recall) and 79.85% (F-measure). The summarization system has been evaluated with Precision of 72.15%, Recall of 67.32% and F-measure of 69.65%.

Published in:

Social Computing (SocialCom), 2010 IEEE Second International Conference on

Date of Conference:

20-22 Aug. 2010