By Topic

Hidden Markov Models and Text Classifiers for Information Extraction on Semi-Structured Texts

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Barros, F.A. ; Center of Inf., Fed. Univ. of Pernambuco, Recife ; Silva, E.F.A. ; Prudencio, R.B.C. ; Filho, V.M.
more authors

Information extraction (IE) aims to extract from textual documents only the fragments which correspond to datafields required by the user. In this paper, we present new experiments evaluating a hybrid machine learning approach for IE that combines text classifiers and hidden Markov models (HMM). In this approach, a text classifier technique generates an initial output, which is refined by an HMM, taking into account dependences in the order of the data to be extracted. The proposal was evaluated to extract information from bibliographic references. Experiments performed on a corpus of 6000 references have shown an improvement in performance compared to benchmarking IE approaches adopted in previous work.

Published in:

Hybrid Intelligent Systems, 2008. HIS '08. Eighth International Conference on

Date of Conference:

10-12 Sept. 2008