By Topic

The quantification of unstructured information and its use in predictive modeling

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

8 Author(s)
P. Dumrong ; Dept. of Syst. & Inf. Eng., Virginia Univ., Charlottesville, VA, USA ; J. Gould ; G. Lee ; L. Nicholson
more authors

Managing text-based information is crucial when trying to extract valuable information from documents. Assigning a numerical value to the text-based (unstructured) information is one of the ways to extract value. This research studied the quantification of unstructured text and its forecasting power. In order to examine unstructured information that related to predictive models, the Beige books were utilized to investigate and predict changes in the U.S. economy. The Beige books describe current economic conditions and discuss fluctuations in real gross domestic product (GDP). To quantify the text-based unstructured information, the direct scoring algorithm (DSA) was proposed. It utilized the keywords in the document and their subjectively-determined numerical weights to score individual sentence. Statistical analyses were then conducted to verify which sections of the Beige books contributed the most significant information to the prediction of GDP. Utilizing the significant sections, a linear regression model was constructed to predict future GDP growth. The adjusted-R2 values of the DSA model were compared to the scoring of the same documents by an economic expert. The comparison demonstrated that the DSA model using the Beige book significantly contributed to the prediction of GDP, and it explained similar amounts of variance compared to the scores created by an economic expert. Also, a comparison between a structured predictive model and the DSA model was conducted to again prove the significance of text-based information.

Published in:

Systems and Information Engineering Design Symposium, 2003 IEEE

Date of Conference:

24-25 April 2003