Skip to Main Content
Most readily available tools - basic search engines, possibly a news or information service, and perhaps agents and Web crawlers - are inadequate for many information retrieval tasks and downright dangerous for others. These tools either return too much useless material or miss important material. Even when such tools find useful information, the data is still in a text form that makes it difficult to build displays or diagrams. Employing the data in data mining or standard database operations, such as sorting and counting, can also be difficult. An emerging technology called information extraction (IE) is beginning to change all that, and you might already be using some very basic IE tools without even knowing it. Companies are increasingly applying IE behind the scenes to improve information and knowledge management applications such as text search, text categorization, data mining, and visualization (Rao, 2003). IE has also begun playing a key role in fields such as national security, law enforcement, insurance, and biomedical research, which have highly critical information and knowledge needs. In these fields, IE's powerful capabilities arc necessary to save lives or substantial investments of time and money. IE views language up close, considering grammar and vocabulary, and tries to determine the details of "who did what to whom" from a piece of text. In its most in-depth applications, IE is domain focused; it does not try to define all the events or relationships present in a piece of text, but focuses only on items of particular interest to the user organization.