Skip to Main Content
The popularity of the Internet and the large number of documents available in electronic form in modern world has motivated the search for hidden knowledge in text collections. In this paper, we present Information Extraction using NLP technique combined with soft matching rules, which is then stored in the form of scenario template production. Once the information is extracted, we have used the genetic algorithm for classifying the information based upon their relevancy. We have compared the relevancy of genetic algorithm with traditional classification techniques. The system is tested using data collected from NSF Research abstracts and abstracts from two different domains of www.computer.org and we have found that the system has improved its recall value after the application of soft matching rules. Genetic algorithm is an effective classifier and is quite competitive with C4.8 method even though the concept increases in complexity.