Skip to Main Content
We report a keyphrase identification program (KIP), which uses sample human keyphrases and then learns to identify additional new keyphrases. KIP first populates its database using manually identified keyphrases; each keyphrase is preprocessed and assigned an initial weight. It then extracts noun phrases from documents. All noun phrases will be assigned a score, depending on the weights for words it contains; the ones that have a score higher than the threshold will be selected as keyphrases. Learned new keyphrases will be inserted to the database and weights will be updated. As a result, new keyphrase identification iteration will be triggered. The process stops when no new keyphrases are identified during previous iteration. According to the results of evaluation, the base KIP system's average recall was 0.7 and precision was 0.44. The augmented KIP with learning functions did produce new keyphrases which were not identified by the base system.