By Topic

TMAC: An automated text mining tool for construction of an annotated corpus to support protein-protein interaction information extraction

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

2 Author(s)
Azzem, R.A.A. ; Dept. of Electr. Eng., El Fayoum Univ., Fayoum, Egypt ; Seoud, A.

Extracting protein-protein interaction (PPI) from biomedical literatures is a meaningful topic in protein science. Annotated corpora are important to the development and evaluation of protein-protein interaction extraction systems. So it is important to construct a text mining tool for the annotation of any corpus for protein name and interaction events for the identification of interactions among proteins. In this paper we present a java package called the TMAC system. TMAC tagged protein names and interaction events in biomedical literatures based on a combination of carefully designed rules and a dictionary of protein names. TMAC is able to normalize the results of protein mentions and interaction events found by offering the appropriate database reference. TMAC is divided into two modules. The first module is the Name entity identification and normalization module. The second module is the interaction event tagger for the identification of words that will ensure the occurrence of the interaction. TMAC achieved an average of 85.2% precision, 76.7% recall for the protein identification process. TMAC achieved an average of 88.2% precision, 71.8% recall for the protein - protein interaction event identification process. TMAC is a flexible system. It could be used as a standalone application or can be incorporated in the workflow of a more general text mining system.

Published in:

Computer Technology and Development (ICCTD), 2010 2nd International Conference on

Date of Conference:

2-4 Nov. 2010