Skip to Main Content
Extracting protein-protein interaction (PPI) from biomedical literatures is a meaningful topic in protein science. Annotated corpora are important to the development and evaluation of protein-protein interaction extraction systems. So it is important to construct a text mining tool for the annotation of any corpus for protein name and interaction events for the identification of interactions among proteins. In this paper we present a java package called the TMAC system. TMAC tagged protein names and interaction events in biomedical literatures based on a combination of carefully designed rules and a dictionary of protein names. TMAC is able to normalize the results of protein mentions and interaction events found by offering the appropriate database reference. TMAC is divided into two modules. The first module is the Name entity identification and normalization module. The second module is the interaction event tagger for the identification of words that will ensure the occurrence of the interaction. TMAC achieved an average of 85.2% precision, 76.7% recall for the protein identification process. TMAC achieved an average of 88.2% precision, 71.8% recall for the protein - protein interaction event identification process. TMAC is a flexible system. It could be used as a standalone application or can be incorporated in the workflow of a more general text mining system.