Skip to Main Content
With the progress of research on structural analysis of proteins, a large number of studies have been conducted on extracting the protein interaction information from literature. For automatic extraction of interaction information, the machine learning approach is useful. Generally, linguistic features obtained directly from the literature are used for learning, but a non-linguistic feature such as the atomic distance calculated from the protein structure data is often very effective for learning and classification. We call this type of feature a “key feature” in this study. In the machine learning approach, preparing enough training instances to train the classifier is important, but this often requires great cost. In such a situation, transfer learning is one of the better approaches. However, it is difficult to apply a simple transfer learning algorithm to a task in which the key feature cannot be prepared in the source domain. In this study, we propose a new transfer learning method called STEK (Selective Transfer learning based on Effectiveness of a Key feature). In this method, we focus on the effectiveness of the key feature, and divide a set of instances into two categories. One is a set of instances applying transfer learning and the other is a set of instances avoiding the use of transfer learning. The proposed method with the InstPrune algorithm showed stably high precision, recall and F-measure on average.