Skip to Main Content
Effective integration of heterogeneous databases and information sources has been studied as the most pressing challenge in various fields; such as, corporate data management and life sciences. In this paper, we present an information integration system for drug discovery by employing grid technology. Drug discovery involves a number of stages including disease modeling, target protein identification, lead compound identification, and clinical trials. Each stage requires its own specific type of data and many of them are stored in their databases. However, those databases are independently operated and their schema and data description are much diverged since they are based on various data domains and research backgrounds. Furthermore, the amounts of the databases have been increased very rapidly due to the completion of the human genome project. To cope with these issues, we utilize grid technology for integrating various information and develop metadata for describing relationships between heterogeneous data that belong to different domains (such as, protein-compound interactions). The effectiveness of our system is demonstrated by evaluating results of several example queries and by measuring processing times against geographically separated databases.