Skip to Main Content
Data sources that are typical of the next generation of biological information entities are gene chips that identify the individual genes in a given biological sample. These data are currently stored in a database format defined by the Genetic Analysis Technology Consortium (GATC). To interpret the chip data, we also need information about the genes themselves, as found in the Human Genome Database (HGDB). These two databases were conceived at different times to serve different purposes, and their designs differ significantly. Extracting information simultaneously from multiple databases has proved to be a very difficult problem. We have developed a system that will intelligently direct a single client query against a federation of databases. Our solution uses software standards common in the.-field today XML, CORBA, and Java - but these standards by themselves are not sufficient. We have developed a new component called the Class Mapper, a software layer unique to each database. Each Class Mapper represents its database as an object-oriented schema consistent with the schema level of the federation. A Federation Platform reads the query, the Class Mappers execute the query across their respective databases, and the Federation Platform returns results to the client.