Scientists face an ever-increasing challenge in investigating biological systems with high throughput experimental methods such as mass spectrometry and gene arrays because of the scale and complexity of the data and the need to integrate results broadly with heterogeneous other types of information. Many analyses require merging the experimental results with datasets returned from public databases, such as those hosted by the National Center for Biotechnology (NCBI), Kyoto Encyclopedia of Genes and Genomes (KEGG), and protein interaction databases such as the Biomolecular Interaction Network Database (BIND). Because data sources such as these are constantly evolving the researcher is faced with hurdles to manually gather, integrate and manage the data into cohesive datasets. To overcome these technical problems, we have been building a three-tier software system that includes a client-side graphical user interface for rich interaction with the data, an application server that hides the messy technical details of data collection, integration, and management tasks from the researcher, and a flexible database schema that efficiently manages mixed data source content. The software is being developed using Java for portability and Open Source technology so that it can one day be freely distributed. This problem-solving environment is called the Computational Cell Environment (CCE) and is designed to provide scalable and agile connectivity to diverse data stores and eventually provide data retrieval, management, and analysis through all aspects of biological study.
Published in:
Computational Systems Bioinformatics Conference, 2005. Workshops and Poster Abstracts. IEEE
Date of Conference: 8-11 Aug. 2005