By Topic

Data management and analysis for high-throughput DNA sequencing projects

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

10 Author(s)
A. R. Kerlavage ; Dept. of Bioinf., Inst. for Genomic Res., Gaithersburg, MD, USA ; W. FitzHugh ; A. Gladek ; J. Kelley
more authors

The rapid advances in molecular biology have begun to shift many of the bottlenecks in genome research from the laboratory to the data analysis facility. The pace at which this has occurred creates a situation in which software development always has to catch up with the flow of data. Since such large-scale processes were not anticipated, the analysis infrastructure has not been fully established. Furthermore, most systems that have been built were designed by the biologists who collected the data. More recently, computer scientists, mathematicians, and engineers have taken an interest in this problem. This has had a positive effect, since it has created a tight synergy between the informatics and the biology. Several principles affected the design of the system developed at TIGR. Each of the sample preparation, sequencing, and analysis steps had to be managed, scheduled, and tracked. This information had to be made readily available to those who needed it for carrying out their tasks. Different skill levels of the users had to be taken into account. The degree of human intervention at each step had to be evaluated and built into the design. A mixed processing environment of Macintosh and Unix platforms had to be integrated. Most importantly, the system had to save time, reduce error, and ensure uniformity of the analysis and quality of the results. In the authors' experience, the tools they have built work well because of their early decisions as to which systems to use for development. The authors settled on a robust relational database management system (Sybase) and a portable development environment (C, C++)

Published in:

IEEE Engineering in Medicine and Biology Magazine  (Volume:14 ,  Issue: 6 )