Skip to Main Content
This paper provides an effective design and implementation of a distributed bioinformatics computing system for the analysis of DNA sequences. This system could be used for disease detection, criminal forensic analysis, and protein analysis. Different types of distributed algorithms for the search and identification for a triplet repeat pattern in a given DNA sequence are developed. The search algorithm was developed to compute the number of occurrences of a given pattern in a given gene sequence. A distributed subsequence identification algorithm was designed and implemented to detect repeating patterns. Sequential and distributed implementations of these algorithms were executed with different triplet repeat search patterns and genetic sequences. DNA sequences of different lengths were tested on all these algorithms. These sequences varied in size from very small to very large. The performance of distributed algorithm is compared with the sequential approach.