Skip to Main Content
The amino acid-coupling sequence patterns deduced from 74 mesophilic and 15 thermophilic genomes were analyzed. The amino acid-coupling sequence patterns are defined as any two types of amino acids separated apart by one or more amino acids. We found that there exists significantly different distributions of amino acid-coupling sequence patterns between thermophilic and mesophilic proteomes. For example, patterns favored to form local salt bridges (such as KXnE) are usually preferred by thermophiles; patterns containing glutamate and valine are usually favored by thermophiles, but only some of them are statistically significant; most patterns containing cysteine appear to occur more in mesophiles, but most of them are statistically insignificant except patterns like CXnP or CXnC, which is favored by thermophiles. Though previous studies based on global amino acid compositions indicate that glutamate is one of the most favored amino acids by thermophiles, we found that EXnT, EXnH and EXnQ are statistically significant patterns favored by mesophiles. We also identified sequence patterns that can effectively distinguish between thermophilic and mesophilic genomes. By combining the statistically significant amino acid-coupling sequence patterns (of lowest p-values), we find a good linear relationship between these sequence patterns and the optimal growth temperatures of the genomes.