Skip to Main Content
Here we apply the graph-theoretic concept of betweenness centrality to a class of protein repeats, e.g., Armadillo (ARM) and HEAT. The Betweenness of a node represents how often a node is traversed on the shortest path between all pairs of nodes i, j in the network and thus gives the contribution of each node in the network. These repeats are not easily detectable at the sequence level because of low conservation between independent repeated units, e.g., HEAT repeats are known to have less than 13% identity. Their identification at the structure level typically involves self structure-structure comparison, which can be computationally very intensive. Our analysis of a set of proteins from ARM and HEAT repeat family shows that the repeat regions exhibit similar connectivity patterns for the repeating units. Since it is generally accepted that in many networks, the larger the degree of a node, the larger the chance that many of the shortest paths will pass through this node, computing vertex Betweenness provides a simple and elegant approach for identifying tandem structural repeats in proteins.