Skip to Main Content
This paper describes a new approach for locating signals, such as promoter sequences, in nucleic acid sequences. Transcription factor (TF) binding to its DNA target site is a fundamental regulatory interaction. The most common model used to represent TF binding specificities is a position weight matrix (PWM), which assumes independence between binding positions. However, in many cases, this simplifying assumption does not hold. In this paper, we present a Chi-square ( x2 ) distance model, which is based on the distance between the profiles of component vectors. It is a novel probabilistic method for modeling TF-DNA interactions. Our approach uses x2 distances to represent TF binding specificities. Simulation results show that the proposed approach identifies TF binding sites significantly better than the PWM model method.