Skip to Main Content
In this paper, we will bring to bear new tools to analyze non-stationary signals that have emerged in the statistical and signal processing community over the past few years. The emergence of these new methods will be used to shed new light and help resolve the issues of (i) the existence of long-range correlations in DNA sequences and (ii) whether they are present in both coding and non-coding segments or only in the latter. It turns out that the statistical differences between coding and non-coding segments are much more subtle than previously thought using stationary analysis. In particular, both coding and non-coding sequences exhibit long-range correlations, as asserted by a 1/fbeta(n) evolutionary (i.e., time-dependent) spectrum. However, we will use an index of randomness, which we derive from the Hilbert-Huang Transform, to demonstrate that coding sequences, although not random as previously suspected, are often ldquomore randomrdquo (i.e., more white) than non-coding sequences. Moreover, the study of the evolution of the rate of change of these time dependent parameters in homologous gene families shows a sudden jump around the rat, which might be related to the well-known supercharged evolution of this rodent.
Date of Conference: 8-10 June 2008