Skip to Main Content
This paper provides a new method for analyzing floating-point roundoff error for digital filters by using "finite signal-to-noise" models whose noise sources have variances proportional to the variance or power of the corrupted signals. With this model, a new expression for output error covariance of floating-point arithmetic is derived the in case of double or extended precision accumulation. The output error covariance shows that the optimal state space realization for floating point is the same as that of the fixed-point case, except for two cases: when the filter has poles extremely close to the unit circle or when final quantization to precisions shorter than single precision is employed. An explicit formula is found for determining the minimum number of mantissa bits for stable realization.