We improve the existing achievable rate regions for causal and for zero-delay source coding of stationary Gaussian sources under an average mean squared error distortion measure. To begin with, we find a closed-form expression for the information-theoretic causal rate-distortion function (RDF) under such distortion measure, denoted by Rcit(D), for first-order Gauss-Markov processes. Rcit(D) is a lower bound to the optimal performance theoretically attainable (OPTA) by any causal source code, namely Rcop(D). We show that, for Gaussian sources, the latter can also be upper bounded as Rcop(D) ≤ Rcit(D) + 0.5 log 2(2πe) bits/sample. In order to analyze Rcit(D) for arbitrary zero-mean Gaussian stationary sources, we introduce Rcit̅(D), the information-theoretic causal RDF when the reconstruction error is jointly stationary with the source. Based upon Rcit̅(D), we derive three closed-form upper bounds to the additive rate loss defined as Rcit̅(D) - R(D), where R(D) denotes Shannon's RDF. Two of these bounds are strictly smaller than 0.5 bits/sample at all rates. These bounds differ from one another in their tightness and ease of evaluation; the tighter the bound, the more involved its evaluation. We then show that, for any source spectral density and any positive distortion D ≤ σx2, RU(D) can be realized by an additive white Gaussian noise channel surrounded by a unique set of causal pre-, post-, and feed- back niters. We show that finding such filters constitutes a convex optimization problem. In order to solve the latter, we propose an iterative optimization procedure that yields the optimal niters and is guaranteed to converge to Rcit̅(D). Finally, by esta- lishing a connection to feedback quantization, we design a causal and a zero-delay coding scheme which, for Gaussian sources, achieves an operational rate lower than Rcit̅(D) +0.254 and Rcit̅(D) + 0.754 bits/sample, respectively. This implies that the OPTA among all zero-delay source codes, denoted by Rzdop(D), is upper bounded as Rzdop(D) <; Rcit̅(D) + 1-254 <; R(D) + 1.754 bits/sample.