Recent studies have shown that retransmissions can cause heavy-tailed transmission delays even when packet sizes are light tailed. In addition, the impact of heavy-tailed delays persists even when packets size are upper bounded. The key question we study in this paper is how the use of coding techniques to transmit information, together with different system configurations, would affect the distribution of delay. To investigate this problem, we model the underlying channel as a Markov modulated binary erasure channel, where transmitted bits are either received successfully or erased. Erasure codes are used to encode information prior to transmission, which ensures that a fixed fraction of the bits in the codeword can lead to successful decoding. We use incremental redundancy codes, where the codeword is divided into codeword trunks and these trunks are transmitted one at a time to provide incremental redundancies to the receiver until the information is recovered. We characterize the distribution of delay under two different scenarios: 1) decoder uses memory to cache all previously successfully received bits and 2) decoder does not use memory, where received bits are discarded if the corresponding information cannot be decoded. In both cases, we consider codeword length with infinite and finite support. From a theoretical perspective, our results provide a benchmark to quantify the tradeoff between system complexity and the distribution of delay.