Hierarchical Trellis Based Decoder for Total Variation Sequence Detection (TVSD) in Space-Time Trellis Coded (STTC) Wireless Image/Video Communication

This paper proposes a robust reconstruction scheme for image/video transmission in space-time trellis coding (STTC) based MIMO wireless multimedia sensor networks (WMSN). The information bits of the image/video stream are modulated using STTC prior to transmission over the MIMO wireless channel between the multimedia sensors and the cluster head. At the receiver, a novel total variation sequence detector (TVSD) is developed that employs an anisotropic total variation (TV) regularized cost function to leverage the bounded variation property of the image/video stream, in addition to the coding and diversity advantages of the STTC, for improved image/video recovery. For reconstruction, a novel bi-layered hierarchical trellis based modified Viterbi decoder is developed to decode the optimal bit sequence corresponding to the least TV cost function using appropriately modified node and branch metrics. Simulation results using several test images and video sequences demonstrate the improved reconstruction performance of the proposed scheme, especially in the low SNR regime ( $0-10$ dB), in comparison to the conventional maximum likelihood (ML) decoder, which makes it ideally suited for implementation in practical WMSN.


I. INTRODUCTION
Wireless multimedia sensor networks (WMSN) have gained popularity due to their applicability in a diverse spectrum of applications such as surveillance, law enforcement, industrial monitoring, health care, agriculture, UAVs, among others [1], [2]. The fabrication of multimedia capable sensor nodes has been made possible by the recent advances in the development of low-cost and sophisticated CMOS-based imaging and audio sensors [3]. In contrast to the traditional wireless sensor nodes [4], [5] that sense, process and The associate editor coordinating the review of this manuscript and approving it for publication was Xingwang Li . transmit measurements of parameters like temperature, pressure, humidity etc, WMSN generate and process multimedia content comprising of audio, images and videos streams. A typical WMSN is shown in Fig. 1. Given the stringent computational and battery constraints [6]- [8] of the miniature sensor nodes, it is imperative to develop low complexity image/ video recovery techniques at the fusion center, to achieve high quality image/ video reconstruction at the base station/fusion center. This is especially important in WMSN since the fading nature of the wireless channel can lead to a significant degradation in the quality of reconstruction [9]. A brief review of the existing research in this context is presented below.

A. REVIEW OF WORKS IN EXISTING LITERATURE
Several techniques have been proposed in the existing literature for reliable image/ video reconstruction in wireless sensor networks (WSN). For instance, the authors in [10] proposed a network adaptive image/ video transmission scheme for sensor networks based on quality-scalable coding for multipath transmission. The work in [11] proposes a real time error correction scheme for WSN based on predictive models that exploit the temporal correlation in sensor data. A significant advantage of the scheme proposed therein is that it neither requires any additional resources nor increases the computational complexity at the sensor nodes. The multiple description (MD) source coding based approach in [12]- [14] uses a correlating block transform at the application layer, thus introducing controlled redundancy in the transmitted data to enhance the performance of multimedia streaming in wireless networks. However, such approaches are not suitable for the WMSN due to the increased bandwidth overhead caused by the redundancy in the bit stream, which can be significant for a WMSN with a large number of transmitting sensor nodes. Traditional schemes such as automatic repeat request (ARQ) [15] that deal with packet losses in sensor networks also lead to a decrease in the effective spectral efficiency. Interestingly, the work in [16] demonstrates that coding schemes over multiple parallel channels e.g. frequency bands, antennas or time, which leverage the diversity properties of the fading channel, are more efficient for Rayleigh fading channels in comparison to the source coding schemes. The development of multiple-input and multiple-output (MIMO) technology that supports increased data rates for the same transmit power, coupled with its diversity advantages [17]- [19] for increased reliability, has led to a radical improvement in the quality and efficiency of wireless communication. Its energy and spectral efficiency naturally renders it attractive for implementation in WMSN [20]. MIMO wireless systems have also been employed for efficient and reliable video transmission in various recent works as in [21]- [23]. Further, STTC [24] is a popular technique that can yield additional coding and diversity benefits in such systems [25], in turn leading to a decrease in the error rate of communication. An STTC typically employs a low complexity generator matrix for encoding the information bits [26] at the transmitter, which makes it a better choice for use in WMSN. Decoding at the receiver can be performed via the efficient Viterbi algorithm [27] based trellis decoder. Hence, STTC has been popular in literature for robust communication in multi-antenna systems. However the performance of STTC specifically for image/ video transmission can be significantly enhanced by leveraging the bounded variation property [28] that is a fundamental characteristic of multimedia signals. TV regularization based image/ video reconstruction schemes that exploit the BV property have gained popularity for several image/ video processing applications such as noise removal [29], deblurring [30]- [32], interpolation [33], inpainting [34], [35], super resolution (SR) based reconstruction [36]- [39], structure decomposition [40] etc. This is due to the edge preserving property [41] of the l 1 norm based anisotropic TV regularization term that can lead to a significant improvement in reconstruction quality [42]. Previous works [43], [44] have described error resilient techniques for image/ video transmission in wireless networks that exploit their bounded variation property employing a novel Viterbi algorithm based TVSD technique for reconstruction. Furthermore, this is well suited for sensor networks since it does not increase the processing complexity at the sensors. This work therefore proposes a MIMO-STTC encoder for multimedia data followed by a novel joint STTC-TV decoder at the cluster head for improving the quality of image/ video recovery. The various contributions of this work are described next.

B. CONTRIBUTIONS OF THIS PAPER
This work develops a framework for STTC-based transmission and detection of image/ video frames. At the receiver, a TV regularized cost function is proposed for improved reconstruction of the received frames. Another contribution of this work is a novel hierarchical trellis structure, comprising of two layers, which is developed for optimal decoding of the received signal stream. The coded component (CC) layer of the trellis comprises of nodes that correspond to various possible quantization levels and pixel intensities for entropy coded and uncompressed streams, respectively. The other layer termed the STTC symbol (ST) layer has nodes corresponding to the encoder states of the conventional STTC trellis. A novel modified Viterbi decoder, with suitably defined branch and state metrics, is developed to obtain the optimal decoded stream for the above hierarchical trellis which also incorporates an l 1 norm based TV regularization term. Simulation results demonstrate the improved STTC-TV reconstruction in comparison to the conventional STTC-ML decoder.

C. ORGANIZATION OF THE PAPER
The outline of the paper is as follows. Section II presents an STTC coded wireless communication framework for image/ video stream transmission in a WMSN followed by a brief description of the conventional STTC-ML image/ video decoder. The proposed TV regularization based STTC-TV scheme for robust image/ video recovery and the hierarchical trellis based Viterbi decoder for joint STTC and TV decoding are proposed in section III. Section IV describes the simulation setup for the proposed image/ video reconstruction scheme together with the various visual and numerical results and also compares them with those obtained from conventional recovery schemes. This is followed by the conclusion in section V.

D. SYMBOLS AND NOTATIONS
The various notations used in this paper are as follows. Vectors and the matrices are denoted by lower case boldface letters x, upper case boldface letters X respectively. Scalars are denoted by lowercase letters x. E{X } denotes the expectation operator over the random variable X that computes its stochastic average with respect to its probability density function (PDF). The estimate of the signal vector x is denoted byx. R, C and Z denote the sets of real numbers, complex numbers and integers, respectively. An identity matrix of size N × N is represented by I N . The cardinality of a set A is denoted by |A|. N [µ, C] denotes the the multivariate Gaussian distribution with mean µ and covariance matrix C.

II. SPACE-TIME TRELLIS CODING BASED WIRELESS IMAGE/WWWWW/VIDEO TRANSMISSION SYSTEM
Consider a digital video stream consisting of P frames where the pth frame denoted by A p ∈ Z R×C for 1 ≤ p ≤ P, comprises of R rows and C columns of pixels. Note that this can also be used to model image transmission by setting P = 1. The quantity A p (i, j) represents the intensity level of the corresponding pixel in the ith row and the jth column of the pth frame. The frames can potentially be encoded using a pertinent source coding scheme such as JPEG for images, MPEG for videos etc [45], [46] to compress the stream in order to meet the limited bandwidth constraint at the sensor node. Towards this end, each frame is divided into blocks of size b × b pixels resulting in C s = C b and R s = R b blocks in each row and column, respectively. Typically b = 8 for various popular schemes such as JPEG and MPEG. Setting b = 1 corresponds to the transmission of the uncompressed image/ video frame. The (r, l)th block of the video frame, where 1 ≤ r ≤ R s and 1 ≤ l ≤ C s , is encoded into Q p r,l coded components (e.g. DC, AC components). For digital modulation, prior to transmission in a digital wireless system, the kth coded component of the block is further subsequently binary coded with B p,k r,l information bits. Let the sequence of B p,k r,l information bits obtained above be denoted by the vector where each x p,k r,l (i) ∈ {0, 1}. The coded bits corresponding to the kth coded components of all the blocks in the rth row are further concatenated as the vector The next section describes the STTC for encoding the information bit vector sequence x p,k r in a MIMO wireless communication system.
A. STTC FOR IMAGE/WWWWW/VIDEO TRANSMISSION Consider a MIMO system with N t transmit and N r receive antennas [47] employing an M -ary digital symbol constellation for modulation of the symbols. For instance, M = 2, 4 represent BPSK, QPSK respectively. Let m = log 2 M denote the number of bits per symbol. STTC is subsequently applied on the information bits using an appropriate generator matrix to obtain the space-time coded symbols that are transmitted over the MIMO wireless channel, the procedure for which can be described as follows. The coded video bit stream x p,k r is partitioned into blocks of m + s bits, where s denotes the number of memory bits, to obtain the input bit vector u which is acted on by the encoder at time instant t. Hence, the number of encoder memory states is V = 2 s . Let G ∈ R (m+s)×N t denote the generator matrix for the STTC. Using the procedure given in [48], the coded symbol index vector d corresponding to the M -ary constellation at time instant t is generated as where Z M = {0, . . . , M − 1} is the set of possible M −ary symbol indices. The initial encoder state is assumed to be zero i.e. x p,k r (t) = 0 for t ≤ 0. Thus the process above transforms the m + s coded image/ video bit stream at time t into N t trellis coded symbol indices. Let the ith element ξ p,k where  symbol vector sequence denoted by where T is the STTC block length corresponding to the transmission of the information bits in x p,k r . The MIMO wireless channel model for STTC coded wireless transmission of image/ video frames is detailed next.

B. MIMO WIRELESS COMMUNICATION SYSTEM MODEL
The transmit symbol vector ξ p,k r,t generated as shown in (5) is subsequently transmitted over an N r × N t MIMO wireless channel. Let y p,k r,t denote the corresponding received symbol vector, which is related to the transmit symbol vector ξ p,k r,t as [49] y p,k where H p,k r,t ∈ C N r ×N t represents the MIMO wireless channel matrix with H p,k r,t (i, j) denoting the wireless channel fading coefficient between the jth transmit and the ith receive antenna at time instant t. The quantity η p,k r,t ∈ C N r ×1 represents the symmetric complex additive white Gaussian noise vector at the receiver with zero mean and covariance matrix σ 2 η I N r , where σ 2 η denotes the noise variance. The conventional trellis based ML decoder for STTC coded image/ video transmission over the MIMO wireless channel is described next, followed by the proposed scheme for improved decoding.

Conventionally, the symbol vector sequence estimateˆ
for the STTC coded video stream p,k r , is obtained using the Viterbi algorithm based ML decoder [50] that is described as follows. Consider a trellis structure with V = 2 s states and T stages. The ith state in each stage is denoted by s i , which corresponds to one of the V encoder states of the STTC. Each transition from state s i to state s j corresponds to the transmission of m bits of the coded image/ video bit stream at time t and the corresponding STTC encoded symbol vector is determined using the generator matrix G. The Viterbi algorithm at the receiver [51] is employed to decode the state sequence that has the maximum likelihood property through the trellis corresponding to the received signal vector block Y p,k Let the STTC encoded symbol vector corresponding to the transition from state s i to state s j be denoted byξ i,j and the set of valid next states from the memory state s i for different input symbols be denoted by S i . It can be noted that the vectorξ i,j and the set S i can be determined using either the generator matrix or state diagram of the STTC code. Example generator matrices and state transitions in a trellis for the STTC used in simulation are shown in Fig. 2 and Fig. 3 respectively. The branch metric corresponding to the state transition from state s i to state s j at instant t is defined as The initial state of the STTC encoder at t = 1 is assumed to be s 1 . The accumulated path metric is initialized as α p,k r,1 s j = µ p,k r,1 s 1 , s j , 1 ≤ j ≤ n, and is subsequently computed recursively for t > 2 by minimizing the sum of the accumulated metric at t − 1 and the branch metric at time t as given in the equation below Let the survivor state index corresponding to state s j at stage t be denoted by υ p,k r (j, t), which is initialized as υ p,k r (j, t) = 1,∀j for t = 1. Further, for stages t ≥ 2, it is determined for each state s j at stage t of the trellis as The ML decoded sequence corresponds to the trellis path terminating at the state with corresponding to the minimum accumulated metric at t = T . The state at t = T is determined as shown below The decoded state sequence is obtained by back tracking through the survivor states in the trellis using the survivor state index matrix υ p,k r as The optimal state sequence estimateŝ p,k r is in turn obtained from the index sequence q p,k r aŝ The decoded video bit streamx p,k r for the kth coded component in the rth row and the pth frame is recovered from s p,k r using the generator matrix G, by repeating steps in (3) and (4) with x p,k r replaced byx p,k r . This process is repeated for all the rows of the frame to obtain the frame pixel intensity estimateÂ p in turn for each of the P frames of the video stream. A significant shortcoming of the conventional method described here is that it does not exploit the bounded variation property [29] of the image/ video stream towards data detection, which has the potential to substantially reduce the decoding errors at the receiver, thereby leading to a significant improvement in the quality of reconstruction. Motivated by this fact, a novel TV regularization based STTC decoder is proposed in the following section for robust image/ video reconstruction in MIMO wireless systems. VOLUME 8, 2020

III. PROPOSED STTC-TV DECODER FOR IMAGE/WWWWW/VIDEO RECOVERY
The likelihood function P y p,k r,t |ξ p,k r.t is given by the standard Gaussian distribution corresponding to the MIMO wireless channel model in (7). The prior probability density function (PDF) P p,k r for an image/ video stream is modeled using the standard Gibbs distribution [52], which has been shown to be well suited for such purposes, that is expressed as where U p,k r is the prior energy functional and σ z is a suitably chosen normalization factor. On substituting the likelihood function and the prior probability density in the MAP rule determined in (16) (18) where the regularization parameter γ controls the weight ratio of the regularization energy functional U p,k r vis-a-vis the approximation error in the first term inside the sum. The regularization term in the cost function above exploits the prior information about the smoothness of the reconstructed image/ video stream obtained from the problem in (18). The use of an l 2 norm based isotropic TV regularization functional U p,k r was first introduced by Rudin et al. in [29] for noise removal in images. Subsequently, a scheme employing the l 1 norm based anisotropic TV norm was developed in [53] for restoration of color/vector valued images. The TV regularization cost function therein was shown to be rotational invariant as well as better suited for preserving edges, thereby leading to superior image recovery. Motivated by these advantages of the l 1 norm based regularization methods, the TV regularization functional is set as U p,k r x p,k r TV , where the TV norm is defined as The operator f D (.) evaluates the difference vector of x p,k r,l computed with respect to its causal connected causal neighbors [44] as shown below . .
where the operator f w x p,k r,l evaluates the numerical value corresponding to the coded component x p,k r,l . The quantity γ f controls the weight of the gradient of the coded components of the frame along the temporal direction with respect to the gradients in the spatial directions. On substituting (20) and (19) in (18), the proposed MAP rule based optimization problem in (16) simplifies to the one given in (21), as shown at the bottom of the next page, where U(p, r, l) and U d (p, r, l) denote the frame, row and column indices of the causal neighbours and causal decoded neighbours as defined in (22) and (24) as shown at the bottom of the next page, respectively, and δ a (b) = 1 if a = b and 0 otherwise is the standard Kronecker delta function.
The solution to the optimization problem in (21) cannot be determined using conventional techniques due to the high computational complexity associated with the large dimensionality of the coded component space that grows exponentially with the sequence length C s . Therefore, in order to overcome this challenge of dimensionality and recover the video frame efficiently, a novel modified Viterbi decoder is proposed next, which employs a hierarchical trellis for decoding the sequence x p,k r . This is well suited for signal recovery in practical scenarios since it has a linear complexity with respect to the sequence length C s , similar to that of the conventional Viterbi decoder [51].

A. HIERARCHICAL TRELLIS BASED OPTIMAL VITERBI DECODER FOR STTC-TV DETECTION
In the proposed hierarchical trellis for TV decoding, the higher layer is termed as the CC-layer and corresponds to the sequence of the coded components, as shown in Fig. 2. The CC-layer also evaluates the TV cost between the coded components, that arises due to the bounded variation property of images/ video. The lower layer is termed the ST-layer, and the corresponding decoder is similar to a regular STTC decoder. The ST-layer evaluates the ML cost corresponding to the STTC symbols. The CC-layer comprises of C s stages with N nodes in each stage. Each ith node in this trellis corresponds to a valid coded component bit sequence vector denoted by ν i ∈ {0, 1} B p,k r,l ×1 . For instance, ν i = [10010011] T corresponds to the pixel intensity value A P (r, l) = 147 for B p,k r,l = 8. Each node corresponds to transmission of the STTC coded symbols over the duration of T = B p,k r,l /m time instants, which is further expanded in the lower layer, termed as the ST layer, as shown in Fig. 3. The ST-layer has V encoder states with the ith state denoted by s i and T ST-layer stages, similar to the conventional decoder in section-II. Each node in the CC-layer of the trellis has V initial and final encoder states referred to as initial and final subnodes denoted by s i for 1 ≤ i ≤ V , which denote the possible encoder states before and after STTC encoding of the corresponding coded component. The subnodes determine the valid branches between two nodes at successive stages in the trellis, since encoder state continuity is maintained for STTC encoding of the coded components from one CC-layer stage to the next. The final encoder state after encoding of a coded component is identical to the initial encoder state of the coded components for the next CC-layer stage. Therefore, a valid branch between two nodes in the CC-layer exists only between identical subnodes connecting the same encoder states in the ST-layer. The modified node and branch metrics derived from the objective function in the optimization problem in (21) corresponding to the hierarchical trellis are described next.
The node metric p,k r,l (i, q) corresponding to the qth initial subnode of the ith node at the lth CC-layer stage is defined in (23) as shown at the bottom of this page. This comprises of the decoding error for the symbol vector sequencẽ corresponding to STTC encoding of ν i with initial encoder state s q together with the TV norm regularization term corresponding to the causal decoded neighbors denoted by the  (29) and (30) respectively. end if end for end for end for 2. Determine node index τ (C s ) and the subnode index τ (C s ) corresponding to the least accumulated cost at CC-stage l = C s as given in (32). 3. Compute the optimal transmit CC bit sequence estimate for remaining CC-stages recursively usingζ p,k r,l (i, q) and ζ p,k r,l (i, q) as given next for l = C s -1 to 1 do I. Find the CC transmit bit sequence estimatex p,k r,l using the steps described in (33).
(23) U d (p, r, l) = U(p, r, l)\{(p, r, l − 1)}, ∀ 1 ≤ p ≤ P, 1 ≤ r ≤ R s , 2 ≤ l ≤ C s (24) VOLUME 8, 2020 set U d (p, r, l) as defined in (24). An illustration of the hierarchical structure of the trellis for each node ν i in the CC-layer together with the corresponding trellis in the ST-layer for the transmit vector sequence˜ i,q is given in Fig. 3. Note that for a V −state STTC, the transmit vector sequence is found using the generator matrix G, the initial state s q and the information bit sequence ν i . Assuming the initial encoder state to be s 1 , for l = 1, the node metric The branch metric ρ p,k r,l (i, j) corresponding to a valid branch from node i to node j for 1 ≤ l ≤ C s , is defined as Further, it can be seen that corresponding to the ith node, multiple paths emanating from different initial subnodes can terminate at a final subnode s q . Let the set of indices of these initial subnodes be denoted by V q i corresponding to each final subnode s q at stage i in the ST-layer. The accumulated metric p,k r,l (i, q), corresponding to the node ν i and final encoder state s q , is found as follows. For l = 1, it is initialized as For stages l ≥ 2, the accumulated metric p,k r,l (i, q), the survivor subnodeζ p,k r,l (i, q) and the survivor node ζ p,k r,l (i, q) of the qth node in the final stage of the ST-layer corresponding to the ith node in the lth CC-layer stage are computed recursively using (28), (29) and (30), as shown at the bottom of the next page respectively. The above metrics and the survivor nodes are computed for all the V subnodes for each node in the CC-layer. Finally, in order to determine the bit sequence estimatex p,k r,C s of the coded component corresponding to the final stage l = C s of the CC-layer, the node index τ (C s ) and the subnode indexτ (C s ) corresponding to the least accumulated cost is found as follows The codeword estimates of the coded components corresponding to the rest of the CC-layer stages 1 ≤ l ≤ C s − 1 are determined from the survivor nodes ζ p,k r,l and the survivor subnodesζ p,k r,l recursively as follows A concise summary of the various steps of the procedure for decoding the codewords corresponding to the kth coded component of the blocks in the rth row of the pth frame is given in Algorithm 1. Furthermore, the estimateÂ d,p r,l of the (r, l)th coded block is obtained by decoding the estimatê x p,k r,l ; 1 ≤ k ≤ Q p r,l of the coded component vector sequence. The estimates of the pixel intensitiesÂ p r,l ∈ R ∈ R b×b for the (r, l)th block are subsequently obtained via inverse transform (IDCT for JPEG, MPEG) of the coded block estimate as shown in (34), as shown at the bottom of the next page. The R s ×C s size image/ video frameÂ p is finally reconstructed aŝ

IV. SIMULATION RESULTS AND DISCUSSION
Visual and numerical results are presented in this section followed by discussion regarding the performance of the proposed scheme and its comparison with the conventional reconstruction schemes. The setup for image transmission and reconstruction is described next.

A. IMAGE TRANSMISSION AND WIRELESS SYSTEM MODEL
In order to validate the performance of the proposed scheme, simulations are performed on several standard test images such as Lena, House, Jetplane, Walkbridge and Living room of size 256 × 256 pixels i.e. R = C = 256 as shown in Fig. 4. The images and videos considered for simulations are standard test images/ videos that have been used in various other works in the existing literature such as [54]- [56]. A particularly appealing aspect of these frames that makes them ideally suited for testing the performance of different algorithms is that they represent a wide variety of physical conditions, covering indoor as well as outdoor scenarios, with the objects of interest placed at different locations. Moreover, a variety of backgrounds and textures are present in each frame. Thus, these images/ videos are considered for comprehensively testing and demonstrating the performance of the proposed scheme for diverse conditions to ensure reliable performance in real-time implementations. For images, the number of frames is set as P = 1 and both uncoded (b = 1, N = 256) and JPEG coded (b = 8, N = 256) formats are considered. The bits are subsequently STTC coded with different generator matrices [48] that are listed in Table 2. The coded symbols are modulated using QPSK symbols of power P s given as . The parameter SNR refers to the average value of the signal to noise power ratio at the receiver taking the fading into account.
The performance of the proposed scheme has been illustrated using three different metrics: PSNR (Peak signal to noise ratio), bit error rate (BER) and visual quality. The PSNR is a standard quality assessment metric that has been used in many works on image/ video processing such as [32], [57], [58]. This has been shown to be best suited to study the performance of image/ video reconstruction. Furthermore, the performance of the proposed scheme is also characterized in terms of the BER in Fig. 7, which is once again a well-established measure to demonstrate the improvement in the accuracy of symbol detection for wireless communication. Finally, individual reconstructed images and video frames are also shown in Fig. 8, Fig. 9, Fig. 10 and where MSE A,Â is computed as VOLUME 8, 2020

B. OPTIMAL REGULARIZATION PARAMETER γ
The optimal regularization parameter γ is determined using the minimum mean squared error (MSE) criterion for different channel SNR conditions. The average MSE of image reconstruction for different SNRs in the range [0, 10] dB is plotted with respect to γ values in the range 10 −3 , 10 as shown in Fig. 5 Fig. 6. It can be observed therein that the lowest reconstruction error and the best visual quality of the reconstructed image is indeed achieved at γ = γ * where γ * = 0.1 is the optimal value of the regularization parameter determined as above from Fig. 5. For γ < γ * , the reconstructed image obtained is noisy whereas further increase in the value of γ value beyond the optimal value degrades the reconstruction performance due to over smoothing.

C. IMAGE RECONSTRUCTION PERFORMANCE
For the optimal values of γ , Monte-carlo simulations are carried out to demonstrate the average PSNR performance of the proposed STTC-TV decoder for different channel SNR  conditions and also to compare the performance with that of the conventional STTC-ML decoder. The DC components are recovered using the STTC-TV procedure whereas the AC coded-components are estimated using the conventional STTC-ML technique. The corresponding results for average PSNR are shown in Table 1. The average PSNR values for the proposed scheme using different standard STTC generator matrices reported in literature [48], obtained using the optimal code design criterion, are shown in Table 2. It can be observed from the simulation results shown in Table 2 that the image/ video reconstruction performance of the proposed STTC-TV scheme improves with the number of states in the STTC codes. Thus, the best performance among the four STTC generator matrices in Table 2    in Fig. 8 and in Fig. 9 respectively, demonstrate the improved image recovery in comparison to that of the conventional ML based STTC decoder. For instance, the proposed scheme can be seen to yield a PSNR improvement that is close to 14dB and 3dB corresponding to Raw and JPEG compression respectively, for SNR = 4dB. The effect of change of average SNR on the overall bit error rate (BER) performance of the proposed scheme has been shown for image recovery for the SNR range [0, 10] dB in Fig.7. The performance for each SNR value is obtained by averaging over several instantiations of the fading wireless channel. The consistent PSNR improvement to the tune of 2dB demonstrates the improved ability of STTC-TV for various values of average SNR considering also the effects of fading in the WMSN. The numerical and visual results above demonstrate the improved image/ video reconstruction of the proposed scheme over the conventional STTC-ML scheme arising due to the exploitation of the bounded variation property of the transmitted stream. The BER performance of wireless communication also shows a similar improvement in decoding the transmitted symbols that get severely distorted due to the fading nature of the wireless channel. This boost in performance demonstrates the ability of STTC-TV to combat the adverse effects of fading for various SNR conditions in the WMSN.

D. VIDEO RECONSTRUCTION
Simulations are also performed for video reconstruction using the standard test video sequences: Foreman, Coastguard and Akiyo. The different parameters for the videos under consideration are set as R = 176, C = 144 and P = 300. The scheme is applied for raw as well as on the coded video streams using the low complexity DCT-based video compression technique described by Ouni et.al. in [59] referred herein as Ouni for brevity. The procedure for the same is briefly described as follows. A temporal decomposition is performed on the 3D video frames transforming them into 2D image components using the popular Accordion representation. Specifically, each 8 × 8 × 8 cube of video pixels is transformed into an image frame of size 8 × 64. Thus, the number of such frames generated for the test video sequences above isP = 38. These frames are then compressed using JPEG coding. The proposed scheme is employed at the receiver for reconstruction of the video frames with its performance once again assessed using the average PSNR metric that is defined as Using a procedure similar to the one described for image reconstruction above, the optimal regularization parameter γ is observed to lie in the range [0.1, 0.5]. The parameter γ f corresponding to (20) is set as γ f = 1.25, similar to [32]. The average PSNR values of the proposed scheme and that for the conventional reconstruction are shown in Table-3 which clearly demonstrates the enhanced quality of video recovery obtained using the proposed approach. For instance, the PSNR improvement is approximately 15dB and 4dB for raw and Ouni compressed video reconstruction respectively, for SNR = 4dB. In addition to the numerical results, the representative frames of all the reconstructed video sequences are shown in Fig. 10 and Fig. 11 respectively for raw and Ouni compression schemes for the proposed and the conventional recovery techniques. Numerical values and visual examination of the frames clearly show that the proposed decoder outperforms the conventional STTC-ML decoder. The performance margin is especially pronounced in the low SNR regime thus making the approach well suited for application in WMSN with power constrained sensor nodes.

V. CONCLUSION
An STTC coding based image/ video communication framework has been developed for WMSN. A TV regularization based image/ video decoder is developed for robust recovery, which employs an l 1 norm regularization term to exploit the bounded variation property of the image/ video stream in conjunction with the l 2 norm based ML decoder for the STTC-MIMO wireless system. A novel STTC-TV decoder is developed based on the Viterbi algorithm to minimize the TV regularization cost function using a bi-layered hierarchical trellis structure for image/ video reconstruction. The modified state, branch and accumulated metrics are defined for the trellis nodes and branches derived from the proposed TV optimization problem. Simulation results, both visually and numerically, demonstrate the improved reconstruction performance of the proposed framework in comparison to the conventional STTC-ML decoder.