Image Compression Techniques in Wireless Sensor Networks: A Survey and Comparison

There is continuous intensive research on image compression techniques in wireless sensor networks (WSNs) in the literature. Some of the image compression techniques in WSNs that exist in the literature include discrete cosine transform (DCT), discrete waveform transforms (DWT), set partitioning in a hierarchical tree (SPIHT), and embedded zero tree wavelet (EZW) coding. Research on image compression in WSNs is necessitated by the need to improve the energy efficiency of sensor nodes and WSNs’ lifetimes without compromising the quality of the reconstructed data. Several approaches have been developed centered around image compression and other factors in trying to limit the energy consumption of sensor nodes. Most of these approaches do not provide the error-bound mechanism that balances the rate of compression and distortion of the reconstructed image. Therefore, in this paper, a review and analysis of image compression techniques and approaches in WSNs are conducted. Available image compression approaches in WSNs in literature were then classified according to the image compression technique adopted, and their strengths and weaknesses were highlighted. In addition, a rate-distortion balanced data compression algorithm with error bound mechanism based on artificial neural networks (ANN) in the form of an autoencoder (AE) was coded and simulated in MATLAB before being evaluated and compared to the conventional approaches. The experimental results show that the simulated algorithm has less root mean square error (RMSE) and a higher coefficient of determination ( $R^{2}$ ) values on variable compression ratios as compared to the Principal Component Analysis (PCA), Discrete Cosine Transform, and Fast Fourier Transform (FFT) when using the Grand-St-Bernard metrological dataset. Furthermore, it presented less RMSE, and higher compression ratio values compared to the Lightweight Temporal Compression (LTC) algorithm on variable error bounds when using the LUCE metrological dataset. Therefore, it was found that the simulated algorithm presents better compression fidelity as compared to the conventional approaches without an error-bound mechanism. Moreover, the algorithm analyzed presents a significant approach to balancing the compression ratio and reconstructed data quality through its error-bound mechanism.


I. INTRODUCTION
Wireless Sensor Networks (WSNs) are being deployed in a wide range of potential applications scenarios, including precision agriculture, object tracking, pipeline monitoring, underground mining, forest monitoring, industrial applications, military surveillance, medical systems, traffic, and remote control [1]- [3]. Wireless sensor networks; almost unlimited information access and greater control of our environments. These are numerous distributed sensing devices for monitoring and interacting with the physical world. The devices involved are networked in a way that they cooperate to perform higher-level sensing tasks. WSNs consist of wireless sensors (numbers of nodes) and base stations [4] that are limited by communication bandwidth, memory, power supply, processing performance, and highly resourceconstrained [5]- [8]. Therefore, the key issue in the design of algorithms and protocols for WSNs is energy consumption. Radio communication is the dominant energy consumption feature in WSNs with data bits being directly proportional to this type of energy consumption. i.e., traffic data transmission within WSN [8], [9]. Hence, a measurable reduction of communication energy costs can be achieved through transmitted bits compression while increasing the lifetime of the network [8]. WSN topologies include star, tree, and mesh. The different types of WSNs include Terrestrial WSNs, Underground WSNs, Underwater WSNs, Multimedia WSNs, and Mobile WSNs [4]. According to the literature, WSNs can be classified as static and mobile, deterministic, and nondeterministic, single-base station and multi-base station, static-base station and mobile-base station, single-hop and multi-hop WSN, self-reconfigurable and non-self-configurable, and homogeneous and heterogeneous. A typical sensor node consists of four main components: (i) a sensing unit including one or more sensors and analogue-to-digital converters for data acquisition; (ii) a data processor including a microcontroller and a memory for local data processing; (iii) a radio subsystem (RF unit) to transmit the data over a wireless channel A wireless sensor network model with a sensor network architecture [3].
to a designated sink; and (iv) a power source [3], [10] as shown in Figure 1.
Sensor Nodes -the sensing devices that sense data like image data and used for forwarding and relaying messages in the network to other nodes. According to literature, a wireless sensor node is an essential component of a wireless sensor network. They are used for sensing, processing, wireless communication, power supply management, etc. according to wireless network application. They collaborate with each other to support in-network processing in such a way of significantly reducing the amount of network data traffic.
Sinks-information destinations that avail data from sensor to interested users on the internet.
Mobile Data Collectors -Intermediate nodes that are not necessarily sources or destinations. At least one of them in a network renders the network mobile.
Several attributes determine the sensor selection and integration. These include the accuracy of the sensed data, sensitivity, reproducibility, sensing span of the network, resolution, selectivity, response time, and self-heating that can affect the quality of the data sensed by the sensor and performance of the sensor. The physical layer (point-to-point communication) for establishing a direct link between sensor nodes. It consists of the transmitter, channel, receiver, source encoding, channel encoding, modulation, demodulation, and signal propagation. MAC protocols can either be schedule based, contention based or a hybrid. Ensures that all nodes share the wireless media in a distributed way. MAC protocols manage the utilization of energy on the network. Network layer is used for supporting multi-hop communication. The network layer includes the network topology, routing metrics, routing classification, and routing protocols. In wireless sensor networks, the two main network topologies that have been adopted according to literature are flat and hierarchical topologies.
Unlike scalar WSNs, Wireless Multimedia Sensor Networks (WMSNs) nodes are equipped with low-cost cameras to enable them to meet most event detections and environmental data collection requirements [3], [11]. WMSNs' complexity further increases the resource constraints such as bandwidth and computational power as compared to scalar WSNs in detecting environmental events [12]- [14].
Several image compression (IC) techniques exist in literature and their choice depends on the type of operating platform. Image compression minimizes redundancies and irrelevant image data for efficient storage and transmission [15] while preserving the visual quality of the reconstructed image [16]. Generally, image compression is done to save storage space and lower bandwidth without compromising the output image quality [16]. Compression techniques can be classified as lossy and lossless [17]. Their applications are dependent on the encoding and decoding time, compression ratio, and energy requirements [16].
Surveys and reviews on image compression in WSNs have been done by different researchers before. However, the contributions of this research work are as follows: The rest of the paper is arranged as follows: Section II covers compression in WSNs that include data compression and image compression. In Section III, related works in image compression for WSNs are discussed and classified. Section IV and V cover an analysis and evaluation of a rate-distortion balanced data compression algorithm, respectively. Lastly, conclusions and future direction are provided in Section VI.  [16], [18], and Communication Compression [18], [19] as shown in Figure 2. Sampling Compression: A reduction of sensory operations while making sure that there is no loss in coverage of the network and at an acceptable distortion margin [8], [20]. VOLUME 10, 2022 FIGURE 2. Different compression schemes within a wireless sensor network [8]. Data Compression: Conversion of an input image stream into the desired output that is compressed with a smaller size [16] using some form of encoding [8].

II. COMPRESSION IN WIRELESS SENOSR NETWORKS
Communication Compression: Reduction of the number of transmissions within the network and the receptions by radio on-time of transceivers reduction [8], [21].
The requirements can be categorized into generic requirements and application-specific requirements. Generic requirements are communication requirements, computational complexity and memory requirements, redundant sensing, on-route compression, reliability, robustness, and scalability.
Whereas application specific requirements consist of realtime vs. non-real-time, quality of service (QoS)-awareness, and security. Features of these compression techniques include lossless vs lossy, distortion vs accuracy, data aggregation, data correlation, symmetric vs asymmetric, and nonadaptive vs adaptive [8], [22]- [26]. Figure 3 is a summary on compression in WSNs according to the types, requirements, and the features.

A. DATA COMPRESSION TECHNIQUES IN WSN
Research on data compression for wireless sensor networks has been done extensively and there are a lot of surveys and reviews on the data compression techniques and their applicability [27]. According to the authors in [27], data compression techniques can be summarized into three main categories; data aggregation compression techniques, local data compression techniques, and distributed data compression techniques.
• Data aggregation compression techniques: These techniques have been heavily investigated according to literature [28]. However, they are known to extract summaries of statistical data from sensory data such as minimums, maximums, and averages [27]. The techniques are more useful to certain applications, which require information that is limited. However, there is also a group of distributed source coding techniques, which are more practical such as the Slepian-Wolf coding [29] that perform data compression from the sources [28]. Data aggregation techniques can be classified into Tree Structured, Chain-Based, Cluster-Based, Sector-Based, and QoS-Based data aggregation compressions [27]. Moreover, Tree structured types include Energy-Aware Distributed Heuristic approach (EADAT) [30], Tree based Tiny Aggregation (TTA) [27], and Power Efficient Data gathering and Aggregation Protocol (PEDAP) [27]. Power Efficient Data Gathering Protocol for Sensor Information Systems (PEGASIS) [31] is a type of chain-based data aggregation techniques. Clusterbased techniques [27] include Low Energy Adaptive Clustering Hierarchy (LEACH) [32] and Hybrid Energy-Efficient Distribute clustering (HEED) [33]. Semantic/Spatial Correlation aware Tree (SCT) [34] and Application Independent Data Aggregation (AIDA) [35] are examples of sector-based techniques [27]. QoSbased techniques include the AIDA.
• Distributed data compression techniques: They exploit the high spatial similarities of the sensor data in dense networks on fixed sensor nodes [27].

B. IMAGE COMPRESSION IN WIRELESS SENSOR NETWORKS
In WSNs, image compression is described as a data compression application for digital images to reduce their transmission and/or storage requirements. Several image compression techniques exist in literature that include the commonly used Joint Photographic Expert Group (JPEG), JPEG2000, and discrete cosine transform (DCT).

1) IMAGE COMPRESSION OVERVIEW
Most images have correlated neighboring pixels that result in redundancy [46]. Therefore, image compression just like data compression is necessitated by the desire to reduce energy consumption and improve the network lifetime in WSNs. The process of image compression aims to output quality images with less distortions while reducing data redundancies. Types of images used for different applications exist with different compression features. These image formats are either lossy or lossless compression-based [47]. Table 1 classifies image formats according to lossy and lossless compression based on the literature review.

a: HOW IMAGE COMPRESSION WORKS
i. Reduce spatial or temporal redundancy (mapper) -It is a reversible or irreversible process depending on the algorithm used for compression and decompression from DFT, DCT or Run Length Coding. ii. Reduce the accuracy of the mapper's output (Quantizer) -An irreversible process that can be done on lossy compression and not advisable on lossless compression. iii. Generate fixed or variable output. In addition, Figure 5 illustrates how image compression works.
b: THE JPEG COMPRESSION SCHEME DESCRIPTION JPEG encoding algorithm: It is a lossy data compression method commonly applied to digital images. It employs a FIGURE 6. The JPEG compression schematic [24].
transform coding method using the DCT technique as summarized in Figure 6.

1) IMAGE COMPRESSION TECHNIQUES
These are algorithms used to identify and remove information that is not critical to the image perception and then encode the remainder in a compact form. Those developed primarily for images, are categorized primarily in two types: lossy and lossless [48].

a: LOSSLESS IMAGE COMPRESSION TECHNIQUES
There is no data loss to achieve a reduction in compression ratio with lossless image compression techniques. Therefore, this leads to complications on image transfer over WSN [16], [49]. The resultant compressed image is very large with a high power consumption of sensor nodes and bandwidth on resource-constrained applications [16], [50].
The usage of lossless image compression techniques is limited as surveyed from literature due to their lack of energy efficiency. Lossless algorithms are for text or programs. There is redundant data. Therefore, original data and the data after compression and decompression are the same. Redundant data is removed in compression and added during decompression. E.g., ABABAA to 2ABAA then back to ABABAA. Lossless compression techniques reconstruct the exact data, and they can reduce the size of data at low extent. Original VOLUME 10, 2022 data is compressed to a less extent and it does not degrade the quality of the data. Moreover, channel holds a smaller amount of data. The algorithms depend on two-stage procedures [16], [51]: Decorrelation and Entropy coding.
Decorrelation -It is a process used for the removal of the spatial redundancy [8] between the pixels while preserving the other aspects of the image with low distortion. Decorrelation is categorized into three main categories; transform-based, prediction-based, and multi-resolutionbased techniques [52].
Entropy Coding -It is used to reduce data rates on coefficients resulting from decorrelation [8]. Literature has shown that the Discrete Wavelet Transform (DWT) and Discrete Cosine Transform (DCT) are widely used in video and image compression fields. Removal of coding redundancy is based on Statistical Coding and Run Length Coding (RLC). Statistical coding includes Huffman Encoding, Arithmetic Encoding, and LZE encoding.

b: LOSSY IMAGE COMPRESSION TECHNIQUES
Lossy techniques are for images, videos or audio and they are used for compressing images, audio files and video files. Lossy data is acceptable. These methods are cheaper, less time and use less space. These techniques are identified with a higher compression ratio as compared to lossless image compression techniques. The compressed image is normally of a different size to the original image with some form of distortion. However, the reconstructed image is normally a close match to the original image. Lossy compression removes non-useful part of the data that is undetectable, decrease the size of the file to a greater extent. Original data is compressed to a greater extent and restored, quality of the data degrades, the channel accommodates more data. Due to some loss of data during this type of image compression, it is vital to measure some form of distortion [16] on how close a reconstructed image is to the original image. According to literature, Mean Square Error (MSE) and Peak-Signal-To-Noise ratio (PSNR) are the most adopted similarity metrics used to measure the proximity between images. MSE and PSNR for image compression are represented by (1) and (2), respectively adopted from [16].
where x n represents input data sequence, y n is the compressed data sequence, and N being the data sequence length.
where x peak is the peak value of the signal, x 2 peak is equal to 255 for an 8-bit pixels.
The higher the value of PSNR the better image quality and a lower MSE suggests that the original and the com pressed images are closely similar.
Transform-based transmission and Non-transmission based techniques are the two main categories for lossy image compression techniques applied on resource-constrained applications [47], [53]- [56].

i) TRANSFORM-BASED TECHNIQUES
Most video and image compression applications [8] use DWT and DCT techniques as part of transform-based transmission techniques. For appropriate basis functions, the original data is transformed into a set of coefficients to be used in the reconstruction of the image or signal at the receiver. Generally, nonzero quantized coefficients reduced in number are enough for the recovery of an approximation of the original image with low distortion [8]. Their easier implementation makes them the most preferable in real-time applications [57].
Discrete Cosine Transform (DCT) -Mostly used in JPEG image compression scheme and it is the widely used transform coding technique [8], [16] because it is very fast [58]. The original image is divided into blocks first and coding of each block is done independently [58]. In JPEG compression, these blocks are 8 × 8 pixels each [58], [59], [48].
An image is projected into cosine components collection at distinct 2-dimensional frequencies. That is, it acts on BxB pixel blocks P zero-centered in obtaining BxB DCT block D using (3) and (4) [60]: Discrete Wavelet Transform (DWT) -Adopted in JPEG2000. It represents a signal with good resolution in frequency and time with the use of base functions called wavelets [16]. Information on location and frequency are captured while discretely sampling the wavelets [58], [61]- [63]. A typical wavelet transform-based compression is illustrated in Figure 7.
Embedded Block Coding with Optimized Truncation (EBCOT): This is a complex coding system adopted for entropy coding in the JPEG2000 image compression standard. It is a two-tier coding system with one tier dealing with modelling of context quantized coefficients entropy coding [57]. Control of the targeted rate of compression and code stream output is done by the second tier. In Figure 8, the EBCOT is illustrated.

Set Partitioning In Hierarchical Tree (SPIHT):
SPIHT coding performance is high and closer to embedded block coding achieving rate scalability with optimized truncation [64]- [66]. The computational complexity of the algorithm is lower with the ability to exploit DWT coefficients' self-similarity across the different scales [64], [67].
Hence, making it one of the best encoders according to literature. The technique is summarized in Figure 9.

i) NON-TRANSFORM BASED TECHNIQUES
Due to lack of transforms usage, their computational load from frequency domain coefficients is reduced [71]. The quantization process for these techniques is based on vector quantizer in lossy compression. From the literature review, these types of image compression algorithms are lowlily adopted as compared to transform-based techniques.
Fractal Compression (FC) -It has an encoding method that relies on mathematical theorems and its suitability is on images with some similar parts. A fixed-point theorem and collage are used in building the Iterative Function System (IFS) [71]. It also uses block partitioning on the source encoder.
Vector Quantization (VQ) -In signal processing, it uses the prototype vectors distribution in probability density functions modelling through quantization classification [71]. Apart from transform and non-transform-based compression techniques, there exist other techniques in data compression that are either lossy or lossless such as Distributed Source Coding (DSC) and Compressed Sensing (CS), Text-based Compression, Data Aggregation, and Predictive Coding. These techniques are discussed in Section III.

2) IMAGE COMPRESSION DETERMINING REQUIREMENTS
Just like in data compression [52], the adoption of an image compression technique is influenced by the application. For instance, some applications may need visual information of high quality while others may require the same with less quality [72]. Some applications may need data in real-time while others may require it in non-real time. Therefore, image compression requirements can be categorized into generic requirements and application-specific requirements as introduced in Section II.

a: GENERIC REQUIREMENTS
• Redundant sensing: Data redundancy that will likely occur during the collection, transmission and data saving emanating from sensor regions covered overlapping by nodes. This can be exploited after discovery by the compression techniques.
• On-route compression: The standard adopted normally in data compression is that data is compressed and decompressed at the source nodes and sink, respectively. Data is made available at forwarding nodes for on-route compression for processing and changes to happen.
• Computational complexity and memory requirements: These are mostly centered around hardware requirements that include parallelism to support the compression algorithm for efficiency purposes.
• Reliability: Spatial redundancy is one of the attributes that can be deployed for the improvement of the reliability of image data in communication.
• Robustness: Failures from the network link and nodes should be anticipated and the compression techniques should be able to adequately function if such case arises.
• Scalability: The image compression algorithm in a WSN should be able to scale or grow with the network size [27]. • QoS-awareness: Each sensor node has a set of distinct latency and reliability requirements [73] based on the application.
• Security: Security levels on certain WSN applications may create a conflict between data compression and the level of security required. Hence, it is always important to find a balance between the security protocols used and the image compression to be adopted [74].

III. RELATED WORKS AND CLASSIFICATION OF IMAGE COMPRESSION TECHNIQUES
Challenges with WSNs include target coverage and connectivity, data collection, network lifetime, and data compression [75]. However, various algorithms have been developed to overcome the WSN challenges in literature. These include data collection algorithms such as chain, tree, cluster, multipath and hybrid topologies [5]. As for the network lifespan problem, the Swap-Level algorithm, and Game Theoretic Energy Balance Routing Protocol (GTEB) algorithms have been discussed in literature. A distributed image compression was proposed to overcome the limitation of energy and computations among individual nodes through tasks' processing sharing. This was introduced by authors in [5]. Two distinct methods to address image quality and energy consumption were proposed. The main objective was to achieve an efficient transmission and compression of images on a multi-hop wireless sensor network that is resource constrained. Achieved results showed that the proposed method prolonged the network lifetime at a promising energy consumption rate as compared to an image compression that is centralized. However, the authors did not validate their approach on the testbed for a sensor network. In addition, link errors impacts associated with WSNs were also not taken in consideration.
Authors in [76] focused on finding a compression optimization model with less loss within a group of compressed models on neural nets. Their framework was based on lowrank compression, quantization, low-precision approximation, pruning, and lossless compression. Furthermore, the authors provided a general overview of the Learning Compression (LC) algorithm under standard assumptions. The authors experimental results were compared to other companion papers. It was found out that the compression mechanisms frameworks were comparable with existing state-of-the-art techniques, with some advantages of simplicity, convergence guarantees, and generality. Hence, an addition of useful information to neural networks toolboxes. However, the authors were more general in their approach and the paper can be used in further research as the compression framework model proposed presents a significant advance as far as compression optimization is concerned on neural nets.
Variation partial differential equation was adopted on the grey image compression algorithm implementation as an optimization model [77]. The authors introduced a quad tree for image segmentation, encoding and transmission of some pixels. In addition, at the decoding level, an image interpolation technique with the use of variation of partial differential equations were used to image reconstruction. It was found out that the method provided a significant improvement with high compression ratio and PSNR on less textured and larger images. As compared to Quad, Pixel, Data Manager (QPDM), Error-Dispersed based on Vector Quantization (EDVQ), and Local Cluster Member (LCM), the proposed algorithm demonstrated better results based on image compression coding quality metrics for compression ratio, coding efficiency, average code word length, source entropy, redundancy and PSNR. Even though the proposed method performed better than the algorithms it was compared with, the PSNR is still low. Therefore, there is a need to focus on improvement of PSNR for image compression as well as the average phase error reduction for information on phase and amplitude maintenance. More work on partial differential equations has been discussed by authors in [78].
Different optimization approaches and methods on compression exists in literature. Authors in [79] discussed matrix compression methods such as the Supreme Minimum (SM) and the Variable Length Blocks (VLB). Authors in [80] discussed compression optimization implementation approaches in the form of packets versus sessions, dictionary sizes, blocks versus bytes, static versus adaptive compression, and application versus network. Table 2 provides a classification of related works on image compression algorithms in WSNs.

IV. ANALYSIS OF A DATA COMPRESSION ALGORITHM
There is lack of error bound guarantee mechanism in traditional lossy data or image compression algorithms for WSNs according to literature due to high reconstruction and data decompression computational demands. Even though, there are high computations demands on the error bound mechanism, the process is still vital as some applications require quality in the reconstructed images that can has to be measured and guaranteed to an acceptable range. Therefore, the authors of this paper [63] proposed an algorithm focusing on the following areas in data compression: • An error-bound guarantee data compression technique of low-cost on both compression and decompression using only sigmoid and linear operations.
• The solution was customized to support both temporal and spatial compression, a key feature that is lacking in most conventional methods.
• Some level of free security is introduced due to recovery of data that requires an offline learned decompression dictionary.

A. NEURAL AUTOENCODERS (AE)
As part artificial neural networks, autoencoders are a deep learning model. Artificial neural networks have been widely used on development of WSNs solutions due to capturing of non-linear data structures and sensing coverage maximization. They perform reduction in dimensionality through transformation of data to lower dimensionality but meaningful from high-dimensional data. The key hyperparameters [88] to autoencoders that are to be set before training the autoencoder are: • Code size: These are the number of nodes or neurons in the hidden layer of an encoder. A less number of them results is more compression and a higher number of them results in less compression [88].
• Number of layers: Layers that define an autoencoder and every defined autoencoder can have several of them.
• Number of nodes per layer: Each layer in an autoencoder has one or more nodes. Moreover, the number of these nodes per layer decreases on each subsequent encoder layer and increase back to the decoder [88]. • Loss Function: Binary cross-entropy or MSE are adopted on training autoencoders depending on the input data [88]. Even though the input of autoencoders will always be equal to the output, their advantage is that the output is directly derived from the input data through cost functions such as sigmoid. In their paper the authors adopted autoencoders to address the following key technical challenges associated with data compression in WSNs: • Learning of non-linear spatio-temporal WSN data correlations.
• Data compression and decompression enabling at a lowcost.
• Enabling of tolerable error bound margins of data reconstruction.
• Energy consumption of the WSN minimization. Autoencoders are three layered neural networks mapping input vector d ∈ R L to hidden layer or representation y ∈ R K and lastly an output vectord ∈ R L approximating the input stream d. The illustration in Figure 10 demonstrates and autoencoder as a three-layered neural network.
where θ := [W enc , b enc , W dec , b dec ] are the real-valued parameters to be learned by the right training algorithm, while the sigmoid function for activations is represented by F (•). W enc and b enc are the weight matrix for encoding and bias, and W dec and b dec are the weight matrix and bias for decoding. The weights determine the significance of input vectors to the network with reference to the expected output data. For optimization in learning optimal neural weights θ using the training data D, a cost function for the standard AE was defined using (8).
where W 2 is the sum of squares for the entries of matrix W, a hyperparameter, variable selected priori, controlling the weight decay term contribution represented as α. Sparse Autoencoder (SAE): Used for extraction of the sparsity of the data representation on the hidden layer for entries of y to be as close as zero as possible. It is represented by (10). Addition of the Kullback-Leibler (KL) divergence function enables the sparsity. The functionality is represented by (11).
where β represents a hyperparameter, ρ being target activation close to zero, and k-th node average activation in the hidden layer represented byρ k .

B. LOSSY COMPRESSION WITH ERROR BOUND GUARANTEE
The proposed algorithm applies an autoencoder for representation of captured data with fewer bits, reduction in dimensionality and data compression in WSNs. Collection of compressed data is enabled at error margins that are tolerable within three main steps: the use of sensor nodes to collect historical data, modelling and training offline at the base station (BS), and online spatial or temporal data compression. Figure 11 represents derived flowchart from the proposed algorithm by the authors.

1) MISSING DATA IMPUTATION
In WSNs, missing data can occur due to sampling that is unsynchronized from the sensors, interference, and failure in communication. Therefore, to address the problem, a naïve method with low computational demand using (12) was used VOLUME 10, 2022 for estimation of the missing entry x ij .
where j and i are sensor and time indices, x ij being the missing entry in an aligned matrix, observed sensors at a time i represented by S, and µ j as the observed sensor readings mean for sensor j.
2) DATA SPHERING Output vectors of AEs are between 0 and 1. As the AE will try to reconstruct the input vector d, the input data had to be normalized before being input into the AE and denormalized after the data compression process by the AE. In addition, AE works with input data vectors that are uniformly distributed close to a unit sphere in R L . A process named data sphering in literature. Normalization of input data set and denormalization of output data are represented by (13) and (14), respectively: where x is the vector of the source data, σ being standard deviation of x − mean (x) for all x in the training data. Data fed to the AE network is denoted by d, p is the regeneration of the input data x using the output vector d of the AE, and m being the mean value of source data x vector.

3) THE ERROR BOUND MECHANISM
The error bound is tuned through consideration of various factors that include the precision of the sensor used and requirements for the application. It is the maximum allowable difference between readings captured by the sensor and those received after a compressed representation by the receiver. For error bound mechanism, the residual is first computed from a reconstruction p of input x. Entries with residual vectors beyond the error bound r = x − p to be transmitted by using a residual code in (15). The use of L-BFGS was adopted in minimization of the cost function WAE (θ,D) as part of learning optimal weights θ of the autoencoder. It is a computationally intensive process happening once at the beginning of the network deployment with parameters θ, σ being distributed to the receivers and transmitters as part of training.

V. EVALUATION AND DISCUSSIONS OF THE ALGORITHM A. EVALUATION METRICS ADOPTED
In evaluating the algorithm, the metrics for compression ratio (CR), root mean square error (RMSE), and coefficient of determination (denoted as R 2 ) were used. These are represented by (16), (17), and (18).
where B (x) and B x are for number of bits used to denote the source and transmitted data, respectively. Root Mean Square Error assists in measuring compression error where an RMSE of zero (0) implies a fully regeneration of WSN data without an error.
Coefficient of Determination determines a fraction of source data that is being regenerated from the compressed data. For example, a value of R 2 = 0.6 implies that 60% of input data x is regenerated inx. Hence a full reconstruction of the source data is achieved when R 2 = 1.0.
In summary, R 2 and RMSE are for reconstruction fidelity while compression efficiency is derived from CR.

B. DATASETS
Metrological datasets from Grand-St-Bernard and LUCE deployments [90] were used to evaluate the data compression algorithms on RMSE, CR, and R 2 . The 10 folds cross validation methodology [91] was adopted to train the autoencoders while refining it with Broyden-Fletcher-Goldfard-Shano (L-BFGS) [92] optimization algorithm in tuning the AE's weights during data learning. A dataset was randomized into 10 datasets. The 9 datasets were for training while the remaining dataset was for testing the algorithm. In addition, the datasets had temporal and spatial information. Figure 12 demonstrates the RMSE and learning iterations during training of the autoencoder before being tested online. As shown in Figure 12, the best training performance of the autoencoder is at 0.017646 after 21 iterations. This is an RMSE of 0.132838.

C. BASELINES
Most data compression algorithms in literature lack the errorbound mechanism. Therefore, to evaluate the algorithm that was proposed by the authors in [63], the error bound mechanism was set aside during the first evaluation stage on RMSE, compression ratio, and coefficient of determination. Firstly, VOLUME 10, 2022  the different AE models of different variants were evaluated to determine how they relate regarding RMSE and various compression ratios. This is demonstrated in Figure 13. Secondly, the algorithms were evaluated on spatial compression using the Grand-St-Bernard datasets without any error bound mechanism to cater for the conventional algorithms such as DCT, Fast Fourier Transform (FFT), CS, and the Principal Component Analysis (PCA) [93] [86]. These algorithms do not have an error bound mechanism. The relationship between RMSE, CR, and R 2 for the five algorithms is illustrated in Figure 14. Lastly, an analysis on the temporal compression of the algorithm in [63] is compared with the Lightweight Temporal Compression (LTC) algorithm. The LTC algorithm is known to be one of the algorithms in literature that has error bound mechanism [93]. The LUCE dataset was used for the simulations and experiments for temporal compression scenario.  Although WAE and SAE variants are useful for classification purposes, they reduce the regeneration performance as shown in Figure 13. The basic autoencoder without any overfitting capabilities provides the best performance compared to the other two AEs with variants. This is due to the smaller number of neurons than in the middle layer than they are in the input layer. Overfitting problems emanate when the code layer has more neurons than the input vector. It is on this scenario that the WAE and SAE variants become more useful than the basic AE.
In Figure 14, reconstruction fidelity without any error bound mechanism is illustrated on spatial compression. The results show that the adoption of AEs in WSN improves RMSE at various compression ratios. Therefore, the proposed algorithm in [63] continues to show some promising results as compared to the conventional methods of PCA, CS, FFT, and DCT.  Figure 15, the proposed algorithm demonstrates promising results with reconstruction fidelity high than that of the other compared algorithms at various compression ratios.

Moreover, in
An analysis on the temporal compression was carried out using the LUCE deployment [90]. Two methods were evaluated, being the proposed method in [63] and the LTC (Lightweight Temporal Compression) method in [93]. The method provides for an error-bound guarantee. Therefore, in Figure 16, compression error was measured on various error bounds hyperparameters. The results demonstrated that the proposed algorithm that adopted the AE model on data compression performed better than the LTC method. However, a comparison of the two methods show a similarity in response from the compression ratio at various error bounds. This is demonstrated in Figure 17.

VI. CONCLUSION AND FUTURE DIRECTION
This paper reviewed and analysed image compression techniques and approaches in WSNs. Available image compression approaches in WSNs in literature were then classified according to the image compression technique adopted, and their strengths and weaknesses. In addition, a rate-distortion balanced data compression algorithm with error bound mechanism based on artificial neural networks (ANN) in the form of autoencoder (AE) was coded and simulated in MATLAB, which was further evaluated and compared to conventional approaches. The experimental results show that the simulated algorithm has less root mean square error (RMSE) and a higher coefficient of determination (R 2 ) values on variable compression ratios as compared to the Principal Component Analysis (PCA), Discrete Cosine Transform, and Fast Fourier Transform (FFT) when using the Grand-St-Bernard metrological dataset. It was also found out that although several data and image compression algorithms exist in literature, they lack the error bound mechanism to balance between compression ratio and distortion. Therefore, the analysed algorithm provides a significant approach to data compression that can be applied to image compression for energy conservation and network lifetime without compromising the quality of the reconstructed data or image. ABID YAHYA (Senior Member, IEEE) received the bachelor's degree in electrical and electronic engineering majoring in telecommunication from the University of Engineering and Technology Peshawar, Pakistan, and the M.Sc. and Ph.D. degrees in wireless and mobile systems from Universiti Sains Malaysia, Malaysia. He began his career on an engineering path, which is rare among other researcher executives. Currently, he is working at the Botswana International University of Science and Technology. He is also a Registered Professional Engineer with the Botswana Engineers Registration Board (ERB). He has many research publications to his credit in numerous reputable journals, conference articles, and book chapters. He has received several awards and grants from various funding agencies and supervised several master's and Ph.