Water Quality Prediction for Smart Aquaculture Using Hybrid Deep Learning Models

Water quality prediction (WQP) plays an essential role in water quality management for aquaculture to make aquaculture production profitable and sustainable. In this work, we propose hybrid deep learning (DL) models, convolutional neural network (CNN) with the long short-term memory (LSTM) and gated recurrent unit (GRU) for aquaculture WQP. CNN can effectively fetch the aquaculture water quality characteristics, whereas GRU and LSTM can learn long-term dependencies in the time series data. We conduct experiments using the two different water quality datasets and present an extensive study on the impact of hyperparameters on the performance of the proposed hybrid DL models. Furthermore, the performance of hybrid CNN-LSTM and CNN-GRU models are compared with different baseline LSTM, GRU and CNN DL models and also with attention-based LSTM and attention-based GRU DL models. The results show that the hybrid CNN-LSTM outperformed all other models in terms of prediction accuracy and computation time.


I. INTRODUCTION
Aquaculture is a vital component for ensuring global food security, as the world's population is predicted to exceed 9.8 billion by 2050 [1]. Fishing from the sea is currently far beyond its limits and will not be in a position to meet the growing demand for food. Aquaculture has progressed steadily to meet the demand for fish as food while protecting marine life from overfishing by maintaining a consistent fish supply [2]. Nowadays, with the application of IoT, artificial intelligence, data analysis, etc., aquaculture systems are upgraded to smart aquaculture systems for improving performance and efficiency [3], [4].
To maintain and manage aquaculture, water quality monitoring and water quality prediction (WQP) are essential [5]. In aquaculture, water quality is the most influential and critical factor that affects production as well as product quality. The vital water quality parameters in aquaculture are salinity, pH, dissolved oxygen (DO), and temperature [6].
The associate editor coordinating the review of this manuscript and approving it for publication was Baozhen Yao .
Water quality is affected by many factors, such as fish density, quality of the feed, feeding interval, climate, and more. The change in water quality will upset the balance of the system with algae bloom, bacterial growth, etc. [7]. This can lead to severe problems, such as trigger stress, lack of food intake, vulnerability to diseases, and increased mortality rate of fish [8]. Hence, if we forecast the trend in water quality changes, we can employ safeguards in advance to avoid imbalances in the ecosystem and also ensure suitable conditions for optimum growth of the fish. Therefore, accurate WQP can drastically improve productivity to make aquaculture more profitable and sustainable.
Aquaculture water quality is affected by meteorological conditions and complex interdependency relations between different water quality parameters. Hence the change of water quality parameters exhibit non-linear characteristics, which results in low prediction accuracy. The time-series data shows periodic variations depending on the seasons and climatic conditions. Time series data is used for analysis and prediction in various fields such as the stock market, medical field, energy consumption, weather forecasting, solar radiation etc. [9]- [14]. The major approaches for WQP include classical prediction mechanisms such as the autoregressive moving average model (ARMA), autoregressive integrated moving average (ARIMA), seasonal autoregressive integrated moving average model (SARIMA), seasonal auto-regressive integrated moving average with exogenous factors (SARIMAX), Holt-Winters exponential smoothing (HWES), Markov model, Grey model, and support vector regression (SVR) etc. [15]. However, these models are least suitable for aquaculture WQP because they assume a linear relationship with water quality parameter data. These models are less accurate and take a long prediction time, making them unsuitable for predicting the non-linear aquaculture water quality parameters [16], [17].
The deep learning (DL) models like long short-term memory (LSTM), gated recurrent unit (GRU), and convolutional neural network (CNN) have the flexibility in capturing the non-linear nature of the aquaculture WQP [18]. These DL models have overcome the limitations of Recurrent neural networks (RNNs) and achieved great success in various applications. Recent advancements in artificial intelligence have made DL one of the most proficient methods, and DL is popularly applied in different fields such as image processing, speech recognition, text prediction, etc. [19]- [21]. DL makes time series prediction more precise and efficient in terms of training time and required processing power. In addition, DL models consider the non-linear characteristics of water quality changes. LSTM is the widely used DL method for WQP due to its remarkable performance in time series prediction [22].
LSTM and GRU models are not very capable of maintaining long-term memory for time series prediction, especially for long sequences [23]- [25]. In time series forecasting, the prediction value at the current time step is impacted mainly by past observations. In some cases, the ones with a significant influence might have appeared at the time step long back from the current one. Hence the neural networks must have the prolonged memory holding capacity to hold memory for the learned long-term dependencies. The ability of LSTM models to capture long-term dependencies information from historical observations is still considered a critical performance bottleneck, as indicated in recent studies [26], [27]. The work [23] has theoretically proven that standard LSTM does not have long memory from a statistical perspective. Hence, building a time-series prediction model capable of capturing and remembering complex dependencies is a crucial problem that needs to be addressed.
In [25], [28], the authors have discussed how incorporating an attention module can improve the prediction accuracy of LSTM. Different works have combined the attention model to LSTM to capture the relevant data from long time series data, helping to improve its prediction performance. LSTM network with an attention mechanism has an adaptive decay rate of long-term memory. This decay rate is much lower than the polynomial or exponential decay rate. In the work [29], the reading of the geo-sensor is predicted for a few hours by using multi-level attention-based RNN by feeding data from multiple sensors and meteorological data along with spatial information of these sensors. This work uses the attention mechanism to model the dynamic spatio-temporal relations within the sensors. A fusion module is designed to input the effects of external factors from different domains. The authors have tested their model on a water quality and air quality dataset.
The hybrid CNN-LSTM models also work similarly, helping the LSTM model capture the relevant data. In hybrid models, CNN with maxpooling is used to reduce the length of the input sequence, which is fed into the LSTM and GRU models. The hybrid models are not affected by exponential decay, as the CNN with maxpooling will input the key features and helps LSTM in forecasting with better accuracy. Thus the hybrid models can achieve higher accuracy in prediction. Hybrid models aim to build neither overfit nor underfit models with strong generalization ability and require less training time for fitting the models. The convolutional layer with maxpooling enhances the capacity of the LSTM and GRU models to learn and store the critical features in the non-linear water quality parameters. Thus convolution layer with maxpooling reduces exponential decay in LSTM and GRU, bypassing selective information to the LSTM/GRU layer using convolution and maxpooling [30], [31]. CNNs are faster by design since the computations in CNNs can happen in parallel, while RNNs need to be processed sequentially since the subsequent steps depend on previous ones. In [32], authors introduced the Quasi-Recurrent Neural Networks that use some of the CNN components to imitate RNNs while speeding them up. CNN leads to various complexity reductions by concentrating on the key features. The use of convolution layers leads to a reduction in the size of tensors and with Maxpooling leads to a further decrease in training time. Reducing the training time of the hybrid models.
CNN and its hybrid variations are used in many research work for time series prediction. DL models have been extensively used in many fields of time series prediction, including finance (stock price, cryptocurrency, and precious metal prediction) [33], [34]. The authors [35] have used the CNN-LSTM model for gold price prediction. They have proposed two simple hybrid models and compared the results with other DL models. The authors also compared LSTM models with the hybrid models and observed that hybrid models perform better. In [36] to predict blood glucose (BG) levels, authors have used multiple layers of CNN along with LSTM. This hybrid model has shown superior performance in BG prediction. In another work, the authors have used Bi-LSTM with CNN for air quality prediction [37]. They have done an extensive study with two datasets and shown results proving the model's capability for accurate air quality prediction. There have been multiple works based on hybrid DL models for the prediction of air quality (PM 2.5 ) based on CNN and LSTM models [38], [39]. VOLUME 10, 2022

A. MAIN CONTRIBUTIONS OF THE WORK
In this work, we propose to combine RNN and CNN, making a new hybrid model for WQP having the advantages of both models. RNN has a vanishing gradient problem while training with large data and affects the learning from large datasets [40]. This problem is solved by introducing two specialised variants, LSTM and GRU. LSTM and GRU can learn long term dependencies from training data, while CNN is good with feature extraction [41]. We propose hybrid DL models that combine the advantages of CNN with LSTM and GRU for aquaculture WQP. The significant contributions of this research work are as follows: 1) Three years of data (January 2016 to December 2018) is collected from aquaculture ponds located in Kerala under ADAK. The data is pre-processed to remove abnormalities in data affecting the prediction accuracy. The linear interpolation (LIN) and smoothing methods are used to fill the missing data and correct abnormal data, respectively. 2) We have proposed hybrid DL models, CNN-LSTM and CNN-GRU, that combine the advantages of CNN with both LSTM as well as GRU for aquaculture WQP. The proposed hybrid models are trained and tested with the data collected from ADAK.

3)
We have also trained and tested the proposed CNN-LSTM and CNN-GRU models with another water quality parameters dataset provided in [16]. This data is collected from the marine aquaculture base in Xincun Town, LingShui County, Hainan Province, China. 4) The performance of hybrid CNN-LSTM and CNN-GRU models are compared with baseline DL models LSTM, GRU, and CNN and with attention-based LSTM and attention-based GRU DL models. Results show that the hybrid models significantly improve accuracy compared to the baseline models for aquaculture WQP. Also, the hybrid models outperform all the DL models in terms of computation time. To the best of the authors' knowledge, this is the first research work to propose and analyse the performance of hybrid CNN-LSTM and CNN-GRU models to predict water quality parameters for aquaculture.

B. PAPER ORGANIZATION
The reminder of the paper is arranged as follows: In Section II, the materials and methods are described, in which we have explained in detail the data that we have collected. Section III explains the proposed hybrid CNN-LSTM and CNN-GRU models. Section IV explains in detail the experiments that we have done to analyse and compare the performance of proposed hybrid DL models with the baseline DL models for different hyperparameters (h p ) variations. Section V further analyse the performance of all the DL models for a selected set of h p and give the final results comparing the models. Finally, section VI summarise the results and concludes the work.

A. ACQUISITION OF DATA
The data of aquaculture water quality parameters used in this work is collected from an aquaculture farm under ADAK at Kollam, Kerala, India. We have collected water quality parameter data for three years, from January 2016 to December 2018. Water quality parameters are collected from the pond on a daily basis. The trends in variations of water quality parameters of the aquaculture pond are studied. Fig. 1 shows the annual trends of salinity, pH, DO, and temperature. From the plots, it is clear that all water quality parameters are non-stationary in nature. Table 1  .68 • C. The mean DO minimum in 2018, where the mean temperature is also highest, indicates the relation between DO and temperature. When temperature increases, DO will decrease. In all three years, the average salinity value is coming down in June -July, coinciding with the monsoon season in Kerala.

B. CORRELATION ANALYSIS
Correlation analysis is performed on data from January 2016 and December 2018 to analyse the relationship between these water quality parameters (salinity, pH, DO and temperature). Pearson's correlation coefficient shows the correlation between the variables. A correlation ratio of more  than 0.5 indicates a direct correlation with the parameter and a significant interdependence between the parameters. A value between 0.5 to 0.2 demonstrates a correlation with interdependence between them to some extend. A value of less than 0.1 suggests a poor correlation with the parameter and an insignificant interdependence between the parameters. The data is standardised to eliminate dimensional influence, and then the Pearson correlation coefficient matrix is applied. Table 2 shows the correlation coefficient matrix. The correlation between salinity, pH, DO, and temperature is given in Table 2. The findings in Table 2 show that temperature and salinity have a significant influence on the DO. Moreover, pH does not have any impact on the other water quality parameters monitored here.
Temperature and DO show the most correlation. Whereas temperature and pH show the minimum correlation. The pH has an insignificant correlation with salinity, DO and temperature. DO has a higher correlation with temperature, followed by salinity and an insignificant correlation with pH. Similarly, the temperature has a high correlation with DO and a moderate correlation with salinity. However, the correlation of temperature with pH is insignificant.

C. DATA PRE-PROCESSING
Aquaculture water quality parameters data measured from the ponds may have anomalies such as missing data or abnormal data. This can be due to problems with sensors or possible errors while storing the data. These variations will lead to excessive deviation of predicted values from actual monitored values. In order to improve the accuracy of prediction, we must provide predictive models with clean, reliable and succinct data. Unavailable data can no longer be retrieved, and missing data can only be computed as accurately as possible. Throughout this work, the missing part of the water quality parameters data is first filled using LIN algorithm [42], [43], and then this data is used to forecast the water quality parameters. As illustrated in Fig. 1, the water quality parameters show continuous seasonal variation and also exhibit correlation with time.
Any data which is taken in a definite time interval is timeseries data. The data set of water quality parameters obtained from a source is also time series data. It is an organized set consisting of measured values of water quality parameters at regular intervals. Water quality parameters are measured at a specified time every day.
In (1) TS i,n is defined as n-length time-series with ith water quality parameter. The sampling interval for all parameters is one day i.e., all parameters are measured at the same time on a daily basis. If the value y i,v at T v is missing, then we can obtain its approximated value using the LIN algorithm. The LIN function can be constructed as: When a water quality data is missing at any point, the LIN algorithm first finds the two closest moments represented as T u and T w . Then calculate the missing value at that moment T v by utilizing the parameter values y i,u and y i,w at the moments T u and T w respectively based on (2). Where Y i,v is the estimated value of missing value y i,v .

D. DESCRIPTIVE STATISTICS
Descriptive statistics of the water quality parameter data used in this work, including minimum, mean, maximum, median, standard deviation, skewness and kurtosis, can be found in Table 3. Standard deviation is highest for salinity (0.7400 ppt) and temperature (0.6522 • C), while DO (0.0759 ml/L) and pH (0.0540) have the lowest standard deviation. The DO concentration ranged between 5.06 and 5.51 ml/L, with a mean of 5.31 ml/L. The minimum and maximum values of temperature are 22.99 and 27.27 • C respectively (with a mean of 24.89 • C). The skewness and kurtosis for the salinity, pH, DO, and temperature are provided in Table 3. The skewness and kurtosis were within − 2 to + 2, indicating that these water quality parameters follow a normal (Gaussian) distribution.

III. PROPOSED HYBRID CNN-LSTM/GRU MODEL A. LSTM DL NEURAL NETWORK
LSTM DL neural network was proposed in 1997 by authors of [44] to avoid long-term dependency problems in RNN. In LSTM, a control unit is introduced to store information, unlike a hidden layer in RNN. This hidden state is divided to memory cells c t and working memory h t . The c t is responsible for sequence features retention, and previous sequence memory is controlled by the forgetting gate f . The portion of the current memory c t is controlled by output gate o t and h t is used as the output. The current state h t−1 and current input x t written to memory cells are responsibility of the input gate i. The LSTM architecture [45] is shown in Fig. 2, and memory units are defined as follows in 3: In this architecture f t , i t , and o t are forget, input and output gate layer respectively.c t and c t are new and final memory  cell, w is weight matrices, b is bias vectors, σ is the sigmoid activation function.

B. GRU DL NEURAL NETWORK
GRU DL neural network was proposed by authors of [46] in 2014. It similar to LSTM, but required less computing power. GRU is an improved version of RNN with only two gates, an update gate and a reset gate. There are no additional memory cells to store information; GRU can control information inside the unit. The update gate decides whether to pass the previous output h t−1 to the next cell. The reset gate reads the input sequences when the gate is set to zero and forgets the previously calculated state. As a result, GRU has fewer tensor operations than LSTM and runs typically faster than LSTM. The GRU architecture [45] is shown in Fig. 3 and memory units are defined as follows: In this architecture z t is the reset and r t is the update gate, h t is process input, and h t is hidden state update, w is weight matrices and σ is the sigmoid activation function.

C. HYBRID CNN LSTM/GRU DL NEURAL NETWORK
The Hybrid CNN-LSTM and CNN-GRU DL neural network structures are shown in Fig. 4 and Fig. 5. The two models are the hybrid of CNN with LSTM and CNN with GRU. The first part of the model is CNN, to which the data is fed, and it extracts the features. There is a dropout layer after the convolution layer (Conv1D) and pooling layer (MaxPooling1D). The second part has the LSTM or GRU, followed by a dense layer to give the output.
In the CNN-LSTM model, we use Conv1D with 32 filters, a kernel size of 3 and ReLU is used as the activation function. A pooling layer MaxPooling1D follows this with a pooling size of 2. The CNN layer extracts all the features and then feeds them to the LSTM layer. The output of the CNN layer is fed to the LSTM layer after pooling through the flatten layer. Then, the LSTM layer outputs its output to the dense layer through the flatten layer, and finally, the prediction is output at the dense layer.
The CNN-GRU models use Conv1D with 64 filters, and a kernel size of 5 and ReLU is used as the activation function. A pooling layer MaxPooling1D follows this with a pooling size of 4. The CNN layer extracts all the features and then feeds them to the GRU layer. The output of the CNN layer is fed to the LSTM layer after pooling and 0.2 dropout. After that, the LSTM layer outputs its output to the dense layer, and finally, the prediction is output at the dense layer.
The structure of both CNN-LSTM and CNN-GRU is further detailed in Table 4. These models use a learning rate of 0.0008 and Adam for optimisation [47]. Here, MSE is used as the loss function. The algorithm is based on CNN, LSTM, and GRU DL neural network implemented using Python, Keras and Tensorflow.

D. MODEL IMPLEMENTATION
The water quality dataset is used to train the DL models to predict each water quality parameter. First, the dataset is divided into two; 80% for the training and 20% for the testing. The training dataset is used to develop the models, while the testing dataset is used to validate and compare the performance of models developed. All input and output variables were scaled between 0 and 1 via normalization VOLUME 10, 2022   and minimum-maximum scaling techniques, using the Min-MaxScaler in the scikit-learn preprocessing library using Python [48]. The training and testing datasets are transformed into supervised learning frames with various array manipulation techniques. The time series sequences are converted to input-output equations using the sliding window procedure. This transformed data is used to train the models for WQP.

IV. EXPERIMENTAL RESULTS AND DISCUSSIONS
This research is to analyse the performance of hybrid models CNN-LSTM and CNN-GRU. We use two water quality data sets to conduct experiments, analyse and evaluate the performance of the proposed models. Furthermore, by comparing proposed models with other DL models, the prediction performance and effectiveness of the proposed models are validated.

A. DATASETS 1) ADAK WATER QUALITY DATASET
The first dataset used in this research work is the aquaculture water quality data that we have collected from aquaculture farms under ADAK, Kerala, which includes the data of water quality parameters such as salinity, pH, DO and temperature. The preprocessing and cleaning of this data was done as part of this work. We have three years of data with 1096 samples. The data is collected on a daily basis at the same time. This is a medium-size dataset compared to the MAC dataset used in this research work.

2) MAC WATER QUALITY DATASET
The second dataset is taken from another research work which is made available publically by the authors [16]. This data is collected from the marine aquaculture base in Xincun Town, LingShui County, Hainan Province, China. It also includes water quality data parameters such as salinity, pH, DO, and temperature and the data is collected every 5 minutes. This data have a total of 23200 samples collected over 80 days. This is a large dataset compared to the ADAK dataset.

B. EXPERIMENTAL SETUP
The experimental environment is Microsoft Azure Virtual Machines with specifications: Inter(R) Xeon (R) 8272CL CPU @2.60GHz, 32 GB RAM, Windows 10 (64-bit) operating system, Visual studio code IDE, and we have implemented the neural network model using Python 3.9.6, Keras 2.6.0 and Tensorflow 2.6.0.
The hybrid models are compared with CNN, LSTM, and GRU DL models. We use 80% of data to train the model, and the remaining 20% is utilised to test the prediction accuracy of results. The most critical part of building a DL model is tuning and optimisation of the h p . For the model to be most effective and efficient, we need to tune as many h p . In this work, we have selected four h p for tuning: learning rate, epochs, window size and batch size. In each of the models, we have kept other h p , the number of layers and neurons in each layer fixed. A dropout of 0.2 is applied after the primary layers to avoid over-fitting, and ReLU is applied as the activation function. In addition, the optimiser adopted in all the DL models in this work is Adam.

C. MODEL PERFORMANCE METRICS
The performance of the prediction models are evaluated using mean absolute error (MAE), mean square error (MSE), root mean squared error (RMSE) and mean absolute percentage error (MAPE), computed by the set of equations given below: where A i is the actual value of i th sample, Y i is the predicted value of i th sample and n is the number of samples.

D. EXPERIMENTS
This work aims to improve the accuracy of aquaculture WQP and reduce the computation time. We have done five sets of experiments. Four sets are done by varying each of the h p , and the fifth experiment is done by changing the step size.
We have plotted learning rate, epoch, window size, and batch size versus RMSE. We have plotted each hyperparameter versus the computation time as well. The experiments are done on both datasets (ADAK dataset and MAC dataset). This will help to analyse the performance of these DL models on a medium-sized dataset and large dataset. The ADAK water quality dataset, even though it has three years of data, it has only 1096 data points, as the data is recorded only once a day. In comparison, the MAC water quality dataset has 23200 data points, which is from 80 days since data is recorded every 5 minutes.

1) EPOCH
The outcome of choosing different epochs {10, 50, 100, 150, 200, 300, 400, 500} for each DL model is studied. Fig. 6 shows the RMSE vs epoch for different water quality parameters using ADAK water quality dataset for the proposed hybrid DL models and LSTM, GRU as well as CNN DL models. Here the proposed hybrid DL models CNN-LSTM and CNN-GRU maintain better performance than baseline DL models LSTM, GRU and CNN. Hybrid models achieve good performance within 50 to 100 epochs, and the generalization is attained within 100 epochs. Prediction accuracy performance is maintained for all four water quality parameters for epochs 10 to 500. Also, the prediction performance of our hybrid DL models is higher compared to the baseline DL models. As well as baseline DL models are taking at least 400 epochs to achieve generalization and is under-fitting training data for epochs less than 400. The baseline DL models are close to the performance of the proposed models at 500 epochs. The baseline models require more computation resources to achieve similar results. The hybrid models start to overfit a little after 150 epochs for the ADAK water quality dataset. Still, the accuracy is maintained, indicating the adaptability of the proposed DL model to different datasets.
To further evaluate the performance, we have repeated the experiment with the MAC water quality dataset. Fig. 7 shows the RMSE vs epoch for all four water quality parameters on the MAC dataset of the proposed hybrid DL models, LSTM, GRU and CNN. Here the proposed hybrid DL models CNN-LSTM maintain better performance than the other DL models for predictions of water quality parameters. In this dataset, our model attained the required accuracy and generalization in 50 epochs. Furthermore, the accuracy loss is minimal after increasing epochs, and the CNN-LSTM model is neither over-fitting nor under-fitting the training data as we increase the epochs from 10 to 500 epochs.

2) WINDOW SIZE
The influence of choosing different window size {10, 20, 30, 40, 50, 60, 70, 80} for each DL model is analysed. Fig. 8 shows the RMSE vs window size for different water quality parameters on the ADAK water quality dataset with the proposed hybrid DL models and the baseline DL models. The proposed hybrid DL models perform better than baseline DL models. As the window size is increased from 10 to 80, the error remains constant for hybrid DL models. In this experiment, we can see those baseline models are comparable in performance with hybrid DL models for some window sizes for each water quality parameter. However, no baseline models are performing consistently well for all four water quality parameters. For example, the performance of LSTM is slightly better than CNN-LSTM at window size 40 and 80 for pH. However, for the other three water quality parameters, the performance of LSTM is not good. In comparison, proposed hybrid models consistently show good performance for the VOLUME 10, 2022           four water quality parameters and window sizes that we experimented with.
To further analyse the performance of these models, we repeat the experiment with the MAC water quality dataset. Fig. 9 shows the RMSE vs window size for all four water quality parameters on the MAC dataset with the proposed hybrid DL models, LSTM, GRU as well as CNN. Furthermore, we can observe from Fig. 9 that the hybrid CNN-LSTM model is having good performance consistently when compared to the other models.

3) LEARNING RATE
The impact of learning rate {0.01, 0.001, 0.0009, 0.0008, 0.0007, 0.0001} on the performance for each of the DL models is analysed. In Fig. 10, we compare RMSE vs learning rate performance of proposed hybrid DL models with baseline models using the ADAK water quality dataset. The proposed hybrid DL models perform consistently better than baseline DL models. However, the performance of baseline DL models is not consistent for various learning rates and different water quality parameters. Hence using the baseline DL models for WQP is not practical.
To further analyse the performance of these models, we repeat the experiment with the MAC water quality dataset. In Fig. 11, we compare RMSE vs learning rate performance of proposed hybrid DL models with baseline models using the MAC water quality dataset. LSTM, GRU and CNN-GRU show moderate performance at some learning rates. However, the performance of CNN does not come close to the performance of hybrid models at any point. Here, we can observe from Fig. 11 that the hybrid CNN-LSTM model has the best and consistent performance compared to other models.

4) BATCH SIZE
In this experiment, we analyse the performance of each DL model on different batch sizes {16, 32, 64, 128, 256, 512}. Fig. 12 shows the RMSE vs batch size for different water quality parameters on the ADAK water quality dataset for the proposed hybrid DL models with the baseline DL models. Compared to the proposed hybrid DL models, it can be observed that the performance of the baseline DL models is inconsistent and inferior. Furthermore, we can see that the performance of CNN is reducing with the increase of the batch size. Fig. 13 shows the RMSE vs batch size for different water quality parameters using the MAC water quality dataset with the proposed hybrid DL models and the baseline DL models. As this dataset has more data points, we can see from the plots the performance difference of each model. From the results, it can be observed that the proposed hybrid CNN-LSTM models have better performance compared to other baseline models and the hybrid CNN-GRU model. Here also, the baseline DL models are inconsistent and underperform compared to the proposed hybrid DL models.

5) COMPUTATION TIME
In the above four experiments, the computation time was also simultaneously measured and stored. Fig. 14 and Fig. 15 plot the computation time for each hyperparameter for the ADAK water quality dataset and MAC water quality dataset, respectively.The computation time required by each DL model is different. Here we can observe some common trends from the plots. One among them is that the computation time increases with an increase in epochs and window size. But computation time is reduced with an increase in batch size. We cannot notice much change in computation time for different learning rates since we have selected the learning rate within a limited range for both of the datasets for all models.
In summary, for WQP, the proposed CNN-LSTM model maintains the best performance in various experimental conditions in terms of accuracy and computation time, indicating the adaptability of the proposed CNN-LSTM DL model.

6) MULTI-STEPS PREDICTION
In this experiment, we have analysed the performance of each of the DL models for different step sizes {2,4,6,8,10,12}. Fig. 16 shows the RMSE vs step size for various water quality parameters in the ADAK water quality dataset for the proposed hybrid DL models with the baseline DL models. The results show that the proposed hybrid DL models perform much better than the baseline DL models. Furthermore, we can see that the performance of CNN-LSTM and CNN-GRU is reducing with the increase in the step size. Fig. 17 shows the RMSE vs step size for different water quality parameters in the MAC water quality dataset with the proposed hybrid DL models and the baseline DL models. From the results, it can be observed that the proposed hybrid CNN-LSTM models have better performance compared to other baseline models and the hybrid CNN-GRU model. Compared to the proposed hybrid DL models for both datasets, the baseline DL models are underperforming.

V. RESULT ANALYSIS
The objective is to analyse the performance of hybrid DL models, CNN-LSTM and CNN-GRU, in comparison with LSTM, GRU, CNN, attention-based LSTM and attention-based GRU DL models to predict aquaculture water quality parameters (salinity, ph, DO and temperature). In this section, the performance of proposed hybrid DL models is compared with baseline DL models and attention-based DL models with a fixed set of h p on the two datasets.
The h p is selected by studying the performance of the DL models for various hyperparameters (epochs, window size, learning rate and batch size) in terms of prediction accuracy and computation time. The performance of each model is studied through various experiments by varying the hyperparameters. The optimal set of hyperparameters in terms of prediction accuracy and computation time is selected to run the final experiment for comparing the performance of all the DL models.
A. ADAK WATER QUALITY DATASET Fig. 18 compares the predicted values with the true data on the ADAK water quality dataset using different DL models. All these models are trained using 80% of the ADAK water quality dataset and tested with 20% of the data. The predicted values are compared with the true data. Here we have a fixed set of h p (epochs = 100, learning rate = 0.0008, batch size = 32, window size = 30 and Adam optimizer). The models are evaluated using performance metrics MAE, MSE, RMSE, and MAPE in the training and testing periods. The computation time required for each of the models is also calculated.
For the ADAK water quality dataset, the prediction performance of all models is shown in Table 5. GRU, attention-based LSTM and attention-based GRU models. CNN-LSTM outperforms CNN with prediction accuracy even though the CNN model requires lesser computational time when compared to CNN-LSTM. The attention-based models have slightly better performance than the hybrid CNN-LSTM models in terms of prediction accuracy. The hybrid model outperforms them in training time. Hence, we can conclude that CNN-LSTM is the best DL model for ADAK water quality data.
We have also compared the performance of classical model ARIMA with all the DL models for the ADAK dataset in Table 7. The results show that all the DL models perform better than the classical model ARIMA for the ADAK dataset. Fig. 19 compares the prediction of all DL models with the true data on the MAC water quality dataset. All these models are trained using 80% of the ADAK water quality dataset and predicted the 20% values. The predictions are compared with the true data. Here we have a fixed set of h p (epochs = 50, learning rate = 0.0008, batch size = 64, window size = 30 and Adam optimizer). The models are evaluated using MAE, MSE, RMSE, and MAPE in the training and testing periods. The computation time required for each of the models is also calculated.

B. MAC WATER QUALITY DATASET
For the MAC water quality dataset, the prediction performance of all models is shown in Table 6.     0.0216 • C, MSE = 0.0009 • C, RMSE = 0.0303 • C, and MAPE = 0.0008 • C) prediction as well. Also, the computation time required for CNN-LSTM is around 33% of the computation time needed by LSTM, GRU, attention-based LSTM and attention-based GRU models. CNN model has lesser computational time when compared to CNN-LSTM. The attention-based models have slightly better performance than the hybrid CNN-LSTM models in terms of prediction accuracy. The hybrid model outperforms them in training time. Hence, we can conclude that CNN-LSTM is the best DL model for MAC water quality data.
We have also compared the performance of classical model ARIMA with all the DL models for the MAC dataset in Table 7. The results show that all the DL models perform better than the classical model ARIMA for the MAC dataset.

VI. CONCLUSION
In this research work, we have proposed hybrid DL models, CNN-LSTM and CNN-GRU, for aquaculture WQP. The developed hybrid prediction models were trained and tested on two distinct datasets. The water quality parameters data was collected from aquaculture ponds located in Kollam, Kerala, under ADAK. Another dataset used was the MAC dataset which was collected from the marine aquaculture base in Xincun Town, LingShui County, Hainan Province, China. We have also extensively analysed the impact of varying the h p . For performance comparison and further analysis, optimal h p were used. We have compared the performance of these hybrid DL models (CNN-LSTM and CNN-GRU) with baseline DL models (LSTM, GRU, and CNN) and attention-based DL models (attention-based LSTM and attention-based GRU) in terms of MAE, MSE, RMSE, and MAPE. Results show that the CNN-LSTM hybrid model provides significant improvement in prediction accuracy as well as computation time compared to the baseline DL models. The hybrid models have a similar performance compared with the attention-based models. Still, they outperform the attention-based models in computation time, offering a realistic solution for predicting water quality parameters in smart aquaculture.