Cloud Affected Solar UV Prediction With Three-Phase Wavelet Hybrid Convolutional Long Short-Term Memory Network Multi-Step Forecast System

Harmful exposure to erythemally-effective ultraviolet radiation (UVR) poses high health risks such as malignant keratinocyte cancers and eye-related diseases. Delivering short-term forecasts of the solar ultraviolet index (UVI) is an effective way to advise UVR exposure information to the public at risk. This research reports on a novel framework built to forecast UVI, integrating antecedent lagged memory of cloud statistical properties and the solar zenith angle (SZA). To produce the forecasts at multi-step horizon we design a 3-phase hybrid convolutional long short-term memory network (W-O-convLSTM) model, validated with Queensland-based datasets in near real-time (i.e., 10-minute, 20-minute, 30-minute and 1 hour forecast horizon). Our approach in optimizing the performance also entails a robust selective filtering method using the BorutaShap algorithm, data decomposition with stationary wavelet transformation and hyperparameter optimization using the Optuna algorithm. We assess the performance of the proposed W-O-convLSTM model alongside the baseline and benchmark models. The captured results, through statistical metrics and visual infographics, elucidate the superior performance of the objective model in short-term UVI forecasting. For instance, at a 10-minute forecast horizon, our objective model yields a relatively high correlation coefficient of ~0.961 in the autumn, 0.909 in the summer, 0.926 in the spring and 0.936 in the winter season. Overall, the proposed O-convLSTM model outperforms its competing counterpart models for all forecast horizons with the lowest absolute forecast error. The robustness of our newly proposed model avers its practical utility in delivering sun-protection behavior recommendations that can mitigate UV-exposure-related public health risk. We also recommend that future integration of aerosol and ozone effects with cloud cover data can enhance our forecasting framework for wider applications in solar energy or skin health monitoring systems.


I. INTRODUCTION
Solar ultraviolet radiation (UVR) has benefits and risks for the people, industry, and the natural terrestrial environment. Exposure to erythemally-effective UVR poses high health risks of skin-based diseases, such as malignant ker-The associate editor coordinating the review of this manuscript and approving it for publication was R. K. Tripathy . atinocyte cancers, and eye diseases (pterygium and cataracts) in humans [1], [2]. In the agricultural sector, UVR reduces a plant's photosynthetic rate, CO 2 intake and oxygen outputs, thus hindering its water use efficiency [3]. However, solar radiation is a vital renewable energy resource in the energy sector for harnessing clean energy using solar photovoltaic (PV) technologies. Factors that affect terrestrial UV radiation are inclusive of time of the day, season, geographical latitude, surface reflection, altitude and cloud cover [4]. While the intensity of solar UVR is largely dependent on solar zenith angle (SZA), the ground-level UVR is significantly affected by cloud movement. To implement sunprotection from such incident UVR, the World Health Organization (WHO), International Commission on Non-Ionizing Radiation Protection (ICNIRP), World Meteorological Organization (WMO) and United Nations Environment Programme (UNEP) developed the global solar ultraviolet index (UVI) for mitigating skin and eye health risks [5]. It is known that under unbroken cloud cover conditions, UVI reduces by 50 to 60%, and even further during precipitation [6]. However, under particular partial cloud cover conditions, scattering can escalate ground-based UV levels above the nominal cloud-free surface UV irradiation [7]. Thus, accurate forecasts of cloud-affected UVI are essential in delivering real-time sun-exposure advice to the public at risk of skin and eye-related diseases.
Malignant melanoma cases, which are more prevalent in fair skin types, increase with decreasing latitudes [8]. In Australia, 1726 and 714 deaths were reported in 2019 for cutaneous malignant melanoma and keratinocyte cancer (squamous cell carcinoma and basal cell carcinoma), respectively [9]. A survey in 2011-2014 revealed that among the Australian populations, Queensland recorded the highest person-based incidence of keratinocyte cancer excisions with 2679 per 100, 000 [10].
In this paper, we propose a deep learning (DL)-based novel wavelet hybrid convLSTM, to advance an earlier study [7], to forecast short-term UVI with cloud cover effects by integrating cloud segmented statistical properties extracted from whole sky images and SZA. The earlier study [7] neither considered cloud cover factor nor incorporated deep learning methods and multiple forecast horizons for solar UVI predictions. Considering the seasonal and diurnal variations of SZA, and the cloud movement, in this study, we also present forecasts of the four seasons tailored for multiple-step time horizons. Prior to modeling UVI, we utilize an intelligent BorutaShap algorithm to select the most informative input features from the cloud chromatic properties. In addressing the issues of non-stationarity, intermittent or stochastic variations, periodicity, and trends in the predictor variables, we apply a stationary wavelet transform (SWT) to decompose these input signals. To optimize the hyperparameters of the wavelet hybrid convLSTM, we employ a state-of-the-art Optuna (O) algorithm with powerful sampling and pruning efficiency. Hereafter, we designate the proposed 3 phase wavelet hybrid convLSTM model with O optimization as W-O-convLSTM. Thus, the contributions of this paper, which are distinct from an earlier study [7], are summarized as follows: 1) A novel hybrid W-O-convLSTM is proposed to forecast UVI for the first time using antecedent fluctuations in cloud cover condition and SZA at multi-step forecast horizon (i.e., 10-minute, 20-minute, 30-minute & 1 hour).
2) An efficient self-adaptive Python tool is developed to segment cloud chromatic properties using real-time sky images from total sky image repositories. 3) In optimizing the performance of W-O-convLSTM, an intelligent wrapper-based BorutaShap algorithm is designed to select the most relevant features from the cloud segmented statistical properties. Further optimization is achieved through hyperparameter tuning using a state-of-the-art O optimizer. 4) The non-stationarity behavior, periodicity and random fluctuations in the cloud chromatic properties and SZA over the temporal scales are addressed through the application of SWT with high and low frequency decompositions.

5) The efficacy of W-O-convLSTM in forecasting UVI
is explored for the four seasons with robust statistical score metrics and visual analysis of all tested data alongside other competing benchmark and baseline models.
The rest of this paper is organized as follows: In Section II, we briefly present the related work and in section III, we discuss the theoretical overview. Afterward, we provide the methodology detailing the comparative experiments for UVI forecasts in section IV and then we present the results and discussion in section V. Finally, in section VI, we discourse the concluding remarks and future work.

II. RELATED WORK
While UVI measurement can be achieved using mechanistic surface measurement methods including the use of a pyranometer or spectroradiometer, its potential for broad application can be constrained by high costs and calibration issues [11]. Previous researches have applied deterministic methods to predict UVI but such approaches are restricted by assumed fixed or estimated initial conditions [12], [13]. Artificial intelligence (AI) based data-driven and DL algorithms are robust, cost-effective and user-friendly [14] but have not yet been applied to predict short-term UVI by utilizing stochastic cloud cover conditions. Though solar UVI has been forecasted with applications of artificial neural networks (ANN) [15], extreme learning machine (ELM) [7], deep belief networks (DBN) [16] and long short-term memory (LSTM) [17], integrating cloud effects can further boost the performance of highly competitive machine learning (ML) and DL methods.
A multiple-input DL convolutional long short-term memory (convLSTM) is currently gaining prominence as a powerful predictive tool. Having the convolutional operation embedded inside the long short-term memory (LSTM) cell, it robustly extracts statistically significant antecedent lagged inputs from the predictive variables whilst the LSTM learns from the sequentially incorporated features for low latency predictions [18], [19]. Recently, convLSTM was applied for flood index forecasts [20] and precipitation forecasts [21], and these studies illustrated the superiority of convLSTM over the benchmarked counterparts. Being an intelligent and versatile predictive model, convLSTM is highly suitable for modeling cloud-affected UVI.
Feature selection approaches are essential components of the model designing phase to achieve the optimum performance of a forecast model. The Python-based BorutaShap algorithm remarkably eliminates irrelevant and largely redundant features, as revealed in a study where it was employed in identifying the strongest data series of winning and losing the Belgian professional soccer [22]. Along with utilizing selective filtering, the application of robust data decomposition schemes such as SWT efficiently accomplishes dimensionality reduction of the input variables. As a pre-processing tool, SWT was applied by [23] to effectively decompose the input signals into low-frequency and high-frequency components. The non-stationarity in electrocardiogram signal inputs [24] was also exploited using SWT decomposition.

III. THEORETICAL OVERVIEW
This section provides a brief overview of the operational mechanism of convLSTM in designing the proposed hybridized W-O-convLSTM model. Furthermore, we briefly discuss the three major phases of the UVI forecasting framework that includes feature selection by BorutaShap, data decomposition using SWT and Hyperparameter optimization by O algorithm.

A. OPERATIONAL MECHANISMS OF ConvLSTM
ConvLSTM is fundamentally an extension of LSTM networks that encapsulates the convolutional operation to robustly capture the underlying spatial features in large scale sequential and multi-dimensional datasets [25], [26]. With a time-series predictive framework as in our case, the convolutional operation at each gate (input, forget and output) of the LSTM cell replaces matrix multiplication to suitably extract spatiotemporal patterns in the 2-dimensional inputs [21], [25]. The future state of a cell in convLSTM is determined by its local neighbors' input and past state. While convLSTM retains the strengths of LSTM to capture long short-term memory, it further minimizes the redundancy of the fully connected structure, thus improving the training and prediction efficiency [27]. The key equations governing the operation of a single convLSTM unit are as follows [26], [28]: where ' * ' denotes convolution operator, '•' denotes Hadamard product, h t is hidden state at sequential time t, C t is cell state, S t is intermediate state and the convLSTM gates i t , f t , o t , are 3-dimensional tensors having the last two dimensions as spatial dimensions (rows and columns). The operational mechanisms and explanations of the benchmarked models constructed using CNN [29], SVR [30] and PA [31] are elucidated elsewhere, as these methods are well-renowned.

B. WRAPPER-BASED BORUTASHAP
BorutaShap is an elegant Python-based wrapper method that combines the Boruta feature selection algorithm with shapely additive explanations. It is highly compatible and facilitates any tree-based learner such as RF, XGBoost, decision tree (DT), etc. as the base model [22], [32]. To select the most significant features, the Boruta algorithm creates shadow features (exact replicas) of each feature and shuffles the values in the shadowed features to remove their correlations with the response variable [33]. Thereafter, it passes the actual and shadow-shuffled features in the tree-based model to predict the target variable using the tree-based learner. It then determines the permutation importance or Mean Decrease Accuracy (MDA) for the actual and the shadow-shuffled inputs for overall trees (m tree ), given by the expression [34], [35]: where, x t is group of predictor variables (x t ∈ R n ) and y t is target variable (y t ∈ R) for n number of inputs in the set T (where t = 1, 2, . . . ., T ), I (•) is indicator function, OOB is Out-of-Bag predictive error, y t = f (x t ) is predicted value before permuting and y t = f x n t is predicted value after permuting.
By performing a two-sided hypothesis test (t-test) for equality of both actual and shadowed, the algorithm calculates the z-score [32]. The z-score is determined by the expression: z score = MDA SD (8) where SD represents the standard deviation of accuracy losses. A threshold is set by the algorithm where the z-score of the actual feature must be greater than the maximum z-score (z max ) of the randomized shadow features. If the threshold criteria is met, the feature is selected to be important. Additionally, comparisons are made between the features and corresponding shadow features in terms of their shapely importance values (SHAP values), which produces a more consistent result [36].

C. STATIONARY WAVELET TRANSFORM (SWT)
SWT is a powerful mathematical tool for dimensionality reduction and data decomposition, which takes care of nonstationary, nonlinear and noisy signals [37]. It is a modified version of conventional discrete wavelet transform (DWT) that is designed to handle the issues of signal decimation in 24706 VOLUME 10, 2022 DWT [38]. For a given signal, x(t), its wavelet transform can be determined by the expression [37]: (9) where ' * ' denotes complex conjugate, ψ is analyzing wavelet, a is time dilation, and τ is time translation. Therefore, the DWT of a signal, x[m], is given by the expression: By performing a DWT decomposition for the signal x[m], the respective sub-signals of detailed components (DC) and approximation components (AC) are acquired [39]. However, due to signal decimation after each level of decomposition, the transform by DWT is not time-invariant, which makes the signal unsuitable for data preprocessing [37]. To overcome this drawback, SWT (an extension of DWT) is employed, as it uses the a-trous algorithm to solve the problem of shiftinvariance [38]. Having undecimated wavelet transform, the size of SWT data is efficiently preserved through the lowpass and high-pass filters. Thus, the length of the detailed and approximation coefficients are the same in comparison with the original signal [40]. Using SWT, the decompositions can be computed using the expressions [41]: cD SWT n,m = u cD SWT n−1,m+2 n (u) h(u) (12) where cA SWT n,m is the approximation coefficient of SWT, cD SWT n,m is the detailed coefficient of SWT, n, m is the number of decomposition levels and the position, g(u) is the low pass filter and h(u) is the high pass filter. The Python-based SWT presents several mother wavelets for data decomposition and signal denoising, among which 'haar' and 'db' are widely utilized [23], [24].

D. OPTUNA (O) OPTIMIZER
The O algorithm is a next-generation hyperparameter optimization framework with a define-by-run API that provides the platform to construct the parameter search space dynamically via efficient searching and pruning strategies [42]. In searching ideal hyperparameter values, O utilizes various samplers such as random, grid, Bayesian, and genetic calculations [43]. During the process of optimization, the O algorithm achieves optimal solution by repeatedly calling and evaluating the objective function of different parameter values. The following steps describe the optimization process by O algorithm [44]: Step 1: Determine the direction of optimization, type of parameter, range of values and the maximum number of iterations.
Step 2: Enter the loop; Step 2.1: Uniformly select a population of individuals within the function defining the parameter value range; Step 2.2: Automatically terminate the hopeless population individuals according to the trimming conditions with a trimmer; Step 2.3: Determine the objective function value of the unpruned individual populations; Step 2.4: Repeat the above steps for the loop and exit when the maximum number of iterations is reached.
Step 3: Provide the output as the optimal solution and optimal function value.
The O optimizer is gaining eminence as it provides an optimum combination of hyperparameters with relatively lower computation cost in comparison with other optimization methods such as exhausted grid search and random grid search [45].

IV. METHODOLOGY
In this section, we describe our study location and datasets for the UVI modeling experiments. Thereafter, we discuss the process of segmenting cloud statistical properties from the sky images. Finally, we present the stages involved in designing the proposed W-O-convLSTM model, followed by a discussion on model evaluation using robust statistical score metrics.

A. EXPERIMENTAL SITE AND DATASETS
To validate the W-O-convLSTM model, the study site of the experimental set-up was based at the University of Southern Queensland (USQ) in Toowoomba (Latitude of 27.60 • S and Longitude of 153.93 • E), Australia, as illustrated in Fig.  1(a). Geographically, the experimental site is located approximately 100 km inland relative to the ocean and experiences limited marine aerosol and anthropogenic effects [7]. Being a subtropical region, Queensland receives a large number of sunshine days annually, which poses a significant impact on the public health sector in terms of UV-exposure-related skin and eye diseases.
We measured the time-series solar spectral irradiance using the Bentham DTM300 Spectroradiometer (Bentham Instruments Inc., UK), mounted on a roof-top at the USQ Toowoomba campus, as shown in Fig. 1(b). Using the measured solar spectral irradiance, the UVI data was calculated based on the International Commission on Illumination (CIE) reference action spectrum for UV-induced erythema on the human skin [5]. As per the CIE guidelines, we first determined the erythemally active UV irradiance (UVE) by integrating the monochromatic UV irradiance (S(λ)) that is weighted with the CIE spectral action function CIE(λ) and bounded within the wavelengths of 280 nm to 400 nm as follows [46]: Having one unit of UVI equivalent to 25 mW m −2 of erythemally effective exposure to UVR, we calculated the UVI from the UVE as follows [46], [6]: The calculated UVI is a unitless normalized index, for which the values range globally from 0 to 11+. As the UVI increases, the exposure severity and potential for damage to the skin and eye rises. We acquired the UVI datasets at a time resolution of 10 minutes. However, there were instances when these datasets were missing due to power failure or maintenance of the spectroradiometer. The missing datasets were recovered with the UVI calculated using the minimal erythema dose (MED) measurements of a co-located 501 broadband UVR Biometer (Solar Light Co., USA), as shown in Fig. 1(c). To avoid any UVI anomalies measured by two different instruments, the Biometer was initially calibrated to the Bentham spectroradiometer using a time-dependent conversion factor (CF). Consequently, the Biometer-derived UVI was calculated as follows: where MED is the minimal erythema dose measured by the Biometer at every 5 minutes (300 s) and CF is a conversion factor (different for each season). Considering that one unit of MED is equivalent to 200 J/m 2 of erythemally weighted UV radiation [47], MED is converted to J/m 2 by multiplying with CF. UVI is calculated from the erythemally weighted UV by multiplying the erythemal irradiance in units of W/m 2 by 40. Thereafter, we extracted the sky images that were captured by a synchronous co-located Total Sky Imager -TSI440 (TSI) (Yankee Environmental Systems Inc., USA), as shown in Fig. 1(d). These sky images were stored in the TSI repository. The records of SZA were also extracted from the TSI at 10 minutes intervals. We extracted the UVI, sky images and SZA data series for a complete year (from 01-Mar-2003 to 29-Feb-2004) to obtain the datasets for all 4 seasons. For each day, the datasets were extracted from 7.40 am to 4.10 pm. We segmented the sky images to extract the cloud statistical properties. While we utilized the UVI datasets as the target input, the cloud statistical properties and SZA datasets were employed as the input features in model building.

B. SEGMENTING CLOUD STATISTICAL PROPERTIES
Cloud statistical properties were essential input predictor variables in designing the W-O-convLSTM model and these variables were segmented from the sky images stored in the TSI repository. The TSI repository saves a suite of files that contain colored sky images in JPEG format, a properties text file and a TSI segmented image in PNG format with cloud and non-cloud parts of the clear sky. The properties text file contains the sun position, SZA and cloud fraction information. We utilized the TSI segmented PNG image and cloud fraction information to validate our segmented sky images through comparisons of blue sky and cloud cover. To segment the sky images from the suite of files, we designed an automated Python tool that reads all the 10 minutes sky images in JPEG format and extracts the cloud statistical properties for each image. The image segmentation algorithm, referred as the Python tool has been designed in the Python (version 3.7.9) environment. A flowchart shown in Fig. 2 demonstrates the algorithm execution process of the proposed automated Python tool to segment the cloud chromatic statistics. The Python-based ''glob'', ''os'' and ''cv2'' libraries were utilized to locate and read the real-time sky image and properties files. Using the ''linecache'' library, a common line was read from the properties file and if this line was missing, the image was reported as corrupt. Otherwise, the image of background, camera housing, camera arm and sun-shield captured in the sky image were all masked using the ''numpy'' library, as shown in Fig. 3. Thereafter, the sky image was split into red (R), green (G) and blue (B) channels, from which R and B channel arrays were utilized for further analysis by applying previously reported image segmentation techniques [48]. Using the R and B channels, the redblue ratios (RBR) of the pixels were determined. RBR has been a commonly applied threshold in segmenting cloud cover and blue sky that maintains a high resolution of the image despite getting downsampled when saved in JPEG format [49]. To increase contrast, the RBR pixel values were scaled within 0 to 255 [50], [48]. A calculated threshold (T ) was applied to binarize and segment the RBR-scaled pixels into black and white. The T was determined as follows: where TF is a threshold factor of 0.56 [48] (usually between 0 to 1) and RBR_max is the maximum RBR. If the pixel values 24708 VOLUME 10, 2022 were greater than T , they were assigned 255 (white color) to represent the cloud cover, else, they were assigned 0 (black color) to represent the blue sky. The binarized pixel values of cloud cover and blue sky were masked onto the pixels of red and blue channels to obtain the segmented statistics of the sky image. Finally, the Python-based tool was automated via a for loop to perform the same operations in segmenting the entire JPEG sky images within the suite. Our segmentation program is an improvement of the previously reported methods [48], which shows very close segmentation with the segmented TSI PNG image, as illustrated in Fig. 3. Upon comparing our image segmentation with the TSI-based image segmentation in terms of cloud percentages, we achieve a very low cloud   Table 1.

C. MISSING DATA RECOVERY
After acquiring the data series, it was noted that there were some missing values in the UVI and cloud statistical properties data. However, the SZA datasets were complete. The cloud statistical properties were incomplete due to some missing and corrupt images from the TSI repository. In the case of the UVI datasets, some incomplete values were observed because the 501 Biometer UVI used to recover the Bentham UVI were missing occasionally. These missing values were duly imputed with the monthly median of the respective variable at the same daily time domain. Among the three commonly used imputation methods of mean, median and listwise deletion, the median imputation approach is more accurate and robust [51].

D. DEVELOPMENT OF THE PROPOSED PREDICTIVE MODEL
The scope of this research was to develop a wavelet hybrid convLSTM model that entails 3 major phases, which include feature selection by BorutaShap, decomposition of the selected features using SWT and hyperparameter optimization by O algorithm. In designing this AI-based UVI forecasting model, the Python programming language (version 3.7.9) was implemented. For hyperparameter optimization using the O algorithm, we used Google Colab with python programming as it provides freely available computing resources that include a graphics processing unit (GPU). The Python tool is highly versatile, as its virtual environment provides the platform for both ML and DL-based data analysis through its eminent packages such as Scikit-learn, Tensorflow and Keras [52], [53], [54]. The schematic diagram in Fig. 4 provides an overview of the stages involved in designing the proposed predictive model. In accordance with the stages illustrated in the schematic diagram, the details of the methods adopted at each stage of the UVI forecasting framework are as follows: Stage 1: This stage involves an assessment of the crosscorrelations (r cross ) between the 10 minutes measured UVI (i.e. UVI(t)) and each of the 16 predictor variables (i.e. X 1 (t − n), X 2 (t−n), X 3 (t−n), . . . . . . , X 16 (t−n), where t is time and n is the most significant antecedent lag). Statistically, the individual predictors exhibiting the most significant correlation from the lagged combinations were selected to generate UVI forecasts. Table 1 enumerates the r cross values and the inferential statistics of these input variables. Once the significant antecedent lagged inputs of UVI and the 16 attributes were determined, the data series were reshaped for simulating the future UVI over multi-step horizons. We describe these forecast horizons in Table 2, where the 10 minutes, 20 minutes, 30 minutes and hourly ahead forecasts are designated as 10M, 20M, 30M and 60M, respectively. In reshaping the datasets, a lagged matrix was constructed for each of the four forecast timescales. Stage 2: This stage describes the application of a wrapperbased BorutaShap algorithm for effective feature selection. After feeding the UVI and 16 attributes into BorutaShap, it robustly selected the pertinent features and captured the significant antecedent memory of UVI behavior to deliver multi-step forecasts. In identifying the most significant input variables, XGboost was utilized as the base model for screening each of the four forecast horizon data series. During the process of screening, consistency was maintained in identifying the feature importance through the aggregated and sorted SHAP values. The outcome of feature selection revealed that all the 16 predictor variables in each of the four forecast horizon datasets were pertinent and BorutaShap selected them as important features for model building. For instance, Fig. 5(a) presents the outcome of feature selection for 10 minutes forecast horizon datasets using the BorutaShap feature importance plot. The plot marks all the 16 predictors as pertinent. In addition, Fig. 5(b) presents a bee-swarm plot that illustrates the feature importance of these predictors based on their SHAP values.
The criterion in designing the proposed W-O-convLSTM model is to utilize the historical memories and BorutaShap feature selection of the inputs acquired from the diversified characteristics of UVI and cloud statistical properties data series. If the lagged values delay the two samples (i.e. predictors and predictand), by applying this criterion, they can be regarded as statistically independent. Stage 3: This stage describes the segregation of input datasets into respective seasons, followed by the train-test split. The time-series datasets prepared for each forecast horizon were initially segregated into four different seasons. As detailed in Table 3 29-Feb-2004) were assigned with 4784, 4784, 4732 and 4732 data points, respectively. Thereafter, each seasonalbased data series was split into a training set (84.6% to 84.8%), a validation set (10% of training data) and a testing set (15.2% to 15.4%). Such training and testing split were employed because we utilized 11 weeks datasets for training and 2 weeks datasets for testing during all four seasons. These datasets were extracted at 10 minutes interval, so we had a sufficient number of data points (4732 to 4784) for each season to develop the proposed model. Some earlier studies have also employed a similar train-test split. For instance, the study by [55] employed a train-test split on monthly-based datasets with a training split of 71.45% to 75.01% and a testing split of 12.59% to 14.39% for four sites. A similar approach for the train-test split was also adopted by [19] and [56]. Subsequently, all the model input datasets as per Table 1 were normalized between [0 -1] to improve the efficiency and accuracy during training and testing phases [7]. Stage 4: This stage employs SWT to address the issues pertaining to non-stationarity and noise in the input data signals. The train-test split of the input datasets was conducted prior to SWT decomposition to prevent the leakage of training data into the testing sets, as this could add bias into the forecast [57]. In decomposing the lagged feature data series, SWT convolved each cloud statistical property and SZA signal through high and low pass filters into detailed components (DC) and approximation components (AC) without performing any decimation. Identifying VOLUME 10, 2022  the type of SWT scaling filter and level of decomposition was a critical task to achieve a remarkable wavelet-coupled model, as no specific method for such selection is confirmed in the literature [30], [39]. In our case, a trial and error method was adopted in selecting the best mother wavelet and decomposition level [58]. Among the SWT mother wavelets (that includes Daubechies (db), Haar (haar), Symlets (Sym), Coiflets (coif), Biorthogonal (bior), Reverse biorthogonal (rbio) and Gaussian (gaus)) and decomposition levels (that includes 2, 3, 4, 5, 6 and 7), optimum performance was achieved in designing the proposed model using haar wavelet at a decomposition level of 2. These SWT parameters search space and optimum parameters are highlighted in Table 4. Moreover, Fig. 6 illustrates the training phase decomposition of the attribute CBR d into its detailed coefficients (D1 and D2) and approximation coefficient (A2) at 10 minutes forecast horizon in summer. The other attributes were decomposed in a similar manner for all four forecast timescales. While A2 seems to be in phase with the original undecomposed predictor variables, D1 and D2 turn out to replicate greater details of the subtle but significant patterns in the time-series inputs. Stage 5: In this stage, we discuss the architectural design of the proposed hybridized convLSTM model and hyperparameter optimization using the O algorithm. The architecture of the deep learning convLSTM model consists of double convLSTM2D layers that robustly extract the complex behavior of antecedent lagged features. With RELU assigned as the activation function for the two layers, each layer was allocated with 100 and 44 filters, respectively. These were the optimal number of filters tuned by employing the powerful O algorithm. A flattening layer was integrated after each convL-STM2D layer. Finally, a dense layer was utilized to generate forecasts of future UVI as output. An improved performance was achieved with O optimized hyperparameters that include a batch size of 104 and epochs of 189. In adopting regularization to reduce overfitting and to improve the training performance, a good dropout of 0.1 was applied. To further minimize the issue of overfitting, we adopted a 10 fold crossvalidation strategy. Table 4 presents the search space and optimal hyperparameters of the proposed W-O-convLSTM model (also labeled as M1). To comprehensively benchmark the proposed W-O-convLSTM model, we deployed other highly competitive counterparts. These counterparts were non-wavelet-based models that were developed using convLSTM, convolutional neural network (CNN), support vector regression (SVR) and passive-aggressive (PA) models. The hyperparameters of these benchmarked models were also optimized using the O algorithm. We designated these models as O-convLSTM (also labeled as M2), O-CNN (M3), O-SVR (M4) and O-PA (M5), respectively. In an earlier study by [7], UVI was forecasted by developing machine learning models using a single predictor input of SZA without considering the cloud cover effects. In this study, we design our deep learning UVI forecasting model (M1) using the attributes of cloud statistical properties (that define the cloud cover conditions) and SZA to claim that M1 will yield superior performance in comparison with a deep learning baseline model developed using the predictor input of SZA alone. We designated the baseline model developed using SZA as W-O-convLSTMsza (also labeled as M6). Though M1 and M6 were fed with different predictor inputs, they were both wavelet hybrid convLSTM models with similar architectural designs. It was important to compare the performance of our objective model (M1) with the baseline model (M6) due to a significant dependence of UVI on SZA. It is known that when the sun is out, we have SZA and SZA is highly correlated with UVI. For instance, Table 1 Table 2.

E. PERFORMANCE EVALUATION OF THE MODEL
To confirm the superiority of the W-O-convLSTM model in UVI forecasting, we evaluated this model against the baseline and benchmarked models. To validate that the use of cloud cover effects could further improve the performance of the objective model, we evaluated our model alongside the baseline model. Additionally, by evaluating our objective model (SWT-based model) alongside the benchmarked models (non-SWT-based models), we validated the superiority of employing SWT over non-SWT model design in forecasting UVI. Here, our focus was to evaluate SWT against non-SWTbased models, so other wavelet transforms such as DWT were not evaluated. While DWT is a known standard frequency transform, it was not applied to benchmark SWT in our study because the DWT algorithm exhibits significant problems associated with signal decimation. Such decimation effects induce a bias in the model that makes the signal unsuitable for data preprocessing [30], [37]. On the other hand, SWT is a modified version of the conventional DWT that utilizes an a-trous algorithm to overcome the issues of signal decimation [38]. This drawback of DWT confirms the superiority of SWT during data preprocessing, thus eliminating the need for evaluating the DWT-based models.
A number of robust statistical metrics were applied to rigorously evaluate the hybridized W-O-convLSTM model alongside other competing counterparts in forecasting short-term UVI. For this study, the commonly adopted model score metrics, such as Pearson's Correlation Coefficient (r), VOLUME 10, 2022  [18] were employed.
where UVI O i , UVI F i = observed and forecasted UVI for the i th observation,ŪVI O ,ŪVI F = average observed and forecasted UVI, N = Total number.
It is to be noted that the results obtained through these score metrics may also be due to chance or decisive. So, to prevent rejection of an equally good parallel model due to stochastically generated performance metrics, we further evaluate their forecast accuracies using an efficient statistical test, known as Diebold-Mariano (DM) test. For details of the DM test, the readers may refer to [59].  Table 5 presents the testing phase performance evaluation of the developed models for the seasons of autumn, winter, spring and summer at different forecasting timescales (the optimal performance is highlighted in red). For almost all the experimentally captured modeling aptitudes having the highest Pearson's correlation coefficient (r), lowest mean absolute error (MAE) and lowest root mean square error (RMSE), the proposed hybridized W-O-convLSTM model outperforms the comparative models in forecasting seasonal-based UVI at 10M, 20M, 30M and 60M horizons. Overall, the proposed model highlights its best performance against the competing counterparts in autumn-based 10M forecast horizon with statistical scores of r = 0.961, MAE = 0.017 and RMSE To completely gauge and understand the W-O-convLSTM model, it was rigorously evaluated with Willmott's Index (WI), Nash-Sutcliffe efficiency (NSE) and the most stringent metrics of Legate-McCabe's index (LM). These evaluation statistics are presented after aggregating the initial results of the four seasons with averages so that extensive comparative outcomes could be delivered at multiple forecasting timescales. The observed trends in aggregated and non-aggregated statistics were very similar. In Fig. 7 In conjunction with the statistical metrics, the percentage errors, such as RRMSE and RMAE were further employed as alternative score metrics to enable the model comparison during the four different seasons. For instance, the seasonal-based performance comparison of the proposed W-O-convLSTM model against the counterparts are presented using radar plots in Fig. 8 at 10M forecast horizon. The objective model captured the lowest RRMSE and RMAE  values with RRMSE = 18.226% and RMAE = 28.426% in autumn, RRMSE = 26.324% and RMAE = 19.318% in winter, RRMSE = 17.697% and RMAE = 18.936% in spring and RRMSE = 17.173% and RMAE = 16.224% in summer. By displaying relatively better performance with respect to the comparative models in all four seasons, our newly designed model is highly competent for delivering more accurate forecasts of UVI at 10M forecast horizon. Similar performance was achieved by the objective model at all the other forecast horizons.
A DM test was implemented to compare the forecasting performance of the objective model with its counterparts. The null hypothesis (H O ) was set as: the observed differences between the performances of two forecasting models are not significant. H O was tested against the alternative hypothesis (H A ), which was set as: the observed differences between the performances of two forecasting models are significant. By conducting this statistical test at a 5% level of significance, we rejected H O if |DM| > 1.96. The outcomes of DM tests are presented in Table 6 for 10M horizon, where the calculated DM statistics are mostly greater than 1.96 and less than -1.96. In accordance with these statistics, we conclude that the difference in UVI forecasts from the two predictive models is statistically significant in most cases (H O is rejected). The test implies that our W-O-convLSTM model mostly shows greater accuracies. The only exceptions are for comparisons of our objective model with O-convLSTM (M2) in spring and summer, where the DM statistics are −0.359 and 0.161, respectively. Possibly due to stochastic interference, the observed differences between the performances of these two forecast models are not significant and they capture the same accuracies. Otherwise, in most cases, our proposed model delivers superior performance. Similar outcomes of DM tests were yielded for other forecast horizons.
To further examine the success of the W-O-convLSTM model in UVI forecasting, the observed and forecasted values were plotted as ordinate and abscissa (for the objective model) in Fig. 9 and as the absolute forecasted error (for all predictive models) in Fig. 10 for 10M horizon. The scatterplots presented in Fig. 9 display a least squares regression line (UVI for = mUVI obs + c, where c is the ordinate intercept and m is the gradient) between the observed and forecasted UVI. For an optimal performing model, its R 2 value is closer  to 1, while the m and c values are very close to 1 and 0, respectively [58]. In our case, the W-O-convLSTM model performs very well in all seasons, having the most efficient performance in autumn with R 2 = 0.923, m = 0.918 and forecasts with robust adaptability to seasonal and diurnal variations, particularly for stochastic cloud cover conditions. As enumerated in Fig. 10, the boxplots of absolute forecasted error |FE| (i.e. |FE| = UVI for -UVI obs ) explore the precision of the W-O-convLSTM model against comparative models in terms of statistics of the lower quartile, upper quartile, median, maximum, minimum and data outliers. Upon comparisons, the boxplots justify that the distributed errors for the objective model acquire significantly lower statistical error criteria with smaller spread and relatively lower magnitude of quartile and median statistics. Due to the reason that the designed models did not achieve a correlation coefficient of 1, some outliers were observed in |FE|. Mostly in summer and spring, the designed models could not capture all higher variability in cloud type, ozone column and aerosol effects (i.e. dust and smoke). Despite the presence of some outliers in |FE|, the proposed W-O-convLSTM model yielded high values of r, (mostly greater than 0.9) in all four seasons and at all the forecast timescales, as indicated in Table 5.
The newly designed W-O-convLSTM model yielded high correlation coefficients (r values) in UVI forecasting. The high r values were achieved because the predictor inputs displayed a high correlation with the UVI. The study by [60] revealed that there is a high correlation between the monthly average SZA and UVI (≈88% or 0.88). Similarly, in our study, Table 1 shows a high r value of 0.89 between 10 minutes SZA and UVI. Together with SZA, we further integrated cloud statistical properties that were also correlated with UVI to generate UVI forecasts. Having highly correlated features with the target, the simulations of UVI forecasts in this study yielded high r values. Another similar study by [61] integrated SZA and cloud statistical properties with a CNN-LSTM model to forecast photosynthetic photon flux density (PPFD). The outcomes revealed that the model captured a high r value of 0.92 in generating forecasts of PPFD. Moreover, our study forecasted very short-term UVI at 10M, 20M, 30M and 60M. For a very short-term forecast, there would be a high correlation of immediate past value with the current value, and a high correlation of current value with future value. To further justify the high r values captured by the objective model we calculated Murphy's skill score (SS). The work of [62] reveals that the derived decompositions of SS yield analytical relationships between the respective skill scores and the coefficient of correlation between the observations and forecasts. Table 7 presents the SS of the W-O-convLSTM model in generating UVI forecasts at multi-step horizons for Overall, the evaluation outcomes and results exemplified in Table 5-6, as well as in Fig. 7-10 demonstrate the robustness and efficacy of the newly proposed W-O-convLSTM model in generating cloud-affected UVI forecasts with respect to its counterpart models at multi-step timescale. It was essential to apply several statistical criteria, as a single indicator may not portray the shortcomings of each predictive model [63]. After exploring the performance against benchmarked and baseline models, the findings reveal that our wavelet-based hybrid W-O-convLSTM captures comparatively larger values of r, WI and ENS, smaller values of MAE and RMSE, lower percentage errors of RRMSE and RMAE and better values of R 2 , m and c. The superiority of W-O-convLSTM is further elucidated by larger values of the most stringent metric, i.e. LM. In terms of the forecast timescales, more accurate and efficient forecasts of cloud-affected UVI are achieved at a lower forecast horizon (10M). The stochastic nature of the cloud is best captured on short time scales, as even the slightest position change can vastly change the available UV. While our objective model presents the most precise performance by having lower |FE|, it demonstrates its forecasting adaptability for all four seasons in Queensland. Out of the four seasons, the aforementioned performance metrics statistics indicate that our wavelet-hybridized model VOLUME 10, 2022 generates the best forecasts in autumn and delivers slightly lower performance in winter at all the forecasting timescales. However, such observed discrepancy is subtle and relative to other counterpart models, the proposed W-O-convLSTM model still delivers the best forecasting skills for all four seasons. The robustness of our SWT-based objective model over the non-SWT-based benchmarked models is an outcome of exploiting SWT that successfully addressed the issues of non-stationarity in the cloud statistical properties prior to simulating UVI forecasts.
To validate the influence of cloud movements on UVI, our cloud properties-based W-O-convLSTM model is gauged against the SZA-based W-O-convLSTMsza model (baseline model developed with a single predictor input of SZA). In accordance with the captured results in Table 5-6 and Fig. 7-10, the objective model displays superior performance over the baseline model, thus affirming the significance of stochastic cloud effects on ground level UVR.
In this study, the development of a multiple input multistep output W-O-convLSTM model entails many advantages. Firstly, after rigorous evaluation using robust statistical metrics, the model displays superior and enhanced performance in forecasting short-term UVI for Australia. Secondly, the enhancement in simulations of future UVI can serve as a powerful clinical tool to inform more accurate sun-protection times to the public and mitigate skin and eye health risks under different cloud cover conditions. Moreover, our improved image segmentation technique avers its potential applicability in modeling UVI with cloud cover conditions for other temperate countries. Our image segmentation techniques may also be applicable in designing robust predictive models to improve solar radiation forecasts. This may benefit the energy sector for solar energy monitoring under cloud-affected skies. Additionally, such cloud segmentation techniques can be integrated into modeling photosynthetic active radiation to facilitate healthy plant growth and benefit the agricultural sector. Despite an excellent performance by the newly proposed W-O-convLSTM model, it exhibits a minor limitation. In model designing, we did not use the aerosol and ozone datasets, as these were not available for our site at 10 minutes time resolution. These are two important atmospheric variables that also affect the ground-based UVI through the absorption and scattering processes. However, in our study, we utilized the time-lagged Bentham UVI datasets that already captured some ozone and aerosol effects. For future studies, integrating ozone and aerosol datasets may further improve the UVI forecasting framework.

VI. CONCLUSION
We proposed a novel solar UVI forecasting framework by building a hybrid deep learning and multi-step input system, denoted as W-O-convLSTM model, integrating antecedent lagged memory of cloud cover properties with SZA. The newly developed model was further validated with data extracted for four different seasons at study sites in Queensland, Australia where solar UV radiation currently poses a serious risk in terms of increasing skin cancer and eye diseases such as Pterygium, cataracts, or other eye health ailments. A 3-phase model design approach was employed, which entailed an input selection process with BorutaShap, data decomposition using the SWT and a hyperparameter optimization stage with the Optuna algorithm. We performed a holistic evaluation of the predictive model through statistical metrics and diagnostic plots of predicted and measured UVI to elucidate the superior forecasting skill of the proposed W-O-convLSTM model over its benchmark models. For the forecast horizon of 10 minutes (10M), 20 minutes (20M), half-hourly (30M) and hourly (60M) scales we noted an accurate performance of the proposed W-O-convLSTM model that has also captured the stochastic effects of cloud cover. Thus, our newly proposed model is a likely tool to be adopted in real life for benefits to the public health area such as delivering sun protection behavior recommendations that can help mitigate skin cancer and eye disease risk.
Our study, advancing an earlier work [7] that has used solar zenith angle as a single input to predict the solar UV index, was a next stage pioneering research in developing an artificial intelligence-based predictive model particularly by integrating cloud cover conditions. However, in a future study, we may integrate the actual measured values of aerosol and ozone effects together with the solar zenith angle and the cloud cover effects to further enhance the predictive framework for real-time UVI forecasting.

ACKNOWLEDGMENT
The datasets were collected at the University of Southern Queensland (USQ), Toowoomba Campus experimental facility. The authors sincerely acknowledge all expert reviewers whose comments have improved the final article. RAVINESH C. DEO (Senior Member, IEEE) leads the Advanced Data Analytics Laboratory as a Professor with the University of Southern Queensland, Australia. He is a Clarivate Highly Cited Researcher with publications ranking in top 1% by citations for field and publication year in the Web of Science citation index and is among scientists and social scientists who have demonstrated significant broad influence, reflected in the publication of multiple papers frequently cited by their peers. He leads cross-disciplinary research in deep learning and artificial intelligence, supervising 20+ Ph.D./M.Sc. degrees. He has received Employee Excellence Awards, Elsevier Highly Cited Paper Awards, and Publication Excellence and Teaching Commendations. He has published more than 235 articles, 150 journals, and seven books in Elsevier, Springer, and IGI with 23 book chapters. He has cumulative citations that exceed 8,700 with an H-index of 51.
NATHAN DOWNS is an Associate Professor with the School of Sciences, University of Southern Queensland, Australia, where he is actively involved in the quantification of personal solar ultraviolet exposure through the use of smartphone technologies, assessing occupational exposure risks in global population groups, and the measurement of spectral radiation during extreme aerosol events. His research interests include photobiology, atmospherics, and skin cancer epidemiology.
DAMIEN IGOE is an Adjunct Lecturer with the School of Sciences, University of Southern Queensland, Australia. He is an experienced Teacher in science and mathematics. His research interests include the development of methods to measure solar ultraviolet radiation, particularly using smartphone technology, urban air quality, data analysis and modeling, and STEM education.

ALFIO V. PARISI is a Physicist and an Honorary
Professor with the School of Sciences, University of Southern Queensland, Australia. His research interests include the development of techniques in solar ultraviolet (UV) dosimetry, radiometry, and spectroradiometry, including the development of new methods for solar measurement applied to provide improved characterization of the solar UV environment, and measurement of solar exposures to plants.
JEFFREY SOAR came to academic research from a long and distinguished career in industry including as a Chief Information Officer in government agencies in Australia and New Zealand. He is the Personal Chair of Human-Centered Technology with the School of Business, University of Southern Queensland. His research interests include AI, e-business, e-health, technology and development, and social and organizational change.