Groundwater Level Prediction Model Using Correlation and Difference Mechanisms Based on Boreholes Data for Sustainable Hydraulic Resource Management

Drilling data for groundwater extraction incur changes over time due to variations in hydrogeological and weather conditions. At any time, if there is a need to deploy a change in drilling operations, drilling companies keep monitoring the time-series drilling data to make sure it is not introducing any changes or new errors. Therefore, a solution is needed to predict groundwater levels (GWL) and detect a change in boreholes data to improve drilling efficiency. The proposed study presents an ensemble GWL prediction (E-GWLP) model using boosting and bagging models based on stacking techniques to predict GWL for enhancing hydraulic resource management and planning. The proposed research study consists of two modules; descriptive analysis of boreholes data and GWL prediction model using ensemble model based on stacking. First, descriptive analysis techniques, such as correlation analysis and difference mechanisms, are applied to investigate boreholes log data for extracting underlying characteristics, which is critical for enhancing hydraulic resource management. Second, an ensemble prediction model is developed based on multiple hydrological patterns using robust machine learning (ML) techniques to predict GWL for enhancing drilling efficiency and water resource management. The architecture of the proposed ensemble model involves three boosting algorithms as base models (level-0) and a bagging algorithm as a meta-model that combines the base models predictions (level-1). The base models consist of the following boosting algorithms; eXtreme Gradient Boosting (XGBoost), AdaBoost, Gradient Boosting (GB). The meta-model includes Random Forest (RF) as a bagging algorithm referred to as a level-1 model. Furthermore, different evaluation metrics are used, including mean absolute error (MAE), mean square error (MSE), and root mean square error (RMSE), mean absolute percentage error (MAPE), and R2 score. The performance of the proposed E-GWLP model is compared with existing ensemble and baseline models. The experimental results reveal that the proposed model performed accurately in respect of MAE, MSE, and RMSE of 0.340, 0.564, and 0.751, respectively. The MAPE and R2 score of our proposed approach is 12.658 and 0.976, respectively, which signifies the importance of our work. Moreover, experimental results suggest that E-GWLP model is suitable for sustainable water resource management and improves reservoir engineering.

groundwater resources results in a repid increase in water supply crises [1], [2]. Groundwater is a scarce unseen water resource in natural reservoirs in soil or rocks beneath the earth's surface [3]. Groundwater plays a vital role in fulfilling the requirements of industrial development, economic growth of the country, and providing safe water to living beings worldwide [4], [5]. However, in recent years, it is decreasing due to improper groundwater resource extraction and overexploitation [6]. Drilling is widely considered to extract groundwater resources to fulfill the needs of living beings. Increased groundwater demand and its exploitation have surged the drilling process for groundwater extraction. Drilling and extraction of groundwater may lead to a decline in groundwater resources, increased boreholes depth, and higher drilling costs [7]. The drilling process for groundwater level has some significant risks and complexities concerning economy, environment, and sustainability [8].
Drilling boreholes to gain the GWL is a complicated process that accounts for a massive amount of budgets due to dynamic variations in hydrogeological characteristics. Factors influencing the cost of the drilling process involve the type of soil, land layer, boreholes depth, intended use, machinery, skilled workforce, and materials needed [9]. Hence, drilling depth prediction is crucial for improvements in the overall drilling process, holistic management of hydraulic resources, development of city, underground safety, risk assessment, etc. However, GWL prediction is a complex and dynamic process due to variations in hydrogeological properties. Unfortunately, none of the existing work has achieved reliable prediction accuracy due to complex parameters influencing boreholes depths [10]. Proper utilization of time-series analysis of boreholes log data and mathematical tools can help to predict GWL for enhancing the efficiency of future boreholes.
Groundwater plays a very vital role in the irrigation and food production of a country [11]. Groundwater usage has grown enormously during the past few decades. One of the primary reasons is the advancement in drilling technologies [12]. Increased water usage has surged the demand for drilling groundwater. Due to rapid climatic and geological changes, the prediction of groundwater-related aspects has become difficult. Therefore time-series analysis of groundlevel data and future trend prediction for land subsidence is immensely beneficial for achieving sustainability and efficient use of resources. Time series analysis of groundwater level data will aid the detection of trends and patterns, and behaviors for the identification of declining water levels. Time series modeling provides a better fitting model as compared to other groundwater level data models.
Noisy and varying time-series boreholes data has made it a challenging process to search and locate differences and dissimilarities in time-series data in a large context. There is a lack of such efficient systems and techniques to handle the huge amount of available data to improve the drilling process [13]. Time-series boreholes data possess a high dimensionality resulting in slower access times and high computational complexity. The keynote is the fast search of real time-series data set and the difficulty with time-series because we cannot precisely apply string match and directly index time-series. Therefore employ distance functions and a much fast algorithm than a simple linear scan. Furthermore, it becomes computationally expensive in terms of cost (time and storage) to apply analysis techniques to the original borehole's time-series data. Difference functions are undoubtedly significant for time-series modeling and prediction. Because it is not practical to apply machine learning techniques on raw and un-preprocessed time series data. Therefore, it is needed a higher-level representation of data for efficient computation and extraction of higher-order features. A vast amount of methods exist for generating a difference between timeseries data; these methods include Discrete Fourier Transform (DFT) [14], Discrete Wavelet Transform (DWT) [15], piecewise aggregate approximation [16], 1-lag difference algorithm [17], to name a few.
Drilling process of boreholes generates a vast amount of boreholes log data. There are various sources to acquire boreholes data, starting from drilling activity breakdown, soil colors, land layers, geology and casing information, bottom hole Assembly, and bit information [18]. An essential feature of the borehole's time-series data is high dimensionality and dynamicity. The speed at which the boreholes data is growing does not match the corresponding development techniques of data interpretation and analysis [19]. At present, the drilling industry faces a major challenge in finding ways to tackle such huge volumes of boreholes data for analysis and modeling. The ability to measure the differences between instances is crucial to various data mining applications. We can define time series as composed of complex data objects found in many applications like the stock market, hydro-geology, medicine, telecommunication, etc. The enormous increase in data generating and collecting devices has resulted in the construction of time-series databases. Time-series data analysis and evaluation techniques are highly demanded by data scientists for comparing values, trends, patterns, and periodicity.
With the development of robust time-series models, it is quite possible to develop efficient ML models using timeseries boreholes data. In recent years technological advancements in ML have brought breakthrough changes concerning efficient data processing and data mining solutions such as XGBoost, Artificial Neural Networks (ANN), Deep Learning (DNN), and Support Vector Regression (SVR). All these powerful techniques have facilitated improvement in the prediction performance of complex time-series data. ML techniques have been widely utilized in many areas, such as regression [20], classification [21], [22], patterns mining [23], [24], decision-making systems [25], [26], to name of a few. ML-based approaches tend to produce more robust predictions than conventional methods due to their ability to limit uncertainty concerning input variables having various nonlinear dependencies to generate accurate and reliable predictions. Therefore, in this research study, we employed ML-based ensemble and conventional techniques to predict GWL for sustainable water resource management.
The notable contributions of our proposed work are: • The notable contribution of our proposed work is to employ data and predictive analytics to predict GWL for sustainable water resource management.
• Integrating boosting and bagging models using stacking technique to develop an ensemble prediction model to predict GWL for facilitating hydraulic management for sustainable groundwater resources.
• Different descriptive analyses are utilized to investigate boreholes time-series data for extracting underlying hydrogeological patterns. The descriptive analysis includes boreholes log data analysis based on borehole depth, analysis of boreholes data according to soil color patterns, rock unit, stratum layer, to name a few.
• Different hydrogeological parameters are computed from the historical boreholes log data; total borehole's depth, total number of days spent on each borehole, core soil color, core rock layer, and core stratum layer.
• Detailed comparative study is illustrated to signify the significance of the E-GWLP model compared to the existing baseline models. The rest of the paper is summarized as follow. Section II presents a detailed review of the existing GWL prediction models; Section III describes proposed methodology of the E-GWLP model. Section IV describes boreholes log data. Section V presents data preprocessing, descriptive data analysis, and features extraction modules. Section VI presents proposed difference mechanism to detect change in time series data. In section VII, implementation and experimental environment are discussed. Section VIII presents prediction results and analysis. Section IX presents conclusion of the proposed E-GWLP model.

II. LITERATURE REVIEW
In this section, a detailed survey is conducted to highlight the strengths and weaknesses of the existing GWL prediction models. GWL prediction is considered one of the challenging tasks due to improper extraction, dynamic variations in hydrogeological properties, and over-exploitation [6]. Recently, different ML and mathematical models are suggested by different researchers to predict GWL [27], [28]. Existing prediction models have been developed to match the complexities and accuracy of estimation of GWL due to different hydrogeological and structural properties [27], [29]. In the last few years, most of the research studies used soft computing techniques for GWL [27]. These soft-computing techniques included ANN [30], support vector machines (SVM) [31], and adaptive neuro-fuzzy interface systems (ANFIS) [32].
The aforementioned soft-computing techniques have been widely used to predict hydrological parameters due to multiple factors, such as low computational complexity, high precision, fast training, fast performance time, to name a few [33]. For instance, in [34], the authors developed a hybrid prediction model based on ANN and wavelet theorem to predict GWL in Canada. The authors modeled fluctuations in GWL based on monthly recorded temperature. In [35], the authors developed and compared feed-forward ANN with the conventional regression model for estimating GWL in the time interval of 1 hour. In [36], ANN and ANFIS models are developed to simulate and predict GWL in Iran. The authors considered the following three parameters as an input; a flow of irrigation returned, prediction rate, and pumping rate, to train and test ANN and ANFIS models. The results revealed that the ANFIS model performed accurately compared to the ANN. Another study presented in [37] applied ANN and SVM techniques to predict GWL prediction based on boreholes data acquired from 5 stations in Republic of Korea. The results indicated that the SVM model was more precise and accurate compared to the conventional ANN model. Furthermore, a study presented in [38] utilized ANN and SVM to predict water table depth.
In the last few years, other ensemble and conventional ML models are also developed to predict GWL prediction for sustainable water resource management [39], [40]. In [39] the authors presented an ensemble model based on KNN and RF for three months ahead of groundwater table prediction based on seasonal changes. In [40], the authors proposed an enhanced RF prediction model based on the combination of random features to forecast GWL using two features; temperature (Celsius) and precipitation (Millimeters). The authors reported that the R2 score value of the enhanced RF is 0.8223 for long-term forecasting, which is still improvable. RF model can be efficiently used to handle small and large datasets [41]. It is a robust ML model that produces better generalization to overcome overfitting issues for modeling applications related to hydrology [42]. The authors developed an enhanced RF model to forecast GWL in datascarce regions [40]. A detailed comparative study is presented in [43] to explain a wide range of RF applications in the field of hydrogeology. Another study presented in [44] also implemented RF using a geographic information system (GIS) based on potential mapping for predicting groundwater level. The authors developed potential maps that can be applied to underground resource exploration. In [45], the authors developed a classification model based on RF to predict the layer to extract underground water samples. The classification model was developed based on the main ion composition of the underground water samples. Efficient modeling of boreholes log data is vital for sustainable hydraulic resource development and management. In [46], a prediction model was developed based on RF mode to predict water level variations of the lake for sustainable development. The experimental results of the RF model were compared with existing ML models; ANN, SVM, and linear regression (LR) models.
Likewise, statistical techniques are employed to predict GWL based on time-series data. These methods have been proposed for evaluating temporal trends concerning groundwater like regression analysis to complex parametric and non-parametric techniques. One of the drawbacks of using 96094 VOLUME 9, 2021 a simple regression model is its inability to handle non-linear patterns [47]. Frequently used time-series prediction models include autoregressive integrated moving average (ARIMA), regression analysis and exponential smoothing. In [47], the authors proposed a non-parametric approach (Mann-Kendall) for analysis of trends in groundwater level. Another study used a geostatistical approach to predict spatial and temporal groundwater variation using ARIMA and sequential Gaussian simulation method [48]. In [49], the authors employed time series modeling to forecast fluctuations in groundwater levels. Likewise, predicted groundwater levels using integrated time series, ARIMA, and Holt-Winters exponential smoothing (HWES). However, experimental results show superior performance by the HWES approach. For trend analysis, a new approach called innovative trend analysis (ITA) based on a statistical method is used by many researchers. ITA performs a comparative analysis of time series data without considering statistical assumptions [50]. Furthermore, in [51], a novel method was proposed for identification of trends their magnitude for groundwater levels involving temperate climatic conditions for efficient management of scarce water resources.
Time-series is a sequence of random variables across time stamps upon which we apply tools and mathematical models to achieve the desired goal. Time-series analysis has been frequently reported in the literature for prediction with varying complexities and accuracies [52], [53]. Prediction of time-series involves predicting future data points based on historical data such that the error is minimized. Finding differences between time-series datasets is an integral component of the development process. Comparison of data enables us to locate differences and make our analysis more comprehensive. Moreover, we can check the variables that caused the difference [54]. The basic goal of difference algorithms is to deliver an efficient strategy for generating differences. Due to external events, the time-series borehole's data is subjected to interruption. A difference is created in pre-and postintervention stages, which may be temporary or permanent.
However, a plethora of prediction and difference mechanisms are available in the literature to predict and compare time series. Differentiating data can also be done using various test types like parametric and nonparametric, for example, a distribution-free test where no information about the distribution of the population is given lie under the category of parametric test it uses qualitative data, e.g., Wilcoxon, Mann Whitney, and Kruskal-Wallis tests. In the parametric test case, a normal distribution is considered, e.g., t-test and ANOVA [9]. There are several difference/dissimilarity measures employed in various studies for comparison of timeseries data. Following are some statistical methods for finding differences between time series data. This includes T-test [55] that deals with parametric data and makes a comparison between two-time series, the virtual classifier (VI) [56] for interpreting change that occurs in two consecutive windows, Rank Preservation [57] for comparing two matrices by taking column-wise correlation, CUSUM also called as cumulative sum test [58] for detection of change points in a time series, Spearman correlation [59] for measuring association among two data groups, ANOVA test [60] make a comparison of more than three paired data groupings, To the best of our knowledge, many existing prediction models were developed based on the conventional ANN algorithm to predict GWL. Some of the existing models were implemented based on SVM and ANN to forecast GWL. However, still, these models did not achieve accurate prediction results due to variations in hydrogeological patterns. This study aims to develop an ensemble model by integrating boosting and bagging models using a stacking combinator to predict GWL sustainable hydraulic planning and management. Furthermore, descriptive data analysis techniques are utilized to analyze the hydrogeological patterns of timeseries boreholes data acquired from Jeju National University (JNU), Republic of Korea. Moreover, different hydrological and time-series patterns are extracted from real boreholes data to evaluate and compare the proposed model with baseline ensemble and ML models. Therefore, to the best of the author's knowledge, it is the first attempt to integrate boosting and bagging models to develop a robust E-GWLP model based on hidden hydrological characteristics for sustainable water management.

III. METHODS
This section presents a detailed methodology of the proposed E-GWLP model. The proposed E-GWLP model aims to utilize sophisticated and robust ML ensemble approaches to improve hydraulic resource management.

A. PROPOSED MODEL OVERVIEW
An overview of the proposed E-GWLP model is described. Fig. 1 exhibits the block diagram to analyze the detailed overview of our proposed method. The block diagram describes the functional flow of the proposed model. The functional flow of the proposed model consists of various steps. In step 1, raw data of the boreholes-log is passed to the preprocessing module.
Step 2 indicates preprocessing module that aims to preprocess raw data by removing irrelevant features, handling missing values, and label encoding to increase the efficiency of the boreholes-log data. Next, in step 3, preprocessed data is passed to the features engineering module to construct new features using the existing preprocessed features set. Data analysis is considered an integral module in data mining to investigate the underlying characteristics of the historical data. Therefore, in step 4, the data analysis module is presented to perform different types of analysis, including time-series analysis, statistical analysis, etc. Features selection is an important process to reduce a large feature space by eliminating the least contributed features without losing the accuracy and efficiency of the proposed model. Step 5 presents the features selection process to select the most promising features from the base and derived features. In step 6, the data splitting module divides reduced feature data into training and testing samples.

B. PROPOSED ARCHITECTURE OF E-GWLP MODEL
This subsection depicts the main architecture for developing E-GWLP model. Fig. 2 introduces the layered architecture of the proposed model for predicting GWL to improve the development of hydraulic resource management for future groundwater extraction. This layered architecture of the proposed model architecture consists of 5 layers. The first layer presents time-series boreholes data acquired from JNU, Republic of Korea. The boreholes log dataset consists of the following attributes, including borehole ID, altitude, soil color patterns, rock units, strata codes, etc. The second layer presents data preprocessing and analysis of the boreholes data. The acquired boreholes log data is not in reliable format; therefore, cleaning of raw data is required to convert unprocessed data into a meaningful form for data mining (DM). Therefore, data processing is taken into account to remove irrelevant attributes and other outliers from the acquired data. Data analysis takes preprocessing data as an input to process and investigate trends of the historical boreholes log data. Different hydrogeological and time interval analyses are conducted to analyze underlying characteristics of the preprocessed boreholes log data, which can be considered helpful for the future drilling process. In the third layer, difference mechanisms are developed based on lag-1 difference and unsupervised difference algorithms to detect seasonality change in time-series data observations. The fourth layer presents a proposed ensemble prediction model using level 0 and level 1 models based on stacking to predict GWL. One of the primary objectives of our work is to integrate boosting and bagging models using stacking to build an ensemble model for predicting GWL. Furthermore, different conventional ensemble and baseline ML models are also developed. Lastly, different prediction error metrics are implemented to measure the prediction error of the E-GWLP model. The prediction error of the proposed E-GWLP is also compared with state-of-art and traditional hybrid models to signify the importance of the proposed work.

C. FLOW DIAGRAM OF THE PROPOSED E-GWLP MODEL
In this subsection, a detailed flow of our proposed E-GWLP model is exhibited in Fig. 3. The functional flow of our proposed method consists of the following steps; collection of boreholes log data, preprocessing of collected data, descriptive analysis of boreholes log data, extraction of hydrogeological features, normalization of decision features, utilization of difference mechanisms, developing ensemble model, and performance evaluation. The boreholes data contains 9,287 data samples for boreholes of different regions in the Republic of Korea.
The dataset includes 12 input features; borehole log ID, altitude, geographic coordinates X and Y, starting (top) depth, ending (bottom) depth, thickness, standard Korean layer name, starting and ending drilling date, and groundwater level. The acquired dataset contains irrelevant data and outliers; therefore, data preprocessing techniques are used to clean and filter out trivial features to accumulate the consistency of the dataset. Next, preprocessed data are passed to the data analysis and features extraction modules. The preprocessed data are analyzed based on different data analysis techniques to highlight the trends and patterns of the historical time-series boreholes data. Furthermore, different hydrogeological and time interval features are computed from  the preprocessed boreholes log data; days spent on each borehole drilling, total drilling depth, soil color with maximum borehole depth, to name a few. Next, data normalization technique is implemented to scale down feature values in uniform range [0,1]. Correlation and difference mechanisms are applied to evaluate the linear relationships of the decision variables and identify a change in time-series observations. In the next step, an ensemble model is developed based on the combination of boosting and bagging models using stacking to predict GWL. The proposed ensemble model is formed based on the integration of two models; base and meta models. The base models are developed based on three boosting models: XGBoost, AdaBoost, and GB. Similarly, a meta-model involves an RF model as a bagging algorithm to learn from the base model predictions. The prediction outputs of the base models are fused to the meta-model as input to learning from these predictions. The stacking method is used as a combinator to combine base and meta models to draw a conclusion. The prediction results of the proposed ensemble model are evaluated using different evaluation measures; MAE, MSE, RMSE, MAPE, to name a few. Furthermore, prediction results of our E-GWLP model are compared with baseline approaches to signify the usefulness and relevance of the proposed research study.

IV. TIME-SERIES BOREHOLES DATA PRESENTATION
This section presents a boreholes log data provided by the JNU, Republic of Korea. The considered boreholes log data consists of 9,287 samples along with 1,987 unique boreholes. The collected data includes following data features, such as borehole log ID, geographic coordinates, starting depth of thickness layer, ending depth of thickness layer, rock unit, patterns of soil color, groundwater level, etc. Groundwater level represents the depth under the earth's surface that is permeated with water. Soil color represents the color patterns of soil under the ground. Stratum layer is defined as a layer of sedimentary rock that formed under the ground surface. The land layer represents the rock unit under the ground surface; it can be classified as igneous or sedimentary rocks. The detailed summary of the boreholes log data is presented in Table 1.

V. DATA PREPROCESSING, DESCRIPTIVE ANALYSIS AND FEATURES EXTRACTION
This section describes collection of boreholes log data, cleaning of boreholes log data, and descriptive analysis to investigate underlying characteristics of drilling process.

A. PREPROCESSING OF DRILLING DEPTH DATA
Data preprocessing is a vital and challenging task in DM to clean and prepare preprocessed data model. Data preprocessing model aims to reduce the dataset size, determine the relation between data attributes, normalize data to get uniformity, remove noise and outliers, to name a few. It also helps to increase the consistency of the dataset, reduce computational and storage costs. However, unclean data will significantly affect data-driven methods and led to poor results. Therefore, it is required to clean raw data to find outliers and missing value attributes. In this study, several steps are carried out to convert raw data into a reliable format. Data analysis is a systematic process of applying statistical and logical methods to unearth hidden characteristics of the prepared dataset. Data analysis aims to discover hidden patterns and useful information from a massive amount of data to draw conclusions. Therefore, a preprocessed drilling depth data is used to apply descriptive data analysis techniques to track historical data for underground water characteristics. Different descriptive analyses are performed to track and discover hidden patterns and characteristics from the drilling depth data, which is essential for sustainable water resource management.  Fig. 4 depicts drilling data based on starting drilling depth frequency. Along the y-axis, we have starting depth frequency, and on the x-axis, borehole code is plotted. It can be observed that starting drilling depth frequency data fluctuated between the limits of 0 to 70 meter. The minimum starting drilling depth is 0 meter; whereas maximum starting drilling depth for drilling location is 70 meter. The starting drilling depth varies between a minimum and maximum drilling depth for the given drilling locations.
Similarly, Fig. 5 examines boreholes log data based on ending (bottom) drilling depth frequency. It can be observed that ending drilling depth frequency data fluctuated between values 0 to 75 meter for drilling locations at time t. The x-axis represents drilling locations, and the y-axis represents drilling depth for location x at time t. A major rise in ending depth frequency can be seen for boreholes between 0 and 2,000, while the rest of borehole codes show fewer fluctuations  comparatively between borehole codes 4,000 and 8,000 and above. Fig. 6 presents a comparative analysis in order to compare the starting and ending drilling depth frequency. Along y-axis drilling depth frequency is plotted against borehole codes on the x-axis. It can be observed that starting drilling depth frequency data fluctuated between values 0 to 70 meter. Whereas, it can be observed that ending drilling depth frequency data fluctuated between 0 and 75 meter. The decline in groundwater affects the drilling depth frequency, which is evident from the starting and ending drilling depth.
In Fig. 7, average and maximum boreholes depth is analyzed for each unique pattern of soil color. The analysis investigates that the average and maximum boreholes depth varies due to the structure of the rock types. The analysis shows that the maximum average boreholes depth is 20.58 meters for soil color ''Tan'' among listed soil colors. Furthermore, soil color ''Light Brown'' has minimum average boreholes depth of 15.32 meters. Similarly, maximum analysis depicts that soil color pattern ''Partridge'' has maximum borehole depth of 74.28 meters, which indicates that the drilling process is difficult compared to the other soil colors.
Similarly, Fig. 8 analyzes boreholes data based on land layer according to an average and maximum boreholes depth VOLUME 9, 2021  during drilling to gain water levels. The analysis results reveal that the land layer ''Gyeongam'' has maximum average boreholes depth of 74.28 meters, and ''Sedimentary'' layer has minimum average boreholes depth of 15.92 meters. Likewise, landfill layer has maximum boreholes depth of 74.28 meters, which shows that the drilling process took a large amount of time to drill under the earth's surface to reach the water levels. Hence, drilling through ordinary and soft rock units is easier and time-saving than the other land layers to gain the GWL.

C. FEATURES EXTRACTION
Features extraction is a vital process to construct new features based on the existing data features. It also reduces dimensionality. The feature extraction techniques aim to enhance model accuracy, overcome overfitting issues, speed up model training, and reduce computational complexity. In this study, some of new features are computed using existing data attributes, such as total depth of boreholes drilling, days spent on each borehole drilling, core soil color, core stratum layer, and core land layer. Borehole depth is defined as the sum of the thickness (T ) of the land layer for each borehole log. Thickness is determined by taking the difference between the top (starting) and bottom (ending) drilling depth of each land layer. Thickness is calculated as shown in equations 1 and 2.
The total boreholes depth (TB depth ) is calculated as the sum of the thickness instances for each borehole log location i. (TB depth ) is computed as shown in equation 3. Fig. 9 shows analysis of boreholes log data based on drilling depth of boreholes log and days spent on drilling to gain the groundwater level. The analysis investigated the relationship between total boreholes depth and drilling time for each unique borehole location. According to the analysis results, most of the time, it is found that drilling time is minimum and boreholes depth is maximum, which indicates that the drilling process is easier to gain GWL in the selected regions. Next, a temporal feature is calculated to analyze the total number of days spent on the borehole to reach the GWL. The total number of days is defined as counting the combinations of thickness the rock units for each drilling location. Equation 5 is used to compute the time duration (TD) for each drilling location.

TD days = Count Thickness Instances
Furthermore, box-plot analysis is widely used to measure five value summary, such as minimum, lower quartile of the median, median, upper quartile of the median, and maximum values. Fig. 10 shows box-plot analysis to investigate GWL according to time interval groups (in terms of days). It can be seen that the relationship between TD days and GWL varies because of the different structures of rock units. As an example, it can be observed that 5 to 6 days spent to gain GWL between 0.35 m to 22.2 m. Data outliers are figured out that are distant from the scattered data samples. The data points visualized outside of the box-plot whiskers are defined as data outliers. Furthermore, in the case of 11 to 13 days, it can be analyzed that GWL varies between 2.8 m and 7.09 m. Moreover, hardness of rock layers ultimately minimizes GWL and maximizes time spent. Fig. 11 analyze drilling data based on TB depth and GWL according to the days spent on each drilling location. It can be seen that the relationship between all these three attributes is varied due to the different structures of rock layers. The resulting analysis shows that days spent on each drilling location ranges from 1 to 13 to reach the GWL. Similarly, GWL fluctuates between 0.17 m to 45.5 m to extract water in the scenario area. Besides, the drilling depth of the boreholes log is up to 74.2 m to access the GWL in the boreholes region. The analysis results depict that an average of days spent on each drilling location is 5 to access the GWL. Furthermore, it can be examined that the drilling depth of boreholes and GWL varies because of the different structures of rock and soil patterns, which also influences the time taken by each borehole to drill.
The next feature is core soil color, which is extracted based on maximum total boreholes depth. The drilling for each borehole log consists of different soil colors patterns. Algorithm 1 presents a detailed flow of the core soil color for each borehole. The boreholes data and unique boreholes are used as input data. The objective of the algorithm is to extract the core soil color based on maximum drilling depth for each unique borehole. It is earlier discussed that the drilling process for each borehole consists of several soil colors. Therefore, first of all, unique soil colors are extracted for each borehole. Second, total drilling depth is calculated for each unique soil color. Finally, a soil color with maximum drilling depth denoted as a core soil pattern for an i t h drilling location. The extraction flow of the core land layer and stratup layer for each borehole is given in algorithm 2. The drilling process of the boreholes consists of several land layers and stratup layers to reach the GWL. Therefore, it is needed to analyze and find the core land and stratup layers based on maximum borehole depths. Hence, a core layer is defined as the land layer with maximum drilling depth for an i th borehole. Similarly, a core stratup layer is defined as the stratup layer with a maximum frequency of drilling for an i th borehole. Therefore, for each unique borehole, a drilling frequency for each unique land layer is computed to analyze and select a land layer as a core land layer having maximum drilling frequency. Similarly, according to the stratup layers, a drilling frequency is also computed to analyze and find a core stratup layer with maximum boreholes depth.

D. FEATURES NORMALIZATION AND SELECTION
This subsection describes features normalization and selection. Data normalization is an important process to scale down feature values in some specified range, for instance, [0, 1]. It is an effective process to transform data into a common scale to avoid biases among data features and improve model learning. Therefore, a feature normalization is required because the range of feature values is different. Different features normalization techniques are considered, for instance, min-max normalization, z-score, clipping, etc. This research study utilizes min-max normalization to scale down feature values in a similar range to consider each feature VOLUME 9, 2021 The next step is to select the most promising features to reduce the high dimensionality of the dataset and improve the performance of the model without losing information. Commonly used feature selection techniques are correlation analysis, information gain, principal component analysis, to name a few. This work uses correlation analysis as a benchmark technique to compute the correlation index of all input features with respect to the target feature and select those features having a correlation index 03.0 or greater than 0.30. The correlation heatmap map is shown in Fig. 12 to analyze a linear relationship between input features and target features. It can be observed that altitude and temporal difference features are negatively correlated with a target feature; therefore, both features are removed from the given feature space to reduce the computation and storage cost. Fig. 13 presents a correlation heatmap for the selected features to analyze the linear relationship between independent (soil color, total depth, stratum layer, time taken, and land layer) and dependent variables (groundwater level).

VI. PROPOSED DIFFERENCE MECHANISM
This section presents difference model and the implementation of drilling dataset. It also includes the implementation of the following functions, such as input function, comparison function, and search function for drilling dataset. Furthermore, it suggests a logical/mathematical model for enhancing data search results. Time series data can be transformed using a technique called differencing to eliminate temporal dependence. Before modelling time-series data, the trends and seasonality factor might need to be removed. To achieve this difference is utilized as an effective data transformation method for constructing stationary time series data. For statistical modelling techniques, time series should be stationary for ease in modelling. As non-stationary time series data possess specific trends and seasonality that vary with time. Likewise, the statistical measures incur changes with time, for example, mean, and variance, which leads to change in concept which model is trying to learn.
For the transformation of a time-series dataset, various differencing methods are utilized. Differencing methods are an effective way to eliminate temporal dependence that exists in a time series, more specifically concerning features related to trend and seasonality in data. Moreover, it can remove variations in time series by achieving a stable mean and ultimately lessens the impacts of trends in data. It works by computing the difference between current and previous data sample values. Differencing measure involves methods that compare two time-series objects and output a value that encodes how dissimilar they are. The distance can be defined as a quantitative measurement of dissimilarity or difference, specifying how far two instances are from each other. Fig. 14 presents difference between consecutive starting borehole depth samples using lag-1 difference. It can be observed that average starting borehole depth rate varied between 3 meter to 25 meter. Similarly, Fig. 15 is used to presents an average temporal difference of ending borehole depth for each borehole location. The ending depth of boreholes locations indicates the bottom part of the rock unit during thickness combination. The difference between the top and bottom frequency of drilling depth is defined thickness, which fluctuates due to different hydrogeological patterns and climatic changes. It can be observed that average ending borehole depth rate varied between 3 meter to 45 meter.

VII. IMPLEMENTATION AND EXPERIMENTAL SETUP
In this section, an experimental setup of the proposed E-GWLP model is presented. In this work, we used Python VOLUME 9, 2021 as a core language to implement and conduct a series of experiments. The following core libraries of Python are utilized, such as Sklearn, Seaborn, Matplotlib, Pandas, Numpy, to name a few. Furthermore, MS Excel and MySQL are used to store raw and process boreholes data. Moreover, we used Intel Core i9-10900 CPU along with 32 GB RAM to perform experiments. Table 2 summarizes the experimental setup of the E-GWLP model.   Figure 16 depicts implementation process of the proposed E-GWLP model. Our proposed E-GWLP model utilized python as the core backend programming language to performed different experiments, including data and predictive analysis. A sklearn library is used to utilize various features, such as the transformation of categorical values into continuous values, division of prepared boreholes data into training and testing samples sets, training and testing of ML-based regression models. Min-max scaler is used to mapped the feature values into a specified range [0,1] to overcome the learning issues of ML models. The prepared dataset is divided into training samples and testing samples with a split ratio of 70-30; it indicates that 70% of boreholes samples are used for building ML models, and the remaining 30% of boreholes samples are utilized for evaluation purpose. Furthermore, different evaluation measurements are considered to evaluate the error of the each regression model.

VIII. IMPLEMENTATION RESULTS AND ANALYSIS
This section provides a detailed overview of the results yield by experiments moreover a detailed performance analysis is also presented for GWL prediction. There are two types of experimental results analyses performed. First, prediction results of our E-GWLP model is compared with traditional ensemble model to highlights the significance of the proposed work. Second, experimental results of our model is compared with baseline regression models. Fig. 17 depicts a comparison of the implemented regression models to predict GWL. In Fig. 17 the observed and estimated GWL are analyzed. The analysis verify that the proposed framework based on the ensemble model outperformed conventional methods. Fig. 20a presents the actual vs predicted GWL based on the CatBoost model. The difference analysis of actual and forecasted is justifiable comparative to Adaboost and GB. Similarly, Fig. 17b depicts the prediction error of the AdaBoost model, It can be seen that the gap between actual and forecasted values is comparatively high than those achieved by CatBoost and GB models. In Fig. 17c, showcase a comparison of actual and predicted values using the GB model. It can be seen clearly that the prediction error is relatively higher compared to the CatBoost model. Furthermore, in Fig. 17d, it is evident that the prediction error of the XGBoost model is low compared to the CatBoost, AdaBoost, and GB models. Fig. 17e shows a comparative analysis of actual GWL and predicted acquired by RF, which indicates that RF produced slightly high error compared to the XGBoost and CatBoost models. Finally, Fig. 17f visualized actual and predicted GWL using the proposed ensemble model. It can be clearly seen that occurrence of prediction error by using proposed ensemble model is lower comparative to counterpart solutions, including GB, CatBoost, AdaBoost etc. This verify the proposition of the study that proposed ensemble prediction model yield superior performance by achieving a low prediction error and can be considered a sustainable solution for enhancing future boreholes efficiency and reservoir engineering. Furthermore, Fig. 18 visualizes actual and predicted values achieved by proposed ensemble model along with its comparison with baseline regression models. Fig. 20b shows that the conventional ANN model produced a relatively high prediction error compared to the proposed ensemble model. Similarly, Fig. 18b indicates the prediction results of the baseline SVR model, it is evident from the comparison that our model achieved lower error percentage compared to the ANN and LR. Moreover LR model also produced a high prediction error compared to ANN and SVR models as shown in Fig. 18c. Similarly Fig. 18d and Fig. 18e analyzes prediction error for unseen data samples using L1 and L2 models. It can be observed that prediction error in case of using L1 and L2 models are higher comparative to ANN and SVR models. The prediction error observed in case of conventional models for unseen samples is significantly high and cannot be considered those models to predict GWL for future boreholes. The comparative review reveals that the prediction results of the conventional statistical models are not acceptable for sustainable water resource management. Hence, it can be concluded that our proposed ensemble model has achieved satisfactory results and outperformed the conventional regression models by bringing massive improvements concerning performance of the GWL prediction. The findings of experimental results prove that proposed ensemble model is suitable for predicting GWL to enhance the efficiency of future water boreholes.
Furthermore, Fig. 19 shows the comparison of proposed framework with hybrid prediction models. The analysis reveals that prediction error caused by KNN-RF and XGB-RF models is slightly higher in comparison to our proposed E-GWLP model. Hence, our proposed ensemble model produced more accurate results in contrast to aggregated meanbased hybrid prediction framework.
Features importance is an important process to investigate the significance of the prepared data features [61]. Feature importance refers to assigning importance to the feature variables based on specific scores. Scores are allocated based on their usefulness at predicting the output variable. It can be used for dimensionality reduction by selecting only promising features from the given feature space. Faster training and complexity reduction, and easy interpretation are some advantages of applying feature importance. Furthermore, it is an efficient way to find the contribution of each feature in the model learning phase and eliminate the least contributed features from the features space to produce generalize and accurate decision model. Therefore, it is required to identify the most contributed features in the prepared dataset. Figure 20 shows a comparison of features importance using conventional ensemble models. XGBoost indicates that the temporal difference feature has a highly contributed feature compared to other proposed features. Adaboost, RF, and GB models indicate that the score of the altitude feature has high, which means that the altitude feature contributed more compared to the other listed features.
The proposed study employed various statistical formulations for measuring the forecasting error of conventional ensemble models and baseline ML models. Performance analysis metrics include widely used metrics including MAE, MSE, RMSE, normalized RMSE (NRMSE), MAPE, and R2 scores. MAE and MSE are common performance evaluation measure used for continuous variables [53], [62]. MAE measures the difference between actual and estimated values by extracting the average of absolute difference based on entire dataset and provides the average error magnitude. It is formulated as shown in 6: MSE measures the difference between estimated and actual values (residuals) and resultingly provide a value that depicts how closely the fitted lined lies to the data points, lastly the value is squared so that negative values turn positive. the lesser the value of mean square error the closer the fit, better is model performance. It is obtained by finding the difference then taking average of squared value and calculating square root finally. MSE is calculated using the following equation 7.
RMSE is a defined as the square root of the MSE. It is used to measure the average distance that starts from fitted line to the data points along vertical axis. The formula for calculating RMSE is provided in equation 8.
R2 score is a statistical measure that is defined as a coefficient of determination that involves observed and predicted values for evaluating how well the regression model performs. R2 score approaching 1 or close to 1 is an indicator of good performance achieved by regression model. R2 score is computed based on the following equation 9.
MAPE is another statistical measure to estimate the regression model's accuracy in terms of differences between observed and predicted values. It is defined as an average of the absolute percentage errors of the regression model. The low MAPE indicates the high accuracy percentage of the prediction model. The basic formula is given in equation 10 to measure MAPE.

MAPE =
100% n n k=1 (y observed −ŷ estimated ) y observed (10) Table 3 presents performance evaluation of the proposed model along with comparative analysis with counterpart conventional learning models, including CatBosot, AdaBoost, GB, XGBoost, and RF. Furthermore the proposed model's performance is compared with some developed integrated models. These models include KNN-RF and XGB-RF. The experimental findings made the fact evident that validation and testing performance of the proposed model is superior than all other standalone and ensemble models. The validation performance analysis also proves model's strength as it successfully achieves a lower MAPE value and high R2 score compared to the baseline models. In the validation analysis, our proposed ensemble prediction model gained MAPE of 13.473 and R2 score of 0.945. Similarly, the testing results also proved that ensemble prediction framework proposed in this study produced accurate results comparative to counterpart solutions. The testing performance of proposed model reported a MAPE 12.658 and R2 score of 0.976. Furthermore, MAE, MSE, RMSE produced by the proposed solution is 0.340, 0.564, and 0.751, respectively. The experimental findings proves the efficiency and robustness of our VOLUME 9, 2021 FIGURE 20. Proposed features importance analysis using XGBoost, RF, AdaBoost, and GB models.  ensemble model. It is proven that the forecasting error of unseen samples using proposed E-GWLP model is significantly low than conventional bagging and boosting models. Hence, based on performance analysis results, it is proved that our E-GWLP model performed significantly better to predict GWL compared to the conventional models and state-of-theart techniques.   Furthermore, Fig. 22    Furthermore, in Fig. 23, we presented an evaluation analysis involving proposed solution and conventional regression based model solution approaches. MAE, MSE, and N-RMSE error metrics are considered to analyze the prediction error of the proposed ensemble and traditional regression algorithms. The MAE, RMSE, and NRMSE values of E-GWLP approach are 0.340, 0.5751, and 0.018, respectively. The analysis revealed that high error rates are produced by conventional ANN and linear regression (LR) models are high in comparison to baseline models that include SVR, lasso (L1), and ridge (L2). Results are an indicator of how well our proposed E-GWLP model generalized the data, and produced accurate prediction results.
Moreover, Table 5 shows a comparative analysis of our E-GWLP and existing state-of-art models. Different important parameters are taken into account to compare the proposed study results with the existing model. The comparative analysis results indicates that the existing model used a traditional approach to combine KNN and RF using aggregated mean to form an ensemble model. However, our proposed model is developed based on the stacking technique to forecast GWL. Furthermore, an existing model used the following input features, including temperature, precipitation, and solar radiation, to predict GWL. In contrast, our proposed ensemble model used hydrogeological and time interval features to forecast GWL to improve hydraulic management. It can also be observed that the baseline model used a sliding windowbased approach to validate trained models. In comparison, our work used the k-fold validation method for validating models to avoid overfitting issues. Moreover our proposed model produced r2 scores of 0.976, contrarily R2 score observed in case of existing model is 0.939. Hence, our proposed E-GWLP model is a reliable solution compared to the existing prediction model to improve hydraulic resource management effectively.

IX. CONCLUSION
The importance of groundwater level has received high significance due to variation in hydrogeological properties. The proposed ensemble prediction model was presented to develop an integrated prediction model based on boosting and bagging models using boreholes-log data to predict GWL for sustainable water resource planning and management. The proposed research study consists of two core modules; data and predictive analytics modules. The data analytics module aimed to process and investigate boreholes data to discover hidden hydrogeological characteristics to improve the efficiency of future boreholes. Therefore, different data analysis techniques were employed to analyze boreholes data, such as statistical and time-series analyses of borehole data based on soil colors, land layers, and stratum layers, to name a few. Differencing and correlation mechanisms were also utilized to find a difference between consecutive boreholes depths and analyze the linear relationship between boreholes depths. Furthermore, different hydrogeological and time interval features were extracted from the prepared boreholes log data. Secondly, the predictive analytics module aimed to develop an ensemble prediction model based on the integration of multiple boosting and bagging models using extracted hydrogeological and temporal features to predict GWL. The ultimate goal of the proposed ensemble prediction model was to predict GWL in order to facilitate drilling management for sustainable water resource management. Furthermore, prediction errors of the implemented models were evaluated using different error metrics. The MAE, MSE, RMSE, and NRMSE values of the proposed E-GWLP model are 0.340, 0.564, 0.751, and 0.018, respectively, which indicates that our E-GWLP model accurately predicted GWL compared to conventional ensemble and baseline regression models. The prediction error of the proposed ensemble model in terms of MAPE for unseen samples is 12.658, which signifies that E-GWLP model performed quite well compared to the baseline models. In contrast, MAPE of KNN-RF and XGB-RF is 34.048, and 28.444 respectively, which indicates that traditional hybrids models produced a relatively high prediction error compared to our proposed model. Furthermore, evaluation results of the E-GWLP model were compared with six conventional ML models, such as ANN, SVR, DT, LR, L1, and L2. The analysis of the traditional regression models shows that the LR model performed poorly compared to other baseline models. The experimental results revealed that the proposed E-GWLP model accurately predicts GWL and outperformed conventional regression models. The experimental results will be used for the planning and management of sustainable water resources. Moreover, It will also be used to improve reservoir engineering and the efficiency of future boreholes.