Extremely Randomized Trees Regressor Scheme for Mobile Network Coverage Prediction and REM Construction

In mobile communications network planning (and designing any radio system), coverage prediction helps network operators optimize cellular networks to improve customer experience. Accordingly, several path-loss models have been proposed that depend on many conditions, such as suitable selection of the terrain for each model, the height of the receiver and transmitter above ground and the distance between them, and the presence of obstacles. This may increase the prediction error between actual and estimated values, which change according to the propagation model selected. To overcome these problems, we propose a novel approach to mobile coverage prediction based on an extremely randomized trees regressor (ERTR) algorithm. In addition, we construct a radio environment map (REM) over a Google Earth digital map to improve visualization of the results and to easily detect coverage holes and traffic hotspots. For this purpose, we utilize a dataset with real measurements collected from Victoria Island and Ikoyi in Lagos, Nigeria. For performance evaluation, we use k-fold cross-validation based on four error metrics: relative error, root mean squared error, mean absolute error, and ${R^{2}}$ score. The proposed ERTR scheme achieves the best performance in terms of accuracy and computational load in predicting the reference signal received power and the received signal strength indicator value. We prove this with extensive simulation analysis and by comparing the error metrics of the proposed ERTR approach with an existing method widely used to perform coverage prediction, called ordinary kriging. We also compared seven machine learning regression algorithms, namely, random forest, a bagging regressor, support vector regression, k-nearest neighbors, a deep neural network, Gaussian process regression, and the decision tree.


I. INTRODUCTION
Currently, mobile communications (MC) provides a flexible infrastructure subject to the challenges of increasing demand for mobile data. For instance, fifth-generation (5G) technology is capable of accessing and sharing information in scenarios with high data rates and extremely low latency, The associate editor coordinating the review of this manuscript and approving it for publication was Aasia Khanum . in which the transmission environment effects increase the vulnerabilities of the signal itself, especially in 5G millimeter wave networks [1], [2]. As a result, more antennas must be installed closer to user nodes, exceeding the number of antennas needed [1], [3], [4], [5]. Accordingly, coverage prediction plays a key role in the resource management of MC, which entails better network planning, design, and implementation, plus optimization improvements. In addition, a radio environment map (REM) is considered by regulatory agencies as a helpful tool for informed decision-making, and by network operators to ease coverage hole detection and traffic hotspots.
Overall, several path-loss models have been proposed that depend on many conditions, such as suitable terrain selection for each model, the height of the receiver and transmitter above the ground and the distance between them, the presence of obstacles, and so on [6]. These factors may increase the prediction error between actual and estimated values, which varies depending on the propagation model selected [7]. For instance, in [8], the authors proposed a propagation model called COST-231-Walfisch -Ikegami that utilized a geographic information system tool for field strength prediction in cellular mobile communications. Although the authors highlighted the benefits of geographic information system tools to deal with spatial databases analysis, and generating essential spatial parameters for field strength prediction, the proposed COST-231-Walfisch -Ikegami model is mainly useful for isotropic antennas. In the literature on REM construction, ordinary kriging (OK) has been widely used as a spatial interpolation technique based on geostatistics [1], [2], [9], [10]. OK estimates unknown data points according to the spatial correlation between measured data and the relative positional relationships between all sample points [2]. For instance, in [1], the authors constructed a REM for an indoor propagation environment based on interpolation methods. The results showed that OK outperformed baseline schemes such as inverse distance weight [3] and k-nearest neighbors (KNN) in terms of root mean squared error (RMSE) and correlation coefficients. Although OK can achieve high accuracy, its main disadvantage is computational cost, which rises exponentially with the number of measurement points [3], [10]. Moreover, a heuristic-based approach has been proposed in [11] for coverage prediction for indoor environments based on the indoor dominant path model. Since it still relies on a path loss model, its extension for outdoor scenarios can be difficult to implement.
Although the MC wireless transmission environment is complex, conventional mobile network planning techniques based on propagation models are inflexible and are subject to specifications such as antenna height, frequency, and environmental conditions [6]. Therefore, in recent breakthroughs, machine learning (ML)-based schemes have emerged as innovative prediction techniques capable of dealing with mobile network operational complexities, and they can provide high accuracy [12], [13]. For instance, in [14], the authors made path loss predictions in an urban environment in Beijing, China, by applying an artificial neural network (ANN), support vector regression (SVR), and random forest (RF) models. The performance evaluated in terms of RMSE achieved results between 4 dB and 5 dB. Similarly, in [15], the authors utilized SVR and RF to predict the path loss of a 5G network in Lisbon, Portugal. RMSE was evaluated using 10-fold cross-validation, and the obtained results varied between 6 dB and 7 dB. Moreover, an ANN and Gaussian process regression (GPR) were applied in suburban environments in South Korea, giving RMSE values between 8 dB and 9dB [16]. On the other hand, ML models based on an ANN, RF, and SVR were applied in rural environments in Greece to make path-loss-based predictions for an RMSE average of 4.2 dB [17]. In [18], the authors compared the coverage prediction performance between ANN schemes, multi-layer perceptron (MLP) with two hidden layers, and KNN for cellular networks based on the signal to interference ratio metric. The results showed that ANN with Gaussian kernels and the MLP technique obtained the best performance. To the best of our knowledge, none of the research described above considered an extremely randomized trees regressor (ERTR) for coverage prediction or REM construction.
In recent research, reference signal received power (RSRP) was considered the target label in MC since the RSRP parameter represents the network signal level at the user node location [19] in fourth-generation (4G) Long Term Evolution (LTE) and 5G New Radio (NR) networks. In [20], the authors applied an RF model to predict RSRP in multiple environments located in China. The results obtained 6.11 dB of RMSE by applying 10-fold cross-validation. Meanwhile, in [21], the authors analyzed several ML models as linear regression (LR), the ANN, SVR, GPR, regression trees (RT), and RF. The authors stated that according to 10-fold crossvalidation, GPR achieved the best performance at 5.64 dB, followed by RF at 6.18 dB. For this purpose, the authors used 18,048 samples from 4G LTE collected in Putrajaya, Malaysia.
Motivated by the benefits provided by the ensemble learning techniques to obtain high accuracy for indoor [22] and outdoor [7], [21] environments regardless of propagation models. In this paper, we propose a novel ML regression approach based on an ensemble learning algorithm (namely ERTR) to perform coverage prediction and design the REM for MC. Our goal is to predict RSRP and the received signal strength indicator (RSSI) values in an urban dense area located on Victoria Island, Lagos, Nigeria [23]. In addition, we utilized 5-fold cross-validation to evaluate the performance of the proposed ERTR approach, the baseline ML models, and OK, by comparing different error metrics. This paper opens the door to constructing ERTR-based REM designs for MC environments that can be extended for coverage analysis in various outdoor and indoor propagation scenarios. It is worth highlighting that this is the first work that investigates ERTR for coverage prediction in MC according to RSRP and RSSI values. The main contributions of this paper can be summarized as follows.
• First, a novel, ensemble learning approach is proposed, called ERTR, for coverage prediction of MC systems by utilizing RSSI values, RSRP, and global positioning system (GPS) coordinates. For this purpose, we utilize a dataset of actual measurements collected from Victoria Island and Ikoyi in Lagos, Nigeria [23].
• Second, we construct the REM by using MATLAB to improve the visualization of coverage prediction. For this purpose, we created a 100 × 100 grid of data VOLUME 11, 2023 geographic points in the area of interest to plot the results over a 2D map and a Google Earth digital map.
• Third, in addition to the proposed scheme, we assess the performance of seven ML regression algorithms: RF, a bagging regressor, SVR, KNN, a deep neural network (DNN), GPR, and the decision tree (DT). Additionally, we include a widely used benchmark algorithm called OK for coverage prediction. To compare the proposed ERTR algorithm with the baseline schemes, a 5-fold cross-validation technique is employed, measuring the relative error, mean absolute error (MAE), root mean square error (RMSE), and coefficient of determination (R 2 score). Through extensive simulations, we validate that the proposed ERTR algorithm outperforms the baseline schemes, offering the highest accuracy while maintaining a low computational load.
• Fourth, to validate the superiority of the proposed ERTR in terms of complexity, we provide a computational complexity analysis between the proposed ERTR, and the baseline algorithms: RF, Bagging, and OK. The rest of the paper is structured as follows. The dataset is described in Section II. In Section III, we present the coverage prediction methodology, including an overview, a model evaluation, and the ERTR scheme. In Section IV, we provide the numerical results, the computational complexity analysis, and the graphical results. Finally, conclusions are described in Section V.

II. DATASET DESCRIPTION
In this paper, we utilize the publicly available dataset described in [23] composed of key performance indicator parameters such as RSRP, RSSI, logging time, and GPS coordinates. The dataset contained 42,498 instances of each parameter. The measurement campaign was carried out in dense urban environments around Victoria Island and Ikoyi in Lagos, Nigeria, as shown in Figure 1. The dataset was collected with a 4G LTE test modem mounted on a computer housed in a test vehicle driving at 30 km/h. Note that user equipment periodically measures RSRP to perform cell selection/reselection and handover processes in 4G LTE, as well as in 5G NR networks [21]. Therefore, the proposed segmentalapproach-based prediction model can easily be adapted to 5G NR network parameters in the future. Formally, we denote the features and labels of the dataset with D = (m i , r i ) where m i ∈ R 2 , i ∈ {1, 2, . . . , M } , in which M is the number of instances, and m i is longitude and latitude coordinates. Meanwhile, r i ∈ R represents the target label given by the RSRP value, expressed in decibel milliwatts. In this paper, we also consider the analysis of the RSSI as a target value. Similar to RSRP, RSSI is the signal strength received by the user equipment, but RSSI measurements include the main signals, co-channel non-serving signals, adjacent-channel interference, and thermal noise on the specified frequency band [24]. Therefore, the features of the dataset correspond to the longitude and latitude coordinates, and the target values are the RSRP and RSSI values. By measuring the RSRP and RSSI in several positions determined by longitude and latitude coordinates, it is possible to estimate the signal strength and gain insights into the signal's propagation characteristics. REMs are constructed by measuring the RSRP and RSSI at various points in a given area, which is then used as a dataset to build the proposed ERTR model. This model can estimate the RSRP and RSSI at any location within the coverage area.

III. COVERAGE PREDICTION METHODOLOGY A. OVERVIEW
Our objective is to construct an ML regression framework to predict the outdoor propagation coverage of MC networks by entering data measurement points. First, to design the deployment model, we start by developing the training stage, which is programming in Python software. In this sense, the input dataset is divided into two subsets: the training dataset and the validation (or testing) dataset for hyperparameter tuning of the model. Hence, we adjusted the parameters according to the best results obtained in the evaluation procedure based on 5-fold cross-validation of relative error, MAE, RMSE, and R 2 score metrics. Accordingly, the trained model was ready to be used in the deployment stage. Next, we meshed the target area by creating a grid 100 × 100 points based on the minimum and maximum geographic coordinates of the dataset so that the grid covered the entire area of interest. After that, we performed feature normalization based on Z-score and then defined the ML regression-based method to be applied to the prediction task. Consequently, the coverage prediction given by the r i values was obtained for each of the points on the grid by utilizing the model previously trained with the points of the dataset. Afterward, the predicted values of the ML-based framework, along with the longitude and latitude coordinate points of the grid, were exported to a MATLAB file. Then, we loaded that file into MATLAB to build the coverage map. For this purpose, we converted the GPS location measurements into the Universal Transverse Mercator (UTM) coordinate system to build a mesh where the new points were predicted. As a result, the REM with the predicted data points was obtained as a pseudocolor plot, which is drawn as a 2D map by applying the pcolor function. To further improve the visualization, the REM was plotted over a map from Google Earth by using the ge_imagesc function. Finally, we included a bar graph to identify the relationship between our data and the colors displayed in every chart. Figure 2 illustrates the aforementioned procedure. Figure 3 explains one iteration of the 5-fold crossvalidation [25] used to evaluate the ML-based model. The dataset was divided into 80% for training and 20% for testing. The values predicted by the model were compared with real values from the test data to calculate the relative error, MAE, RMSE, and R 2 score, as defined in Section IV.

B. ERTR-BASED FRAMEWORK FOR COVERAGE PREDICTION AND REM CONSTRUCTION
In this paper, we investigate an ERTR scheme to predict the coverage of outdoor propagation for 4G MC given by the  numerical values of r i . The ERTR algorithm is a supervised ensemble learning model [12] that combines the prediction of various individual trees, in which the whole training dataset is used to create each DT. Figure 4 illustrates the structure of a DT where new instances perform top-down learning to make predictions. For example, every new instance begins in the root node, moves along the branches, and goes through child nodes until it reaches a leaf node [26]. Two rules differentiate this ERTR algorithm from similar ensemble techniques like RF. First, ERTR chooses a random subset of features for each tree from all available features. Second, the split procedure of ERTR relies on a random selection of a splitting value for each of the selected features.
In detail, given the features of the training dataset, M = where M is the number of instances in VOLUME 11, 2023 the training dataset; the samples, m i = {x 1 , x 2 , . . . , x N }, constitute an N -dimensional vector, and x j denotes the feature, in which j ∈ {1, 2, . . . , N } . In each DT created by the ERTR algorithm, S c represents the subset of instances in the training dataset at child node c. Therefore, at each node c, the best split is based on S c and a random subgroup of features from Algorithm 1. Next, S c at c is divided into two subsets: S right c includes samples that satisfy the two rules of the extra-tree algorithm, and S left c includes the rest of the training instances. Furthermore, we use mean square error (MSE) [7] to evaluate the quality of a split, i.e., the best division is selected in accordance with the lowest MSE. The process is repeated in each child node until reaching the minimum number of samples for the split, v min . On the other hand, during the testing procedure, a test sample goes through each DT and traverses each child node. During the process, the test sample uses the best split to go to the right or left child node until reaching a leaf node. The prediction for the test sample is given by the leaf node for each DT, and the final prediction of the ERTR algorithm is defined as the average of the F decision trees. For each feature r in the subgroup do: 7: Calculate the maximum value, x max r , and the minimum value, x min r , of the feature r in the subset S c . Choose x r < x c r as a candidate split.

10:
End for 11: MSE(x c R ) 12: Output: best split rule x * < x c * at the child node c.

IV. NUMERICAL RESULTS
In this section, we present simulation results from MC coverage prediction based on RSRP and RSSI, as well as REM construction over dense urban environments around Victoria Island and Ikoyi in Lagos, Nigeria [23]. First, we present the performance evaluation from 5-fold cross-validation of the proposed ERTR algorithm and the additional baseline ML algorithms: RF [7], the bagging regressor [27], [28], [29], KNN, the DNN, GPR [28], the DT, and SVR with a radial basis function (RBF) kernel [30]. Moreover, in our comparative approaches, we include OK, an interpolation technique for cellular coverage prediction [2], [31]. For the OK algorithm, we used the module previously developed in Python, named PyKrige [32]. Second, we present graphic results from REM construction on a 2D map and the Google Earth digital map.

A. EVALUATION WITH 5-FOLD CROSS VALIDATION
In this subsection, we assess the performance of the ML regression schemes and OK by applying 5-fold crossvalidation [25] to obtain the following error metrics: relative error, MAE, RMSE, and the R 2 score. This procedure was described in Section III-A, and the results are the average of several repetitions of 5-fold cross-validation. Specifically, the relative error is the absolute error between the predicted value r i and the real measure, r i , divided by the real measure. Therefore, it provides insight into how well the model performs across a range of values, and a lower relative error indicates better performance. Regarding MAE, it measures the average absolute difference between the predicted and actual values, indicating how close the predictions are to the actual values [33]. Meanwhile, RMSE is the square root of the average of the squares of the differences between the actual value and the estimated value. The equations for relative error, MAE, and RMSE, are expressed in (1), (2), and (3), respectively: where m is the number of samples.
Note that the lower the value of the aforementioned metrics, the better the performance, unlike R 2 score where a higher value is better. The upper bound is 1, which indicates a perfectly accurate prediction. R 2 score can be expressed as follows: where the numerator of the second term is the mean error given by the summation of squares of the residual prediction errors, while the denominator represents the variance, where r i is the average target value [34], [35]. The main idea of the R 2 score is to measure the proportion of the variance in the dependent variable (i.e., the target variable being predicted) that is predictable from the independent variables (i.e., the features used for prediction) in a regression model. The score has an upper bound of 1 which represents a perfectly accurate prediction, but there is no lower bound, implying that predictions can be extremely inaccurate. If the score is around 0, it can be considered as good as randomly guessing around the mean,r i . In summary, the MAE measures the average absolute difference between predicted and actual values, while the relative error measures the error as a percentage of the actual value. Thus, MAE is scale-dependent, while relative error provides a scale-independent measure of accuracy. Regarding the R 2 score, it measures the amount of variability in the target variable that is explained by the model. Including all these metrics in the analysis provides a more comprehensive evaluation of the model's performance, as each metric captures different aspects of the model's accuracy and fits the data. Accordingly, Figure 5 and Figure 6 show the number of RSRP training samples versus relative error and RMSE, respectively. From Figure 5 and Figure 6, we observe that as the number of training samples increased, the performance of the investigated algorithms improved. Therefore, we used 36,000 samples for the training procedure of the models. Moreover, we observe that worse performance was obtained from the DNN, SVR, and GPR. On the other hand, we can see that lower relative error values and lower RMSE were achieved by the ERTR ensemble learning algorithm, followed by RF and the bagging regressor. Therefore, in Figure 7 and Figure 8, we analyze the performance of these ensemble learning algorithms in terms of training time and RMSE, respectively, when varying the number of trees. In addition, from Figure 5 and Figure 6, we can see that OK achieved performance close to RF and the bagging regressor. Thus, the proposed ERTR performed best compared to the benchmark schemes. This is because ERTR improves the reduction of bias and variance by utilizing two main strategies. First, it samples the entire dataset and randomizes the selection of the node split, which differs from RF, which uses Bootstrap with replicas and selects the optimum split. The bagging regressor trains each regressor model on random subsets with the replacement of the original training set and then aggregates the individual predictions by averaging them to give a final prediction. Note that the values in the parameters of each algorithm in the simulation results are set based on the best results through hyperparameter tuning and several experiments. For instance, with the OK algorithm, we used the number of points closest to 2, and we selected power, loop, and bool as the variogram model, backend, and mask, respectively.
For the DT, we set the maximum depth to 50 and the minimum samples split parameter to 2. To set the hyperparameters for the DNN model, we evaluated the suitable number of neurons, hidden layers, and learning rate through extensive simulation results and hyperparameter tuning performed on the training data. In this sense, utilizing too many neurons and hidden layers can lead to overfitting, where the network  becomes overly specialized to the training data and performs poorly on new, unseen data. On the other hand, using too few neurons and hidden layers may result in underfitting, where the network fails to capture remarkable patterns in the data. Similarly, the learning rate determines the step size at which the network adjusts its weights during the training process. A larger learning rate allows for faster convergence but can lead to overshooting the optimal weights and potentially unstable training. By contrast, a smaller learning rate can improve stability but might result in slower convergence behavior or getting trapped in local optimal. Therefore, we selected an appropriate learning rate through experimentation to find a balance between convergence speed and stability. Accordingly, for the DNN, we used four hidden layers. The number of neurons per hidden layer was 100, 50, 100, and 50. We used the ReLU activation function and Adam as the solver, with the learning rate set to 0.0001 and the maximum number of iterations at 300. To establish the best values for the ERTR, RF, and bagging regressor parameters, we analyzed the results obtained by these algorithms with different numbers of regressor trees, samples, and required training times. In this sense, Figure 7 and Figure 8 show the  performance of these algorithms in terms of training time and RMSE based on the number of regressor trees. Specifically, Figure 7 shows the number of trees utilized by ERTR, RF, and the bagging regressor versus the training time to obtain the two target values (RSRP and RSSI). Here, we appreciate that the training time increases by using more regressor trees. By contrast, Figure 8 shows that from 180 regressor trees, the RMSE for both RSRP and RSSI did not vary remarkably. Therefore, for our purposes, we utilized 200 regressor trees for ERTR, RF, and the bagging regressor. Moreover, we set the maximum tree depth for ERTR and RF equal to 50. In addition, from Figure 7 and Figure 8, it is noteworthy that ERTR outperformed RF and the bagging regressor in both training time and RMSE. These results validate the efficiency of the proposed ERTR algorithm, which achieved the lowest error with less computational time. Figure 9 shows the training time versus the number of RSRP samples. In Figure 9, we include the performance by ERTR, RF, the bagging regressor, and OK. It is worth noting that the OK algorithm has been widely utilized for coverage prediction in MC environments [2], [14]. Overall, from Figure 9, we can see that as the number of samples increased, the training time increased. However, we can also see from Figure 9 that the ensemble learning methods required less computational time, whereas OK required the longest training time. Consequently, from Figure 7 and Figure 9, we verify that ERTR needed the shortest training time, which results in a computational load reduction. We used a PC with an AMD Ryzed 9 5900X CPU and 48GB of main memory. Figure 10 and Figure 11 show the number of RSSI training samples versus relative error and RMSE, respectively. Similar to Figure 5 and Figure 6, from Figure 10 and Figure 11, we can observe that as the number of training samples increased, the performance of the investigated algorithms was enhanced. Moreover, the DNN, SVR, and GPR schemes had a higher relative error and RMSE than the OK method and the ensemble learning algorithms (RF, bagging regressor, and ERTR). However, it is remarkable from Figure 10 and Figure 11 that the proposed ERTR outperformed the baseline   schemes by achieving the lowest relative error and RMSE, respectively. Table 1 and Table 2 compare regression performance using RMSE, MAE, and R 2 score for RF, the bagging regressor, SVR, KNN, GPR, the DNN, DT, ERTR, and OK. Recall that  the parameters of each algorithm in the simulation results are set according to the best results through hyperparameter tuning and several experiments. The results in Table 1 are based on RSRP target values, whereas those in Table 2 are from RSSI target values. From Table 1 and Table 2, we can see that ERTR obtained the lowest error measurements. Thus, we can appreciate that ERTR outperformed the other ML techniques and OK. Moreover, it is remarkable that SVR and GPR presented worse error measurement percentages, followed by DNN.

B. COMPUTATIONAL COMPLEXITY ANALYSIS
In this subsection, we analyze the computational complexity of the proposed ERTR method and the comparative schemes: RF, Bagging, and OK. Accordingly, the computational complexity of the proposed ERTR depends on the number of regression trees, the number of features, the number of samples, and the maximum depth of trees. Specifically, the computational complexity of ERTR can be approximated by where F is the number of trees, N is the number of features, M is the number of training samples, and t d is the maximum tree depth. In our simulations, we set the maximum tree depth to t d = 50 and included all available features when selecting the best split, i.e., R = N = 2. As shown in Fig. 7, the training time exhibits linear growth since the computational complexity is directly proportional to the number of trees, F, while the number of samples, M , and the number of features, N , remain fixed.
Similarly, in the case of RF, the computational complexity is given by O (F · N · M · t d ) [36]. However, ERTR achieves lower computational time compared to RF because it makes use of a random threshold to split the data at each node, without searching for the best possible threshold like in RF. In the case of Bagging, the computational complexity is given by O (E B · L B ) [36], where E B is the number of base regressors, and L B is the computational complexity of training a single base regressor. In our simulations, we use the decision tree regressor as the base estimator, which has a complexity of L B = O (N · M · t d ). On the other hand, the computational complexity of training for OK is given by O M 3 [37], which leads to the cubic growth of the training time as observed in Figure 9.

C. GRAPHICAL RESULTS OF REM CONSTRUCTION
In this subsection, we present a grid of 100 × 100 points covering the area of interest from the dataset [23] with RSSI and RSRP predicted values from the trained regressor algorithm processed as described in Section III-A. The threshold values for RSSI are defined as follows: values higher than −70 dBm are considered excellent signal strength reception; values from −70 dBm to −85 dBm are considered good reception; −90 dBm to −100 dBm is considered fair reception, and less than −100 dBm is poor. Meanwhile, the threshold values for RSRP are defined as follows: values higher than −80 dBm are considered excellent signal strength reception; values from −80 dBm to −90 dBm are considered good reception; −90 dBm to −100 dBm is considered fair reception, and values less than −100 dBm are poor. Figure 12 and Figure 13, respectively, show the MC coverage map predictions for RSRP and RSSI target values on a 2D map by applying the proposed ERTR algorithm and the RF and bagging regressor baseline schemes. From Figure 12 and Figure 13, we observe that RF and the bagging regressor presented abrupt changes of color on the 2D map, which makes it difficult to identify critical points where the signal is decreasing. This makes coverage prediction unreliable. By contrast, the 2D coverage maps obtained by ERTR tend to better generalize the prediction points, since we can appreciate how the signal strength is degrading without abrupt changes. In this sense, ERTR can better detect the quality of signal reception, where we can see areas with good reception and those with shadow areas. Note that the quality of REMs constructed using machine learning techniques depends on several factors, including the quality and quantity of the data used for training, the chosen algorithm for constructing the maps, and the complexity of the environment being mapped. As a result, it is essential to carefully collect and preprocess data, as well as thoroughly test and validate the radio maps to 65178 VOLUME 11, 2023 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. ensure their quality and reliability. In this regard, the quality of the REMs for ensemble methods based on decision tree regressors primarily depends on how the algorithm selects the split rule in each child node. In the case of RF, the set of possible split points at the child node c is chosen from the feature values of the samples in the subset of the training dataset, S c . Consequently, only specific values are available for choosing the threshold of the best-split rule at each node. Conversely, in the case of ERTR, the split value is randomly selected within a range based on the subset of the training dataset at node c, as described in lines 7 to 9 of Algorithm 1. This randomness leads to a smoother representation of the final REMs. Figure 14 shows graphical results for RSSI and RSRP values on a Google Earth map after following the procedure described in Section III-A by applying the ERTR algorithm. From Figure 14, we can identify coverage predictions according to the geographic coordinates that allow us to improve the signal quality reception by installing relay nodes in those points where quality is poor or by adjusting transmission parameters such as antenna height and tilt angle [21]. This mechanism may help operators to improve network planning in MC systems or any radio system.

V. CONCLUSION
In this paper, we proposed a novel MC coverage prediction approach based on a supervised ML regression algorithm. In particular, we designed an ERTR-based scheme to predict coverage through RSRP and RSSI values in an outdoor-tooutdoor propagation environment. For this purpose, we used real measurements carried out using the 4G LTE frequency band in dense urban environments around Victoria Island and Ikoyi in Lagos, Nigeria. Furthermore, we investigated the performance of the OK method, which has been utilized for coverage prediction tasks. In addition, the following ML regression techniques were considered as comparative approaches: RF, the bagging regressor, SVR, KNN, a DNN, GPR, and the DT. It is noteworthy that the ensemble learning techniques (ERTR, RF, and the bagging regressor) achieved higher performance than the other ML schemes. Moreover, it is worth highlighting that OK obtained error metrics closer to RF and the bagging regressor. However, OK incurs high computational costs that tend to get worse when increasing the number of samples. Through numerical results, we showed that the proposed ERTR outperformed the benchmark schemes in terms of computational cost, relative error, RMSE, MAE, and R 2 score. Furthermore, we constructed a REM 2D map on a Google Earth map that provided better visualization of signal quality according to the geographic coordinates, which can help network operators to improve the planning of an MC network.