Radio Map Extrapolation Using Compensated Empirical CDF Under Interference-Limited Observations

The radio map has attracted attention as a promising tool for accurately estimating radio propagation and determining communication parameters. Conventional radio maps are mainly used in wireless systems with a single target transmitter. However, in dense transmitter environments, such as cellular systems, interference-limited observations may become dominant, and thus, the received signal power from a target transmitter may be calculated to be higher than the actual value because of survivorship bias. As a result, the coverage area may be overestimated and communication parameters inappropriately determined. In this paper, we propose a radio map extrapolation method for multiple-transmitter environments. The proposed method compensates for the empirical cumulative distribution function (CDF) of the target power by taking into account the number of missing data in each mesh. The median target power was extrapolated based on the compensated empirical CDF. The emulation results using datasets over the 3.5 GHz band show that the proposed method can improve the mean error of the target power from 7 [dB] to 8 [dB], compared with the multiple imputation (MI) method, a non-compensated radio map, and kriging-based spatial interpolation.


I. INTRODUCTION
Radio maps have attracted attention as a promising tool for accurately estimating radio propagation and determining communication parameters for various applications such as a spectrum sharing and localization [1]- [6]. Radio maps are charts that display the average received signal power in meshes, where a mesh is defined by a square space based on geographical coordinates. To construct radio maps, multiple mobile terminals demodulate massive received signal power data in each location, and each terminal reports observed datasets to the database server. The database server then creates a radio map by calculating the average received signal power in each mesh. The calculated average value may be approximately equal to the true average value because of the The associate editor coordinating the review of this manuscript and approving it for publication was Chan Hwang See. law of large numbers. This construction method is known as crowdsourcing [7]- [11]. Through crowdsourcing, the radio map can be efficiently constructed because massive measurement data can be obtained for each location in a short time.
The assumption of common radio maps discussed in [12]- [14] is that the number of transmitters is one. These methods enable precise estimation of television white spaces to improve the quality of spectrum sharing. Meanwhile, the radio map can be used in multiple-transmitter environments, such as systems with multiple access points [15]- [18] and in private networks [19], [20]. It is possible to generate radio maps by observing the received signal power from the transmitter along with each transmitter ID. However, if the number of transmitters operating in the same band is large, the received signal power from the target transmitter (hereafter, referred to as target power) may be missing owing to the severe interference-limited observation. In particu-lar, low signal-to-interference ratio (SIR) data are missing even in measurements with high signal-to-noise ratio (SNR). Here, let us define the interference-limited observations as ''signal power measurements in which the availability of the target received power is determined by whether signalto-interference-plus-noise ratio (SINR) satisfies a certain quality.'' In such wireless systems, the average target power is calculated to be higher than the actual value. This phenomenon is known as the survivorship bias [21]. For example, Achtzehn et al. [22] estimated the error of the received signal power to be up to 4 [dB] higher in a multiple-transmitter environment than in a single-transmitter environment in an urban area. In such an environment with survivorship bias, radio maps overestimate the coverage area; that is, the communication parameters may be inappropriately determined. Spatial interpolation may be used to predict the missing data; however, the interpolated value may not be accurate. This is because the absent data occurs outside of the known data range. These motivations allow us to extrapolate the missing target power in dense transmitter environments to create an accurate radio map.
In related works on radio map construction in multiple transmitters, references [15]- [18] have discussed the process for indoor environments. However, because the aggregate interference signal power is negligible owing to the walls and obstacles in indoor environments, severe interference-limited observations may not occur. Although Bouali et al. [23] utilize a radio map in multiple-transmitter environments, this work assumes that each transmitter uses different frequency bands. Hence, this method cannot be utilized for multiple transmitters operating in the same band.
This paper proposes a radio map extrapolation method based on the compensation of the empirical cumulative distribution function (CDF) of the target power. In the proposed method, mobile terminals first observe the target power and each terminal reports the obtained data to the database server. After compensating for the empirical CDF by considering the number of missing data in each mesh, the database server extrapolates the median target power. The emulation results using datasets over the 3.5 GHz band show that the proposed method can more precisely extrapolate the median target power than using a non-compensated radio map and with the MI-and kriging-based methods without compensation for the median target power.
The main contributions of this study are summarized as follows.
• We comprehensively investigated compensation methods for missing data, including parametric approaches (such as using truncated normal distribution, lefttruncated exponential distribution, and Tobit model), an interpolation approach (kriging), and an extrapolation approach (MI method). The missing data were modeled as certain distributions in the parametric and extrapolation approaches. Furthermore, the traditional interpolation approach does not extrapolate the missing data for interference-limited observations. These observations motivated us to propose a novel extrapolation method using the compensated empirical CDF for interferencelimited observations.
• The emulation results revealed that the estimation accuracy of the average target power could be improved by a factor of approximately 1-8 [dB] by using the proposed method compared with that obtained using the conventional methods. The remainder of this paper is organized as follows. Sect. II describes related works for interference-limited and noiselimited observations. Sect. III explains the system model, and Sect. IV describes the proposed method. Sect. V shows the conventional compensation methods to estimate the missing data and Sect. VI describes the emulation setups. After Sect. VII explains the emulation results, we conclude this paper in Sect. VIII.

II. RELATED WORKS A. THEORETICAL ANALYSIS FOR INTERFERENCE-LIMITED COMMUNICATION SYSTEMS
Related to interference-limited observations, Effects of interference on wireless systems have been analyzed in references [24]- [28] in terms of SIR outage probability performance. They have theoretically derived a probability density function (PDF) and a CDF of the SIR for typical fading models, such as generalized-k fading [24], Nakagami-m fading [25], and Weibull-Gamma fading [27]. The theoretical outage performances were verified with computer simulations. Based on the related discussions, we propose considering the outage performance of signal measurement, influenced by the aggregate interference, for the radio map extrapolation.

B. COMPENSATION FOR NOISE-LIMITED OBSERVATIONS
The truncated normal distribution [29], [30], left-truncated exponential distribution [31], and Tobit model [32] are used for the compensation of missing data in various wireless systems. Each method models the PDF of the missing data as a specific distribution. Then, the parameters of the PDF are derived by calculating the log-likelihood function. However, these works assume the target power to be missing if it is less than the noise floor in noise-limited observations. In interference-limited observations, the threshold at which the target power is missing may not be known because of the probabilistic fluctuation of the SIR. Thus, these methods do not allow us to accurately extrapolate the radio map for interference-limited observations. Compared with the related methods, the proposed method can improve the problem using an empirical distribution and the number of missing data.

III. SYSTEM MODEL
The system model is described in this section. In the following, we explain the measurement model, the missing criterion of the target power, and the mesh definitions in the interference-limited observation. VOLUME 10, 2022 A. MEASUREMENT MODEL AND STATISTICAL PROCESSING The system model is illustrated in Fig. 1 and f , respectively. Here, s is the index of an interfering transmitter (s = 1, 2, · · · , S).
Mobile terminals observe the target power for each location in multiple transmitters and report the obtained data to the database server. The database server divides the communication area into M meshes (m = 0, · · · , M − 1) to average small-scale fading. Then, the average target power is calculated for each mesh, where m is the mesh index. The mesh is defined as a square geographic space defined by the latitude and longitude. The calculated average values are stored as a radio map of the target transmitter in the database server. In addition, the database server estimates the measurement-based path loss model using the obtained target power data. This model is explained in Sect. VII-B.
However, because the constructed statistical data are inaccurate owing to survivorship bias, we extrapolate the missing target power in each mesh. To construct the statistical data, this study assumes that the locations of the target transmitter and mobile terminals, the transmission interval of the target transmitter, and the reception time of each mobile terminal are known.
This paper assumes that the location and reception time of each mobile terminal can be obtained by applicationlevel implementation, such as Android API [7], based on Global Positioning System (GPS) or Global Navigation Satellite System (GNSS). Further, Wi-Fi Round-Trip Time (Wi-Fi RTT) 1 would help estimate this information for indoor applications. Recent Android OS also provides the API for Wi-Fi RTT. Since Wi-Fi RTT is based on Fine-Time-Measurement (FTM), both time/location can be obtained.
Further, if the transmitter locations are not known, localization techniques based on the received signal power values are the optional choice (e.g., [33]). Finally, the transmission interval information also can be estimated from the received signal power values. In reference [34], the authors model ON/OFF activities of multiple transmitters using the hidden Markov model (HMM). They had shown that the ON/OFF statements (i.e., transmission intervals) could be estimated using HMM and time-series received signal power measurements. This method will help us if such information is not given as a prior.

B. MISSING CRITERION BASED ON INSTANTANEOUS SINR
This paper defines missing data as a target signal power value that failed to be obtained and calls the signal measurements in the presence of this phenomenon as interference-limited 1 https://developer.android.com/guide/topics/connectivity/wifi-rtt observations. In interference-limited observation, the transmitted signals from the target transmitter and S interfering transmitters can be considered as the target power and aggregate interference power, respectively. Thus, we model the missing criterion of the target power as the SINR. Under this condition, the target power and ID of the target transmitter are recorded for the storage of each mobile terminal if the following criterion is satisfied: where γ th [dB] is the SINR threshold. Additionally, γ g,m [dB] is the instantaneous SINR for the g-th instantaneous target power in the m-th mesh. The definition is given by γ g,m = P g,m − 10log 10 10 where P g,m [dBm] is the g-th instantaneous target power in the m-th mesh. Note that P g,m does not include interference power. g is the data index (g = 0, 1, · · · , G − 1) and G is the number of target power data before the interference occurs. N 0 [dBm] is the noise floor of the mobile terminal. I sum,m [dBm] is the aggregate interference power in the m-th mesh, as explained in Sect. VI-B. It is assumed that the number of missing data in each mesh can be calculated based on the transmission interval of the target transmitter and the reception time of each terminal. After the observation, the mobile terminals report the obtained data to the database server. Even if (1) is not satisfied, the latitude and longitude in each reception location are obtained from a global positioning system module implemented in the mobile terminal.
In the interference-limited observation, I sum,m may be larger than N 0 because many interfering transmitters exist in a small area. Thus, the instantaneous SINR can be approximated as follows: where γ g,m [dB] is the approximated SINR (or SIR) for the g-th instantaneous target power in the m-th mesh. This approximation is introduced to simplify the extrapolation process. Note that the emulation evaluation (in Sect. VII) considers Eq. (3) to verify the validity of this approximation.

C. MESH DEFINITIONS
As described in Sect. III-B, the missing data for the target power is probabilistically determined according to the instantaneous SINR. In this environment, the mesh types are considered as follows: a). Non-missing mesh: All instantaneous target power data in the mesh can be obtained without the missing data. Extrapolation was not necessary for this mesh. b). Semi-missing mesh: We define that n loss,k target power data in the k-th mesh are missing owing to aggregate interference. This mesh is named as the semi-missing mesh. Here, n loss,k is the number of missing data in the k-th semi-missing mesh. Meanwhile, it is assumed that n k instantaneous target power data is accumulated in the k-th semi-missing mesh. The target power vector is defined as is the u-th instantaneous target power and u is the data index of the target power (u = 0, 1, · · · , n k − 1). c). Complete-missing mesh: This paper defines that all n loss,c target power data are missing in the c-th mesh.
The mesh is referred to as the complete-missing mesh.
Here, n loss,c is the number of missing data in the c-th complete-missing mesh. The extrapolation method is described in Sect. IV-B. As described in Sect. III-B, the number of missing data can be calculated from the transmission interval of the target transmitter and the reception time of each terminal. Thus, we assume that n loss,k and n loss,c are known.

IV. PROPOSED METHOD
We propose two extrapolation methods for extrapolating the missing data based on the compensation of the empirical CDF. The extrapolation procedures for the semi-missing mesh and complete-missing mesh are described.

A. EXTRAPOLATION FOR SEMI-MISSING MESH
If (3) is satisfied, the target power with a small SINR may be missing [N 0 , p min,k ], where p min,k [dBm] is the minimum instantaneous target power in the k-th semi-missing mesh. The histogram of this range is compensated for by considering n loss,k . The extrapolation procedures are as follows: . Because p min,k is usually the floatingpoint type, it is divisible by w and less than p min,k . II). The database server compensates the histogram for each bin by increasing the number of data by 1 from p min,k . When the number of additional data become n loss,k , the compensation of the histogram is completed. Thus, n loss,k is satisfied by the following constraint: where R i is the number of additional data points in the i-th bin. If the lower limit value of the class that  increases the amount of data becomes N 0 and the number of additional data points does not become n loss,k , the database server compensates the histogram from p min,k again until the above constraint is satisfied. An overview of the extrapolation, where B k = 4 is illustrated in Fig. 2. The white rectangle is an observed data that is not missing. The gray rectangle denotes the data points increased by compensation. If the number of gray rectangles is n loss,k , the compensation ends. The compensated target power vector for the k-th semi-missing mesh is defined as P semi,k = (P 0 , P 1 , · · · , P n k −1+n loss,k ). III). The empirical CDF of P semi,k is calculated using the following equation, F k (P) = 1 n k + n loss,k − 1 n k +n loss,k −1 u=0 χ P (P u ), (5) whereF k (P) is the empirical CDF of the k-th semimissing mesh. P [dBm] denotes the instantaneous target power. Moreover, χ P (P u ) is an indicator function given by IV). The database server extrapolates the median target power of the k-th semi-missing mesh P med,k [dBm] based on the median ofF k (P) and registers it. The database server performs the above calculation in all semi-missing meshes.

B. EXTRAPOLATION FOR COMPLETE-MISSING MESH
This subsection presents an extrapolation method for the c-th complete-missing mesh. In the complete-missing mesh, VOLUME 10, 2022 no instantaneous target power data can be used to find p min,k . Thus, the database server performs preprocessing as follows: i). The logarithmic distance log 10 (d c ) between the target transmitter and the c-th complete-missing mesh is calculated. ii). The database server extracts the non-missing meshes and semi-missing meshes that have the almost same distance as log 10 (d c ). If the D-th decimal place or higher is equivalent between log 10 (d c ) and a mesh, the non-missing mesh and semi-missing mesh are extracted. iii). Finally, the minimum target power is obtained from the extracted non-missing and semi-missing meshes. The power is defined as p min,c [dBm] for the c-th completemissing mesh. Then, the extrapolation is performed as follows: A). The database server performs the same procedures shown in Sect. IV-A I). and II). with p min,c and n loss,c . The compensated target power vector for the c-th complete missing mesh is defined as P complete,c = (P 0 , P 1 , · · · , P n loss,c −1 ). B). The empirical CDF of P complete,c is calculated as follows,F whereF c (P) is the empirical CDF of P complete,c . C). The database server extrapolates the median target power of the c-th complete-missing mesh P med,c [dBm] based on theF c (P) and registers it. Fig. 3 represents the overview of the extrapolation for the complete-missing mesh. In this figure, the c-th mesh is a complete-missing mesh. The a-th and b-th meshes are non-missing and semi-missing, respectively. The three meshes are at approximately the same distance log 10 (d c ). Thus, p min,c is obtained from the a-th and the b-th meshes.

V. CONVENTIONAL COMPENSATION METHODS
This section describes the conventional compensation methods: an interpolation method and an extrapolation method.

A. SPATIAL INTERPOLATION
Kriging is a well-known method for radio map compensation [35]. This method interpolates the unknown received signal power for the c-th complete-missing mesh as follows: whereP c [dBm] is the interpolated average received signal power in the c-th complete-missing mesh.P j [dBm] is the measured average value in the j-th mesh, and J is the number of meshes with known data points. To perform kriging, it is necessary to derive ω j , which is the weight considering the spatial-covariance structure of the random field. Kriging methods are subdivided into several types according ω j . We use ordinary kriging method because it does not require prior knowledge of the expectation of the received signal power. Ordinary kriging calculates the weights so that the variance of the estimation error is minimized under the weight constraint as follows: where σ 2 is the variance of the estimation error, and P c [dBm] is the true average received signal power in the c-th complete-missing mesh. Using the Lagrange multiplier method, the objective function ψ(ω j , µ) can be written as: where µ is the Lagrange multiplier. Here, σ 2 can be represented using the semivariogram ξ as follows: where d l,j [m] is the distance between the l-th mesh and the j-th mesh. Because the shadowing correlation empirically follows the exponential decay model [36], we used the exponential semivariogram model for ξ . The definition is given by where θ 2 n , θ 2 s , and θ r are the nugget, sill, and range, respectively. These parameters can be obtained by nonlinearly fitting the measured data into (12). Finally, we can express the simultaneous equations by calculating the partial derivatives in (11) as follows: By solving the simultaneous equations, the weights that minimize σ 2 can be derived.

B. SPATIAL EXTRAPOLATION
The alternative compensation method is an extrapolation method, which compensates for missing data outside of the known data. This situation matches the missing phenomenon in the coverage edge under severe interference-limited observations. Extrapolation is divided into two types: single imputation (SI) method [37] and MI method [38]. The SI method imputes a representative value of the known data to the missing points. For example, a mean imputation [39], median imputation [40], and hot-deck imputation [41] have been proposed. However, these methods may not extrapolate accurately if the statistical characteristics differ greatly between the known and missing data. Meanwhile, the MI method models a posterior probability distribution of the missing data and obtains Q datasets that follow the modeled distribution. After creating Q representative values from Q datasets, the database server unifies these values to a single value by statistical processing, such as averaging and linear regression. Finally, the obtained value is imputed to the missing points. The MI method can roughly extrapolate the missing data by appropriately modeling the posterior probability distribution. From this perspective, the MI method is superior to the SI method [38]. However, in a realistic environment, the complicated radio propagation may not allow modeling the posterior probability distribution of the missing data appropriately; that is, the extrapolation accuracy of the MI method is low. Table 1 lists advantages, disadvantages, and performance indications in the related methods.

C. SUMMARY OF COMPENSATION METHODS
First, the proposed method can accurately extrapolate the missing data based on measured datasets and the number of missing data without specific assumption for the PDF of interference. In contrast, the requirement of the number of missing data will be the disadvantage of the proposed method. Fig. 10 in our paper reveals that the proposed method is affected by the class width w because the histogram is inaccurately compensated in large or small w. Therefore, we give w as the performance indication in the proposed method.
Kriging can interpolate the missing data by considering the spatial correlation of the measurement datasets based on the optimized weighted averaging. This method can construct an optimal radio map under the interpolation task. However, since this method is a non-parametric interpolation technique, its accuracy is degraded significantly in the extrapolation task. Because kriging optimizes weighting factors based on the theoretical variogram model, the interpolation accuracy depends on variogram model selection. Thus, we give the estimation accuracy of the variogram as the performance indication of kriging.
Next, the MI method extrapolates the missing data by obtaining random values from an existing probability distribution, such as the multivariate normal distribution. In contrast to the proposal, this method does not require information about the number of missing data. However, because the MI method does not utilize the empirical distribution of measured data, it can not extrapolate the missing data by considering the site-specific fluctuation of the received signal power. The extrapolation accuracy depends on whether the actual distribution of the missing data matches the selected probability distribution. Thus, we give the probability distribution selection as the performance indication.
Finally, the truncated distribution-based method models the probability distributions of the missing data as a parametric distribution. Because these methods do not use the number of missing data and empirical distribution, its simplicity can be the advantage. However, as well as the MI method, the accuracy of this method would be degraded if the assumed and actual probability distribution show differences. The extrapolation accuracy is determined according to the fitness of the missing data to the truncated distributions. Thus, we give the fitness of the missing data to the truncated distributions as the performance indication.

VI. EMULATION SETUPS
We evaluated an emulation-based performance using actual datasets measured in an urban area. This evaluation used the dataset obtained over the 3.5 GHz band as the target power. In the following, an overview of the used datasets is provided. For the interference-limited observation, multiple interfering transmitters were virtually deployed around the measurement area, that is, the target power consists of the actually measured datasets, and the interference power was generated over the computer simulation. The details of the target/interference conditions are summarized below.
All methods and emulation programs are implemented based on Python 3 without any commercial simulators. For more detail, our programs were based on Python 3.8.10, numpy 1.22.1, and scipy 1.7.3 with Ubuntu 20.04 LTS. Using the measurement datasets, we emulated the target signals, and the interference signals were generated based on numerical simulations.

A. 3.5GHz BAND DATASETS
This measurement campaign was conducted in Kudanshita Chiyoda-ku, Tokyo, Japan, a typical urban environment. The spectrum analyzer was installed on the cart, and the receiver traveled at 4 [km/h]. The target power, reception time, and reception position were recorded at 0.15 [s] intervals. In this experiment, the receiver location information was obtained by a GPS capability in the spectrum analyzer (Anritsu, MS2712E). After the measurement, the receiver location data was classified into 10-m meshes. Note that the measurement environment provided sufficient visibility of the sky, and the experiments were taken while visually checking the results. Therefore, there was at least enough accuracy to fit the observations into a target 10-m mesh. The transmission and reception heights above the ground level were 17.5 [m] and 1.5 [m], respectively. The measurement parameters are listed in Table 2. The antenna type of the target transmitter is omnidirectional in the horizontal plane. The receiver has an omnidirectional antenna with circular polarization. Further, the gains of the transmission and reception antennas are 2.4 [dBi] and 0.2 [dBi], respectively. The number of measured instantaneous data points was 100423. After the observation, instantaneous received signal power values were classified into 10-m meshes, and an average value was calculated for each mesh.

B. RADIO PROPAGATION MODEL FOR INTERFERING TRANSMITTER
The radio propagation model based on the log-distance path loss for the interfering transmitter is given by where I s,m [dBm] is the instantaneous interference power in the m-th mesh from the s-th interfering transmitter. α is the   is the link distance between the s-th interfering transmitter and the m-th mesh. Additionally, L 0 (d 0 ) [dB] is the free-space path loss, and its function is defined as where λ [m] is the wavelength. Next, the aggregate interference power is defined as, Is,m 10 .
The locations of the interfering transmitters are shown in Fig. 5. The orange square and frame denote the location and coverage of the target transmitter, respectively. The seven yellow squares denote the locations of the interfering transmitters.
The emulation parameters for the interfering transmitters are listed in Table 3. We selected these power levels to verify the performance of radio map extrapolation methods in the presence of interference-limited observations; in particular, aiming to improve a system in which interference from others cannot be ignored, such as secondary systems in dynamic spectrum sharing or licensed shared access systems. Based on this motivation, we set the transmission power at the interference side as 40 dBm so that the ratio of the complete-missing mesh at S = 7 is approximately 50 %. Note that, in terms of SIR, the same power difference between the measurement target and the interfering transmitters shows similar characteristics (e.g., 0 dBm vs. 11 dBm). Our discussions would be applicable to smaller systems, including dense Wi-Fi systems. Further, we assume that interfering transmitters send continuous wave signals using isotropic antennas (0 [dBi]).

VII. EMULATION RESULTS
Next, we explain the examples of radio maps, the measurement-based path loss model, and its accuracy. In the following, D = 3 is utilized. Thus, if 3-th decimal place or higher is equivalent between log 10 (d c ) and the logarithmic distance of non-missing meshes and semi-missing meshes, the non-missing meshes and semi-missing meshes are extracted.

A. EXAMPLE OF RADIO MAPS
A constructed radio map is shown as an example in Fig. 6. The mesh size is 10 [m], γ th = 7 [dB], S = 4, and w = 0.5 [dB]. Fig. 6(a) shows a true map without interference that corresponds to the average target power described in Sect. VI-A. The interfered map based on (1) is shown in Fig. 6(b); the target power data are missing in many meshes. Figs. 6(c), (d), (e), and (f) are the compensated maps using each method.
These maps reveal that the proposed method can more accurately extrapolate missing data compared to other methods. However, in the severe interference area surrounded by the black dotted frame, the extrapolated median value is lower than the true value. In the extrapolation, the non-missing and semi-missing meshes may be extracted from the southern area since these meshes have the almost same distance as the complete-missing meshes. The target power is relatively low in the southern area; thus, the low p min,c is used and the compensation accuracy of the empirical CDF is degraded. Additionally, several complete-missing meshes cannot be extrapolated in the coverage edge because there are few meshes that satisfy the conditions explained in Sect. IV-B-ii).
Meanwhile, the MI method randomly obtains a target power from the multivariate normal distribution in each complete-missing mesh. From Figs. 6(a) and (d), it can be found that the MI method cannot accurately extrapolate the target power in many meshes. Because the MI method randomly extrapolates the missing data without considering n loss,k , p min,k , n loss,c , and p min,c , the difference between the true and extrapolated maps is large.
Finally, Figs. 6(e) and (f) show the kriging-based compensated maps for J = 8. Fig. 6(e) is derived by applying the kriging to the complete-missing meshes in Fig. 6(b). Meanwhile, we create Fig. 6(f) by compensating the empirical CDFs in the semi-missing meshes using the proposed method before the kriging is applied. Note that the extrapolation cannot be performed in the non-colored meshes because there are no known target power data around the complete-missing mesh. These results show that kriging without median compensation cannot accurately extrapolate the target power. Although kriging with median compensation is slightly superior to Fig. 6(e), the extrapolation accuracy is low in the coverage edge because of the severe missing in the target power data.

B. MEASUREMENT-BASED PATH LOSS MODEL
This subsection explains the measurement-based path loss model. The path loss model is derived using a least-squares method, and its function is defined aŝ   the target transmitter and the m-th mesh. The database server first creates a scatter diagram, where the horizontal axis is log 10 (d m ) and vertical axis the average target power. Then, β and η are estimated using the least-squares method. Fig. 7 is an example of the path loss estimation. A circle denotes the average target power in each mesh, and each solid line is the estimated path loss model in each method. Note that γ th = 7 [dB], S = 4, and w = 0.5 [dB]. The black solid line is the true path loss model that corresponds to no interference. The purple solid line is the estimated path loss model from the interfered map shown in Fig. 6(b). We can confirm that the proposed method can estimate the path loss model more accurately compared with the other models, which inaccurately estimate the path loss model compared to the true model owing to the missing of lower SINR data.

C. ESTIMATION ACCURACY
Finally, the estimation accuracy is derived using a signed mean error e [dB], and its definition is given by whereP true,i [dBm] is the true average target power in the i-th mesh.P dB,i [dBm] is the median path loss value expressed in (17). N eva is the number of 10m meshes used in the evaluation. After 100423 instantaneous data points were divided into three groups, the average e was calculated based on the cross-validation. Since instantaneous data was obtained from moving observations, the number of observation points and  data samples varies from mesh to mesh. 2 Table 4 shows the average number of instantaneous values included in a 10m mesh; the number of data in a mesh was roughly 32. The signed mean error is used instead of the root mean squared error because it is necessary to evaluate whether the missing data containing a lower SINR are accurately extrapolated.
In the MI method, a target power is randomly obtained from the multivariate normal distribution in each complete-missing mesh and the parameters β and η are estimated. This operation is performed 20 times. Then, β and η are averaged and median path loss is calculated using the averaged parameters.
The average e is shown in Fig. 8 for S = 4 and w = 0.5 [dB]. These results reveal that the proposed method enables us to precisely predict the median path loss value compared to the other methods. Additionally, kriging with compensation can estimate the radio environment with a certain degree of accuracy which is lesser than that of the proposed method. In the others, the average e is inferior to the proposed method and kriging with compensation because of the inaccurate estimation of β and η.
Next, Fig. 9 presents the average e versus the number of interfering transmitters S where, γ th = 5 [dB] and w = 0.5 [dB]. It can be seen that the proposed method is more accurate than the others. The average e is degraded after S = 5 owing to the severe aggregate interference power. Note that these results correspond to the performance evaluation when the interference power increases in the fixed number of interfering transmitters. All error performances tend to decrease monotonically as S increases. However, we can confirm that the proposed method is superior to the other methods for various interference situations.
Finally, the estimation accuracy versus the class width w is shown in Fig. 10 for S = 4 and γ th = 7 [dB]. These results revealed that the average e increases with an increase in w. In the proposed method, the empirical CDF is repeatedly compensated from p min,k to N 0 . Thus, if w is large, the lower SINR data are excessively compensated. Consequently, η increases; that is, the estimation accuracy is degraded. Furthermore, if a smaller w is utilized, the estimation accuracy decreases. This is because the lower SINR data cannot be accurately compensated for. This figure shows that w = 0.4 was the best parameter in this evaluation, and the error can be within ±1 dB if with w = [0.3, 0.5]. To stably estimate the median path loss, we recommend choosing from w = [0.3, 0.5] in practice.

VIII. CONCLUSION
We investigated radio map extrapolation methods for interference-limited observation. Conventional methods have drawbacks, such as modeling of the specific distribution, where the compensation of missing data is not considered. Motivated by these problems, we propose a novel extrapolation method for the radio map. In the proposed method, the empirical CDF is compensated by considering the number of missing data, which is an advantage compared to the other methods. The emulation results revealed that the proposed method could improve the mean error performance from 7 [dB] to 8 [dB] than related methods.