Introduction
As an increasing number of customer services rely on location to satisfy the needs of both users and network operators, Localization-as-a-Service (LaaS) is becoming increasingly vital for 5G and 6G networks [1]. LaaS is critical in enabling new location-based services such as autonomous robots and vehicles [2], smart education [3] or e-Health [4]. The 3GPP has set a target of achieving high localization accuracy for 5G networks, aiming for submeter accuracy in certain cases such as autonomous driving, where location accuracy below 10 cm is envisioned [5], and an accuracy of below 3 meters in most cases (both indoors and outdoors) [6]. Combining context-aware data from the Internet of Things (IoT) collected through WiFi networks with 5G information can enhance the accuracy, reliability, and scalability of localization services [7], [8].
Accurate location estimation has become increasingly important in recent years, and the use of Global Navigation Satellite Systems (GNSS) is a common approach for achieving high accuracy in outdoor environments. However, issues like signal blocking, attenuation, and multipath effects make GNSS ineffective indoors, where many applications are being developed. To address this, supplementary technologies like 5G, WiFi, or Ultra Wide Band (UWB) are often used to determine location [9], [10]. In situations where energy constraints on user devices require network-based location to conserve battery and optimize computational efficiency, the network estimates the User Equipment (UE) location based on data collected in the network infrastructure in a non-cooperative manner [11]. Some applications, such as beam management or automatic configuration of network parameters, may also require terminals to transmit their location using specific protocols [12], which can be complex and energy-intensive. An alternative solution to determining location is through network-based location [13]. This method involves the network utilizing data collected from the network infrastructure in a non-cooperative manner to estimate the location of a terminal.
Cellular networks like Long Term Evolution (LTE) are commonly used to locate users when GNSS is unavailable [14]. The most common approaches are location by proximity, ranging-based methods, Angle of Arrival (AoA) and fingerprinting. Location by proximity is the easiest method to determine the location of the UE because it assumes the location of the gNodeB (gNB) is the location of the UE and is used when high accuracy is not required [15]. Ranging-based methods, such as trilateration, involves using ranges obtained through methods such as Received Signal Strength Indicator (RSSI) or Time of Flight (ToF) [9] and can be very accurate if ranges are precise. The determination of the location involves estimating the interception of 4 spheres (or 3 in 2D location). Nevertheless, range estimations are not normally accurate, occasionally resulting in the non-convergence of circles or hyperbolas utilized in the trilateration process. To solve the uncertainty, techniques such as Least Squares (LS) or Weighted Least Squares (WLS) are used [16]. AoA measures the angle at which the signal reaches the UE from the gNB. Multiple Input Multiple Output (MIMO) systems are capable of transmitting with beamforming that can be used to implement the AoA approach [17]. Indoor environments pose reliability challenges for both range-based models and AoA due to the susceptibility of the models to signal blocking and reflections. While the received power might not follow a predetermined propagation model, in cases where the environmental conditions remain relatively stable, it is observed to remain constant over time. For instance, if we consider a location in close proximity to a WiFi AP and the measured power is unusually diminished due to an obstacle such as a wall, this power level will remain unaltered over time as long as the obstruction remains stationary. As a result, each point in space is associated with a set of paired values comprising reference point (RP) identifiers and unvarying received power levels. This principle underlies the concept of fingerprinting, these paired values conform a distinctive signature, commonly referred to as a fingerprint, which serves to uniquely identify each point in space [18].
Fingerprinting exhibits several primary drawbacks. It notably demonstrates high sensitivity to disparities between training and testing conditions arising from dynamic propagation attributes such as temperature, humidity, and obstacles [19], [20], [21]. Additionally, it mandates an extensive preliminary map construction phase, which necessitates thoroughness [22]. This is imperative because unrecorded data points remain unusable for positioning during the operational phase. Lastly, the integrity of the radio map is compromised due to device heterogeneity stemming from variations in orientation and chip sensitivity [23].
Fingerprinting has some disadvantages, which include the requirement for a long map-building phase in advance. This process must also be comprehensive because unmeasured points cannot be used for location in the exploitation phase. An alternative method for constructing maps is through crowdsourcing data from various UEs or sensors. This method may sacrifice precision but it is cost-effective. The system can use the measurements provided by these sources to create or update the radio map or models for localization systems [24], [25]. Other studies have explored the reconstruction of maps when maps are incomplete. In [26], they addressed this issue by leveraging the linear nature of signal propagation. Their objective was to create new data by considering the context of the existing map, allowing for the application of techniques like fingerprinting. It is important to note that this approach is constrained by the granularity of the radio map division, which directly impacts the final accuracy of the system. When the division is finer, precision increases, but it also necessitates a larger number of minimum required data points. Another avenue explored is the utilization of Deep Learning techniques with incomplete maps for the recovery of missing data points. However, it is observed that this approach has limitations and can only recover up to 50% of missing data in incomplete maps [27].
Supplementary techniques such as combining ranging with AoA can enhance the final location estimation of a UE, resulting in a higher degree of accuracy [9], [28]. The fusion of multiple technologies helps to increase the density of RPs in the scenario, providing more information for the final estimation stage. This reduces the cost of infrastructure or expands the coverage area [29].
The contributions of this paper are listed as follows:
Implementation of fingerprinting and model-based algorithms utilizing real 5G and WiFi data.
Evaluation of the performance of positioning systems when fusing different technologies employing map- and model-based methodologies.
Examination of the the behavior of the algorithms when varying different percentages of missing reference points during the training phase.
Proposition of model-based techniques when maps are incomplete with a percentage over 50% of missing reference points while minimizing the degradation on the localization performance.
The rest of the paper is organized as follows. Section II explains both the fingerprinting algorithm and various DTR-based techniques. Section III provides an overview of benefits of fusion of technologies. In Section IV, the experimental setup and the scenario are described. Section V analyzes the results of the outcomes that were obtained from the data collection campaign and the implementation of various location methods with different experiments. Finally, Section VI presents the conclusions of this work.
The acronyms in this paper are listed in the Table 1 as follows:
Location Techniques
In indoor environments, techniques like trilateration can be challenging due to the possibility of signal blocking and reflections, which can result in significant errors. ToF based ranging estimation can reduce these errors, but it can be expensive due to the hardware requirements [9]. In indoor environments, however, there are typically multiple radio signals that can be measured and reported without hardware modifications. In cellular networks, UEs are required to measure all visible base stations and report this information to the serving base station to determine the best cell [30]. In stable indoor environments, the received power tends to remain constant, making radio map techniques particularly useful. This Section will provide an overview of various techniques suitable for these types of scenarios.
A. Fingerprinting
Classical fingerprinting is a localization technique that involves generating a unique fingerprint of wireless signal strength and other characteristics of a particular location. This fingerprint can be later used to identify the location of a device. The process of creating a fingerprint involves measuring wireless signal characteristics at various points within an area, such as a building or campus. In static environments where changes are minimal, the received power at a specific point in space remains relatively constant over time. As a result, each point
Fingerprinting involves two distinct phases, as illustrated in Figure 1. The first phase, known as the offline or training phase, involves creating a radio map by assigning a unique fingerprint to each point on a regular grid. In the second phase, called the online or exploitation phase, the terminal measures the surrounding gNBs and generates a new vector
The level of accuracy in the fingerprinting method depends on several factors such as the size of the grid used during the training phase, the size of the input vector, the variance of the measured power for each component, and the accuracy of the UE’s measurements. If the input vector has fewer than
WiFi and cellular networks are commonly associated with fingerprinting due to the high density of stations in office and residential areas [18], [31]. Fingerprinting can provide high accuracy with a reduced infrastructure investment, but it requires the creation of a radio map with a complex training phase. To maintain location precision, the maps need to be updated when there are changes in the environment. Furthermore, the maps must be comprehensive, meaning that all points on the grid must be systematically measured in order to properly locate users at any point.
B. DTR-Based Location
Fingerprinting has a major drawback in that it requires complete information maps about the environment. To address this issue, a commonly used approach is to employ ML algorithms to create an environment model [32], [33], [34], which can then be utilized for estimating the position during the exploitation phase. ML algorithms generate a comprehensive model of the scenario through the information provided in the training phase with certain RPs. Consequently, even without conducting measurements across the entirety of the scenario during the training phase, the ML model enables a localization service encompassing the entire designated area [35], [36]. In the context of a grid scenario when maps are incomplete, this study utilized DTRs, which are recognized for their simplicity and computational efficiency [37], to estimate the position. In addition, DTR-based methods were chosen over other ML algorithms because they offer the best precision for indoor localization [38].
By creating a set of hierarchical comparison rules that are applied sequentially, DTRs model the behavior of the localization system. The resulting path over a tree is determined by the outcome of each rule (branch), leading to a final node (leaf) that decides the output of the regressor as illustrated in Figure 2.
The DTR learning process comprises two phases: training and testing. In the training phase, the 80% of the available samples are randomly selected to form the training dataset
In this work, we study different DTR-based algoritms that were chosen due to its high accuracy and low complexity: Random Forest (RF) and two Adaboost-based training algorithms that are Decision Tree Adaboost (DTA) and Linear Tree Adaboost (LTA). DTA, in its final prediction of positions, combines outputs from different WLs by using a decision rule and taking their average [39]. On the other hand, LTA creates an interpolation function by taking into account the different outputs of a set of decision rules [40], [41].
1) Random Forests
RFs are a ML technique that employs a collection of individual models (known as base models) to generate a final prediction. This ensemble method is versatile and can be applied to various ML tasks such as classification, regression, or localization. RFs are especially useful for localization tasks because they can effectively aggregate the predictions of multiple decision trees to determine the location of a device [42], [43].
RFs employ the bootstrapping method to generate decision trees, which involves a random subset of the training data that is selected to create a single decision tree. This process is repeated several times, leading to a vast number of decision trees trained on various subsets of the data. To generate the final prediction of the localization process, the predictions of all the decision trees in the forest are averaged as depicted in Figure 3.
The implementation of RFs is relatively straightforward, as they utilize decision trees, which makes them computationally efficient. Additionally, RFs are resistant to data noise since average of all the location outputs mitigates the impact of any individual decision tree that might produce an inaccurate estimation.
Algorithm 1 explains with pseudocode the structure and formulation of the RF algorithm [44]. Given
During the testing phase of the RF algorithm, the input vector is
2) Adapting Boosting (Adaboost)
Adaboost leverages the predictions of multiple individual models, known as Weak Learners (WLs), to arrive at a final prediction [45]. The WLs are generated through a process called boosting, which involves iteratively training the model on new subsets of the data, with each round emphasizing the data points that were incorrectly classified in the previous iteration. Figure 4 illustrates the method of combining the predictions of all the WLs in the ensemble to make the final prediction. Two Adaboost-based training algorithms are studied in this work: DTA and LTA. In the DTA method, the positions from various WLs associated with a decision rule are averaged in the final prediction [46], while in the LTA method, an interpolation function is developed between the different outputs within a set of decision rules [40], [41].
Adaboost is capable of adapting and learning from changes in data over time, making it crucial in dynamic environments where wireless characteristics are prone to variation. Although it achieves high accuracy, especially in LTA, a significant disadvantage of Adaboost is its reliance on extensive computational processing for the final estimation.
The structure and formulation of the Adaboost regressor training, as described in [45], is explained through the pseudocode in Algorithm 2. As in RF algorithm, given
The estimator uses
Fusion of Wi-Fi/5G Technologies
As the demand for connectivity increases, the number of radio technologies available at a specific point in space has increased over time. This is especially true for indoors environments, where the demand for broadband is higher. Thus, it is common that technologies such as WiFi and cellular networks are present in most indoor scenarios.
The fusion of 5G and WiFi can also enhance the user experience by providing seamless connectivity [47]. This is particularly important in indoor environments where users frequently move between different rooms and areas, each with varying signal strengths and qualities. With the integration of 5G and WiFi, the system can dynamically switch between the two technologies depending on the location and signal strength, ensuring a consistent and reliable connection.
Moreover, the fusion of these two technologies can also improve network efficiency and reduce costs [48]. With the increasing demand for high-speed connectivity, network operators are under pressure to provide faster and more reliable services. By utilizing both 5G and WiFi technologies, operators can optimize the use of available resources, thereby reducing network congestion and improving overall network performance. This can result in lower costs for both the network operator and the end-user [49].
In terms of localization, 5G and WiFi are two technologies that can be utilized to increase the coverage area, enhance the accuracy of the final location estimate through fusion in trilateration [29], or create denser areas for radio map creation. Furthermore, since both services are managed independently, they can act as backup options for each other in case one fails. Additionally, both technologies can offer unique services, such as wide spectrum service in case of 5G [50] or precise timestamp in trilateration for WiFi [51].
In this work, the fusion of 5G and WiFi, for the different localization algorithms, is direct as the system integrates the data of both technologies as input data. Fusion enables the system to expand the number of APs available for the radio map creation. Having a higher number of APs in the radio map allows the method to compare the context of the UE more thoroughly for the final location estimation. Moreover, a denser radio map reduce the impact of losing a single gNB or APs.
Experimental Setup
This Section presents the configuration for obtaining real 5G data and WiFi from the University of Malaga. The 5G network belongs to the University of Malaga, and contains three indoors base stations which have been configured to reduce the interferences with commercial networks. The base stations are located at two different heights (2.5m and 3.5m) and a map of the scenario is shown in Figure 5. The three WiFi APs are Google WiFi mesh routers placed on shelves at a height of 2 meters in order to ensure the visibility of all APs in the entire map. Measurements were taken at ground truth points represented by orange dots and green dots. The scenario includes three laboratories and one hall with metallic elements that can cause signal blocking, attenuation, and multipath effects. The 5G gNBs are placed in the ceiling to provide good visibility and transmit at a power of 20 dBm at a frequency of 3774.990 MHz. Measurements were taken systematically over a grid of points marked on the floor as illustrated in Figure 5. Samples were taken 0.8 meters apart to cover the entire accessible area of the scenario.
The location target UE is a Motorola Edge 20 which runs Android 11. An application has been programmed to capture the RSSI of the serving and neighbor cells and WiFi APs information. The captured data is sent to a server over 5G, where the measurement samples are saved in a MySQL database to be further processed. The programmed application also allows to indicate the ground truth and send it along the taken measurements.
Results
In this section, we present the localization results obtained from three different experiments. All experiments used a dataset of over 500 samples. The data was randomly split into a training set and a testing set (represented in Figure 5 as orange and green dots, respectively) with 20% of the measuring points allocated for testing. This process was repeated a thousand times, on each iteration the training and testing points are randomly chosen, using the Monte Carlo method, in which Figure 5 represents one example of this process, to produce accurate statistical results.
A. Evaluation of Different Methods
This experiment evaluated the performance of four localization techniques - fingerprinting, DTA, LTA, and RF. The DTA and LTA methods were trained with 50 WLs as suggested in [52], and the number of trees in RF was set at 50 for fair comparison with Adaboost. In this experiment, the performance of the different methods is being evaluated solely using 5G technology. For fingerprinting, the training data was used to construct a radio map and for the rest of the methods, the training data was utilized to construct the trees or the WLs. The testing data was used to measure the precision of the different localization methods, with the results represented by the Cumulative Distribution Function (CDF) of the horizontal error in Figure 6. The 95th percentile (horizontal pink line) has been selected as the basis for location accuracy standard [53].
Fingerprinting (red) estimates the position of the UE on the radio map by identifying the closest point. The radio map is divided into a lattice, and fingerprinting determines the location of the UE within this lattice. DTA (blue) calculates the average the output of the different WLs. In case of averaging a regular radio map, the final result is always a lattice of the radio map. Notably, all measurements are acquired at the center of these lattices. Thus, both fingerprinting and DTA provide a lattice-based location that is translated into discrete error and a staggered step of the CDF. RF (yellow) performs better than DTA as it provides good performance on large and complex datasets and averages the linear estimation of the different trees. In contrast, DTA averages the final estimation based on WLs. LTA (green) is the top performer because it generates an interpolation function of the different WL outputs, significantly improving final localisation accuracy, even though, it reduces the computational efficiency.
Figure 6 clearly shows that ML methods significantly reduce errors compared to the fingerprinting technique. The accuracy of fingerprinting heavily relies on the radio map as it compares directly with the received signals from the UE. RF and LTA enhances the final location estimation compared to DTA because the final position is derived from a linear function. While LTA provides higher precision in positioning, it is not feasible for real-time applications due to its highly time-consuming nature. RF offers a trade-off between accuracy and computational efficiency, making it suitable for real-time applications.
B. Evaluation of Fusing Technologies
In this experiment, the behavior of the different algorithms are evaluated with different cases: with 5G and WiFi in isolation and the fusion of both. The goal was to determine if fusion enhances the precision of the localization system when different technologies appear in fingerprinting and DTR-based methods.
Although 5G promises high-precision positioning through the multi-RTT protocol, to the best of the authors’ knowledge, this protocol has not been implemented in any commercial device yet. On the other hand, WiFi has already created the 802.11mc protocol, which offers accurate ranging estimation through the RTT protocol and can achieve meter-level accuracy [9], [51]. This protocol is widely implemented in plenty of smartphones, but only a limited number of APs have adopted it [54]. Until RTT protocol gets integrated into 5G, using ML techniques and fingerprinting along with RSSI measurements can be highly beneficial. Additionally, capturing RSSI measurements does not increment the energy consumption of the terminal or demand special hardware [9].
Figure 7 represent the different cases of fingerprinting, DTA, LTA and RF with only 5G NR (solid line), only WiFi (dashed line) and fusion of 5G and WiFi (dotted line). As it can be observed, fusion improves the performance of the system in all cases. Combining different technologies increases the number of APs in the scenario, which improves the final estimation due to the availability of more information of the environment. So, the more complete radio map, the better localization resolution will be. Among the techniques, fingerprinting yields the greatest improvement as it is the most radio map dependent. Despite of LTA having a slightly better overall performance, RF still provides a high level of accuracy that is comparable to LTA and it allows real-time location-based services.
Cumulative Distribution Function of the error of 5G, WiFi and fusion for different methods.
Table 2 presents the performance of the different algorithms for 5G, WiFi and fusion data, characterized by metrics including the mean (
C. Robustness of the Methods with Different Levels of Incomplete Maps
In this experiment, the robustness of the different methods are evaluated with varying degrees of missing RPs in incomplete maps. In this case, the fusion of 5G and WiFi is used as the input data because it has demonstrated to always improve the performance of the localization results. The goal was to determine the robustness of the techniques by examining how well it performed as the percentage of missing data in the radio map increased. This experiment consists on reducing the number of training points. To do this, the percentage of testing points was kept constant at 20% while the percentage of discarded data varied from 0% to 60% as shown in Figure 8.
The results of the experiment were represented by the CDFs of the horizontal error when different percentages of discarded data (0%, 20%, 40% and 60%) are used to evaluate the performance of the system. As it can be observed in Figure 9, except of the case of DTA with 60% of discarded data, the rest of the cases of DTR-based algorithms outperform fingerprinting without discarded data in terms of user localization accuracy. Therefore, using DTR-based algorithms, allows to achieve higher levels of accuracy even with maps that contain fewer data points.
Cumulative density distribution of the error of different techniques with different percentage of discarded data.
Table 3 provides a brief summary of the different methods.
Conclusion
Indoor positioning has become an increasingly important technology in recent years as it enables a wide range of applications, such as indoor navigation, asset tracking, and location-based services. However, the traditional method of using radio maps for indoor positioning has several significant drawbacks. One of the most significant issues with radio map techniques is the complex training process required. The process of creating a radio map involves collecting and analyzing a large amount of data from a given indoor environment. This data is used to build a map of the radio signal strength for each location in the area. However, this process can be time-consuming and expensive, which limits its applicability in scenarios where a large area must be covered. Moreover, another major issue with radio map techniques is that the fingerprints of the indoor environment can change over time due to changes in the scenario. These changes can affect the accuracy of the radio map, which requires frequent updates to maintain the effectiveness of the technique. This retraining can be very costly, both in terms of time and resources.
In this work, we have performed and compared fingerprinting, DTA, LTA and RF techniques with real 5G and WiFi data. First, DTR-based methods noticeably improves the localization performance compared with the regular fingerprinting. It is remarkable that LTA and RF have performed better than DTA and fingerprinting because the final location based on the interpolation between points.
On the other hand, fusion of technologies have proven to provide better performance of the system. By combining 5G and WiFi, the number of APs in the scenario increases. This implies that during both the training phase and the operational phase, the different localization algorithms are provided with more information about the environment. This, in turn, results in improved final estimation by providing more environmental information. Furthermore, fusion has the potential to enhance connectivity, extend coverage, optimize resources for location-based services and minimize the expenses associated with deployment and infrastructure.
Related to the robustness of the different methods, LTA and RF maintains the error stable even when the percentage of missing RPs in incomplete maps becomes significant, up to 40% of missing data. Its robustness allows to cover larger areas, minimize the need for frequent retraining, or decrease the number of data points required on each map. Depending on the service being offered, DTR-based models, specifically LTA and RF, can be highly valuable tools for indoor positioning. In these experiment, LTA yields better results in both experiments than RF but it is not suitable for real-time applications. Nonetheless, RF provides a balance between accuracy and computational efficiency, making it ideal for real-time services. Although RF cannot adapt to environmental changes, DTA readjusts to them but with decreased localization accuracy. As a result, DTR-based models have imposed its applicability over fingerprinting. However, there is no single method that is universally best for location-based services, as it varies depending on the specific application and scenario.