LoRaWAN: Lost for Localization?

Nowadays, the flexible localization solution for various devices for workplace safety is one of the most demanding research questions. Notably, it is expected to provide an acceptable level of precision in different types of environments empowered by wearable technology and Internet-of-Things (IoT) devices. Existing leading localization technologies are adapted for certain conditions, for example, Wi-Fi, Bluetooth low energy (BLE), and ultra-wideband (UWB) are used for indoor areas and various global navigation satellite system (GNSS)-based ones for outdoors. This work focuses on investigating the long-range wide-area network (LoRaWAN) (868-MHz band) as a potential candidate to bridge this gap, being one of the most reliable and recognized communication technologies for the Industrial IoT (IIoT). In the past, the research community had a lot of critics with respect to the applicability of LoRaWAN for localization, while the vision is facing tremendous change over the past two years. The purpose of this work is to assess the feasibility of LoRaWAN as a localization solution for work safety applications in the industrial scenario from different angles. The work is based on two measurement campaigns conducted at the Brno University of Technology (BUT), Brno, Czech Republic, and the University Politechnica of Bucharest (UPB), Bucharest, Romania. The campaigns cover both indoor and outdoor scenarios and provide the practical limitations of the positioning in standalone and ${k}$ -nearest neighbors ( ${k}$ -NN) powered localization systems. According to the results, LoRaWAN-based localization with relatively dense gateways (GWs) deployment allows for achieving a meter-level accuracy, which may be suitable for the localization of workers.

of the top research questions when talking about industries, especially hazardous and hard-to-reach worksites with heterogeneous environment [1], [2]. Existing flagship solutions applied for both indoor [e.g., Wi-Fi, Bluetooth low energy (BLE), ultra-wideband (UWB)], and outdoor [e.g., Global Navigation Satellite System (GNSS)] localization are not flexible enough for such cases since they are focused primarily on one environment [3], [4].
At the same time, numerous life scenarios that demand both indoor and outdoor location detection, for example, construction and logistics worksites [5], [6], [7], are interested in a flexible solution to reduce the deployment and compatibility costs. Thus, a hybrid localization technology meeting the requirements of wearable devices (power consumption, dimensions, computational complexity, etc.) and at the same time handling different types of environments is an active research area.
Long-range wide-area network (LoRaWAN) technology, a verified communication solution for the IoT world and widely deployed in many industrial scenarios, could be a potential candidate to fulfill this research gap [8], [9]. Naturally, it has several characteristics that are attractive for the flexible, outdoor-indoor, localization of wearable devices. This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Sub-GHz band: LoRaWAN technology under consideration operates on a lower frequency than current leaders in indoor localization, such as Wi-Fi, BLE, and UWB. First of all, it decreases the probability of interference (compared to technologies operating in the 2.4-GHz band), and, second, it defines better penetration ability, that is, higher resilience to the scatter-rich indoor environment [10], [11]. The better penetration ability plays an important role in the localization in the underground/hazardous environment. LoRa modulation: Chirp Spread Spectrum (CSS) modulation entails a higher resistance to multipath propagation [12]. In conjunction with better penetration ability, LoRaWAN technology is more stable in the indoor environment. Low energy consumption: when talking about wearable devices, energy consumption is one of the most crucial points. LoRaWAN technology is a low-power solution developed specifically for IoT devices, while one of the most popular solutions in outdoor/indoor environments, GNSS and Wi-Fi, respectively, are not the best choice for wearable devices from this perspective.
Nonetheless, LoRaWAN technology is considered a costeffective solution: due to the low cost of the module, the infrastructure can be easily scalable, which allows communication in the conditions of large worksites like construction.
More and more papers are now being published related to the use of LoRaWAN as a leading communication technology for wearable applications that help keep workers safe [13], [14]. Notably, the research community does not postulate that the localization system would work solely based on LoRaWAN but rather be supplemented by it when necessary as the state-of-the-art proves that the localization accuracy obtained with LoRaWAN is quite low (to be discussed in Section II). If it were possible to improve precision using supplementary preprocessing of data and Machine Learning (ML) algorithms, we would have a low-cost, low-power, wide-range solution, ensuring high positioning accuracy for the IIoT.
This work investigates the following questions. RQ1: What is the LoRaWAN localization accuracy for outdoor/indoor above-ground/underground scenarios? RQ2: Which localization approach is more suitable for estimating the LoRaWAN-based accuracy?
RQ3: How does LoRaWAN-based localization accuracy depend on the number of gateways (GWs) used?
RQ4: Is it possible to use LoRaWAN for localization performed by industrial wearables?
One of the core contributions of this work is two open-access datasets collected during two measurement campaigns conducted in Brno, Czech Republic, and Bucharest, Romania, to address the above questions [15]. Presenting LoRaWAN datasets for different environments, calculating localization accuracy, and reviewing their dependency on the number of equipment and ML algorithms used, this article does not aim at discussing the possibility to increase precision, using the specifics of the LoRaWAN PHY layer.
The rest of the article is organized as follows. Section II highlights the current state of affairs in the field under study and specifies the relevance of this work. Section III reveals the measurement campaign procedure: equipment used, order, characteristics, and possible complications. Section IV describes the localization approach used to estimate the localization accuracy in this work. Next, Sections V and VI provide detailed results of measurement campaigns conducted in the Brno University of Technology (BUT), Brno, Czech Republic, and University Politechnica of Bucharest (UPB), Bucharest, Romania, consequently: parameters, statistical analysis, and mean errors. Furthermore, Section VII summarizes the dependence of localization accuracy on the number of GW. Finally, Section VIII is a discussion of the questions stated for this article, which also contains the conclusion and future work.

II. RELATED WORK
LoRaWAN technology is a proven communication solution for the IoT world. It is being extensively utilized for smart city applications [16], [17], smart homes [18], and different monitoring tasks such as health, agriculture, traffic, wellbeing, and so on [19]. Recently, LoRa was even named as a possible solution for future lunar communications [20]. However, its application as a localization solution is still in its infancy. Currently, the literature is still lacking the discussion in this research direction with only a few works [21], [22] or open datasets encouraging researchers [23], [24].
One of the most well-known works by Aernouts et al. [23] investigated the outdoor fingerprinting dataset collected using LoRaWAN in Antwerp using 68 GWs. The mean error achieved is 398.4 m using k-nearest neighbors (k-NN) with k = 11. The thesis work [25] applied to this dataset an approach based on Artificial Neural Networks (ANNs) with Multilayer Perception (MLP) architecture. The obtained results showed a mean error of 381.8 m. Published in the same year, the paper [26] tested different types of fingerprinting methods on this dataset and improved this result by 41 m using a similar neural network approach. One of the most recent works [27] presents a comprehensive review of Received Signal Strength (RSS)-based localization approaches that can be used in the case of LoRaWAN and compares range-based localization and fingerprint-based localization for the Aernouts dataset, reporting a mean error of 700 and 340 m, respectively. Another recent work [28] achieved a mean error of 322.6 m with the same dataset by using not just RSS, but also timestamps to prepare the data before applying the k-NN algorithm with Random Forest Regressor (RFR).
Moreover, some works performed measurement campaigns to create fingerprinting datasets to estimate LoRaWAN-based localization accuracy in relatively small outdoor regions [21], [29]. Despite incremental improvements from article to article in terms of precision, the reported accuracy remains low, for example, one of the first works [30] explored TDOA multilateration for the area 2 × 3 km with four GWs and achieved an accuracy of around 100 m. Choi et al. [29] conducted measurement in open 340 × 340 m outdoor area using four GWs. They applied the fingerprinting approach to the interpolated RSS Indicator (RSSI) maps and obtained the smallest mean error of 24.1 m. The literature offers fewer published works when it comes to the indoor environment. It could be explained by the fact that one of the main advantages of LoRaWAN is longrange coverage, which automatically associates it with the outdoor environment. The initial idea behind the technology was to use fewer GWs to cover large zones. At the same time, to perform indoor localization, we need to ignore this benefit and use several GWs for relatively small areas.
However, some researchers see the potential for LoRaWANbased localization in the indoor environment and actively explore this field, presenting promising results.
One of the most well-known papers [31] presents a comparative study of several technologies, Wi-Fi, BLE, ZigBee, and LoRaWAN, for indoor localization. The experiment included three GWs and one measurement point (MP); trilateration was chosen as a localization approach. LoRaWAN showed a 2.7-m mean error for a 5-m distance between the GWs, slightly less than other scenarios. However, the authors pointed out other undeniable advantages of LoRaWAN, already mentioned here, long-range coverage and better penetration ability, which can bring it forward in real scenarios. The work [32] investigated LoRaWAN-based localization for both indoor and outdoor environments, using RSSI fingerprinting (metric: Euclidean distance). The best-reported accuracy for the indoor environment is 4.55 m. Zhu et al. [33] presented the mean localization error of sub-10 m for an open space indoor area of 50 × 100 m by building a less vulnerable fingerprinting map based on extreme RSSI (ERSSI) and proposing Boundary Autocorrelation (BAE) for comparison online data with the stored one. One of the most recent works [12] reviews LoRaWAN for smart home localization and declares a precision of 1.6 m in the case of the Line-of-Sight (LoS) and a precision of 3.1 m in the case of non-LoS (NLoS).
The promising results presented in Table I show that the application of LoRaWAN technology as a localization solution is a very hot topic for research activities. They give a good basis for this work, which focuses more on the multienvironment and the question of the possibility to perform seamless localization using LoRaWAN. The current paper considers the investigation of LoRaWAN-based localization for industrial wearables for work safety that envisages a nonideal changing scenario. The main contribution of this work lies in testing the flexibility of technology to switch between different types of environments: indoor, above-ground/underground, and outdoor. In addition, this work explores the dependency of localization accuracy on the number of GWs used in the measurement campaign and attempts to set the optimal number of GWs for particular scenarios.

III. MEASUREMENT CAMPAIGN PROCEDURE AND LIMITATIONS
To address the questions stated in the introduction and to have more consistent results, two similar measurement campaigns were organized in BUT and UPB. Several scenarios were explored to cover the most typical environments for the localization task: indoor above-ground (iAG), outdoor, and indoor underground (iUG). Data were collected using different spreading factors (SFs). Considered scenarios and their parameters are presented in Table II. We would like to highlight that by "environment," this article means indoor aboveground, indoor underground, and outdoor. "Scenario" corresponds to different parameters of the measurement campaign with respect to GW setup, measurement place, and the system of coordinates. In both campaigns, we used the following equipment. LoRaWAN GW LG308: Guidance and configuration rules could be found in [37]. Due to specific circumstances defined by the environment and availability of the GWs, we used from 6 to 9 pieces in different scenarios. For ease of identification, we assign to each of them odd serial numbers.
Routers: The routers enable connection of the GWs. Similar to the number of GWs, routers vary from case to case.
Field test devices (FTDs): Two devices capable of transmitting and receiving data using the LoRaWAN protocol and equipped with a display that allows to instantly check information and parameters [38].
The measurement procedure comprises two stages as follows.
1. Offline: This stage includes all the required preparations that are supposed to be done before the actual practical activity. The first step is creating the measurement map. In both cases, it was decided to establish a separate coordinate system to exclude the problem of identifying real positions in an indoor environment. Maps supposed to include the location of the MPs and equipment (for localization methods based on the locations of the GWs). During the phase of creating a measurement map, questions related to the following should be considered. a) Spacing: A compromise between the labor and time consumption, on the one hand, and localization accuracy, on the other (the higher the signal map density, the higher the accuracy can be expected). b) Equipment: GWs utilized in these measurement campaigns require a connection to the Internet. Consequently, when planning the company, one needs to consider the availability of the access points (APs) and sockets near the proposed equipment deployment. Thus, one should decide how many GWs to use (a compromise between availability/installability and precision: the more the GWs, the higher accuracy can be expected) and their distribution (symmetry avoidance, possibilities to plug in and to connect to the Internet, and remoteness from the measurement area). Furthermore, to ensure that the measurements are going to be taken in the proper places defined in the measurement map and ensure reproducibility, it is necessary to indicate the location of the MPs on the floor/ground and placemarks. Next comes the equipment setup (the guidance could be found in [37]): after ensuring the connection of GWs to the Internet, it is necessary to check out the connection of the equipment (GWs and FTDs) to the server (The Things Stack (TTS) [39]) and provide the transfer of the data to the place where it can be stored. TTS supports many integrations (e.g., Message Queuing Telemetry Transport (MQTT), storage integration), allowing organizing the database to further information storage. We stored the data in a unique request bin using Webhooks integration.
2. Online: This stage comprises directly gathering the data: at each point, a person conducting the measurement uses a test device to send uplink (UL) messages to be received by predeployed GWs. We sent three UL messages to have more reliable results in all cases at each MP.
The preparation and implementation of the campaigns described in this article were carried out in accordance with the steps presented above. Depending on the environmental conditions and the resource of time, the number of equipment involved and the studied SFs may vary.

A. Localization Methods Based on the GW Locations
Trilateration: It is a basic localization approach that envisages finding the location based on the connection between RSSI of the received signal and the distance d it passed during propagation [31], [40], [41]. In addition to knowing the coordinates of the GWs, the use of this method is also limited by their number-more than 2.
Weighted centroid algorithm: The essence of the algorithm is to calculate the gravity center of the figure formed by the GWs that received the uplink message, based on RSSI weights that determine the significance of each GW. In this work, WCA is applied according to formulas that can be found in [42].

B. Machine-Learning Algorithms
ML could be separated into classification and regression task types. We classify our task as a regression task since we are trying to predict a unique position using features, that is, RSSI rows received by various GWs. To investigate the possibility of increasing the accuracy of LoRaWAN-based localization employing ML, based on the literature review, we selected from the regression cluster the most common algorithms for localization.
k-Nearest Neighbors Algorithm: k-NN is one of the simplest ML classifiers that estimates the location of the MP based on the coordinates of k closest points [23], [43], [44]. To estimate the proximity of this work, use the Euclidean distance matrix.

k-Nearest Neighbors Algorithm With Weighted Centroid (k-NN-W):
The closer the neighbor location to the estimated point, the bigger is its' weight [45].
Decision Tree Regression: Envisages splitting of a dataset on several classes (target values) based on the different conditions. When the target value is discrete, we are dealing with a classification problem; when it is continuous-with a regression [46], [44].
Random Forest Regression: It is based on the use of an ensemble of decision trees built using randomly selected training samples (bagging). The result is determined by calculating the mean value of all individual predictions [44], [47].
Linear Regression: It tries to find a linear function that could describe the training data in the best way [9], [44], [47]. Support Vector Regression: It tries to find a tube that best describes the training data while trying to balance model complexity and prediction error [48]. It is considered a superior approach to the LR since it can handle nonlinearity through the different versions of kernel [47], [49], [50].
Estimation of the LoRaWAN-based localization accuracy were carried using Python libraries sklearn, keras, and pandas. By default, the data were divided into training and test samples in the proportion of 80%-20%, respectively. For all cases, the models for longitude and latitude were trained separately. Before analysis, the data was preprocessed: outliners and missing values were replaced with the value lower than the receiver sensitivity (−140 dBm); to exclude overfitting due to redundancy of the data caused by multiple messages sent from each MP, several strategies were performed: selecting a random reading out of three for each location and selecting the reading with the highest average RSSI and averaging.
The localization accuracy for the different algorithms and scenarios is compared based on the Root Mean Square Error (RMSE) where x and y are real coordinates,x andŷ are estimated coordinates, and N is the number of MPs to be estimated.

V. MEASUREMENT CAMPAIGN IN BUT
The measurement campaign was conducted in two locations: the fifth floor of the BUT; the parking in front of the same university. 1 The environment for the first measurement campaign is represented in Fig. 1. In all scenarios for this measurement campaign, seven GWs have unique identifier: 7, 9, 11, 13, 15, 17, and 19.

A. Aboveground Indoor Localization in BUT
The first part of the first measurement campaign was conducted on the fifth floor of the BUT building. The building has seven floors with internal walls (150-mm thick) made of concrete. The coordinate system created for this case is depicted in Fig. 2   horizontal (75 × 1.8/3.2 m). All corridors have iron benches along the walls; the horizontal corridor contains stairs and lifts (separated by doors). The total number of MPs is 203, with 1 m between them. Using different SFs (7, 9, 10, and 12), we sent three UL messages from each point received by seven GWs distributed on the floor and placed close to the walls so as not to obstruct the passage of people.
Summary information on the collected dataset is represented in Table III. The total number of the sent UL packets for each SF is 203 × 3 = 609. Consequently, the possible theoretical number of packets that all GWs could receive for each SF is 609 × 7 = 4263. Technically, a larger SF means higher sensitivity. Thus, with SF12, we should observe the largest number of received packets, and with SF7, the least. However, SF9 and SF10 are out of these consequences, which can be explained by the fact that measurements using those SFs were carried out during the mornings when there are more people in the building than in the evening.  The possible number of packets that each GW could receive for all SFs is 609 × 4 = 2436. Expectedly, the GWs located at the ends of the corridors (GWs 7 and 9) received the least packets, and GWs in the center (GWs 11-15) were the largest.
The RSSI distributions for each SF are presented in Fig. 3. The mean RSSI values for all SFs oscillate around −86 dBm.
The mean errors are presented in Table IV. Here and further, the work presents results averaged for the different SFs, that is, this work does not investigate the dependency of the localization accuracy on the SF. In indoor conditions with many obstacles, researchers use trilateration as the simplest localization approach to estimate the accuracy when the signal is refracted and reflected multiple times. With the distance-power gradient 4 corresponding to the environment under consideration, we obtained a mean error of 32.3 m. WCA reduced the error to 26.7 m. Using different ML algorithms, we obtained a minimum mean error of 2.49 and 2.43 for k = 2 using k-NN and k-NN-W.
In this work, we explored several ways to reduce redundancy caused by the multiple uplink messages from the same location to avoid overfitting the models: selection of the random reading out of three, averaging, and selection of the reading with the highest RSSI. For all methods, the best strategy to eliminate redundancy of readings turned out to be the last one, selecting the corresponding reading with the highest mean RSSI for each location.
To conclude, localization methods based on the anchors (trilateration and WCA) produce mean errors equal to tens of meters, while the k-NN algorithm reduces it to the meter level. Thus, future experiments exclude the first two algorithms from the analysis. The best results were obtained for ML methods based on the highest average RSSI (approximately 1 m better accuracy). Therefore, we will use this strategy to reduce extra data from the dataset.

B. Underground Indoor Localization in BUT
The next part of the measurement campaign took place in the parking lot in front of the BUT (see Fig. 1). It has an above-ground floor and several underground floors, which represent an analog of the underground environment (see Fig. 4, where 1 illustrates the above-ground floor, 2 and 3-two underground floors). The mean errors are presented in Table V.
In this case, the measurements were taken on the underground 2 and 3 floors of the parking lot. The coordinate system for this experiment is represented in Fig. 4. During the week, the measurement campaign was carried out on two underground parking levels at 2.8-m height during low workload hours. The top underground floor (2 in Fig. 4) had approximately 74% places busy, and the bottom floor (3)-10%. There are 49 MPs in the center of each floor at a distance of 2.5 m. We sent three messages from each MP as in the  The total number of UL packets sent for each SF for each floor is 49 × 3 = 147. Consequently, the possible number of packets that all GWs could receive for each SF is 147 × 6 = 882. The maximum number of packets each GW could receive for all SFs is 147 × 2 = 294.
The RSSI distributions for each SF are presented in Fig. 5. As expected, the difference in the average RSSI levels obtained using SF7 and SF12 becomes less noticeable at shorter distances.
The mean errors for the case when the GWs were installed locally are shown in Table VI for both underground floors. On average, the best results were obtained with SVR (5-7 m), followed by k-NN, which turned out to be the best algorithm for localization estimation in the previous case.

VI. MEASUREMENT CAMPAIGN IN UPB
A similar measurement campaign in UPB was planned for outdoor and indoor environments to have more consistent conclusions. The environment for this MC is shown in Fig. 6. Building A has eight floors and Building B-4, and the walls are made of concrete. For this MC, the number of GWs was increased to 9 to explore the dependency of the localization accuracy on the number of the GWs more explicitly. The distribution of the GWs among the considered area is presented in Fig. 6 and Table VII. The identification numbers assigned to each GW are 1, 3, 5, 7, 9, 11, 13, 17, and 19. The placement of the GWs for both scenarios, indoor and outdoor, is almost the same, excluding GW 5, which was located in another building during outdoor measurements.
The coordinate systems for the indoor and outdoor scenarios for this MC are the same (see Fig. 7). The measurement map covers a rectangular area of 5 × 31 m and contains 155 MPs separated by 1 m.
In addition, it was decided to make some changes in the procedure of k-NN algorithms for this measurement campaign, namely, when dividing the dataset into training and testing subsets, and the boundary MPs (blue dots in Fig. 7) were excluded from the testing set. This decision is because, in the case of small datasets, the border MPs are experiencing large errors since they have fewer neighbors, resulting in a distorted picture of the level of the mean localization error. The data for   MPs at the borders was taken separately from the MPs inside the rectangular area (orange dots) for the subsequent ease of dividing the data for k-NN.

A. Indoor (Localization in the Office, UPB)
The indoor measurements were collected on the ground floor in the spacious hall without considerable obstacles except for columns. Summary information on the collected dataset is represented in Table VIII. The total number of the sent UL packets for each SF is 155 × 3 = 465. Consequently, the possible number of packets that all GWs could receive for each SF is 465 × 9 = 4185. According to the statistical analysis results, the average probability of receiving the message equals 92.6%, comparable to the similar scenario conducted in BUT (94%; see Table III).
The distributions of RSSI for each SF are presented in Fig. 8. Regarding the RSSI levels, the results were similar to those obtained for the indoor scenario in BUT (see Fig. 5). Although the cases had different initial parameters (see Table II), this gives us more rights to compare them. Table IX contains the accuracy assessment for this case. The obtained mean error turned out to be approximately 2.7 m, confirming the results obtained for another indoor scenario in the previous campaign in UPB (see Table IV). It should be noted that despite such an accuracy representing promising results in terms of LoRaWAN-based localization and could be compared with the Wi-Fi performance, it remains relatively low (compared, e.g., to UWB).

B. Outdoor (Localization in Front of the University, UPB)
The coordinate system for the outdoor scenario is the same as in the case of the indoor scenario (contains the same number and the same disposition of MPs, see Fig. 7), but the considered area is located in front of the university.
According to the results of statistical analysis, the probability of receiving the message, on average, equals 91% in this case (see Table X). It is only 1.6% lower than in the case of the indoor scenario. The distributions of RSSI for each SF are presented in Fig. 9. Despite the mean level of RSSI  remaining approximately the same (from −89 to −92 dB), one can notice that the form of the distribution turned out to be more normalized than in the previous case.
Table XI contains the mean localization errors for this case. The best result for this case was achieved for k = 5 and equals 4.50 m, which is 1.7 times lower than that in the indoor case. To the best of the authors' knowledge, this accuracy level is one of the highest achieved with LoRaWAN-based technology for outdoor datasets so far. However, it remains quite low compared to the levels that could be achieved by GNSS (10 cm) [4]. At the same time, as stated before, the target was not trying to reach the highest localization accuracy in each environment separately using LoRaWAN but to test whether it is possible to achieve accuracy that could be considered suitable with this technology in the mixed type of environment.

VII. INVESTIGATION OF THE OPTIMAL NUMBER OF GWS
Besides estimating localization accuracy based on all the collected data, it is valid to investigate its dependence on the number of GWs used. Based on the datasets collected in UPB and more reliable BUT scenarios (indoor localization in the building, indoor underground localization with GWs deployed locally), we conducted the experiment to identify the best possible option to choose the GWs to obtain the least mean localization error. The algorithm was as follows. 1) To try all possible combinations of three GWs.
2) To identify the best combination, that is, the one that gives the least mean localization error. 3) Gradually increase the number of GWs adding at each step the best option from the remaining set. The results of the experiment are presented in Tables XII and XIII. According to the best strategy of choosing the GWs' order, to achieve the lowest mean localization error, there is no need to use all the available equipment: in the majority of the cases, the optimum GW, (GWs opt) is less than the available number of the GWs (GWs max). Note, for both indoor datasets that showed similar results in terms of accuracy, that the minimums of the mean localization error were achieved with the same number of GWs is 7.
The complete picture of the investigation of the accuracy dependency on the number of GWs requires reviewing the average expected results. Therefore, this article also studies the mean localization accuracy error for the random choice of the GW averaged over 10 in fifth power iterations. The results for both experiments are presented in Fig. 10.
As expected, the localization accuracy diminishes with the decrease in the number of GWs. The most dependent on the number of GWs turned out to be the underground bottom floor scenario: the average loss of each GW increases the mean localization error by 16%. The curves for the rest scenarios follow the same tendency: adding the first 2-3 GWs entails an increase in accuracy at the average by 11.25% and then, after the point approximately corresponding to GWs opt, the growth slows down by an average of 5%-6%.

VIII. MAIN CHALLENGES AND DISCUSSION
This work investigates LoRaWAN as a localization technology and its potential to be utilized in industrial worksites utilizing two measurement campaigns conducted in BUT and UPB. The considered scenarios cover the most typical localization cases, allowing general derivations.
Analyzing the measurement procedure, it should be noted that the UG environment is more complex for planning a  Summing up all aspects of the conducted measurement campaigns, we can distinguish the following groups of limitations and errors that are present in this work at the stage of data collection and processing.
1) Limitations related to the equipment.
2) Errors that might occur during the transmission of the data to the storage (e.g., outage of the RequestBin). 3) Errors related to the environment (e.g., the number of available sockets, marking inaccuracies caused by uneven ground, etc.).

4)
Assumptions related to the approaches. The main method considered in this work is k-NN fingerprinting, which is applied to randomized readings. Thus, a different run will cause a slightly different result. The results are presented for this work's same random sequence of data. 5) Human errors (e.g., errors that might occur during the direct gathering of the data or manual export of the data from the RequestBin to Excel file, etc.). Among the other limitations are limitations related to the equipment, errors that might occur during the transmission of the data to the storage (outage of the Requestbin), and errors related to the environment.
The first question stated for this work is the average localization accuracy of the LoRaWAN technology outdoor/indoor above-ground/underground. The pivot data for all scenarios considered are given in Table XIV. Both scenarios for indoor above-ground localization in BUT and UPB give almost the same accuracy, around 2.5-2.8 m (k-NN and k-NN-W approaches). Taking into account the various works claiming the accuracy up to 5-6 m for both Wi-Fi [51], [52] and BLE [53], [54], which are currently the most spread solutions for the localization in the indoor environment, we can conclude that obtained results are quite promising for LoRaWAN. For the underground scenario, the mean localization errors are 5 and 6.6 m for the top and bottom floors of the parking lot consequently. Finally, the LoRaWAN application in an outdoor scenario gives 4 m of the mean localization error (SVR, linear kernel).
The results of the measurement campaign revealed that the approaches based on the location of the GWs (trilateration and WCA) are not suitable for localization purposes, providing tens of meters accuracy. At the same time, ML algorithms help to reduce this level to meter-level order. A comparison of the mean localization errors provided by the considered ML algorithms for the main scenarios is presented in Fig. 11. According to the results, in most cases, k-NN and k-NN-W have the best overall performance algorithms. In contrast, the lowest accuracy is shown by DTR, the most complex algorithm considering the number of parameters to tune.
Initially, this work proceeds from the assumption that the more GWs, the higher the accuracy. However, the strategy of using many of GWs cannot be considered a typical one since the initial idea of the technology is to use fewer GWs to cover bigger territories. Depending on the features of the environments under investigation, we operated from 6 to 9 GWs. To examine the dependency of interest, this work investigates the average/best possible cases for different numbers of GWs for the collected datasets.
In the study of the first scenario, it was found that in the best case, the least mean localization error is reached when the number of GWs is less than the maximum. The optimal number of GWs depends on the environment: for the two indoor datasets, it equals 7, and for the more noisy and fewer LoS scenarios, it is generally less, 3-5 (see Table XII).
The study of the second scenario, which reviews the expected accuracy for the different numbers of GWs averaged over 10 in fifth power iterations, confirms the assumption that the more GWs, the higher the accuracy. However, from some point on, the increase in accuracy reduces from 11.25% to 5%-6% and ceases to be worthy of involvement in additional GWs. This point turned out to be equal to or close to the optimal number of GWs identified in the previous step.
The collected datasets cover a small area, but the perimeter formed by the GWs is usually several times larger than the surveyed area (except for underground datasets). Thus, we can expect the declared accuracy, given the particular number of GWs, the similar density of the signal map, and the similar type of environment, at least within this perimeter.
The collected datasets cover a small area, but the perimeter formed by the GWs is usually several times larger than the surveyed area (except for underground datasets). For example, the areas formed by GWs for both indoor datasets are approximately the same, approximately 4 500 m 2 , while the area of the outdoor dataset by changing the position of one GW was expanded to 12 000 m 2 . Thus, we can expect the declared accuracy, given the similar density of the signal map, at least within those perimeters.

IX. CONCLUSION
The results obtained during these campaigns state that the LoRaWAN-based localization accuracy is comparable with one of the most widespread solutions, Wi-Fi and BLE. However, it still significantly loses to indoor and outdoor localization flagships. To be realistic, regardless of the algorithm applied to increase the accuracy, LoRaWAN technology, as it is currently available on the market, is less likely to overcome the leading technologies in indoor or outdoor environments separately. On the other hand, due to its flexibility, LoRaWAN could be a good solution for the mixed environment, potentially including some underground worksites and becoming one more step toward solving the problem of seamless localization.