Offsite Evaluation of Localization Systems: Criteria, Systems, and Results From IPIN 2021 and 2022 Competitions

Indoor positioning is a thriving research area, which is slowly gaining market momentum. Its applications are mostly customized, ad hoc installations; ubiquitous applications analogous to Global Navigation Satellite System for outdoors are not available because of the lack of generic platforms, widely accepted standards and interoperability protocols. In this context, the indoor positioning and indoor navigation (IPIN) competition is the only long-term, technically sound initiative to monitor the state of the art of real systems by measuring their performance in a realistic environment. Most competing systems are pedestrian-oriented and based on the use of smartphones, but several competing tracks were set up, enabling comparison of an array of technologies. The two IPIN competitions described here include only off-site tracks. In contrast with on-site tracks where competitors bring their systems on-site—which were impossible to organize during 2021 and 2022—in off-site tracks competitors download prerecorded data from multiple sensors and process them using the EvaalAPI, a real-time, web-based emulation interface. As usual with IPIN competitions, tracks were compliant with the EvAAL framework, ensuring consistency of the measurement procedure and reliability of results. The main contribution of this work is to show a compilation of possible indoor positioning scenarios and different indoor positioning solutions to the same problem.


I. INTRODUCTION
T HE purpose of the IPIN competition is to create an envi- ronment where methods and algorithms for indoor localization can be tested in a controlled environment as realistic as possible [1].The idea is that localization systems described on papers are next to impossible to compare in a significant way, for a variety of reasons.First of all, the fact that each research group almost always works and tests the system in its own laboratory or nearby facility.Second, systems are often complex and their description may omit some relevant parameters or implementation details, making them impossible to reproduce.Third, but not least important, the infrastructure required to support positioning may not be fully replicated in a different location.
Benchmarking based on public competition is one way out of these problems: research groups are invited to showcase their system in a way that makes it possible to compare it with other systems on an even ground.This article introduces IPIN competition's outcomes for the 2021 and 2022 editions.
1) The introduction of the EvaalAPI, an open source API, that allows off-site tracks to simulate the stressing conditions of an on-site track.2) New tracks added that introduce new challenging scenarios for indoor positioning, such as Track 8: fifthgeneration (5G) in open plan office.3) New environments and challenges are introduced to already existing tracks.4) Description of state-of-art solutions.

II. EVAAL FRAMEWORK
The EvAAL framework is a set of criteria for defining how an indoor positioning competition should be set up.It was defined in 2014 based on the original EvAAL competitions [6].
During a competition, a number of teams compete according to a set of rules, which define a track.A competition may include a number of tracks, each centred on different types of devices and each with its own rules.Tracks can be on-site, with teams gathering in a physical place to run their working systems, or off-site, with no physical gathering and no physical devices involved from the participating team.For each track, competitors run one or more trials during which the performance of their systems is measured.In on-site tracks, an actor walks along a predefined path while carrying or wearing the competing system, which continuously estimates and records its position along the path.Reference points are marked along the path, and position estimation errors are measured for each reference point.
Competitors have the opportunity to survey the environment, running testing trials on their own, before running (usually) two scoring trials on which the competition score is computed and the final competition ranking is established.
The same applies to off-site tracks, with the main difference being that the competitors do not collect any data, neither for surveying the environment nor for participating in the competition.Data collection tasks are delegated to track chairs.All competitors have the same surveyed data and participate in the competition with the same information.
In short, the EvAAL framework considers the following four core criteria: 1) natural movement of an actor; 2) realistic environment; 3) realistic measurement resolution; 4) third quartile of point Euclidean error; and the following four extended criteria: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

A. Integrating Multiple Diverse Tracks
An IPIN competition is in fact composed of several independent competitions, called tracks.Each track has its own rules and purpose.Competitors can participate in one or more tracks.Tracks adhere to the EvAAL framework, though to different degrees.Tracks can be either on-site or off-site as follows.
1) On-site tracks are run in real-time, with trials consisting of a real device collecting sensor data and carried by a real person (an actor) walking along a predefined path previously unknown to competitors.The device, which runs software written by competitors, continuously estimates and records the current position.Estimates are collected at the end of each trial and handed to track chairs, who then compare them to a ground truth unknown to competitors.In an on-site environment, competitors are free to explore the site themselves and make surveys to tune their systems and discover the specifics of the competition area.Usually, this happens the day before the competition proper.During competition proper, scoring is computed on the best of two trials done on the same path.2) Off-site tracks are done on recorded data rather than on real-time collected data.Track chairs collect sensor data at a given location and then provide them to competitors.Competitors run their software on the sensor data, estimate positions, and then hand those estimates to track chairs to be compared with the ground truth.In an off-site competition, competitors are provided with training trial data and/or testing trial datasets to check that their system indeed works and tune it as long as they need.Then, they are provided with (usually) two scoring trial datasets, on which the scoring is finally computed.As an example, the flagship track of the IPIN competition has been the on-site Smartphone track, track 1 in the years 2014-2019.The rules of this track have specified that competitors must implement their solution as an app running on a commercial offthe-shelf smartphone, without communicating with the outside world.Only sensors embedded into the smartphone have been allowed, and the use of external devices has been excluded.

B. Similar Competitions
As it often happens, many people have had the same idea around the same time.In 2011, the first EvAAL competition was held in Valencia (ES) as part of UniversAAL project (FP7-ICT-2009-4), with two more editions in 2012 and 2013.
Microsoft indoor localization competition was also born in 2011, in association with the International Conference on Information Processing in Sensor Networks [7], [8], [9].The Microsoft competition's aim was to muster different teams around the world using as many different approaches as possible, with few constraints.Measurements were taken with the competing systems staying still at a number of reference points, and no attempt at realistic movement or environment was made.The competition was held yearly until 2017.
The PerfLoc Prize Competition was run in 2018 by the NIST (U.S.), while the Positioning Algorithm Competition was run in 2019 by the IEEE Communication Theory Workshop.Both competitions were centred on RF systems [5].The first responder smart tracking competition is a big and ambitious effort funded (again) by NIST with $8M, most of which will be awarded to competitors.It was launched in 2022 and is planned to be completed end of 2023.

C. Previous IPIN Competitions
IPIN competitions started in 2014 in Busan, with a single onsite track based on smartphones.The first off-site track was run in 2015.Years 2020-2022 have not seen on-site tracks because of travel restrictions.
Table I summarizes the location, number of tracks, and number of competitors participating in on-site and off-site tracks during the history of the IPIN competition.

III. INNOVATION IN 2021 AND 2022 EDITIONS
The lessons learned after organizing the off-site edition of the IPIN Competition 2020 [4] were considered in the 2021 and 2022 editions, which both brought significant innovation as follows: 1) the introduction of the EvaalAPI in off-site tracks; 2) considering new challenges in existing tracks; 3) the birth of several new tracks introducing new localization technologies; 4) novel solutions for the different tracks.

A. EvaalAPI
The EvaalAPI interface is used to run off-site tracks.It was introduced in 2021 as experimental and established in 2022 for downloading testing trials and providing position estimates.
The purpose of the EvaalAPI interface (see Section IV) is to make the results of off-site tracks closer to those of on-site tracks by removing some distortions that became apparent in 2015 when the first off-site track was introduced.Distortions include, for instance, fixing positions afterward and smoothing scoring trajectory with future information.

B. New Challenges in Existing Tracks
This is a short summary of new challenges introduced to existing tracks in 2021 and 2022.Each is described in deeper detail in the following sections.
Track 3: Smartphone, introduced in 2015, exploits the sensors of a smartphone.In 2015, the competition was based on static Wi-Fi fingerprinting.In 2016, user's motion and other sensors were introduced.In 2018, the first very large scenario was introduced, a shopping mall.In 2021, device diversity and user diversity were introduced to the track.Track 4: Foot-mounted inertial measurement unit (IMU), introduced in 2018, exploits data gathered by multisensor equipment mounted on the foot.The PERSY sensor used in 2018-2020 [3], [4], [5] was replaced by ULISS in 2021 [10], as the latter is able to deliver 3-D inertial and magnetic, pressure, and GNSS data.Track 7: CIR, introduced in 2020, where an actor moves around a warehouse-like environment wearing a tag that regularly transmits UWB signals and CIR readings are gathered by anchors positioned around the area.In 2021, a second scenario without training was provided, where clutter elements from the first scenario were moved within the environment, allowing the assessment of the adaptability to changes in the environment.In 2022, training and evaluation data were collected by different agents and the EvaalAPI was adopted.

C. New Localization Technologies
This is a short summary of new localization technologies introduced in 2021 and 2022.Each is described in deeper detail in the following sections. Camera:

D. Novel Solutions for the Competition Tracks
The indoor positioning community was challenged with several tracks in 2021 and 2022.A total of 26 (2021) and 29 (2022) teams submitted their proposal to participate in the competition, but only 13 (2021) and 26 (2022) submitted their outputs to participate in the competition (see Table I).
The short description of the proposed indoor localization solutions is available on the Evaal website [11].All teams from the 2021 and 2022 editions were invited to submit an extended detailed description to be reported in this manuscript.Sections V-X include descriptions from those teams that accepted the invitation.

IV. EVAALAPI
From 2015 to 2020, sensor data recorded by track chairs were timestamped and stored into a file which was then sent to competitors.Competitors would then send back the results some days later, by a common deadline.In time, we observed that competitors were more and more often treating the challenge more like an optimization problem than the emulation of an on-site trial.Specifically, we observed three main ways where optimization differs from emulation of on-site behavior, and usually provides more accurate results.
One difference is that on-site trials are causal, meaning that estimates provided by competitors are necessarily based only on past sensor readings.This can make a big difference in estimation accuracy, which is important because real localization systems are indeed causal.
Another difference is that on-site trials are one-shot, meaning that if something goes wrong in a trial, you can not just retry it.A real localization system cannot ask the user to go back and try again if it detects inconsistencies in its estimation results.
The last notable difference is that on-site trials are run in real-time, meaning that estimation is timestamped when it is provided, analogously to a real localization system, which uses the position information as soon as it gets it, in order to provide a smooth experience to users.
The introduction of the EvaalAPI interface in 2021 forced off-site tracks to a behavior, which was causal, one-shot, and real-time behavior, making them more similar to on-site tracks.In the following, the detailed working of EvaalAPI is discussed and some comparisons are made about off-site tracks, which switched to EvaalAPI.

A. EvaalAPI Concept
To make off-site track behavior more similar to on-site tracks, the first step is to force causal behavior by forcing the competing system to provide position estimates as it reads data from multiple sensors.This is obtained by defining an API for providing sensor data to the competing system and getting position  estimates from it.The API is implemented as a web service.The competing system runs a loop where it repeatedly reads data from multiple sensors and provides a position estimate.The EvaalAPI server reads sensor data from a file, provided by track chairs, where each row is timestamped and contains data from one or more sensors.The server writes position estimates obtained by competitors to a file, which is subsequently used by track chairs to compute the score of the trial.
1) Forcing Causality-The Nextdata Loop: The EvaalAPI server waits for a Nextdata (Horizon, Position) command [an HTTP request] from the competing system acting as a client; it answers the request with timestamped sensor data, which it reads from the data file.Figs. 1 and 2 illustrate the server loop.In each iteration of the main loop, the client sends a Nextdata command.Each carries a position estimate, which the client has computed on the sensor data received from previous Nextdata commands, and a requested time horizon, indicating how much sensor data the client expects as an answer to the request.This interface forces causal behavior because the competing system can base its position estimates only on past (in virtual time) sensor data; it can exploit no forward knowledge.
2) Forcing One-Shot-Nonreloadable Trials: In order to force one-shot behavior, each scoring trial can be run only once, i.e., it is nonreloadable.Each Track provides a number of testing trials, i.e., reloadable ones, that can be used at will by competitors to tune their system.In addition, it provides a few (usually two) scoring trials, which can be run only once, on which the score is computed, and the best one used for ranking the competitors.
3) Forcing Real-Time-Managing Timeout: In order to force real-time behavior, the virtual time is linked to the wall time.Virtual time is relative to the time stamped on each line of the sensor data file and to the horizon used in each Nextdata (Horizon, Position) request.
EvaalAPI forces real-time behavior by slowing down virtual time with respect to wall time by a slowdown factor V (V ≥ 1), to account for network delays, transmission bottlenecks and server response time.In practice, EvaalAPI implements a leaky bucket with a rate defined by the slowdown factor and a threshold useful for compensating occasional brief networking disruptions; a timeout occurs when the bucket empties.

B. EvaalAPI Implementation
Source code, including a demo program written in Python, is available and licensed under a GNU Affero General Public License [12], which allows anyone to use, modify and redistribute it freely.

C. EvaalAPI Experience
Results from the competition have shown that EvaalAPI has made a difference.This is most clear when looking at results from tracks 1 and 3, given in Table II, which summarizes the best scores.Tracks 1 and 3 are the oldest and more stable ones.They are based on the same technology and are run in similar environments, that is, using sensors from a smartphone in big office environments.Notably, in 2019 tracks 1 and 3 shared the same location and even some of the reference points.
During the five on-site smartphone competitions, winning teams have always obtained scores in the range from approximately 4 to 9 m, which is the same that happened in the first three years of the off-site competition.In 2018, scores from the off-site track started to diverge, becoming much better than those of the on-site one.This was even more clear in 2019, when the preparation phase-choosing the path and taking measurements-was done by the same people in the same area for both the off-site and the on-site tracks.
In 2021, with EvaalAPI, the off-site track results were back again to realistic numbers.In 2022, results were bad, apparently because the competition was more difficult with respect to 2021.In 2023, tracks 1 and 3 shared the same environment and the same sensors: results, yet to be published, show again a realistic alignment between them.

V. TRACK 2: CAMERA (2022)
This section describes track 2, which was based on camera (computer vision) and took place only in 2022.

A. Track Description
The widespread availability and the combination of sensing, computation, and communication capabilities make smartphones an attractive platform for indoor localization.The preferred localization approaches are influenced by factors, such as infrastructure availability, size, and type of the target indoor site, people's movement characteristics, desired frequency, latency, and accuracy of the localization result.
Image-based localization does not require the presence of specific infrastructure, can handle relatively large sites, and can provide orientation along with position estimation.Although it is possible to obtain centimeter-level accuracy for room or apartment-sized sites, in practical applications, achieving and maintaining similar accuracy in large public areas remains challenging.Among the relevant issues contributing to this challenge are variability in visual appearance over time, irregular motion patterns and the presence of dynamic objects.
The aim of the track 2 competition is to test image-based indoor localization for pedestrians.The target site was two floors of an office building with a test area of about 50 m × 50 m per floor.Using a smartphone, we collected image and sensor data but focused on using only image data for localization mainly due to the off-site setting of track 2.
The training data limited site coverage to simulate requirements for simplified collection procedures.The scoring trials' data featured reduced frame rate and larger motion variability including stopping, sitting/standing and meandering.The reduced frame rate was partially motivated by a general preference for solutions with lower power consumption and partially by practical time constraints for scoring trials in settings with limited internet connection speed.

B. Environment and Measurement Setup
Track 2 used three floors of an office building.Data from the third floor (site 1) were used for training and were provided to competitors in advance.Data from the other two floors (site 2) were collected along the trajectories shown in Fig. 3 and divided into training (plotted in green) and testing (plotted in yellow and red) sets.The training data were provided to the competitors on the day before the scoring trials.Two trajectories from the testing data (red trajectories) were used for the scoring trials on the competition day.
To obtain ground truth labels, a backpack setup with a LiDAR (HESAI Pandar-QT) was used to scan the site and build a localization map.Later, images (640 × 480, 30 fps) and sensor data were collected using a smartphone (SM-N986 N).Training data were collected sequentially by several subjects walking along the hallway in a closed loop, holding the smartphone in their right hand in front of the body, with the rear camera facing forward.The recorded pose (longitude, latitude, floor, orientation) was the pose of the subject and a reasonable effort was made to keep a steady offset of the smartphone relative to the body.
For training data, 43 524 images were recorded along four closed trajectories per floor, combining the inner and outer sides of the hallway with clockwise and counterclockwise directions.For testing data, 5244 images were recorded along seven closed trajectories, keeping each trajectory on a single floor.Unlike the training data, the testing data trajectories included larger variations in walking speed (with stopping and sitting) and direction.Two test data trajectories were selected for scoring trials on the competition day.The image frame rate was reduced

C. Description of Competitors (Camera)
1) Team CamLoc: In this competition, the team realized a visual localizing system based on an image retrieval algorithm and VO.The system architecture can be seen in Fig. 4. The overall process of the whole system can be divided into the following parts. a) Step 1 Build image descriptor database: To localize the smartphone with an image retrieval algorithm, an image descriptor database needs to be built in advance.Patch-NetVLAD [13] model was used to extract the descriptor vector for each image in the training dataset and save them as files for the following image retrieval step.b) Step 2 Image retrieval: During the test, the Patch-NetVLAD model was used to extract the descriptor of the query image online.The similarity between the query image and images in the database will be calculated.The approach finds the most similar image in the database and uses its ground truth as the pose of the query image.In order to speed up the process of similarity calculation, keyframes from the image descriptor database for every 20 images are selected.Since image retrieval is a type of nonincremental localizing algorithm, it is used to predict the starting point and relocalize, which can reduce the cumulative error from the following VO step.c) Step 3 VO: It is time-consuming and less generalizable for each test image to retrieve a similar image in the database.So, a frame-by-frame monocular VO is implemented to locate the smartphone in a faster and more robust way, which enables the system to track the smartphone even in an unknown environment.For the sake of getting high-quality feature points and their matching relationship, SuperPoint [14] and Super-Glue [15] models were used in Team CamLoc's system.To make the monocular VO work effectively, the team needs to align the pose and estimate the scale factor.The poses predicted by VO are in the camera coordinate system, so the pose between the VO coordinate system and the LLA coordinate system were transformed with the help of the ENU coordinate system.Since VO is a type of incremental localizing algorithm, the problem of cumulative error is inevitable.To solve the problem, the system relocalizes with image retrieval for every N frames.In practice, N is set to 30.As for the scale factor, the prediction and ground truth on the training dataset were aligned with the Umeyama algorithm to get the estimated scale parameter.

VI. TRACK 3: SMARTPHONE (2021 AND 2022)
This section describes track 3, which was based on the use of smartphones and took place in 2021 and 2022.

A. Track Description
The objective of track 3 is to evaluate the performance of different integrated navigation solutions based on regular smartphone sensor fusion (WiFi, Bluetooth, and inertial, among others) in an off-site context.As done in the 2016-2020 editions [2], [3], [4], [5], [16], a data collection strategy and evaluation procedure has been followed.
All data for track 3 has been collected with the Android app "GetSensorData" [17], [18], which records and stores all data coming from sensors available in the smartphone into a single text file, i.e., into a logfile.As usual, the dataset is split into three independent subsets, namely, training, validation, and evaluation using ML terminology.A novelty introduced in 2021 was the evaluation through the EvaalAPI, renaming those subsets into training trials, testing trials and scoring trials, respectively.
The first set is for calibration purposes and covers most of the evaluation area, containing several simple short single-floor tracks with several key points at relevant positions including initial, final and turns in the tracks.In the training trials, the trajectory between two key points is almost straight.
The second set is for validation.It contains useful data for competitors to evaluate their systems with long trials covering multiple floors and well-known locations for the key points.Generally, testing trials only include a few key points and the user's movement is not restricted to straight lines between two consecutive key points.In addition, new areas might be explored.Testing trials allow the competitors to evaluate the accuracy of their solutions as many times as they wish, getting an assessment of the level of maturity of their solution.
The third and last set is devoted to evaluation purposes, allowing competitors to have an independent external evaluation without ground truth data, and contains three multifloor very Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.The main difference introduced in 2021 with respect to the previous editions was the incorporation of several devices and users instead of collecting all data with the same device, i.e., track 3 chairs challenged the competitors with device diversity.In 2021, training and validation data were collected with five smartphones, which are detailed in Table III.The table illustrates the range of sensors considered in the competition.For evaluation in 2021, two scoring trials collected with a Samsung S7 by two different users and a scoring trial collected with a Samsung A5 2017 were provided.
This device diversity feature was kept in 2022 but with a different and larger subset of phones, as shown in Table III, only a few phones were used in both editions.For evaluation in 2022, three scoring trials were provided, collected with the smartphones BQ Aquaris X5 Plus, Samsung A31 5G, and Samsung A5 2017, respectively.
Given the large amount of data and diversity of smartphones, a reasonable sampling frequency was set in "GetSensorData" for all sensors to record at 100 Hz in both editions, 2021 and 2022.
In addition to the logfiles, georeferenced floor plans are provided to competitors.Those floor plans may be useful at the sensor fusion level, allowing competitors to check whether the provided positions are coherent with the environment.

B. Environment and Measurement Setup (2021)
For the 2021 edition, competition data came from the facilities of the University of Extremadura (Badajoz, ES).The collection lasted four days and was restricted to the external car park area, the ground floor and the basement.More than 30 BLE beacons were deployed in part of the environment to support indoor positioning, and their location was provided to competitors.
The indoor area included an auditorium, which covers a large area with a soft floor transition as shown in Fig. 5.

C. Environment and Measurement Setup (2022)
In 2022, competition data came from the facilities of the University of Minho (Guimarães, PT).The collection lasted four days and was restricted to the School of Engineering, a threestorey building, and its surroundings.This time, no additional infrastructure was deployed to support indoor positioning.In addition, the scoring trials were collected one month after the training and testing trials.
The indoor area is a three-storey variable-height building and it includes a large open patio as shown in Fig. 6.

D. Description of Competitors (Smartphone)
1) Team Leviathan: The proposed system consists of the following four components.1) PDR system based on step detection and stride length estimation.2) ESKF incorporating IMU measurements, headings and PDR output.3) Floor detection and initialization based on Wi-Fi fingerprint and barometer.4) PF that utilizes the floor plan information.The flow chart of the proposed system is shown in Fig. 7.
a) Pedestrian Dead Reckoning: The PDR algorithm estimates the pedestrian step count, stride length and heading.Thus, PDR exploits accelerometer, gyroscope, and magnetometer data.To remove high-frequency noise, a low-pass filter is  applied to the norm of the acceleration.Team Leviathan first employ FFT to convert acceleration data from the time domain into the frequency domain to better represent the periodic component in the signal [26].Then, detection is used to identify the steps.Unlike traditional stride length estimation, Leviathan's approach is more adaptive in the sense that it is formulated as a function of the peak frequency [27].The heading of the pedestrian is initialized and updated by using the Madgwick method, which combines magnetic field and angular velocity [28].
b) Error-State Kalman Filter: In the scenario of indoor localization, the types of sensors used may be different.An EKF is well-known to fuse different kinds of observations together.Compared with EKF, ESKF applies optimization based on the error state, which is numerically small.Thus, the estimation error is smaller during the linearization process, leading to a more accurate result.In the prediction step, the gyroscope and acceleration are used to estimate the current pose.The magnetometer and the PDR result are used to correct the pose estimation.Specifically, the PDR module measures the displacement, and the magnetometer measures the current heading.The observation error is assumed to follow a Gaussian noise distribution to correct the pose estimation.

c) Floor Detection and Initial Position and Pose Estimation:
The Wi-Fi RSS fingerprint and the barometer are used for floor detection, initial position estimation, and position correction.RSS is susceptible to various environmental changes, e.g., concrete walls, moving humans, temperature and humidity [29].Compared with PDR, Wi-Fi localization is less accurate.Thus, Wi-Fi RSS is mainly used for position correction and floor detection.A radio map is built from the training data, and the first few Wi-Fi detections are used to locate the initial pose.The variance of the barometer is used to detect the floor change by setting a threshold.The current pose is continuously matched with Wi-Fi location and floor information.Once a significant mismatch is detected, the system resets.Data from the accelerometer and the magnetometer are used to extend positions to pose estimations by adopting the Madgwick filter.
d) Particle Filter: The PF fuses the trajectory estimated by the ESKF and compares it with the floor plan to regulate the distribution of particles.The floor plan images are first stored as an obstacle probability map.The floor estimation component first identifies the floor ID and an initial position guess.Each point x on a 2-D plane can be assigned with a Gaussian probability distribution where Ω obs is the set of obstacles, x * is the location of the nearest obstacles, and d(•) is the Euclidean distance between two points.
During each update, a single particle infers its position and the associated probability of hitting an obstacle.In the resampling stage, particles in unreachable areas are removed.Moreover, to account for history information, a particle will be removed if its accumulated penalty within the time window exceeds a predefined threshold.New particles are generated following the distribution of valid particles.The final position is calculated as the weighted average of the particle positions.In case of system failure, i.e., when all particles hit obstacles or a wrong floor ID is reported by the floor detector, random particles will be generated on the specific floor until system convergence.
2) Team imec-WAVES 2021 and 2022: Team imec-WAVES' systems for the 2021 and 2022 competitions are similar and consist of six modules.Each module has the same functionality in both versions, but some modules are implemented differently.Fig. 8 shows how these six modules and their components interact.The following paragraphs briefly describe each module.The description applies to both years unless specified otherwise.
a) EvaalAPI interface: This interface starts the Evaal API trial and requests the next stream of smartphone data in blocks of 0.5 s.It parses and structures the received data and passes it to the PDR module, which employs a step detection algorithm (see Section VI-D2b).If the PDR module detects one or more steps, the interface waits for the path estimation algorithm to provide a new position.If no step is detected, the interface will take the previous position estimation.The position is sent back to the EvaalAPI server, and a new block of data is received.

b) Pedestrian Dead Reckoning:
The proposed PDR algorithm is based on [30].It uses attitude and heading reference system (AHRS), that is, pitch and roll data to transform the accelerometer, gyroscope and magnetometer data from the local (smartphone) reference frame to the horizontal plane.Step detection is performed by detecting peaks in vertical acceleration.The heading is estimated by fusing the horizontal gyroscope and magnetometer data.The adaptive step length estimation is based on the Weinberg model and reproduced from [31].The fusion algorithm works best when the gyroscope and magnetometer headings are initially aligned.The alignment, as well as gyroscope calibration, is done near the start when the smartphone is held still for several seconds.This time interval is detected by thresholding the acceleration variance.
c) Graph database of the environment: Each year, floor plan images of the building are provided.Team imec-WAVES manually draw the walls over the floor plan using the graphical interface of the WHIPP tool [32]-resulting in a line segment for each wall-and also draw over the contours of elevator shafts and staircases, resulting in a polygon for each staircase or elevator shaft.For outdoor trajectories, Google Earth was used to draw the boundaries of the pavement directly surrounding the building.The line segments and polygons are used directly by the path estimation algorithm in the 2021 system.
However, calculating intersections and solving the pointin-polygon problem ad hoc is time-consuming and limits the number of usable particles under EvaalAPI's timeout constraint.Therefore, a 3-D graph was generated from the set of line segments of each floor in 2022.Each graph node is a possible location.Two nodes are connected by an edge if the latter does not intersect a wall and its length is shorter than a specified maximum human step length.Using the contours, each node knows whether it is a stair, elevator, or floor node and whether it is an indoor or outdoor node.The height of each floor is estimated by converting the pressure data [33] from the training log files, and a 3-D graph is then created.Finally, the floors are connected by drawing the staircases with a custom tool.Result is in Fig. 9.

d) RSS fingerprinting:
The training data include ground truth positions at turning points.In 2021, the team used a PDR-based method to construct RSS radio maps, known from the previous competition [3].Since this method does not work well for staircases, the Dijkstra algorithm is used with the 3-D graph in 2022 to create a path on the staircase and match this path with the PDR output.
Furthermore, several BLE beacon locations were provided in 2021.Therefore, RSS prediction [32] is used to create radio maps, which cover most of the building, in addition to the empirical radio maps covering only the training trajectories.
A WKNN fingerprint matching algorithm is used to estimate the user position.The Euclidean distance metric is used to match RSS vectors in the validation/evaluation data with RSS vectors in the radio map.RSS normalization is used to avoid device mismatch [34].The weighted centroid of the k best matches is selected as the estimated position.
e) Floor (transition) detection: Floor levels and transitions are detected by fusing data from the pressure sensor, accelerometer, and RSS fingerprinting into the Viterbi-based algorithm described in [35].In both years, there was an outdoor environment consisting of two levels.Therefore, the team regards these outdoor levels as separate floors, and expands the team's existing algorithm with a simple GNSS-based outdoor detector.If the GNSS coordinate lies outside of the building, a large cost to all indoor states and vice versa was added.
f) Path estimation: After each step detection, a new 3-D position is estimated by fusing information from PDR, floor (transition) detection, RSS fingerprinting, and a 3-D graph of the environment (2022) or floor plan information directly (2021).Path estimation in 2021 was performed by a PF, described in detail in [36].It includes a reset mechanism that removes almost all particles when the PF gets stuck.New particles are sampled randomly in the neighbourhood of the previous sample mean.Also, if there are no particles on the currently estimated floor (including outdoor levels), and the floor has not changed during the last five steps, all particles are randomly resampled on that floor.The same applies to stair detections.
A new path estimation was used in 2022, which combines the PF from 2021, and Viterbi-based tracking from [37].After step detection, each of the N particles has to search all the K reachable nodes and spawn a particle at each node.These N × K new particles inherit the cost of their predecessors and receive a new cost based on how well the length and angle of the edge between the new and previous node match.There are additional costs based on RSS fingerprinting, floor detection, and floor transition detection for indoor, similar to PF in [36].The particles are then resampled based on their cost, and the latest position is estimated by a clustering algorithm based on [36], in which the particle with the lowest cost is the first particle of the cluster.Lastly, if the estimated floor is outdoors, each particle is also evaluated based on its Euclidean distance to available GNSS coordinates in both systems.
3) Team X-LAB: The system proposed by X-lab includes the following three main stages: 1) the 2-D position estimation based on PDR algorithm; 2) the Wi-Fi, Bluetooth, geomagnetic information fusion, and 2-D position correction; 3) the initial position estimation and floor decision making.Six types of sensor data were used in the whole processing of pedestrian positioning.Fig. 10 shows the framework of the proposed system.In the following paragraphs, each stage is explained in detail below.
a) PDR: The PDR algorithm consists of step detection, step length estimation, heading estimation, and movement modes recognition.Peak detection is used to detect steps, and the Weinberg method is used to estimate step length.The heading of pedestrians is calculated by gyroscope data.The heading is corrected by a KF algorithm using geomagnetic data to solve the error accumulation problem.In Team X-Lab's system, a support vector machine is used to identify four movement modes, including normal walking, walking upstairs, walking downstairs, and other movements.Some statistical characteristics (e.g., mean, max, min, and derivative) of the accelerometer and gyroscope are extracted as features.The result of the movement modes recognition will influence the selection of parameters in the other three stages.

b) Information fusion and 2-D position correction:
There are two phases in this stage.The feature map construction in the preparation phase and 2-D position correction in the real-time positioning phase.
In the preparation phase, the Wi-Fi fingerprint, bluetooth fingerprint, and geomagnetic fingerprint map were built.Only a small amount of reference position information is provided in the training data.Therefore, it is necessary to calculate the pedestrian position between two reference position points to obtain more fingerprint information by the PDR algorithm in the previous stage.To improve the accuracy of fingerprint localization and the efficiency of the localization algorithm, PCA is used to eliminate fingerprint information that is not helpful for localization.
In the real-time positioning phase, particle filtering is used to correct the pedestrian position.First, the particles are initialized and the position of the PDR estimate is sampled.Then, the weights of each particle are calculated by feature maps and digital maps.Finally, the resampling process is used to obtain the corrected pedestrian positions.The digital map was created from a map of the buildings provided by the organizers.It contains passable and impassable areas.
c) Initial position estimation and floor decision making: The initial position is provided by fingerprint information matching, specifically Wi-Fi, bluetooth, and geomagnetic information matching.The barometer data are used to calculate the relative height, and the result of movement modes recognition is used to judge the floor jointly.

VII. TRACK 4: FOOT-MOUNTED IMU
This section describes track 4, which was based on the use of foot-mounted IMU and took place in 2021 and 2022.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

A. Track Description
The objective of track 4 is to estimate the position from data gathered by multisensor equipment mounted on the foot of an actor walking a predefined path.Since track 4 is focused on the real use cases of pedestrian navigation, scale and scenario were considered.
Scale stands for the size of the experimentation area: we organized IPIN competitions in a large shopping mall around Nantes in 2021 and in the train station of the same town in 2022.This allowed us to realize trials longer than 30 min and longer than 1 km over areas several hundreds of meters wide and long.
Track 4 gives importance to the quality of representativeness of a scenario.This means that we design our data collection scenario by thinking about the real movement of a specific citizen in a specific place: this year, a tourist visiting an automotive museum.An example to illustrate: navigating through a crowd inside a wide area is something pretty common for citizens, and even if this is not the easiest context for positioning system and algorithm, it deserves to be tackled.
The target environments exhibited features, such as multifloor levels, stairs, escalators, and lifts, everything that is commonly used in public places.This is a perfect mix of a citizen's day-life scenario and scientific challenges for high-level competition.
Sensors' data were gathered through the ULISS sensor device [10], which is able to deliver 3-D inertial data (accelerometer, gyroscope), 3-D magnetic data, pressure data and GNSS data.Fig. 11(a) and (b) shows, respectively, an overview of ULISS and its embedded sensors.Table IV presents the technical specifications of all the sensors embedded in the ULISS.More detail can be found in the 2021 call for competition [38].
In compliance with the Evaal framework, in track 4, the accuracy score is computed as the 75th percentile of the sample error.As in track 3, the sampling error is itself defined as the sum of 2-D horizontal error plus 15 m per floor misdetection (absolute difference between current floor and estimated floor).The 2-D horizontal error is the Euclidean distance between the estimated horizontal position computed by the competitor and the ground truth position of reference points (85 points in 2021 and 90 in 2022).

B. Environment and Measurement Setup 2021
In 2021, the competition trajectory was inside the Atlantis shopping mall in Nantes (one of the biggest in the west of France).Some of the different areas of Atlantis shopping mall used are illustrated in Fig. 13.See the 2021 awards presentation at [39] for additional details.

C. Environment and Measurement Setup 2022
In 2022, the competition data was collected on 12th July in the railway station of Nantes.
A full-body wearable device developed by Xsens called AWINDA was used to provide ground truth.Originally designed for motion capture, it was able to generate a satisfactory ground truth after intensive post-processing.The advantage of using Awinda for ground truth is that we can get ground truth without having to walk on predefined waypoints on the ground as in previous years.The main disadvantage is that both systems (ground truth and competition sensors) need to work perfectly at the same time.Any problem or bug in either system means a new data collection that needs to be done.
Each of the two scoring trials includes a walk of about 1.5 km long in 25 min.Approximately 95% of the walk was done indoors.Only active walking over four different floors without elevators or escalators was used.This is one of the limitations introduced with the new ground truth system.Fig. 14 illustrates a bird's eye view of the ground truth for the first scoring trial.Some of the different places 1 of Nantes railway station is shown in Fig. 15 Additional detail can be found in the Call for Competition relative to track 4 [40], and the awards presentation [41].

D. Description of Competitors (Foot-Mounted IMU)
1) Team X-lab: The solution presented by X-lab consists of three steps plus a correction model.a) Step 1: The IPIN2022 competition evaluates the algorithm in real-time, so it is unknown which mode the tester is travelling in and the threshold for zero-velocity detector varies from different modes.There is no way a perfect threshold can be 1 SIMON BÉNÉTEAU / MAGENTA FILMS, for the drone view.set in the algorithm to perfectly exploit the advantages of ZUPT.In this case, the first step is to propose an adaptive zero-velocity detector that is dependent on the tester's motion modes.The team's algorithm is based on the fact that the more violent the tester's motion, the larger the threshold of the zero-velocity detector should be to detect a shorter still phase as possible, while also noting that missed detections are more easily forgiven than false detections, hence the used adaptive zero velocity detector has a relatively small threshold.b) Step 2: Considering that the biggest shortcoming of ZUPT is the severe heading drift and the lack of reliable heading observation, the heading drift can be corrected to some extent by correcting the pitch and roll angle error through the coupling relationship of IMU three-axis attitude.The pitch and roll error observations were built by using the difference between the pitch Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.and roll angles calculated by the INS and the horizontal attitude estimated by the accelerometer output during the still phase.At the same time, the difference between the currently estimated heading angle point and the average value of the heading angle of the previous five sampling points in the still phase is used as heading angle error observation to improve heading estimation.c) Step 3: Given that height error is considered in IPIN2022 competition, which is expressed as floor error, the difference between the height calculated by the barometer at the current moment and the ones at the initial time is used to obtain the relative height, for assisting in determining the floor changes.
d) Correction model: The escalator and elevator scenario was introduced in the IPIN2021 competition.A moving platform correction model was proposed in the system developed by Team X-lab to solve the zero-velocity false detection when the testers ride still on the escalator and elevator.The motion characteristics of the escalators and elevators are also taken into account to constrain the velocity error and improve the positioning accuracy in the proposed moving platform correction mode [42].

VIII. TRACK 6: SMARTPHONE ON VEHICLE (2022)
This section describes track 6, which was based on the use of smartphones attached to a vehicle and took place in 2022.

A. Track Description
The goal of this track is to evaluate the performance of different integrated navigation solutions based on a vehicle-mounted smartphone, which includes GNSS, accelerometer, gyroscope, and magnetometer, among other sensors.The test route includes both an outdoor scenario with an unobstructed satellite view, one with a partially obstructed view and an indoor scenario without a satellite view.The outdoor scenario with an unobstructed view accounts for 40% of the total test route and is not considered for computing the evaluation score.
The test route of track 6 (see Fig. 17) includes an outdoor scenario with an unobstructed satellite view, an attenuation scenario with a partially obstructed view and an indoor scenario without a satellite view.In the test process (see Fig. 16), there were several long interruptions of the GNSS signal and an irregular test route was adopted.Besides the navigation measurements derived from the sensors installed in the smartphone, there was no external aid information and no prior knowledge of the test route.The competitors could only rely on smartphone data to calculate the vehicle position.

B. Environment and Measurement Setup
In this off-site track, all data for testing and scoring have been provided by the organizers before the IPIN conference.The teams in the competition can calibrate their algorithmic models with several databases that contain readings from the sensors of the smartphone mounted on the vehicle and some ground truth positions.Then, each team competes using additional database files, but in this case, they have to estimate the ground truth reference without knowing it.Moreover, to prevent the use of the map-matching method, an irregular driving route is chosen.
The raw multisensor data, which includes the information of all the signals captured by the smartphone in real-time in the vehicle scenario, was recorded using a Huawei Mate 20 smartphone.The smartphone was attached to the car dashboard throughout the test process to record the motion measurements.
The test area of track 6 is a typical urban road environment.A single test process takes about 1 h and the test route has four phases: a static initial alignment phase (about 5 min), an open environment phase (about 20 min), an obstructed environment phase where the GNSS signal is weakened or blocked by the Fig. 18.Overview of adopted system.
nearby buildings or trees (about 25 min, during which the GNSS positioning results will be often disrupted), and a no GNSS signal phase (underground parking lot, about 10 min, no GNSS positioning results).The test vehicle drives in different ways, such as going straight, turning, reversing, and parking.
During the data collection phase, the same vehicle, smartphone, and smartphone installation method are used for all the testing and scoring trials.The smartphone is securely fixed to the vehicle body as shown in Fig. 16.The coordinate system of the vehicle body and the sensors are different, so the mounting angles are required.

C. Description of Competitors (Smartphone on Vehicle)
1) Team WHU-GD: As shown in Fig. 18, the method uses a graph-optimization method to fuse GNSS, IMU, and magnetometer information.The GNSS provides absolute position constraints.Moreover, a magnetic field heading helps reduce heading drift in the long term.However, the GNSS signal is often disrupted in field testing, and the raw GNSS measurements are not accessible.Therefore, enhancing the relative positioning capability based on IMU with the motion model constraint is crucial.The motion model includes ZUPT and NHC.
This approach estimates the following: 1) the installation parameters R vb and l b , which are rotation and translation between the vehicle and body frames; 2) m n , which is the magnetic field vector in the navigation frame; 3) S i , which is the system state of the keyframe i.This approach chooses the keyframe according to a fixed time interval of 1 s.
It is defined as where t nb i , R nb i , and v nb i represent the position, rotation, and velocity of the body frame (alignment with the IMU) in the navigation frame.b a i , b g i , and b m i represent biases of the accelerometer, gyroscope, and magnetometer, respectively.
The structure of the graph optimization consists of the factors shown in Fig. 19 and described as follows.
a) Preintegration Factor: This factor establishes the relative pose and velocity constraint between two consecutive keyframes.Because of the low accuracy of the built-in IMU in the mobile phone, a simplified INS mechanization is used to enhance efficiency without compromising accuracy.Specifically, the impact of the angular rate and sculling effect due to the Earth's rotation and motion speed is neglected.
b) Zero-Velocity Factor: ZUPT is used to reduce the accumulation of velocity errors.Because the velocity of consumergrade IMU increases rapidly, the ZUPT is essential to improve the accuracy of the relative position when the GNSS signal is unavailable.This factor gives a prior probability to the velocity when the zero-velocity state is identified.The raw output of the IMU over a short period (2 s in this approach) is used for stationary detection.
c) Nonholonomic Factor: The nonholonomic factor is based on the assumption that the vertical and lateral velocities in the vehicle frame are nearly zero, because the mobile phone installation in the field testing is uncertain, the installation parameters need to be estimated online.The initial value is estimated when the velocity value is high enough.The installation parameters are updated online during the testing to prevent the impact of the installation parameters change.
d) Magnetic Factor: The magnetic field vector limits the absolute heading during the testing.It is important to note that the magnetic field vector is very noisy in some places, such as inside buildings or near electronic devices.
In this context, the χ 2 -test is adopted to eliminate the effect of these noise sources.
e) GNSS Factor: The GNSS position is the only source that can provide absolute position information in this competition, because the GNSS signal is prone to interruption or nonGaussian noise, the χ 2 -test is used to determine whether the current GNSS position is valid.Taking advantage of the flexibility of the graph-optimization architecture, the relative distance of the GNSS positions between two keyframes is used first and then the absolute position constraint after several seconds is applied.This strategy helps increase the reliability of the position of the keyframe, which is used in the χ 2 -test.
f) General notes: This system adopts a sliding window strategy to ensure the system can run in real-time.
The marginalization strategy is used to preserve the information from deleted keyframes.
The parameters of this approach are tuned based on the given training dataset.as Wi-Fi or bluetooth.However, with modern UWB technology, signals can be transmitted at higher bandwidths, enabling a much higher spatial resolution from which complex propagation conditions can be extracted, such as absorption, reflection, diffraction, and scattering [43].While UWB is progressively integrated, but not yet widely spread, into consumer devices, current progress in development and standardization makes it likely that it will be ubiquitous in the near future.This allows for low-cost ad hoc positioning.
To leverage the benefits of the high spatial solution we can make use of the CI.For sufficiently high bandwidths the CI roughly corresponds to the complex-valued CIR.Many algorithms have been investigated that exploit the CIR to extract spatial information in order to enhance the positioning performance.They have been used for ToF error mitigation [44], which uses the CIR to estimate an environment-specific ToF error, fingerprinting [45], which exploits the raw CIR as location-specific information, and C-SLAM [46], which exploits the multipath components included in the CIR.Therefore, besides ToF estimates, we provide the raw CIRs, which allows enhancement of localization accuracy.
The challenge is divided into two parts.In the first part, the data that is used for training and testing originate from the same environment setup.In the second part, we made some changes to the environment setup (i.e., we moved mobile metallic objects) in order to consider the robustness of the algorithms to environmental changes.For the second scenario, we do not provide training data but only test data.The trajectories we use for testing in the second scenario stay within a similar area as the one used in the first scenario.
In 2022, we investigated the generalization to a different agent for collecting data.In a typical industrial application setting, data points are easily collected and labelled by automated guided vehicles, but the tracking targets can be other agents, such as persons.The different agents have various influences on the signal, due to the shadowing of, e.g., a person or reflections of a robot.Also, movement patterns and height of the radio unit are different, which might also influence the performance.The majority of the provided data for the validation and training are collected by a mobile robot, while the evaluation is based on the tracking of a worker in an industrial setting.
In both years the ground truth of the transmitter positions is collected using a millimeter-accurate Qualisys motion tracking system.The data are collected and synchronized by an NTP server and preprocessed (corrupted data points are removed and RF and positioning reference data are synchronized).

B. Environment and Measurement Setup (2021)
The environment consists of an area of ≈ 300 m 2 , partially enclosed by reflecting walls (consisting of the walls of the measurement wall, including metal gates and artificially included reflector/absorber walls elements) and various metal objects that are typical of industrial indoor environments like, e.g., industrial vehicles or metal shelves.2) The second scenario presents a modification of the first scenario.In this setup, clutter elements within the environment (e.g., forklift, van, etc.) were moved, which led to a slightly different propagation scenario.The goal of this scenario is to test if the models submitted by the competitors overfit the previous environmental setup and fail to generalize well to changes to the environment.Therefore, the training dataset does not include any trajectory collected within this modified scenario, which is only considered in the trajectories within the testset.

C. Environment and Measurement Setup (2022)
The environment consists of a warehouse area of ≈ 1200 m 2 partially enclosed by reflecting walls (consisting of the walls of the warehouse, including metal gates).The environment contains various metal objects (e.g., industrial vehicles or metal shelves).Fig. 21 shows a picture of a part of the warehouse.Receiving anchors are placed around the recording area at about Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.1.5 m height.The transmitter device is carried by the mobile agent/tracking target and regularly transmits UWB signals received by the anchors.In the data collection phase, it is attached to a mobile robot.In the evaluation phase, it is carried as a handheld by a human/worker.An exemplary and representative evaluation/experiment dataset for adjusting models was provided.

D. Description of Competitors (CIR in Warehourse)
1) Team imec-WAVES (2021): imec-WAVES' localization solution's core is 1) distance estimation between each tag and anchor, 2) range correction through a regression model, and 3) a PF for localization.Fig. 22 shows the individual steps of the system.A custom ranging algorithm provides range estimates for each captured CIR.A first pass of a PF provides an initial approximation of the user's location throughout time, from which their motion trajectory is calculated.Erroneous range estimates are identified by examining time-range plots for each tag-anchor pair.At this stage, a set of predictors serves as input for the regression model.The regression outputs a scalar distance correction that is applied to the original range estimate.In training, this reduced the MAE from 24 down to 3 cm (excluding uncorrectable estimates).A second PF utilizes the updated range estimates to obtain a better location estimate.Finally, a smoothing step in conjunction with position interpolation predicts the location at each requested timestamp.c) Tracking: In this competition, tracking was leveraged to the PF algorithm, which has better accuracy than the Kalmanframework filters generally [47].In this competition, only the 2-D coordinates P MU of the moving user (MU) are considered, which are updated via the constant velocity motion model, given as follows [48]: where v MU denotes the 2-D velocity of the MU, Δt the time difference between timestamps t and t + 1. n v represents the Gaussian velocity errors.In total, 1000 particles were utilized to generate the proposed likelihood of the MU locations within the targeted area.The weights were updated via the PDF of the ToF estimation errors.However, in the case of NLOS propagation, the ToF estimates may have large offsets due to the wrong threshold judgment.To better quantify the ToF statistical error, its histogram was fitted on three widely-used distributions, namely, Gaussian, Laplace, and t location-scale distributions [49].Fig. 23 shows the histogram fitting on these three distributions and their goodnessof-fit via QQ plots.Benefiting from handling with heavier tails, t location-scale distribution presents the most straight line in QQ plots, which illustrates that the ToF estimation errors best follow t location-scale distribution.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.The same implementation and parameters are used in both instances of the PF in the systems developed by team imec-WAVES.
2) Team imec-WAVES (2022): In 2022, the localization solution's core remains similar to the approach used by Team imec-WAVES during the IPIN 2021 competition.The main difference was that in 2021 the EvaalAPI interface was not used, so the solution presented in 2022 by imec-WAVES was real-time.
It consists of the following.1) Distance estimation between each tag and anchor.
2) Range correction through a regression model.
3) Two instances of a PF for localization.Fig. 24 shows the individual steps of the system.A custom ranging algorithm provides range estimates for each captured CIR.A first pass of a PF provides an initial approximation of the user's location throughout time, from which their motion trajectory is calculated.A set of predictors serves as input for the regression model.The regression outputs a scalar distance correction that is applied to the original range estimate.A second, final PF utilizes the updated range estimates to obtain a better location estimate.The location estimate at each requested timestamp is calculated by using the prediction step of the final PF.
a) Ranging algorithm: A threshold-based algorithm is used to calculate range or ToF.The threshold is calculated from the noise, which is determined heuristically after partitioning the CIR into a noise region, a region of interest which contains the first path component, and a region with only multipath components and noise.Algorithm parameters are obtained through the optimization of the training data.A fixed bias correction term is obtained from the median ranging error of each anchor's training data range estimates.The term is subtracted immediately after the ranging step.Negative range results are dropped.Previous range estimations are not used.
b) Regression model: The correction model uses Gaussian process regression with a constant basis function and a Matern 5/2 kernel.Training is performed with fivefold crossvalidation to reduce overfitting.Predictors are calculated from the following.
1) The original CIR measurement (anchor number, first path bin power and maximum bin power).
2) The location estimate (AoD from the anchor, compatibility with distance estimate, X-and Y-coordinates).
3) The motion trajectory (velocity, direction, and AoA on the tag).Due to the real-time character of the data in 2022, trajectory estimation was much harder.In fact, predictors used in 2021, such as turning rate, proved too unreliable this time around.Furthermore, the regression model could only be trained with data that contained global timestamps, as these timestamps are needed to calculate the motion trajectory.This made the largest available set of training data not applicable for training this model.c) Tracking: PF algorithm was used to track the agent.Only the X-and Y-coordinates of the MU are considered.The particles are updated through a constant velocity motion model, using nonadditive process noise.The first PF contains 700 particles, with a measurement noise of 0.17 m and a process noise of 10 ms −2 .The second PF contains 2000 particles, with a measurement noise of 0.02 m and a process noise of 15 ms −2 .The measurement likelihood function to update the particles makes use of a PDF fitted to the range estimation errors from the training data.Fig. 25 shows the QQ plot of each distribution fit.For the first filter, the PDF is fitted using a stable distribution on the debiased estimates.In the second filter, a t locationscale distribution is fitted to the ranges after correction of the regression model.
3) Team SPSC: Team SPSC used a two-step method similar to the algorithm presented in [50], [51].First, a snapshot-based parametric channel estimation and detection algorithm extracts delays and corresponding amplitudes of multipath signal components out of the received baseband signal.Second, a sequential estimation algorithm estimates the state of the mobile agent by using the delays and amplitudes as measurements.More specifically, the sequential estimation algorithm jointly performs probabilistic data association and estimation of the mobile agent state together with all relevant model parameters, employing the SPA on a factor graph.It adapts in an online manner the timevarying component SNR as well as the detection probability of the LOS component.The concept of probabilistic data association, together with adaptation of the LOS detection probability, enables the algorithm to solve the nonlinear positioning problem and mitigate NLOS situations, while still offering an execution time in the magnitude order of milliseconds.In the following, the probabilistic system model of Team SPSC's algorithm is briefly discussed.
a) Channel Estimation and Detection Algorithm: The channel estimation and detection algorithm presented in [51,Supplementary Material] was applied to the baseband signal vector at each time n and for each anchor j independently.It provides a measurement vector z n containing a number of Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

M
m,n contains a distance measurement d(j) n,m , and a normalized amplitude measurement û(j) n,m .

b) SPA-based Sequential Estimation Algorithm:
The components of the measurement vector z (j) n are subject to data association uncertainty, i.e., it is not known which measurement originates from the LOS, from multipath or from a false-alarm.Based on the concept of probabilistic data association, an association variable is defined as n,m is the LOS meas. in z (j) n 0 , there is no LOS meas. in z (j) n .
(4) It differentiates between the conditional likelihood functions for LOS and NLOS measurements, which, for the distance measurements d(j) n,m are given as a Gaussian PDF with mean value that is geometrically related to the agent position p n and a uniform PDF, respectively.The system jointly performs sequential estimation of amplitude states u (j) n , which are assumed to be independent per anchor.The corresponding conditional amplitude likelihood functions are given as Rician PDF and Rayleigh PDF for LOS or NLOS measurements, respectively.The LOS detection probability, which occurs as part of the data association prior and represents the probability that there is a LOS component per time step and anchor, is modeled as the product p D (u j n ) q n , between the amplitude-related detection probability p D (u j n ) and a prior LOS probability q (j) n .The latter is modeled discretely, as a first-order Markov process.The likelihood functions, together with the prior PDF of the data association variable, define the joint pseudolikelihood function gz (z The agent state is described by the state vector T , which is composed of the 2-D agent position p n and velocity v n .The agent motion, i.e., the state transition PDF Υ(x n |x n−1 ), is modeled by a linear, constant velocity, and stochastic acceleration model with standard deviation set to 1/3 of the mean step width of the mobile agent.The state transition PDF of the normalized amplitudes Φ(u n−1 ) is modeled as a Gaussian distribution with standard deviation set to 5% of the last amplitude estimate.The elements of the first-order Markov transition matrix Ψ(q n−1 = ω k ), as well as the initial distributions, i.e., f (x 0 ) p(q 0 ), were initialized heuristically as described in [50].
By applying Bayes' rule as well as some commonly used independence assumptions, the factorized joint posterior PDF is computed, which is visually represented by the factor graph shown in Fig. 26.
The agent state is estimated as the minimum mean-squared error estimate given as ( In order to obtain (5) the marginal posterior PDF is calculated by performing message passing on the factor graph in Fig.
26  utilizing the SPA rules.Since the integrals involved in the calculations of the messages cannot be obtained analytically, a sequential particle-based implementation is used.

X. TRACK 8: 5G IN OPEN-PLAN OFFICE (2022)
This section describes track 8, which was based on 5G in an open-plan office and took place in 2022.

A. Track Description
Track 8 was dedicated to 5G positioning based on UL-TDOA, which is widely adopted in 5G products.2022 was the first time such technology was part of an IPIN competition.
The Huawei 5G system is deployed in an indoor office in the Huawei-Chengdu building.The area is about 15 m×15 m with ceiling height of 3.2 m.There are working tables, chairs, and partition panels in the room with heights in the 0.5-1.5 range.Four pRRUs with known locations are mounted on the ceiling, see Fig. 27.The UE is a Huawei Mate 30 Pro terminal.The UE transmits in bursts of 80 ms.The pRRUs detects the received SRS and calculates the positioning measurements, such as RTOA and RSRP.The UE was fixed onto a trolley with a constant height of 1.2 m, and the UE moves at a speed of 0.2-0.5 ms −1 within the reachable area (highlighted in green colour).During the walking route, the UE signals to some TRPs might be blocked by tables, partition panels and shelves.The tables and panels are made of plywood (2-4 cm thick), and the shelves are made of sheet metal.Hence, there may exist a mixture of LOS, near LOS, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.The coordinates of the pRRUs's locations are given in Table V.The ground truth coordinates of pRRUs3 are actually (12.48 and 9.75 m), that is, the xy coordinates are exchanged with each other.This is to introduce a coordinate error, which might happen in practice.

TABLE V COORDINATES OF THE PRRUS
1) Competition Area: For IPIN 2022, due to movement restrictions, the data set of track 8 was collected only from one indoor office instead of two independent indoor scenarios as planned.Fortunately, the measured office has diverse furniture and facilities, which enable diverse channels in different locations, such as strong LOS, near LOS, and NLOS.Four datasets are measured in the office, and their routines are different in time.Competitors were encouraged to develop self-localization for UL-TDOA including TAE estimation and pRRUs selection, possibly using artificial intelligence.

B. Description of Competitors (5G in Open-Plan Office)
1) Team Mobile: The technical route used by mobile team is mainly based on an ML approach.Fig. 28 shows the pipeline of team mobile's proposed solution.First, simulated data is generated based on data statistics for the purpose of data augmentation, the training data set is built and then CatBoost [52] is used to complete the end-to-end positioning task.Then the The data provided by Track 8 include the timestamps and eight features which are the RTOA and RSRP estimated by the four pRRUs.Note that there are timing errors among the receivers in pRRUs, called TAEs.The unknown TAEs greatly affects the accuracy of TOAs and the position estimation results, and it is also time-varying among different datasets.In contrast, RSRP is more stable because data in all data sets are collected in the same environment.In this case, the solution provided by Team Mobile identifies the RSRP as the key feature.
To fully capture the RSRP feature, an ML-based approach is planned to be used in order to derive the user position.However, the labeled dataset provided by track 8 is too small.Thus, an interpolation method is applied to augment the labeled data.Then, the relationship between the RSRP received by each pRRU and the distance in the real data are quantified.Also, the relationship between the RSRP difference and the real distance difference of each pRRUs is evaluated.As shown in Fig. 29, both are negatively correlated.
Based on these relationships, the system can generate many trajectories in a simulated experimental environment and obtain the corresponding simulated RSRP.Meanwhile, the team can verify the effect of the simulation method on the real dataset.For the interpolated real trajectory in the data set of Testing_B, the RSRP generated by the simulation algorithm has a similar distribution with the real RSRP, as shown in Fig. 30.

b) Machine Learning (ML)-based Position Estimation:
When there is enough data to form the training data set, CatBoost is used to complete the end-to-end position estimation task.CatBoost is a supervised learning algorithm based on gradient boosting and has excellent performance while reducing overfitting and the time spent on tuning.The CatBoost model takes the true coordinate [x t , y t ] as a label and the RSRP and the differences of RSRPs received by two different pRRUs as the input vector Input t which can be represented as shown in ( 7) where the N means that there are N pRRUs while R t i−1 means that the RSRP received by the ith pRRUs at time t.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.c) Position Estimation Correction: Using CatBoost, a position based on RSRP can be estimated, however, RSRP has a large random noise that makes the estimation unstable.To achieve a more stable estimation, the system uses additional information to optimize the position estimation.For example, the users cannot pass through obstacles (e.g., furniture) because of space constraints that can be inferred from the reachable area and the trajectory in the data set.Furthermore, due to time constraints, it is unlikely that the two adjacent estimated positions are far apart.Therefore, abnormal position estimations are fixed based on the reachable area, historical position information, and motion direction.
d) Smoothing and Resampling Trajectory: With these data processing methods, reliable and stable position estimation can be reached.However, the normal trajectory should be smooth and continuous.Therefore, a KF is used to smooth the trajectory in real-time.Then, the relationship between the recent time and the estimated position is fit and resampled to obtain the results required by the competition.
2) Team TX8: In the IPIN 2022 competition track 8, two sets of data are given to calibrate the algorithmic models.In each dataset, the TOA and RSRP measurements are provided.As already identified by team mobile, high precision positioning requires estimating the TAEs accurately, i.e., to get the unknown timing errors among the receivers in TRPs.In addition, since the indoor environment is very complex, there exists a mixture of LOS paths, weak LOS paths, NLOS paths, causing a lot of outliers in the TOA and RSRP measurements.To achieve highprecision positioning, the outliers must be handled reasonably.For the TAEs estimation, the RSRP measurements are employed.Generally, the RSRP can be expressed as follows: where σ is the RSRP measurement, d is the distance between the terminal and TRP, A and η are the model parameters, which can be determined by using the datasets, and ε is the measurement noise.In this manuscript, a Bayesian filter is employed to estimate the TAEs and the location of the terminal simultaneously.The state space model for the TAEs and location estimation can be written as [53] x 1 T 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 T 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 where is the difference of the TOA measurements of TRPi and TRP0, ρ i,k is the RSRP measurement of TRPi at time k, x TRPi and y TRPi are the horizontal coordinates of the TRPi, A i and η i are the model parameters of the TRPi, v k is the measurement noise at time k.
Considering the outliers in the measurements, v k is modeled as a heavy-tailed non-Gaussian noise.Since the MCC-based EKF can handle the heavy-tailed non-Gaussian noise by using a robust cost function [54], the MCC-based EKF is utilized to estimate the TAEs and the location of the terminal based on the state space model proposed above.The workflow of the developed algorithm is shown in Fig. 31.
3) Team DYS-BUPT: In recent years, the DYS-BUPT team has been committed to 5G indoor positioning research to meet the positioning needs under different indoor scenes.In track 8, for the positioning scheme in the indoor office scene, DYS-BUPT solution mainly includes three parts and the system block diagram is shown in Fig. 32.
a) Neural network regression position: In the data enhancement part, in order to increase the amount of data for model training and improve the generalization ability of the model, neural networks are used to fit the wireless channel propagation model, so as to systematically generate more training samples to expand the training dataset, as shown in Fig. 33.
In the later position settlement part, considering that the data are RSRP collected along the continuous motion, the single point Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.matching method may cause a large positioning error.The position of the dynamic target is constrained by space and time, so a recursive neural network is adopted.This network is no longer like the traditional fingerprint positioning, which only relies on the fingerprint of a certain point to locate each time.Instead, it takes into account the correlation of RSRP measurements in the continuous trajectory, considers the space and time constraints of the motion trajectory on the basis of single point matching, realizes the correlation of time and location information of RSRP In the data solution part, the position at the previous time is fused with the data collected at this time through Kalman filtering to obtain the Kalman location estimation at time k.At the same time, the system obtains the LLOP estimation value at time k by solving the LLOP algorithm, and combines the two positioning methods with empirical weighting to obtain the positioning estimation value of Module B. c) Numerical weighted fusion and trajectory correction: In module C, the location estimates obtained from modules A and B are empirically weighted to achieve better results.Then, consider the actual situation, and calibrate the areas that cannot be reached by pedestrians, such as points outside the room, to obtain more reliable results.

XI. RESULTS
In this section, we report the overall scores for each track and its competitors in editions 2021 and 2022 of the IPIN Competition.Table VI presents the results for the 2021 edition, while Table VII presents the results for the 2022 edition.In both editions, as usual in the IPIN competition, the reported results correspond to the third quartile of the error metric, which is the 2-D positioning error plus a floor penalty of 15 m.Each track defined a cutoff threshold to be eligible for a prize.i.e., teams providing an error larger than the cutoff were not awarded a prize.Given the large errors provided by some teams in 2021, very large errors were reported differently in 2022.For the 2022 edition, the errors larger than three times the cutoff value are represented by > 3 × 15 in tracks 3 and 4, and by > 3 × 40 in track 6.Table VIII gives the main techniques used in the systems described within this manuscript, where inertial techniques (PDR or PDR with ZUPT) are used by all teams in tracks 3, 4, and 6.Fingerprinting is also used by smartphone-based positioning, using only Wi-Fi, BLE, or the combination of both signals and magnetic field.Floor estimation exploits barometer information.As far as algorithms are regarded, a PF is used in Tracks 3 and 7.It is combined with map information.However, some teams also used environmental information without combining it with PFs, see the description of teams X-LAB and WHU-GD.
Another important element from the table is that some teams participated in more than one track: X-LAB and imec-WAVES, although they use different techniques in different Tracks.
Thus, we see from the table that the tracks created correspond to different solutions and there is little overlap among different tracks.The most overlap is among tracks 3 and 4, however, they are physically very different: track 3 is based on smartphones that can exploit the combination of data provided by low accurate built-in sensors of different nature; and track 4 is solely based on a higher-quality IMU, which enables the integration of better inertial information in the navigation algorithms.
KF and its variants are used in tracks 3, 7, and 8.

A. Track 2
Two teams participated in Track 2 competition: team1 was SZUSCRI from Shenzhen University and Smart City Research Institute; team2 was CamLoc from Beijing University of Posts and Telecommunications.
During scoring trials, the competitors connected to the testing server and started receiving testing images and returning back pose estimations.Each subsequent image was sent only after receiving the pose estimation of the previous image.The competitors had no indication of which images were to be used as reference points.
At the end of the scoring trial, the reference points were used to calculate position errors as a sum of two terms: a position error calculated from the Euclidean (horizontal) distance between the estimated position and the corresponding ground truth and a floor error penalty of 15 m.The third quartile error of the best scoring trial for each team was 3.2 m for team SZUSCRI and 2.1 m for team CamLoc (see Table VII Track 2: Camera).The trajectory of the scoring trial for the winner CamLoc is shown in Fig. 35.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

B. Track 3
In 2021, a total of 16 teams registered in the competition but only four of them were able to submit the results with the EvaalAPI.This means a significant drop in participation when compared to previous editions, where 11 (2020), 12 (2019), and 13 (2018) teams submitted the results.Moreover, the best team scored an error of 4.4 m while the runner-up's error was 7.9 m, in phase with what is expected for smartphone-based solutions according to previous on-site competitions [5].The lower participation and the scores in 2021 reinforced the idea that the EvaalAPI was necessary to stress real-world on-site evaluation features in track 3. Fig. 36 shows the trajectory for the winner in 2021 (top plot) and the trajectory of a post-processed trajectory like in previous editions (bottom plot).In 2022, a total of ten teams registered in the competition but only seven of them were able to start the procedure to submit the results with the EvaalAPI.This means that interested teams made an effort to adopt the EvaalAPI for evaluation.The number of teams providing reliable results was in phase with the previous edition, but only two teams repeated and participated again.In this case, the best team scored an error of 30.1 m while the runner-up's error was 39.8 m, both (of which are) above the cutoff of 15 m of track 3. Therefore none of the participating teams was eligible for the award as the overall lowest error was below the expectations for a smartphone-based positioning solution.Fig. 37 shows the trajectory for the best-performing trial in 2022, where we can observe very large positioning errors in several parts of the trajectory.The results of the other three participating teams are not shown as the errors were three times larger than the cutoff of 15 m.
In both editions, the use of advanced PF and/or KF is a core element to deal with smartphone data, including Wi-Fi fingerprinting and internal measurements.This requires properly representing the information contained in the provided floorplans.3-D graphs are only used by one team, which seems a promising solution.In addition, the PF and KF filters need to have good strategies for settling the initial position and orientation as well as correcting locations in case of severe deviation.
In the 2022 edition, where the number of different devices used was higher and without BLE infrastructure supporting indoor localization, the competitors scored a positioning error much worse than usual.This even happened to the system developed by the imec-WAVES team, who has participated in the competition for many years.This highlights the relevance of having multiple heterogeneous environments to test every single solution, as a good promising indoor positioning system may not fit all environments.
Analyzing the best performing trials (see Figs. 36 and 37), we can observe that the outputs provided by the competitors are more realistic than those provided in previous editions with off-line evaluation.First, trajectories are not shown as perfectly drawn straight lines as in previous editions (see [4] and Fig. 36) as noise from sensors is visible in the trajectories as zigzag movement, drifts or messy trajectories in a challenging walking style.While short-term displacements can be captured, see text IPIN between points 35-39 in 2021 and text T3 between points 58-59 in 2022, noise and drift remain there.Second, the trajectory cannot be fixed a posteriori, so a large error in the initial location can end up in a large positioning error over the whole trajectory as it happened in 2022.Third, integration with other sources to fix the location in real-time, like map-matching may be more challenging and filters like PF and KF are computationally demanding.Those three elements can be seen in the simulated phone call of around 1 min performed in 2021 between points 25 and 26.The trajectory of the best trial is messy during the phone call, it is not fixed either in the short or in the long term, and it transverses some walls, while in a postprocessed trajectory, the trajectory between those two key points is drawn as a straight line.

C. Track 4
In 2021, a total of three teams registered for track 4, of which only two were able to deliver results via the EvaalAPI platform.This year was the first time that the EvaalAPI platform was deployed for track 4. As shown in Fig. 38, there is a large gap in terms of final results before and after the introduction of the EvaalAPI platform.The best score before introducing the EvaalAPI platform, where full CSV files were shared with competitors in a postprocessing mode, was 0.5 m.However, it increased to 62 m in 2021 with EvaalAPI in a quasi-real-time mode.Although this can be partly explained by competitors' lack of familiarity with the new platform, the main reason is the causal effect, which does not provide access to future information and makes forward-backward adjustment impossible.Compared with other tracks, track 4 is particularly affected by this effect due to the error accumulation of the inertial sensors.Not many absolute "resets" can be performed on track 4, especially in GNSS-denied environments.
In 2022, a total of five teams chose to compete in track 4. Four teams were able to output quasi-real-time results via the EvaalAPI.We note that this is better than the previous edition, both in terms of the number of registered teams and teams able to produce results.As we can see in Fig. 38, the winner achieves an accuracy of about 77 m.Even if it seems worse than the previous editions, a real improvement was achieved taking into account the complexity of the trajectory this year as displayed in Fig. 39.We can see clearly that compared with the ground truth pattern (in green), the competitor's estimated trajectory suffers from a continuous rotation drift, which is typical for the dead reckoning algorithm.The figure shows the best 2-D trace since the introduction of EvaalAPI.Its final score of 76.9 m versus 61.9 m for the winner of 2021 is explained by the poor quality of floor estimation that led to 15 m of penalties.

D. Track 7
In 2021, a total of seven teams provided results for the challenge (see Table IX).Different approaches were investigated by the competitors: three teams relied on LOS-error mitigation approaches, three relied on PFs, and one on a C-SLAM approach.Due to the environment changes, the PF-based approaches deteriorate heavily in performance from Test 1 to 2, as the models are fitted to the specific environments.The EMI-based approaches do not exhibit this problem, as the environmental conditions stay similar.The same holds for the C-SLAM approach.Team ISCAS and Waves shared the first place with almost identical 75th error Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.percentiles of 0.0896 and 0.0891 m (the error differences are within the accuracy limits of the reference system), while team SPSC took the third place with an average score of 0.1594 m with a very consistent accuracy for both environments, which indicates good generalization properties.In general, the competition is advantageous for EMI approaches.In fact, the high number of anchors provides redundancy high enough that reliance on LOS connections yields sufficient spatial information for accurate tracking.As a consequence, no additional multipath information, which is exploited by the other approaches, is required.In 2022, a total of four teams provided results for the challenge, as detailed in Table X.Team ISCAS achieved the highest performance with a score of 0.20 m, while team IMEC is very close with a score of 0.21 m so that both teams share the first place.Lower results are achieved by team WHU with a score of 0.51 m and team CUMT 1.06 m.The results show that the change of the agent for data acquisition and inference is feasible, as all competitors achieved reasonable accuracies.Teams ISCAS and IMEC have shown that very high performances can be achieved, which indicates that data-driven algorithms can generalize well to different agents.The deterioration of the results w.r.t.2021 challenge can be explained by the different agents and the quasi-real-time processing of the data to obtain results with the novel introduction of the EvaalAPI.XI, summarizes their advantages and disadvantages.

A. Lessons Learned by Competitors
2) Track 3-Smartphone: a) Team imec-WAVES: Learned that calculating wall intersections ad hoc for each particle of their PF is very timeconsuming.Because of this and the limited time to upload a new position, they could not use many particles.This made it hard for the PF to recover from mistakes, e.g., turning into the wrong room.However, the reset mechanism worked well, so they still obtained good results.The 3-D graph solved this problem and allowed them to use more particles.In 2022, some new difficulties were introduced: some smartphones did not have a magnetometer and/or barometer, no beacon locations were provided, and the quality of the RSS fingerprints was worse than in 2021.The SmartPDR algorithm relies on the magnetometer and, therefore, the fallback PDR algorithm was not available, thus one of the three trials failed immediately, because of the possibility of a missing barometer and the bad RSS fingerprinting performance, this team could not put confidence in (i.e., gave low weights to) the floor (transition) detection, and the RSS fingerprint matching algorithms.This was a mistake because the path estimation algorithm did not respond to correct floor transition detections and therefore stayed mostly on the same floor.
b) Team X-LAB: Identified that the phone used for collecting the data should be fully considered in the fingerprint location process.They also found that it is especially important to prioritize the correctness of each module in the positioning system rather than improving the accuracy of a particular module.An incorrect module will greatly reduce the overall positioning accuracy, just like the barrel effect.
3) Track 6-Smartphone on Vehicle: a) Team WHU-GD: identified that the GNSS raw measurements are important to help determine the GNSS signal quality and that those systems that fuse the GNSS raw measurements and the IMU have potential to provide better accuracy and robustness.

4) Track 7-CIR in Warehouse:
a) Team imec-WAVES: Listed their lessons learned as follows.
1) Our approach is less suited to agent generalization than environment generalization, which was at the heart of the 2021 IPIN T7 competition.Training data was therefore rather limited and probably contributed to the reduced precision in comparison to last year's result.2) Attempts to characterize error and ranging reliability based on x and y position were not fruitful.3) While successful, the constant velocity PF fails to capture the intricacies of the agent, human, or robotic.The inclusion of higher order terms, a more dynamic approach to the PF motion model or a mixture of models will be investigated in the future.4) No access to future data points for a given timestamp creates problems for reliable motion and angle estimation in the trajectory, as the initial PF is quite noisy.Some predictors have lost their use because of it and were therefore omitted.5) Limited precision of bin timestamps reduced ranging precision; output format should be more carefully checked by organizers.6) Increased fairness and straightforwardness of the online submission platform outweigh the technical hurdles that had to be overcome.b) Team SPCSC: Proposed a model-based method, Bayesian approach, which does not use the provided training data at all.They used parts of the training data to calibrate the anchor positions.Analyzing the results of track 7 of the IPIN 2021 challenge their proposed algorithm compares well to the proposed ML-based approaches, consistently showing robust behavior for all data sets and being only slightly outperformed in terms of accuracy.An important aspect of the methods presented in [50] and [51] is the nonuniform NLOS likelihood, which allows the information contained in multipath to be fused in a soft, Bayesian manner [55], i.e., it constrains the PF-based position estimate using information contained in the NLOS measurements.However, when there is a LOS connection to at least two anchors and these anchors provide enough directional diversity for accurate positioning, the NLOS information has virtually no effect (especially when the mirror sources due to walls in the environment setup are far away).A comparison showed that changing the NLOS likelihood from the proposed nonuniform to a uniform distribution led to no significant difference in the resulting performance metrics.The scenario investigated in track 7 of the IPIN 2021 challenge comprised data of seven anchors.The LOS to every anchor was obstructed over large parts of the trajectory.Due to a large number of anchors, there were only partial obstructed LOS situations, i.e., the LOS to all anchors was never obstructed simultaneously over significant time intervals.Therefore, unlike [50] and [51], the method described in Section IX-D3 does not involve a nonuniform NLOS likelihood.The team assumed that ML-based methods that learn a nonlinear regression model that maps to the agent position p n can, in turn, exploit small differences in the imprint of the provided radio signal data, leading to the observed gain in estimation accuracy.

5) Track 8-5G in
Open-Plan Office: a) Team Mobile: Proposed an end-to-end solution based on a ML approach.Moreover, they used statistical knowledge to augment data and generate simulated data so that the model could be effectively trained.They believe that if there is more real data as the training data set, the CatBoost can perform better.However, the current solution does not integrate TOA information well and the trained CatBoost model is unable to cope with the unknown environment.
b) Team TX8: Identifies that in Track 8, to realize highprecision positioning, the TAE should be estimated accurately and the outliers caused by the weak LOS paths and NLOS paths must be handled reasonably.They used the RSRP measurements and a path loss model for the TAE estimation.To deal with the outliers in the measurements, the MCC-based EKF was developed.In addition, this team identified that the position and velocity constraints can be used to further reduce positioning error.

B. Lessons Learned by Track Chairs
1) Track 3-Smartphone: Three major lessons learned arose from organising the competition in 2021 and 2022 and the integration with EvaalAPI.First, with the integration of real-time assessment, data sampling should be decreased as processing sensors with sampling frequencies of 200 Hz or higher may not allow real-time processing on the competition side.Second, device diversity should be handled carefully as some devices lack the sensors required for positioning or they may behave under expectations.Finally, the location of the anchor seems decisive to avoid the presence of very large errors, while being provided the location of BLE beacons in 2021, no information about the infrastructure was provided in 2022, which drove to unexpectedly high positioning errors.
2) Track 6-Smartphone on Vehicle: In track 6 smartphone on-vehicle, four teams were registered and three teams submitted their final results.Two final scores were under 40 m, with the best one at 14.7 m.The key to success appears to be the well-executed combination of vehicle motion constraint information and magnetometer observations including IMU preintegration, ZUPT, NHC, magnetic heading and graph optimization.Considering the long interruptions of GNSS signal in the test data, more to the point is to maintain the vehicle heading accurately.
Considering that more and more mobile phones can support differential positioning, differential positioning results may be provided to improve positioning accuracy.In addition, changing the posture of the mobile phone during the test can be considered, as this is a typical case in a real scenario.
3) Track 7-CIR in Warehouse: The presented data are advantageous for EMI approaches because of the high number of available anchors.To allow for a fairer comparison with other approaches, future competitions may feature a lower number of anchors.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

C. Lessons for Competition Organizers
Based on our experience in organizing wide-scale competitions in interesting scenarios, we can provide some insight that can be useful to scholars and organizers of future competitions.
Among good practices in the framework of such competition organization, we may also include: getting information about the site, taking the time to carefully prepare the path and choosing the proper system able to produce your ground truth.
The first two points involve significant time and effort and should be planned well in advance together with trips for on-site surveys and measurements.
The third point, that is producing the ground truth, is a key enabler for any competition.Measuring reference point positions by hand and using a 3-D scan are the two ways used by Track chairs.Advantages of 3-D scans are accuracy, which in 2018 we estimated between 2 and 10 cm [3], [56], and one-shot postprocessing.Disadvantages are being tied to the chosen site if you do not have the skill or the equipment to do the survey yourself, with the risk that your ground truth database will one day or another be known or learnt by future competitors.

D. Future Direction of the IPIN Competition
Off-site tracks have become more realistic with the introduction of EvaalAPI, an interactive, real-time, causal interface, which is intended to emulate a real environment similar to that of on-site tracks.
Challenges ahead are to make the results of on-site and offsite tracks comparable on a regular basis.Ideally, this would involve choosing the same environment for analogous on-site and off-site tracks, as it was done in 2019 for smartphone-based tracks 1 (on-site) and 3 (off-site).At that time, EvaalAPI was not available, but this is what was done again in the 2023 edition, for which results are yet to be published.

XIII. OPEN CHALLENGES AND FUTURE WORK
The IPIN competition has grown in the past years, up to hosting six off-site tracks in 2022.In the last three years, since 2020, it has only hosted off-site tracks, but has started again with on-site tracks in 2023.
The main challenge is to keep it interesting for organizers to dedicate their time to it and for competitors to participate.These two objectives are partially conflicting.
Most organizers are from the academy, and are not funded for their involvement in the competition.Their main interest lies in discussing and experimenting with new ideas, and generally advancing the state of the art in the field.This can be done by adding new tracks, possibly rotating among them from year to year, or by updating the challenges provided by old Tracks.
On the other hand, advancements should not be leaps forward, as this may alienate interest from competitors, which is linked to being able to test and show off their methods and algorithms.For new competitors, this involves being able to learn from past competitions.For recurring competitors, this involves a stable API.
Judging from the continuous and appreciative involvement of organizers, competitors, and the IPIN Conference management, IPIN competitions have so far managed to meet both objectives of novelty and stability, in a delicate balance.

ACKNOWLEDGMENT
This document presents the experiences of the IPIN competition for 2021 and 2022 editions.All authors listed in this article have either contributed to managing the EvaalAPI, organizing the off-site competition tracks, or developing the positioning solutions (competitors).
Chairmanship: Francesco Potortì chaired and organized the competition, managed the evaal.aaloa.orgwebsite containing all the data, and developed/managed the EvaalAPI interface software.Antonino Crivello helped with overall management and managed the paper.
Track 2-Camera: The team organizing track 2 is composed by Soyeon Lee, B. Vladimirov, and Sangjoon Park.

Track 8-5G in open-plan office:
The team organizing track 8 is composed by Yi Wang and Shaobo Wang.
Team DYS-BUPT (track 8, 2022) is composed by Kai Luo, Ziyao Ma, Yanbiao Gao, Jiaxing Chang, Hailong Ren, and Wenfang Guo.Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 3 .
Fig. 3. Data collected at track 2 evaluation site 2-floor 1 (top) and floor 2 (bottom).From the collected testing data trajectories, two (red) were used for scoring trials and the rest (yellow) were kept in reserve.

to 3
fps and reference points were selected for error evaluation.Scoring trial 1 had a length of 170 m, with 735 images and 80 reference points.Scoring trial 2 had a length of 136 m, with 750 images and 64 reference points.
long tracks.In contrast to the systematic data collection done in the previous trials, the scoring trials include realistic movements (e.g., simulating a user that was messaging or taking a phone call) and stops.Only three unlabelled scoring trials were provided to competitors in each edition.The accuracy score of a scoring trial corresponds to the 75th percentile of the sample error in compliance with the Evaal framework.This error is the 2-D positioning error plus a penalty of 15 m times the absolute difference between the current and the estimated floors.The team's score corresponds to the best (lowest) score among the three scoring trials.

Fig. 5 .
Fig. 5. Floor plan of track 3 (2021) environment and its auditorium (located on the bottom-right corner in floor 1).

Fig. 7 .
Fig. 7. Flow chart of the indoor localization solution proposed by team Leviathan.

Fig. 8 .
Fig. 8. Flow graph of the systems of team imec-WAVES for track 3 in 2021 and 2022.

IX. TRACK 7 :
CIR IN WAREHOUSE This section describes track 7, which was based on the use of CIR in warehouses and took place in 2021 and 2022. A. Track Description RF positioning in cluttered indoor environments is challenging.As signals travel through the environment along different paths it is difficult to determine the correct ToF of the transmitted signals.Traditionally, fingerprinting-based solutions have been used to estimate a rough position from narrow-band signals, such
Fig. 20 (left-hand side) schematically sketches the environment, while the real-world environment is shown on the right-hand side.The receiving anchors are placed around the recording area at about 1.5 m height.The transmitter device is carried by a human/worker and regularly transmits UWB signals received by the anchors.The data are recorded using a platform based on the Decawave DW1000 UWB chip with a centre frequency of 4 GHz and 499.2 MHz bandwidth.This challenge contains the following two scenarios.1) For the first scenario a training dataset with ground truth positional information is provided; the models submitted by the competitors are evaluated on a test set (a few trajectories) that originates from the same measurement campaign, i.e., training and test datasets were recorded on the same environmental setup.Both training and test datasets contain complete trajectories while the trajectories of the test dataset are shorter.The testset does not contain ground truth position labels.

Fig. 21 .Fig. 22 .
Fig. 21.Image of the environment.The mobile robot can be seen on the right.

Fig. 23 .
Fig. 23.(a) Normalized histogram fitting of the ToF estimation errors and (b) the QQ plots.

Fig. 24 .
Fig. 24.Flowchart of the imec-WAVES localization approach.Each rectangular box represents a step in the process.Training the regression model for range error correction is done separately.

Fig. 25 .
Fig.25.QQ plot of the measurement likelihood functions that are used in the first and second PF.The first filter incorporates a function based on a Stable distribution fitted to unbiased ranging error.The second filter makes use of a t location-scale distribution fitted to the ranging error after it is corrected by the regression model.

Fig. 26 .
Fig.26.Factor graph representing the factorization of the joint posterior PDF and the messages according to the SPA.See[50] for further details.

Fig. 29 .
Fig. 29.(a) Relationship between the RSRP received by pRRU0 and the real distance.(b) Relationship between the difference of RSRP and the real distance difference of pRRU0 and pRRU1.

Fig. 30 .
Fig. 30.Distribution of the real RSRP received by pRRU0 and the simulated RSRP of pRRU0.
x k and y k are the horizontal coordinates of the terminal at time k, v x,k , and v y,k are the horizontal velocities of the terminal at time k, b 1,k , b 2,k , and b 3,k are the TAEs of the TRPs at time k, w k−1 is the process noise at time k − 1, T is the sampling interval,

Fig. 35 .
Fig. 35.Trajectory plot of track 2 winner's best scoring trial.The estimated trajectory (est trj) is evaluated against the ground truth trajectory (gt trj) at predefined reference points (ref pts) based on horizontal Euclidean distances to the corresponding estimated points (est pts) with an additional penalty for incorrectly estimated floor (marked with floor error).

Fig. 38 .
Fig. 38.Evolution of the two first scores of track 4 over the last five years.

Track 3 -
Smartphone: The team organizing track 3 is composed by Joaquín Torres-Sospedra, Antoni Perez-Navarro and.Antonio R. Jiménez.Fernando J. Alvarez, Fernando J. Aranda, and Felipe Parralejo supported data collection for the 2021 edition.Adriano Moreira, Cristiano Pendão, and Ivo Silva supported data collection for the 2022 edition.Leviathan Team (Track 3, 2022) is composed by Han Wang and Hengyi Liang.imec-WAVES Team (Track 3, 2022) is composed by Cedric De Cock and David Plets.X-LAB Team (Track 3, 2022) is composed by Yan Cui, Zhi Xiong, Xiaodong Li, and Yiming Ding.

Fang
Zhao (Member, IEEE) received the B.S. degree in computers and applications from the School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China, in 1990, and the M.S. and Ph.D. degrees in computer science and technology from the Beijing University of Posts and Telecommunications, Beijing, China, in 2004 and 2009, respectively.She is currently a Professor with the School of Software Engineering, Beijing University of Posts and Telecommunication.Her research interests include mobile computing, location-based services, and computer networks.Yue Zhuge received the B.S. degree in computer science and technology from the Wuhan University of Technology, Wuhan, China, in 2021.She is currently working toward the M.S. degree in computer application technology with the University of Chinese Academy of Sciences, Beijing, China.Her research interests include computer vision, simultaneous localization, and mapping.Haiyong Luo (Member, IEEE) received the B.S. degree in information engineering from the Department of Electronics and Information Engineering, Huazhong University of Science and Technology, Wuhan, China, in 1989, the M.S. degree in communication and information systems from the School of Information and Communication Engineering, Beijing University of Posts and Telecommunication, Beijing, China, in 2002, and the Ph.D. degree in computer science from the University of Chinese Academy of Sciences, Beijing, in 2008.He is currently an Associate Professor with the Institute of Computer Technology, Chinese Academy of Science, Beijing.His main research interests are location-based services, pervasive computing, mobile computing, and Internet of Things.Antoni Perez-Navarro (Member, IEEE) received the bachelor's and Ph.D. degrees in physics from the Universitat Autónoma de Barcelona, Bellaterra, Spain, in 1995 and 2000, respectively.Between 2017 and 2020, he held the position of Deputy Director of Research with the eLearn Center, Universitat Oberta de Catalunya (UOC) and is Lecturer with the Computer Science, Multimedia, and Telecommunication Department (EIMT Department), since 2005.He is also a Member eHealthLab research group.He is currently the Director of the Technological Observatory with the EIMT department.Apart from his activities at UOC, he works, since the year 2007 with Escola Universitària Salesiana de Sarrià (EUSS).His teaching activities range from the fields of physics and GIS in telecommunication engineering, computer science, multimedia, and industrial engineering.He has authored or coauthored several papers in international journals in all these topics and acts as a reviewer of several journals.His main research interests are indoor positioning, prevention of diseases via smartphones, and e-Learning.Dr. Perez-Navarro is part of the Technical Program Committee of IPIN and is one of the Chairs of IPIN 2021.Antonio Ramón Jiménez was born in Santander, Spain, in 1968.He received the degree in physics and computer science and the Ph.D. degree in physics from the Universidad Complutense de Madrid, Madrid, Spain, in 1991 and 1998, respectively.Since 1993, he has been with the Center de Automation y Robotics, Spanish Council for Scientific Research, Madrid, where he holds a research position.He has authored more than 100 articles in journals and conference proceedings.His current research interests include local positioning solutions for indoor/ global positioning system-denied localization and navigation of persons and robots, signal processing, Bayesian estimation, and inertial-ultrasonic-RFID sensor fusion.Dr. Ruiz is a Reviewer for many international journals and projects in the field.Han Wang (Member, IEEE) received the B.E. and Ph.D. degrees in electrical and electronics engineering from Nanyang Technological University, Singapore, in 2016 and 2021.Since 2021, he is working as a Research Assistant with the Huawei Technologies, Company Ltd., Shen Zhen, China, where he is currently a Researcher.His research interest, include simultaneous localization and mapping, pedestrian dead reckoning, and computer vision.Hengyi Liang (Member, IEEE) received the B.S. degree in microelectronics from the University of Electronic Science and Technology of China, Chengdu, China, in 2016, and the Ph.D. degree in computer engineering from Northwestern University, Evanston, IL, USA, in 2021.He is currently a Researcher with Huawei Technologies, Company, Ltd.His research interests include indoor positioning, cyber-physical systems, and connected and autonomous vehicles.Cedric De Cock received the M.S. degree in electronics and ICT Engineering Technology from Ghent University, Ghent, Belgium, in 2020.In 2020, he became a Member of the imec-WAVES Group, Department of Information Technology, Ghent University.His research interests include IMU-enabled indoor positioning and Bayesian filtering algorithms.David Plets (Member, IEEE) has been a Member of the imec-WAVES Group, Department of Information Technology, Ghent University, Ghent, Belgium, since 2006.He is currently an Associate Professor with the Ghent University.His current research interests include localization techniques and the IoT, for both industryand health-related applications, and also involved in the optimization of wireless communication and broadcast networks.

TABLE I BASIC
STATISTICS FOR ALL PAST IPIN COMPETITIONS

TABLE II BEST
SCORES IN METres (FIRST AND SECOND PLACE) FOR ON-SITE AND OFF-SITE SMARTPHONE TRACK BEFORE AND AFTER EVAALAPI WAS INTRODUCED IN 2021

TABLE III SMARTPHONES
USED IN TRACK 3 (2021 AND 2022) DETAILING THE COMMERCIAL NAME, THE MANUFACTURER CODE (IF ANY), THE ANDROID VERSION, THE SENSORS USED, AND THE EDITION WHEN THE SMARTPHONE WAS USED

TABLE IV TECHNICAL
SPECIFICATIONS OF THE EMBEDDED ULISS SENSORS

TABLE IX 75TH
PERCENTILE OF THE ABSOLUTE ERROR IN m FOR THE 2021 TRACK 7 COMPETITION