Cooperative Saliency-Based Pothole Detection and AR Rendering for Increased Situational Awareness

Autonomous vehicles are expected to operate safely in real-life road conditions in the next years. Nevertheless, unanticipated events such as the existence of unexpected objects in the range of the road, can put safety at risk. The advancement of sensing and communication technologies and Internet of Things may facilitate the recognition of hazardous situations and information exchange in a cooperative driving scheme, providing new opportunities for the increase of collaborative situational awareness. Safe and unobtrusive visualization of the obtained information may nowadays be enabled through the adoption of novel Augmented Reality (AR) interfaces in the form of windshields. Motivated by these technological opportunities, we propose a saliency-based distributed, cooperative and rendering scheme for increasing the driver’s situational awareness through (i) automated negative obstacle (potholes) detection, (ii) AR visualization and (iii) information sharing (upcoming potential dangers) with other connected vehicles or road infrastructure. An extensive evaluation study using a variety of real datasets for pothole detection showed that the proposed method provides favorable results and features compared to other recent and relevant approaches.


I. INTRODUCTION
I NFORMATION-CENTRIC technologies have started to play a central role in the recent automotive industry boosting new research trends in semi or fully Automated Driving Systems (ADS).Autonomous vehicles, ranging from level 3 to level 5 of autonomy [1], are expected to operate safely in real-life road conditions, but the reality is that obstacles like potholes, bumps, and other unexpected objects are not uncommon in an everyday driving context.For this reason, the detection and identification of obstacles are imperative Gerasimos Arvanitis is with the Department of Electrical and Computer Engineering, University of Patras, 26504 Patras, Greece, and also with AviSense.aiTechnologies, 26500 Patras, Greece (e-mail: arvanitis@ece.upatras.gr).
Digital Object Identifier 10.1109/TITS.2023.3327494 for reliable operation of autonomous vehicles [2].Moreover, driver inattentiveness plays a major role in driving safety and is the culprit of road accidents around the world [3], [4], thus a lot of work has been devoted in the quantification of the abstract mechanics of human situational awareness [5].
Enhancing situational awareness is especially critical in the case of semi-autonomous cars, where the operator may be distracted by secondary activities, e.g.looking at the phone or reading a book.If the driver has to take over control, it is important to minimize the required reaction time.This can be achieved by monitoring and presenting to the driver the crucial information about the environment, thus keeping him/her aware of potentially hazardous situations.Inherent challenges include the need for unobtrusive information display, avoiding the effects of tunnel vision which could lead to actually overlooking critical information [6].The problem of road pothole detection is commonly targeted using imaging (camera) data and computer vision techniques [7], [8], [9].Although image-based techniques have achieved great success, one common drawback is that they are sensitive to motion blur and changes in lighting and/or even shadows [10].Also, most techniques do not account for other passing vehicles [11].This can make them unreliable in real use cases, which is a major weakness in problems involving human safety.In light of all this, the use of a 3D LiDAR (Light Detection and Ranging) sensor could provide more robust sensing capabilities for the analysis of potholes, in the same way that it is used to increase the accuracy of road's boundary detection [12], [13], [14].
On the other hand, a limitation of the LiDAR sensor is that, due to refraction and reflection, water appears as a black hole in the imagery calculated from LiDAR data [15], imposing additional challenges in the detection of potholes filled with water.
The purpose of this work is to increase the driver's situational awareness through automated cooperative obstacle detection, visualization and information sharing with other connected vehicles in a V2X (vehicle-to-everything) setting.To address the above issues, we developed a point cloud processing system that takes as input road environment data and classifies them into safe and potentially hazardous regions by identifying obstacles lying in the range of the road.We selected LiDAR as sensing modality for the surrounding environment due to its ability to retrieve depth information and its large range, making it suitable for driving environments.For more robust estimation, LiDAR data are fused with information on driving patterns, such as the steering angle of the wheels.For implementation and evaluation, we utilized the open-source CARLA simulator [16] including also a multi-agent system of vehicles, and we augmented it with our obstacle detection and tracking component.In this simulated environment, information sharing between agents is enabled, so that vehicles are notified about incoming obstacles even when there is no direct line-of-sight.Our method is capable of detecting static obstacles both negative (e.g., potholes) and positive (e.g., speed bumps or hazardous objects) within the range of the road.However, according to the literature the detection of negative obstacles poses more challenges, which have not been handled efficiently by the available methods so far [8], and is more specific to driving scenarios, in contrast to general object detection which has been extensively being studied in mobile robot navigation research.
To avoid any information visualization clutter, we propose the use of AR for visualizing critical information in the driver's field of view.AR rendering is based on classical perspective projection, where for each point (of the point cloud) the pixel coordinates in the image space of the AR interface are calculated through projection and a color is assigned indicating the object class.Interfaces that can be used for in-vehicle visualization include AR headset, Head-Up Display (HUD) [17], [18] or even the car's windshield with transparent display.The contributions of the proposed approach can be summarized as follows.
• Development of an obstacle detection module that takes into account the extraction of saliency maps from point clouds.
• Generation of data for randomized multi-ego connected vehicle in cooperative driving scenarios.
• Creation of realistic synthetic data of potholes that can be entered in the town maps of the CARLA simulator for the design of lifelike driving situations.
• AR visualization for point cloud projection registered on the scene images.
• Development of public and open access libraries with code for the aforementioned components1,2,3, . 4he rest of this paper is organized as follows.First we present previous works in related domains in Section II, and then describe in detail the proposed methodology in Sections III and IV.Section V follows with some experimental results in comparison with other state-of-the-art methods, while Section VI draws the conclusions and directions for future work.

II. PREVIOUS WORK
In the following, we provide an overview of methodologies tackling the main challenges of the presented approach.
1) Negative Obstacle Detection: A major element that adds unpredictability in path planning for self-driving cars are obstacles in the road.Negative obstacles can appear in the form of objects beyond the surface of the road, or cracks and holes in paved areas.There has been major work on obstacle detection, raging from real-time implementations [7], to offline schemes that act as automated informants to the authorities responsible for maintenance [19], or as efficient unsupervised techniques for pothole detection [20].Most of the existing works implement a broad spectrum of computer vision and/or machine learning techniques to analyze imaging information [7].The methods differ mainly on the utilized features and classifiers for obstacle representation and recognition.In respect to performance, a direct comparison of methods is not feasible because most works are evaluated on their own (simulated) data.In fact, there is lack or restricted access to a common benchmark dataset with potholes and obstacles, that can be used for comparison.
Asad et al. [21] explored the potential of deep learning models (YOLO family and SSD-mobilenetv2) for real-time pothole detection leading towards the deployment on edge devices.However, they used only images, and despite the good detection results, the visualization of the information was considered as disturbing for real-case implementation.
Jenkins and Young [22] discussed the design of a system to alert motorcyclists for different hazard categories including roadway hazards (e.g., potholes, other vehicles or pedestrians, roadway debris, uneven surfaces), in a manner that facilitates direct perception and action to appropriately respond to such hazards and reduce the risk of accidents and injuries.However, no real-time data processing is performed, and analysis is based on data that have been already stored in the cloud.
Heo et al. [23] proposed a 2D pothole risk assessment standard to visually indicate risk signals to the driver by comparing the size of the pothole detected using the developed model with the size of the tire contact patch area.The authors conclude that if a risk assessment method could be used in real-time, it may not only be useful for road maintenance but also for detecting large potholes that are not recognizable by drivers in driving situations.
Other methods focus on road cracks detection from high resolution cameras on smartphones.Since such data are more easily available, those methods can bypass the extraction of hand-crafted features and utilize deep architectures, such as convolutional neural networks [24], [25], [26], [27].However, in the case of dense traffic situations and poor lighting conditions, techniques utilizing images from smartphone camera are less effective.In contrast to computer vision techniques which exploit texture information from images, 3D point cloud processing techniques exploit the object's geometrical properties [15], [28], [29].Bosurgi et al. [29] identify potholes in road sections by estimating area, perimeter and depth information from 3D data of pavement surfaces.Chen et al. [15] propose a framework for obstacle detection using the pitch and rotation angles of a LiDAR sensor to create a 2D image-like plane where the unordered set of points (from the point cloud) are projected.From this "LiDAR-imagery" a 2D histogram is extracted and used to find the road plane.If an adequate part of the road, in front of the vehicle is flat, those points form a straight line in the histogram representation, and anything above the line can be classified as a positive obstacle (points higher than the road plane), while points below the line as a negative obstacle (points lower than the road plane).Moreover, since water bodies cannot be detected by LiDAR due to refraction and reflection, the authors propose a technique to detect potholes filled with water by scanning the image for large areas of missing data.
Gu et al. [28] improves the aforementioned method by projecting the points on the camera plane and interpolating the depth values of the projected points to receive a depth image.They use both horizontal and vertical histograms to coarsely detect the road area and refine it respectively.Although they state their method as sensor fusion between the monocular camera and LiDAR, they do not utilize the color values of the camera images.Both works [15] and [28] use the KITTI dataset as a benchmark and achieve great results, comparable to machine learning methods.
Other techniques for pothole detection may include laser scanning, ground penetrating radar, ultrasonic sensor, as well as multi-sensor fusion, especially concerning fusion with imaging information.An extensive review of such techniques falls beyond the scope of this article.However, an interested reader may be referred to the survey in [30].
2) Point Cloud Saliency: One of main challenges in techniques utilizing point clouds is the inherent noise and the increased computational cost due to the unordered data structure of point clouds.To address such challenges, saliency map extraction has been proposed as a powerful step in point cloud processing to reduce noise and data dimensionality, leading to more robust solutions and computational efficiency [31], [32].Yet, the use of local saliency in pothole detection has not been sufficiently examined.Saliency maps were constructed from point clouds obtained from Mobile Laser Scanning (MLS) in [33] for road crack detection.MLS point clouds contain spatial information (i.e., Euclidean coordinates) and intensity information, and thus the extracted features could leverage both height and intensity information.Feature saliency was estimated by calculating the distances from the normal of each point to the principal normal of the input point clouds.In a similar setting, Wang et al. [34] extracted saliency maps in MLS point clouds by projecting the distance of each point's normal vector to the point cloud's dominant normal vector into a hyperbolic tangent function space.
3) Cooperative Driving: While significant advances have been made for single-agent perception, many applications require multiple sensing agents and cross-agent communication for more accurate results.Objects, captured by the single-agent's sensor devices, may be heavily occluded or far away from the sensors' view, resulting in sparse observations.Nevertheless, failing to detect and predict the accurate position or moving intention of these occluded or "hard-to-see" objects might have harmful consequences in safety-critical situations, and especially if the reaction time is very narrow [35].The development of multi-agent solutions can lead to collaborative perception and, through information sharing, may improve the driving performance and experiences, providing endless possibilities for safe driving.
Recently, cooperative autonomous driving has been considered as a possible solution to improve the performance and safety of autonomous vehicles [36].Cooperative perception for 3D object detection can be performed via early or late fusion of information, i.e., combination of multiple sensing points of view or fusion of object detection results, respectively.Both fusion approaches can extend the perception of the sensing system, however, only the early fusion approach can actually exploit complementary information.A major challenge that arises regarding cooperative perception is how to effectively merge sensors' data received from different vehicles to obtain a precise and comprehensive perception outcome.Additionally, despite the attention that cooperative driving has attracted recently, the absence of a suitable open dataset for benchmarking algorithms has made it difficult to develop and assess cooperative perception technologies.
Xu et al. [37] presented the first open dataset and used it to benchmark fusion strategies for V2V (vehicle-to-vehicle) perception.They also plan to extend the dataset with more tasks as well as sensor suites and investigate more multimodal sensor fusion methods in the V2V and V2I (vehicle-to-infrastructure) settings.Arnold et al. [36] proposed a system that produces a perception of complex road segments (e.g., complex T-junctions and roundabouts) using a network of roadside infrastructure sensors with fixed positions.Chen et al. [38] studied the raw-data level cooperative perception for enhancing the detection ability of self-driving systems.They fuse the sensor data collected from different positions and angles of connected vehicles, relying on LiDAR 3D point clouds.Liu et al. [39] addressed the collaborative perception problem, where one agent is required to perform a perception task and can communicate and share information with other agents on the same task.
Chen et al. [40] proposed a point cloud feature-based cooperative perception framework for connected autonomous vehicles to increase object detection precision.The features are selected to be rich enough for the training process, and at the same time have an intrinsically small size to achieve real-time edge computing.Guo et al. [41] proposed a cooperative fusion method to combine spatial feature maps for achieving a higher 3D object detection performance.Yuan et al. [42] proposed a 3D keypoints feature fusion scheme for cooperative driving detection to remedy the problem of low bounding box localization accuracy.Fang et al. [43] presented an iterated split covariance intersection filter-based cooperative localization strategy with a decentralized framework.In addition, they adopted a point cloud registration method to obtain the relative pose estimation using mutually shared information from neighbour vehicles.Kim and Liu [44] presented the concept of cooperative autonomous driving using mirror neuron-inspired intention awareness and cooperative perception, providing information on the upcoming traffic situations ahead, even beyond line-of-sight and field-of-view.
4) Situational Awareness and AR Infotainment: In the case of semi-autonomous vehicles, where the operator/driver may be asked to take manual control of the car at any moment, it is of great importance [45] to implement notification paradigms that direct the operator's, possibly reduced, attention to the event that triggered the take-over request [46], [47].Recently, the automotive industry started to invest funds and efforts into Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
AR technology and its integration with In-Vehicle Information Systems (IVIS) for intuitive and non-intrusive information display to the driver.
The design of AR in-vehicle systems for infotainment is a challenging task.Rao et al. [48] performed an analysis of design methods on different use cases aiming to identify the difficulties in implementation aspects.Despite the vast amount of requirements for these systems to work reliably, such as latency, bandwidth, weather conditions etc, they concluded that the integration of augmented reality in vehicles will help drivers navigate their environment better, and thus will be more widely adopted.
While IVIS existing in many modern vehicles with touch Liquid-Crystal Display (LCD) displays and voice commands may seem to offer most of the utilities of an AR infotainment system, they may actually be distracting to the driver.Strayer et al. [49] showed in a recent study that some IVIS require a high cognitive demand or complex command sequences to be handled, and this can in turn lower the awareness of the operator.This is perpetuated by the fact that most IVIS are placed on the dashboard and usually demand their operation to avert (even momentarily) the driver's gaze from the road.In contrast, AR HUDs perform information rendering on top of the environment and thus the driver does not need to share focus in multiple locations.
The distraction potential of AR HUDs was assessed by Kim and Gabbard [50].An AR-enabled windshield was used in a simulated environment with a real-life driving video feed to test various methods of pedestrian visualization.The gaze behavior and cognitive processes were measured and it was found that the visual and cognitive distraction potential of AR depends on the perceptual forms of graphical elements presented on the displays.Specifically, in some cases visualizations, e.g., in the form of a "virtual transparent shadow" indicating the pedestrian's anticipated path, improved the driver's attention without degrading awareness of other objects or scene elements.On the other hand, the use of bounding boxes localizing pedestrians showed to have negative effects, because this approach either overloaded (visually) the scene or degraded the driver's attention on other -not highlighted but possibly critical -scene elements.These outcomes indicate that, while the potential of AR for improving situational awareness is tangible, a lot of attention must be paid for the AR design to not end up cluttering and obstructing the driver's attention.
Kettle and Lee [51] reviewed the AR visualizations for in-vehicle vehicle-driver communication, regarding factors like the display mode (e.g., windshield, simulated, HUB, etc.), the display design (e.g., bounding boxes, warning symbols, arrows, etc.) and the display information (e.g., hazard detection, pedestrian detection, vehicle detection, road signs, etc.).They concluded that there are many benefits of implementing AR interfaces, and such interfaces have the potential to improve driving performance through braking and takeover responses.The research on augmented reality displays on windshields for improving driver awareness also extends to fully Autonomous Vehicles (AV).Such informative human-machine interfaces may help to form a mental model of the vehicle's sensory and planning system, thereby enhancing trust in AV, which is currently quite low in the general public [52], [53], [54].Lindemann et al. [55] conducted a user study on urban environments for evaluating the situational awareness of the driver in various scenarios.They found that their explanatory windshield display had positive results and improved the operator's trust.Yontem et al. [56] also designed an AR windshield interface targeting future vehicles.Their main focus was also to increase driver awareness by presenting graphical cues in a non-intrusive way based on a human-centric design and taking into account the human peripheral vision.
While the above methods provide essential feedback on the assessment of such interfaces' design, a significant limitation is that most studies were based on basic or non-interactive simulations, with the steering wheel and pedals not influencing the simulated environment and thus restricting the feeling of immersiveness of the simulations during the evaluation study.A more realistic, experimental study on the benefits of AR in driver's behavior was performed by Kim et al. [57] outdoors in a parking lot.It focused on pedestrian collision warning based on visual depth cues delivered in a conformal manner through a monocular display seated above the dashboard, or a volumetric display providing binocular disparity.A limitation of this study, which we address through our AR visualization component (subsection IV-A of section IV), is the limited field of view of the display used in the experiments, potentially creating a tunneling effect of the human vision.

III. OBSTACLE DETECTION
This section presents the proposed methodology on obstacle detection and is followed by section IV on visualization and communication aspects.The main components of the methodology are illustrated in the schematic diagram in Fig. 1 and can be encapsulated in the next steps: • Extraction of saliency map: A saliency value is estimated for any point of the point cloud scene based on its local geometry, as well as the local geometry of its neighboring points.
• Scene segmentation: The estimated saliency map is then used as a feature to segment the point cloud into areas characterizing (i) the safe area of the road, (ii) be-aware or dangerous areas within the range of the road, and (iii) areas out of the range of the road.
• Static object recognition: Static objects (i.e., potholes and bumps) can be identified and their point coordinates are Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.In this work, we assume the existence of two or more vehicles (referred as ego1 and ego2 vehicles in this paper) that are moving on the same map of a town but not necessarily at the same time, i.e., they are in spatial proximity but possibly not in temporal proximity.Fig. 2 presents an example of two registered point clouds, as received by the LiDAR devices of ego1 and ego2 vehicles, showing also their starting points (in arrows).We would like to mention here that all the following analysis is applied to each vehicle separately.

A. Notations
Before presenting details on the individual steps, we provide here the necessary definitions and notations.The input data constitute a sequence of point clouds P i , i = 1, . . ., l that represents a set of l consecutive frames acquired by a LiDAR device.Each point cloud P i consists of m i vertices v, where the value of m i may be different from frame to frame.The j-th vertex (v j ) of a point cloud P i is represented by the Cartesian coordinates, denoted v j = x j , y j , z j T , ∀ j = 1, • • • , m i , where the index i of the point cloud is omitted for simplification.Thus, all the vertices can be represented as a matrix Let's also denote with k j the set of the indices of the k nearest neighbors of point j.For a face f defined by three vertices (v j1 , v j2 , v j3 ), the outward unit face normal n f is calculated by the following equation: The point normal n j , representing the normal of each point separately, is calculated as:

B. Saliency Map Estimation of the Point Cloud Scene
The purpose of this step is to calculate a metric of saliency for each vertex of a point cloud.Assuming point clouds without context information, saliency characterizes the geometric properties in a local neighborhood of points, i.e., high saliency values represent more perceptually prominent vertices which usually correspond to sharp corners (high-frequency spatial information).On the opposite, the geometrically least important points are those that lie in flat areas.
For the estimation of the saliency map, we implemented and modified the fusion technique presented in [32].Instead of using guided normals of centroids, as in the original version [32], we now utilize normals for the points.This was performed to accelerate computations.Since the number of faces is usually approximately twice the number of vertices, the point normals are almost half the number of the centroid normals.For the sake of completeness, we present here our approach for the estimation of the saliency map of a point cloud scene, utilizing point normals.
Our fusion technique combines geometric saliency (s (1) ) with spectral saliency (s (2) ) features.The unique characteristics of each of these saliency features make the methodology more robust to point clouds acquired under real conditions, thereby being potentially affected by noise and outliers.The method processes each frame independently without examining past temporal information.Thus, as the methodology is applied for each point cloud in the sequence independently, for simplicity we omit the index i (indicating the frame number) from now on in the equations.
For a point cloud P with m vertices, a matrix E ∈ R 3m×(k+1) is constructed which includes in the first column the m point normals (n j = [n j x , n j y , n j z ] T ) of each vertex j, j = 1, • • • , m, respectively, and in the subsequent k columns the point normals of the k nearest neighbors of vertex j (i.e.n jκ ∈ k j ).The salient features extracted by this approach capture global information since the matrix E is constructed using the point normals of the whole scene.
In order to exploit the geometrical coherence between neighboring normals, we apply Robust Principal Component Analysis (RPCA) to decompose the matrix E into a low-rank matrix L ∈ R 3m×(k+1) and a sparse matrix S ∈ R 3m×(k+1) , as described in the appendix A. The matrix L consists of the low-rank values n of the point normals n, while the matrix S consists of the corresponding sparse values represented as ṅ.The values of this matrix are zero (or to be more specific nearly zero) if the row (representing a neighboring patch of points) corresponds to point normals with very similar values, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
i.e., the vertex lies in a flat area, and very large values if the row corresponds to point normals with big dissimilarity (i.e., the vertex lies in a very sharp corner).The fact that most of the local patches k j of a 3D surface are piecewise flat confirms that the matrix S can be considered a sparse matrix.
In other words, sparsity of the matrix is assumed because piecewise flat areas are the most dominant geometrical pattern in a 3D surface.1) Estimation of the Geometrical Saliency (global approach): As the similarity of normals between neighboring points is a measure of geometrical coherence of the local neighborhood, we estimate the sparsity of the dissimilarity of normals and use it as a feature for geometrical saliency, s (1) .Low values of the sparse matrix indicate that the normals of the point and its neighbors are similar (low-rank).This means that if all points in a neighborhood have similar geometrical characteristics, the respective patch represents a flat area.On the opposite, high dissimilarity indicates that the surface has an irregular shape.For a point v j the geometric saliency feature, s j , is estimated by the values of the first column of the sparse matrix S according to: where ṅ j x denotes the scalar value of the x coordinate, of the [3 • ( j − 1) + 1] th row, of the 1 st column of the S matrix.
2) Estimation of the Spectral Saliency (local approach): For the estimation of the spectral-based saliency, s j , for a vertex j of the point cloud, we use the submatrix E j ∈ R 3×(k+1) , that includes the 3 corresponding rows of the matrix E: In other words, each submatrix E j , which is a subset of the global matrix E i , consists of the point normals of a local neighborhood of the vertex v j .Then, for each one of these local matrices E j , the covariance matrix R j ∈ R 3×3 is calculated: Next, the calculated matrix R j is decomposed into a matrix U consisting of the eigenvectors and a diagonal matrix = diag(λ j1 , λ j2 , λ j3 ) consisting of the corresponding eigenvalues, i.e., [U ] = eig(R j ), where eig(.)represents the eigendecomposition operation.
Finally, the spectral saliency of each vertex is calculated by the inverse l 2 -norm of the corresponding eigenvalues: Eq. (7) indicates that large values of the term correspond to small saliency features implying that the centroid lies in a flat area, while small values of the eigenvalues' norm correspond to large saliency, characterizing the specific centroid as a discriminative point.This can be easily justified by the fact that a point normal lying on a flat area is represented by one dominant eigenvector, the corresponding eigenvalue of which has a very large value (especially, considering that it is squared).On the other hand, the point normal of a vertex lying on a corner is represented by three eigenvectors, that correspond to eigenvalues with small and almost equal amplitude, as shown in Fig. 3.
3) Normalization and Fusion of Local and Global Saliency: Finally, we linearly scale the values of the geometric (s (1) ) and spectral (s (2) ) saliency in the range of [0-1] and combine them through weighted averaging according to: where s(1) and s(2) denote the normalized geometric and spectral saliency features, and w 1 and w 2 the corresponding weights.We note here that we used equal weights (w 1 = w 2 = 1) in all of our experiments, however, the weights can be tuned to emphasize the local or global saliency descriptors, respectively.
The proposed method has shown to be robust [31], [32], even for complex surfaces with different geometrical characteristics and patterns, since it exploits spectral properties (i.e., sensitivity in the variation of neighboring normals) and geometrical characteristics (i.e., sparsity of intense prominent spatial features).An example of the visualization of the saliency map, as applied to the point cloud of a scene shown in Fig. 4 (a), is presented in Fig. 4 (b).

C. Scene Segmentation for the Identification of On-Road Obstacles
The saliency map of each frame is used to categorize different regions of the scene.For illustration purposes the regions are visualized in different colors: • Blue: The safe area of the road beyond the view of the driver.
• Yellow: Be-aware areas representing negative obstacles.
• Cyan: Hazardous areas in the range of the road representing positive obstacles.
• Purple: Dangerous areas outside of the range of the road.
• Red: Recognized obstacles in the range of the road (e.g., potholes).To define the vehicle's moving direction steering data are used received by internal sensors of the vehicle.The direction Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply. of the vehicle specifies which part of the scene in the field of view is in front of the vehicle and is used as parameter, in addition to saliency mapping, for the segmentation of the point cloud.The more critical regions are the ones that lie within the limits of the road.A segmentation example is illustrated in Fig. 4 (c).

D. Data Simulations
For evaluation of our methodology, we created a rich dataset using CARLA, an open-source autonomous driving simulator [16].CARLA is based on a server-client system, in which the server is responsible for running the simulations including the calculation of physics, weather conditions, collision detection and sensor readings.It operates on the OpenDRIVE specification [58] for defining junctions, traffic lights, etc, and is used by CARLA for simulating independent agents, such as other cars and pedestrians.This makes CARLA ideal for creating complex scenarios and realistic driving conditions for our tests.
The server running the simulations is powered by Unreal Engine.Clients can connect and request changes to almost any element in the world being essential for the creation of scenarios.They also receive sensor data and manage input to the vehicle controlled by the user.CARLA supports a wide range of sensor suites with extensive configurability to its intrinsic parameters.In our work, we use a LiDAR sensor on top of the vehicle and a monocular RGB camera, placed in the front part of the car, for simulated data collection.By placing these sensors in an autonomous car and initiating its navigation in the virtual environment, we were able to create a very large dataset for evaluating our algorithms.In the future, we plan to assess the AR visualization effectiveness, with respect to reaction time and awareness increase, in a real environment with a driver manually controlling a vehicle.
Contributions in CARLA simulator: Due to lack of benchmark point clouds datasets representing real road scenes with obstacles (potholes and bumps), we used the CARLA simulator to create obstacle-free environment data, in which we subsequently introduced simulated obstacles.Specifically, we designed obstacles as curved point cloud surfaces using the open-source software Blender5 and used them to substitute parts of the road.To avoid modeling the obstacles by hand, we followed an automated procedure to generate a plethora of different obstacles based on several parameters, such as depth, ellipticity and size.An example of a frame in the CARLA simulator with a simulated pothole is presented in Fig. 4 (a) (texture) and in Fig. 4 (b) (geometry).

IV. INTERFACES AND COMMUNICATION
Context-awareness is a critical factor for successful take-over requests and a lot of effort has been devoted to determining the type of stimulus (e.g.visual, auditory, vibrotactile) [59] and the required time-window [60], [61], [62].In the case of partial or conditional driving automation, our framework could be used to prepare the driver to quickly take the control of the vehicle, if requested.In order to ensure that the driver is able to swiftly take over the control of the vehicle in an efficient way, we developed a notification system that presents relevant information about the condition of the environment.Our notification system is based on non-intrusive visual cues to prevent tunnel visioning, alerting the driver of potential risks and also directing his/her attention to the objects of interest that sparked the take-over request.In that way, in addition to assisting the human operator during manual driving, the system can, in times of automated driving, trigger the attention of the operator to possible external hazards and preparing him/her to resume control.The visualization technique presented in this section is designed as an AR windshield interface, although this is not restrictive, i.e. the method can be implemented in any AR interface.

A. AR Visualization
The visualization of obstacles is performed by projection.Assuming the position is known for the AR interface and the LiDAR relative to the world, we construct a transformation matrix to map the points of the point cloud from the LiDAR relative coordinate system to the AR interface's coordinate system.The transformation between two different coordinate systems is typically performed by applying serially a scale, a rotation and then a translation transformation.Since both coordinate systems are orthonormal, the scaling can be omitted.Also, by taking advantage of the rigid body nature of the vehicle where the LiDAR and AR interface is located, we also omit the rotation matrix given that, without loss of generality, we can assume that the two coordinate systems are aligned.According to these assumptions, the LiDAR coordinates are transformed into the AR interface's coordinates by a simple translation.
For projecting the points of the point cloud to the AR interface, we assume a simple pinhole camera model.If the AR interface is, for example, an AR windshield, then the windshield represents the image plane and the head of the driver the principal point with coordinates (x 0 , y 0 ).That way, the focal distance f = ( f x , f y ) represents the distance from the driver to the image plane.With the dimensions of the image plane (windshield), and specifically the aspect ratio, known, the frustum is fully defined and the projection can be made from a point in 3D windshield coordinates (x, y, z) to pixels (u, v) on the image plane using the following equation: An undesirable property is the sparsity of the projected pixels attributed to the sparsity of the point cloud.To overcome this limitation, we use an iterative nearest neighbour algorithm on the image space to fill the gaps between projected points.The result of this process is shown in Figs. 4 (d)-(i).More specifically, Fig. 4 (d) and Fig. 4 (g) illustrate the segmented point cloud projected to the AR interface of ego1 and ego2 correspondingly.Note that all information is rendered for the sake of completeness.In real-world cases only the necessary information (e.g., arrows or recognised potholes) will be rendered so as to avoid clutter.Fig. 4 (e) shows the perspective projection of the points to the AR interface and image filling for ego1.In Fig. 4 (f) and Fig. 4 (i), the pothole recognition and visualization is depicted for the vehicles ego1 and ego2, while a warning about an upcoming pothole (retrieved from the database) before reaching the field of view of ego2 is presented in Fig. 4 (h).
We would like to clarify here that for evaluation of our methodology and demonstration purposes in the previous figures we project and illustrate in the 2D display device all the information from scene segmentation.However, in real driving scenarios only the most relevant information of the scene (e.g., dangerous objects, potholes) would be highlighted and displayed so as to decrease the amount of any unnecessary information that may bother or confuse the driver.

B. Information Storage and Vehicle Communication Rules
One of the advantages of autonomous vehicles is their ability to communicate with each other forming a cyber-physical system of systems.Many new opportunities arise from the ability of systems to share information, one of which is the transmission of objects or landmarks of interest that were previously observed by an agent, to other agents of the system who could benefit from such information.In particular, our work focuses on information sharing among vehicles about encountered obstacles, such as potholes and bumps, through a centralized server.When a vehicle identifies an unexpected (i.e., unregistered) obstacle, the vehicle sends a request to the server and after further inspection, the new potential obstacle is either discarded or added to the database.Vehicles may also send information regarding already known obstacles when they come across them.Such information includes the Global Positioning System (GPS) location, dimensions and geometrical characteristics in case the obstacle needs updating in the database, e.g. it has increased in size or has been fixed.Through this communication system, a driver can be warned about potential hazards that may not yet be in his field of view or they are obstructed by other objects and thus, increase his performance and decision-making abilities.We should clarify that our work does not focus on communication protocols and defence mechanisms against potential network attacks, but rather defines a solid framework describing the roles of each node and the information flow.
By using the LiDAR-based obstacle detection method, described in section III, the vehicle transmits via a communication component to a central server the points belonging to the obstacle, segmented from the point cloud scene.The information is coupled with a timestamp and the GPS location of the vehicle at that instance.The server then transmits to any vehicle in the vicinity of the obstacle, alerting (autonomous vehicles or human operators) about potential hazards from a large distance and thus helping alleviate the inability of the LiDAR sensor to identify obstacles from such a range.In the case of a driver, we also use the AR interface of the vehicle to display, in a non-distracting manner, the location and nature of the potentially upcoming obstacle.Potholes can change shape over time, most commonly due to deterioration of the surrounding pavement and erosion caused by environmental effects or in the opposite case due to pothole repair.Thus, periodic updates are necessary for the long-term reliability of the pothole visualization component.As there is a need for periodical evaluation of the objects in the server database and update in the case of changes, we assign a shape-and geometry-based descriptor at each obstacle, so that it is characterized by a unique representative signature.Thus, every vehicle encountering the obstacle in a nearby range, calculates the descriptor of the obstacle's area.The new descriptor is then transmitted to the server and is used to confirm whether the information is up-to-date.In the case of a difference in the descriptor's value, an algorithm running in the server decides between keeping the old descriptor, updating it with the new one, or marking the obstacle as removed and deleting the entry from the database.
More specifically, we implement a simple system that (when a new pothole is detected) initiates a database search to retrieve whether the pothole is new or already existed and needs to be updated.Since potholes are static and thus change only in shape, the similarity check is based only on the bounding box of the re-identified pothole.When the overlap of the bounding boxes is less than a threshold, the previous object is replaced by the new one.In our experiments we a threshold of 15% reshape in the area in either direction to avoid frequent unnecessary updates, while also retaining the required precision in representation.Similarly, the algorithm checks for significant changes in the bounding box dimensions.A flowchart showcasing the information update and communication pipeline between two vehicles is shown in Fig. 5.

V. EXPERIMENTAL ANALYSIS
In this section, we will present and discuss in detail the experimental analysis and will evaluate our proposed framework.

A. Experimental Setup, Datasets and Metrics
The experiments were carried out on an Intel Core i7-4790HQ CPU @ 3.60GHz PC with 16 GB of RAM.The core algorithms are written in Matlab and C++.The evaluation of the methodology was performed using (i) synthetic dataset of potholes that we have created and (ii) 3D point cloud potholes from real datasets with known models (used as ground truth) which have been evaluated by other methods too [9], [63], [64], [65].
The pothole detection algorithms are compared in terms of the pixel-level (for image-based methods) and point-level The performance metrics can also be expressed as shown in  Table I, where Real Pothole (RP) represents the recall or in other words the percentage of vertices correctly annotated as pothole, Real Road (RR) represents the percentage of vertices correctly annotated as road, Not real Pothole (NP) represents the percentage of vertices wrongly annotated as pothole and Not real Road (NR) represents the percentage of vertices wrongly annotated as road.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

B. Results
A quantitative comparison of our approach to other methods is challenged by the absence of datasets containing labeled and real potholes in point cloud format.Furthermore, the existing LiDAR-based methods [68], [69], [70] in the literature are still in their early stages, often using 2D LiDAR data, and they do not provide any available open datasets for quantitative comparison.Therefore, the only way to compare our pothole detection method was with image-based approaches that provide the corresponding point clouds for each pothole image.Despite the distinction in detection techniques between point clouds and RGB images, we maintained consistency by using the same models and metrics to ensure a fair comparison.
For the evaluation of our method, two public available datasets [9], [63] were utilized providing point clouds of real potholes.Fig. 6 visualizes results of our pothole detection method for the dataset created by real potholes [63] under different density resolutions (Fig. 6 (a)-(d)).Points in red represent the vertices belonging to the pothole, while points in blue represent vertices belonging to the road, both for the ground truth and the estimated point clouds.Two dense models (Fig. 6 (a)) are utilized as presented in rows 1-3 and 4-6, respectively.To investigate the performance of our approach in more realistic conditions, we increasingly downsampled the original point cloud (Fig. 6 (b)-(c)) to evaluate the robustness of detection of our algorithm.The corresponding number of vertices for the two models (original and downsampled) are shown above each model, respectively.The heatmap (rows 2 and 5) illustrates the geometric and spectral saliency per vertex (as estimated from Eq. 8).Higher salient values are depicted with deep red color while lower salient values with deep blue.
Due to the sensitive nature of the specific application involving safety of drivers (via information visualization for situational awareness), we prefer our algorithm to provide a small percentage of NR than having even a small value of NP (please refer to Table I).To wrongly identify as a pothole a small area of the road around an actual pothole is not as critical in our application as the opposite, namely to fail to present or partially present a potentially dangerous object (e.g., pothole, ramp).The detailed results with all evaluation metrics are shown in Table II for each of the thirteen 3D models of the point cloud dataset, and under different point cloud density resolutions.The results of this table show that our method is robust even for very low point cloud density.This is an important observation, since the output of the LiDAR device has a low density resolution pattern.Fig. 7 visualizes some examples of the pothole detection algorithm applied in an other dataset [9].The first column of this figure illustrates the RGB image presenting real road potholes.In the second column (Fig. 7-(b)), the corresponding point cloud with the relative texture is presented.The geometry represented by the 3D coordinates of the point cloud (without any color information) is presented in Fig. 7-(c).Fig. 7-(d  Table III provides a qualitatively comparison of our method versus other approaches of the literature.However, it should be mentioned that the results are not directly comparable because the other methods use only the visual information of the RGB images, while our method uses only the geometrical information of the corresponding point cloud.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
The experimental analysis showed that our methodology is superior to image-based pothole detection methodologies.This may be attributed to the use of LiDAR sensing, which provides depth information and has a large range, making it suitable for driving environments.In contrast, image-based methods rely on visual cues, such as changes in color or texture, which can be ambiguous and difficult to interpret.Moreover, the depth of a pothole cannot be accurately estimated by 2D information alone.LiDAR sensors can provide 3D information about the road surface, enabling more accurate detection and localization of potholes.This allows accurate identification and localization of potholes even when the potholes are partially obscured by other objects or difficult to identify in images.Additionally, LiDAR sensors are less affected by changes in lighting conditions, such as shadows and reflections, which can impact the performance of image-based methods.They also have a longer range than cameras, enabling obstacles' detection at greater distances, and providing in this way drivers with more advanced warning of potential hazards, thereby improving overall safety.The point cloud processing system analyzes the environmental data and classifies regions as safe or potentially hazardous based on the presence of obstacles.A flexible visualization system was implemented to prioritize the display of critical information in the driver's field of view and accordingly adjust the properties of rendered information.This ensures that the driver is alerted to potential hazards in a non-intrusive way, without being overloaded with unnecessary information.

C. Visualization and AR Rendering
The detected potholes are rendered on an AR display in the driver's field of view, which can be a head-up display (HUD) or a windshield display.The rendering of potholes can be overlaid on the real-world view, using a variety of visual cues to indicate their severity.For example, potholes that are more salient could be rendered in larger sizes, brighter colours, or with animated effects, while less dangerous potholes could be rendered in smaller sizes or subtler colours.
To ensure the safe and effective conveying of information about potholes, the AR visualization system should be designed to be non-intrusive and appropriate for the driving context.We have implemented a flexible visualization system (Figs.8-9) that allows to select a number of properties for the obstacles to be rendered (size, colour, animation and visualization type), such as to maximize the driver's ability to realize the presence of hazardous obstacles while minimizing distraction.Next, we first present some general recommendations on the choice of optimal properties for the visualization system, and upon this we present our preliminary research on the evaluation of users' personal preferences for customization of visualization properties in a simulated driving environment.
To present information in a non-intrusive way, it is important to strike a balance between providing enough information to enhance situational awareness while avoiding information overload that may distract the driver.Some strategies for presenting AR information in a non-intrusive manner are: (a) Prioritize information [71] that is critical for safe driving, such as upcoming obstacles or hazards.(b) The size and placement of information can affect how noticeable it is to the driver [72].Important information should be presented in a larger size and placed in a location where it can be easily seen without distracting the driver from the road.(c) Colours can affect how noticeable and attention-grabbing information is [72].Highcontrast colours, such as red, can be effective for highlighting critical information, while more muted colours can be used for less important information.(d) Animation can be attentiongrabbing, but it can also be distracting.Animated information should be used sparingly and only for critical information that requires immediate attention [73].(e) Drivers have different preferences for how they want information presented to them [74].Providing customization options for the size, colour, and placement of information can help drivers tailor the information to their preferences and reduce distraction.Feedback from drivers can help identify areas where the presentation of information can be improved and help strike the right balance between providing enough information and not distracting the driver.
• Size: The size of the rendered obstacles should be proportional to their saliency.More important obstacles, such as potholes, should be rendered larger than less important ones.However, it is important to avoid rendering the obstacles so large that they become intrusive or distracting.The size of the rendered obstacles should be designed to quickly grab the driver's attention and provide sufficient information for the driver to take appropriate action.
• Color: The use of color can also help convey the saliency of obstacles.Brighter colors, such as red or yellow, can be used to indicate high saliency, while less important obstacles can be rendered with subtler colors.The use of contrasting colors can also help the driver quickly differentiate between obstacles and the surrounding environment.
• Animation: Animation can be used to provide additional visual cues that indicate the saliency of obstacles.For example, obstacles that are in the driver's immediate path could be rendered with a flashing or pulsating effect, while less important obstacles could be rendered with subtler animation.
• Visualization type: The AR content may be provided through different visualization techniques (Fig. 8): -Wedge 3D.This method is based on rendering objects in a form of a pyramid-like scheme.The height of the pyramid is proportional to the distance between the user's vehicle reference and the object.-Arrow.This visualization method involves a stick and an arrow tip.The arrow's direction (in 3D) follows the object's position, while the length of the stick is proportional to the corresponding distance.-3D Minimap.Three layers of concentric spheres are used to provide an estimation of the object's distance from the vehicle's position.-Radar.The objects are represented as small squares in a radar-like area.-Meshes Sphere.Similar to the previous method, however, the occluded objects are represented by spheres.-Occluded Meshes As is.The occluded information is presented at the appropriate distance by transparently rendering the object's silhouette.
Additionally to the visual symbolic warnings, other alternative modalities can be used as well.For example when a vehicle is approaching a pothole that has been detected by another vehicle, it could receive a text or sound warning to slow down or take immediate action.1) Rule-Based Visualization and Warnings: In terms of visualizing and representing the outcomes of the proposed method through AR rendering, we suggest using a colour-coded system to indicate the severity of the detected potholes.For example, green could represent a small or shallow pothole, red could indicate a larger or deeper pothole that poses a greater risk to the driver, while orange could represent an intermediate situation (Fig. 9).The size and shape of the rendered pothole could also be adjusted to reflect its severity and distance from the driver.To ensure that the information is presented in a non-intrusive way, we suggest limiting the amount of information displayed on the AR-HUD to only the most critical or relevant obstacles.This can be achieved through intelligent filtering and prioritization algorithms that take into account factors such as the driver's speed and direction of travel, as well as the severity and proximity of detected obstacles (like the Pavement Condition Index (PCI) [75] and the International Roughness Index (IRI) [76]).By combining the IRI with the depth and size of potholes, a more comprehensive measure of the severity of the road surface condition can be obtained.By providing only the most relevant information, we can help reduce the risk of information overload and ensure that the driver's attention remains focused on the road ahead.
2) User Evaluation Study: In order to design the visualization scheme following user-centered principles, a user evaluation study was performed to assess personal prefer-ences while driving in a simulated environment.A steering wheelchair was implemented for the driving simulation, and a VR display device with leap motion sensor was used to provide the AR information (Fig. 8).Out of all properties, we focused on the customization of the visualization type which is considered more critical and less subjective than other factors like size, color, and animation.
A total of 12 adult (4 females, 8 males) participants took part in the experiment.Most of them were either employees or students at the University of Patras.The age range of the participants was 23 to 45 years with an average value of 27.5 years.The experimental process consisted of two parts.In the first part, the participants familiarized themselves with the simulator and the driving process.They were not given any specific instructions or time constraints, and were free to drive in any direction and for any duration they felt comfortable with.During this phase, the simulated vehicles adhered to driving rules and moved safely on the road.The participants had the opportunity to explore the simulator environment using all the provided visualization methods.In the second part, the participants were asked to follow a predetermined route that included various hazardous situations based on predefined scenarios.The different visualization methods available were utilized to enhance the drivers' situational awareness.Spatial indicators were used to guide them along the correct path.Finally, a questionnaire was administered at the end of the experimental process, which included general demographic questions (gender, age, education, etc.), inquiries about technology usage frequency, and questions pertaining to the evaluation of the visualization techniques.Specifically, the participants evaluated whether the information provided through the visualization techniques met their personal expectations, increased their trust and acceptance, and whether it was understandable and non-distracting.
The most important outcomes of the evaluation study are summarized next.All participants agreed that utilizing the proposed visualization system would enhance their driving safety and awareness of critical upcoming events.They also found the tool particularly useful when navigating unfamiliar Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.mixed traffic environments.Furthermore, a significant majority of participants (83.3%) expressed a positive inclination towards using the proposed visualization system to alleviate their nervousness while driving in unknown areas.Regarding the application's efficacy in promoting safe and secure driving, all participants were in favor (66.7% strongly agree, 33.3% somewhat agree) of utilizing it.Nearly all participants (11 out of 12) demonstrated awareness of VR/AR training tools, and all of them (100%) expressed interest in utilizing VR/AR technology for training or learning purposes.It should be however mentioned, that the characteristics of the group of participants might not reflect that of the average population, since the majority of participants (75%) had prior experience using an AR device, tool, or application, and all of them (100%) have previously used a VR device, tool, or application.
With regards to visualization type, attendees had diverse preferences.The most popular method was the presentation of occluded objects as transparent meshes, and the display of the object's silhouette in their true form (83.3% rated it from 4 to 5 on a scale of 1 to 5).On the other hand, the least popular method was the use of minimaps (50% rated it below 3 on a scale of 1 to 5).

VI. CONCLUSION
In this work, we propose a cooperative obstacle detection and rendering scheme that utilizes LiDAR data and driving patterns to identify obstacles within the road range.Our system allows for information sharing between connected vehicles, enabling drivers to be notified about incoming potholes even when there is no direct line-of-sight.This cooperative driving scheme increases situational awareness and reduces the risk of accidents caused by unexpected obstacles.Our method is based on the analysis of point clouds which is challenged by the lack of benchmark datasets obtained from LiDAR devices.To overcome this problem, we created our own synthetic dataset and added it to the maps of the CARLA simulator, thereby creating realistic driving environments.The comparison of our method with other state-of-the-art approaches, regarding the accuracy of pothole detection in real datasets, has shown its effectiveness providing very promising outcomes.Our proposed approach can be extended to cover a wider range of road hazards beyond potholes, such as debris or uneven road surfaces.By utilizing the same LiDAR sensor technology, we can detect these hazards and provide similar AR visualizations to drivers.Moreover, we plan to investigate the integration of other sensing modalities, such as RGB-D cameras, which could provide additional visual information to improve the accuracy of obstacle detection and enhance the situational awareness of drivers.In addition, our methodology can be further improved by incorporating machine learning algorithms to enhance the accuracy and efficiency of obstacle detection and classification.We plan to explore the use of deep learning models, which have shown promising results in various computer vision tasks, to enhance our point cloud processing system.Lastly, we envision that our proposed approach could be applied beyond personal vehicles, such as in autonomous vehicles and public transportation systems.By leveraging V2X communication, our cooperative obstacle detection and rendering scheme could provide a safer driving experience for all road users.

APPENDIX A ROBUST PRINCIPAL COMPONENT ANALYSIS (RPCA)
RPCA is a powerful mathematical tool that has been used in many scientific domains in order to decompose an observed measurement E into a low-rank matrix L, representing the where ∥L∥ * is the nuclear norm of a matrix L (i.e, i σ i (L) is the sum of the singular values of L).
A lot of works have been proposed all of these years, presenting excellent results.However, despite the effectiveness that some works [77], [78] have presented in the past, the execution times of the proposed algorithms need improvement.This convex problem can be solved using a very fast approach, as described in [79], according to: In each (t) iteration, the Eq. ( 11) is updated with rank = K .If u K K i=1 u i > ϵ, where u denotes the singular values and ϵ is a small threshold, then the rank is increased by one (i.e., K = K +1) and the Eq. ( 12) is updated too.To update the Eq. ( 11), a partial SVD(E − S (t) ) is estimated keeping K components.

Manuscript received 20
July 2022; revised 26 May 2023; accepted 20 October 2023.Date of publication 7 November 2023; date of current version 13 May 2024.This work was supported by the European Union's Horizon 2020 Research and Innovation Program under Grant 101092875-DIDYMOS-XR: Digital DynaMic and responsible twinS for XR.The Associate Editor for this article was R. Arghandeh.(Corresponding author: Gerasimos Arvanitis.)
stored and then used for the AR-based visualization and communication to other nearby vehicles.

Fig. 4 .
Fig. 4. (a) Image from the camera of the vehicle, the texture of a pothole is also apparent, (b) Extracted saliency map of the same road scene, (c) Segmentation of the point cloud scene based on the saliency map, (d) Example of segmentation of the point cloud projected to the AR interface (in the view of ego1), (e) Perspective projection of the point cloud vertices to the AR interface and image filling, (f) Pothole recognition (highlighted in red color) and AR visualization of the corresponding information (in the view of ego1), (g) AR projection of the point cloud vertices to the scene image that depicts the starting point of view of the ego2 vehicle, (h) Early warning of upcoming pothole to inform ego2, (i) Pothole recognition and visualization (in the view of ego2).
(for point clouds) pr ecision = [T P/(T P + F P)], r ecall = [T P/(T P + F N )], accuracy = [(T P + T N )/(T P + T N + F P + F N )] and F − scor e = 2 • [( pr ecision • r ecall)/( pr ecision + r ecall)], where T P, F P, T N , F N , represent the number of True-Positive, False-Positive, True-Negative and False-Negative pixels, respectively.The positive class includes all vertices belonging to the pothole (P) and the negative class all vertices belonging to the road (R).

Fig. 6 .
Fig. 6.Pothole detection in point cloud data of real potholes [63].Two dense models are visualized: model1 (rows 1-3) and model2 (rows 4-6).For each model, the three rows illustrate (i) the ground truth, (ii) the heatmap visualizing the saliency map of the pothole and (iii) the estimated point cloud, respectively.The columns show results with decreasing density resolutions (in respect to the original model): (a) original model, (b) ∼ 50% of the vertices, (c) ∼ 10% of the vertices, (d) ∼ 5% of the vertices.

Fig. 7 .
Fig. 7. Pothole detection on real data [9].(a) RGB images of potholes, (b) corresponding point cloud of potholes with texture, (c) point cloud of potholes, (d) ground truth binary mask of potholes, (e) estimated binary mask of potholes, (f) enlarged details of the ground truth point cloud, (g) enlarged details of the estimated point cloud.

Fig. 9 .
Fig. 9. (i-iii) Different severity prioritization of potholes based on their size and volume, (a) image of the camera without any visual cue, (b) projected point cloud representing the identified pothole, (c) overlay visualization of the pothole.

TABLE I EVALUATION
METRICS FOR POTHOLE DETECTION (IN PERCENTAGE %)

TABLE II POTHOLE
DETECTION ACCURACY (IN PERCENTAGE %) FOR DIFFERENT DENSITY RESOLUTIONS OF THE POINT CLOUD MODELS TABLE III COMPARISON OF THE POTHOLE DETECTION ACCURACY AMONG DIFFERENT STATE-OF-THE-ART APPROACHES