High-Definition Maps: Comprehensive Survey, Challenges, and Future Perspectives

In cooperative, connected, and automated mobility (CCAM), the more automated vehicles can perceive, model, and analyze the surrounding environment, the more they become aware and capable of understanding, making decisions, as well as safely and efficiently executing complex driving scenarios. High-definition (HD) maps represent the road environment with unprecedented centimetre-level precision with lane-level semantic information, making them a core component in smart mobility systems, and a key enabler for CCAM technology. These maps provide automated vehicles with a strong prior to understand the surrounding environment. An HD map is also considered as a hidden or virtual sensor, since it aggregates knowledge (mapping) from physical sensors, i.e., LiDAR, camera, GPS and IMU to build a model of the road environment. Maps for automated vehicles are quickly evolving towards a holistic representation of the digital infrastructure of smart cities to include not only road geometry and semantic information, but also live perception of road participants, updates on weather conditions, work zones and accidents. Deployment of autonomous vehicles at a large scale necessitates building and maintaining these maps by a large fleet of vehicles which work cooperatively to continuously keep maps up-to-date for autonomous vehicles in the fleet to function properly. This article provides an extensive review of the various applications of these maps in highly autonomous driving (AD) systems. We review the state-of-the-art of the different approaches and algorithms to build and maintain HD maps. Furthermore, we discuss and synthesise data, communication and infrastructure requirements for the distribution of HD maps. Finally, we review the current challenges and discuss future research directions for the next generation of digital mapping systems.


B. DIGITAL MAPS
The advent of modern satellite systems and imagery technology has revolutionized the creation of accurate and detailed digital representations of the world, giving rise to what we now call digital maps, such as Google Maps, OpenStreetMaps, Apple Maps, Garmin, and Mapbox. Digital maps encode road structures and basic semantic information as well as points of interest (POI). Several methodologies and techniques exist to extract and recognize geographic features needed to build these maps from satellite images [2]. Digital maps are now an essential tool in our daily life, especially when integrated with GPS. Indeed, such integration has been a core component in building a huge number of digital services, most importantly for navigation and routing. These maps have been mainly developed to help humans and are now available in the most recent vehicles to assist human drivers. However, these maps are limited in accuracy and precision and update time for AD requirements [3], [4], [5], in which the vehicle needs some degree of positional precision as well as detailed lane-level information.

C. ENHANCED DIGITAL MAPS
Digital maps have been significantly improved to meet the requirements of advanced driver assistance system (ADAS) functions such as lane-keeping assist [6] and adaptive cruise control (ACC) [7]. Typical features included in these Enhanced Digital Maps are speed limits, road curvature and gradient, lane information, as well as traffic signs and traffic lights [8]. Enhanced digital maps are also called ADAS maps and are currently an integral part of most modern vehicles to enable ADAS functions. Although enhanced digital maps introduced lane-level information, their geometric precision and the level of semantic details limit their applicability at higher levels of autonomy. In AD systems, the vehicle is required to be localized with high precision with respect to its environment [9], [10], understand the current situation [11] and plan collision-free trajectories [12]. To reach this level of autonomy, automated vehicles are required to have access to maps not only with centimetre-level positional accuracy and lane-level geometric information but also a 3D model of the environment, as well as all static and dynamic features of the road environment.

D. HIGH DEFINITION MAPS
The need of the above mentioned requirements gave rise to what we call nowadays the high-definition maps, or simply HD maps. Figure 1 highlights the evolution of maps, their features and usage as well as the information they contain and their level of precision and details. The strategic research planning workshop organized by a small group of researchers at Mercedes-Benz in Stuttgart in 2010, is where HD maps were born [13].The Bertha Drive Project marked the first successful use of HD maps in various functions of an AD system [14]. In this project, fully AD experiments along the Bertha Benz memorial route have been conducted using HD maps developed by HERE, one of the project partners [5], [13]. One of the key outcomes from the Mercedes-Benz planning workshop is likely the requirement for a highly detailed and accurate map, which can serve as an additional sensor to enhance the vehicle's understanding and perception of its surroundings. An HD map is sometimes referred to as the hidden or the virtual sensor, since it provides the autonomous vehicles with a strong prior to understand the surrounding environment, even far beyond what onboard physical vehicle sensors can provide [4], [15]. It is even considered as the most intelligent sensor in AD [16]. Furthermore, maps can offer an unlimited range and therefore, they could improve decisions and situation awareness, especially in occluded zones [5], [17]. While most physical sensors used in autonomous vehicles are vulnerable to environmental conditions, especially cameras and LiDARs, maps would not fail if kept accurate, consistent and up-to-date [18]. HD maps are believed to enable the next generation of automated vehicles, and there is a wide agreement that these maps will be central to the digital transformation of intelligent infrastructures, as well as strongly contribute to more sustainable mobility solutions. HD maps become a crucial component to power vehicles dealing with complex driving scenarios. They are used to improve vehicle localization by matching map data with collected sensor data in real-time [10]. Furthermore, they play an important role in improving the accuracy and reliability of perception in automated vehicles perception as they include information of the various features found in the road environments [19]. This can help the vehicle accurately recognize and classify these features. As HD maps contain rich lane-level information, they can also be used to support the calculation of efficient, feasible and safe routes and itineraries. Having information about the capacity and capabilities of roads, such as the number of lanes, the speed limit, and the presence of turn lanes. This can be used to calculate routes that are suitable for the specific vehicle and its capabilities [20], [21]. Additionally, HD maps can provide a detailed and accurate representation of the road environment, including the location and shape of roads, intersections, landmarks, POIs and many other features that allow one to model the structure of the environment, understand the driving context and thus anticipate risks and potential hazards [22]. Furthermore, HD maps can also be used to predict the likely paths and movements of other road users, such as pedestrians and other vehicles [23], [24]. These predictions are possible thanks to the detailed geometric representations and the rich semantics in HD maps. Moreover, HD maps can support the planning of feasible and collision-free trajectories that respect traffic rules [12]. Although, there exist several surveys that cover the different AD functions, reviewing those that depend on HD maps are timidly covered in the state-of-the-art. Part of this paper extensively reviews the previous works in each of the above-mentioned use cases.

E. SCALABLE MAPPING: OVERVIEW OF CHALLENGES
During the last decade, there have been tremendous research and development efforts both from academia and industry to push the limits towards affordable, self-maintained and scalable HD maps. However, there are various unsettled challenges in building HD maps at scale [25]. These challenges hold up HD maps to attain their full potential and ultimate goal in autonomous mobility. These challenges fall into one of the following categories.

1) DATA COLLECTION
Data collection for an HD map can be a time-consuming and labour-intensive process. It typically involves using a combination of sensors, such as GPS, IMU, LiDARs, and cameras, to gather detailed information about the environment.

2) DATA COMMUNICATION
Data communication involves the transfer of mapping data from where they are collected to where they are processed to build an HD map, and finally to where they are consumed, e.g., by an autonomous vehicle. Mapping vehicles generate large quantities of data from different sensors that need to be processed to build and update maps. Handling these data in real-time from a large number of mapping vehicles is indeed a challenge.

3) DATA PROCESSING
Data processing is the step to create an HD map by extracting the elements and features needed to build it [26]. This can be a very complex task, especially for large maps, as it involves aggregating and aligning data from multiple sources and ensuring that the map is accurate and up-todate. Creating HD maps at scale with a large number of mapping vehicles involved in the mapping process precise temporal synchronizations must be guaranteed to avoid data misalignment [27]. Synchronization using the pulse-persecond (PPS) signal generated by GPS tends to be the most common approach to have all onboard vehicle sensors synchronized [28].

4) MAP MAINTENANCE
Map maintenance is the process of continuously keeping the HD map up-to-date according to the changes in the road environment, such as construction sites, road blockages, and modifications of road connections. Since the road environment is highly dynamic and undergoes changes, this process requires frequent data collection and processing efforts.

5) DATA SECURITY AND PRIVACY
Data security and privacy are crucial for HD maps, as they often contain sensitive information, such as the locations of buildings and infrastructure. Ensuring that this data is protected and not misused is a significant challenge.

6) MAPPING COST
Mapping cost is an important factor in the process of creating HD maps. Building maps at large scale necessitates using a big number of mapping vehicles, each equipped with an expensive suite of mapping devices with highprecision sensors. This cost becomes significantly important when mapping large areas. HD mapping using consumergrade sensors is possible, but it comes at the cost of the sophistication of the mapping algorithms used.

F. CONTRIBUTIONS
This paper provides an in-depth overview of HD maps including a unified model of their layered architecture. Further, the paper highlights the importance of HD maps in modular AD systems and provides a synthesis of how they are used in the various AD core functions. Given the aforementioned challenges of mapping data collection, communication, processing, security, and costs, this paper extensively reviews the previous works on building and maintaining HD maps, including cost-effective solutions as well as the communication and mapping data requirements from generation to distribution. Additionally, the paper discusses the current challenges in each of the above areas for building and maintaining HD maps. Finally, we shed some light on the future and next generation of HD maps for mobility. The main contributions of this work can be summarized in the following: • A free-standing overview of HD maps as a background for the broader community of intelligent transportation systems. • A detailed review of the state-of-the-art of HD maps uses in the various core functions of AD systems. • A comprehensive review of the different approaches, methods and algorithms to maintain the different layers of HD maps and keep them up-to-date. • Discussion on key challenges and future perspectives of HD maps in CCAM and beyond.

G. ORGANIZATION
The objective of this survey paper is to provide a detailed and extensive review of recent research works in HD maps. In VOLUME 4, 2023 529 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
Section II, we synthesis and analyze relevant survey papers in the state-of-the-art of AD and HD maps. We further discuss how the present survey is positioned among previous works and its contributions. In Section III, we provide an overview of HD maps as well as a description of the different layers and the information contained in each. In Section IV, we provide an extensive review of applications of HD maps in AD systems. For each component in AD systems, previous works are classified based on two criteria: (a) which map data are used (layer) and (b) and what these data are used for. In Section V, we extensively review different approaches and algorithms to build HD maps. Section VI is dedicated to maintenance of HD maps, where we review the different methods to keep HD maps up-to-date. Section VII discusses the different communication infrastructures and protocols needed for HD maps. In Section VIII, we conclude the paper by discussing the current challenges of building, maintaining, and distributing HD maps at scale.

II. RELATED SURVEY WORK
Although there are numerous review papers covering various AD topics, the vast majority of them do not extensively address the subject of HD maps in more detail. Yurtsever et al. [29] presented a comprehensive survey of AD systems focusing on emerging technologies, the common practices from high-level system architectures to the different methodologies and typical core AD functions. In [29], besides the review of the various AD core components, namely localization, perception, planning and control, the mapping part has been briefly discussed. Further, mapping is often presented as an integral part of localization, e.g., in the scope of Simultaneous Localization and Mapping (SLAM) and not for HD maps in AD. Moreover, the various core components of AD systems have been the subject of discussions in several survey papers, for localization [9], [10], perception [30], [31], scene understanding [11], motion prediction [24], [32], [33], motion planning and control [12]. Mapping for autonomous vehicles has been discussed very often in the context of SLAM [34], [35], [36]. However, HD maps as a subject are far more complex and more comprehensive compared to classical mapping in robotics. Furthermore, the insight of most of these papers is oriented towards building maps for robot navigation in unstructured environments. On the other hand, the environment of autonomous vehicles is highly structured and subject to traffic rules. HD maps enable autonomous vehicles to understand and navigate the road environment while respecting these rules. The present paper tries to provide a comprehensive and systematic review of the state-of-the-art of HD maps compared to available surveys that cover either one aspect of HD maps or provide a very general overview. For instance, Puente et al. in [28] reviewed the different technologies and platforms used in data collection for HD maps, e.g., Mobile Mapping Systems (MMS). Elhashash et al. [37] reviewed the different sensors used to build MMS and discussed their utility and applications. Ma et al. [38] moved a step forward and discussed the different methodologies and algorithms used to extract road features from the point clouds generated by MMS. Their work, however, focused on geometric road features. Similarly, Zheng et al. [39] presented an overview of the different methods used to extract lanelevel road geometry as well as a mathematical model used to represent extracted features. More recently, a comprehensive survey of the generation algorithms for the various elements and layers of HD maps and their formats has been presented in [26]. While the current generation algorithms for HD maps may have limitations and fall short of desired performance and accuracy [26], there are extensive and rapidly advancing research endeavours focusing on building HD maps, particularly utilizing deep learning techniques. Reliable algorithms for building HD maps are considered the main part of the challenge. Building HD maps at scale involves various aspects, such as data collection from several vehicles, data processing by building algorithms, aggregation in cloud servers and distribution to autonomous vehicles in standardized formats via suitable communication protocols [40]. Motivated by the fact that the road environment is highly dynamic and undergoes changes frequently, the review of Boubakri et al. [41] has focused only on the techniques of updating HD maps. The multidisciplinary character of HD maps motivated several researchers to present an overview [5], tutorial [3] and high-level review papers [4], [15]. The present survey paper differs from the above-mentioned reviews in three main points. (1) First, we provide a thorough overview of HD maps and review their different formats. Furthermore, we adopt a generic definition of the different static and dynamic layers in HD maps. The elements in these layers constitute the basis of the taxonomy used in the rest of the paper to classify previous works in HD maps. (2) Second, we provide a comprehensive review of the use cases of HD maps in the different core components of AD systems, e.g., in localization, perception, routing, motion prediction and motion planning. In each of these functions, we synthesize how HD maps are used to improve their functionality. We systematically classify these works based on the HD map layer. (3) Finally, we review in detail the recent research papers focusing on building and updating HD maps. More precisely, we synthesize and provide a taxonomy of these works both on sensor data used as well as the features generated and its corresponding map layer.

III. HD MAPS: AN OVERVIEW
Early HD maps were only extensions of Enhanced Digital Maps used in ADAS, and they were referred to as prior maps [14], [43]. The term HD maps is quite recent but now becomes widely accepted in CCAM industry, including Tier I automotive companies, map providers, and OEMs. HD maps encapsulate all necessary information for automated vehicles to understand the driving environment at a very-high precision [5]. While generally there is a consensus that HD maps are a core enabler for CCAM, there are no clear guidelines or a standard of what information constitutes an HD map, and how they are represented [44], [45], [46]. Nevertheless, available HD maps in the market share common features. Centimetre-level positional accuracy and the availability of lane-level geometric and semantic information are the essential features found in most HD maps [3], [5], [46]. At its most basic level, an HD map can simply be a set of points and line segments with accurate positions representing road signs, lane markings, lane borders, and lane dividers [26]. Today's HD maps are becoming more complex due to the requirements of AD systems, where data from different sources constitute several layers of information about the driving environment [47]. Breaking down an HD map into multiple layers allows to have a more structured data representation of the road environment. This facilitates accessibility by the different components of an AD system, which requires that the environment is modelled at different levels of detail. Furthermore, a layered data representation makes it easy to build, store, retrieve and maintain the map. The HD map used in Mercedes-Benz Bertha Drive research project [13] defined three layers [14]. A two-layered HD map has been used in the BMW AG experiments, where the first layer was dedicated to geometric and semantic lane information and the second one of road/lane markings used for localization [48]. Similar to the map used in Bertha Drive, TomTom and HERE also adopted a three-layered data structure for their HD maps [49], [50]. The data in these three-layered models somehow represent the lane geometry, road connection network and a few semantics, but in different standard formats [5]. The HD Live Map of HERE is composed of three layers, namely, the Road Model, HD Lane Model, and HD Localization Model [49]. The Road Model defines road-level topology and geometric features as well as country-specific road classification. As the name implies, the HD Lane Model provides highly-precise lane-level features such as lane driving direction, lane type, lane boundary, and lane marking types. These data allow automated vehicles to plan more comfortable local trajectories. The HD Localization Model is composed of object-level semantic features such as traffic signs, traffic lights, and other road features. This layer helps the vehicle to accurately estimate its position using object location. Examples of these hierarchical layers are given in Fig. 2. Similarly, TomTom HD map is composed of a threelayers model, namely Navigation Data, Planning Data, and RoadDNA [5]. The latter is used for localization [51].
There are several ways to represent and store the data of HD maps. Among existing formats used to represent HD maps are OpenStreetMap [52], Lanelet2 [53], [54], OpenDrive [55], Navigation Data Standard (NDS) [56], Geographic Data Files (GDF) [57], GeoJSON [58] and ESRI shapefiles [59]. More details about different formats of map data for AD can be found in [4], [45], [46]. Software tools are available for conversions between most of these formats [60], [61], [62], [63]. Using HD maps to aid automated vehicles in improving their localization with respect to the environment has been one of the early motivations for creating such a geographic database [10]. The principle of map-based localization is to match observations obtained by the perception system with features included in the HD map. These features are either geometric (e.g., lane markings) or semantic map elements (traffic signs, road signs, and POI) [14]. Localization using geometric and semantic features tends to be challenging, especially in zones in which these features are sparse [64]. An alternative approach for localization is to match dense raw sensor data with a 3D spatial representation of the road environment [65], [66] (e.g., point cloud map). Although localization based on a dense map layer and raw sensor data can achieve better pose estimation results [10], storage and processing requirements tend to be one of the limitations of this approach. One solution is to use compact 2D/3D representations such as 3D occupancy grid or voxels. Obviously, having a prebuilt 3D spatial dense layer of the road environment in HD maps becomes crucial for highly AD systems, since the accuracy and robustness of localization determines the reliability of the whole system (cf. Section IV for more details). This motivates map providers to include a layer with such a representation in their map hierarchy. TomTom supports their RoadDNA layer with a 2D raster image that converts from a 3D point map [51]. In addition to its vector map in Lanelet2 format, Autoware, the open-source AD software uses a separate 3D point cloud map [67], mainly for localization using the Normal Distribution Transform (NDT) method [68]. Likewise, Apollo, the open-source AD software of Baidu uses a prebuilt point cloud map in their localization system [69] and another vector map in OpenDrive format [70]. HD maps are now an integral component of most AD simulation tools to account for more realistic scenarios [71], [72]. Furthermore, new releases of several AD datasets come with an HD map (Waymo open dataset [73], [74], Argo AI Argoverse I [75] and Agroverse II [76], Motional NuScenes [77], Lyft L5 [23]).
As discussed above, there are several ways to represent map information used in AD systems including lane-level details, such as lane boundaries, lane marking types, traffic direction, crosswalks, driveable area polygons, and intersection annotations. Although the driving environment is highly dynamic, the data represented in these three layers are static. A holistic representation of the environment shall also include real-time traffic information about observed speed, weather conditions, congestion zones, blocked road zones (constructions), etc. This section tries to provide a global overview of the information stored in these layers in a unified manner. Although most HD map providers have their own definition and formats, and there is no unique standard yet for HD maps, we categorize the information contained in HD maps into six distinct layers as described in Figure 2.

A. BASE MAP LAYER
The base map layer is the foundation of an HD map and is considered as a reference layer on which all other layers are built. It contains a highly-accurate 3D geo-spatial representation of the environment, such as the location and shape of roads, buildings, and other structures. A 3D geospatial model of the road environment is becoming an important source of information for autonomous vehicles. It is now common for an HD map to contain a 3D representation of the environment. The base map layer is typically created using point clouds from LiDARs and/or images from one or more cameras, sometimes assisted with GPS/IMU. This suite of sensors constitutes an MMS that allows to create a highly accurate and detailed 3D point clouds representing the environment. Road and lane geometric and semantic features are extracted from this layer to build other layers in HD maps. Since this layer contains a dense data representation of the environment, it plays an essential role in the precise localization of autonomous vehicles. Several techniques for point cloud registration allow estimating the vehicle pose by matching raw sensor data against a point cloud from this layer. Building and updating this layer is challenging in terms of data processing and communication requirements [78].

B. GEOMETRIC MAP LAYER
Despite its precise and dense representation of the environment, the base map layer ability to support understanding of the environment is limited, due to the lack of meaningful features in its representation. The geometric layer in the HD map provides detailed information about the geometry of the road environment, including the location and shape of roads, lanes, curbs, and other features. The geometric layer typically includes information about road width, the number of lanes, the centerline of each lane, the borders of lanes in each road, and the elevation of the road surface. It also includes information about the precise location and shape of curbs, sidewalks, pedestrian crossings, and both vertical and horizontal traffic signs. Each of these features is represented in terms of basic geometric primitives, i.e., points, lines, multilines, and polygons. For example, the location of a vertical traffic sign could be represented by a point. A lane centerline or borders can be represented by a set of line segments connected to one another, e.g., multiline. Similarly, a pedestrian crossing can be represented by a polygon. Geometric features of this layer are created by processing data of the base map layer. Building the geometric layer from the base map data typically involves several processing steps, including road segmentation, extraction of lane information, road signs, poles, traffic signs, curbs, barriers and road surface features. This layer provides a highly accurate and lane-level geometric representation of the road features. Geometric features in an HD map are essential for various AD core components, most importantly for precise motion predictions of dynamic road participants, as well as for safe planning of geometrically feasible trajectories.

C. SEMANTIC MAP LAYER
The semantic map layer defines the significance of road features provided by the geometric map layer. The data in this layer provide a context as well as meaning to the features represented in the map. For example, the semantic map layer in an HD map contains information such as the type of road (e.g., highway, residential roads), and lanes (e.g., change possible, to left or right), their numbers, the direction of traffic, and whether a lane is for turning or for parking. It also includes information on speed limits, lane boundaries, intersections, crosswalks, traffic signs, traffic lights, parking spaces, bus stops and many other features that are important to build the contextual representation of the environment. The semantic map layer allows the autonomous vehicle to build a detailed situational representation of its environment and understand the traffic rules, and thus be able to make correct and safe decisions in different traffic scenarios. In simple terms, the semantic map layer assigns semantic labels to road features and objects defined in the geometric map. For example, a point in the geometric layer is nothing but an ordered set of coordinates in the map coordinate reference system. Only the semantic layer defines whether this point corresponds to a traffic light, yield or stop sign. HD maps are known to contain rich semantic information. The semantic layer also associates metadata to road features such as road curvature, recommended driving speed, and a unique identifier of each semantic feature. Indeed, semantically rich HD maps enable autonomous vehicles better understand the driving situation, and therefore to make complex decisions in sophisticated scenarios. Nevertheless, building a reliable and high-fidelity semantic map of the road environment is not a straightforward process. Several processing steps, not limited to scene segmentation, object detection, classification, pose estimation and mapping are required. With recent advances in computer vision, deep learning, sensor fusion and semantic SLAM algorithms, building accurate semantic maps becomes possible.

D. ROAD CONNECTIVITY LAYER
The road connectivity layer describes the topology of the road network and how the various geometric elements are connected. Contrary to the standard definition of digital maps that contains only road-level information and roadlevel connectivity, HD maps contains lane-level geometric and semantic information, thus connectivity between roads becomes complex, as it defines the connection between two or more group of lanes. More precisely, this layer provides the layout and connectivity of roads including lane borders and centre lines as well as intersections. Lane-level connectivity information is necessary to plan legal transitions between roads and lanes as well as plan manoeuvres that are permitted at each intersection, which is crucial for the path planning of autonomous vehicles. In simple terms, this layer defines how the primitives constituting the geometric layer are connected with each other. These connections are established by defining sequential pairs of geometric and semantic elements. Assigning a unique identifier to each geometric and semantic element makes it possible to represent this information using a graph data structure, where each element is represented by an edge and their connection as a node. The graph structure allows for fast querying and searching of the map and planning routes efficiently.

E. PRIORS MAP LAYER
This layer is also known as the experimental map layer since it represents and learns information from the past experiences. It concerns the geometric and semantic elements in the map that their states changes temporally. Learning the status of traffic flow and accident zones from data of a fleet of vehicles allows for a more efficient and predictive driving behaviors. This layer also acquires and learns information that aides to predict the behavior of human driving and the dynamic states of traffic lights at intersections. It also accommodates temporal road settings, such as the parking orders, their occupancy and time schedules. For example, roadside parking places in some cities change during some week days, predicting the probability of occupancy and the timing rule that governs a given parking place is derived from the prior map layer sensor readings of different fleet vehicles that drive by that place. Learning and predicting the driving behaviour of road agents could be challenging due to sociocultural differences between different societies. Modeling these behaviours from experience is crucial for universal and scalable AD systems.

F. REAL-TIME MAP DATA
The real-time layer in an HD map is a dynamic layer that provides real-time information about the environment, such as traffic conditions, road closures, and other events that may impact the navigation of the autonomous vehicle. This layer is typically created by combining data from various sources, such as cameras, sensors, and other connected devices, which are mounted on the vehicle or located on the roadside. The data is collected in real-time and is used to update the HD map, either through crowdsourcing by participating vehicles or from intelligent infrastructures using specific communication networks. The real-time layer can include information such as the location and speed of other vehicles, the location and status of traffic signals, and the presence of construction areas or other obstacles and blockages on the road. This information is crucial for autonomous vehicles to make safe and efficient driving decisions in real-time to optimize traffic flow and reduce congestion. Furthermore, the real-time layer can be used to improve the accuracy and completeness of the HD map, by providing up-to-date information about the environment that may not be captured by the sensors used to create the map. In simple terms, the real-time layer in HD maps provides a dynamic, up-to-date representation of the environment. Live updates of an HD map for dynamic elements are challenging and require sophisticated intelligent communication infrastructure and cooperation between multiple actors. Data transmission between Intelligent Transportation Systems (ITS), HD map providers and vehicles must be reliable and meet certain requirements which are covered later in this survey.

IV. HD MAPS IN AD SYSTEM ARCHITECTURES
HD maps provide the AD system with a detailed and precise representation of the road environment [15]. These maps contain lane-level geometric, topological and semantic information necessary for safe and efficient navigation of autonomous vehicles [14]. Using HD maps in autonomous vehicles allows them to better understand their surroundings, plan their routes, and make more accurate driving decisions, thus ensuring the safety of passengers and other road users. This section discusses the importance and uses of HD maps in AD systems. The ultra-precise map data are now an integral part of most various core components in AD systems [14]. In order to discuss the importance and uses of HD maps in AD, we briefly describe the architecture and standard components of typical modern AD systems. Figure 5 shows the standard components of an AD system demonstrating those relying on HD maps. This section begins by briefly describing the architecture of an AD system and how it works as well as its various components. The rest of this section provides an extensive review of the stateof-the-art on those AD components that rely on HD maps. A keyword graph constructed based on analysing Google Scholar papers using the name of AD components as the search keywords definitely results in a strong dependency of core AD components on HD maps.

A. AD SYSTEM ARCHITECTURE
Automated vehicles are complex cyber-physical systems in which different components have to work together to achieve the overall driving tasks in a robust, reliable and safe way. While there does not exist a unique architecture of AD systems [29], we rely in this work on a common and generic architecture that helps us understand how HD maps are used to improve the different functions of AD systems. Likewise any robotic system, an autonomous vehicle can be considered a cognitive agent, with its three main elements, (1) a sensing element, (2) a cognitive element and (3) an action element. Splitting up these elements into an industry-level AD system results in several components as depicted in Figure 5. The sensing component in modern AD system architectures typically includes different sensors such as IMU, GPS, camera, LiDAR, and radar. A Subset of these sensors allows the vehicle to know its position with respect to the environment, i.e., for localization and the remaining sensors are used for perceiving the environment itself. Reading and preprocessing raw sensor data and making it available to the rest of the AD system is the role of the sensing component. In its simplest form, the sensing component is composed of a set of sensor drivers to read raw sensor data in realtime. The localization component is one of the most critical for the whole AD system to function reliably. Its role is to precisely estimate the position of the vehicle [9]. Error in localization propagates to the rest of the AD processing pipeline. Localization is simply a state estimator that fuses raw sensor data from the sensing component. Additionally, the availability of a map allows to improve and robustify localization, especially in zones in which some of the sensors fail or have degraded performance [10]. In Section IV-B, we review map-based localization techniques. Further, we discuss how the base, geometric and semantic data from an HD map are matched with raw sensor data, mainly how LiDAR point clouds and camera images are used to better estimate vehicle pose. The role of the perception component is to generate an intermediate-level representation of the current state of the environment, including information about obstacles and road agents [80], [81]. This representation also includes details about lanes (their position, borders, markings, and types), traffic signs, traffic lights, and drivable areas. Computer vision and deep learning techniques are extensively used for segmentation, clustering and classification tasks. Furthermore, object-level fusion is also an essential part of this component. The output of the perception is a list of tracked objects as well as a semantic segmentation of the image used for scene understanding. The geometric and semantic information from an HD map can also be used to improve object detection and fusion. Accurate perception is crucial for safety, as perception errors can affect the quality of information used throughout the AD system. Therefore, using redundant sources of sensor data can enhance confidence in the accuracy of perception, thereby improving overall system robustness. We discuss later in this section, how the information provided by an HD map can contribute to the confidence, accuracy and overall robustness of the perception component. The scene understanding component serves as a bridge between the abstract mid-level state representation of the environment given by the perception component and the high-level cognitive components in the AD system [82]. This component aims to provide a higher-level contextual understanding of the driving scene by building upon the data provided by both the HD map and the perception component [81]. Later in this section, we discuss how these two sources of information are fused to build a scene representation for understanding the driving context. Another component in the AD pipeline that relies on HD maps is the motion prediction component. It builds on the high-level spatio-temporal representation of the environment provided by scene understanding to predict the behaviour of road agents surrounding the vehicle [33]. The role of HD maps in motion prediction is to provide prior trajectories of each road agent in the scene. Motion prediction is a highly multi-modal problem in which HD maps play a key role discussed in detail in this section. The motion planning component aims to calculate a feasible, collision-free and safe trajectory of the autonomous vehicle [12]. This is achieved by optimizing a global shortest path obtained by a routing algorithm running on HD map data as well as the predicted trajectories of road agents. Motion planning also includes a behavioural planning function that relies on the state of the current scenario defined by detected objects and the HD map. The control component receives a planned trajectory and computes control commands for the steering, brake and acceleration actuation systems [12]. The control component does not explicitly rely on map data, thus it will not be considered in this survey. Finally, a special component is used to serve all other components by handling requests to provide map data as shown in Figure 5. HD map data are often stored in databases queried by map servers (local or cloud) to routing, tiling and update requests by a map client in the vehicle. As the routing element necessitates special algorithmic treatment, it will be considered in our survey of applications of HD maps in AD systems.

B. LOCALISATION
The localisation component in AD systems aims to estimate the position and orientation of the vehicle with respect to a global reference coordinate system. Its critical role is to continuously keep high accuracy and robustness of the estimation needed by the successive components in the system [112]. The precision of localization algorithms determines the reliability of the entire AD system. The robustness of localization under inclement weather conditions is a key requirement of modern AD systems as degraded estimation performance may lead to severe consequences and potential damages. The significant research efforts on localization during the last two decades have witnessed a remarkable performance, and at the same time have led to a wide range of assorted approaches. In order to guarantee normal operating conditions and achieve global system safety, an autonomous vehicle is required to be localized within 10cm precision [84], [113]. The vast literature in localization techniques for autonomous vehicles can be allocated into either of two main categories. The first category assumes prior knowledge about the vehicle environment, i.e., a map, hence is referred to as map-based localization [10], [83], [95], [106], [114]. The other category assumes no prior knowledge about the environment and aims at building this knowledge and simultaneously estimating vehicle location, e.g., SLAM based approaches [34], [35]. This section categorizes and analyses the localization techniques based on the different data provided by HD maps. SLAM-based approaches have a map-building element, thus it will be discussed among algorithms used to construct HD maps in Section V. As a rich and precise representation of the environment, HD maps are considered one of the most suitable prior maps for localization [115]. We review localization techniques by analysing and discussing "which" and "how" map data are used to localize autonomous vehicles with respect to their environment. Localization in its abstract form is a pose estimation problem that basically amounts to the fusion of onboard observations from different sensing modalities with map data. How sensors are fused with map data can be categorised into three main approaches. the first approach tries to associate map features with onboard observations of perception sensors. This association basically amounts to solving a geometric problem, with the solution being the position of the autonomous vehicle. The solution is usually obtained by solving an optimization problem that tends to optimize the poses from pairwise relative observations and map elements, e.g., using pose graph optimization (PGO) [104] or iterative closest point (ICP) [106]. This is referred to as the geometric approach and in some other works is referred to as the map matching approach [90], [97]. The second approach handles the problem of map-based localization using probabilistic techniques in the sense that a belief of pose probability distributions of observations and map data are used to obtain the accurate belief of vehicle pose. Reference [86] matched curbs and lane markings with features detected by camera. The resulting residual error from geometric matching is send as an observation to a KF for vehicle pose estimation (thus Kalman is for smoothing not probabilistic localization). GPS has been used for initialization only.

C. PERCEPTION
The perception component in an AD system is often linked with processing raw camera images and LiDAR point clouds for the detection and tracking, not only of static objects (e.g., traffic signs and road markings), but also dynamic agents, e.g., surrounding vehicles, pedestrians, and cyclists [30], [80], [81]. Perception is one of the critical core functions of an AD system. Ensuring its reliability and real-time performance is crucial to ensure collision-free navigation [116], [117]. Fusing perception data with the detailed and precise geometric and semantic information included in the various layers of HD maps potentially improves perceptions by focusing on the most relevant Regions of Interest (ROI) [81], [117]. More precisely, the geometry of an HD map allows to define an ROI to filter out point clouds, leaving only those of particular interest to the perception function, thus simplifying and improving the computational efficiency of object detectors [119]. For instance, removing point clouds corresponding to buildings would avoid unnecessary computations in object detection. Reliable object detection for AD systems remains an open challenge mainly in occluded zones and beyond the reach of onboard sensors [80], [120]. Although the primary use of HD maps is to improve vehicle localization, they still could provide useful information to boost the performance and confidence of detected dynamic objects [19]. Recently there has been an interest in using HD maps to improve the perception of autonomous vehicles [117]. Fadadu et al. [121] used a local rasterized image of HD map as an input to a deep learning architecture in parallel with raw camera image and LiDAR point cloud for map-aware object detection. Yang et al. [118] and Carrillo and Waslander [122] developed a deep learning framework for 3D object detection leveraging HD maps to improve the performance and robustness of state-of-the-art 3D object detectors. On the other hand, detecting static objects, e.g., traffic signs and road markings is of interest to build HD map geometry and semantic features. This precision contributes to the overall quality of the HD map, and consequently the AD functions. For instance, errors in the positions of detected traffic signs and road landmarks make it difficult to match against their counterparts in the HD map with strict matching thresholds. Map geometry and semantics allow defining scene representation models that facilitate recognising the most relevant obstacles to decision-making while ignoring those without impact on the current situation [22]. Matching detected objects against an HD map makes it possible to identify relevant objects for decision-making. Perception at this higher level is refereed is referred to as situation understanding.

D. SCENE UNDERSTANDING
Understanding the driving context is imperative to make correct and safe decisions by autonomous vehicles. One of the early motivations of HD maps is to provide autonomous vehicles with precise and detailed information to help understand their environment. This information enables the AD system to understand the current driving situation and interpret all entities constituting the scene. Geometry and semantics contained in the map make it possible to build compact data models and representations of the environment systematically, thus enabling the vehicle to deal with complex driving scenarios [123]. More precisely, the scene understanding component in an AD system, supported by an HD map's geometric and semantic information, could consistently provide a meaningful context of perception [82]. Beyond the raw object detection, scene understanding aims at extracting and estimating safety critical information and making it available to subsequent processing stages [11]. As discussed earlier in this section, raw perception mostly deals with object detection, tracking, and fusion, without considering the context of the object. Projecting raw perception objects onto the map layers allows building a comprehensive layout of the driving scene. This layout sometimes is referred to as the world model [22]. The main benefit of having such a layout is that it enables matching a perception object with the semantic features in the map, thus obtaining a more enriched perception, e.g., a pedestrian on a crossing and a car in the same lane [22]. Encoding static and dynamic information of the environment in a unified world model facilitates the subsequent AD tasks, mainly motion prediction and planning. Furthermore, HD maps facilitate the estimation of the drivable area taking into consideration the adjacent driving lanes. In summary, HD maps provide information to facilitate scene modelling and understanding, e.g., by providing complementary information on sidewalks, pedestrian crossings and drivable paths [99]. They further include information about local traffic regulations, including speed limits and priority rules [5]. The higher the precision of the map geometry and the richer its semantics, the better the AD system will be at interpreting and interacting with complex scenarios. Scene representation with unreliable perception and outdated HD maps may potentially lead to misinterpretations of the context.

E. ROUTING
Road-level digital maps assist human drivers in navigating. The route calculations in these maps cannot go beyond using road-level connectivity, since these maps do not include lane-level details. Accurate and optimal driving routes are necessary for time and energy saving, as well as contributing to global vehicle safety. Efficient and low-cost drive route calculation must consider a lane-level model of the environment [20]. Furthermore, in a highly dynamic environment, details about the status of the traffic and lane occupancy are essential to adapt the route dynamically as the autonomous vehicle navigates in the environment [124]. Considering the detailed and accurate lane-level information of the HD map static layers together with the priors and real-time layers, an efficient dynamic route calculation is possible [125]. For a routing subsystem in an autonomous vehicle to be able to calculate a drivable path from the current position to a set destination, a fresh and up-to-date map must be made available to the system from the HD map server as depicted in Fig. 5. Alternatively, as in digital maps, route calculations could be offered as a service. Upon sending its accurate position to the HD map server, an optimal route could be calculated and fed back to the vehicle for supporting the other core components of the system. Over the last few years, those routing services have involved taking into account realtime traffic conditions and energy factors (e.g., most energy efficient route). For autonomous vehicles, additional factors can be taken into account, such as routes avoiding complex urban environments that are difficult to navigate for ADS, or routes with a good network coverage to guarantee continuous connectivity for online services, including the real-time HD map service [126].

F. MOTION PLANNING
The role of motion planning in an AD system is to generate feasible, safe, collision-free and energy-efficient trajectories. The motion planning task typically incorporates trajectory generation and behaviour planning [12]. Behaviour planning is a high-level decision-making function that decides transitions between the different driving states, e.g., lane change, in-lane vehicle following, decelerating to stop, etc. To make these transitions safely, a local map tile and vehicle perception are needed by the behaviour planner to build a transition model of the vehicle environment. Unlike navigation in mobile robots, the road environment is highly structured [127] and all road users have to respect traffic rules. Generated trajectories for AD are strictly required to ensure that traffic rules are respected and motion is within drivable road areas. There exist different approaches for motion planning for autonomous vehicles, they all rely somehow on the geometric and semantic information provided by HD maps to respect traffic rules [21], [128], [129], [130]. In sample-based motion planning approaches, the lane geometry of the HD map is used to limit the search space by rejecting candidate trajectories that are not feasible [131]. In optimization-based motion planning, map geometry is used to define a set of constraints to confine the solution to a feasible road region [128], [130]. Recently, there has been an increasing interest in end-to-end frameworks in AD. One deep-learning architecture could replace all components of a sophisticated motion planner while guaranteeing the effectiveness to generate safe and collision-free trajectories in real-world driving scenarios. An example is the end-to-end neural motion planner developed by Zeng et al. [132]. The proposed deep learning pipeline is composed of stages. The first input point clouds from LiDAR as well as a local map and outputs an intermediate map-aware representation of 3D perception. The second stage samples and optimizes over this representation all physically possible trajectories. The trajectory of minimum learned cost is chosen as the system output.

G. MOTION PREDICTION
The driving environment is highly dynamic and involves different road participants, such as pedestrians, vehicles and cyclists. Predicting future motions and behaviours of these road participants is imperative for autonomous vehicles to build a context-aware representation of their interactive environment, thus anticipating potentially dangerous situations [24], [32], [33]. From an abstract point of view, these traffic participants can be considered as a complex multiagent system. Indeed, the development of reliable solutions to motion and behaviour prediction of road agents will enhance the safety and capabilities of autonomous vehicles to adapt human-like behaviour in real-world traffic conditions. The authors in [150] review the tracking prediction and decision-making. Predicting the behaviour of these agents is crucial for AD systems [75], [151], mainly for risk assessment [24], [29], and safe and comfortable motion planning [12], [130]. Motion prediction refers to estimating the future behaviour of road agents given their current states and a model of the environment in which they navigate. The problem of predicting the future motions of road participants has been addressed by various research works. An overview of motion prediction can be found in [152]. A survey of early methods of motion prediction of intelligent vehicles has been conducted in [24]. Early methods of predicting the intention of road participants are based on modelling the motion of the agent. One way to predict the intention of a road participant is to model its motion using kinematic and dynamic models. The state evolution of these models allows us to know the future state or the trajectory of the agent [153]. This approach does not require information from the surrounding environment. As a result, it fails at long-term predictions. One limitation of common motion prediction approaches lies in their inability to perform long-term predictions (model simplicity and availability of measurements, context, etc). This issue can be handled by using the data available from HD maps, where lane information is available. Using HD map allows to associate each actor with one or more lanes as given by the geometric layer of the HD map. Then all possible trajectories of an actor can be generated based on the lane connectivity and the current state of the vehicle. While this method, in contrary to previous works, is quite good for long-term predictions, it nevertheless tends to make predictions in common driving scenarios which are prone to errors in the map and vehicle position. Moreover, it cannot predict the strange behaviour of an actor. Methods in the state-of-the-art of motion prediction using HD maps can be classified into two main approaches as depicted in Table 2. The first approach uses a raster of an HD map as an input of the motion prediction architecture [145]. This raster is often formed by projecting the geometric and semantic elements of the map into a common plane, e.g., to be aligned with other sensing modalities, e.g., images and point clouds from sensors. Although this approach allows leveraging powerful methods from CNNs, One limitation of this approach is the difficulty to model certain spatio-temporal features, which is essential for motion prediction. On the other hand, the second approach allows to use directly map elements in their vector formats which facilitates agent modelling and other dynamic features in the map [146], [147], [154].

H. THIRD-PARTY APPLICATIONS
HD maps can provide accurate and reliable ground truth data that can be used as a reference for calibrating sensor outputs [155]. For example, LiDAR can be calibrated and perfectly aligned with an IMU using the highly-precise coordinates of geometric elements of an HD map. By comparing the sensor measurements with the HD map data, any errors or discrepancies can be identified and corrected, leading to improved calibration of the sensors. Furthermore, HD maps can be used for online (self) calibration. The availability of an HD map, raw sensor data in real-time, and algorithms to perform comparison makes it possible to compute the error between sensor measurements and the ground truth. Thus it allows for continuous correction of calibration errors of the sensors in real-time. This enables the AD system to be more robust and reliable to changing environmental conditions as well as sensor performance variations. Online calibration can result in more accurate and robust sensor calibration compared to offline calibration methods. More recently, HD maps can also be used to boost road annotations for creating large datasets for traffic landmark detection [156].

V. BUILDING HD MAPS A. MOBILE MAPPING SYSTEMS
Building HD maps is a sophisticated procedure in which several steps are involved. The first step in the procedure of building an HD map is to send specialized vehicles equipped with a suite of high-precision and well-calibrated sensors to survey and collect data about the environment. Data collection vehicles for mapping are likely to be equipped with a highly-precise GNSS connected with or that implements correction services such as RTK (Real-Time Kinematic) positioning accuracy up to very few centimeters. GNSS positioning measurement are often fused with the measurements of high-performance IMU (Inertial Measurement Unit) and wheel odometry. Several commercial products exist that combine both GNSS and IMU in one unit as an INS (Inertial Navigation System). Mapping vehicles are also equipped with one or more high-resolution LiDAR and cameras to collect raw 3D/2D data of the road environments. There are two ways to set up a data collection vehicle for mapping. The first is to buy the above mentioned sensors, choose a suitable configuration and mount them on a vehicle. Although this approach offers the flexibility to define sensor configuration beforehand; nevertheless calibrating several and different sensors to the required precision for mapping is not trivial and time-consuming, especially with cameras [157]. Alternatively, several manufacturers provide the whole suite of sensor in one package, referred to as a mobile mapping system (MMS) [28], [37]. Examples of commercially available MMS are shown in Figure 6. More information about commercially available MMS, their specifications and applications can be found in [158]. More details about MMS as well as the technology, the sensor used, their specifications, and applications can be found in [28], [37], [159], [160]. Although MMSs are easy to install and calibrate, they do not provide more flexibility to define the sensor configuration, e.g., where each sensor is positioned and oriented with respect to the body of the vehicle. MMS generates highlydetailed and precise geo-referenced 3D point clouds that need to be stitched to create a 3D representation of the environment.

VI. MAINTENANCE OF HD MAPS
Having an up-to-date HD map is crucial for the various AD core components to function correctly. Errors in HD maps could lead to severe damage due to inappropriate decisions taken by the system. Erroneous decisions could be avoided through frequent updates by the mapping vehicles. The road environment is highly dynamic and likely undergo frequent changes due to new infrastructure constructions, road maintenance, and lane extensions. Mapping vehicles must be able to detect changes in the environment and send them to update the map. The map update procedure involves complex processing steps, including handling data from multiple sources and sensors at different scales, identifying the deviation between the stored map and the newly collected data from the environment, and finally integrating these deviations to update the different layers of the map. Several methods and approaches have been developed in the literature to capture HD map changes and update them [161]. In the following, we review the different approaches and methods to detect changes in HD maps and how this information is applied to update the maps. The approach we follow to review previous works to maintain HD maps is based on analysing which layer is maintained by each state-of-the-art method as summarized in Table 3.

A. MAP CHANGE DETECTION
Change detection in HD maps refers to the process of identifying changes in the environment, such as new constructions, road closures, etc. This is followed by updating the layers of the map accordingly. HD maps undergo changes regularly and having a map that can be trusted by autonomous vehicles is crucial to guarantee navigation safety [18], [167]. Change detection is typically achieved through the use of various sensors, such as cameras, LiDAR, and radar, combined with computer vision algorithms and machine learning techniques. Change detection algorithms have found their way to many applications, even before HD maps. Remote sensing is one of the early applications of change detection and update of maps [177]. It has also been applied successfully to urban monitoring, forest changes, crisis monitoring, 3D geographic information updating, construction progress monitoring, and resource surveying [120]. At the most basic level of these applications, the problem amounts to comparing raw sensor data, mostly 3D point clouds [178], 2D images [179] or both [180]. In 3D point clouds, change detection can be divided into three main categories, namely point-based, object-based and voxel-based change detections.
Although off-the-shelf methods from remote sensing still could be adapted to change detection in HD maps, however, their applicability is limited to detecting changes in the base layer, which is typically represented as a point cloud map. As an HD map is complex layered architecture with geometric, semantic and topological information for which change detection is challenging. In this context, there are obviously two methodologies to update an HD map. The first is to update the base map layer only and then use it to regenerate the geometric, semantic and road connectivity layers. The second is to directly detect changes and update each layer individually, avoiding unnecessary computations to regenerate the other layers in the map. In the following, we briefly review the recent layer-specific change detection works.
Regardless of the methodology used, previous works in change detection could be categorized as either probabilistic, geometric or deep learning approaches. Kim et al. proposed a probabilistic change detection algorithm based on probability and evidence theories to update the base layer from crowdsourced LiDAR data [78]. For change detection in the geometric layer, Pannen et al. [162] proposed a method that uses the probability distribution of a particle filter (PF) to define various metrics to quantify change detection between detected lane markings and boundaries with their counterparts in the map. These metrics are then evaluated using weak and AdaBoost classifiers to quantify geometric map changes through thresholding. Although the approach in [162] has shown promising results to detect changes in lane markings and road edges, one limitation of this approach lies in the detection of minor changes in road geometry, mainly due to sensor sparsity and noise. Another probabilistic approach is the work of Welte et al. [163] in which a Kalman smoothing technique has been used to detect positional errors in semantic features in HD maps. The method has been applied to detect road signs inaccurate positions. Alternative to Kalman or particle filter-based techniques, Jo et al. [170] have used Dempster-Shafer's theory of beliefs to infer the existence of map features. Klejnowski et al. [171] have used a similar approach for change detection of traffic signs. While the vast majority of change detection techniques in the literature have focused on change detection of base, geometric, and semantic layers, only few works detected changes in road connectivity. Yang et al. [169] have used fuzzy logic to match GPS traces with the lane-level road network. Fuzzy membership degree between GPS data and lane segments is used to quantify matching and consequently change detection.
In a framework of HD map verification against certain geometric errors, Pauls et al. [166] proposed a geometric approach to detect changes in road markings by grouping road features via spatial-semantic clustering. Ordering these groups and projecting them into 1D space yield a 1D signal that quantifies changes in road markings. An improved version has been presented by the authors in [167], where boosted classification trees are used to ensure the consistency of each feature group as a robust alternative to the maximum margin classifier used in [166]. Both [166] and [167] have been evaluated on the road evaluation dataset they have presented in [18]. Another geometric framework for change detection has been presented in [168]. The main idea in this framework is to vectorize road features from a semantically segmented point cloud via Euclidean clustering. Then, these features are geometrically matched with their counterparts in the HD map. Other works have focused on change detection of very specific features, such as the width of the leftmost lane [172]. While most of the discussed approaches on geometric change detection of geometric features have focused on road and lane markings, Zinoune et al. [164] develop an approach based on graphical pattern recognition using a Bayesian classifier to detect missing roundabouts in a map.
Recently there has been an interest to use deep learning to detect changes in the different features of HD maps. Plachetka et al. [173] developed a deep neural network (DNN)-based pipeline to detect deviations of certain geometric and semantic features in HD maps using LiDAR point clouds. The main concept in their approach is to encode both map elements and LiDAR point cloud into a common feature map, both are then fed into a DNN architecture to verify, falsify, or detect missing map elements. The proposed approach has been successfully applied to detect changes in traffic signs, traffic lights, and pole-like objects and validated on the 3DHD CityScenes dataset [181]. Alternative to LiDAR point clouds, Heo et al. [174] proposed an encoder-decoder deep learning framework that takes as inputs both RGB camera images in addition to an HD map raster formed by projecting the map geometric elements on the same image plane, i.e., using camera intrinsic and extrinsic camera calibration parameters. The output of this framework is pixel-wise change probability. The two modalities are then fed into an adversarial learning block in order to reduce the discrepancy between the two input modalities. This is followed by a deep metric learning block to measure the similarity between the output of the adversarial learning block. The output of this framework is pixel-wise probability of change. This framework has been successfully applied to detect changes in lane geometry as well as lane markings. However, the algorithm fails to recognize partially visible objects. He et al. [175] presented Diff-Net as a feature-based change detection framework by leveraging deep learning object-detection algorithms. The main idea in Diff-Net lies in inferring map changes through parallel feature difference calculation. Similar to [174], the change detection framework in Diff-Net receives an RGB image as well as a rasterized image formed by projecting HD map features on the same image plane. The applicability of Diff-Net is however limited to change detection in vertical semantic features, e.g., traffic signs and lights. In contrast to the above image-based deep learning method that detects changes in image plan, the framework presented in [176] to detect changes in crosswalks, Mask R-CNN for instance segmentation of crosswalks and ResNet-50 combined with a Feature Pyramid Network (FPN) has been used for feature extraction in Bird-Eye-View (BEV). Most of these approaches are applied to very limited and simplified use cases. Obviously, change detection in HD maps is still in its infancy and universal change detectors for HD maps are still missing.

B. MAP DATA UPDATE
The second stage in the maintenance of an HD map is to update the map elements based on the outcomes of change detection. In simple terms, map update amounts to a probabilistic data fusion problem. Continuously monitoring changes in the ever-changing environment in near real-time and fusing different data modalities both in time and space, and from different sources to update several layers is indeed a challenging task. In their survey paper, Cadena et al. [182] identified that the distributed process of updating and maintaining HD map created and used by large fleets of autonomous vehicles is a cogent subject of future research. Towards this direction, Kim et al. [183], [184] proposed a solution to keep new feature map layer [161] up to date from crowdsourced point cloud data. This new feature map forms a basis to build the different semantic and geometric features of an HD map. Pannen et al. [165], [185] proposed a problem partitioning approach for the maintenance of HD maps from crowdsourced data. However, the applicability of this work is limited to the update of lane geometry, more precisely to lane markings and road boundaries. In an attempt to update the base map layer from real-time crowdsourced point cloud data, Kim et al. [78] developed and validated a maintenance framework to update a point cloud map using evidence theory [186] and a pose graph SLAM approach for localization. Obviously, the results on change detection and update of HD maps in real-time are very limited due to the need for a reliable lowlatency communication infrastructure to handle this complex process.

VII. DATA AND COMMUNICATION INFRASTRUCTURE FOR HD MAPS
Building and maintaining HD maps at scale is a matter of data exchange between multiple stakeholders, e.g., government as the owner of the ITS roadside infrastructure, map providers and vehicles, as depicted in Figure 7. Collection, building, maintenance and distribution of map data require a reliable communication and distributed computing infrastructure [187]. This section discusses the data and communication infrastructure needed to scale the creation, maintenance and distribution of HD maps.
The first connected concepts have been proposed in 2012, by introducing Local Dynamic Maps (LDM) defining four layers, including static and dynamic information stored locally [188]. Onboard sensors, Vehicle-to-Vehicle (V2V) and Vehicle-to-Infrastructure (V2I) data are used to collect dynamic information and make it available locally to the driver and ADAS. Next, concepts that update the local map with global information retrieved via the cellular network have been proposed [47]. More specifically, techniques that take into account the network coverage and performance are used to come up with optimised schedules for data transmission [189]. A first national deployment has been tested in Japan in 2018 using the four-layer LDM principle [190]. More recently, sophisticated centralized dynamic mapping systems have been proposed and validated on a city-scale [191]. In the same vein, a cross-border system involving multiple network operators has been proposed and experimentally validated [192].
Crowed-sourcing approaches to share local perception data that rely on different processing and uploading strategies constitute the vast majority of studies that have been published over the past couple of years [47], [192], [193], [194], [195], [196], [197]. Some propose theoretical concepts [193] other validate their approach through simulations [47], [195], [196] and experiments [192], [194], [197]. The uploading strategies generally focus on three abstraction layers: (1) local processing [188], (2) uploading and processing on the network edge [195], [196], [198], (3) uploading and processing in the cloud [189]. This abstraction allows to categorize the connected mapping platforms into four categories: (1) centralized, i.e., cloud systems, (2) decentralized, i.e., edge systems, (3) distributed systems, i.e., self-organized local systems using direct communications such as V2V, and (4) hybrid approaches [80], [197], [199]. Furthermore, the dynamic nature of the data needed for the different map layers impacts the choice of communication technology used. For instance, the static layers that do not change frequently can rely on delay tolerant and slow communication technologies, such as 2/3G cellular networks. The more dynamic the data, the more reliable and available the network needs to be. Transient dynamic data such as weather and traffic conditions can tolerate seconds or minutes of delay. However, highly dynamic data such as the state of traffic signals or the presence of close by vulnerable road users (VRU) like pedestrians or cyclists, require a specialised network technology (e.g., C-V2X or ITS-G5 [200]), and a dedicated offloading and processing architecture, to meet requirements of safety applications [199]. Standardisation efforts are ongoing to facilitate the dissemination of highly dynamic data so that it can be used as an input for the corresponding layers of the HD map [201]. However, the format and integration of this type of data remain a challenge on their own and will be covered in more detail in the next section.

VIII. CHALLENGES AND FUTURE PERSPECTIVES
Despite notable advancements in CCAM over the past decade, achieving complete autonomy in vehicles is still an unresolved challenge. For autonomous vehicles to be deployed on a large scale, scalable solutions for HD maps are essential. In this section, we shed light on the various challenges needed to be addressed to reach the full potential of HD maps CCAM [25]. Undoubtedly, the availability of costeffective and flexible solutions for building, maintaining, and distributing map data among stakeholders will greatly enhance the scalability of CCAM in future generations of smart cities. Further, we also discuss future perspectives and applications of HD maps.

A. CHALLENGES 1) STANDARDIZATION AND DATA REPRESENTATION
The concept of HD maps becomes widely accepted as a key enabling technology for CCAM. Nevertheless, there is no common agreement on how mapping data are represented, how many layers are needed, what mapping data have to be stored in each layer and in which data format. Defining a common standard for HD maps is difficult due to their complexity and large amount of data and information they contain, making it challenging to create a standard that is both comprehensive and easy to understand, store, maintain, update and distribute effectively. Defining a common standard for mapping data will provide more data compatibility and facilitate access to data while reducing the costs of development and integration. Furthermore, this will improve the quality, consistency, and privacy of data, consequently improving the road safety of all participants including automated vehicles. Recently, there have been few initiatives to define a common standard. The NDS aims at defining worldwide standards for HD map data in automotive ecosystems [44], [56]. There are more than 44 members in the NDS consortium, ranging from automotive constructors, OEMs, and map solutions providers. Nevertheless, the NDS standard is not yet adopted by most of the leading companies that shape the AD industry today.

2) SCALABILITY
Scalable HD map solutions are imperative to the mass deployment of autonomous vehicles. Building a city, regional and national-wide HD maps and keeping them updated remains a big challenge, especially to deal with the different standards, traffic rules and regulations used to represent geometric road features as well as traffic signage. These standards differ from one region to another. Mapping algorithms have to be universal and be able to work in different regions and countries. Mapping is supposed to be a continuous process of data collection and processing, in order to heal zones that have been changed. This process becomes challenging in large geographical areas, where a huge number of vehicles have to be part of the mapping process. The mapping cost directly depends on how large the zone to be mapped is and the number of vehicles needed to serve it. Mapping vehicles are very expensive as discussed early in this paper. Furthermore, using individual vehicles equipped with consumer-grade sensors requires sophisticated algorithms that are not yet mature. Additionally, the communication and the distributed computing infrastructure needed to handle this use-case is the subject of ongoing research and studies [193], [197], [202].

3) NETWORKING AND COMPUTING INFRASTRUCTURE
Handling and processing large amounts of data as in the case of building and updating scalable HD maps requires a reliable networking and computing infrastructure that shall work in harmony and near real-time [192]. With the advent of 5G/6G cellular communications, Internet of Things (IoT) and edge computing architectures, many opportunities for vehicular communications become available in general [203], and solutions that handle building HD maps become a commercially viable option [40]. These communication and computing infrastructures are designed to handle such datahungry applications and meet their latency and bandwidth requirements. Large-scale crowdsourced mapping with a massive number of connected vehicles will be one of the principal applications of these infrastructures [197].

4) LIMITATIONS OF MAPPING ALGORITHMS
Despite the tremendous research and development efforts expended for automating the process of building HD maps, recent research outcomes in HD maps clearly reveal that mapping algorithms used to extract HD map features and VOLUME 4, 2023 543 Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
build road and lane topology are still limited to simple features [204]. Current state-of-the-art algorithms can detect simple geometric features but fail to deal with high-curvature features, e.g., roundabouts. Furthermore, most of these methods require several post-processing steps to get the feature in a suitable vector format. Mapping semantic features is still limited to very few and easily detectable traffic signs. Very few recent works started to address building lane topology to construct simplified road/lane connection networks. Developing a universal mapping pipeline makes it possible to build a fully-featured HD map containing geometric, semantic and topological information. Building such a pipeline remains a challenge.

5) MAP DATA OWNERSHIP, PRIVACY, INTEGRITY AND DISTRIBUTION
The future of building and maintaining HD maps will be to automate and distribute the process in which millions of individual vehicles are involved. Collecting, processing and storing large amounts of distributed data from the environment raise several concerns about data ownership, privacy, integrity and distribution. Raw mapping data are generated in vehicles aggregated with other sources of data from public authorities, processed and distributed by map providers. Map data ownership from collection to distribution potentially needs to be addressed in large-scale HD mapping. Furthermore, preserving the privacy of individuals and vehicles is crucial and must be considered in the mapping process. Mapping data may include sensitive user information such as precise locations of vehicles as well as a precise description of their environment. The integrity of HD map data must be ensured in order to avoid incorrect and fatal decisions, especially if used by autonomous vehicles. Building accurate and trustworthy HD maps still is an ongoing research question. Commercially available HD maps often undergo manual checks and verification by humans. Generating accurate and reliable HD map data from multiple sources of data, e.g., via crowdsourcing poses several technical issues, yet to be solved. The ownership, privacy and integrity of scalable HD maps have started recently to attract the attention of researchers. On the other hand, blockchains have proven themselves as a promising solution ensuring data integrity due to their distributed and secure nature [205], [206]. The use case of building and updating scalable HD maps while keeping the traceability of data, their privacy and integrity is a perfect application of blockchains. This technology is expected to play a central role in building and distributing the next generation of HD maps.

B. FUTURE PERSPECTIVES 1) PHOTOREALISM
Precise localization has been one of the key motivations to introduce HD maps to autonomous vehicles. The existence of dense, and at the same time compact representations of the road environment is fundamental for HD maps; especially for localization. There has always been a compromise between the density of information included in an HD map and the computational effort needed to process them. Recent progress in neural 3D scene representations makes it possible to reconstruct photorealistic 3D scenes in a very compact representation [207], [208], [209], [210], [211], [212]. Representing the base map layer using neural radiance fields (NeFR) allows benefiting from both compact and photorealistic representation of this layer. This technology will probably get maps for autonomous vehicles to a new era.

2) APPLICATIONS BEYOND AUTONOMOUS VEHICLES
HD maps are mainly developed to help autonomous vehicles to understand and safely navigate in the environment. Thanks to the detailed and precise representation of the environment they provide, HD maps can also be used to improve the quality of various services offered by classical digital maps. Furthermore, HD maps can play an important role in digital assistive technologies for people with disabilities. Mobility and safety of visually impaired persons could be significantly improved if they are equipped with suitable sensors and have access to a highly precise, detailed and semantically rich representation of the environment. If precisely localized, a digital assistive device will be able to interpret and understand the environment, therefore generating vocal navigation messages for safe navigation. The real-time status of traffic lights and other traffic information in HD maps for pedestrians are relevant to enhance the functionality of these devices. Presently, most HD map providers only offer maps representing the vehicle environment. Mapping routes of participants other than vehicles, e.g., sidewalks of pedestrians and cycling tracks are still missing in HD maps of today. Building and updating HD maps for all participants will pave the road towards a broad range of autonomous and non-autonomous navigation as well as several useful digital services.

3) TOWARDS DIGITAL TWINS
The environmental digital twin is a holistic digital representation of the environment including all of its physical and functional characteristics [213], [214]. A city-scale digital twin is an emerging concept in CCAM that aims at building a data-driven model that combines data from various sources of IoT sensors, connected vehicles, buildings, intelligent infrastructure and transportation networks and all other data sources to help create a comprehensive, real-time model of the city [215], and thus improves road services [216]. This concept generalises HD maps as a digital model for connected and autonomous vehicles to a holistic digital model that helps all entities in a society. Digital twins can even be used to model the behaviour of the different entities in the environment even at micro-scale of details [217]. An HD map will be a single module of a digital twin [218] that supports different functions and services for connected and autonomous vehicles in our smart cities [219], [220], [221], [222]. As HD maps can be used to simulate complex driving scenarios, digital twins will be used to simulate and analyze complex city-scale scenarios for these vehicles [223]. Digital twins will allow studying, analysing and simulating the impact of new development projects or the effects of changes in traffic patterns, and can help city planners and decision-makers to analyze and optimize the performance of the city by predicting future scenarios, and identify opportunities for improvement. Building a city-scale digital twin is indeed a big challenge that requires a large amount of data, and it can be a complex and time-consuming process. Cross-validation, integrity and trustworthiness of distributed large amounts of data remain a challenge in creating digital twins [224], [225]. Crowdsourced mapping of roads by vehicles will be replaced by a unified process of simultaneous outdoor and indoor mapping using large amount of data available from heterogeneous connected sensors.

IX. CONCLUSION
HD maps continue to be a rapidly evolving aspect of realworld CCAM applications, driving innovation and progress within the field. Despite the existence of significant research and development efforts on the applications of HD maps in AD systems and the algorithms and infrastructures to build and maintain HD maps, there is very little literature to summarize and provide a standing point on these works. This paper extensively reviewed the previous works on building and maintaining HD maps, including cost-effective solutions as well as the communication and mapping data requirements from generation to distribution. Furthermore, the paper discussed the current challenges in each of the above areas for building and maintaining HD maps. More precisely, we provided a free-standing overview of HD maps as a background for the broader community of intelligent transportation systems. We also discussed and analyzed the state-of-the-art of using HD maps for the various core functions in AD systems. Furthermore, we extensively discussed and reviewed the different approaches, methods and algorithms to build the different layers of HD maps and keep them up-to-date. Finally, we shed some light on the prospective developments of HD maps for the next generation of mobility applications.