Mapless LiDAR Navigation Control of Wheeled Mobile Robots Based on Deep Imitation Learning

This paper addresses the problems related to the mapless navigation control of wheeled mobile robots based on deep learning technology. The traditional navigation control framework is based on a global map of the environment, and its navigation performance depends on the quality of the global map. In this paper, we proposes a mapless Light Detection and Ranging (LiDAR) navigation control method for wheeled mobile robots based on deep imitation learning. The proposed method is a data-driven control method that directly uses LiDAR sensors and relative target position for mobile robot navigation control. A deep convolutional neural network (CNN) model is proposed to predict motion control commands of the mobile robot without the requirement of the global map to achieve navigation control of the mobile robot in unknown environments. While collecting the training dataset, we manipulated the mobile robot to avoid obstacles through manual control and recorded the raw data of the LiDAR sensor, the relative target position, and the corresponding motion control commands. Next, we applied a data augmentation method on the recorded samples to increase the number of training samples in the dataset. In the network model design, the proposed CNN model consists of a LiDAR CNN module to extract LiDAR features and a motion prediction module to predict the motion behavior of the robot. In the model training phase, the proposed CNN model learns the mapping between the input sensor data and the desired motion behavior through end-to-end imitation learning. Experimental results show that the proposed mapless LiDAR navigation control method can safely navigate the mobile robot in four unseen environments with an average success rate of 75%. Therefore, the proposed mapless LiDAR navigation control system is effective for robot navigation control in an unknown environment without the global map.


I. INTRODUCTION
Navigation control is one of the core functions of autonomous mobile robots; it enables the robots to successfully complete the task of motion control and obstacle avoidance in the working environment. Figure 1(a) shows the architecture of the traditional map-based mobile robot navigation control system [1], which is a layered architecture composed of mapping, planning, and tracking control systems. The mapping system is used to construct a global map of the unknown environment. The planning system includes a global planner and a local planner to provide the optimal motion trajectory for the mobile robot to perform navigation tasks such as target approaching or obstacle avoidance. Both global The associate editor coordinating the review of this manuscript and approving it for publication was Yiming Tang . planner and local planner need to use the global map, sensor information, and the absolute target position to compute the optimal path trajectory and the motion control commands, respectively. According to [2], each system of the traditional map-based navigation framework represents a challenging research topic, and their integration often leads to poor computing performance due to large computational errors. In addition, the navigation performance of the traditional map-based navigation framework depends on the quality of the global map, which is very sensitive to sensor noise. This requirement may limit the ability of the navigation system to manage unknown or dynamic environments [2].
To simplify the architecture of the traditional map-based navigation control system, the development of mapless LiDAR navigation technology for wheeled mobile robots has received more and more attention in recent years. VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Figure 1(b) shows the system architecture of the existing deep learning-based mapless LiDAR navigation control system, which employs the neural network model to predict the motion control commands for the robot to approach the target while avoiding obstacles. Compared with traditional map-based navigation technology, the existing deep learning-based mapless navigation technology has two major advantages: (1) It releases the requirement of global map information. This advantage makes the performance of the navigation system independent of the quality of the global map, but it depends on the diversity of the training dataset.
(2) It directly uses the sensor information and the relative target position to predict the corresponding motion control commands. This advantage simplifies the complexity of the navigation system by using the neural network model. However, a potential flaw of this simplification is that the user may need to retrain the entire neural network model if the hardware specification of the mobile platform is changed.
Several mapless navigation control methods have been proposed in the literature. Based on the types of sensors, we divide the existing mapless navigation control methods into two approaches, namely the vision-based approach and the LiDAR-based approach. The vision-based approach uses RGB cameras to capture environmental images and a deep CNN model to learn the mapping between the observed images and the corresponding motion control commands, which can directly drive the robot to perform the navigation control in an unknown environment. For example, Kanayama et al. [3] proposed a two-mode mapless visual navigation system for mobile robots based on deep CNN models. The proposed navigation system is composed of two modules, namely the approach module and the positioning module. The approach module detects the landmarks in the environment to determine the velocity which directs the robot to move close to the target. The positioning module consists of a deep CNN model to calculate the moving distance from the current position to the target position. Experimental results show that this visual mapless navigation system is able to navigate the robot in the first-time-seen environments. Similar research can also be found in the references [4] and [5].
On the other hand, the LiDAR-based approach uses LiDAR sensors to obtain point cloud information of the surrounding environment and a deep CNN model to learn the mapping between the observed point clouds and the corresponding navigation control commands without the global map information. Recently, some mapless LiDAR navigation control methods have been published in the literature. Figure 2 shows the existing mapless LiDAR navigation control methods, which can be divided into two categories. The first category is based on data clustering technology to deal with scene segmentation problems. The second category is based on deep learning technology, which can be further divided into training methods based on reinforcement learning [6] and imitation learning [7]. The former can be used to solve FIGURE 1. System architecture of (a) the traditional map-based navigation control system, which is a layered architecture composed of mapping, planning, and tracking control systems, and (b) the existing deep learning-based mapless navigation control system. the problem of motion behavior classification and motion behavior analysis; the latter is useful in dealing with the problem of motion behavior cloning.
The purpose of the mapless navigation method based on point cloud segmentation processing is to use the geometric surface features of the LiDAR point cloud to segment the road area in the scene and to plan the motion trajectory of the area. For example, Ort et al. proposed a mapless driving framework for autonomous vehicle navigation, which combines sparse topological maps for global navigation with a sensor-based perception system for local navigation [8]. The sensor-based perception system uses a 3D laser scanner mounted on the vehicle for planning safe trajectories in the local frame based on a road segmentation algorithm, which aims to obtain the road boundary points in the sensor swath of the vehicle. Experimental results show that this framework allows the vehicle to navigate road networks without detailed prior maps in a rural unknown environment. Similar research can also be found in the reference [9].
Due to the rise of deep learning technology in recent years, its applications have become more diverse. With deep learning technology, mapless navigation control methods can achieve more robust navigation results. The main purpose of the reinforcement learning method is to determine the best action based on environmental feedback to maximize the overall reward. The classification problem of navigation control commands is to classify the output commands of the navigation system into several fixed categories according to the environmental conditions. The motion behavior classification method based on reinforcement learning can be applied to this classification problem to train the optimal classifier that can predict the correct motion command category based on the sensing information. For example, Qiang et al. [10] proposed a model-free mapless navigation method for mobile robots based on reinforcement learning. They designed an end-to-end navigation model that uses LiDAR sensor data as input and one of the five possible motion actions as output. After training the model through reinforcement learning, the mobile robot can safely reach the navigation target in an unknown environment without prior maps. References [11] and [12] also proposed similar research on this topic. A good review of reinforcement learning-based navigation techniques compared to traditional map-based navigation methods can be found in [2].
The motion behavior analysis method based on reinforcement learning is a regression analysis between the output motion control commands of the navigation system and the input environment perception and target position data. For example, Tai et al. [13] proposed a learning-based mapless motion planner, which takes the laser scan data and the relative target position as input and the continuous navigation commands as output. They applied an asynchronous deep reinforcement learning method to train an end-to-end mapless motion planner in the virtual environment. Experimental results show that the trained planner can be directly applied in unseen virtual and real environments. Pfeiffer et al. [14] proposed a target-driven mapless navigation policy based on a combination of imitation and reinforcement learning. They designed a neural network model, which uses the sensor measurements and the relative target position as input and the required control commands as output. The model was trained in a sequential fashion by using the result from imitation learning to initialize the reinforcement learning method. Their results show that target-driven demonstrations through imitation learning significantly improve the exploration during reinforcement learning.
Imitation learning means learning from the demonstration so that the network model can imitate the behavior of the demonstration. In the training process, the end-to-end learning method is used to make the neural network model learn the demonstration behavior given by the expert strategy, so that the predictive behavior of the model can become similar to the behavior of the expert strategy. Hence, the purpose of the navigation control method based on imitation learning is to learn the best mapping between the output motion action of the navigation controller and the input LiDAR sensing data and the relative target position. For example, Khaksar et al. [15] proposed a mapless neuro-fuzzy motion planner learned from a virtual experience model, which generates enough training data to train the adaptive neuro-fuzzy inference system. Pfeiffer et al. [16] proposed a data-driven motion planning approach based on a feature-based maximum entropy model, which is trained to predict the joint navigation behavior of heterogeneous groups of agents based on the demonstration of human-human interactions. In [17], Pfeiffer et al. proposed a target-oriented end-to-end navigation model that can learn the complex mapping from raw 2D-laser range findings and a target position to the required motion commands for a robotic platform. Hamandi et al. [18] proposed a human-aware navigation model in which the robot mimics humans to navigate safely in the crowds. The proposed navigation model also processes LiDAR scans as input to navigate the robot to the target position. Hence, the main advantage of imitation learning is that it is very useful when an expert system is available to provide the desired behavior for the network model to learn the optimal policy by imitating the behavior of the expert. However, the efficiency of imitation learning depends on the diversity of behaviors of the expert system to cover various situations. Therefore, to find ways to increase the behavior diversity of the expert system is an important issue in the application of imitation learning.
In this paper, we propose a mapless LiDAR navigation control system based on an end-to-end CNN model, which is trained through imitation learning. The proposed navigation system directly uses the LiDAR sensor data and the relative target position as input, and outputs the motion control commands required by the mobile robot. In the training phase, we use a regression method to train the model end-to-end to learn the mapping between input data and output commands and apply the trained model to navigation control. Moreover, we also propose a novel data augmentation method to increase the diversity of training samples to improve the efficiency of imitation learning. The contributions of this paper are summarized in the following three points: (1) A new CNN-based navigation control model is proposed, which can obtain better training results by increasing the dimension of input LiDAR data.
(2) We propose a data augmentation method that can increase the diversity of training samples so that the trained model can operate in an unknown environment.
(3) By collecting training data in the real environment and using the proposed data augmentation method to increase the size of the training dataset, the trained model can obtain good navigation control performance in the real environment.
Experimental results show that the proposed LiDAR navigation control system based on end-to-end imitation learning achieves good navigation control performance of the wheeled mobile robot. In a known environment, the proposed navigation control system can safely control the mobile robot to avoid obstacles and reach the target position. When operating in an unknown environment, the proposed navigation control system can reach the target position with an average success rate of 75%. Therefore, the proposed mapless navigation control system is practically applicable in real environments.
The rest of this paper is organized as follows. Section II describes the architecture of the proposed mapless navigation control system. Section III introduces the LiDAR signal CNN model and the motion prediction neural network model proposed in this paper. Section IV presents the collected training dataset and the proposed data augmentation method. Section V reports the experimental results to evaluate the navigation control performance of the proposed method in different environments. Section VI concludes this work and provides recommendations for future work. Figure 3 shows the architecture of the proposed mapless LiDAR navigation control system based on end-to-end imitation learning, in which the wheeled mobile robot is equipped with a 3D laser scanner to capture the point cloud of the surrounding environment. When the 3D point cloud of the surrounding environment, has been received, the proposed mapless navigation control system performs point cloud preprocessing on the 3D point cloud information to obtain the LiDAR sensing data. In addition, the robot motion information is obtained from the odometry of the mobile platform. Next, we use the Laser_scan_matcher [19] provided in the Robot Operating System (ROS) to obtain the robot position information by fusing the LiDAR point cloud and the robot motion information. When the user manually gives the target position information, the proposed navigation control system performs/initiates a coordinate conversion process on the given target position information to obtain the target position information relative to the current robot position.

II. SYSTEM ARCHITECTURE
As shown in Figure 3, the proposed navigation motion controller comprises of a navigation control CNN model and an end-to-end imitation learning algorithm. When performing navigation tasks, the proposed navigation control CNN model takes the extracted LiDAR sensing data and the relative target position as input, and outputs linear and angular velocity commands to the mobile platform. To learn the navigation control behavior provided by the expert strategy, we applied an end-to-end imitation learning algorithm to train the proposed navigation control CNN model offline. The detailed training process is described in Section III. Figure 4 shows the architecture of the proposed navigation control neural network model, which is divided into two models. The first one is a LiDAR signal CNN model that includes a feature extraction module for extracting features of LiDAR sensing data, and a residual network module for generating key point cloud feature maps. The second one is a motion prediction neural network model for motion command prediction. In the design of this model, we preprocess the relative target position through a fully connected (FC) layer and input the result into the data fusion module together with the point cloud feature maps. Finally, we use the motion prediction module to predict motion control commands. Section III also introduces the detailed design of the proposed navigation control CNN model for motion command prediction.
Remark 1: In the proposed navigation system, the robot position is obtained from the Laser_scan_matcher node in ROS, which is a frame-to-frame laser odometry based on  laser scan match between consecutive laser scans. More specifically, the laser odometry used in this node applies a point-to-line iterative closest point algorithm [20] on two consecutive laser scans to estimate the motion of the robot while updating the robot position relative to the starting position. The Laser_scan_matcher node can operate with wheel odometry to speed up the scan match process by providing a guess for the current position of the sensor. Therefore, in an indoor environment suitable for scan match processing, this node can effectively solve the problem of accumulated odometry errors. However, in an open wide environment, this node may reduce the accuracy of robot localization due to the increase in the scan match error. Interested readers can refer to [19] and [20] for more detailed information.

III. THE PROPOSED NAVIGATION CONTROL CNN MODEL
In this section, we introduce the design of the proposed navigation control CNN model for mapless navigation of wheeled mobile robots.
A. LIDAR SIGNAL CNN MODEL Figure 5 shows the detailed architecture of the proposed LiDAR signal CNN model, which is inspired from the design presented in [17]. The authors in [17] proposed a CNN model based on the ResNet architecture [21] to process laser scans obtained from a 2D LiDAR sensor, which provides a vector of distance values of the surrounding environment. In contrast, the proposed navigation control system uses a 3D LiDAR sensor to capture a distance matrix of the surrounding environment and converts this distance matrix into a 3D point cloud matrix of the regional environment. Next, a point cloud preprocessing module is adopted to reduce the dimension of the 3D point cloud matrix that outputs a 1080-by-3 point cloud matrix, which is defined as the LiDAR sensing data. This data is the input data of the proposed CNN model. As shown in Figure 5, the proposed CNN model is divided into a feature extraction module and a residual network (ResNet) module. The former is built by using two convolutional blocks and one maximum pooling (MaxPool) layer. Each convolutional block is composed of a 2D convolutional layer with ReLU [22] activation function and batch normalization (BN) layer for data normalization. The latter is designed and implemented based on the ResNet architecture to reduce the complexity of training and increase the speed of learning. Finally, we use an average pooling (AvgPool) layer to output the point cloud feature maps to complete the point cloud feature extraction process. Figure 6 shows the processing steps of the proposed point cloud preprocessing, which is divided into two steps, namely point cloud extraction, and point cloud library (PCL) filtering. We used ROS to connect the LiDAR sensor and read laser scans from the LiDAR through RJ-45 network cable. In the point cloud extraction step, after receiving the laser scans from the LiDAR sensor node in ROS, we convert the received data into 3D point clouds, and then convert these point clouds into the PCL file format [23]. In the PCL filtering step, VOLUME 9, 2021  we perform two filtering processes. The first one uses the PassThrough filter to set the size of a 3D cuboid to filter the point clouds outside the cuboid. The purpose of this process is to filter out the point clouds that are far away from the robot to retain the more important point cloud data. However, after processing by the PassThrough filter, the number of remaining point clouds is not fixed. Therefore, the second one uses the RandomSample filter to filter the remaining point clouds and output a fixed number of point clouds, which is the input data of the LiDAR signal CNN model. Figure 7 shows the architecture of the proposed motion prediction neural network model, which is modified from the neural network model presented in [17]. The proposed neural network model is built on five FC layers with dimensions (210, 1024, 1024, 512, 2). The first FC layer uses the linear activation function to increase the dimension of the input relative target position, which is calculated based on the coordinate conversion of the given target position with respect to the current robot location. After concatenating the high-dimensional target position information and the point cloud feature vector to form a hybrid feature vector, two FC layers with the ReLU activation function are used to perform nonlinear transformation and dimensionality reduction operations on the hybrid feature vector. The last two FC layers use the linear activation function to perform the mapping between the low-dimensional hybrid feature vector and the output linear and angular velocity commands.

B. MOTION PREDICTION NEURAL NETWORK MODEL
Let L and L f denote the LiDAR sensing data and the corresponding point cloud feature maps, respectively. Assume that the LiDAR signal CNN model is represented by a nonlinear mapping function π f . Then, the relationship between L and L f can be expressed as L f = π f (L), where the dimension of L is fixed to 1080×3. Let C d = (X d , Y d , θ d ) denote the desired relative target position given by the user, where (X d , Y d ) is the plane coordinate of the desired relative target point, and θ d is the desired relative orientation angle; C M = (v, ω) the predicted motion control command, where v is the linear velocity command, and ω is the angular velocity command. Assume that the proposed motion prediction neural network model is formulated by a nonlinear mapping function π ϕ . Then, the relationship between the input data L f , C d and the output command C M can be described by the following formula where the input data C d is obtained from the user-defined target position (X T , Y T , θ T ) and the current robot position (X R , Y R , θ R ). Note that the current robot position can be obtained from a mobile robot localization algorithm, such as the Laser_scan_matcher method used in the proposed navigation control system. Then, the desired relative target position C d can be calculated by the following coordinate conversion formula where the current robot position (X R , Y R , θ R ) is continuously updated during the navigation task. Note that the physical meaning of the coordinate conversion formula (2) is the transformation of the target position from the world frame to the robot base frame. Therefore, when the desired relative target position converges to zero, it indicates that the robot is already at the desired target position.  In the model training phase, we first constructed a largescale training dataset. Let train = {(L i , C di ,Ĉ Mi )}| i=1∼N train and test = {(L j , C dj ,Ĉ Mj )}| j=1∼N test denote, respectively, the constructed training and testing dataset, where L i is the i-th LiDAR sensing data, C di the i-th desired relative target position, andĈ Mi the i-th desired motion control commands. In this study, we collected a total of 250,000 training samples, of which 200,000 samples were used as the training dataset (N train = 200, 000), and the other 50,000 samples were used as the testing dataset (N test = 50, 000). During the model training process, we randomly divided the training dataset into several batches to train the proposed navigation control CNN model. We used the Mean-Absolute-Error (MAE) as the loss function for training where batch = {(L k , C dk ,Ĉ Mk )}| k=1∼N batch ⊂ train is a batch of the training dataset, and N batch N train is the batch size. Based on the MAE loss function (3), we optimized both the LiDAR signal CNN model and the motion prediction neural network model such that π φ ,π f = arg min π φ ,π f L MAE ( batch , π φ , π f ).
In the model testing phase, we evaluated the performance of the optimized navigation control CNN model based on the MAE loss function L MAE ( test ,π φ ,π f ) calculated from the testing dataset test . We observed from the experimental results that the MAE loss value of the optimized navigation control CNN model tested on the testing dataset is L MAE ( test ,π φ ,π f ) = 0.1137.

C. INFLUENCE OF DIFFERENT DIMENSIONS OF THE INPUT FC LAYER OF THE RELATIVE TARGET POSITION ON MODEL TRAINING
In this section, we study the influence of the different dimensions of the input FC layer of the relative target position on the training of the proposed navigation control CNN model. Figure 8 shows the evolution of MAE with respect to different dimensions of this FC layer from 3 to 300. It can be seen from Figure 8 that the use of a 3-dimensional FC layer results in the worst performance of model training. In contrast, when the dimension of the input FC layer is increased to 210, the best performance of model training can be obtained. Therefore, we use the 210-dimensional input FC layer as the default setting of the proposed motion prediction neural network model.

Remark 2:
The difference between the proposed CNN model and the original model in reference [17] is twofold. First, the original model is only suitable for 2D LiDAR, whereas the proposed model is expanded to be used for 3D LiDAR by integrating a point cloud preprocessing module. Second, the original model does not have an input FC layer for the relative target position. On the contrary, the proposed model adds an input FC layer to improve the learning accuracy of the CNN model, as shown in Figure 8. Hence, the navigation performance of the proposed model can be improved.

IV. PREPARATION OF TRAINING DATASET
This section presents the collection of the training dataset and the proposed data augmentation methods to increase the number of training samples.

A. COLLECTION OF THE TRAINING DATASET
In this study, we first collected a preliminary training dataset by manually controlling a wheeled mobile robot (Pioneer P3-DX [24]) equipped with a 3D LiDAR sensor   (Velodyne VLP-16 [25]). The robot navigated in an experimental field. Figure 9 shows the experimental field for the preliminary training dataset collection. A notebook computer was placed on the mobile robot to collect LiDAR sensor information, robot motion control commands, and the relative target positions.
While collecting the training data, we placed the robot in the experimental field and manually controlled the robot to navigate among six specific locations (the red dots from A to F), as shown in Figure 9. In the manual navigation control task, we recorded the surrounding point cloud information, the relative target positions, and all the control commands of the navigation process as expert strategies. The starting point of the navigation task starts from point A, passes through point B to point F in order (sequence), and finally returns to point A. When performing navigation control, the motion trajectory of the robot is maintained at the midpoint of the movable space. If the robot encounters an obstacle, we controlled the robot to perform obstacle avoidance actions.
In order to control the motion of the mobile robot, we used the ROS system to connect the robot through the RS232 interface. In this work, we defined four control behaviors to navigate the mobile robot in the experimental field. Figure 10 shows these four behaviors and the corresponding velocity control commands of the wheeled mobile robot. These commands are defined as the output ground truth in the training dataset.

B. THE PROPOSED DATA AUGMENTATION METHOD
After obtaining the preliminary training dataset recorded in the real environment, we performed a data augmentation processing on the recorded training samples to form a large-scale training dataset. First, we applied multiple affine transformations, such as rotating 90 degrees clockwise, flipping 180 degrees horizontally, flipping 180 degrees vertically, etc., to the recorded LiDAR point cloud data to obtain multiple sets of new data. Figure 11 shows some examples of affine transformations used in the proposed data augmentation method. Note that the output ground truth of each augmented new data is the same as that of the original data.
Second, we propose a new data augmentation method based on a data merging process to increase the training samples by merging multiple consecutive training data into a new training data. Figure 12 illustrates the concept of the proposed data augmentation method. Let C dij denote the relative position from target point i to target point j. In Figure 12, there are three target points to guide the mobile robot to avoid an obstacle in the corridor. When we sequentially recorded the training samples of the robot navigation from C d01 to C d23 , we perform the data merging process on the recorded data to generate new training samples between two different target points offline, such as the new robot navigation path C d02 and C d13 . By doing so, we can efficiently increase the navigation distance of the robot in the training sample so that the diversity of the augmented training dataset can be improved. Table 1 records the changes in the number of training samples before and after using the proposed data augmentation method. At the beginning, we recorded 15,625 training samples in the preliminary training dataset through manual navigation control of the mobile robot. After applying multiple affine transformations to the preliminary training samples, the number of the training samples increased to 125,000 samples. Finally, a large-scale training dataset with 250,000 samples was created by applying the data merging process on the augmented 125,000 samples.

V. EXPERIMENTAL RESULTS
We trained the proposed navigation control CNN model using Tensorflow Keras framework running on a desktop computer equipped with an Nvidia RTX 2080Ti GPU, and then implemented it on a laptop to control a wheeled mobile robot equipped with a 3D LiDAR sensor. Figure 13 shows the wheeled mobile robot and the 3D LiDAR sensor used in the experiment. We used the ROS framework to control the VOLUME 9, 2021  mobile robot and the LiDAR sensor to test and evaluate the performance of the proposed navigation control method. Figure 14 shows four unseen environments used to evaluate the performance of mobile robot navigation control based on the proposed navigation control CNN model. These unseen environments are considered as common scenes in the indoor environments. For example, UE1 and UE2 are indoor corridor scenes, and UE3 and UE4 are indoor hall scenes. For each unseen environment, we performed five times of the mapless navigation control on the mobile robot and recorded the desired target position and the final robot position to evaluate the success rate of the navigation control task in an unseen environment. In the experiment, we define the condition of successful navigation as the error distance within 0.4m between the desired target position and the final robot position. Table 2 lists the results of five experiments of the mapless navigation control of the mobile robot in the UE1. As shown in Table 2, after five experiments, the robot successfully navigated to the desired target position four times. Therefore, the average success rate of the mapless navigation control task in the UE1 reaches 80%. In addition, in the four successful navigation experiments, the longest navigation distance was about 10.21 meters, and the maximum error distance was 0.19 meters. Figure 15 shows the image sequence of the mobile robot performing mapless navigation in the first experiment. First, we placed the robot at the starting point shown in Figure 15 and set the desired target position to (5.0m, -2.5m). From Point (2) to Point (4), the robot autonomously approached the desired target position with a smooth motion trajectory. At Point (5), the robot autonomously performed a right-turn motion to correct its orientation angle to face the desired target position. Finally, the robot successfully reached the desired target position at Point (6) with an error distance of about 0.09m. Table 3 lists the results of the five experiments of the mapless navigation control of the mobile robot in the UE2, which is an unseen corridor environment. After five experiments, the robot also successfully navigated to the desired target position four times. Therefore, the average success rate of the mapless navigation control task in the UE2 is 80%. In the four successful navigation experiments, the longest navigation  [17] for the four unseen environments for comparison. distance was about 13.01 meters, and the maximum error distance was 0.13 meters. Figure 16 presents the image sequence of the first navigation experiment in the UE2. In this experiment, we tested whether the robot could complete the mapless navigation control task in the unseen corridor environment. At the beginning of the experiment, we placed the robot at the starting point and set the desired target position to the position where the robot needs to turn twice in the corridor. From Point (1) to Point (3), the robot successfully completed the first autonomous steering action with smooth motion trajectory. From Point (3) to Point (6), the robot completed the second autonomous steering action and successfully reached the desired target position of Point (6) with an error distance of about 0.13m. Table 4 records the results of the five experiments of the mapless navigation control of the mobile robot in the UE3, which is an unseen wide open environment. After five experiments, the robot successfully reached the desired target position only three times. Therefore, the average success rate of the mapless navigation control task in the UE3 is 60%. In the three successful navigation experiments, the longest navigation distance was about 5.05 meters, and the maximum error distance was 0.19 meters. Figure 17 shows the image sequence of the first navigation experiment in the UE3. In this experiment, we tested the performance of the proposed mapless navigation control system in the unseen open wide environment. At the beginning of the experiment, we placed the robot at Point (1) shown in Figure 17 and set the desired target position to (3.0m, 1.0m). From Point (2) to Point (5), the robot approached the desired target position with a smooth motion trajectory. At Point (6), the robot successfully reached the desired target position with an error distance of about 0.14m. During the experiments in the UE3, we found that the navigation performance of the proposed mapless navigation control system in an unseen open wide environment may be reduced due to the increased difficulty of the robot localization.  Table 5 records the results of the five experiments of the mapless navigation control of the mobile robot in the UE4, which is an unseen wide-open environment with an architectural structure conducive to robot localization. After five experiments, the robot successfully reached the desired target position four times. Therefore, the average success rate of the mapless navigation control task in the UE4 is 80%. In the four successful navigation experiments, the longest navigation distance was 8.28 meters, and the maximum error distance was 0.33 meters. Figure 18 shows the image sequence of the first navigation experiment in the UE4. In this experiment, the mobile robot performed navigation tasks near a cylindrical structure in the unseen open wide environment. At the beginning of the experiment, we placed the robot at Point (1) shown in Figure 18 and set the desired target position to (5.0m, 2.0m). From Point (2) to Point (6), the robot tried to avoid the obstacle while approaching the desired target position with a smooth motion trajectory. At Point (7), the robot successfully reached the desired target position with an error distance of about 0.29m. The experimental results in the UE4 show that by improving the reliability of the robot localization, the navigation performance of the proposed mapless navigation control system in an unseen open wide environment can be improved. For a video clip of the experimental results, kindly refer to the online webpage in [26].

E. COMPARISON OF RESULTS WITH SOME EXISTING METHODS
In this section, we first compare the performance of the proposed method with the CNN model introduced in the literature [17], which is highly relevant to this work. We implemented the CNN model in [17] in Tensorflow Keras and trained the model using the proposed large-scale training dataset. After training the model, we tested its navigation performance in the same four unseen environments, each of which contains five experiments. Table 6 records the experimental results of twenty experiments conducted for the purpose of the comparison using the method proposed in [17]. It can be seen from Table 6 that the comparison method performs well in the first two unseen environments with an average success rate of 80%. However, in UE3 and UE4, the average success rate of the comparison method reduced to 40%. This problem is also mentioned in the literature [17] that one limitation of the comparison method is wide open spaces with a lot of glass and clutter around, which is like the last two unseen environments in our proposed work. Therefore, the overall average success rate of the comparison method [17] only reaches 60%, which is lower than the success rate of the proposed method which is 75%.
We next compare the performance of the proposed method with other state-of-the-art methods. Table 7 presents comparison results between the proposed method and all comparison methods, in which the data of all comparison methods are obtained directly from the literature [14] and [27]. In Table 7, the learning types are divided into three types: imitation learning (IL), reinforcement learning (RL), and reinforced imitation learning (R-IL). In addition, the average success rate results of all comparison methods were calculated in virtual environments. The c1000 method is a neural network model trained by IL training with 1000 trajectories in a complex map. The CPO method uses constrained policy optimization (CPO) to train the neural network model during RL training. The c1000-CPO method uses the c1000 model as the pre-trained model and uses the CPO to update the model during RL training. It can be seen from Table 7 that the proposed method is superior to the c1000 and CPO methods and provides comparable performance to the c100-CPO method.
On the other hand, SPN (support point-based navigation) and SPN-v2 are two recently proposed LiDAR navigation network models that can be used with different range sensors and different installation positions. The SPN {360|10|5|0} method represents the SPN model that uses a LiDAR configuration with 360-degree field of view, 10-degree angular resolution, a maximum detection range of 5 meters, and a distance of 0 meter between the center point of the range sensor and the mobile robot. It can be seen from Table 7 that the proposed method provides comparable performance to the two SPN models configured with low-resolution LiDAR sensors.

VI. CONCLUSION AND FUTURE WORK
In this paper, we propose a mapless LiDAR navigation control method for a wheeled mobile robot based on deep imitation learning to solve the mapless navigation control problem in unknown environments. The proposed navigation control method is a data-driven control method based on a deep CNN model, which consists of a LiDAR signal CNN model for generating point cloud feature maps and a motion prediction neural network model for predicting motion velocity commands. In addition, a data augmentation method is proposed to increase the number of training samples in the preliminary training dataset recorded in the real environment. After the data augmentation process, a large-scale training dataset with 250,000 samples was generated in this study.
In this experiment, four unseen environments are used to evaluate the navigation performance of the proposed mapless navigation control method. Experimental results show that the proposed mapless navigation control method has been tested twenty times in the four unseen environments, and the average success rate reached 75%. Therefore, the proposed mapless navigation control method is effective for robot navigation control in an unknown environment without the global map. We also compared the proposed method with some existing methods, and the experimental results show that the proposed method outperforms the two existing IL-based methods and provides comparable performance to the R-IL-based method in terms of the overall average success rate. Moreover, compared with the recently proposed SPN-based method, the proposed method also provides comparable performance to the SPN model equipped with a low-resolution LiDAR sensor.
In future work, we will try to increase the data diversity of the training database so that the trained navigation network model can have better navigation performance in unknown and complex environments. In addition, we also consider using reinforcement learning to improve the robustness of the navigation network model and achieve better mapless navigation performance.