A Review on Challenges of Autonomous Mobile Robot and Sensor Fusion Methods

Autonomous mobile robots are becoming more prominent in recent time because of their relevance and applications to the world today. Their ability to navigate in an environment without a need for physical or electro-mechanical guidance devices has made it more promising and useful. The use of autonomous mobile robots is emerging in different sectors such as companies, industries, hospital, institutions, agriculture and homes to improve services and daily activities. Due to technology advancement, the demand for mobile robot has increased due to the task they perform and services they render such as carrying heavy objects, monitoring, search and rescue missions, etc. Various studies have been carried out by researchers on the importance of mobile robot, its applications and challenges. This survey paper unravels the current literatures, the challenges mobile robot is being faced with. A comprehensive study on devices/sensors and prevalent sensor fusion techniques developed for tackling issues like localization, estimation and navigation in mobile robot are presented as well in which they are organised according to relevance, strengths and weaknesses. The study therefore gives good direction for further investigation on developing methods to deal with the discrepancies faced with autonomous mobile robot.


I. INTRODUCTION
An autonomous mobile robot is a system that operates in an unpredictable and partially unknown environment.This means the robot must have the ability to navigate without disruption and having the capability to avoid any obstacle placed within the confinement of movement.Autonomous Mobile Robot (AMR) has little or no human intervention for its movement and it is designed in such a way to follow a predefined path be it in an indoor or outdoor environment.For indoor navigation, the mobile robot is based on floor plan, sonar sensing, Inertial Measurement Unit (IMU) etc.The first autonomous navigation was based on planar sensors such as laser range finder such that they navigate without human supervision.For an autonomous mobile robot to perform its task, it must have a range of environmental sensors.These sensors are either mounted on the robot or serve as an external sensor positioned somewhere in the environment.The number of different type of sensors mounted on the mobile robot to perform complex tasks such as estimation and localization The associate editor coordinating the review of this manuscript and approving it for publication was Leo Chen.makes the design of the overall system very tasking [1]- [3].The basics of mobile robotics consist of locomotion, perception and navigation.

A. LOCOMOTION
Locomotion system is an important aspect of mobile robot design which does not only rely on the medium in which the robot moves but also on other factors such as manoeuvrability, controllability, terrain conditions, efficiency, stability, and so on [4].The design of mobile robot is dependent on the service to be rendered; therefore, a mobile robot can be designed to walk, run, jump, fly etc.With the requirement of the designed robot, they are categorised into stationary and mobility: on land, water or air.Mobile robots especially autonomous are in high demand because of their ability and capacity to perform tasks that may seem difficult for humans.Examples of such designed mobile robots are wheeled, legged, walking or hybrid.Legged, wheeled, and articulated bodies are the main ways how mobile robot locomote [5].The wheeled robots are suited to ground either soft or hard ground while the legged and articulated bodies requires a certain degree of freedom and therefore greater mechanical complexity sets in [6].The wheel has been by far the most famous locomotion mechanism in mobile robotics and in vehicles in general.The advantages of wheel are efficiencies and simplicity.The use of wheels is easier and cheaper to build, design and program than its other counterparts.The control is less complex, and they cause minimum wear and tear on the surface where they move on.Another advantage is that they do not have issues with balancing because of its consistent contact with their mobility areas.The shortcoming of wheels is that they are not suitable at navigating over obstacles, such as stony terrain, unsmooth surfaces [4].To design and develop the locomotion system, the terrain type for the mobile robot must be identified.The types of terrain are: Uneven, Level Ground, Stair Up, Stair Down and Nontraversable [5].Another factor to consider when designing a mobile robot is stability.Stability is not usually a great problem in wheeled robot, because they are designed in such that all the wheels are always in contact with the ground.The use of four-wheeled is more stable than three, two and one because the Center of Gravity (COG) is located at the centre space of the wheels.In recent time, mobile robots are being designed to operate in two or more modes to improve performance.In [7], the author proposed a mechanism structure for the mobile robot with the advantage of adaptability using hybrid modes with active wheels.On a rough terrain the robot locomote using the leg mode while for smooth terrain it makes use of the wheeled locomotion by roller-skating using the passive wheels.The challenging part is that the wheels are usually very heavy and huge because they require driving actuators, steering and braking devices.Therefore, installation of the active wheels usually adds up to the whole weight of the vehicle which is already hefty enough limiting the versatility of the leg mechanism.To improve the localization of a mobile robot irrespective of the terrain, a technique has to be deployed.Dead reckoning has been already extended to the case of a mobile robot moving on uneven terrain.It gives information about positioning for mobile robots by directly computing the parameters such as position, velocity and orientation [8].

B. PERCEPTION
It is very important for an autonomous mobile robot to acquire information from its environment, sense objects around itself, or its relative position.Perception is an imperative aspect in mobile robot study.If a mobile robot is unable to observe its environment correctly and efficiently, to perform tasks such as estimating the position of an object accurately maybe an issue [9].To achieve this, information are perceived by the use of sensors and other related devices [10].Sensors make it possible to autonomously perform robot localization.They are also used for data collection, object identification, mapping and representation.Sensors used in the area of data collection is categorised into two major aspect; Proprioceptive/ exteroceptive sensors and active/passive sensors.Proprioceptive sensors measure values internally to the system (robot), e.g.battery level, wheel position, joint angle, motor speed etc.These sensors can be encoders, potentiometers, gyroscopes, compasses, etc. Exteroceptive sensors are used to extract information from the environments or objects.Sonar sensors, Infrared (IR) sensitive sensors, ultrasonic distance sensors are some examples of exteroceptive sensors.Active sensors emit their own energy into the environment, and then measure the environmental response.They often achieved a good performance due to their ability to manage interactions with the environment.Furthermore, an active sensor may suffer from interference between its signal and environment [11].Examples of active sensors include sonar sensors, radars etc.While passive sensors receive energy to make observation like camera such as Charge Coupled Device (CCD) or Complementary Metal Oxide Semiconductor (CMOS) cameras, temperature sensors, touch sensors etc.These sensors are most applicable in relation to specificity and achievement in the design of an autonomous mobile robot.Table 1 gives types of sensors used by an autonomous mobile robot.

C. NAVIGATION
Navigation is a fundamental problem in robotics and other important technologies.In order for the mobile robot to autonomously navigate, the robot has to know where it is at present, where the destination is, and how it can reach the destination [12].The most important aspect in the design of a mobile robot is navigation abilities.The objective is for the robot to navigate from one destination to another either in a recognized or uncontrolled environment.Most of the time, the mobile robot cannot take the direct route from its starting point to the ending point, which means that motion planning techniques must be incorporated.This means that the robot must depend on other aspects, such as perception (valuable data acquired by the mobile robot through the use of sensors), localization (position and configuration to be determined by the robot), cognition (decision made by the mobile robot on how to achieve its goals), and motion control (the robot must estimate its input forces on the actuators to accomplish the anticipated trajectory).
In robotics, another area to consider is the use of computer vision applications to aid navigation and localization.In computer vision, object recognition and feature matching are a significant task to be performed for accurate positioning.Object recognition has long been adopted in mobile robot to detect or identify objects present in an image.The technique can either be used to determine coordinates of the object detected or calculate in relative to a proposed object identified in an image.Feature matching or image matching on the other hand performs the task of establishing correspondence between two images of the same scene/object.
Examples of features associated between the images could be points, edges or lines, and these features are often called keypoints features [13], [14].To perform the task of object recognition and feature matching, several algorithms were VOLUME 8, 2020 TABLE 1. Classification of sensor system [11].
adopted and some of the algorithms were mentioned and discussed later in the paper.
Mobile robots attract attention more and more because of the increase in applications in various areas such as surveillance for security and monitoring home for health and entertainment, research and education etc., [15]- [17].Surveillance robots are now being installed in homes for domestic use, they are simple and easy to deploy, they are connected to Wi-Fi home network or smart environment to monitor and report activities going on in the environment.They have been designed further to engage in house cleaning, positioning objects where and when required.Recently, home robots are now being used by elderly people in a situation where emergency case arises.Therefore, these robots have helped to promote technology that aids to detect and react to events that demand immediate response [18].Another area where mobile robot is trending is the section of education.Educational robotics is primarily focused on creating a robot that will assist users to develop more practical, didactic, and cognitive skills.This approach is intended also to stimulate interest for research and science through set of different activities designed to support strengthening of specific areas of knowledge and skills.Introduction of mobile robot has increased not only on tertiary level and scientific research institutions, but also in lower grades such as secondary and primary schools [19].These have therefore improved the knowledge of people about mobile robot worldwide.
Furthermore, mobile robot is gaining more interest in the area of mining industry [20].The use of mobile robot has increased the efficiency and safety of miners.The robot assists in tracking people, robots and machines as well as monitor environmental conditions in mines.The mobile robotic platform is coupled with a set of range finders, thermal imaging sensors, and acoustic systems, all of which are functioned with neural networks.They navigate into different environments and identify potential risk areas before the workers go in. Figure 1 shows some applications of mobile robots but not limited to the areas mentioned.Furthermore, other applications includes firefighting, agriculture, museum and library guides, planetary exploration, patrolling, reconnaissance, petrochemical applications as well as for both domestic and industrial applications [4] etc.
The other sections of this paper are as follows: Section 2 commences by presenting the challenges mobile robots are faced with.This is followed by sensors and technique used to determine the positioning of mobile robots such as to improve accuracy in Section 3. Section 4 discusses the different types of methods used for object recognition and feature matching.Furthermore, related work on sensor fusion techniques were presented in Section 5. Section 6 presented the classification of sensor fusion algorithms.Section 7 highlighted the importance of sensor fusion techniques while Section 8 discusses the areas where researchers can further investigate on the issues challenging mobile robot navigation and localization in both known and unknown environment and Section 9 concludes the paper.

II. CHALLENGES OF AUTONOMOUS MOBILE ROBOT
Autonomous mobile robots have proven to be a system that cannot be without as a result of increase in demand for diverse applications.Regardless, the potential and prospect, they are yet to attain optimal performance, this is because of inherent challenges that they are faced with.These challenges (see Figure 2) have enabled more researchers to develop more interest in recent times.Some of the main challenges are navigation and path planning, localization and obstacle avoidance.

A. NAVIGATION AND PATH PLANNING
As earlier said in Section I that autonomous navigation of a mobile robot is an issue in robotics field.There are majorly two ways by which navigation problem is categorised into: local and global navigation.The local and the global navigation problem varies in terms of distances, scales and obstacle avoidance and inability for the goal state to be observed.For local navigation, occupancy grid of map is used to determine the navigation direction and for global navigation, landmark approach based on topological map is used.This have a compact representation of the environment and do not depend on the geometric accuracy.The limitation of this approach is that they are downgraded by the noise generated from the sensor.Mobile robot navigation systems depend on the level of abstraction of the environment representation.To accurately determine the position and orientation of the mobile robot, it is imperative for the environment to be modelled in a simple and understandable structure.Three main techniques for representing the environment are given as: geometric, topological and semantic [21].

1) GEOMETRIC
This is used to describe robot environment by parameterizing primitive geometric object such as curves, lines and points.The geometric representation of the environment is closer to the sensor and actuator world and it is the best one to perform local navigation.In [22], the author proposed the use of Principal Components Analysis (PCA) -Bayesian based method with grid map representation to compress images and reduce computational resources.The PCA was also use to reduce dimensionality and model the parameter of the environment by considering the pixels of an image as feature vectors of the data set [23].In [24], Markov localization method was proposed to provide accuracy and multimodality to represent probability distribution of diverse kind but require significant processing for update, hence it is impractical for large environment.

2) TOPOLOGICAL
This is considered by defining reference elements of the environment according to the distinct relations between them.A conventional method for modelling the robot's environment is to discretize the environmental model by using a topological representation of the belief state, where each likely pose of the mobile robot is connected to a node in a topological map [25].In [26], the proposed approach uses visual features extracted from a pair of stereo images as landmarks.While the new landmarks are fused into the map and transient landmarks are removed from the map over time.Topological representation of the environment uses graphs to model the environment and it is used in large navigation tasks.

3) SEMANTIC
The current development in robotics is to alleviate from representation models that are closest to the robot's hardware such as geometric models to those models closer to human reasoning, with which the robot will interact.It is proposed to relate model with the way robots represent the environment and the way humans do.Robots that are provided with semantic models of the environments where they operate have a larger decision autonomy, and become more robust and more efficient [27].
An integrated approach for efficient online 3D semantic map building of urban environments and the subsequent VOLUME 8, 2020 extraction of qualitative spatial relationships between the different objects was presented, this enables efficient task planning [28].Semantic information constitutes a better solution for interaction with humans [29], the representation is the most abstract representation model and adds concepts such as utilities or meanings of the environment elements in the map representation.Semantic navigation is considered as a navigation system that considers semantic information to model that includes conceptual and physical representation of objects and places, utilities of the objects, and semantic relation among objects and places.This model allows the robot to manage the environment and to make queries about the environment in order to do plans for navigation tasks [21].Environmental model requires improved representation to enable successful result, better accuracy and as well reduce the computational cost [30].For this to prevail, the environment must be well represented, simple technique must be adopted and be incorporated in to the robot's representation of its environment [31].
Safe and efficient mobile robot navigation requests an efficient path planning technique since the quality of the generated path affects extremely the robotic applications [32]- [34].In an environment with several obstacles, finding a path without collision with obstacles from the initial point to the final point becomes an issue such as shortness and simplicity of route are important criteria affecting the optimality of selected routes.Considering the length of the path travelled by the robot, energy consumption and its performance time, and an algorithm that finds the shortest possible route [35] is most appropriate.Basically, there are two types of environment: static and dynamic.While dynamic environment is divided into global and local path planning [33].Global navigation strategy deals with a completely known environment while local navigation strategy deals with the unknown and partially known environment.Figure 3 shows the breakdown of path planning categories.Quite a number of studies have been investigated on path planning in dynamic environments.Authors in [37] proposed a new method to decide the optimum route of the mobile robot in an unknown dynamic environment, they used Ant Colony Optimization (ACO) algorithm to decide the optimal rule table of the fuzzy system.Other related algorithms are Bacterial Foraging Optimization (BFO) [33], and Probabilistic Cell Decomposition (PCD) [38].
A new mathematical method that is based on the concepts of 3D geometry is proposed to generate the route of the mobile robot.The mobile robot decides its path in real time to avoid randomly moving obstacles [39].Other intelligent algorithms studied by researchers used by mobile robot to navigate in diverse environment are Differential Evolution (DE) algorithm [40], [41], Harmony Search (HS) algorithm [42], Bat Algorithm (BA) [43], and Invasive Weed Optimization (IWO) [44].

B. LOCALIZATION
Localization is another fundamental issue encountered in mobile robot which requires attention as well.The challenging part of localization is estimating the robot position and orientation of which this information can be acquired from sensors and other systems.So, to tackle the issue of localization, a good technique should be proposed to deal with errors, downgrading factors, improper measurement and estimations.The techniques are divided into two categories [45]- [48]: relative and absolute localization.

1) RELATIVE LOCALIZATION TECHNIQUES
This method estimate the position and orientation of the mobile robot by integrating information produced by diverse sensors through the combination of information presented by different sensors, usually encoder or inertial sensors.The integration starts from the initial position and continuously update in time.The relative positioning alone can be used only for a short period of time.

2) ABSOLUTE LOCALIZATION TECHNIQUES
This method permits the mobile robot to search its location directly from the mobile system environment.Their numerous methods usually depend on navigation beacons, active or passive landmarks, maps matching or satellite-based signals such as the Global Positioning System (GPS).For absolute localization, the error growth is alleviated when measurements are accessible.The position of the robot is externally determined, and its accuracy is usually time and location independent.In other words, integration of noisy data is not required and thus there is no aggregation of error with time or distance travelled.The limitation is that one cannot keep track of the robot for short distances.

C. OBSTACLE AVOIDANCE
Obstacle avoidance is a vital task in the field of robotics, because it is important that the mobile robot get to its destination without being obstructed by any obstacle or an event of collision on its path.To this effect, collision free algorithm is a prerequisite of autonomous mobile robot, since it offers the safe trajectory and proves convergence [49].Some of the main algorithms that can be used for obstacle avoidance are discussed in this section.Bug algorithm [50] is one of the earliest algorithms.It enables the robot to navigate the entire circumferences of the obstacle encountered and decide on the most appropriate point to leave towards the goal.The robot therefore moves to the best leaving position and later moves towards the object.The benefit of this algorithm is that it is easy to determine if an object is unreachable or not.However, the algorithm takes time to achieve its goal.Another algorithm is Vector Field Histogram (VFH) [51] which is an improvement of the short coming of Virtual Force Field (VFF) algorithm [52].VFH allows detection of unknown obstacle and avoids collision while simultaneously piloting the mobile robot towards the target.This algorithm employs a 2-stage data reduction process in order to compute the desired control command for the robot.This ensure accurate computation of the robot path to the target, but it consumes more resources like memory, processor and power.Hybrid navigation algorithm with roaming trails (HNA) [53] is an algorithm that is able to deal very efficiently with environments where obstacles are encountered by the robot during motion.During navigation the robot can deviate from its path to avoid obstacles on the basis of reactive navigation strategies, but it is never permitted to exit from the area.Since the robot is controlled to move within a convex area which includes the location of the target node, in presence of static obstacles it is guaranteed to reach the target by following a straight line.In some cases, the mobile robot has to either avoid the obstacles or simply stop in front of the obstacle.Another method that is similar to HNA is the New Hybrid Navigation Algorithms (NHNA) [54].The algorithm uses D-H bug algorithm (Distance Histogram bug) to avoid obstacle.It enables the robot to rotate freely at angle less than 90 degrees to avoid obstacle.If the rotation is 90 degrees or greater and it is required to avoid an obstacle, it acts as bug-2 algorithm [50] and starting moving to destination when path is clear from obstacles.Conclusively, collision free algorithm is a requirement for autonomous mobile robot, since it provides safe trajectory.
In conclusion, challenges faced by mobile robot must be tackled to ensure effective performance.Navigation is one of the most important aspect to be considered when it comes mobile robot because it requires planning algorithms and appropriate information about robot's location.This will navigate the robot through its pre-defined path.In as much as navigation is important so also is trajectory planning.This will determine the path the robot must follow in order to reach its destination.Therefore, a path must be planned accordingly to avoid collision and obstacles.Different algorithms are considered for obstacle avoidance depending on the goal to be achieved.Finally, the robot must know its position and direction per time.In this regard, an effective localization technique and reliable sensors are required to gather precise information.

III. SENSORS AND TECHNIQUES IN MOBILE ROBOT POSITIOING
To ensure accuracy in localization, sensors and effective positioning system has to be considered.Objects positioning [55], robotics, and Augmented Reality (AR) tracking [56] have been of interest in the literature of recent.This section will discuss the existing technologies that aim at determining mobile robot's position within its environment.

A. INERTIAL SENSORS
Inertial based sensor methods are also known as IMU (Inertial Measurement Units) which is a combination of accelerometers, gyroscopes and sometimes magnetometers.These sensors have become ubiquitous because many devices and system depend on them to serve a large sum of applications.They rely on measurement of acceleration, heading and angular rates, which can be acquired without external reference.Each of these sensors are deployed in robots, mobile devices and navigation systems [57]- [59].The benefits of using these sensors is solely to calculate the position and orientation of a device and/or object.

1) ACCELEROMETER
Accelerometer as a sensor measures the linear acceleration, which is the rate of change of velocity of an object.They measure in meters per second (m/s 2 ) or in gravity (g).They are useful for sensing vibration in system or for orientation in applications [60].Velocity is determined from it if integrated once and for position, integration is done twice.Using a standalone sensor like accelerometer could be simple and of low cost as stated by the author in [61], but the linear increasing error does not give a high-level of accuracy.The use of accelerometer alone may not be suitable because they suffer from extensive noise and accumulated drift.This can be complemented with the use of gyroscope.

2) GYROSCOPE
Gyroscope sensor measures the angular velocity in degrees per second ( • /s) or Revolution Per Second (RPS) and by integrating once, rotation angle can be calculated.Although gyroscope is small in size and inexpensive but run at a high rate in which they are able to track fast and abrupt movements.Another advantage of using gyroscope sensor is that it is not affected by illumination and visual occlusion [55].However, their performances are degraded by accumulation of measurement errors for long periods.Consequently, the fusion of both accelerometer and gyroscope sensor is appropriate to determine the pose of an object and to make up for the weakness of one over the other.

3) MAGNETOMETER
Magnetometer is another sensor used to calculate the heading angle by sensing the earth magnetic field.They are combined with technologies to determine pose estimation [62].However, magnetometer may not be so useful for indoor positioning because of the existence of metallic objects within the environment that could affect data collected through measurements [55].Other methods that be used to determine indoor localization includes infrared, Wi-Fi, Ultra-Wideband (UWB), Bluetooth, Wide Local Area Network (WLAN), fingerprinting etc., [63]- [66].However, these methods have their inadequacies, it is therefore necessary that two or more schemes be combined to attain accurate result.

B. MONOCULAR VISION POSITIOING SYSTEM
Monocular vision positioning uses a single camera to determine the pose estimation of a mobile device or static objects.Another type of vision positioning system is called binocular vision.Binocular stereo vision uses two cameras to estimate location of a mobile robot.Although it has the advantage of better performance in the regard of accuracy, but it is more expensive and complex to compute [67].While monocular vision on the other hand is simple to set-up and of low cost.Information collected from the environment captured by the camera can be in form of an image or video.This information is therefore processed to estimate the position and orientation of the robot per time.This poses a spatial relationship between the 2D image captured and the 3D points in the scene.According to Navab [68], the use of marker in augmented reality (AR) is very efficient in the environment.It increases robustness and reduces computational requirement.However, there are exceptional cases where markers are placed in the area and they need re-calibration from time to time.Therefore, the use of scene features for tracking in place of markers is reasonable especially when certain parts of the workplace do not change over time.Placing fiducial markers [47] is a way to assist robot to navigate through its environments.In new environments, marker often need to be determined by the robot itself, using sensor data collected by IMU, sonar, laser and camera.Markers' locations are known, but the robot position is unknown, and this is a challenge for tracking a mobile robot.From the sensor readings, the robot must be able to infer its most likely position in the environment.With monocular vision (one camera), a good solution in terms of scalability and accuracy is provided.The monocular vision is low in cost because only one camera is required, and this technique demands less calculation unlike stereo vision with high complexity.With the aid of other sensors such as ultrasonic sensor or barometric altimeter, the monocular vision can also provide the scale and in-depth information of the image frames.To calculate the pose of the mobile robot with respect to the camera based on the pinhole camera model.The monocular vision positioning system [69], can be use to estimate the 3D camera from 2D image plane [70].The relationship between a point in the world frame and its projection in the image plane can be expressed as: where λ is a scale factor, p = [u, v, 1] T and P = [X w , Y w , Z w , 1] T homogenous coordinates of p and P, and M is a 3 × 4 projection matrix.Equation (1) can further be expressed as: The projection matrix depends on both camera intrinsic and extrinsic parameters.The intrinsic parameters contain five parameters: focal length f , principal point u 0 , v 0 and the skew coefficient between x and y axis and is often zero.
Extrinsic parameters: R, T defines the position of camera center and the camera's heading in world coordinates.Camera calibration is to obtain the intrinsic and extrinsic parameters.Therefore, the projection matrix of a world point in the image is expressed as: where T is the position of the origin of the world coordinate, and R is the rotation matrix.

C. LANDMARKS
Landmark is the feature information recognized through robot's sensors perception.For an autonomous robot, how to identify landmarks quickly and accurately plays an important role in localization and navigation.Robot navigation system based on landmarks research areas include landmark selection, landmark design, landmark detection, landmark navigation, environmental characterization and path planning, etc.Generally, landmarks are classified into two types: markerless (also known as natural landmark) and marker-based (also known as artificial landmark) [71].Artificial Landmark: Artificial landmarks refer to the special designs of the objects or markers placed in an environment which can be detected by laser, infrared, sonar and vision sensors.The uniqueness of the marker is important with the features for quick recognition and high reliability, these landmarks can be identified accurately at various visual conditions [71], [72].Localization based on artificial landmarks is used more widely than other methods because the artificial landmarks are easy to detect and allowed to achieve high speed and precision.An artificial landmark could be any object whether static or mobile which could vary in size, shape, feature or color as long as it is placed in the environment with the purpose of robot localization.The author in [73] use a sticker and LED array as an artificial landmark.These makers are easier to detect and describe because the details of the objects used are known in advance.These methods are used because of their simplicity and easy setup.However, they cannot be adopted in an extensive environment where large numbers of markers are deployed.
Natural Landmark: Natural landmarks are objects or features that are part of the environment and have a function other than robot navigation.Examples of natural landmarks are corridors, edges, doors, wall, ceiling light, lines, etc.The choice of features is vital because it will determine the complexity in the feature description, detection and matching [55].Although the natural landmarks have little influence on the environment, it is rarely used in the practical applications for its low stability and bad adaptability.Visual features are divided into three categories: point feature, line feature, block feature.Amongst the three categories, point feature is the easiest to extract, relatively stable and contain abundant information [74].Several work has dealt with the issue of using natural landmarks to extract feature that will aid robot localization using Scale-Invariance feature Transform (SIFT) features [75] and Speeded Up Robust Feature (SURF) features [76], [77].Figure 4 shows an example of natural landmarks extracted using SURF algorithm.

IV. OBJECT RECOGNITION AND FEATURE MATCHING
In this section we presented the proposed method of object recognition and matching features.Object recognition under uncontrolled, real-world conditions is of vital importance in robotics.
It is an essential attribute for building object-based representations of the environment and for the manipulation of objects.Different methods of scale invariant descriptors and detectors are currently being adopted because of their affine transformations to detect, recognize and classify objects.Some of these methods are Oriented Fast and Rotated BRIEF (ORB), Binary Robust Invariant Scalable Keypoints (BRISK), Difference of Gaussians (DoG), FERNS [78] SIFT [13] and SURF [76].More details of these method can be found in reference [79].Object detection and recognition can be done using computer vision whereby an object will be detected in image or video sequence.The recognised object is used as a reference to determine the pose of a mobile device.Basically, object detection can be categorised into three aspects: appearance based, color based and features based.All these methods have their advantages and limitations [80].
Appearance based objects are recognised based on the changes in color, size and shape.The techniques used are edge matching, divide and conquer search, greyscale matching, gradient matching etc.The color based techniques are based on the Red, Green and Blue (RGB) features to represent and match images.They provide cogent information for object recognition.While the feature-based technique finds the interest points of an object in image and matches them to the find object in another image of similar scene.Features extracted are surfaces, patches, corners and linear edges.The methods used to extract feature are interpretations trees, hypothesize and test, pose consistency, geometric hashing, SIFT, and SURF.
Mostly, finding the correspondences is a difficult image processing problem where two tasks have to be solved [81].The first task consists of detecting the points of interest or features in the image.Features are distinct elements in the images, examples are corners, blobs, edges.The most widely used algorithm for detection includes the Harris corner detector [82].It is based on the eigenvalues of the second moment matrix.Other types of detectors are correlation based: Kanade-Lucas-Tomasi tracker [83] and Laplace detector [84].For feature matching, the two most popular methods for computing the geometric transformations are: Hough transform and Random Sample Consensus (RANSAC) algorithm [79], [85], [76].They could estimate parameter with a high degree of accuracy even when a substantial number of outliers are present in the data set.

A. SPEEDED-UP ROBUST FEATURES (SURF)
SURF was first introduced by Bay et al. [76].SURF outperforms formerly proposed scheme SIFT with respect to repeatability (reliability of a detector for finding the same physical interest points under different viewing conditions), distinctiveness, and robustness, yet can be computed much faster.The descriptors are used to find correspondent features in the image.SURF detect interest points (such as blob) using Hessian matrix because of its high level of accuracy (See equations 5 and 6).This is achieved by relying on integral images for image convolutions; by building on the strengths of the leading existing detectors and descriptors (specifically, using a Hessian matrix-based for the detector, and a distribution-based for the descriptor); and by simplifying these methods to the essential.This leads to a combination of novel detection, description, and matching steps.SURF is used to detect key points and to generate its descriptors.Its feature vector is based on the Haar Wavelet response around the interested features [80].SURF is a scale-and rotation-VOLUME 8, 2020 invariant, that means, even with variations on the size and on the rotation of an image, SURF can find key points.
X = (x, y) is an image I , Hessian matrix H = (x, σ ) in x at scale σ is defined.
Where L xx (x, σ ) is the convolution of the Gaussian second ∂ ∂x 2   2 g(σ ) with the image in point x and derivative for L xy (x, σ ) and L yy (x, σ ).

B. RANDOM SAMPLE CONSENSUS (RANSAC)
RANSAC is feature matcher which works well with SURF when matching detected objects in images.RANSAC was first published by Fischler and Bolles [85] in 1981 which is also often used in computer vision.It simultaneously unravel the correspondence problem such as, fundamental matrix related to a pair of cameras, homograph estimation, motion estimation and image registration [86]- [91].It is an iterative method to estimate parameters of a mathematical model from a set of observed data which contains outliers.Standard RANSAC algorithm of this method is presented as follows: Assuming a 2D image corresponds to a 3D scene point.(x i , wX i ).Assuming that some matches are wrong in the data.RANSAC uses the smallest set of possible correspondence and proceed iteratively to increase this set with consistent data.
-draw a minimal number of randomly selected correspondences S k (random sample) -compute the pose from these minimal set of point correspondences using () POSIT, DLT -determine the number C k of points from the whole set of all correspondence that are consistent with the estimated parameters with a predefined tolerance.If C k >C * then retain the randomly selected set of correspondences S k as the best one: S * equal S k and C * equal C k -repeat first step to third step.The correspondences that partakes to the consensus obtained from S * are the inliers and the outliers are the rest.It has to be noted that the number of iterations which ensures a probability p that at least one sample with only inliers is drawn can be calculated.Let p be the probability that the RANSAC algorithm selects only inliers from the input data set in some iteration.The number of iterations is denoted as [92]- [94]: where w is the proportion of inliers and n is the size of the minimal subset from which the model parameters are estimated.
Steps to detect and identify object in a scene: -

V. SENSOR FUSION TECHNIQUES
Several definitions of sensor fusion are given in the literature.Sensor fusion or data fusion as defined by Joint Directors of Laboratories (JDL) workshop [95] is a multi-level procedure dealing with the association, correlation, integration of data and information from single and multiple sources to attain distinguished position, determine estimates and complete timely assessments of situations, threats and their significance.Also, Hall and Llinas [96] presented the following well-known meaning of data fusion: ''data fusion techniques combine data from multiple sensors and related information from associated databases to achieve improved accuracy and more specific inferences that could be achieved by the use of a single sensor alone''.According to the authors in [97] and [98], sensor fusion was defined as the cooperative use of information provided by multiple sensors to aid on performing a function while several others authors [99]- [101] defined data fusion algorithms as the combination of data from multiple sources in order to enhance the performance of mobile robot.Regardless of different definition given, sensor fusion is the integration of information from multiple sources to improve accuracy and quality content, also with the aim to reduce cost.The technique finds wide application in many areas of robotics such as object recognition, environment mapping, and localization.Fusion techniques are therefore regarded as the most appropriate method to track objects and determine their locations.The advantages of sensor fusion are as follows: reduction in uncertainty, increase in accuracy and reduction of cost.It is therefore suggested by various researchers that to attain a level of accuracy, integration of more than one sensor is most suitable because the inadequacy of one sensor can be complemented by another.For example, the image captured by the camera was used to correct the abnormalities of inertial sensors [102], [103].The data fusion techniques deployed is influenced by the objective of applications in which it aids in building a more accurate world model for the robot to navigate and behave more successfully.The three fundamental ways of combining sensor data are the following [99], [104]: The sensors are configured competitively to produce independent measurements of the same property.i.e. diverse kinds of sensors are used to measure same environment characteristic.This means data from different sensors can be fused or measurement from a single sensor taken at different periods can be fused.A special case of competitive sensor fusion is fault tolerance.Fault tolerance requires an exact requirement of the service and the failure modes of the system.This configuration therefore reduces the risk of incorrect indication that could be caused by one of the sensors.Most importantly, this might result in an increase in the reliability, accuracy or confidence of data measured by the sensors.This technique can also provide robustness to a system by combining redundant information [105], [106].However, the robust system provides a degraded level of service in the presence of faults while this graceful degradation is weaker than the accomplishment of fault tolerance.The method performs better in terms of resource need and work well with heterogenous data sources.Another name for competitive sensor configuration is also called a redundant configuration.An example of competitive is the reduction of noise by combining two overlaying camera images.

B. COMPLEMENTARY
This type of sensor configuration ensures that the sensors do not depend on each other but rather complement themselves with different measurements.This resolves the incompleteness of sensor data.This type is the most common for localization.Example is when vision is complemented by the short coming of accumulated errors in IMU.Another example of complementary configuration is the employment of several cameras each observing different area of the mobile robot surrounding to build up a picture of the environment.Generally, fusing complementary data is simple, since the data from independent sensors can be appended to each other, but the disadvantage is that under certain conditions the sensors maybe ineffective, such as when camera used in poor visibility [107].

C. COOPERATIVE
This method uses the information made available by the two separate sensors to originate data that would not be obtainable from the single sensors.An example of a cooperative sensor configuration is stereoscopic vision by combining two dimensional images from two cameras at slightly dissimilar viewpoints in which 3D of the detected scene is derived.According [107], cooperative sensor configuration is the most difficult system to design due to their sensitivity to imprecisions in all individual participating sensors.Thus, in contrast to competitive fusion, cooperative sensor fusion generally decreases accuracy and reliability.
Conclusively, competitive fusion combinations increase the robustness of the perception, while cooperative and complementary fusion provide extended and more complete views.The methods particularly used in the fusion level is subject to the availability of components.Furthermore, these three combinations of sensor fusion are not mutually exclusive.Therefore, many applications implement aspects of more than one of the three types.

VI. CLASSIFICATION OF SENSOR FUSION ALGORITHMS
The sensor fusion algorithms are required to translate the diverse sensory inputs into reliable evaluations and environment models that can be used by other navigation subsystems.The methods usually implement iterative algorithms to deal with linear and non-linear models.In order to localize robot, many sensors have been adopted and fusion methods developed.These algorithms are a set of mathematical equations that provide competent computational means to estimates the state of a process.Table 2 also shows work based on the classification of sensor fusion method.Some of the sensor algorithms used are categorised into the following [108]: The state estimation methods are used to ascertain the state of an anticipated system that is continuously changing given some observations or measurements.State estimation phase is a common step in data fusion algorithms since the target's observation could come from different sensors or sources, and the final goal is to acquire a global target state from the observations.Table 3 shows related study carried primarily based on state estimate methods.The two major methods discussed are kalman filter and particle filter.

1) KALMAN FILTER
Kalman filter (KF) is an efficient estimator used in various fields to estimate the unknown state of the system.Several applications were developed with the implementation of Kalman filter such applications include navigation, localization and object tracking.It involves using vision camera to perform real time image processing for robot tracking.Kalman filter is established to estimate the positions and velocities of vehicles or any moving object and provide tracking on such objects at a visible condition.
Kalman filter is an algorithm that estimates the state of a discrete time-controlled process described by the linear stochastic equation.It processes the state from the previous time step with the current measurement to calculate the estimate of the current state.Kalman filters are famous  techniques in theory of stochastic dynamic systems, which can be used to improve the value of estimates of unknown quantities [109].It is one of the most useful and common estimation techniques where it is easy to implement on linear systems.Equations for Kalman filter are given as follows [110]: Vector xk is the estimate state of the system x k .P k is the predicted covariance matrix.F is the matrix that denote the dynamics of the system.B is the control matrix and Q is the noise covariance.
The Kalman filter equation are used to generate new estimates with the addition of an external unit for correction.The Kalman filter involve another stage to update the estimate.This is given by equations below: where From the above equations: z k is the measurement vector which is a reading from the sensors.H is the transformation matrix, R is the covariance matrix of the measurement noise and k is the time interval.The Kalman gain (K) describes the amount of update needed at each recursive estimation which can be as the weighting factor that considers the relationship between the accuracy of the predicted estimate and the measurement noise.To analyze the statistical behavior of the measured values, KF is an optimal estimator that can be used.Most of the real time problem, the systems may not provide linear characteristic, so we use extended Kalman filter, which will linearize the system.The main benefit of Kalman filter is its computational competence but it can signify only unimodal distributions.So Kalman filters are best when the uncertainty is not too high.Other types of sensor fusion based on Kalman filter is EKF.The Extended Kalman Filter (EKF) is one of the most effective probabilistic solutions to simultaneously estimate the robot pose estimation based on sensor information.
Comparing Kalman filter to EKF, author [111] proves that that EKF algorithm is among the best method which ensures better performance and optimal result in determining robot localization.Another derivates of KF apart from EKF is Unscented Kalman filtering (UKF).According to the literature, it is stated that UKF delivers better results on data fusion compared to Kalman filter or EKF solutions [112].

2) PARTICLE FILTER
Particle Filter (PF), with the ability of approximating Probability Density Functions (PDFs) of any form, has received substantial attention among researchers.PF method is a Sequential Monte Carlo (SMC) technique for the solution of the state estimation problem, using the so-called Sequential Importance Sampling (SIS) algorithm and including a resampling step at each instant.This method builds the consequent density function using several random samples called particles.Particles are propagated over time with the integration of sampling and steps.At each iteration, the sampling step is employed to reject some particles, increasing the significance of regions with advanced posterior probability.The particle algorithm is comprise of the following steps [97], [113]- [117]: Particle generation: Generate N {x 1 (0), x 2 (0), x 3 (0), . . ., x N (0)} initial particles according to the initial probability density function (PDF) p(x(0)) Prediction: For each particle x i (k), propagate the x i (k + 1) particle according to the transition PDF p(x(k + 1) |x(k)).Here, each particle accounts for the sum of the random noise to simulate the noise effect.
Sampling: For each particle x i (k + 1), generate Normalization and rejected sampling: Weights of the particles are normalized.Particles with low weight are removed and particles with high weight are replicated such that each particle has the same weight.
PF is considered as an alternative for real-time applications, which are typically approached by model based traditional Kalman filter technique implementations.With the advantages of accuracy and stability, PF is currently being considered in the field of traffic control (car or people video monitoring), military field (radar tracking, air-to-ground passive tracking), mobile robot positioning and self-localization.

B. DECISION FUSION METHOD
Decision fusion is one form of data fusion that combines the decisions of many classifiers into a mutual decision about the activity that happened.The fusion method reduces the level of uncertainty by maximizing a measure of evidence [118].These techniques frequently use symbolic information, and the fusion process requires to reason while accounting for the uncertainties and constraints.The two types of decision method discussed here are Bayesian Approach and Dempster-Shafer Approach.

1) BAYESIAN APPROACH
Bayesian approach is a basic method to deal with conditional probability more precisely it relates the condition probability of more than two events.They are practically used for more complex relationship description [119].The method provides a theoretical framework for dealing with this uncertainty using an underlying graphical structure.They are ideal for taking an event that happened and envisaging the likelihood that any one of numerous possible known causes was the contributing factor.Bayesian method can be mathematically presented as [113]: P(C|D) = P(C|D)P(C) P(D) (12) where P(C) is the probability of event C without any effect of any other event.P(D) is the probability of the event D without any effect of any other event and P(D|C) is the probability of event D given that C event is true.The result of P(C|D) condition probability will be in range between zero and one [1 0].Which means either the event P(C|D) will occur.Bayesian method is computationally simpler, has higher probabilities for correct decision and it provides point estimates and posterior pdf [120].However, they have the following demerits: difficulty in describing the uncertainty of decision, complexity when there are multiple potential hypothesis and a substantial number of events that depend on conditions, difficulty in establishing the value of a prior probabilities.Bayesian method is applicable to solve image fusion, where no prior knowledge in available.Also, it is applied in robotics learning by imitation.The approach enables the robot to study internal models of their environment through self-experience and employ the model for human intent recognition, skill acquisition from human observation.

2) DEMPSTER-SHAFER
Dempster-Shafer (DS) has become very famous in which its application extends to pattern recognition methods which are widely used in signal solving and recognition.The method has a better adaptability of grasping unknown and uncertain problem when it is regarded as an uncertainty method.It also provides a vital formula which fuse diverse evident of different sources.Dempster-Shafter theory has been considered for a variety of perceptual activities including sensor fusion, scene interpretation, object target recognition, and object verification.In [109], DS theory was successfully used in building occupancy map to improve reliability.D-S approach is more robust to perturbations such as noise and imprecise prior information [120].The method is based on concept of combining information from different sources such as sensors.It uses belief and plausibility values to represent the evidence and corresponding uncertainty [121], [122].The method uses 'belief' rather than probability.Belief function is used to represent the uncertainty of the hypothesis [123].The hypothesis is represented by a probability mass function m.The amount of belief to a hypothesis (A) is denoted by a belief function [124]: Equation ( 13) is the sum of the mass probabilities assigned to all subsets of A by m.The availability of two or more evidence is integrated using the combination rule in equation below: where 1 − k is a normalization factor in which k is the total of all non-zero values given to the null set hypothesis ∅.The decision on the class of a feature can be decided based on a maximum belief decision rule, which is assigned a feature to a class A if the total amount of belief supporting A more than that supporting its negation:

VII. IMPORTANCE OF SENSOR FUSION TECHNIQUES
Techniques that employ sensor fusion methods has several advantages over single sensor systems.Combined information reduces the set of uncertain interpretations of the measured value.Expected benefits of sensor fusion techniques are presented as follows [104]: Reduction in Uncertainty: Data provided by sensors is sometimes subjected to some level of uncertainty and discrepancy.Multi-sensor data fusion techniques reduce the uncertainty by combining data from numerous sources [125].It is therefore imperative to compensate using other sensors by fusing their data together using data fusion algorithms.Authors in [126] was able to minimize uncertainty in robot localization based on EKF and PF.The measurement from the kinetic sensor was used to correct the error accumulated by odometry in order to estimate the pose of the mobile robot.
Increase in Accuracy and Reliability: Integration of multiple sensor sources will enable the system to provide inherent information even in case of partial failure.
Extended Spatial and Temporal Coverage: Area covered by one sensor may not be covered by the other sensor, therefore the coverage or measurement of one is dependent on the other and this complements each other.An example is inertial sensor such as accelerometer or gyroscope and vision.The coverage of a camera as vision sensor cannot be compared to the use of accelerometer which only takes measurement about the navigation route.
Improved Resolution: The resolution resulting value of multiple independent measurements fused together is better than a singular sensor measurement.
Reduce System Complexity: System where sensor data is preprocessed by fusion algorithms, the input to the controlling application can be standardized autonomously of the employed sensor kinds, consequently simplifying application implementation and providing the option of modifications in the sensor system concerning number and type of employed sensors without alterations of the application software.

VIII. FUTURE RESEARCH AREAS
Navigation and localization of a mobile robot in an arbitrary environment is a challenge due to the intricacy and diversity of environments, methods and sensors that are involved.It is therefore necessary to continue to research on new systems and new methods with the aim to unravel specific sensor fusion problems for robot navigation and localization.Several directions seem to call for further investigation, despite other related work carried out in the literature.
3D Indoor Environmental Modelling: 3D models of indoor environments are significant in many applications, but they usually exist only for newly constructed buildings [127].For robot navigation purpose, 3D models are required in an indoor operation environment to ensure safe movement.The model is also expected to be used for recognition and location by robots.To develop a method to model 3D, simplicity and accuracy must first be put into consideration.A 3D model can convey more useful information than 2D maps used in many applications.For example, in an indoor environment where additional features are present and are also unresolved problems in modelling.This kind of environment requires more sophisticated models in order to determine the ability characteristics of the environment.Several methods are adopted in modelling the environment.Reference in [132] proposed a method of obtaining 3D models by a mobile robot with a laser scanner and a panoramic camera while Thrun et al. [133] proposed a multi-planar model from dense range data and image data using an improved Expectation-Maximization (EM) algorithm.Some authors worked with generation of precise 3D models using sufficient amount of data and expatiate statistical and geometrical estimation technique.Environment models are required for localization, object recognition/detection.Recently, 3D models are usually attained by hand-guided scanning which is very hard and time-demanding task for the human operator.Therefore, a robotic system to obtain 3D models of environment is highly beneficial [134].
Landmarks and Feature Extraction: Localization methods using vision are active research areas, especially in studies related with the identification of objects and the position and estimation of the recognized objects [135].Another aspect to look into is the appearance changes of target objects over time; this also as a research area has gained much attention in the literature but with the limitation of robust detection algorithm.
Distinct Object: To improve localization for a mobile robot in a structured or unstructured environment, it is suggested that distinct or specific objects are to be detected.Despite the work done, this is still an open problem.
Topological Modelling and Localization: Several traditional localization approaches attempt to determine geometrically the position and the direction of the robot; new approaches are to be considered and compared.Recent approaches look for methods to build topological models once features and landmarks are detected and for topological estimation of the robot's state.
Perception Planning and World Modelling: Motion planning and path planning are factors that can also cause uncertainty in mobile robot.In a situation whereby the robot accidentally takes another route and misses it path, how such event is handled is an aspect that requires attention.Therefore, new techniques are suggested to determine motion plan for mobile robot.Also, a model of the environment is to be built for safe motion planning for the robot to operate in.

IX. CONCLUSION
Through the mobilization of autonomous mobile robot, businesses are increasing flexibility and diversifying applications.The new technologies have improved and ease the way of life of human beings in which their exposure and environmental dangers and hazard have been reduced to the minimum.
In this paper, we have been able to provide a background and identify the of an autonomous mobile robot.These problems such as navigation and localization are what limit the performance of the robot.Therefore, some techniques have been presented in this paper on how to tackle the challenges.Such techniques are using sensors which are coupled on the mobile robot for effective performance.Using a single sensor to determine the pose of an object may not be reliable and accurate therefore, the use of multi-sensor is encouraged.Their objective is to integrate multiple data sources to produce more consistent, accurate, and useful information.
Methods used to extract information from environment using computer vison were also discussed.These methods are categorized into artificial and natural landmarks.They are used to detect/identify objects and match with the training image.The strength and weakness of these approaches were also presented.Exploring the conceptualizations and benefits, as well as existing methodologies, sensor are categorized into how to relate to one another, this is called sensor configuration.They are cooperative, complementary and competitive.The mostly used sensor configuration for autonomous mobile robot is complementary.
Also, benefits of using sensor fusion algorithms were identified in this paper.Finally, the paper highlighted some of the research areas that can be investigated for further work.
Input training image -Convert the image to grayscale -Get rid of lens distortions from images -Initialise match object -Detect feature points using SURF -Check the image pixels -Extract feature descriptor -Use RANSAC algorithm to match query image with training image -If inliers > threshold then -Compute Homography transform Box -Draw box on object and display.

TABLE 2 .
Related works of different sensor fusion algorithms.

TABLE 3 .
Related works of state estimate sensor fusion algorithms.