Method of robot episode cognition based on hippocampus mechanism

This paper proposes a method to realize the robot’s self-learning of unknown environment by using a episode cognitive model inspired by the hippocampus. The episode cognitive map constructed through self-learning is suitable for robot navigation in an unknown environment. Which solve the problem of robustness of the robot’s perception in complex and dynamic environments. The model we propose is called the hippocampus episode cognitive network (HEC), which is based on the physiological functions of the hippocampus CA1, CA3 and the dentate gyrus combined with the adaptive resonance theory (ART) network. Extract new events through the incremental generation of cognitive neurons, and encode the events into episode through spatio-temporal connections. The episode nodes can be connected to generate a episode cognitive map. This method can realize learning, storage, and update information for autonomous mobility robots in unknown environment. On the basis of the episode cognitive map, the path trajectory can be predicted through the playback of the episode neuron. The experimental results on the mobile robot show that this method can effectively improve the robot’s adaptability for positioning and mapping in a complex and dynamic environment.


I. INTRODUCTION
I T is a challenging task for robots to navigate autonomously in a complex and dynamic unknown environment [1] [2]. Robots need the abilities of perceiving environment, learning environment, self positioning, path planning and navigation. Cognitive map [3] [4] is a representation method that records the relationship between environmental landmarks. It is the subjective experience memory of the agent's environmental information, and it is the basis for positioning and navigation. Inspired by the navigation capabilities of mammals (such as rats), research on how mammals conduct map construction, positioning and navigation has gained great interest in the field of robotics [5] [6] [7]. Compared with robots, mammals can adapt to complex environments and tasks well, study knowledge, and retrieval previous ex-periences to complete new work. And this critical behavior is often promoted by experience and needs to rely on their learning. In the same way, for robots, it is necessary to have the ability to learn and remember experience [8]. The learned experience can be integrated into a episode cognitive map, allowing the robot to perceive and navigate freely in an environment. Seeking inspiration from biological experience to achieve bionic environmental adaptability and cognitive abilities is an important way for the research of intelligent navigation robots.
This paper proposes a continuous learning, self-adaptive episode cognitive network model, which is used to realize the episode recognition and memory of environmental information by mobile robots. The biology of episode cognition is inspired by the hippocampus of mammals [9] [10], which Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS is a collection of experiences that occurred in a specific time and space in the past. At present, most of the realization of memory is from the perspective of engineering and learning specific actions, such as literature [11] [12] [13], and its versatility and adaptability are very weak. Jockel S et al. [14] proposed episodic robot memory, which uses images, appearance and behavior to improve action planning based on past experience. It can provide a fixed episodic memory for the robot and retain a series of experienced observations, behaviors and rewards. But it cannot continue learning. Stachowicz and Kruijff [15] used an indexed data structure to store episodic memory to provide knowledge for the robot cognitive model, and conducted many long-term simulation experiments. Kelley [16] implemented a memory storage that allows robots to construct events based on images that retain knowledge from previous experience. Park et al. [17] proposed an integrated adaptive neural model of episodic memory and task memory, which is used for robots that perform service tasks. The above-mentioned research on bionic cognition is based on the expression of symbolic information, which is inconsistent with the biological neural processing and expression mechanism of perceptual information. Therefore, there are many limitations in the fusion of perceptual information and expansibility.
The Hippocampus Episode Cognition model (HEC) proposed in this paper is inspired by the biological basis of hippocampal neurons, using adaptive resonance theory network (ART) as the basic unit module, combined with our previous research on the spatial cognition model of hippocampal structure. which extracts and encodes sensing and motion information in the space and time dimensions to form episode cognition. The episode cognitive model based on HEC can effectively organize the robot's experience, including the internal state of the robot and the cognitive memory of the external environment.
Physiological studies have shown that the mammalian hippocampus converges spatial information from the entorhinal cortex and visual information from the visual cortex to form episode memory in the CA1 area of the hippocampus. In addition,Research found that there are a series of neural cells with spatial cognitive functions in the hippocampus and its accessory structures, such as head-direction cells, grid cells, and place cells. In the previous studies, we used the continuous attractor model to construct the hippocampus spatial cell model,which is used to simulate the cognitive process and information expression to space. The HEC model studied in this paper combines the neural expression results of spatial cognition with visual perception information to form a robot bionic episode cognitive memory.
The hippocampus' cognition memory of environmental episode is realized through the learning of neuronal synaptic connections. In the field of neural learning, "Hebb Theory" describes the principle of synaptic plasticity, that is, the continuous and repeated stimulation of presynaptic neurons to postsynaptic neurons can lead to an increase in the efficiency of synaptic transmission. ART network is a self-learning FIGURE 1: Neural connections within the hippocampus model of neural synapses based on "Hebb Theory". It is a efficient, unsupervised learning algorithm that has been used to explain how location cells learn [18]. ART solves the stability-plasticity problem, it explains how the brain has inherited learning knowledge [19], how to quickly learn to classify and remember information in the real world.
The main contribution of this paper is to propose a set of bionic episode cognitive models and robot episode cognitive methods, which can simulate the biological episode memory mechanism and construct a episode cognitive map for robot navigation in uncertain environments. According to the episode cognitive map, the robot can predict the optimal behavior strategy by activating the episode neurons to adapt to the complex, dynamic, and unknown environment and navigation tasks.

II. HEC EPISODE COGNITION MODEL
Current research in brain science and neurobiology has a clearer understanding of the internal neural pathways and functional mechanisms of the hippocampus of mammals (such as rats). As shown in Figure 1, perceive information is projected from the cerebral cortex to the hippocampus through neural pathway 1. The fibers in the channel form synaptic connections with the apical dendrites of the dentate granular cells and CA3 pyramidal cells. The granular cells in the dentate gyrus project through the mossy fiber 2 to the pyramidal cells in CA3. The pyramidal cells in CA3 project to the pyramidal cells of CA1 through the Schaffer branch 3, and the CA1 pyramidal cells are connected to the inferior colliculus through the 4 channel. The dentate gyrus performs pattern separation through competitive learning to produce a sparse representation. The dentate granule cells are connected to CA3 through a small amount of mossy fibers, which produces a random pattern separation effect, which separates the pattern represented by CA3 discharge and distinguishes it from other different patterns. CA1 records the information from CA3 and establishes associative learning to the neocortex. The information of the neocortex can be retrieved during the recall process. Related research on hippocampal subregional analysis and hippocampal knockout experiments support this theory. [20] In the field of bionic cognition, the computational theory that uses the hippocampus structure as the mechanism of the episode cognition system has been described a lot. The CA3 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. area is a functional area for perception fusion and comprehensive episode cognition. The episode cognition model that mimics this area needs to meet the following basic functions: (1) Support for rapid association of any spatial location, perception, and behavior; (2) Recall from any part the whole episodic memory; (3) The cognition of the episodic needs time correlation to realize the time sequence memory. Neuroscience research has shown that stimulation can change the firing pattern of neurons in the CA1 area of the hippocampus. This change is related to the change of the episode. Statistical methods show that neurons activate in low-dimensional space to form an obvious cluster coding pattern. Anatomical studies have shown that there are spatial cognition cells such as head-direction cells, grid cells, and place cells in the hippocampus, which can integrate its own movement cues to form spatial cognition. These spatial cognition information is finally projected to the CA3 area of the hippocampus and combined with other perceptual information to form a biological comprehensive cognition of the environment. These spatial cognition information are finally projected to the CA3 area of the hippocampus and combined with other perceptual information to form a biological comprehensive cognition of the environment.
Based on the above-mentioned physiological research foundation, using nerve cell coding information as the information expression method, and drawing on the previous research on the cognitive memory model [21], we propose the HEC robot episode cognitive model that mimics the hippocampal structure of the episode cognitive mechanism. It is composed of four layers: input layer, event layer, scene layer, and map layer, as shown in Figure 2. The input layer is to stack the perception information from the environment to form an activation value vector representing the environment characteristics. The activation pattern of the input layer will be projected to the event layer, and event neurons can be selectively activated through event recognition. This process continuously learns the activation pattern of incoming events by updating the connection weights between the input layer and the event layer. In the environment perception task of mobile robots, events are mainly identified by environment perception features. In our model, the environment feature information comes from head-direction cells, grid cells, place cells, and visual landmark cells.
The event neurons have a short-term memory function and store the event activation pattern in the event layer itself. The information of the input layer causes the neurons of the event layer to be activated, and the activation value of each event neuron gradually decays with the passage of time, so that an activation sequence with time information can be obtained. Then, these activation sequences are transmitted to the episode layer, and a corresponding episode neuron will be activated as a result of episode recognition.
Once a episode is successfully identified, the entire episode can be retrieved, from the event level to the input level. Since the episode cognitive map connects each episode node together, the robot can automatically find other nodes accord- The algorithm details of the learning, mapping and retrieval of the proposed HEC model will be further explained below.

1) Neural code of head-direction cell
When the head of a mammal (such as a rat) faces a specific direction, the corresponding head-direction cell to produce the maximum firing. When the head direction deviates from the specific direction, the firing gradually weakens [22]. We build a ring attractor model as shown in Figure 7(a), and encode a specific angle head orientation through the group firing pattern. When the rat's head orientation is θ t , the phase angle of the ith head-direction cell in the attractor model is θ i , so each head-direction cell can be set to produce the following firing rate signal: 2) Neural code of grid cell In 2005, Hafting et al. [23] changed the size and shape of the test chamber and found that there are grid cells in the ratâȂŹs hippocampus that have a strong discharge to specific locations in space. When the rat moves in a two-dimensional space, Grid cells generate repetitive and regular discharges at specific locations. This spatial range is called the grid field of the grid cells. The hexagonal activation field formed by multiple activation areas is connected throughout the entire space environment that the mouse passes through; The grid cells can be represented by the neural plate model shown in Figure 7(b). N g N g grid cells are uniformly arranged on the neural plate, and the firing rate of each grid cell will periodically change with the change of spatial position. The firing period between the grid cells is the same but the phase changes uniformly. In this way, when the rat is in a certain spatial position, the neural plate firing will show a grid-like pattern, and when the rat moves, the grid pattern will move in the direction of displacement accordingly. The periodic firing characteristics of grid cells can be obtained in one-dimensional space with the Von Mises function, which is a periodic extension of the Gaussian function: Where λ refers to the grid spacing, c j is the phase of each grid cell, κ is the gain coefficient, n max is the maximum firing rate, and Ω j (x) is the firing rate of the grid cell. It is actually a stride on the neural plate pattern. The grid pattern can be considered as the superposition of three stride patterns with an angle of 60 degrees. Therefore, the grid cells model in the two-dimensional space can be expressed by this function: Where x is the spatial displacement, and k l is the direction of the wave vector of each stride.

3) Neural code of place cell
The place cell is a kind of firing cell that is selective to the spatial location. Only when the rat is in a specific position in the space, the cell will fire, while other positions in the space do not produce firing. We arrange the place cells on the neural board as shown in Figure 3c, and the peak firing position of each position cell corresponds to its phase on the neural board. Therefore, the place cell population will present a single peak pattern, which maps the current position of the rat. We use the place cell mathematical model proposed by OâȂŹKeefe et al. [24] to calculate the cell firing rate at each position, and its mathematical expression is shown in equation (3): where R i pc ( r) is the firing rate of place cell i at position r, r(x, y) represents the current position coordinates of the rat in the environment. (r) i o is the position coordinate corresponding to the firing field center of place cell i. δ 2 is the adjustment coefficient of the place cell firing field.  Biological visual information is perceived by retinal neurons, and landmark information is extracted by the visual cortex then input into the hippocampus to merge with other perceptual information. A similar process is also used in our cognitive model. We collect the pixel information by the camera (corresponding to the biological retinal nerve stimulation) through visual cognition model (corresponding to the visual cortex) to obtain the statistical information of the visual landmark and then input it into our bionic episode cognitive model. The landmark cognition algorithm is similar to the event encoding algorithm in the following section, which will not be elaborated in this section. The block diagram of the algorithm model is shown in Figure 4. The first step is to uniformly set grid points in the picture obtained by the camera, and then use the grid point as the center of the field of view to obtain small pictures as the unit of landmark recognition. The second step is to input the unit pictures to the feature layer to extract visual features. We use the SHIFT feature descriptor as the visual feature, and each unit picture can get a 128-dimensional feature vector. In the third step, the feature vector is inputed to the landmark layer for pattern matching and weight learning. If there is no match, a new landmark neuron is added. Finally, the statistical data of the landmark numbers by all the picture units are used as the visual coding information, and finally as the input information of the visual channel of the episode cognition model.

B. EVENT ENCODING AND RETRIEVAL
Set the number of head-direction cells is N d , the number of grid cells is N g , the number of position cells is N p , and the number of landmark cells is N l , the characteristic vector of the input information of each channel is shown in Figure 5.
From the input layer to the event layer is composed of multiple ART networks, each perceptual channel corresponds to an ART network, and each network learns each charac-This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.  Number of input channel of input layer x k Activity vector of input layer of channel k y n k Activity vector of Event layer, where n is the number of neurons w k j Weight vector of j neuron of channel k y j Activity vector of Event layer teristic attribute of the event through the weight connection between the input layer and the event layer. The neuron in the event layer represents an event activated by the robot. Table  2 lists the symbols that will be used in the description of the algorithm in this paper. The event layer cognitive learning process solves the method in the literature [21], including the following four main steps.

1) Activation of event neurons
when the a new activation vector from the input layer to event layer, then calculated activation value of every neuron in the event layer to find the neuron that match the input pattern. The activation value of each neuron J in the event layer is calculated as follows: The ∧ operation is defined by (p ∧ q) i = min(p i , q i ), and the norm| · | is defined by |p| = i p i .

2) Competition of event neurons
This process selects the neuron with the biggest activation value in the event layer. The selected neuron J has the biggest T j value, that is Once the neuron J is selected, by the "winner takes all" rule, the activation value y j = 1, when J = j, others as zero.

3) Matching of event neurons
This process is also called rematch evaluation. It is checked whether the matching value of the winning neuron J for each channel k meets its alert threshold as follows: If any channel k does not meet the Vigilance, a mismatch will occur, and the winning neuron will be deactivated in the competition. Use Eq. (1) and (2) to select next neuron J − 1 until one neuron satisfies Eq.(3). If no one neuron in the event layer pass the vigilance test, adds a new event neuron to response the current event.

4) Learning of event neurons
If a selected neuron satisfies Eq.(3),the synapse weight w k J will be updated: Then the activation value y J is set to one, and y J is used as the input of the episode layer to perform episode cognition in the space-time dimension.
The HEC cognition model is a framework that can learn and classify multiple perceptual pattern inputs to produce specific cognitive outputs. An important feature is that the cognitive learning and recognition process happened at the same time. The network can continuously learn the new perceptual information by adjusting the weight vector. When none of the existing neurons match, a new one will be added. Therefore, it can gradually learn new knowledge and can inherit the knowledge learned before.
The cognitive process in the event is a bottom-up activation method for the input vector, while the recall retrieval is achieved by top-down tracing. The algorithm for event learning and retrieval is show in Algorithm 1.

C. EPISODE LEARNING AND RETRIEVAL
The advantage of episode cognition is that it can encode the time relationship between input patterns (event sequences). In HEC the newly activated event neuron has the biggest activation value y J = 1, and the previously selected neuron will be attenuated in each learning iteration, y new j = y old j (1 − τ ), where τ ∈ (0, 1), represents the decay speed of the event activation value. Therefore,an event activation vector with a time sequence is generated in the event layer, and the activation vector is input to the episode layer, then the episode neurons are activated by various sequential patterns. Algorithm 2 presents the episode learning and coding process. When the event sequence y changes in the event layer, the activation value of each neuron in the episode layer is calculated by the activation function. Then, if there are any similar episode that satisfy the vigilance test (m s > ρ s ), the matching is successful, if all existing neurons do not match in the episode layer, a new episode neuron is created.
Different from the activation of event neurons, we have created a new activation function that not only reflects the consistency between the weights of the neurons obtained and the input pattern, but also evaluates the directional similarity between the vectors to improve the accuracy of episode recognition accuracy. The activation function is: where s When the episode learning is completed, it can be identified and recalled based on the environmental cues, and the entire environmental episode can be regenerated by predictive methods based on the cues. An important feature of the HEC model is that the activation and matching have strong robustness, which is very practical in the case of environmental information deficiency and interference. For example, reducing the vigilance parameter ρ s of the event layer can improve the ability to tolerate noise or incomplete information. Retrieve a episode with less obvious environmental characteristics, such as a sub-sequence of a episode, in the continuous retrieval process. When the neurons in the episode layer are activated, the activation mode is generated in the event layer through resonance search. As long as no event neuron with resonance matching (less than ρ k ) is found, the next event will be received and another neuron in the event layer will be activated, so that it can supplement the environmental characteristics.
Algorithm 3 presents the process of episode recognition and prediction. When an event occurs, if a episode neuron S is activated and satisfies the vigilance test, its complete modality in the event layer will be regenerated through top-down calculations. However, to reproduce a complete episode information, the information in the input layer is also reproduced in sequence. To facilitate the calculation, a compensation vector is used to replace the predicted event activation vector. Suppose the sequence mode of the event layer is predicted from the episode layer as y p , and its compensation vector isȳ p , for each event neuronȳ p i = 1−y p i . For a given vectorȳ p , the event neuron with the largest activation value is selected and the input information of its input layer is predicted. Subsequently, the activation value of the currently selected event neuron is suppressed to zero. Then, the next event neuron with the largest activation value is selected, and the process is repeated until all event neurons

FOR EACH incoming event 2
Select a resonance neuron J in event layer based on the corresponding event 3 Set neuron activation 4 FOR EACH previously selected neuron i in event layer decay its activation by τ 5 Select a resonance neuron S in episode layer based on y event layer 6 IF S can be found THEN 7 Predict events 8 Set event vector as ,where 9 FOR EACH neuron j with largest activation in event layer 10 Readout weight of 11 Predict the input THEN

FOR EACH subsequent event in episode 2
Select a resonance neuron J in event layer based on input in input layer 3 Let neuron activation = 1 4 FOR EACH previously selected neuron j in event layer 5 Delay its activation by τ or 0 when 6 Given activation vector y formed in event layer after the subsequent presentation of episode 7 Active every neuron s in episode layer by choice function 8 Select neuron by ： 9 WHILE match function 10 Deselect and reset J by 11 Select another neuron by ： ) 12 IF no matching (resonance) S can be found in episode layer THEN let , where is a newly recruited uncommitted neuron in episode layer 13 Learn J as a novel event with are inhibited. The events of the episode retrieved in this way can be reproduced in order.

D. CONSTRUCTION OF EPISODE COGNITIVE MAP
The existing situational cognitive network focuses on event coding and memory management, and there is no connection between episodes in the episode layer, so it is not suitable for robot path planning.
In order to overcome this problem, we have extended the HEC architecture to create a cognitive map that connects the episode nodes on the basis of the episode layer. The construction of the episode cognitive map depends on the generation of episode neurons in the episode layer. Whenever a new neuron is generated, a new node will be added to the episode map. Each node M i contains a specific situational neuron e i , a hippocampal space cell discharge mode p i , and visual landmark information v i . The edges between nodes represent the time-space relationship between nodes. In a sense, this map connects the episode neurons in the episode layer. Therefore, the nodes in the map can be defined as triples: The node link L ij marks the connection from the previ- This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.   ously activated node C i to the nearest active node C j , and encodes the changes in the spatial position obtained from the bionic cognitive model. With the episode map, path planning can be achieved through a top-down retrieval process. We use tree search to select the connecting node (path) from the current position of the robot to the target point. Each node of the episode cognitive map represents a specific neuron in the episode layer. Therefore, the path from the scene cognitive map is the episode trajectory. After selecting the episode trajectory, HEC determines the resonance event neuron of the event layer according to the current episode. Once the resonance event neuron is selected, the top-down connection will recall the event in the sequence event layer and retrieve the attributes of the event. The retrieved event attributes include robot space perception and visual perception, which are used for robot navigation and positioning. The whole process is shown in Figure 4.

III. EXPERIMENTS AND RESULTS
The models and methods we propose are suitable for learning, positioning, and mapping of real scenes including dynamic pedestrian passing and local environmental changes. We use a mobile robot made in the laboratory. As shown in Figure 7, it is equipped with a USB camera, encoder and inertial navigation to measure the mileage and direction. In order to facilitate the most comparative experiments, it is also equipped with a three-dimensional lidar. The images obtained from the camera are cut into blocks with fixed steps as landmarks, the SHIFT [25] descriptor subvector is extracted for each landmark as landmark features, and the landmarks are classified and coded through cognitive learning. Then, the code composition of the road signs on each original image is counted as the input value of the visual information. Assuming that the starting position of the robot is the zero coordinate, the encoder collects the displacement between adjacent sampling moments as the step displacement. The grid cell mode learned by the robot in the previous sampling step is driven by the step displace-  ment to move the phase accordingly, and the new grid cell mode obtained is used as the input value of the grid cell at the current moment. The mode of the position cell is the unimodal discharge mode obtained from the decoding of the grid cell. The direction information obtained by the inertial navigation is encoded into a ring attractor model. The final head orientation cell information, grid cell information, position cell information, and visual landmark information are used as the four channels for mobile robots to perceive The HEC receives the normalized feature vectors of the four channels as the input of each learning iteration. Set the parameters of the model, and the set parameters are shown in Table 1. We control the robot to move autonomously in office areas and corridors. The perceived internal motion information and external visual information are processed into feature vectors of the input layer. The learning algorithm continuously classifies similar events according to the feature vectors, and classifies similar events in the event layer. They are represented as event neurons. The time series of the activated event neuron is used as the input feature vector of the context layer to activate the context neuron. The learning algorithm continuously classifies similar scenarios based on event neurons to form a series of scenario neurons. Each situational neuron is connected as a node to store memory, and finally these learning experiences form a situational cognitive map. The detailed experimental process and results are as follows. VOLUME 4, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.

A. PREPROCESSING OF ENVIRONMENTAL PERCEPTION INFORMATION
According to the perceptual information encoding method described in the aforementioned model and method, the number of various types of nerve cells are set as: 360 headdirection cells, 32 × 32 grid cells, 32 × 32 place cells, and 5000 visual landmark cells. The direction information of the moving chassis is encoded into the firing rate of each headdirection cell; the mileage information is converted into the displacement between each adjacent sampling step, and the displacement drives the grid cells pattern to obtain the current grid firing pattern, and then the firing rate of the grid cell realizes neural coding; the place cell information comes from the position decoding of the grid cells. The purpose of using it as a separate sensing channel is because it has a one-toone correspondence with the physical space.It can be used as a spatial mapping of episode neurons to form a episode cognitive map. Figure 7 shows the environmental perception information of four channels of neural coding. The neural firing information of each channel is expanded into a vector, and each element of the vector is used as the input attribute value of the channel. Figure 9 shows the robot system based on the HEC model episode recognition process and results. The robot roamed autonomously in the experimental area, including pedestrians passing by and changes in the local environment, and relevant perceptual information is learned as a series of episodic memories. Figure 9(a) shows the robot's learning trajectory. The robot circulates in the office, and the robot learned 200 sampling points in total. As shown in Figure 9(b), at the 1st to 94th sampling points, the robot went back and forth for    In order to verify the anti-interference performance of the episode cognition model, after the first cognition learning, the camera was blocked several times during the second cognition learning on the same path. The event and episode recognition process is shown in Figure 10. Figure 10(a) This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication.  shows the activation of event neurons and episode neurons during the first cognition learning. It can be seen that the neurons are gradually increasing. Figure 10(b) shows that after being disturbed (occluded), the perception information cannot match the previously recognized events, and new events are generated in step 2, 10, and 22. Episode recognition cannot be matched due to limited event sequence information in step 2, Except that the perceptual information in other time steps can accurately identify the existing context neurons. Therefore, the proposed episode recognition model is robust to interference.

B. CONSTRUCTION OF EPISODE COGNITIVE MAP
Next, we did repeated experiments in a controllable environment. We controlled the robot to go back and forth in the office area for cognition learning five times, and then walk along the original route four times. The environmental conditions are static, slightly changed, moderately changed, and greatly changed. We used three indicators: average similarity, event neuron matching rate, and average error to evaluate the results of the experiment. The average similarity refers to the average value of the matching degree between the current environment information and the episode neurons in the previous cognition memory, and the event neuron matching rate refers to the correct response degree of the model to the event neurons in the memory under the current environment information. The average The error refers to the REMS error between the current actual position and the position of the activated event neuron. The final experimental results are shown in Table 3, The results show that as the environment changes more dynamically, the cognition model will add a lot of event neurons and episode neurons to adapt to the new episode, and the accuracy of cognition will also decrease, but    In the corridor with weak texture information and in the office with dynamic changes, HEC shows excellent positioning performance.
the impact Relatively small.

C. ROBOT EPISODE PREDICTION
As mentioned earlier, each node in the episode cognition map represents a specific episode neuron in the episode layer. Therefore, we conducted robot relocation experiments to evaluate the performance of episode prediction. For example, after receiving a series of input environment information, the robot first activates a specific episode node in the scene cognition map accordingly. Then predict the position of the robot from the memory information mapped by this episode node, and calculate the Euclidean distance between the predicted position and the current actual position of the robot. If the Euclidean distance is less than the set threshold, the robot is considered to be successfully positioned. In the experiment, we set the threshold to 0.1 meters.
In the cognitive learning process, the robot is controlled to randomly traverse 10 times between a fixed starting point and an ending point to learn sufficient environmental information. During each movement, the environmental conditions are not controlled, and try to conform to the real scene. Therefore, the robot faces different environmental conditions in each circle, such as the opening and closing of the door, the movement of pedestrians, and the change of light. In order to verify the advantages of the proposed situational recognition model, a comparative experiment was carried out with the 3D laser SLAM (simultaneous mapping and positioning) model, and the robot relocation accuracy of HEC and [26] was compared. The positioning accuracy comparison is shown in Table 4. Show. The experimental results show that the HEC model has obvious robustness advantages, and the advantages are more obvious in the weak texture environment such as the corridor.

IV. DISCUSSION
We verified the adaptability of HEC's perception in robot contextual cognition learning without pre-defined environ-VOLUME 4, 2021 mental knowledge. HEC can classify the perceptual features of each channel as event neurons, and encode the spatiotemporal relationship of event neurons as context neurons. The learning and coding process does not require any human intervention, which makes it possible to work in a natural environment. HEC can adapt to changes in the environment by updating neurons, so it can adapt to long-term operations. Robot positioning is not like other methods such as SLAM. It not only compares perceptual information, but also matches through the spatio-temporal relationship of events in event neurons. Therefore, this method can further eliminate the ambiguity of places with similar perceptual information and overcome the problem of perceptual confusion.
When the robot path planning, because the episode cognition map connects all the episode nodes together, the robot can search for the shortest path among these connected episode nodes, and then perform navigation. Of course, this method has limitations, since the planned path only relies on the robot's previous traversal experience, it may not be the optimal navigation path (shortcut), which is similar to a human being along the previously traversed route to reach the target location. In addition, for advanced mammals, they can navigate through unknown areas to vector navigation to the target. We will study in-depth in the follow-up research.

V. CONCLUSION
In this article, we propose a method to build a cognition map for robot navigation in a complex environment by mimicking the contextual memory of the hippocampus mechanism. It can learn and recognize the episode experienced by the robot, predict the current state and planned behaviors. Combined with the biological basis of neural expression of perceptual information, the mapping relationship from spatial and visual information to contextual cognitive neurons is established, and a set of contextual cognitive models and methods are proposed. The learning method realizes learning by changing the synaptic weight connection of the neuron network, and has the self-learning ability of imitating nerve synapses. It can realize the real-time storage, integration and update of robot memory. In the future, the activation of the contextual neuron can be used to predict the behavior sequence and realize the robot navigation based on the contextual cognitive map. Experimental results show that the robot can realize robust environmental cognition and prediction under dynamic and complex environments.