Ontology Learning of New Concepts Combining Textural Knowledge, Visual Analysis, and User Interaction

The continuous emergence of new technologies has contributed to the impending reality of service robots an upcoming reality. When interacting with humans, robots must adapt to changing environments. Hence, service robots at home need learning capabilities to acquire new knowledge and merge it with their own. In this study, we have developed a system for learning the ontologies of new concepts, combining textural knowledge, visual analysis, and user interaction. In this system, the robot is provided with an essential feature to adapt to the home environment. We focus on the learning of new ontological concepts oriented toward service robot applications. We propose combining textural knowledge, visual analysis, and user interaction to determine the correct placement of the new concepts in the ontology structure. We aim to enable the robot to extend its ontological knowledge as needed. We conducted a set of experiments to show the applicability of the presented method and the advantage of conceptualizing objects in ontological knowledge. The experiments consisted of two parts: concept learning experiments and experiments with an integrated robot system. In the former, the robot had to conceptualize a set of new objects in its ontological knowledge, and in the latter, the robot was asked to search and find the new objects learned.


I. INTRODUCTION
The continuous emergence of new technologies has contributed to the impeding reality of service robots. Service robots are used in areas such as hospitality with great success in supporting customers and employees. A particular case is the service robots at home, which assist humans closely in performing household chores. These types of robots require standard capabilities such as navigation and vision, as well as personalized means of interaction according to the robot's purpose. An essential part of developing such robots is knowledge management, which contains information regarding the scenes and robot; this knowledge is generally integrated beforehand. However, robots must adapt to the changing environments when interacting with people. Hence, service robots at home need learning capabilities to acquire new knowledge and merge it with their own.
The associate editor coordinating the review of this manuscript and approving it for publication was Pasquale De Meo.
The use of ontological knowledge to represent concepts has been proved to be beneficial in many robot applications. It gives a broad conceptualization of the meaning of objects and allows for the connection of related concepts, making knowledge extendable. Ontology learning has been achieved for different domains using linguistic and statistical techniques and inductive logic programming. These techniques include data preprocessing, relation extraction, and term extraction [1].
Expanding robot knowledge had been achieved using multiple techniques, including the use of language and perceptual information [2]- [4]. However, a broad and a clear conceptualization of objects is not contemplated. Different systems that include a hierarchy of concepts had been proposed [5], [6]. However, the robots' autonomous learning is not considered.
In this study, we develop a system for ontology learning of new concepts combining textural knowledge, visual analysis, and user interaction. In this system, a robot is provided with an essential feature to adapt inside a home environment. VOLUME 9, 2021 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ We focus on the learning of new ontological concepts oriented to service robot applications. We propose combining textural knowledge, visual analysis, and user interaction to determine the correct placement of these new concepts in the ontology structure. We aim to enable the robot to extend its ontological knowledge as needed.
The contributions of this work are as follows: (a) textural knowledge acquisition for word meaning identification and image data collection, (b) image visual analysis for concept description selection, (c) user interaction to support meaning selection, and (d) ontological knowledge update with the conceptualization of new objects.
Our previous work [7] showed an early version of this system, which used a basic visual analysis. It demonstrated the importance of selecting the proper meaning of a new concept during the learning process. In this work, we have included the utilization of neural networks in the visual analysis process and human-robot interaction to support meaning selection. In addition, we conducted experiments, including experiments with a complete service robot application scenario.
The rest of this study is organized as follows. Section II describes related work. Section III explains the problem definition and the environment layout. Section III-A briefly describes ontology learning components included in this system. Section IV and V explain word meaning identification and visual analysis processes, respectively. Section VI describes concept description selection. Section VII explains the ontology update process. Section VIII shows the experiments in learning a new concept. The study ends with the conclusions and discussion in Section IX.

II. RELATED WORK
Ontology learning techniques have been explored in different research domains such as tourism [8], [9], medicine [10], [11], and computer science [12], [13]. In addition, researchers have developed tools for ontology development tasks, such as the ROBOT tool [14]. This tool helps to check logical errors and standard quality control; it eases the creation, maintenance, and release of ontologies, currently working only with Open Biomedical Ontologies [15].
In the field of robotics, an attempt to create an ontology standard for autonomous robotics was presented in [5]. The authors demonstrated some advantages of using ontologies for a system, such as semantic interoperability. The knowledge representation focuses on notions such as behavior, function, goal, and task. Although the ontology includes object manipulation behaviors, it is unclear how objects are described. In addition, it is not mentioned if robots can extend their knowledge autonomously.
An approach to expand the knowledge of a conversational agent during runtime was presented in [2]. It presents a technique for automatic knowledge extraction, and it was tested with a social humanoid robot in a residential care home setup. The insertion of concepts is accomplished using verbal interaction, and it includes cultural knowledge about the user. Before the robot inserts a new concept, it asks whether the new concept belongs to a previously mapped class. However, it will continue asking in the same way for each class of ontology in a descending form, which might lead to a rather prolonged confirmation process if the mapped class is too general. Furthermore, the correct insertion of concepts depends on the ability of a user to describe the new concept so that the system can find a matching or related class; this can limit meaning conceptualization.
A different approach for learning concepts is found in [3], where an integrated cognitive architecture was realized to learn concepts, actions, and language. The authors integrated multiple probabilistic generative models to accomplish this. Experiments proved that a robot could learn based on its experiences while interacting with objects and receiving feedback from a user. However, they do not consider any hierarchy or relation between concepts related to objects learned by the robot. In addition, the model does not consider reasoning on the newly acquired concepts.
Learning word meanings has been achieved using statistical models such as probabilistic latent semantic analysis and latent Dirichlet allocation [16]. The approach, further extended in [4], used multimodal information such as visual, auditory, and haptic information, and it can learn object concepts incrementally. However, the categorization of objects is based on their visual features, so the semantic relationships between the object concepts are not present.
Another approach for learning visual concepts was proposed in [6]. It collects training videos and uses human annotation for data collection. The vision system learns concepts that are used to extract related concepts with ontology knowledge. The system searches for more videos using those extracted concepts and repeats the process. It shows the importance of having a concept hierarchy. However, it is not a fully autonomous learning process since it relies on human annotation data.
In [17], an incremental object learning system was described. The system uses a few sample images to learn new objects and their categories visually. However, the category hierarchy of concepts is not considered in the learning process.
In the present work, we aim to allow a robot to increment its ontological knowledge of object concepts. The conceptualization of a new object is accomplished using textural knowledge. In addition, selecting the accurate meaning describing a new concept is supported by visual analysis and human-robot interactions. The textural knowledge contains information about the semantic relations of concepts, providing accurate descriptions of new objects. Furthermore, the representation of object concepts inside the ontological knowledge gives the advantage of reasoning on concepts and a more extended language description.

III. PROBLEM DEFINITION AND OUR APPROACH
A robot, with general ontological knowledge, is required to learn a new object, which entails learning the new word concept to correctly create the new instance and the corresponding classes inside its predefined knowledge. Although the robot could use methods, such as using online resources or actively asking a user about the new object to extend its knowledge, we want the robot to learn new concepts as needed with the little burden on the user as possible.
For this purpose, we want to achieve the learning of a new concept using three methods. First, (1) using textural knowledge to identify the possible meaning of a new word; second, (2) performing meaning selection by analyzing the visual characteristics of related concepts against the new object; third, (3) creating user interactions to support meaning selection.
The proposed scenario to demonstrate the usability and feasibility of learning new concepts through textural knowledge and visual analysis is shown in Fig. 1. A service robot with vision capabilities and ontological knowledge is requested to learn a new object. First, the robot receives the name and the image of the new object. Then, the new object must be conceptualized to include it in the current knowledge.

A. ONTOLOGY LEARNING COMPONENTS
Our approach for ontology learning of new concepts consists of four modules. These modules are responsible for (1) textural knowledge acquisition and semantic relation extraction, (2) image collection and analysis, (3) concept description selection, and (4) ontology updating. The modules are connected consecutively until finalizing with the object conceptualization process. These components enable the robot to learn new objects by conceptualizing them. The robot only needs the name of the new object and its image to start this process. It can also verbally interact with a user to obtain confirmation before conceptualizing a new object.
An overview of the ontology learning components is shown in Fig. 2. The concept learning process starts when a user shows the robot an image of the new object and names it; this information is sent to the Word Meaning Identification module. It acquires textural knowledge about the new object to extract its semantic relations as a concept. Then, all senses, which refer to the possible meanings of the new word concept, are sent, along with their semantic relations, to the Visual Analysis module.
The Visual Analysis module performs online image queries using the hypernyms contained in the semantic relations received. Next, it analyzes the images downloaded to find features similar to the image of the new object, and a similarity score is assigned to each sense. Senses with their respective similarity scores are sent to the Concept Description Selection module. In this module, the similarity scores are examined to choose a sense that best describes the new object concept to learn. When the robot finds more than one high similarity score, it resorts to interacting with the user to confirm the correct concept description. Once the robot knows which concept description best represents the new object, it sends it to the Knowledge Management module to update the ontology.
In the following subsections, we explain the word meaning identification process, the visual analysis performed, and the ontology update with new concepts.

IV. WORD MEANING IDENTIFICATION A. TEXTURAL KNOWLEDGE ACQUISITION
An important factor for inserting a new concept in the ontology hierarchy is determining the correct correlated concepts and the corresponding categories. The ontological knowledge contains a collection of concepts of objects associated with each other by properties such as the ''subclass of'' property. It also contains instances of real objects belonging to some of those class concepts. To extend this knowledge, the robot needs to acquire textural knowledge containing semantic relations associated with the new concept. Therefore, we decided to employ online processing resources to acquire the semantic relations required.
Textural knowledge is based on a well-known lexical database WordNet [18]. To access the WordNet database, we use the general architecture for text engineering GATE [19], which can add WordNet as a processing resource. First, the robot queries GATE with the new word concept to retrieve the corresponding collection of noun POS tag senses only. Each sense has a possible meaning for the new word concept as a noun and its semantic relations with associated concepts. Then, a list of hypernyms for each sense is obtained. This list represents the superclasses of the new word concept according to each sense that will be used to create the new concept in the ontology.
An example of a query for the word ''pen'' is shown in Fig. 3. The robot receives five different senses for the same word: writing implement, enclosed area, portable enclosure, correctional institution, and female swan. In Fig. 3 (bottom), the retrieved hypernyms for the first two senses are shown. After the robot obtains the list of senses and their semantic relations, it must identify which sense best describes the new word concept to add the corresponding semantic relation in the ontology.

B. SEMANTIC RELATION SELECTION
The selection of the correct semantic relation is a crucial step for ontology learning since this will establish the connection of a new concept with existing ones. In addition, new concepts that are correctly associated will allow the ontology to make inferences over them, such as deducing the possibly inherited characteristics from potential superclasses.  [7]. The robot receives the name of a new object, along with its visual image. On the right side, textural knowledge information and visual analysis are used to select the new hierarchy of classes to include in the current robot's knowledge. The new semantic relation to be added to the ontological knowledge must comply with the following guidelines: • The sense of the chosen semantic relation must correspond to the correct meaning of the new word concept.
• The new semantic relation will be linked to the closest or an equivalent concept in the ontology hierarchy.
• The concept classes of the new word concept will be added sequentially following the semantic relation.
• The insertion of a new class will stop when the new class already exists in the ontology. Otherwise, a maximum of four new classes will be created as the higher the concept in the hierarchy is, the more general it becomes. Semantic relations acquired from WordNet might contain general concepts that can be found in the ontological knowledge, such as an artifact or physical entity (Fig. 3  bottom). However, this does not ensure that those concepts accurately describe the new concept to be learned. Analyzing the visual characteristics of the new object and user interaction aid in determining the correct semantic relation.

V. VISUAL ANALYSIS
We assume that the robot cannot recognize the new object with the current object identification module in the concept learning scenario. Hence, it is impossible to know the name or category of the object that could assert the position for the new semantic relation in the ontological knowledge.
Therefore, we propose supporting the selection of semantic relations by analyzing the visual characteristics of the new object and comparing it with objects belonging to the possible semantic fields of the new object based on its name. This visual analysis process aims to find the correct semantic field of the new object by studying the similarities with other objects from a set of potential semantic relations. The process starts with an online image query to collect images of hypernyms included in each potential semantic relation, followed by feature analysis, as explained in the following subsections.

A. ONLINE IMAGE QUERY
In the first part of the visual analysis process, the robot makes an online image query to collect image samples of object categories associated semantically. Potential categories are chosen from the list of potential semantic relations obtained previously in the word meaning identification process.
A query is formed using the new word concept and a hypernym, e.g., for the first sense of the concept washer, whose first three hypernyms are worker, person, and organism, the first query generated would be ''washer worker.'' This method of formulating queries helps the online image query return good results according to the expected meaning. For example, Fig. 4 shows image results for querying the hypernyms of the second sense of the concept washer in different ways. According to WordNet, the second sense of the word washer refers to a ''seal consisting of a flat disk placed to prevent leakage.'' The difference in image results depends on the query. In the case of (a), (b), and (c), when the query contains only a hypernym, image results do not fully represent the meaning required. This situation can confuse the robot's understanding since objects in the resulting images can be rather general and variable, not necessarily representing the expected meaning.
In contrast, adding the word washer to the query makes it more specific, and resulting images show objects closer to the meaning required (Figs. 4 (d-f)). Therefore, the robot can have a better approximation of objects belonging to the required sense.
Following the query generation method, the system creates online image queries for the first three hypernyms per sense to download sample images (Fig. 5). Thus, these images symbolize examples of objects from each semantic relation. The importance of correctly identifying the sense that best describes the new object to learn cannot be overstated. Choosing the wrong sense could result in a completely different meaning being assigned to the new object, the incorrect concepts, and an incorrect hierarchy being created in the ontological knowledge. Once the online image query collects sample images of the potential semantic relations for the new object, the next step is to determine the correct meaning for the new object by analyzing image features.

B. IMAGE ANALYSIS
The second part of the visual analysis process consists of comparing the new object's image with the potential semantic relation data collection. With this, the robot is expected to find a group of images belonging to only one sense similar to the new object image.
We propose using a pretrained artificial neural network to extract image features of the new object and the downloaded collection of images. Then, it is possible to calculate a similarity value between them using the extracted features.
In this study, we use a deep convolutional neural network (CNN), namely, ResNet-152 version 2 architecture [20]. The CNN is pretrained on the ImageNet dataset [21]. First, we removed the network's last layer to extract a 2048-dimensional feature per image from the last fully connected layer. Next, the features were computed for the new object image and the downloaded images of all the potential semantic relations.
Subsequently, the cosine similarity is computed between the features of the new object image and those of each image from the semantic relations. Then, the average similarity for each semantic relation, which represents a sense, is calculated. Hence, a similarity value is assigned to each sense (Fig. 6).

VI. CONCEPT DESCRIPTION SELECTION
Before the robot can add the new concept to its ontological knowledge, the final step is deciding which concept VOLUME 9, 2021 description best represents the new object shown. Thus, to recap, the first part of the concept learning process is the word meaning identification, which acquires the meanings and the semantic fields of the new word concept. The second part is image analysis, which collects image samples for each semantic field and finds similarities between them and the new object to learn. Finally, the robot needs to choose the best concept description for the new object. The selection of the concept description is based on the results of the image analysis process, which outputs a list of similarity values of the new object for each sense. Ideally, the sense with the highest similarity score would be the best to describe the new object concept. However, the similarity is significantly affected by the variation of results during the online image query, as shown in Fig. 5.
There are some situations when image results may contain unrelated images, even with the descriptive query. That is the case of Fig. 7, where querying ''pitcher containerful'' retrieves mixed results. Therefore, it is necessary to emphasize that having additional images for each semantic relation contributes to a more accurate differentiation.
The senses of a word can refer to multiple different meanings for the same word concept. Hence, the robot needs to be sure that the correct concepts will be added to its ontological knowledge. There are three main cases to consider regarding the results of the similarities of images: 1) The sense with the highest similarity score is the correct one, and no other sense has a close similarity score.
2) The sense with the highest similarity score is the correct one, and a second sense has a close similarity score.
3) The sense with the highest similarity score is the incorrect one. As previously mentioned, the first is the ideal case since the new object would be adequately conceptualized. However, for the second and third cases, an additional method is required to confirm that the correct sense is being selected. Therefore, we propose assisting in selecting the concept description using human-robot interaction, as explained in the following subsection.

B. USER INTERACTION FOR CONCEPT DESCRIPTION SELECTION
In this learning concept process, the robot is in a situation where it requires the user's approval for the series of concepts that it is about to learn. Furthermore, as discussed in the previous chapter, human-robot interaction can be considerably helpful in a service robot environment.
We propose using human-robot interaction to provide final assistance in the concept description selection process, which has the following benefits: • it helps the robot confirm that it is conceptualizing the new object concept correctly, • it helps the robot decide which definition of the new object is accurate when the image analysis results are not confident, • it gives assurance to the user that the robot is learning the new object correctly, and • teaching a new object to the robot would be more interactive for the user. We previously discussed the main scenarios that the robot may encounter after obtaining image similarities: when the VOLUME 9, 2021 FIGURE 6. Image analysis process for similarity score assignment. highest score is either for the correct or incorrect sense, and when the second-highest score is close to the first. Depending on which of these scenarios the robot encounters, it interacts with the user throughout the concept learning process.
The robot creates three types of dialog sentences for (1) requesting the image and the name of the new object, (2) reporting that it saved the new object and a one-word description of it, and (3) allowing the user to choose between two options of concepts that possibly define the new object. While the first and second dialog sentences are always used in the concept learning to start and finish the process, the third dialog sentence is used when the first two highest similarity scores are very close. Hence, the robot can ask which one it should save.
The user interaction helps the robot with the final selection of the concept description. It is worth mentioning that limited interaction is preferable since spending much time teaching one object could be exhausting for the user.

VII. ONTOLOGY UPDATE
The final step in the concept learning process is to update the ontological knowledge with information about the new object. At this point, the correct semantic relation has been chosen according to the guidelines explained in subsection IV-B.
The ontology update process starts by creating a new instance of the object. Then, the first concept class created corresponds to the exact name of the object that is being taught; this means that if the new object is a ''pen,'' the first concept class would be Pen. Next, the following three hypernyms of the semantic relation will be added sequentially based on the hierarchy.
Finally, the creation of new classes stops when (1) the concept already exists in the ontological knowledge or when (2) four class concepts have been created. Figs. 8 and 9 show an example of these two cases, respectively.
One problem arises in the second case when four classes are created without finding an existing class. In this case, the four classes will not have any precedent class  and would be forcefully linked to the root ontology concept. A new concept class linked to the root ontology will not connect with any other concept unless explicitly specified.
The robot has to conceptualize new tangible objects available in the current environment in the proposed concept learning scenario. Therefore, the previously mentioned problem is overcome by linking the last class created to the HumanSca-leObject class, which best describes the possible objects the robot can learn (Fig. 9).

VIII. EXPERIMENTS IN LEARNING A NEW CONCEPT
We conducted a set of experiments to show the applicability of the presented method for ontology learning of new concepts. Some essential factors are considered in these experiments: • Ontological knowledge enables the robot to progressively and accurately expand its knowledge. It also allows the utilization of more linguistic variations for the same referent.
• Textural knowledge contributes to the understanding of the meanings of a new concept.
• Visual analysis significantly supports the robot in the selection of the meaning of the new object concept.
• Human-robot interaction creates a more natural learning process and assists the robot in the correct conceptualization of objects.

A. EXPERIMENT SETUP
The experiments consisted of two parts: concept learning experiments and experiments with an integrated robot system. In the concept learning experiments (Section VIII-B), the robot was asked to learn new objects as follows (Fig. 10): 1) The user showed an image of the new object and named it. 2) The robot conceptualized the new object using textural knowledge and visual analysis. 3) In case more interaction with the user was required, the robot created it. The concept learning experiments were divided into two tasks. In the first task, the robot learned one object, and the results were compared with a baseline method. In the second task, the robot learned eight different objects with specific characteristics to challenge the robot.
After the robot had finished learning the new objects of the concept learning experiment, we conducted the integrated system experiments (Section VIII-C), where the robot was asked to search and find the new objects learned.

B. CONCEPT LEARNING EXPERIMENT
The first experiment consisted of the robot learning only one object concept, and it did not include interaction with the user. This experiment is a continuation of experiments performed in [7]. This experiment aimed to show the importance of visual information to assist in selecting the semantic relation corresponding to the new object concept.
In the first experiment, we taught the robot the concept of a ''pen.'' The image and name of the new object were inputted to the robot. The robot started the learning process by acquiring the senses of the word ''pen.'' Then, it made an online image query of each of the hypernyms of each sense. With this, the robot created sets of sample images for each hypernym of the semantic relation. Sample images from the online image query are shown in Table 1.
As a baseline, we used color histograms in the visual analysis process for feature extraction. Hence, color histograms for all sets of images were created. Subsequently, the robot computed the cosine similarity between the histograms of each image searched online and the new object image. Finally, the average similarity was calculated for the set of images of each semantic relation.
The similarity score was assigned to each of the senses. The sense with the highest similarity score was selected to describe the new concept in the ontology. Table 2 shows the similarity scores calculated for each sense of the object ''pen'' using the baseline and our method. Sense 1 has the highest similarity score in both methods; therefore, the robot conceptualized it in the ontology.
The robot added the new semantic relations starting from the first hypernym of the chosen sense. The semantic relations were created on the basis of only the sense description. The new concepts are linked to a general concept in the ontology when the ontological knowledge and the sense have no TABLE 2. Similarity scores assigned to each sense of the object ''pen'' using the baseline and proposed methods. Based on these experiments, we confirmed the importance of selecting the best sense describing the new object in the concept learning process. In addition, the correct placement of the new concepts in the ontology hierarchy is crucial for ensuring that the overall ontological knowledge is consistent.
The second experiment consisted of showing the robot eight different objects, which it must conceptualize. In this experiment, the robot used the proposed method for image analysis and user interaction. We demonstrated the challenges in a concept learning scenario involving objects and concepts.
In the visual analysis process, the robot used the pretrained network to extract image features. This time, the set of objects included the following types: • The resulting images between each sense are visually similar.
• Images of the correct sense are significantly different.
• Images of the correct sense appear in the results of another sense.
• Very few images of the correct sense appear in the results.
• Images of the correct sense appear in the results of all senses. • The image results are visually different for all senses.
• The correct semantic relation contains common concepts with the ontology.
• The correct semantic relation does not contain any common concepts with the ontology. Fig. 11 shows the set of objects used for this second concept learning experiment. It includes a drill, durian, nail, pitcher, sponge, trunk, washer, and wrench. Each of these objects corresponds to at least one of the types of objects listed above.
During the experiment, the robot received the image of each object and its name sequentially. Next, it performed the word meaning identification process, acquiring the corresponding senses and semantic relations, as explained in Section IV. Then, it continued with the visual analysis process. In the online image query method, the robot created two datasets: a small dataset of 10 images and a large dataset of 50 images. Next, it extracted features using the pretrained network and computed the similarity scores for each semantic relation, as explained in Section V. Table 3 illustrates the results of the similarity scores assigned to each sense of the set of objects for the small and large datasets. The senses marked in blue represent the true meaning of the new object concepts to learn. In addition, the highest similarity score for each sense per new object is marked in bold; this corresponds to the sense chosen by the robot.
Based on these results, we can see that the robot correctly chose the desired sense for all objects in both datasets. However, the similarity score is not as high as expected, and this is due to the variety of objects used in the previous experiments.
The chosen senses with a low similarity score belong to the object drill and the nail. The similarity score was significantly affected by the results of the online image query, where the images did not fully represent the expected meaning. In both cases, images similar to the new object appear only in the results of one hypernym of the chosen senses, as shown in Fig. 12.
In the case of the objects durian, sponge, and wrench, the highest two similarity scores are slightly close for the small dataset. These results were caused by images of the correct sense appearing in other senses' results, increasing the score of the incorrect sense. However, we believe that having several samples for each sense according to its hypernyms adds more variations of objects helping in the semantic selection. This is evident in the results of sponge and wrench in the large dataset, where the two highest similarity scores are more distant than the scores for the same objects in the small dataset. A challenging case was observed in the object wrench for the small dataset. In these results, images similar to the target object were found in all senses (Fig. 13). This case corresponds to the third situation mentioned in Section VI, where a second similarity score is very close to the highest score. Therefore, the robot proceeded to use the textbased interaction to confirm the meaning of this new object, wrench. Thus, it created an alternative question using the first hypernyms of the two optional senses: ''Is it a spanner or a twist?'' With this question, the user helped the robot to confirm the correct meaning. Then, the robot conceptualized the new object successfully.
Sponge is another object in the small dataset with two close similarity values. However, the robot did not prompt a question with this concept due to the threshold value. Setting up a higher threshold might cause unnecessary interaction with the user. Therefore, an appropriate threshold calculation must be investigated in the future.

C. INTEGRATED SYSTEM EXPERIMENT
The last experiment consisted of the robot searching and finding the new objects in the current environment once it completed the concept learning process. The user challenged the robot skills to deduce the requested object and infer its location according to its knowledge.
In this last experiment, we demonstrated the advantage of having ontological knowledge when conceptualizing new objects, such as referring to the new object differently according to its newly connected classes and making inferences about the new objects using possibly inherited attributes.
The robot test included the following specifications to examine the integrated system after the concept learning effects: • The objects are requested by their names or the names of their class.
• When the object class's names are used, they are as follows: a directly stated class name, an inherited class name, and a possibly linked class.
• Object concepts that are expected to inherit attributes such as possible locations are requested.
• The integrated ontology-based knowledge management system with verbal interaction and concept learning was evaluated by its success in solving the object search experiment. To provide the robot skills to complete the mentioned challenge, we joined the concept learning components proposed in this paper (see Fig. 2) and the ontology-based knowledge management system for home service robots presented in [22]. The system integration consists of command analysis, talking interaction, task planning, execution modules, and knowledge management connected to the concept learning components.
We conducted the experiments in the simulated environment used in [22], which includes a four-room house with static furniture and graspable objects commonly found in home settings. We placed boxes at different locations inside the living room, bedroom, lobby, and kitchen, as shown in Fig. 14. Each box had one of the images of the new objects that were used during the learning process.
The new objects were taught to the robot sequentially during the concept learning. Then, the robot was given different kinds of commands to find the newly acquired objects.
The conceptualization of the new concepts was tested using the following natural language commands, which evaluated the correct association of the concepts and the inheritance of features in the ontology: 1) Find the wrench.
3) Find the sponge on the kitchen cabinet. 4) Find the nail in the bedroom. 5) Find the seal (washer). 6) Find the vessel (pitcher). 7) Find the luggage on the bed (trunk). 8) Find the tool in the lobby (drill).  The first four commands request for the new objects by their names. The first two (''Find the wrench'' and ''Find the durian'') do not include a location to search for the object. The third command (''Find the sponge on the kitchen cabinet'') includes the name of the furniture. The fourth command (''Find the nail in the bedroom'') includes the room's name to find the object.
The last four commands request for the new objects using the name of an upper class and two do not include a location to search for the objects. In the fifth command (''Find the seal''), the class name used refers to the object ''washer,'' and the sixth command (''Find the vessel'') refers to the object ''pitcher.'' The seventh command (''Find the luggage on the bed'') includes the furniture on which the object is located, and the class name refers to the object ''trunk.'' Finally, the eighth command (''Find the tool in the lobby'') includes the room's name, and the class name refers to the ''drill.'' In addition, the second and sixth commands (''Find the durian'' and ''Find the vessel'') include objects that are expected to inherit a location where are commonly found from its ontological knowledge. A particular case is considered in the eighth command (''Find the tool in the lobby''), where more than one object belonging to the required class exists in the environment. A summary of the characteristics of each command can be found in Table 4.
The robot completed all the commands using the integrated system successfully during the experiment. The robot used its ontological knowledge and interacted with the user when necessary. We analyzed the robot's process in each command to evaluate the use of its ontological knowledge after the learning process. The results can be found in Table 5. The robot found object instances applicable to the description when the objects were requested either by their names or classes. Furthermore, the robot found a suitable instance of a location to search for the object when its ontological knowledge had an inherited location or when the command provided it.
A particular case was observed when the commands included any object belonging to the Tool class, which are ''wrench,'' ''nail,'' ''washer,'' and the ''drill.'' This type of class does not include an inherited location that could be inferred from the robot's knowledge; therefore, the robot could not find a suitable location. However, since the integrated system includes human-robot interaction, the robot could ask the user for an object's location. Another case occurred in the last command (''Find the tool in the lobby''), where the robot found more than one object that could belong to the Tool class, and it asked the user for the specific object.
We can conclude that the new objects learned by the robot were accurately conceptualized in the ontological knowledge, considering that the robot could find the possible instances of the requested objects. Furthermore, the new concepts learned for each new object were linked to the proper hierarchy since the robot found inherited properties such as the location. Experimental results using the integrated system. The robot found the instances of the requested objects using its knowledge or by asking the user. The locations were acquired using its knowledge or from the command, or by asking the user.
In cases, where the ontological knowledge insufficient, the system used human-robot interaction to acquire information, which is the expected behavior of a service robot.

IX. CONCLUSION AND FUTURE WORK
This study describes ontology learning of new concepts combining textural knowledge, visual analysis, and user interaction for service robot applications. We proposed analyzing the semantic relations and visual features of a new object concept to determine its correct conceptualization in an ontology. In addition, we employed human-robot interaction to assist the robot when necessary.
We tested the proposed concept learning method in a scenario, where the robot has to conceptualize new objects with only the image and the name of the object. In addition, we demonstrated some challenges to consider in concept learning scenarios. Future work could make some improvements, such as improving the input query formulation for the online image query process to obtain more accurate image results according to each sense. This improvement could be achieved by further analyzing concept meanings and the semantic relations of a concept since more information defining each sense can be obtained.
Furthermore, the interaction with users during the word meaning selection process could be improved by including more variations of questions formulated by the robot, such as inquiring about the new object and mentioning its possible applications. Improving the verbal interaction would be helpful too during the ontology update process, as there is a possibility of concept inconsistency. Ontology inconsistencies occurs when a concept is not adequately defined, including missing equivalent classes, disjoint classes, and other properties. Hence, a more extensive verbal interaction would be necessary.