Remote Sensing Image Interpretation With Semantic Graph-Based Methods: A Survey

—With the signiﬁcant improvements in Earth observation (EO) technologies, remote sensing (RS) data exhibit the typical characteristics of Big Data. Propelled by the powerful feature extraction capabilities of intelligent algorithms, RS image interpretation has drawn remarkable attention and achieved progress. However, the semantic relationship and domain knowledge hidden in massive RS images have not been fully exploited. To the best of our knowledge, a comprehensive review of recent achievements regarding semantic graph-based methods for comprehension and interpretation of RS images is still lacking. Speciﬁcally, this article discusses the main challenges of RS image interpretation and presents a systematic survey of typical semantic graph-based methodologies for RS knowledge representation and understanding, including the Ontology Model, Geo-Information Tupu, and SemanticKnowledgeGraph.Furthermore,wecategorizeandsum-marizehowtheexistingtechnologiesaddressdifferentchallenges inRSimageinterpretationbasedonsemanticgraph-basedmeth-ods,whichindicatesthatthesemanticinformationaboutpotential relationshipsandpriorknowledgeofvariantRStargetsarecentraltothesolution.Inaddition,acasestudyofRSgeologicalinterpre- tationbasedonthesemanticknowledgegraphisdemonstratedtoshowthepromisingcapabilityofintelligentRSimageinter-pretation.Finally,thefuturedirectionsarediscussedforfurtherresearch.

(EO) data are being generated by the remote sensing (RS) platform exceeds the rate of data exploration and analysis of these data [1]. Recent advances in RS and computer technologies have enabled the explosive growth of RS data [2]. The RS data captured by these platforms offer great potential for understanding numerous natural phenomena and obtaining many interests in government projects, commercial applications, and academic fields. However, the unprecedented proliferation of RS Big Data poses significant challenges for their management, processing, and interpretation [3].
EO data are clearly showing the characteristics of Big Data [4], including large Volume (characterized by their increasing scale and bulk) [5], many Variety (embodied in the multisource, multitemporal, and multiresolution) [6], high Velocity (including a rapid growing rate of data generation and the efficiency of data processing) [7], uncertain Veracity (represented as the inconsistency, incompleteness, ambiguities, and also deviation in the RS model) [8], and high Value (reflected in the immense values contained in massive geospatial information) [9]. This has given particular urgency to the requirement of full utilization of ever-increasing RS images for intelligent EO [10], so it is extremely important to the comprehensive understanding of the large-scale and complex RS images.
The large volume, many Variety, and high Velocity of RS Big Data pose challenges for conventional data handling methods and technologies, and the difficulty of Veracity and high Value of RS Big Data present opportunities for novel ideologies and new techniques. As we know that large-scale data integration and management technologies [11] have solved some of the Volume and Variety challenges in RS Big Data [12], but these techniques cannot process the data efficiently and analyze the associations among data deeply, to fully explore and discover their values [13]. Moreover, high-performance computing architectures and hardware facilities [14], [15] may cope with data Velocity [16], but cannot address the Value nor address issues of uncertainty and incompleteness in RS Big Data [17].
Most of the above methods rely on the statistical analysis of data elements to deal with Big Data. However, semantic analysis, including data mining and knowledge reasoning [18], is difficult to achieve with these methods, and the potential knowledge implied in massive RS images is still underutilized [19]. The value of RS Big Data lies in the valuable knowledge behind the RS data generated from the multitemporal [20], multilevel [21], and multifaceted [22] reflection of the earth's surface. Therefore, how to mine the relevant knowledge efficiently and intelligently from RS Big Data remains a topic of interest. Specifically, efforts to address the Veracity and achieve the Value include knowledge discovery and value exploration based on the semantic model and intelligent algorithms.
The RS image has some particular features [8], such as nonrepeatability, high-correlation, multidimensionality, and spatialtemporality. These unique characteristics result in decisional complexity and indeterminate compatibility in the RS data analysis, so the addressing and solving of RS Big Data depends not only on data management and processing technologies but also on intelligent analysis and knowledge inference techniques. Because the rich details in the RS image pose challenges to the interpretation of image contents, the information extraction and analysis of RS images often rely heavily on visual interpretation and manual treatment by domain experts. This limitation highlights the increasing and urgent need for automated and intelligent interpretation of rapidly expanding RS images.
However, a thorough survey of semantic graph-based methods for RS image interpretation is still lacking. This motivates us to deeply analyze the main challenges faced for RS image interpretation, systematically review the semantic graph-based knowledge representation methods, summarize the semantic graph-based techniques for RS image semantic analysis and comprehension, and discuss the future directions of RS image interpretation based on the semantic knowledge graph. According to the research on semantic graph-based method in coping with RS image interpretation, the overall structure of the survey is illustrated in Fig. 1.
The rest of this article is organized as follows. After analyzing the current challenges of RS image interpretation in Section II, Section III summarizes the typical methodologies in coping with RS image interpretation and knowledge representation. Then, the techniques and applications of semantic graph-based methods for RS image analysis are reviewed in Section IV systematically. A case study of geological RS interpretation is illuminated in Section V preliminarily to present the potential value of the semantic knowledge graph. Furthermore, Section VI discussed the promising future directions of RS image interpretation with the semantic knowledge graph. Finally, Section VII concludes this article.

II. MAIN CHALLENGES OF RS IMAGE INTERPRETATION
Image interpretation can be conceptualized as a visual problem-solving activity and is a high-level image understanding, which explores the relationship between various objects in the image and obtains the recognition of the image content based on domain knowledge and experts' experience [23]. The RS image interpretation is the procedure of semantic recognition and information extraction from RS images. With the rapid development of the EO technology, the increasing volume and the ever-richer detail of RS images pose great challenges to the interpretation of semantic content. These motivate an increasing and stringent demand for automatic and intelligent interpretation of the blooming RS images. From the perspective of knowledge engineering, the difficulties and challenges of RS image interpretation can be summarized as follows.
1) The formalization and representation of prior knowledge from domain experts. The first challenge lies in the representation and the integration of the domain knowledge at different levels during the semantic interpretation process of RS images. Background knowledge (in the form of logical theories) is indispensable for RS image interpretation, and contextual knowledge is generally related to the scene, objects, and the relationships between these objects in RS images. The crucial problem defined as the semantic gap lies in a lack of concordance between the low-level image features and the high-level semantic meanings [24]. Integrating prior and contextual knowledge into the semantic interpretation process allows connecting different content levels to narrow the semantic gap, but it may be more difficult for analysts to verbalize the knowledge that they are using to make judgments than the visual cues. Benefiting from the increasing availability and various knowledgebase of EO data, the developed methods have reported promising performance in the construction of large-scale knowledgebase for the interpretation of RS image content [25]. However, very few empirical studies have been performed to describe the interpretation process, and most of what is presented in the RS literature has come from personal experience.
2) The annotation and extraction of multilevel semantics from contextual RS images The second challenge lies in the semantic extraction and classification of different levels of contexts, ranging from the pixel-level spectral character extraction, object-level structural component analysis, and scene-level content recognition, to the global semantic understanding of RS images. The analysis of RS image content can be performed at different levels of granularity, which have to tackle the difficulty of characterizing complex contents and the high human labor cost caused by preparing a large amount of training examples with high-quality pixel-level labels in the fully supervised annotation method [26]. Moreover, due to individuals' different cognitions of spatial information in their respective fields, their observation and description of the same geographical phenomenon may focus on variant aspects of the object, which leads to different views and semantic heterogeneity [27]. Up to this point, the explored approaches and investigations to integrate semantics with images almost relate to the feature segmentation with semantic labels, the object recognition with semantic concepts, the relationship detection with semantic analysis, and the scene understanding with semantic interpretation. However, the comprehensive understanding of RS images needs a more effective method based on the combination of numeric and symbolic features, which can provide a hybrid combination of the semantic graph model and the statistical analysis algorithm.
3) The discovery and utilization of various relationships from knowledge derivation The third challenge lies in the comprehensive integration and utilization of complex semantic relationships (including different qualities of uncertainty, relevance, reliability, and completeness) for RS image understanding and interpretation. Knowledge-based systems have been proved to be effective for complex object recognition and image analysis, especially the understanding of variant relationships among different EO data [28]. The content characterization and semantic understanding of RS images include both the identification of objects from the image and the discovery of relationships among them. Moreover, knowledge discovery and inference can support the capability of associative retrieval and latent relation mining, and the semantic graph method [29] shows great value and potential, which highlights semantic data enhancement and reveals things and their relationships in the form of semantic graphs. However, due to the complexity and heterogeneity of relationships in RS images, research concerning the reasoning processes of expert image interpreters is relatively sparse due to the difficulty in explicating the implicit mental processes that become ingrained through the development of expertise.

III. SEMANTIC GRAPH-BASED METHODOLOGIES FOR RS KNOWLEDGE REPRESENTATION
In the era of Big Data, researchers face the cruel irony of having plenty of data, but insufficient knowledge [30]. The exploitation of EO data is generally recognized, but due to the lack of inherent semantics, such data cannot be transformed into information directly. Generally, EO data need some interpretation and rely on a comprehensive knowledge base of value-chain analysis [31]. Therefore, knowledge representation and inference of Big Data are crucial issues in the research field and a promising solution for Big Data analysis.
In most cases, information mining and knowledge discovery employ semantic representation and analysis techniques based on domain-specific knowledge. In the data analysis and knowledge mining field, many efforts have investigated the research of ontologies, vocabularies, and schemas that cover different aspects of the domain [32]. Ontology provides a foundation for the unambiguous, logically consistent, and formal representation of domain knowledge. The advantages of ontology for knowledge representation have been discussed in many data-driven and knowledge-driven applications [33]. A knowledge base similar to Ontology in the Semantic Web should be recognized as a shallow semantic graph, describing semantics based on a semantic network of concepts (terms) and semantic links [34]. In contrast, the knowledge discovery methods typical of artificial neural network, as well as extended convolutional neural network (CNN) [35], recurrent neural network [36], and evolutionary neural network [37], etc. could be regarded as the deep (potential or implicit) semantic mining [38] based on neutral network [39].
The semantic graph is a well-established data structure, which has been used to model relationships among different entities within a domain. This section makes the investigation on semantic graph-based methodology for RS Big Data and summarizes the typical methods and models for RS information analysis and knowledge representation, including the ontology model, Geo-Information Tupu, and semantic knowledge graph.

A. Ontology Model Based on Conceptualization
EO and RS both take Geography as their theoretical foundation. Geography is a practical and functional science, which attempts to comprehend the specific reality of the Earth. However, Ontology is an ideal and abstract methodology, which concentrates on the meanings of concepts but hardly at all on the particular circumstances of reality [40]. Geography is intended to represent the physical structure of the actual world, while Ontology focuses on the conceptual structure of the epistemic world [41]. Therefore, geographic ontology (Geo-ontology) is an explicit and formal specification of the shared conceptual model in the geographic field [42], as shown in Fig. 2.
Geo-ontology includes the meanings of both philosophical ontology (a branch of philosophy that focuses on the natural organization of reality) and IS-ontology (a formal vocabulary that is an explicit specification of conceptualization) [43]. Philosophical ontology is often used to build the structure of geospatial concepts, terms, and relations, especially the spatiotemporal objects, granularity, and mereotopology. However, IS-ontology is often used in geospatial information sharing, integration, and interoperation.
Geo-ontology can be recognized as the semantic cognition of geospatial information, the extension of geographical representation, and the interoperability of geospatial information with different scales and uncertainty [44]. Based on the ontology model, domain information can be organized into a reasonable semantic system from the perspective of knowledge cognition and representation, which can contribute to information processing and sharing efficiently. Ontology is a real paradigm shift in geosciences and it helps to solve many problems in spatial information sharing [45], geographic image interpretation [46], geo-resource semantic interoperation [47], and so on. Moreover, it improves our understanding of geographic space and geosciences knowledge. However, it is difficult to construct and maintain ontology knowledgebase, and its rigorous logical relationship is also difficult to cope with large-scale knowledge management. In practice, researchers adopt a more flexible and efficient graph-based method to implement domain knowledge description.

B. Geo-Information Tupu Based on Cartography
Geo-Information Tupu is the combination of map and spectrum. The map mainly refers to the form of spatial information map (refers to the Holographic Map especially), which contains the basic properties of EO targets, and has the fundamental characteristics of visualization, geographical orientation, and geometric accuracy. The spectrum is a systematic arrangement of many similar objects or phenomena (reflects the principle of Geography generally), which is established systematically according to the characteristics of spectrum distribution and spatial-temporal resolution. Geo-Information Tupu reflects and reveals the characteristics of the spatial structure of things and phenomena as well as the law of dynamic change of time and space.
Geo-Information Tupu employs the thinking mode of the spectrum, which further develops the capability of quantitative analysis and simulation calculation of geological data [48]. In Geo-sciences, Geo-Information Tupu is an effective method to understand and represent complex geographical phenomena, and also it can promote the summary, expression, and forecasting of geographical trends, as shown in Fig. 3. Among variant applications of Geo-Information Tupu, the generalized symmetric structure Tupu and the hierarchical structure Tupu are the most adopted forms [49]. Geo-Information Tupu integrates the simplicity of the landscape comprehensive map and the abstraction of the mathematical model, which is the integration and unification of cognition, method, and dynamic map. It is a methodology to display the spatial morphological structure and temporal and spatial change law of the earth system. Geo-Information Tupu can not only invert the past geographical things but also predict some geographical phenomena in the future. In practice, some implementations have been carried out on various explorations, such as urban form [50], land utilization [51], landscape pattern [52], and topography [53]. Geo-Information Tupu has dual advantages of the comprehensive map and mathematical model, and it contributes to the inversion of the past, the evaluation of the current, and the prediction of the future [54]. However, due to its dependence on geographical maps, Geo-Information Tupu still has limitations in data computability, fusion, and interaction. The efficient organization of EO data and representation of RS knowledge need to be investigated further from the spatial-temporal relation dimension.

C. Semantic Knowledge Graph Based on Semantic Network
The semantic knowledge graph is a novel knowledge representation and discovery model, which can dynamically describe and discover variant relationships among any combination of entities in a specific domain. It achieves this by the method of materializing nodes and edges in an intuitive graphical representation of a specific knowledge domain [55]. Compared with the above, this type of data organization and knowledge representation can effectively combine the advantages of expressive capability and efficient computability. The design and construction of semantic knowledge graphs in geosciences can enable a highly compressed graph representation (geospatial objects can be organized together based on their spatial-temporal relationship and semantic linkage), which makes contributions to describing and traversing any existing relationship within a knowledge domain [56], as shown in Fig. 4. Geospatial data have two important roles, including serving as the geospatial reference and providing a source of geographic knowledge. However, mining or extracting knowledge from geospatial data has not been well explored, and few geospatial knowledge services are provided to common users. More efforts need to be devoted to moving from traditional geospatial data/information services to knowledge services [57]. Geographic Knowledge Graph links geographical entities based on geographic features, spectrum characteristics, and spatialtemporal relationships for semantic retrieval of geographic information instead of traditional gazetteers. These capabilities can ensure semantic interoperability and be conducive to the semantic integration and linkage of multisource EO data [58].
Naturally, geographic knowledge representation is a human cognition expression of the actual world, which is of great significance to the storage and processing of geospatial information. Especially in coping with Big Data, a well-established geographic knowledge can benefit various geographic information systems, because formalization is the foundation of geographic data analyzing, processing, and visualizing. This kind of semantic knowledge graph method has variant applications, such as enhancing semantic retrieval by linking search queries to correlative domain concepts and terms [59], discovering trending topics based on time series data [60], building a content-based recommendation engine [61], and performing semantic search interpretation and retrieval expansion [62]. However, some outstanding issues still need to be investigated and explored further, such as the automated construction of domain-specific knowledge graphs, the efficient retrieval of large-scale knowledge graphs, the effective application of valuable knowledge graphs, etc.
In a summary, this section illuminates three typical semantic graph-based methodologies for RS knowledge representation from different aspects and mechanisms. Each method offers specific advantages and is widely applied in various situations. From the comparison in Table I, we can see that the Ontology model is suitable for the conceptualization representation of specific domain knowledge, and the Geo-Information Tupu is good at expressing all kinds of geographical entities and phenomena on spatial-temporal coordinates, while the semantic knowledge graph has better semantic retrieval and knowledge discovery capabilities of variant semantic relationships, as illustrated in Fig. 5. The semantic knowledge graph is more promising for the comprehensive representation of all kinds of spatial, temporal, spectral, and radiometric relations in RS images, which has better potential for exploring all types of latent and valuable principles in RS Big Data.
Ontology is a traditional and mature knowledge organization method in the knowledge engineering field, which mainly relies on the experience of domain experts to build a knowledge system of terms and concepts. Moreover, Geo-Information Tupu is a domain-specific knowledge cognitive method in the Geosciences field, which organizes all kinds of EO data based on the space-time coordinate system. Furthermore, the knowledge graph is a powerful and promising knowledge representation method in the Big Data era, which is the large-scale expansion and extension of Ontology and focuses on semantic retrieval and knowledge inference based on graph mining and statistical analysis. In an objective sense, the Knowledge Graph is not an entirely new technology but a novel and comprehensive solution of Big Data. In addition, the significance and value of the Knowledge Graph lie not only in the method of knowledge representation but also in the further ability to discover knowledge based on semantic relationships.

IV. SEMANTIC GRAPH-BASED TECHNIQUES AND APPLICATIONS FOR RS IMAGE ANALYSIS
The diversity and high dimensionality of contemporary RS data are capable of yielding insights and intelligence not possible in previous decades. The volume of RS data continues to multiply exponentially due to the launch of new EO platforms with multispectral or hyper-spectral sensors. Besides its huge volume, RS data also has significant heterogeneity because of cognitive diversity and technological differences [63]. Heterogeneity in RS Big Data creates the problem of knowledge insufficiency because it limits semantic comprehension among heterogeneous data and between different organizations.
Semantic technology is adopted to describe the connection between data by using shared knowledge descriptions with explicit meaning expressed in the semantic system [64]. This type of description allows one to traverse large-scale data conveniently and effectively within or outside organizations. Such the technique also contributes to the multisource integration of variant and heterogeneous information [65]. The semantic relation graph can support data understanding and interpretation, information semantic integration and interoperability, and collaborative management of data services and shared platforms [66].
According to the implementation of the semantic graph method in the RS field, the semantic graph-based techniques can be divided into four aspects due to different levels of semantic representation and understanding, as illustrated in Fig. 6. This section focuses on the semantic graph-based techniques for RS data analysis and image interpretation in four applications, including semantic graph-based RS image annotation, image classification, scene understanding, and image interpretation.

A. Semantic Graph-Based RS Image Annotation
In the case of RS image analysis and retrieval, semantic annotation can narrow the semantic gap between visual feature properties and semantic meaning interpretation. However, most semantic annotation methods represent the image as a series of keywords or visual terms and hardly consider the spatial distribution of the regions or any prior knowledge of the objects in the scene. In contrast, the graph-based method enables RS image annotation that combines prior knowledge of context and spatial relations with low-level visual features to provide more faithful representations of images, as shown in Fig. 7.
Semantic annotation plays a foundational role in the comprehension and application of RS images and has received increasing interest. Based on domain ontology, Mohamed Farah et al. [67] presented a graph-based method for RS image semantic annotation, which tried to simultaneously process all available information of the image and develop an annotation procedure to generate graphs for representing objects and their spatial relations in the studied scene. Besides, in order to give a formalized and reasonable representation of the relationships between different regions and related labels in the RS image, Khitem Amiri et al. [68] proposed a semantic annotation approach based on region adjacency graphs to produce a concept graph, which could represent the objects in the scene by using spatial and spectrum attributes. However, due to the limitation of conceptualized knowledge, these methods relied on prior knowledge, and their performances are not ideal.
Furthermore, some more intelligent and efficient methods have been investigated based on deep feature learning. Xiwen Yao et al. [69] proposed a unified annotation framework by combining discriminative high-level feature learning and weakly supervised feature transferring, which tried to transfer the learned high-level features into the semantic annotation. Moreover, Panpan Zhu et al. [70] presented an end-to-end deep learning framework of multilabel annotation for object-level RS images, which used multiple labels as supervised information for annotation and focused on the similarity relations between different images grouped by semantic concepts at the scene level. However, because of the influence of training set in deep learning, these works ignored the label relationships between the scene level and the object level and failed to model the label relationships between the intra-level and inter-level semantic concepts of RS images.
Semantic annotation of images is achieved by adding labels in image descriptions to describe its content and reflect human understanding of images as much as possible, which has improved search capabilities, especially for ambiguous concepts. Recent research aims to enhance the annotation results based on prior knowledge and includes some solutions based on semantic graph methods. The domain knowledge graph based on Ontology can give intuitive representation and efficient acquisition of the relationships between different regions and related labels in the images [71], which can help to enrich and discover the semantics of the target image. As a great promoter of semantic annotation, the semantic graph-based annotation of RS images still has some potential improvements in the deeper annotation of images, the spatiotemporal efficiency of annotation, the large-scale annotation of hyper-spectral images, etc.

B. Semantic Graph-Based RS Image Classification
Classification or segmentation of RS images is a long-term research topic in the RS field, which has received more and more attention in recent years. The traditional pixel-wise classification does not manage the data as two-dimensional (2-D) images but as a series of disorder spectral signals, so the spatial correlation between pixels is simply not reflected. The fusion of spatial and spectral information is a hot topic in RS image classification research. When handling the problems of object detection and classification, the usage of context between categories/objects has been proved to improve the accuracy because similar things often appear together in their natural environments, as shown in Fig. 8.
In the past several years, many research works have been investigated in RS image classification, while current research has focused on the transformation from the traditional 2-D pixel matrix to a more meaningful multidimensional feature space. Considering the converting of complex RS images into a graphical representation, Gilbert Rotich et al. [72] used semantic relationships among multiple regions of interest to obtain a semantic network labeled with the highest semantic consistency in a given image. Moreover, based on the feature recognition capability of CNNs, Xin Wang et al. [73] proposed an enhanced feature pyramid network to extract multilevel and multiscale feature maps and employed a deep semantic embedding to obtain more reliable features for RS scene classification. Furthermore, aiming at the solution of hyperspectral image classification, in the review about advanced spectral classifiers for hyperspectral images by Pedram Ghamisi et al. [74], some deep models have been proposed for hyperspectral data feature extraction and classification, which can progressively lead to more invariant and abstract features at higher layers. However, these works have ignored the comprehensive utilization of feature models with prior knowledge, which resulted in limited improvement of image classification.
Aware of the above problem, in the combination of the feature transformation model and semantic analysis method, Kenneth Marino et al. [75] investigated the use of structured prior knowledge in the form of knowledge graphs as extra information to improve image classification, which tried to incorporate potentially large knowledge graphs into an endto-end learning system for computationally feasible of large graphs. Besides, Song Ouyang et al. [76] put forward a novel deep semantic segmentation network, which integrated utilized the object-level modeling to reduce the pixel-level noises and the prior knowledge of spatial relations to improve the performance and robustness of the classifier. From the above, we can see that the prior knowledge of variant relationships is essential for high-precision and interpretable semantic segmentation of the RS image and a more effective and deeper fusion of domain knowledge with the feature model is still the key problem.
RS image semantic segmentation is a fundamental work of geographic information interpretation and also the basis of other RS research and applications. Although it has gotten quite a bit of attention in the last decade, high-resolution RS image semantic segmentation is still challenging because the complexity and heterogeneity of RS image structure may lead to inter-class similarity and intra-class variability. Therefore, further research on RS image classification based on the semantic graph-based method needs to be focused on the high efficient usage of spatial relationship information and domain prior knowledge, as well as the effective acquirement of expert knowledge for the interpretation of RS images automatically and intelligently.

C. Semantic Graph-Based RS Scene Understanding
Due to the semantic gap between the lower-level property and higher-level information of RS data, scene understanding is a challenging issue in the analysis of RS images. Current scene understanding methods mostly ignore the semantic relationships among variant spatial components. On account of the diversity of objects, the variability of properties, and the complexity of spatial layouts in RS images, scene understanding could be more effective if it could focus on recognizing objects and describing their relationships. Scene understanding provides the necessary semantic interpretation by semantic scene graphs, as shown in Fig. 9.
Most of the existing scene classification methods in RS image analysis can classify the scene preliminarily but ignore the detailed components and their spatial relations in RS images. To overcome this deficiency, Kuldeep R Kurte et al. [77] designed a resource description and information mining framework based on a spatial semantic graph, which addressed the explicit model of topological and spatial relationships between different image areas. Besides, based on the performance advantages of the deep learning model in image classification, Gong Cheng et al. [78] proposed an effective method to learn discriminative CNN models for RS image scene classification, which trained the model and distinguished the scene classes based on within-class diversity and between-class similarity. However, the description In order to fully understand the components and their relations in the scene, Yanfei Zhong et al. [79] presented a bottom-up scene understanding framework based on a context relationship model of multiply spatial objects, which combined the symbiotic and positional relationships at the object level. Moreover, focusing on the contextual relations in the messy indoor scene, Wentong Liao et al. [80] proposed a framework for the automatic generation of semantic scene graphs, which deduced reasonable support relations based on physical constraints and prior knowledge of spatial relations among variant objects. Furthermore, considering the combination of the deep learning and knowledge-based approach, Abhishek V Potnis et al. [81] proposed a semantics-driven RS image understanding framework for describing the comprehensive spatial-contextual scene to enhance the situation awareness, which used the deep learning for multiclass segmentation and the deductive reasoning for discovering implicit knowledge. These works all realized the necessity of prior knowledge and tried to transform RS scenes into scene knowledge graphs formally.
It is a promising research issue to analyze and understand the RS scene based on semantic relations, who can make contributes to the discovery and reasoning of spatial-temporal relationships among different objects in RS images. The comprehensive, explainable and contextual interpretation of a RS scene relies on the representation of a generic RS scene in form of knowledge graphs by defining concepts related to the scene's lineage and land-use/land-cover regions and the spatial relationships between them. Therefore, there are still some problems that need further investigation in RS scene understanding based on the semantic graph-based method, such as the effective organization of variant and multilayer nodes (especially the temporal components), the efficient extraction of multidimensional relationships (such as the spatial relation, time factor, frequency spectrum, etc.), and the intelligent mining of potential associations.

D. Semantic Graph-Based RS Image Interpretation
Semantic image interpretation can be defined as the semantic extraction and inference process for deriving high-level knowledge from an observed image [82]. In RS imagery, semantic image interpretation has been an active research topic and has played an important role in the development of RS applications. In general, the interpretation of RS images relies on expert experience and domain knowledge in the RS field, which needs support from semantic annotation, image classification, and scene understanding of the RS image. Therefore, how to comprehensively utilize these semantic analysis approaches for RS image parsing and understanding is the crucial key to RS image interpretation. Nowadays, many approaches have been proposed to extract and analyze semantics from RS images, which can be generally distinguished according to the granularity of the semantics being interpreted.
At the spectral pixel level, the interpretation focuses on attributing image pixels to their corresponding semantic categories. In the research of 3-D modeling using sensor trajectory as a valuable source for semantic labeling of Indoor Mobile Laser Scanners points clouds, Shayan Nikoohemat et al. [83] presented an adjacency-graph-based method for detecting and labeling spatial structures in indoor scenes, and a voxel-based method was applied for labeling the navigable space and separating them from obstacles. Moreover, according to the spectral characteristic of the pixel in RS images, Hongyan Zhang et al. [84] pointed out that the spectrum detection of land surface information based on RS imagery should be supported by the map analysis in geography, and Geo-information Tupu could combine the comprehensive thinking of geography with the graphic representation. Therefore, these investigations have realized the limitation of pixel-level interpretation and tried to convert the spectral character of pixels into a graphic description of objects in the spatial data.
At the spatial-temporal object level, the interpretation focuses on the semantic description of variant objects and their spatialtemporal relations in the RS image. For the purpose of locating features of interest in spatial data relative to the object trajectories, David Nikolaus Perkins et al. [85] proposed a geospatialtemporal semantic graph to represent the trajectories from RS and geolocation data, which used the graph search algorithm to identify features of interest by the comparison of search query parameters with the nodes and edges in semantic graphs. Furthermore, based on the enlightenment of Geo-Information Tupu, Jiancheng Luo et al. [86] systematically presented a theory and computational methodology for the spatial-spectral cognition of RS data and tried to build an intelligent and integrated model for the RS information interpretation. However, the object-level semantic interpretation still has limitations of shallow semantics and neglected context, which impedes the comprehensive understanding and interpretation of the RS image.
At the semantic scene level, the semantic scene interpretation focuses on categorizing scene images into a discrete set of meaningful classes according to the contents in the RS image. In order to give a visual description of the full content of the image scene in a given time, Bitao Jiang et al. [87] put forward a Knowledge Graph construction framework for RS image interpretation, which tried to perform the intelligent retrieval and intelligent reasoning of RS data. Following the idea that the interpretation may take place at several levels from the simple recognition of objects to the inference of site conditions, Fethi Ghazouani et al. [88] highlighted the importance of ontologies exploitation for encoding the domain knowledge and guiding the semantic scene interpretation, which characterized the structure of the information element and its components required for the interpretation process. As a consequence, the final step may be the global interpretation level. The global semantic interpretation can utilize the integration of contextual, spectral, spatial, temporal, and radiometric relations in the RS image synthetically. It focuses on the change detection and interpretation that can affect different states of objects and the relations between these changes.
As we know that the RS image interpretation is a complex task that does not uniquely depend on the data itself because the semantics are not explicitly inside the image. The annotation of RS images can only implement the lexical understanding of images by marking labels with keywords, but it provides a necessary foundation for image processing. As well as, image classification and scene recognition can achieve the shallow semantic recognition and understanding of RS image, which reflects the routine methods of image processing. Moreover, the interpretation of RS images needs further discovery of the deep semantics and the exploration of the correlation law of things, which represents the mechanical knowledge of RS image application. Nevertheless, it must acknowledge that the deep semantic understanding of RS images needs support from the image classification, scene recognition, and image annotation, as illustrated in Fig. 10. The interpretation of RS image requires comprehensive utilization of various semantic information (such as scene identification, spatial-temporal relationship, domain principles, etc.) to explore the domain knowledge further.
Due to the complicated relationships and implicit knowledge of RS image interpretation, the semantic knowledge graph seems a more promising method to tackle this challenge currently. And the crucial issue of the solution lies in how to represent variant relations in hyperspaces effectively and discover potential associations with intelligence efficiently. However, because of the immense quantity of associated information and the enormous diversity of geospatial knowledge graphs, there are many challenges to the applicability and mass adoption of such helpful structured knowledge [89]. To our knowledge that there are few available semantic knowledge graphs for the intelligent interpretation of RS images, and also the practice and implementation of the knowledge graph in RS image interpretation is rare.
In a summary, the interpretation of RS images is a complex and comprehensive procedure, which needs the utilization and integration of semantic annotation, image classification, and scene understanding. RS image annotation helps to describe and search RS images by representing relationships between the spatial regions and semantic labels, but it falls short in capturing an understanding of complex relationships among target objects without using prior knowledge. RS scene understanding aims to comprehend scene context based on the spatial layout of target objects in RS image, but it lacks an in-depth recognition of various features and their relationships. Moreover, RS data classification focuses on the recognition and segmentation of different scenarios and ground objects in RS images, but it is still short of an adequate understanding of multifarious relationships among ground objects and scenes in the image. However, RS interpretation based on the semantic knowledge graph attempts to extract and represent all types of relationships among spectral and spatiotemporal characteristics from the cognitive perspective of domain experts, which can capture and reflect a higher-level semantic understanding of RS image. In addition, the specialty of knowledge inference and semantic computation based on the Knowledge Graph embodies its significance. Therefore, the intelligent interpretation of RS images based on knowledge graphs has great academic value and practical significance.

V. CASE STUDY
With the dramatic increase of RS images, the rich details pose challenges to the semantic interpretation of RS images, which aims at the efficient extraction and recognition of semantic content. It focuses on the application of human knowledge and experience to obtain useful spatial-temporal and thematic information about the objects in RS images. Furthermore, the information analysis, content extraction, and interpretation of RS images rely heavily on the visual interpretation and manual working of domain experts. There are different levels of abstraction between the visual interpretation of spectral information and the semantic interpretation of image pixels. This problem is tagged as the semantic gap and is rooted in the lack of concordance between low-level data and high-level information [90]. As discussed above, the semantic knowledge graph is regarded as an innovative approach for knowledge representation and inference, which can map RS images into a semantic network to discover explicit and implicit relationships between variant features of the RS image.
Geological RS is a typical application of RS image interpretation, which can be considered as the study of geological survey and resource investigation with the electromagnetic spectrum based on the comprehensive utilization of RS technologies [91]. Especially with the launch of hyper-spectral instruments to observe the Earth, the RS image is full of variant information and potential relationships beyond the capability of the traditional RS image interpretation method [92]. These variant and heterogeneous relationships need a semantic-level methodology to be analyzed and discovered from the prospect of semantic network and domain knowledge. This section presents the following case study of geological RS image interpretation based on the semantic knowledge graph.
Historically, multispectral imagery has been adopted to produce colorful photographs for visual interpretation of lithologic units and geological structures [93]. RS image interpretation relies on two types of marks in the image, including direct interpretation mark and indirect interpretation mark. Direct interpretation mark is the direct reflection of the attribute of geological objects or phenomena in the RS image, such as shape, size, gray and color, shadow, image structure, and texture. Indirect interpretation mark refers to the attributes of geological objects or phenomena reflected in the image by other objects connected with them, such as drainage patterns, vegetation distribution, soil characteristics, and human activities.
Direct interpretation marks are easy to obtain and recognized by feature extraction from RS images. While indirect interpretation marks need to be inferred and determined by domain experts based on comprehensive reasoning and geographical correlation analysis. The expert experience, which reflects the potential or implicit corelationships between different geological objects or phenomena, is of great value to geological RS interpretation, which can be abstracted and represented by the Knowledge Graph intuitively.
This case study takes geomorphic interpretation based on RS image as an example to show the application of the semantic knowledge graph. The ontology structure of geological interpretation marks illustrated in Figs. 11 and 12 presents the landform classification of geomorphic interpretation. The identification of different landforms depends on relevant RS image interpretation marks. Direct interpretation mark has explicit relations with variant landforms based on RS image spectral analysis to identify the types of the topographic form (such as ridge, valley, basin, and hill), but it is insufficient for complex scenes generally.  Geomorphology is also determined by certain geological bases (such as lithology and tectonic) and influenced by relevant natural and geographical conditions (such as climate and hydrology). Indirect interpretation mark has more potential capability, which relies on the long-term summary of the experience and knowledge of experts in the field. This tacit knowledge of indirect interpretation experience needs to be discovered and mined from historical documents and interpretation reports.
Based on the RS geological interpretation reports of the Ganguoshan area (Landsat TM images) in Tibet (sourced from the China University of Geosciences), three fluvial landforms can be extracted based on the spectral characters (direct interpretation mark), and the geomorphic origins (indirect interpretation mark) can be obtained from the context, as illustrated in Fig. 13. Moreover, based on the RS geological interpretation report of the Pamir area (GF-1 PMS images) in Tajikistan (sourced from the Institute of Geological Survey of Qinghai Province in China) [94], three geologic structures can be obtained with relevant RS interpretation marks, as illustrated in Fig. 14.
The RS geological interpretation report can be adopted as a knowledge source to construct the knowledge graph of RS geological interpretation. The RS image feature information (reflecting the direct interpretation marks) and potential  correlations (reflecting the indirect interpretation marks) of various geological objects/phenomena can be extracted based on the context of the report. With the continuous expansion of geological landform (knowledge nodes) and interpretation relations (directed edges), a large-scale semantic network (Knowledge Graph) could be constructed and formed gradually, as illustrated in Fig. 15.
Based on the relationships between geological units and interpretation marks in Fig. 15, variant explicit or implicit semantic relations could be searched or discovered, which reflect the domain knowledge and interpretation principles. To be more specific, the retrieval could be used to search explicit relations between geological units and interpretation marks, a reasoner could be adopted to discover implicit relations among different interpretation marks, and some statistical algorithms could be applied to summarize the regularities of RS geological interpretation based on knowledge graphs.
In a summary, based on a large number of RS geological interpretation reports, a semantic knowledge graph can be obtained, and the knowledge-filled semantic relationships among geological objects/phenomena and interpretation marks can imply expert experiences of geological interpretation. Various graphbased analysis approaches based on the graph characteristics (such as type of nodes, degree of edges, and subgraph isomorphism) could be adopted to discover valuable information and potential knowledge. The representation of specific relations and the mining of potential associations can provide an important reference for the selection of corresponding RS images in various RS geological analyses. In addition, it can help to discover the recognition rules of different geological phenomena by various ground objects in RS geological interpretation, and promote the intelligence and efficiency of RS geological interpretation.

VI. FUTURE DIRECTIONS
RS image interpretation is an important and challenging problem for EO cognition and application, which has aroused extensive research attention. Despite the dramatic progress in the past several years, there still exists a giant gap between the current understanding level of machines and the human-level performance. By investigating the semantic graph-based knowledge representation methods and current technologies for the RS image interpretation, this article discusses several promising future directions for the RS image interpretation based on the semantic knowledge graph.
1) Multisource knowledge acquisition and fusion of RS imagery field. Multiple knowledge sources need to be employed from RS data resources and interpretation reports in the encyclopedia knowledge bases and helped to obtain sufficient and comprehensive knowledge. Therefore, the integration and disambiguation of knowledge from different sources is the crucial issue to be considered and resolved. 2) Comprehensive extraction and utilization of multiple relations in RS imagery. On account of the rich relationships in RS images, both the spectrum characteristic and spatiotemporal relations are necessary to be extracted and explored as possible, which can impart an entire representation of the RS image interpretation principles. Consequently, the representation and inference of semantic relations from variant feature dimensions need to be investigated and achieved. 3) Systematic categorization and representation of RS image interpretation marks. Besides the traditional attributes of RS imagery, the marks of interpretation play important roles in the connection of target features with image attributes. However, a distinct understanding and formal expression of the semantic relevance among different interpretation marks is still unsolved and full of challenges, especially the implicit indirect marks of RS image interpretation which reflect the experience and knowledge of interpretation experts. 4) Quality evaluation and knowledge evolution of the semantic knowledge graph. The effectiveness and performance of the knowledge graph rely on its richness of knowledge and the density of relationships. Moreover, with the continuous enrichment of interpretation experience, the information in the knowledge graph also needs to be updated and evolved synchronously. Therefore, a thorough evaluation of knowledge graph quality is indispensable before its application, and also an effective evolutionary strategy of the knowledge graph is necessary during its life cycle. 5) Effective integration and implementation with exiting RS application systems. The knowledge graph is a type of large-scale knowledge base based on semantic networks, and it may be a great support for the semantic understanding and recognition of RS images. Consequently, the integration and interaction mechanism with exiting systems is a special difficulty that needs to be considered and performed, which determines its technical sense and applied value.

VII. CONCLUSION
The semantic graph-based method fills crucial roles in the RS data analysis and image interpretation. From the perspective of the relational graph, potential relationships among different entities can be represented formally, and some new characteristics might be revealed by graphical analysis and mining. Ontology model, Geo-Information Tupu, and semantic knowledge graph are typical methodologies for RS data mining and knowledge representation. And the semantic knowledge graph may have more promising capabilities of variant relationship description and knowledge inference. In practice, there exist several technologies and applications for RS image analysis based on the semantic graph-based method. The interpretation of RS images relies on the integration of semantic annotation (preliminary definition and classification of the RS image), image classification (deep understanding and classification of RS image meaning), and scene understanding (further understanding and refinement of RS image content). In RS Big Data, the semantic knowledge graph can offer promise for refining existing geographic knowledge, improving RS interpretation accuracy, and promoting the progress of Geosciences. Therefore, the semantic knowledge graph-based RS image interpretation has become a research hotspot and trend with excellent value and promising prospects.