Theoretical Perspective of Multi-Dividing Ontology Learning Trick in Two-Sample Setting

The multi-dividing ontology learning framework has been proven to have a higher efficiency for tree-structured ontology learning, and in this work, we consider a special setting of this learning framework in which ontology sample set for each rate is divided into two groups. This setting can be regarded as the classic two-sample learning problem associated with multi-dividing ontology framework. In this work, we mainly focus on the theoretical analysis of multi-dividing two-sample ontology learning algorithm, whose ontology objective function is proposed, and the generalization bounds in this setting is obtained in terms of $U$ -statistics technique. The theoretical result given is of potential guiding significance in the field of ontology engineering applications.

The concept of ontology originally belongs to the category of western philosophy, which refers to the expression and summary of the objective existence at the logical level. Ontology began to be introduced in the artificial intelligence in the 1980s, with 20 years of development, and it has been widely recognized in this century which was defined by a clear formal specification of a shared conceptual model. Ontology can specifically describe the complex conceptual relationships in a certain field. Human-machine can communicate and share data information between machines because the definition of domain information by ontology is unanimously recognized. In addition to the study of ontology in the field of philosophy, computers and other aspects have also made active research on ontology theory and applied it to corresponding fields (see Gao et al. [1] and [2]).
Artificial intelligence has its own definition of ontology, for example, the standard stipulate the concept of ontology can be described as follows: ''Ontology can define concepts in a specific field and give clear terms, describe the relationship between them in detail, and vocabulary extension rules can also describe and express based on defined terms.'' The main research problem of ontology in the field of engineering is how to construct ontology, which involves the use The associate editor coordinating the review of this manuscript and approving it for publication was Fabrizio Marozzo . of ontology principles. Although the definitions of ontologies given by experts in various fields are different from all angles, researchers agree that ontologies can clearly define the information concepts in the field, and the use of ontologies in specific fields can make each subject accessible [3]. This is the essential connotation of the ontology concept. Ontology is defined in the field of library and information science as ''Ontology can use a specific domain vocabulary to describe a specific fact and infer the deep meaning of the vocabulary'', and it's also defined to be able to represent specific domain conceptual information from a specific perspective [4].
Ontology has been studied and applied in various engineering applications. Skalle and Aamodt [5] showed how tricks of knowledge modeling and drilling ontology have been employed to predict downhole failures during drilling. Sobral et al. [6] proposed an ontology-based modelling to support integration and visualization of data from ITS. Al-Sayed et al. [7] presented a comprehensive cloud ontology named CloudFNF. Tebes et al. [8] evaluated identify and synthesize the available primary studies on conceptualized software testing ontologies. Pradeep and Sundar [9] suggested retrieving the information with the design of QAOC architecture. Hema and Kuppusamy [10] raised a trust-based privacy preservation modelling for service handling by means of ontology service ranking. Messaoudi et al. [11] presented a review of medical ontologies. Mantovani et al. [ introduced an ontology based trick for the interpretation and the encoding of the map data. Abeysinghe et al. [13] developed a SSIF by leveraging a novel term-algebra on top of a sequence-based representation of gene ontology concepts.
Kossmann et al. [14] presented an ontology based federative trick to managing the inherent complexity of CM in the context of SoS.
In this article, we don't consider the philosophical category of ontology, but regard it as a structured conceptual model. The so-called structured means the data in the ontology is not a single record, but the data are mutually related, and a graph can be used to represent the data structure. Vertices are used to represent concepts, and edges between vertices represent direct connections or relationship between concepts. Hence, the entire ontology is represented by a graph. In addition, after all the information related to the concept is numerically expressed, a multi-dimensional vector is used to encapsulate the representation, that is, each vertex is a fixed p-dimensional vector, and then a learning model can be used to learn various ontology graphs [15].

II. SETTING OF ONTOLOGY ALGORITHM A. ONTOLOGY LEARNING ALGORITHM
Set G = (V , E) as an ontology graph whose vertex set corresponds to concepts and edge set reveals the set of directly relationship between two concepts. Suppose Sim : V 2 → R + ∪ {0} as the similarity function and it always unitizes the value to interval [0,1] for convenience. Let v 1 , v 2 ∈ V (G) be two differ vertices, Sim(v 1 , v 2 ) = 1 indicates the same meaning of concepts corresponding to v 1 and v 2 . Conversely, Sim(v 1 , v 2 ) = 0 reveals no relationship between two concepts corresponding to v 1 and v 2 . Threshold parameter M ∈ [0, 1] is determined in light of field experts, then for the given vertex v, {v |Sim(v, v ) ≥ M } is returned to the user as similarity vertices. In this whole article, suppose n is the sample capacity, i.e., the number of ontology samples.
Let S = {v i } n i=1 be the ontology sample set with n ontology vertices which is independent identically distributed according to an unknown distribution D (written as v i ∼ D for i ∈ {1, · · · , n}), f : V → R be an ontology function which maps each ontology vertex to a real number (in this setting, the similarity between ontology vertices v 1 and v 2 is determined in terms of |f (v 1 ) − f (v 2 )| in which we desire a small number for a high similar pair (v 1 , v 2 ), and on the contrary a large number for dissimilar pair) and l(f , v) be the ontology loss function. The expected risk of ontology learning model can be formulated by Unfortunately, we can't directly calculate R(f ) since D is unknown. Instead, ontology empirical framework is applied in the specific ontology learning process which is denoted as follows For the supervised ontology learning, assume that ontology samples are denoted by (v i , y i ) where y i ∈ Y is the label of v i . For given f : V → R, ontology loss l(f , v i , y i ), and hence the expected ontology risk is denoted as The corresponding empirical ontology risk with can be modified as

B. MULTI-DIVIDING ONTOLOGY ALGORITHM BY MAXIMIZING AUC MEASURE
Multi-dividing ontology learning framework has attracted the attention of scholars in the recent decade since it fits the ontology graph with tree structure. In this special ontology learning setting, the ontology vertex set is divided into k rates corresponding to k branches under the top vertex. The values of different branches are always obtained from domain experts. For where v a and v b belong to branches a and b respectively, and a, b are positive integers with 1 ≤ a < b ≤ k. Specifically, the learner is inferred to a set of ontology . Ontology function f : V → R is learned in terms of S which assigns the S a vertices larger value than S b vertices, where a < b. Let D a be the conditional distributions for a ∈ {1, · · · , k} and the sample capacity is denoted by n = k i=1 n i with n i = |S i |. The expected multi-dividing ontology expected risk with the ontology function f : V → R is formulated by The transformation expression for expected ontology risk can be denoted as The first R(f ) expression is in expectation form, while the second R(f ) expression is displayed in integral form, and the two are equivalent. Furthermore, the multi-dividing empirical error is formulated as Hence, the desired ontology function is deduced from ontology learning framework f * = argmin R T ,l (f ) in which R S,l (f ) can be simply written as R(f ). From another angle, the idea of multi-dividing ontology learning modelling can be explained in terms of maximizing AUC (Area Under the ROC (Receiver Operating Characteristic) Curve) criterion. Consider k = 2 or imagine it as a binary classification problem, ROC curve is relied on a series of different two classification methods (cut-off value or decision threshold), with true positive rate (sensitivity) as the vertical coordinate, false positive rate is the curve drawn on the abscissa. AUC is introduced by the area under the ROC curve and the coordinate axis. Clearly, the value of this area will not be larger than 1. Because the ROC curve is generally above the line y = x in, the value range of AUC is between 1/2 and 1. The closer the AUC is to 1, the higher the authenticity of the detection trick; when it is equal to 1/2, the authenticity is the lowest and it has no application value. In multi-dividing ontology setting, there are k classes corresponding to k rate, and we consider the pairwise comparison. Hence, in this case, the AUC criterion in multi-dividing setting can be formulated by the accumulation of each pair of (a, b) with , · · · , k}, t ∈ R, and I(·) is a binary function such that its value takes 1 if argument is true and 0 otherwise. For convenience, in what follows, we write H f ,a and H f ,a instead of H f ,a (t) and H f ,a (t) in the AUC expression. The expected AUC framework in multidividing setting is associated with H f ,a which is denoted by (4) and the corresponding empirical multi-dividing ontology framework under AUC certerion is If we limit it to a specific pair (a, b), the AUC criterion can be expressed as Hence, we admit The aim of this article is to propose the statistical characterization of multi-dividing ontology learning algorithm in the two-sample setting, where each S a with a ∈ {1, · · · , k} is divided into two groups S 0 a and S 1 a . In the classic two-sample learning problem, the first data set is used to obtain an ontology function on its function space and the second ontology data set is served to compute a pseudo-two-sample test statistic from the given ontology data. In our multi-dividing ontology setting, we can consider the two-sample ontology data as the similar meaning, for example, the first group of ontology data is applied to training, and the second group of ontology data is for testing, ect. More contents on two-sample learning problem in different setting and applications can be referred to Ma and Wong [16], Tang et al. [17], Chen et al. [18], Kim et al. [19], Rabin et al. [20], and Emura and Hsu [21].
The two-sample ontology setting can be explained from another angle. We require ontology learning algorithms to have generalization capabilities, i.e., ontology functions deduced from one ontology sample set can be well applied to other ontology data sets in the same type. In other words, for the same type of ontology data, the ontology functions obtained from different ontology sample sets should have similar characteristics, and should not be very different. In statistical learning theory, it can be understood that two ontology functions obtained from different ontology samples of the same type of ontology data are very close in the ontology function space and have similar statistical characteristics to each other.
The main result regarding to generalization bounds in this setting is manifested in next section, the proof of this result relies on the techniques of U -statistics and its applications which can be found in Fuchs et al. [22], Bouzebda and Nemouchi [23], Fuglsby et al. [24], Privault and Serafin [25], Bachmann and Reitzner [26], and Garg and Dewan [27], and we skip the details here.

III. THEORETICAL ANALYSIS IN TWO-SAMPLE SETTING
For a ∈ {1, · · · , k}, we denote z ∈ {0, 1} as the natation of two groups, i.e., ontology sub sample set S a in new multidividing ontology setting is divided into two sets S z a : S 0 a and S 1 a (denote n 0 a = |S 0 a | and n 1 To simplify the symbol, we use H z f ,a and H z f ,a to replace H z f ,a (t) and H z f ,a (t). Hence, the AUC framework in multi-dividing ontology setting is reformulated by Similarly, if we restrict on a specific combination of (a, b) with a, b ∈ {1, · · · , k} and a < b, then Thus, we have In two-sample ontology setting, the optimal ontology function is denoted by f * λ which is obtained by maximizing the following ontology objective function: where λ > 0 is an offset variable. The corresponding empirical ontology version with positive offset parameter λ is and we express its maximizer as f λ,S .
We say an ontology function space F with ontology functions f : V → R is VC-major if the major sets of the elements in F is a VC-class of sets in V . Specifically, F is a VC-major class if and only if {{v ∈ V | f (v) > t}|t ∈ R, f ∈ F} is a VC-class of sets.
Our main conclusion reveals the learning rate of multidividing ontology problem in two-sample setting.

Proof of Theorem 1. Note that
In light of triangular inequality, we acquire Set for a, b ∈ {1, · · · , k} with a < b and Z ∈ {0, 1}. Thus, we obtain q a,b 0 = 1 − q a,b 1 .
220706 VOLUME 8, 2020 Denote and by means of U -statistic tricks, we get Furthermore, by means of characteristics of U -statistics, holds with possibility at least 1 − δ where C is an universal constant. The details on the proof and related statement (10) can be found in Bousquet et al. [28] and Clémencon et al. [29].
It follows from Hoeffding inequality that Note that n a +n b (n a +n b ) 2 − p a,b (1 − p a,b ) = (1 − 2p a,b )( n a n a +n b − p) − ( n a n a +n b −p a,b ) 2 . By setting a,b (δ) = In view of the following inequality is established with possibility at least 1 − δ: Now, we consider 0 . Since Moreover, it's not difficult to verify the following fact In light of the similar trick as dealing with , we acquire Via calculation and simplification, we confirm that for each pair of (a, b), According to Hoeffding inequalities again, we deduce that with possibility at least 1 − δ, two inequalities stated as follows estabilished simultaneously: Set ϒ a,b (δ) = 1 n a +n b −1 again, we have the following holds with possibility at least 1 − δ: When it comes to 1 , we have the similar conclusion that with possibility at least 1 − δ, Finally, put (12), (16) and (17) together, apply the assumption min y∈{1,··· ,k},z∈{0,1} P{Y = y, Z = z} ≥ ε, and note the fact that min{p a,b , 1 − p a,b } ≥ 2ε, we get holds with possibility at least 1 − δ. Hence we get the desired conclusion.

IV. CONCLUSION AND DISCUSSION
In this article, a novel multi-dividing ontology learning setting is proposed where each rate of sub ontology sample is divided into two groups, the corresponding objective function is given and the generalization bound in this specific ontology setting is determined by means of U -statistics technique and presented in Theorem 1.
In the real ontology engineering applications, R λ is tough difficult to maximin since truth function I is a discrete undifferentiable function. In statistical learning, one commended trick is replacing truth function to logistic function σ : x → 1 1+e −x . In this way, the surrogate relaxation of multidividing ontology AUC criterion AUC H f ,a ,H f ,b is denoted bỹ When it comes to two-sample multi-dividing ontology setting, for z ∈ {0, 1}, the corresponding objective ontology function we expect is to maximizẽ However, we don't know exactly the statistical characters of the above surrogate relaxation of multi-dividing ontology criterion in two-sample assumption, and therefore it deserves to be studied in the future. So far, thousands of ontologies have been defined according to their specific needs, and they are distributed in various fields of natural science and social science. Due to different ontology applications, and even different application backgrounds of the same ontology, the ontology data will perform large differences, which leads to different ontology sample dividing in the two-sample setting. In other words, for each ontology application, specific problems must be analyzed in detail, and unified parameters and standards cannot be used for direct execution. Our work only stays at the theoretical stage. For the specific application of the multi-dividing twosample ontology algorithm in the specific ontology application field, further research is needed in future works.