Similarity-Based Reasoning With Order-Sorted Feature Logic

Order-sorted feature (OSF) logic is a knowledge representation and reasoning language based on sorts—symbols that denote concepts ordered in a subsumption relation—and features—symbols that denote functional attributes. Reasoning with OSF logic is based on the unification of OSF terms, record-like structures that denote classes of objects and that are themselves ordered in a subsumption relation. OSF term unification aims to combine the constraints expressed by two terms in a consistent way, and it takes into account the subsumption relation between sort symbols, providing an efficient calculus of type subsumption. This article presents an approach to define approximate reasoning with OSF logic by extending its language with a similarity relation on sorts. In order for the OSF term unification algorithm to take into account this similarity and its interaction with the subsumption relation, we propose to combine the two relations into a single fuzzy subsumption relation. The advantage is that the same unification rules of OSF logic can then be applied to this fuzzy setting. We conclude by discussing potential applications of OSF logic extended with a sort similarity relation.


I. INTRODUCTION
A PPROXIMATE reasoning based on fuzzy relations-in particular similarity relations-has been researched extensively in fuzzy logic programming.Early work includes Ying's logic for approximate reasoning [1] and the first papers on similarity-based logic programming [2], [3], [4], [5].One motivation behind the similarity-based approaches is to model a form of reasoning that may be referred to as reasoning by analogy or similarity [5].For instance, this may be achieved by relaxing the equality constraint on two functor symbols to a flexible constraint of similarity, when unifying two first-order terms (FOTs).For example, if the functors thriller and horror are assumed to denote similar concepts, then the term thriller(X) can unify with the term horror("Psycho") to some extent (degree), thus leading to approximate (not exact but similar) solutions to a query posed to a knowledge base.This kind of relaxed unification is generally referred to as weak unification (e.g., [5]).
This research line has been extended in several ways, including approaches that support multiple similarity relations [6], proximity relations [7], [8], [9], or related operations, such as matching and antiunification [10], [11].Moreover, weak unification has been implemented in fuzzy logic programming systems, such as FASILL [12] and Bousi∼Prolog [13], [14], which has been employed in several applications, such as text classification [15], [16], linguistic feedback in computer games [17], decision making [18], [19], and knowledge discovery [20].Aït-Kaci and Pasi [21] presented a procedure for weak unification that, besides tolerating different (but similar) functor symbols, also allows the unification of FOTs with a different number and possibly a different order of arguments.This work has been generalized to proximity relations [22], and a possible incorporation in Bousi∼Prolog has been proposed [23].
The work by Aït-Kaci and Pasi [21] was preliminary toward the definition of similarity-based reasoning with order-sorted feature (OSF) logic, a knowledge representation and reasoning language developed by Aït-Kaci and Podelski [24], which has found applications in constraint logic programming and computational linguistics (e.g., [25]).The interest in defining a similarity-based extension of OSF logic is that previous work has proved the efficiency of this language for knowledge representation and reasoning.For instance, one advantage of OSF logic is that its unification algorithm takes into account a subsumption (is-a) ordering between sorts, which enables a single unification step to potentially replace several resolution steps, possibly leading to more efficient computations [26], [27].Moreover, it has been shown that, at the time of the experiments, the CEDAR Semantic Web reasoner based on OSF logic was consistently among the best reasoners in terms of concept classification, and several orders-of-magnitude more efficient in terms of terminological reasoning [28], [29].Introducing a similarity relation to augment the flexibility of OSF logic, while preserving its efficiency, has the potential to significantly enhance the effectiveness and applicability of this framework, particularly in domains such as the Semantic Web, where efficiency is of paramount concern.
In this article, we show how to make the OSF term unification algorithm more flexible by considering a similarity relation between sorts besides a sort subsumption ordering.Rather than devising ad hoc unification rules that deal with the similarity relation and its interaction with the sort subsumption ordering, we propose to combine the two relations into a single fuzzy subsumption relation.Intuitively, this is achieved by applying the following informal inference, inspired by the similarity-based approaches to logic programming (e.g., [5]) If the sort s 0 is subsumed by the sort s 1 and s 1 is similar to the sort s 2 with degree α then s 0 is subsumed by s 2 with degree α.
For example, if slashers are horror movies, and horror movies are similar to thrillers with degree 0.5, then we can conclude that slashers are also thrillers with subsumption degree 0.5.As a consequence, queries aimed to retrieve thrillers from a knowledge base may also retrieve instances of slasher (associated with an approximation degree), thus improving the flexibility of the retrieval process.Intuitively, the fuzzy subsumption relation encodes the information of both the crisp subsumption and the similarity.This procedure shifts the setting to that of OSF logic with a fuzzy sort subsumption relation, or fuzzy OSF logic, whose semantics has already been developed in [30].The advantage is that in fuzzy OSF logic it is possible to apply the same unification algorithm of (crisp) OSF logic, with essentially the same computational cost, thereby retaining the efficiency inherent to this logical framework.
The rest of this article is organized as follows.OSF logic [24] and fuzzy OSF logic [30] are briefly presented in Section II.
In Section III, we argue why a similarity-based unification procedure for OSF terms should take into account the interaction between the sort subsumption and the sort similarity relations, and why combining them into a single fuzzy subsumption relation is an effective way to deal with this issue.
Section IV-A focuses on the issue of how to formally define a fuzzy subsumption relation-which should be a fuzzy partial order, or even a fuzzy lattice-starting from a (crisp) subsumption and a fuzzy similarity relation.We start by combining the two relations into a fuzzy subsumption preorder, which may, however, contain fuzzy subsumption cycles.This issue is discussed in Section IV-B, where we present a construction of a fuzzy partial order from a fuzzy preorder that generalizes a well-known result from order theory.In Section IV-C, we also propose a definition of a completion of a fuzzy partial order into a fuzzy lattice, thus finalizing the transformation of a subsumption relation and a similarity into a fuzzy subsumption relation.In other words, Section IV ensures that the process of combining a crisp subsumption and a fuzzy similarity according to the intuition outlined above is sound, i.e., it indeed leads to a fuzzy lattice, as required by the unification rules of fuzzy OSF logic.
In Section V, we discuss two potential applications of similarity-based reasoning with OSF logic: 1) an extension of the CEDAR reasoner [28], [29], which relies on a sort similarity relation to approximately answer queries posed to a knowledge base and 2) a fuzzy logic programming language based on OSF terms, which leverages a similarity relation between sorts, comparable to a similarity-based extension of the language LOGIN [26].Finally, Section VI concludes this article.

A. OSF Logic
OSF logic is a knowledge representation and reasoning language that originates in Aït-Kaci's [31] work and, similarly to description logics (DLs), it was initially meant as a formalization of Ron Brachman's structured inheritance networks [32], [33].OSF logic and related formalisms-e.g., the logic of typed feature structures [25] or feature logic [34]-have been applied in computational linguistics [25] and implemented in constraint logic programming languages, such as LIFE [24], LOGIN [26], and CIL [35] and, more recently, in the CEDAR Semantic Web reasoner [28], [29].
Due to their common origin, OSF Logic [24] and DLs [32] share a few similarities.Both formalisms are subsets of firstorder logic designed to simplify its language in order to achieve computational tractability, while still providing enough expressive power for effective knowledge representation and reasoning.DLs and OSF logic are both based on set-denoting symbols (concepts and sorts, respectively) and symbols for expressing attributes: relational roles for DL, and functional features for OSF Logic.One of the most significant distinguishing aspects is that the semantics of OSF logic is based on the closed world assumption: for instance, if two sorts do not share a common subsort, then they are assumed to be disjoint.The relationship between the two languages has been explored thoroughly in [33], [36], [37], and [38].
As its name suggests, OSF logic is based on sorts, symbols that represent sets of objects, such as person or movie and that are ordered by a subsumption relation, and features, symbols that denote functional attributes, such as title or directed_by.These symbols are specified by an OSF signature, which is defined as follows.
Definition II.1 (OSF signature [24]): An OSF signature is a tuple (S, F , ) such that the following holds.
1) S is a finite set of sort symbols (or sorts).
2) F is a finite set of feature symbols (or features).
3) (S, ) is a finite lattice with least element ⊥ and greatest element .The greatest lower bound (GLB) of two sorts s and s is denoted by s⋏s .
Sorts and features can be used to construct record-like structures called OSF terms that can describe more complex classes of objects.
Definition II.2 (OSF terms [24]): Let V be a countably infinite set of variables (or coreference tags, or simply tags).Let X ∈ V, s ∈ S and f 1 , . . ., f n ∈ F. An OSF term is defined recursively as follows.
1) A sorted variable X : s is an OSF term.
2) If t 1 , . . ., t n are OSF terms, then an attributed sorted variable t = X : s(f 1 → t 1 , . . ., f n → t n ) is an OSF term.The set of variables occurring in t is denoted as Tags(t).The variable X is called the root tag of t and is denoted Root(t).

Example II.3 (OSF terms):
The following OSF term denotes the class of movies that are written and directed by the same Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.person: The variable Y is used as a coreference tag, i.e., it specifies that the values of the features directed_by and written_by should be the same.The variables in an OSF term are often omitted unless they are necessary to express this property.Any sort s can be seen as an OSF term X : s in which the variable is left implicit.
The definition of OSF terms given above does not rule out the presence of redundant or even contradictory information (e.g., consider the OSF term s(f → s 0 , f → s 1 ), which is contradictory if s 0 ⋏ s 1 = ⊥).OSF terms that are well behaved to this regard are called normal OSF terms or ψ-terms [24], and they are also denoted ψ, ψ i , and so on.Normal OSF terms are ordered in a subsumption lattice, which extends the one on sort symbols1 [24].
Reasoning with OSF logic is based on a unification procedure, which aims to combine the constraints expressed by two OSF terms in a consistent way [24].This is achieved by first translating the two terms into OSF clauses, an equivalent representation for OSF terms, which is defined next.
Definition II.4 (OSF constraints and clauses [24]): An OSF constraint is an expression of the form X : s, X .= X , or X.f .= X .An OSF clause φ is a conjunction of OSF constraints.The set of variables occurring in φ is denoted as Tags(φ), while φ[X/Y ] is the OSF clause obtained by replacing all occurrences of Y with X.
Informally, the constraint X : s means that the value assigned to X is of sort s; X .= X means that the same value is assigned to the variables X and X ; while X.f .= X means that applying the feature f to the value assigned to X returns the value assigned to X .Every OSF term t can be translated into an equivalent OSF clause φ(t) [24].
Example II.5 (OSF clause): The OSF clause syntax for the OSF term of Example II.3 is 1 shows the OSF constraint normalization rules [24] needed to unify two OSF terms.Each rule expresses that, whenever the (optional) condition in square brackets holds, the expression above the line can be simplified into the one below.
Proposition II.6 (OSF term unification [24]): Two OSF terms ψ 1 and ψ 2 can be unified by nondeterministically applying any applicable constraint normalization rule to the clause . = Root(ψ 2 ) until none applies.The resulting clause can be translated back into an OSF term ψ, called the unifier of ψ 1 and ψ 2 (the term X : ⊥ if the unification fails).The term ψ is the GLB of ψ 1 and ψ 2 and is denoted Example II.7 (OSF term unification): Consider the subsumption relation of Fig. 4(a)2 and the terms t 1 = movie(directed_by → person, genre → horror) and t 2 = movie(genre → slasher).Their unifier is the term t 3 = movie(directed_by → person, genre → slasher), which combines the constraints of t 1 and t 2 in a consistent way.Note that t 3 is subsumed by both t 1 and t 2 , and it is in fact their GLB.In particular, this is because t 3 has the same root sort as the two other terms, and its values for the features directed_by and genre are subsumed by those of t 1 and t 2 .
An advantage of OSF term unification is that it takes into account a subsumption relation on sorts.This may cause a single unification step to potentially replace several resolution steps, as the following example from [26] shows.
Example II.8 (Prolog resolution and OSF unification): Consider the Prolog program of Fig. 2(a).The query ?-s n (X), prop(X) will require n resolution steps before matching X = a.The OSF version of the same program is depicted in Fig. 2(b).The program involves declarations of shape s < s' for the sort subsumption relation, and of shape {a} < s for the instances of the sort symbols.Constants, such as a, are treated as singleton sorts (sorts that denote a single element), and thus as OSF terms themselves.The symbol prop is a predicate symbol, which takes OSF terms as arguments.Since a is subsumed by s n , now the query prop(X:s n ) (which aims to retrieve individuals X of sort s n and that satisfy the property prop) succeeds in a single unification step rather than n resolution steps.
The reader is referred to [24] for more details about OSF logic and its semantics.

B. Fuzzy OSF Logic
Fuzzy OSF logic [30] generalizes OSF logic by considering a fuzzy subsumption relation between sorts and by interpreting sorts as fuzzy sets.More precisely, the set S of sorts is ordered by a fuzzy lattice3 • : S × S → [0, 1], and the meaning of a sort s∈S in an interpretation I = (Δ I , • I ) is a fuzzy set s I : Δ I → [0, 1], where Δ I is the domain or universe of the interpretation.We rely on the minimum as a t-norm (denoted ∧) and on the maximum as a t-conorm (denoted ∨).
While in the crisp setting the subsumption s s denotes the inclusion of the sets denoted by s and s , we consider a fuzzy subsumption relation as a way to model a weaker notion of inclusion, which requires that every instance of s I must also be an instance of s I , but possibly with a lower membership degree.Thus, we define a fuzzy subsumption relation as a fuzzy lattice • : S 2 → [0, 1] that associates a subsumption degree β with each pair of sorts s, s ∈ S and having the following semantics: That is, any object d, which is an instance of s I with degree β , must also be an instance of s I with a degree that is greater than or equal to the minimum of β and β .Note that Zadeh's inclusion of fuzzy sets [39] is a special case of this constraint where β = 1.An example of a fuzzy sort subsumption relation is given in Fig. 4(c). 4he interpretation of sorts as fuzzy sets can be extended to OSF terms, so that the denotation of an OSF term t in an interpretation Analogously, the fuzzy sort subsumption relation can be extended to a fuzzy subsumption between OSF terms, which is a fuzzy lattice on (equivalence classes of) OSF terms [30].
Like their crisp counterparts, fuzzy OSF logic and fuzzy DLs (e.g., [40] and [41]) share a few common aspects.For instance, fuzzy DL concepts are also interpreted as fuzzy subsets of a domain.On the other hand, fuzzy DLs interpret roles as fuzzy relations, while fuzzy OSF logic maintains a crisp interpretation for feature symbols.Moreover, the notion of fuzzy subsumption defined above for fuzzy OSF logic differs significantly with that of graded subsumption in fuzzy DLs, which relies on the fuzzy implication operator.
With respect to computational complexity, the GLB of two OSF terms in the fuzzy subsumption lattice can be determined via their unification, through a constraint normalization procedure, which is essentially the same as the one for crisp OSF logic [30].The only difference involves the rule Sort Intersection, which in the fuzzy setting relies on the computation of the GLB of two sorts in a fuzzy lattice rather than a crisp one.Because of Proposition A.15, however, this computation can be reduced to the crisp setting, so that unifying two terms ψ 1 and ψ 2 in fuzzy OSF logic has the same complexity as in crisp OSF logic, which is O(mG(m)), where m = | Tags(ψ 1 ) ∪ Tags(ψ 2 )|, and the growth rate of the function G is of the order of an inverse of the Ackermann function (G(m) ≤ 5 for all practical purposes) [26].In fuzzy OSF logic, the unifier of two OSF terms is also associated with a unification degree, whose computation requires O(m(|S| + e)) time, where m = | Tags(ψ 1 ) ∪ Tags(ψ 2 )|, and e is the number of edges in the graph representation of the fuzzy sort subsumption relation [30].This computation depends on determining the subsumption degree •(s, s ) between pairs of sorts.This operation can be performed with an approach that is analogous to solving the shortest paths problem in a directed acyclic graph, and that can be optimized through the application of graph encoding techniques [42].
Example II.9 (Fuzzy OSF term unification): Consider the fuzzy subsumption relation • represented in Fig. 4(c).By employing the same constraint normalization rules of crisp OSF logic (with the only difference being that the GLB in the rule Sort Intersection is computed in a fuzzy lattice) it is possible, for example, to unify the terms t 1 = movie(genre → horror) and t 2 = movie(genre → thriller), resulting in the unifier movie(genre → slasher).The unifier is subsumed by t 1 with degree 1 (since •(slasher, horror) = 1), and by t 2 with degree 0.5 (since •(slasher, thriller) = 0.5).We thus say that the unification degree of t 1 and t 2 is 0.5.

III. SIMILARITY-BASED OSF LOGIC, INFORMALLY
In a standard setting, two FOTs can be unified if there is a substitution that makes them equal.A mismatch of functor symbols, such as f and g, in the terms t 1 def = f (X, a) and t 2 def = g(b, Y ) causes their unification to fail.With similarity-based unification (e.g., [5]), it is possible to weakly unify the terms t 1 and t 2 provided that f ∼ α g with α ∈ (0, 1].In this case, the two terms weakly unify with approximation degree α.A similarity can also be considered when resolving clauses, leading to similaritybased selective linear definite (SLD) resolution [5].
Example III.1 (Similarity-based SLD resolution): Consider, for example, the Prolog program of Fig. 3 and assume that horror and thriller are similar to degree 0.5.Then, the query ?-thriller(X) will return, besides X = "Memento", also the solution X = "Psycho" with approximation degree 0.5.This is due to the fact that, thanks to the similarity relation, the query will resolve with the first clause of the program, leading to an approximate solution.Intuitively, since "Psycho" is a horror movie as a consequence of the first rule, and horror movies are similar to thrillers, then, to some degree, "Psycho" is also a thriller.
As seen in Section II-A, the unification of two OSF terms aims to combine the constraints expressed by the two terms in a consistent way into a single term.This procedure takes the sort subsumption relation into account, so that it is possible, for instance, to unify the term t 1 def = s(f 1 → s 1 ) with the term Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Our goal is to define a more flexible, or weaker, unification procedure for OSF terms that also takes into account a similarity relation ∼: S × S → [0, 1] on sort symbols.Using this additional information it could be possible, for example, to unify the terms t 1 and t 2 even if s⋏s = ⊥.
Example III.2 (Weak OSF term unification: similar sorts): Consider for example the set of sorts S, the subsumption relation ⊆ S 2 and the similarity ∼ : S 2 → [0, 1] specified in Fig. 4(a).In addition, consider the sort "Psycho", which only subsumes ⊥ and is only subsumed by , and the OSF terms ψ 1 def = horror(title → "Psycho") and ψ 2 def = thriller(title → X : ).Clearly, the two terms cannot be unified since horror ⋏ thriller = ⊥, but they could be weakly unified if we additionally consider that horror and thriller are similar to degree 0.5.Intuitively, as ψ 1 and ψ 2 denote movies of similar genres (and all of the other constraints can be combined in a consistent way), then they could be weakly unified.
Besides matching similar sorts, weak OSF term unification could also consider the interaction between the subsumption and the similarity relations, as the following example shows.
Example III.3 (Weak OSF term unification: subsort of a similar sort): Continuing from Example III.2, consider the term ψ 3 def = slasher(title → "Psycho"), which unifies with ψ 1 since slasher horror.Moreover, as horror ∼ 0.5 thriller, then ψ 3 should also weakly unify with ψ 2 .In other words, since the genre of ψ 3 is subsumed by a genre (horror), which is similar to the genre of ψ 2 (and also all of the other constraints can be combined in a consistent way), then the two terms could be weakly unified, achieving a result analogous to Example III.1.
In addition, similarly to how a single OSF term unification step can replace several resolution steps (as in Example II.8), similarity-based OSF term unification should be capable of replacing several similarity-based resolution steps.
Example III.4 (Similarity-based resolution and weak OSF term unification): Consider the logic program of Fig. 5(a) and assume that ∼ is a similarity relation such that, for each Then, the query ?-s 2n+1 (X), prop(X) will require several similarity-based SLD resolution steps before returning the solution X = a.Now, consider the OSF version of the same program of Fig. 5(b) and the query prop(X : s 2n+1 ).Because a5 is subsumed by s 1 , which is similar to s 2 , which is subsumed by s 3 , and so on, up to s 2n+1 , then the query should succeed with a single unification of the terms X:s 2n+1 and a.
These examples show that a similarity-based notion of OSF term unification should consider the interaction between the subsumption and similarity relations on sorts, besides allowing sorts that are equal, similar, or with a GLB different from ⊥.
In order to achieve this, we propose to combine the (crisp) subsumption and the similarity ∼ into a single fuzzy subsumption relation •.The advantage is that it would then be possible to employ the usual unification algorithm of OSF terms [24], [30], taking into account the fuzzy subsumption •, which incorporates the information of both and ∼ at the same time.Intuitively, the combination of and ∼ is inspired by the following informal inference: If the sort s 0 is subsumed by the sort s 1 and s 1 is similar to the sort s 2 with degree α then s 0 is subsumed by s 2 with degree α.
For example, considering the setting of Example III.3, since slasher horror and horror ∼ 0.5 thriller (see Fig. 4(a)), then we could conclude that •(slasher, thriller) = 0.5 (see Fig. 4(c)).Thus, it would be possible to unify the terms ψ 1 and ψ 2 , since, according to Fig. 4(c), horror thriller = slasher (where denotes the GLB of two sorts in the fuzzy subsumption •).The unification would result in the term slasher(title → "Psycho"), which is subsumed by ψ 1 with degree 1, and by ψ 2 with degree 0.5 (and thus the overall unification degree is 0.5).In a similar manner, ψ 2 and ψ 3 can be unified with degree 0.5.Analogously, in the context of Example III.4,since then by repeatedly applying the same informal inference 6 we obtain •(a, s 2n+1 ) = α = min 1≤i≤n α i , thus, enabling the weak unification of a with X:s 2n+1 with degree α.The advantage of this approach is that it allows to seamlessly integrate the similarity relation into the unification rules of fuzzy OSF logic, taking the interaction between the similarity and the crisp subsumption into account, potentially allowing a single weak unification to replace several similarity-based SLD resolution steps, all the while maintaining the same computational complexity of crisp OSF logic for the computation of the unifier of two OSF terms.The only additional cost is the construction of the fuzzy subsumption, which is only required once, before the queries are processed.
The following section is devoted to the formal validation of this procedure.Specifically, it demonstrates that the fuzzy subsumption defined by combining a crisp subsumption and a similarity leads indeed to a fuzzy lattice, as required by the unification rules of fuzzy OSF logic.

A. Combining a Subsumption and a Similarity
As a first step toward our formal definition of a fuzzy subsumption relation that combines a (crisp) subsumption relation and a sort similarity, we define a fuzzy preorder on sort symbols, which intuitively arises from the iterated application of the informal inference discussed above.
Definition IV.1 (Similarity-subsumption chain and ): Let (S, ) be a partial order, let ∼: S 2 → [0, 1] be a similarity relation, and let s, s ∈ S. A similarity-subsumption chain of strength α from s to s is a sequence of sorts where α = min 0≤i≤n α n .The fuzzy relation on S is defined by letting there is a similarity-subsumption chain of strength α from s to s .
Alternatively, it is possible to define as the transitive closure of the composition of and ∼.

B. Dealing With Cycles
It is clear from Example IV.3 that the fuzzy relation in general is not a fuzzy lattice, and not even a fuzzy partial order, as it is not antisymmetric.One reason is that | ∼ | ⊆ | | and | ∼ | is symmetric, and thus will contain symmetric links (such as (thriller, horror) = (horror, thriller) > 0 in Example IV.3) that are due directly to the similarity relation.A solution in this case simply consists in deleting such similarity links, which is also justified by the fact that these are not needed anymore for the purpose of unifying two terms (for example, the fact ∼(thriller, horror) > 0 has already been used to define (slasher, thriller) > 0, which is enough, for instance, to make the terms ψ 2 and ψ 3 of Examples III.While cases such as the latter one could be blamed on an improper modeling of the subsumption and the similarity relations, we propose a solution that consists in the definition of a fuzzy partial order on equivalence classes of sorts starting from a fuzzy preorder on sorts.This construction generalizes to fuzzy set theory the well-known order theoretic transformation of a preorder (X, P ) into a partial order on the quotient of X modulo the equivalence relation x ≈ y ⇔ (x, y) ∈ P and (y, x) ∈ P .
Note that the same symbol • is used for the preorder on S and the fuzzy relation on S /≈ , since its meaning is clear from context.
Example IV.10 (Fuzzy partial order • on S /≈ ): Continuing from Example IV.7, Fig. 6(e) depicts the fuzzy partial order • on the equivalence classes of S obtained from the fuzzy preorder • on S of Fig. 6(d), as defined in Definition IV.9.Then, for example, the similarity-based unification of the OSF terms b(f → ) and c(f → ) could be performed by considering the fuzzy subsumption on the equivalence classes of the sorts, resulting in the unifier t 1 = {a, c}(f → ), where {a, c} is a disjunctive sort, and t 1 is a disjunctive OSF term [24].Alternatively, the construction of Definition IV.9 could be applied to the fuzzy preorder represented in Fig. 6(b), leading to the partial order of Fig. 6(c).In this case, the similarity-based unification of the same two terms would result in t 2 = {a, b, c, d}(f → ).
Proposition IV.11 (Fuzzy partial order • on S /≈ ): The fuzzy relation • of Definition IV.9 is a fuzzy partial order on S /≈ .

C. From a Fuzzy Poset to a Fuzzy Lattice
In (crisp) OSF logic, the sort subsumption relation is by definition required to be a finite lattice.In practice, however, it is enough for (S, ) to be a partial order and to implicitly work on a completion of (S, ) consisting of a lattice of sets of sorts, where singletons are treated as normal sorts, while sets of two or more sorts as disjunctive sorts [24].Formally, the lattice is defined on the antichains of the sort subsumption partial order and, from an implementation standpoint, it does not need to be explicitly constructed [43].
Analogously, the fuzzy sort subsumption relation of fuzzy OSF logic is required to be a fuzzy lattice [30].In this section, we present a generalization to fuzzy set theory of the completion of a partial order into a lattice, i.e., we show how to construct a fuzzy lattice on the antichains of a fuzzy poset.
Definition IV.12 (Antichains in a fuzzy poset): Let (S, •) be a fuzzy poset.Two elements s, s ∈ S are said to be incomparable-denoted as s s -if •(s, s ) = 0 and •(s , s) = 0.An antichain is a subset C ⊆ S such that, for all s, s ∈ C, s s if s = s .The set of all antichains of (S, •) is denoted Antichains(S).
Definition IV.13 (Fuzzy antichain ordering): A fuzzy partial order • on S can be extended to a fuzzy relation • on Antichains(S) by letting, for all C, C ∈ Antichains(S): ).The same symbol • is used for the partial order on S and the fuzzy relation on Antichains(S), since its meaning is always clear from context.Before showing that • on Antichains(S) is a fuzzy lattice, we need the following definitions.
Definition IV.14 (Maximal elements and down sets): Let (S, •) be a fuzzy poset and S ⊆ S. The set of maximal elements of S is defined as The (fuzzy) down set of S is defined as the set S ↓ • def ={s ∈ S | ∃s ∈ S, s • s }.
Note that, for any S ⊆ S, S • is an antichain.Proposition IV.15 (Fuzzy antichain lattice): Let (S, •) be a fuzzy poset.Then, (Antichains(S), •) from Definition IV.13 is a fuzzy lattice, where the GLB of C, C ∈ Antichains(S) is given by Example IV. 16 (GLBs in • ): Consider the subsumption and the similarity ∼ represented in Fig. 7(a), and the fuzzy partial order • obtained from their combination (according to Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Definition IV.5) represented in Fig. 7(c).In this case, • is not a fuzzy lattice, since h and t do not have a GLB.Fig. 7(d) shows the completion of • as defined in Definition IV.13, in which the GLB of {h} and {t} is {s, p}.We can thus use this fuzzy order for the unification, e.g., of the OSF terms t 1 = m(f → h) and t 2 = m(f → t), which would result in the disjunctive OSF term m(f → {s, p}), which is subsumed by t 1 and t 2 with degree 0.5.
Similarly to the crisp setting, the completion of a fuzzy partial order • : S → [0, 1] does not need to be computed explicitly in order to find the GLB of two sorts in Antichains(S), as the same encoding and decoding strategies of [43] and [44] can be employed in the fuzzy case.

V. POTENTIAL APPLICATIONS
A. Similarity-Based CEDAR CEDAR [28], [29] is a Semantic Web reasoner based on OSF logic and OSF term unification, whose implementation relies on techniques that exploit the specificity of concept taxonomies, particularly the fact that subsumption orderings are central to all ontologies.The main capabilities of CEDAR are concept classification, Boolean query answering (i.e., answering queries formed by sort symbols and Boolean connectives), and answering queries represented as OSF terms (CEDAR features an extended OSF term syntax that also supports a few DL constructs [29]).
Before being executed, a query-expressed as an OSF term in which the variables of interest are marked by a question mark-is first optimized according to both the knowledge expressed in the given ontology and the OSF constraint normalization rules.The resulting query is then translated into a SPARQL query and executed by a SPARQL engine.This process potentially leads to an optimized query that is more efficient to execute, by reducing the instance retrieval search space.Moreover, it also ensures the consistency of the input query against the ontology, so that no answer is provided if the query is inconsistent.The retrieval is further optimized thanks to a custom RDF triple indexing scheme based on OSF sort and attribute information [29].The results of [28] and [29] show that, at the time of the experiments, CEDAR was consistently among the best Semantic Web reasoners in terms of concept classification, and several  [29]) and the query ?X : person(works_at → ), which aims to retrieve people that work somewhere.In addition, assume that the feature works_at applies to objects of sort researcher and points to objects of sort university.Then, the CEDAR query optimization step [29] would transform the given query into the query ?X : researcher(works_at → university).The reason is that, since X is of sort person and the feature works_at only applies to researchers, then by the rule Sort Intersection the variable X must be of sort person ⋏ researcher = researcher.An analogous reasoning applies to the value of the feature works_at.The resulting query-which aims to retrieve researchers rather than people, and universities rather than entities of any sort-benefits from a much smaller search space, and is thus more efficient to execute.
On the other hand, CEDAR would find the query ?X : teacher(works_at → ) to be inconsistent with the knowledge expressed in Fig. 8 (according to the subsumption , the sorts teacher and researcher are disjoint: teacher ⋏ researcher = ⊥), thus preventing the query from being translated into SPARQL and executed.
While the consistency check is useful for preventing unnecessary computations, there may be scenarios where offering approximate answers to inconsistent queries becomes desirable.For instance, users often do not possess full knowledge of the extensive ontologies they query, making the inconsistency of a query, such as the second one in Example V.1, potentially surprising.Furthermore, the inconsistency of this query is due to the closed world assumption of OSF logic, which states that, since teacher ⋏ researcher = ⊥, the two sorts must denote disjoint classes.However, it may be desirable to relax this consistency requirement in contexts where this assumption does not hold, such as settings where domain knowledge can evolve or be incomplete.
With these considerations in mind, similarity-based reasoning with OSF logic could be implemented in an extension of CEDAR in order to relax the consistency requirement and provide approximate answers even when the input query is inconsistent with respect to the ontology.This would be achieved, for example, by enriching the subsumption relation of the given ontology with a similarity, which would then be taken into account during the query optimization phase, making query answering with CEDAR more flexible.
Example V.2 (Similarity-based CEDAR query normalization): Continuing from Example V.1, in a similarity-based extension of CEDAR we could consider the similarity relation ∼ of Fig. 8, which specifies that researcher and teacher are similar.The constraint normalization rules used in the query optimization phase could then be executed over the fuzzy subsumption • obtained by combining and ∼ as described in the previous section, where prof essor becomes the GLB of researcher and teacher.The query ?X : teacher(works_at → ), which was previously inconsistent, can now be simplified to ?X : prof essor(works_at → university).In this case, the answers to this query are associated with the approximation (satisfaction) degree 0.5.Alternatively, instead of providing approximate answers, the new query could be suggested to the user as a replacement to the inconsistent query.

B. Logic Programming With Similarity-Based OSF
Another possible application of our approach is the implementation of a logic programming language based on SLD resolution and OSF term unification (like LOGIN [26]), which additionally allows the specification of a sort similarity relation, similarly to how Bousi∼Prolog [13], [14] allows the specification of a similarity (or proximity) relation between functors, constants and predicates. 8The addition of a similarity relation allows the retrieval of approximate solutions to a query, improving the flexibility of query answering.
There are several potential advantages of using OSF logic in this context.First of all, as discussed in previous sections, the unification algorithm for OSF terms takes into account a (fuzzy) sort subsumption relation, which can lead to more efficient computations [26], [27].For instance, Example II.8 shows how a single OSF term unification can replace several resolution steps, and analogously Example III.4 discusses how a single weak OSF term unification can replace several similarity-based resolution steps.Another advantage is the flexibility provided by OSF terms, which lack a fixed arity and can, thus, easily represent partial information, and are, moreover, simpler to interpret thanks to their use of features rather than positions to specify arguments [26].
The following example illustrates a similarity-based OSF logic program, showcasing the flexibility of our approach in retrieving approximate solutions to queries.The computation of a solution to a given query relies on the weak unification of OSF terms, and SLD resolution on predicate symbols.Predicate symbols in this context are distinct from the sorts and features of OSF terms.They work exactly as in Prolog, with the only difference that they accept OSF terms rather than FOTs as arguments.
Example V.3 (Logic programming with OSF logic and a sort similarity): Consider the program of Fig. 9 .A sort subsumption relation is specified by declarations of shape s < s', while an expression such as {c} < s means that the constant c is an instance of the sort s.Constants, such as c, are treated as singleton sorts (sorts that denote a single element), and thus as OSF terms themselves.A similarity declaration of shape x ∼ y = α specifies that the sorts horror and thriller are similar to degree 0.5.A few facts with the predicate direc-tor_of-whose arguments are OSF terms-are then specified and, finally, a rule involving the predicates director_of and likes states that alinda likes thriller movies.The query ?-likes(alinda,Y : movie) is first reduced by resolution to the goal director_of(X : person, Y : thriller), which is then resolved against the facts of the program, returning the following solutions.
1) A solution binding Y to memento(title -> "Memento"), through the unification of thriller and memento(title -> "Memento").2) A solution binding Y to psycho with approximation degree 0.5, through the weak unification of thriller and psycho (psycho is a horror, which is similar to thriller).3) A solution binding Y to halloween(year -> 1979) with approximation degree 0.5, through the weak unification of thriller and halloween(year -> 1979) (halloween is a horror, which is similar to thriller).Behind the scenes, the computations would involve the unification of OSF terms over the fuzzy subsumption • obtained by combining, before the execution of the queries, the subsumption • and the similarity ∼ specified by the program.Note that in this example a similarity was only specified between sort symbols in order to perform weak OSF term unification, but it could be possible to further extend our approach by also considering a similarity between predicate symbols in order to perform similarity-based SLD resolution.This would be analogous, for instance, to Bousi∼Prolog, which supports a similarity relation on predicate symbols (in order to perform  similarity-based SLD resolution) and on functor symbols (in order to perform weak FOT unification).We conclude with an example that showcases another advantage of our approach with respect to other similarity-based logic programming languages, such as Bousi∼Prolog, in particular when dealing with cycles in the specification of the subsumption and similarity relations.
Example V.4 (Cycles in Bousi∼Prolog and similarity-based OSF logic): Consider the logic program enriched with a similarity relation of Fig. 10(a), which follows the syntax of Bousi∼Prolog [13], [14].The program starts by describing the similarity and the subsumption relations of Fig. 6(a) through declarations of shape x ∼ y = α and two Horn rules, followed by a few facts that specify the instances of the predicates a, b, c, and d.Loading this program in Bousi∼Prolog and querying a(X) leads to an infinite loop. 9This is due to the fact that, 9 The program of Fig. 10(a) was tested on Bousi∼Prolog 3.6.1.since a and d are similar, then the query a(X) resolves with the clause d(X) :-c(X) by similarity-based SLD resolution, and the goal becomes c(X).Then, since b and c are similar, the new goal resolves with the clause b(X) :-a(X) by similarity-based SLD resolution, resulting in the goal a(X) again, so that the cycle starts over.Now consider the similarity-based OSF logic program of Fig. 10(b), which describes the same setting.The subsumption and similarity defined in the program would be combined into a fuzzy subsumption relation before the execution of the query a(X), leading to the fuzzy partial order of Fig. 6(c), or the fuzzy partial order • of Fig. 6(e) (this could depend on an additional parameter passed to the program).In the first case, the answers to the query would include the instances of a, b, c, and d (i.e., alice, bob, carol, and david), while in the second case they would consist of the instances of a and c (i.e., alice and carol).

VI. CONCLUSION
We have presented an approach to define approximate reasoning with OSF logic by extending its language with a similarity relation on sort symbols.By combining this similarity relation with the usual sort subsumption relation it is possible to define a fuzzy subsumption relation which intuitively combines the information of both relations and their interaction.The fuzzy subsumption is then taken into account when unifying two OSF terms, using the same rules of (crisp) OSF logic.
As discussed in Section V, similarity-based OSF logic could be applied, for example, in the implementation of a fuzzy logic programming language or a similarity-based extension of the CEDAR reasoner [28], [29], which would be capable of returning approximate answer to queries.

APPENDIX A FUZZY SET THEORY
We recall the basic definitions of fuzzy set theory [39] to fix the notation.Whenever possible we use the same notation for fuzzy sets and orders as the one used for crisp sets and orders, but with the addition of a dot (•) to avoid ambiguity, e.g., the symbol for the intersection of two crisp sets is ∩, while the symbol for the intersection of two fuzzy sets is ⩀.We adopt the minimum as a t-norm (denoted ∧) and the maximum as a t-conorm (denoted ∨).
Definition A.1 (Fuzzy subset): A fuzzy subset F of a (crisp) set X is determined by its membership function μ F : X → [0, 1].
Definition A.2 (Intersection and union of fuzzy subsets): The intersection ⩀ F of a set F of fuzzy subsets of a set X is defined by letting μ ⩀ F (x) Remark A.4 (Fuzzy set notation): Throughout this article, the membership function of a fuzzy subset F of X is simply written

(Fuzzy binary relation):
A fuzzy binary relation R on a set X is a fuzzy subset of X × X, i.e., it is a function Definition A.6 (Fuzzy preorder): A fuzzy binary relation R on a set X is a fuzzy preorder if it satisfies ∀x ∈ X, R(x, x) = 1, (fuzzy reflexivity) ∀x, y, z ∈ X, R(x, z) ≥ R(x, y) ∧ R(y, z).

(max-min transitivity)
Definition A.7 (Fuzzy partial order): A fuzzy binary relation R on a set X is a fuzzy partial order if it satisfies (fuzzy reflexivity) and (max-min transitivity) and ∀x, y ∈ X, if R(x, y) > 0 and R(y, x) > 0, then x = y.We adopt the definitions of lower bounds, GLBs, and fuzzy lattice from [48], [49], and [50].
Definition A. 12  The following proposition states that the support (S, | • |) of a fuzzy lattice (S, •) is a lattice.In addition, if s is the GLB of s 0 and s 1 in (S, •), then s is also the GLB of s 0 and s 1 in (S, | • |), and vice versa.This property ensures that the computation of GLBs in a fuzzy lattice can be reduced to the crisp setting.For instance, it would be possible to employ the techniques of [43] and [44] on the weighted graph representation of a fuzzy lattice simply by ignoring the edge weights, preserving the same computational complexity.Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

APPENDIX B PROOFS OF RESULTS FROM SECTION IV
Proposition IV.2 (Equivalent definition of ): Let (S, ) be a partial order, let ∼ be a similarity relation on S, and let be as in Definition IV.1.Then, = ( ∼) ⊕ .
Proof: Let s, s ∈ S, and, for brevity, let R def = ∼.The fact that (s, s ) = R ⊕ (s, s ) is given by the following equivalences: (s, s ) = α ⇔ 1 α is the maximum such that there exists a sequence of sorts The equivalences are given by the following: 1) Definition IV.1; 2) R def =( ∼); 3) Definition A.8 and S is finite; 4) Definition A.9 and S is finite.Proposition IV.4 (Similarity-subsumption preorder): The fuzzy relation of Definition IV.1 is a fuzzy preorder.
Proof: Reflexivity holds since s ∼ 1 s for all s ∈ S.
Proof: Reflexivity is given by , and transitivity by the transitive closure.
Proposition IV.11 (Fuzzy partial order • on S /≈ ): The fuzzy relation • of Definition IV.9 is a fuzzy partial order on S /≈ .
Proof: First of all, note that the following holds by Definition IV.13: This fact will be used throughout the proof.

t 2 def
= s (f 2 → s 2 ) even if their root sorts s and s are different, as long as s⋏s = ⊥.Moreover, as shown in Example II.8, this feature can allow a single unification step to replace several resolution steps, which can lead to more efficient computations.
let ∼ be a similarity relation on S, and let be as in Definition IV.1.Then, = ( ∼) ⊕ .Example IV.3 (Similarity-subsumption chain): Fig. 4(b) represents the fuzzy preorder obtained by combining the subsumption relation and the similarity ∼ of Fig. 4(a).The fuzzy subsumption edges added to are represented in green.Proposition IV.4 (Similarity-subsumption preorder): The fuzzy relation of Definition IV.1 is a fuzzy preorder.
2 and III.3 weakly unifiable).According to this intuition, we define the fuzzy relation • as follows.Definition IV.5 (Fuzzy preorder • ): Let ⊆ S 2 be a partial order, ∼ : S 2 → [0, 1] be a similarity relation and be as in Definition IV.1.The fuzzy relation • is defined by letting• def =(( .− ∼) ⊍ ) ⊕, where the difference .− is defined by letting, for two fuzzy sets F andG μ F .−G (x) def = 0 if μ G (x) > 0 μ F (x) otherwise.Note that the fuzzy union of ( .−∼) with is necessary in case | ∼ | ∩ = ∅,and the transitive closure is needed to ensure the transitivity of • in case transitive links are deleted by taking ( .− ∼).Example IV.6 (Fuzzy preorder • ): Fig. 4(c) represents the fuzzy relation • obtained by combining the subsumption and the similarity ∼-resulting in the relation , as represented in Fig. 4(b)-and then deleting the similarity links as per Definition IV.5.The result in this case is a fuzzy lattice that can be employed, for instance, for the unification of the terms ψ 2 and ψ 3 of Examples III.2 and III.3.Deleting symmetric links may work for simple cases, such as the one of Fig. 4, but in general the relation • may still not be antisymmetric, as the next example shows.Example IV.7 (Antisymmetry and • ): Consider Fig. 6(a), (b), and (d), where S = {a, b, c, d}, a b, c d, a ∼ d = α > 0,

Fig. 9 .
Fig. 9. Logic program with OSF terms and a similarity relation.

(
strong fuzzy antisymmetry)The pair (X, R) is called a fuzzy partially ordered set (fuzzy poset).Definition A.8 (Composition of fuzzy binary relations): The (max-min) composition of two fuzzy binary relations R and Q on a finite set X is the fuzzy binary relation R Q defined by the membership functionR Q(x, z) def = y∈X (R(x, y) ∧ Q(y, z)).The n-ary composition of a fuzzy binary relation R with itself is defined by letting R 1 def = R and R n def = R R n−1 for n > 1.Definition A.9 (Reflexive and transitive closure of a fuzzy binary relation): The transitive closure of a fuzzy binary relation R is defined as R ⊕ def = ⊍ m≥1 R m .The reflexive and transitive closure R of a fuzzy binary relation R is obtained by letting R (x, y) def = 1 if x = y and R (x, y) def = R ⊕ (x, y) otherwise.Definition A.10 (Similarity relation): A fuzzy binary relation ∼ on a set X is a similarity if it satisfies (fuzzy reflexivity) and (max-min transitivity) and ∀x, y ∈ X, ∼(x, y) = ∼(y, x).(fuzzy symmetry) Remark A.11 (Infix notation): If R is a fuzzy binary relation, then xR α y stands for R(x, y) = α, and xRy stands for R(x, y) > 0.

Proposition A. 15 (
Fuzzy and crisp lattices): Let (S, •) be a fuzzy lattice.Then, (S, | • |) is a (crisp) lattice on S.Moreover, if ⋏ is the GLB operation for (S, | • |), then S = ⋏ S for every subset S ⊆ S. Proof: Let def = | • | and (S, •) be a fuzzy lattice.It follows that is a reflexive, antisymmetric, and transitive binary relation on S. 1) (Reflexivity) Since •(s, s) = 1 for all s ∈ S, then s s for all s ∈ S. 2) (Antisymmetry) If s s and s s, then •(s, ) > 0 and •(s , s) > 0, and by the antisymmetry of • it follows that s = s .3) (Transitivity) If s 1 s 2 and s 2 s 3 , then •(s 1 , s 2 ) > 0 and •(s 2 , s 3 ) > 0, so that by the transitivity of • it follows that •(s 1 , s 3 ) ≥ min( •(s 1 , s 2 ), •(s 2 , s 3 )) > 0, and thus, s 1 s 3 .Let S l def ={s ∈ S | s s , ∀s ∈ S} denote the set of lower bounds of S ⊆ S in (S, ), and note that S l = S fl for any S ⊆ S. Let S ⊆ S.Then, s = S ⇔ (i) s ∈ S fl and (ii) ∀s ∈ S fl , s • s ⇔ (i) s ∈ S l and (ii) ∀s ∈ S l , s s ⇔ s = ⋏ S.

2 ∈
C 0 such that •(c 1 , c 2 ) > 0, implying •(c 0 , c 2 ) > 0.Because C 0 is an antichain, then c 0 = c 2 , which implies c 1 = c 0 = c 2 by fuzzy antisymmetry of (S, •), so that c 0 ∈ C 1 .The other direction is proved similarly.Now, let C 0 , C 1 ∈ Antichains(S), andC def = C 0 ↓ • ∩C 1 ↓ • • .Then, C is the GLB of C 0 and C 1 in (Antichains(S), •).1)To prove that C • C 0 it suffices to show that for any c ∈ C there is some c 0 ∈ C 0 such that •(c, c 0 ) > 0. This immediately follows from the fact that ifc ∈ C, then in particular c ∈ C 0 ↓ • .The fact that C • C 1 is proved similarly.2) Suppose C ∈ Antichains(S) is such that (v) C • C 0 and (vi) C • C 1 .To show that C • C, it is sufficient to show that for any c ∈ C there is some c ∈ C such that •(c , c) > 0.Hence, let c ∈ C be arbitrary.By (v) and (vi) there are some c 0 ∈ C 0 and c1 ∈ C 1 such that c • c 0 and c • c 1 .Then, c ∈ C 0 ↓ • ∩C 1 ↓ • .Then, by definition of C there is some c ∈ C (possibly equal to c ) such that c • c.Finally, suppose that s is the GLB of s 0 , s 1 in (S, •).Then,{s 0 } {s 1 } = s 0 ↓ • ∩s 1 ↓ • • = {s}.
(Lower bounds in a fuzzy poset): Let (S, •) be a fuzzy poset and S ⊆ S. The set of lower bounds of S is defined as S fl def ={s∈S | ∀s ∈ S, s •s }.Definition A.13 (Fuzzy GLB): Let (S, •) be a fuzzy poset and S ⊆ S. The GLB of S is the unique s∈S fl such that, for all s ∈ S fl , s • s.If the GLB of S exists, it is denoted S, or simply s s in case S = {s, s }.Definition A.14 (Fuzzy lattice and bounded lattice): A fuzzy poset (S, •) is a fuzzy lattice if every pair of elements has a GLB.A fuzzy lattice (S, •) is bounded if there are elements ⊥, ∈ S such that, for all s∈S, •(⊥, s) = 1, and •(s, ) = 1.