Constrained Interval Type-2 Fuzzy Sets

In many contexts, type-2 fuzzy sets (T2 FS) are obtained from a type-1 fuzzy set to which we wish to add uncertainty. However, in the current type-2 representation, there is no restriction on the shape of the footprint of uncertainty and the embedded sets (ESs) that can be considered acceptable. This leads, usually, to the loss of the semantic relationship between the T2 FS and the concept it models. As a consequence, the interpretability of some of the ESs and the explainability of the uncertainty measures obtained from them can decrease. To overcome these issues, constrained type-2 (CT2) fuzzy sets have been proposed. However, no formal definitions for some of their key components [e.g., acceptable ESs (AESs)] and constrained operations have been given. In this article, we provide some theoretical underpinning for the definition of CT2 sets, their inferencing and defuzzification method. To conclude, the constrained inference framework is presented, applied to two real-world cases and briefly compared to the standard interval type-2 inference and defuzzification method.


I. INTRODUCTION
T YPE-2 (T2) fuzzy sets (T2 FS) were introduced by Zadeh [1] in 1975 as an extension of type-1 (T1) fuzzy sets (T1 FS) so that it would be possible to model the uncertainty of membership functions (MFs). However, their use remained rare due to the significant increase in complexity of algorithms, until Mendel introduced many advances that made their practical use possible. T2, and particularly interval T2 (IT2) sets, have been successfully applied in many areas such as control [2], [3], classification and regression [4], and many other contexts.
T2 fuzzy systems have required the creation of additional representations, definitions, and algorithms, including to allow the creation of complete rule-based inferencing systems. One of these is the concept of the footprint of uncertainty (FOU), introduced by Mendel and John [5], which represents the existence of nonzero secondary membership values as a 2-D shaded area. Other novel methodologies have been suggested for type-reduction and defuzzification, largely based on algorithms for centroid defuzzification of T2 sets initially by Karnik and Mendel [6], and subsequently enhanced by Wu and Mendel [7], and others. The original Karnik-Mendel (KM), modern enhanced Karnik-Mendel (EKM), and other similar recent algorithms, are based on the concept of the embedded set (ES). Intuitively, an ES is a path along the surface of a T2 set and it has been proven [8] that any T2 FS can be represented as the union of all its ESs (representation theorem). The process of finding the centroid of a T2 set then depends on finding the ES with the leftmost centroid, and that with the rightmost centroid.
Although the current T2 and IT2 frameworks have shown to have many advantages over T1 approaches, particularly in their ability to exhibit greater performance in most situations, we believe there are drawbacks. Two properties, which we believe decrease the overall interpretability of T2 systems, are: 1) there is currently no agreed mechanism to derive the FOU, particularly in the situation in which a concept being modeled by a T1 set has uncertainty added to form a T2 set representing the same concept; and 2) ESs may have any shape, including ones which bear no relationship to the concept being modeled.
To overcome these issues, constrained T2 (CT2) fuzzy sets (CT2 FS) have been proposed [9], [10]. The idea behind them is to address the two limitations above by: 1) providing an explicit method for generating the boundaries of the FOU that keeps a shape coherency [9] throughout the generation of the T2 set, based on an underlying concept modeled by a T1 set; and 2) restricting the acceptable ESs (AESs) that may be used to only a subset of all the ESs, in order to process only those shapes that may be considered meaningful in that specific context. Although the concept of CT2 FS has already been formulated [9], [10], some key components are currently lacking formal definitions such as the AESs, constrained inference, and centroid defuzzification.
In this article, we will provide some theoretical underpinning for this new constrained representation, focusing specifically on constrained IT2 (CIT2) fuzzy sets. In addition to formal definitions, a full inferencing and defuzzification framework is then proposed for the creation of CIT2 Mamdani-style fuzzy inference systems. Next, we compare and contrast the CIT2 approach with the recent framework introduced by Wu et al. [11], [12] for creating "well-shaped" T2 sets. Finally, a practical application will be shown and compared with the conventional IT2 representation in terms of interpretability and explainability of the outputs, performances, and running times. Specifically, a genetic architecture will be described for the automatic generation of CIT2 fuzzy systems, which will be tested on two real-world datasets. Although interpretability is itself a difficult This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see http://creativecommons.org/licenses/by/4.0/ and complex concept to define, and is somewhat subjective in nature, nevertheless we use worked examples and the practical applications to illustrate ways in which interpretability is enhanced. Throughout, we stress that the proposed CIT2 approach, which may be used in contexts in which explainability and interpretability are considered important, is an alternative to other approaches including the conventional T2 approach.

II. PRELIMINARY DEFINITIONS
In this section, we will provide some formal definition of fuzzy concepts that will be used throughout this article (definitions taken or rephrased from [8], [9], [13], [14]).
with X being the universe of discourse (UOD).
in which X is the UOD.
in which X is the UOD. Definition 4: Given a T2 FSÃ, its FOU is the set of points (x,u) for which μÃ(x, u) > 0 Definition 5: Given a T2 FSÃ and a value x ∈ X, we define the set of pairs J x as Definition 6: Given a T2 FSÃ and a value x ∈ X, a secondary MF is a function μÃ (x) such that The domain of a secondary MF is called the primary membership of x.
Definition 7: A T2 ES, denotedÃ E , is a path along the T2 set it belongs to. It contains only one primary degree u x for each x, with its associated secondary grade v x Definition 8: A T1 ES, denoted A E represents a projection of a T2 ES, i.e., its secondary degree has been dropped. Therefore, it contains one primary degree u x for each x Fig. 1. In red, one of the ESs of the IT2 fuzzy set in grey (picture from [9]).

III. MOTIVATION
In the literature, there are three main approaches to determine the upper and lower bounds of the FOUs of T2 FSs when starting from already existing T1 MFs modeling the same concept. The first one identifies the two boundary MFs by taking the parameters of the existing T1 MFs and adding some uncertainty to them [15]- [20]. For example, in the case of a T1 Gaussian with mean m and variance v, the upper and lower bounds of the FOU could be the Gaussians with mean m and variances v − k and v + k, respectively, with k being a positive real number.
A different method defines the FOU as the area covered by the translation along the x-axis of the starting T1 MF by a factor c and −c, c ∈ R [21]- [23]. The result is a symmetrical blurring around the starting T1 MF. An example of an FOU obtained with this approach with a T1 Gaussian can be seen in Fig. 1.
Another approach has also been proposed. It models the FOU so that it embeds all the T1 MFs obtainable from observations [24] or from the modeling of the same concept under different circumstances [25].
All those methods have in common the fact that they identify some T1 shapes as "meaningful" in their context and, then, use them to build the FOUs. However, when some fuzzy operators such as the KM type-reduction algorithm [6] are used, all the ESs are processed, regardless of their shape. As a consequence of that, ESs that could hardly represent the concept they are modeling, will likely determine the end-point of the defuzzified centroid. Since those ESs have a low interpretability due to their shape, the explainability of the output and, consequently, of the fuzzy system or set that generates it, decreases. However, in the recent years, building explainable intelligent systems has become increasingly important [26], [27]. We will now use the following examples to support our claims. Suppose that we decide to model the concept of medium height using a T1 Gaussian MF, as shown in Fig. 2. We will call this set the T1 generator set (GS). If we want to build an IT2 FS from that, one of the possible approaches would be to ask different people to place the mean of the Gaussian on the x-axis, after its variance value had been previously determined (similar approaches can be found in [28] and [29]).
It is likely that we would obtain something similar to what is shown in Fig. 3, since the concept of medium height would vary   slightly from person to person. Now we can use this collection of T1 MFs to determine the FOU of our IT2 FS.
As in [25], we will embed those sets in our FOU. To do so, we will use the translation method mentioned above, i.e., we will define our FOU as the area covered by the shifting of the GS from the leftmost to the rightmost Gaussian to embed. The result of this operation is shown in Fig. 4.
If we use the standard IT2 representation, the ESs within the FOU will have arbitrary shapes. That makes even the ES shown in Fig. 5 acceptable. In this particular context, it is clear that a T1   ES like that has very little relation with the concept of medium height. In fact, no observation of the participants' opinion during the experiment led to such shape. Furthermore, this representation affects the centroid value and its explainability. The set shown in Fig. 6 has been obtained with the process described in our thought experiment above.
If one uses the KM procedure [6] to type-reduce it, the algorithm will find the two ESs that give us the left and right endpoints of the centroid. For the IT2 FS in Fig. 6, the results are shown in Fig. 7.
These sets do not seem to fit our case very well. That is because, to obtain the type-reduced value, the algorithm chose two ESs that did not represent any of the observations made during the experiment; additionally, those shapes could hardly represent the concept of medium height that is being considered.
System output defuzzification represents another useful example to see how the standard IT2 representation affects the interpretability and explainability of fuzzy systems. Consider, for example, the fuzzy output set shown in Fig. 8, and its associated left and right endpoints shown in Fig. 9 and 10, respectively. In Fig. 9, the ESs of the left endpoint derived using the constrained centroid [see Fig. 9(a)], and the KM procedure [see Fig. 9(b)] are compared. Similarly, Fig. 10 compares those of the right endpoint. The ES used for the constrained centroid preserve the same level of interpretability of T1 system outputs in that the shapes of the GSs are clearly identifiable and so are the firing strengths that generated them. As a consequence of this, it is possible to get an intuitive idea of the sets that lead to the end-points. In addition to that, knowing which rules (and, therefore, which inputs and antecedents) generated the ES from which the endpoints are obtained gives an explanation to how and why the final output of the system has been obtained. In the KM case, on the other hand, the shape coherency with the original shape is partly lost and the firing strengths are not as clear as in the CIT2 case.
Intuitively, the standard T2 definition gives too much "mathematical freedom" in some contexts, posing no restrictions on the shape of the FOU and of the ESs, especially when modeling T2 MFs from an underlying concept represented as a T1 FS with uncertainty. For these reasons, CIT2 FSs were proposed, in which both the FOU and the ESs considered as acceptable have a shape that is "meaningful" for the context in which they are used.
The specific sense of "meaningfulness" can vary. The intuitive idea is that the shape of the MFs should be reasonable for the semantic meaning they carry. For example, in the case of the concept of medium height, only a MF that monotonically increases up to a plateau and then monotonically decreases would be "meaningful". That is simply because any MF without these properties would result in a counter-intuitive set for the representation of the medium height concept.
In other contexts, meaningful shapes can be obtained as a result of experimental observations, data analysis or experts' knowledge. The topic has been discussed in detail in [30], in which a possible mathematical definition for the concept of meaningfulness in the context of fuzzy sets has been given.

IV. CONSTRAINED INTERVAL TYPE-2 FUZZY SETS
Although we assert that the main concepts of CT2 FSs can be extended to all T2 FSs, the rest of this article will only focus on IT2 fuzzy sets and their constrained representation (CIT2). The motivations behind this decision will be discussed later in this article. Also, we assume that the UOD we are working with is a connected subset of R.
The idea behind CIT2 FSs is to generate a T2 FS starting from a T1 FS modeling the same semantic concept. This T1 FS is called T1 GS (e.g., the T1 FS in Fig. 2 is the T1 GS for our thought experiment in Section III). To obtain the CIT2 FS, we add uncertainty on the location of the T1 GS on the x-axis. We do that by using a set of offsets that intuitively represent all the possible valid locations of our T1 GS. We call this set of offsets the displacement set (DS).
Definition 9: A DS, denoted D, is a closed set of real numbers such that When the DS is a continuous interval, it can be expressed as With a DS plus a T1 GS, we can define the T1 FSs that will represent the AESs of the CIT2 FS we are modeling.
Definition 10: A collection of T1 AESs (CAES) is a set of T1 FSs obtained from the shifting of a T1 GS G. Formally, each of the AES S in a CAES can be expressed as where given a UOD X, a DS D, and a T1 GS G. Given a CAES, we can generate a CIT2 FS. Definition 11: A CIT2 FSȂ, is defined as follows: with CAESȂ being the CAES from which we obtainȂ. In this case, J x can be rewritten as follows: A can also be written as It is important to note that CIT2 FSs represent a subset of IT2 FSs since they impose additional constraints on their mathematical definition, just like IT2 FSs represent a subset of the more general T2 FSs.
In order to prove an important property, we need to build a 3-D version of the sets in our CAES. Since they are T1 FSs, building their 3-D representation is straightforward. Given a T1 set A, its 3-D representationÃ (i.e., its representation as a T2

FS) is defined as follows:
By applying (15) to all the sets in a given CAES, we obtain a collection of IT2 AESs.
Definition 12: A collection of acceptable IT2 ESs ( CAES) of a CIT2 setȂ, denoted CAESȂ, is a set of CIT2 ESs described as follows: Each of the setsS, can also be described as The sets in the CAESȂ are actual T2 ESs ofȂ, since they satisfy Definition 7.
Although all the definitions up to this point could be easily extended to the general CT2 case, the conversion of T1 MFs to AESs of a general T2 FS would not be so trivial. That is because the membership degree of each of the pairs ((x, μ S (x)) could not be easily determined since it could be any value between 0 and 1. The conversion to AES of IT2 FS, instead, is straightforward and shown in Definition 12. A possible solution to this has been proposed in [9], in which a similarity function is used on each AES S and the GS to determine μS(x, μ S (x)), ∀x. However, the use of this and other possible approaches, together with the interpretability of 3-D ESs will be analyzed in future work. Definition 12 is very important since it allows us to introduce the CIT2 representation theorem.
Theorem 1: Given a CIT2 setȂ and its CAESȂ,Ȃ can be expressed as the crisp set union of all the IT2 setsS in CAESȂ.
Proof: To do that, we simply show that it is possible to write the union of all theS ∈ CAESȂ as (14), by rewriting S as in (18) Theorem 1 allows us to define CIT2 operations by only working with AESs. For example, the union of two setsȂ and B is defined as follows.
Corollary 1: Given two CIT2 setsȂ andB, their union is the union of the T2 ESsS in CAESȂ and CAESB 1 (20) involves integral and union signs, where the integral sign is shorthand for lots of union signs. The union sign indicates the union between members of a set, whereas the integral sign represents the union of the sets themselves.
Intuitively, we are considering all the combinations of all the AES of the two CIT2 sets involved in the operation. The unions between the AESs ofȂ andB generate the AESs of the FS generated from the union ofB andB.
Analogously, we can derive the CIT2 intersection and com-plementȂ Also, the upper and lower MFs of the FOU of a CT2 FS can be expressed in terms of the AES.
Definition 13: Given an CIT2 FSȂ, we define its upper MF μȂ and lower MF μȂ as follows: Although IT2 and CIT2 operations may seem similar, they are conceptually different. In the IT2 case, the only goal of operations such as the union and intersection is to generate the new upper-and lower-bound MFs and, therefore, the FOU. In the CIT2 case, that is not enough. In fact, the key point of CIT2 operators is the generation of a new CAES that determines which ES are considered acceptable and, therefore, which ES will be considered by other CIT2 fuzzy operators (such as the centroid). This property is necessary to maintain the concept of interpretability (as semantic relation) described so far in this article.
Since every CIT2 set can be expressed as the union of the AES in its CAES, we can use this property to define the constrained centroid, denoted as C(Ȃ) That is, the union of all the centroids of the sets in CAESȂ. The constrained centroid is analogous to the IT2 one, in which the centroid is the union of the centroids of all its ESs [13]. The difference is that in the CIT2 case, we only take into account the set of the AESs. They represent a subset of all the ESs examined in the standard IT2 approach. In addition, since the CAES is a subset of all the ES embedded in a given FOU, the constrained centroid will always be contained (or will be equal to) in the standard IT2 centroid.
When a CIT2 FS is not the result of a CIT2 fuzzy operator but is generated from a T1 GS with a continuous DS, the CIT2 centroid has an interesting mathematical property. In fact, in that case, the centroid can be rewritten as the following interval: withÃ L ,Ã R being the left-most and right-most AES ofȂ. The proof for that equation is straightforward: since all the AES of a CIT2 generated from a GS share the same shape, the AES obtained from the leftmost shift will trivially have the lowest centroid value and will, therefore, determine the left endpoint of the centroid; analogously, the right endpoint is generated by the rightmost AES. However, (26) may not hold anymore after the application of a set theory operation. Intuitively, that is because (26) can be used when all the sets in CAES have the same shape. An example of a case in which (26) cannot be used is given by the CIT2 FS in Fig. 12. Its AES [e.g., Fig. 9(a) and 10(a)] are obtained as the aggregation of three triangular MFs "truncated" (i.e., inferred) at different heights. In that case, determining which "truncation values" generate the AES with the lowest and highest centroid value is nontrivial, as will be also discussed in Section V-A.
Finally, the FOU [see (4)] of a CIT2 FSȂ can be rewritten using only the AESs.
Definition 14: The FOU of a CIT2 FSȂ can be defined as

V. INFERENCING WITH CIT2 SETS
Now that we have a formal definition of CIT2 FSs and all their components, we can use them to build fuzzy rules and fuzzy systems. For CIT2 fuzzy systems to be usable, however, we need to define the procedure to carry out the Mamdani inference with singleton fuzzification.
Consider the following CIT2 fuzzy rule (CIT2 fuzzy rule), i.e., a fuzzy rule in which all the sets involved are CIT2 FSs Using Theorem 1, we can rewrite this as Since all the sets in the CAES are a 3-D representation of T1 sets [see (15)], we can use T1 mathematics to operate with them. After the singleton fuzzification of the input, the antecedent operation is straightforward. For example, for the fuzzified input x 1 in the rule mentioned above, we obtain where x 1 is a specific value of x 1 . The antecedent composition is, therefore, given by the following formula: with being a T-norm. The antecedent composition, as described so far, returns a set of real numbers. Each of these values can be, for each rule R i ∈ S do 3: for each permutation P of the AES of the CIT2 antecedents in R i do 4: firing_strengths.add(P.evaluate Antecedents()); the FSs in P are T1 AES 5: end for 6: for each consequent C ∈ R i do 7: for each AES E ∈ C do 8: for each c ∈ firing_strengths do 9: CIT2_result_R i .add_AES(impli cate(E, c)); add a new AES to the rule i output 10: end for 11: end for 12: end for 13: end for 14: CIT2_output={∅} CIT2 FS representing the output of the system 15: for each rule R i ∈ fuzzy system S do 16: CIT2_output = CIT2_union(CIT2_output, CIT2_result_R i ); union of rule outputs 17: end for 18: left_value= inf

E∈CIT2_rb_output
(centroid(E)); highest AES centroid value 20: return (left_value, right_value); 21: end procedure then, used to apply the implication method (i.e., any T-norm) to each of the AES C ∈ CAESC, producing the CAESC * of the fuzzy CIT2 outputC * . In the rest of this article, we assume that the minimum operator is used for the implication method and informally refer to this operation as truncation. To defuzzifyC * , we implemented a procedure that is based on the result shown in (25). Our CIT2 centroid is a pair (l, u), where remembering from (25) that Since each of the IT2 set in the CAESC * is just a 3-D representation of a T1 set, we can defuzzify the equivalent T1 sets in CAESC * instead, by using the standard T1 centroid defuzzification method. Therefore, the pair (l, u) provides us a lower (l) and an upper (u) bound for the set of centroids in (25). This approach is conceptually similar to the KM [6] procedure, in the sense that both return a pair composed of the upper and the lower bounds of a set of centroids (that in the case of the KM approach is the set of the centroids of all the ES of the IT2 FS).
The whole inference process where the CIT2 FSs involved have a finite number of AES is described in pseudo-code in Algorithm 1.

A. Result of CIT2 Operators
It is interesting to see how the result of CIT2 operators on CIT2 FSs may result in an FS in which it may not be possible to identify a T1 GS G in the CAES from which we can obtain the remaining AESs by shifting G. That is because there is no guarantee that all the sets obtained as the result of the implication operator, for example, will have the same shape.
However, the shape of the T1 GS is not totally lost after the application of CIT2 fuzzy operators. Fig. 12 shows some of the AES of the inference output of a CIT2 fuzzy rule of the form IF x 1 ISȂ THEN y ISC where all the CIT2 FSs involved have a discrete DS (i.e., a finite number of AES). It is possible to see that although the sets forming the CAES of the output do not share exactly the same shape, they all come from the same GS (i.e., a triangular T1 FS) truncated at different heights during the inference process (the consequent CIT2 FSC before the inference can be found in Fig. 11).
Intuitively, these AESs are meaningful even if they have different shapes because they represent actual T1 inference results that are obtainable from T1 inference by picking one of the AES Fig. 11. Consequent CIT2 in the rule generating the output set shown in Fig. 12. from each of the antecedent and consequent CIT2 FSs in our fuzzy rule. The fact that each of the AESs is obtained from a shifted GS truncated at a given height is extremely important to build interpretable and explainable CIT2 systems. In fact, when one of those AESs is selected, its interpretability is guaranteed by the semantic connection with the concept it is modeling, since it has the same shape as the GS, while its truncation height is directly related to the firing strength of the rule(s) that generated it. Therefore, it is possible to give an explanation for how this AES has been generated by showing the rules (and, therefore, the inputs) that contributed to its creation.
The theoretical issue of having AES with different shapes was already pointed out in [10] and has now been addressed in [30], where the CIT2 fuzzy output has been formally defined as a CIT2 FS thanks to a different definition of the CAES (see Definition 10) based on the concept of mathematical constraint satisfaction.
The analysis of this new definition, however, goes beyond the scope of this article since, as already stated in [30], it does not affect any of the CIT2 operations but, in this context, just fills a theory gap.

B. On the Interpretability and Explainability of CIT2 Sets and Systems
As shown in Section IV, the CIT2 FOU is a set of points, exactly like the FOU of a standard IT2 FS. If one considers the shape of a CIT2 FS alone, it is clear that its interpretability depends only on the shape of its FOU (and its boundaries) and not on the specific set of ES that are embedded into it. However, some T2 uncertainty measures do make use of these ESs and it is in these cases that CIT2 are able to provide a clear advantage over IT2 FS, allowing for the creation of explainable CIT2 FS and systems. Specifically, each of the AES that can be selected by the above-mentioned fuzzy operators in the CIT2 case has been created so that it is able to carry meaningful information. This is done both by keeping a semantic relation with the concept it is modeling (i.e., by keeping the same shape as the GS) and by conveying, in the case of rule-base systems, information on the rule that generated it and its firing strength. In other words, it is possible to build CIT2 fuzzy systems that not only are able to solve, for example, classification problems, but that are also able to explain, in terms of the input space, how each endpoint of the interval centroid has been obtained. With a standard IT2 system, this property is lost simply because in the defuzzification process, the ES that produce the endpoints do not carry any meaningful information on which rules played a role in their generation and why. Therefore, in IT2 systems, an explanation in terms of the input space cannot be provided for the centroid but only for the boundaries of the FOU of the fuzzy output of the system. The ability of CIT2 fuzzy systems to explain also the endpoints of the centroid, on the other hand, clearly represents a novelty and a progress for T2 FSs in the increasingly popular explainable artificial intelligence (XAI, [26]) field.

C. Efficiency
The main goal of Algorithm 1 is to provide a procedure to compute the inferencing and defuzzification processes described in this section. For now, the optimization of computational complexity has not been our focus. It is clear that the proposed algorithm is slower than the current IT2 inferencing and defuzzification methods. That is because after the evaluation of the whole rule-base, the output is a set of AESs (line 16, in Algorithm 1) that can be quite big in size: each rule can produce (line 9) a number of implication sets that, in the worst case, are equal to the size of the permutations of the AESs of the antecedents, multiplied by the cardinality of the DS of the consequent. Additionally, at line 16, we generate the unions of all the possible permutations of the AESs of the CIT2 resulting from the single rules. This union generates a number of AES that grows as a double exponential, being O(k n+1 ) m where m is the number of rules, n the number of antecedents per rule, and k the number of AES of each of the CIT2 involved.
Since this approach enumerates all the AESs to find the final defuzzified output, it is the analogous of the exhaustive defuzzification method rather than the KM one. In fact, the strength of the KM procedure is that it quickly identifies the ESs to be used for the left and right centroid values. On the contrary, in Algorithm 1, the AESs that give the left and right centroid value are found using a brute force approach, first building all the AESs of the total rule-base evaluation (line 16) and, then, finding among them the two that will give us the left and right centroid values (lines 18 and 19). For use in real-world problems, this approach is impractical because of its prohibitive computational complexity. For this reason, the alternative, much faster and practical defuzzification Algorithm 2, is proposed in Section VII. This algorithm is, then, used within the genetic framework described in Section VIII, in which it is applied on two well-known real-world datasets and compared to the KM procedure.

VI. COMPARISON WITH A DIFFERENT CONSTRAINED APPROACH
In this section, the constrained representation presented in this article will be compared to a different approach (that here will be called W-CIT2) proposed by Wu et al. [11], [12]. They start from the observation that ESs have been used to obtain theoretical results such as the definition of uncertainty measures and are processed regardless of their shape.
However, the authors point out that in many fuzzy logic applications, the MFs that are used are convex and normal. Consequently, they propose a constrained representation theorem that allows the definition of the FOU of well-shaped (see [12] for details) IT2 FSs by using only convex and normal ESs. They claim that this definition is more general than the one that only considers ES with the same shape and does not require any expert knowledge or data analysis to determine which shapes are meaningful in a given context. Using this new theorem, many constrained uncertainty measures (such as centroid, entropy, and cardinality) are defined mathematically. In addition to that, the authors show how the convexity and normality constraints can be simply added to the KM algorithm to find the constrained centroid value of a well-shaped IT2 FS. Finally, the authors also state that this approach cannot be used in Mamdani systems since their outputs can be nonwell shaped.
The main difference between the representation theorem proposed in this article and the W-CIT2 one is in the definition of the ESs that are considered acceptable. Although it is true that the W-CIT2 theorem allows the presence of multiple shapes among the ESs, normality and convexity cannot be sufficient and not necessary to obtain shapes that are meaningful, as also discussed in [30]. Those two properties alone still do not guarantee that there will be a meaningful connection between an ES and the concept it models. To support this claim, a comparison is provided between the ESs that determine the end-points of the W-CIT2, CIT2, and IT2 centroid with the KM procedure (see Fig. 13). The set to defuzzify has been obtained starting from a triangular T1 MF as a GS, using the approach described in this article to build the FOU around it. The comparison shows how the ES used by the KM approach [see Fig. 13(a)] are both non-normal and nonconvex. In addition to that, they could hardly represent any word or label. As a result, the meaningfulness and interpretability of the centroid value returned as an output decreases. On the other hand though, the KM algorithm can be applied to any IT2 FS, regardless of the approach used to obtain its FOU. The ESs used by the W-CIT2 approach, instead, are both normal and convex. However, also in this case, the relation between the original T1 triangular shape (i.e., the one that has been used as a GS) and the ESs is lost. Again, these sets would hardly model the same concept (e.g., medium height) from which we obtained the GS. The ESs used by the CIT2 approach, instead, keep the same level of the interpretability as the GS as they share its shape. The only difference between them is their location on the x-axis. From this experiment, we can conclude that normality and convexity alone may not be sufficient to guarantee the meaningfulness of an FS.
In addition to that, the fact that W-CIT2 FSs are not usable in Mamdani systems represents a significant limitation that can be overcome by the CIT2 definition provided in this article, as shown in Section V.

VII. SAMPLING APPROACH FOR THE CIT2 CENTROID
As already discussed in Section V, the evaluation of the CIT2 centroid as described in Algorithm 1 is prohibitive due to the astronomical number of AESs that are examined to determine the defuzzified value. Therefore, although the algorithm proposed earlier in this article is theoretically correct for the computation of the CIT2 centroid, it is not usable in practice for real-world problems. Conceptually, the problem is very similar to the one that is faced when exhaustive defuzzification is applied to T2 FSs. In that context, many approximation algorithms have been proposed to overcome the computational complexity of the exhaustive defuzzification. One of them is the sampling method [31]. The intuitive idea is that each of the ESs in a T2 FS only has a minimal contribution to the final result, therefore generating a random sample of the ESs is a good and efficient way to obtain an approximation of the actual centroid value, as showed in [32]. In this case, we apply the same concept to sample a fixed number of AESs to determine the constrained centroid. A sample is obtained by replacing each CIT2 FS in the rule base with one of its AES chosen randomly (rather than replacing each set with all its AES, as in Algorithm 1).
Conceptually, the following steps are used to produce a singlesampled AES of the CIT2 fuzzy output.
1) For each setȂ involved in the FLS a) generate a random number k within its DS; b) use k to shift the GS ofȂ along the x-axis, obtaining E, an AES ofȂ; remembering that given a function f (x), its translated version by a factor k along the x-axis can be written as f (x) = f (x) − k, this step can be done in constant time without the need to store all the AES to choose one randomly; c) loop through all the rules and replaceȂ with E.
2) Once all the CIT2 FS have been replaced with a random AES, a T1 rule base is generated.
3) The fuzzy inferenced result of the rule base represents a sampled AES. The output interpretability offered by CIT2 FLS is given by the process used to produce the AES. In fact, each of them represents a T1 fuzzy output and as such keeps all the interpretability properties that belong to the outputs of T1 FLSs: the shapes of the consequent set involved in the rules are clearly identifiable together with the firing strengths used for the inference operator [e.g., see Fig. 9(a), 10(a), and 13(a)]. These properties also make possible a direct connection between the endpoints of the interval centroid and the rules that were used in its generation.
The pseudo-code (mainly written following OOP conventions) of the sampling method is described in Algorithm 2.
Other than the reduction in the computational cost, the other main advantage of this approach is its applicability to systems in which the CIT2 FSs involved have a continuous DS, i.e., the number of AESs per CIT2 FS is infinite. In fact, Algorithm 1 only works with a discrete number of AESs and may, therefore, require an additional discretization step. With the sampling approach, each CIT2 FS involved in the rule can be easily substituted with one of its AES by shifting its GS by a random value in the DS during the conversion step (mentioned above) of the CIT2 rule into a T1 one.

VIII. PRACTICAL APPLICATION
In this section, a framework for the automatic learning of CIT2 fuzzy systems will be described and applied to two realworld classification problems. The aim is not to compare this learning method to other approaches proposed in the literature in terms of performances, but rather to present a possible way of generating CIT2 fuzzy systems and show a practical application of these new fuzzy sets and their inference framework described so far.
Classification problems have been chosen because they represent one of the contexts in which interpretability and especially explainability play a crucial role. In many applications, in fact, knowing both the output (the interval centroid) and how it has been obtained (i.e., which rules and which inputs determined the ES that produced the endpoints) is of great value and it is the main reason for the emergence of the new XAI field.

A. Genetic CIT2 Fuzzy Systems
Genetic algorithms have been widely used for the automatic generation and optimization of fuzzy systems [33] since they allow for the creation of both the rule base and the MFs without the need of any expert knowledge. Although these systems are obtained through machine learning techniques, they can maintain the typical interpretability of fuzzy logic systems as long as they contain a reasonably small number of rules and it is possible to give a linguistic label to the MFs involved [34]. The genetic approach proposed for the generation of CIT2 fuzzy systems is based on the architecture described in [35]. Each of the input variables of the system is partitioned in three triangular MFs. The center of each triangular GS for the antecedent CIT2 FSs is determined using the well-known fuzzy C-Means (FCM) clustering algorithm [36] on each input variable. The end-points of the triangles are the center of the previous and next clusters, if they exist, or the closest end-point of the UOD increased by 10% of the UOD size, so that every point in the UOD belongs to at least one of the MFs with a membership value greater than 0. The continuous DS is an interval [−c, c], c > 0 with 2c = 5% of the distance between the starting and end-point of each triangular GS. The output variable is partitioned with a number of CIT2 FS  equal of the number of classes in the problem. Each of them is given an integer index from 0 to the number of classes involved. The index represents the peak of their triangular GS while the start and end-points of the triangles are obtained, respectively, subtracting and adding 1 to their peak points. The DS for all the CIT2 MF partitioning the output is an interval [−c, c], c > 0 with 2c = 10% of the UOD. Once the MFs are determined, there is a first evolutionary stage to generate the rule base of the system. During this process, the MFs are not changed. The number of rules is fixed (as shown in [35], redundant rules can be eliminated with an additional stage) and each chromosome codes an entire rule base. With n input variables, each rule is coded with a set of n + 1 integers. Each gene p i represents the index of the MF to use for the ith antecedent or for the consequent, if i = n + 1. A value of − 1 for p i , i ≤ n indicates that the ith input must not be included in the rule p i belongs to. A sequence of encoded rules represents a rule base. At the end of the first stage, the fittest chromosome is returned. The rule base encoded by this chromosome is passed to the second stage of the learning process, with the goal of optimizing the MFs involved in the system. Each triangular CIT2 MF is encoded with four real numbers: three modeling the GS (starting point, center, and ending point of the triangle) and one representing the size of the DS as a percentage of the UOD. Thanks to the way CIT2 MFs are built starting from a T1 GS, the encoding of CIT2 MFs only requires 1 additional parameter with respect to their T1 counterpart. That is because the upper and lower MFs bounding the FOU of the set are determined from the T1 generator shape and the DS. Standard IT2 representations, instead, may require up to twice the number of parameters of their T1 counterpart to fully represent the FOU and its bounds. The optimized rule base obtained at the end of the second stage is, then, returned as the final output of the learning process. The architecture is summarized in Fig. 14. For more information on the tuning and learning process, please refer to [35].

B. Application on Real Datasets
The genetic architecture described above has been tested on two real-world classification problems using two well-known datasets: iris [37] and new-thyroid [38]. The tenfold crossvalidation method has been used to evaluate the CIT2 fuzzy systems; both datasets, including the train and test partitions of each cross-validation iteration, are publicly available on the KEEL website [39]. In both stages, a single-point crossover has been used and the fitness function has been defined as the accuracy value of the rule base encoded in the chromosome. A more detailed list of the parameters used in the optimization can be found in Table I. The optimization has been carried out twice, once using the CIT2 sampling method with 50 samples to defuzzify the output and once using the implementation of the KM iterative procedure implemented in the Java library Juzzy [40]. The architecture has been implemented in Java using multithread computation on an i7-7600 U CPU. The average results of both approaches and their running times for the ten runs are reported in Table II. It can be seen that the execution time of the CIT2 systems, featuring Algorithm 2, is higher than the IT2 systems. However, these execution times represent approximately 10 7 individual defuzzification operations throughout the optimization process, i.e., each individual CIT2 defuzzification using Algorithm 2 takes around 1.5 ms using multithreading to generate the samples. Although not as efficient as current IT2 defuzzification algorithms, this is clearly usable in real-world applications, particularly decision problems.
As it is possible to see, the two approaches give similar results and perform well on both the datasets analyzed. Therefore, to determine if and under what conditions one of the two defuzzification methods gives superior results, more experiments are required, with a bigger number of datasets and a statistical evaluation of their performances. To demonstrate the superior interpretability and explainability of the CIT2 approach, in Fig. 15, the ES used to determine the right end-point of the constrained 1) and "standard IT2" 2) centroid generated by the KM procedure are shown. Those ESs have been obtained as the result of the defuzzification of the output of a CIT2 FLS generated through the learning framework described in this section. As discussed in Section III, the AES selected by the CIT2 approach provides a clearer understanding of the final system output, giving an intuitive idea of how the centroid value is obtained since, just like any T1 fuzzy output, it is still possible to identify the shapes of the consequent MFs and see the respective firing levels. Additionally, the firing strengths can be traced back to the rules and the inputs that generated the endpoints. The ability to produce explanations for each of the system outputs, together with the interpretable rule-based structure (characteristic of any FLS), makes CIT2 FLS a valid alternative to IT2 for the development of FLS in the XAI field.
Currently, running times seem to be the main drawback of this approach. In fact, in both the tests, the IT2 approach with the KM procedure has proven to be roughly 3.5-4 times faster than the CIT2 one. In future works, we plan on developing new and faster defuzzification methods to address this issue.

IX. CONCLUSION
In this article, we have fully formalized CIT2, showing how they can be obtained starting from a T1 FS with uncertainty on its exact location on the x-axis. The main idea behind CIT2 FSs is to produce a representation that considers only the ESs that have meaningful shape for a given concept; these ESs, called AES, can, then, be used to define the FOU of our CIT2 FS and CIT2 fuzzy operators. The use of AESs rather than their unconstrained version, guarantees that CIT2 operators will only process ESs with a meaningful shape, increasing the interpretability of their output (as discussed in Sections III and VIII.B).
Formal definitions of CIT2 FSs and AESs have been provided, together with the formulation of a new constrained representation theorem (see Theorem 1). This allowed us to define all the main CIT2 operators, including the centroid defuzzification, by working only with "meaningful" ESs. Finally, a full inference framework has been presented for a CIT2 fuzzy system together with a defuzzification procedure. As a test case, a genetic architecture for the generation of CIT2 fuzzy systems has been described and applied to two real-world datasets. The preliminary results, presented here, show how the performances of the CIT2 approach are comparable to the ones obtained from the IT2 one, with the CIT2 system outputs presenting a higher level of interpretability. On the other hand, CIT2 have been shown to be slower, requiring approximately four times more time than their IT2 counterpart to complete the learning process.

X. FUTURE WORK
Now that formal definitions have been given for CIT2 inferencing and defuzzification procedures, a considerable amount of research work needs to be done to fully understand their potential for practical applications. We speculate that the properties of CIT2 FSs, such as the fact that all the AES share the same shape, can be exploited for deploying a faster inference and defuzzification process. Also understanding how the number of AES used in Algorithms 1 and 2 affect the centroid value and its convergence is of significant importance and will be examined in future works. Additionally, it would be useful to have a more thorough comparison between IT2 and CIT2, using more real-world cases from different scenarios. Understanding in which cases and why the results of IT2 and CT2 systems differ would help designers choose between the two approaches in their context. That is one of the main things we plan on working on in the near future. We also expect that the differences in the interpretability of the ES of the CIT2 and IT2 representation require a more formal comparison as this is a subject that is present in many research works but has never been deeply studied and formalized. While this article is only focused on CIT2 rule-based systems, it is necessary to evaluate the meaningfulness and interpretability of these new sets more thoroughly, in different applications. For example, we believe that the use of CIT2 FSs would be very suitable for the fuzzy linguistic and the computing with words approaches [41]. However, to be applied in those contexts, similarity measures for CIT2 FSs need to be defined. Finally, the CIT2 definitions can be extended to include general T2 FSs, in order to have general constrained T2 FSs, overcoming the problems discussed in Section IV.