Uncertainty Measurement for Fuzzy Set-Valued Data

Uncertainty measurement (UM) can offer new visual angle for data analysis. A fuzzy set-valued information system (FSVIS) indicates an information system (IS) where its information values are fuzzy sets. This article investigates UM for fuzzy set-valued data based on Chebyshev distance. First, the distance between information values is founded in a given subsystem. After that, the tolerance relation induced by this subsystem is obtained by means of this distance. Moreover, the information structure of this subsystem is proposed. Next, the uncertainty of a FSVIS are measured. Eventually, to show the feasibility of the proposed measures, effectiveness analysis is carried out from a statistical view. The obtained outcomes may be helpful for comprehending the nature of uncertainty in a FSVIS.


I. INTRODUCTION
A. RESEARCH BACKGROUND AND RELATED WORKS Uncertainty can be found everywhere. Both human intelligence and artificial intelligence depend on the handling of uncertainty. It can be said that intelligence is mainly reflected in the capable of solving uncertain problems. Furthermore, uncertainty is the core of intelligent problem research. Consequently, uncertainty measurement (UM) is a significant issue in the research of many fields.
Rough set theory, brought forward by Pawlak [23], [25], [26], is a significant approach for managing imprecision, vagueness, and specially uncertainty. This theory is developed around the concept of an information system (IS). Because this theory has a good processing performance for uncertain data, it becomes a powerful tool for managing uncertainty of IS. Recently, this theory has attracted attention of a great many researchers, and most applications of this theory are connected with ISs, such as sequential covering [5], feature selection [8], [15], rule extraction [16] and so on.
Fuzzy set theory, put forward by Zadeh [34], can describe the fuzziness in precise mathematical language. Fuzzy set theory and rough set theory, are two methods to study The associate editor coordinating the review of this manuscript and approving it for publication was Gang Li . incomplete knowledge and uncertain problems in ISs, have their own advantages and characteristics, and can combine and use their advantages to study some specific problems. Fuzzy set theory has been successfully applied in various fields. Applications of fuzzy set theory in the field of target recognition is mainly reflected in the design of classifier, which can help the machine to make accurate judgment in the fuzzy environment and has strong adaptability to the special case of target deformation. Moreover, fuzzy set theory improves the flexibility and practicability of processing fuzzy or uncertain information.
UM is an important foundation for describing the classification ability of system and improving classification accuracy in rough set theory. To evaluate uncertainty of a system, Shannon [27] brought in the concept of entropy. The extension of entropy and its variants can be used in ISs or rough sets. Some scholars have made many excellent research contributions. For example, from the standpoint of granulation, Yao [32] brought forward a granularity measure; Wierman [28] raised measures of uncertainty and granularity in rough set theory; Bianucci and Cattaneo [1] and Bianucci et al. [2] brought up entropy and co-entropy approaches for UMs of coverings; Liang and Shi [19] and Liang et al. [20] deliberated over information entropy, rough entropy and information granulation in complete and VOLUME 8, 2020 This incomplete ISs; Xu et al. [31] presented rough entropy of rough sets in ordered information systems; Düntsch and Gediga [11] applied Shannon's entropy to the measurement of decision rules in rough set theory; Dai and Tian [12] researched UM for a set-valued IS. Zhang et al. [35] explored uncertainty measures in a fully fuzzy IS; Beaubouef et al. [4] come up with a method for measuring the uncertainty of rough sets; Li et al. [17], [21], [22] studied information structures and UM in covering and fuzzy relation ISs.

B. MOTIVATION AND INSPIRATION
Uncertainty is everywhere, and IS is no exception. A fuzzy set-valued information system (FSVIS) means an IS with fuzzy set-valued data or an IS where its information values are fuzzy sets. In addition, a FSVIS has uncertainty. The study of UM of an IS is becoming increasingly significant. However, the problem of UM for a FSVIS has not been studied. So this article focuses on it. Up to now, information granulation and information entropy have been two main methods to measure uncertainty of an IS. The uncertainty measures are studied from two different perspectives. The general work is displayed in FIGURE 1:

C. COMPARISON AND DISCUSSION
Below, we do comparison and discussion so as to see the innovation of this paper more clearly. 1) Xie et al. [30] investigated information structures and UM in an incomplete probabilistic set-valued information system (IPSVIS). First, according to Bhattacharyya distance, they proposed the distance between objects in every subsystem. Then, they obtained the tolerance relation on an object set by using this distance. Next, they introduced information structure in an IPSVIS. Finally, as an application for information structures, they measured the uncertainty of an IPSVIS. And to evaluate the performance of the proposed measures, they gave effectiveness analysis from the angle of statistics.
2) Dai et al. [13] considered UM for an incomplete interval-valued information system (IIVIS) based on α-weak similarity. Firstly, they defined the maximum and minimum similarity degrees in an an IIVIS, and α-weak similarity relation. Secondly, they introduced accuracy, roughness and approximation accuracy to evaluate the uncertainty of an IIVIS. Furthermore, they showed the effectiveness of the constructed measures by experimental analysis.
3) Chen et al. [9] investigated UM for neighborhood rough sets. First, they introduced neighborhood granule. Then, they measured neighborhood granules by introducing neighborhood accuracy, information quantity, neighborhood entropy and information granularity in a neighborhood information system (NIS). Finally, they showed that the proposed measures are better than neighborhood accuracy by theoretical analysis and experimental results. 4) This paper investigates UM for a FSVIS. Due to the particularity of a FSVIS, rough set theory and fuzzy set theory need to be combined to handle a FSVIS. The main work of this paper includes: firstly, we give the similarity degree between information values in a given subsystem. Then, we introduce the tolerance relation induced by this subsystem by using the similarity degree. Next, we propose the information structure of this subsystem. Additionally, we defined θ -information granulation, θ -information amount, θ -rough entropy and θ -information entropy to measure the uncertainty

D. STRUCTURE AND ORGANIZATION
The remaining sections of this article are designed below. Section 2 retrospects the essential notions of binary relations, fuzzy sets and FSVISs. Section 3 brings in distances between information values in a given subsystem of a FSVIS. Section 4 gives the tolerance relation induced by this subsystem. Section 5 proposes information structures in a FSVIS. Section 6 defines some useful tools for measuring uncertainty of a FSVIS. Section 7 does effectiveness analysis from a statistical view.. Section 8 discusses and concludes this article.
In addition, R is said to be a tolerance relation on U if it is reflexive and symmetric.

B. FUZZY SETS
Fuzzy sets are an extension of ordinary sets. Let X be a finite set. A fuzzy subset P in X is defined as a function assigning to each element x of X a value P(x) ∈ [0, 1] and P(x) is said to be the membership degree of x to the fuzzy set P.
Throughout this paper, I denotes [0, 1], I X denotes the family of all fuzzy sets in X .

C. FSVISS
In this part, the concept of a FSVIS is introduced. Definition 1 [24]: Suppose that U is a finite set of objects and A expresses a finite set of attributes. Then the ordered pair (U , A) is referred to as an information system (IS), if every attribute a ∈ A is able to decide a function a : UarrowV a , where V a = {a(u) : u ∈ U }.
Let (U , A) be an IS. Given B ⊆ A. Then an equivalence relation on U can be defined as Definition 2: Let (U , A) be an IS. Then (U , A) is said to be a set-valued information system (SVIS), if for any u ∈ U and a ∈ A, a(u) is an ordinary set.
Let (U , A) be a SVIS. Given θ ∈ [0, 1] and B ⊆ A. Then a tolerance relation on U can be defined as Information values of a SVIS are ordinary sets. We can also consider an IS in which information values are fuzzy sets. So we propose the definition of FSVIS below.

Definition 3: Let (U , A) be an IS. Given that X is a finite set. Then the pair (U , A) is known as a FSVIS, if for any u ∈ U and a ∈ A, a(u) ∈ I X .
In order to better express the meaning of the definition, a FSVIS (U , A) is denoted by (U , X , A).
If B ⊆ A, then (U , X , B) is referred to as the subsystem of (U , X , A).

III. DISTANCES BETWEEN INFORMATION VALUES IN A FSVIS
To construct the distance between information values in a FSVIS, a novel distance function should be presented.
Definition 5 [7]: Suppose A, B ∈ I X with X = {x 1 , x 2 , · · · , x l }. Then Chebyshev distance between A and B is defined as Definition 6: Let (U , X , A) be a FSVIS with X = {x 1 , x 2 , · · · , x l }. Given u, v ∈ U and a ∈ A. Then the distance between a(u) and a(v) is defined as Then d a is called the distance matrix of a in a FSVIS. Example 7 (Continued From Example 4 ): By Definition 5, we have d a 1 -d a 8 as shown at the next page.

IV. TOLERANCE RELATIONS IN A FSVIS
Proof: It is apparent. This proposition shows that when θ is constant, the larger the subset of A is, the smaller the corresponding tolerance relation is. But the tolerance relation will increase with the increase of the value of θ .  4).
An algorithm for computing the tolerance class is designed as below.

V. INFORMATION STRUCTURES IN A FSVIS
In this category, information structures in a FSVIS are considered.
can be referred to as the fuzzy neighborhood or the information granule of the point u i . (1) S θ 2 (C) is regarded as to depend on S θ , this can be written as S θ 1 (B) S θ 2 (C) (2) S θ 2 (C) is regarded as to depend strictly on S θ 1 (B), if S θ 1 (B) S θ 2 (C) and S θ 1 (B) = S θ 2 (C), this can be written as S θ 1 (B) ≺ S θ 2 (C).

VI. MEASURING UNCERTAINTY OF A FSVIS
In this part, UM for a FSVIS is presented and the performance of the presented measures is analyzed.
This proposition expresses that θ-information granulation increases as available information gets coarser, and vice versa. Consequently, it can be concluded that the uncertainty of a FSVIS can be evaluated by θ -information granulation displayed in Definition 20.
Proof: This is obtained by Theorem 18 and Proposition 22.
This proposition expresses that θ -information granulation is monotonic.

C. ENTROPY MEASURE FOR A FSVIS
Rough entropy, introduced by YaociteXZZ, is said to be coentropy in [3]. θ -rough entropy for a FSVIS is proposed below. Definition 28: Let (U , A) be a FSVIS. Given θ ∈ [0, 1] and B ⊆ A. θ -rough entropy of (U , B) is defined as Then ∀ i, and ∃ j, . This proposition clarifies that the θ -rough entropy value gets bigger as the available information is more uncertain. Therefore, it can be concluded that θ -rough entropy put forward in Definition 28 can evaluate the uncertainty of a FSVIS.
Proposition 31: Let (U , A) be a FSVIS.      (1) It can be seen that θ -information granulation and θ -rough entropy both increase monotonously with the increase of the value of θ , from this, it can be concluded that the uncertainty of eight subsystems increases as the θ value increases (see . (2) It can be concluded that the change of θ -information granulation and θ -rough entropy are closely related to θ , the change of θ -information amount and θ -information entropy are not closely related to θ (see FIGURE 2−9).    Example 38 (Continued From Example 37): Pick θ = 0.1, · · · , 0.9 and B i (i = 1, 2, · · · , 8). The following results can be obtained.
Consider θ -information granulation and θ-rough entropy, will be gotten. This shows the larger the subsystem, the smaller the measured value(see .    (2) If we pick θ = 0.2, consider θ -information amount and θ -information entropy, then and VOLUME 8, 2020 will be gotten. This shows the larger the subsystem, the smaller the measured value (see FIGURE 11).
For i = 1, 2, · · · , 8, denote This means the dispersion degree of E θ and H θ are relatively smaller, while the dispersion degree of G θ is maximum.
From FIGURE 2 − 9 and these standard deviation coefficients, the following results can be obtained.
(1) if monotonicity is only needed, then G θ , E θ , E θ r and H θ can better measure uncertainty of a FSVIS; (2) if only the dispersion degree is taken into account, then G θ and E θ r can relatively better measure uncertainty of a FSVIS, while E θ and H θ will worse measure uncertainty of a FSVIS.
The correlation between X and Y can be obtained according to TABLE 4.
The compared results of Pearson's correlation coefficients of θ -measure sets are displayed in TABLE 13. From TABLE 14, the following conclusion is demonstrated (see TABLE15).

VIII. CONCLUSION
This article has investigated UM for FSVISs. Information structure in a FSVIS have been defined. Four kinds of tools have been presented to measure the uncertainty of a FSVIS by using its information structures. Effectiveness analysis has been conducted from dispersion and correlation to show the feasibility of the presented measures. In the near future, we will consider three-way decision based on a FSVIS.