Measures of Uncertainty for an Incomplete Set-Valued Information System With the Optimal Selection of Subsystems: Gaussian Kernel Method

A set-valued information system (SVIS) with missing values is known as an incomplete set-valued information system (ISVIS). This article focuses on studying uncertainty measurement for an ISVIS and the optimal selection of subsystems by means of Gaussian kernel. First, the distance between two information values on each attribute in an ISVIS is put forward. Second, the fuzzy <inline-formula> <tex-math notation="LaTeX">$T_{cos}$ </tex-math></inline-formula>-equivalence relation induced by a given subsystem is proposed based on Gaussian kernel. Next, some tools are used to measure the uncertainty of an ISVIS. Moreover, effectiveness analysis is done from a statistical point of view. In the end, the optimal selection of subsystems based on <inline-formula> <tex-math notation="LaTeX">$\delta $ </tex-math></inline-formula>-information granulation and <inline-formula> <tex-math notation="LaTeX">$\delta $ </tex-math></inline-formula>-information amount is given. These results will help us comprehend nature of uncertainty in an ISVIS.


I. INTRODUCTION
Rough set theory as a mathematical tool for dealing with inaccuracy and uncertainty in data analysis has been successfully applied to many fields [17]- [21], [25], [26]. From philosophical point of view, rough set theory is established on the assumption that each object in the universe is connected with some information, expressed by means of some attributes used for object description [18]. Accordingly, an information system (IS) is a database that represents relationships between objects and attributes. If the information values of each object in an IS are sets, then this IS is called a set-valued information system (SVIS). Some scholars have studied SVISs. For instance, Yao [30] presented a set model for SVISs with upper and lower approximations, moreover, studied generalized decision logic; On the basis of knowledge induction process, Leung et al. [10] discussed a rough set The associate editor coordinating the review of this manuscript and approving it for publication was Farhana Jabeen Jabeen . approach for selecting decision rules with minimum feature sets in SVISs; Qian et al. [22] proposed a dominance relation for SVISs.
Uncertainty is caused by the limited resolution and incomplete description of the data. Measures of uncertainty have gradually become a significant research topic and given rise to a large number of people's attentions. Aiming at uncertainty of IS, Shannon [24] introduced the concept of entropy and discussed the uncertainty with entropy. Later, Liang et al. [11] studied information granules and entropy theory in ISs; Liang et al. [12] investigated several kinds of entropy in incomplete ISs; Dai et al. [2] thought about entropy measures in SVISs; Qian et al. [23] considered fuzzy information entropy and granularity; Xu et al. [28] investigated rough entropy in ordered ISs; Dai et al. [4] proposed an extended conditional entropy in interval-valued decision systems; Dai et al. [3] put forward θ -rough degree in IVISs on the foundation of θ-similarity entropy; Dai et al. [5] explored entropy and granularity measures in SVISs; Huang et al. [8] investigated uncertainty measures for intuitionistic fuzzy approximation space; Huang et al. [9] gave uncertainty measures in interval-valued intuitionistic fuzzy ISs; Xie et al. [27] took into account new method to measure the uncertainty of interval-valued ISs; Zhang et al. [37] measured the uncertainty of fully fuzzy ISs; Li et al. [13], [14] considered uncertainty measurements in fuzzy relation ISs and covering ISs.
An incomplete set-valued information system (ISVIS) is a SVIS with missing values. An ISVIS itself has uncertainty. How to measure its uncertainty is a crucial research topic. This article will study this issue. The similarity degree between two information values on a given attribute in an ISVIS is constructed and the distance between two objects is given. Fuzzy T cos -equivalence relation is induced by a given subsystem of an ISVIS by means of Gaussian kernel. The uncertainty of a SVIS is measured. Effectiveness analysis is done from the angle of statistics. Based on them, the optimal selection of subsystems is given. The work process of the article is shown in FIGURE 1.
The rest of the article is arranged as follows. In the second section, we review some basic concepts about fuzzy sets and fuzzy relations. In the third section, we construct the distance degree between two information values on a given attribute in an ISVIS and give the distance between two objects in an ISVIS. In the fourth section, we study fuzzy T cos -equivalence relation by means of Gaussian kernel. In the fifth section, we research relationships between two ISVISs and display inclusion degree of IIVISs. In the sixth section, we measure of uncertainty for a given ISVIS. In the seventh section, we do effectiveness analysis from three aspects. In the eighth section, we obtain the optimal selection of subsystems based on the proposed measures. In the ninth section, we summarize the article.

II. PRELIMINARIES
In this section, we briefly recall some concepts about fuzzy sets, fuzzy relations and ISVISs.
U denotes a non-empty finite set and I expresses [0, 1] in this article. Put

A. FUZZY SETS AND FUZZY RELATIONS
If F is a mapping defined by F : U → I , then F is a fuzzy set on U .
In this article, I U indicates the family of all fuzzy sets on U .ā denotes the constant fuzzy set on U for each a ∈ I . If R is a fuzzy set in U ×U , then R is called a fuzzy relation on U .
In this article, I U ×U denotes the family of all fuzzy relations on U .
Given F ∈ I U and u ∈ U . Then F(u) indicates the degree that u belongs to F. Similarly, given R ∈ I U ×U and u, v ∈ U . Then R (u, v) indicates the degree that (u, v) belongs to R. Thus R(u, v) can be regarded as the degree of similarity between u and v. In general, R is denoted by the following matrix: Suppose R ∈ I U ×U and u ∈ U . Then a fuzzy set [u] R is defined as follows: [u] R can be viewed as the fuzzy neighborhood of the point u on U under R.
Definition 1 [16]: A function T : I 2 → I is called a t-norm, if meets the conditions as follows: (4) T (a, 1) = a (Boundary condition). Definition 2 [38]: Let T be the t-norm. Suppose R ∈ I U ×U . Then R is a T -fuzzy equivalence relation on U if it meets the following conditions: Proposition 3 [15]: Assume that f : U × U → I satisfies f (u, u) = 1 for all u ∈ U . Then u, v, w ∈ U , Corollary 4: Given R ∈ I U ×U . If R is reflexive, then R is T cos -transitive.

B. ISVISs
Definition 5 [18]: Consider that U is an object set, A is an attribute set, U and A are finite sets. Then the pair (U , A) is called an information system (IS), if each attribute a ∈ A determines an information function a : U → V a , where V a = {a(u) : u ∈ U } is the set of information function values of the attribute a.
Let (U , A) be an IS, given P ⊆ A, then an equivalence relation on U can be defined as Definition 6 [18]: Assume that (U , A) is an IS. Then the pair (U , A) is said to be an incomplete information system (IIS), if there are u ∈ U and a ∈ A then a(u) is missing.
If (U , A) is an IIS. Given P ⊆ A. Then a tolerance relation on U can be defined as  where * is a missing value.
Definition 7 [29]: Suppose that (U , A) is an IS. Then (U , A) is referred to as a set-valued information system (SVIS), if for any a ∈ A and u ∈ U , a(u) is a set.
If (U , A) is a SVIS. Given P ⊆ A and θ ∈ [0, 1]. Then a tolerance relation on U can be defined as where s(a(u), a(v)) = |a(u) a(v)| |a(u) a(v)| means the similarity degree between a(u) and a(v).
Definition 8 [29]: Given that (U , A) is an IS. Then

(U , A) is called an incomplete set-valued information system (ISVIS), if (U , A) is both incomplete and set-valued.
If P ⊆ A, then (U , P) is referred to as the subsystem of (U , A).

III. DISTANCE BETWEEN TWO OBJECTS IN AN ISVIS
Definition 10: Let (U , A) be an ISVIS. Then ∀ u, v ∈ U , a ∈ A, the distance between a(u) and a(v) is defined as . According to the above definition, the distance between two objects in an ISVIS is defined as follows.
Definition 11: Suppose that (U , A) is an ISVIS. Given P ⊆ A. ∀ u, v ∈ U , the distance between u and v in the subsystem (U , P) is defined as

IV. FUZZY T cos -EQUIVALENCE RELATION BASED ON GAUSSIAN KERNEL IN AN ISVIS
In this section, the fuzzy T cos -equivalence relation induced by a given subsystem of an ISVIS is given by means of Gaussian kernel. Gaussian kernel G(u, v) = exp(− u−v 2 2δ 2 ) is used to compute the similarity between two objects u and v, where u−v is the Euclidean distance between two objects u and v, δ is a threshold. In this article, pick δ ∈ (0, 1]. Obviously, G(u, v) satisfies: is called the Gaussian kernel matric of the subsystem (U , P) with respect to δ. For any u ∈ U , a fuzzy set [u] R G P (δ) is defined as follows:

Algorithm 1 The T cos -Equivalence Relation
Obtain R G P (δ).

end 12 end
[u] R G P (δ) can be viewed as the fuzzy neighborhood of the point u on U with respect to δ in the subsystem (U , P).
is the T cos -equivalence relation induced by the system (U , A) with respect to δ.
Given P ⊆ A and δ ∈ (0, 1]. Then an algorithm on a T cosequivalence relation R G P (δ) is designed as follows.
(2) Suppose (U , P) δ (U , Q). Then, by Definition 19, Thus,  From the above, we know that D δ is the inclusion degree.
It can be obtained that the inclusion degree has the ability to quantify relationships by the theorem below.

B. ENTROPY MEASUREMENTS FOR AN ISVIS
Definition 30: Let (U , A) be an ISVIS. Given P ⊆ A and δ ∈ (0, 1]. Then δ-rough entropy of (U , P) with respect to δ is defined as

By Definition 34,
By Definition 37, ) ≈ 2.4441, The results of these experiments are shown in FIGURE 2. 2) (E r ) δ and H δ are more sensitive than G δ ; 3) (E r ) δ and H δ are more sensitive than E δ ; 4) The difference among G δ and E δ are almost the same. Thus, δ-rough entropy and δ-information entropy are more suitable than δ-information amount and δ-information granulation for an ISVIS.

VII. EFFECTIVENESS ANALYSIS
In this section, effectiveness analysis is put forward from three aspects.

A. DISPERSION ANALYSIS
Assume that X = {x 1 , . . . , x n } is a data set. Then its arithmetic average value (resp. standard deviation, standard deviation coefficient) is regarded as x (σ (X ), CV (X )), they are defined as follows:

Example 41 (Continued From Example 40): Denote
Then CV (X G ) = 0.3487, CV (X E r ) = 0.4963, The results are shown in FIGURE 3. So Then dispersion degree of G reaches minimum. VOLUME 8, 2020   From FIGUREs 2 and 3, the following results can be obtained: (1) (E r ) δ and H δ have better performance to measure uncertainty of an ISVIS if the monotonicity is only considered; (2) (E r ) δ has better performance to measure uncertainty of an ISVIS if the monotonicity and dispersion degree are both considered.

B. ASSOCIATION ANALYSIS
In statistics, Pearson correlation coefficient is a measure of the strength of a linear correlation between two data sets.
Suppose that X = {x 1 , x 2 . . . , x n } and Y = {y 1 , y 2 . . . , y n } are two data sets. Pearson correlation coefficient between X and Y , denoted by r(X , Y ), is defined as

C. FRIEDMAN TEST AND BONFERRONI-DUNN TEST
To further explore whether the performance of each uncertainty measurement with the six subsystems are significantly different, Friedman test [6] and Bonferroni-Dunn test [1] are given in this subsection.
Friedman test is a statistical test that uses the rank of algorithms. Friedman statistic is defined as where k is the number of algorithms, N is the number of data sets, r i is the average ranking of the i-th algorithm. When k and N are large enough, Friedman statistic follows the chi-square distribution with k − 1 degrees of freedom. However, such Friedman test is too conservation, and is usually replaced by the next statistic The statistic F F follows the Fisher distribution with k − 1 and (k − 1)(N − 1) degrees of freedom. If the value of the statistic F F is larger than the critical value of F α (k −1, N −1), it means the null hypothesis is rejected under the Friedman test. Then the Bonferroni-Dunn test can be used to further explore which algorithm is better in the statistical term. If the average level of distance exceeds the critical distance CD α , then the performance of the two algorithms will be significantly different. The critical distance CD α is denoted as where q α is a critical value calculated by the qtukey function in r and α is the significance level.  (3,15). This means that at the significant level α = 0.05, it is evidence to reject the null hypothesis, which means that the four uncertainty measurements are different in the statistical significance.

Example 43 (Continued From Example 40): We have
(3) To further show the significant differences of the four measurements, Bonferroni-Dunn test is introduced. For α = 0.05, we can easily calculate the corresponding critical distance CD α = 2.569 × 4×(4+1) 6×6 = 1.915. FIGURE 4 shows b) The performance of E δ is statistically different from the performance of (E r ) δ .
2) a) There is no significant difference among G δ , E δ and H δ ; b) There is no significant difference between H δ and E δ .

VIII. OPTIMAL SELECTION OF SUBSYSTEMS BASED ON UNCERTAINTY MEASURES
In the above section, we use relationships between two ISVISs to study uncertainty measures, which naturally causes a problem. When uncertainty measure reaches the optimal value (i.e. the maximum or minimum value)? How to determine the corresponding subsystem (we call it the optimal system)? In this section, the optimal selection of subsystems based on δ-information granulation and δ-information amount is obtained. (2) If there exists B 2 ⊆ A such that G δ (B 2 ) = min{G δ (B) : B ⊆ A}, then (U , B 2 ) is called a minimum subsystem in (U , A) based on δ-information granulation.
The maximum subsystem and minimum subsystem in (U , A) are collectively called the optimal subsystems based on δ-information granulation. Thus (U , {a 0 }) is a maximum subsystem in (U , A) based on δ-information granulation.
(2) By Theorem 28, ∀ B ⊆ A, This shows that (2) If there exists B 2 ⊆ A such that E δ (B 2 ) = min{E δ (B) : B ⊆ A}, then (U , B 2 ) is called a minimum subsystem in (U , A) based on δ-information amount.
The maximum subsystem and minimum subsystem in (U , A) based on δ-information amount are collectively called the optimal subsystems based on δ-information amount. Thus, (U , {a 1 }) is a minimum subsystem in (U , A) based on δ-information amount, (U , {a 1 , a 3 , a 5 , a 6 }) is a maximum subsystem in (U , A) based on δ-information amount.

IX. CONCLUSION
This article has measured the uncertainty of an ISVIS by means of Gaussian kernel and given the optimal selection of subsystems. Relationships between ISVISs have been investigated. Four tools of measuring the uncertainty of an ISVIS have been proposed. Effectiveness analysis about the proposed measures has been done from the angle of statistics. Based on δ-information granulation and δ-information amount, the optimal selection of subsystems has been given. In the future, we will examine applications of the proposed measures for an ISVIS.