New Measures of Uncertainty for Interval-Valued Data With Application to Attribute Reduction

Uncertainty measurement (UM) gives a brand-new perspective on attribute reduction in an information system (IS). Interval-valued data is a kind of very vital data in rough set theory (RST). Rough set model based on tolerance relations can be considered to deal with interval-valued data. However, these kinds of tolerance relations have deficiencies when they are used in fuzzy rough computation. This paper studies new UM for an interval-valued information system (IVIS) and considers its attribute reduction. Firstly, a novel fuzzy symmetry relation on the object set of an IVIS is established based on “The similarity between information values that is fed back to the attribute set”. Secondly, $\lambda $ -information granules on the basis of a fuzzy symmetry relation are obtained. Then, four UMs for an IVIS are investigated. Next, numerical experiments and statistical tests are used to evaluate the performance of the proposed UMs. Moreover, attribute reduction in an IVIS is studied and the relevant algorithms are proposed. Finally, clustering analysis on the reduced IVIS is conducted. Experimental results indicate that the proposed algorithms are effective based on evaluation indicators of clustering performance. This paper provides a novel viewpoint for the establishment of fuzzy symmetry relation and attribute reduction algorithms.


I. INTRODUCTION
Rough set theory (RST), proposed by Pawlak [23], is a powerful tool to handle uncertainty. The completeness of RST depends on the data itself. In other words, it does not need to be attached to any additional information, so the results will become more objective and reliable. Nevertheless, two extended rough set models are raised for the sake of solving the disadvantage of too strict equivalence conditions. One is to introduce weak equivalence relations, such as to tolerance relations, similarity relations, dominance relations, or reflexive relations [34]; the other is to apply fuzzy set theory (FST) for RST [14].
An information system (IS) in the light of RST was raised by Pawlak. Uncertainty nearly exists in everywhere. Uncertainty measurement (UM) becomes a significant issue in a range of fields. The study of various UMs in ISs is helpful to The associate editor coordinating the review of this manuscript and approving it for publication was Wei Wei . understand the nature of information. UMs are able to be used for attribute reduction, rule acquisition, pattern recognition and clustering analysis. Some researchers have made some explorations in this respect and achieved a great deal of excellent research results. For instance, Düntsch et al. [11] investigated the measurement of decision rules on the basis of Shannon entropy in RST; Li et al. [20] gave UM methods based on fuzzy relation in an IS; Zeng et al. [42] researched UMs in a hybrid IS with the help of Gaussian kernel; Li et al. [17] provided a method to measure the uncertainty of a fully fuzzy IS by means of Gaussian kernel; Yang et al. [38] proposed UM methods for multi-source fuzzy IS.
Attribute reduction is a common technology in machine learning. Many datasets typically have a range of redundant attributes. Attribute reduction is aimed to eliminate extraneous and superfluous attributes from the original attributes of data and select the most effective attribute subset to eliminate the dimension of data. In recent years, many attribute reduction methods have emerged one after another. For example, VOLUME 10, 2022 This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ Zeng et al. [40] provided a FST method for incremental attribute reduction in a hybrid IS; Cornelis et al. [5] gave a general definition for a fuzzy attribute reduction; Li et al. [16] studied UMs in view of description ability for attribute reduction in an IS; Chen et al. [4] proposed a attribute reduction method for heterogeneous data on the basis of the notion of fuzzy kernel alignment; Li et al. [18] advanced attribute reduction for heterogeneous data by means of information entropy; Wang et al. [31] put forward attribute reduction on account of local conditional entropy; Singh et al. [28] explored attribute reduction in an IVIS by employing a approach based on fuzzy similarity; Wang et al. [32] carried out attribute reduction by virtue of neighborhood selfinformation; Liu et al. [15] raised a familiar form of attribute reduction for an IS; Akram et al. [1] investigated parameter reduction under interval-valued m-polar fuzzy soft information; Ali et al. [2] researched attribute reduction in a bipolar fuzzy relation decision system. FST was proposed by Zadeh [39] that describes the fuzziness in precise mathematical language. FST and RST, are two models to study inaccurate, uncertain and vague information. They have own advantages and features, and combine to form a model that is called fuzzy rough sets (FRSs) [7]. Wang et al. [29] put forward a fitting model for attribute reduction with FRSs; Chen et al. [6] raised a new attribute reduction algorithm on account of FRSs. Cornelis et al. [5] gave a extended model of attribute reduction in view of fuzzy tolerance relation within the context of FRSs. Yuan et al. [37] researched unsupervised attribute reduction for hybrid data by means of FRSs. Moreover, Wang et al. [34] proposed an integrated qualitative group decision-making method for assessing health-care waste treatment technologies based on linguistic terms with weakened hedges.
An interval-valued information system (IVIS) expresses an IS where its information values are interval-valued numbers. Based on the rich semantic explanations and flexibility, an IVIS have attracted attention of some scholars. Xie et al. [35] considered information structures of an IVIS and gave a new UM to measure uncertainty of the system. Dai et al. [8] studied UMs for an IVIS. Zhang et al. [41] proposed incremental updating of rough approximations in an IVIS under attribute generalization. Sakai et al. [27] presented a rule generation prototype system of IISs in Lipski that is able to process an IVIS.
Wang et al. [34] researched attribute reduction for categorical data based on FRSs. But they did not consider UMs and attribute reduction in an IVIS.
Usually, a tolerance relation on the object set of an IVIS is established according to ''The similarity between information values that is fed back to the object set''. Rough set model based on this tolerance relation is employed to dispose of interval-valued data. However, these kinds of tolerance relations have deficiencies when they are used in fuzzy rough computation. This paper introduces a novel fuzzy symmetry relation on the object set of an IVIS is established based on ''The similarity between information values that is fed back to the attribute set''. Fuzzy rough set model based on this fuzzy symmetry relation is used to deal with interval-valued data. On this basis, this paper studies UM for interval-valued data with application to attribute reduction. The work process of this paper is displayed in FIGURE 1.

A. STRUCTURE AND ORGANIZATION
The specific arrangements of the article is structured as follows. In Section 2, some related concepts of fuzzy relations, interval-valued numbers and IVISs are raised. In Section 3, UMs of an IVIS are raised. In Section 4, some numerical experiments are designed to analyze the effectiveness of the proposed measures. In Section 5, an application for attribute reduction is given in an IVIS and clustering analysis on the reduced IVIS are doing. In Section 6, this paper is summarized.

II. PRELIMINARIES
In this article, O and AT signify two nonempty finite sets,  1], and S(x) means the membership degree of x to S. S is denoted as S(o i ) shows the cardinality of S.
If R ∈ O×O is a fuzzy relation on O, and R may be denoted by If M (R) is a fuzzy identity matrix, and we denote by R = ; if R(o i , o j ) = 1, i, j ≤ n is a fuzzy universal relation, and we denote by R = ω.

a(o) is an interval-valued number, then (O, AT ) is called an interval-valued information system (IVIS).
If A ⊆ AT , then (O, A) is known as the subsystem of (O, AT ).

Example 5: TABLE 1 depicts an IVIS
is fed back to the object set of an IVIS. Naturally, we may consider that ''∀ a ∈ A, q(a(o), a(o )) ≥ λ'' is fed back to the attribute set of an IVIS. For this purpose, inspired by the paper [34], we introduce the following definition.
For convenience, denote Obviously,

III. MEASURING UNCERTAINTY OF AN IVIS
This part puts forward some tools for measuring uncertainty of an IVIS.

A. GRANULATION MEASURE FOR AN IVIS
Definition 9: For an IVIS (O, AT ), given A ⊆ AT . Then fuzzy information granulation of (O, A) is specified as Example 10: Table 1 illustrates an an IVIS.

Proposition 11: For an IVIS
By Definition 9,

Definition 16: For an IVIS (O, AT ), given A ⊆ AT . Then fuzzy rough entropy of (O, A) is deemed as
By Definition 16, we obtain that Proof: Example 23 (Continued From Example 10): We have Proof:

IV. EXPERIMENTAL EVALUATION
This part designs some experiments and does effectiveness analysis to the proposed measures.

A. DATA SETS AND EXPERIMENTAL CONTENTS
Six data sets from UCI (Machine Learning Repository) are picked (see TABLE 2). The above data sets come from UCI. They are all real number type. They can change to interval-valued data. Then, the interval-valued number a (x) converted from the information value of the object o i under the attribute a i can be obtained by formula a (x) = [a(x) − ξ σ, a(x) + ξ σ ], where σ is the standard deviation. Usually, we pick ξ = 5.

B. EXPERIMENTAL RESULTS
From FIGUREs 2a − 2f , we obtain the following experimental results.
G λ , E λ r are both monotonically increases and E λ , H λ are both monotonically decreasing as the cardinality of attribute subset increases. Thus, G λ , E λ , E λ r and H λ are applied to measure the uncertainty of an IVIS.

C. DISPERSION ANALYSIS
CV (S) is called the standard deviation coefficient of S. It is used for analyzing the effectiveness of the proposed measures Continue the above experiment, CV -values of four measure sets are shown in FIGURE 3. From the FIGURE 3, we can draw a conclusion that the CV -values of G λ and E λ r are much higher than those of E λ and H λ . For these six datasets, the CV -value of E λ is obviously the smallest and indicates that the dispersion degree of E λ r is minimum, that is to say, E λ has much better performance for measuring uncertainty of an IVIS.

D. CORRELATION ANALYSIS
Define

pcc(S, T ) is called Pearson correlation coefficient between
S and T . It reflects the degree of linear correlation between S and T .
The correlation between S and T can be obtained according to TABLE 3. Continue the above experiment, pcc-values between two measurement sets are shown in TABLEs 4-9.      From TABLEs 4-9, the correlation degrees between the four measures on the six data sets are obtained (see . Obviously, the six tables are the same. The correlation between the four measures on the six data sets is consistent.

E. FRIEDMAN TEST AND NEMENYI TEST
To further assess the performance of the proposed four measurement uncertainties, Friedman test and Nemenyi test are presented in this part. Friedman test is a statistical test on the basis of ranking method. It is defined as where N and k are the number of data sets and algorithms, respectively, r i expresses the average ranking of the i-th algorithm. Nevertheless, the Friedman test is too conservation that is substituted by the following statistic The statistic F F follows a Fisher distribution with k −1 and (k − 1)(N − 1) degrees of freedom.
Nemenyi test is able to further explore which algorithm is better in the statistical term. And the critical distance CD α is defined as where q α is the critical tabulated value and α is the significance level of the test. If the average level of distance exceeds CD α , then the performance of two algorithms will be significantly different. Continued from Subsection 4.2, we have G λ (Ir) = 0.5058, E λ (Ir) = 0.0238, H λ (Ir) = 0.1808, Four UMs can be seen as four algorithms. We demonstrate the statistical significance by using Friedman test and Nemenyi test.
We list the ranking of CV-values of the four measure sets on six datasets in TABLE 16.
Friedman test is used on four measurements under ten data sets. Thus, F F follows the distribution with 3 and 15 degrees of freedom. It is noted that the critical value of Fisher distribution F 0.05 (3, 15) is 3.2874. Then F F = 35.9091. Obviously, the value F F is much bigger than 3.2874.  This means that at the significant level α = 0.05, the performance of the four measurements are different in the statistical significance.
The following results can be obtained from FIGURE 4, 1) Compare the performance of four measurements from statistics, E λ is better than H λ ; In the same way, H λ is better than E λ r and E λ r is better than G λ . 2) There is no significant difference between the performance of E λ and H λ , H λ and E λ r , E λ r and G λ , respectively. But E λ is significant difference from G λ and E λ r .

F. PAIRED t -Test
The paired t-test may be regarded as an extension of the single-object t-test, except that Observations change from a group of independent objects of normal distribution to a pair of paired objects.
If the differentials between the two pairs is independent of each other, and derived from normal distribution, then statistics below is able to be employed to determine whether the expectation of d i is 0: 2 , n is the number of paired objects. Under the null hypothesis, the statistic T follows t-distribution with degrees of freedom VOLUME 10, 2022 df = n − 1. For a given significance level α, one can obtain rejection domain in the following: The value of T computed by objects falls into the rejection domain that means two objects are significantly different.
Since the smaller the CV-values, the better the measurement, we conduct paired t-tests on the CV-values. We treat the CV-values of G λ and E λ as a pair of paired objects. Likewise, we treat the CV-values of E λ and H λ , E λ and E λ r , H λ and E λ r , H λ and G λ , E λ r and G λ as five pairs of paired objects, respectively. The tests are under the assumption that each pair objects come from normally distributed populations with unknown but equal variances. Pick α = 0.05, the test results are shown in TABLE 17.
Ifd < 0 and p − value < 0.05, it means the CV-values of the first measurement is significantly smaller than that of the second measurement. Thus, the performance of the first measurement is significantly better than that of the second. From TABLE 17, we obtain the following results: a) The performance of E λ is significantly better than that of G λ , H λ and E λ r ; b) The performance of G λ is the worst of the four measures.

G. THE INFLUENCE OF PARAMETER λ ON THE FOUR MEASURES
The parameter λ is utilized to adjust the fuzzy relation; it has a significant influence on UM. We select an attribute subset P = {a 1 , a 2 , a 3 } for Ecoli data set and Parkinsons data set by adjusting the value of the parameter to vary from 0.1 to 0.9 with a step of 0.05. Then, we calculate the values of G λ , E λ , H λ and E λ r . The results are shown in FIGUREs 5a-5d and FIGUREs 6a-6d.
From FIGUREs 5a-5d and FIGUREs 6a-6d, we obtain the following conclusions.
G λ and E λ r are both monotonically decreasing as the parameter λ increases. By contraries, E λ and H λ are both monotonically increasing with the parameter λ growth. But they are not strictly monotonous.
(4) ⇔ (3). It follows from Theorem 20. Below, we only use Algorithm 1 to get the reduction of each IVIS.
The parameter λ will have a certain impact on Algorithm 1. We first discuss the value of parameter λ and select Parkinsons data set by adjusting the value of the parameter λ to vary from 0.1 to 0.9 with a step of 0.05. After setting λ, Algorithm 1 is run for 20 times, and the mean and variance of the reduced subset size are counted. The results are shown in FIGURE 7.
From FIGURE 7, it can be seen that the reduced subset size is decreasing with the increase of λ, but it is not strictly decreasing, and its variance is also relatively large. To balance the reduction rate and the stability, it is recommended that λ is between 0.2 and 0.6 in practical application. In the following experiments of this paper, λ is set at 0.4.
These reduction results are shown in In order to verify the performance of the algorithm, Based on the work of Rouseeuw et al [25], we use k-medoids clustering algorithm to cluster the data sets before and after reduction.  The following distance between objects is to deal with interval-valued data.
Definition 31 [8]: It is worth noting that the number of clusters is set to the number of classes of the data set in all experiments. In order to intuitively display the clustering results, we take the midpoint of each interval to get the real value, and then reduce the dimension with PCA. The results are shown in FIGUREs 8a,8b,9a,9b,10a,10b,11a and 11b. The silhouette coefficient values of four data sets before and after reduction are shown in FIGURE 12. From  FIGUREs 8a,8b,9a,9b,10a,10b,11a and 11b, it can be seen that the reduced data on these four data sets is more concentrated within the class and more dispersed among the classes. They indicate that the quality of reduced data is better. Figure 12 shows that the value of silhouette coefficient of the reduced data set is larger, indicating that the reduced data set has a better clustering effect.

VI. CONCLUSION
Considering that ''The similarity between information values that is fed back to the attribute set'', a new fuzzy symmetry relation on the object set of an IVIS has been constructed by introducing a variable parameter to control the similarity between information values. The advantage of this fuzzy symmetry relation is to facilitate fuzzy computation and then fuzzy Information granules have been obtained. Four UMs have been investigated by means of fuzzy information granules. The effectiveness of the investigated UMs has been proved by statistical analysis. The purpose of statistical analysis is to select two better measures to construct attribute reduction algorithm. Two attribute reduction algorithms on account of the selected UMs have been proposed. The experimental results of cluster analysis demonstrated the effectiveness of the proposed algorithm according to the evaluation indicator of clustering performance. In the future, we will apply the proposed measures for attribute reduction on largescale gene data, and considers the possible application of our method in health-care waste treatment technologies assessment problems.

ACKNOWLEDGMENT
The author would like to thank the editors and the anonymous reviewers for their valuable comments and suggestions which have helped immensely in improving the quality of this paper.