Attribute Reduction Methods Based on Pythagorean Fuzzy Covering Information Systems

By introducing covering rough sets to Pythagorean fuzzy environment, we construct a new rough set model called the Pythagorean fuzzy <inline-formula> <tex-math notation="LaTeX">$\lambda $ </tex-math></inline-formula>-covering rough set. Based on the rough set model, we adopt the discernibility matrix method to obtain its attribute reduction. First, we give the definitions of Pythagorean fuzzy <inline-formula> <tex-math notation="LaTeX">$\lambda $ </tex-math></inline-formula>-coverings and <inline-formula> <tex-math notation="LaTeX">$\lambda $ </tex-math></inline-formula>-neighborhoods and then establish a Pythagorean fuzzy <inline-formula> <tex-math notation="LaTeX">$\lambda $ </tex-math></inline-formula>-covering rough set model. Second, from the perspective of decision systems, Pythagorean fuzzy <inline-formula> <tex-math notation="LaTeX">$\lambda $ </tex-math></inline-formula>-covering decision systems are divided into two categories: consistent Pythagorean fuzzy <inline-formula> <tex-math notation="LaTeX">$\lambda $ </tex-math></inline-formula>-covering decision systems and inconsistent Pythagorean fuzzy <inline-formula> <tex-math notation="LaTeX">$\lambda $ </tex-math></inline-formula>-covering decision systems. We further investigate the attribute reductions in the two systems and some equivalent conditions of the reductions and then design the reduction algorithms by using the discernibility matrix. Finally, numerical examples are provided to demonstrate the effectiveness of the proposed design methods. In addition, we reveal the superiority of the Pythagorean fuzzy <inline-formula> <tex-math notation="LaTeX">$\lambda $ </tex-math></inline-formula>-covering rough set in attribute reduction by numerical experiments.


I. INTRODUCTION
Recently, Yager [32] proposed a new mathematical method for dealing with uncertainty information, which is called Pythagorean fuzzy set (PFS). Compared with the intuitionistic fuzzy set, PFS is more comprehensive and scientific with respect to modeling uncertainty because it requires that the square sum of its membership and nonmembership does not exceed 1. Since its emergence, the field of PFS has attracted increasing attention from researchers [1]- [3], [5], [17], [18], [25], [33]- [35], [38], [39]. This new mathematical model not only promotes the development of fuzzy mathematics but also provides improved solutions for many practical problems, such as career orientation selection [44], internet stock investment [19] and software development investment [20].
Currently, the world has entered the era of big data. Faced with such a massive database, determining how to find useful The associate editor coordinating the review of this manuscript and approving it for publication was Hualong Yu . data is particularly important. Under such circumstances, rough set theory proposed by Pawlak [21] provides an effective tool to solve this problem. The purpose of rough set theory is to analyze and infer data and to extract effective data and discover hidden rules from it. To date, rough set theory has been successfully applied to many aspects, such as clinical medical diagnosis, pattern recognition and classification, machine learning, data mining and image processing. In decision tables, redundant attributes are deleted through knowledge reduction. After reduction, the decision tables are more conducive to the analysis of the problem by researchers. On the other hand, it is well known that classical rough set theory has some limitations in practical application; for example, it is only applicable to data that can generate equivalent classes. However, in practice, the data studied in many cases cannot always generate equivalent classes. In this context, classical rough set theory has been extended by researchers to more general cases, and many new rough set models have been generated, such as fuzzy rough sets, intuitionistic fuzzy rough sets and covering rough sets. In recent years, Zhang [43] introduced a Pythagorean fuzzy rough set (PFRS) and considered its application but ignored its reduction.
Reduction, as an increasingly important component of rough set research, has attracted much attention [4], [11], [22], [27], [41], [42]. Different reduction methods are suitable for different systems. So far, Pawlak's [22] classical attribute reduction has been widely used in many studies. Zhang and Miao et al. [41] used this approach for three-way attribute reductions. Information entropy reduction was applied to three-way class-specific attribute reductions [42]. Meanwhile, the reduction method of the discernibility matrix is widely used in decision systems. Chen et al. [11] proposed a discernibility matrix and designed a reduction method for covering rough sets. Wang et al. [27] improved the resolvable matrix and proposed a new reduction method, which greatly reduced the complexity of the original resolvable matrix.
From the above research, we find that the research on covering rough sets has made great progress. However, there are few studies on the combination of covering rough sets and PFSs as well as the corresponding reduction, which arouses our interest. We attempt to construct a PF λ-covering rough set and further investigate its reduction. Concretely, the following issues must be considered: (1) how to establish the PF λ-covering rough set and (2) which reduction method should be chosen. Meanwhile, as mentioned above, Zhang [43] constructed a PFRS by integrating rough set with PFS and did not consider the PFRS under the coverage environment or its reduction. Huang et al. [12] conducted a cross study of intuitionistic fuzzy sets and covering rough sets, and they considered the reduction of multi-granularity intuitionistic fuzzy covering rough sets. As is known, the research scope of intuitionistic fuzzy sets is relatively narrow. In the actual uncertainty problem, we will encounter the situation in which the sum of membership and nonmembership is greater than 1. Pythagorean fuzzy sets broaden the research scope of intuitionistic fuzzy sets. In view of this situation, we attempt to establish a PFRS under the coverage environment by integrating covering rough sets with Pythagorean fuzzy sets and explore the reduction of the novel rough set in decision systems. The proposed PF λ-covering rough set will provide new and better solutions to many uncertain problems. In addition, this novel theory can be used to cope with practical problems that the existing models cannot solve, such as medical diagnosis, evaluation of credit card applicants and coal exploration. On the other hand, in order to achieve better reduction, we consider the decision system divided into two categories: the consistent decision system and inconsistent decision system. Meanwhile, we introduce the discernibility matrix and discernibility function to design corresponding reduction algorithms based on the consistent decision system and the inconsistent decision system.
The remainder of this paper is organized as follows. In Section 2, we review some basic concepts of covering and PFS. Section 3 introduces the definitions of PF λ-covering and PF λ-neighborhood, and then establishes a PF λ-covering rough set model. Section 4 proposes the consistent PF λ-covering decision systems and develops a reduction algorithm based on the consistent decision systems. In addition, we use an example to illustrate the reduction process visually. In Section 5, we investigate inconsistent PF λ-covering decision systems and their corresponding reduction method. Section 6 presents experimental analysis and compares the experimental results with those of classical rough sets and covering rough sets. Finally, we summarize our work in the last section.

A. COVERING AND FUZZY λ-COVERING
In this section, we review some concepts of covering and fuzzy λ-covering.
Definition 1 [46]: Let U be a nonempty finite universal set, and C = {K j |K j ⊆ U , j = 1, 2, . . . , n} be a family of subsets of U . We then call C a covering of U if all elements K of C are not empty, and ∪K j = U (j = 1, 2, . . . , n). (U , C) is called a covering approximation space.
Definition 2 [47]: Let U be a universe, and C = {K 1 , K 2 , . . . , K n } be a covering of U . For ∀x ∈ U , the neighborhood N C (x) of x with regard to (U , C) is defined as Based on Definition 2 above, the neighborhood exhibits the following properties.
Definition 3 [15]: Let U be a universe of discourse and F(U ) be a fuzzy power set of U . For λ ∈ (0, 1], C = {C j : j = 1, 2, . . . , m} is called a fuzzy λ-covering of U , where We call (U , C ) a fuzzy covering approximation space.
Definition 4 [15]: Let (U , C ) be a fuzzy covering approximation space and C = {C j : j = 1, 2, . . . , m} be a fuzzy λ-covering of U and λ ∈ (0, 1]. For x ∈ U , Based on the fuzzy λ-neighborhood in Definition 4, Ma [15] also gave the concept of the λ-neighborhood of x ∈ U as follows. VOLUME 8, 2020 Definition 5 [15]: Let (U , C ) be a fuzzy covering approximation space and C = {C j : j = 1, 2, . . . , m} be a fuzzy λ-covering of U and λ ∈ (0, 1]. For x ∈ U , 1] respectively represent the degree of membership and the degree of nonmembership of x to A with the condition µ 2

III. PYTHAGOREAN FUZZY λ-COVERING ROUGH SETS
Huang et al. [12] proposed the concepts of the intuitionistic fuzzy β-covering and the intuitionistic fuzzy β-neighborhood and established an intuitionistic fuzzy β-covering rough set model on the basis of the intuitionistic fuzzy β-neighborhood. In this section, we shall generalize the model into PF environment and establish a PF λ-covering rough set.
In what follows, motivated by the idea of Ref. [15], we further extend the fuzzy β-covering into the PF environment and introduce the concepts of PF λ-covering and PF λ-neighborhood.
Definition 8: Let U be a finite universe of discourse and Definition 9: Let U be a universe of discourse and C = (U , N C ) is called a PF λ-neighborhood approximation space induced by the PF λ-covering approximation space (U , C ).
Let N λ C = { N λ x : x ∈ U }; then, N λ C is called a PF λ-neighborhood system.
Definition 10: Let U be a universe and C = { C 1 , C 2 , . . . , C m } be a PF λ-covering of U . For a PF value λ = a, b , we define the Pythagorean λ-neighborhood N λ x of x ∈ U as: Given C x = { C j : C j ∈ C , C j (x) ≥ λ}, the following conclusions are straightforward.
Theorem 1: For any x, y ∈ U , one has For every X ⊆ U , the lower and upper approximations of X with respect to C are defined as follows: is called the approximate precision of set X .
The positive region, boundary region and negative region of X with regard to C are determined by the following: From the above definition, we can directly obtain the properties of upper and lower approximations.
Theorem 2: Let (U , C ) be a PF λ-covering approximation space. For every X ⊆ U , the lower and upper approximations of X with respect to C satisfy the following properties: ( Next, we will show the relationships among the covering rough set, intuitionistic fuzzy λ-covering rough set and Pythagorean fuzzy λ-covering rough set by numerical examples.
We can see that C is a fuzzy λ-covering of U (0 < λ ≤ 0.8).
Let a set X = {x 3 , x 5 }. According to Definition 4.2 in [15], we can calculate the upper and lower approximations of X as follows: We conclude that the approximate accuracy of the fuzzy λ-covering rough set for X is 1 3 , which is relatively small. Now, we reconsider the above example by using the PF λ-covering rough set.
Example 2: Based on Table 1, we construct a PF λ-covering of U , shown as Table 2.
It is given that λ = (0.6, 0.5). From Definition 11, we can obtain the upper and lower approximations of X as follows: We conclude that the approximate accuracy of the PF λ-covering rough set for X is 1.
Remark 1: In general, the smaller the approximate accuracy is, the less knowledge information is known for set X . From the above examples, it can be seen that since the approximate accuracy for X increases from 1 3 to 1 based on two different methods, we can obtain more knowledge information by using the PF λ-covering rough set rather than the fuzzy λ-covering rough set. For some reason, the fuzzy λ-covering rough set proposed by Ma [15] only considers the degree of membership and does not take into account the effects of other factors, so available knowledge information is insufficient in the decision-making process based on the fuzzy λ-covering rough set. Unlike the fuzzy λ-covering rough set, the PF λ-covering rough set considers not only the degree of membership but also the degree of nonmembership, which requires that the square sum of its membership and nonmembership does not exceed 1. From the above discussions, the available knowledge information in the PF λ-covering rough set is more comprehensive than that in the fuzzy λ-covering rough set.
In what follows, the relationship between intuitionistic fuzzy λ-covering rough sets and Pythagorean fuzzy λ-covering rough sets will be established. Example 3: Based on Table 1, we can construct an intuitionistic fuzzy λ-covering, shown as Table 3.
It is given that λ = (0.6, 0.3). By Definition 3.2 in [12], we can find the intuitionistic fuzzy λ-neighborhood of each x i (i = 1, 2, · · · , 5). We can then express it as a crisp set and obtain the upper and lower approximations of X as follows: We conclude that the approximate accuracy of the intuitionistic fuzzy λ-covering rough set for X is 2 3 . Remark 2: From the above examples, it can be seen that the approximate accuracy of PF λ-covering rough sets for X is larger than that of intuitionistic fuzzy λ-covering rough sets, which means that the available knowledge information in the PF λ-covering rough set is more comprehensive than that in the intuitionistic fuzzy λ-covering rough set. The reason is that although the intuitionistic fuzzy λ-covering rough set involves the degrees of membership and nonmembership, the range of the degrees of membership and nonmembership for the intuitionistic fuzzy λ-covering rough set is much smaller than that of the PF λ-covering rough set. Therefore, the decision results based on the method of the PF λ-covering rough set exhibit greater reliability and superior persuasion.
Overall, the PF λ-covering rough set is the generalization of the fuzzy λ-covering rough sets and intuitionistic fuzzy λ-covering rough sets. Compared with the existing models, the advantages of the PF λ-covering rough set are as follows: the comprehensiveness of knowledge information, the enhanced approximation accuracy and the wider range of applications. Therefore, the PF λ-covering rough set is a more reasonable and practical method for the process of decision-making than the two other methods employed under different complex environments.

IV. ATTRIBUTE REDUCTION OF CONSISTENT PF λ-COVERING DECISION SYSTEMS
In this section, we shall consider the attribute reduction of consistent PF λ-covering decision systems.
Definition 12: Let U be a universe and C = { C j | C j ∈ PF(U ), j = 1, 2, . . . , m} be a PF λ-covering of U . It is given that D is a decision attribute set and U /D is a decision partition on U with respect to (w.r.t.) D. If for every x ∈ U , there exists The decision attribute is an important component of rough set theory. The universe U can be divided into several equivalent classes [x] D by decision attribute D. The equivalence class [x] D is closely related to the attribute reduction of CPFCDS. Supposing that (ii) If there exists y ∈ U such that A(y) [y] D , then for every x ∈ U , A(y) [x] D . Therefore, N λ Theorem 4: It is given that (U , C , D) is a CPFCDS. If C j ∈ C , then C j is necessary in C ; i.e., N λ x 2 ) such that the original relationship between N λ x 1 and N λ x 2 will change after deleting C j . Proof: Let N λ Theorem 5: It is given that (U , C , D) is a CPFCDS.
If Q ⊆ C , Q is a reduction of C iff x 1 and x 2 satisfy d( N λ x 1 ) = d( N λ x 1 ), of which the original relation w.r.t. C is the same as the relation w.r.t. Q, i.e., N λ Proof: (i) The necessary condition is obvious.
. This is contradictory. Skowron and Rauszer [23] defined the discernibility matrices and discernibility functions in information systems. Next, motivated by the idea of Ref. [23], we shall define the discernibility matrices and discernibility functions in the consistent PF λ-covering decision system. We also design an algorithm of relative reductions.
Proof: (U , C , D) is a CPFCDS ⇔ pos C (D) = U ⇔ for every x ∈ U , and thus, one has m ij = ∅ ⇔ m ij = ∅ for i, j ≤ n.
Theorem 7: , which means that there exist some C j ∈ Q such that their original relation w.r.t. C is the same as the relation w.r.t. Q. Therefore, C ≤ U /D.
Thus, there is a C k ∈ Q such that C k x i ⊂ C k x j and C k x i ⊃ C k x j , and thus, Q ∩ m ij = ∅.
When we remove C s from C , we know that the original relation w.r.t. C has changed. Thus, C s satisfies C s x i ⊂ C s x j and C s x i ⊃ C s x j . Hence, from Definition 14 and m ij = { C s }, it follows that core D ( C ) ⊆ { C s ∈ C : m ij = { C s }}.
Corollary 1: Supposing Q ⊆ C , then Q is a relative reduction of C iff it satisfies Q ∩ m ij = ∅ for 1 ≤ i, j ≤ n.
Next, we will introduce the discernibility function f (U , C , D) of (U , C , D), where f (U , C , D) is a Boolean function of m Boolean variables C 1 , C 2 , . . . , C m corresponding to the coverings C 1 , C 2 , . . . , C m , respectively. By introducing the discernibility function into CPFCDS, we explore all of the relative reductions.
be obtained from f (U , C , D) by using the multiplication and absorption laws as much as possible, and every element in Q r appears only once, then the set { Q r : 1 ≤ r ≤ l} is the set of all reductions of (U , C , D), i.e., red(U , C , D) = { Q r : However, for every m ij , we have Q k ∩ m ij = ∅, and then (∧ Q r )), which is contradictory. Thus, there is m i 0 j 0 ∈ m ij such that Q k ∩ m ij = ∅, which means that Q k is a reduction of (U , C , D). Assuming there is C j 0 such that ∧r(U , C , D) ≤ C j 0 ; i.e., C j 0 ∈ Q, which is contradictory. Hence, red(U , C , D) = { Q r : 1 ≤ r ≤ l}.
Next, we use Algorithm 1 to represent the attribute reduction process in the consistent PF λ-covering decision system.
In what follows, an example will be provided to illustrate the effectiveness of this reduction method.
Example 4: If λ = (0.6, 0.5) is given, then (U , C , D) in Table 4 Based on Definition 10, one has N λ (2), as shown at the bottom of this page, and f (U ,

V. ATTRIBUTE REDUCTION OF INCONSISTENT PF λ-COVERING DECISION SYSTEMS
In solving practical problems, we often encounter discontinuous coverage. It is necessary to design a reduction algorithm for IPFCDS. From the definition of IPFCDS, the preceding discernibility matrix is no longer suitable. In this section, we establish a new discernibility matrix. By using the discernibility functions, we shall develop a new reduction algorithm. In an IPFCDS, the universe U is classified into different partitions by decision attribute D. If pos C (D) = ∪{ C (X ) : X ∈ U /D} is given, then pos C (D) is a nonempty set. For any x i ∈ U , supposing that N λ Since , one has pos Q (D) = pos C (D).
Theorem 9: Letting (U , C , D) be an IPFCDS, one has the following results: (1) For any (2) For any Q ⊆ C , pos Q (D) = pos C (D) iff for every X ∈ U /D, Q(X ) = C (X ). ( Proof: It can be directly proven by Definition 16. Theorem 10: It is given that (U , C , D) is an IPFCDS. If C j is unnecessary in C , i.e., pos C − C j (D) = pos C (D), then there exists a pair of x i , x j ∈ U satisfying one of the following conditions, and their original relationship w.r.t. C has changed after deleting C j : On the other hand, from x i , x j ∈ Q(x i ), it follows that Q(x j ) ⊆ Q(x i ). Consequently, the original relationship w.r.t. C has changed after deleting C j .
Theorem 11: It is given that (U , C , D) is an IPFCDS. For 28490 VOLUME 8, 2020 That is a contradiction.
If pos Q (D) = pos C (D), by Theorem 10, there exist Given pos Q (D) = pos C (D), by Theorem 10, we know that there exist x i , x j ∈ U satisfying N λ x i ⊂ pos C (D) and N λ x j ⊂ pos C (D) such that their original relation w.r.t. C is not equivalent to the relation w.r.t. Q. That is a contradiction.
The proof is similar to that of Theorem 7. Corollary 2: Supposing that Q ⊆ C , Q is a relative reduction of C iff it satisfies Q ∩ m ij = ∅ for any m ij = ∅, 1 ≤ i, j ≤ n. Proof: The proof is similar to that of Theorem 13. Similarly, we use Algorithm 2 to represent the attribute reduction process in the inconsistent PF λ-covering decision system.

Algorithm 2 Attribute Reduction of Inconsistent PF λ-Covering Decision Systems
Input: An inconsistent PF λ-covering decision system Output: An attribute reduction r( C ) ∈ red( C ) of inconsistent PF λ-covering decision system Now, we provide an example to illustrate the effectiveness of the reduction method.
Example 5: Setting λ = (0.6, 0.5), then Table 5 is an ICPFCDS. By Table 5, we have   Then, N λ By the discernibility matrix of (U , C , D) is described as (4), shown at the bottom of the next page, and 6 }} and core(U , C , D) = ∅.

VI. COMPARATIVE ANALYSIS
In this section, the validity of the above reduction algorithm will be verified by experiments based on several data from the UCI Repository of machine learning databases [6]. The data sets that we select for the experiment are listed as Table 6, in which all the conditional attributes are numerical. Furthermore, we shall compare the experimental results of the proposed reduction algorithm with those of the other three reduction algorithms based on the classical rough set and two reduction algorithms based on the covering rough set.
It is well known that the classical rough set model can only handle discrete data. First, we use three methods, namely, equal-width (EW), equal-frequency (EF), and fuzzy c-means clustering (FCM) [14], to transform these numerical attributes into discrete attributes. Then, we obtain the experimental results of these three methods by using the algorithm in [13]. There are two reduction methods for covering rough sets: one is the discernibility matrix; the other is the related family. The experimental results of the two methods are from the algorithms in [11] and [37], respectively.  Since these data sets cannot be directly used to conduct experiments, we should convert these data into PF values by using the Gauss membership function, in which the square sum of the attribute data's membership degree and nonmembership degree cannot exceed 1. Based on the PF λ-covering rough set method, we must specify a parameter: the size of covering δ with a step size of 0.05. The classification accuracy and the number of selected features with 10-fold cross validation are shown as Figs. 1-4. In the experiment, we use a radial basis function (RBF) kernel to verify the selected features in the support vector machine (SVM) learning algorithm.
It can be observed that these graphs exhibit the same characteristics. In the interval [0,0.2], the classification accuracy and the number of selected features display an upward trend;  in the interval [0.2,0.5], the two functions gradually become stable; and in the interval [0.5,1], the two indicators first decline rapidly and then gradually become stable. This result shows that [0.2,0.5] is the most suitable coverage size in the reduction process. Meanwhile, from Figs. 1-4, it can be seen that when we reduce the classification accuracy, the number of selected features will be decreased. That is, we can find some subsets in the attribute set that are much smaller than the reduction generated with the classification accuracy used to produce similar classification.
The comparisons between our proposed method and five other methods are shown in Tables 7 and 8. From Tables 7  and 8, it is observed that the PF λ-covering rough set method can achieve higher classification accuracy and more selected features than can the three methods based on the classical rough set and two methods based on the covering rough  set. As a result, we make the conclusion that the proposed reduction method based on PF λ-covering rough set is better than the other five methods. In the classical rough set, the discretization of data can result in loss of information so that the complete information may be reduced. On the other hand, the covering rough set model represents attributes by covering. Although the covering rough set reduces the loss of information during data processing compared with that with the classical rough set, there is still room for improvement. In this context, the PF λ-covering rough set method offers more improvement.

VII. CONCLUSION
Pythagorean fuzzy sets have made great progress in both academic research and practical application. In this study, we introduce a PF λ-covering rough set model and investigate its reduction methods in different situations. In the decision system, according to the decision rules, the PF λ-covering decision system is divided into a consistent PF λ-covering decision system and inconsistent PF λ-covering decision system; for the two PF λ-covering decision systems, we design two appropriate reduction algorithms by introducing the discernibility matrix and the discernibility function, respectively. Finally, numerical examples are provided to demonstrate the effectiveness of the proposed design methods. In addition, we conduct an experimental analysis revealing that the PF λ-covering rough set is superior to both the classical rough set and covering rough set with respect to data processing.
The new model provides a new method for dealing with uncertainty and incomplete information. In the future, we shall investigate the PF λ-covering rough set from several aspects, such as information measure, information entropy reduction and information fusion.