Accelerating Update of Approximations Under a Dominance Relation

In real-time applications, the information system may often evolves over time. Dynamic information processing is one of main challenges in the research fields related to information processing. Rough sets theory is an excellent information processing tool which has been applied in dynamic information processing researches in recent years. Dominance-based rough sets approach is an outstanding generation of rough sets theory, which can process the ordered information related to the problems of multi-criteria decision analysis and multi-criteria sort. In this article, we investigate some new strategies to unleash the performance of updating approximations in dominance-based rough sets approach further. An original concept of coarse boundary of approximations is given, which can be applied to prune more unnecessary computation in updating approximations further. Then a new incremental approach for updating approximations under a dominance relation is proposed and the corresponding algorithm is designed. A numeric illustration shows the feasibility of the approach. By extensive experimental estimation, the performance of the approach outperforms that of its counterpart on the computational time.


I. INTRODUCTION
In 1982, Zdzislaw Pawlak, a Polish scientist proposed the original rough sets theory (RST) [1]. The theory has two essential characteristics. The one is the granularity of universe with a discernibility relation, which produces a family of information granule. The another is to approximate the rough concepts (rough sets) by their lower and upper approximations in order to obtain certain and uncertain knowledge. In recent years, it has been proved that RST is one of most excellent information processing tools, which can be applied widely in different application fields [2] such as faults diagnose, image processing, knowledge discovery, feature selection, intelligent control systems and so on [3]- [6].
However, the original RST introduced by Pawlak can not process information with preference-ordered attributes effectively. Greco et al. introduced a dominance-based rough sets approach (DRSA) to solve multi-criteria decision analysis and multi-criteria sort problems [7], in which a dominance The associate editor coordinating the review of this manuscript and approving it for publication was Chi-Hua Chen . relation was used to granulate universe instead of the discernibility relation in RST forming two families of basic information granule, i.e., dominating set and dominated set. In DRSA, the concepts approximated are upward unions and downward unions of decision classes, which are characterized by the corresponding lower and upper approximations, respectively [7]. In order to deal with errors, missing values in the domains of condition attributes and the strong inconsistency of given data, Inuiguchi et al. proposed a variation-precision dominance-based rough sets approach (VP-DRSA) [8] following the idea of variable precision rough sets approach (VP-RSA) introduced by Ziarko [9]. Considering the combination of the ''do not care'' and unknown attribute values in the incomplete information system and the preference-ordered domains of the attributes in the ordered information system, Yang et al. proposed a similarity dominance relation by combining a similarity relation and a dominance relation, then developed a generation of DRSA for incomplete ordered information processing [10]. Qian et al. considered that some of the attribute values in many real-time information systems may be set-valued to characterize uncertain information and missing information and summarized two types of set-valued information systems. Then they introduced two dominance relations to the two types of set-valued information systems [11]. Du et al. proposed a characteristic-based dominance relation under the mainframe of DRSA to process information with the losing or absenting attribute values [12]. DRSA is an useful generation of RST, which can be used to analyzing the ordered data. Hence it has been applied widely in many fields related to multi-criteria decision analysis problems solving, e.g., periodic prediction, group decision, multi-criteria web mining, business indicator analysis, public security, territorial sustainable policies making, text processing and so on [13]- [18].
In fact, the information from the real-world evolves over time, which often reflects to object set, attribute set, attribute values in an information system. The evolution of these three aspects in the information system had been paid many attentions by many scholars in rough sets society [19]- [35]. Many research works had been done by based on DRSA, e.g., Jerzy et al. realized that concept descriptions from facts incrementally refine when some new facts become available. They introduced an approach for incrementally inducing decision rules and selecting the most interesting representatives from the final set of rules under a domiance-based relation [19]. For multiple criteria decision analysis problems in the dynamic data environment, Greco et al. proposed an incremental strategy for updaing decision rules [20]. Jia et al. proposed an incremental algorithm INRIDDM based on DRSA to update decision rule sets when a new object being added into the universe [21]. Computing lower and upper approximations of concepts is a key step for applying RST in data mining and knowledge discovery. The computational time taken by computing lower and upper approximations relates directly to whether RST can be used to solve some real-time problems with high timeliness. In recent years, many approach for dynamically updating lower and upper approximations of upward and downward unions of decision classes in DRSA in dynamic data environment were introduced e.g., Li et al. analyzed the variations happened in the computation of approximations in DRSA when the information system varying and then explored the dynamic mechanism of updating approximations of DRSA by incremental learning. They proposed an incremental approaches for updating approximations of DRSA under the object set varying [23]. Luo et al. proposed an incremental approach for updating approximation of set-valued dominance-based rough sets approach when the set of objects varying [25]. Chen et al. proposed an incremental approach for update approximations of DRSA while attribute values refining or coarsening in incomplete information system [22]. Wang et al. propose an incremental algorithm which can efficiently update approximations of DRSA when objects and attributes increase simultaneously [28]. Huang et al. introduced a composite dominance-based rough sets model and redefined the lower and upper approximations of upward and downward unions of decision classes.
They proposed a matrix-based incremental approaches for the update of approximations in composite ordered decision systems while the attribute set varies over time [33]. Guo et al. analyzed the variations happened in equivalence classes, decision classes, conditional probability, internal grade and external grade while the set of objects evolves over time and explored the updating mechanisms for the concept approximations of two types of double-quantitative decisiontheoretic rough sets models with the incremental learning technique [34].
Although there are many excellent approaches of incrementally updating approximations of rough sets, they can hardly be directly applied to update approximations of rough sets under a dominance relation when the object set evolves over time except for the one and its generalization presented in [23], [28]. The incremental approach in literature [23] can not directly deal with the cases that the insertion or deletion of multiple objects at the same time. If the insertion or deletion of multiple objects are processed in the way of single object accumulation, redundant calculation will be caused. The aim of the paper is to improve the performance of updating approximations under a dominance relation when some objects adding. An original concept of coarse boundary is introduced with respect to lower and upper approximations of downward and upward unions of decision classes in DRSA. Combined with the coarse boundaries of approximations, a new available object can be directly assigned into some approximations according to its available information description if no variations happen in the corresponding coarse boundaries of approximations. Then some unnecessary computation in approximations updating may be avoided under this case. In another case that some variations happen in coarse boundaries of approximations when an object becoming available, the unnecessary computation in updating the dominance relation may be pruned by lessening the scan range. Thus there is an alternative incremental approach that can be used to update approximations under a dominance relation. A numeric example illustrates the feasibility of the approach. A corresponding algorithm is developed to estimate the performance of the approach by several different data sets downloaded from UCI [36].
The rest of this article is organized as follows. Section II reviews the definitions of information system, dominance relation, information granule and approximations in DRSA. In Section III, the definitions and some properties of coarse boundaries of lower and upper approximations of downward and upward unions of decision classes are introduced respectively. In Section IV, we investigate that the variations may be happened on coarse boundaries in dynamic data environment. In Section V, an alternative incremental approach is introduced. An algorithm of updating approximations of DRSA is developed and analyzed in Section VI. Section VII employs a numerical example to validate the feasibility of the approach. The experimental estimations are introduced in Section VIII. The paper ends with conclusions and further research work in Section VX. VOLUME 8, 2020

II. DOMINANCE-BASED ROUGH SETS APPROXIMATION
In this section, we review briefly some basic notations, concepts and terminologies of DRSA on the basis of references [7], [37], [38].
Definition 1: An information system can be presented formally as a four-tuple S = (U , A, V , f ), in which • U is called as universe, a non-empty finite set of objects; • A is a set of attributes, which can be divided into a condition attribute set C and a decision attribute set {d}, • V is regarded as the domain of all attributes; • f is an information function that can be present as ∀a ∈ C, let a present a preference relation on universe U with respect to the criterion a. ∀x, y ∈ U , x a y means that x is at least as good as y with respect to the criterion a. Let P ⊆ C, if there exists x a y for all a ∈ P, then x dominates y with respect to P, denoted by xD P y. D P is a dominance relation on universe U with respect to the set of criteria P which can be presented formally as In DRSA, the basic information granules are called as the knowledge granules, which can be obtained by granulating the universe with a dominance relation. Then the universe U can be granulated by the dominance relation D P into two families of knowledge granules under the mainframe of DRSA, which can be presented as follows.
• D + P (x) = {y ∈ U | yD P x} is the set of the objects that are dominating the object x with respect to P, which is called as P-dominating set of the object x; • D − P (x) = {y ∈ U | xD P y} is the set of the objects that are dominated by the object x with respect to P, which is called as P-dominated set of the object x. The concepts (rough sets) approximated in DRSA are the downward and upward unions of decision classes. The universe U can be partitioned by the decision attribute d into a family of decision classes C 1 , C 2 , . . . , C m . For any decision class C n , its upward union and downward union present respectively as follows.
In equations (1a) and (1a), let n, n ∈ {1, 2, . . . , m}. Cl ≥ n is the upward union of decision class Cl n which means that if x ∈ Cl ≥ n then x at least belongs to class Cl n ; Cl ≤ n is the downward union of decision class Cl n which means that if x ∈ Cl ≤ n then x at most belongs to class Cl n . ∀P ⊆ C, the lower and upper approximations of Cl ≥ n with respect to P are defined respectively as: Analogously, the lower and upper approximations of Cl ≤ n with respect to P are defined respectively as: In RST, the lower and upper approximations of any rough sets can partition the universe into three regions as positive region, boundary region and negative region. The lower and upper approximations of the concept Cl ≥ n can partition the universe U into three regions as follows.
Analogously, the universe U can be divided by the lower and upper approximations of the concept Cl ≤ n into three regions as follows.
. In addition, if n > 1 then there are the following properties.

III. COARSE BOUNDARIES
Following our previous work [23], we attempt to explore some new strategies to reduce more computation than those we had. Here we will introduce some concepts of coarse boundaries of lower and upper approximations of upward unions and downward unions of decision classes, respectively. Definition 2: ∀a i ∈ P and ∀x ∈ P(Cl ≥ n ), letȧ n i present the minimal value of the information function f (x, a i ) in its value domain, denoted bẏ a n i = min{f (x, a i )|x ∈ P(Cl ≥ n )}. The coarse lower boundary of P(Cl ≥ n ) is the vector formed bẏ a n i for all a i in P, denoted by Similarly, ∀x ∈ P(Cl ≥ n ), letȧ n i present the minimal value of the information function f (x, a i ) in its value domain, denoted byȧ The coarse lower boundary of P(Cl ≥ n ) is the vector formed bẏ a n i for all a i in P, denoted by In equations (4) and (5), |P| presents the cardinality of the criterion set P. The coarse lower boundary of lower or upper approximation of an upward union of decision classes may be seen as the description of available information of a virtual or real object which is dominated by all objects in the corresponding approximation. Then, we may regard B P (P(Cl ≥ n )) as the description of available information of an object x . No matter whether x is a virtual or real object, all objects assigned into P(Cl ≥ n ) are dominating x . If x is a real object, then the set . Based on three regions divided from the universe U by lower and upper approximations of upward union of decision classes, the following equations can be obtained.
) For the coarse upper boundaries of lower and upper approximations of upward union of decision classes, we have Assume that x ∈ P(Cl ≥ n ), y ∈ P(Cl ≥ n ) and y / ∈ P(Cl ≥ n ), then x y and the equation (6) holds.
Next, the definition of the coarse upper boundaries of lower and upper approximations of downward union of decision classes is introduced in the following.
Definition 3: ∀a i ∈ P and ∀x ∈ P(Cl ≤ n ), let a .
Similarly, ∀x ∈ P(Cl ≤ n ), letȧ n i present the maximal value of the information function f (x, a i ) in its value domain, denoted by a .
The coarse upper boundary of lower or upper approximations of an downward union of decision classes may also be seen as the description of available information of a virtual or real object which is dominating all objects in the corresponding approximation. Such as, B P (P(Cl ≤ n )) may be regarded as the description of available information of a virtual or real object x which is dominating all objects assigned into P(Cl ≤ n ). Based on three regions divided by lower and upper approximations of downward union of decision classes, we have two equations as follows.
. Proof: The proof of Property 3 can be proved easily by based on Properties 1 and 2 and is omitted.

IV. COARSE BOUNDARIES EVOLVES OVER TIME
Some variations will be happened in the steps of approximations computing when some new object being added into the universe [23]. Here we focus on variations happened on the coarse boundaries of approximations under the case when a new object becoming available. The case of some new objects becoming available simultaneously may be regarded as the accumulation of single object becoming available.
Let U be the original universe in the initial time of a dynamic course. Within the dynamic course, the new object x becomes available. This means that the original universe U is outdated. Then we need to discovery new knowledge on the universe U = U ∪ {x } instead of that obtained form the universe U to ensure the timeliness of useful knowledge. The coarse boundary of an approximation can be seen as the description of the available information of an object no matter which is a virtual or real. To discuss whether variations will be happened on coarse boundaries when a new object adding into the universe, the available information description of the object is introduced in the following.
∀a i ∈ P, the available information of x with respect to P can be described as a 1 ), f (x , a 2 ), . . . , f (x , a |P| ) .
(9) VOLUME 8, 2020 As for the coarse boundaries when a new object becoming available, we discuss unchanged and changed coarse boundaries respectively. In the rest of the paper, let Cl n = Cl n ∪{x }.

A. UNCHANGED COARSE BOUNDARIES
If f (x , d) ≥ d n then x must be assigned into Cl ≥ n . The coarse lower boundaries remain unchanged if satisfying the following proposition.
. As for the coarse upper boundaries remain unchanged, we obtain the following proposition.
Proof: The proof is similar to the proof of Proposition 1.

B. CHANGED COARSE BOUNDARIES
The case that coarse boundaries will be changed when some objects becoming available is inevitable in dynamic data environment. Here we discuss the changing trends of coarse boundaries after an object adding into the universe.
Proposition 5: There exists B(P(Cl ≥ n )) B(P(Cl ≥ n )) if satisfying the following items.
The proof is similar to the proof of Proposition 5.
Proposition 8: If f (x , d) ≤ d n and B(P(Cl ≤ n )) I (x ), then B(P(Cl ≤ n )) B(P(Cl ≤ n )). Proof: The proof is similar to the proof of Proposition 6.

V. APPROXIMATIONS UPDATING
After some new objects becoming available, the knowledge discovered from the original information system may be outdated. In order to obtain the timeliness and effectiveness of knowledge, we need to recompute the approximations of RST which is applied in our problem solving. In many cases, the new available information just cause a small part of approximations to be outdated. In [23], an incremental approach for updating approximations had been introduced. However, that approach requires the beginning from computing the dominating and dominated sets of the new object over the whole universe. In addition, it also needs to update all approximations that are related to the dominating and dominated sets of the new object. Obviously, we still can prune unnecessary computation from that approach. Then, we investigate a streamlined strategy of updating approximations of DRSA in this section.
When a new object becomes available, approximations may be updated by adding it according to the following theorems.
The proof of Theorem 3 is similar to the proof of Theorem 1 and is omitted.
The proof of Theorem 4 is similar to the proof of Theorem 2 and is omitted.
In the cases satisfying Theorems 1, 2, 3 and 4, approximations may be updated by comparing the description of available information of the object added to their corresponding coarse boundaries. In other cases, if these Theorems do not work at all, then we may improve the incremental approach introduced in [23] to update approximations.
According the concept of coarse boundary, we can obtain the following Remark easily.
. Then Remark 2 may be used to reduce the scan range in computing the dominating or dominated sets of the object x .
Remark 2: Let n > 1, The procedure of approximations updating based on their corresponding coarse boundaries is shown in Figure 1. When a new object becomes available, at first we decide whether the coarse boundaries of approximations need to be updated according to its basic available information description. Next, for some coarse boundaries that need not to be updated, the corresponding approximations may be updated by Theorems 1, 2, 3 and 4. Otherwise, the dominating and dominated set of the new object can be computed according to Remark 2. Then we use the incremental approach [23] to update the approximations related to the new object. Finally, we obtain the updated approximations and the procedure ends.

VI. ALGORITHM
According to the strategies of updating approximations with their corresponding coarse boundaries, we develop a new incremental algorithm. The main idea of the algorithm is to reduce more computation in approximations updating than the counterpart introduced in [23]. By using the coarse boundaries, some approximations may be updated by adding the new object directly. For the cases that Theorems 1, 2, 3 and 4 do not work, the scan range in the computation of the dominating and dominated sets may be reduced by using the corresponding coarse boundaries. Combination of these strategies, Algorithm 1 may be designed for updating approximations in DRAS when the new object adding.
In Algorithm 1, the steps 3-11 and 13-15 are to update lower and upper approximations of an upward union of decision classes under the cases satisfying Theorems 1 and 2, respectively. The time complexity of these steps is O(|P|).   and dominated sets, the function fun-IncAlg1 reduce unnecessary computation further. Then, even though the time complexity of function fun-IncAlg1 is same to the corresponding steps in the incremental algorithm designed in [23], it is better to certain extent.
From the analysis above, the time complexity of Algorithm 1 is O(|V d |(|P| + |U |)). Algorithm 1 may reduce more unnecessary computation in approximations updating than its counterpart, the Algorithm 2 developed in [23].

VII. A NUMERIC ILLUSTRATION
Example 1: Table 1    According to condition attributes a 1 and a 2 , we can draw a scatter chart as Figure 2.
In Figure 2, the red, blue and black points present the objects whose decision attribute values are 1, 2 and 3, respectively.
Let P = {a 1 , a 2 }, the lower and upper approximations of upward union and downward union of decision classes can be calculated, respectively. The results are listed as follows. The coarse boundaries of approximations above are listed as follows.
It is not difficult to find that B(P(Cl ≥ 3 )) need to be updated, and the rest of coarse boundaries do not change after the object x being added. Then the updated coarse boundaries are listed as follows.

VIII. EXPERIMENTAL EVALUATIONS
In order to estimate the performance of Algorithm 1, in this section some experiments are designed and executed. These experiments are to compare the computational time of updating approximations of DRSA taken by it and its counterpart, the incremental algorithm designed in [23]. Here we call them as BAlgorihm and IAlorithm, respectively. The platform of experiments is a personal computer with GNU/Linux Debian 9 (Stretch) and Intel(R) Core(TM) i5 CPU and 4 GB memory. BAlgorihm and IAlorithm are coded in GNU Octave, version 4.0.3. The Datasets of experiments are downloaded from machine learning data repository, University of California at Irvine [36]. The Datasets include Abalone, Car Evaluation, Ionosphere and Sonar. The basic information of them are listed in Table 2.
For satisfying the experimental requirements, we select some attributes from these Datasets to act as the condition criteria and the decision criterion, respectively. Such as Dataset Ablone, we select Length, Diameter, Height, Whole weight, Shucked weight, Shucked weight and Shell weight as the condition criteria, meanwhile, we select Rings as the decision  criterion. The similar operations has been done on other Datasets.
In order to obtain the more convincing experimental results, we scaled Car Evaluation two times, Ionosphere and Sonar five times to better results in the experiments, respectively. We divided four Datasets of experiment into five subsets of same size and called them as Dataset 1, . . ., Dataset 5, respectively. For each of these four experiment Datasets, we united Dataset 1, Dataset 2, Dataset 3 and Dataset 4 as the training set to compute approximations and the corresponding coarse boundaries. We select randomly 1, 10, 20, 30, 40 and 50 objects from Dataset 5 as the new objects being added, respectively. IAlgorithm and BAlgorithm were employed to update approximations of DRSA when the new objects adding. The trends of computational time taken by two algorithms are shown respectively in Figure 4. Figure 4 shows the trends of the computation time taken by IAlgorithm and BAlgorithm on four experimental Datasets, respectively. In each of four sub-figures, the horizontal axis is related to the number of objects added while the vertical axis concerns the computational time. From Figure 4, one can see that the computational time taken by IAlgorithm and BAlgorithm increase monotonically with the increasing number of objects added. As shown in each sub-figure of Figure 4, BAlgorithm is always faster than IAlgorithm. Moreover, the difference of computational time taken by IAlgorithm and BAlgorithm enlarges with the increasing of the number of objects added.
The results from experiments show that the performance of BAlgorithm outperforms that of IAlgorithm in updating approximations in dynamic data environment. Hence, BAlgorithm is more effective than IAlgorithm for updating approximations of DRSA when some objects becoming available.

IX. CONCLUSION AND FUTURE WORK
How to obtain knowledge effectively and timely from the dynamic data is a hot issue in the research field related to data mining, which has been paid highly attention in recent years. The ordered data is often used to present people's preference in daily lives. The analysis of ordered data is one of main tasks for multi-criteria decision making and multi-criteria sort. DRSA is an effective tool to the ordered data mining. Approximations computing is one of main steps for applying RST in data mining and knowledge discovery, which is the starting point to attribute reduction and rule extraction. In this article, an alternative incremental approach for updating approximations of DRSA when some new objects becoming available. The main innovation is to give the concept of coarse boundaries of approximations of DRSA. According to the coarse boundaries, we can reduce more unnecessary computation in approximations updating further. By the experimental estimation and the numeric illustration, we can draw conclusions as follows: (1) The approach has the feasibility in updating approximations of DRSA; (2)The performance of the approach outperforms that of its counterpart; (3) The approach can be applied in updating approximations of DRSA when multiple objects adding. Due to the incremental algorithm can not directly be used to massive big processing, then our future work is to design the corresponding parallel algorithm for updating approximations of DRSA on cluster and multi-core platforms.
SHAOYONG LI received the Ph.D. degree in computer science and technology from Southwest Jiaotong University, Chengdu, China, in 2014. He is currently an Associate Professor with the Facility of Intelligence Manufacture, Wuyi University, Jiangmen, China. His research interests include data mining, parallel computing, and rough set theory.
LIN YU received the B.S. degree in computer science and technology from the Huali College Guangdong University of Technology, Guangzhou, China, in 2019. He is currently pursuing the M.S. degree in computer science and technology with the Facility of Intelligence Manufacture, Wuyi University, Jiangmen, China. His research interests include data mining and rough set theory.