Applicable Metamorphic Testing for Erasable-Itemset Mining

Erasable-itemset mining is often used in production planning to identify itemsets that, if removed, would make little effect on the production profits. Consequently, much recent attention has been focused on increasing the mining efficiency. Yet, in addition to mining efficiency, guaranteeing mining correctness is also an important issue. In real applications, wrong mining results might lead a company to make inappropriate decisions. Therefore, in this paper, we use the metamorphic testing, a lightweight software-testing strategy, to check the mining results via discovering proper metamorphic relations. The core idea contains five metamorphic relations oriented from two aspects. In the first aspect, we alter maximum thresholds without changing the input data, and in the second aspect, we modify product databases and material items with fixing maximum thresholds. In the experiments, 11 datasets and 76 mutants generated by μJava were used to discuss the effects of database parameters, including the average number of materials in products, the number of different material items, the number of products, and the maximum threshold, respectively. The experimental results show that the metamorphic relations deliver good assessment performances in terms of effectiveness and efficiency, especially the fourth metamorphic relation.


I. INTRODUCTION
With the flourishing development of computer science, databases volumes are becoming larger and larger. Facing such a large volume of databases, people seek to obtain helpful information from these databases for decision supports. Thus, data mining techniques such as Apriori and FP-tree have been proposed to mine useful knowledge for decision-makers. According to the knowledge mined, decision-makers can effectively make the right decisions to meet the rapidly fluctuating commercial patterns. To this end, Agrawal et al. proposed data mining [2] [3], which is widely used in fields like bioinformatics and predictive analysis. There are many applications -especially in industrial manufacturing -such as product production planning. The product production planning problem defines those itemsets (or the sets of raw materials of the products) that would not significantly impact product profit if removed. To alleviate this problem, Deng et al. proposed an erasable-itemset mining algorithm called META in 2009 [8].
Actually, for a manufacturing company owner, hiring a group of professionals for just mining the erasable itemsets will not be a good idea because the manual cost is so high.
Additionally, with the development of cloud computing, more and more companies have begun outsourcing data mining tasks to third-party services. Due to such outsourcing service, during transfers, the returned itemsets or the mining program provided by third-party services could lose some important information. With this in mind, the correctness and integrity of the returned itemsets gains importance: if the returned itemsets are not accurate, it could result in an enormous loss for the company. In general cases, problems of the huge amount of the returned erasable itemsets and the diverse mining logic could incur high costs in both money and human resources. Therefore, verifying the mining correctness and maintaining the mining programs have been important research topics.
To aim at the issues above, a software-testing technique called metamorphic testing (MT) was proposed by Chen et al. in 1998 [5]. This approach defines some metamorphic relations (MRs), each of which includes an input relation and an output relation. A test case can be converted into another test case through the input relation. If the output results of two cases obtained from a program execution satisfy the output relation, then the program has high reliability. In contrast, if the output results of two cases do not satisfy the output relation, the program apparently has logic errors.
By concluding above, the metamorphic relations are very helpful to cost reduction in validating the mining algorithms. In real applications, for a manufacturing company, how to reduce the manufacture cost by discovering redundant materials (erasable itemsets) is an important issue. If the erasable itemsets can be filtered out successfully by an erasable-itemsets mining algorithm, the manufacture cost will be decreased effectively. For this purpose, the metamorphic relation plays a critical role to assure the correctness of the program. If the metamorphic relations can be defined accurately, the manual cost for checking the mining algorithm can be saved. In the following, the main contributions are summarized into three-folds.
(1) From the technical viewpoint, although there have been several studies made on metamorphic testing and frequent-itemset-mining testing, no metamorphic relations are proposed for erasable-itemset mining. (2) From the practical viewpoint, the proposed metamorphic relations can be applied in real applications of manufacturing. By the proposed methods, the manufacture cost can be reduced significantly. (3) From the robustness viewpoint, the proposed metamorphic relations were evaluated by a number of experiments. The experimental results reveal that the proposed metamorphic relations were very promising in terms of effectiveness and efficiency. The remainder of this paper is structured in the following. The literatures are briefly reviewed in Section 2. In Section 3, the proposed method for erasable-itemset mining assessments is presented in detail. The experimental analysis is interpreted in Section 4. Finally, the conclusions and future works are shown in Section 5.

II. Literature Review
The goal of this paper is to conduct a reliable assessment for erasable-itemset mining. Basically, it contains two major ideas, namely metamorphic testing and erasable-itemset mining. In this section, a brief review is made by the following categories.

A. Metamorphic Testing
Software testing is a way to assure and verify software quality, but it faces a fundamental problem: the oracle problem. The oracle problem can be identified that it is difficult to verify the results while performing a software test. In 1998, Chen et al. proposed metamorphic testing (MT), a new approach for generating the next test case [6]. This technique effectively alleviates the oracle problem and further triggers a set of related papers made gradually. Figure 1 shows the yearly numbers of MT-related publications from 2002 to 2019 in IEEE society. From Figure 1, we can know that, MT has become a popular topic recently due to the number of papers grew rapidly after the first International Workshop on Metamorphic Testing organized by Kanewala el al. in 2016.
Chen et al. provided a comprehensive review on MT in 2018 [7], which discussed the current challenges and present directions for the new research in the field of MT.
To explain MT in detail, a sine function is taken as an example: MR: {IR (x2 = π -x1) => OR (sin x1 = sin x2)} (1) In Eq. (1), MR, IR and OR indicate the metamorphic relation, the input relation and the output relation, respectively. According to the input relation in the related MR, MT uses the existing test case, known as the source input (x1), to generate the subsequent test case, known as the follow-up input (x2). Then, the program is examined with the source input (x1) and the follow-up input (x2 = π -x1), and the results are obtained as the source output (the value of sin x1) and the follow-up output (the value of sin x2), respectively. Next, whether the source output and follow-up output satisfy the output relation in the same MR is determined. If the output results of two cases satisfy the output relation, the program is identified as the one with high correctness. However, implicit errors might still exist. In contrast, if the output results of two cases do not satisfy the output relation, the program is sure to have errors. The most critical part of the MT procedure is designing reliable MRs. For that, Segura et al. proposed four aspects of a good MR design [19]. Over the past few years, the core of MT has been used in various fields actually by designing specific MRs, such as domains of IoT (Internet Of Things) and data mining. In IoT, Azimian et al. adopted MT to search high-consuming applications for portable devices, such as smartphones [1]. In bioinformatics, Chen et al. used MT to test a range of bioinformatics programs [6]. In machine learning, Xie et al. proposed METTLE to evaluate and validate unsupervised learning systems [20]. In deep learning, Zhou and Sun used metamorphic testing to detect fatal errors in self-driving cars' onboard systems [23]. In data analytics software, Jarman et al. applied metamorphic testing to Adobe's data analytics software for time-series data [13]. In data mining, Zhang et al. used MT to verify the integrity of frequent itemsets and frequent-itemset mining programs [22]. Inspired by Zhang et al., we thus adopted this concept on the erasable-itemset mining. In [22], Zhang designed ten MRs, and in this paper, we refer to their idea to design five new MRs for erasableitemset mining. for (i = 1; i ≤ n; i++) do 4. Total_profit = Total_profit + Pi.profit; 5.
for each product P in DB do 8.
end for 11.
for each candidate i-itemset c in Ci do 14.
for each product P in DB do 16.

B. Erasable-itemset Mining
In 2009, erasable-itemset mining was introduced by Deng et al. to facilitate production planning [8]. They proposed an Apriori-based algorithm called META, which generates candidate patterns level-by-level. Figure 2 shows the META algorithm. In addition to the META algorithm, several algorithms for erasable itemset mining have been proposed in the later years. In 2010, Deng and Xu proposed a new algorithm called VME Algorithm [9], which used the structure of PID_List, a new data representation, to improve the mining efficiency. In 2012, Deng and Xu proposed another erasable itemset mining algorithm called MERIT Algorithm [10], which adopted a tree structure to mine the erasable itemsets rapidly. After that, in 2013, a revised version of the MERIT Algorithm called MERIT+ was presented by Le et al. [14], which used a pruning strategy to speed up the mining of MERIT. The following year, Le and Vo proposed MEI, a divide-and-conquer algorithm with a concept of PID_List [15]. In 2017, Hong et al. proposed incremental erasable-itemset mining concept [12] to handle dynamic databases. Hong et al. then proposed a tree-based quasi-erasable-itemset mining approach to increase the efficiency [11]. Lee et al. then proposed an efficient algorithm for incremental erasable pattern stream mining with a special data structure and a pruning strategy [16]. Yun et al. presented an efficient erasable pattern mining algorithm based on sliding windows [21]. Through the proposed data structure and divide-and-conquer manner, the erasable patterns can be mined from the data stream successfully and efficiently. In 2020, two methods were developed for erasable-pattern mining. One is proposed by Baek et al., which mined erasable patterns from dynamic data streams on the basis of a damped window model [4]. The other one was proposed by Nam et al., which adopted the anti-monotone property to mine weighted erasable itemsets based on the static overestimated feature of itemsets profits [18].
Example 1. To explain the above-mentioned erasableitemset mining, an illustrative example is shown before presenting the details of the proposed approach. Table 1 shows a product database consisting of five products and five materials a-e. Assume the maximum threshold r is set at 40%. The procedure of the algorithm is shown as follows. First, the total profit value of DB denoted as T, is 200 + 200 + 100 + 400 + 100 = 1000. The gain values of candidate erasable 1-itemsets, materials a to e appearing in the product database, are calculated. Here, take material b as an example. It appears in P1, P4, and P5 in the product database. Thus, the related gain value is 200 + 400 + 100 = 700. The gain values of the other materials are calculated in the same way, as shown in Table 2. Take a candidate erasable 2-itemset {c, d} as an example. Because products P1, P2, P3, and P5 contain at least c or d, the sum of the profits of these products is set as the gain value of the candidate erasable 2-itemset {c, d}, which is 600. Because the gain value of the candidate erasable 2-itemset {c, d} is larger than the maximum threshold value (400), the candidate erasable 2-itemset {c, d} is not an erasable 2itemset. The other candidate erasable 2-itemsets {c, e} and {d, e} are processed in the same way. Then, no candidate erasable 3-itemsets are generated. Finally, all erasable itemsets obtained are collected as the output EI, as shown in Table 3.

A. Overview
The goal of this paper is to define MRs to validate the erasableitemset mining programs. To reach this goal, in this paper, we propose five MRs from two aspects based on the input data of the mining problem. For the first aspect, we tune the maximum threshold without changing the product database and the set of materials. On the contrary, the threshold is fixed with changing the product database and the set of materials in the second aspect. This lightweight method based on metamorphic testing substantially reduces the company costs because it is automatic and easy to maintain. To evaluate the effectiveness of the proposed MRs, numerous experiments were conducted to simulate malicious attacks by mutating the erasable-itemset mining program (META) and using the mutation score to validate the performance of the proposed MRs.

B. Preliminary
A metamorphic relation (MR) consists of an input relation and an output relation. At the beginning of the metamorphic testing procedure, an existing test case called the source input is given. The input relation in a specific MR is used to generate a follow-up input by a source input, and the output relation is used to verify the relationship between a source output and the follow-up output. The source output indicates the results executed by the mining program with a source input, and the follow-up output indicates the results executed by the same mining program with a follow-up input. In the following, the used symbols in the proposed MRs are introduced. According to the concept of MT, these symbols can be divided into the source input, the follow-up input, the source output, and the follow-up output, individually.

D D'
The product database The set of materials r r' The maximum threshold

1) SOURCE/FOLLOW-UP INPUTS
The source input refers to the existing test case, which consists of the product database, the set of materials, and the maximum threshold. The follow-up input refers to the results also consisting of the product database, the set of materials and the maximum threshold, where a source input is converted by a specific input relation. The symbols of the source and followup inputs to be used in the MRs are listed in Table 4.

2) SOURCE/FOLLOW-UP OUTPUTS
The source output refers to the results when the programmer executes the mining program with the specific source input. The follow-up output refers to the results when the programmer executes the same mining program with the follow-up input. The symbols of the source and follow-up outputs to be used in the MRs are listed in Table 5.

C. Metamorphic Relations
In this section, the proposed five MRs will be presented from two perspectives. In the first perspective, the maximum threshold will be tuned, keeping other input data constant. In the second perspective, the product database and the set of materials will be changed, keeping the maximum threshold constant. To better explain the design logic, for each MR, an explainable example is shown by following the examples taken in Section 2.

1) RELATIONS WITH VARYING MAXIMUM THRESHOLDS (A) THE FIRST METAMORPHIC RELATION (MR1)
In MR1, we consider the impact on the number of erasable itemsets when decreasing the maximum threshold r, which is defined as follows: Input relation: I' = I, D' = D and r' < r. Output relation: |E'| ≤ |E| and |E'|⊆|E|.
In the above MR, the follow-up product database D' is equivalent to the source product database D. The set of follow-up materials I' is also equivalent to the set of source materials I. Then, the follow-up threshold decreases as the follow-up maximum threshold r' reduces, where the followup threshold is the result of the follow-up maximum threshold r' multiplying the total profit of the follow-up product database. Thus, the gain values of source erasable itemsets may be larger than the follow-up threshold, and then remove them from the set of the source erasable itemsets E. Next, we glean the remaining source erasable itemsets as the set of the follow-up erasable itemsets E'. Besides, the gain values of the source nonerasable itemsets are still larger than the follow-up threshold, so they are still source nonerasable itemsets, and we glean them as the set of the follow-up nonerasable itemsets NE'. Thus, we conclude that the number of the follow-up erasable itemsets |E'| is smaller than or equal to that of the source erasable itemsets |E|. Thus, |E'| ≤ |E| and |E'|⊆|E|. Here, an example is given to illustrate MR1. The follow-up product database D' and the set of the followup materials I' keep constant, as shown in Table 1. The source maximum threshold r is 40%, and we reduce it to 30% as the follow-up maximum threshold r'. Thus, the follow-up maximum threshold value is 1000 × 0.3 = 300. Then, we execute the mining program with the follow-up input data. The set of the follow-up candidate erasable 1-itemsets CE1' is the same as the set of the source candidate erasable 1itemsets CE1 shown in Table 2. Then the follow-up candidate erasable 1-itemsets not exceeding the follow-up maximum threshold are filtered. From Table 2, we know that {c}, {d} and {e} are the follow-up erasable 1-itemsets. Next, the set of the follow-up candidate erasable 2-itemsets CE2' is obtained from the set of the follow-up erasable 1-itemsets E1'. In this example, we have three follow-up candidate erasable 2-itemsets: {c, d}, {d, e}, and {c, e}. The gain values of three candidate erasable 2-itemsets are calculated and compared with the maximum threshold of 300. Then, no follow-up candidate erasable 3-itemsets are generated. Finally, Table 6 shows the follow-up erasable itemsets. By comparing Tables 3 and 6, we can know that the number of the follow-up erasable itemsets is smaller than that of the source erasable itemsets.

(B) THE SECOND METAMORPHIC RELATION (MR2)
In MR2, the impact on the number of erasable itemsets is considered while increasing the maximum threshold r. The second MR is defined as follows: Input relation: I' = I, D' = D, r < r'. Output relation: |E| ≤ |E'| and |E|⊆|E'|.
In the above MR, the follow-up product database D' and the set of follow-up materials I' are equivalent to the source product database D and the set of source materials I, respectively. Then, the follow-up threshold increases as the follow-up maximum threshold r' increases, where the follow-up threshold is derived as the same as that in MR1. Thus, the gain values of source nonerasable itemsets may be smaller than or equal to the follow-up threshold and remove them from the source nonerasable itemsets NE. Then, we glean the remaining source nonerasable itemsets as the set of the follow-up nonerasable itemsets NE'. Because the gain values of source erasable itemsets are still smaller than the follow-up threshold value, they are still erasable itemsets. Thus, the set of the follow-up erasable itemsets E' is obtained as the follow-up erasable itemsets plus those itemsets removed from the set of the source nonerasable itemsets NE. According to the above, we conclude that the number of the follow-up erasable itemsets |E'| is larger than or equal to the number of the source erasable itemsets |E|. Thus, |E| ≤ |E'| and |E|⊆|E'|.
Example 3. Based on the above examples, given 60% as the follow-up maximum threshold r', the follow-up maximum threshold value is 1000 × 0.6 = 600. Then, as shown in Table 7

2) RELATIONS WITH VARYING MATERIALS AND THE DATABASE D (A)THE THIRD METAMORPHIC RELATION (MR3)
In MR3, we consider the impact on the number of erasable itemsets when removing an erasable1-itemset from the product database and the set of materials. The third MR is defined as follows: Input relation: I' is constructed by removing an erasable 1-itemset from I. D' is constructed by removing the same erasable 1-itemset from D. Then, r' = r.
In the above MR, the set of the follow-up materials I' is constructed by removing an erasable 1-itemset, which we assume is {e}, from the set of the source materials I. The follow-up product database D' is constructed by removing e from the source product database D. The follow-up maximum threshold r' is equivalent to the source maximum threshold r, which means that the follow-up threshold is unchanged. Due to removing item e from the source product database D, when we execute the mining program with the follow-up input containing D', I', and r', the number of the follow-up erasable 1-itemsets |E1'| is obtained as the number of the source erasable 1-itemsets |E1| minus 1. Thus, |E1'| = |E1| -1. Then, the number of the follow-up erasable 1itemsets |E1'| decreases, and the number of the follow-up erasable 2-itemsets |E2'| decreases, too. Next, the number of the follow-up erasable 2-itemsets |E2'| is smaller than that of the source erasable 2-itemsets |E2|. Thus, |E2'| < |E2|. We infer that the number of the follow-up erasable k-itemsets |Ek'| is smaller than the number of the source erasable k-itemsets |Ek|, that is, |Ek'| ≤ |Ek|. Finally, we conclude that the number of the follow-up erasable itemset |E'| is smaller than the source erasable itemset |E|.
Example 4. The follow-up product database D' is constructed by removing a source erasable 1-itemset e from the source product database D, but the profit values of each product are not changed. From Table 3, we have all source erasable 1-itemsets. Assume that we remove item e from the source product database to form the follow-up product database D' as shown in Table 8. Then, the set of the followup product items I' also removes item e from the set of the source product items. The maximum threshold value, which is 400, stays constant. Finally, {c} and {d} are the follow-up erasable 1-itemsets and no follow-up erasable 2-itemset is generated in this case. By comparing this result with Table 3, we can conclude that the number of the follow-up erasable itemsets is smaller than the number of the source erasable itemsets. In MR4, we consider the impact on the number of erasable itemsets while adding new material. The fourth MR is defined as follows: Input relation: I' is constructed by adding a new material to I. D' is constructed by adding the same material in each product materials. Then, r' = r.
In the above MR, the set of the follow-up materials I' is constructed by adding a new material, which we assume is g, to the set of the source materials I. The follow-up product database D' is constructed by adding material g to each product in the source product database D. The follow-up maximum threshold r' is equivalent to the source maximum threshold r. Suppose the set of the source erasable 1- itemsets  E1 is {{i1}, {i2}, {i3}, ..., {im}}. Because material g is added to each product, its gain value is equal to the total profit of the follow-up product database D'. Thus, its gain value must be larger than or equal to the follow-up threshold value. Material g should belong to the set of the follow-up nonerasable 1-itemsets NE1'. Thus, the set of the follow-up erasable 1-itemsets E1' should be equivalent to the set of the source erasable 1-itemsets E1. Because E1' = E1, we infer the set of the follow-up erasable 2-itemsets E2' is also equivalent to the set of the source erasable 2-itemsets E2. Finally, we conclude that the number of the source erasable itemsets |E| is equal to the number of the follow-up erasable itemsets |E'|.

(C) THE FIFTH METAMORPHIC RELATION (MR5)
In MR5, we consider the impact on the number of erasable itemsets when adding a new product with a new material to the product database. The fifth MR is defined as follows: Input relation: I' is constructed by adding a new material to I. D' is constructed by adding a new product Pn+1 to D, where the material of Pn+1 is the new material, and its profit is zero. Then, r'= r.
In the above MR, the set of the follow-up product items I' is constructed by adding a new material h to the set of the source materials I. The follow-up product database D' is constructed by adding a new product Pn+1 to the source product database D, where its material is h, and the profit of this product is zero. Due to the profit of Pn+1 is zero, the gain value of g is also zero. Thus, h must belong to the set of the follow-up erasable 1-itemsets, and the number of follow-up erasable 1-itemsets |E1'| increases thereby. Then, the set of the follow-up erasable 2-itemsets E2' is generated by the set of the follow-up erasable 1-itemsets E1'. Further, the number of the follow-up erasable 1-itemsets |E1'| increases, and the number of the follow-up erasable 2-itemsets |E2'| might increase, too. Finally, we conclude that the number of the source erasable itemsets |E| is smaller than the number of the follow-up erasable itemsets |E'|, i.e., |E| ≤ |E'|.
Example 6. The follow-up product database D' is constructed by adding a new product h to the source product database and the related profit is zero. Table 10

A. Experimental Settings
The experiments include two major sources, namely mutated programs and transaction data. The goal of mutated programs is to simulate the implementation errors in a mining program. For this purpose, we used the µJava tool, a mutation system for Java programs [17]. In the experiments, 76 invalid programs were yielded for the META algorithm, each of which had only one fault, called mutant. An example is shown in Figure 3. The correct program is to calculate the sum accumulated from 0 to 9, referring to Line3 of j = j + 1. After using µJava, Line 3 is altered to j = j -1. For experimental transaction data, the IBM Synthetic Data Generator was used to generate synthetic datasets [24]. The parameters for creating synthetic datasets are shown in Table 11. To evaluate the performance of the proposed metamorphic relations (MRs), we adopted the mutation score metric: The mutation score is the rate of detected mutants over the total mutants for a specific MR. If a mutant program violates an MR, this mutant program can be identified as killed. For example, if there are totally t mutants and k detected mutants for a MR, then the performance of this MR is k/t. The erasable-itemset mining programs in this paper were implemented in Java, running on a server with Intel Core i7-9700 3GHz and 8 GB RAM.

B. Experimental Results
The major intent of evaluations is to know the performances of proposed MRs. All evaluations were conducted in 3 aspects, including the impact of the average number of items in each product, the number of different items in the datasets and the number of products in the datasets, and the maximum threshold.

1) INFLUENCES OF AVERAGE NUMBERS OF MATERIALS PER PRODUCT HOLDS
To observe the influences of the average number of materials in products on our proposed MRs, we altered parameter T (the average number of materials per product) to generate different datasets. In our experiments, T is set to 20, 40, 60, and 80, where N=1000, D=10000, and the maximum threshold is set to be 0.04%. Figure 4 shows the resulting mutation scores for each MR. In this Figure, the performances of MR1, MR4, MR5 decrease as the average number of materials increases. In contrast, when the average number of materials becomes large, the performance of MR3 increases and that of MR2 is stable.  reason is that the addition operation is harder to detect than the removal operation. Figure 5 shows the time to execute all mutants in every MR. As the average number of materials increases, the execution time increases accordingly. In summary, the threshold-varying MRs are more difficult to detect than the data-varying MRs for the transaction length, and all execution time is very close.

2) INFLUENCES OF NUMBERS OF UNIQUE MATERIALS
To observe the influence of the number of different items in the datasets on our MRs, we altered parameter N (the number of different materials in the datasets) to generate datasets. In this experiment, N is set to 3000(3K), 5000(5K), 7000(7K) and 9000(9K), where T=5, D=10000 and the maximum threshold is set to be 0.04%. Figure 6 shows the mutation score of each MR, which delivers some observations. First, MR1 and MR2 are with the same performances. Second, for MR3, MR4 and MR5, the performances are stable and higher than those of MR1 and MR2. Figure 7 shows the time taken to execute all mutants in every MR. The execution time of all MRs linearly increases as the number of unique materials in the datasets increases. Yet, MR2 needs much more time than the others on a large number of unique items. This is because MR2 raises up the maximum threshold and therefore costs explosively to search items.

3) INFLUENCES OF NUMBERS OF PRODUCTS
In addition to the above parameters, the goal of this evaluation is to clarify the impacts on different numbers of products. By referring to Table 11, D in this evaluation was set to 1000(1K), 5500(5.5K) and 10000(10K), where N=1000, T=10 and the maximum threshold was also 0.04%. Figure 8 shows that the performances, which include a set of aspects. First, MR1 and MR2 perform the same and much worse than the others. Second, MR1, MR2 and MR4 are not influenced by the number of products in our datasets. On the contrary, performances of MR5 and MR3 slightly change. Figure 9 shows the execution time. It can be observed that the execution time of all MRs grows significantly as the number of products rises.

4) EXPERIMENTAL SUMMARY
Based on the above experimental evaluations, the results are summarized in the following.
(1) To ensure the effectiveness of the developed models, in the experiments, a mutation score is proposed to evaluate the effectiveness. This measure shows the rate of successful detections for biased programs. The experimental results show MR4 performed better than the other MRs. In the future, we will keep testing the proposed 5 MRs by real data. (2) By comparing the mutation scores above, the threshold-varying MRs are not easy to detect. In contrast, data-varying MRs are easy to detect, especially for MR4. The potential explanation is that a new material is added in each product. The change is clearer than that in the others. In detail, in the mining programs, the correct results rely heavily on the coding logics and thresholds. Therefore, the threshold changes probably recover the loss results by biased coding. This is the potential reason that the threshold-varying MRs are not easy to detect in the experimental cases. (3) From the execution-time viewpoint, the heavy data changes will result in high execution time. Especially for product length, it needs much time to compose and decompose the products. This problem often happens in Apriori-based algorithms. Moreover, by comparing MR1 and MR2 in Figure 7, the threshold is sensitive to the numbers of unique materials. This is because the mutated threshold is lower than the source threshold for MR1. Further, more candidates are pruned to speed up the mining if facing a larger number of unique materials. (4) In overall, MR1, MR2 and MR4 perform more stably than MR 3 and MR5. However, MR1 and MR2 are much harder to detect than MR4. This is because the programming logic and the threshold are insensitive to the results. Besides, lower thresholds in erasableitemset mining generate fewer results. However, the biased logics might recover the lost results, which makes the biased programs uneasy to detect. For example, if the itemsets {A, D} are eliminated by a threshold, the biased logic still generates them. As a result, the biased program is not detected successfully in this case. This problem will be researched in our future intent.

V. Conclusions
Recently, cost-down has been a hot topic for a manufacturing company. To this end, mining of erasable materials is proposed as a solution. However, it needs much manual effort to verify the correctness of mining programs. To remedy this, in this paper, we use metamorphic testing, a lightweight method to verify the correctness of the erasable itemsets mining. In the proposed method, five metamorphic relations (MRs) are proposed according to the properties of the erasable itemset mining. Each MR is analytically evaluated by the proposed mutation score. These MRs provide manufacturing companies with a solution to effectively verify the correctness of erasable itemsets.
Although this paper provides metamorphic testing for erasable data mining, there remain a number of works to do in the future. First, the threshold-varying problem will be researched for a robust performance. Second, not only checking the mining correctness, but also logical errors will be recognized by a level-wise metamorphic testing in the future. Third, correlations with real companies will be conducted using real datasets.