A Multi-Core Approach to Efficiently Mining High-Utility Itemsets in Dynamic Profit Databases

Analyzing customer transactions to discover high-utility itemsets is a popular task, which consists of ﬁnding the sets of items that are purchased together and yield a high proﬁt. However, many studies assume that transactional data is static while in real-life, it changes over time. For example, the unit proﬁts of items may vary from one week to another because sale prices and production costs may change. Many algorithms for mining high-utility itemsets (HUI) ignore this important property and thus are inapplicable or generate inaccurate results on real data. To address this issue, this paper proposes a novel algorithm named Multi-Core HUI Miner (MCH-Miner). It adapts techniques introduced in the iMEFIM algorithm to run on a parallel multi-core architecture to efﬁciently mine HUIs in dynamic transaction databases. An empirical evaluation shows that in most cases, MCH-Miner is signiﬁcantly faster than iMEFIM, and that the cost of database scans is reduced


I. INTRODUCTION
A key problem in the field of data mining is frequent itemset mining (FIM), which was introduced in 1994 by Agrawal and Skirant [1]. It consists of identifying frequently occurring sets of items in a transaction database, that is frequent itemsets (FI). FIM algorithms rely on the support framework to discover these itemsets that have occurrence frequencies that are no less than a minimum support threshold. To efficiently solve the problem, a property called downward closure property was proposed [1]. This property helps to reduce the search space by pruning unpromising candidates early. However, FIM treats all items of a transaction database equally, that is as having the same importance. This assumption is a major drawback of FIM because for many real-world applications, items of a transaction database may have different importance degrees. For instance, breads and beverages are purchased much more often every day than cars. As result, FIM algorithms are more likely to discover the former itemset The associate editor coordinating the review of this manuscript and approving it for publication was Jerry Chun-Wei Lin . than the latter. But although the latter seldomly appears, its yields a much higher profit.
To address the above drawback of FIM, the task of highutility itemset mining (HUIM) was proposed. The purpose of HUIM is to reveal itemsets that are useful or profitable (have a high utility) in a transaction database [2]. HUIM can be considered as an extension of FIM, and it has been the subject of many recent studies [3]- [7]. Traditional HUIM algorithms assume that each item has its own fixed utility value and a quantity value when appearing in a transaction. In practical applications, one can set these values based on his preferences and needs. The utility may represent information such as the weight, cost or profit of an item. An HUIM algorithm operates on databases with utility information to discover itemsets having a utility that is no less than a user-specified threshold, named the minimum utility (minutil). Itemsets meeting this constraint are called high-utility itemsets (HUI). These itemsets are useful for many real-world applications such as click stream analysis, user behavior analysis and cross marketing. Discovering high-utility itemsets is a combinatorial problem that is challenging. In FIM, the support measure is anti-monotonic and satisfies the downward closure property, which is useful to reduce the search space. However, since the utility measure in HUIM is not anti-monotonic nor monotonic, this property does not hold [7]. Thus, the search space of HUIM is huge, which leads to long execution times and high memory usage. Table 2 presents an example transaction database having five transactions, denoted as T 1 , T 2 , . . . , T 5 . Each transaction stores information about a set of purchased items. For instance, consider the transaction T 4 (the fifth row of Table 2). It indicates that a customer has purchased items a, b, d and e and that the purchase quantities of these items are 5, 2, 1 and 2, respectively. The database also stores information about the unit profits of items appearing in each transaction. In the case of T 4 , the unit profits of a, b, d and e are 1, 2, 5 and 4, respectively. Based on this information given in each transaction, the utility value of an item in a transaction can be easily calculated as the product of its quantity by its unit profit. Furthermore, by summing up the utility values of every item of an itemset in all transactions that contains it, we obtain the utility value of that itemset for the whole database. For instance, consider the itemset {bc}, which is appearing in transactions T 1 , T 2 and T 5 . Its utility value is calculated as: 18, whereas its support value (occurrence frequency) is 3. Similarly, the utility of itemset {de} can be calculated as 22 and its support is 2. If the minutil threshold is set to 20 then {de} is an HUI. This example clearly shows that the itemset that has a higher occurrence frequency is not always the one that yield the highest profit (utility), which in this case is {bc}.
FIM discovers FIs, that is itemsets having a support that is no less than the minimum support threshold (minsup). Similarly, in HUIM, the discovered HUIs are the itemsets having a utility that is no less than minutil. Since the proposal of HUIM, several algorithms were presented to discover the complete set of HUIs such as Two-Phase [8], IHUP [9], UP-Growth [10], EFIM [11].
However, all the algorithms mentioned above assume that the unit profit of each item is fixed. Take the database shown in Table 2 as example. It can be seen that item d, which is appearing in T 1 , T 2 , T 3 and T 4 , has a fixed unit profit of 5 in all these transactions, and the same is true for the other items. The above assumption is unpractical for real-world databases and applications where the unit profit of items may vary over time due to various reasons. For instance, a change in supply cost or promotions, variable transportation fees or taxes may influence the unit profits of items. As a result, if traditional HUI mining algorithms are applied on real-world databases containing dynamic profit values, they are unable to return accurate results or are inapplicable. To provide an efficient solution to this problem, Nguyen et al. has recently proposed a framework [12] that overcomes this unrealistic assumption and makes HUI mining in real-world databases with dynamic profit values possible. This new database type is called dynamic profit database. Also, the authors proposed a compact format to store the transactions along with all their utility information [12]. Using this framework, all the currently available HUIM algorithms can now be applied to dynamic profit databases and generate accurate results. Table 3 presents an example of dynamic profit database, which is an extended version of the database given in Table 2, designed to store dynamic item profit values. For instance, consider item a. Its unit profit values are 1.0, 1.1, 1.1 and 1.2 when appearing in T 2 , T 3 , T 4 and T 5 , respectively. Along with the introduction of the dynamic utility framework, an efficient algorithm was also designed by Nguyen et al. to mine HUI from this new database type. The algorithm, named iMEFIM [12], extends the EFIM algorithm. It combines the newly introduced framework with all the efficient techniques presented in the EFIM algorithm. Furthermore, a novel data structure named P-set is utilized to significantly reduce the cost of database scans.
The following sections are organized as follows. Section II briefly reviews relevant related work on HUIM. Section III describes the problem of mining HUIs in dynamic profit databases and presents important definitions. Section IV discusses parallelism and presents an extension of the iMEFIM algorithm to further boost its performance using parallel computing. Section V describes the evaluation of the extended algorithm by comparing its performance with that of the VOLUME 8, 2020 original algorithm. Finally, a conclusion is drawn and future work are discussed in Section VI. The abbreviations frequently used in this manuscript is presented in Table 1.

II. RELATED WORK
As mentioned above, HUIM is more challenging than FIM because the utility measure does not satisfy the downward closure property as stressed by Yao and Hamilton [7], [13]. Thus, it cannot be used to prune candidate patterns. To reduce the search space of the HUIM problem, efficient pruning strategies and utility-based upper-bounds are required [14]. The first HUIM algorithm based on this idea is UMining [7]. However, this algorithm is incomplete, i.e. it may not discover the complete set of HUIs in a transaction database. To address this issue, the Transaction Weighted Utilization (TWU), a safe utility-based upper-bound, was proposed and adopted in several algorithms [8], [9], [15]. It is used to efficiently and safely pruning candidates. Since the TWU is anti-monotonic, it satisfies the downward closure property. And based on this property, Liu et al. designed the TWU downward closure model (TWDC) [8] to prune unpromising candidate patterns from the search space early, and hence reduce the time for discovering HUIs. Combining the Apriori algorithm with the TWDC model, Liu et al. has introduced an algorithm named Two-Phase [8]. As the name implies, the algorithm requires two separated phases to mine HUIs. First, the algorithm obtains a list of candidates having TWU no less than the minutil threshold. Next, a database scan is performed to determine the exact utility value of each candidate. The algorithm returns only the patterns having utility values greater or equal to the minutil threshold. Thus, Two-Phase can return the complete set of HUIs. However, it can still have very long runtimes. By adopting the proposed TWDC model, many algorithms were then introduced to increase the effectiveness of the HUIM process by lowering the number of candidates generated when using the TWU [16]- [18]. Based on the concept of pattern growth tree introduced by Han et al. [19], algorithms such as IHUP [9], UP-Growth [10] were proposed. These algorithms all mine HUIs in two phases, first they generate candidate patterns, and then they scan the database to compute the exact utility of each candidate and filter those having a low utility. Besides, such algorithms suffer from a scalability issue since the number of generated candidates is huge when it is applied to large databases. The HUI-Miner algorithm, which was introduced by Liu and Qu [3], overcomes this limitation by mining HUIs in one phase without candidate generation. HUI-Miner requires only one database scan. To achieve this, the authors presented an efficient structure named Utility-List. Although that structure can be employed to efficiently prune candidates, the join operation on these lists is computationally expensive and not suitable for large databases. Fournier-Viger et al. then proposed an extended version of the HUI-Miner algorithm, named FHM [20]. The algorithm employs an Estimated Utility Co-occurrence Structure (EUCS) and a pruning strategy called EUCP to greatly reduce the mining time and number of database scans. Also extended from HUI-Miner, Krishnamoorthy proposed an algorithm named HUP-Miner [21]. The algorithm efficiently prunes unpromising candidates by extending the utility-list structure into a partitioned utility-list structure. It also utilizes a look-ahead strategy to minimize the time needed to construct utility lists. The goal of all the mentioned HUIM approaches is to increase the performance of the mining process by pruning candidates. A list-based algorithm named DHUPL (Damped High Utility Pattern mining based List) was presented to mine HUIs in data streams [22]. DHUPL prunes patterns using a damped window model that considers newly arrived data as more important than previous data [22]. Also, to handle dynamic databases where transactions are frequently deleted, Yun et al. adapted a pre-large method to mine HUIs efficiently while reducing the number of database scans to update results [23]. The SPHUI-Miner [24] algorithm, proposed by Bai et al. introduced an efficient and compact data format named HUI-TRPL to reduce memory consumption. The authors also presented two novel data structures, which are selective database projection list and Tail-Count list to prune the search space and to reduce the time needed to scan the database. Recently, Zida et al. introduced the EFIM algorithm [11], which is considered the state-of-the-art algorithm for mining HUIs. The authors presented the sub-tree utility and local utility, two novel and tighter upper bounds to efficiently prune generated candidates. Furthermore, two new techniques to deal with long transactions and partially identical transactions were presented. They are called High-utility Database Projection (HDP) and High-utility Transaction Merging (HTM). By employing these efficient techniques and strategies, EFIM was shown to outperform all the previously introduced algorithms. But EFIM performs a complete database projection for each promising itemset, which requires a long execution time and has scalability issue. To mine dynamic profit databases and take into account the dynamic aspect of real-world transaction databases, Nguyen et al. proposed a new utility framework [12]. Furthermore, a novel structure named P-set was also introduced to significantly reduce the cost of database scans. Based on the proposed techniques, the authors extended the EFIM algorithm into a more efficient algorithm, named iMEFIM [12]. Evaluation results have proven that iMEFIM has superior performance and scalability compared to the previous methods. The concept of HUIs was also extended in many algorithms. The CPHUI-List algorithm to efficiently mine closed potential HUIs [25] in uncertain databases without generating candidates. Gan et al. proposed the HUOPM algorithm (high-utility occupancy pattern mining) [26] that takes into account user preferences, such as frequency, utility, and occupancy when mining HUIs. To prune the search space, several data structures were introduced as a global and partially downward closure properties, such as a novel frequency-utility tree, utility-occupancy list and frequency-utility table. With the data obtained from the IoT devices, which contain positive, negative unit utilities and uncertain, Gan et al. proposed an algorithm namely HUPNU [27] to discover high-utility pattern having these characteristics. The discovered patterns can be used in many applications, such as intrusion detection, risk prediction, decision-making, etc.
Many efficient data structures have been proposed to improve the performance of the mining process. To speed up the process of mining association rules regardless of the algorithm used, Luna et al. presented a novel data structure [28]. This approach is designed to be employed by a wide range of existing methods without changing their original process. Experimental results have shown that the introduced structure has enhanced the runtime by orders of magnitude and significantly reduced memory requirements. Vo et al. inherited a lattice-based approach [29], [30] to mine high utility association rules [31] and non-redundant high utility association rules [32].
As presented above, many algorithms were designed to mine HUIs efficiently. But those algorithms are mainly sequential algorithms and many of those are time-consuming when applied on specific types of database such as high-density databases and databases with long transactions. An approach to enhance the performance of pattern mining is to apply parallel computing models such as distributed, GPU and multi-core computing. With the increased popularity and availability of novel microprocessors that allows the simultaneous execution of multiple tasks to improve performance [33], many studies have introduced parallel methods for FIM such as pSPADE [34], Par-CSP [35] and Par-ClosP [36]. Based on their successes, it can be assumed that adapting HUIM algorithms to run on a multi-core architecture would also greatly boost their performance. Yu et al. introduced an effective load balancing strategy into FIM based on a parallel architecture [37], as well as Li et al. [33], Negrevergne et al. [38], Schlegel et al. [39] and Negrevergne et al. [40]. Parallel mining has been also applied to correlated pattern mining [41], class association rule mining [42], sequential pattern mining [43], sub-graph mining [44], and frequent sequential pattern mining [45]. However, these techniques have been rarely applied in HUIM. Chen and An introduced a parallelized version of the HUI-Miner algorithm, named PHUI-Miner [46]. The algorithm partitions the search space into sub-spaces that can be explored in parallel. Each sub-space is in turn assigned to a cluster node. This approach increases the overall performance of the mining process in terms of runtime. Nguyen et al. extended the EFIM algorithm using multi-core parallelism and proposed the pEFIM algorithm [47].

III. PRELIMINARIES AND PROBLEM STATEMENT A. PROBLEM STATEMENT
Let there be a dynamic profit database D and a user-specified minimum utility threshold (minutil). The problem of mining high-utility itemsets in D is to efficiently discover all the itemsets in D having a utility that is no less than minutil.

B. PRELIMINARIES
A transaction database D is a set of transactions, denoted as D = {T 1 , T 2 , . . . , T n }. Let I be the finite set of all distinct items contained in D, and X be an itemset (X ⊆ I). A transaction T c , T c ⊆ I, T c ∈ D, has a unique transaction ID (T ID )c. Consider a single item i(i ∈ I) contained in a transaction T c . The quantity of i in T c is denoted as q(i, T c ). Furthermore, the unit profit of i in D is denoted as pr(i). Many traditional approaches for HUIM assume that unit profits remain the same for all transactions of a database D.
Definition 1: The utility of an item i in a transaction T c , denoted as u (i, T c ), is calculated by the following formula: [11].
Definition 2: The transaction utility of a transaction T c , denoted as u (T c ) is calculated as: [11]. Definition 3: Let X be an itemset in a transaction T c . Then, the utility of X in T c , denoted as u (X , T c ), is computed as: Definition 4: The utility of X in the database D, denoted as u(X ), is defined as: u (X ) = X ⊆T c ∧T c ∈D u (T c ) [11].
Definition 5: An itemset X is considered to be an HUI if and only if u (X ) ≥ minutil, otherwise, X is a low-utility itemset.
A problem with the above definition of an item's utility is that it assumes that unit profits are fixed, which does not reflect the true nature of real-world databases. Thus, to address this issue, Nguyen et al have introduced a new framework for handling dynamic profit databases [12]. With this new framework, the utility of a single item i appearing in a transaction T c is defined as follows.
Definition 6: The utility of an item i in a transaction T c is defined as u (i, T c ) = q (i, T c ) × pr (i, T c ), where pr (i, T c ) represents the unit profit of item i when appearing in transaction T c .
By employing this new framework, all the existing HUIM algorithms can now be applied on the new type of databases where unit profits may vary. For instance, consider item {a} and the database of Table 3. The item {a} appears in T 2 , T 3 , T 4 , and T 5 . Its utility is calculated as: To minimize the number of database scans for mining HUIs, Nguyen et al. proposed the concept of P-set and used it to limit the number of scanned transactions when considering an itemset X [12].
Definition 7: The P-set of an itemset X , denoted as P − set(X ), is defined as follows. P − set (X ) = {c|X ⊆ T c }. For instance, for the database depicted in Table 3 Furthermore, Nguyen et al. also proposed a more compact representation of the dynamic database D [12], which is shown in Table 4. This new representation helps reducing the required storage space and the time needed to calculate the utility of single items. During the HUIM process, the TWDC model is used to safely reduce the search space by pruning unpromising candidates. This model was widely applied in previous studies [8], [9], [15]- [18]. The definition of TWU is given below.
Definition 10: Let be the ascending order of TWU, the set containing all single items r that can be used to extend X , denoted as E(X ), is defined as follows. E (X ) = {r|r ∈ I ∧ rx, x ∈ X } [11], where is a total order on items.
However, the TWU is considered as not being tight enough to prune many candidates, especially in large databases [11]. Thus, Zida et al. has proposed two new and tighter upper-bounds on the utility measure with respect to an item z ∈ E(X ), namely the sub-tree utility and local utility of an itemset X , denoted as su(X , z) and lu(X , z), respectively [11]. Based on the definitions of sub-tree and local utility, the definitions of primary and secondary items were also proposed by Zida et al. [11].

IV. THE PROPOSED ALGORITHM A. THE MCH-MINER ALGORITHM
According to Zaki [48], applying parallelism into a data mining process has several benefits but also brings several challenges. A single processor suffers from the bottleneck caused by the limitation of processing power, memory, etc., while multiple processors can solve these issues efficiently by executing tasks in parallel. Multiple processors are now widely available under the form of multi-core processors. Furthermore, many databases are split into several parts and stored on distributed servers. Serial data mining algorithms executed on a single-core processor are unsuitable for this situation. However, adapting current data mining methods for parallel execution brings new challenges such as load balancing, communication minimization and synchronization.
Zaki has pointed out three well-known approaches to support parallelism. They are task parallelism, data parallelism, and hybrid task/data parallelism. The task parallelism model partitions the search space of the algorithm into sub-spaces and then assigns each sub-space to a separated processor. All these processors operate on the same data. There are two task parallelism approaches: (i) divide and conquer, which divides the search space and assign each sub-space to a processor that handles it. (ii) task queue strategy, which dynamically assigns small portion of the search space to a processor whenever it becomes available. Data parallelism assign parts of a database to several processors, which are then concurrently processed using a same algorithm executed on a distributed computing platform. The hybrid task/data parallelism approach consists of performing a pipeline of tasks, in which each task may be using data parallelism and the output of one task is used as input of the next one. This paper focuses onto task parallelism to improve the performance of the HUI mining process. A novel parallel algorithm is designed, called MCH-Miner. The aim is to put into use the current generation of multi-core processors to speed-up the pattern mining process. The algorithm adopts a divide and conquer strategy, which was deemed the most suitable for this problem. Consider the four items a, b, c and d, from the running example. Figure 1 illustrates the search space of the sequential HUIM algorithm iMEFIM as a set-enumeration tree. It contains 15 non empty itemsets. The proposed MCH-Miner algorithm for efficiently mining HUIs in a transaction database is presented next. MCH-Miner extends the iMEFIM algorithm to explore the search space in parallel. The inputs of the algorithm are a transaction database D and a minimum utility threshold. The algorithm returns the set of all HUIs in D. To employ a task parallelism approach, MCH-Miner partitions the search into sub-spaces, which are then processed as parallel tasks, as illustrated in Figure 2. A subspace contains a 1-itemset (a single item) and its extensions.
Initially, the algorithm calculates P (X ) , S (X ) and P − set(X ), by considering X as ∅. Then for each 1-itemset found in P(X ), the algorithm considers each item as a separated task to be concurrently explored using a depth-first search (DFS). As seen in Figure 2, all sub-spaces are non-overlapping as iMEFIM is a DFS-based algorithm. Furthermore, each task maintains its own copy of P (X ) , S(X ) and P − set(X ). Thus, MCH-Miner is thread-safe. The pseudo-code is given in Algorithm 1. A major change in MCH-Miner compared to iMEFIM is in lines #9 to #12, where every call of the Search function for a sub-space of an item i is executed in parallel. For each sub-space, the Search function is invoked to recursively explore that space using a DFS. The function extends the itemset being processed and determines whether it is an HUI or not. If the itemset is an HUI, it will be added to a private list of discovered HUIs in that sub-space. Each task allocates its own list to store the discovered HUIs during the mining process. After a task has completely processed a sub-space, it terminates and returns this list. The main algorithm will then aggregate all the returned lists to output the complete set of HUIs in Many parallel algorithms apply load balancing. It consists of assigning approximately an equal amount of work (tasks) to each process (e.g. processor) to ensure that all processes are kept busy. This can be viewed as a problem of minimizing idle time. Load balancing is important for parallel algorithms since it can improve the overall performance when many activities can be processed concurrently. Activities should be assigned to separated processes of different machines, separated processes on the same machine, or separated threads of a same process.
For the proposed MCH-Miner algorithm, which adopts the multi-core parallel computing model, tasks are executed on a single multi-core processor. Hence, load balancing is quite simple, since all the cores share the same memory space (using the Shared Memory System) and all the tasks that need to be executed are separated and their search spaces are disjoint. Thus, these tasks are put into a task-pool having a size equals to the number of available processor cores. Then, the tasks are concurrently executed by processor cores, using the first come-first served (FCFS) scheduling strategy, until all the tasks have been completed.
In the multi-core implementation of an algorithm, it is a good practice to start a task-pool of size equal to the number of logical/physical processors available in the system and to execute tasks from within the pool. This is to avoid taking up all system resources for all the allocated tasks, as |P(X )| could become very large when scanning large databases, and by executing all the tasks associated with each item i ∈ P(X ) at once without using a task-pool would greatly degrades the algorithm's performance.

V. EXPERIMENTAL EVALUATION
Experiments were performed to evaluate the performance of the proposed MCH-Miner algorithm. Those experiments were designed to compare the performance of iMEFIM with that of its parallel version MCH-Miner in terms of execution time and scalability. The evaluation was conducted on several dynamic profit databases. Evaluations were performed using a personal computer equipped with a dual-core (four logical processors) Intel R Core i5 processor @ 2.7Ghz, The experiments were carried out using standard databases that are used in HUIM studies to benchmark the performance of HUIM algorithms. The databases are Chainstore, Retail, Foodmart, Kosarak, Mushroom, Accidents, Connect and Chess. They can be obtained from the SPMF Opensource Data Mining Library (SPMF) [49] at the following URL: https://bit.ly/2xg2Nzr. For each database, the quantity of each item appearing in each transaction is randomly generated in the [1,10] interval, while each item's utility is randomly generated in the [2,50] interval. To make the database contain dynamic profit values, the utility of each item in each transaction is varied by 2% to 10% compared to the original value. The Retail and Chainstore databases contain real utility values. Table 5 presents the characteristics of the databases used in the experiments.

A. RUNTIME
The experiments were done to compare the performance of iMEFIM with that of its parallelized version, MCH-Miner. For each database, we varied the minutil threshold and recorded the runtime of each algorithm. Results are presented in Figure 3. In those experiments, the following databases were used: Chainstore, Retail, Foodmart, Kosarak, Chess, Mushroom, Accidents, and Connect, as shown in Figure 3a  to Figure 3h, respectively. The experiments were completed by using 4 logical processor cores. It was found that the parallel MCH-Miner algorithm provides a big performance improvement compared to its sequential version, the iME-FIM algorithm. For high-density databases such as Accidents, Chess, Mushroom, and Connect, MCH-Miner was up to three times faster than iMEFIM. For large and sparse databases such as Chainstore and Kosarak, the execution time of MCH-Miner algorithm was also reduced by up to three times compared to iMEFIM. This can be explained that the iMEFIM has efficiently solved the problem of mining HUIs on sparse databases and thus the speed gained by using multi-core processing is not as high as on the high-density databases. On overall, it was observed that the performance of MCH-Miner remains better than iMEFIM, as the minutil threshold is lowered on each database. Thus, it is concluded that the MCH-Miner algorithm outperforms iMEFIM on all the test databases, especially on high-density databases.

B. SCALABILITY
To further assess the performance of the MCH-Miner algorithm, scalability tests were done on two spare databases Chainstore, Retail and two high-density databases Mushroom and Accidents. These datasets were replicated from 3 to 5 times their original transactions count. The suffix of each database name denotes the expanded size of the original database, i.e. Chainstore3x indicates the total number of transactions in the Chainstore database has been replicated 3 times, increased from 1,112,949 to 3,338,847. Specifically, Chainstore, Retail and Accidents had their total transactions multiplied 3 times and 5 times for the Mushroom database, hence we name them as Chainstore3x, Retail3x, Accidents3x and Mushroom5x. In this test, we fixed the minutil threshold on each test database and varied the number of transactions from 25% to 100% of their size. The runtime of each algorithm on the test databases were recorded and presented in Figure 4a to Figure 4d. It can be seen in all tests that the MCH-Miner has better scalability with respect to the number of transactions compare to that of iMEFIM when using 4 logical processor cores. The MCH-Miner scalability is almost linear when we increased the number of transactions in each database, especially for the case of the Chainstore database, it has linear scalability. The iMEFIM although has been proved to have the best scalability in [12]. Thus, the MCH-Miner algorithm, which is based on iMEFIM inherits this characteristic. Overall, the MCH-Miner was shown that it has better scalability than iMEFIM on all the test databases.

VI. CONCLUSIONS AND FUTURE WORK
In this study, we extended the state-of-the-art iMEFIM algorithm for mining HUIs by combining all its techniques and strategies with the multi-core parallel computing model. That model was adapted to exploit the full computing power of the current generation of processors. The result is a new and high-performance algorithm, named MCH-Miner, that mines HUIs in a dynamic profit database in parallel. Extensive experiments were done on eight benchmark datasets. Results have shown that the proposed parallel algorithm is up to three times faster than the sequential iMEFIM algorithm on the test databases when using four processor cores and has better scalability. In the future, we intent to extend the proposed MCH-Miner algorithm using a distributed computing platform such as Apache Hadoop or Spark, to mine HUIs in largescale databases.