An N-List-Based Approach for Mining Frequent Inter-Transaction Patterns

Mining frequent inter-transaction patterns (ITPs) from large databases is both useful and of interest. Since frequent inter-transaction patterns (FITPs) are discovered across transactions in a transaction database (TD), the number of patterns is very large. Therefore, the mining time and memory usage are very high. Although several algorithms have been proposed for mining FITPs, they still require long runtime and high memory usage. Recent research shows that N-list-based approaches are very efficient for mining frequent patterns (FPs). Therefore, in this paper, we propose an N-list-based algorithm, called NL-ITP-Miner, to mine FITPs. In the proposed algorithm, we adopt the advantages of the N-list structure to build up the IT-PPC-tree. During the process of building the IT-PPC-tree, NL-ITP-Miner applies our proposed theorems to eliminate infrequent inter-transaction 1-patterns to reduce the search space. NL-ITP-Miner scans the database once to find frequent inter-transaction (FIT) 1-patterns for constructing the IT-PPC-tree, after that, the NL-ITP-Miner algorithm traverses this tree to generate frequent 1-patterns, FIT 1-patterns with their respective N-lists. Besides, we also propose effective pruning strategies that help NL-ITP-Miner to reduce the search space significantly and generate FITPs more quickly. Experiments show that NL-ITP-Miner outperforms the state-of-the-art algorithms for mining FITPs in terms of runtime and memory usage.


I. INTRODUCTION
The Internet of Things (IoT) has been widely applied in many areas of life, such as healthcare, e-learning, smart cities, smart homes, banking, and other areas. Many IoT applications in businesses and organizations have brought great benefits for management and operations. With the rapid development of IoT, a large number of transactions are being accumulated and the amount of related data is becoming massive in information systems. Therefore, efficient mining methods need to be developed for decision-making in the context of the swift development of IoT. Many researchers have thus focused on investigating efficient data-mining solutions for IoT systems [1], [2].
Moreover, in order for people to understand the past and predict the future from such huge data obtained from IoT The associate editor coordinating the review of this manuscript and approving it for publication was Chun-Wei Tsai . applications, it is necessary to develop and apply effective tools to extract useful knowledge, which serve the needs of decision-making and prediction applications. One of the useful tools is data mining, with frequent pattern mining being one of the important approaches. Since Agrawal et al. introduced the Apriori algorithm [3], many different patternmining algorithms have been proposed, such as methods based on tree-projection (FP-Growth [4], FP-Growth * [5]), IT-tree (Eclat [6], dEclat [7]), CHARM [8], dCHARM [8], Index-BittableFI [9], DBV-FI [10], DCI_PLUS [11], and DBV-Miner [10].
In recent years, many methods have also been proposed to mine various types of frequent patterns, such as NFWI [12] for frequent weighted itemset mining, HMiner-Closed [13], MEFIM and iMEFIM [14], dHAUIM [15], HAUP-growth [16] for mining high-utility patterns, and MPFPS BFS and MPFPS DFS [17] for mining periodic patterns in multiple sequences. Deng et al. proposed the N-List structure and PrePost algorithm [18] to mine frequent patterns efficiently. Many efficient methods for mining frequent patterns based on the N-List have since been developed, such as PrePost+ [19], negFIN [20], DiffNodeSets [21], NAFCP [22], NSFI [23], and INLA-MFP [24]. These proved that using the N-List structure for mining frequent patterns is very effective in terms of runtime and memory usage, because it is compact and computationally efficient. The structure can quickly calculate the support of patterns in linear time. However, most of the methods for mining frequent patterns only focus on mining frequent items happening within the same transactions in the transaction database, i.e., intra-transaction patterns. Therefore, such frequent patterns can only be used to predict rules like R1 but cannot be used to predict rules like R2. R1: Whenever there is a visit to Tommy's Facebook page, there is also a visit to Bobby's Facebook page and Anna's Facebook page.

R2: If there is a visit to Tommy's Facebook page, then there will be a visit to Bobby's Facebook page 3 minutes later and a visit to Anna's Facebook page 5 minutes later.
Rule R2 is formed from frequent items not only within the same transactions, but also across several different transactions in the database. The frequent patterns mined in such a context are called Frequent Inter-Transaction Patterns (FITP).
So far, several methods have been proposed for mining FITPs, of which ITP-Miner [25] is considered to be the best, outperforming previous algorithms in terms of runtime and memory usage. In 2017 and 2019, Nguyen et al. proposed two efficient algorithms, DITP-Miner [26] and FCITP-Miner [27], to exploit FITPs and closed intertransaction patterns, respectively. However, there are still many issues with FITP mining that need to be addressed to make it more efficient in terms of runtime and memory usage.
In this article, we propose an efficient algorithm for mining FITPs named NL-ITP-Miner, in which the main contributions are as follows. The proposed method generates inter-transaction 1-patterns consisting of intra-transaction 1-patterns and inter-transaction 1-patterns, and proposes theorems to prune infrequent inter-transaction patterns at the 1-pattern level early and quickly determine the support of inter-transaction patterns. NL-ITP-Miner is proposed based on these theorems and adopts the advantages of the N-List structure. Finally, the effectiveness of the NL-ITP-Miner algorithm is shown by the experimental results obtained from real databases.
The remainder of this article is structured as follows. The next section reviews some of the previous works related to this project. The basic concepts, proposed algorithm NL-ITP-Miner, including several subroutines and illustrations, and experimental results and some discussions are presented in the following sections. The conclusion and directions for future work are then given in the last section.

II. RELATED WORKS A. MINING INTER-TRANSACTION PATTERNS
Applications of ITP mining in predicting the movements of stock prices were presented by Lu et al. [28], while studying meteorological data was proposed by Li et al. [29]. Several other algorithms based on Apriori have also been proposed to mine FITPs, such as E/EH-Apriori [30] and FITI [31].
Expanding the scope of the association rules mined from traditional one-way internal transaction association rules to multi-dimensional inter-transaction association rules was introduced by Li et al. [32]. Lee et al. recently presented two algorithms ITP-Miner [25] and ICMiner [33] to mine FITPs and FCITPs. An ITP-Miner based on IT-Tree [6] and DFS (Depth-First-Search) traversing was proposed to mine FITPs. ICMiner, based on IT-Tree and CHARM [8] properties, is used to mine all frequent closed inter-transaction patterns. Wang et al. proposed the PITP-Miner [34] algorithm, which relies on tree projection [4] to mine the entire set of FITPs in a database. In addition, FITP mining has been applied to mine profit rules from stock databases by Hsieh et al., with approaches such as PRMiner [35], JCMiner and ATMiner [36]. The authors then presented ITR-Miner and NRITR [37] to mine non-redundant inter-transaction association rules. Nguyen et al. recently proposed two methods, DITP-Miner [26] and FCITP-Miner [27], to effectively mine FITPs and frequent closed ITPs.

B. N-LIST STRUCTURE
In 2012, Deng et al. came up with a new algorithm, called the PrePost [18], which mines all frequent patterns from large TD efficiently. The authors also proposed a new structure called N-list [18], a hybrid tree structure PPC-tree [18] formed of the IT-Tree [6] and FP-Tree [4] structures. The PPCtree structure stores information of 1-patterns to calculate their frequency so that infrequent patterns can be pruned quickly. The PrePost algorithm using N-list to mine FPs VOLUME 8, 2020 (frequent patterns) outperforms dEclat [7] and Eclat_goethals [36], [37]. Using the N-list structure in mining FPs has some advantages, since the storage space of N-list is not as large as that of tidsets [6] or diffsets [7] in the vertical structures, and since supports of patterns and N-lists of k-patterns are calculated in a simple manner, which is traversing their N-lists and performing intersection operations on their N-lists with linear complexity.
Thus, the task of mining FPs can be significantly reduced in terms of mining time and memory usage by using N-list based approaches.
Several publications related to N-list or similar to N-list have appeared in recent years for mining frequent patterns. Deng et al. suggested the Nodeset [38] structure in which every node stores only one Pre or Post value in the PPCtree. The FIN algorithm [38] takes advantage of the Nodeset structure for mining FPs, and it is very effective in comparison with the PrePost algorithm. In addition, the PrePost+ algorithm [19] to mine FPs, by Deng and Lv, relies on both the N-list structure and a new strategy of pruning childrenparent equivalence in mining FPs. PrePost+ [19] is more efficient than both the PrePost and FIN algorithms. Moreover, the NC_set [39] structure from the PPC-tree has also been effectively used to solve the problem of exploiting erasable patterns. Another interesting usage of N-list is developing an effective method for mining top-rank-k FPs, as proposed by Deng [40] and Huynh et al. [41]. Vo et al. also used the N-list structure in searching for closed patterns and FPs [23], [24]. In 2017, these authors proposed using the N-list structure in the INLA-MFP algorithm [24] to mine maximal patterns, and after that they went on with the use of the N-list structure in the NFWI [12] algorithm to obtain frequent weighted patterns. The negFIN [20] based on the structure similar to the N-list structure, was also proposed to quickly mine FPs.
Among the current algorithms for mining FITPs, we have to mention ITP-Miner, PITP-Miner, and DITP-Miner, in which ITP-Miner uses tidsets to store the information of ITPs and performs DFS (Depth First Search) traversing to generate all FITPs in TD. Despite significant improvements, the algorithm still has limitations. Using intersections between tidsets to create new tidsets and calculate the support of new patterns can lead to very high computational costs in the case of large databases with dense items in transactions. The PITP-Miner algorithm invokes multiple recursions to scan projected databases to determine the ITPs' supports and form projected databases in the next levels resulting in consuming numerous resources regarding memory and runtime. The DITP-Miner algorithm uses diffset to store the ITPs' information to calculate the support of ITPs. DITP-Miner only works well in the case of databases with sparse items. In general, the main limitations of current methods of exploiting FITPs are multiple database scans and high computational cost for determining the support of ITPs, especially for databases with a large number of transactions and dense items. Therefore, the existing methods consume too many resources regarding mining time and memory usage.
In this research we thus present a new N-list-based approach named NL-ITP-Miner so that FITPs can be mined efficiently. In addition, experiments are conducted to compare the runtime and memory usage of NL-ITP-Miner with those of ITP-Miner and DITP-Miner to prove the effectiveness of the proposed algorithm.

III. BASIC CONCEPTS A. MINING FREQUENT INTER-TRANSACTION PATTERNS
This section presents some basic definitions related to ITP mining that describe the process of mining FITPs in a TD.
Definition 1: Let I = {i 1 , i 2 , . . . ,i m } be a set of distinct items, and T = {t 1 , t 2 , . . . ,t n } be a set of transaction identifiers, tidset. A transaction database D contains a set of transactions in which each transaction consists of a transaction identifier tid and a pattern with distinct items. The transactions of D can be described as the form of tid, T tid , where tid ∈ T , T tid ⊆I , T tid is a pattern occurring at transaction tid.
An example transaction database D with the number of transactions |D| = 6 and set of items I = {a,b, c, d} is shown in Table 2. The example transaction database D is used throughout this paper with M = 50% and L = 1. The value of M is defined such that M is a minimum threshold of support value set by the user. The values of L and support are defined in definitions 2 and 8, respectively.
Definition 2 [33]: Let u, T u , and v, T v be two transactions in D. The relative distance between u and v is defined as u − v, where u > v, and v is called the reference point. With respect to v, an item i k at u is called an extended item and denoted as i k (u − v), where (u − v) is called the Span of the extended item. Similarly, with respect to v (or the transaction at v), a transaction T u at u is called an extended transaction and denoted as T u (u − v). Therefore, an extended transaction consists of a set of extended items described as Example 2: In Table 2, the extended transaction of the transaction at tid = 5 regarding the transaction at tid = 1 is {a(4), b(4), d(4)}.
Definition 3 [33]: Let x i (ϕ i ) and x j (ϕ j ) be two extended items.
Definition 4 [33]: An inter-transaction pattern is defined as a set of extended items, Definition 5 [33]: is also an inter-transaction pattern, where ϕ 1 is the reference point.
Example 5: Using the example database D in Table 2 with L = 1, we have six mega-transactions as follows Definition 7 [33]: Definition 8 [33]: Given a database D, M, and a pattern X , let M x be the set of mega-transactions in D containing X . The support of X , support(X ), is defined as |M x |. If support(X ) ≥ M, we can say that X is a FIT pattern.
Example 8: We consider D with M = 50%, L = 1, and X = {b (0) a(1)}. Since the first megatransaction formed by the first and second transactions, is increased by 1. Also, the mega-transaction formed by the fourth and fifth transactions, (1), contains X . Thus, support (X ) is increased by 1. We have support (X ) = 1 + 1 = 2. Next, the megatransaction formed by the fifth and sixth transactions, Definition 9: (maxpoint) Given D, and let x be an item occurring at the transactions with the set of transaction identifications, tidset ∈ {T 1 , . . . , T k }. The maxpoint of item x is defined as the order of the last element T k of tidset and maxpoint(x) is also the support(x). Example 9: In D shown in Table 2, item a occurs in D at the transactions <1, 2, 3, 4, 5, 6> in turn. According to Definition 9, maxpoint(a) = support(a) = 6. Sim-

B. N-LIST STRUCTURE
In this section, we present the preliminaries and develop theorems of the N-List structure so that it can contribute to solve the problems of FIT pattern mining.
Theorem 1 [18]: A PP-code Cod i is an ancestor of another PP-code Cod j if and only if Cod i .Pre ≤ Cod j .Pre and Cod i .Post ≥ Cod j .Post.
Definition 11 [18] (The N-list of an item): The N-list associated with an extended item B, denoted by NL(B), is the set of PP-codes associated with nodes in the IT-PPC-tree, where its Name and Span are equal to the Name of B, and the Span of B, respectively. Thus NL(B) is equal to whereas C i is the PP-code associated with N i in the IT-PPC-tree. C i .support (2) VOLUME 8, 2020 Example13: We have NL (B) = { 2, 1, 1 , 5, 3, 1 , 8, 5, 1 , 10, 7, 2 } in Example 9. Accoding to Theorem 2, support (B) = 1 + 1 + 1 + 2 = 5.

Definition 12 (The N-list of a 2-pattern):
Let PX and PY be two 1-patterns in the same equivalence class [P] [6] in which X is before Y according to the I 1 ordering. NL(PX ) and NL(PY ) are two N-lists linked with PX and PY , respectively. The N-list linked with PXY is determined as follows: 1. For each PP-code λ i ∈ NL(PX ) and λ j ∈ NL(PY ), if λ i is an ancestor of λ j and λ i .Span = λ j .Span = 0, the algorithm will add λ i .pre, λ i .post, λ j .support to NL(PXY ). 2. For each PP-code λ i ∈ NL(PX ) and λ j ∈ NL(PY ), if λ i > λ j (according to Definition 3 and Definition 4) and λ i .Span > λ j .Span = 0, the algorithm will add λ i .pre, λ i .post, λ j .support to NL(PXY ). Definition 13 [18] (The N-list of a k-pattern): Let PX and PY be two (k-1)-patterns in the same equivalence class [P] in which X is before Y according to the I k−1 ordering. NL(PX ) and NL(PY ) are two N-lists linked with PX and PY , respectively. The N-list linked with PXY is determined as follows: 1. For each PP-code λ i ∈ NL(PX ) and λ j ∈ NL(PY ), if λ i is an ancestor of λ j , the algorithm will add λ i .pre, λ i .post, λ j .support to NL(PXY ).

IV. PROPOSED NL-ITP-MINER ALGORITHM A. IT-PPC-TREE CONSTRUCTION
In this section, we continue to propose theorems and adopt the N-List structure to construct IT-PPC-tree.
Definition 14: An IT-PPC-tree is a tree structure denoted by R. Each node in the tree consists of six values, Name, support, childnodes, Pre, Post and Span, in which Name is the name of 1-patterns, support is the frequency of 1-patterns, childnodes is the set of child nodes associated with their ancestor node, Pre and Post are the order numbers of the node traversing IT-PPC-tree in Pre-order and Post-order ways, respectively, and Span is the relative distance between the transaction containing a 1-pattern of the node and the reference point. The IT-PPC-tree construction algorithm can be seen in Figure 6. We can say that x (0) is the ancestor of x (k) and support(x (0) ) ≥support(x (k)).
Proof: Based on Definition 3, we have x (0) < x(k). Therefore, the position of transactions containing x (0) always stands before the position of transactions containing x (k) in the database D. Let tidset (x(0)) = (according to Definition 2),thus, |tidset(x (0))| ≥ |tidset(x (k))| ⇒ support(x (0) ) ≥support(x (k)). When the IT-PPC-tree is constructed, the items in D are sorted in descending order of support. Therefore, the items with high support will be inserted into the tree at the nodes which are closer to the root, R. Therefore, item x (0) is the ancestor of item x(k). Theorem 3 is proven.
Example 16: With regard to the IT-PPC-tree shown in Figure 8, it can be seen that support (a (0)) = 6 > support (a (1)) = 5 and a (0) always appears before a (1) in the tree.

Example 17: In
To illustrate the working process of the IT-PPC-tree construction algorithm shown in Figure 6, the example database D shown in Table 2 and Table 3 is used (after deleting infrequent items and sorting frequent items according to their support) with M = 3 and L = 1. First, the IT-PPC-tree construction algorithm scans the database to find frequent 1-patterns and their supports which are also their maxpoints (line 1). Therefore, the maxpoints of items a, b, c, and d are 6, 3, 2, and 4, respectively.  In the same way, we have item d (1), and it is also inserted into the IT-PPC-tree. The tree for the mega-transaction of the first two transactions of D is shown in Figure 1.  Table 4.  Table 4.
Likewise, the tree for the mega-transaction of the second and third transactions is also shown in Figure 1. The next megatransactions are processed similarly and the resulting trees are shown in Figures 2, 3, 4, and 5. The dotted rectangles in Figures 2 and 3 indicate that these nodes have been pruned by Theorem 4, which is one of the pruning techniques proposed in this study.
Line 19 traverses the IT-PPC-tree to generate Pre-and Post-order values. Figure 8 shows the IT-PPC-tree generated from the example database D. Each node in the IT-PPC-tree has five values: the name of the item, its Span, its support, Pre-and Post-values. For example, the node {a (1) , 1} (3, 1) in Figure 8 from the left to the right has node. Name = a,node.Span= 1,node.support= 1,node.Pre= 3, and node.Post = 1.

B. NL-ITP-MINER ALGORITHM
In this section we present the core of the proposed algorithm and related procedures. At the end of this section, we also describe the performance of the NL-ITP-Miner algorithm in detail and illustrate the algorithm with an example.
First, a hash table, H 1, is initialized (line 1). The NL-ITP-Miner algorithm constructs the IT-PPC-tree (line 2) and calls the procedure GENERATE_N-IT-list to generate the N-list from this tree (line 3). From the GENERATE_N-ITlist, we use another hash table in which the keys are the Span values, and the values are the lists of FIT 1-patterns with the same Span value so that we can create a list of lists of FIT 1-patterns (procedure GENERATE_N-IT-list). The pseudocode of the procedure GENERATE_N-IT-list is shown in Figure 7. Then, the algorithm selects all FIT 1-patterns with Span = 0 from the H 1 and sorts the support values in descending order (line 4, line 5). According to Theorem 3, Definition 12, and Definition 13, the algorithm is divided into two parts (from line 6 to line 31) to generate intertransaction patterns whose supports are checked by the function MERGING_N-IT_list in Figure 11 (line 10 or line 23 of Figure 13). The inter-transaction patterns are added to the list of inter-transaction 2-patterns FIsNext (line 16 or line 29) if their supports meet M. The algorithm checks whether the list FIsNext is null. If so, NL-ITPs-Miner returns to line 6 and continues to process the next branch with the above steps. If FIsNext is not null, it calls the procedure DFS_ITPs with the FIsNext parameter (line 34). The procedure DFS_ITPs in Figure 12 performs recursively until no FIT pattern is generated and returns to line 6 and gradually processes the next branch with the above steps to generate the FIT 1-patterns and 2-patterns. The algorithm calls the DFS_ITPs procedure to extend the search tree to discover all FIT k-patterns. The algorithm terminates when all tree nodes have been traversed. In Figures 9 and 10 the dotted rectangles indicate that these nodes have been pruned because they do not meet the threshold M. All of FITPs are found and shown in Figure 10.
In the proposed algorithm, NL-ITP-Miner, we adopted the compact structure N-list to store the information needed to quickly determine the support of FITPs with low computational cost and low linear complexity. In addition, we also proposed a pruning strategy for effectively pruning infrequent patterns at the 1-pattern level. Using the N-list structure and effective pruning strategies helps to meaningfully reduce mining time and memory usage during the mining process. The other algorithms for mining FITPs or FCITPs, such as ITP-miner, PITP-Miner, DITP-Miner, and ICMiner, which use the structures (tidset, diffset, projected database) during the mining process need a lot of resources to calculate the supports of ITPs, especially in the case that PITP-Miner which uses projected databases, resulting in multiple database scans. With respect to big databases containing a large number of transactions and items, multiple database scans are not feasible. Therefore, the proposed algorithm is more efficient than the above algorithms    for mining FITPs in terms of mining time and memory usage.

C. AN ILLUSTRATION EXAMPLE
The algorithm NL-ITP-Miner is also illustrated by the example database D, M = 50%, and L = 1.
We utilize a hashtable, H1, to contain all FIT 1-patterns.  The algorithm checks whether the set of FITPs of FIsNext is empty or not. If so, NL-ITP-Miner jumps to the next step.  The progress of the proposed algorithm with the example database is shown in Figures 9 and 10. NL-ITP-Miner also performs the DFS_ITPs function in a recursive manner to generate FITPs at the next levels. The final result of the NL-ITP-Miner algorithm for exploiting all of FITPs from D with M = 50 and L = 1 is shown in Figure 10.

A. CHARACTERISTICS OF EXPERIMENTAL DATABASES
The algorithms applied in the experiments were written using Visual C# 2019 and tested on a computer with the following specifications: CPU Intel(R) Core(TM) i7-8565U processor @ 1.80 GHz, 20GB RAM, running Windows 10. Experimental datasets were obtained from the Frequent Itemset Mining Dataset Repository (http://fimi.ua.ac.be/data/) and their characteristics are presented in Table 5. We compare the proposed algorithm with DITP-Miner and ITP-Miner algorithms regarding the mining time and memory usage. Through the experimental evaluations, we varied the parameters such as M, L in order to accurately assess the effectiveness of the algorithms used in the tests. Moreover, scalability tests were also conducted to assess the performance of the proposed algorithm on large databases.   1-patterns and all the FIT 1-patterns at the 1-pattern level, the supports of patterns are then quickly determined based on the N-List structure. Therefore, it is faster than DITP-Miner and ITP-Miner in most cases with dense databases of items, as shown in Figures 14-15 with the Chess and Connect databases, respectively. However, with sparse databases of items and a large number of transactions, such as T40I10D100K, T10I4D100K, Retail, and Chainstore, the proposed algorithm runs slightly faster than ITP-Miner and DITP-Miner, but with a decrease in M and an increase in L, the number of frequent patterns increases, and NL-ITP-Miner runs significantly faster than ITP-Miner and DITP-Miner, as shown in Figures 16-19, which are the runtime experiments with Chainstore, Retail, T10I4D100K, and T40I10D100K databases, respectively.
In addition, the NL-ITP-Miner applies the N-list structure to accelerate the calculation of the support for combining patterns to find all FITPs. The N-list structure contains very compact information with the Pre, Post and support values in it. While combining two tidsets in ITP-Miner or two diffsets in the DITP-Miner is often a process with exponential complexity, the combination of two N-list structures in NL-ITP-Miner to generate another N-list structure is performed in a simpler manner with a time complexity of O(m + n), where m and n are the lengths of the first and second N-lists, respectively. Thanks to the above advantages in conjunction with the application of Theorem 3 and Theorem 4 in the mining process, and the experimental results, NL-ITP-Miner can be considered as the best algorithm for mining FITPs regarding runtime.   NL-ITP-Miner avoid generating infrequent inter-transaction patterns at the 1-item level and thus reduces the search space. Therefore, NL-ITP-Miner generally requires less memory than DITP-Miner and ITP-Miner. Along with these experimental results, it can be easy to observe that NL-ITP-Miner is the best algorithm for mining FITPs regarding memory usage.

D. SCALABILITY
The scalability tests are performed by varying the numbers of transactions from the Chainstore database, the largest     is significantly better than ITP-Miner and DITP-miner in terms of runtime and memory usage, and especially when we increase the thresholds of L (maxSpan) and decrease the thresholds of M (minSup).
In the case of sparse datasets (T10I4D100K, T40I10D100K, Chainstore, and Retail) with a small number of items in each transaction, the runtime and memory usage        DITP-Miner in most cases in terms of runtime and memory usage.

VI. CONCLUSION AND FUTURE WORK
In this scope of the article, we have put forward the efficiency of an approach based on N-list for mining FITPs from large TDs. We have proposed the theorem and adopted the N-list structure to build up the IT-PPC-tree to deal with the problems of mining FITPs. In addition, we have also proposed theorems and pruning techniques to substantially reduce the search space and quickly compute the support of ITPs. The effectiveness of the NL-ITP-Miner algorithm has been proven through experiments on several databases commonly used in data mining studies. The results indicate that the proposed algorithm has better performance than the previous ones, ITP-Miner and DITP-Miner, requiring less execution time and less memory usage.
In the future work, we shall focus on efficient approaches to mine FIT weighted patterns as well as inter-transaction weighted closed patterns, mining FIT in incremental databases, mining FIT weighted patterns in incremental databases, and mining inter-transaction high utility patterns.
NGOC-THANH NGUYEN (Senior Member, IEEE) is currently a Full Professor with the Wroclaw University of Science and Technology and the Head of the Applied Informatics Department, Faculty of Computer Science and Management. His scientific research interests include collective intelligence, knowledge integration methods, inconsistent knowledge processing, and multi-agent systems. He has edited more than 30 special issues of international journals, 52 books, and 35 conference proceedings. He is the author or a coauthor of five monographs and more than 350 journal and conference papers. He serves as a member of the Council of Scientific Excellence of Poland, a member of Committee on Informatics of the Polish Academy of Sciences, and an Expert of the National Center of Research and Development and European Commission in evaluating research projects in several programs like the Marie Sklodowska-Curie Individual Fellowships, FET, and EUREKA. He has given 22 plenary and keynote speeches for international conferences, and more than 40 invited lectures in many countries. In 2009, he was granted of title Distinguished Scientist of ACM. He was also a Distinguished Visitor of the IEEE and a Distinguished Speaker of ACM. He has been the General Chair or the Program Chair of more than 40 international conferences. He also serves as the Chair of the IEEE SMC Technical Committee on Computational Collective Intelligence. He serves as the Editor-in-Chief for the International Journal of Information and Telecommunication (Taylor and Francis), Transactions on Computational Collective Intelligence (Springer), and the Vietnam Journal of Computer Science (World Scientific). He is also the Associate Editor-in-Chief of several prestigious international journals, among others, the Journal of Intelligent and Fuzzy Systems and Applied Intelligence.
TRINH D. D. NGUYEN received the B.Sc. and M.Sc. degrees in information technology and computer science from Ho Chi Minh Open University, Military Technology Academy, in 2001 and 2017, respectively. He is currently pursuing the Ph.D. degree with the University of Information Technology, VNU-HCM, Vietnam. He is a Researcher with Duy Tan University. His research interests include association rules, classification, high-utility pattern mining, frequent pattern mining, machine learning, and distributed computing. VOLUME 8, 2020