Skip to Main Content
Text mining concerns the discovery of knowledge from unstructured textual data. One important task is the discovery of rules that relate specific words and phrases. Textual entries in many database fields exhibit minor variations that may prevent mining algorithms from discovering important patterns. Variations can arise from typographical errors, misspellings, abbreviations, as well as other sources like ambiguity. Ambiguity may be due to the derivation feature, which is very common in the Arabic language. This paper introduces a new system developed to discover soft-matching association rules using a similarity measurements based on the derivation feature of the Arabic language. In addition, it presents the features of using Frequent Closed Item-sets (FCI) concept in mining the association rules rather than Frequent Itemsets (FI).