Skip to Main Content
Linguistic summarization (LS) is a data mining or knowledge discovery approach to extract patterns from databases. Many authors have used this technique to generate summaries like “Most senior workers have high salary,” which can be used to better understand and communicate about data; however, few of them have used it to generate IF-THEN rules like “IF X is large and Y is medium, THEN Z is small,” which not only facilitate understanding and communication of data but can also be used in decision-making. In this paper, an LS approach to generate IF-THEN rules for causal databases is proposed. Both type-1 and interval type-2 fuzzy sets are considered. Five quality measures-the degrees of truth, sufficient coverage, reliability, outlier, and simplicity-are defined. Among them, the degree of reliability is especially valuable for finding the most reliable and representative rules, and the degree of outlier can be used to identify outlier rules and data for close-up investigation. An improved parallel coordinates approach for visualizing the IF-THEN rules is also proposed. Experiments on two datasets demonstrate our LS and rule visualization approaches. Finally, the relationships between our LS approach and the Wang-Mendel (WM) method, perceptual reasoning, and granular computing are pointed out.