Skip to Main Content
Data sets containing a combination of categorical and continuous variables (mixed data sets) are difficult to analyse since no generalized similarity measure exists for categorical variables. Quantification of categorical variables makes it possible to represent this type of data using techniques designed for numerical data. This paper presents a quantification process of categorical variables in mixed data sets that incorporates information on relationships among the continuous variables into the process, as well as utilizing the domain knowledge of a user. An interactive visualization environment using parallel coordinates as a visual interface is provided, where the user is able to control the quantification process and analyse the result. The efficiency of the approach is demonstrated using two mixed data sets.