By Topic

Comparing Clusterings Using Bertin's Idea

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$31 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

3 Author(s)
Pilhofer, A. ; Univ. of Augsburg, Augsburg, Germany ; Gribov, A. ; Unwin, A.

Classifying a set of objects into clusters can be done in numerous ways, producing different results. They can be visually compared using contingency tables [27], mosaicplots [13], fluctuation diagrams [15], tableplots [20] , (modified) parallel coordinates plots [28], Parallel Sets plots [18] or circos diagrams [19]. Unfortunately the interpretability of all these graphical displays decreases rapidly with the numbers of categories and clusterings. In his famous book A Semiology of Graphics [5] Bertin writes “the discovery of an ordered concept appears as the ultimate point in logical simplification since it permits reducing to a single instant the assimilation of series which previously required many instants of study”. Or in more everyday language, if you use good orderings you can see results immediately that with other orderings might take a lot of effort. This is also related to the idea of effect ordering [12], that data should be organised to reflect the effect you want to observe. This paper presents an efficient algorithm based on Bertin's idea and concepts related to Kendall's t [17], which finds informative joint orders for two or more nominal classification variables. We also show how these orderings improve the various displays and how groups of corresponding categories can be detected using a top-down partitioning algorithm. Different clusterings based on data on the environmental performance of cars sold in Germany are used for illustration. All presented methods are available in the R package extracat which is used to compute the optimized orderings for the example dataset.

Published in:

Visualization and Computer Graphics, IEEE Transactions on  (Volume:18 ,  Issue: 12 )