By Topic

Parallelizing an Information Theoretic Co-clustering Algorithm Using a Cloud Middleware

Sign In

Cookies must be enabled to login.After enabling cookies , please use refresh or reload or ctrl+f5 on the browser for the login options.

Formats Non-Member Member
$33 $13
Learn how you can qualify for the best price for this item!
Become an IEEE Member or Subscribe to
IEEE Xplore for exclusive pricing!
close button

puzzle piece

IEEE membership options for an individual and IEEE Xplore subscriptions for an organization offer the most affordable access to essential journal articles, conference papers, standards, eBooks, and eLearning courses.

Learn more about:

IEEE membership

IEEE Xplore subscriptions

5 Author(s)
Venkatram Ramanathan ; Dept. of Comput. Sci. & Eng., Ohio State Univ., Columbus, OH, USA ; Wenjing Ma ; Vignesh T. Ravi ; Tantan Liu
more authors

The emerging cloud environments are well suited for storage and analysis of large datasets, since they can allow on-demand access to resources. However, developing high-performance implementations of data analysis tasks is a challenging problem. In our prior work, we have developed a middleware called FREERIDE (FRamework for Rapid Implementation of Data mining Engines). FREERIDE is based upon the observation that the processing structure of a large number of data mining algorithms involves generalized reductions. FREERIDE offers a high-level interface and implements both distributed memory and shared memory parallelization. In this paper, we consider a challenging new data mining algorithm, information theoretic co-clustering, and parallelize it using the FREERIDE middleware. We show how the main processing loops of row clustering and column clustering of the Co-clustering algorithm can essentially be fit into a generalized reduction structure. We achieve good parallel efficiency, with a speedup of nearly 21 on 32 cores.

Published in:

2010 IEEE International Conference on Data Mining Workshops

Date of Conference:

13-13 Dec. 2010