Skip to Main Content
A major challenge to the creation of chip multiprocessors is designing the on-chip memory and communication resources to efficiently support parallel workloads. A variety of cache organizations, data management techniques, and hardware optimizations that take advantage of specific data characteristics have been developed to improve application performance. The success of these approaches depends on applications exhibiting the presumed data characteristics. Data mining applications are a growing class of applications that discover patterns in large sets of collected data. Because these applications tend to be highly parallelizable, they represent an important workload for chip multiprocessors. However, the memory intensive nature of these applications means that they will stress these chips' memory and communication resources. In this paper, we examine the data usage characteristics of a set of parallel data mining applications to determine the applicability of existing chip multiprocessor approaches to these applications. We show diversity of characteristics across and within these applications, making some techniques more applicable than others. We also discuss software approaches that could be used to either provide information to the hardware or assist the hardware in dynamically discovering data characteristics needed for the deployment of these techniques.
Date of Conference: 4-6 Oct. 2009