I. Introduction
Graph problems arise in many applications, such as data analysis and computational science and engineering. Graph algorithms serve as useful tool by themselves for solving real problems of interest and as good benchmarks for kernel performance in emerging architectures. They are characterized by irregular data access, dynamic data structures, and non-traditional parallelization patterns with synchronization bottlenecks dictated by the structure of the graph or by the algorithm. In this paper, we consider an archetypal graph problem: graph coloring. Graph coloring is often used to find independent tasks that can be executed simultaneously. As minimizing the number of colors is NP-hard, we consider fast graph coloring heuristics that run in reasonable amount of time. Parallel graph coloring presents an interesting challenge for algorithm developers both in terms of the performance of the graph kernel itself and in terms of the impact of the coloring on real problems. While the former relies on new algorithms and implementations for improving the performance of the graph algorithm, the latter relies on the quality of the parallel graph algorithm, typically the number of colors for graph coloring. In this paper, we are concerned with both of these aspects of parallel graph coloring.