Skip to Main Content
Automatic parallelization of sequential programs combined with tuning is an alternative to manual parallelization. This method has the potential to substantially increase productivity and is thus of critical importance for exploiting the increased computational power of today's multicores. A key difficulty is that parallelizing compilers are generally unable to estimate the performance impact of an optimization on a whole program or a program section at compile time; hence, the ultimate performance decision today rests with the developer. Building an autotuning system to remedy this situation is not a trivial task. This work presents a portable empirical autotuning system that operates at program-section granularity and partitions the compiler options into groups that can be tuned independently. To our knowledge, this is the first approach delivering an au-toparallelization system that ensures performance improvements for nearly all programs, eliminating the users' need to experiment with such tools to strive for highest application performance.