Skip to Main Content
Todays computer systems increasingly comprise het-erogenous computing elements like multi-core processors, graphics processing units, and specialized co-processors, which allow parallel processing. Programming applications to utilize such systems is a complex process and needs good knowledge about the hardware architecture. Automatic and transparent use of these resources is a major concern of domain specific software developers and users. We present a new approach of using shared library interposing to replace libraries in binary applications with highly optimized accelerated versions. A plugin-based framework was developed, which allows interposing shared library calls, delegating them to accelerator specific libraries and adapting them to the library specific interface. Accelerator specific plugins can be added with a high degree of automatism. First steps were taken to develop a fast and intelligent selection component, choosing the best possible accelerator for a shared library call. It was shown, that such a framework may be efficiently used to apply shared library interposing to transparently speedup existing applications. The BLAS library for linear algebra was used as an example to develop plugins for an acceleratable library. Runtimes of BLAS functions were measured on different architectures and expose significant differences depending on the used implementation and hardware, showing the potentially high speedups of the approach.