Skip to Main Content
In this paper we present thread flux (TFlux), a complete system that supports the data-driven multithreading (DDM) model of execution. TFlux virtualizes any details of the underlying system therefore offering the same programming model independently of the architecture. To achieve this goal, TFlux has a runtime support that is built on top of a commodity operating system. Scheduling of threads is performed by the thread synchronization unit (TSU), which can be implemented either as a hardware or a software module. In addition, TFlux includes a preprocessor that, along with a set of simple compiler directives, allows the user to easily develop DDM programs. The preprocessor then automatically produces the TFlux code, which can be compiled using any commodity C compiler, therefore automatically producing code to any ISA. TFlux has been validated on three platforms. A Simics-based multicore system with a TSU hardware module (TFluxHard), a commodity 8-core Intel Core2 QuadCore-based system with a software TSU module (TFluxSoft), and a Cell/BE system with a software TSU module (TFluxCell). The experimental results show that the performance achieved is close to linear speedup, on average 21x for the 27 nodes TFluxHard, and 4.4x on a 6 nodes TFluxSoft and TFluxCell. Most importantly, the observed speedup is stable across the different platforms thus allowing the benefits of DDM to be exploited on different commodity systems.
Date of Conference: 9-12 Sept. 2008