Skip to Main Content
With the increased complexity of platforms coupled with data centers' servers sprawl, power consumption is reaching unsustainable limits. Researchers have addressed data centers' performance-per-watt management at different hierarchies going from server clusters to servers to individual components within the server platform. This paper addresses performance-per-watt maximization of memory subsystems in a data center. Traditional memory power management techniques rely on profiling the utilization of memory modules and transitioning them to some low-power mode when they are sufficiently idle. However, fully interleaved memory presents an interesting research challenge because data striping across memory modules reduces the idleness of individual modules to warrant transitions to low-power states. In this paper, we present a novel technique for performance-per-watt maximization of interleaved memory by dynamically reconfiguring (expanding or contracting) the degree of interleaving to adapt to incoming workload. The reconfigured memory hosts the application's working set on a smaller set of modules in a manner that exploits the platform's memory hierarchy architecture. This creates the opportunity for the remaining memory modules to transition to low-power states and remain in those states for as long as the performance remains within given acceptable thresholds. The memory power expenditure is minimized subject to application memory requirements and end-to-end memory access delay constraints. This is formulated as a performance-per-watt maximization problem and solved using an analytical memory power and performance model. Our technique has been validated on a real server using SPECjbb benchmark and on a trace-driven memory simulator using SPECjbb and gcc memory traces. On the server, our techniques are shown to give about 48.8 percent (26.7 kJ) energy savings compared to traditional techniques measured at 4.5 percent. The maximum improvement in performance-per-wa- - tt was measured at 88.48 percent. The simulator showed 89.7 percent improvement in performance-per-watt compared to the best performing traditional technique.
Parallel and Distributed Systems, IEEE Transactions on (Volume:20 , Issue: 7 )
Date of Publication: July 2009