Skip to Main Content
Focusing on data reliability, we propose a control theory centric approach designed to improve transient error resilience in shared caches of emerging multicores while satisfying performance goals. The proposed scheme takes, as input, two quality of service (QoS) specifications: performance QoS and reliability QoS. The first of these indicates the minimum workload-wide cache (L2) hit rate value acceptable, whereas the second one captures the reliability bound on an application basis, with the help of a metric called the Reads-with-Replica (RwR). We present an extensive experimental evaluation of the proposed scheme on various workloads formed using the applications from the SPEC2006 benchmark suite. The proposed scheme is able to satisfy, in most of the tested cases, both performance and reliability QoS targets, by successfully modulating the total size of the data replication area and partitioning of this area among the co-runner applications. The collected results also show that our scheme achieves consistent improvements under different values of the major simulation parameters.