Skip to Main Content
With scaling of process technology, transistor and interconnect reliability has emerged as a growing concern for modern microprocessors. Traditional solutions for reliable operation rely on double or triple modular redundancies. However, chip multiprocessors (CMP) provide unique opportunity for low-cost data path verification for reliable operation. A recent paper presents a fault recovery scheme based on outsourcing instructions from identified faulty cores to fault free cores capable of executing them. The communication between the cores is managed via an inter-core queue (ICQ). However, no faulty core identification mechanism was presented. In this paper, we extend this research to enable self-test of the data path execution in a multicore processor. Specifically, whenever instructions are retired locally on a core (local), they are also dispatched for execution on another nearby (remote) core for execution verification via ICQ. Results obtained from local and remote cores are compared. If a fault is detected, the instruction may be re-executed on both local and remote cores to distinguish between hard and soft faults. In this study, we present results on frequency of coverage and latency between first execution and its verification. We also report performance impact of execution verification on the remote core. Results indicate that the proposed scheme is capable of remotely verifying ~80% integer ALU instructions and >;98% of other instruction types with very small impact on performance of just ~1% on the tester core and incurs less than 1% area overhead.