Systematic Methodology for the Quantitative Analysis of Pipeline-Register Reliability
Decades of rapid aggressive technology scaling have brought the challenge of soft errors to modern computing systems. Sequential elements (registers) in the processor pipeline exposed to charge-carrying particles generate bit flips or soft errors that could translate into system failures. Next to the processor cache, the pipeline registers (PRs)—registers between two pipeline stages—account for more than 50% of soft-error failures in the system. In this paper, for the first time, we apply architectural correct execution models that quantitatively define the vulnerability (or exposure to soft errors) of microarchitectural components, and extend it to define the vulnerability of PRs—PR vulnerability (PRV). We develop gemV-Pipe, a simulation toolset for the systematic, accurate, and quantitative estimation and analysis of PRV. Our detailed ISA-aware analysis in gemV-Pipe reveals interesting facts on the data-access behavior of PRs: 1) the vulnerability of each PR is not proportional to their size; 2) the PR bits used for one instruction may not be used (and are thus not vulnerable) for another, which makes PRV extremely instruction-dependent; and 3) the functionality of stored data on the PR bits can be used to classify them as—instruction, control, and data bits—each of which differ in their instruction-specific behavior and vulnerability. Applying the insight gained, we perform design space exploration on selectively hardening the PR bits, and demonstrate that 75% improved reliability can be achieved for only <15% power overhead.