Reliability-Aware Runtime Power Management for Many-Core Systems in the Dark Silicon Era
Power management of networked many-core systems with runtime application mapping becomes more challenging in the dark silicon era. It necessitates considering network characteristics at runtime to achieve better performance while honoring the peak power upper bound. On the other hand, power management has a direct effect on chip temperature, which is the main driver of the aging effects. Therefore, alongside performance fulfillment, the controlling mechanism must also consider the current cores’ reliability in its actuator manipulation to enhance the overall system lifetime in the long term. In this paper, we propose a multiobjective dynamic power management technique that uses current power consumption and other network characteristics including the reliability of the cores as the feedback while utilizing fine-grained voltage and frequency scaling and per-core power gating as the actuators. In addition, disturbance rejecter and reliability balancer are designed to help the controller to better smooth power consumption in the short term and reliability in the long term, respectively. Simulations of dynamic workloads and mixed criticality application profiles show that our method not only is effective in honoring the power budget while considerably boosting the system throughput, but also increases the overall system lifetime by minimizing aging effects by means of power consumption balancing.