Power and Reliability Management of SoCs
Today's embedded systems integrate multiple IP cores for processing, communication, and sensing on a single die as systems-on-chip (SoCs). Aggressive transistor scaling, decreased voltage margins and increased processor power and temperature have made reliability assessment a much more significant issue. Although reliability of devices and interconnect has been broadly studied, in this work, we study a tradeoff between reliability and power consumption for component-based SoC designs. We specifically focus on hard error rates as they cause a device to permanently stop operating. We also present a joint reliability and power management optimization problem whose solution is an optimal management policy. When careful joint policy optimization is performed, we obtain a significant improvement in energy consumption (40%) in tandem with meeting a reliability constraint for all SoC operating temperatures.