TY - GEN
T1 - Voltage-stacked GPUs
T2 - 51st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2018
AU - Zou, An
AU - Leng, Jingwen
AU - He, Xin
AU - Zu, Yazhou
AU - Gill, Christopher D.
AU - Janapa Reddi, Vijay
AU - Zhang, Xuan
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/12/12
Y1 - 2018/12/12
N2 - More than 20% of the available energy is lost in 'the last centimeter' from the PCB board to the microprocessor chip due to inherent inefficiencies of power delivery subsystems (PDSs) in today's computing systems. By series-stacking multiple voltage domains to eliminate explicit voltage conversion and reduce loss along the power delivery path, voltage stacking (VS) is a novel configuration that can improve power delivery efficiency (PDE). However, VS suffers from aggravated levels of supply noise caused by current imbalance between the stacking layers, preventing its practical adoption in mainstream computing systems. Throughput-centric manycore architectures such as GPUS intrinsically exhibit more balanced workloads, yet suffer from lower PDE, making them ideal platforms to implement voltage stacking. In this paper, we present a cross-layer approach to practical voltage stacking implementation in GPUS. It combines circuit-level voltage regulation using distributed charge-recycling integrated voltage regulators (CR-IVRs) with architecture-level voltage smoothing guided by control theory. Our proposed voltage-stacked GPUS can eliminate 61.5% of total PDS energy loss and achieve 92.3% system-level power delivery efficiency, a 12.3% improvement over the conventional single-layer based PDS. Compared to the circuit-only solution, the cross-layer approach significantly reduces the implementation cost of voltage stacking (88% reduction in area overhead) without compromising supply reliability under worst-case scenarios and across a wide range of real-world benchmarks. In addition, we demonstrate that the cross-layer solution not only complements on-chip CR-IVRs to transparently manage current imbalance and restore stable layer voltages, but also serves as a seamless interface to accommodate higher-level power optimization techniques, traditionally thought to be incompatible with a VS configuration.
AB - More than 20% of the available energy is lost in 'the last centimeter' from the PCB board to the microprocessor chip due to inherent inefficiencies of power delivery subsystems (PDSs) in today's computing systems. By series-stacking multiple voltage domains to eliminate explicit voltage conversion and reduce loss along the power delivery path, voltage stacking (VS) is a novel configuration that can improve power delivery efficiency (PDE). However, VS suffers from aggravated levels of supply noise caused by current imbalance between the stacking layers, preventing its practical adoption in mainstream computing systems. Throughput-centric manycore architectures such as GPUS intrinsically exhibit more balanced workloads, yet suffer from lower PDE, making them ideal platforms to implement voltage stacking. In this paper, we present a cross-layer approach to practical voltage stacking implementation in GPUS. It combines circuit-level voltage regulation using distributed charge-recycling integrated voltage regulators (CR-IVRs) with architecture-level voltage smoothing guided by control theory. Our proposed voltage-stacked GPUS can eliminate 61.5% of total PDS energy loss and achieve 92.3% system-level power delivery efficiency, a 12.3% improvement over the conventional single-layer based PDS. Compared to the circuit-only solution, the cross-layer approach significantly reduces the implementation cost of voltage stacking (88% reduction in area overhead) without compromising supply reliability under worst-case scenarios and across a wide range of real-world benchmarks. In addition, we demonstrate that the cross-layer solution not only complements on-chip CR-IVRs to transparently manage current imbalance and restore stable layer voltages, but also serves as a seamless interface to accommodate higher-level power optimization techniques, traditionally thought to be incompatible with a VS configuration.
KW - Architecture Support
KW - Charge Recycle
KW - GPU
KW - Instruction Issue Width
KW - Integrated Voltage Regulator
KW - Multi Story
KW - Power Delivery Efficiency
KW - Power Delivery System
KW - Power Management
KW - Voltage Stack
UR - https://www.scopus.com/pages/publications/85060008011
U2 - 10.1109/MICRO.2018.00039
DO - 10.1109/MICRO.2018.00039
M3 - Conference contribution
AN - SCOPUS:85060008011
T3 - Proceedings of the Annual International Symposium on Microarchitecture, MICRO
SP - 390
EP - 402
BT - Proceedings - 51st Annual IEEE/ACM International Symposium on Microarchitecture, MICRO 2018
PB - IEEE Computer Society
Y2 - 20 October 2018 through 24 October 2018
ER -