TY - GEN
T1 - Performance Walls in Machine Learning and Neuromorphic Systems
AU - Chakrabartty, Shantanu
AU - Cauwenberghs, Gert
N1 - Funding Information:
This work is supported by the National Science Foundation with research grant FET-2208770.
Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - At the fundamental level, an energy imbalance exists between training and inference in machine learning (ML) systems. While inference involves recall using a fixed or learned set of parameters that can be energy-optimized using compression and sparsification techniques, training involves searching over the entire set of parameters and hence requires repeated memorization, caching, pruning, and annealing. In this paper, we introduce three 'performance walls' that determine the training energy efficiency, namely, the memory-wall, the update-wall, and the consolidation-wall. While the emerging compute-in-memory ML architectures can address the memory-wall bottleneck (or energy-dissipated due to repeated memory access) the approach is agnostic to energy-dissipated due to the number and precision required for the training updates (the update-wall) and is agnostic to the energy-dissipated when transferring information between short-term and long-term memories (the consolidation-wall). To overcome these performance walls, we propose a learning-in-memory (LIM) paradigm that prescribes ML system memories with metaplasticity and whose thermodynamical properties match the physics and energetics of learning.
AB - At the fundamental level, an energy imbalance exists between training and inference in machine learning (ML) systems. While inference involves recall using a fixed or learned set of parameters that can be energy-optimized using compression and sparsification techniques, training involves searching over the entire set of parameters and hence requires repeated memorization, caching, pruning, and annealing. In this paper, we introduce three 'performance walls' that determine the training energy efficiency, namely, the memory-wall, the update-wall, and the consolidation-wall. While the emerging compute-in-memory ML architectures can address the memory-wall bottleneck (or energy-dissipated due to repeated memory access) the approach is agnostic to energy-dissipated due to the number and precision required for the training updates (the update-wall) and is agnostic to the energy-dissipated when transferring information between short-term and long-term memories (the consolidation-wall). To overcome these performance walls, we propose a learning-in-memory (LIM) paradigm that prescribes ML system memories with metaplasticity and whose thermodynamical properties match the physics and energetics of learning.
KW - Energy Efficiency
KW - Machine Learning
KW - Memory
KW - Neuromorphic Systems
KW - Thermodynamics
KW - Training
UR - http://www.scopus.com/inward/record.url?scp=85167704247&partnerID=8YFLogxK
U2 - 10.1109/ISCAS46773.2023.10181597
DO - 10.1109/ISCAS46773.2023.10181597
M3 - Conference contribution
AN - SCOPUS:85167704247
T3 - Proceedings - IEEE International Symposium on Circuits and Systems
BT - ISCAS 2023 - 56th IEEE International Symposium on Circuits and Systems, Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 56th IEEE International Symposium on Circuits and Systems, ISCAS 2023
Y2 - 21 May 2023 through 25 May 2023
ER -