TY - GEN
T1 - Performance/area efficiency in chip multiprocessors with micro-caches
AU - Becchi, Michela
AU - Franklin, Mark A.
AU - Crowley, Patrick J.
PY - 2007
Y1 - 2007
N2 - This paper proposes the use of very small instruction caches, called micro-caches (-caches), consisting of tens to hundreds of bytes, at the bottom of the instruction delivery hierarchy in chip-multiprocessors (CMP). Multi-core architectures place a novel emphasis on the performance/area efficiency of processor cores, and we note that traditional instruction cache sizes reflect an emphasis on hit-rate performance rather than efficiency. In brief, -caches reduce the area footprint of individual cores, thus allowing additional cores to fit within a given die area. We use commercial design tools and a commercial processor core to evaluate this tradeoff in the context of high-performance networking, where CMP architectures have had their greatest commercial impact to date. Our results suggest that the use of u-caches can yield a 25% improvement in efficiency relative to traditional hierarchies. In our evaluation, we consider a range of architectural options (cluster organization, non-blocking caches, cache parameters) and justify our conclusions while accounting for the errors inherent in die area estimates.
AB - This paper proposes the use of very small instruction caches, called micro-caches (-caches), consisting of tens to hundreds of bytes, at the bottom of the instruction delivery hierarchy in chip-multiprocessors (CMP). Multi-core architectures place a novel emphasis on the performance/area efficiency of processor cores, and we note that traditional instruction cache sizes reflect an emphasis on hit-rate performance rather than efficiency. In brief, -caches reduce the area footprint of individual cores, thus allowing additional cores to fit within a given die area. We use commercial design tools and a commercial processor core to evaluate this tradeoff in the context of high-performance networking, where CMP architectures have had their greatest commercial impact to date. Our results suggest that the use of u-caches can yield a 25% improvement in efficiency relative to traditional hierarchies. In our evaluation, we consider a range of architectural options (cluster organization, non-blocking caches, cache parameters) and justify our conclusions while accounting for the errors inherent in die area estimates.
KW - Cache hierarchies
KW - Chip multiprocessor
KW - Networking workload
UR - https://www.scopus.com/pages/publications/35348812608
U2 - 10.1145/1242531.1242567
DO - 10.1145/1242531.1242567
M3 - Conference contribution
AN - SCOPUS:35348812608
SN - 1595936831
SN - 9781595936837
T3 - 2007 Computing Frontiers, Conference Proceedings
SP - 247
EP - 258
BT - 2007 Computing Frontiers, Conference Proceedings
T2 - 4th Conference On Computing Frontiers 2007
Y2 - 7 May 2007 through 9 May 2007
ER -