January 25, 2008

Predictor Virtualization and RegionTracker: Two Ways of Exploiting Multi-Megabyte Caches

Andreas Moshovos, Department of Electrical and Computer Engineering, University of Toronto, Canada

Abstract: Modern on-chip caches continue to grow in size with multi-megabyte caches being the norm today. Two techniques that revisit the design and role of on-chip caches will be presented: Predictor Virtualization and RegionTracker.

Predictor Virtualization can drastically reduce the cost or improving the accuracy and effectiveness of predictor-based techniques. Instead of demanding large, dedicated resources, Predictor Virtualization spills metadata in the memory hierarchy. It will be shown that a virtualized state-of-the-art memory prefetcher needs about 1KB of dedicated resources compared to the 64KB needed without virtualization. A number of additional opportunities facilitated by Predictor Virtualization will be discussed.

RegionTracker, is a framework for implementing coarse-grain optimizations in the on-chip memory hierarchy. As several recent works have demonstrated, coarse-grain information and management can improve performance and power in the on-chip memory hierarchy. RegionTracker aims at eliminating the area and complexity costs associated with existing ways of implementing such coarse-grain optimizations. To demonstrate the potential of RegionTracker, it will be shown that it improves the effectiveness of a broadcast elimination technique "for free", that is without requiring any additional resources compared to a conventional block-based cache of the same capacity. Implementing Stealth Prefetching over RegionTracker will also be discussed.

Predictor Virtualization is joint work with Stephen Somogyi/CMU and Babak Falsafi/EPFL and will appear in this years ASPLOS. RegionTracker has appeared in MICRO 2007.

About the speaker: Andreas Moshovos is an Associate Professor at the Electrical and Computer Engineering Department of the University of Toronto. He received the Ph.D. in Computer Science from the University of Wisconsin-Madison. In addition to the University of Toronto, he has taught computer design at Northwestern University, the University of Athens, and the Hellenic Open University. His research is on performance, power and complexity improvements for processors and has produced techniques that have been implemented in commercial designs.