The document proposes optimizing DRAM caches for latency rather than hit rate. It summarizes previous work on DRAM caches like Loh-Hill Cache that treated DRAM cache similarly to SRAM cache. This led to high latency and low bandwidth utilization.
The document introduces the Alloy Cache design which avoids tag serialization to reduce latency. It also proposes a Memory Access Predictor to selectively use parallel or serial access models for low latency and bandwidth. Simulation results show Alloy Cache with a predictor outperforms previous designs like SRAM-tag caches. The design provides benefits with simple structures optimized for DRAM cache constraints.