The document proposes optimizing DRAM caches for latency rather than hit rate. It summarizes previous work on DRAM caches like Loh-Hill Cache that treated DRAM cache similarly to SRAM cache. This led to high latency and low bandwidth utilization.
The document introduces the Alloy Cache design which avoids tag serialization and keeps tags and data in the same DRAM row for lower latency. It also proposes a Memory Access Predictor to selectively use parallel or serial access models for the lowest latency. Simulation results show Alloy Cache with a predictor outperforms previous designs like SRAM-Tags. The document advocates optimizing DRAM caches for latency first before hit rate given their different constraints compared to SRAM caches.