This document summarizes a research paper that proposes GreenHDFS, an energy-efficient variant of HDFS that uses data classification to place data in hot and cold zones for power management. The authors analyzed file access patterns and lifespans in a large Yahoo! HDFS cluster and found that: 1) Patterns and lifespans varied significantly across directories; 2) 60% of data was cold/unused but needed for regulatory/historical purposes; and 3) 95-98% of files were hot for less than 3 days, though one directory had longer lifespans. GreenHDFS aims to generate long idle periods to power down servers while maintaining performance.