HDFS is a distributed file system designed for storing very large data sets reliably and efficiently across commodity hardware. It has three main components - the NameNode, Secondary NameNode, and DataNodes. The NameNode manages the file system namespace and regulates access to files. DataNodes store and retrieve blocks when requested by clients. HDFS provides reliable storage through replication of blocks across DataNodes and detects hardware failures to ensure data is not lost. It is highly scalable, fault-tolerant, and suitable for applications processing large datasets.