This document provides an overview of EMC's ViPR HDFS Data Service. Key points include:
1) ViPR HDFS allows users to leverage existing storage infrastructure as an HDFS data repository or "data lake" without needing dedicated analytics clusters.
2) It addresses limitations of off-the-shelf HDFS and brings HDFS capabilities to existing storage hardware, enabling HDFS, object, and file-based scenarios from a single platform.
3) ViPR HDFS provides an HDFS-compatible interface but replaces name nodes to eliminate single points of failure and uses ViPR's object storage engine for high scale.