Pig is a platform for analyzing large datasets that operates on Hadoop. It uses its own Pig Latin language to express data flows that the Pig engine executes in parallel across a Hadoop cluster. Pig Latin scripts typically involve loading data from HDFS, transforming it through operations like filtering, grouping, joining, and applying user-defined functions. The results are then stored back in HDFS. Key features include its data model with scalar and complex types, use of schemas to optimize queries, interactive Grunt shell, built-in and user-defined functions, and macro capabilities to package reusable logic.