This document provides an introduction to Apache Pig, including:
- Pig is a system for processing large unstructured data using HDFS and MapReduce. It uses a high-level data flow language called Pig Latin.
- Pig aims to increase programmer productivity by abstracting low-level MapReduce jobs and providing a procedural language for parallel data flows.
- Pig components include the Pig engine for parsing, optimizing, and executing queries, and the Grunt shell for running interactive commands.
- The document then covers Pig data types, input/output, relational operations, user-defined functions, and new features in Pig version 0.10.0.