This document discusses optimizing data processing on Apache Pig. It describes Pig as a high-level language for analyzing large datasets. Various optimization techniques for Pig are covered, including pushing filters, partition pruning, intermediate file compression, and controlling multiquery jobs. Cost-based optimizations like aggregation algorithms and join strategies are also discussed. Keeping data sorted and using columnar formats can further improve performance. Future work includes optimizing queries using statistics and sampling.