This document discusses Twitter's use of open source software for large scale data processing. Twitter collects terabytes of daily data and processes tens of petabytes daily across thousands of servers. It uses various open source projects like Hadoop, Storm and Zookeeper for tasks like data collection, real-time and batch processing, service coordination and metrics. Twitter engineers actively contribute to many open source projects and release some internally developed tools to the open source community.