This document discusses how Etsy uses ephemeral Hadoop clusters in the cloud to process and analyze their large amounts of data. They move data from databases and logs to S3, then use Cascading to run jobs that perform joins, grouping, etc. on the data in Hadoop. They also leverage Hadoop streaming to run MATLAB scripts for more complex analysis, and build a system called Barnum to coordinate jobs and return results. This approach allows them to flexibly scale processing from zero to thousands of nodes as needed in a cost effective and isolated manner.