This document provides an overview of H2O, an open source machine learning platform that allows for distributed, in-memory analytics of large datasets. It discusses how H2O works, including how it uses a map-reduce style to parallelize machine learning algorithms across multiple nodes. The document demonstrates starting an 8-node H2O cluster on Amazon EC2 and importing a 23GB dataset in under a minute, significantly faster than with other tools. It also summarizes how H2O's distributed fork-join framework executes tasks across nodes and shares data through its distributed data structures.