This document discusses IBM's reference architecture for data and AI. It provides guidance on designing systems that use AI and analyze large amounts of data. The reference architecture covers strategies for collecting, storing, processing and analyzing data at large scales using technologies like Apache Spark, Hadoop and containers. It is intended to help organizations build systems that extract insights from data.