This document discusses using Spark Mbuto to build machine learning pipelines for image classification and retrieval. It provides an overview of classification and retrieval problems and the logic behind building pipelines for each. Key steps in the pipelines include feature extraction, building a codebook/dictionary, training and evaluating classifiers. Classification models discussed include KNN and neural networks. The document outlines building image pipelines in Spark Mbuto through sequential Spark jobs that share a context.