This document provides an overview and agenda for a training on using Apache Spark for predictive analytics. It discusses key topics that will be covered including what Spark is, how to use Spark on IBM Cloud, basic programming in Scala and Python, Spark streaming, machine learning with MLLib, and graph processing with GraphX. Use cases for Spark are also presented such as customer behavior analytics, predictive maintenance using IoT data, and network performance optimization. Hands-on labs are outlined on introductory notebooks, sentiment analysis on Twitter data, and calculating Apache HTTP response codes from log data. The overall motivation of local development versus cloud deployment is also addressed.
Related topics: