This document summarizes the development of an image processing API and ML system to meet requirements of processing images in under 2 seconds, with visibility, persistence, scalability, and extensibility. It describes splitting the API and ML into separate services, optimizing processing times from 4 to 1 second through changes like using Kafka, scaling the system, and troubleshooting issues with performance spikes, message brokers, and databases. Lessons learned include distinguishing CPU from IO-bound tasks and the challenges of Go versus Python. The overall goal of under 1 second per request processing time was eventually achieved.