StreamSets can process data using Apache Spark in three ways:
1) The Spark Evaluator stage allows user-provided Spark code to run on each batch of records in a pipeline and return results or errors.
2) A Cluster Pipeline can leverage Apache Spark's Direct Kafka DStream to partition data from Kafka across worker pipelines on a cluster.
3) A Spark Executor can kick off a Spark application when an event is received, allowing tasks like model updating to run on streaming data using Spark.