15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos

Running Spark on
Mesos
Christos Sidiropoulos, Lead DevOps Engineer,
Encode

Agenda
● About
● Alternatives
● Mesos architecture
● DC/OS
● Spark installation/configuration
● Submitting spark applications
● Monitoring spark applications
● Viewing the logs

About
● Advanced Security Analytics and Response Orchestration
● Early compromise detection.
● Capture and analyze traffic logs.

Alternatives
● Standalone
○ Easy to deploy (scripts are bundled with spark distribution).
○ Can easily run on localhost for development.
○ Master-Worker setup.
○ HA supported utilizing Zookeeper.
○ Web UI for monitoring cluster and job statistics.
● Hadoop YARN
○ Harder to bring up.
○ Combination of the ResourceManager, NodeManager, Application Master & Container.
○ HA Supported utilizing Zookeeper.
○ ResourceManager/NodeManager UI.
● Kubernetes
○ Still experimental (v2.4.0).
○ Easy to get up and running if you are already familiar with k8s.
● Nomad
○ Good if you are into experimenting with hashicorp products.

A few things about Mesos
● Using the same principles as the Linux kernel, only at a different level of abstraction.
● Dynamic resource sharing and isolation (CPU, RAM, …).
● Turn your data center into one very large computer (global resource manager).
● Dominant Resource Fairness.
● Scales on 10,000s of nodes.
● Packages and commercial support through Mesosphere
● Even YARN can run on mesos(myriad).
● Three main components
○ Mesos Master
○ Mesos Agent
○ Mesos Framework

DC/OS
● Distributed operating system based on the Apache Mesos distributed systems kernel.
● A Cluster Manager.
● A Container Platform.
● An Operating System.
● Great documentation.
● Easy to spin up.
● A great catalog with packages (Universe).

Bring it up
● Cloudformation
● Terraform/Ansible
● Manual
https://github.com/dcos-labs/ansible-dcos
os = "centos_7.4"
state = "none"
dcos_version = "1.11.4"
#
num_of_masters = "1"
num_of_private_agents = "5"
num_of_public_agents = "1"
num_of_spark_spot_agents = "0"
num_of_spark_dev_agents = "1"
num_of_private_spark_agents = "3"
#
aws_region = "eu-west-1"
aws_bootstrap_instance_type = "t3.large"
aws_master_instance_type = "t3.2xlarge"
aws_agent_instance_type = "t3.xlarge"
aws_spark_spot_agent_instance_type = "r3.2xlarge"
aws_spark_dev_agent_instance_type = "t3.2xlarge"
aws_spark_agent_instance_type = "m5.4xlarge"
aws_public_agent_instance_type = "t3.large"
ssh_key_name = "csidi"
ssh_spark_agents_key_name = "ansible"
ssh_spark_agents_private_key_filename =
"/home/ansible/.ssh/id_rsa"

Spark Installation
● dcos package install spark (et voila!)
● Alternatively we can use the Web UI.

Spark on Mesos
● Client Mode
○ A Spark Mesos framework is launched directly on the client machine and waits for the driver
output.
● Cluster mode
○ The driver is launched in the cluster and the client can find the results of the driver from the Mesos
Web UI.
● Mesos run modes:
○ Fine-grained mode (deprecated)
○ Coarse-grained mode, each Spark executor is represented by a single Mesos task. As a result,
executors have a constant size throughout their lifetime.

Submitting spark applications
● dcos spark cli
○ dcos spark run --submit-args="--class org.apache.spark.examples.SparkPi
https://downloads.mesosphere.com/spark/assets/spark-examples_2.11-2.0.1.jar 30"
● spark-submit (from inside the cluster)
○ /opt/spark/dist/bin/spark-submit --deploy-mode cluster --master
mesos://spark-dispatcher.marathon.l4lb.thisdcos.directory:7077 ---class
org.apache.spark.examples.SparkPi
https://downloads.mesosphere.com/spark/assets/spark-examples_2.11-2.0.1.jar 30

Notable configuration options when
submitting an application
● spark.mesos.executor.docker.image
● spark.mesos.uris
● spark.mesos.role
● spark.executor.memory
● spark.executor.cores
● spark.cores.max ( Number of executors: spark.cores.max/spark.executor.cores )

Viewing the logs
● Mesos sandbox
● dcos spark log

Viewing the logs
● dcos spark log
dcos spark log driver-20181126153522-0001 --file="stderr" --lines_count=4
18/11/26 16:05:36 INFO ShutdownHookManager: Deleting directory
/tmp/spark-19fad8b1-b162-44c4-a6ad-3cf3d9f3f004
18/11/26 16:05:36 INFO ShutdownHookManager: Deleting directory
/tmp/spark-19fad8b1-b162-44c4-a6ad-3cf3d9f3f004/pyspark-b93f92bc-bf9e-40b7-8ccd-65
8d18c7eade
I1126 16:05:37.522994 7645 executor.cpp:675] Container exited with status 137
W1126 16:05:37.522994 7644 logging.cpp:93] RAW: Received signal SIGTERM from process
2589 of user 0; exiting

Viewing the logs
● filebeat
filebeat.prospectors:
- input_type: log
paths:
- /var/lib/mesos/slave/slaves/*/frameworks/*/executors/*/runs/latest/stdout*
- /var/lib/mesos/slave/slaves/*/frameworks/*/executors/*/runs/latest/stderr*
- /var/log/mesos/*.log
- /var/log/dcos/dcos.log
exclude_files: ["stdout.logrotate.state", "stdout.logrotate.conf", "stderr.logrotate.state",
"stderr.logrotate.conf"]
tail_files: true
output.elasticsearch:
hosts: ["http://elasticsearch.marathon.l4lb.thisdcos.directory:9200"

Monitoring of a Spark Job
● Graphite
● Grafana

Monitoring mesos nodes
● Prometheus/Grafana
● TICK

Future Work / Sum up
● Scaling
● Dynamic resource allocation
● Multi tenant

15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos

More Related Content

What's hot (20)

Similar to 15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos (20)

More from Athens Big Data (20)

Recently uploaded (20)

15th Athens Big Data Meetup - 1st Talk - Running Spark On Mesos