SlideShare a Scribd company logo
Spark as Service in cloud on Yarn
Hadoop Meetup
bharatb@qubole.com, rgupta@qubole.com
May 15, 2015
Agenda
• Spark on Yarn
• Autoscaling Spark Apps and Cluster
management
• Hive Integration with Spark
• Persistent History Server
Spark on Yarn
Hadoop1
Disadvantages of hadoop1
• Limited to only MR
• Separate Map and Reduce slots =>
underutilization
• JT is heavily loaded for job scheduling,
monitoring and resource allocation.
Yarn Overview
Advantages of Spark on Yarn
• General cluster for running multiple
workflows. AM can have custom logic for
scheduling
• AM can ask for more containers when
required and give up containers when free
• This become even better when yarn clusters
can autoscale
• Get features like spot nodes etc which brings
additional challenges
Advantages of Spark on Yarn
• Qubole Yarn clusters can upscale and downscale
based on load and support spot instances.
Autoscaling Spark Applications
Spark Provisioning: Problems
• Spark Application starts with fixed number of
resources and hold on to them till its alive
• Sometimes its difficult to estimate resources
required by a job since AM is long running
• It becomes limiting spl when Yarn clusters can
autoscale.
Dynamic Provisioning
• Speed up spark commands by using free
resources in yarn cluster and also by releasing
resources when free to RM.
Spark on Yarn basics
Driver
AM
Executor-1 Executor-n
• Cluster Mode: Driver and AM run in same JVM in
a yarn Executor
• Client Mode: Driver and AM run in separate JVM
• Driver and AM talk using Actors to handle both
cases
Driver AM Executor-1 Executor-n
Dynamic Provisioning: Problem
Statement
• Two parts:
– Spark AM has no way to ask for additional
containers and give up free containers
– Automating the process of requesting containers
and releasing containers. Cached data in
containers make this difficult
Dynamic Provisioning: Part1
Dynamic Provisioning: Part1
• Implementation of 2 new apis:
// Request 5 extra executors
sc.requestExecutors(5)
// Kill executors with IDs 1, 15, and 16
sc.killExecutors(Seq("1", "15", "16"))
requestExecutors
AM
Reporter Thread
E1 E2 En
• AM has reporter thread that has count of
number of executors
• Reporter thread was used to restart died
executors
• Driver increments count of number of
executors when sc.requestExecutors is called.
Driver
removeExecutors
• To kill executors, one must precisely tell which
executors need to be killed
• Driver maintains list of all executors and can
be obtained by:
sc.executorStorageStatuses.foreach(x => println(x.blockManagerId.executorId))
• Whats cached in each executor is also
available using:
sc.executorStorageStatuses.foreach(x => println(s”memUsed = ${x.memUsed}
diskUsed=${x.diskUsed)”))
Removing Executors Tradeoffs
• BlockManager in each executor can have
cached RDDs, shuffle and broadcast data
• Killing an executor with shuffle data will
require the stage to rerun.
• To avoid this use external shuffle service
introduced in spark-1.2
Dynamic Provisioning: Part2
Upscaling Heuristics
• Request Executors as many pending tasks
• Request Executors in rounds if there are
pending tasks, doubling number of executors
added in each round bounded by some upper
limit
• Request executors by estimating workload
• Introduced –max-executors as extra param
Downscaling Heuristics
• Remove Executors when they are idle
• Remove Executors if then are idle for X secs
• Cant downscale executors with shuffle data or
broadcast data.
• --num-executors act as minimum executors
Scope
• Kill executors on spot nodes first
• Flag for not killing up executors if they have
shuffle data
Where is the code?
• https://github.com/apache/spark/pull/2840
• https://github.com/apache/spark/pull/2746
Spark Hive Integration
What is involved?
• Spark programs should be able to access hive
metastore
• Other Qubole services can be producers or
consumers of data and metadata(hive, presto,
pig etc)
Using SparkSQL - Command UI
Using SparkSQL - Results
Using SparkSQL - Notebook
• SQL, Python, Scala code can be input
Using SparkSQL - REST api - scala
curl --silent -X POST 
-H "X-AUTH-TOKEN: $AUTH_TOKEN" 
-H "Content-Type: application/json" 
-H "Accept: application/json" 
-d '{
"program" : "val s = new org.apache.spark.sql.hive.HiveContext(sc);
s.sql("show tables").collect.foreach(println)",
"language" : "scala",
"command_type" : "SparkCommand"
}' 
https://api.qubole.net/api/latest/commands
Using SparkSQL - REST api - sql
curl --silent -X POST 
-H "X-AUTH-TOKEN: $AUTH_TOKEN" 
-H "Content-Type: application/json" 
-H "Accept: application/json" 
-d '{
"program" : "show tables",
"language" : "sql",
"command_type" : "SparkCommand"
}' 
https://api.qubole.net/api/latest/commands
NOT RELEASE YET
Using SparkSQL - qds-sdk-py / java
from qds_sdk.commands import SparkCommand
with open(“test_spark.py”) as f:
code = f.read()
cmd = SparkCommand.run(language="python",
label="spark", program=code)
results = cmd.get_results()
Using SparkSQL - Cluster config
Spark UI container info
Basic cluster organization
• DB instance in Qubole account
• ssh tunnel from master to metastore DB
• Metastore server running on master on port
10000
• On master and slave nodes, hive-site.xml:-
hive.metastore.uris=thrift://master_ip:10000
Hosted metastore
Problems
• yarn overhead should be 20% (TPC-H)
• Parquet needs higher PermGen
• cached tables use actual table
• alter table recover partitions not supported
• VPC cluster has slow access to metastore
• SchemaRDD gone - old jars dont run
• hive jars needed on system classpath
Future/Near future
• Run with Qubole’s hive codebase
• Metastore caching
• Benchmarking
Future/Near future
• Persistent History Server
• Fast access to spark AM running in customer
cluster
Thank You

More Related Content

PDF
Case Studies on PostgreSQL
PDF
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQL
PDF
PostgreSQL 9.5 - Major Features
PDF
PostgreSQL WAL for DBAs
PDF
Logical Replication in PostgreSQL - FLOSSUK 2016
PDF
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
PDF
PostgreSQL and RAM usage
ODP
Logical replication with pglogical
Case Studies on PostgreSQL
Toro DB- Open-source, MongoDB-compatible database, built on top of PostgreSQL
PostgreSQL 9.5 - Major Features
PostgreSQL WAL for DBAs
Logical Replication in PostgreSQL - FLOSSUK 2016
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
PostgreSQL and RAM usage
Logical replication with pglogical

What's hot (20)

PPTX
Streaming replication in PostgreSQL
PDF
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PDF
PostgreSQL Replication High Availability Methods
PDF
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
PDF
On The Building Of A PostgreSQL Cluster
PDF
Patroni - HA PostgreSQL made easy
PDF
Query Parallelism in PostgreSQL: What's coming next?
PDF
What is new in MariaDB 10.6?
PDF
PostgreSQL Write-Ahead Log (Heikki Linnakangas)
 
PDF
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
PDF
MySQL Live Migration - Common Scenarios
PDF
MySQL shell and It's utilities - Praveen GR (Mydbops Team)
PDF
Evolution of MongoDB Replicaset and Its Best Practices
PDF
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PDF
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
PDF
Spark performance tuning eng
PPTX
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
PDF
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
PDF
Out of the box replication in postgres 9.4
PDF
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Streaming replication in PostgreSQL
PGConf.ASIA 2019 Bali - Building PostgreSQL as a Service with Kubernetes - Ta...
PostgreSQL Replication High Availability Methods
Ilya Kosmodemiansky - An ultimate guide to upgrading your PostgreSQL installa...
On The Building Of A PostgreSQL Cluster
Patroni - HA PostgreSQL made easy
Query Parallelism in PostgreSQL: What's coming next?
What is new in MariaDB 10.6?
PostgreSQL Write-Ahead Log (Heikki Linnakangas)
 
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
MySQL Live Migration - Common Scenarios
MySQL shell and It's utilities - Praveen GR (Mydbops Team)
Evolution of MongoDB Replicaset and Its Best Practices
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
How does PostgreSQL work with disks: a DBA's checklist in detail. PGConf.US 2015
Spark performance tuning eng
RMOUG2016 - Resource Management (the critical piece of the consolidation puzzle)
Meet Spilo, Zalando’s HIGH-AVAILABLE POSTGRESQL CLUSTER - Feike Steenbergen
Out of the box replication in postgres 9.4
Autovacuum, explained for engineers, new improved version PGConf.eu 2015 Vienna
Ad

Viewers also liked (20)

PDF
Building Machine Learning Pipelines
PDF
Optimizer Hints
PDF
Attacking Web Proxies
PDF
Introduction to cocoa sql mapper
PPTX
Cloud Computing (CCSME 2015 talk) - mypapit
PDF
8 Ways a Digital Media Platform is More Powerful than “Marketing”
PPTX
How Often Should You Post to Facebook and Twitter
PDF
Slides That Rock
PPTX
Why Content Marketing Fails
PDF
What Makes Great Infographics
PPT
Sea Of Greed
PDF
Masters of SlideShare
 
PDF
STOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
PDF
You Suck At PowerPoint!
PDF
10 Ways to Win at SlideShare SEO & Presentation Optimization
PDF
How To Get More From SlideShare - Super-Simple Tips For Content Marketing
PDF
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
PPTX
Tez Data Processing over Yarn
PDF
2015 Upload Campaigns Calendar - SlideShare
PPTX
What to Upload to SlideShare
Building Machine Learning Pipelines
Optimizer Hints
Attacking Web Proxies
Introduction to cocoa sql mapper
Cloud Computing (CCSME 2015 talk) - mypapit
8 Ways a Digital Media Platform is More Powerful than “Marketing”
How Often Should You Post to Facebook and Twitter
Slides That Rock
Why Content Marketing Fails
What Makes Great Infographics
Sea Of Greed
Masters of SlideShare
 
STOP! VIEW THIS! 10-Step Checklist When Uploading to Slideshare
You Suck At PowerPoint!
10 Ways to Win at SlideShare SEO & Presentation Optimization
How To Get More From SlideShare - Super-Simple Tips For Content Marketing
A Guide to SlideShare Analytics - Excerpts from Hubspot's Step by Step Guide ...
Tez Data Processing over Yarn
2015 Upload Campaigns Calendar - SlideShare
What to Upload to SlideShare
Ad

Similar to Building Spark as Service in Cloud (20)

PPTX
Spark on Yarn
 
PDF
Hadoop Spark Introduction-20150130
PDF
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
PPTX
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
PPT
February 2016 HUG: Running Spark Clusters in Containers with Docker
PDF
Apache spark - Installation
PPTX
APACHE SPARK.pptx
PDF
Fast Data Analytics with Spark and Python
PPTX
Apache Spark Fundamentals
PDF
Spark in YARN-managed Multi-tenant Clusters by Pravin Mittal and Rajesh Iyer
PPTX
In Memory Analytics with Apache Spark
PPTX
4Introduction+to+Spark.pptx sdfsdfsdfsdfsdf
PPTX
Hadoop and friends
PPTX
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
PPTX
Spark infrastructure
PDF
TriHUG talk on Spark and Shark
 
PDF
Adios hadoop, Hola Spark! T3chfest 2015
PDF
Apache Spark and Python: unified Big Data analytics
PPTX
Spark in yarn managed multi-tenant clusters
PPTX
Getting started with Apache Spark
Spark on Yarn
 
Hadoop Spark Introduction-20150130
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Uncovering an Apache Spark 2 Benchmark - Configuration, Tuning and Test Results
February 2016 HUG: Running Spark Clusters in Containers with Docker
Apache spark - Installation
APACHE SPARK.pptx
Fast Data Analytics with Spark and Python
Apache Spark Fundamentals
Spark in YARN-managed Multi-tenant Clusters by Pravin Mittal and Rajesh Iyer
In Memory Analytics with Apache Spark
4Introduction+to+Spark.pptx sdfsdfsdfsdfsdf
Hadoop and friends
CLOUD_COMPUTING_MODULE5_RK_BIG_DATA.pptx
Spark infrastructure
TriHUG talk on Spark and Shark
 
Adios hadoop, Hola Spark! T3chfest 2015
Apache Spark and Python: unified Big Data analytics
Spark in yarn managed multi-tenant clusters
Getting started with Apache Spark

More from InMobi Technology (19)

PDF
Ensemble Methods for Algorithmic Trading
PPTX
Backbone & Graphs
PDF
24/7 Monitoring and Alerting of PostgreSQL
PPTX
Reflective and Stored XSS- Cross Site Scripting
PDF
Introduction to Threat Modeling
PDF
HTTP Basics Demo
PDF
The Synapse IoT Stack: Technology Trends in IOT and Big Data
PPTX
What's new in Hadoop Yarn- Dec 2014
PPTX
Security News Bytes Null Dec Meet Bangalore
PPTX
Matriux blue
PPTX
PCI DSS v3 - Protecting Cardholder data
PDF
Running Hadoop as Service in AltiScale Platform
PPTX
Shodan- That Device Search Engine
PPTX
Big Data BI Simplified
PDF
Massively Parallel Processing with Procedural Python - Pivotal HAWQ
PDF
Building Audience Analytics Platform
PPTX
Big Data and User Segmentation in Mobile Context
PDF
Freedom Hack Report 2014
PPTX
Hadoop fundamentals
Ensemble Methods for Algorithmic Trading
Backbone & Graphs
24/7 Monitoring and Alerting of PostgreSQL
Reflective and Stored XSS- Cross Site Scripting
Introduction to Threat Modeling
HTTP Basics Demo
The Synapse IoT Stack: Technology Trends in IOT and Big Data
What's new in Hadoop Yarn- Dec 2014
Security News Bytes Null Dec Meet Bangalore
Matriux blue
PCI DSS v3 - Protecting Cardholder data
Running Hadoop as Service in AltiScale Platform
Shodan- That Device Search Engine
Big Data BI Simplified
Massively Parallel Processing with Procedural Python - Pivotal HAWQ
Building Audience Analytics Platform
Big Data and User Segmentation in Mobile Context
Freedom Hack Report 2014
Hadoop fundamentals

Recently uploaded (20)

PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Cloud computing and distributed systems.
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Chapter 3 Spatial Domain Image Processing.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
 
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Encapsulation theory and applications.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
 
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Machine learning based COVID-19 study performance prediction
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Cloud computing and distributed systems.
MYSQL Presentation for SQL database connectivity
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Chapter 3 Spatial Domain Image Processing.pdf
The AUB Centre for AI in Media Proposal.docx
 
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Diabetes mellitus diagnosis method based random forest with bat algorithm
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Programs and apps: productivity, graphics, security and other tools
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Understanding_Digital_Forensics_Presentation.pptx
Encapsulation theory and applications.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Empathic Computing: Creating Shared Understanding
Network Security Unit 5.pdf for BCA BBA.
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
 
NewMind AI Weekly Chronicles - August'25 Week I
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Machine learning based COVID-19 study performance prediction

Building Spark as Service in Cloud

  • 1. Spark as Service in cloud on Yarn Hadoop Meetup bharatb@qubole.com, rgupta@qubole.com May 15, 2015
  • 2. Agenda • Spark on Yarn • Autoscaling Spark Apps and Cluster management • Hive Integration with Spark • Persistent History Server
  • 5. Disadvantages of hadoop1 • Limited to only MR • Separate Map and Reduce slots => underutilization • JT is heavily loaded for job scheduling, monitoring and resource allocation.
  • 7. Advantages of Spark on Yarn • General cluster for running multiple workflows. AM can have custom logic for scheduling • AM can ask for more containers when required and give up containers when free • This become even better when yarn clusters can autoscale • Get features like spot nodes etc which brings additional challenges
  • 8. Advantages of Spark on Yarn • Qubole Yarn clusters can upscale and downscale based on load and support spot instances.
  • 10. Spark Provisioning: Problems • Spark Application starts with fixed number of resources and hold on to them till its alive • Sometimes its difficult to estimate resources required by a job since AM is long running • It becomes limiting spl when Yarn clusters can autoscale.
  • 11. Dynamic Provisioning • Speed up spark commands by using free resources in yarn cluster and also by releasing resources when free to RM.
  • 12. Spark on Yarn basics Driver AM Executor-1 Executor-n • Cluster Mode: Driver and AM run in same JVM in a yarn Executor • Client Mode: Driver and AM run in separate JVM • Driver and AM talk using Actors to handle both cases Driver AM Executor-1 Executor-n
  • 13. Dynamic Provisioning: Problem Statement • Two parts: – Spark AM has no way to ask for additional containers and give up free containers – Automating the process of requesting containers and releasing containers. Cached data in containers make this difficult
  • 15. Dynamic Provisioning: Part1 • Implementation of 2 new apis: // Request 5 extra executors sc.requestExecutors(5) // Kill executors with IDs 1, 15, and 16 sc.killExecutors(Seq("1", "15", "16"))
  • 16. requestExecutors AM Reporter Thread E1 E2 En • AM has reporter thread that has count of number of executors • Reporter thread was used to restart died executors • Driver increments count of number of executors when sc.requestExecutors is called. Driver
  • 17. removeExecutors • To kill executors, one must precisely tell which executors need to be killed • Driver maintains list of all executors and can be obtained by: sc.executorStorageStatuses.foreach(x => println(x.blockManagerId.executorId)) • Whats cached in each executor is also available using: sc.executorStorageStatuses.foreach(x => println(s”memUsed = ${x.memUsed} diskUsed=${x.diskUsed)”))
  • 18. Removing Executors Tradeoffs • BlockManager in each executor can have cached RDDs, shuffle and broadcast data • Killing an executor with shuffle data will require the stage to rerun. • To avoid this use external shuffle service introduced in spark-1.2
  • 20. Upscaling Heuristics • Request Executors as many pending tasks • Request Executors in rounds if there are pending tasks, doubling number of executors added in each round bounded by some upper limit • Request executors by estimating workload • Introduced –max-executors as extra param
  • 21. Downscaling Heuristics • Remove Executors when they are idle • Remove Executors if then are idle for X secs • Cant downscale executors with shuffle data or broadcast data. • --num-executors act as minimum executors
  • 22. Scope • Kill executors on spot nodes first • Flag for not killing up executors if they have shuffle data
  • 23. Where is the code? • https://github.com/apache/spark/pull/2840 • https://github.com/apache/spark/pull/2746
  • 25. What is involved? • Spark programs should be able to access hive metastore • Other Qubole services can be producers or consumers of data and metadata(hive, presto, pig etc)
  • 26. Using SparkSQL - Command UI
  • 27. Using SparkSQL - Results
  • 28. Using SparkSQL - Notebook • SQL, Python, Scala code can be input
  • 29. Using SparkSQL - REST api - scala curl --silent -X POST -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Content-Type: application/json" -H "Accept: application/json" -d '{ "program" : "val s = new org.apache.spark.sql.hive.HiveContext(sc); s.sql("show tables").collect.foreach(println)", "language" : "scala", "command_type" : "SparkCommand" }' https://api.qubole.net/api/latest/commands
  • 30. Using SparkSQL - REST api - sql curl --silent -X POST -H "X-AUTH-TOKEN: $AUTH_TOKEN" -H "Content-Type: application/json" -H "Accept: application/json" -d '{ "program" : "show tables", "language" : "sql", "command_type" : "SparkCommand" }' https://api.qubole.net/api/latest/commands NOT RELEASE YET
  • 31. Using SparkSQL - qds-sdk-py / java from qds_sdk.commands import SparkCommand with open(“test_spark.py”) as f: code = f.read() cmd = SparkCommand.run(language="python", label="spark", program=code) results = cmd.get_results()
  • 32. Using SparkSQL - Cluster config
  • 34. Basic cluster organization • DB instance in Qubole account • ssh tunnel from master to metastore DB • Metastore server running on master on port 10000 • On master and slave nodes, hive-site.xml:- hive.metastore.uris=thrift://master_ip:10000
  • 36. Problems • yarn overhead should be 20% (TPC-H) • Parquet needs higher PermGen • cached tables use actual table • alter table recover partitions not supported • VPC cluster has slow access to metastore • SchemaRDD gone - old jars dont run • hive jars needed on system classpath
  • 37. Future/Near future • Run with Qubole’s hive codebase • Metastore caching • Benchmarking
  • 38. Future/Near future • Persistent History Server • Fast access to spark AM running in customer cluster

Editor's Notes

  • #2: Intro- self, Qubole. In this video, we will see how users setup a Qubole Cluster in 3 simple steps.. Those 3 steps are…
  • #4: Intro- self, Qubole. In this video, we will see how users setup a Qubole Cluster in 3 simple steps.. Those 3 steps are…
  • #10: Intro- self, Qubole. In this video, we will see how users setup a Qubole Cluster in 3 simple steps.. Those 3 steps are…
  • #15: Intro- self, Qubole. In this video, we will see how users setup a Qubole Cluster in 3 simple steps.. Those 3 steps are…
  • #20: Intro- self, Qubole. In this video, we will see how users setup a Qubole Cluster in 3 simple steps.. Those 3 steps are…
  • #25: Intro- self, Qubole. In this video, we will see how users setup a Qubole Cluster in 3 simple steps.. Those 3 steps are…
  • #40: Intro- self, Qubole. In this video, we will see how users setup a Qubole Cluster in 3 simple steps.. Those 3 steps are…