SlideShare a Scribd company logo
BUILDING A REST JOB SERVER

FOR INTERACTIVE SPARK
AS A SERVICE
Romain Rigaux - Cloudera
Erick Tryzelaar - Cloudera
WHY?
Building a REST Job Server for interactive Spark as a service by Romain Rigaux and Erick Tryzelaar
Building a REST Job Server for interactive Spark as a service by Romain Rigaux and Erick Tryzelaar
NOTEBOOKS

EASY	ACCESS	FROM	ANYWHERE

SHARE	SPARK	CONTEXTS	AND	RDDs

BUILD	APPS

SPARK	MAGIC

…
WHY SPARK

AS A SERVICE?
MARRIED	WITH	FULL	HADOOP	ECOSYSTEM		
WHY SPARK

IN HUE?
HISTORY

V1: OOZIE
• It	works	
• Code	snippet
THE GOOD
• Submit	through	Oozie	
• Shell	ac:on	
• Very	Slow	
• Batch
THE BAD
workflow.xml	
snippet.py
stdout
HISTORY

V2: SPARK IGNITER
• It	works	beAer
THE GOOD
• Compiler	Jar	
• Batch	only,	no	shell	
• No	Python,	R	
• Security	
• Single	point	of	failure
THE BAD Compile
Implement
Upload
json	output
Batch
Scala
jar
Ooyala
HISTORY

V3: NOTEBOOK
• Like	spark-submit	/	spark	shells	
• Scala	/	Python	/	R	shells	
• Jar	/	Python	batch	Jobs	
• Notebook	UI	
• YARN
THE GOOD
• Beta?
THE BAD
Livy
code	snippet batch
GENERAL ARCHITECTURE
Spark
Spark
Spark
Livy YARN
!"
# $
Livy
Spark
Spark
Spark
YARN
API
!"
# $
GENERAL ARCHITECTURE
LIVY SPARK SERVER
LIVY

SPARK SERVER
•REST	Web	server	in	Scala	for	Spark	submissions	
•Interac:ve	Shell	Sessions	or	Batch	Jobs	
•Backends:	Scala,	Java,	Python,	R	
•No	dependency	on	Hue	
•Open	Source:	hAps://github.com/cloudera/
hue/tree/master/apps/spark/java	
•Read	about	it:	hAp://gethue.com/spark/
ARCHITECTURE
• Standard	web	service:	wrapper	around	spark-submit	/	Spark	shells	
• YARN	mode,	Spark	drivers	run	inside	the	cluster	(supports	crashes)	
• No	need	to	inherit	any	interface	or	compile	code	
• Extended	to	work	with	additional	backends
LIVY WEB SERVER

ARCHITECTURE
LOCAL	“DEV”	MODE YARN	MODE
LOCAL
MODE
Livy	Server
Scalatra
Session	Manager
Session
Spark

ContextSpark	
Client
Spark	
Client
Spark

Interpreter
LOCAL
MODE
Livy	Server
Scalatra
Session	Manager
Session
Spark	
Client
Spark	
Client
Spark

Context
Spark

Interpreter
LOCAL
MODE
Spark	
Client
1
Livy	Server
Scalatra
Session	Manager
Session
Spark	
Client
Spark

Context
Spark

Interpreter
LOCAL
MODE
Spark	
Client
1
2
Livy	Server
Scalatra
Session	Manager
Session
Spark	
Client
Spark

Context
Spark

Interpreter
LOCAL
MODE
Spark	
Client
Spark

Interpreter
1
2
Livy	Server
Scalatra
Session	Manager
Session
Spark	
Client
Spark

Context
3
LOCAL
MODE
Spark	
Client
1
2
Livy	Server
Scalatra
Session	Manager
Session
Spark	
Client
Spark

Context
3
4 Spark

Interpreter
LOCAL
MODE
Spark	
Client
1
2
Livy	Server
Scalatra
Session	Manager
Session
Spark	
Client
Spark

Context
3
4
5
Spark

Interpreter
YARN-CLUSTER

MODE
PRODUCTION SCALABLE
YARN	
Master
Spark	
Client
YARN

Node
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
Livy	Server
Scalatra
Session	Manager
Session
YARN-CLUSTER

MODE
Spark

Interpreter
Livy	Server
YARN	
Master
Scalatra
Spark	
Client
Session	Manager
Session
YARN

Node
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
1
YARN-CLUSTER

MODE
Spark

Interpreter
YARN	
Master
Spark	
Client
YARN

Node
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
1
2
Livy	Server
Scalatra
Session	Manager
Session
YARN-CLUSTER

MODE
Spark

Interpreter
YARN	
Master
Spark	
Client
YARN

Node
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
1
2
3
Livy	Server
Scalatra
Session	Manager
Session
YARN-CLUSTER

MODE
Spark

Interpreter
YARN	
Master
Spark	
Client
YARN

Node
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
1
2
3
4
Livy	Server
Scalatra
Session	Manager
Session
YARN-CLUSTER

MODE
Spark

Interpreter
YARN	
Master
Spark	
Client
YARN

Node
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
1
2
3
4
5
Livy	Server
Scalatra
Session	Manager
Session
YARN-CLUSTER

MODE
Spark

Interpreter
YARN	
Master
Spark	
Client
YARN

Node
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
1
2
3
4
5
6
Livy	Server
Scalatra
Session	Manager
Session
YARN-CLUSTER

MODE
Spark

Interpreter
YARN	
Master
Spark	
Client
YARN

Node
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
1 7
2
3
4
5
6
Livy	Server
Scalatra
Session	Manager
Session
YARN-CLUSTER

MODE
Spark

Interpreter
SESSION CREATION AND EXECUTION
%	curl	-XPOST	localhost:8998/sessions		
		-d	'{"kind":	"spark"}'	
{	
		"id":	0,	
		"kind":	"spark",	
		"log":	[...],	
		"state":	"idle"	
}	
%	curl	-XPOST	localhost:8998/sessions/0/statements	-d	'{"code":	"1+1"}'	
{	
		"id":	0,	
		"output":	{	
				"data":	{	"text/plain":	"res0:	Int	=	2"	},	
				"execution_count":	0,	
				"status":	"ok"	
		},	
		"state":	"available"	
}
Jar
Py
Scala
Python
R
Livy
Spark
Spark
Spark
YARN
/batches
/sessions
BATCH OR INTERACTIVE
SHELL OR BATCH?
YARN	
Master
Spark	
Client
YARN

Node
Spark

Interpreter
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
Livy	Server
Scalatra
Session	Manager
Session
SHELL
YARN	
Master
Spark	
Client
YARN

Node
pyspark
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
Livy	Server
Scalatra
Session	Manager
Session
BATCH
YARN	
Master
Spark	
Client
YARN

Node
spark-
submit
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
Livy	Server
Scalatra
Session	Manager
Session
LIVY INTERPRETERSScala,	Python,	R…
REMEMBER?
YARN	
Master
Spark	Client
YARN

Node
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
Livy	Server
Scalatra
Session	Manager
Session
Spark

Interpreter
INTERPRETERS
• Pipe	stdin/stdout	to	a	running	shell	
• Execute	the	code	/	send	to	Spark	
workers	
• Perform	magic	opera:ons	
• One	interpreter	per	language	
• “Swappable”	with	other	kernels	
(python,	spark..)
Interpreter
>	println(1	+	1)	
2
println(1	+	1)
2
Livy	Server
INTERPRETER FLOW
Interpreter
Livy	Server
>	1	+	1
Interpreter
INTERPRETER FLOW
Livy	Server
{“code”:	“1+1”}
>	1	+	1
Interpreter
INTERPRETER FLOW
Livy	Server Interpreter
1+1	
{“code”:	“1+1”}
>	1	+	1
INTERPRETER FLOW
Livy	Server Interpreter
1+1	
{“code”:	“1+1”}
>	1	+	1
Magic
INTERPRETER FLOW
Livy	Server
2	
Interpreter
1+1	
{“code”:	“1+1”}
>	1	+	1
Magic
INTERPRETER FLOW
{	
		“data”:	{	
				“application/json”:	“2”	
		}	
}	
Livy	Server
2	
Interpreter
1+1	
{“code”:	“1+1”}
>	1	+	1
Magic
INTERPRETER FLOW
{	
		“data”:	{	
				“application/json”:	“2”	
		}	
}	
Livy	Server
2	
Interpreter
1+1	
{“code”:	“1+1”}
>	1	+	1
2 Magic
INTERPRETER FLOW
INTERPRETER FLOW CHART
Receive	lines
Split	into	
Chunks
Send	output

to	server
Send	error	to	
server
Success
Execute	ChunkMagic!
Chunks	
le[?
Magic	
chunk?
No
Yes
NoYes
Example	of	parsing
INTERPRETER MAGIC
• table	
• json	
• plotting	
• ...
NO MAGIC
>	1	+	1
Interpreter
1+1
sparkIMain.interpret(“1+1”)
{	
		"id":	0,	
		"output":	{	
				"application/json":	2	
		}	
}
[('',	506610),	('the',	23407),	('I',	19540)...	]	
JSON MAGIC
>	counts
sparkIMain.valueOfTerm(“counts”)	
.toJson()
Interpreter
val	lines	=	sc.textFile("shakespeare.txt");	
val	counts	=	lines.	
		flatMap(line	=>	line.split("	")).	
				map(word	=>	(word,	1)).	
				reduceByKey(_	+	_).	
				sortBy(-_._2).	
				map	{	case	(w,	c)	=>	
				Map("word"	->	w,	"count"	->	c)	
				}	
%json	counts
JSON MAGIC
>	counts
sparkIMain.valueOfTerm(“counts”)	
.toJson()
Interpreter
{	
		"id":	0,	
		"output":	{	
				"application/json":	[	
						{	"count":	506610,	"word":	""	},	
						{	"count":	23407,	"word":	"the"	},	
						{	"count":	19540,	"word":	"I"	},	
						...	
				]	
		...	
}	
val	lines	=	sc.textFile("shakespeare.txt");	
val	counts	=	lines.	
		flatMap(line	=>	line.split("	")).	
				map(word	=>	(word,	1)).	
				reduceByKey(_	+	_).	
				sortBy(-_._2).	
				map	{	case	(w,	c)	=>	
				Map("word"	->	w,	"count"	->	c)	
				}	
%json	counts
[('',	506610),	('the',	23407),	('I',	19540)...	]	
TABLE MAGIC
>	counts
Interpreter
val	lines	=	sc.textFile("shakespeare.txt");	
val	counts	=	lines.	
		flatMap(line	=>	line.split("	")).	
				map(word	=>	(word,	1)).	
				reduceByKey(_	+	_).	
				sortBy(-_._2).	
				map	{	case	(w,	c)	=>	
				Map("word"	->	w,	"count"	->	c)	
				}	
%table	counts
sparkIMain.valueOfTerm(“counts”)	
.guessHeaders().toList()
TABLE MAGIC
>	counts
sparkIMain.valueOfTerm(“counts”)	
.guessHeaders().toList()
Interpreter
val	lines	=	sc.textFile("shakespeare.txt");	
val	counts	=	lines.	
		flatMap(line	=>	line.split("	")).	
				map(word	=>	(word,	1)).	
				reduceByKey(_	+	_).	
				sortBy(-_._2).	
				map	{	case	(w,	c)	=>	
				Map("word"	->	w,	"count"	->	c)	
				}	
%table	counts
"application/vnd.livy.table.v1+json":	{	
		"headers":	[	
				{	"name":	"count",	"type":	"BIGINT_TYPE"	},	
				{	"name":	"name",	"type":	"STRING_TYPE"	}	
		],	
		"data":	[	
				[	23407,	"the"	],	
				[	19540,	"I"	],	
				[	18358,	"and"	],	
								...	
		]	
}
PLOT MAGIC
	>
sparkIMain.interpret(“png(‘/tmp/
plot.png’)	barplot	dev.off()”)	
Interpreter
...	
barplot(sorted_data
$count,names.arg=sorted_data$value,	
main="Resource	hits",	las=2,	
col=colfunc(nrow(sorted_data)),	
ylim=c(0,300))
PLOT MAGIC
	>
sparkIMain.interpret(“png(‘/tmp/
plot.png’)	barplot	dev.off()”)	
Interpreter
...	
barplot(sorted_data
$count,names.arg=sorted_data$value,	
main="Resource	hits",	las=2,	
col=colfunc(nrow(sorted_data)),	
ylim=c(0,300))
PLOT MAGIC
	>	png(‘/tmp/..’)	
	>	barplot	
	>	dev.off()
sparkIMain.interpret(“png(‘/tmp/
plot.png’)	barplot	dev.off()”)	
Interpreter
...	
barplot(sorted_data
$count,names.arg=sorted_data$value,	
main="Resource	hits",	las=2,	
col=colfunc(nrow(sorted_data)),	
ylim=c(0,300))
PLOT MAGIC
	>	png(‘/tmp/..’)	
	>	barplot	
	>	dev.off()
sparkIMain.interpret(“png(‘/tmp/
plot.png’)	barplot	dev.off()”)	
File(’/tmp/plot.png’).read().toBase64()
Interpreter
...	
barplot(sorted_data
$count,names.arg=sorted_data$value,	
main="Resource	hits",	las=2,	
col=colfunc(nrow(sorted_data)),	
ylim=c(0,300))
PLOT MAGIC
	>	png(‘/tmp/..’)	
	>	barplot	
	>	dev.off()
sparkIMain.interpret(“png(‘/tmp/
plot.png’)	barplot	dev.off()”)	
File(’/tmp/plot.png’).read().toBase64()
Interpreter
...	
barplot(sorted_data
$count,names.arg=sorted_data$value,	
main="Resource	hits",	las=2,	
col=colfunc(nrow(sorted_data)),	
ylim=c(0,300))
{	
		"data":	{	
				"image/png":	"iVBORw0KGgoAAAANSUhEU
					...	
				}	
		...	
}
• Pluggable	Backends	
• Livy's	Spark	Backends	
– Scala	
– pyspark	
– R	
• IPython	/	Jupyter	support	coming	soon
PLUGGABLE INTERPRETERS
• Re-using	it	
• Generic	Framework	
for	Interpreters	
• 51	Kernels
JUPYTER BACKEND

SPARK AS A SERVICE
REMEMBER AGAIN?
YARN	
Master
Spark	Client
YARN

Node
Spark

Context
YARN

Node
Spark

Worker
YARN

Node
Spark

Worker
Livy	Server
Scalatra
Session	Manager
Session
Spark

Interpreter
MULTI USERS
YARN

Node
Spark

Context
Livy	Server
Scalatra
Session	Manager
Session
Spark

Interpreter YARN

Node
Spark

Context
Spark

Interpreter
YARN

Node
Spark

Context
Spark

Interpreter
Spark	
Client
Spark	
Client
Spark	
Client
SHARED CONTEXTS?
YARN

Node
Spark

Context
Livy	Server
Scalatra
Session	Manager
Session
Spark

Interpreter
Spark	
Client
Spark	
Client
Spark	
Client
SHARED RDD?
YARN

Node
Spark

Context
Livy	Server
Scalatra
Session	Manager
Session
Spark

Interpreter
Spark	
Client
Spark	
Client
Spark	
Client
RDD
SHARED RDDS?
YARN

Node
Spark

Context
Livy	Server
Scalatra
Session	Manager
Session
Spark

Interpreter
Spark	
Client
Spark	
Client
Spark	
Client
RDD
RDD
RDD
YARN

Node
Spark

Context
Livy	Server
Scalatra
Session	Manager
Session
Spark

Interpreter
Spark	
Client
Spark	
Client
Spark	
Client
RDD
RDD
RDD
SECURE IT?
YARN

Node
Spark

Context
Livy	Server
Scalatra
Session	Manager
Session
Spark

Interpreter
Spark	
Client
Spark	
Client
Spark	
Client
RDD
RDD
RDD
SECURE IT?
Livy	Server
Spark
Spark	
Client
Spark	
Client
Spark	
Client
SPARK AS SERVICE
Spark
SHARING RDDS
PySpark	shell
RDD
Shell
Python	
Shell
PySpark	shell
RDD
Shell
Python	
Shell
PySpark	shell
RDD
Shell
Python	
Shell
r	=	sc.parallelize([])	
srdd	=	ShareableRdd(r)
PySpark	shell
RDD
{'ak':	'Alaska'}
{'ca':	'California'}
Shell
Python	
Shell
r	=	sc.parallelize([])	
srdd	=	ShareableRdd(r)
PySpark	shell
RDD
{'ak':	'Alaska'}
{'ca':	'California'}
Shell
Python	
Shell
curl	-XPOST	/sessions/0/statement	{	
			'code':	srdd.get('ak')	
}
r	=	sc.parallelize([])	
srdd	=	ShareableRdd(r)
PySpark	shell
RDD
{'ak':	'Alaska'}
{'ca':	'California'}
Shell
Python	
Shell
states	=	SharedRdd('host/sessions/0',	'srdd')	
states.get('ak')
r	=	sc.parallelize([])	
srdd	=	ShareableRdd(r)	
curl	-XPOST	/sessions/0/statement	{	
			'code':	srdd.get('ak')	
}
DEMO
TIME

https://github.com/romainr/hadoop-tutorials-examples/tree/master/notebook/shared_rdd
• SSL	Support	
• Persistent	Sessions	
• Kerberos
SECURITY
SPARK MAGIC
•From	Microsop	
•Python	magics	for	working	with	remote	Spark	
clusters	
•Open	Source:	hAps://github.com/jupyter-
incubator/sparkmagic
FUTURE
•Move	to	ext	repo?	
•Security	
•iPython/Jupyter	backends	and	file	format	
•Shared	named	RDD	/	contexts?	
•Share	data	
•Spark	specific,	language	generic,	both?	
•Leverage	Hue	4
https://issues.cloudera.org/browse/HUE-2990
• Open	Source:	hAps://github.com/cloudera/
hue/tree/master/apps/spark/java	
• Read	about	it:	hAp://gethue.com/spark/
•Scala,	Java,	Python,	R	
•Type	Introspec:on	for	Visualiza:on	
•YARN-cluster	or	local	modes	
•Code	snippets	/	compiled	
•REST	API
•Pluggable	backends	
•Magic	keywords	
•Failure	resilient	
•Security
LIVY’S

CHEAT SHEET
BEDANKT!

TWITTER
@gethue
USER GROUP
hue-user@
WEBSITE
hAp://gethue.com
LEARN
hAp://learn.gethue.com

More Related Content

PDF
The rise of Layer 7, microservices, and the proxy war with Envoy, NGINX, and ...
PPTX
[NDC 2018] 신입 개발자가 알아야 할 윈도우 메모리릭 디버깅
PPTX
PPTX
취미로 엔진 만들기
PPT
Soap
PDF
Alphorm.com Formation Nagios et Cacti : Installation et Administration
PDF
Introduction to eBPF
PDF
Container Performance Analysis
The rise of Layer 7, microservices, and the proxy war with Envoy, NGINX, and ...
[NDC 2018] 신입 개발자가 알아야 할 윈도우 메모리릭 디버깅
취미로 엔진 만들기
Soap
Alphorm.com Formation Nagios et Cacti : Installation et Administration
Introduction to eBPF
Container Performance Analysis

What's hot (20)

PDF
Automating linux network performance testing
PDF
Fascicule de tp atelier développement web
PPTX
Staring into the eBPF Abyss
PDF
Traitement distribue en BIg Data - KAFKA Broker and Kafka Streams
PDF
BPF: Tracing and more
PDF
Présentation Flutter
PDF
The linux networking architecture
PDF
Unrevealed Story Behind Viettel Network Cloud Hotpot | Đặng Văn Đại, Hà Mạnh ...
PDF
[2019] 게임 서버 대규모 부하 테스트와 모니터링 이렇게 해보자
PDF
Introduction to ARM big.LITTLE technology
PDF
Extreme Linux Performance Monitoring and Tuning
PDF
Linux Networking Explained
PDF
Accelerating Envoy and Istio with Cilium and the Linux Kernel
PPTX
[FR] Présentatation d'Ansible
PDF
Meet cute-between-ebpf-and-tracing
DOCX
Tp securité des reseaux
PPTX
Tutorial: Using GoBGP as an IXP connecting router
PPTX
eBPF Basics
PPTX
Introduction à Angular
PDF
alphorm.com - Formation Cisco ICND1-CCENT (100-101)
Automating linux network performance testing
Fascicule de tp atelier développement web
Staring into the eBPF Abyss
Traitement distribue en BIg Data - KAFKA Broker and Kafka Streams
BPF: Tracing and more
Présentation Flutter
The linux networking architecture
Unrevealed Story Behind Viettel Network Cloud Hotpot | Đặng Văn Đại, Hà Mạnh ...
[2019] 게임 서버 대규모 부하 테스트와 모니터링 이렇게 해보자
Introduction to ARM big.LITTLE technology
Extreme Linux Performance Monitoring and Tuning
Linux Networking Explained
Accelerating Envoy and Istio with Cilium and the Linux Kernel
[FR] Présentatation d'Ansible
Meet cute-between-ebpf-and-tracing
Tp securité des reseaux
Tutorial: Using GoBGP as an IXP connecting router
eBPF Basics
Introduction à Angular
alphorm.com - Formation Cisco ICND1-CCENT (100-101)
Ad

Similar to Building a REST Job Server for interactive Spark as a service by Romain Rigaux and Erick Tryzelaar (20)

PDF
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
PDF
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
PDF
Interactive Apache Spark in Your Browser
PDF
Faster Data Integration Pipeline Execution using Spark-Jobserver
PDF
Spark Summit 2014: Spark Job Server Talk
PDF
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
PPTX
Running Spark and MapReduce together in Production
PDF
Microservices Practitioner Summit Jan '15 - Scaling Uber from 1 to 100s of Se...
PDF
03 2014 Apache Spark Serving: Unifying Batch, Streaming, and RESTful Serving
PPTX
What Can The Spira API Do For You?
PDF
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
PDF
Scaling Uber
PDF
Back-end (Flask_AWS)
PDF
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
PPTX
YARN Ready: Apache Spark
PDF
Ray Serve: A new scalable machine learning model serving library on Ray
PDF
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
PDF
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
PPTX
Productionizing Spark and the REST Job Server- Evan Chan
PDF
Productionizing Spark and the Spark Job Server
Big Data Web applications for Interactive Hadoop by ENRICO BERTI at Big Data...
Spark Job Server and Spark as a Query Engine (Spark Meetup 5/14)
Interactive Apache Spark in Your Browser
Faster Data Integration Pipeline Execution using Spark-Jobserver
Spark Summit 2014: Spark Job Server Talk
Hue: Big Data Web applications for Interactive Hadoop at Big Data Spain 2014
Running Spark and MapReduce together in Production
Microservices Practitioner Summit Jan '15 - Scaling Uber from 1 to 100s of Se...
03 2014 Apache Spark Serving: Unifying Batch, Streaming, and RESTful Serving
What Can The Spira API Do For You?
Spark as a Platform to Support Multi-Tenancy and Many Kinds of Data Applicati...
Scaling Uber
Back-end (Flask_AWS)
Apache Spark At Apple with Sam Maclennan and Vishwanath Lakkundi
YARN Ready: Apache Spark
Ray Serve: A new scalable machine learning model serving library on Ray
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
Apache Spark Workshop, Apr. 2016, Euangelos Linardos
Productionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the Spark Job Server
Ad

More from Spark Summit (20)

PDF
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
PDF
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
PDF
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
PDF
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
PDF
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
PDF
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
PDF
Apache Spark and Tensorflow as a Service with Jim Dowling
PDF
Apache Spark and Tensorflow as a Service with Jim Dowling
PDF
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
PDF
Next CERN Accelerator Logging Service with Jakub Wozniak
PDF
Powering a Startup with Apache Spark with Kevin Kim
PDF
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
PDF
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
PDF
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
PDF
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
PDF
Goal Based Data Production with Sim Simeonov
PDF
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
PDF
Getting Ready to Use Redis with Apache Spark with Dvir Volk
PDF
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
PDF
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...
FPGA-Based Acceleration Architecture for Spark SQL Qi Xie and Quanfu Wang
VEGAS: The Missing Matplotlib for Scala/Apache Spark with DB Tsai and Roger M...
Apache Spark Structured Streaming Helps Smart Manufacturing with Xiaochang Wu
Improving Traffic Prediction Using Weather Data with Ramya Raghavendra
A Tale of Two Graph Frameworks on Spark: GraphFrames and Tinkerpop OLAP Artem...
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark Marcin ...
Apache Spark and Tensorflow as a Service with Jim Dowling
Apache Spark and Tensorflow as a Service with Jim Dowling
MMLSpark: Lessons from Building a SparkML-Compatible Machine Learning Library...
Next CERN Accelerator Logging Service with Jakub Wozniak
Powering a Startup with Apache Spark with Kevin Kim
Improving Traffic Prediction Using Weather Datawith Ramya Raghavendra
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
How Nielsen Utilized Databricks for Large-Scale Research and Development with...
Spline: Apache Spark Lineage not Only for the Banking Industry with Marek Nov...
Goal Based Data Production with Sim Simeonov
Preventing Revenue Leakage and Monitoring Distributed Systems with Machine Le...
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
MatFast: In-Memory Distributed Matrix Computation Processing and Optimization...

Recently uploaded (20)

PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Database Infoormation System (DBIS).pptx
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PDF
Launch Your Data Science Career in Kochi – 2025
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Foundation of Data Science unit number two notes
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Introduction to Business Data Analytics.
PPTX
IB Computer Science - Internal Assessment.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Database Infoormation System (DBIS).pptx
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Launch Your Data Science Career in Kochi – 2025
.pdf is not working space design for the following data for the following dat...
Business Ppt On Nestle.pptx huunnnhhgfvu
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Foundation of Data Science unit number two notes
Introduction to Knowledge Engineering Part 1
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Introduction-to-Cloud-ComputingFinal.pptx
Moving the Public Sector (Government) to a Digital Adoption
Clinical guidelines as a resource for EBP(1).pdf
Miokarditis (Inflamasi pada Otot Jantung)
Introduction to Business Data Analytics.
IB Computer Science - Internal Assessment.pptx

Building a REST Job Server for interactive Spark as a service by Romain Rigaux and Erick Tryzelaar