Fast and Reliable Apache Spark SQL Releases

Fast and Reliable Apache
Spark SQL Releases
DataWorks Summit Barcelona
March 21st, 2019
1

2
NICOLAS POGGI
Databricks, Performance Engineer
• Spark benchmarking
Barcelona Supercomputing - Microsoft Research Centre
• Lead researcher ALOJA project
• New architectures for Big Data
BarcelonaTech (UPC), PhD in Computer Architecture
• Autonomic resource manager for the cloud
• Web customer modeling
About us
BOGDAN GHIT
Databricks, Software Engineer
• Spark performance
IBM T.J. Watson Research Center
• Research intern on big data
• Bid advisor for cloud spot markets
Delft University of Technology, PhD in Computer Science
• Resource management in datacenters
• Performance of Spark, Hadoop

Databricks ecosystem
3
ToolsDevelopers
DBR Cluster Manager
Infrastructure Customers

Beta
Full Support
Marked for deprecation
Deprecated
Databricks runtime (DBR) releases
Our goal is to make releases automatic and frequent
Feb’18 Aug’18 Nov’18 Apr’19 Jul’19 Oct’19 Feb’20
* dates and LTS-tag new releases are subject to change
Spark 3.0
Spark 2.3
Spark 2.4
Spark 2.4
DBR 6.0*
DBR 4.3
DBR 5.0
DBR 5.3-LTS*

Apache Spark contributions
5
Hundreds of commits monthly to the Apache Spark project
Numberofcommits
At this pace of development, mistakes are bound to happen

Where do these contributions go?
6
Scope of the testing
Developers put a significant engineering effort in testing
Query
Input data
Configuration
Over 200 built-in functions

Yet another brick in the wall
Unit testing is not enough to guarantee correctness and performance
Unit testing
Integration
E2E
Micro
Benchmarks
Plan
stability
Fuzz
testing
Macro
benchmarks
Stress
testing
Customer
workloads
Failure
testing

8
Continuous integration pipeline
New artifacts Metrics
- Correctness
- Performance
Test
Alerts
- Merge
- Build
Dev
- Rules
- Policies
Analyze

9
Classification and alerting
- Impact
- Scope
- Correlation
- Confirm?
Failure
Regression
- Minimize
- Drill-down
- Profile
- Compare
- Validate
Events Re-test Alert
Classify Root-cause
Correctness
Performance

10
Failure
Regression
Events
Re-test
Alert
Classify Root-cause
Correctness
Performance
Correctness

How SQLite is tested
Anomaly testing
Out-of-memory testing
IO-error testing
Crash testing
Compound failure tests
Fuzz testing
SQL Fuzz
Malformed database files
Boundary value tests
How SQLite Is Tested: https://www.sqlite.org/testing.html

SQL AST
DataFrame
Unresolved
Logical Plan
Logical Plan
Optimized
Logical Plan
RDDs
Selected
Physical Plan
Analysis
Logical
Optimization
Physical
Planning
CostModel
Physical
Plans
Code
Generation
Catalog
Spark SQL behind the scenes
SQL operators can be represented as trees
Phases of transformation prepare the trees for execution
Rules can be applied once or to fix-point

Random query generation
13
Query profile
Model
translator
Spark
Query
Postgres
Query
vs
vs

...
...
DDL and datagen
14
...
...
BigIntBoolean
Timestamp
Decimal
FloatInteger
SmallInt
String
Choose a data type
Random number of rows
Random number of columns
Random number of tables
Random partition columns

Recursive query model
15
SQL Query
WITH
FROMUNION
SELECT
Functions
Constant
GROUP BY
ORDER BY
Table
Column
Alias
Query
Clause
Expression
JOIN
WHERE

Probabilistic query profile
Independent weights
• Optional query clauses
Inter-dependent weights
• Join types
• Select functions
ORDER BY
UNION
GROUP BY WHERE
10%
10%
50%
10%

Coalesce flattening (1/4)
SELECT COALESCE(t2.smallint_col_3, t1.smallint_col_3, t2.smallint_col_3) AS int_col,
IF(NULL, VARIANCE(COALESCE(t2.smallint_col_3, t1.smallint_col_3, t2.smallint_col_3)),
COALESCE(t2.smallint_col_3, t1.smallint_col_3, t2.smallint_col_3)) AS int_col_1,
STDDEV(t2.double_col_2) AS float_col,
COALESCE(MIN((t1.smallint_col_3) - (COALESCE(t2.smallint_col_3, t1.smallint_col_3,
t2.smallint_col_3))), COALESCE(t2.smallint_col_3, t1.smallint_col_3, t2.smallint_col_3),
COALESCE(t2.smallint_col_3, t1.smallint_col_3, t2.smallint_col_3)) AS int_col_2
FROM table_4 t1
INNER JOIN table_4 t2 ON (t2.timestamp_col_7) = (t1.timestamp_col_7)
WHERE (t1.smallint_col_3) IN (CAST('0.04' AS DECIMAL(10,10)), t1.smallint_col_3)
GROUP BY COALESCE(t2.smallint_col_3, t1.smallint_col_3, t2.smallint_col_3)
Small dataset with 2 tables of 5x5 size
Within 10 randomly generated queries
Error: Operation is in ERROR_STATE

Aggregate
Project
Join
FILTERSCAN foo
SCAN bar
foo.id IN
(CAST(‘0.04’ AS DECIMAL(10, 10)), foo.id)
foo.ts = bar.ts
COALESCE(COALESCE(foo.id, foo.val), 88)
GROUP BY COALESCE(foo.id, foo.val)

Aggregate
Project
Join
FILTERSCAN t1
SCAN t2
foo.id IN
(CAST(‘0.04’ AS DECIMAL(10, 10)), foo.id)
foo.ts = bar.ts
COALESCE(foo.id, foo.val)

Aggregate
Project
SCAN foo
Minimized query:
SELECT
FROM foo
GROUP BY
COALESCE(foo.id, foo.val)
Analyzing the error
● The optimizer flattens the nested coalesce calls
● The SELECT clause doesn’t contain the GROUP BY expression
● Possibly a problem with any GROUP BY expression that can be optimized

Lead function (1/3)
SELECT (t1.decimal0803_col_3) / (t1.decimal0803_col_3) AS decimal_col,
CAST(696 AS STRING) AS char_col, t1.decimal0803_col_3,
(COALESCE(CAST('0.02' AS DECIMAL(10,10)),
CAST('0.47' AS DECIMAL(10,10)),
CAST('-0.53' AS DECIMAL(10,10)))) +
(LEAD(-65, 4) OVER (ORDER BY (t1.decimal0803_col_3) / (t1.decimal0803_col_3),
CAST(696 AS STRING))) AS decimal_col_1,
CAST(-349 AS STRING) AS char_col_1
FROM table_16 t1
WHERE (943) > (889)
Error: Column 4 in row 10 does not match:
[1.0, 696, -871.81, <<-64.98>>, -349] SPARK row
[1.0, 696, -871.81, <<None>>, -349] POSTGRESQL row

Lead function (2/3)
Project
FILTER
SCAN foo
WHERE expr
COALESCE(expr) + LEAD(-65, 4) OVER ORDER BY expr

Lead function (3/3)
Project
FILTER WHERE expr
COALESCE(expr) + LEAD(-65, 4) OVER ORDER BY expr
Analyzing the error
● Using constant input values breaks the behaviour of the LEAD function
● SPARK-16633: https://github.com/apache/spark/pull/14284
SCAN foo

Query operator coverage analysis
In 15m (500 queries), we reach the max coverage of the framework

25
Performance
25
Failure
Regression
Events
Re-test
Alert
Classify Root-cause
Correctness
Performance

Benchmarking tools
•We use spark-sql-perf public library for
TPC workloads
• Provides datagen and import scripts
• local, cluster, S3
• Dashboards for analyzing results
•The Spark micro benchmarks
•And the async-profiler
• to produce flamegraphs
26
https://github.com/databricks/spark-sql-perf
Source:
http://www.brendangregg.com/flamegraphs.html
CPU Flame Graph

Per query drill-down: q67
First, scope and validate
• in 2.4-master (dev) compared
• to 2.3 in DBR 4.3 (prod)
Query 67: 18% regression From 320s to 390s

Q67 executor profile for Spark 2.4-master

Side-by-side 2.3 vs 2.4: find the differences
Spark 2.3 Spark 2.4

Framegraph diff zoom Red slower White new
unsafe/Platform.copyMemory()
unsafe/BytesToBytesMap.safeLookup
New: hash/Murmur3_x86_32.hashUTF8String()
Murmur3_x86_32.hashUnsafeBytesBlock()
Look for hints:
- Mem mgmt
- Hashing
- unsafe

Root-causing
Results:
• Spark 2.3: hashUnsafeBytes() -> 40µs
• Spark 2.4 hashUnsafeBytesBlock() -> 140µs
• also slower UTF8String.getBytes()
Microbenchmark for UTF8String
GIT BISECT
1.)
2.)
3.)

It is a journey to get a release out
DBR and Spark testing and performance are a continuous effort
• Over a month effort to bring performance to improving
TPC-DS 2.4-master vs. 2.3 at SF 1000
15%
5%
< 0%

… a journey that pays off quickly
Query times have improved over 2X
in the Spark 2.x branch measured in the Databricks platform
Note: both 2.4.1 and 3.0.0 are not released yet

Conclusion
Spark in production is not just the framework
Unit and integration testing are not enough
We need Spark specific tools to automate the process
to ensure both correctness and performance

Thanks!
Fast and Reliable Apache Spark SQL Releases
March 2019
36
Test AnalyzeDev
Feedback: {Nico.Poggi, Bogdan.Ghit}@databricks.com

Fast and Reliable Apache Spark SQL Releases

More Related Content

What's hot (20)

Similar to Fast and Reliable Apache Spark SQL Releases (20)

More from DataWorks Summit (20)

Recently uploaded (20)

Fast and Reliable Apache Spark SQL Releases