SlideShare a Scribd company logo
Predicting Optimal
Parallelism for Data
Analytics
Rathijit Sen, Vishal Rohra
Agenda
▪ Overview
▪ AutoDOP
▪ AutoToken
▪ TASQ (AutoToken_vNext)
▪ AutoExecutor
▪ Summary
Resource Provisioning in the Cloud
• Focus: Automatically predict Optimal Parallelism for jobs
• Allow flexibility in selecting optimal point for cost-efficient performance
• Enable optimal resource provisioning
Users
dynamic,
fine-grained
provisioning
for jobs
Providers
Provision
cluster
capacities
How much resources does a job actually need?
General Approach
• Prediction of job run time or peak parallelism :
Peak Parallelism = f (query characteristics) [lowest run time]
Run time = f (query characteristics, #parallelism)
• Query characteristics: compile/optimization time-
properties and estimates
• Learn f using Machine Learning models on past executions
Case Studies
Performance Characteristic Curve (PCC)
Run
Time
Parallelism
Study Platform Num Nodes Prediction
AutoDOP SQL Server Single Run Time
AutoToken Cosmos Multiple Peak Parallelism
AutoToken_vNext /
TASQ
Cosmos Multiple
Run Time, PCC
(Strictly Monotonic)
AutoExecutor Spark Multiple PCC (Monotonic)
AutoDOP
Zhiwei Fan, RathijitSen, Paris Koutris, Aws Albarghouthi,“Automated Tuning of Query Degree of Parallelism via
MachineLearning”, aiDM@SIGMOD, 2020
Zhiwei Fan, RathijitSen, Paris Koutris, Aws Albarghouthi,“A ComparativeExplorationof ML Techniques for Tuning
Query Degree of Parallelism”, arXiv, 2020
Context
• Platform: SQL Server, single node
• Degree Of Parallelism (DOP)
• Maximum number of threads that can be active at any time for query execution
• Per-query selection
• Impact of DOP for running a query:
• Query Performance and Cost
• Resource Utilization of Multicore Servers
• Resource Provisioning in Cloud-Computing Platforms
Dependence on query characteristics
TPC-DS1000 Example Queries
Well-Parallelizable Queries
Other Queries
Dependence on data size (scale factor)
• The average and median shift towards larger DOP values
as the scale factor/dataset size increases
• More variation in TPC-DS
compared to TPC-H due to
the larger variety of query
templates in TPC-DS
• No workload has a single
per-query optimal DOP value
Approach
• Goal: predict optimal DOP
• ML model type: Regression, not Classification
• More flexibility in choosing optimal point for cost vs performance tradeoffs
ML Model
Random Forest
…
• Query plan operators
• Number of tuples
(cardinality), other
compile/optimization-
time estimates
Run time
DOP +
Example results
• AutoDOP is closer to optimal (oracle selection) than static DOP selection policies
• ML: each query at predicted-optimal DOP
given by ML model
• Query-Optimal: each query at Optimal DOP
(oracle selection)
• Workload-Optimal: all queries at optimal
DOP for overall workload (oracle selection)
• 40: each query at DOP 40
• 80: each query at DOP 40
• Speedup over DOP 64 (default DOP)
TPC-DS1000 Queries (subset)
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Test 1 Test 2
Speedup
ML Query-Optimal Workload-Optimal 40 80
Case Studies
Performance Characteristic Curve (PCC)
Run
Time
Parallelism
Study Platform Num Nodes Prediction
AutoDOP SQL Server Single Run Time
AutoToken Cosmos Multiple Peak Parallelism
AutoToken_vNext /
TASQ
Cosmos Multiple
Run Time, PCC
(Strictly Monotonic)
AutoExecutor Spark Multiple PCC (Monotonic)
AutoToken
RathijitSen, Alekh Jindal,Hiren Patel, Shi Qiao, “AutoToken:Predicting Peak Parallelismfor Big Data Analyticsat
Microsoft”, VLDB, 2020
Context
• Platform: Exabyte-scale Big Data analytics platform for SCOPE
queries
• Token: unit of resource allocation
• Per-job allocation
• Guaranteed and spare tokens
• Impact of number of tokens for running a job:
• Query performance and cost
• Resource utilization and provisioning
Peak Parallelism / Peak Resource Provisioning
• How many guaranteed tokens to request for the job?
• Depends on peak parallelism
• More tokens: unnecessary wait time, unused guaranteed tokens
• Less tokens: loss of performance or predictable performance
• Possible options:
• Default value
• User guesstimate
• Default VC percentage
Approach
• Automatically eliminate over-allocations for recurring jobs
• Ideally, no performance impact
• Use ML models to learn peak tokens from past behavior
• Simple models per job group (signature)
Default Allocation
Over-
allocation
Resources
Ideal
Allocation
AutoToken
Results
• Overall prediction accuracy:
• Median error: 0
• 90th percentile error  50%
• Coverage: 10.7%—28.1%
• #Jobs:
• Total: approx. 8.8M
• 0.8—2.4M training
• 162—528K testing
RequestedTokens/ActualPeak
Cumulative
Percentage
Resource Allocation Policies
Peak Allocation
Resources
Resources
TightAllocation
AutoToken
(only recurring jobs)
TASQ
TASQ
Anish Pimpley, Shuo Li, AnubhaSrivastava, Vishal Rohra, Yi Zhu, Soundarajan Srinivasan,Alekh Jindal,Hiren Patel, Shi
Qiao, Rathijit Sen, “OptimalResource Allocationfor Serverless Queries”, [Under Submission]
Why Tight Allocation
• Cost Savings
• Negligible change in performance
• 50% of the jobs can request fewer tokens
• 20% require less than 50% of requested tokens
• 5% performance loss
• 92% of the jobs can request fewer tokens
• 30% require less than 50% of requested tokens
• Reduces job wait times
• Wider resource availability
TASQ’s Approach
Given compile time features of a job
=> Predict Tight Allocation
Observation
• Optimal allocation means different thing for different users, f(cost, time)
• Predicting the relationship between tokens and runtime >> Predicting Tight allocation
• Relationship between tokens and runtime is an exponentially decaying curve,
referred to as performance characteristic curve (PCC)
Parameters (a, b)
for PCC
Challenge: Limited Trend Data
• Historical workloads executed with single token count
• In order to predict PCC, we need data for multiple token counts
Solution: Data Augmentation
• Area Preserving Allocation
Simulator (AREPAS)
• Based on past skylines, generate skylines
for multiple token counts using the simulator.
• Assumptions
• Total computations stay constant
• Total tokens-seconds used stay constant
• Area under skyline stays constant
Modeling the Runtime vs Token relationship
• Need for monotonically non-increasing curve
• User expectation: more resources → faster runtime
• ‘Elbow’ region of the curve usually emerges before parallelism overhead
• How do you enforce that in modeling
• Expect a power-law curve
Runtime t(n) = f (n: TokenAllocation) = b * n -a where a, b > 0
Predict: Scalar parameters ‘a’ and ‘b’
Results
• XGBoost are not designed to enforce monotonicity
• NN and GNN perform better in trend prediction
• NN has comparable performance with lower training time
Model
Pattern
(Non-Increase)
MAE
(Curve Params)
Median AE
(Run-Time)
XGBoost SS 32% NA 53%
XGBoost PL 93% 0.202 52%
NN 100% 0.163 39%
GNN 100% 0.168 33%
User Interface
• Workflow
• Submit the job script
• Graph generated at compile time
• Two options
• Visualize the Runtime vs Token Predictions
• Get an optimal token count
• Advantages
• Informed decision
• For all jobs
• Before job execution
Integration
• Bullet 1
• Sub-bullet
• Sub-bullet
• Bullet 2
• Sub-bullet
• Sub-bullet
Case Studies
Performance Characteristic Curve (PCC)
Run
Time
Parallelism
Study Platform Num Nodes Prediction
AutoDOP SQL Server Single Run Time
AutoToken Cosmos Multiple Peak Parallelism
AutoToken_vNext /
TASQ
Cosmos Multiple
Run Time, PCC
(Strictly Monotonic)
AutoExecutor Spark Multiple PCC (Monotonic)
AutoExecutor
RathijitSen, Abhishek Roy, Alekh Jindal,Rui Fang, Jeff Zheng, XiaoleiLiu, Ruiping Li, “AutoExecutor: Predictive
Parallelism for Spark SQL Queries”, [Under Submission]
Context
• Platform: Spark, Azure Synapse
• Executors: processes on worker nodes
• Each executor can use a certain number or cores and amount of memory
• Impact of number of executors for running a query:
• Query performance and cost
• Resource utilization and provisioning
Modeling Approach
• Reuse and extend TASQ PCC model
• Power-law curve with lower bound
• Run time t(n) with executor count n:
t(n) = max(b*na , m)
• a, b, m: parameters
ML Model
• Count of operators
• Input Cardinality
• Avg. Row length
• …
PCC model parameters
Random Forest
…
t(n) = b*na
t(n) = m
Example predictions
• Sparklens: predict after one execution of the query
• AutoExecutor: predict before execution of the query
Error distributions (different templates, SF=100)
• Most prediction errors at small number of executors
S: Sparklens
AE: AutoExecutor
F1..F10:
• ten-fold cross
validation
• 80% queries in
training set
• 20% in test set
System Architecture
Feature Extraction
Model Training
Extensions
Workload
Table
Anonymized
Plans, Metrics
Executor Events
Telemetry Pipeline
AutoExecutor
Workload Analysis
Peregrine Events
PCC
Summary
Automatic selection of optimal parallelism
• Capability and Approach:
• Enable selection of optimal operating point with respect to optimization objective
• ML models to predict run time/peak parallelism using query characteristics
• Challenges:
• Modeling PCC characteristics
• AutoDOP: Point-wise
• TASQ: Point-wise, Power-law function
• AutoExecutor: Power-law + constant function
• Collecting training data
• TASQ: AREPAS
• AutoExecutor: Sparklens
Could we have other
models for PCC?
How would you simulate
for other parameter
changes?
Feedback
Your feedback is important to us.
Don’t forget to rate and review the sessions.

More Related Content

PDF
Koalas: How Well Does Koalas Work?
PDF
Best Practices for Hyperparameter Tuning with MLflow
PDF
PandasUDFs: One Weird Trick to Scaled Ensembles
PDF
Extending Machine Learning Algorithms with PySpark
PDF
Infrastructure Agnostic Machine Learning Workload Deployment
PDF
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
PDF
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
PDF
Building an ML Platform with Ray and MLflow
Koalas: How Well Does Koalas Work?
Best Practices for Hyperparameter Tuning with MLflow
PandasUDFs: One Weird Trick to Scaled Ensembles
Extending Machine Learning Algorithms with PySpark
Infrastructure Agnostic Machine Learning Workload Deployment
Deploying MLlib for Scoring in Structured Streaming with Joseph Bradley
Simplify Distributed TensorFlow Training for Fast Image Categorization at Sta...
Building an ML Platform with Ray and MLflow

What's hot (20)

PDF
Managing Millions of Tests Using Databricks
PDF
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
PDF
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
PDF
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
PDF
How to use Apache TVM to optimize your ML models
PDF
Semantic Image Logging Using Approximate Statistics & MLflow
PDF
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
PDF
Unified MLOps: Feature Stores & Model Deployment
PDF
Raven: End-to-end Optimization of ML Prediction Queries
PDF
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
PDF
Object Detection with Transformers
PDF
Degrading Performance? You Might be Suffering From the Small Files Syndrome
PDF
Best Practices for Building Robust Data Platform with Apache Spark and Delta
PPTX
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
PDF
Deploying Machine Learning Models to Production
PDF
Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...
PPTX
SnappyData overview NikeTechTalk 11/19/15
PDF
Willump: Optimizing Feature Computation in ML Inference
PDF
Scaling Machine Learning To Billions Of Parameters
PDF
Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...
Managing Millions of Tests Using Databricks
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Building Deep Reinforcement Learning Applications on Apache Spark with Analyt...
ModelDB: A System to Manage Machine Learning Models: Spark Summit East talk b...
How to use Apache TVM to optimize your ML models
Semantic Image Logging Using Approximate Statistics & MLflow
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Unified MLOps: Feature Stores & Model Deployment
Raven: End-to-end Optimization of ML Prediction Queries
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Object Detection with Transformers
Degrading Performance? You Might be Suffering From the Small Files Syndrome
Best Practices for Building Robust Data Platform with Apache Spark and Delta
SnappyData, the Spark Database. A unified cluster for streaming, transactions...
Deploying Machine Learning Models to Production
Creating an end-to-end Recommender System with Apache Spark and Elasticsearch...
SnappyData overview NikeTechTalk 11/19/15
Willump: Optimizing Feature Computation in ML Inference
Scaling Machine Learning To Billions Of Parameters
Machine Learning as a Service: Apache Spark MLlib Enrichment and Web-Based Co...
Ad

Similar to Predicting Optimal Parallelism for Data Analytics (20)

PDF
Benchmarking Elastic Cloud Big Data Services under SLA Constraints
PPTX
Quick! Quick! Exploration!: A framework for searching a predictive model on A...
PDF
Towards a Unified Data Analytics Optimizer with Yanlei Diao
PDF
Data Science at Scale on MPP databases - Use Cases & Open Source Tools
PDF
PyconZA19-Distributed-workloads-challenges-with-PySpark-and-Airflow
PDF
Index conf sparkml-feb20-n-pentreath
PDF
Past Experiences and Future Challenges using Automatic Performance Modelling ...
PPT
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
PDF
Foundation of High Performance Computing HPC
PDF
Use Machine Learning to Get the Most out of Your Big Data Clusters
PDF
ML-Based SQL Query Resource Usage Prediction
PPTX
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
PDF
Machine Learning for Capacity Management
 
PDF
A Database-Hadoop Hybrid Approach to Scalable Machine Learning
PPTX
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
PPT
Machine Learning for automated diagnosis of distributed ...AE
PDF
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
PPTX
PDF
Self learning cloud controllers
PDF
Lazy Join Optimizations Without Upfront Statistics with Matteo Interlandi
Benchmarking Elastic Cloud Big Data Services under SLA Constraints
Quick! Quick! Exploration!: A framework for searching a predictive model on A...
Towards a Unified Data Analytics Optimizer with Yanlei Diao
Data Science at Scale on MPP databases - Use Cases & Open Source Tools
PyconZA19-Distributed-workloads-challenges-with-PySpark-and-Airflow
Index conf sparkml-feb20-n-pentreath
Past Experiences and Future Challenges using Automatic Performance Modelling ...
Predicting Multiple Metrics for Queries: Better Decision Enabled by Machine L...
Foundation of High Performance Computing HPC
Use Machine Learning to Get the Most out of Your Big Data Clusters
ML-Based SQL Query Resource Usage Prediction
A TALE of DATA PATTERN DISCOVERY IN PARALLEL
Machine Learning for Capacity Management
 
A Database-Hadoop Hybrid Approach to Scalable Machine Learning
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Machine Learning for automated diagnosis of distributed ...AE
No More Cumbersomeness: Automatic Predictive Modeling on Apache Spark with Ma...
Self learning cloud controllers
Lazy Join Optimizations Without Upfront Statistics with Matteo Interlandi
Ad

More from Databricks (20)

PPTX
DW Migration Webinar-March 2022.pptx
PPTX
Data Lakehouse Symposium | Day 1 | Part 1
PPT
Data Lakehouse Symposium | Day 1 | Part 2
PPTX
Data Lakehouse Symposium | Day 2
PPTX
Data Lakehouse Symposium | Day 4
PDF
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
PDF
Democratizing Data Quality Through a Centralized Platform
PDF
Learn to Use Databricks for Data Science
PDF
Why APM Is Not the Same As ML Monitoring
PDF
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
PDF
Stage Level Scheduling Improving Big Data and AI Integration
PDF
Simplify Data Conversion from Spark to TensorFlow and PyTorch
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
PDF
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
PDF
Sawtooth Windows for Feature Aggregations
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
PDF
Re-imagine Data Monitoring with whylogs and Spark
PDF
Processing Large Datasets for ADAS Applications using Apache Spark
PDF
Massive Data Processing in Adobe Using Delta Lake
PDF
Machine Learning CI/CD for Email Attack Detection
DW Migration Webinar-March 2022.pptx
Data Lakehouse Symposium | Day 1 | Part 1
Data Lakehouse Symposium | Day 1 | Part 2
Data Lakehouse Symposium | Day 2
Data Lakehouse Symposium | Day 4
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Democratizing Data Quality Through a Centralized Platform
Learn to Use Databricks for Data Science
Why APM Is Not the Same As ML Monitoring
The Function, the Context, and the Data—Enabling ML Ops at Stitch Fix
Stage Level Scheduling Improving Big Data and AI Integration
Simplify Data Conversion from Spark to TensorFlow and PyTorch
Scaling your Data Pipelines with Apache Spark on Kubernetes
Scaling and Unifying SciKit Learn and Apache Spark Pipelines
Sawtooth Windows for Feature Aggregations
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Re-imagine Data Monitoring with whylogs and Spark
Processing Large Datasets for ADAS Applications using Apache Spark
Massive Data Processing in Adobe Using Delta Lake
Machine Learning CI/CD for Email Attack Detection

Recently uploaded (20)

PPT
Quality review (1)_presentation of this 21
PPTX
Database Infoormation System (DBIS).pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Introduction to machine learning and Linear Models
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
[EN] Industrial Machine Downtime Prediction
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
Quality review (1)_presentation of this 21
Database Infoormation System (DBIS).pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Supervised vs unsupervised machine learning algorithms
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Qualitative Qantitative and Mixed Methods.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Introduction to machine learning and Linear Models
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
STERILIZATION AND DISINFECTION-1.ppthhhbx
Clinical guidelines as a resource for EBP(1).pdf
Introduction to Knowledge Engineering Part 1
IBA_Chapter_11_Slides_Final_Accessible.pptx
[EN] Industrial Machine Downtime Prediction
Reliability_Chapter_ presentation 1221.5784
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
SAP 2 completion done . PRESENTATION.pptx
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...

Predicting Optimal Parallelism for Data Analytics

  • 1. Predicting Optimal Parallelism for Data Analytics Rathijit Sen, Vishal Rohra
  • 2. Agenda ▪ Overview ▪ AutoDOP ▪ AutoToken ▪ TASQ (AutoToken_vNext) ▪ AutoExecutor ▪ Summary
  • 3. Resource Provisioning in the Cloud • Focus: Automatically predict Optimal Parallelism for jobs • Allow flexibility in selecting optimal point for cost-efficient performance • Enable optimal resource provisioning Users dynamic, fine-grained provisioning for jobs Providers Provision cluster capacities How much resources does a job actually need?
  • 4. General Approach • Prediction of job run time or peak parallelism : Peak Parallelism = f (query characteristics) [lowest run time] Run time = f (query characteristics, #parallelism) • Query characteristics: compile/optimization time- properties and estimates • Learn f using Machine Learning models on past executions
  • 5. Case Studies Performance Characteristic Curve (PCC) Run Time Parallelism Study Platform Num Nodes Prediction AutoDOP SQL Server Single Run Time AutoToken Cosmos Multiple Peak Parallelism AutoToken_vNext / TASQ Cosmos Multiple Run Time, PCC (Strictly Monotonic) AutoExecutor Spark Multiple PCC (Monotonic)
  • 6. AutoDOP Zhiwei Fan, RathijitSen, Paris Koutris, Aws Albarghouthi,“Automated Tuning of Query Degree of Parallelism via MachineLearning”, aiDM@SIGMOD, 2020 Zhiwei Fan, RathijitSen, Paris Koutris, Aws Albarghouthi,“A ComparativeExplorationof ML Techniques for Tuning Query Degree of Parallelism”, arXiv, 2020
  • 7. Context • Platform: SQL Server, single node • Degree Of Parallelism (DOP) • Maximum number of threads that can be active at any time for query execution • Per-query selection • Impact of DOP for running a query: • Query Performance and Cost • Resource Utilization of Multicore Servers • Resource Provisioning in Cloud-Computing Platforms
  • 8. Dependence on query characteristics TPC-DS1000 Example Queries Well-Parallelizable Queries Other Queries
  • 9. Dependence on data size (scale factor) • The average and median shift towards larger DOP values as the scale factor/dataset size increases • More variation in TPC-DS compared to TPC-H due to the larger variety of query templates in TPC-DS • No workload has a single per-query optimal DOP value
  • 10. Approach • Goal: predict optimal DOP • ML model type: Regression, not Classification • More flexibility in choosing optimal point for cost vs performance tradeoffs ML Model Random Forest … • Query plan operators • Number of tuples (cardinality), other compile/optimization- time estimates Run time DOP +
  • 11. Example results • AutoDOP is closer to optimal (oracle selection) than static DOP selection policies • ML: each query at predicted-optimal DOP given by ML model • Query-Optimal: each query at Optimal DOP (oracle selection) • Workload-Optimal: all queries at optimal DOP for overall workload (oracle selection) • 40: each query at DOP 40 • 80: each query at DOP 40 • Speedup over DOP 64 (default DOP) TPC-DS1000 Queries (subset) 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Test 1 Test 2 Speedup ML Query-Optimal Workload-Optimal 40 80
  • 12. Case Studies Performance Characteristic Curve (PCC) Run Time Parallelism Study Platform Num Nodes Prediction AutoDOP SQL Server Single Run Time AutoToken Cosmos Multiple Peak Parallelism AutoToken_vNext / TASQ Cosmos Multiple Run Time, PCC (Strictly Monotonic) AutoExecutor Spark Multiple PCC (Monotonic)
  • 13. AutoToken RathijitSen, Alekh Jindal,Hiren Patel, Shi Qiao, “AutoToken:Predicting Peak Parallelismfor Big Data Analyticsat Microsoft”, VLDB, 2020
  • 14. Context • Platform: Exabyte-scale Big Data analytics platform for SCOPE queries • Token: unit of resource allocation • Per-job allocation • Guaranteed and spare tokens • Impact of number of tokens for running a job: • Query performance and cost • Resource utilization and provisioning
  • 15. Peak Parallelism / Peak Resource Provisioning • How many guaranteed tokens to request for the job? • Depends on peak parallelism • More tokens: unnecessary wait time, unused guaranteed tokens • Less tokens: loss of performance or predictable performance • Possible options: • Default value • User guesstimate • Default VC percentage
  • 16. Approach • Automatically eliminate over-allocations for recurring jobs • Ideally, no performance impact • Use ML models to learn peak tokens from past behavior • Simple models per job group (signature) Default Allocation Over- allocation Resources Ideal Allocation AutoToken
  • 17. Results • Overall prediction accuracy: • Median error: 0 • 90th percentile error  50% • Coverage: 10.7%—28.1% • #Jobs: • Total: approx. 8.8M • 0.8—2.4M training • 162—528K testing RequestedTokens/ActualPeak Cumulative Percentage
  • 18. Resource Allocation Policies Peak Allocation Resources Resources TightAllocation AutoToken (only recurring jobs) TASQ
  • 19. TASQ Anish Pimpley, Shuo Li, AnubhaSrivastava, Vishal Rohra, Yi Zhu, Soundarajan Srinivasan,Alekh Jindal,Hiren Patel, Shi Qiao, Rathijit Sen, “OptimalResource Allocationfor Serverless Queries”, [Under Submission]
  • 20. Why Tight Allocation • Cost Savings • Negligible change in performance • 50% of the jobs can request fewer tokens • 20% require less than 50% of requested tokens • 5% performance loss • 92% of the jobs can request fewer tokens • 30% require less than 50% of requested tokens • Reduces job wait times • Wider resource availability
  • 21. TASQ’s Approach Given compile time features of a job => Predict Tight Allocation Observation • Optimal allocation means different thing for different users, f(cost, time) • Predicting the relationship between tokens and runtime >> Predicting Tight allocation • Relationship between tokens and runtime is an exponentially decaying curve, referred to as performance characteristic curve (PCC) Parameters (a, b) for PCC
  • 22. Challenge: Limited Trend Data • Historical workloads executed with single token count • In order to predict PCC, we need data for multiple token counts
  • 23. Solution: Data Augmentation • Area Preserving Allocation Simulator (AREPAS) • Based on past skylines, generate skylines for multiple token counts using the simulator. • Assumptions • Total computations stay constant • Total tokens-seconds used stay constant • Area under skyline stays constant
  • 24. Modeling the Runtime vs Token relationship • Need for monotonically non-increasing curve • User expectation: more resources → faster runtime • ‘Elbow’ region of the curve usually emerges before parallelism overhead • How do you enforce that in modeling • Expect a power-law curve Runtime t(n) = f (n: TokenAllocation) = b * n -a where a, b > 0 Predict: Scalar parameters ‘a’ and ‘b’
  • 25. Results • XGBoost are not designed to enforce monotonicity • NN and GNN perform better in trend prediction • NN has comparable performance with lower training time Model Pattern (Non-Increase) MAE (Curve Params) Median AE (Run-Time) XGBoost SS 32% NA 53% XGBoost PL 93% 0.202 52% NN 100% 0.163 39% GNN 100% 0.168 33%
  • 26. User Interface • Workflow • Submit the job script • Graph generated at compile time • Two options • Visualize the Runtime vs Token Predictions • Get an optimal token count • Advantages • Informed decision • For all jobs • Before job execution
  • 27. Integration • Bullet 1 • Sub-bullet • Sub-bullet • Bullet 2 • Sub-bullet • Sub-bullet
  • 28. Case Studies Performance Characteristic Curve (PCC) Run Time Parallelism Study Platform Num Nodes Prediction AutoDOP SQL Server Single Run Time AutoToken Cosmos Multiple Peak Parallelism AutoToken_vNext / TASQ Cosmos Multiple Run Time, PCC (Strictly Monotonic) AutoExecutor Spark Multiple PCC (Monotonic)
  • 29. AutoExecutor RathijitSen, Abhishek Roy, Alekh Jindal,Rui Fang, Jeff Zheng, XiaoleiLiu, Ruiping Li, “AutoExecutor: Predictive Parallelism for Spark SQL Queries”, [Under Submission]
  • 30. Context • Platform: Spark, Azure Synapse • Executors: processes on worker nodes • Each executor can use a certain number or cores and amount of memory • Impact of number of executors for running a query: • Query performance and cost • Resource utilization and provisioning
  • 31. Modeling Approach • Reuse and extend TASQ PCC model • Power-law curve with lower bound • Run time t(n) with executor count n: t(n) = max(b*na , m) • a, b, m: parameters ML Model • Count of operators • Input Cardinality • Avg. Row length • … PCC model parameters Random Forest … t(n) = b*na t(n) = m
  • 32. Example predictions • Sparklens: predict after one execution of the query • AutoExecutor: predict before execution of the query
  • 33. Error distributions (different templates, SF=100) • Most prediction errors at small number of executors S: Sparklens AE: AutoExecutor F1..F10: • ten-fold cross validation • 80% queries in training set • 20% in test set
  • 34. System Architecture Feature Extraction Model Training Extensions Workload Table Anonymized Plans, Metrics Executor Events Telemetry Pipeline AutoExecutor Workload Analysis Peregrine Events PCC
  • 36. Automatic selection of optimal parallelism • Capability and Approach: • Enable selection of optimal operating point with respect to optimization objective • ML models to predict run time/peak parallelism using query characteristics • Challenges: • Modeling PCC characteristics • AutoDOP: Point-wise • TASQ: Point-wise, Power-law function • AutoExecutor: Power-law + constant function • Collecting training data • TASQ: AREPAS • AutoExecutor: Sparklens Could we have other models for PCC? How would you simulate for other parameter changes?
  • 37. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.