SlideShare a Scribd company logo
SigOpt. Confidential.
Talk #2
Optimize Training and Tuning for Deep
Learning
SigOpt Talk Series
Tuning for Systematic Trading
Tobias Andreasen — Machine Learning Engineer
Tuesday, April 21, 2020
SigOpt. Confidential.
Abstract
SigOpt provides an extensive set of advanced features,
which help you, the expert, save time while
increasing model performance via experimentation.
Today, we will continue this talk series by discussing
how to best utilize your infrastructure, reduce
experiment time and accelerate training for deep
learning models.
SigOpt. Confidential.
Motivation
1. Overview of SigOpt
2. Recap on bayesian optimization
3. How to continuously and efficiently utilize your
project’s allotted compute infrastructure
4. How to tune models with expensive training costs
SigOpt. Confidential.
Overview of SigOpt1
SigOpt. Confidential.
Accelerate and amplify the
impact of modelers everywhere
SigOpt. Confidential.
Experiment Insights Optimization Engine
Track, analyze and reproduce any
model to improve the productivity of
your modeling
Enterprise Platform
Automate hyperparameter tuning to
maximize the performance and impact
of your models
Standardize experimentation across
any combination of library,
infrastructure, model or task
On-Premise Hybrid/Multi
Solution: Experiment, optimize and analyze at scale
6
SigOpt. Confidential.
SigOpt Features
Enterprise
Platform
Optimization
Engine
Experiment
Insights
Reproducibility
Intuitive web dashboards
Cross-team permissions
and collaboration
Advanced experiment
visualizations
Usage insights
Parameter importance
analysis
Multimetric optimization
Continuous, categorical, or
integer parameters
Constraints and
failure regions
Up to 10k observations,
100 parameters
Multitask optimization and
high parallelism
Training Monitor and
Automated Early Stopping
Infrastructure agnostic
REST API
Parallel Resource Scheduler
Black-Box Interface
Tunes without
accessing any data
Libraries for Python,
Java, R, and MATLAB
SigOpt. Confidential.
Recap on bayesian
optimization2
SigOpt. Confidential.
Your firewall
Training
Data
AI, ML, DL,
Simulation Model
Model Evaluation
or Backtest
Testing
Data
New
Configurations
Objective
Metric
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale
with your needs
OPTIMIZATION ENGINE
Explore and exploit with a
variety of techniques
RESTAPI
Configuration
Parameters or
Hyperparameters
Black Box Optimization
SigOpt. Confidential.
A graphical depiction of the iterative process
10
Build a statistical model
Sequential Model Based Optimization (SMBO)
SigOpt. Confidential.
A graphical depiction of the iterative process
11
Build a statistical model
Choose the next point
to maximize the acquisition function
Sequential Model Based Optimization (SMBO)
SigOpt. Confidential.
A graphical depiction of the iterative process
12
Build a statistical model Build a statistical model
Choose the next point
to maximize the acquisition function
Sequential Model Based Optimization (SMBO)
SigOpt. Confidential.
A graphical depiction of the iterative process
13
Build a statistical model Build a statistical model
Choose the next point
to maximize the acquisition function
Sequential Model Based Optimization (SMBO)
Choose the next point
to maximize the acquisition function
SigOpt Blog Posts: Intuition Behind Bayesian Optimization
Some Relevant Blog Posts
● Intuition Behind Covariance Kernels
● Approximation of Data
● Likelihood for Gaussian Processes
● Profile Likelihood vs. Kriging Variance
● Intuition behind Gaussian Processes
● Dealing with Troublesome Metrics
Find more blog posts visit:
https://sigopt.com/blog/
SigOpt. Confidential.
How to continuously and efficiently
utilize your project’s allotted compute
infrastructure
3
SigOpt. Confidential.
Utilize compute by asynchronous parallel optimization
SigOpt natively handles Parallel Function Evaluation with the primary goal of
minimizing the Overall Wall-Clock Time. Parallelism also provides:
• Faster time-to-results — minimized overall wall-clock time
• Full resource utilization — asynchronous parallel optimization
• Scaling with infrastructure — optimize across the number of available compute resources
This is essential to increase Research Productivity by lowering the time-to-results and scaling with
available infrastructure.
16
Continuously and efficiently utilize infrastructure
SigOpt. Confidential.
Your firewall
Training
Data
AI, ML, DL, Simulation
Model
Model Evaluation
or Backtest
Testing
Data
New
Configurations
Objective
Metric
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with
your needs
OPTIMIZATION ENGINE
Explore and exploit with a variety
of techniques
RESTAPI
Configuration
Parameters or
Hyperparameters
Continuously and efficiently utilize infrastructure
SigOpt. Confidential.
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with
your needs
OPTIMIZATION ENGINE
Explore and exploit with a variety
of techniques
RESTAPI
Worker
Continuously and efficiently utilize infrastructure
SigOpt. Confidential.
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with
your needs
OPTIMIZATION ENGINE
Explore and exploit with a variety
of techniques
RESTAPI
Worker
Continuously and efficiently utilize infrastructure
Worker
Worker
Worker #1
Worker #2
Worker #100
SigOpt. Confidential.
Parallel function evaluations: find the best set of suggestions
20
Parallel function evaluations are a way of efficiently maximizing a function
while using all available compute resources [Ginsbourger et al, 2008, Garcia-Barcos et al. 2019].
• Choosing points by jointly maximizing criteria over the entire set of open resources
• Asynchronously evaluating over a collection of points
• Fixing points which are currently being evaluated while sampling new ones
Jointly Optimize Multiple Next Points to Sample
1D - Acquisition Function 2D - Acquisition Function
SigOpt. Confidential.
Parallel optimization: different parallel bandwidth leads to different search
21
Parallel bandwidth = 1 Parallel bandwidth = 2 Parallel bandwidth = 3
Parallel bandwidth = 4 Parallel bandwidth = 5
Next point(s) to
evaluate:
Parallel bandwidth
represent the # of
available compute
resources
Statistical Model
More Exploration, More Exploitation: Faster Wall Clock
Parallelism Use Case
● Category: NLP
● Task: Sentiment Analysis
● Model: CNN
● Data: Rotten Tomatoes Movie Reviews
● Analysis: Predicting Positive vs. Negative Sentiment
● Result: 400x speedup
Learn more
https://aws.amazon.com/blogs/machine-learning/fast-cnn-tuni
ng-with-aws-gpu-instances-and-sigopt/
Use Case: Fast CNN Tuning with AWS GPU Instances
SigOpt. Confidential.
How to tune models with
expensive training costs
4
SigOpt. Confidential.
How to efficiently minimize time to optimize any function
SigOpt’s multitask feature is an efficient way for modelers to tune model with an expensive training cost
with the benefit of:
• Faster time-to-market — The ability to bring expensive models into production faster
• Reduction in infrastructure cost — Intelligently leverage infrastructure while reducing cost
Through novel research SigOpt helps the user lower the overall time-to-market,
while reducing the overall compute budget.
24
Expensive Training Cost
SigOpt. Confidential.
Your firewall
Training
Data
AI, ML, DL, Simulation
Model
Model Evaluation or
Backtest
Testing
Data
New
Configurations
Objective
Metric
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with
your needs
OPTIMIZATION ENGINE
Explore and exploit with a variety
of techniques
RESTAPI
Configuration
Parameters or
Hyperparameters
Expensive Training Cost
SigOpt. Confidential.
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with
your needs
OPTIMIZATION ENGINE
Explore and exploit with a variety
of techniques
RESTAPI
Expensive Training Cost
SigOpt. Confidential.
Better
Results
EXPERIMENT INSIGHTS
Track, organize, analyze and
reproduce any model
ENTERPRISE PLATFORM
Built to fit any stack and scale with
your needs
OPTIMIZATION ENGINE
Explore and exploit with a variety
of techniques
RESTAPI
Expensive Training Cost
SigOpt. Confidential.
Using cheap or free information to speed learning
28
SigOpt allows to the user to define lower-cost functions in order to quickly optimize expensive functions
• Cheaper-cost functions can be flexible (fewer epochs, subsampled data, other custom features)
• Use cheaper tasks earlier in the tuning process to explore
• Inform more expensive tasks later by exploiting what we learn
• In the process, reduce the full time required to tune an expensive model
Expensive Training Cost
SigOpt. Confidential.
Using cheap or free information to speed learning
We can build better models using inaccurate data to help point the
actual optimization in the right direction with less cost.
• Using a warm start through multi-task learning logic [Swersky et al, 2014]
• Combining good anytime performance with active learning [Klein et al, 2018]
• Accepting data from multiple sources without priors [Poloczek et al, 2017]
29
Expensive Training Cost
Use Case: Image Classification on a Budget
Use Case
● Category: Computer Vision
● Task: Image Classification
● Model: CNN
● Data: Stanford Cars Dataset
● Analysis: Architecture Comparison
● Result: 2.4% accuracy gain with a much shallower
model
Learn more
https://mlconf.com/blog/insights-for-building-high-performing-
image-classification-models/
SigOpt. Confidential.
Next Talk: Efficient Approaches to Training
Automated Early Stopping, Convergence Monitoring
SigOpt. Confidential.
Register for this talk
https://tuning.sigopt.com/tuning-for-systematic-trading
SigOpt. Confidential.
Tobias Andreasen | tobias@sigopt.com
For more information visit: https://sigopt.com/research/
Questions?
SigOpt. Confidential.
Next talk: How should one
think about convergence?
5
- and other approaches to
efficient model training
techniques
SigOpt. Confidential.
How should one think about
convergence?
5
SigOpt. Confidential.
Future talks
Convergence
Implementation
Infrastructure
Implementation
Use cases
Experiment transfor
Visit us at OpMl’20
Metric Management
Parameter Importance
SigOpt. Confidential.
The best model is found through convergence
37
Think about convergence
SigOpt. Confidential.
Think about convergence
The best model is found through convergence

More Related Content

PDF
Tuning for Systematic Trading: Talk 3: Training, Tuning, and Metric Strategy
PDF
SigOpt for Hedge Funds
PDF
SigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
PDF
SigOpt at Ai4 Finance—Modeling at Scale
PDF
Modeling at scale in systematic trading
PDF
SigOpt at GTC - Reducing operational barriers to optimization
PDF
Advanced Optimization for the Enterprise Webinar
PDF
Tuning the Untunable - Insights on Deep Learning Optimization
Tuning for Systematic Trading: Talk 3: Training, Tuning, and Metric Strategy
SigOpt for Hedge Funds
SigOpt at O'Reilly - Best Practices for Scaling Modeling Platforms
SigOpt at Ai4 Finance—Modeling at Scale
Modeling at scale in systematic trading
SigOpt at GTC - Reducing operational barriers to optimization
Advanced Optimization for the Enterprise Webinar
Tuning the Untunable - Insights on Deep Learning Optimization

What's hot (12)

PDF
Common Problems in Hyperparameter Optimization
PPTX
Machine Learning Fundamentals
PDF
MLOps Using MLflow
PPTX
Pydata presentation
PDF
Pentaho World 2017: Automated Machine Learning (AutoML) and Pentaho (Thursday...
PDF
PyData London 2018 talk on feature selection
PDF
Model-Based Optimization for Effective and Reliable Decision-Making
PDF
Agile Machine Learning for Real-time Recommender Systems
PDF
“Houston, we have a model...” Introduction to MLOps
PDF
Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
PDF
Nesma autumn conference 2015 - Is FPA a valuable addition to predictable agil...
PDF
Recent advances on large scheduling problems in CP Optimizer
Common Problems in Hyperparameter Optimization
Machine Learning Fundamentals
MLOps Using MLflow
Pydata presentation
Pentaho World 2017: Automated Machine Learning (AutoML) and Pentaho (Thursday...
PyData London 2018 talk on feature selection
Model-Based Optimization for Effective and Reliable Decision-Making
Agile Machine Learning for Real-time Recommender Systems
“Houston, we have a model...” Introduction to MLOps
Shortening the time from analysis to deployment with ml as-a-service — Luiz A...
Nesma autumn conference 2015 - Is FPA a valuable addition to predictable agil...
Recent advances on large scheduling problems in CP Optimizer
Ad

Similar to Tuning for Systematic Trading: Talk 2: Deep Learning (20)

PDF
Tuning 2.0: Advanced Optimization Techniques Webinar
PDF
Tuning for Systematic Trading: Talk 1
PDF
Using Bayesian Optimization to Tune Machine Learning Models
PDF
Using Bayesian Optimization to Tune Machine Learning Models
PDF
Modeling at Scale: SigOpt at TWIMLcon 2019
PDF
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
PDF
MLConf 2016 SigOpt Talk by Scott Clark
PDF
Using Optimal Learning to Tune Deep Learning Pipelines
PDF
Using Optimal Learning to Tune Deep Learning Pipelines
PDF
SigOpt for Machine Learning and AI
PDF
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
PPTX
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
PDF
SigOpt at GTC - Tuning the Untunable
PDF
Scott Clark, CEO, SigOpt, at The AI Conference 2017
PDF
SigOpt at Uber Science Symposium - Exploring the spectrum of black-box optimi...
PDF
Interactive Tradeoffs Between Competing Offline Metrics with Bayesian Optimiz...
PDF
Lessons for an enterprise approach to modeling at scale
PDF
Experiment Management for the Enterprise
PDF
mlsys_portrait
PDF
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
Tuning 2.0: Advanced Optimization Techniques Webinar
Tuning for Systematic Trading: Talk 1
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
Modeling at Scale: SigOpt at TWIMLcon 2019
Scott Clark, Co-Founder and CEO, SigOpt at MLconf SF 2016
MLConf 2016 SigOpt Talk by Scott Clark
Using Optimal Learning to Tune Deep Learning Pipelines
Using Optimal Learning to Tune Deep Learning Pipelines
SigOpt for Machine Learning and AI
Using SigOpt to Tune Deep Learning Models with Nervana Cloud
Advanced Techniques to Accelerate Model Tuning | Software for AI Optimization...
SigOpt at GTC - Tuning the Untunable
Scott Clark, CEO, SigOpt, at The AI Conference 2017
SigOpt at Uber Science Symposium - Exploring the spectrum of black-box optimi...
Interactive Tradeoffs Between Competing Offline Metrics with Bayesian Optimiz...
Lessons for an enterprise approach to modeling at scale
Experiment Management for the Enterprise
mlsys_portrait
Scott Clark, CEO, SigOpt, at MLconf Seattle 2017
Ad

More from SigOpt (10)

PDF
Optimizing BERT and Natural Language Models with SigOpt Experiment Management
PDF
Efficient NLP by Distilling BERT and Multimetric Optimization
PDF
Detecting COVID-19 Cases with Deep Learning
PDF
Metric Management: a SigOpt Applied Use Case
PDF
Tuning Data Augmentation to Boost Model Performance
PDF
Machine Learning Infrastructure
PDF
SigOpt at MLconf - Reducing Operational Barriers to Model Training
PDF
Machine Learning Infrastructure
PPTX
Tips and techniques for hyperparameter optimization
PDF
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...
Optimizing BERT and Natural Language Models with SigOpt Experiment Management
Efficient NLP by Distilling BERT and Multimetric Optimization
Detecting COVID-19 Cases with Deep Learning
Metric Management: a SigOpt Applied Use Case
Tuning Data Augmentation to Boost Model Performance
Machine Learning Infrastructure
SigOpt at MLconf - Reducing Operational Barriers to Model Training
Machine Learning Infrastructure
Tips and techniques for hyperparameter optimization
MLconf 2017 Seattle Lunch Talk - Using Optimal Learning to tune Deep Learning...

Recently uploaded (20)

PDF
Empathic Computing: Creating Shared Understanding
PDF
Approach and Philosophy of On baking technology
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Modernizing your data center with Dell and AMD
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Advanced IT Governance
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
cuic standard and advanced reporting.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Empathic Computing: Creating Shared Understanding
Approach and Philosophy of On baking technology
Diabetes mellitus diagnosis method based random forest with bat algorithm
“AI and Expert System Decision Support & Business Intelligence Systems”
Modernizing your data center with Dell and AMD
Per capita expenditure prediction using model stacking based on satellite ima...
Spectral efficient network and resource selection model in 5G networks
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
Advanced Soft Computing BINUS July 2025.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Advanced IT Governance
Unlocking AI with Model Context Protocol (MCP)
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
CIFDAQ's Market Insight: SEC Turns Pro Crypto
cuic standard and advanced reporting.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
MYSQL Presentation for SQL database connectivity
Build a system with the filesystem maintained by OSTree @ COSCUP 2025

Tuning for Systematic Trading: Talk 2: Deep Learning

  • 1. SigOpt. Confidential. Talk #2 Optimize Training and Tuning for Deep Learning SigOpt Talk Series Tuning for Systematic Trading Tobias Andreasen — Machine Learning Engineer Tuesday, April 21, 2020
  • 2. SigOpt. Confidential. Abstract SigOpt provides an extensive set of advanced features, which help you, the expert, save time while increasing model performance via experimentation. Today, we will continue this talk series by discussing how to best utilize your infrastructure, reduce experiment time and accelerate training for deep learning models.
  • 3. SigOpt. Confidential. Motivation 1. Overview of SigOpt 2. Recap on bayesian optimization 3. How to continuously and efficiently utilize your project’s allotted compute infrastructure 4. How to tune models with expensive training costs
  • 5. SigOpt. Confidential. Accelerate and amplify the impact of modelers everywhere
  • 6. SigOpt. Confidential. Experiment Insights Optimization Engine Track, analyze and reproduce any model to improve the productivity of your modeling Enterprise Platform Automate hyperparameter tuning to maximize the performance and impact of your models Standardize experimentation across any combination of library, infrastructure, model or task On-Premise Hybrid/Multi Solution: Experiment, optimize and analyze at scale 6
  • 7. SigOpt. Confidential. SigOpt Features Enterprise Platform Optimization Engine Experiment Insights Reproducibility Intuitive web dashboards Cross-team permissions and collaboration Advanced experiment visualizations Usage insights Parameter importance analysis Multimetric optimization Continuous, categorical, or integer parameters Constraints and failure regions Up to 10k observations, 100 parameters Multitask optimization and high parallelism Training Monitor and Automated Early Stopping Infrastructure agnostic REST API Parallel Resource Scheduler Black-Box Interface Tunes without accessing any data Libraries for Python, Java, R, and MATLAB
  • 8. SigOpt. Confidential. Recap on bayesian optimization2
  • 9. SigOpt. Confidential. Your firewall Training Data AI, ML, DL, Simulation Model Model Evaluation or Backtest Testing Data New Configurations Objective Metric Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Configuration Parameters or Hyperparameters Black Box Optimization
  • 10. SigOpt. Confidential. A graphical depiction of the iterative process 10 Build a statistical model Sequential Model Based Optimization (SMBO)
  • 11. SigOpt. Confidential. A graphical depiction of the iterative process 11 Build a statistical model Choose the next point to maximize the acquisition function Sequential Model Based Optimization (SMBO)
  • 12. SigOpt. Confidential. A graphical depiction of the iterative process 12 Build a statistical model Build a statistical model Choose the next point to maximize the acquisition function Sequential Model Based Optimization (SMBO)
  • 13. SigOpt. Confidential. A graphical depiction of the iterative process 13 Build a statistical model Build a statistical model Choose the next point to maximize the acquisition function Sequential Model Based Optimization (SMBO) Choose the next point to maximize the acquisition function
  • 14. SigOpt Blog Posts: Intuition Behind Bayesian Optimization Some Relevant Blog Posts ● Intuition Behind Covariance Kernels ● Approximation of Data ● Likelihood for Gaussian Processes ● Profile Likelihood vs. Kriging Variance ● Intuition behind Gaussian Processes ● Dealing with Troublesome Metrics Find more blog posts visit: https://sigopt.com/blog/
  • 15. SigOpt. Confidential. How to continuously and efficiently utilize your project’s allotted compute infrastructure 3
  • 16. SigOpt. Confidential. Utilize compute by asynchronous parallel optimization SigOpt natively handles Parallel Function Evaluation with the primary goal of minimizing the Overall Wall-Clock Time. Parallelism also provides: • Faster time-to-results — minimized overall wall-clock time • Full resource utilization — asynchronous parallel optimization • Scaling with infrastructure — optimize across the number of available compute resources This is essential to increase Research Productivity by lowering the time-to-results and scaling with available infrastructure. 16 Continuously and efficiently utilize infrastructure
  • 17. SigOpt. Confidential. Your firewall Training Data AI, ML, DL, Simulation Model Model Evaluation or Backtest Testing Data New Configurations Objective Metric Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Configuration Parameters or Hyperparameters Continuously and efficiently utilize infrastructure
  • 18. SigOpt. Confidential. Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Worker Continuously and efficiently utilize infrastructure
  • 19. SigOpt. Confidential. EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Worker Continuously and efficiently utilize infrastructure Worker Worker Worker #1 Worker #2 Worker #100
  • 20. SigOpt. Confidential. Parallel function evaluations: find the best set of suggestions 20 Parallel function evaluations are a way of efficiently maximizing a function while using all available compute resources [Ginsbourger et al, 2008, Garcia-Barcos et al. 2019]. • Choosing points by jointly maximizing criteria over the entire set of open resources • Asynchronously evaluating over a collection of points • Fixing points which are currently being evaluated while sampling new ones Jointly Optimize Multiple Next Points to Sample 1D - Acquisition Function 2D - Acquisition Function
  • 21. SigOpt. Confidential. Parallel optimization: different parallel bandwidth leads to different search 21 Parallel bandwidth = 1 Parallel bandwidth = 2 Parallel bandwidth = 3 Parallel bandwidth = 4 Parallel bandwidth = 5 Next point(s) to evaluate: Parallel bandwidth represent the # of available compute resources Statistical Model More Exploration, More Exploitation: Faster Wall Clock
  • 22. Parallelism Use Case ● Category: NLP ● Task: Sentiment Analysis ● Model: CNN ● Data: Rotten Tomatoes Movie Reviews ● Analysis: Predicting Positive vs. Negative Sentiment ● Result: 400x speedup Learn more https://aws.amazon.com/blogs/machine-learning/fast-cnn-tuni ng-with-aws-gpu-instances-and-sigopt/ Use Case: Fast CNN Tuning with AWS GPU Instances
  • 23. SigOpt. Confidential. How to tune models with expensive training costs 4
  • 24. SigOpt. Confidential. How to efficiently minimize time to optimize any function SigOpt’s multitask feature is an efficient way for modelers to tune model with an expensive training cost with the benefit of: • Faster time-to-market — The ability to bring expensive models into production faster • Reduction in infrastructure cost — Intelligently leverage infrastructure while reducing cost Through novel research SigOpt helps the user lower the overall time-to-market, while reducing the overall compute budget. 24 Expensive Training Cost
  • 25. SigOpt. Confidential. Your firewall Training Data AI, ML, DL, Simulation Model Model Evaluation or Backtest Testing Data New Configurations Objective Metric Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Configuration Parameters or Hyperparameters Expensive Training Cost
  • 26. SigOpt. Confidential. Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Expensive Training Cost
  • 27. SigOpt. Confidential. Better Results EXPERIMENT INSIGHTS Track, organize, analyze and reproduce any model ENTERPRISE PLATFORM Built to fit any stack and scale with your needs OPTIMIZATION ENGINE Explore and exploit with a variety of techniques RESTAPI Expensive Training Cost
  • 28. SigOpt. Confidential. Using cheap or free information to speed learning 28 SigOpt allows to the user to define lower-cost functions in order to quickly optimize expensive functions • Cheaper-cost functions can be flexible (fewer epochs, subsampled data, other custom features) • Use cheaper tasks earlier in the tuning process to explore • Inform more expensive tasks later by exploiting what we learn • In the process, reduce the full time required to tune an expensive model Expensive Training Cost
  • 29. SigOpt. Confidential. Using cheap or free information to speed learning We can build better models using inaccurate data to help point the actual optimization in the right direction with less cost. • Using a warm start through multi-task learning logic [Swersky et al, 2014] • Combining good anytime performance with active learning [Klein et al, 2018] • Accepting data from multiple sources without priors [Poloczek et al, 2017] 29 Expensive Training Cost
  • 30. Use Case: Image Classification on a Budget Use Case ● Category: Computer Vision ● Task: Image Classification ● Model: CNN ● Data: Stanford Cars Dataset ● Analysis: Architecture Comparison ● Result: 2.4% accuracy gain with a much shallower model Learn more https://mlconf.com/blog/insights-for-building-high-performing- image-classification-models/
  • 31. SigOpt. Confidential. Next Talk: Efficient Approaches to Training Automated Early Stopping, Convergence Monitoring
  • 32. SigOpt. Confidential. Register for this talk https://tuning.sigopt.com/tuning-for-systematic-trading
  • 33. SigOpt. Confidential. Tobias Andreasen | tobias@sigopt.com For more information visit: https://sigopt.com/research/ Questions?
  • 34. SigOpt. Confidential. Next talk: How should one think about convergence? 5 - and other approaches to efficient model training techniques
  • 35. SigOpt. Confidential. How should one think about convergence? 5
  • 36. SigOpt. Confidential. Future talks Convergence Implementation Infrastructure Implementation Use cases Experiment transfor Visit us at OpMl’20 Metric Management Parameter Importance
  • 37. SigOpt. Confidential. The best model is found through convergence 37 Think about convergence
  • 38. SigOpt. Confidential. Think about convergence The best model is found through convergence