SlideShare a Scribd company logo
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
apsis
Automated Hyperparameter Optimization Using Bayesian
Optimization
Frederik Diehl Andreas Jauch
March 10, 2015
State March 10, 2015
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
AGENDA
Problem Description
Bayesian Optimization
apsis and its Architecture
Project Organisation
Performance Evaluation
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
MOTIVATION
Why Hyperparameter Optimization and why automating it?
hyperparameter tuning often leads to huge performance
gain
”more of an art than a science”
reproducibility of published results
automatic methods might be better than humans
provide ml algorithms to non-expert users
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
ML PROCESS OVERVIEW
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
FORMAL PROBLEM DESCRIPTION
λ : hyperparameter vector λ = (λ(1)
, ..., λ(n)
)
L(X, f) : loss function evaluated for model f and dataset x
Aλ(X) : learning algorithm with hyperparameter vector λ
learning on dataset X
Xtrain : training data, Xvalid : validation data, Xtest : test data
Ψ(λ) : hyperparameter response function/surface
Hyperparameter Optimization Problem
ˆλ ≈ argmin
λ Λ
mean
Xi∈Xvalid
(L(xi, Aλ(Xtrain))
Ψ(λ)
(1)
= argmin
λ Λ
(Ψ(λ)) (2)
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
KEY PROBLEM PROPERTIES
unknown, probably non-convex response surface Ψ
no derivative-based optimization possible
every evaluation of Ψ is expensive
evaluation time of Ψ depends on individual value of λ
low effective dimensionality of Ψ
which dimensions are important is dataset dependent
tree-structured configuration space1
1
not addressed in apsis yet
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
STATE OF THE ART
optimization still manual in many projects
grid search most common method
often the only provided method by many ml frameworks
random search
bayesian optimization
code of Jasper Snoek et. al. Harvard/Toronto
whetlab - bay opt in the cloud
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
BAYESIAN OPTIMIZATION
approximate Ψ(λ) by a surrogate function M(λ) = y
surrogate function cheaper to evaluate than Ψ
interpret model to find minimization candidates for Ψ
evaluate Ψ for promising candidates
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
BAYESIAN OPTIMIZATION FUNDAMENTALS
We need two design choices
Surrogate Modelling Function - Gaussian Processes
universal approximation
very flexible and have many useful properties
closed under sampling
Acquisition Function
Probability of Improvement
Expected Improvement
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
ACQUISITION FUNCTION u
measures the expected utility of evaluating the objective
function at a point λnext
exploitation vs. exploration trade-off
(picture adopted from [1])
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
OPTIMIZATION - SUCCESSIVELY UPDATING THE GP
1. find
max
λ
(u(λ)) = λnext
max of acquisition
2. Evaluate M at λnext
3. Update the GP
(picture adopted from [1])
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
FITTING THE GP TO THE PROBLEM
by tuning the covariance!
Squared Exponential Kernel
KSE(λ, λ ) = exp −
1
2l2
·
1..dim(D)
(λd − λd)2
use Automatic Relevance Determination (ARD)
KSE(λ, λ ) = θ0 · exp

−
1
2
·
1..dim(D)
1
θd
2
(λd − λd)2


with ARD vector θ
θ = ( θ0
bias
, θ1, . . . , θd
dimension weights
)
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
HOW DO ACQUISITION FUNCTIONS LOOK LIKE?
Expected Improvement (EI)
uEI(λ) =
−∞
∞
max(Ψ(λ∗
) − y, 0) · pM(y|λ) dy
closed form solution for GPs available
uEI(λ|Mt) = σ(λ) ·
f(λ∗) − µ(λ)
σ(λ)
· Φ(λ) + φ(λ)
gradient analytically derived in apsis for more effective
optimization
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
THE apsis TOOLKIT
Automated Hyperparameter Optimization Framework for
random search
bayesian optimization
as an open source framework featuring
flexible architecture, ready to be extended for more
optimizers
ready for use with scikit-learn and theano
implemented in Python
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
PROJECT OBJECTIVES
open source implementation of state of the art research in
bayesian optimization
extendible project to encourage collaboration with other
researchers
easy integration with existing machine learning
frameworks
multi core support
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
apsis ARCHITECTURE OVERVIEW
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
apsis CORE MODEL COMPONENTS
Parameter Definitions
define the meta information for each hyperparameter
Candidates
represent a specific hyperparameter vector and its value
holds function value if available
Experiments
represent an optimization object
keeps track of finished and unfinished Candidates
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
USING apsis - EXPERIMENT ASSISTANTS
single experiment interaction interface
provides plots and result bookkeeping
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
USING apsis - LAB ASSISTANTS
multiple experiments to compare different optimization
techniques
cross validation
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
EXTENSIVE EXPERIMENT TRACKING
automated plot writing
automated results writing
write out information at every step
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
AUTOMATED PLOTTING
plot function evaluations and best results
plot confidence bars when using cross validation
write out plots at every step
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
PARAMETER REPRESENTATION IN apsis
different representation by parameter type
various nominal and numeric types
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
NUMERIC PARAMETERS IN apsis
Warping Mechanism
parameters are warped into [0, 1] interval
optimization core can assume a uniform and equal
distribution in [0, 1] space
warping can be user defined
Provided Warpings for
normalization of arbitrary intervals [a, b] into [0, 1] space.
asymptotic parameters, e.g. learning rate asymptotic at 0
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
NOMINAL PARAMETERS IN apsis
generally supported in apsis
no support in Bayesian Optimization
GP kernels based on distance metrics between parameters
interesting topic for further research
no publications on this topic so far
whetlab pretends to deal well with them - but doesn’t say
how
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
EXPECTED IMPROVEMENT OPTIMIZATION
gradient analytically derived2
1000 Steps Random Search for Initialization
Several iterative optimization methods integrated
L-BFGS-B Bounded Low Memory Quasi Newton Method
BFGS Quasi Newton Method
Nelder-Mead
Inexact Newton with Conjugate Gradient Solver
...
2
See our paper for derivation.
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
DEALING WITH GP HYPERPARAMETERS
the GP surrogate model introduces new hyperparameter
not subject of optimization ⇒ hyper-hyperparameters
optimization by maximum likelihood method
integrating over these parameters in the acquisition
function using Hybrid Monte Carlo sampling
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
apsis PROJECT SET UP
Open-Source project from the beginning
MIT-License
active issue tracking
PEP-8 code styling convention
Fully automated sphinx documentation build on every
commit
90% test coverage
clear commit messages
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
apsis GITHUB REPOSITORY
Check out http://github.com/FrederikDiehl/apsis !
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
ISSUE TRACKING AND DISCUSSION
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
UNIT TESTS
90% overall test coverage
100% in most core components
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
GOOD CODE DOCUMENTATION
almost 50:50 ratio of code vs. doc
documented according to sphinx standard
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
FULLY AUTOMATED DOCUMENTATION BUILD
builds on every commit
Visit http://apsis.readthedocs.org !
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
BRANIN HOO OPTIMIZATION
Best Value by Optimizer
Values by Optimizer
random search finds better end result but bay opt is more
stable
similar performance as in other bay opt literature
no other group publishes comparison to random search
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
OPTIMIZING ARTIFICIAL NOISE FUNCTION
One dimensional noise function with several smoothing
variances
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
OPTIMIZING ARTIFICIAL NOISE FUNCTION
Minimization result on 3d noise by smoothing factor.
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
BREZE MNIST NEURAL NETWORK
Neural Network on MNIST using uniform parameters
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
BREZE MNIST NEURAL NETWORK (2)
Neural Network on MNIST using asymptotic parameters for
learning rate and learning rate decay
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
TAKING THE PROJECT TO THE NEXT LEVEL -
PROGRAM
implement full multicore support
implement a REST web-service to offer interoperability
with any language
improve integration of matplotlib
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
TAKING THE PROJECT TO THE NEXT LEVEL -
BAYESIAN OPTIMIZATION
deal with nominal parameters
try replacing GPs with Student-t processes
try to take tree structured configuration space into account
account for evaluation cost depending on hyperparameter
setting
implement freeze-thaw optimization idea [3]
automated learning of input warping [2]
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
Thank You!
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
REFERENCES I
Eric Brochu, Vlad M. Cora, and Nando de Freitas.
A Tutorial on Bayesian Optimization of Expensive Cost
Functions, with Application to Active User Modeling and
Hierarchical Reinforcement Learning.
IEEE Transactions on Reliability, abs/1012.2, 2010.
Jasper Snoek, Kevin Swersky, Richard S Zemel, and Ryan P
Adams.
Input warping for bayesian optimization of non-stationary
functions.
arXiv preprint arXiv:1402.0929, 2014.
Kevin Swersky, Jasper Snoek, and Ryan Prescott Adams.
Freeze-thaw bayesian optimization.
arXiv preprint arXiv:1406.3896, 2014.
Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
REFERENCES II

More Related Content

PPT
PASS Summit 2010 Keynote David DeWitt
PPTX
TPC-H Column Store and MPP systems
PPT
probabilistic ranking
PDF
COCOA: Communication-Efficient Coordinate Ascent
PDF
Industrial project and machine scheduling with Constraint Programming
PPTX
new optimization algorithm for topology optimization
PDF
A generic method for modeling accelerated life testing data
PPTX
Air rocket
PASS Summit 2010 Keynote David DeWitt
TPC-H Column Store and MPP systems
probabilistic ranking
COCOA: Communication-Efficient Coordinate Ascent
Industrial project and machine scheduling with Constraint Programming
new optimization algorithm for topology optimization
A generic method for modeling accelerated life testing data
Air rocket

Viewers also liked (10)

PPTX
Rocket project 2014 updated
PPTX
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
PDF
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
PDF
Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age
PPTX
Feature Engineering
PDF
Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang
PDF
High Performance Machine Learning in R with H2O
PPTX
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
PPTX
The How and Why of Feature Engineering
PPTX
TensorFrames: Google Tensorflow on Apache Spark
Rocket project 2014 updated
Recommendation Architecture - OpenTable - RecSys 2014 - Large Scale Recommend...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Deep Learning at Scale - A...
Spark, Deep Learning and Life Sciences, Systems Biology in the Big Data Age
Feature Engineering
Introduction to multi gpu deep learning with DIGITS 2 - Mike Wang
High Performance Machine Learning in R with H2O
A Scaleable Implementation of Deep Learning on Spark -Alexander Ulanov
The How and Why of Feature Engineering
TensorFrames: Google Tensorflow on Apache Spark
Ad

Similar to apsis - Automatic Hyperparameter Optimization Framework for Machine Learning (20)

PDF
DataScienceLab2017_Оптимизация гиперпараметров машинного обучения при помощи ...
PPTX
Ijcai 2020
PPTX
Pydata presentation
PDF
Bayesian Optimization for Balancing Metrics in Recommender Systems
PDF
Virtual.PYXIS_flyer
PDF
Common Problems in Hyperparameter Optimization
PDF
Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017
PDF
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
PDF
Interactive Tradeoffs Between Competing Offline Metrics with Bayesian Optimiz...
PDF
Breaking the Boundaries of Human-in-the-Loop Optimization
PDF
Virtual.PYXIS_flyer
PDF
Bayesian network based software reliability prediction
PDF
IBM BOA for POWER
PPTX
NIPS 2016. BayesOpt workshop invited talk.
PDF
Facebook Talk at Netflix ML Platform meetup Sep 2019
PDF
Probabilistic machine learning for optimization and solving complex
PDF
No more grid search! How to build models effectively by Thomas Huijskens
PPTX
LinkedIn talk at Netflix ML Platform meetup Sep 2019
PPTX
Sagemaker Automatic model tuning
PDF
Tuning for Systematic Trading: Talk 1
DataScienceLab2017_Оптимизация гиперпараметров машинного обучения при помощи ...
Ijcai 2020
Pydata presentation
Bayesian Optimization for Balancing Metrics in Recommender Systems
Virtual.PYXIS_flyer
Common Problems in Hyperparameter Optimization
Alexandra Johnson, Software Engineer, SigOpt, at MLconf NYC 2017
MUMS: Transition & SPUQ Workshop - Practical Bayesian Optimization for Urban ...
Interactive Tradeoffs Between Competing Offline Metrics with Bayesian Optimiz...
Breaking the Boundaries of Human-in-the-Loop Optimization
Virtual.PYXIS_flyer
Bayesian network based software reliability prediction
IBM BOA for POWER
NIPS 2016. BayesOpt workshop invited talk.
Facebook Talk at Netflix ML Platform meetup Sep 2019
Probabilistic machine learning for optimization and solving complex
No more grid search! How to build models effectively by Thomas Huijskens
LinkedIn talk at Netflix ML Platform meetup Sep 2019
Sagemaker Automatic model tuning
Tuning for Systematic Trading: Talk 1
Ad

Recently uploaded (20)

PPTX
Oracle Fusion HCM Cloud Demo for Beginners
PDF
iTop VPN Crack Latest Version Full Key 2025
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Complete Guide to Website Development in Malaysia for SMEs
PDF
Nekopoi APK 2025 free lastest update
PPTX
Patient Appointment Booking in Odoo with online payment
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PDF
CCleaner Pro 6.38.11537 Crack Final Latest Version 2025
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
How to Make Money in the Metaverse_ Top Strategies for Beginners.pdf
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PDF
Tally Prime Crack Download New Version 5.1 [2025] (License Key Free
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
Oracle Fusion HCM Cloud Demo for Beginners
iTop VPN Crack Latest Version Full Key 2025
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
CHAPTER 2 - PM Management and IT Context
Complete Guide to Website Development in Malaysia for SMEs
Nekopoi APK 2025 free lastest update
Patient Appointment Booking in Odoo with online payment
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
CCleaner Pro 6.38.11537 Crack Final Latest Version 2025
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
How to Make Money in the Metaverse_ Top Strategies for Beginners.pdf
Why Generative AI is the Future of Content, Code & Creativity?
Tally Prime Crack Download New Version 5.1 [2025] (License Key Free
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Odoo Companies in India – Driving Business Transformation.pdf
Operating system designcfffgfgggggggvggggggggg
Design an Analysis of Algorithms I-SECS-1021-03
Navsoft: AI-Powered Business Solutions & Custom Software Development

apsis - Automatic Hyperparameter Optimization Framework for Machine Learning

  • 1. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation apsis Automated Hyperparameter Optimization Using Bayesian Optimization Frederik Diehl Andreas Jauch March 10, 2015 State March 10, 2015
  • 2. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation AGENDA Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation
  • 3. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation MOTIVATION Why Hyperparameter Optimization and why automating it? hyperparameter tuning often leads to huge performance gain ”more of an art than a science” reproducibility of published results automatic methods might be better than humans provide ml algorithms to non-expert users
  • 4. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation ML PROCESS OVERVIEW
  • 5. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation FORMAL PROBLEM DESCRIPTION λ : hyperparameter vector λ = (λ(1) , ..., λ(n) ) L(X, f) : loss function evaluated for model f and dataset x Aλ(X) : learning algorithm with hyperparameter vector λ learning on dataset X Xtrain : training data, Xvalid : validation data, Xtest : test data Ψ(λ) : hyperparameter response function/surface Hyperparameter Optimization Problem ˆλ ≈ argmin λ Λ mean Xi∈Xvalid (L(xi, Aλ(Xtrain)) Ψ(λ) (1) = argmin λ Λ (Ψ(λ)) (2)
  • 6. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation KEY PROBLEM PROPERTIES unknown, probably non-convex response surface Ψ no derivative-based optimization possible every evaluation of Ψ is expensive evaluation time of Ψ depends on individual value of λ low effective dimensionality of Ψ which dimensions are important is dataset dependent tree-structured configuration space1 1 not addressed in apsis yet
  • 7. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation STATE OF THE ART optimization still manual in many projects grid search most common method often the only provided method by many ml frameworks random search bayesian optimization code of Jasper Snoek et. al. Harvard/Toronto whetlab - bay opt in the cloud
  • 8. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation BAYESIAN OPTIMIZATION approximate Ψ(λ) by a surrogate function M(λ) = y surrogate function cheaper to evaluate than Ψ interpret model to find minimization candidates for Ψ evaluate Ψ for promising candidates
  • 9. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation BAYESIAN OPTIMIZATION FUNDAMENTALS We need two design choices Surrogate Modelling Function - Gaussian Processes universal approximation very flexible and have many useful properties closed under sampling Acquisition Function Probability of Improvement Expected Improvement
  • 10. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation ACQUISITION FUNCTION u measures the expected utility of evaluating the objective function at a point λnext exploitation vs. exploration trade-off (picture adopted from [1])
  • 11. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation OPTIMIZATION - SUCCESSIVELY UPDATING THE GP 1. find max λ (u(λ)) = λnext max of acquisition 2. Evaluate M at λnext 3. Update the GP (picture adopted from [1])
  • 12. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation FITTING THE GP TO THE PROBLEM by tuning the covariance! Squared Exponential Kernel KSE(λ, λ ) = exp − 1 2l2 · 1..dim(D) (λd − λd)2 use Automatic Relevance Determination (ARD) KSE(λ, λ ) = θ0 · exp  − 1 2 · 1..dim(D) 1 θd 2 (λd − λd)2   with ARD vector θ θ = ( θ0 bias , θ1, . . . , θd dimension weights )
  • 13. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation HOW DO ACQUISITION FUNCTIONS LOOK LIKE? Expected Improvement (EI) uEI(λ) = −∞ ∞ max(Ψ(λ∗ ) − y, 0) · pM(y|λ) dy closed form solution for GPs available uEI(λ|Mt) = σ(λ) · f(λ∗) − µ(λ) σ(λ) · Φ(λ) + φ(λ) gradient analytically derived in apsis for more effective optimization
  • 14. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation THE apsis TOOLKIT Automated Hyperparameter Optimization Framework for random search bayesian optimization as an open source framework featuring flexible architecture, ready to be extended for more optimizers ready for use with scikit-learn and theano implemented in Python
  • 15. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation PROJECT OBJECTIVES open source implementation of state of the art research in bayesian optimization extendible project to encourage collaboration with other researchers easy integration with existing machine learning frameworks multi core support
  • 16. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation apsis ARCHITECTURE OVERVIEW
  • 17. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation apsis CORE MODEL COMPONENTS Parameter Definitions define the meta information for each hyperparameter Candidates represent a specific hyperparameter vector and its value holds function value if available Experiments represent an optimization object keeps track of finished and unfinished Candidates
  • 18. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation USING apsis - EXPERIMENT ASSISTANTS single experiment interaction interface provides plots and result bookkeeping
  • 19. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation USING apsis - LAB ASSISTANTS multiple experiments to compare different optimization techniques cross validation
  • 20. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation EXTENSIVE EXPERIMENT TRACKING automated plot writing automated results writing write out information at every step
  • 21. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation AUTOMATED PLOTTING plot function evaluations and best results plot confidence bars when using cross validation write out plots at every step
  • 22. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation PARAMETER REPRESENTATION IN apsis different representation by parameter type various nominal and numeric types
  • 23. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation NUMERIC PARAMETERS IN apsis Warping Mechanism parameters are warped into [0, 1] interval optimization core can assume a uniform and equal distribution in [0, 1] space warping can be user defined Provided Warpings for normalization of arbitrary intervals [a, b] into [0, 1] space. asymptotic parameters, e.g. learning rate asymptotic at 0
  • 24. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation NOMINAL PARAMETERS IN apsis generally supported in apsis no support in Bayesian Optimization GP kernels based on distance metrics between parameters interesting topic for further research no publications on this topic so far whetlab pretends to deal well with them - but doesn’t say how
  • 25. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation EXPECTED IMPROVEMENT OPTIMIZATION gradient analytically derived2 1000 Steps Random Search for Initialization Several iterative optimization methods integrated L-BFGS-B Bounded Low Memory Quasi Newton Method BFGS Quasi Newton Method Nelder-Mead Inexact Newton with Conjugate Gradient Solver ... 2 See our paper for derivation.
  • 26. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation DEALING WITH GP HYPERPARAMETERS the GP surrogate model introduces new hyperparameter not subject of optimization ⇒ hyper-hyperparameters optimization by maximum likelihood method integrating over these parameters in the acquisition function using Hybrid Monte Carlo sampling
  • 27. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation apsis PROJECT SET UP Open-Source project from the beginning MIT-License active issue tracking PEP-8 code styling convention Fully automated sphinx documentation build on every commit 90% test coverage clear commit messages
  • 28. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation apsis GITHUB REPOSITORY Check out http://github.com/FrederikDiehl/apsis !
  • 29. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation ISSUE TRACKING AND DISCUSSION
  • 30. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation UNIT TESTS 90% overall test coverage 100% in most core components
  • 31. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation GOOD CODE DOCUMENTATION almost 50:50 ratio of code vs. doc documented according to sphinx standard
  • 32. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation FULLY AUTOMATED DOCUMENTATION BUILD builds on every commit Visit http://apsis.readthedocs.org !
  • 33. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation BRANIN HOO OPTIMIZATION Best Value by Optimizer Values by Optimizer random search finds better end result but bay opt is more stable similar performance as in other bay opt literature no other group publishes comparison to random search
  • 34. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation OPTIMIZING ARTIFICIAL NOISE FUNCTION One dimensional noise function with several smoothing variances
  • 35. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation OPTIMIZING ARTIFICIAL NOISE FUNCTION Minimization result on 3d noise by smoothing factor.
  • 36. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation BREZE MNIST NEURAL NETWORK Neural Network on MNIST using uniform parameters
  • 37. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation BREZE MNIST NEURAL NETWORK (2) Neural Network on MNIST using asymptotic parameters for learning rate and learning rate decay
  • 38. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation TAKING THE PROJECT TO THE NEXT LEVEL - PROGRAM implement full multicore support implement a REST web-service to offer interoperability with any language improve integration of matplotlib
  • 39. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation TAKING THE PROJECT TO THE NEXT LEVEL - BAYESIAN OPTIMIZATION deal with nominal parameters try replacing GPs with Student-t processes try to take tree structured configuration space into account account for evaluation cost depending on hyperparameter setting implement freeze-thaw optimization idea [3] automated learning of input warping [2]
  • 40. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation Thank You!
  • 41. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation REFERENCES I Eric Brochu, Vlad M. Cora, and Nando de Freitas. A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning. IEEE Transactions on Reliability, abs/1012.2, 2010. Jasper Snoek, Kevin Swersky, Richard S Zemel, and Ryan P Adams. Input warping for bayesian optimization of non-stationary functions. arXiv preprint arXiv:1402.0929, 2014. Kevin Swersky, Jasper Snoek, and Ryan Prescott Adams. Freeze-thaw bayesian optimization. arXiv preprint arXiv:1406.3896, 2014.
  • 42. Problem Description Bayesian Optimization apsis and its Architecture Project Organisation Performance Evaluation REFERENCES II