Stacked Ensembles in H2O
Erin LeDell Ph.D.

Machine Learning Scientist
February 2017
Agenda
• Who/What is H2O?
• Ensemble Learning Overview
• Stacking / Super Learner
• Why Stacking?
• Grid Search & Stacking
• Stacking with Third-party Algos
• AutoML and Stacking
H2O.ai
H2O.ai, the
Company
H2O, the
Platform
• Founded in 2012
• Stanford & Purdue Math & Systems Engineers
• Headquarters: Mountain View, California, USA
• Open Source Software (Apache 2.0 Licensed)
• R, Python, Scala, Java and Web Interfaces
• Distributed algorithms that scale to “Big Data”
Scientific Advisory Council
• John A. Overdeck Professor of Mathematics, Stanford University
• PhD in Statistics, Stanford University
• Co-author, The Elements of Statistical Learning: Prediction, Inference and Data Mining
• Co-author with John Chambers, Statistical Models in S
• Co-author, Generalized Additive Models
Dr. Trevor Hastie
• Professor of Statistics and Health Research and Policy, Stanford University
• PhD in Statistics, Stanford University
• Co-author, The Elements of Statistical Learning: Prediction, Inference and Data Mining
• Author, Regression Shrinkage and Selection via the Lasso
• Co-author, An Introduction to the Bootstrap
Dr. Robert Tibshirani
• Professor of Electrical Engineering and Computer Science, Stanford University
• PhD in Electrical Engineering and Computer Science, UC Berkeley
• Co-author, Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers
• Co-author, Linear Matrix Inequalities in System and Control Theory
• Co-author, Convex Optimization
Dr. Steven Boyd
H2O Distributed Computing
H2O Cluster
H2O Frame
• Multi-node cluster with shared memory model.
• All computations in memory.
• Each node sees only some rows of the data.
• No limit on cluster size.
• Distributed data frames (collection of vectors).
• Columns are distributed (across nodes) arrays.
• Works just like R’s data.frame or Python Pandas
DataFrame
Introduction to Stacking
Ensemble Learning
In statistics and machine learning,
ensemble methods use multiple
learning algorithms to obtain
better predictive performance
than could be obtained by any of
the constituent algorithms.


— Wikipedia
Common Types of Ensemble Methods
• Also reduces variance and increases accuracy
• Not robust against outliers or noisy data
• Flexible — can be used with any loss function
Bagging
Boosting
Stacking
• Reduces variance and increases accuracy
• Robust against outliers or noisy data
• Often used with Decision Trees (i.e. Random Forest)
• Used to ensemble a diverse group of strong learners
• Involves training a second-level machine learning
algorithm called a “metalearner” to learn the 

optimal combination of the base learners
Stacking (aka Super Learner Algorithm)
• Start with design matrix, X, and response, y
• Specify L base learners (with model params)
• Specify a metalearner (just another algorithm)
• Perform k-fold CV on each of the L learners
“Level-zero” 

data
Stacking (aka Super Learner Algorithm)
• Collect the predicted values from k-fold CV that was
performed on each of the L base learners
• Column-bind these prediction vectors together to
form a new design matrix, Z
• Train the metalearner using Z, y
“Level-one” 

data
Stacking vs. Parameter Tuning/Search
• A common task in machine learning is to perform model selection by
specifying a number of models with different parameters.
• An example of this is Grid Search or Random Search.
• The first phase of the Super Learner algorithm is computationally
equivalent to performing model selection via cross-validation.
• The latter phase of the Super Learner algorithm (the metalearning step)
is just training another single model (no CV).
• With Stacking, your computation does not go to waste!
Why Stacked Ensembles?
How to Win Kaggle
https://www.kaggle.com/c/GiveMeSomeCredit/leaderboard/private
How to Win Kaggle
https://www.kaggle.com/c/GiveMeSomeCredit/forums/t/1166/congratulations-to-the-winners/7229#post7229
How to Win Kaggle
https://www.kaggle.com/c/GiveMeSomeCredit/forums/t/1166/congratulations-to-the-winners/7230#post7230
h2oEnsemble R package
& Stacked Ensemble in h2o
Evolution of H2O Ensemble
• h2oEnsemble R package in 2015
• Ensemble logic ported to Java in late 2016
• Stacked Ensemble method in h2o in early 2017
• R & Python APIs supported
• In progress: Custom metalearners
• In progress: MOJO for production use
Stacking with 

Random Grids
H2O Cartesian Grid Search
H2O Random Grid Search
Stacking with Random Grids (h2o R)
Stacking with Random Grids (h2o Python)
H2O Stacking Resources
H2O Stacked Ensembles docs & code demo:
http://tinyurl.com/h2o-stacked-ensembles
h2oEnsemble R package homepage on Github:
http://tinyurl.com/github-h2o-ensemble
Third-Party Integrations
Ensemble H2O with Anything
• XGBoost will be available in the next major
release of H2O, so you can use it with the
Stacked Ensemble method
• https://github.com/h2oai/h2o-3/pull/699
A powerful combo: H2O + XGBoost
Third party stacking with H2O:
• SuperLearner, subsemble, mlr & caret R packages
support stacking with H2O for small/medium data
AutoML
H2O AutoML
Public code coming soon!
• AutoML stands for “Automatic Machine Learning”
• The idea here is to remove most (or all) of the parameters
from the algorithm, as well as automatically generate
derived features that will aid in learning.
• Single algorithms are tuned automatically using a
carefully constructed random grid search.
• Optionally, a Stacked Ensemble can be constructed.
H2O Resources
• H2O Online Training: http://learn.h2o.ai
• H2O Tutorials: https://github.com/h2oai/h2o-tutorials
• H2O Meetup Materials: https://github.com/h2oai/h2o-meetups
• H2O Video Presentations: https://www.youtube.com/user/0xdata
• H2O Community Events & Meetups: https://h2o.ai/events
Thank you!
@ledell on Github, Twitter
erin@h2o.ai
http://www.stat.berkeley.edu/~ledell

More Related Content

PPTX
PPTX
Skutil - H2O meets Sklearn - Taylor Smith
PDF
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
PDF
H2O Deep Water - Making Deep Learning Accessible to Everyone
PPTX
Applying Machine Learning using H2O
PDF
Scalable Machine Learning in R and Python with H2O
PDF
H2O Big Join Slides
PDF
H2O with Erin LeDell at Portland R User Group
Skutil - H2O meets Sklearn - Taylor Smith
Scalable Ensemble Machine Learning @ Harvard Health Policy Data Science Lab
H2O Deep Water - Making Deep Learning Accessible to Everyone
Applying Machine Learning using H2O
Scalable Machine Learning in R and Python with H2O
H2O Big Join Slides
H2O with Erin LeDell at Portland R User Group

What's hot (20)

PDF
Scala: the unpredicted lingua franca for data science
PPTX
H2O intro at Dallas Meetup
PDF
Intro to H2O Machine Learning in R at Santa Clara University
PDF
ArnoCandelAIFrontiers011217
PPTX
Analyzing Data With Python
PDF
Introduction to Analytics with Azure Notebooks and Python
PDF
Ted Willke, Intel Labs MLconf 2013
PDF
Agile data science with scala
PPTX
Intro to Python Data Analysis in Wakari
PDF
Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer wit...
PDF
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
PDF
Building Better Analytics Workflows (Strata-Hadoop World 2013)
PPTX
Machine Learning with Spark
PDF
Strata San Jose 2016: Scalable Ensemble Learning with H2O
PDF
Enabling Python to be a Better Big Data Citizen
PDF
Introduction to Machine Learning with H2O and Python
PDF
H2O Deep Water - Making Deep Learning Accessible to Everyone
PPTX
Making Machine Learning Scale: Single Machine and Distributed
PDF
Spark + H20 = Machine Learning at scale
PDF
Spark Meetup @ Netflix, 05/19/2015
Scala: the unpredicted lingua franca for data science
H2O intro at Dallas Meetup
Intro to H2O Machine Learning in R at Santa Clara University
ArnoCandelAIFrontiers011217
Analyzing Data With Python
Introduction to Analytics with Azure Notebooks and Python
Ted Willke, Intel Labs MLconf 2013
Agile data science with scala
Intro to Python Data Analysis in Wakari
Bringing an AI Ecosystem to the Domain Expert and Enterprise AI Developer wit...
Practical Medium Data Analytics with Python (10 Things I Hate About pandas, P...
Building Better Analytics Workflows (Strata-Hadoop World 2013)
Machine Learning with Spark
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Enabling Python to be a Better Big Data Citizen
Introduction to Machine Learning with H2O and Python
H2O Deep Water - Making Deep Learning Accessible to Everyone
Making Machine Learning Scale: Single Machine and Distributed
Spark + H20 = Machine Learning at scale
Spark Meetup @ Netflix, 05/19/2015
Ad

Viewers also liked (20)

PPTX
Using Machine Learning For Solving Time Series Probelms
PDF
Deep Water - GPU Deep Learning for H2O - Arno Candel
PDF
Deep Learning with MXNet - Dmitry Larko
PDF
H2O AutoML roadmap - Ray Peck
PPTX
Top 10 Data Science Practitioner Pitfalls
PPTX
Interpretable machine learning
PDF
Cybersecurity with AI - Ashrith Barthur
PDF
Sparkling Water 2.0 - Michal Malohlava
PDF
H2O World - Ensembles with Erin LeDell
PDF
sparklyr - Jeff Allen
PPTX
Applying Machine Learning using H2O
PDF
The Joys of Clean Data with Matt Dowle
PDF
H2O Random Grid Search - PyData Amsterdam
PDF
H2O World - Benchmarking Open Source ML Platforms - Szilard Pafka
PPTX
Better Customer Experience with Data Science - Bernard Burg, Comcast
PDF
H2O Advancements - Arno Candel
PPTX
Comcast Enterprise Network Services
PPTX
Predicting Patient Outcomes in Real-Time at HCA
PDF
Steffen Rendle, Research Scientist, Google at MLconf SF
PPTX
Visual Machine Learning - Tony Chu
Using Machine Learning For Solving Time Series Probelms
Deep Water - GPU Deep Learning for H2O - Arno Candel
Deep Learning with MXNet - Dmitry Larko
H2O AutoML roadmap - Ray Peck
Top 10 Data Science Practitioner Pitfalls
Interpretable machine learning
Cybersecurity with AI - Ashrith Barthur
Sparkling Water 2.0 - Michal Malohlava
H2O World - Ensembles with Erin LeDell
sparklyr - Jeff Allen
Applying Machine Learning using H2O
The Joys of Clean Data with Matt Dowle
H2O Random Grid Search - PyData Amsterdam
H2O World - Benchmarking Open Source ML Platforms - Szilard Pafka
Better Customer Experience with Data Science - Bernard Burg, Comcast
H2O Advancements - Arno Candel
Comcast Enterprise Network Services
Predicting Patient Outcomes in Real-Time at HCA
Steffen Rendle, Research Scientist, Google at MLconf SF
Visual Machine Learning - Tony Chu
Ad

Similar to Stacked Ensembles in H2O (20)

PDF
Winning Kaggle 101: Introduction to Stacking
PDF
Scalable Automatic Machine Learning in H2O
PDF
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
PDF
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
PDF
h2oensemble with Erin Ledell at useR! Aalborg
PDF
New Developments in H2O: April 2017 Edition
PDF
Intro to AutoML + Hands-on Lab - Erin LeDell, Machine Learning Scientist, H2O.ai
PPTX
Improving Model Predictions via Stacking and Hyper-parameters Tuning
PDF
Open Platform for AI & ML modeling
PDF
Introduction to H2O and Model Stacking Use Cases
PDF
Scalable Automatic Machine Learning with H2O” by Erin LeDell, Chief Machine L...
PDF
Scalable Automatic Machine Learning with H2O
PDF
Scalable Automatic Machine Learning in H2O
PDF
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
PPTX
Machine Learning in H2O
PPTX
introduction to machine learning and ensemble methods
PDF
High Performance Machine Learning in R with H2O
PDF
H2O World - Intro to Data Science with Erin Ledell
PPTX
Gabriele Nocco - Massive distributed processing with H2O - Codemotion Milan 2017
PDF
Using H2O AutoML for Kaggle Competitions
Winning Kaggle 101: Introduction to Stacking
Scalable Automatic Machine Learning in H2O
Dr. Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf SEA - 5/20/16
Erin LeDell, Machine Learning Scientist, H2O.ai at MLconf ATL 2016
h2oensemble with Erin Ledell at useR! Aalborg
New Developments in H2O: April 2017 Edition
Intro to AutoML + Hands-on Lab - Erin LeDell, Machine Learning Scientist, H2O.ai
Improving Model Predictions via Stacking and Hyper-parameters Tuning
Open Platform for AI & ML modeling
Introduction to H2O and Model Stacking Use Cases
Scalable Automatic Machine Learning with H2O” by Erin LeDell, Chief Machine L...
Scalable Automatic Machine Learning with H2O
Scalable Automatic Machine Learning in H2O
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Machine Learning in H2O
introduction to machine learning and ensemble methods
High Performance Machine Learning in R with H2O
H2O World - Intro to Data Science with Erin Ledell
Gabriele Nocco - Massive distributed processing with H2O - Codemotion Milan 2017
Using H2O AutoML for Kaggle Competitions

More from Sri Ambati (20)

PDF
H2O Label Genie Starter Track - Support Presentation
PDF
H2O.ai Agents : From Theory to Practice - Support Presentation
PDF
H2O Generative AI Starter Track - Support Presentation Slides.pdf
PDF
H2O Gen AI Ecosystem Overview - Level 1 - Slide Deck
PDF
An In-depth Exploration of Enterprise h2oGPTe Slide Deck
PDF
Intro to Enterprise h2oGPTe Presentation Slides
PDF
Enterprise h2o GPTe Learning Path Slide Deck
PDF
H2O Wave Course Starter - Presentation Slides
PDF
Large Language Models (LLMs) - Level 3 Slides
PDF
Data Science and Machine Learning Platforms (2024) Slides
PDF
Data Prep for H2O Driverless AI - Slides
PDF
H2O Cloud AI Developer Services - Slides (2024)
PDF
LLM Learning Path Level 2 - Presentation Slides
PDF
LLM Learning Path Level 1 - Presentation Slides
PDF
Hydrogen Torch - Starter Course - Presentation Slides
PDF
Presentation Resources - H2O Gen AI Ecosystem Overview - Level 2
PDF
H2O Driverless AI Starter Course - Slides and Assignments
PPTX
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
PDF
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
PPTX
Generative AI Masterclass - Model Risk Management.pptx
H2O Label Genie Starter Track - Support Presentation
H2O.ai Agents : From Theory to Practice - Support Presentation
H2O Generative AI Starter Track - Support Presentation Slides.pdf
H2O Gen AI Ecosystem Overview - Level 1 - Slide Deck
An In-depth Exploration of Enterprise h2oGPTe Slide Deck
Intro to Enterprise h2oGPTe Presentation Slides
Enterprise h2o GPTe Learning Path Slide Deck
H2O Wave Course Starter - Presentation Slides
Large Language Models (LLMs) - Level 3 Slides
Data Science and Machine Learning Platforms (2024) Slides
Data Prep for H2O Driverless AI - Slides
H2O Cloud AI Developer Services - Slides (2024)
LLM Learning Path Level 2 - Presentation Slides
LLM Learning Path Level 1 - Presentation Slides
Hydrogen Torch - Starter Course - Presentation Slides
Presentation Resources - H2O Gen AI Ecosystem Overview - Level 2
H2O Driverless AI Starter Course - Slides and Assignments
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Generative AI Masterclass - Model Risk Management.pptx

Recently uploaded (20)

PDF
CloudStack 4.21: First Look Webinar slides
PDF
Hindi spoken digit analysis for native and non-native speakers
PPTX
observCloud-Native Containerability and monitoring.pptx
PPTX
The various Industrial Revolutions .pptx
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Getting Started with Data Integration: FME Form 101
DOCX
search engine optimization ppt fir known well about this
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PPT
What is a Computer? Input Devices /output devices
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPTX
Benefits of Physical activity for teenagers.pptx
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
CloudStack 4.21: First Look Webinar slides
Hindi spoken digit analysis for native and non-native speakers
observCloud-Native Containerability and monitoring.pptx
The various Industrial Revolutions .pptx
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Assigned Numbers - 2025 - Bluetooth® Document
Zenith AI: Advanced Artificial Intelligence
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Getting Started with Data Integration: FME Form 101
search engine optimization ppt fir known well about this
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
What is a Computer? Input Devices /output devices
NewMind AI Weekly Chronicles – August ’25 Week III
Group 1 Presentation -Planning and Decision Making .pptx
Benefits of Physical activity for teenagers.pptx
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game

Stacked Ensembles in H2O

  • 1. Stacked Ensembles in H2O Erin LeDell Ph.D.
 Machine Learning Scientist February 2017
  • 2. Agenda • Who/What is H2O? • Ensemble Learning Overview • Stacking / Super Learner • Why Stacking? • Grid Search & Stacking • Stacking with Third-party Algos • AutoML and Stacking
  • 3. H2O.ai H2O.ai, the Company H2O, the Platform • Founded in 2012 • Stanford & Purdue Math & Systems Engineers • Headquarters: Mountain View, California, USA • Open Source Software (Apache 2.0 Licensed) • R, Python, Scala, Java and Web Interfaces • Distributed algorithms that scale to “Big Data”
  • 4. Scientific Advisory Council • John A. Overdeck Professor of Mathematics, Stanford University • PhD in Statistics, Stanford University • Co-author, The Elements of Statistical Learning: Prediction, Inference and Data Mining • Co-author with John Chambers, Statistical Models in S • Co-author, Generalized Additive Models Dr. Trevor Hastie • Professor of Statistics and Health Research and Policy, Stanford University • PhD in Statistics, Stanford University • Co-author, The Elements of Statistical Learning: Prediction, Inference and Data Mining • Author, Regression Shrinkage and Selection via the Lasso • Co-author, An Introduction to the Bootstrap Dr. Robert Tibshirani • Professor of Electrical Engineering and Computer Science, Stanford University • PhD in Electrical Engineering and Computer Science, UC Berkeley • Co-author, Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers • Co-author, Linear Matrix Inequalities in System and Control Theory • Co-author, Convex Optimization Dr. Steven Boyd
  • 5. H2O Distributed Computing H2O Cluster H2O Frame • Multi-node cluster with shared memory model. • All computations in memory. • Each node sees only some rows of the data. • No limit on cluster size. • Distributed data frames (collection of vectors). • Columns are distributed (across nodes) arrays. • Works just like R’s data.frame or Python Pandas DataFrame
  • 7. Ensemble Learning In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained by any of the constituent algorithms. 
 — Wikipedia
  • 8. Common Types of Ensemble Methods • Also reduces variance and increases accuracy • Not robust against outliers or noisy data • Flexible — can be used with any loss function Bagging Boosting Stacking • Reduces variance and increases accuracy • Robust against outliers or noisy data • Often used with Decision Trees (i.e. Random Forest) • Used to ensemble a diverse group of strong learners • Involves training a second-level machine learning algorithm called a “metalearner” to learn the 
 optimal combination of the base learners
  • 9. Stacking (aka Super Learner Algorithm) • Start with design matrix, X, and response, y • Specify L base learners (with model params) • Specify a metalearner (just another algorithm) • Perform k-fold CV on each of the L learners “Level-zero” 
 data
  • 10. Stacking (aka Super Learner Algorithm) • Collect the predicted values from k-fold CV that was performed on each of the L base learners • Column-bind these prediction vectors together to form a new design matrix, Z • Train the metalearner using Z, y “Level-one” 
 data
  • 11. Stacking vs. Parameter Tuning/Search • A common task in machine learning is to perform model selection by specifying a number of models with different parameters. • An example of this is Grid Search or Random Search. • The first phase of the Super Learner algorithm is computationally equivalent to performing model selection via cross-validation. • The latter phase of the Super Learner algorithm (the metalearning step) is just training another single model (no CV). • With Stacking, your computation does not go to waste!
  • 13. How to Win Kaggle https://www.kaggle.com/c/GiveMeSomeCredit/leaderboard/private
  • 14. How to Win Kaggle https://www.kaggle.com/c/GiveMeSomeCredit/forums/t/1166/congratulations-to-the-winners/7229#post7229
  • 15. How to Win Kaggle https://www.kaggle.com/c/GiveMeSomeCredit/forums/t/1166/congratulations-to-the-winners/7230#post7230
  • 16. h2oEnsemble R package & Stacked Ensemble in h2o
  • 17. Evolution of H2O Ensemble • h2oEnsemble R package in 2015 • Ensemble logic ported to Java in late 2016 • Stacked Ensemble method in h2o in early 2017 • R & Python APIs supported • In progress: Custom metalearners • In progress: MOJO for production use
  • 20. H2O Random Grid Search
  • 21. Stacking with Random Grids (h2o R)
  • 22. Stacking with Random Grids (h2o Python)
  • 23. H2O Stacking Resources H2O Stacked Ensembles docs & code demo: http://tinyurl.com/h2o-stacked-ensembles h2oEnsemble R package homepage on Github: http://tinyurl.com/github-h2o-ensemble
  • 25. Ensemble H2O with Anything • XGBoost will be available in the next major release of H2O, so you can use it with the Stacked Ensemble method • https://github.com/h2oai/h2o-3/pull/699 A powerful combo: H2O + XGBoost Third party stacking with H2O: • SuperLearner, subsemble, mlr & caret R packages support stacking with H2O for small/medium data
  • 27. H2O AutoML Public code coming soon! • AutoML stands for “Automatic Machine Learning” • The idea here is to remove most (or all) of the parameters from the algorithm, as well as automatically generate derived features that will aid in learning. • Single algorithms are tuned automatically using a carefully constructed random grid search. • Optionally, a Stacked Ensemble can be constructed.
  • 28. H2O Resources • H2O Online Training: http://learn.h2o.ai • H2O Tutorials: https://github.com/h2oai/h2o-tutorials • H2O Meetup Materials: https://github.com/h2oai/h2o-meetups • H2O Video Presentations: https://www.youtube.com/user/0xdata • H2O Community Events & Meetups: https://h2o.ai/events
  • 29. Thank you! @ledell on Github, Twitter erin@h2o.ai http://www.stat.berkeley.edu/~ledell