SlideShare a Scribd company logo
To Production and Beyond
Spencer Aiello
The Problem
• Goal:
o Move from prototype to production
• Road block:
o Prototyping Environment Cages Your:
• Feature preprocessing
• Models
• Ideas
The Problem
• Even if your code is beautiful:
The Problem
• You cannot drag-n-drop into a new
environment.
• Translation may be difficult; humans make
mistakes
A Solution
H2O gives you wings:
• Export Preprocessing
• Export Models
H2OAssembly
o Build Rich Feature Preprocessing Assembly Lines
• Clean, reduce, and expand datasets by composing any
of the 100s of primitives available in H2O
• Build hygenic processing assembly lines that can be
applied to new batches of data
• Export your feature preprocessing steps as a plain old
java object and apply to streaming tuples
H2OAssembly
H2OAssembly
Python
Java
Live Demo
• Lending Club Data: Predict Interest Rate
o Four-part dataset of loan data
o 500K rows, 52 columns
o Preprocess 5 columns within a 16 step assembly
o Build a simple GBM to predict interest rate
o Export everything into a Storm topology
Live Demo
Storm Topology

More Related Content

PDF
H2O World - What's New in H2O with Cliff Click
PDF
H2O World - Welcome to H2O World with Arno Candel
PDF
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
PDF
Koalas: How Well Does Koalas Work?
PDF
H2O World - Sparkling Water - Michal Malohlava
PDF
Understanding and Improving Code Generation
PDF
Context-aware Fast Food Recommendation with Ray on Apache Spark at Burger King
PDF
Willump: Optimizing Feature Computation in ML Inference
H2O World - What's New in H2O with Cliff Click
H2O World - Welcome to H2O World with Arno Candel
Erin LeDell, H2O.ai - Scalable Automatic Machine Learning - H2O World San Fra...
Koalas: How Well Does Koalas Work?
H2O World - Sparkling Water - Michal Malohlava
Understanding and Improving Code Generation
Context-aware Fast Food Recommendation with Ray on Apache Spark at Burger King
Willump: Optimizing Feature Computation in ML Inference

What's hot (20)

PDF
Zipline - A Declarative Feature Engineering Framework
PDF
Apache Spark MLlib 2.0 Preview: Data Science and Production
PDF
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
PDF
Intro to AutoML + Hands-on Lab - Erin LeDell, Machine Learning Scientist, H2O.ai
PDF
FlinkML - Big data application meetup
PDF
Strata San Jose 2016: Scalable Ensemble Learning with H2O
PDF
Ray: Enterprise-Grade, Distributed Python
PDF
Scaling Machine Learning To Billions Of Parameters
PDF
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
PDF
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
PDF
Scalable Automatic Machine Learning in H2O
PDF
Clearing Airflow Obstructions
PDF
Productionizing Machine Learning Pipelines with Databricks and Azure ML
PDF
SparkCruise: Automatic Computation Reuse in Apache Spark
PDF
Operationalizing Machine Learning at Scale with Sameer Nori
PDF
Superworkflow of Graph Neural Networks with K8S and Fugue
PDF
Scaling Ride-Hailing with Machine Learning on MLflow
PDF
Javantura v4 - Getting started with Apache Spark - Dinko Srkoč
PDF
Madrid Meetup
PDF
Grokking TechTalk #20: PostgreSQL Internals 101
Zipline - A Declarative Feature Engineering Framework
Apache Spark MLlib 2.0 Preview: Data Science and Production
Enabling Scalable Data Science Pipeline with Mlflow at Thermo Fisher Scientific
Intro to AutoML + Hands-on Lab - Erin LeDell, Machine Learning Scientist, H2O.ai
FlinkML - Big data application meetup
Strata San Jose 2016: Scalable Ensemble Learning with H2O
Ray: Enterprise-Grade, Distributed Python
Scaling Machine Learning To Billions Of Parameters
Intuitive & Scalable Hyperparameter Tuning with Apache Spark + Fugue
Scaling Apache Spark MLlib to Billions of Parameters: Spark Summit East talk ...
Scalable Automatic Machine Learning in H2O
Clearing Airflow Obstructions
Productionizing Machine Learning Pipelines with Databricks and Azure ML
SparkCruise: Automatic Computation Reuse in Apache Spark
Operationalizing Machine Learning at Scale with Sameer Nori
Superworkflow of Graph Neural Networks with K8S and Fugue
Scaling Ride-Hailing with Machine Learning on MLflow
Javantura v4 - Getting started with Apache Spark - Dinko Srkoč
Madrid Meetup
Grokking TechTalk #20: PostgreSQL Internals 101
Ad

Viewers also liked (19)

PPTX
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
PPTX
H2O World - Self Guiding Applications with Venkatesh Yadav
PPTX
H2O World - Translating Advanced Analytics for Business Users - Conor Jensen
PDF
Basic H2O for Python with Eric Eckstrand
PDF
H2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
PDF
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
PDF
H2O World - H2O Rains with Databricks Cloud
PDF
Sparkling Water Meetup 4.15.15
PDF
H2O World - Building a Smarter Application - Tom Kraljevic
PPTX
Data & Data Alliances - Scott Mclellan
PDF
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
PDF
H2O World - What you need before doing predictive analysis - Keen.io
PDF
The Joys of Clean Data with Matt Dowle
PDF
H2O World - A Look Under Progressive's Big Data Hood - Pawan Divakarla & Bria...
PDF
Intro to H2O Machine Learning in R at Santa Clara University
PDF
Introduction to Data Science with H2O- Mountain View
PDF
H2O Deep Water - Making Deep Learning Accessible to Everyone
PDF
H2O with Erin LeDell at Portland R User Group
PDF
H2O PySparkling Water
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Self Guiding Applications with Venkatesh Yadav
H2O World - Translating Advanced Analytics for Business Users - Conor Jensen
Basic H2O for Python with Eric Eckstrand
H2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
H2O World - H2O Rains with Databricks Cloud
Sparkling Water Meetup 4.15.15
H2O World - Building a Smarter Application - Tom Kraljevic
Data & Data Alliances - Scott Mclellan
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
H2O World - What you need before doing predictive analysis - Keen.io
The Joys of Clean Data with Matt Dowle
H2O World - A Look Under Progressive's Big Data Hood - Pawan Divakarla & Bria...
Intro to H2O Machine Learning in R at Santa Clara University
Introduction to Data Science with H2O- Mountain View
H2O Deep Water - Making Deep Learning Accessible to Everyone
H2O with Erin LeDell at Portland R User Group
H2O PySparkling Water
Ad

Similar to H2O World - Python Pipelines - Spencer Aiello (20)

PDF
H2O at Poznan R Meetup
PDF
Digital Origin - Pipelines for model deployment
PDF
H2O at BelgradeR Meetup
PDF
Belgrade R - Intro to H2O and Deep Water
PDF
PDF
Machine Learning With H2O vs SparkML
PPTX
Data Science, Machine Learning, and H2O
PDF
Introducción al Machine Learning Automático
PPTX
Project "Deep Water"
PDF
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
PDF
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
PDF
Introducción al Aprendizaje Automatico con H2O-3 (1)
PDF
0xdata_h2o_BigDataScience_5.28.2013
PDF
New Developments in H2O: April 2017 Edition
PPTX
H2O 0xdata MLconf
PPTX
Auto ai for skillsfuture
PDF
Intro to Machine Learning with H2O and AWS
PDF
Introduction to H2O and Model Stacking Use Cases
PPTX
Sparkling Water Webinar October 29th, 2014
PPTX
Spark meetup feb 2016
H2O at Poznan R Meetup
Digital Origin - Pipelines for model deployment
H2O at BelgradeR Meetup
Belgrade R - Intro to H2O and Deep Water
Machine Learning With H2O vs SparkML
Data Science, Machine Learning, and H2O
Introducción al Machine Learning Automático
Project "Deep Water"
Start Getting Your Feet Wet in Open Source Machine and Deep Learning
H2o.ai presentation at 2nd Virtual Pydata Piraeus meetup
Introducción al Aprendizaje Automatico con H2O-3 (1)
0xdata_h2o_BigDataScience_5.28.2013
New Developments in H2O: April 2017 Edition
H2O 0xdata MLconf
Auto ai for skillsfuture
Intro to Machine Learning with H2O and AWS
Introduction to H2O and Model Stacking Use Cases
Sparkling Water Webinar October 29th, 2014
Spark meetup feb 2016

More from Sri Ambati (20)

PDF
H2O Label Genie Starter Track - Support Presentation
PDF
H2O.ai Agents : From Theory to Practice - Support Presentation
PDF
H2O Generative AI Starter Track - Support Presentation Slides.pdf
PDF
H2O Gen AI Ecosystem Overview - Level 1 - Slide Deck
PDF
An In-depth Exploration of Enterprise h2oGPTe Slide Deck
PDF
Intro to Enterprise h2oGPTe Presentation Slides
PDF
Enterprise h2o GPTe Learning Path Slide Deck
PDF
H2O Wave Course Starter - Presentation Slides
PDF
Large Language Models (LLMs) - Level 3 Slides
PDF
Data Science and Machine Learning Platforms (2024) Slides
PDF
Data Prep for H2O Driverless AI - Slides
PDF
H2O Cloud AI Developer Services - Slides (2024)
PDF
LLM Learning Path Level 2 - Presentation Slides
PDF
LLM Learning Path Level 1 - Presentation Slides
PDF
Hydrogen Torch - Starter Course - Presentation Slides
PDF
Presentation Resources - H2O Gen AI Ecosystem Overview - Level 2
PDF
H2O Driverless AI Starter Course - Slides and Assignments
PPTX
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
PDF
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
PPTX
Generative AI Masterclass - Model Risk Management.pptx
H2O Label Genie Starter Track - Support Presentation
H2O.ai Agents : From Theory to Practice - Support Presentation
H2O Generative AI Starter Track - Support Presentation Slides.pdf
H2O Gen AI Ecosystem Overview - Level 1 - Slide Deck
An In-depth Exploration of Enterprise h2oGPTe Slide Deck
Intro to Enterprise h2oGPTe Presentation Slides
Enterprise h2o GPTe Learning Path Slide Deck
H2O Wave Course Starter - Presentation Slides
Large Language Models (LLMs) - Level 3 Slides
Data Science and Machine Learning Platforms (2024) Slides
Data Prep for H2O Driverless AI - Slides
H2O Cloud AI Developer Services - Slides (2024)
LLM Learning Path Level 2 - Presentation Slides
LLM Learning Path Level 1 - Presentation Slides
Hydrogen Torch - Starter Course - Presentation Slides
Presentation Resources - H2O Gen AI Ecosystem Overview - Level 2
H2O Driverless AI Starter Course - Slides and Assignments
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Generative AI Masterclass - Model Risk Management.pptx

Recently uploaded (20)

PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
iTop VPN 6.5.0 Crack + License Key 2025 (Premium Version)
PDF
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
PDF
iTop VPN Crack Latest Version Full Key 2025
PPTX
Oracle Fusion HCM Cloud Demo for Beginners
PPTX
history of c programming in notes for students .pptx
PDF
CCleaner Pro 6.38.11537 Crack Final Latest Version 2025
PPTX
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
PDF
Complete Guide to Website Development in Malaysia for SMEs
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PPTX
Patient Appointment Booking in Odoo with online payment
PPTX
assetexplorer- product-overview - presentation
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
17 Powerful Integrations Your Next-Gen MLM Software Needs
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
iTop VPN 6.5.0 Crack + License Key 2025 (Premium Version)
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
iTop VPN Crack Latest Version Full Key 2025
Oracle Fusion HCM Cloud Demo for Beginners
history of c programming in notes for students .pptx
CCleaner Pro 6.38.11537 Crack Final Latest Version 2025
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
Complete Guide to Website Development in Malaysia for SMEs
Design an Analysis of Algorithms II-SECS-1021-03
Monitoring Stack: Grafana, Loki & Promtail
Advanced SystemCare Ultimate Crack + Portable (2025)
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
Patient Appointment Booking in Odoo with online payment
assetexplorer- product-overview - presentation
Adobe Illustrator 28.6 Crack My Vision of Vector Design
17 Powerful Integrations Your Next-Gen MLM Software Needs
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
wealthsignaloriginal-com-DS-text-... (1).pdf
Why Generative AI is the Future of Content, Code & Creativity?

H2O World - Python Pipelines - Spencer Aiello

  • 1. To Production and Beyond Spencer Aiello
  • 2. The Problem • Goal: o Move from prototype to production • Road block: o Prototyping Environment Cages Your: • Feature preprocessing • Models • Ideas
  • 3. The Problem • Even if your code is beautiful:
  • 4. The Problem • You cannot drag-n-drop into a new environment. • Translation may be difficult; humans make mistakes
  • 5. A Solution H2O gives you wings: • Export Preprocessing • Export Models
  • 6. H2OAssembly o Build Rich Feature Preprocessing Assembly Lines • Clean, reduce, and expand datasets by composing any of the 100s of primitives available in H2O • Build hygenic processing assembly lines that can be applied to new batches of data • Export your feature preprocessing steps as a plain old java object and apply to streaming tuples
  • 9. Live Demo • Lending Club Data: Predict Interest Rate o Four-part dataset of loan data o 500K rows, 52 columns o Preprocess 5 columns within a 16 step assembly o Build a simple GBM to predict interest rate o Export everything into a Storm topology