SlideShare a Scribd company logo
Basic H2O for
Python
Eric C. Eckstrand
Agenda
1. Getting H2O & Documentation
2. Basic Architecture
3. Loading Data
4. Data Exploration & Munging
5. Model Building
6. Model Saving & Loading
Getting H2O & Docs
1. pip install h2o
2. http://h2o.ai/download/
a. Bleeding Edge (link)
b. Install in Python (tab)
c. pip install http://h2o-release.s3.amazonaws.
com/h2o/master/3066/Python/h2o-3.1.0.3066-py2.py3-none-any.whl
3. build h2o (https://github.com/h2oai/h2o-3#4-building-h2o-3)
a. pip install h2o-py/dist/h2o-3.1.0.99999-py2.py3-none-any.whl
4. http://docs.h2o.ai/ -> H2O 3.0 -> Python Users (link) -> Python docs (link)
Basic Architecture
local machine
Python
>>> import h2o
Basic Architecture
local machine
Python
>>> import h2o
>>> h2o.init()
H2O JVM
ip=localhost, port=54321
Basic Architecture
local machine
Python
>>> import h2o
>>> h2o.init(ip=”172.16.2.181”, port=54321)
H2O JVM
ip=172.16.2.181, port=54321
remote machine
Basic Architecture
local machine
Python
>>> import h2o
>>> h2o.init(ip=”172.16.2.181”, port=54321) H2O JVM
H2O JVM
H2O JVM
H2O JVM
H2O JVM
Load Data into H2O JVM
1. Iris dataset
a. 150 rows x 5 columns
b. Sepal Width, Sepal Length, Petal Width, Petal
Length, and Species (Verginica, Setosa, Versicolor)
2. Methods
a. h2o.upload_file
b. h2o.import_frame
c. h2o.H2OFrame
Load Data into H2O JVM
my laptop: /Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv
Python
>>> import h2o
>>> h2o.init()
>>> iris_H2OFrame = h2o.upload_file
(“/Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv”)
>>> iris_H2OFrame = h2o.import_frame
(“/Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv”)
H2O JVM
ip=localhost, port=54321
Load Data into H2O JVM
my laptop: /Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv
Python
>>> import h2o
>>> h2o.init(ip=”172.16.2.181”, port=54321)
>>>
>>> iris_H2OFrame = h2o.upload_file
(“/Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv”)
>>>
>>> iris_H2OFrame = h2o.import_frame
(“/home/eric/iris.csv”)
H2O JVM
ip=172.16.2.181, port=54321
server room: /home/eric/iris.csv
Exploration & Munging
my laptop: /Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv
Python
>>> import h2o
>>> h2o.init()
>>> iris_H2OFrame = h2o.upload_file
(“/Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv”)
H2O JVM
ip=localhost, port=54321Frame
150 x 5
Exploration & Munging
1. show, dim, nrow, ncol, head, tail, col_names, setNames
2. indexing
3. summary statistics
a. mean, median, min, max, sd
4. categorical columns
a. levels
5. cut, group_by
6. ndarray <-> DataFrame <-> H2OFrame
Model-Building
1. H2O K-means
a. h2o_model = h2o.kmeans(x=iris_H2OFrame[:,0:4],
k=3)
b. h2o_model.centers()
2. Scikit Learn
a. from sklearn.cluster import KMeans
b. sk_model = KMeans(n_clusters=3)
c. sk_model.fit(iris_DataFrame.iloc[:,0:4])
d. sk_model.cluster_centers_
Model Saving & Loading
1. path = h2o.save_model(h2o_model,"
/Users/ece/")
2. saved_model = h2o.load_model(str(path))
3. saved_model.centers()
Questions?

More Related Content

PPTX
Intro to R and H2O with Spencer Aiello
PPTX
CouchDB Day NYC 2017: Replication
PDF
Heroku pycon
PDF
Python + STIX = Awesome
PDF
Spatial script for my JS.Everywhere 2012
PPTX
CouchDB Day NYC 2017: Core HTTP API
PPTX
CouchDB Day NYC 2017: Introduction to CouchDB 2.0
Intro to R and H2O with Spencer Aiello
CouchDB Day NYC 2017: Replication
Heroku pycon
Python + STIX = Awesome
Spatial script for my JS.Everywhere 2012
CouchDB Day NYC 2017: Core HTTP API
CouchDB Day NYC 2017: Introduction to CouchDB 2.0

What's hot (19)

PDF
Pyspark
PDF
Tracing python applications
PDF
Productivity tips for developers
PDF
A brand new documentation infrastructure for the GStreamer framework (GStream...
PDF
OpenStack for Centos
PDF
git. WTF is it doing anyway?
PDF
Avoiding Performance Potholes: Scaling Python for Data Science Using Apache ...
PPTX
R sharing 101
PDF
Python Dependency Management - PyconDE 2018
PPTX
Git operation 101
PPTX
Installing GravCMS
PDF
Contributing to an os project
PPTX
Spatial MongoDB, Node.JS, and Express - server-side JS for your application
ODP
Eat my data
PDF
Git training
KEY
Gittalk
PDF
GIT: Content-addressable filesystem and Version Control System
KEY
Python setup
PDF
Essential git fu for tech writers
Pyspark
Tracing python applications
Productivity tips for developers
A brand new documentation infrastructure for the GStreamer framework (GStream...
OpenStack for Centos
git. WTF is it doing anyway?
Avoiding Performance Potholes: Scaling Python for Data Science Using Apache ...
R sharing 101
Python Dependency Management - PyconDE 2018
Git operation 101
Installing GravCMS
Contributing to an os project
Spatial MongoDB, Node.JS, and Express - server-side JS for your application
Eat my data
Git training
Gittalk
GIT: Content-addressable filesystem and Version Control System
Python setup
Essential git fu for tech writers
Ad

Viewers also liked (20)

PDF
Building Random Forest at Scale
PDF
H2O World - What's New in H2O with Cliff Click
PPTX
H2O World - Python Pipelines - Spencer Aiello
PPTX
H2O World - Self Guiding Applications with Venkatesh Yadav
PPTX
H2O World - Translating Advanced Analytics for Business Users - Conor Jensen
PPTX
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
PDF
H2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
PDF
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
PPTX
H2O World - Clustering & Feature Extraction on Text - Seth Redmore
PDF
H2O World - H2O Rains with Databricks Cloud
PDF
Sparkling Water Meetup 4.15.15
PDF
H2O World - Building a Smarter Application - Tom Kraljevic
PPTX
Data & Data Alliances - Scott Mclellan
PDF
H2O World - What you need before doing predictive analysis - Keen.io
PDF
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
PDF
The Joys of Clean Data with Matt Dowle
PDF
H2O World - A Look Under Progressive's Big Data Hood - Pawan Divakarla & Bria...
PDF
Intro to H2O Machine Learning in R at Santa Clara University
PDF
Introduction to Data Science with H2O- Mountain View
PDF
H2O Deep Water - Making Deep Learning Accessible to Everyone
Building Random Forest at Scale
H2O World - What's New in H2O with Cliff Click
H2O World - Python Pipelines - Spencer Aiello
H2O World - Self Guiding Applications with Venkatesh Yadav
H2O World - Translating Advanced Analytics for Business Users - Conor Jensen
H2O World - Munging, modeling, and pipelines using Python - Hank Roark
H2O World - Survey of Available Machine Learning Frameworks - Brendan Herger
H2O World - Quora: Machine Learning Algorithms to Grow the World's Knowledge ...
H2O World - Clustering & Feature Extraction on Text - Seth Redmore
H2O World - H2O Rains with Databricks Cloud
Sparkling Water Meetup 4.15.15
H2O World - Building a Smarter Application - Tom Kraljevic
Data & Data Alliances - Scott Mclellan
H2O World - What you need before doing predictive analysis - Keen.io
H2O World - Solving Customer Churn with Machine Learning - Julian Bharadwaj
The Joys of Clean Data with Matt Dowle
H2O World - A Look Under Progressive's Big Data Hood - Pawan Divakarla & Bria...
Intro to H2O Machine Learning in R at Santa Clara University
Introduction to Data Science with H2O- Mountain View
H2O Deep Water - Making Deep Learning Accessible to Everyone
Ad

Similar to Basic H2O for Python with Eric Eckstrand (20)

PDF
release_python_day3_slides_201606.pdf
PPTX
How to configure PyCharm for Odoo development in Windows?
PDF
SciPy 2025 - Packaging a Scientific Python Project
PDF
Distributing UI Libraries: in a post Web-Component world
PDF
Package a PyApp as a Flatpak Package: An HTTP Server for Example @ PyCon APAC...
PDF
Docker to the Rescue of an Ops Team
PDF
Docker to the Rescue of an Ops Team
PDF
Building a data warehouse with Pentaho and Docker
PPTX
Installing tensorflow object detection on raspberry pi
PDF
Badge Poser v3.0 - A DevOps Journey
PPTX
Raspberry pi and Azure
PDF
Usage Note of SWIG for PHP
PDF
Software Quality Assurance Tooling - Wintersession 2024
PDF
Software Quality Assurance Tooling 2023
PDF
silver gemstone sun pendant silver pendant gemstones
PPT
New in Plone 3.3. What to expect from Plone 4
PDF
Digital RSE: automated code quality checks - RSE group meeting
PDF
Rustifying a Python package in 2025 with pyo3 and maturin
PDF
Porion a new Build Manager
PDF
Princeton RSE: Building Python Packages (+binary)
release_python_day3_slides_201606.pdf
How to configure PyCharm for Odoo development in Windows?
SciPy 2025 - Packaging a Scientific Python Project
Distributing UI Libraries: in a post Web-Component world
Package a PyApp as a Flatpak Package: An HTTP Server for Example @ PyCon APAC...
Docker to the Rescue of an Ops Team
Docker to the Rescue of an Ops Team
Building a data warehouse with Pentaho and Docker
Installing tensorflow object detection on raspberry pi
Badge Poser v3.0 - A DevOps Journey
Raspberry pi and Azure
Usage Note of SWIG for PHP
Software Quality Assurance Tooling - Wintersession 2024
Software Quality Assurance Tooling 2023
silver gemstone sun pendant silver pendant gemstones
New in Plone 3.3. What to expect from Plone 4
Digital RSE: automated code quality checks - RSE group meeting
Rustifying a Python package in 2025 with pyo3 and maturin
Porion a new Build Manager
Princeton RSE: Building Python Packages (+binary)

More from Sri Ambati (20)

PDF
H2O Label Genie Starter Track - Support Presentation
PDF
H2O.ai Agents : From Theory to Practice - Support Presentation
PDF
H2O Generative AI Starter Track - Support Presentation Slides.pdf
PDF
H2O Gen AI Ecosystem Overview - Level 1 - Slide Deck
PDF
An In-depth Exploration of Enterprise h2oGPTe Slide Deck
PDF
Intro to Enterprise h2oGPTe Presentation Slides
PDF
Enterprise h2o GPTe Learning Path Slide Deck
PDF
H2O Wave Course Starter - Presentation Slides
PDF
Large Language Models (LLMs) - Level 3 Slides
PDF
Data Science and Machine Learning Platforms (2024) Slides
PDF
Data Prep for H2O Driverless AI - Slides
PDF
H2O Cloud AI Developer Services - Slides (2024)
PDF
LLM Learning Path Level 2 - Presentation Slides
PDF
LLM Learning Path Level 1 - Presentation Slides
PDF
Hydrogen Torch - Starter Course - Presentation Slides
PDF
Presentation Resources - H2O Gen AI Ecosystem Overview - Level 2
PDF
H2O Driverless AI Starter Course - Slides and Assignments
PPTX
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
PDF
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
PPTX
Generative AI Masterclass - Model Risk Management.pptx
H2O Label Genie Starter Track - Support Presentation
H2O.ai Agents : From Theory to Practice - Support Presentation
H2O Generative AI Starter Track - Support Presentation Slides.pdf
H2O Gen AI Ecosystem Overview - Level 1 - Slide Deck
An In-depth Exploration of Enterprise h2oGPTe Slide Deck
Intro to Enterprise h2oGPTe Presentation Slides
Enterprise h2o GPTe Learning Path Slide Deck
H2O Wave Course Starter - Presentation Slides
Large Language Models (LLMs) - Level 3 Slides
Data Science and Machine Learning Platforms (2024) Slides
Data Prep for H2O Driverless AI - Slides
H2O Cloud AI Developer Services - Slides (2024)
LLM Learning Path Level 2 - Presentation Slides
LLM Learning Path Level 1 - Presentation Slides
Hydrogen Torch - Starter Course - Presentation Slides
Presentation Resources - H2O Gen AI Ecosystem Overview - Level 2
H2O Driverless AI Starter Course - Slides and Assignments
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Generative AI Masterclass - Model Risk Management.pptx

Recently uploaded (20)

PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
iTop VPN Crack Latest Version Full Key 2025
PDF
Complete Guide to Website Development in Malaysia for SMEs
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Weekly report ppt - harsh dattuprasad patel.pptx
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PPTX
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
PDF
Digital Systems & Binary Numbers (comprehensive )
PDF
Nekopoi APK 2025 free lastest update
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PPTX
Reimagine Home Health with the Power of Agentic AI​
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
17 Powerful Integrations Your Next-Gen MLM Software Needs
PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Salesforce Agentforce AI Implementation.pdf
PDF
AutoCAD Professional Crack 2025 With License Key
Operating system designcfffgfgggggggvggggggggg
iTop VPN Crack Latest Version Full Key 2025
Complete Guide to Website Development in Malaysia for SMEs
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Weekly report ppt - harsh dattuprasad patel.pptx
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Advanced SystemCare Ultimate Crack + Portable (2025)
WiFi Honeypot Detecscfddssdffsedfseztor.pptx
Digital Systems & Binary Numbers (comprehensive )
Nekopoi APK 2025 free lastest update
Why Generative AI is the Future of Content, Code & Creativity?
Reimagine Home Health with the Power of Agentic AI​
CHAPTER 2 - PM Management and IT Context
17 Powerful Integrations Your Next-Gen MLM Software Needs
Computer Software and OS of computer science of grade 11.pptx
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Odoo Companies in India – Driving Business Transformation.pdf
Salesforce Agentforce AI Implementation.pdf
AutoCAD Professional Crack 2025 With License Key

Basic H2O for Python with Eric Eckstrand

  • 2. Agenda 1. Getting H2O & Documentation 2. Basic Architecture 3. Loading Data 4. Data Exploration & Munging 5. Model Building 6. Model Saving & Loading
  • 3. Getting H2O & Docs 1. pip install h2o 2. http://h2o.ai/download/ a. Bleeding Edge (link) b. Install in Python (tab) c. pip install http://h2o-release.s3.amazonaws. com/h2o/master/3066/Python/h2o-3.1.0.3066-py2.py3-none-any.whl 3. build h2o (https://github.com/h2oai/h2o-3#4-building-h2o-3) a. pip install h2o-py/dist/h2o-3.1.0.99999-py2.py3-none-any.whl 4. http://docs.h2o.ai/ -> H2O 3.0 -> Python Users (link) -> Python docs (link)
  • 5. Basic Architecture local machine Python >>> import h2o >>> h2o.init() H2O JVM ip=localhost, port=54321
  • 6. Basic Architecture local machine Python >>> import h2o >>> h2o.init(ip=”172.16.2.181”, port=54321) H2O JVM ip=172.16.2.181, port=54321 remote machine
  • 7. Basic Architecture local machine Python >>> import h2o >>> h2o.init(ip=”172.16.2.181”, port=54321) H2O JVM H2O JVM H2O JVM H2O JVM H2O JVM
  • 8. Load Data into H2O JVM 1. Iris dataset a. 150 rows x 5 columns b. Sepal Width, Sepal Length, Petal Width, Petal Length, and Species (Verginica, Setosa, Versicolor) 2. Methods a. h2o.upload_file b. h2o.import_frame c. h2o.H2OFrame
  • 9. Load Data into H2O JVM my laptop: /Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv Python >>> import h2o >>> h2o.init() >>> iris_H2OFrame = h2o.upload_file (“/Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv”) >>> iris_H2OFrame = h2o.import_frame (“/Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv”) H2O JVM ip=localhost, port=54321
  • 10. Load Data into H2O JVM my laptop: /Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv Python >>> import h2o >>> h2o.init(ip=”172.16.2.181”, port=54321) >>> >>> iris_H2OFrame = h2o.upload_file (“/Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv”) >>> >>> iris_H2OFrame = h2o.import_frame (“/home/eric/iris.csv”) H2O JVM ip=172.16.2.181, port=54321 server room: /home/eric/iris.csv
  • 11. Exploration & Munging my laptop: /Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv Python >>> import h2o >>> h2o.init() >>> iris_H2OFrame = h2o.upload_file (“/Users/ece/0xdata/h2o-dev/smalldata/iris/iris.csv”) H2O JVM ip=localhost, port=54321Frame 150 x 5
  • 12. Exploration & Munging 1. show, dim, nrow, ncol, head, tail, col_names, setNames 2. indexing 3. summary statistics a. mean, median, min, max, sd 4. categorical columns a. levels 5. cut, group_by 6. ndarray <-> DataFrame <-> H2OFrame
  • 13. Model-Building 1. H2O K-means a. h2o_model = h2o.kmeans(x=iris_H2OFrame[:,0:4], k=3) b. h2o_model.centers() 2. Scikit Learn a. from sklearn.cluster import KMeans b. sk_model = KMeans(n_clusters=3) c. sk_model.fit(iris_DataFrame.iloc[:,0:4]) d. sk_model.cluster_centers_
  • 14. Model Saving & Loading 1. path = h2o.save_model(h2o_model," /Users/ece/") 2. saved_model = h2o.load_model(str(path)) 3. saved_model.centers()