SlideShare a Scribd company logo
Tom Kraljevic / Venkatesh Yadav
H2O.ai
Lessons From
Driverless AI Going
to Production
Outline
• Driverless AI software distributions and supported environments
• Hardware Recommendations
• End-to-end steps of hardware uncrating to Machine Learning
Pipeline-creating
• Data Sources
• Automating Driverless AI training
• Productionizing Driverless AI pipelines
• Top customer questions
Driverless AI Software Distributions and
Supported Environments
• Cloud marketplace BYOL offerings
• Amazon AWS AMI
• Microsoft Azure Marketplace
• Google Cloud Platform
• Nimbix, Paperspace
• IBM Cloud Private
• NVIDIA DGX Registry
• Install on your own
• Cloud (for experimenting or for serious use)
• Servers (for serious use)
• Desktop/Laptop (for experimenting with small data)
Cloud - Amazon AWS AMI
Cloud - Microsoft Azure Marketplace
Cloud - Google Cloud Platform
Cloud - Nimbix
Cloud - IBM Cloud Private
NVDIA DGX Registry
Install on Your Own
• RPM package
• DEB package
• Docker image
RPM
Supported CPU Supported OS Supported CUDA Supported GPU
IBM Power P8 RHEL 7 CUDA 8.0
CUDA 9.0
(CUDA 9.2 soon...)
Kepler
Pascal
Volta
IBM Power P9 RHEL 7 CUDA 9.0
(CUDA 9.2 soon...)
Volta
x86_64 RHEL 7
SLES 12
CUDA 8.0
CUDA 9.0
(CUDA 9.2 soon...)
Kepler
Pascal
Volta
DEB
Supported CPU Supported OS Supported CUDA Supported GPU
IBM Power P8 Ubuntu 16.04 CUDA 8.0
CUDA 9.0
(CUDA 9.2 soon...)
Kepler
Pascal
Volta
IBM Power P9 (Ubuntu GPU
support not yet
available...)
(Ubuntu GPU
support not yet
available...)
(Ubuntu GPU
support not yet
available...)
x86_64 Ubuntu 16.04 CUDA 8.0
CUDA 9.0
(CUDA 9.2 soon...)
Kepler
Pascal
Volta
x86_64 Ubuntu 16.04 on
Windows (via WSL)
none none
Docker Image
Supported CPU Supported Host OS Supported
Container CUDA
Supported GPU
IBM Power P8 Ubuntu 16.04 CUDA 8.0
CUDA 9.0
Kepler
Pascal
Volta
IBM Power P8 RHEL 7 Soon... Soon...
IBM Power P9 (Ubuntu GPU
support not yet
available...)
(Ubuntu GPU
support not yet
available...)
(Ubuntu GPU
support not yet
available...)
IBM Power P9 RHEL 7 Soon... Soon...
x86_64 Ubuntu 16.04 CUDA 8.0
CUDA 9.0
Kepler
Pascal
Volta
Hardware Recommendations
• IBM Power
• P8 with 4 (or more) Pascal/Volta GPUs (“Minsky”)
• Lots of CPU cores (100 +)
• Lots of CPU memory (256 GB +)
• Fast storage (SSD/NVMe)
• P9 with 4 (or more) Volta GPUs (“Newell”)
• Lots of CPU cores (one of my test systems has 160 cores)
• Lots of CPU memory (256 GB +)
• Fast storage (SSD/NVMe)
• x86_64
• 2 or more Xeon sockets
• 4 or more Pascal / Volta GPUs
• Lots of CPU memory (256 GB +)
• Fast storage (SSD/NVMe)
• Insights
• Don’t skimp on CPU cores and memory; when GPUs aren’t working, this is the bottleneck
• Fast storage makes a big difference for docker-based environments
End-to-End Uncrating to Creating –
Bringing DAI to a new IBM P9 System
• Enable RedHat Linux subscription
• Install GPU drivers
• Install CUDA 9.0
• Grow the disk volume mounted at ‘/’
• Open firewall port 12345
• Download Driverless AI
• Install Driverless AI
• Use Driverless AI from your web browser
End-to-End Uncrating to Creating –
Bringing DAI to a new IBM P9 System
• [ Enable RedHat Linux subscription ]
• [ (Optional) Enable SELinux if you want it ]
• yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
• yum install dkms
• yum groupinstall “Development Tools”
• Needed to build GPU drivers
• wget http://us.download.nvidia.com/tesla/396.26/nvidia-driver-local-repo-rhel7-
396.26-1.0-1.ppc64le.rpm
• yum localinstall nvidia-driver*.rpm
• wget
https://developer.download.nvidia.com/compute/cuda/repos/rhel7/ppc64le/cuda-
repo-rhel7-9.2.88-1.ppc64le.rpm
• yum localinstall cuda-repo*.rpm
• yum install cuda-9-0.ppc64le
• systemctl enable nvidia-persistenced
• cp /lib/udev/rules.d/40-redhat.rules /etc/udev/rules.d
• sed -i ‘/SUBSYSTEM==“memory”, ACTION==“add”/d’ /etc/udev/rules.d/40-redhat.rules
• Needed for nvidia-smi to not say “Unknown error”
• reboot
• [ Grow size of the disk volume mounted at ‘/’ (default was really tiny) ]
• firewall-cmd --zone=public --add-port=12345/tcp –permanent
• wget http://.../dai-rpm.dai
• yum localinstall dai.rpm
• systemctl start dai
• http://dai-host:12345
• [ Import dataset ]
• [ Run an experiment (the “Predict” menu item) ]
Data Sources
• File Formats
• csv, tsv, txt, dat, tgz, gz, bz2, zip, xz, xls, xlsx, nff, feather, bin, arff, parquet
• Connectors
• Local filesystem
• HDFS
• S3
• Google Cloud Storage
• Google BigQuery
• (in development) Minio
• (in development) Snowflake
• Adding these on a first-come-first-served basis...
Automating Driverless AI Training (Python)
address = 'http://ip_where_driverless_is_running:12345'
username = 'username'
password = 'password'
from h2oai_client import Client, ModelParameters, InterpretParameters
h2oai = Client(address = address, username = username, password = password)
train_path = '/data/Kaggle/CreditCard/CreditCard-train.csv'
test_path = '/data/Kaggle/CreditCard/CreditCard-test.csv'
train = h2oai.create_dataset_sync(train_path)
test = h2oai.create_dataset_sync(test_path)
target="default payment next month"
params = h2oai.get_experiment_tuning_suggestion(dataset_key = train.key,
target_col = target,
is_classification = True,
is_time_series = False)
experiment = h2oai.start_experiment_sync(params)
h2oai.download(src_path=experiment.test_predictions_path, dest_dir=".")
Productionizing Driverless AI Pipelines
• Driverless AI MOJO pipeline (+ model) artifact
• Small/lightweight footprint
• Low latency
• Designed for real-time applications (predicting one row at a time)
• Java implementation
• MOJO for both the feature-engineered pipeline, as well as for MLI (to get reason
codes in production)
• Driverless AI Python pipeline (+ model) artifact
• Heavy footprint
• Usable for batch applications
• Used as a reference implementation for MOJO testing
• Will usually have new features first
Driverless AI Python MOJO Code Example
import java.io.IOException;
import ai.h2o.mojos.runtime.MojoPipeline;
import ai.h2o.mojos.runtime.frame.MojoFrame;
import ai.h2o.mojos.runtime.frame.MojoFrameBuilder;
import ai.h2o.mojos.runtime.frame.MojoRowBuilder;
import ai.h2o.mojos.runtime.utils.SimpleCSV;
public class Main {
public static void main(String[] args) throws IOException {
// Load model and csv
MojoPipeline model = MojoPipeline.loadFrom("pipeline.mojo");
// Get and fill the input columns
MojoFrameBuilder frameBuilder = model.getInputFrameBuilder();
MojoRowBuilder rowBuilder = frameBuilder.getMojoRowBuilder();
rowBuilder.setValue("AGE", "68");
rowBuilder.setValue("RACE", "2");
rowBuilder.setValue("DCAPS", "2");
rowBuilder.setValue("VOL", "0");
rowBuilder.setValue("GLEASON", "6");
frameBuilder.addRow(rowBuilder);
// Create a frame which can be transformed by MOJO pipeline
MojoFrame iframe = frameBuilder.toMojoFrame();
// Transform input frame by MOJO pipeline
MojoFrame oframe = model.transform(iframe);
// Output prediction as CSV
SimpleCSV outCsv = SimpleCSV.read(oframe);
outCsv.write(System.out);
}
}
Top Customer Questions - Installation
• Can Driverless AI run on CPU-only machines?
• Can Driverless AI be installed without docker in a native install mode RPM,
DEB package ?
• Can Driverless AI be integrated with ActiveDirectory/LDAP for
Authentication/Authorization ?
• Can Driverless AI be secured with SSL support ?
• Can I run multiple instances of Driverless AI on one GPU server ?
• Can I run divide Driverless AI and divide GPU resources ?
• Can Driverless AI run on my Windows 7 laptop ?
• Can Driverless AI run in an air-gapped environment?
Top Customer Questions - Deployment
• Can the model (& pipeline) be deployed as a docker container ?
• Can the model (& pipeline) be deployed as a micro service in
kubernetes ?
• Does Driverless AI support one click model (& pipeline) deployment ?
• How to scale Driverless AI MOJO model (& pipeline) in production ?
• What are the different Driverless AI MOJO model (& pipeline)
deployment patterns ?

More Related Content

PDF
Sparkling Water Workshop
PDF
Building the Foundations of an Intelligent, Event-Driven Data Platform at EFSA
PDF
Building a SIMD Supported Vectorized Native Engine for Spark SQL
PDF
Building an ML Platform with Ray and MLflow
PDF
Koalas: How Well Does Koalas Work?
PDF
Resource-Efficient Deep Learning Model Selection on Apache Spark
PDF
Accelerating Data Processing in Spark SQL with Pandas UDFs
PDF
Stage Level Scheduling Improving Big Data and AI Integration
Sparkling Water Workshop
Building the Foundations of an Intelligent, Event-Driven Data Platform at EFSA
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building an ML Platform with Ray and MLflow
Koalas: How Well Does Koalas Work?
Resource-Efficient Deep Learning Model Selection on Apache Spark
Accelerating Data Processing in Spark SQL with Pandas UDFs
Stage Level Scheduling Improving Big Data and AI Integration

What's hot (20)

PDF
Hyperspace: An Indexing Subsystem for Apache Spark
PDF
How Adobe Does 2 Million Records Per Second Using Apache Spark!
PDF
Karmasphere Studio for Hadoop
PDF
How Adobe uses Structured Streaming at Scale
PDF
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
PDF
Accelerated Training of Transformer Models
PPTX
Seattle Scalability Meetup - Ted Dunning - MapR
PDF
Spark SQL Catalyst Code Optimization using Function Outlining with Kavana Bha...
PPTX
Stream Computing (The Engineer's Perspective)
PDF
How to performance tune spark applications in large clusters
PDF
PandasUDFs: One Weird Trick to Scaled Ensembles
PDF
Koalas: Making an Easy Transition from Pandas to Apache Spark
PDF
End-to-End Data Pipelines with Apache Spark
PDF
Data Science Across Data Sources with Apache Arrow
PDF
Spark Streaming and MLlib - Hyderabad Spark Group
PDF
Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...
PDF
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
PDF
Getting Ready to Use Redis with Apache Spark with Dvir Volk
PDF
Leveraging Apache Spark for Scalable Data Prep and Inference in Deep Learning
PDF
Mobility insights at Swisscom - Understanding collective mobility in Switzerland
Hyperspace: An Indexing Subsystem for Apache Spark
How Adobe Does 2 Million Records Per Second Using Apache Spark!
Karmasphere Studio for Hadoop
How Adobe uses Structured Streaming at Scale
Redis + Apache Spark = Swiss Army Knife Meets Kitchen Sink
Accelerated Training of Transformer Models
Seattle Scalability Meetup - Ted Dunning - MapR
Spark SQL Catalyst Code Optimization using Function Outlining with Kavana Bha...
Stream Computing (The Engineer's Perspective)
How to performance tune spark applications in large clusters
PandasUDFs: One Weird Trick to Scaled Ensembles
Koalas: Making an Easy Transition from Pandas to Apache Spark
End-to-End Data Pipelines with Apache Spark
Data Science Across Data Sources with Apache Arrow
Spark Streaming and MLlib - Hyderabad Spark Group
Getting Started with Apache Cassandra and Apache Zeppelin (DuyHai DOAN, DataS...
MLflow: Infrastructure for a Complete Machine Learning Life Cycle
Getting Ready to Use Redis with Apache Spark with Dvir Volk
Leveraging Apache Spark for Scalable Data Prep and Inference in Deep Learning
Mobility insights at Swisscom - Understanding collective mobility in Switzerland
Ad

Similar to Lessons from Driverless AI going to Production (20)

PPTX
Optimizing VM images for OpenStack with KVM/QEMU
PDF
Geode on Docker
ODP
Infrastructure as code with Puppet and Apache CloudStack
ODP
Puppet and Apache CloudStack
ODP
Puppet and CloudStack
PDF
Cloud foundry on kubernetes
ODP
Puppetpreso
PDF
How to Puppetize Google Cloud Platform - PuppetConf 2014
PPTX
CiklumCPPSat: Alexey Podoba "Automatic assembly. Cmake"
ODP
Google Cloud Platform for DeVops, by Javier Ramirez @ teowaki
PPTX
introduction to node.js
PDF
VMware, SoftLayer, OpenStack, Heat, Cloud Foundry and Docker put together
PDF
Ironic 140622212631-phpapp02
PDF
Ironic
PDF
Ironic 140622212631-phpapp02
PPTX
OpenStack Summit 2013 Hong Kong - OpenStack and Windows
PDF
OSインストーラーの自作方法
PPTX
Hyper-V: Best Practices
PDF
Introduction to Stacki - World's fastest Linux server provisioning Tool
PDF
The Rise of Parallel Computing
Optimizing VM images for OpenStack with KVM/QEMU
Geode on Docker
Infrastructure as code with Puppet and Apache CloudStack
Puppet and Apache CloudStack
Puppet and CloudStack
Cloud foundry on kubernetes
Puppetpreso
How to Puppetize Google Cloud Platform - PuppetConf 2014
CiklumCPPSat: Alexey Podoba "Automatic assembly. Cmake"
Google Cloud Platform for DeVops, by Javier Ramirez @ teowaki
introduction to node.js
VMware, SoftLayer, OpenStack, Heat, Cloud Foundry and Docker put together
Ironic 140622212631-phpapp02
Ironic
Ironic 140622212631-phpapp02
OpenStack Summit 2013 Hong Kong - OpenStack and Windows
OSインストーラーの自作方法
Hyper-V: Best Practices
Introduction to Stacki - World's fastest Linux server provisioning Tool
The Rise of Parallel Computing
Ad

More from Sri Ambati (20)

PDF
H2O Label Genie Starter Track - Support Presentation
PDF
H2O.ai Agents : From Theory to Practice - Support Presentation
PDF
H2O Generative AI Starter Track - Support Presentation Slides.pdf
PDF
H2O Gen AI Ecosystem Overview - Level 1 - Slide Deck
PDF
An In-depth Exploration of Enterprise h2oGPTe Slide Deck
PDF
Intro to Enterprise h2oGPTe Presentation Slides
PDF
Enterprise h2o GPTe Learning Path Slide Deck
PDF
H2O Wave Course Starter - Presentation Slides
PDF
Large Language Models (LLMs) - Level 3 Slides
PDF
Data Science and Machine Learning Platforms (2024) Slides
PDF
Data Prep for H2O Driverless AI - Slides
PDF
H2O Cloud AI Developer Services - Slides (2024)
PDF
LLM Learning Path Level 2 - Presentation Slides
PDF
LLM Learning Path Level 1 - Presentation Slides
PDF
Hydrogen Torch - Starter Course - Presentation Slides
PDF
Presentation Resources - H2O Gen AI Ecosystem Overview - Level 2
PDF
H2O Driverless AI Starter Course - Slides and Assignments
PPTX
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
PDF
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
PPTX
Generative AI Masterclass - Model Risk Management.pptx
H2O Label Genie Starter Track - Support Presentation
H2O.ai Agents : From Theory to Practice - Support Presentation
H2O Generative AI Starter Track - Support Presentation Slides.pdf
H2O Gen AI Ecosystem Overview - Level 1 - Slide Deck
An In-depth Exploration of Enterprise h2oGPTe Slide Deck
Intro to Enterprise h2oGPTe Presentation Slides
Enterprise h2o GPTe Learning Path Slide Deck
H2O Wave Course Starter - Presentation Slides
Large Language Models (LLMs) - Level 3 Slides
Data Science and Machine Learning Platforms (2024) Slides
Data Prep for H2O Driverless AI - Slides
H2O Cloud AI Developer Services - Slides (2024)
LLM Learning Path Level 2 - Presentation Slides
LLM Learning Path Level 1 - Presentation Slides
Hydrogen Torch - Starter Course - Presentation Slides
Presentation Resources - H2O Gen AI Ecosystem Overview - Level 2
H2O Driverless AI Starter Course - Slides and Assignments
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
Generative AI Masterclass - Model Risk Management.pptx

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
cuic standard and advanced reporting.pdf
PPT
Teaching material agriculture food technology
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Electronic commerce courselecture one. Pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Encapsulation theory and applications.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
NewMind AI Weekly Chronicles - August'25 Week I
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Understanding_Digital_Forensics_Presentation.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
cuic standard and advanced reporting.pdf
Teaching material agriculture food technology
The AUB Centre for AI in Media Proposal.docx
Encapsulation_ Review paper, used for researhc scholars
Review of recent advances in non-invasive hemoglobin estimation
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Electronic commerce courselecture one. Pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Encapsulation theory and applications.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Unlocking AI with Model Context Protocol (MCP)
20250228 LYD VKU AI Blended-Learning.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing

Lessons from Driverless AI going to Production

  • 1. Tom Kraljevic / Venkatesh Yadav H2O.ai Lessons From Driverless AI Going to Production
  • 2. Outline • Driverless AI software distributions and supported environments • Hardware Recommendations • End-to-end steps of hardware uncrating to Machine Learning Pipeline-creating • Data Sources • Automating Driverless AI training • Productionizing Driverless AI pipelines • Top customer questions
  • 3. Driverless AI Software Distributions and Supported Environments • Cloud marketplace BYOL offerings • Amazon AWS AMI • Microsoft Azure Marketplace • Google Cloud Platform • Nimbix, Paperspace • IBM Cloud Private • NVIDIA DGX Registry • Install on your own • Cloud (for experimenting or for serious use) • Servers (for serious use) • Desktop/Laptop (for experimenting with small data)
  • 4. Cloud - Amazon AWS AMI
  • 5. Cloud - Microsoft Azure Marketplace
  • 6. Cloud - Google Cloud Platform
  • 8. Cloud - IBM Cloud Private
  • 10. Install on Your Own • RPM package • DEB package • Docker image
  • 11. RPM Supported CPU Supported OS Supported CUDA Supported GPU IBM Power P8 RHEL 7 CUDA 8.0 CUDA 9.0 (CUDA 9.2 soon...) Kepler Pascal Volta IBM Power P9 RHEL 7 CUDA 9.0 (CUDA 9.2 soon...) Volta x86_64 RHEL 7 SLES 12 CUDA 8.0 CUDA 9.0 (CUDA 9.2 soon...) Kepler Pascal Volta
  • 12. DEB Supported CPU Supported OS Supported CUDA Supported GPU IBM Power P8 Ubuntu 16.04 CUDA 8.0 CUDA 9.0 (CUDA 9.2 soon...) Kepler Pascal Volta IBM Power P9 (Ubuntu GPU support not yet available...) (Ubuntu GPU support not yet available...) (Ubuntu GPU support not yet available...) x86_64 Ubuntu 16.04 CUDA 8.0 CUDA 9.0 (CUDA 9.2 soon...) Kepler Pascal Volta x86_64 Ubuntu 16.04 on Windows (via WSL) none none
  • 13. Docker Image Supported CPU Supported Host OS Supported Container CUDA Supported GPU IBM Power P8 Ubuntu 16.04 CUDA 8.0 CUDA 9.0 Kepler Pascal Volta IBM Power P8 RHEL 7 Soon... Soon... IBM Power P9 (Ubuntu GPU support not yet available...) (Ubuntu GPU support not yet available...) (Ubuntu GPU support not yet available...) IBM Power P9 RHEL 7 Soon... Soon... x86_64 Ubuntu 16.04 CUDA 8.0 CUDA 9.0 Kepler Pascal Volta
  • 14. Hardware Recommendations • IBM Power • P8 with 4 (or more) Pascal/Volta GPUs (“Minsky”) • Lots of CPU cores (100 +) • Lots of CPU memory (256 GB +) • Fast storage (SSD/NVMe) • P9 with 4 (or more) Volta GPUs (“Newell”) • Lots of CPU cores (one of my test systems has 160 cores) • Lots of CPU memory (256 GB +) • Fast storage (SSD/NVMe) • x86_64 • 2 or more Xeon sockets • 4 or more Pascal / Volta GPUs • Lots of CPU memory (256 GB +) • Fast storage (SSD/NVMe) • Insights • Don’t skimp on CPU cores and memory; when GPUs aren’t working, this is the bottleneck • Fast storage makes a big difference for docker-based environments
  • 15. End-to-End Uncrating to Creating – Bringing DAI to a new IBM P9 System • Enable RedHat Linux subscription • Install GPU drivers • Install CUDA 9.0 • Grow the disk volume mounted at ‘/’ • Open firewall port 12345 • Download Driverless AI • Install Driverless AI • Use Driverless AI from your web browser
  • 16. End-to-End Uncrating to Creating – Bringing DAI to a new IBM P9 System • [ Enable RedHat Linux subscription ] • [ (Optional) Enable SELinux if you want it ] • yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm • yum install dkms • yum groupinstall “Development Tools” • Needed to build GPU drivers • wget http://us.download.nvidia.com/tesla/396.26/nvidia-driver-local-repo-rhel7- 396.26-1.0-1.ppc64le.rpm • yum localinstall nvidia-driver*.rpm • wget https://developer.download.nvidia.com/compute/cuda/repos/rhel7/ppc64le/cuda- repo-rhel7-9.2.88-1.ppc64le.rpm • yum localinstall cuda-repo*.rpm • yum install cuda-9-0.ppc64le • systemctl enable nvidia-persistenced • cp /lib/udev/rules.d/40-redhat.rules /etc/udev/rules.d • sed -i ‘/SUBSYSTEM==“memory”, ACTION==“add”/d’ /etc/udev/rules.d/40-redhat.rules • Needed for nvidia-smi to not say “Unknown error” • reboot • [ Grow size of the disk volume mounted at ‘/’ (default was really tiny) ] • firewall-cmd --zone=public --add-port=12345/tcp –permanent • wget http://.../dai-rpm.dai • yum localinstall dai.rpm • systemctl start dai • http://dai-host:12345 • [ Import dataset ] • [ Run an experiment (the “Predict” menu item) ]
  • 17. Data Sources • File Formats • csv, tsv, txt, dat, tgz, gz, bz2, zip, xz, xls, xlsx, nff, feather, bin, arff, parquet • Connectors • Local filesystem • HDFS • S3 • Google Cloud Storage • Google BigQuery • (in development) Minio • (in development) Snowflake • Adding these on a first-come-first-served basis...
  • 18. Automating Driverless AI Training (Python) address = 'http://ip_where_driverless_is_running:12345' username = 'username' password = 'password' from h2oai_client import Client, ModelParameters, InterpretParameters h2oai = Client(address = address, username = username, password = password) train_path = '/data/Kaggle/CreditCard/CreditCard-train.csv' test_path = '/data/Kaggle/CreditCard/CreditCard-test.csv' train = h2oai.create_dataset_sync(train_path) test = h2oai.create_dataset_sync(test_path) target="default payment next month" params = h2oai.get_experiment_tuning_suggestion(dataset_key = train.key, target_col = target, is_classification = True, is_time_series = False) experiment = h2oai.start_experiment_sync(params) h2oai.download(src_path=experiment.test_predictions_path, dest_dir=".")
  • 19. Productionizing Driverless AI Pipelines • Driverless AI MOJO pipeline (+ model) artifact • Small/lightweight footprint • Low latency • Designed for real-time applications (predicting one row at a time) • Java implementation • MOJO for both the feature-engineered pipeline, as well as for MLI (to get reason codes in production) • Driverless AI Python pipeline (+ model) artifact • Heavy footprint • Usable for batch applications • Used as a reference implementation for MOJO testing • Will usually have new features first
  • 20. Driverless AI Python MOJO Code Example import java.io.IOException; import ai.h2o.mojos.runtime.MojoPipeline; import ai.h2o.mojos.runtime.frame.MojoFrame; import ai.h2o.mojos.runtime.frame.MojoFrameBuilder; import ai.h2o.mojos.runtime.frame.MojoRowBuilder; import ai.h2o.mojos.runtime.utils.SimpleCSV; public class Main { public static void main(String[] args) throws IOException { // Load model and csv MojoPipeline model = MojoPipeline.loadFrom("pipeline.mojo"); // Get and fill the input columns MojoFrameBuilder frameBuilder = model.getInputFrameBuilder(); MojoRowBuilder rowBuilder = frameBuilder.getMojoRowBuilder(); rowBuilder.setValue("AGE", "68"); rowBuilder.setValue("RACE", "2"); rowBuilder.setValue("DCAPS", "2"); rowBuilder.setValue("VOL", "0"); rowBuilder.setValue("GLEASON", "6"); frameBuilder.addRow(rowBuilder); // Create a frame which can be transformed by MOJO pipeline MojoFrame iframe = frameBuilder.toMojoFrame(); // Transform input frame by MOJO pipeline MojoFrame oframe = model.transform(iframe); // Output prediction as CSV SimpleCSV outCsv = SimpleCSV.read(oframe); outCsv.write(System.out); } }
  • 21. Top Customer Questions - Installation • Can Driverless AI run on CPU-only machines? • Can Driverless AI be installed without docker in a native install mode RPM, DEB package ? • Can Driverless AI be integrated with ActiveDirectory/LDAP for Authentication/Authorization ? • Can Driverless AI be secured with SSL support ? • Can I run multiple instances of Driverless AI on one GPU server ? • Can I run divide Driverless AI and divide GPU resources ? • Can Driverless AI run on my Windows 7 laptop ? • Can Driverless AI run in an air-gapped environment?
  • 22. Top Customer Questions - Deployment • Can the model (& pipeline) be deployed as a docker container ? • Can the model (& pipeline) be deployed as a micro service in kubernetes ? • Does Driverless AI support one click model (& pipeline) deployment ? • How to scale Driverless AI MOJO model (& pipeline) in production ? • What are the different Driverless AI MOJO model (& pipeline) deployment patterns ?