SlideShare a Scribd company logo
Machine Learning:
From Lab to
Production with
Kubeflow
With Special Guests Tensorflow and an
Apache Spark teaser
Agenda
● About Us
● Background
● Problem: Building Model is Easy, Serving models in Prod is hard.
● What Is KubeFlow?
● Kubeflow’s Design and Core Components
● …
Holden About Me Slides
● Prefered pronouns are she/her
● Developer Advocate at Google
● Apache Spark PMC / ASF member + contributor on lots of other projects
● previously IBM, Alpine, Databricks, Google, Foursquare & Amazon
● co-author of Learning Spark & High Performance Spark
● Twitter: @holdenkarau
● Slide share http://www.slideshare.net/hkarau
● Code review livestreams: https://www.twitch.tv/holdenkarau /
https://www.youtube.com/user/holdenkarau
● Talk Videos http://bit.ly/holdenSparkVideos
● Talk feedback: http://bit.ly/holdenTalkFeedback
● Organizing data track @ IT Next AMS - CFP Open!
Trevor Grant
● From Chicago
● Preferred pronouns he/him
● Various odd jobs around IBM (data scientist/evangelist/janitor)
● PMC Apache Mahout, Streams, Community Development
● Apache Roadshow Chicago [1]
● IoT Track at Apache Con North America [2]
● Blog: rawkintrevo.org
● Twitter: @rawkintrevo
● Github: rawkintrevo
● Shameless self promotion: rawkintrevo.org/shameless-self-promotion/
[1] https://www.apachecon.com/chiroadshow19/
[2] https://www.apachecon.com/acna19/index.html
Kubeflow Salesman:
Kubeflow Salesman:
**Slaps roof of Kubeflow**
THIS BAD BOY CAN FIT SO MANY
BUZZWORDS IN IT
Kubeflow Salesman:
**Slaps roof of Kubeflow**
THIS BAD BOY CAN FIT SO MANY
BUZZWORDS IN IT
Background
Things we thought you might know.
History of Predictive Analytics
Photo: Numerology Sign
Photo: Akash Kataruka
Photo: Hans Splinter
What is Statistics?
What is Statistics?
Machine Learning?
What is Statistics?
Machine Learning?
A.I. (Artificial Intelligence)
What is Statistics?
Machine Learning?
A.I. (Artificial Intelligence)
Photo: Andreas Kretschmer
Model Training
Photo: Helen Harrop
Model Serving
What is Statistics?
Machine Learning?
A.I. (Artificial Intelligence)
What is Statistics?
Machine Learning?
A.I. (Artificial Intelligence)
From don’t know shit to know your shit*
Verses # of GPUs required
*Holden’s gut feelings after coffee
NeedGPUs
Knowledge of Shit
What is Kubernetes?
Kubeflow: Dev on Laptop Deploy in Cloud
Deploy to Production
So what is Kubeflow?
What is Kubeflow?
What is Kubeflow?
What is Kubeflow?
VIK hotels group
Components Buffet
argo
automation
chainer-job
core
credentials-pod-preset
katib
mpi-job
mxnet-job
openmpi
pachyderm
pytorch-job
Seldon
spark
tf-serving
Paul Harrison
The (many) kinds of models you can train
● All your favourite Python libraries* (in Jupyter)
○ Different options to parallelize, with more coming (for now MPI or Beam ish)
● PyTorch
● Tensorflow (along with hyper param tuning with katib)
● mxnet
● etc.
Add-ons:
● H2o: CI is failing but you know, it's Wednesday
● And more!
But don't forget about data prep friends!
● For now options are a little limited, but other tools like Apache Spark are
in the works
○ https://github.com/kubeflow/kubeflow/pull/1467
● Now with Spark!*
● Pachyderm
● pandas?
● shell scripts?
● Tensorflow Transform in local mode Python 2 only…...
● Yeah I guess we did forget about our dataprep friends
*Where now == what's in master as of Feb 14th 2019
Model persistence/deployment/CI
● You need to save your model somewhere
● Your favourite cloud storage provider goes here
○ Why invest in model management when I can make directory?
● ModelDB
● WeaveFlux
● Pachyderm
Model Serving
Python Flask
TensorFlow Model Server
Openvino
NVIDIA Inference Server
Seldon Core
● Routers for A/B tests and multi-armed bandits.
● Supports lots of libraries (Python/Spark/R/etc)
● Monitoring / Security
If you can put it in a Docker container, you can use it.
So you want to use this?
What’s Next?!
Step away from keyboard
Think about type(s) of model
Look at components directory and see what’s a fit tool wise
Don’t know? Choose jupyter deal with the details live
Can’t find it?
^^ New Cat
Content!!!
^._.^
What about just tensorflow?*
ks registry add kubeflow
github.com/kubeflow/kubeflow/tree/${VERSION}/kubeflow
ks pkg install kubeflow/core@${VERSION}
ks pkg install kubeflow/tf-serving@${VERSION}
ks pkg install kubeflow/tf-job@${VERSION}
Getting the chef's recommend pairing:
kfctl.sh init my_awesome_project --platform {none, gcp, minikube}
cd my_awesome_project
kfctl.sh generate platform && kfctl.sh apply platform
kfctl.sh generate k8s && kfctl.sh apply k8s
# Add spark
cd ks_ap && ks pkg install kubeflow/spark
Douglas O'Brien
Connect to the Kubeflow Web UI
kubectl port-forward svc/ambassador -n kubeflow 8080:80 &
# Or use IAP, but that's… another story
The chef's recommend pairing is:
● Jupyter Hub
● TF Job & TF Serving
● PyTorch
● Katib (Hyper parameter tuning)
● Ambassador (makes it easier to access the UIs)
● Pipelines (Argo + Magic)
chicoblue
Click-to-deploy: get started hella fast* on
GCP
https://deploy.kubeflow.cloud
Click-to-deploy continued
Click-to-deploy continued
Click-to-deploy continued
Click-to-deploy continued
ro Hasegawa
What are those pipelines?
“Kubeflow Pipelines is a platform for building and deploying portable,
scalable machine learning (ML) workflows based on Docker containers.” -
kubeflow.org
Directed Acyclic Graph (DAG) of “pipeline components” (read “docker
containers”) each performing a function.
Building that pipeline?
Running that pipeline
Serving that job (not the only way)
When you don't know anything or know a
lot about ML: Hyper Parameter Tuning
● Katib
○ Does not depend on a specific ML tool (e.g. not just TF)
○ Supports a few different search algorithms
○ e.g. "What should I set my L1 regularization too? Idk let's ask the computer"
● Great way to accidently overfit too!* (*if you're not careful)
● As respective cloud providers, we are happy to rent you a lot of
resources
● Seriously, mention our names in the sales call. We're both going for
promo (and that shit is hard)
Katib Screenshot
But what about [special foo-baz-inator] or
[special-yak-shaving-tool]?
Write a Dockerfile and build an image, use FROM so you’re not starting from
scratch.
FROM gcr.io/kubeflow-images-public/tensorflow-1.6.0-notebook-cpu
RUN pip install py-special-yak-shaving-tool
Then tell set it as a param for your training/serving job as needed:
ks param set tfjob-v1alpha2 image "my-special-image-goes-here”
Now your fortran lives forever!
Live streamed demos (recorded on
YouTube)
● Kubeflow intro
https://codelabs.developers.google.com/codelabs/kubeflow-introductio
n/index.html & streamed http://bit.ly/kfIntroStream
● Kubeflow E2E with Github issue
summurizationhttps://codelabs.developers.google.com/codelabs/cloud-
kubeflow-e2e-gis/ & streamed http://bit.ly/kfGHStream
● You can tell they were live streamed by how poorly went, I promise no
video editing has occurred.
Limited & Optional: Workshop Demo
● We are doing a workshop @ Strata SF and we'd love to trick offer the
option to do a self-guided pre-ffer you an exciting opportunity to try a
self-guided version free of charge and provide us feedback. We have
some demo accts.
● What you will might learn:
○ Installing Kubeflow
○ Setting up a project
○ Deploying that project to GCP / Azure / IBM
○ Monkeying around with a project and still having it work
● Please come and talk to us after. Holden is wearing a shark dress.
● We'll be around to answer your questions
fionasjournal
Why you shouldn't use
this?
Downsides to Kubeflow
● Lot's of overhead versus doing it locally
● Active development (look it's 0.4)
● 3 talks on Kubeflow can give you 3 different toolsets
Questions? Half-baked Demo?

More Related Content

PDF
MLOps with Kubeflow
PDF
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
PDF
Kubeflow
PDF
Machine Learning using Kubeflow and Kubernetes
PDF
Kubeflow Pipelines (with Tekton)
PDF
What is MLOps
PDF
MLOps Using MLflow
PDF
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...
MLOps with Kubeflow
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
Kubeflow
Machine Learning using Kubeflow and Kubernetes
Kubeflow Pipelines (with Tekton)
What is MLOps
MLOps Using MLflow
Best Practices for ETL with Apache NiFi on Kubernetes - Albert Lewandowski, G...

What's hot (20)

PDF
Databricks Overview for MLOps
PDF
Vertex AI: Pipelines for your MLOps workflows
PPTX
Google Vertex AI
PDF
Introduction to MLflow
PDF
MLFlow: Platform for Complete Machine Learning Lifecycle
PDF
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
PDF
MLOps for production-level machine learning
PPTX
MLOps.pptx
PPTX
Best practices and lessons learnt from Running Apache NiFi at Renault
PDF
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
PDF
Simplifying Model Management with MLflow
PDF
Ml ops intro session
PPTX
MLOps - The Assembly Line of ML
PPTX
MLOps in action
PDF
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
PDF
Kubeflow Distributed Training and HPO
PDF
Scaling your Data Pipelines with Apache Spark on Kubernetes
PDF
KFServing and Kubeflow Pipelines
PDF
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
PDF
MLflow: A Platform for Production Machine Learning
Databricks Overview for MLOps
Vertex AI: Pipelines for your MLOps workflows
Google Vertex AI
Introduction to MLflow
MLFlow: Platform for Complete Machine Learning Lifecycle
Vertex AI - Unified ML Platform for the entire AI workflow on Google Cloud
MLOps for production-level machine learning
MLOps.pptx
Best practices and lessons learnt from Running Apache NiFi at Renault
Intro to Vertex AI, unified MLOps platform for Data Scientists & ML Engineers
Simplifying Model Management with MLflow
Ml ops intro session
MLOps - The Assembly Line of ML
MLOps in action
Streaming Data Lakes using Kafka Connect + Apache Hudi | Vinoth Chandar, Apac...
Kubeflow Distributed Training and HPO
Scaling your Data Pipelines with Apache Spark on Kubernetes
KFServing and Kubeflow Pipelines
Bighead: Airbnb’s End-to-End Machine Learning Platform with Krishna Puttaswa...
MLflow: A Platform for Production Machine Learning
Ad

Similar to Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark) (20)

PDF
Intro - End to end ML with Kubeflow @ SignalConf 2018
PDF
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
PDF
PySpark on Kubernetes @ Python Barcelona March Meetup
PDF
Big data with Python on kubernetes (pyspark on k8s) - Big Data Spain 2018
PDF
Sharing (or stealing) the jewels of python with big data & the jvm (1)
PDF
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
PDF
ApacheCloudStack
PDF
Infrastructure as code with Puppet and Apache CloudStack
PDF
Powering tensor flow with big data using apache beam, flink, and spark cern...
PDF
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
PPTX
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
PDF
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
PDF
Accelerating Big Data beyond the JVM - Fosdem 2018
PDF
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
PPTX
Simplifying training deep and serving learning models with big data in python...
PDF
Big Data Beyond the JVM - Strata San Jose 2018
PDF
Beyond Puppet
PPT
CoffeeScript: A beginner's presentation for beginners copy
PPTX
Serverless Data Architecture at scale on Google Cloud Platform
PPTX
Kubernetes 101
Intro - End to end ML with Kubeflow @ SignalConf 2018
Migrating Apache Spark ML Jobs to Spark + Tensorflow on Kubeflow
PySpark on Kubernetes @ Python Barcelona March Meetup
Big data with Python on kubernetes (pyspark on k8s) - Big Data Spain 2018
Sharing (or stealing) the jewels of python with big data & the jvm (1)
Kafka Summit SF 2017 - Streaming Processing in Python – 10 ways to avoid summ...
ApacheCloudStack
Infrastructure as code with Puppet and Apache CloudStack
Powering tensor flow with big data using apache beam, flink, and spark cern...
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
Accelerating Big Data beyond the JVM - Fosdem 2018
High Performance Distributed TensorFlow with GPUs - NYC Workshop - July 9 2017
Simplifying training deep and serving learning models with big data in python...
Big Data Beyond the JVM - Strata San Jose 2018
Beyond Puppet
CoffeeScript: A beginner's presentation for beginners copy
Serverless Data Architecture at scale on Google Cloud Platform
Kubernetes 101
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
PPTX
Managing the Dewey Decimal System
PPTX
Practical NoSQL: Accumulo's dirlist Example
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
PPTX
Security Framework for Multitenant Architecture
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
PPTX
Extending Twitter's Data Platform to Google Cloud
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
PDF
Computer Vision: Coming to a Store Near You
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Electronic commerce courselecture one. Pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Machine learning based COVID-19 study performance prediction
PDF
cuic standard and advanced reporting.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
KodekX | Application Modernization Development
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
MYSQL Presentation for SQL database connectivity
Encapsulation theory and applications.pdf
Review of recent advances in non-invasive hemoglobin estimation
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Electronic commerce courselecture one. Pdf
Encapsulation_ Review paper, used for researhc scholars
Building Integrated photovoltaic BIPV_UPV.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Machine learning based COVID-19 study performance prediction
cuic standard and advanced reporting.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Unlocking AI with Model Context Protocol (MCP)
Mobile App Security Testing_ A Comprehensive Guide.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Chapter 3 Spatial Domain Image Processing.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Advanced methodologies resolving dimensionality complications for autism neur...
Dropbox Q2 2025 Financial Results & Investor Presentation
KodekX | Application Modernization Development
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
MYSQL Presentation for SQL database connectivity

Introducing Kubeflow (w. Special Guests Tensorflow and Apache Spark)

  • 1. Machine Learning: From Lab to Production with Kubeflow With Special Guests Tensorflow and an Apache Spark teaser
  • 2. Agenda ● About Us ● Background ● Problem: Building Model is Easy, Serving models in Prod is hard. ● What Is KubeFlow? ● Kubeflow’s Design and Core Components ● …
  • 3. Holden About Me Slides ● Prefered pronouns are she/her ● Developer Advocate at Google ● Apache Spark PMC / ASF member + contributor on lots of other projects ● previously IBM, Alpine, Databricks, Google, Foursquare & Amazon ● co-author of Learning Spark & High Performance Spark ● Twitter: @holdenkarau ● Slide share http://www.slideshare.net/hkarau ● Code review livestreams: https://www.twitch.tv/holdenkarau / https://www.youtube.com/user/holdenkarau ● Talk Videos http://bit.ly/holdenSparkVideos ● Talk feedback: http://bit.ly/holdenTalkFeedback ● Organizing data track @ IT Next AMS - CFP Open!
  • 4. Trevor Grant ● From Chicago ● Preferred pronouns he/him ● Various odd jobs around IBM (data scientist/evangelist/janitor) ● PMC Apache Mahout, Streams, Community Development ● Apache Roadshow Chicago [1] ● IoT Track at Apache Con North America [2] ● Blog: rawkintrevo.org ● Twitter: @rawkintrevo ● Github: rawkintrevo ● Shameless self promotion: rawkintrevo.org/shameless-self-promotion/ [1] https://www.apachecon.com/chiroadshow19/ [2] https://www.apachecon.com/acna19/index.html
  • 6. Kubeflow Salesman: **Slaps roof of Kubeflow** THIS BAD BOY CAN FIT SO MANY BUZZWORDS IN IT
  • 7. Kubeflow Salesman: **Slaps roof of Kubeflow** THIS BAD BOY CAN FIT SO MANY BUZZWORDS IN IT
  • 9. History of Predictive Analytics Photo: Numerology Sign Photo: Akash Kataruka Photo: Hans Splinter
  • 12. What is Statistics? Machine Learning? A.I. (Artificial Intelligence)
  • 13. What is Statistics? Machine Learning? A.I. (Artificial Intelligence) Photo: Andreas Kretschmer Model Training Photo: Helen Harrop Model Serving
  • 14. What is Statistics? Machine Learning? A.I. (Artificial Intelligence)
  • 15. What is Statistics? Machine Learning? A.I. (Artificial Intelligence)
  • 16. From don’t know shit to know your shit* Verses # of GPUs required *Holden’s gut feelings after coffee NeedGPUs Knowledge of Shit
  • 18. Kubeflow: Dev on Laptop Deploy in Cloud
  • 20. So what is Kubeflow?
  • 23. What is Kubeflow? VIK hotels group
  • 25. The (many) kinds of models you can train ● All your favourite Python libraries* (in Jupyter) ○ Different options to parallelize, with more coming (for now MPI or Beam ish) ● PyTorch ● Tensorflow (along with hyper param tuning with katib) ● mxnet ● etc. Add-ons: ● H2o: CI is failing but you know, it's Wednesday ● And more!
  • 26. But don't forget about data prep friends! ● For now options are a little limited, but other tools like Apache Spark are in the works ○ https://github.com/kubeflow/kubeflow/pull/1467 ● Now with Spark!* ● Pachyderm ● pandas? ● shell scripts? ● Tensorflow Transform in local mode Python 2 only…... ● Yeah I guess we did forget about our dataprep friends *Where now == what's in master as of Feb 14th 2019
  • 27. Model persistence/deployment/CI ● You need to save your model somewhere ● Your favourite cloud storage provider goes here ○ Why invest in model management when I can make directory? ● ModelDB ● WeaveFlux ● Pachyderm
  • 28. Model Serving Python Flask TensorFlow Model Server Openvino NVIDIA Inference Server Seldon Core ● Routers for A/B tests and multi-armed bandits. ● Supports lots of libraries (Python/Spark/R/etc) ● Monitoring / Security If you can put it in a Docker container, you can use it.
  • 29. So you want to use this?
  • 30. What’s Next?! Step away from keyboard Think about type(s) of model Look at components directory and see what’s a fit tool wise Don’t know? Choose jupyter deal with the details live Can’t find it? ^^ New Cat Content!!! ^._.^
  • 31. What about just tensorflow?* ks registry add kubeflow github.com/kubeflow/kubeflow/tree/${VERSION}/kubeflow ks pkg install kubeflow/core@${VERSION} ks pkg install kubeflow/tf-serving@${VERSION} ks pkg install kubeflow/tf-job@${VERSION}
  • 32. Getting the chef's recommend pairing: kfctl.sh init my_awesome_project --platform {none, gcp, minikube} cd my_awesome_project kfctl.sh generate platform && kfctl.sh apply platform kfctl.sh generate k8s && kfctl.sh apply k8s # Add spark cd ks_ap && ks pkg install kubeflow/spark Douglas O'Brien
  • 33. Connect to the Kubeflow Web UI kubectl port-forward svc/ambassador -n kubeflow 8080:80 & # Or use IAP, but that's… another story
  • 34. The chef's recommend pairing is: ● Jupyter Hub ● TF Job & TF Serving ● PyTorch ● Katib (Hyper parameter tuning) ● Ambassador (makes it easier to access the UIs) ● Pipelines (Argo + Magic) chicoblue
  • 35. Click-to-deploy: get started hella fast* on GCP https://deploy.kubeflow.cloud
  • 41. What are those pipelines? “Kubeflow Pipelines is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers.” - kubeflow.org Directed Acyclic Graph (DAG) of “pipeline components” (read “docker containers”) each performing a function.
  • 44. Serving that job (not the only way)
  • 45. When you don't know anything or know a lot about ML: Hyper Parameter Tuning ● Katib ○ Does not depend on a specific ML tool (e.g. not just TF) ○ Supports a few different search algorithms ○ e.g. "What should I set my L1 regularization too? Idk let's ask the computer" ● Great way to accidently overfit too!* (*if you're not careful) ● As respective cloud providers, we are happy to rent you a lot of resources ● Seriously, mention our names in the sales call. We're both going for promo (and that shit is hard)
  • 47. But what about [special foo-baz-inator] or [special-yak-shaving-tool]? Write a Dockerfile and build an image, use FROM so you’re not starting from scratch. FROM gcr.io/kubeflow-images-public/tensorflow-1.6.0-notebook-cpu RUN pip install py-special-yak-shaving-tool Then tell set it as a param for your training/serving job as needed: ks param set tfjob-v1alpha2 image "my-special-image-goes-here” Now your fortran lives forever!
  • 48. Live streamed demos (recorded on YouTube) ● Kubeflow intro https://codelabs.developers.google.com/codelabs/kubeflow-introductio n/index.html & streamed http://bit.ly/kfIntroStream ● Kubeflow E2E with Github issue summurizationhttps://codelabs.developers.google.com/codelabs/cloud- kubeflow-e2e-gis/ & streamed http://bit.ly/kfGHStream ● You can tell they were live streamed by how poorly went, I promise no video editing has occurred.
  • 49. Limited & Optional: Workshop Demo ● We are doing a workshop @ Strata SF and we'd love to trick offer the option to do a self-guided pre-ffer you an exciting opportunity to try a self-guided version free of charge and provide us feedback. We have some demo accts. ● What you will might learn: ○ Installing Kubeflow ○ Setting up a project ○ Deploying that project to GCP / Azure / IBM ○ Monkeying around with a project and still having it work ● Please come and talk to us after. Holden is wearing a shark dress. ● We'll be around to answer your questions fionasjournal
  • 50. Why you shouldn't use this?
  • 51. Downsides to Kubeflow ● Lot's of overhead versus doing it locally ● Active development (look it's 0.4) ● 3 talks on Kubeflow can give you 3 different toolsets