SlideShare a Scribd company logo
Serverless Compute Platforms
on Kubernetes:
Beyond Web Applications
Alex Glikson
Senior Research Architect, Cloud Platforms
Carnegie Mellon University, Pittsburgh, USA
(IBM Research, Israel)
KubeCon, May 2019
with Ping-Min Lin (Pinterest), Shengjie Luo (VMware), Ke Chang (Facebook), Shichao Nie (Alibaba)
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Outline
● Introduction
○ Serverless
■ Serverless Compute
● FaaS
● Non-FaaS
● Our Use-Cases
○ Interactive Computing
■ Demo
○ Deep Learning
● Conclusions
2
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Serverless
● Many definitions. In a nutshell:
● Avoid management of servers, as a representative example of tasks that:
○ Keep you distracted from developing your *core* business capabilities, and
○ Can be outsourced to someone you trust, for whom this would be *their* core business
● Serverless = Distraction-Free
● Separation of concerns
● Developer experience??
3
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Serverless = Distraction-Free (Examples)
● Object Storage:
○ Core: data organization
○ Distraction: servers, storage, network, high availability, fault tolerance, replication, consistency
● Micro-services:
○ Core: services logic, interfaces
○ Distraction: infra, scaling, LB, HA/FT, API management, routing, service discovery, databases
● Async/Event-driven:
○ Core: event-processing logic
○ Distraction: eventing, messaging, queuing, notifications, etc (+infra/scaling/LB/HA/FT/auth/etc)
● …
4
Example:
Amazon S3
Example:
Kubernetes+Istio+…
Example:
Lambda, SNS, etc
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Serverless Compute Platform (SCP)
● Platform that executes user-provided code (BYOC)
● Often optimized for specific application patterns
○ Often associated fine-grained elasticity, scaling to zero, etc
● Distraction-free
○ Simplified management
■ Deployment, scaling, metering, monitoring, logging, updates, etc
○ Seamless integration with services that the ‘compute’ interacts with (or depends on)
■ Event sources, data, communication middleware, etc.
5
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Function as a Service (FaaS)
6
Platform
Property
General-Purpose FaaS
Examples Lambda, Azure functions, Google Functions;
Kubeless, OpenFaaS, OpenWhisk
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Function as a Service (FaaS)
7
Platform
Property
General-Purpose FaaS
Examples Lambda, Azure functions, Google Functions;
Kubeless, OpenFaaS, OpenWhisk
Code Arbitrary functions
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Function as a Service (FaaS)
8
Platform
Property
General-Purpose FaaS
Examples Lambda, Azure functions, Google Functions;
Kubeless, OpenFaaS, OpenWhisk
Code Arbitrary functions
Application
Pattern
(Not too) short-lived, ephemeral functions, triggered by events or requests;
High load variability (including periods of idleness), (relatively) low sensitivity to latency
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Function as a Service (FaaS)
9
Platform
Property
General-Purpose FaaS
Examples Lambda, Azure functions, Google Functions;
Kubeless, OpenFaaS, OpenWhisk
Code Arbitrary functions
Application
Pattern
(Not too) short-lived, ephemeral functions, triggered by events or requests;
High load variability (including periods of idleness), (relatively) low sensitivity to latency
Management Fully managed runtime containers; functions & function invocations as first class citizens
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Function as a Service (FaaS)
10
Platform
Property
General-Purpose FaaS
Examples Lambda, Azure functions, Google Functions;
Kubeless, OpenFaaS, OpenWhisk
Code Arbitrary functions
Application
Pattern
(Not too) short-lived, ephemeral functions, triggered by events or requests;
High load variability (including periods of idleness), (relatively) low sensitivity to latency
Management Fully managed runtime containers; functions & function invocations as first class citizens
Integration Seamless integration with multiple event sources
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Specialized (Embedded) FaaS
11
Platform
Property
Programmable network edge FaaS
Examples PubNub Functions, Lambda@Edge
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Specialized (Embedded) FaaS
12
Platform
Property
Programmable network edge FaaS
Examples PubNub Functions, Lambda@Edge
Code Arbitrary functions (programming languages often limited)
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Specialized (Embedded) FaaS
13
Platform
Property
Programmable network edge FaaS
Examples PubNub Functions, Lambda@Edge
Code Arbitrary functions (programming languages often limited)
Application
Pattern
High throughput, low latency packet processing
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Specialized (Embedded) FaaS
14
Platform
Property
Programmable network edge FaaS
Examples PubNub Functions, Lambda@Edge
Code Arbitrary functions (programming languages often limited)
Application
Pattern
High throughput, low latency packet processing
Management Fully managed isolated runtime
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP: Specialized (Embedded) FaaS
15
Platform
Property
Programmable network edge FaaS
Examples PubNub Functions, Lambda@Edge
Code Arbitrary functions (programming languages often limited)
Application
Pattern
High throughput, low latency packet processing
Management Fully managed isolated runtime
Integration The hosting platform
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Other (Non-FaaS?) SCPs: Serverless ETL
16
Platform
Property
Serverless ETL
Examples
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Other (Non-FaaS?) SCPs: Serverless ETL
17
Platform
Property
Serverless ETL
Examples AWS Glue
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Other (Non-FaaS?) SCPs: Serverless ETL
18
Platform
Property
Serverless ETL
Examples AWS Glue
Code PySpark, PyShell jobs
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Other (Non-FaaS?) SCPs: Serverless ETL
19
Platform
Property
Serverless ETL
Examples AWS Glue
Code PySpark, PyShell jobs
Application
Pattern
Data-parallel Spark jobs (periodic or ad-hoc)
Non-parallel pre/post-processing jobs
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Other (Non-FaaS?) SCPs: Serverless ETL
20
Platform
Property
Serverless ETL
Examples AWS Glue
Code PySpark, PyShell jobs
Application
Pattern
Data-parallel Spark jobs (periodic or ad-hoc)
Non-parallel pre/post-processing jobs
Management Fully managed Spark cluster; Python runtime
Integration Data catalogue
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Non-FaaS SCP: Cloud-Native Web Applications
21
Platform
Property
Cloud-Native Web Applications
Examples
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Non-FaaS SCP: Cloud-Native Web Applications
22
Platform
Property
Cloud-Native Web Applications
Examples Knative
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Non-FaaS SCP: Cloud-Native Web Applications
23
Platform
Property
Cloud-Native Web Applications
Examples Knative
Code Arbitrary application serving HTTP requests
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Non-FaaS SCP: Cloud-Native Web Applications
24
Platform
Property
Cloud-Native Web Applications
Examples Knative
Code Arbitrary application serving HTTP requests
Application
Pattern
Long-running, scale-out services; Linear resource demand per request
Often high-throughput, low-latency
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Non-FaaS SCP: Cloud-Native Web Applications
25
Platform
Property
Cloud-Native Web Applications
Examples Knative
Code Arbitrary application serving HTTP requests
Application
Pattern
Long-running, scale-out services; Linear resource demand per request
Often high-throughput, low-latency
Management K8s features + code-to-deploy, revisions, canary deployment, etc
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Non-FaaS SCP: Cloud-Native Web Applications
26
Platform
Property
Cloud-Native Web Applications
Examples Knative
Code Arbitrary application serving HTTP requests
Application
Pattern
Long-running, scale-out services; Linear resource demand per request
Often high-throughput, low-latency
Management K8s features + code-to-deploy, revisions, canary deployment, etc
Integration Service mash, build, eventing
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
What Other Application Patterns Could Justify a Specialized SCP?
27
Platform
Property
?
Examples ?
Code ?
Application
Pattern
?
Management ?
Integration ?
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Outline
● Introduction
○ Serverless
■ Serverless Compute
● FaaS
● Non-FaaS
● Our Use-Cases
○ Interactive Computing
○ Deep Learning
● Conclusions
28
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Interactive Computing
● Example: Data Science using Jupyter Notebook
● Architecture 1: Python + Spark
○ Scale-out Spark jobs
○ Requires Spark programming model
● Architecture 2: “pure” Python
○ Local execution, using non-parallel
Python libraries
○ Not designed for scale-out,
but can take advantage of scale-up
● Other example: Linux Shell
29
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Interactive Computing
30
Property
Interactive Computing (Jupyter, Shell)
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Interactive Computing
31
Property
Interactive Computing (Jupyter, Shell)
Code Python, Bash
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Interactive Computing
32
Property
Interactive Computing (Jupyter, Shell)
Code Python, Bash
Application
Pattern
Iterative invocation of stateful, non-parallel, computation-intensive,
ad-hoc tasks, triggered by explicit user interaction
Management
Integration
Efficient persistence of state across invocations
Scale-up rather than scale-out Easily re-programmable (code as payload)
Scale to zero when idle
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Interactive Computing
33
Property
Interactive Computing (Jupyter, Shell)
Code Python, Bash
Application
Pattern
Iterative invocation of stateful, non-parallel, computation-intensive,
ad-hoc tasks, triggered by explicit user interaction
Management Provisioning, management, scaling of underlying resources
Integration
Efficient persistence of state across invocations
Scale-up rather than scale-out
Scale to zero when idle
Easily re-programmable (code as payload)
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Interactive Computing
34
Property
Interactive Computing (Jupyter, Shell)
Code Python, Bash
Application
Pattern
Iterative invocation of stateful, non-parallel, computation-intensive,
ad-hoc tasks, triggered by explicit user interaction
Management Provisioning, management, scaling of underlying resources
Integration Data sources, auth, etc
Efficient persistence of state across invocations
Scale-up rather than scale-out
Scale to zero when idle
Easily re-programmable (code as payload)
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Runbox: Elastic Persistent Execution Environment on K8s
https://github.com/slsvm/runbox
35
Notebook Filesystem Data Volume
Pod/RS
Container
Dev Machine
Runbox
Runbox
Controller*
Kubernetes Cluster
UI
(e.g.,
Jupyter,
Bash)
Runbox
Proxy
Create
Exec
Recycle
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
DEMO – Bash
● https://github.com/slsvm/runbox
36
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
37
Runbox environment:
Pod, Image, Volume,
(+deployment, side-car)
Remote command execution
Filesystem synchronization
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
38
Filesystem synchronization
Persistent over recycling
of idle resource (e.g., by
Runbox controller)
Runbox environment:
Pod, Image, Volume,
(+deployment, side-car)
Remote command execution
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
39
Filesystem synchronization
Persistent over recycling
of idle resource (e.g., by
Runbox controller)
Per-command vertical scaling
Runbox environment:
Pod, Image, Volume,
(+deployment, side-car)
Remote command execution
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
40
Filesystem synchronization
Persistent over recycling
of idle resource (e.g., by
Runbox controller)
Per-command vertical scaling
Runbox environment:
Pod, Image, Volume,
(+deployment, side-car)
Remote command execution
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
DEMO – Jupyter
● https://github.com/slsvm/runbox-jupyter (COMING SOON)
41
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
42
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
46
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
47
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Architecture - Jupyter
48
Jupyter-Browser
Jupyter Server
Runbox
Extension
Notebook Filesystem Data Volume
Pod/RS
Container
Dev Machine
Runbox
Runbox
Controller*
sync
cold
save
GC
1 start kernel
4 resize
3 sync
2 create
6 resize
up
5 run cell
7 exec
9 exec
11
12
10 save
8 restore
Kubernetes Cluster
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Design Details
● Special Jupyter Kernels, delegating execution to a K8s Pod using `kubectl exec`
○ E.g., scp-python, scp-bash
● State is persisted in a K8s volume attached to the Pod
○ Snapshot/restore in-memory state using `dill` in Python and `set/source` in Bash
○ Also, state is synchronized from/to the local machine via a side-car running unison
● Pod is scaled down (optionally, to zero) when nothing is executed
○ E.g., by scaling the containing ReplicaSet, or using in-place Pod vertical scaling (WIP)
○ Tradeoff between capacity for ‘warm’ containers and latency managed by dedicated controller
● When image changes (e.g., after `apt install`), a new image is committed
○ Using tags for versioning; docker-squash to remove redundant layers
● Magics to control the non-functional properties
○ E.g., resource allocation, whether or not image snapshot is needed, etc
49
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Lessons Learned
● Kubernetes originally focused on scale-out workloads, but can also support
scale-up
○ New kind of controller?
● Generic support for application-assisted snapshots could be useful
● For use-cases involving ephemeral compute, API for direct access to volumes
could be useful
50
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Outline
● Introduction
○ Serverless
■ Serverless Compute
● FaaS
● Non-FaaS
● Our Use-Cases
○ Interactive Computing
○ Deep Learning
● Conclusions
51
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Deep Learning
● Resource-intensive
○ (1) model training, (2) inference
● Frameworks: Tensorflow, Keras, PyTorch, etc.
● ‘Hot’ research area – new algorithms, frameworks, etc
● Example application: Image Classification
○ Given a model + unlabeled example(s), predict label(s)
○ Compute-intensive, scale-out, can leverage GPUs
52
transportation medicine smart cities, security consumer games e-commerce
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Deep Learning Inference
53
Property
Deep Learning Inference
Code
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Deep Learning Inference
54
Property
Deep Learning Inference
Code Model inference implementation (Python)
Application
Pattern
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Deep Learning Inference
55
Property
Deep Learning Inference
Code Model inference implementation (Python)
Application
Pattern
Long-running, scale-out services; Linear resource demand per request; Load variance
Can benefit from running on GPUs; potentially large “cold-start” latencies
Management
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Deep Learning Inference
56
Property
Deep Learning Inference
Code Model inference implementation (Python)
Application
Pattern
Long-running, scale-out services; Linear resource demand per request; Load variance
Can benefit from running on GPUs; potentially large “cold-start” latencies
Management
Same as Knative: build, serving, eventing
Load balancing between GPU and CPU resources; Minimal ‘cold-start’ latency
Integration
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
SCP for Deep Learning Inference
57
Property
Deep Learning Inference
Code Model inference implementation (Python)
Application
Pattern
Long-running, scale-out services; Linear resource demand per request; Load variance
Can benefit from running on GPUs; potentially large “cold-start” latencies
Management
Same as Knative: build, serving, eventing
Load balancing between GPU and CPU resources; Minimal ‘cold-start’ latency
Integration K8s, Istio, model storage, etc
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Our Architecture
58
Pod
scaling
GPU Nodes
Pod Pod
scaling
Knative
Service 2
PodPodPodPod
Knative
Service 1
Pod
scaling
CPU Nodes
Pod Pod
scaling
Knative
Service 4
PodPodPodPod
Knative
Service 3
Pod
Standby
Pool
GPU-aware
Load Balancer
LB
GPU
Scheduler
Pool
Manager
User
Hybrid Service
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Design Details
● Build: Automatically add HTTP interface
○ Augment the provided inference logic with a Django ‘wrapper’, then use Knative build to deploy it
● Load-balancing across GPU-enabled and CPU-only nodes
○ Patch Knative to support GPU resources
○ Based on model properties, indicate in the Knative service template whether a GPU is preferable
○ Two-level scheduling: 1 GPU service and 1 CPU service for each app; fair time-sharing of GPUs
● Maintain a pool of ‘warm’ Pods
○ “Pool” is a ReplicaSet with ‘warm’ (running) Pods
■ Size is adjusted dynamically by the Pool Controller (cluster utilization, estimated demand)
○ Knative scaling logic consumes a warm Pod from the Pool instead of provisioning a new one
■ Pod “migration” is implemented by label manipulation + update of the Istio side-car via API
59
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Lessons Learned
● Standardized HTTP wrappers can be used to deliver FaaS-like experience
○ Can leverage existing open source FaaS solutions (e.g., OpenWhisk)
● More fine-grained management of GPU resources would be beneficial
○ The overhead of 2-level scheduling is substantial
● For reuse of ‘warm’ Pods, stronger notion of ‘similarity’ between Pods is needed
○ E.g., same model version?
● Even pool of size 1 significantly reduces the chances of cold starts
○ Instead of pools, can we reuse priority classes and make Knative scaling logic adjust priorities?
60
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Outline
● Introduction
○ Serverless
■ Serverless Compute
● FaaS
● Non-FaaS
● Our Use-Cases
○ Deep Learning
○ Interactive Computing
● Conclusions
61
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Conclusions
● “Serverless” = BYOC + distraction-free
● “Serverless” derives different requirements for different workloads
● No one-size-fits-all!
● Lots of opportunities to deliver ‘serverless’ experience for new workloads!
○ Knative can be enhanced to achieve “serverless” goals for DL inference (KFserving?)
○ SCP for Interactive Computing requires new capabilities on top of Kubernetes
■ https://github.com/slsvm/runbox
62
KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
Questions? Ideas? Suggestions? Collaboration?
● alex dot glikson at gmail dot com
63

More Related Content

PDF
Building serverless applications with Apache OpenWhisk
PDF
Containers vs serverless - Navigating application deployment options
PDF
Building serverless applications with Apache OpenWhisk and IBM Cloud Functions
PDF
Serverless Architectures in Banking: OpenWhisk on IBM Bluemix at Santander
PDF
Serverless APIs with Apache OpenWhisk
PDF
Workshop: Develop Serverless Applications with IBM Cloud Functions
PDF
Developing Serverless Applications on Kubernetes with Knative
PDF
Kubernetes, Istio and Knative - noteworthy practical experience
Building serverless applications with Apache OpenWhisk
Containers vs serverless - Navigating application deployment options
Building serverless applications with Apache OpenWhisk and IBM Cloud Functions
Serverless Architectures in Banking: OpenWhisk on IBM Bluemix at Santander
Serverless APIs with Apache OpenWhisk
Workshop: Develop Serverless Applications with IBM Cloud Functions
Developing Serverless Applications on Kubernetes with Knative
Kubernetes, Istio and Knative - noteworthy practical experience

Similar to Serverless Compute Platforms on Kubernetes (20)

PDF
Building Cloud-Native Applications with Kubernetes, Helm and Kubeless
PDF
[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...
PPSX
Cloud Architecture - Multi Cloud, Edge, On-Premise
PDF
stackconf 2020 | The path to a Serverless-native era with Kubernetes by Paolo...
PPTX
Kubernetes workshop -_the_basics
PDF
apidays LIVE Hong Kong 2021 - Event-driven APIs & Schema governance for Apach...
PPTX
Cloud computing: highlights
DOCX
AWS Cloud Solutions Architects & Tech Enthusiasts
PPTX
Cloud native Kafka | Sascha Holtbruegge and Margaretha Erber, HiveMQ
PDF
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
PPTX
Building Serverless Microservices Using Serverless Framework on the Cloud
PDF
Confluent Operator as Cloud-Native Kafka Operator for Kubernetes
PDF
Streaming Movies brings you Streamlined Applications -- How Adopting Netflix ...
PPTX
Building Cross-Cloud Platform Cognitive Microservices Using Serverless Archit...
PPTX
An introduction to Serverless
PPTX
CNCF Introduction - Feb 2018
PPTX
MongoDB World 2018: Partner Talk - Red Hat: Deploying to Enterprise Kubernetes
PPTX
Migrate a on-prem platform to the public cloud with Java - SpringBoot and PCF
PDF
Kubernetes - Cloud Native Application Orchestration - Catalin Jora
PDF
Software Engineering in the (AWS) Cloud
Building Cloud-Native Applications with Kubernetes, Helm and Kubeless
[Capitole du Libre] #serverless -  mettez-le en oeuvre dans votre entreprise...
Cloud Architecture - Multi Cloud, Edge, On-Premise
stackconf 2020 | The path to a Serverless-native era with Kubernetes by Paolo...
Kubernetes workshop -_the_basics
apidays LIVE Hong Kong 2021 - Event-driven APIs & Schema governance for Apach...
Cloud computing: highlights
AWS Cloud Solutions Architects & Tech Enthusiasts
Cloud native Kafka | Sascha Holtbruegge and Margaretha Erber, HiveMQ
RedisConf18 - Redis in Dev, Test, and Prod with the OpenShift Service Catalog
Building Serverless Microservices Using Serverless Framework on the Cloud
Confluent Operator as Cloud-Native Kafka Operator for Kubernetes
Streaming Movies brings you Streamlined Applications -- How Adopting Netflix ...
Building Cross-Cloud Platform Cognitive Microservices Using Serverless Archit...
An introduction to Serverless
CNCF Introduction - Feb 2018
MongoDB World 2018: Partner Talk - Red Hat: Deploying to Enterprise Kubernetes
Migrate a on-prem platform to the public cloud with Java - SpringBoot and PCF
Kubernetes - Cloud Native Application Orchestration - Catalin Jora
Software Engineering in the (AWS) Cloud
Ad

More from Alex Glikson (9)

PDF
DevOpsDaysTLV24 - Spot Workload Optimization ML.pdf
PPTX
AWS Re:Invented
PPTX
From chroot to Docker to Kubernetes
PDF
Cloud-Native Application and Kubernetes
PDF
Mixing bare-metal and virtualized workloads on OpenStack - 2014
PPTX
Serverless, IoT and OpenWhisk
PDF
Container-Based Platforms and Kubernetes
PDF
Going Serverless with OpenWhisk
PDF
The Serverless Paradigm, OpenWhisk and FIWARE
DevOpsDaysTLV24 - Spot Workload Optimization ML.pdf
AWS Re:Invented
From chroot to Docker to Kubernetes
Cloud-Native Application and Kubernetes
Mixing bare-metal and virtualized workloads on OpenStack - 2014
Serverless, IoT and OpenWhisk
Container-Based Platforms and Kubernetes
Going Serverless with OpenWhisk
The Serverless Paradigm, OpenWhisk and FIWARE
Ad

Recently uploaded (20)

PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Spectroscopy.pptx food analysis technology
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Approach and Philosophy of On baking technology
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Empathic Computing: Creating Shared Understanding
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Encapsulation theory and applications.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Big Data Technologies - Introduction.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
A comparative analysis of optical character recognition models for extracting...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Diabetes mellitus diagnosis method based random forest with bat algorithm
Chapter 3 Spatial Domain Image Processing.pdf
Spectroscopy.pptx food analysis technology
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Network Security Unit 5.pdf for BCA BBA.
Approach and Philosophy of On baking technology
sap open course for s4hana steps from ECC to s4
Empathic Computing: Creating Shared Understanding
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Electronic commerce courselecture one. Pdf
Encapsulation theory and applications.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
NewMind AI Weekly Chronicles - August'25-Week II
Big Data Technologies - Introduction.pptx
cuic standard and advanced reporting.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Review of recent advances in non-invasive hemoglobin estimation
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
A comparative analysis of optical character recognition models for extracting...

Serverless Compute Platforms on Kubernetes

  • 1. Serverless Compute Platforms on Kubernetes: Beyond Web Applications Alex Glikson Senior Research Architect, Cloud Platforms Carnegie Mellon University, Pittsburgh, USA (IBM Research, Israel) KubeCon, May 2019 with Ping-Min Lin (Pinterest), Shengjie Luo (VMware), Ke Chang (Facebook), Shichao Nie (Alibaba)
  • 2. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Outline ● Introduction ○ Serverless ■ Serverless Compute ● FaaS ● Non-FaaS ● Our Use-Cases ○ Interactive Computing ■ Demo ○ Deep Learning ● Conclusions 2
  • 3. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Serverless ● Many definitions. In a nutshell: ● Avoid management of servers, as a representative example of tasks that: ○ Keep you distracted from developing your *core* business capabilities, and ○ Can be outsourced to someone you trust, for whom this would be *their* core business ● Serverless = Distraction-Free ● Separation of concerns ● Developer experience?? 3
  • 4. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Serverless = Distraction-Free (Examples) ● Object Storage: ○ Core: data organization ○ Distraction: servers, storage, network, high availability, fault tolerance, replication, consistency ● Micro-services: ○ Core: services logic, interfaces ○ Distraction: infra, scaling, LB, HA/FT, API management, routing, service discovery, databases ● Async/Event-driven: ○ Core: event-processing logic ○ Distraction: eventing, messaging, queuing, notifications, etc (+infra/scaling/LB/HA/FT/auth/etc) ● … 4 Example: Amazon S3 Example: Kubernetes+Istio+… Example: Lambda, SNS, etc
  • 5. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Serverless Compute Platform (SCP) ● Platform that executes user-provided code (BYOC) ● Often optimized for specific application patterns ○ Often associated fine-grained elasticity, scaling to zero, etc ● Distraction-free ○ Simplified management ■ Deployment, scaling, metering, monitoring, logging, updates, etc ○ Seamless integration with services that the ‘compute’ interacts with (or depends on) ■ Event sources, data, communication middleware, etc. 5
  • 6. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP: Function as a Service (FaaS) 6 Platform Property General-Purpose FaaS Examples Lambda, Azure functions, Google Functions; Kubeless, OpenFaaS, OpenWhisk Code Application Pattern Management Integration
  • 7. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP: Function as a Service (FaaS) 7 Platform Property General-Purpose FaaS Examples Lambda, Azure functions, Google Functions; Kubeless, OpenFaaS, OpenWhisk Code Arbitrary functions Application Pattern Management Integration
  • 8. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP: Function as a Service (FaaS) 8 Platform Property General-Purpose FaaS Examples Lambda, Azure functions, Google Functions; Kubeless, OpenFaaS, OpenWhisk Code Arbitrary functions Application Pattern (Not too) short-lived, ephemeral functions, triggered by events or requests; High load variability (including periods of idleness), (relatively) low sensitivity to latency Management Integration
  • 9. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP: Function as a Service (FaaS) 9 Platform Property General-Purpose FaaS Examples Lambda, Azure functions, Google Functions; Kubeless, OpenFaaS, OpenWhisk Code Arbitrary functions Application Pattern (Not too) short-lived, ephemeral functions, triggered by events or requests; High load variability (including periods of idleness), (relatively) low sensitivity to latency Management Fully managed runtime containers; functions & function invocations as first class citizens Integration
  • 10. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP: Function as a Service (FaaS) 10 Platform Property General-Purpose FaaS Examples Lambda, Azure functions, Google Functions; Kubeless, OpenFaaS, OpenWhisk Code Arbitrary functions Application Pattern (Not too) short-lived, ephemeral functions, triggered by events or requests; High load variability (including periods of idleness), (relatively) low sensitivity to latency Management Fully managed runtime containers; functions & function invocations as first class citizens Integration Seamless integration with multiple event sources
  • 11. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP: Specialized (Embedded) FaaS 11 Platform Property Programmable network edge FaaS Examples PubNub Functions, Lambda@Edge Code Application Pattern Management Integration
  • 12. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP: Specialized (Embedded) FaaS 12 Platform Property Programmable network edge FaaS Examples PubNub Functions, Lambda@Edge Code Arbitrary functions (programming languages often limited) Application Pattern Management Integration
  • 13. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP: Specialized (Embedded) FaaS 13 Platform Property Programmable network edge FaaS Examples PubNub Functions, Lambda@Edge Code Arbitrary functions (programming languages often limited) Application Pattern High throughput, low latency packet processing Management Integration
  • 14. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP: Specialized (Embedded) FaaS 14 Platform Property Programmable network edge FaaS Examples PubNub Functions, Lambda@Edge Code Arbitrary functions (programming languages often limited) Application Pattern High throughput, low latency packet processing Management Fully managed isolated runtime Integration
  • 15. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP: Specialized (Embedded) FaaS 15 Platform Property Programmable network edge FaaS Examples PubNub Functions, Lambda@Edge Code Arbitrary functions (programming languages often limited) Application Pattern High throughput, low latency packet processing Management Fully managed isolated runtime Integration The hosting platform
  • 16. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Other (Non-FaaS?) SCPs: Serverless ETL 16 Platform Property Serverless ETL Examples Code Application Pattern Management Integration
  • 17. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Other (Non-FaaS?) SCPs: Serverless ETL 17 Platform Property Serverless ETL Examples AWS Glue Code Application Pattern Management Integration
  • 18. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Other (Non-FaaS?) SCPs: Serverless ETL 18 Platform Property Serverless ETL Examples AWS Glue Code PySpark, PyShell jobs Application Pattern Management Integration
  • 19. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Other (Non-FaaS?) SCPs: Serverless ETL 19 Platform Property Serverless ETL Examples AWS Glue Code PySpark, PyShell jobs Application Pattern Data-parallel Spark jobs (periodic or ad-hoc) Non-parallel pre/post-processing jobs Management Integration
  • 20. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Other (Non-FaaS?) SCPs: Serverless ETL 20 Platform Property Serverless ETL Examples AWS Glue Code PySpark, PyShell jobs Application Pattern Data-parallel Spark jobs (periodic or ad-hoc) Non-parallel pre/post-processing jobs Management Fully managed Spark cluster; Python runtime Integration Data catalogue
  • 21. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Non-FaaS SCP: Cloud-Native Web Applications 21 Platform Property Cloud-Native Web Applications Examples Code Application Pattern Management Integration
  • 22. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Non-FaaS SCP: Cloud-Native Web Applications 22 Platform Property Cloud-Native Web Applications Examples Knative Code Application Pattern Management Integration
  • 23. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Non-FaaS SCP: Cloud-Native Web Applications 23 Platform Property Cloud-Native Web Applications Examples Knative Code Arbitrary application serving HTTP requests Application Pattern Management Integration
  • 24. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Non-FaaS SCP: Cloud-Native Web Applications 24 Platform Property Cloud-Native Web Applications Examples Knative Code Arbitrary application serving HTTP requests Application Pattern Long-running, scale-out services; Linear resource demand per request Often high-throughput, low-latency Management Integration
  • 25. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Non-FaaS SCP: Cloud-Native Web Applications 25 Platform Property Cloud-Native Web Applications Examples Knative Code Arbitrary application serving HTTP requests Application Pattern Long-running, scale-out services; Linear resource demand per request Often high-throughput, low-latency Management K8s features + code-to-deploy, revisions, canary deployment, etc Integration
  • 26. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Non-FaaS SCP: Cloud-Native Web Applications 26 Platform Property Cloud-Native Web Applications Examples Knative Code Arbitrary application serving HTTP requests Application Pattern Long-running, scale-out services; Linear resource demand per request Often high-throughput, low-latency Management K8s features + code-to-deploy, revisions, canary deployment, etc Integration Service mash, build, eventing
  • 27. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 What Other Application Patterns Could Justify a Specialized SCP? 27 Platform Property ? Examples ? Code ? Application Pattern ? Management ? Integration ?
  • 28. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Outline ● Introduction ○ Serverless ■ Serverless Compute ● FaaS ● Non-FaaS ● Our Use-Cases ○ Interactive Computing ○ Deep Learning ● Conclusions 28
  • 29. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Interactive Computing ● Example: Data Science using Jupyter Notebook ● Architecture 1: Python + Spark ○ Scale-out Spark jobs ○ Requires Spark programming model ● Architecture 2: “pure” Python ○ Local execution, using non-parallel Python libraries ○ Not designed for scale-out, but can take advantage of scale-up ● Other example: Linux Shell 29
  • 30. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP for Interactive Computing 30 Property Interactive Computing (Jupyter, Shell) Code Application Pattern Management Integration
  • 31. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP for Interactive Computing 31 Property Interactive Computing (Jupyter, Shell) Code Python, Bash Application Pattern Management Integration
  • 32. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP for Interactive Computing 32 Property Interactive Computing (Jupyter, Shell) Code Python, Bash Application Pattern Iterative invocation of stateful, non-parallel, computation-intensive, ad-hoc tasks, triggered by explicit user interaction Management Integration Efficient persistence of state across invocations Scale-up rather than scale-out Easily re-programmable (code as payload) Scale to zero when idle
  • 33. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP for Interactive Computing 33 Property Interactive Computing (Jupyter, Shell) Code Python, Bash Application Pattern Iterative invocation of stateful, non-parallel, computation-intensive, ad-hoc tasks, triggered by explicit user interaction Management Provisioning, management, scaling of underlying resources Integration Efficient persistence of state across invocations Scale-up rather than scale-out Scale to zero when idle Easily re-programmable (code as payload)
  • 34. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP for Interactive Computing 34 Property Interactive Computing (Jupyter, Shell) Code Python, Bash Application Pattern Iterative invocation of stateful, non-parallel, computation-intensive, ad-hoc tasks, triggered by explicit user interaction Management Provisioning, management, scaling of underlying resources Integration Data sources, auth, etc Efficient persistence of state across invocations Scale-up rather than scale-out Scale to zero when idle Easily re-programmable (code as payload)
  • 35. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Runbox: Elastic Persistent Execution Environment on K8s https://github.com/slsvm/runbox 35 Notebook Filesystem Data Volume Pod/RS Container Dev Machine Runbox Runbox Controller* Kubernetes Cluster UI (e.g., Jupyter, Bash) Runbox Proxy Create Exec Recycle
  • 36. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 DEMO – Bash ● https://github.com/slsvm/runbox 36
  • 37. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 37 Runbox environment: Pod, Image, Volume, (+deployment, side-car) Remote command execution Filesystem synchronization
  • 38. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 38 Filesystem synchronization Persistent over recycling of idle resource (e.g., by Runbox controller) Runbox environment: Pod, Image, Volume, (+deployment, side-car) Remote command execution
  • 39. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 39 Filesystem synchronization Persistent over recycling of idle resource (e.g., by Runbox controller) Per-command vertical scaling Runbox environment: Pod, Image, Volume, (+deployment, side-car) Remote command execution
  • 40. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 40 Filesystem synchronization Persistent over recycling of idle resource (e.g., by Runbox controller) Per-command vertical scaling Runbox environment: Pod, Image, Volume, (+deployment, side-car) Remote command execution
  • 41. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 DEMO – Jupyter ● https://github.com/slsvm/runbox-jupyter (COMING SOON) 41
  • 42. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 42
  • 43. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
  • 44. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
  • 45. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019
  • 46. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 46
  • 47. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 47
  • 48. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Architecture - Jupyter 48 Jupyter-Browser Jupyter Server Runbox Extension Notebook Filesystem Data Volume Pod/RS Container Dev Machine Runbox Runbox Controller* sync cold save GC 1 start kernel 4 resize 3 sync 2 create 6 resize up 5 run cell 7 exec 9 exec 11 12 10 save 8 restore Kubernetes Cluster
  • 49. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Design Details ● Special Jupyter Kernels, delegating execution to a K8s Pod using `kubectl exec` ○ E.g., scp-python, scp-bash ● State is persisted in a K8s volume attached to the Pod ○ Snapshot/restore in-memory state using `dill` in Python and `set/source` in Bash ○ Also, state is synchronized from/to the local machine via a side-car running unison ● Pod is scaled down (optionally, to zero) when nothing is executed ○ E.g., by scaling the containing ReplicaSet, or using in-place Pod vertical scaling (WIP) ○ Tradeoff between capacity for ‘warm’ containers and latency managed by dedicated controller ● When image changes (e.g., after `apt install`), a new image is committed ○ Using tags for versioning; docker-squash to remove redundant layers ● Magics to control the non-functional properties ○ E.g., resource allocation, whether or not image snapshot is needed, etc 49
  • 50. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Lessons Learned ● Kubernetes originally focused on scale-out workloads, but can also support scale-up ○ New kind of controller? ● Generic support for application-assisted snapshots could be useful ● For use-cases involving ephemeral compute, API for direct access to volumes could be useful 50
  • 51. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Outline ● Introduction ○ Serverless ■ Serverless Compute ● FaaS ● Non-FaaS ● Our Use-Cases ○ Interactive Computing ○ Deep Learning ● Conclusions 51
  • 52. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Deep Learning ● Resource-intensive ○ (1) model training, (2) inference ● Frameworks: Tensorflow, Keras, PyTorch, etc. ● ‘Hot’ research area – new algorithms, frameworks, etc ● Example application: Image Classification ○ Given a model + unlabeled example(s), predict label(s) ○ Compute-intensive, scale-out, can leverage GPUs 52 transportation medicine smart cities, security consumer games e-commerce
  • 53. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP for Deep Learning Inference 53 Property Deep Learning Inference Code Application Pattern Management Integration
  • 54. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP for Deep Learning Inference 54 Property Deep Learning Inference Code Model inference implementation (Python) Application Pattern Management Integration
  • 55. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP for Deep Learning Inference 55 Property Deep Learning Inference Code Model inference implementation (Python) Application Pattern Long-running, scale-out services; Linear resource demand per request; Load variance Can benefit from running on GPUs; potentially large “cold-start” latencies Management Integration
  • 56. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP for Deep Learning Inference 56 Property Deep Learning Inference Code Model inference implementation (Python) Application Pattern Long-running, scale-out services; Linear resource demand per request; Load variance Can benefit from running on GPUs; potentially large “cold-start” latencies Management Same as Knative: build, serving, eventing Load balancing between GPU and CPU resources; Minimal ‘cold-start’ latency Integration
  • 57. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 SCP for Deep Learning Inference 57 Property Deep Learning Inference Code Model inference implementation (Python) Application Pattern Long-running, scale-out services; Linear resource demand per request; Load variance Can benefit from running on GPUs; potentially large “cold-start” latencies Management Same as Knative: build, serving, eventing Load balancing between GPU and CPU resources; Minimal ‘cold-start’ latency Integration K8s, Istio, model storage, etc
  • 58. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Our Architecture 58 Pod scaling GPU Nodes Pod Pod scaling Knative Service 2 PodPodPodPod Knative Service 1 Pod scaling CPU Nodes Pod Pod scaling Knative Service 4 PodPodPodPod Knative Service 3 Pod Standby Pool GPU-aware Load Balancer LB GPU Scheduler Pool Manager User Hybrid Service
  • 59. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Design Details ● Build: Automatically add HTTP interface ○ Augment the provided inference logic with a Django ‘wrapper’, then use Knative build to deploy it ● Load-balancing across GPU-enabled and CPU-only nodes ○ Patch Knative to support GPU resources ○ Based on model properties, indicate in the Knative service template whether a GPU is preferable ○ Two-level scheduling: 1 GPU service and 1 CPU service for each app; fair time-sharing of GPUs ● Maintain a pool of ‘warm’ Pods ○ “Pool” is a ReplicaSet with ‘warm’ (running) Pods ■ Size is adjusted dynamically by the Pool Controller (cluster utilization, estimated demand) ○ Knative scaling logic consumes a warm Pod from the Pool instead of provisioning a new one ■ Pod “migration” is implemented by label manipulation + update of the Istio side-car via API 59
  • 60. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Lessons Learned ● Standardized HTTP wrappers can be used to deliver FaaS-like experience ○ Can leverage existing open source FaaS solutions (e.g., OpenWhisk) ● More fine-grained management of GPU resources would be beneficial ○ The overhead of 2-level scheduling is substantial ● For reuse of ‘warm’ Pods, stronger notion of ‘similarity’ between Pods is needed ○ E.g., same model version? ● Even pool of size 1 significantly reduces the chances of cold starts ○ Instead of pools, can we reuse priority classes and make Knative scaling logic adjust priorities? 60
  • 61. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Outline ● Introduction ○ Serverless ■ Serverless Compute ● FaaS ● Non-FaaS ● Our Use-Cases ○ Deep Learning ○ Interactive Computing ● Conclusions 61
  • 62. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Conclusions ● “Serverless” = BYOC + distraction-free ● “Serverless” derives different requirements for different workloads ● No one-size-fits-all! ● Lots of opportunities to deliver ‘serverless’ experience for new workloads! ○ Knative can be enhanced to achieve “serverless” goals for DL inference (KFserving?) ○ SCP for Interactive Computing requires new capabilities on top of Kubernetes ■ https://github.com/slsvm/runbox 62
  • 63. KubeCon / CloudNativeCon, Barcelona, May 20-23, 2019 Questions? Ideas? Suggestions? Collaboration? ● alex dot glikson at gmail dot com 63