SlideShare a Scribd company logo
TU Wien, Vienna Austria
Distributed Systems Group
https://dsg.tuwien.ac.at
Thomas Rausch @thrauat
Waldemar Hummer
Vinod Muthusamy
Alexander Rashed
Schahram Dustdar
Towards a Serverless Platform for Edge AI
IBM Research AI
HotEdge’19, Renton, WA
2
Drone
With Accelerator
Microsoft Build 2018 // Vision Keynote: https://www.youtube.com/watch?v=rd0Rd8w3FZ0
3
Edge AI Accelerators
Google Edge TPU
NVIDIA Jetson
Intel
Neural Compute Stick
Baidu Kunlun
Microsoft
Project BrainWave
Huawei Atlas
4
AI Operationalization
Hummer et al., ModelOps: Cloud-based Lifecycle Management for Reliable and Trusted AI. IC2E’19.
Process Train Validate Servee
Model
Runtime
Monitoring
Data
Perf.
Process Train Validate Serve
Object
Store
Compute
Cluster
Learning
Cluster
Read
Data
Train
Model
Write
Model
Data
Asset
Trained
Model
ModelOps Platform
5
Serverless Model
{} Event (Request)
Trigger
Node
λ
λ
λ
λ
λ
λ
λ
λ
λ
Resource
λ
def handle(req):
s3 = boto3.client('s3')
with open(tmpfile, 'wb') as f:
s3.download_fileobj('bucket', req['obj'], f)
data = numpy.load(f)
m = train_model(data, req['train_params'])
s3.upload_fileobj(serialize(m), 'bucket', 'model'])
# ...
λλ
Function
Scheduler
Cloud Platform
6
Deviceless Model
{} Event (Request)
Trigger
λλ
Function
Scheduler
??
Edge Cloud
Edge
Edge Cloud
Platform
λ
def handle(req):
s3 = boto3.client('s3')
with open(tmpfile, 'wb') as f:
s3.download_fileobj('bucket', req['obj'], f)
data = numpy.load(f)
m = train_model(data, req['train_params'])
s3.upload_fileobj(serialize(m), 'bucket', 'model'])
# ...
7
 Data and Models as first-class citizens
 Model Selectors
 Policies
 Gates
AI Workflow
Programming Model
 Deviceless function scheduling
 Policy enactment
 Context awareness
 Data locality awareness
Execution Platform
A Serverless Platform for Edge AI
λ
λ
8
@consumes.model(selector={
'type': 'image_classifier',
'data_tags': ['machine_x'],
'accuracy': '>=0.88'
})
def inference(model: Model, request):
data = request['input']
# data prep tasks
prediction = model.estimate(data)
@policy.deadline('2s')
@policy.fn(node = 'user_device',
capability = 'gpu')
@policy.data(network=['company_network'],
strict=True)
@consumes.data(
selector={'urn': 'mnist:data'},
holdout=0.2)
@produce.model(
type='classifier',
urn='mnist:model')
def train(data: Data, request) -> Model:
arr = data.to_ndarray()
return Model(train_model(arr))
@gate.bias(attribute = 'age',
predicate = '<0.8')
@gate.drift(metric = 'confidence',
predicate = '<0.2')
λ
9
@consumes.model(selector={'urn': 'model:base'})
@consumes.data(batch = 100, selector=...)
@produces.model(type='regressor', urn='model:user:{usr}')
@policy.fn(node = 'local')
@policy.data(network = 'local', strict=True)
def refine(model: Model, data: Data):
ndarr = data.to_ndarray() # data artifact API
# transfer learning code
return refined_model
@consumes.model(selector={'urn': 'model:base'})
@consumes.data(batch = 100, selector=...)
@produces.model(type='regressor', urn='model:user:{usr}')
@policy.fn(node = 'local')
@policy.data(network = 'local', strict=True)
def refine(model: Model, data: Data):
ndarr = data.to_ndarray() # data artifact API
# transfer learning code
return refined_model
Network (edge, private)
node:{user}
container
Network (cloud)
f(x)
model u
data
data locality node
model b
λ
Function preprocessor
Scheduler
10
Data Locality Tradeoffs
Cluster Middleware Cluster Middleware Cluster Middleware Cluster Middleware
h
Data
proximity
Container
Image
Deploy the container image to the edge?
OR
Send the data to the cloud?
Edge
11
Skippy
 Built on and Kubernetes
 Kubernetes daemon to discover node capabilities
 Custom Python-based Kubernetes scheduler
● Adds inter-node proximity and data locality as constraints
● Non-monolithic architecture
 Coming to GitHub soon™
λ
12
Preprocess Train Inferenceλ λλ
Scheduler + Simulator: https://git.dsg.tuwien.ac.at/serverless-edge-ai/sched-sim
λ
13
Dipl.-Ing. (MSc), BSc
Thomas Rausch
Research Assistant
TU Wien
Institute of Information Systems Engineering
Argentinierstrasse 8-194-02, Vienna, Austria
T: +43 1 58801-184838
E: trausch@dsg.tuwien.ac.at
https://dsg.tuwien.ac.at/staff/trausch
@consumes.model(selector={'urn': 'model:base'})
@consumes.data(batch = 100, selector=...)
@produces.model(type='regressor', urn='model:user:{usr}')
@policy.fn(node = 'local')
@policy.data(network = 'local', strict=True)
def refine(model: Model, data: Data):
ndarr = data.to_ndarray() # data artifact API
# transfer learning code
return refined_model
Network (edge, private)
node:{user}
container
Network (cloud)
f(x)
model u
data
data locality node
model b
λ
Function preprocessor
Scheduler
{} Event (Request)
Trigger
λλ
Function
Scheduler
Edge Cloud
Edge
Cloud
Platform
λ
def handle(req):
s3 = boto3.client('s3')
with open(tmpfile, 'wb') as f:
s3.download_fileobj('bucket', req['obj'], f)
data = numpy.load(f)
m = train_model(data, req['train_params'])
s3.upload_fileobj(serialize(m), 'bucket', 'model'])
# ...
λ
14
Discussion
●
Correct level of abstraction?
●
API/SDK features?
●
Validation criteria?
●
Deviceless model (does it work?)
●
Transparent data management
●
Scheduler architecture
●
Request routing architecture
●
Proximity and bandwidth monitoring
●
Learning optimal placements
●
Model too high-level for scheduler
●
“Bring-your-own-device” will fail
Feedbacki Controversial pointsii
Open issuesiii Failure risksiv

More Related Content

PDF
Edge Intelligence: The Convergence of Humans, Things and AI
PDF
An Experimental Implementation of an Edge-based AI Engine with Edge-Cloud Co...
PDF
Hardware in Space
PDF
Jawsug hpc@jaws festa2016
PDF
5 biggest hpc trends 2021
PDF
Cloud, AI and Quantum in Mobility - IBM Thorsten Schroeer
PDF
HPC Cluster Computing from 64 to 156,000 Cores 
PDF
EPSRC CDT Conference
Edge Intelligence: The Convergence of Humans, Things and AI
An Experimental Implementation of an Edge-based AI Engine with Edge-Cloud Co...
Hardware in Space
Jawsug hpc@jaws festa2016
5 biggest hpc trends 2021
Cloud, AI and Quantum in Mobility - IBM Thorsten Schroeer
HPC Cluster Computing from 64 to 156,000 Cores 
EPSRC CDT Conference

What's hot (20)

PPTX
AI @ Microsoft, How we do it and how you can too!
PDF
NVIDIA Keynote #GTC21
PDF
Tales of AI agents saving the human race!
PPTX
Shattering AI Performance Records
PDF
Talk on using AI to address some of humanities problems
PDF
Accelerating open science and AI with automated, portable, customizable and r...
PDF
Best Practices for On-Demand HPC in Enterprises
PDF
Deep learning @ Edge using Intel's Neural Compute Stick
PPTX
HPC Top 5 Stories: May 18th, 2018
PDF
Fuelling the AI Revolution with Gaming
PDF
NIPS - Deep learning @ Edge using Intel's NCS
PDF
NVIDIA DataArt IT
PDF
Innovation Roundtable
PDF
Aura: An IoT based Cloud Infrastructure for Localized Mobile Computation Outs...
PPTX
Virtualization and Migration in Cloud - Edge Computing models using OpenStack...
PDF
Talk on commercialising space data
PDF
Arm Neoverse solutions @Graviton2-AWS Japan Webinar Oct2020
PDF
Intelligent internet of things with Google Cloud
PPTX
AI For Enterprise
PDF
Transparent Hardware Acceleration for Deep Learning
AI @ Microsoft, How we do it and how you can too!
NVIDIA Keynote #GTC21
Tales of AI agents saving the human race!
Shattering AI Performance Records
Talk on using AI to address some of humanities problems
Accelerating open science and AI with automated, portable, customizable and r...
Best Practices for On-Demand HPC in Enterprises
Deep learning @ Edge using Intel's Neural Compute Stick
HPC Top 5 Stories: May 18th, 2018
Fuelling the AI Revolution with Gaming
NIPS - Deep learning @ Edge using Intel's NCS
NVIDIA DataArt IT
Innovation Roundtable
Aura: An IoT based Cloud Infrastructure for Localized Mobile Computation Outs...
Virtualization and Migration in Cloud - Edge Computing models using OpenStack...
Talk on commercialising space data
Arm Neoverse solutions @Graviton2-AWS Japan Webinar Oct2020
Intelligent internet of things with Google Cloud
AI For Enterprise
Transparent Hardware Acceleration for Deep Learning
Ad

Similar to Towards a Serverless Platform for Edge AI (20)

PDF
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
PPTX
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
PDF
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
PDF
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
PDF
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PDF
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
PDF
High Performance Distributed TensorFlow with GPUs and Kubernetes
PDF
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
PPTX
Parallel & Distributed Deep Learning - Dataworks Summit
PDF
DDDP 2019 - Brown to Green
PDF
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...
PPTX
Parallel/Distributed Deep Learning and CDSW
PDF
Building ML Pipelines with DCOS
PDF
Un puente enre MLops y Devops con Openshift AI
PDF
Hydrosphere.io for ODSC: Webinar on Kubeflow
PDF
Managing the Machine Learning Lifecycle with MLOps
PDF
Machine Learning Inference at the Edge
PDF
AI on the Edge Future-Proofing IoT and Smart Devices.pdf
PDF
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
PDF
Data ops: Machine Learning in production
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Explore Deep Learning Architecture using Tensorflow 2.0 now! Part 2
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow with GPUs and Kubernetes
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Parallel & Distributed Deep Learning - Dataworks Summit
DDDP 2019 - Brown to Green
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...
Parallel/Distributed Deep Learning and CDSW
Building ML Pipelines with DCOS
Un puente enre MLops y Devops con Openshift AI
Hydrosphere.io for ODSC: Webinar on Kubeflow
Managing the Machine Learning Lifecycle with MLOps
Machine Learning Inference at the Edge
AI on the Edge Future-Proofing IoT and Smart Devices.pdf
Kostiantyn Bokhan, N-iX. CD4ML based on Azure and Kubeflow
Data ops: Machine Learning in production
Ad

More from Thomas Rausch (8)

PDF
Test cloud application deployments locally and in CI without staging environm...
PDF
Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Comp...
PDF
Portable Energy-Aware Cluster-Based Edge Computers
PDF
EMMA: Distributed QoS-Aware MQTT Middleware for Edge Computing Applications
PDF
Message-Oriented Middleware for Edge Computing Applications
PDF
An Empirical Analysis of Build Failures in the Continuous Integration Workflo...
PDF
Build Failure Prediction in Continuous Integration Workflows
PDF
Git Introduction Tutorial
Test cloud application deployments locally and in CI without staging environm...
Synthesizing Plausible Infrastructure Configurations for Evaluating Edge Comp...
Portable Energy-Aware Cluster-Based Edge Computers
EMMA: Distributed QoS-Aware MQTT Middleware for Edge Computing Applications
Message-Oriented Middleware for Edge Computing Applications
An Empirical Analysis of Build Failures in the Continuous Integration Workflo...
Build Failure Prediction in Continuous Integration Workflows
Git Introduction Tutorial

Recently uploaded (20)

PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
Taita Taveta Laboratory Technician Workshop Presentation.pptx
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
PDF
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
lecture 2026 of Sjogren's syndrome l .pdf
PDF
Sciences of Europe No 170 (2025)
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
famous lake in india and its disturibution and importance
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PDF
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
PDF
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
PPT
6.1 High Risk New Born. Padetric health ppt
PDF
The scientific heritage No 166 (166) (2025)
PPTX
BIOMOLECULES PPT........................
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PPTX
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
PPTX
2. Earth - The Living Planet Module 2ELS
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
Taita Taveta Laboratory Technician Workshop Presentation.pptx
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
lecture 2026 of Sjogren's syndrome l .pdf
Sciences of Europe No 170 (2025)
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
famous lake in india and its disturibution and importance
ECG_Course_Presentation د.محمد صقران ppt
Lymphatic System MCQs & Practice Quiz – Functions, Organs, Nodes, Ducts
Looking into the jet cone of the neutrino-associated very high-energy blazar ...
6.1 High Risk New Born. Padetric health ppt
The scientific heritage No 166 (166) (2025)
BIOMOLECULES PPT........................
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
Vitamins & Minerals: Complete Guide to Functions, Food Sources, Deficiency Si...
2. Earth - The Living Planet Module 2ELS

Towards a Serverless Platform for Edge AI

  • 1. TU Wien, Vienna Austria Distributed Systems Group https://dsg.tuwien.ac.at Thomas Rausch @thrauat Waldemar Hummer Vinod Muthusamy Alexander Rashed Schahram Dustdar Towards a Serverless Platform for Edge AI IBM Research AI HotEdge’19, Renton, WA
  • 2. 2 Drone With Accelerator Microsoft Build 2018 // Vision Keynote: https://www.youtube.com/watch?v=rd0Rd8w3FZ0
  • 3. 3 Edge AI Accelerators Google Edge TPU NVIDIA Jetson Intel Neural Compute Stick Baidu Kunlun Microsoft Project BrainWave Huawei Atlas
  • 4. 4 AI Operationalization Hummer et al., ModelOps: Cloud-based Lifecycle Management for Reliable and Trusted AI. IC2E’19. Process Train Validate Servee Model Runtime Monitoring Data Perf. Process Train Validate Serve Object Store Compute Cluster Learning Cluster Read Data Train Model Write Model Data Asset Trained Model ModelOps Platform
  • 5. 5 Serverless Model {} Event (Request) Trigger Node λ λ λ λ λ λ λ λ λ Resource λ def handle(req): s3 = boto3.client('s3') with open(tmpfile, 'wb') as f: s3.download_fileobj('bucket', req['obj'], f) data = numpy.load(f) m = train_model(data, req['train_params']) s3.upload_fileobj(serialize(m), 'bucket', 'model']) # ... λλ Function Scheduler Cloud Platform
  • 6. 6 Deviceless Model {} Event (Request) Trigger λλ Function Scheduler ?? Edge Cloud Edge Edge Cloud Platform λ def handle(req): s3 = boto3.client('s3') with open(tmpfile, 'wb') as f: s3.download_fileobj('bucket', req['obj'], f) data = numpy.load(f) m = train_model(data, req['train_params']) s3.upload_fileobj(serialize(m), 'bucket', 'model']) # ...
  • 7. 7  Data and Models as first-class citizens  Model Selectors  Policies  Gates AI Workflow Programming Model  Deviceless function scheduling  Policy enactment  Context awareness  Data locality awareness Execution Platform A Serverless Platform for Edge AI λ λ
  • 8. 8 @consumes.model(selector={ 'type': 'image_classifier', 'data_tags': ['machine_x'], 'accuracy': '>=0.88' }) def inference(model: Model, request): data = request['input'] # data prep tasks prediction = model.estimate(data) @policy.deadline('2s') @policy.fn(node = 'user_device', capability = 'gpu') @policy.data(network=['company_network'], strict=True) @consumes.data( selector={'urn': 'mnist:data'}, holdout=0.2) @produce.model( type='classifier', urn='mnist:model') def train(data: Data, request) -> Model: arr = data.to_ndarray() return Model(train_model(arr)) @gate.bias(attribute = 'age', predicate = '<0.8') @gate.drift(metric = 'confidence', predicate = '<0.2') λ
  • 9. 9 @consumes.model(selector={'urn': 'model:base'}) @consumes.data(batch = 100, selector=...) @produces.model(type='regressor', urn='model:user:{usr}') @policy.fn(node = 'local') @policy.data(network = 'local', strict=True) def refine(model: Model, data: Data): ndarr = data.to_ndarray() # data artifact API # transfer learning code return refined_model @consumes.model(selector={'urn': 'model:base'}) @consumes.data(batch = 100, selector=...) @produces.model(type='regressor', urn='model:user:{usr}') @policy.fn(node = 'local') @policy.data(network = 'local', strict=True) def refine(model: Model, data: Data): ndarr = data.to_ndarray() # data artifact API # transfer learning code return refined_model Network (edge, private) node:{user} container Network (cloud) f(x) model u data data locality node model b λ Function preprocessor Scheduler
  • 10. 10 Data Locality Tradeoffs Cluster Middleware Cluster Middleware Cluster Middleware Cluster Middleware h Data proximity Container Image Deploy the container image to the edge? OR Send the data to the cloud? Edge
  • 11. 11 Skippy  Built on and Kubernetes  Kubernetes daemon to discover node capabilities  Custom Python-based Kubernetes scheduler ● Adds inter-node proximity and data locality as constraints ● Non-monolithic architecture  Coming to GitHub soon™ λ
  • 12. 12 Preprocess Train Inferenceλ λλ Scheduler + Simulator: https://git.dsg.tuwien.ac.at/serverless-edge-ai/sched-sim λ
  • 13. 13 Dipl.-Ing. (MSc), BSc Thomas Rausch Research Assistant TU Wien Institute of Information Systems Engineering Argentinierstrasse 8-194-02, Vienna, Austria T: +43 1 58801-184838 E: trausch@dsg.tuwien.ac.at https://dsg.tuwien.ac.at/staff/trausch @consumes.model(selector={'urn': 'model:base'}) @consumes.data(batch = 100, selector=...) @produces.model(type='regressor', urn='model:user:{usr}') @policy.fn(node = 'local') @policy.data(network = 'local', strict=True) def refine(model: Model, data: Data): ndarr = data.to_ndarray() # data artifact API # transfer learning code return refined_model Network (edge, private) node:{user} container Network (cloud) f(x) model u data data locality node model b λ Function preprocessor Scheduler {} Event (Request) Trigger λλ Function Scheduler Edge Cloud Edge Cloud Platform λ def handle(req): s3 = boto3.client('s3') with open(tmpfile, 'wb') as f: s3.download_fileobj('bucket', req['obj'], f) data = numpy.load(f) m = train_model(data, req['train_params']) s3.upload_fileobj(serialize(m), 'bucket', 'model']) # ... λ
  • 14. 14 Discussion ● Correct level of abstraction? ● API/SDK features? ● Validation criteria? ● Deviceless model (does it work?) ● Transparent data management ● Scheduler architecture ● Request routing architecture ● Proximity and bandwidth monitoring ● Learning optimal placements ● Model too high-level for scheduler ● “Bring-your-own-device” will fail Feedbacki Controversial pointsii Open issuesiii Failure risksiv