SlideShare a Scribd company logo
3
Most read
13
Most read
20
Most read
ONNX MEETS FLINK
The long trudge towards integrating PyTorch, Chainer, CNTK, MXNet and other models in Flink
streaming applications.
The Problem/Motivation
ONNX
Overview
Limitations
End-to-end example with Java Embedded Python
Overview
Goals
Remove barrier between A.I.“research” and “production.”
Enable access to recent state of the art models from major conference and Python
based frameworks
Specifically, integrate deep learning models written in Python frameworks like PyTorch,
CNTK, Chainer into Flink pipelines for realtime inference on streaming data.
Challenge(s)
Poor Python support in Flink and vice-versa poor ONNX support in Java
Converting a model to ONNX itself can be quite arduous
It can be challenging to rewrite pre-processing code in Java
Goals and challenges
International Conference Learning Representations (ICLR) statistics
2018: 87 papers mentioned PyTorch (compared to 228 that
mentioned Tensorflow)
2019 252 papers mentioned PyTorch (compared to 266 that
mentioned Tensorflow) Roughly a 190% increase!
The rise of
PyTorch
Tensorflow
Powered frameworks
NLP Other
torchvision
torchcv
What is ONNX?
Open neural network exchange format
WHY USE ONNX?
Backends that (at least in theory) run in a large number of
environments.
Can export models from a variety of formats to a standard
format
Exported models generally smaller (in terms of space) than full
models.
Overview of possible ways to
integrate ONNX models into Flink
Create a micro-service and use in conjunction with Flink
AsyncIO.
Use Java embedded Python (JEP) and run using Caffe2 (or
Tensorflow)
Load model natively into Java/Scala and run with a JVM backend
framework
ONNX frameworks overview
ONNX Scoreboard Measure supported operations
OPTIONS
ONNX options
Current backends
Caffe2 (Python, C++)
CNTK (C++, C#, Python, Java experimental)
Tensorflow-ONNX (Python) [Not analogous to Tensorflow)
VESPA (Java)
Menoh (C++, Java, C#, Ruby, NodeJS)
Menoh in Java
Only 19 of the 116 ops available (so pretty limited for now)
import jp.preferred.menoh.ModelRunner;
import jp.preferred.menoh.ModelRunnerBuilder;
try (
ModelRunnerBuilder builder = ModelRunner
// Load ONNX model data
.fromOnnxFile(“squeezenet.onnx”)
// Define input profile (name, dtype, dims) and output profile (name, dtype)
// Menoh calculates dims of outputs automatically at build time
.addInputProfile(conv11InName, DType.FLOAT, new int[] {batchSize, channelNum, height,
width})
.addOutputProfile(fc6OutName, DType.FLOAT)
.addOutputProfile(softmaxOutName, DType.FLOAT)
// Configure backend
.backendName("mkldnn")
.backendConfig("");
ModelRunner runner = builder.build()
) {
// The builder can be deleted explicitly after building a model runner
builder.close();
WHEN NOTTO USE ONNX?
Export process in many cases is difficult and time consuming!
Backends have limited support for various operations.
For instance,Yolo2 still cannot be run on even Caffe2 or
Tensorflow backend due to lack of support of ImageScaler.
Some models have to be re-trained before exporting
Flink calls model “API” using AsyncIO similar to any other API connection
Pros
Use Docker container to capture exact model dependencies (smaller
container than with Flink+Model)
No (extensive) re-writing of code needed
Cons
Have to handle scaling/maintaining a separate service
AsyncIO and Microservice
Model
Uses JNI and the Cython API to start up the Python
interpreter inside the JVM
Faster than many alternatives
Can use pretty much any Python library including numpy,
Tensorflow, PyTorch, Keras, etc
Automatically converts Java primitives, Strings, and
jep.NDArrays sent into the Python interpreter into Python
primitives, strings, and numpy.ndarrays
Java Embedded Python (JEP)
Using PyTorch directly with JEP
Easiest way solution use: Kubernetes
AIStream JEP Flink Docker container
Setup can be a bit painful
Have to get Python dependencies on all
Flink nodes
Job needs path to Python
“Unsatisfied Link Error” is very common
Bootstrap script possible for EMR on AWS
NLP framework written in PyTorch with a state of the art named
entity recognition (NER) model.
from flair.data import Sentence
from flair.models import SequenceTagger
# make a sentence
sentence = Sentence('I love Berlin .')
# load the NER tagger
tagger = SequenceTagger.load('ner')
# run NER over sentence
tagger.predict(sentence)
Easy to train and combine with new methods
Framework handles complex preprocessing and models PyTorch
subclasses (therefore exporting to ONNX is not fun)
Named entity recognition on Flink
data stream with Flair
import jep.Jep;
import jep.JepConfig;
import org.apache.flink.api.common.functions.RichMapFunction;
import jep.SharedInterpreter;
import org.apache.flink.configuration.Configuration;
public class flairMap extends RichMapFunction<TweetData, String> {
private SharedInterpreter j;
@Override
public void open(Configuration c)
{
try {
j = new jep.SharedInterpreter();
j.eval("from flair.data import Sentence");
j.eval("from flair.models import SequenceTagger");
j.eval("model = SequenceTagger.load('ner')");
}
catch (jep.JepException e) {
e.printStackTrace();
}
}
}
public String map(TweetData tweet) throws jep.JepException{
String tweetText = tweet.tweetText;
tweetText = tweetText.replaceAll("[^A-Za-z0-9]", " ");
try {
j.set("text", tweetText);
j.eval("s=Sentence(text)");
j.eval("model.predict(s)");
Object result = j.getValue("s.get_spans('ner')");
return result.toString();
}
catch(jep.JepException e){
e.printStackTrace();
throw e;
}
}
Sentiment Analysis with Flair
from flair.models import TextClassifier
from flair.data import Sentence
classifier = TextClassifier.load('en-sentiment')
sentence = Sentence(‘Twitter is a really good company!’)
classifier.predict(sentence)
# print sentence with predicted labels
print('Sentence sentiment is: ‘ + sentence.labels)
public String map(TweetData tweet) throws jep.JepException{
String tweetText = tweet.tweetText;
tweetText = tweetText.replaceAll("[^A-Za-z0-9]", " ");
try {
j.set("text", tweetText);
j.eval("s=Sentence(text)");
j.eval("model.predict(s)");
Object result = j.getValue(“s.labels");
return result.toString();
}
catch(jep.JepException e){
e.printStackTrace();
throw e;
}
}
Consume data from Twitter Source using Flink Twitter Connector
Filter out non-English Tweets
Alternatively could load multilingual NER model(s)
Named Entity Recognition on Tweets (remove non-entities)
Sentiment Analysis on Tweet (entity, label, sentiment)
Convert to Table. Run query
Putting it all together
https://github.com/isaacmg/dl_java_stream
SELECT entity, sentiment, count(entity)
FROM Tweets
GROUP BY entity, sentiment
Currently easiest to either use JEP or a micro-service + AsyncIO
Saves time converting model to ONNX
No need to re-write code
Promising frameworks in the works like Menoh,VESPA, Dl4J etc
should eventually support ONNX natively but aren’t mature
enough yet.
Conclusions

More Related Content

PDF
ACI Netflow 구성 가이드
PDF
3GPP 5G NSA Detailed explanation 1(EN-DC SgNB Addition Call Flow)
PPTX
2.3 Pembudayaan PLC sekolah melaban.pptx
PDF
Eincop Netwax Lab: Redistribution
PDF
DSKP PJPK TAHUN 2
PPTX
Diameter based Interfaces and description
PDF
Dynamic Access Control 解説編
PPTX
VPN Network
ACI Netflow 구성 가이드
3GPP 5G NSA Detailed explanation 1(EN-DC SgNB Addition Call Flow)
2.3 Pembudayaan PLC sekolah melaban.pptx
Eincop Netwax Lab: Redistribution
DSKP PJPK TAHUN 2
Diameter based Interfaces and description
Dynamic Access Control 解説編
VPN Network

What's hot (16)

PDF
3GPP 5G SA Detailed explanation 4(5G SA RAN and 5GC Access Call Flow)
DOC
Carta organisasi psv
PDF
Advanced OpenVPN Concepts - pfSense Hangout September 2014
PDF
3GPP LTE Detailed explanation 5 (S1 Handover)
PPTX
【Serverless Days】マルチクラウド環境における持続的コスト最適化サイクルの実践.pptx
PDF
3GPP 5G NSA introduction 1(EN-DC Bearer)
PDF
3GPP LTE introduction 1(Architecture & Identification)
PDF
Service Function Chaining in Openstack Neutron
PDF
IIJmio meeting #4 みおふぉんでVoLTE端末は使えるの?
DOCX
RPH TRANSISI.docx
PPTX
3GPP TR23.711-e00まとめ
PPTX
Cisco connect winnipeg 2018 a look at network assurance in dna center
PPTX
SLIDE BENGKEL Program Literasi dan Numerasi Sekolah Rendah (PLaN) 2024.pptx
PDF
Sigtran introduction
PPT
PDF
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
3GPP 5G SA Detailed explanation 4(5G SA RAN and 5GC Access Call Flow)
Carta organisasi psv
Advanced OpenVPN Concepts - pfSense Hangout September 2014
3GPP LTE Detailed explanation 5 (S1 Handover)
【Serverless Days】マルチクラウド環境における持続的コスト最適化サイクルの実践.pptx
3GPP 5G NSA introduction 1(EN-DC Bearer)
3GPP LTE introduction 1(Architecture & Identification)
Service Function Chaining in Openstack Neutron
IIJmio meeting #4 みおふぉんでVoLTE端末は使えるの?
RPH TRANSISI.docx
3GPP TR23.711-e00まとめ
Cisco connect winnipeg 2018 a look at network assurance in dna center
SLIDE BENGKEL Program Literasi dan Numerasi Sekolah Rendah (PLaN) 2024.pptx
Sigtran introduction
“Building Accelerated GStreamer Applications for Video and Audio AI,” a Prese...
Ad

Similar to Flink Forward San Francisco 2019: Deploying ONNX models on Flink - Isaac Mckillen-Godfried (20)

PPTX
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
PPTX
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
PDF
Machine learning from software developers point of view
PDF
[FFE19] Build a Flink AI Ecosystem
PDF
Apache Flink London Meetup - Let's Talk ML on Flink
PPTX
Simplifying training deep and serving learning models with big data in python...
PDF
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
PDF
Flink Forward Berlin 2017: Boris Lublinsky, Stavros Kontopoulos - Introducing...
PPTX
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
PPTX
Predictive Maintenance with Deep Learning and Apache Flink
PDF
Powering tensor flow with big data using apache beam, flink, and spark cern...
PDF
OVHcloud TechTalks - ML serving
PDF
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
PDF
Tensorflowv5.0
PPTX
GDG AI for Science Australia - JAX AI Stack
PPTX
ONNX and MLflow
PPTX
ONNX - The Lingua Franca of Deep Learning
PDF
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
PDF
Exascale Deep Learning for Climate Analytics
PDF
FlinkML - Big data application meetup
Flink Forward SF 2017: Eron Wright - Introducing Flink Tensorflow
Powering Tensorflow with big data using Apache Beam, Flink, and Spark - OSCON...
Machine learning from software developers point of view
[FFE19] Build a Flink AI Ecosystem
Apache Flink London Meetup - Let's Talk ML on Flink
Simplifying training deep and serving learning models with big data in python...
Improving the Life of Data Scientists: Automating ML Lifecycle through MLflow
Flink Forward Berlin 2017: Boris Lublinsky, Stavros Kontopoulos - Introducing...
Flink Forward Berlin 2017: Dongwon Kim - Predictive Maintenance with Apache F...
Predictive Maintenance with Deep Learning and Apache Flink
Powering tensor flow with big data using apache beam, flink, and spark cern...
OVHcloud TechTalks - ML serving
Powering tensorflow with big data (apache spark, flink, and beam) dataworks...
Tensorflowv5.0
GDG AI for Science Australia - JAX AI Stack
ONNX and MLflow
ONNX - The Lingua Franca of Deep Learning
RESTful Machine Learning with Flask and TensorFlow Serving - Carlo Mazzaferro
Exascale Deep Learning for Climate Analytics
FlinkML - Big data application meetup
Ad

More from Flink Forward (20)

PDF
Building a fully managed stream processing platform on Flink at scale for Lin...
PPTX
Evening out the uneven: dealing with skew in Flink
PPTX
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
PDF
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
PDF
Introducing the Apache Flink Kubernetes Operator
PPTX
Autoscaling Flink with Reactive Mode
PDF
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
PPTX
One sink to rule them all: Introducing the new Async Sink
PPTX
Tuning Apache Kafka Connectors for Flink.pptx
PDF
Flink powered stream processing platform at Pinterest
PPTX
Apache Flink in the Cloud-Native Era
PPTX
Where is my bottleneck? Performance troubleshooting in Flink
PPTX
Using the New Apache Flink Kubernetes Operator in a Production Deployment
PPTX
The Current State of Table API in 2022
PDF
Flink SQL on Pulsar made easy
PPTX
Dynamic Rule-based Real-time Market Data Alerts
PPTX
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
PPTX
Processing Semantically-Ordered Streams in Financial Services
PDF
Tame the small files problem and optimize data layout for streaming ingestion...
PDF
Batch Processing at Scale with Flink & Iceberg
Building a fully managed stream processing platform on Flink at scale for Lin...
Evening out the uneven: dealing with skew in Flink
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing the Apache Flink Kubernetes Operator
Autoscaling Flink with Reactive Mode
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
One sink to rule them all: Introducing the new Async Sink
Tuning Apache Kafka Connectors for Flink.pptx
Flink powered stream processing platform at Pinterest
Apache Flink in the Cloud-Native Era
Where is my bottleneck? Performance troubleshooting in Flink
Using the New Apache Flink Kubernetes Operator in a Production Deployment
The Current State of Table API in 2022
Flink SQL on Pulsar made easy
Dynamic Rule-based Real-time Market Data Alerts
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Processing Semantically-Ordered Streams in Financial Services
Tame the small files problem and optimize data layout for streaming ingestion...
Batch Processing at Scale with Flink & Iceberg

Recently uploaded (20)

PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Empathic Computing: Creating Shared Understanding
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Approach and Philosophy of On baking technology
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
Mobile App Security Testing_ A Comprehensive Guide.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
The AUB Centre for AI in Media Proposal.docx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Empathic Computing: Creating Shared Understanding
“AI and Expert System Decision Support & Business Intelligence Systems”
NewMind AI Weekly Chronicles - August'25 Week I
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Approach and Philosophy of On baking technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Chapter 3 Spatial Domain Image Processing.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Big Data Technologies - Introduction.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation

Flink Forward San Francisco 2019: Deploying ONNX models on Flink - Isaac Mckillen-Godfried

  • 1. ONNX MEETS FLINK The long trudge towards integrating PyTorch, Chainer, CNTK, MXNet and other models in Flink streaming applications.
  • 3. Goals Remove barrier between A.I.“research” and “production.” Enable access to recent state of the art models from major conference and Python based frameworks Specifically, integrate deep learning models written in Python frameworks like PyTorch, CNTK, Chainer into Flink pipelines for realtime inference on streaming data. Challenge(s) Poor Python support in Flink and vice-versa poor ONNX support in Java Converting a model to ONNX itself can be quite arduous It can be challenging to rewrite pre-processing code in Java Goals and challenges
  • 4. International Conference Learning Representations (ICLR) statistics 2018: 87 papers mentioned PyTorch (compared to 228 that mentioned Tensorflow) 2019 252 papers mentioned PyTorch (compared to 266 that mentioned Tensorflow) Roughly a 190% increase! The rise of PyTorch Tensorflow
  • 6. What is ONNX? Open neural network exchange format
  • 7. WHY USE ONNX? Backends that (at least in theory) run in a large number of environments. Can export models from a variety of formats to a standard format Exported models generally smaller (in terms of space) than full models.
  • 8. Overview of possible ways to integrate ONNX models into Flink Create a micro-service and use in conjunction with Flink AsyncIO. Use Java embedded Python (JEP) and run using Caffe2 (or Tensorflow) Load model natively into Java/Scala and run with a JVM backend framework
  • 9. ONNX frameworks overview ONNX Scoreboard Measure supported operations
  • 10. OPTIONS ONNX options Current backends Caffe2 (Python, C++) CNTK (C++, C#, Python, Java experimental) Tensorflow-ONNX (Python) [Not analogous to Tensorflow) VESPA (Java) Menoh (C++, Java, C#, Ruby, NodeJS)
  • 11. Menoh in Java Only 19 of the 116 ops available (so pretty limited for now) import jp.preferred.menoh.ModelRunner; import jp.preferred.menoh.ModelRunnerBuilder; try ( ModelRunnerBuilder builder = ModelRunner // Load ONNX model data .fromOnnxFile(“squeezenet.onnx”) // Define input profile (name, dtype, dims) and output profile (name, dtype) // Menoh calculates dims of outputs automatically at build time .addInputProfile(conv11InName, DType.FLOAT, new int[] {batchSize, channelNum, height, width}) .addOutputProfile(fc6OutName, DType.FLOAT) .addOutputProfile(softmaxOutName, DType.FLOAT) // Configure backend .backendName("mkldnn") .backendConfig(""); ModelRunner runner = builder.build() ) { // The builder can be deleted explicitly after building a model runner builder.close();
  • 12. WHEN NOTTO USE ONNX? Export process in many cases is difficult and time consuming! Backends have limited support for various operations. For instance,Yolo2 still cannot be run on even Caffe2 or Tensorflow backend due to lack of support of ImageScaler. Some models have to be re-trained before exporting
  • 13. Flink calls model “API” using AsyncIO similar to any other API connection Pros Use Docker container to capture exact model dependencies (smaller container than with Flink+Model) No (extensive) re-writing of code needed Cons Have to handle scaling/maintaining a separate service AsyncIO and Microservice Model
  • 14. Uses JNI and the Cython API to start up the Python interpreter inside the JVM Faster than many alternatives Can use pretty much any Python library including numpy, Tensorflow, PyTorch, Keras, etc Automatically converts Java primitives, Strings, and jep.NDArrays sent into the Python interpreter into Python primitives, strings, and numpy.ndarrays Java Embedded Python (JEP)
  • 15. Using PyTorch directly with JEP Easiest way solution use: Kubernetes AIStream JEP Flink Docker container Setup can be a bit painful Have to get Python dependencies on all Flink nodes Job needs path to Python “Unsatisfied Link Error” is very common Bootstrap script possible for EMR on AWS
  • 16. NLP framework written in PyTorch with a state of the art named entity recognition (NER) model. from flair.data import Sentence from flair.models import SequenceTagger # make a sentence sentence = Sentence('I love Berlin .') # load the NER tagger tagger = SequenceTagger.load('ner') # run NER over sentence tagger.predict(sentence) Easy to train and combine with new methods Framework handles complex preprocessing and models PyTorch subclasses (therefore exporting to ONNX is not fun)
  • 17. Named entity recognition on Flink data stream with Flair import jep.Jep; import jep.JepConfig; import org.apache.flink.api.common.functions.RichMapFunction; import jep.SharedInterpreter; import org.apache.flink.configuration.Configuration; public class flairMap extends RichMapFunction<TweetData, String> { private SharedInterpreter j; @Override public void open(Configuration c) { try { j = new jep.SharedInterpreter(); j.eval("from flair.data import Sentence"); j.eval("from flair.models import SequenceTagger"); j.eval("model = SequenceTagger.load('ner')"); } catch (jep.JepException e) { e.printStackTrace(); } } } public String map(TweetData tweet) throws jep.JepException{ String tweetText = tweet.tweetText; tweetText = tweetText.replaceAll("[^A-Za-z0-9]", " "); try { j.set("text", tweetText); j.eval("s=Sentence(text)"); j.eval("model.predict(s)"); Object result = j.getValue("s.get_spans('ner')"); return result.toString(); } catch(jep.JepException e){ e.printStackTrace(); throw e; } }
  • 18. Sentiment Analysis with Flair from flair.models import TextClassifier from flair.data import Sentence classifier = TextClassifier.load('en-sentiment') sentence = Sentence(‘Twitter is a really good company!’) classifier.predict(sentence) # print sentence with predicted labels print('Sentence sentiment is: ‘ + sentence.labels) public String map(TweetData tweet) throws jep.JepException{ String tweetText = tweet.tweetText; tweetText = tweetText.replaceAll("[^A-Za-z0-9]", " "); try { j.set("text", tweetText); j.eval("s=Sentence(text)"); j.eval("model.predict(s)"); Object result = j.getValue(“s.labels"); return result.toString(); } catch(jep.JepException e){ e.printStackTrace(); throw e; } }
  • 19. Consume data from Twitter Source using Flink Twitter Connector Filter out non-English Tweets Alternatively could load multilingual NER model(s) Named Entity Recognition on Tweets (remove non-entities) Sentiment Analysis on Tweet (entity, label, sentiment) Convert to Table. Run query Putting it all together https://github.com/isaacmg/dl_java_stream SELECT entity, sentiment, count(entity) FROM Tweets GROUP BY entity, sentiment
  • 20. Currently easiest to either use JEP or a micro-service + AsyncIO Saves time converting model to ONNX No need to re-write code Promising frameworks in the works like Menoh,VESPA, Dl4J etc should eventually support ONNX natively but aren’t mature enough yet. Conclusions