SlideShare a Scribd company logo
𝜆
Open Source
-Architecture for Deep Learning
Use case
Patrick R Nicolas
Oct. 2020
pnicolasai@yahoo.com
Overview
3
“… and the wise man said,
thou shall embrace open source”.
21st century proverb
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Overview
4
Overview
Layers
Open-source components
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Overview
5
The world of data scientists accustomed to Python
scientific libraries have been shaken up by the
emergence of ’big data’ framework such as Apache
Hadoop, Spark and Kafka.
This presentation introduces a variant of the
architecture and describes the seamless integration of
various open source components to train, validate and
test deep learning models.
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
𝜆
Disclaimer
6
The concept and architecture are versatile enough to
accommodate a variety of open source, commercial
solutions and services beside the frameworks
prescribed in this presentation.
For instance, deep learning frameworks, such as Keras
or tensor flow are excellent alternatives to PyTorch.
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Requirements
7
• Process batch and stream data, concurrently
• Enforce data immutability
• Recover gracefully from human errors
• Handle hardware failures
• Minimize latency for real-time requests
• Scale for very large data set
• Optimize full lifecycle of data set
• Guarantee quality and integrity of data
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
A ‘big data’ framework should be able to ….
Optimizing data life cycle
8
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
The need for optimizing the data life cycle: 79% of data
scientist time is spent collecting and organizing data.
Source Quora
Data quality
9
Accuracy: Correct models and representative data.
Completeness: No missing data
Consistency: Applied to semantic and format
Timeliness: Up-to-date data and notification
Accessibility: Ease of use and high availability
Validity: Comply to constraints, rules and regulations
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Guaranteeing data quality and integrity
Solution …
10
- architecture is a large scale data processing that
balanced batch and real-time streamed data.
It is a one-stop shopping for various data sources that
balance latency, redundancy, easy of access and
throughput.
It breaks down into 3 layers
• Speed (streaming, real-time, …)
• Batch (training, analysis, …)
• Serving (query, visualization, …)
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
𝜆
… using open source
11
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
architecture using open source components?
𝜆
The task consists of reviewing and evaluating the trove
of available of open source libraries to build a robust
architecture that support the rigor of training and
tuning deep learning models.
The libraries are weaved through a set language-
agnostic REST API to form a coherent pipeline.
… for deep learning
12
• Python scientific libraries have been the go-to tools
for data scientists to analyze data and build models.
• PyTorch framework builds up on these libraries to
support the design and execution of deep learning
models.
• Apache Spark and Kafka complements these
frameworks for very large data set and real-time
processing.
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
architecture for deep learning?
𝜆
Bird-eye view
13
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Feel overwhelmed?
... Let’s break it down
Example open source
𝜆 architecture
Layers
14
Overview
Layers
Open-source components
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Batch layer
15
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Batch layer objective: load batch of data to be distributed,
preprocessed to train deep learning models.
Batch layer
16
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Typical use case:
1. Apache Spark loads training set from Amazon S3
2. Spark master partitions training data
3. Spark workers preprocessed data and notify
completion through Kafka event queue
4. Pytorch updated model parameters from pre-
processed training data
5. Pytorch broadcast model parameters and quality
metrics through Kafka
6. Apache Hive powered by Spark stores models related
data and metrics
Speed layer
17
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Speed layer objective: process queries to predictive
models with very low latency.
Speed layer
18
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Use case:
1. Kafka routes data streams to Spark master
2. Spark pre-processes requests and forward them to
deep model micro-service
3. Flask converts requests to prediction query to Pytorch
model
4. Pytorch model generate a prediction
5. Run-time metrics are broadcast through Kafka
Serving layer
19
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Serving layer objective: process queries to analyze data,
model performances and execute statistical inference
Serving layer
20
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Use case:
1. Analyst queries relational data base, MySQL for most
recent data, statistics using Fine report UI (low
latency)
2. Analyst queries asynchronously Hive data warehouse
for archived data, statistics (high latency)
3. Hive processes queries through Spark datasets
4. Spark updates regularly MySQL short term data
Overview
21
Overview
Layers
Open-source components
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
PyTorch
22
PyTorch is an optimized tensor library for deep
learning using GPUs and CPUs.
It extends the functionality of Numpy and Scikit-
learn to support the training, evaluation and
commercialization of complex machine learning
models.
https://pytorch.org/tutorials/
Alternatives:
Tensor flow: https://www.tensorflow.org/
Keras: https://keras.io
MxNet: https://mxnet.apache.org
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Apache Spark
23
Apache Spark is an open source cluster computing
framework for fast real-time processing.
It supports Scala, Java, Python and R programming
languages and includes streaming, graph and machine
learning libraries.
https://www.scala-lang.org
https://spark.apache.org
Alternative:
PySpark: https://databricks.com/glossary/pyspark
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Streaming
24
Apache Kafka is an open-source distributed event
streaming framework to large scale, real-time data
processing and analytics.
It captures data from various sources in real-time as a
continuous flow and routes it to the appropriate
processor.
https://kafka.apache.org
Alternatives:
Amazon SQS: https://aws.amazon.com/sqs/
RabbitMQ: https://www.rabbitmq.com
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Model tuning
25
Ray-tune is a distributed hyper-parameters
tuning framework particularly suitable to deep learning
models.
It reduces significantly the cost of optimizing the
configuration of a model. It is a wrapper around other
open source libraries
https://docs.ray.io/en/master/tune/index.html
Alternatives:
Amazon SageMaker: https://aws.amazon.com/sagemaker/
HyperOpt: https://github.com/hyperopt/hyperopt
Optuna: https://optuna.readthedocs.io
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Python REST service
26
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Flask is an easy to use implementation of the
RESTful interface to Python applications.
It supports most of web and deployment standards
such Docker, React.js, Angular, HTML5 and WSGI
containers.
https://palletsprojects.com/p/flask/
Alternatives:
Falcon: https://falcon.readthedocs.io
Fast API: https://fastapi.tiangolo.com
RDBMS
27
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
MySQL is an open source relational database
supporting partitioning, sharding, replication. It can
be extended with real-time analytics (Heatwave)
and enterprise clustering (CGE)
https://www.mysql.com
Alternatives:
PosgresSQL: https://www.postgresql.org
HyperSQL http://www.hsqldb.org
Amazon RDS: http://aws.amazon.com/rds
Data warehouse
28
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Apache Hive is a data warehouse framework that
leverages Spark to execute largely distributed SQL
queries.
It optimizes SQL queries through lazy evaluation of
acyclic execution graph. It is integrated with
Spark data set and HDFS.
https://hive.apache.org
Alternatives:
Vertica http://www.vertica.com
Amazon Redshift https://aws.amazon.com/redshift/
Dashboard
29
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Fine report is a business intelligence and
dashboard tool that supports real time analytics,
reporting and visualization. It accomodates needs
of business managers and data scientists
https://www.finereport.com
Alternatives:
Sisense: https://www.sisense.com
Tableau: https://www.tableau.com
30
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Final disclaimer
This presentation is not an endorsement of the various
tools, libraries or frameworks described or suggested in
this presentation.
Allthough the tools listed in the slides are known to work
in the context of the architecture, there are excellent
alternative libraries that may better meet your specific
needs.
31
Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
Thank you!
Q&A

More Related Content

PPTX
Apache Drill
PDF
Writing RESTful web services using Node.js
PDF
Як РМу швидко влитися на різних стадіях проєкту_розробки продукту. .pptx.pdf
 
PDF
The Deep Learning Frameworks You Should Know | 2025
PDF
The Power of Unified Analytics with Ali Ghodsi
PDF
Bringing Deep Learning into production
Apache Drill
Writing RESTful web services using Node.js
Як РМу швидко влитися на різних стадіях проєкту_розробки продукту. .pptx.pdf
 
The Deep Learning Frameworks You Should Know | 2025
The Power of Unified Analytics with Ali Ghodsi
Bringing Deep Learning into production

Similar to Open Source Lambda Architecture for deep learning (20)

PDF
Machine learning at scale challenges and solutions
PPTX
Open, Secure & Transparent AI Pipelines
PDF
Austin,TX Meetup presentation tensorflow final oct 26 2017
PDF
Top 11 python frameworks for machine learning and deep learning
PPTX
AI Deep Learning - CF Machine Learning
PPTX
Deep learning framework
PDF
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
PDF
Data ops: Machine Learning in production
PDF
Why scala for data science
PPTX
Combining Machine Learning Frameworks with Apache Spark
PPTX
Combining Machine Learning frameworks with Apache Spark
PDF
Deep Learning on Apache® Spark™: Workflows and Best Practices
PDF
Deep Learning on Apache® Spark™ : Workflows and Best Practices
PDF
Deep Learning on Apache® Spark™: Workflows and Best Practices
PDF
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
PDF
Data Analytics and Machine Learning: From Node to Cluster on ARM64
PDF
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
PDF
Simple, Modular and Extensible Big Data Platform Concept
PDF
Deep Learning for Autonomous Driving
PDF
Top Deep Learning Frameworks.pdf
Machine learning at scale challenges and solutions
Open, Secure & Transparent AI Pipelines
Austin,TX Meetup presentation tensorflow final oct 26 2017
Top 11 python frameworks for machine learning and deep learning
AI Deep Learning - CF Machine Learning
Deep learning framework
Jeremy Nixon, Machine Learning Engineer, Spark Technology Center at MLconf AT...
Data ops: Machine Learning in production
Why scala for data science
Combining Machine Learning Frameworks with Apache Spark
Combining Machine Learning frameworks with Apache Spark
Deep Learning on Apache® Spark™: Workflows and Best Practices
Deep Learning on Apache® Spark™ : Workflows and Best Practices
Deep Learning on Apache® Spark™: Workflows and Best Practices
BKK16-408B Data Analytics and Machine Learning From Node to Cluster
Data Analytics and Machine Learning: From Node to Cluster on ARM64
BKK16-404B Data Analytics and Machine Learning- from Node to Cluster
Simple, Modular and Extensible Big Data Platform Concept
Deep Learning for Autonomous Driving
Top Deep Learning Frameworks.pdf
Ad

More from Patrick Nicolas (12)

PPSX
Autonomous medical coding with discriminative transformers
PPTX
AI for electronic health records
PPTX
Monadic genetic kernels in Scala
PDF
Scala for Machine Learning
PPTX
Stock Market Prediction using Hidden Markov Models and Investor sentiment
PPTX
Advanced Functional Programming in Scala
PPSX
Adaptive Intrusion Detection Using Learning Classifiers
PPS
Data Modeling using Symbolic Regression
PPSX
Semantic Analysis using Wikipedia Taxonomy
PPSX
Hadoop Ecosystem
PPSX
Taxonomy-based Contextual Ads Targeting
PPSX
Multi-tenancy in Private Clouds
Autonomous medical coding with discriminative transformers
AI for electronic health records
Monadic genetic kernels in Scala
Scala for Machine Learning
Stock Market Prediction using Hidden Markov Models and Investor sentiment
Advanced Functional Programming in Scala
Adaptive Intrusion Detection Using Learning Classifiers
Data Modeling using Symbolic Regression
Semantic Analysis using Wikipedia Taxonomy
Hadoop Ecosystem
Taxonomy-based Contextual Ads Targeting
Multi-tenancy in Private Clouds
Ad

Recently uploaded (20)

PPTX
Computer network topology notes for revision
PPTX
Introduction to machine learning and Linear Models
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
Business Analytics and business intelligence.pdf
PDF
Mega Projects Data Mega Projects Data
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Introduction to the R Programming Language
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Lecture1 pattern recognition............
PPTX
1_Introduction to advance data techniques.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Computer network topology notes for revision
Introduction to machine learning and Linear Models
climate analysis of Dhaka ,Banglades.pptx
Introduction-to-Cloud-ComputingFinal.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Business Analytics and business intelligence.pdf
Mega Projects Data Mega Projects Data
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Introduction to the R Programming Language
[EN] Industrial Machine Downtime Prediction
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Galatica Smart Energy Infrastructure Startup Pitch Deck
Lecture1 pattern recognition............
1_Introduction to advance data techniques.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj

Open Source Lambda Architecture for deep learning

  • 1. 𝜆 Open Source -Architecture for Deep Learning Use case Patrick R Nicolas Oct. 2020 pnicolasai@yahoo.com
  • 2. Overview 3 “… and the wise man said, thou shall embrace open source”. 21st century proverb Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
  • 3. Overview 4 Overview Layers Open-source components Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
  • 4. Overview 5 The world of data scientists accustomed to Python scientific libraries have been shaken up by the emergence of ’big data’ framework such as Apache Hadoop, Spark and Kafka. This presentation introduces a variant of the architecture and describes the seamless integration of various open source components to train, validate and test deep learning models. Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning 𝜆
  • 5. Disclaimer 6 The concept and architecture are versatile enough to accommodate a variety of open source, commercial solutions and services beside the frameworks prescribed in this presentation. For instance, deep learning frameworks, such as Keras or tensor flow are excellent alternatives to PyTorch. Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
  • 6. Requirements 7 • Process batch and stream data, concurrently • Enforce data immutability • Recover gracefully from human errors • Handle hardware failures • Minimize latency for real-time requests • Scale for very large data set • Optimize full lifecycle of data set • Guarantee quality and integrity of data Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning A ‘big data’ framework should be able to ….
  • 7. Optimizing data life cycle 8 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning The need for optimizing the data life cycle: 79% of data scientist time is spent collecting and organizing data. Source Quora
  • 8. Data quality 9 Accuracy: Correct models and representative data. Completeness: No missing data Consistency: Applied to semantic and format Timeliness: Up-to-date data and notification Accessibility: Ease of use and high availability Validity: Comply to constraints, rules and regulations Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Guaranteeing data quality and integrity
  • 9. Solution … 10 - architecture is a large scale data processing that balanced batch and real-time streamed data. It is a one-stop shopping for various data sources that balance latency, redundancy, easy of access and throughput. It breaks down into 3 layers • Speed (streaming, real-time, …) • Batch (training, analysis, …) • Serving (query, visualization, …) Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning 𝜆
  • 10. … using open source 11 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning architecture using open source components? 𝜆 The task consists of reviewing and evaluating the trove of available of open source libraries to build a robust architecture that support the rigor of training and tuning deep learning models. The libraries are weaved through a set language- agnostic REST API to form a coherent pipeline.
  • 11. … for deep learning 12 • Python scientific libraries have been the go-to tools for data scientists to analyze data and build models. • PyTorch framework builds up on these libraries to support the design and execution of deep learning models. • Apache Spark and Kafka complements these frameworks for very large data set and real-time processing. Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning architecture for deep learning? 𝜆
  • 12. Bird-eye view 13 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Feel overwhelmed? ... Let’s break it down Example open source 𝜆 architecture
  • 13. Layers 14 Overview Layers Open-source components Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
  • 14. Batch layer 15 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Batch layer objective: load batch of data to be distributed, preprocessed to train deep learning models.
  • 15. Batch layer 16 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Typical use case: 1. Apache Spark loads training set from Amazon S3 2. Spark master partitions training data 3. Spark workers preprocessed data and notify completion through Kafka event queue 4. Pytorch updated model parameters from pre- processed training data 5. Pytorch broadcast model parameters and quality metrics through Kafka 6. Apache Hive powered by Spark stores models related data and metrics
  • 16. Speed layer 17 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Speed layer objective: process queries to predictive models with very low latency.
  • 17. Speed layer 18 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Use case: 1. Kafka routes data streams to Spark master 2. Spark pre-processes requests and forward them to deep model micro-service 3. Flask converts requests to prediction query to Pytorch model 4. Pytorch model generate a prediction 5. Run-time metrics are broadcast through Kafka
  • 18. Serving layer 19 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Serving layer objective: process queries to analyze data, model performances and execute statistical inference
  • 19. Serving layer 20 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Use case: 1. Analyst queries relational data base, MySQL for most recent data, statistics using Fine report UI (low latency) 2. Analyst queries asynchronously Hive data warehouse for archived data, statistics (high latency) 3. Hive processes queries through Spark datasets 4. Spark updates regularly MySQL short term data
  • 20. Overview 21 Overview Layers Open-source components Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
  • 21. PyTorch 22 PyTorch is an optimized tensor library for deep learning using GPUs and CPUs. It extends the functionality of Numpy and Scikit- learn to support the training, evaluation and commercialization of complex machine learning models. https://pytorch.org/tutorials/ Alternatives: Tensor flow: https://www.tensorflow.org/ Keras: https://keras.io MxNet: https://mxnet.apache.org Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
  • 22. Apache Spark 23 Apache Spark is an open source cluster computing framework for fast real-time processing. It supports Scala, Java, Python and R programming languages and includes streaming, graph and machine learning libraries. https://www.scala-lang.org https://spark.apache.org Alternative: PySpark: https://databricks.com/glossary/pyspark Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
  • 23. Streaming 24 Apache Kafka is an open-source distributed event streaming framework to large scale, real-time data processing and analytics. It captures data from various sources in real-time as a continuous flow and routes it to the appropriate processor. https://kafka.apache.org Alternatives: Amazon SQS: https://aws.amazon.com/sqs/ RabbitMQ: https://www.rabbitmq.com Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
  • 24. Model tuning 25 Ray-tune is a distributed hyper-parameters tuning framework particularly suitable to deep learning models. It reduces significantly the cost of optimizing the configuration of a model. It is a wrapper around other open source libraries https://docs.ray.io/en/master/tune/index.html Alternatives: Amazon SageMaker: https://aws.amazon.com/sagemaker/ HyperOpt: https://github.com/hyperopt/hyperopt Optuna: https://optuna.readthedocs.io Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning
  • 25. Python REST service 26 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Flask is an easy to use implementation of the RESTful interface to Python applications. It supports most of web and deployment standards such Docker, React.js, Angular, HTML5 and WSGI containers. https://palletsprojects.com/p/flask/ Alternatives: Falcon: https://falcon.readthedocs.io Fast API: https://fastapi.tiangolo.com
  • 26. RDBMS 27 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning MySQL is an open source relational database supporting partitioning, sharding, replication. It can be extended with real-time analytics (Heatwave) and enterprise clustering (CGE) https://www.mysql.com Alternatives: PosgresSQL: https://www.postgresql.org HyperSQL http://www.hsqldb.org Amazon RDS: http://aws.amazon.com/rds
  • 27. Data warehouse 28 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Apache Hive is a data warehouse framework that leverages Spark to execute largely distributed SQL queries. It optimizes SQL queries through lazy evaluation of acyclic execution graph. It is integrated with Spark data set and HDFS. https://hive.apache.org Alternatives: Vertica http://www.vertica.com Amazon Redshift https://aws.amazon.com/redshift/
  • 28. Dashboard 29 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Fine report is a business intelligence and dashboard tool that supports real time analytics, reporting and visualization. It accomodates needs of business managers and data scientists https://www.finereport.com Alternatives: Sisense: https://www.sisense.com Tableau: https://www.tableau.com
  • 29. 30 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Final disclaimer This presentation is not an endorsement of the various tools, libraries or frameworks described or suggested in this presentation. Allthough the tools listed in the slides are known to work in the context of the architecture, there are excellent alternative libraries that may better meet your specific needs.
  • 30. 31 Patrick R. Nicolas - Open Source 𝜆 -Architecture for Deep Learning Thank you! Q&A