SlideShare a Scribd company logo
‹#›© 2015 Pivotal Software, Inc. All rights reserved. ‹#›
Building a Stock Prediction system with
Machine Learning using Geode, Spring XD
e Spark MLLib
William Markito
@william_markito
Fred Melo
@fredmelo_br
Building a Stock Prediction system with Machine Learning using Geode, SpringXD and Spark MLLib
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
It's all about DATA
Data Sources
Look for patterns
Prediction
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Building a Stock Prediction system with Machine Learning using Geode, SpringXD and Spark MLLib
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
medium avg
(x+1)
relative
strength (x)
medium avg (x)
price(x)
Machine Learning Model
(e.g. Linear Regression)
Building a Stock Prediction system with Machine Learning using Geode, SpringXD and Spark MLLib
© Copyright 2014 Pivotal. All rights reserved.
Transform Sink
SpringXD
Extensible
Open-Source
Fault-Tolerant
Horizontally Scalable
Cloud-Native
Machine Learning
Enrich Filter
Split
Dashboard
Indicators
1
2
Predict
3
Real data
Simulator
/Stocks
/TechIndicators
/Predictions
‹#›© 2015 Pivotal Software, Inc. All rights reserved. ‹#›
Apache Geode (incubating)
Introduction
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Introduction
A distributed, memory-based data management platform for
data oriented apps that need:
High performance, scalability, resiliency and continuous
availability
Fast access to critical data set
Location aware distributed data processing
Event driven data architecture
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
Cache
In-memory storage and management for
your data
Configurable through XML, Spring, Java
API or CLI
Collection of Region
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
Region
Distributed java.util.Map on steroids
(Key/Value)
Consistent API regardless of where or how data
is stored
Observable (reactive)
Highly available, redundant on cache Member
(s).
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
Region
Local, Replicated or Partitioned
In-memory or persistent
Redundant
LRU
Overflow
LOCAL
LOCAL_HEAP_LRU
LOCAL_OVERFLOW
LOCAL_PERSISTENT
LOCAL_PERSISTENT_OVERFLOW
PARTITION
PARTITION_HEAP_LRU
PARTITION_OVERFLOW
PARTITION_PERSISTENT
PARTITION_PERSISTENT_OVERFLOW
PARTITION_PROXY
PARTITION_PROXY_REDUNDANT
PARTITION_REDUNDANT
PARTITION_REDUNDANT_HEAP_LRU
PARTITION_REDUNDANT_OVERFLOW
PARTITION_REDUNDANT_PERSISTENT
PARTITION_REDUNDANT_PERSISTENT_OVERFLOW
REPLICATE
REPLICATE_HEAP_LRU
REPLICATE_OVERFLOW
REPLICATE_PERSISTENT
REPLICATE_PERSISTENT_OVERFLOW
REPLICATE_PROXY
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
Member
A process that has a connection to the system
A process that has created a cache
Embeddable within your application
Client
Locator
Server
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
Client cache
A process connected to the Geode server(s)
Can have a local copy of the data
Can be notified about events on the servers
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
Listeners
CacheWriter / CacheListener
AsyncEventListener (queue / batch)
Parallel or Serial
Conflation
© Copyright 2014 Pivotal. All rights reserved. 19
Apache Geode (incubating)
Currently under incubation in Apache Software Foundation
Welcome contributions and contributors
Code and Patches
Bugs, feature requests
Documentation and content
Any form of feedback
© Copyright 2014 Pivotal. All rights reserved. 20
Code
New features
Bug fixes (patches)
Writing tests
Documentation
Wiki
Web site
User guides
Community
Join our mailing lists (Ask or answer)
Become a speaker
Find and report bugs
Testing a release candidate or beta
Apache Geode (incubating)
© Copyright 2014 Pivotal. All rights reserved. 21
JIRA - https://issues.apache.org/jira/browse/GEODE
GitHub - https://github.com/apache/incubator-geode
Mailing lists:
Development - dev@geode.incubator.apache.org
Users - user@geode.incubator.apache.org
Wiki - cwiki.apache.org/confluence/display/GEODE
StackOverflow - http://stackoverflow.com/questions/tagged/geode+or+gemfire
Apache Geode (incubating)
‹#›© 2015 Pivotal Software, Inc. All rights reserved. ‹#›
SpringXD
Introduction
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
 A stream is composed from modules. Each module is deployed to a container and its
channels are bound to the transport.
‹#›© 2015 Pivotal Software, Inc. All rights reserved. ‹#›
Apache Zeppelin
(incubating)
Introduction
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
Web based REPL
Iterative & Exploratory
Support for Data Ingestion
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
Multi interpreters
Markdown
Shell
Spark
Geode
Python…
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
Sharing through URLs without Reports
‹#›© 2015 Pivotal Software, Inc. All rights reserved. ‹#›
Apache Spark
Introduction
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
RDD
Dataframe
Driver
Worker
"An RDD in Spark is simply an immutable distributed collection of objects.
Each RDD is split into multiple partitions, which may be computed on different nodes
of the cluster. RDDs can contain any type of Python, Java, or Scala objects,
including user-defined classes."
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
RDD
Dataframe
Driver
Worker
“A dataframe is a distributed collection of rows organized into named columns. An
abstraction for selecting, filtering and plotting structured data (pandas), previously
known as SchemaRDD."
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Concepts
RDD
Dataframe
Driver
Worker
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Summary
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Summary
• Integration
• Spark, JDBC, Geode
• HDFS, Twitter, File, Mail…
• Data pipeline orchestration
• Intuitive DSL
• Streaming & Analytics
• Distributed and scalable
• Web based REPL
• Multiple Interpreters
• Apache Spark
• Markdown
• Flink
• Python
• Geode…
• Iterative & Exploratory
‹#›© 2015 Pivotal Software, Inc. All rights reserved.
Summary
• Fast data processing
• Columnar queries
• RDDs
• Machine Learning
• Analytics & Streaming
• Fast data store and processing
• In-memory & Persistent
• Highly Consistent
• Transaction processing
• Thousands of concurrent
clients
© Copyright 2014 Pivotal. All rights reserved. 36
Source Code
http://pivotal-open-source-hub.github.io/StockInference-Spark/
Building a Stock Prediction system with Machine Learning using Geode, SpringXD and Spark MLLib

More Related Content

PDF
ApacheCon 2015 - A Stock Prediction System Using OSS
PDF
Implementing a highly scalable stock prediction system with R, Geode, SpringX...
PDF
A Stock Prediction System using Open-Source Software
PPTX
Optimizing your SparkML pipelines using the latest features in Spark 2.3
PDF
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
PPTX
Spark sql meetup
PPTX
[Strata] Sparkta
PDF
Improving Python and Spark (PySpark) Performance and Interoperability
ApacheCon 2015 - A Stock Prediction System Using OSS
Implementing a highly scalable stock prediction system with R, Geode, SpringX...
A Stock Prediction System using Open-Source Software
Optimizing your SparkML pipelines using the latest features in Spark 2.3
Risk Management Framework Using Intel FPGA, Apache Spark, and Persistent RDDs...
Spark sql meetup
[Strata] Sparkta
Improving Python and Spark (PySpark) Performance and Interoperability

What's hot (20)

PDF
Sparkler Presentation for Spark Summit East 2017
PDF
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
PDF
Pandas UDF: Scalable Analysis with Python and PySpark
PDF
Sherlock: an anomaly detection service on top of Druid
PDF
Secured (Kerberos-based) Spark Notebook for Data Science: Spark Summit East t...
PDF
Big Data Meets Learning Science: Keynote by Al Essa
PDF
Uber's data science workbench
PPTX
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
PDF
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
PDF
Apache Druid 101
PDF
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
PPTX
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
PPTX
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
PPTX
Docker data science pipeline
PPTX
Adding structure to your streaming pipelines: moving from Spark streaming to ...
PDF
When Apache Spark Meets TiDB with Xiaoyu Ma
PDF
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
PDF
Data Warehousing with Spark Streaming at Zalando
PDF
Enancing Threat Detection with Big Data and AI
PPTX
Geospatial data platform at Uber
Sparkler Presentation for Spark Summit East 2017
Lightning-Fast Analytics for Workday Transactional Data with Pavel Hardak and...
Pandas UDF: Scalable Analysis with Python and PySpark
Sherlock: an anomaly detection service on top of Druid
Secured (Kerberos-based) Spark Notebook for Data Science: Spark Summit East t...
Big Data Meets Learning Science: Keynote by Al Essa
Uber's data science workbench
Modernizing Business Processes with Big Data: Real-World Use Cases for Produc...
Hiding Apache Spark Complexity for Fast Prototyping of Big Data Applications—...
Apache Druid 101
PyCon.DE / PyData Karlsruhe keynote: "Looking backward, looking forward"
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
Docker data science pipeline
Adding structure to your streaming pipelines: moving from Spark streaming to ...
When Apache Spark Meets TiDB with Xiaoyu Ma
Building a Data Warehouse for Business Analytics using Spark SQL-(Blagoy Kalo...
Data Warehousing with Spark Streaming at Zalando
Enancing Threat Detection with Big Data and AI
Geospatial data platform at Uber
Ad

Viewers also liked (15)

PDF
How to Contribute to Apache Geode
PPTX
Apache Geode (incubating) Introduction with Docker
PDF
Apache Spark Machine Learning
PDF
Time Series Analysis with Spark
PDF
Machine Learning with Spark MLlib
PDF
MLlib: Spark's Machine Learning Library
PDF
Large-Scale Machine Learning with Apache Spark
PDF
Introduction to Machine Learning with Spark
PPTX
Combining Machine Learning Frameworks with Apache Spark
PPTX
Introduction to ML with Apache Spark MLlib
PPTX
MLlib and Machine Learning on Spark
PDF
Practical Machine Learning Pipelines with MLlib
PPTX
Machine Learning with Apache Spark
PPTX
Machine Learning With Spark
PDF
Data Science - Part X - Time Series Forecasting
How to Contribute to Apache Geode
Apache Geode (incubating) Introduction with Docker
Apache Spark Machine Learning
Time Series Analysis with Spark
Machine Learning with Spark MLlib
MLlib: Spark's Machine Learning Library
Large-Scale Machine Learning with Apache Spark
Introduction to Machine Learning with Spark
Combining Machine Learning Frameworks with Apache Spark
Introduction to ML with Apache Spark MLlib
MLlib and Machine Learning on Spark
Practical Machine Learning Pipelines with MLlib
Machine Learning with Apache Spark
Machine Learning With Spark
Data Science - Part X - Time Series Forecasting
Ad

Similar to Building a Stock Prediction system with Machine Learning using Geode, SpringXD and Spark MLLib (20)

PPTX
Apache Spark: Lightning Fast Cluster Computing
PDF
피보탈 클라우드 파운드리 밋업 - 2017년 2월 24일
PPTX
Pivotal Cloud Platform Roadshow Keynote
PDF
Spark meets Spring
PDF
Logging infrastructure for Microservices using StreamSets Data Collector
PDF
Building Cloud Native Applications with Oracle Autonomous Database.
PDF
Apache spark with java 8
PPTX
Apache spark with java 8
PDF
Reducing the Risks of Migrating Off Oracle
 
PDF
Pivotal Digital Transformation Forum: Data Science Technical Overview
PDF
Node summit workshop
PPTX
Introducing Apache Geode and Spring Data GemFire
PPTX
Overview and Walkthrough of the Application Programming Model with SAP Cloud ...
PDF
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
PDF
Sql on everything with drill
PDF
Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah
PDF
IoT Use Cases with MapR
PDF
Marcin Szałowicz - MySQL Workbench
PDF
Removing Barriers Between Dev and Ops
PDF
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...
Apache Spark: Lightning Fast Cluster Computing
피보탈 클라우드 파운드리 밋업 - 2017년 2월 24일
Pivotal Cloud Platform Roadshow Keynote
Spark meets Spring
Logging infrastructure for Microservices using StreamSets Data Collector
Building Cloud Native Applications with Oracle Autonomous Database.
Apache spark with java 8
Apache spark with java 8
Reducing the Risks of Migrating Off Oracle
 
Pivotal Digital Transformation Forum: Data Science Technical Overview
Node summit workshop
Introducing Apache Geode and Spring Data GemFire
Overview and Walkthrough of the Application Programming Model with SAP Cloud ...
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
Sql on everything with drill
Turning Relational Database Tables into Hadoop Datasources by Kuassi Mensah
IoT Use Cases with MapR
Marcin Szałowicz - MySQL Workbench
Removing Barriers Between Dev and Ops
Alexandre Vasseur - Evolution of Data Architectures: From Hadoop to Data Lake...

Recently uploaded (20)

PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Nekopoi APK 2025 free lastest update
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPTX
history of c programming in notes for students .pptx
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
L1 - Introduction to python Backend.pptx
PPTX
Transform Your Business with a Software ERP System
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Designing Intelligence for the Shop Floor.pdf
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPT
Introduction Database Management System for Course Database
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
How to Migrate SBCGlobal Email to Yahoo Easily
Nekopoi APK 2025 free lastest update
Odoo Companies in India – Driving Business Transformation.pdf
Design an Analysis of Algorithms I-SECS-1021-03
history of c programming in notes for students .pptx
Navsoft: AI-Powered Business Solutions & Custom Software Development
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Design an Analysis of Algorithms II-SECS-1021-03
Adobe Illustrator 28.6 Crack My Vision of Vector Design
L1 - Introduction to python Backend.pptx
Transform Your Business with a Software ERP System
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
VVF-Customer-Presentation2025-Ver1.9.pptx
Wondershare Filmora 15 Crack With Activation Key [2025
Designing Intelligence for the Shop Floor.pdf
Upgrade and Innovation Strategies for SAP ERP Customers
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Introduction Database Management System for Course Database
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf

Building a Stock Prediction system with Machine Learning using Geode, SpringXD and Spark MLLib

  • 1. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. ‹#› Building a Stock Prediction system with Machine Learning using Geode, Spring XD e Spark MLLib William Markito @william_markito Fred Melo @fredmelo_br
  • 3. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. It's all about DATA Data Sources Look for patterns Prediction
  • 4. ‹#›© 2015 Pivotal Software, Inc. All rights reserved.
  • 5. ‹#›© 2015 Pivotal Software, Inc. All rights reserved.
  • 7. ‹#›© 2015 Pivotal Software, Inc. All rights reserved.
  • 8. medium avg (x+1) relative strength (x) medium avg (x) price(x) Machine Learning Model (e.g. Linear Regression)
  • 10. © Copyright 2014 Pivotal. All rights reserved. Transform Sink SpringXD Extensible Open-Source Fault-Tolerant Horizontally Scalable Cloud-Native Machine Learning Enrich Filter Split Dashboard Indicators 1 2 Predict 3 Real data Simulator /Stocks /TechIndicators /Predictions
  • 11. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. ‹#› Apache Geode (incubating) Introduction
  • 12. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Introduction A distributed, memory-based data management platform for data oriented apps that need: High performance, scalability, resiliency and continuous availability Fast access to critical data set Location aware distributed data processing Event driven data architecture
  • 13. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts Cache In-memory storage and management for your data Configurable through XML, Spring, Java API or CLI Collection of Region
  • 14. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts Region Distributed java.util.Map on steroids (Key/Value) Consistent API regardless of where or how data is stored Observable (reactive) Highly available, redundant on cache Member (s).
  • 15. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts Region Local, Replicated or Partitioned In-memory or persistent Redundant LRU Overflow LOCAL LOCAL_HEAP_LRU LOCAL_OVERFLOW LOCAL_PERSISTENT LOCAL_PERSISTENT_OVERFLOW PARTITION PARTITION_HEAP_LRU PARTITION_OVERFLOW PARTITION_PERSISTENT PARTITION_PERSISTENT_OVERFLOW PARTITION_PROXY PARTITION_PROXY_REDUNDANT PARTITION_REDUNDANT PARTITION_REDUNDANT_HEAP_LRU PARTITION_REDUNDANT_OVERFLOW PARTITION_REDUNDANT_PERSISTENT PARTITION_REDUNDANT_PERSISTENT_OVERFLOW REPLICATE REPLICATE_HEAP_LRU REPLICATE_OVERFLOW REPLICATE_PERSISTENT REPLICATE_PERSISTENT_OVERFLOW REPLICATE_PROXY
  • 16. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts Member A process that has a connection to the system A process that has created a cache Embeddable within your application Client Locator Server
  • 17. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts Client cache A process connected to the Geode server(s) Can have a local copy of the data Can be notified about events on the servers
  • 18. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts Listeners CacheWriter / CacheListener AsyncEventListener (queue / batch) Parallel or Serial Conflation
  • 19. © Copyright 2014 Pivotal. All rights reserved. 19 Apache Geode (incubating) Currently under incubation in Apache Software Foundation Welcome contributions and contributors Code and Patches Bugs, feature requests Documentation and content Any form of feedback
  • 20. © Copyright 2014 Pivotal. All rights reserved. 20 Code New features Bug fixes (patches) Writing tests Documentation Wiki Web site User guides Community Join our mailing lists (Ask or answer) Become a speaker Find and report bugs Testing a release candidate or beta Apache Geode (incubating)
  • 21. © Copyright 2014 Pivotal. All rights reserved. 21 JIRA - https://issues.apache.org/jira/browse/GEODE GitHub - https://github.com/apache/incubator-geode Mailing lists: Development - dev@geode.incubator.apache.org Users - user@geode.incubator.apache.org Wiki - cwiki.apache.org/confluence/display/GEODE StackOverflow - http://stackoverflow.com/questions/tagged/geode+or+gemfire Apache Geode (incubating)
  • 22. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. ‹#› SpringXD Introduction
  • 23. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts
  • 24. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts  A stream is composed from modules. Each module is deployed to a container and its channels are bound to the transport.
  • 25. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. ‹#› Apache Zeppelin (incubating) Introduction
  • 26. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts Web based REPL Iterative & Exploratory Support for Data Ingestion
  • 27. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts Multi interpreters Markdown Shell Spark Geode Python…
  • 28. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts Sharing through URLs without Reports
  • 29. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. ‹#› Apache Spark Introduction
  • 30. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts RDD Dataframe Driver Worker "An RDD in Spark is simply an immutable distributed collection of objects. Each RDD is split into multiple partitions, which may be computed on different nodes of the cluster. RDDs can contain any type of Python, Java, or Scala objects, including user-defined classes."
  • 31. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts RDD Dataframe Driver Worker “A dataframe is a distributed collection of rows organized into named columns. An abstraction for selecting, filtering and plotting structured data (pandas), previously known as SchemaRDD."
  • 32. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Concepts RDD Dataframe Driver Worker
  • 33. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Summary
  • 34. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Summary • Integration • Spark, JDBC, Geode • HDFS, Twitter, File, Mail… • Data pipeline orchestration • Intuitive DSL • Streaming & Analytics • Distributed and scalable • Web based REPL • Multiple Interpreters • Apache Spark • Markdown • Flink • Python • Geode… • Iterative & Exploratory
  • 35. ‹#›© 2015 Pivotal Software, Inc. All rights reserved. Summary • Fast data processing • Columnar queries • RDDs • Machine Learning • Analytics & Streaming • Fast data store and processing • In-memory & Persistent • Highly Consistent • Transaction processing • Thousands of concurrent clients
  • 36. © Copyright 2014 Pivotal. All rights reserved. 36 Source Code http://pivotal-open-source-hub.github.io/StockInference-Spark/