SlideShare a Scribd company logo
© 2017 The Apache Software Foundation. Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are trademarks of The Apache Software Foundation.
In-Memory Computing Platform
Presenter: Christos Erotocritou
@ChrisErotocritu
Contents
• GridGain & Apache Ignite Project
• Ignite In-Memory Computing Platform
• Introduction to Clustering
• In-Memory Data Grid & SQL Grid
• Compute grid, Service Grid & Streaming
• Hadoop & Spark Integration
• Q & A
© 2017 GridGain Systems, Inc.
What is Apache Ignite
High-performance distributed in-memory platform
for computing and transacting on large-scale data
sets in near real-time.
© 2017 GridGain Systems, Inc.
Apache Ignite Project - Recap
• 2007: First version of GridGain (compute grid)

• Oct. 2014: GridGain contributes Ignite to ASF
• Aug. 2015: Ignite is the second fastest
project to graduate after Spark
• Today vs. Feb. 2016:
• 28 more contributors: 88+ contributors
• Huge development momentum -
Estimated 248 years of effort since the
first commit in February, 2014 vs 192 year
last Feb. [Openhub]
• 200k more SLOC and 2.5k more
commits: 900k+ SLOC & more than
18.5k commits
February 2017
© 2017 GridGain Systems, Inc.
Customer Use Cases
Automated Trading Systems

Real time analysis of trading positions & market risk.
High volume transactions, ultra low latencies.

Financial Services

Fraud Detection, Risk Analysis, Insurance rating and
modelling.

Online & Mobile Advertising

Real time decisions, geo-targeting & retail traffic
information.

Big Data Analytics

Customer 360 view, real-time analysis of KPIs, up-to-
the-second operational BI.
Online Gaming

Real-time back-ends for mobile and massively parallel
games.

SaaS Platforms & Apps

High performance next-generation architectures for
Software as a Service Application vendors.

Travel & E-Commerce

High performance next-generation architectures for
online hotel booking.
© 2016 GridGain Systems, Inc.
• What is GridGain Enterprise Edition?
• Is a binary build of Apache Ignite™ created by GridGain
• Added enterprise features for enterprise deployments
• Earlier features and bug fixes by a few weeks
• Heavily tested
© 2017 GridGain Systems, Inc.
What In-Memory Capabilities are Supported?
‣ HPC
‣ Machine learning
‣ Risk analysis
‣ Grid computing
‣ HA API Services
‣ Scalable
Middleware
‣ Web-session
clustering
‣ Distributed caching
‣ In-Memory SQL
‣ Real-time
Analytics
‣ Big Data
‣ Monitoring tools
‣ Big Data
‣ Realtime Analytics
‣ Batch processing
‣ Distributed In-
Memory File
System
‣ Node2Node &
Topic-based
Messaging
‣ Fault Tolerance
‣ Multiple backups
‣ Cluster groups
‣ Auto Rebalancing
‣ Complex event
processing
‣ Event driven
design
‣ Distributed queues
‣ Atomic variables
‣ Dist. Semaphore
© 2017 GridGain Systems, Inc.
Introduction to Clustering
© 2016 GridGain Systems, Inc.
– Distributed Key-Value Store
– Fault Tolerance and Scalability
– SQL Queries (ANSI 99)
– ACID Transactions
– In-Memory Indexes
– RDBMS / NoSQL Integration
– 100% JCache Compliant (JSR 107)
High-level Architecture
On-Heap
Off-Heap
On-Heap
Off-Heap
On-Heap
Off-Heap
Ignite In-Memory Computing Platform
SQL Cache Trans. Compute
DB
© 2017 GridGain Systems, Inc.
Definitions and Terminology
An Ignite cluster is a group of Ignite
nodes working together to accomplish
tasks like distributed compute and caching
An Ignite node is a single Ignite process
running in a JVM
Many Ignite nodes can live on one physical
server or JVM
Ignite nodes can be Clients or Servers
Server/VM/ContainerServer/VM/Container
JVMJVM
…IGNITE
IGNITE
IGNITE
JVM
IGNITE
© 2016 GridGain Systems, Inc.
Shared-nothing clustering involves
multiple identical nodes forming a
cluster with no single master or
coordinator
All nodes in a shared-nothing cluster
run the exact same process
Nodes communicate using message
passing
Ignite Clients & Servers
Server / VM
JVM
IGNITE
Server / VM
JVM
IGNITE
Server / VM
JVM
IGNITE
Server / VM
JVM
IGNITE
© 2016 GridGain Systems, Inc.
An Ignite node can be started as a client
or a server. 

Server nodes participate in caching and
computations. Client nodes can also
participate in computations.

Client nodes are used for IgniteAPI
operations from the client side such as
cache operations, transactions, and data
streaming.
Ignite Clients & Servers
CLIENT CLIENT CLIENT
SERVER
SERVER
SERVER
SERVER
© 2017 GridGain Systems, Inc.
In-Memory Data Grid & SQL Grid
© 2016 GridGain Systems, Inc.
Data Grid: Cache modes & Horizontal Scaling
Replicated Cache Partitioned Cache & Near Cache
JVM 1
A
D
Primary
Backup
JVM 2
B
C
Primary
Backup
JVM 3
D
B
Primary
Backup
JVM 4
C
A
Primary
Backup
Client JVM
Remote Client
A
Near Cache
B B
A
CC
JVM 1
A
Primary
Backup
JVM 2
B
Primary
Backup
JVM 3
D
Primary
Backup
JVM 4
C
Primary
Backup
B D C D C A
A B C B A D
© 2016 GridGain Systems, Inc.
Data Grid: Cache modes & Horizontal Scaling
Replicated Cache Partitioned Cache
Client JVM
JVM 1
Local Client
A
D
Primary
Backup
JVM 2
B
C
Primary
Backup
JVM 3
D
B
Primary
Backup
JVM 4
C
A
Primary
Backup
A
Remote Client
A
Near Cache
B B
A
B
C
CC
Client JVM
JVM 1
Local Client
A
Primary
Backup
JVM 2
B
Primary
Backup
JVM 3
D
Primary
Backup
JVM 4
C
Primary
Backup
A
Remote Client
A
Near Cache
B B
A
B
C
CC
B D C D C A
A B C B A D
© 2016 GridGain Systems, Inc.
• Unlimited Vertical Scale
• Avoid Java Garbage Collection
Pauses
• Small On-Heap Footprint
• Configurable eviction policies
• Off-Heap Indexes
• Full RAM Utilisation
• Simple Configuration
Data Grid: Off-Heap Memory
© 2016 GridGain Systems, Inc.
• Multiple (up to 32) Data Centres
• Complex Replication
Technologies
• Active-Active & Active-Passive
• Smart Conflict Resolution
• Durable Persistent Queues
• Automatic Throttling
• GridGain Enterprise
Data Grid: DC Replication
DC1 DC2
DC3
Active
Bi-Directional
Replication
Transactional or
Eventually
Consistent
© 2016 GridGain Systems, Inc.
Data Grid: External Persistence
• Read-through & Write-through
• Support for Write-behind
• Configurable eviction policies
• DB schema mapping wizard:
• Generates all the XML configuration and
Java POJOs
Data
Data
Ignite Cache
Nodes
Write
Through
Read
Through
DB
Ignite
Clients
External
Persistent
Store
© 2016 GridGain Systems, Inc.
Data Grid: Cache APIs
• Predicate-based Scan Queries
• Text Queries based on Lucene indexing
• Query configuration using annotations, Spring
XML or simple Java code
• SQL Queries: Automatic Group By,
Aggregations, Sorting, Cross-Cache Joins,
Unions
• Memcached (PHP, Java, Python, Ruby)
• HTTP REST API
• JDBC & ODBC
© 2016 GridGain Systems, Inc.
• ANSI-99 SQL
• In-Memory Indexes (On and Off-
Heap)
• Automatic Group By,
Aggregations, Sorting
• Cross-Cache Joins, Unions
• Use local H2 engine
Data Grid: SQL Support (ANSI 99)
© 2016 GridGain Systems, Inc.
Data Grid: Transactions
• Fully ACID
• Support for Transactional & Atomic
• Cross-cache transactions
• Optimistic and Pessimistic
concurrency modes with multiple
isolation levels
• Deadlock protection
• JTA Integration
© 2016 GridGain Systems, Inc.
Distributed Java Structures
• Distributed Map (cache)
• Distributed Set
• Distributed Queue
• CountDownLatch
• AtomicLong
• AtomicSequence
• AtomicReference
• Distributed ExecutorService
© 2016 GridGain Systems, Inc.
Continuous Queries
• Execute a query and get
notified on data changes
captured in the filter
• Remote filter to evaluate event
and local listener to receive
notification
• Guarantees exactly once
delivery of an event
© 2016 GridGain Systems, Inc.
• Create chains of event processors & transform an object through various states
• Synchronous or asynchronous execution of remote filters & listeners with thread control
Payment Validator Payment Verifier Payment Processor
Ignite Cache
Event Processing using Ignite
© 2017 GridGain Systems, Inc.
In-Memory Compute Grid & the rest
© 2016 GridGain Systems, Inc.
• Branching Pipelines
• Sliding Windows for CEP/
Continuous Query
• JMS, Kafka, MQTT, Flume,
Camel data streamer
integrations
Streaming and CEP
Event 1 Event 2 Event 3 Event 4
Event Window
Evicted
Event
Incoming
Event
Ignite
Streamers
Data
Data
Ignite
Clients
Ignite Cache Nodes
SQL
1. Process Streamed Data
in Parallel on all Nodes
2. Process SQL Queries in
Parallel on all Nodes
© 2016 GridGain Systems, Inc.
Client-Server vs. Affinity Colocation
1
2
4
3 Data 1
Job 1
2
3
Data 2
Job 2
Processing
Node 1
Processing
Node 2
Client
Node
Data
Node 1
Data
Node 2
Processing
Node 1
1
3
4
Data 1
Data 2
2
2
1. Initial Request
2. Fetch data from remote nodes
3. Process entire data-set
4. Return to client
1. Initial Request
2. Co-locating processing with data
3. Return partial result
4. Reduce & return to client
© 2016 GridGain Systems, Inc.
• Direct API for MapReduce
• Cron-like Task Scheduling
• State Checkpoints
• Load Balancing
• Round-robin
• Random & weighted
• Automatic Failover
• Per-node Shared State
• Zero Deployment
• Distributed class loading
In-Memory Compute Grid
C1
C = C1+C2+C3
R = R1+R2+R3
C2
C3
R3
R2
R1
C
R
In T/3
© 2016 GridGain Systems, Inc.
• Resilience - Build an in-
memory resilient service layer
between your client application
and the grid
• Shielding- Only expose
application APIs and not direct
grid APIs
• Continuations - Call services
internally via compute tasks to
create service chains
In-Memory Service Grid
On-Heap
Off-Heap
On-Heap
Off-Heap
On-Heap
Off-Heap
Ignite In-Memory Computing Platform
SQL Cache Trans. Compute
Service A Service B Service C Service D
DB DB DB
© 2017 GridGain Systems, Inc.
Hadoop & Spark Integration
© 2016 GridGain Systems, Inc.
• Ignite In-Memory File System (IGFS)
– Hadoop-compliant
– Easy to Install
– On-Heap and Off-Heap
– Caching Layer for HDFS
– Write-through and Read-through
HDFS
– Performance Boost
IGFS: In-Memory File System
MR HIVE PIG
In-Memory MapReduce
IGFS
HDFS
IGFS
YARN
}Any
Hadoop
Distro
© 2016 GridGain Systems, Inc.
Hadoop Accelerator: Map Reduce
• In-Memory Performance
• Zero Code Change
• Use existing MR code
• Use existing Hive queries
• No Name Node
• No Network Noise
• In-Process Data Colocation
• Eager Push Scheduling
User
Application
Hadoop
Client
Ignite
Client
Hadoop
Jobtracker
Hadoop
Name Node
Hadoop
Tasktracker
Hadoop
Tasktracker
Ignite
Data Node
(IGFS)
Ignite
Data Node
(IGFS)
Hadoop
Data Node
(HDFS)
Hadoop
Data Node
(HDFS)
Ignite Path
Hadoop Path
© 2016 GridGain Systems, Inc.
• IgniteRDD
– Share RDD across jobs on the
host
– Share RDD across jobs in the
application
– Share RDD globally
• Faster SQL
– In-Memory Indexes
– SQL on top of Shared RDD
Spark & Ignite Integration
Spark Application
Spark Worker
Spark
Job
Spark
Job
Ignite Node
Yarn Mesos Docker HDFS
Server
Spark Worker
Spark
Job
Spark
Job
Ignite Node
Server
Spark Worker
Spark
Job
Spark
Job
Ignite Node
Server
In-Memory Shared RDDs / IGFS
© 2016 GridGain Systems, Inc.
• Docker
• Amazon AWS
• Azure Marketplace
• Google Cloud
• Apache JClouds
• Mesos
• YARN
• Apache Karaf (OSGi)
Cloud Deployment
© 2016 GridGain Systems, Inc.
Thank You!
www.gridgain.com
@gridgain
#gridgain
Thank you for joining us. Follow the conversation.
Author: Christos Erotocritou

More Related Content

PPTX
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
PDF
FOSDEM 2015 - NoSQL and SQL the best of both worlds
PDF
Novinky v Oracle Database 18c
PPTX
The rise of microservices - containers and orchestration
PPTX
In-Memory Computing Essentials for Architects and Engineers
PPTX
Data Streaming with Apache Kafka & MongoDB - EMEA
PPTX
What's new in MySQL Cluster 7.4 webinar charts
PDF
Troubleshooting Apache® Ignite™
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
FOSDEM 2015 - NoSQL and SQL the best of both worlds
Novinky v Oracle Database 18c
The rise of microservices - containers and orchestration
In-Memory Computing Essentials for Architects and Engineers
Data Streaming with Apache Kafka & MongoDB - EMEA
What's new in MySQL Cluster 7.4 webinar charts
Troubleshooting Apache® Ignite™

What's hot (20)

PDF
Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
PDF
Couchbase Sydney meetup #1 Couchbase Architecture and Scalability
PPTX
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
PPTX
In-Memory Computing Essentials for Software Engineers
PPTX
Db2 analytics accelerator on ibm integrated analytics system technical over...
PPTX
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
PPTX
Druid and Hive Together : Use Cases and Best Practices
PPTX
On Cloud Nine: How to be happy migrating your in-memory computing platform to...
PPTX
Insights into Real-world Data Management Challenges
PPTX
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
PDF
Couchbase Chennai Meetup: Developing with Couchbase- made easy
PDF
Leveraging docker for hadoop build automation and big data stack provisioning
PPTX
DataStax C*ollege Credit: What and Why NoSQL?
PDF
HAWQ Meets Hive - Querying Unmanaged Data
PPTX
What's new in apache hive
PPTX
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
PDF
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
PPTX
In Search of Database Nirvana: Challenges of Delivering HTAP
PPTX
Querying Druid in SQL with Superset
PPTX
Built-In Security for the Cloud
Efficient Spark Analytics on Encrypted Data with Gidon Gershinsky
Couchbase Sydney meetup #1 Couchbase Architecture and Scalability
Deploying Distributed Databases and In-Memory Computing Platforms with Kubern...
In-Memory Computing Essentials for Software Engineers
Db2 analytics accelerator on ibm integrated analytics system technical over...
GeoWave: Open Source Geospatial/Temporal/N-dimensional Indexing for Accumulo,...
Druid and Hive Together : Use Cases and Best Practices
On Cloud Nine: How to be happy migrating your in-memory computing platform to...
Insights into Real-world Data Management Challenges
From Insights to Value - Building a Modern Logical Data Lake to Drive User Ad...
Couchbase Chennai Meetup: Developing with Couchbase- made easy
Leveraging docker for hadoop build automation and big data stack provisioning
DataStax C*ollege Credit: What and Why NoSQL?
HAWQ Meets Hive - Querying Unmanaged Data
What's new in apache hive
Azure + DataStax Enterprise (DSE) Powers Office365 Per User Store
Oracle GoldenGate and Apache Kafka: A Deep Dive Into Real-Time Data Streaming
In Search of Database Nirvana: Challenges of Delivering HTAP
Querying Druid in SQL with Superset
Built-In Security for the Cloud
Ad

Similar to OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric (20)

PDF
Spark Summit EU talk by Christos Erotocritou
PDF
Nike tech-talk-intro-to-apache-ignite
PPTX
IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...
PDF
The next-phase-of-distributed-systems-with-apache-ignite
PDF
Apache Spark and Apache Ignite: Where Fast Data Meets IoT
PPTX
How we broke Apache Ignite by adding persistence
PDF
Getting Started with Apache Ignite as a Distributed Database
PDF
How we broke Apache Ignite by adding persistence, by Stephen Darlington (Grid...
PDF
Apache Ignite - Distributed Database Orchestration
PDF
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
PPTX
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT
PPTX
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
PDF
Fast Data with Apache Ignite and Apache Spark with Christos Erotocritou
PDF
GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov
PPTX
Apache ignite v1.3
PDF
Apache Ignite
PDF
Improving Apache Spark™ In-Memory Computing with Apache Ignite™
PDF
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
PDF
Fast, In-Memory SQL on Apache Cassandra with Apache Ignite (Rachel Pedreschi,...
PDF
“Building consistent and highly available distributed systems with Apache Ign...
Spark Summit EU talk by Christos Erotocritou
Nike tech-talk-intro-to-apache-ignite
IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...
The next-phase-of-distributed-systems-with-apache-ignite
Apache Spark and Apache Ignite: Where Fast Data Meets IoT
How we broke Apache Ignite by adding persistence
Getting Started with Apache Ignite as a Distributed Database
How we broke Apache Ignite by adding persistence, by Stephen Darlington (Grid...
Apache Ignite - Distributed Database Orchestration
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Apache Spark and Apache Ignite: Where Fast Data Meets the IoT
Apache Ignite: In-Memory Hammer for Your Data Science Toolkit
Fast Data with Apache Ignite and Apache Spark with Christos Erotocritou
GridGain 6.0: Open Source In-Memory Computing Platform - Nikita Ivanov
Apache ignite v1.3
Apache Ignite
Improving Apache Spark™ In-Memory Computing with Apache Ignite™
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
Fast, In-Memory SQL on Apache Cassandra with Apache Ignite (Rachel Pedreschi,...
“Building consistent and highly available distributed systems with Apache Ign...
Ad

Recently uploaded (20)

PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
history of c programming in notes for students .pptx
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
DOCX
The Five Best AI Cover Tools in 2025.docx
PDF
top salesforce developer skills in 2025.pdf
PPTX
Essential Infomation Tech presentation.pptx
PDF
Complete React Javascript Course Syllabus.pdf
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPT
Introduction Database Management System for Course Database
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Wondershare Filmora 15 Crack With Activation Key [2025
history of c programming in notes for students .pptx
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
The Five Best AI Cover Tools in 2025.docx
top salesforce developer skills in 2025.pdf
Essential Infomation Tech presentation.pptx
Complete React Javascript Course Syllabus.pdf
Which alternative to Crystal Reports is best for small or large businesses.pdf
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Design an Analysis of Algorithms I-SECS-1021-03
Introduction Database Management System for Course Database
Internet Downloader Manager (IDM) Crack 6.42 Build 41
ManageIQ - Sprint 268 Review - Slide Deck
How to Migrate SBCGlobal Email to Yahoo Easily
Adobe Illustrator 28.6 Crack My Vision of Vector Design
VVF-Customer-Presentation2025-Ver1.9.pptx
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool

OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric

  • 1. © 2017 The Apache Software Foundation. Apache, Apache Ignite, the Apache feather and the Apache Ignite logo are trademarks of The Apache Software Foundation. In-Memory Computing Platform Presenter: Christos Erotocritou @ChrisErotocritu
  • 2. Contents • GridGain & Apache Ignite Project • Ignite In-Memory Computing Platform • Introduction to Clustering • In-Memory Data Grid & SQL Grid • Compute grid, Service Grid & Streaming • Hadoop & Spark Integration • Q & A
  • 3. © 2017 GridGain Systems, Inc. What is Apache Ignite High-performance distributed in-memory platform for computing and transacting on large-scale data sets in near real-time.
  • 4. © 2017 GridGain Systems, Inc. Apache Ignite Project - Recap • 2007: First version of GridGain (compute grid) • Oct. 2014: GridGain contributes Ignite to ASF • Aug. 2015: Ignite is the second fastest project to graduate after Spark • Today vs. Feb. 2016: • 28 more contributors: 88+ contributors • Huge development momentum - Estimated 248 years of effort since the first commit in February, 2014 vs 192 year last Feb. [Openhub] • 200k more SLOC and 2.5k more commits: 900k+ SLOC & more than 18.5k commits February 2017
  • 5. © 2017 GridGain Systems, Inc. Customer Use Cases Automated Trading Systems
 Real time analysis of trading positions & market risk. High volume transactions, ultra low latencies. Financial Services
 Fraud Detection, Risk Analysis, Insurance rating and modelling. Online & Mobile Advertising
 Real time decisions, geo-targeting & retail traffic information. Big Data Analytics
 Customer 360 view, real-time analysis of KPIs, up-to- the-second operational BI. Online Gaming
 Real-time back-ends for mobile and massively parallel games. SaaS Platforms & Apps
 High performance next-generation architectures for Software as a Service Application vendors. Travel & E-Commerce
 High performance next-generation architectures for online hotel booking.
  • 6. © 2016 GridGain Systems, Inc. • What is GridGain Enterprise Edition? • Is a binary build of Apache Ignite™ created by GridGain • Added enterprise features for enterprise deployments • Earlier features and bug fixes by a few weeks • Heavily tested
  • 7. © 2017 GridGain Systems, Inc. What In-Memory Capabilities are Supported? ‣ HPC ‣ Machine learning ‣ Risk analysis ‣ Grid computing ‣ HA API Services ‣ Scalable Middleware ‣ Web-session clustering ‣ Distributed caching ‣ In-Memory SQL ‣ Real-time Analytics ‣ Big Data ‣ Monitoring tools ‣ Big Data ‣ Realtime Analytics ‣ Batch processing ‣ Distributed In- Memory File System ‣ Node2Node & Topic-based Messaging ‣ Fault Tolerance ‣ Multiple backups ‣ Cluster groups ‣ Auto Rebalancing ‣ Complex event processing ‣ Event driven design ‣ Distributed queues ‣ Atomic variables ‣ Dist. Semaphore
  • 8. © 2017 GridGain Systems, Inc. Introduction to Clustering
  • 9. © 2016 GridGain Systems, Inc. – Distributed Key-Value Store – Fault Tolerance and Scalability – SQL Queries (ANSI 99) – ACID Transactions – In-Memory Indexes – RDBMS / NoSQL Integration – 100% JCache Compliant (JSR 107) High-level Architecture On-Heap Off-Heap On-Heap Off-Heap On-Heap Off-Heap Ignite In-Memory Computing Platform SQL Cache Trans. Compute DB
  • 10. © 2017 GridGain Systems, Inc. Definitions and Terminology An Ignite cluster is a group of Ignite nodes working together to accomplish tasks like distributed compute and caching An Ignite node is a single Ignite process running in a JVM Many Ignite nodes can live on one physical server or JVM Ignite nodes can be Clients or Servers Server/VM/ContainerServer/VM/Container JVMJVM …IGNITE IGNITE IGNITE JVM IGNITE
  • 11. © 2016 GridGain Systems, Inc. Shared-nothing clustering involves multiple identical nodes forming a cluster with no single master or coordinator All nodes in a shared-nothing cluster run the exact same process Nodes communicate using message passing Ignite Clients & Servers Server / VM JVM IGNITE Server / VM JVM IGNITE Server / VM JVM IGNITE Server / VM JVM IGNITE
  • 12. © 2016 GridGain Systems, Inc. An Ignite node can be started as a client or a server. Server nodes participate in caching and computations. Client nodes can also participate in computations. Client nodes are used for IgniteAPI operations from the client side such as cache operations, transactions, and data streaming. Ignite Clients & Servers CLIENT CLIENT CLIENT SERVER SERVER SERVER SERVER
  • 13. © 2017 GridGain Systems, Inc. In-Memory Data Grid & SQL Grid
  • 14. © 2016 GridGain Systems, Inc. Data Grid: Cache modes & Horizontal Scaling Replicated Cache Partitioned Cache & Near Cache JVM 1 A D Primary Backup JVM 2 B C Primary Backup JVM 3 D B Primary Backup JVM 4 C A Primary Backup Client JVM Remote Client A Near Cache B B A CC JVM 1 A Primary Backup JVM 2 B Primary Backup JVM 3 D Primary Backup JVM 4 C Primary Backup B D C D C A A B C B A D
  • 15. © 2016 GridGain Systems, Inc. Data Grid: Cache modes & Horizontal Scaling Replicated Cache Partitioned Cache Client JVM JVM 1 Local Client A D Primary Backup JVM 2 B C Primary Backup JVM 3 D B Primary Backup JVM 4 C A Primary Backup A Remote Client A Near Cache B B A B C CC Client JVM JVM 1 Local Client A Primary Backup JVM 2 B Primary Backup JVM 3 D Primary Backup JVM 4 C Primary Backup A Remote Client A Near Cache B B A B C CC B D C D C A A B C B A D
  • 16. © 2016 GridGain Systems, Inc. • Unlimited Vertical Scale • Avoid Java Garbage Collection Pauses • Small On-Heap Footprint • Configurable eviction policies • Off-Heap Indexes • Full RAM Utilisation • Simple Configuration Data Grid: Off-Heap Memory
  • 17. © 2016 GridGain Systems, Inc. • Multiple (up to 32) Data Centres • Complex Replication Technologies • Active-Active & Active-Passive • Smart Conflict Resolution • Durable Persistent Queues • Automatic Throttling • GridGain Enterprise Data Grid: DC Replication DC1 DC2 DC3 Active Bi-Directional Replication Transactional or Eventually Consistent
  • 18. © 2016 GridGain Systems, Inc. Data Grid: External Persistence • Read-through & Write-through • Support for Write-behind • Configurable eviction policies • DB schema mapping wizard: • Generates all the XML configuration and Java POJOs Data Data Ignite Cache Nodes Write Through Read Through DB Ignite Clients External Persistent Store
  • 19. © 2016 GridGain Systems, Inc. Data Grid: Cache APIs • Predicate-based Scan Queries • Text Queries based on Lucene indexing • Query configuration using annotations, Spring XML or simple Java code • SQL Queries: Automatic Group By, Aggregations, Sorting, Cross-Cache Joins, Unions • Memcached (PHP, Java, Python, Ruby) • HTTP REST API • JDBC & ODBC
  • 20. © 2016 GridGain Systems, Inc. • ANSI-99 SQL • In-Memory Indexes (On and Off- Heap) • Automatic Group By, Aggregations, Sorting • Cross-Cache Joins, Unions • Use local H2 engine Data Grid: SQL Support (ANSI 99)
  • 21. © 2016 GridGain Systems, Inc. Data Grid: Transactions • Fully ACID • Support for Transactional & Atomic • Cross-cache transactions • Optimistic and Pessimistic concurrency modes with multiple isolation levels • Deadlock protection • JTA Integration
  • 22. © 2016 GridGain Systems, Inc. Distributed Java Structures • Distributed Map (cache) • Distributed Set • Distributed Queue • CountDownLatch • AtomicLong • AtomicSequence • AtomicReference • Distributed ExecutorService
  • 23. © 2016 GridGain Systems, Inc. Continuous Queries • Execute a query and get notified on data changes captured in the filter • Remote filter to evaluate event and local listener to receive notification • Guarantees exactly once delivery of an event
  • 24. © 2016 GridGain Systems, Inc. • Create chains of event processors & transform an object through various states • Synchronous or asynchronous execution of remote filters & listeners with thread control Payment Validator Payment Verifier Payment Processor Ignite Cache Event Processing using Ignite
  • 25. © 2017 GridGain Systems, Inc. In-Memory Compute Grid & the rest
  • 26. © 2016 GridGain Systems, Inc. • Branching Pipelines • Sliding Windows for CEP/ Continuous Query • JMS, Kafka, MQTT, Flume, Camel data streamer integrations Streaming and CEP Event 1 Event 2 Event 3 Event 4 Event Window Evicted Event Incoming Event Ignite Streamers Data Data Ignite Clients Ignite Cache Nodes SQL 1. Process Streamed Data in Parallel on all Nodes 2. Process SQL Queries in Parallel on all Nodes
  • 27. © 2016 GridGain Systems, Inc. Client-Server vs. Affinity Colocation 1 2 4 3 Data 1 Job 1 2 3 Data 2 Job 2 Processing Node 1 Processing Node 2 Client Node Data Node 1 Data Node 2 Processing Node 1 1 3 4 Data 1 Data 2 2 2 1. Initial Request 2. Fetch data from remote nodes 3. Process entire data-set 4. Return to client 1. Initial Request 2. Co-locating processing with data 3. Return partial result 4. Reduce & return to client
  • 28. © 2016 GridGain Systems, Inc. • Direct API for MapReduce • Cron-like Task Scheduling • State Checkpoints • Load Balancing • Round-robin • Random & weighted • Automatic Failover • Per-node Shared State • Zero Deployment • Distributed class loading In-Memory Compute Grid C1 C = C1+C2+C3 R = R1+R2+R3 C2 C3 R3 R2 R1 C R In T/3
  • 29. © 2016 GridGain Systems, Inc. • Resilience - Build an in- memory resilient service layer between your client application and the grid • Shielding- Only expose application APIs and not direct grid APIs • Continuations - Call services internally via compute tasks to create service chains In-Memory Service Grid On-Heap Off-Heap On-Heap Off-Heap On-Heap Off-Heap Ignite In-Memory Computing Platform SQL Cache Trans. Compute Service A Service B Service C Service D DB DB DB
  • 30. © 2017 GridGain Systems, Inc. Hadoop & Spark Integration
  • 31. © 2016 GridGain Systems, Inc. • Ignite In-Memory File System (IGFS) – Hadoop-compliant – Easy to Install – On-Heap and Off-Heap – Caching Layer for HDFS – Write-through and Read-through HDFS – Performance Boost IGFS: In-Memory File System MR HIVE PIG In-Memory MapReduce IGFS HDFS IGFS YARN }Any Hadoop Distro
  • 32. © 2016 GridGain Systems, Inc. Hadoop Accelerator: Map Reduce • In-Memory Performance • Zero Code Change • Use existing MR code • Use existing Hive queries • No Name Node • No Network Noise • In-Process Data Colocation • Eager Push Scheduling User Application Hadoop Client Ignite Client Hadoop Jobtracker Hadoop Name Node Hadoop Tasktracker Hadoop Tasktracker Ignite Data Node (IGFS) Ignite Data Node (IGFS) Hadoop Data Node (HDFS) Hadoop Data Node (HDFS) Ignite Path Hadoop Path
  • 33. © 2016 GridGain Systems, Inc. • IgniteRDD – Share RDD across jobs on the host – Share RDD across jobs in the application – Share RDD globally • Faster SQL – In-Memory Indexes – SQL on top of Shared RDD Spark & Ignite Integration Spark Application Spark Worker Spark Job Spark Job Ignite Node Yarn Mesos Docker HDFS Server Spark Worker Spark Job Spark Job Ignite Node Server Spark Worker Spark Job Spark Job Ignite Node Server In-Memory Shared RDDs / IGFS
  • 34. © 2016 GridGain Systems, Inc. • Docker • Amazon AWS • Azure Marketplace • Google Cloud • Apache JClouds • Mesos • YARN • Apache Karaf (OSGi) Cloud Deployment
  • 35. © 2016 GridGain Systems, Inc. Thank You! www.gridgain.com @gridgain #gridgain Thank you for joining us. Follow the conversation. Author: Christos Erotocritou