Getting to know the Grid
Syed M Shaaf
Red Hat
Goto; conference Aarhus 2013 | Syed M Shaaf
Quick introduction
Solutions Architect at Redhat Nordics
Red Hat JBoss middleware
@sshaaf @RedHatNordics
http://www.redhat.com
Goto; conference Aarhus 2013 | Syed M Shaaf
Web/appservers
DB/Storage Integrationservers
Mgmt/Monitor
One Scenario
Goto; conference Aarhus 2013 | Syed M Shaaf
Web servers
Grid servers
DB/Storage Integrationservers
Mgmt/Monitor
Another Scenario
Data Replication and Cache
Goto; conference Aarhus 2013 | Syed M Shaaf
What is?
● Schema-less key/value
store
● Compatible with
applications written in any
language, using any
framework
● Easy access through APIs
● Consistent hash-based
distribution
● Self-healing
● No single point of failure
● Durability (persistence)
● Memory management
(eviction, expiration)
● XA transactions
Goto; conference Aarhus 2013 | Syed M Shaaf
JBoss Data Grid and JSR
● JSR-107: Temporary caching API
● JSR-347: Data grids
● Development led by Red Hat
● JSR-346: CDI1.1
● Programming model for data grids
● JSR-317: JPA2
● Data grids form caching API for database via JPA2
7
And then its a matter of
scaling..
Goto; conference Aarhus 2013 | Syed M Shaaf
Clustering subsystems
• JGROUPS - toolkit for the underlying communication
between nodes . Configured with 2 stacks for communication
UDP (default) and TCP (if the environment is not
multicasting)
• INFINISPAN - data caching and object replication and comes
with 3 preconfigured caches:
• cluster - Replication of objects in a HA cluster
• web - Session replication
• sfsb - Replication of stateful session bean
• hibernate - 2nd level entity caching for JPA/Hibernate
• MODCLUSTER- software LB spreads requests among two or
more nodes
Goto; conference Aarhus 2013 | Syed M Shaaf
Clustering architecture
Goto; conference Aarhus 2013 | Syed M Shaaf
Cluster architecture
JGroups
Infinispan
HTTP Session Clustering
EAP Instance
JGroups
Infinispan
HTTP Session Clustering
EAP Instance
Replication
Goto; conference Aarhus 2013 | Syed M Shaaf
mode=replication
All the data is stored on all cluster nodes
Writes are sent to all nodes
– Every node updates its local cache
Reads are always local
New nodes acquire the initial state from the oldest node
Clients can access any node for reading or writing
Scalability is limited by cluster size and data size
10 nodes with 100MB state each: every node needs 1GB
Goto; conference Aarhus 2013 | Syed M Shaaf
mode=replication; action=rw
mod_cluster
K V
K1
K2
K3
K V
K1
K2
K3
K V
K1
K2
K3
Replication
rw
Goto; conference Aarhus 2013 | Syed M Shaaf
Mode=distribution
Data is only stored on N cluster nodes (say N=2)
A consistent hash on a key “id” determines the 2
servers for “id”
– Example: cluster is {A,B,C,D,E,F}
– Hash(“id”) = 8; 8 MOD 6 = 2
– --> Primary owner = B, backup owner = C
Crash of B, new view is {A,C,D,E,F}
– --> Primary owner = D, backup owner = E
– --> C needs to transfer “id” to D and E and remove it
locally
Knowing the key, we always find the right server(s)
Goto; conference Aarhus 2013 | Syed M Shaaf
mode=distribution; action=w
mod_cluster
K V
K1
K V
K1
K2
K V
K2
Replication
Goto; conference Aarhus 2013 | Syed M Shaaf
Cross Site replication
Cache B
Cache Manager
Cache A
Bergen
[RELAY]
JGroups
Cache B
Cache Manager
Cache A
Trondheim
Cache B
Cache Manager
Cache A
Oslo
[RELAY]
JGroups
[RELAY]
JGroups
16
Data access is important?
Goto; conference Aarhus 2013 | Syed M Shaaf
Client and server
Multiple access protocols
Protocol Format Client type Smart? Load balance
and failover
REST text any no external
Memcached text any no pre-defined
HotRod binary Java, C#,
Python
yes auto/dynamic
Goto; conference Aarhus 2013 | Syed M Shaaf
Advanced functionality
Eviction, expiration, and passivation
● Expiration – defined per entry or cache
● Eviction – FIFO, LRU, unordered, LIRS, none
● Passivation
Step Action Keys in memory Keys on disk
1 Insert K1 K1 n/a
2 Insert K2 K1, K2 n/a
3 Eviction thread - K1 K2 K1
4 Read K1 K1, K2 n/a
5 Eviction thread K2 K1 K2
6 Remove K2 K1 n/a
Goto; conference Aarhus 2013 | Syed M Shaaf
Advanced functionality
Why use consistent hashing?
● Cost-effective, speed
benefits
● Deterministic location of
keys
● Sufficient copies for fault
tolerance and durability
but without an
overabundance of copies
Key 372
Value “p”
Key 500
Key 0
Node A
Node C
Node B
Goto; conference Aarhus 2013 | Syed M Shaaf
Advanced functionality
Consistent hashing
Hash ring
● Cost-effective, speed benefits
● Deterministic location of keys
● Sufficient copies for fault tolerance
and durability without an
overabundance of copies
Node A
● Stores values of keys 815-1000-330
● Wraps around
Value “m”
● Stored in Key 743
● Based on key value, located on Node
C
Value “p”
● Stored in Key 372
● Based on key value, located on Node
B
Key 743
Value “m”
Key 372
Value “p”
Key 500
Key 0
Node A
Key range [815,330]
Node C
Key range [643,814]
Node B
Key range [331,642]
Goto; conference Aarhus 2013 | Syed M Shaaf
Advanced functionality
Consistent hashing
● Event: Node B goes
offline
● Node A
● Now stores keys
815-642
● Node C - unchanged
● Value “m” - unchanged
● Value “p”
● Stored in key 335
● Now located on
Node A
Key 500
Key 843
Value “m”
Key 335
Value “p”
Key 0Key 1000
Node A
Key range [815,642]
Node B
Key range [331,642]
Node C
Key range [643,814]
Goto; conference Aarhus 2013 | Syed M Shaaf
Advanced functionality
Consistent hashing – Virtual nodes
● Addresses
irregularities in node
distribution
● Location of entry
determined
algorithmically
● Allocates multiple
blocks throughout the
hash space when a
node joins or leaves
grid
Key 500
Key 843
Value “m”
Key 0
Key 335
Value “p”
Key 1000
23
Conceptual architecture
Goto; conference Aarhus 2013 | Syed M Shaaf
JBoss Data Grid conceptual architecture
Client / server
Client
Server
Persistent
store
User
app
Cache
API
L1
cache
Cache
manager
Cache
Cache
Cache
Cache
Cache
loader/store
Cache
loader/store
Persistent
store
Goto; conference Aarhus 2013 | Syed M Shaaf
Conceptual architecture
Cache API and L1 cache
User application
● End-user interface (i.e. web
application, Java server application)
Cache API
Uses memcached, Hot Rod, or REST
APIs
L1 near cache
● Stores remote cache entries after
they are initially accessed
● For fast retrieval and to prevent
unnecessary remote fetch operations
Client
User
app
Cache
API
L1
cache
Goto; conference Aarhus 2013 | Syed M Shaaf
Conceptual architecture
Cache and cache manager
Cache manager
● Primary mechanism to retrieve a
cache instance
Cache
● Houses cache instances
Flexible setup
● One cache manager per process
● Multiple caches per cache
manager
● One interface per cache
Cache
manager
Cache
Cache
Cache
Cache
Goto; conference Aarhus 2013 | Syed M Shaaf
Conceptual architecture
Cache and cache manager
Cache
manager
Cache
Cache
Cache
Cache
Cache configuration
● Locking policy
● Transactions
● Eviction policy
● Expiration policy
● Persistence mechanism
● Backups
● L1 cache policy
Cache manager configuration
● Name / Alias / JNDI
● Start-up policy
● Transport policies
● Caches
Goto; conference Aarhus 2013 | Syed M Shaaf
Conceptual architecture
Cache store, cache loader, and persistent store
Cache loader
● Ready-only interface – locate
and retrieve data
Cache store
● Cache loader with write
capabilities
Persistent store
● Permanent store for cache
instances and entries (i.e.
relational database)
Persistent
store
Cache
loader/store
Cache
loader/store
Persistent
store
Goto; conference Aarhus 2013 | Syed M Shaaf
Conceptual architecture
The cache store
● Write-behind or write-
through behavior
● A cache has one or more
cache stores
● Cache stores can be
chained
● Can be loaded or purged on
start
● Open and supported API for
custom stores
● File, JDBC, remote
Persistent
store
Cache
loader/store
Cache
loader/store
Persistent
store
30
JBoss Data Grid: Use cases
Goto; conference Aarhus 2013 | Syed M Shaaf
Use case - Local cache
Boost application performance
A more sophisticated HashMap
● Memory management
● Persistence
● Eviction, expiration
● Eliminate OOM
● Warm-start, preload
● Transaction capable (JTA)
● Monitor-able (JMX)
● Events and notifications
● Plugs into many frameworks to
boost performance
Application
Cache BCache A
Database
Ideal for:
● Single processes
● Data unique to a process
● Unshared data
Goto; conference Aarhus 2013 | Syed M Shaaf
Use case – Data grid
Achieve massive elastic big data scale
● Distributed, horizontally
scalable, unlimited storage
● Move processing to data with
map and reduce
● Low-latency, fast performance
● Eliminate single point of failure
● Built on Red Hat-led JSR-347
(data grids) standards
● Multiple access protocols
● Compatible with applications
written in any language, any
framework
Standalone server C
Database
optional
Application A
CacheCache
Application B
CacheCache
CacheStandalone server B CacheStandalone server A
CacheCacheCacheCache
CacheCache
Goto; conference Aarhus 2013 | Syed M Shaaf
Use case - Replicated cache
Ultimate failover protection
● Instant reads, linear
performance scalability
● Network overhead scales
linearly
● Limited to a single JVM heap
size
● Replicate the same key/value,
updates across the cluster
Application A’
Application A
Cache BCache A
Database
Application B
Cache BCache A
Ideal for:
● Small, fixed datasets
● Scenarios requiring extremely
high fault tolerance
Goto; conference Aarhus 2013 | Syed M Shaaf
Use case – Data grid
Achieve massive elastic big data scale
● Distributed, horizontally
scalable, unlimited storage
● Move processing to data with
map and reduce
● Low-latency, fast performance
● Eliminate single point of failure
● Built on Red Hat-led JSR-347
(data grids) standards
● Multiple access protocols
● Compatible with applications
written in any language, any
framework
Standalone server C
Database
optional
Application A
CacheCache
Application B
CacheCache
CacheStandalone server B CacheStandalone server A
CacheCacheCacheCache
CacheCache
Goto; conference Aarhus 2013 | Syed M Shaaf
Use case – Data grid
Achieve massive elastic big data scale
Ideal for:
● Massive distributed
datasets like those from
global, decentralized
locations
● Elastic datasets that
experience large
fluctuations, periodicity, or
unpredictability
● Transferring transaction
loads away from local
cache and traditional
databases
Standalone server C
Database
optional
Application A
CacheCache
Application B
CacheCache
CacheStandalone server B CacheStandalone server A
CacheCacheCacheCache
CacheCache
36
JBoss Data Grid:
Deployment and use patterns
Goto; conference Aarhus 2013 | Syed M Shaaf
Deployment
Library mode
● “Bring your own” container
● Within one JVM:
● Multiple caches
● One node / cache
● Multiple caches / application
● ‘Cache hit’ is in memory
● Memory management
● Transactions, monitoring, events,
and notifications
JVM
Cache
Cache
Cache
User
application
User
application
Goto; conference Aarhus 2013 | Syed M Shaaf
Deployment
Client / Server stand-alone mode
● “Remote” clients
● Within one service JVM
● Multiple caches
● One node / cache
● Multiple caches / application
● Cache hit, not in local
memory
● Compatibility - language
agnostic
● Separate app and storage life
cycles
JVM
Data Grid
Cache
Data Grid
CacheCache
User
application
User
application
Goto; conference Aarhus 2013 | Syed M Shaaf
Usage patterns
Side cache
● Application manages cache
Database
Application
Cache
Goto; conference Aarhus 2013 | Syed M Shaaf
Usage patterns
Inline cache - Application speaks only to cache
1) App requests data (K1)
2) Cache loader retrieves
from persistent store (K1)
Application
Persistent
store
Cache
Loader
K1
1) App writes data (K2)
2) Cache writes to
persistent store (K2)
K1
Application
Persistent
store
Cache
Store
K2
K2
K2
Goto; conference Aarhus 2013 | Syed M Shaaf
Searching/Indexing
Cache B
Cache Manager
Cache A
App A. Hibernate Search
App B.
Get Indexed data
Server
Goto; conference Aarhus 2013 | Syed M Shaaf
Map/Reduce
1. MAP
K V
K1
K2
K3
K V
K1
K2
K3
K V
K1
K2
K3
M
M
M
2. Reduce
R
R
R
Goto; conference Aarhus 2013 | Syed M Shaaf
Web servers
Grid servers
DB/Storage Integrationservers
Mgmt/Monitor
One Scenario
Data Replication and Cache
Goto; conference Aarhus 2013 | Syed M Shaaf
References
●
Http://www.redhat.com
●
Http://access.redhat.com
● Http://www.openshift.com
●
Http://www.jboss.org/infinispan
●
Http://www.jboss.org/jgroups

More Related Content

PDF
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...
PDF
Postgres Vision 2018: Will Postgres Live Forever?
 
PDF
Red Hat Storage for Mere Mortals
PDF
Introduction to Apache Geode (Cork, Ireland)
PPTX
Red Hat Storage Day Atlanta - Red Hat Gluster Storage vs. Traditional Storage...
PPTX
Red Hat Storage Day Seattle: Persistent Storage for Containerized Applications
PPTX
Rebuilding Web Tracking Infrastructure for Scale
PDF
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData
Red Hat Storage Day New York - Penguin Computing Spotlight: Delivering Open S...
Postgres Vision 2018: Will Postgres Live Forever?
 
Red Hat Storage for Mere Mortals
Introduction to Apache Geode (Cork, Ireland)
Red Hat Storage Day Atlanta - Red Hat Gluster Storage vs. Traditional Storage...
Red Hat Storage Day Seattle: Persistent Storage for Containerized Applications
Rebuilding Web Tracking Infrastructure for Scale
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData

What's hot (20)

PDF
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
PPTX
Practice of large Hadoop cluster in China Mobile
PDF
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
PPTX
Geode Meetup Apachecon
PPTX
Remote DBA Service: Powering your DBA needs
 
PDF
Remote DBA Service: Powering your DBA needs
 
PDF
Discover PostGIS: Add Spatial functions to PostgreSQL
 
PDF
SF Ceph Users Jan. 2014
PDF
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
PDF
Argus Production Monitoring at Salesforce
PDF
DRP (Stretch Cluster) for HDP - Future of Data : Paris
PPTX
Apache Hadoop YARN 3.x in Alibaba
PDF
Red Hat Storage Day New York - New Reference Architectures
PPTX
Bridging the gap: achieving fast data synchronization from SAP HANA by levera...
PDF
Introduction to InfluxDB
PDF
Building Your Data Streams for all the IoT
PDF
Initiative Based Technology Consulting Case Studies
PPTX
Apache geode
PPTX
Beginner's Guide to High Availability for Postgres
 
PPTX
MariaDB Performance Tuning Crash Course
Red Hat Storage Day New York - Intel Unlocking Big Data Infrastructure Effici...
Practice of large Hadoop cluster in China Mobile
Stream Data Processing at Big Data Landscape by Oleksandr Fedirko
Geode Meetup Apachecon
Remote DBA Service: Powering your DBA needs
 
Remote DBA Service: Powering your DBA needs
 
Discover PostGIS: Add Spatial functions to PostgreSQL
 
SF Ceph Users Jan. 2014
Red Hat® Ceph Storage and Network Solutions for Software Defined Infrastructure
Argus Production Monitoring at Salesforce
DRP (Stretch Cluster) for HDP - Future of Data : Paris
Apache Hadoop YARN 3.x in Alibaba
Red Hat Storage Day New York - New Reference Architectures
Bridging the gap: achieving fast data synchronization from SAP HANA by levera...
Introduction to InfluxDB
Building Your Data Streams for all the IoT
Initiative Based Technology Consulting Case Studies
Apache geode
Beginner's Guide to High Availability for Postgres
 
MariaDB Performance Tuning Crash Course
Ad

Similar to Getting to know the Grid - Goto Aarhus 2013 (20)

PPTX
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
PDF
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
PDF
Scaling ELK Stack - DevOpsDays Singapore
PDF
Red Hat Storage Roadmap
PDF
Red Hat Storage Roadmap
PDF
AI/ML Infra Meetup | Exploring Distributed Caching for Faster GPU Training wi...
PDF
From Data Preparation to Inference: How Alluxio Speeds Up AI
PDF
HPE Solutions for Challenges in AI and Big Data
PDF
Saviak lviv ai-2019-e-mail (1)
PDF
Red Hat Gluster Storage - Direction, Roadmap and Use-Cases
PDF
Implementing data and databases on K8s within the Dutch government
PDF
Redpanda and ClickHouse
PDF
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...
PDF
OSDC 2015: John Spray | The Ceph Storage System
PDF
Enterprise data in the WSO2 platform
PDF
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
PPTX
OS for AI: Elastic Microservices & the Next Gen of ML
PPTX
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
PDF
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
PDF
Netflix Open Source Meetup Season 4 Episode 2
Ceph Day Seoul - AFCeph: SKT Scale Out Storage Ceph
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
Scaling ELK Stack - DevOpsDays Singapore
Red Hat Storage Roadmap
Red Hat Storage Roadmap
AI/ML Infra Meetup | Exploring Distributed Caching for Faster GPU Training wi...
From Data Preparation to Inference: How Alluxio Speeds Up AI
HPE Solutions for Challenges in AI and Big Data
Saviak lviv ai-2019-e-mail (1)
Red Hat Gluster Storage - Direction, Roadmap and Use-Cases
Implementing data and databases on K8s within the Dutch government
Redpanda and ClickHouse
MySQL Transformation Case Study: 80% Cost Savings & Uninterrupted Availabilit...
OSDC 2015: John Spray | The Ceph Storage System
Enterprise data in the WSO2 platform
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
OS for AI: Elastic Microservices & the Next Gen of ML
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
Scylla Summit 2022: Building Zeotap's Privacy Compliant Customer Data Platfor...
Netflix Open Source Meetup Season 4 Episode 2
Ad

More from Syed Shaaf (12)

PDF
Containers - What are they and Atomic
PDF
Build and manage private and hybrid cloud
PDF
Red Hat JBoss Technical update
PDF
OpenShift and next generation application development
PDF
Unix to Red Hat Enterprise Linux
PDF
Symantec rhev 31-update by syed m shaaf
PDF
Redhat rhev 31-update by syedmshaaf
PDF
Red Hat Enterprise Linux and NFS by syedmshaaf
PDF
Conduct JBoss EAP 6 seminar
PDF
Technical update KVM and Red Hat Enterprise Virtualization (RHEV) by syedmshaaf
PDF
What is the KISS principle
PDF
Mow2012 data services
Containers - What are they and Atomic
Build and manage private and hybrid cloud
Red Hat JBoss Technical update
OpenShift and next generation application development
Unix to Red Hat Enterprise Linux
Symantec rhev 31-update by syed m shaaf
Redhat rhev 31-update by syedmshaaf
Red Hat Enterprise Linux and NFS by syedmshaaf
Conduct JBoss EAP 6 seminar
Technical update KVM and Red Hat Enterprise Virtualization (RHEV) by syedmshaaf
What is the KISS principle
Mow2012 data services

Recently uploaded (20)

PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PPTX
The various Industrial Revolutions .pptx
PPTX
observCloud-Native Containerability and monitoring.pptx
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPTX
Web Crawler for Trend Tracking Gen Z Insights.pptx
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Unlock new opportunities with location data.pdf
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
A review of recent deep learning applications in wood surface defect identifi...
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
Architecture types and enterprise applications.pdf
PPT
What is a Computer? Input Devices /output devices
Univ-Connecticut-ChatGPT-Presentaion.pdf
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
The various Industrial Revolutions .pptx
observCloud-Native Containerability and monitoring.pptx
1 - Historical Antecedents, Social Consideration.pdf
NewMind AI Weekly Chronicles – August ’25 Week III
Developing a website for English-speaking practice to English as a foreign la...
Web Crawler for Trend Tracking Gen Z Insights.pptx
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Hindi spoken digit analysis for native and non-native speakers
A contest of sentiment analysis: k-nearest neighbor versus neural network
Enhancing emotion recognition model for a student engagement use case through...
Unlock new opportunities with location data.pdf
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
A review of recent deep learning applications in wood surface defect identifi...
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
sustainability-14-14877-v2.pddhzftheheeeee
Architecture types and enterprise applications.pdf
What is a Computer? Input Devices /output devices

Getting to know the Grid - Goto Aarhus 2013

  • 1. Getting to know the Grid Syed M Shaaf Red Hat
  • 2. Goto; conference Aarhus 2013 | Syed M Shaaf Quick introduction Solutions Architect at Redhat Nordics Red Hat JBoss middleware @sshaaf @RedHatNordics http://www.redhat.com
  • 3. Goto; conference Aarhus 2013 | Syed M Shaaf Web/appservers DB/Storage Integrationservers Mgmt/Monitor One Scenario
  • 4. Goto; conference Aarhus 2013 | Syed M Shaaf Web servers Grid servers DB/Storage Integrationservers Mgmt/Monitor Another Scenario Data Replication and Cache
  • 5. Goto; conference Aarhus 2013 | Syed M Shaaf What is? ● Schema-less key/value store ● Compatible with applications written in any language, using any framework ● Easy access through APIs ● Consistent hash-based distribution ● Self-healing ● No single point of failure ● Durability (persistence) ● Memory management (eviction, expiration) ● XA transactions
  • 6. Goto; conference Aarhus 2013 | Syed M Shaaf JBoss Data Grid and JSR ● JSR-107: Temporary caching API ● JSR-347: Data grids ● Development led by Red Hat ● JSR-346: CDI1.1 ● Programming model for data grids ● JSR-317: JPA2 ● Data grids form caching API for database via JPA2
  • 7. 7 And then its a matter of scaling..
  • 8. Goto; conference Aarhus 2013 | Syed M Shaaf Clustering subsystems • JGROUPS - toolkit for the underlying communication between nodes . Configured with 2 stacks for communication UDP (default) and TCP (if the environment is not multicasting) • INFINISPAN - data caching and object replication and comes with 3 preconfigured caches: • cluster - Replication of objects in a HA cluster • web - Session replication • sfsb - Replication of stateful session bean • hibernate - 2nd level entity caching for JPA/Hibernate • MODCLUSTER- software LB spreads requests among two or more nodes
  • 9. Goto; conference Aarhus 2013 | Syed M Shaaf Clustering architecture
  • 10. Goto; conference Aarhus 2013 | Syed M Shaaf Cluster architecture JGroups Infinispan HTTP Session Clustering EAP Instance JGroups Infinispan HTTP Session Clustering EAP Instance Replication
  • 11. Goto; conference Aarhus 2013 | Syed M Shaaf mode=replication All the data is stored on all cluster nodes Writes are sent to all nodes – Every node updates its local cache Reads are always local New nodes acquire the initial state from the oldest node Clients can access any node for reading or writing Scalability is limited by cluster size and data size 10 nodes with 100MB state each: every node needs 1GB
  • 12. Goto; conference Aarhus 2013 | Syed M Shaaf mode=replication; action=rw mod_cluster K V K1 K2 K3 K V K1 K2 K3 K V K1 K2 K3 Replication rw
  • 13. Goto; conference Aarhus 2013 | Syed M Shaaf Mode=distribution Data is only stored on N cluster nodes (say N=2) A consistent hash on a key “id” determines the 2 servers for “id” – Example: cluster is {A,B,C,D,E,F} – Hash(“id”) = 8; 8 MOD 6 = 2 – --> Primary owner = B, backup owner = C Crash of B, new view is {A,C,D,E,F} – --> Primary owner = D, backup owner = E – --> C needs to transfer “id” to D and E and remove it locally Knowing the key, we always find the right server(s)
  • 14. Goto; conference Aarhus 2013 | Syed M Shaaf mode=distribution; action=w mod_cluster K V K1 K V K1 K2 K V K2 Replication
  • 15. Goto; conference Aarhus 2013 | Syed M Shaaf Cross Site replication Cache B Cache Manager Cache A Bergen [RELAY] JGroups Cache B Cache Manager Cache A Trondheim Cache B Cache Manager Cache A Oslo [RELAY] JGroups [RELAY] JGroups
  • 16. 16 Data access is important?
  • 17. Goto; conference Aarhus 2013 | Syed M Shaaf Client and server Multiple access protocols Protocol Format Client type Smart? Load balance and failover REST text any no external Memcached text any no pre-defined HotRod binary Java, C#, Python yes auto/dynamic
  • 18. Goto; conference Aarhus 2013 | Syed M Shaaf Advanced functionality Eviction, expiration, and passivation ● Expiration – defined per entry or cache ● Eviction – FIFO, LRU, unordered, LIRS, none ● Passivation Step Action Keys in memory Keys on disk 1 Insert K1 K1 n/a 2 Insert K2 K1, K2 n/a 3 Eviction thread - K1 K2 K1 4 Read K1 K1, K2 n/a 5 Eviction thread K2 K1 K2 6 Remove K2 K1 n/a
  • 19. Goto; conference Aarhus 2013 | Syed M Shaaf Advanced functionality Why use consistent hashing? ● Cost-effective, speed benefits ● Deterministic location of keys ● Sufficient copies for fault tolerance and durability but without an overabundance of copies Key 372 Value “p” Key 500 Key 0 Node A Node C Node B
  • 20. Goto; conference Aarhus 2013 | Syed M Shaaf Advanced functionality Consistent hashing Hash ring ● Cost-effective, speed benefits ● Deterministic location of keys ● Sufficient copies for fault tolerance and durability without an overabundance of copies Node A ● Stores values of keys 815-1000-330 ● Wraps around Value “m” ● Stored in Key 743 ● Based on key value, located on Node C Value “p” ● Stored in Key 372 ● Based on key value, located on Node B Key 743 Value “m” Key 372 Value “p” Key 500 Key 0 Node A Key range [815,330] Node C Key range [643,814] Node B Key range [331,642]
  • 21. Goto; conference Aarhus 2013 | Syed M Shaaf Advanced functionality Consistent hashing ● Event: Node B goes offline ● Node A ● Now stores keys 815-642 ● Node C - unchanged ● Value “m” - unchanged ● Value “p” ● Stored in key 335 ● Now located on Node A Key 500 Key 843 Value “m” Key 335 Value “p” Key 0Key 1000 Node A Key range [815,642] Node B Key range [331,642] Node C Key range [643,814]
  • 22. Goto; conference Aarhus 2013 | Syed M Shaaf Advanced functionality Consistent hashing – Virtual nodes ● Addresses irregularities in node distribution ● Location of entry determined algorithmically ● Allocates multiple blocks throughout the hash space when a node joins or leaves grid Key 500 Key 843 Value “m” Key 0 Key 335 Value “p” Key 1000
  • 24. Goto; conference Aarhus 2013 | Syed M Shaaf JBoss Data Grid conceptual architecture Client / server Client Server Persistent store User app Cache API L1 cache Cache manager Cache Cache Cache Cache Cache loader/store Cache loader/store Persistent store
  • 25. Goto; conference Aarhus 2013 | Syed M Shaaf Conceptual architecture Cache API and L1 cache User application ● End-user interface (i.e. web application, Java server application) Cache API Uses memcached, Hot Rod, or REST APIs L1 near cache ● Stores remote cache entries after they are initially accessed ● For fast retrieval and to prevent unnecessary remote fetch operations Client User app Cache API L1 cache
  • 26. Goto; conference Aarhus 2013 | Syed M Shaaf Conceptual architecture Cache and cache manager Cache manager ● Primary mechanism to retrieve a cache instance Cache ● Houses cache instances Flexible setup ● One cache manager per process ● Multiple caches per cache manager ● One interface per cache Cache manager Cache Cache Cache Cache
  • 27. Goto; conference Aarhus 2013 | Syed M Shaaf Conceptual architecture Cache and cache manager Cache manager Cache Cache Cache Cache Cache configuration ● Locking policy ● Transactions ● Eviction policy ● Expiration policy ● Persistence mechanism ● Backups ● L1 cache policy Cache manager configuration ● Name / Alias / JNDI ● Start-up policy ● Transport policies ● Caches
  • 28. Goto; conference Aarhus 2013 | Syed M Shaaf Conceptual architecture Cache store, cache loader, and persistent store Cache loader ● Ready-only interface – locate and retrieve data Cache store ● Cache loader with write capabilities Persistent store ● Permanent store for cache instances and entries (i.e. relational database) Persistent store Cache loader/store Cache loader/store Persistent store
  • 29. Goto; conference Aarhus 2013 | Syed M Shaaf Conceptual architecture The cache store ● Write-behind or write- through behavior ● A cache has one or more cache stores ● Cache stores can be chained ● Can be loaded or purged on start ● Open and supported API for custom stores ● File, JDBC, remote Persistent store Cache loader/store Cache loader/store Persistent store
  • 30. 30 JBoss Data Grid: Use cases
  • 31. Goto; conference Aarhus 2013 | Syed M Shaaf Use case - Local cache Boost application performance A more sophisticated HashMap ● Memory management ● Persistence ● Eviction, expiration ● Eliminate OOM ● Warm-start, preload ● Transaction capable (JTA) ● Monitor-able (JMX) ● Events and notifications ● Plugs into many frameworks to boost performance Application Cache BCache A Database Ideal for: ● Single processes ● Data unique to a process ● Unshared data
  • 32. Goto; conference Aarhus 2013 | Syed M Shaaf Use case – Data grid Achieve massive elastic big data scale ● Distributed, horizontally scalable, unlimited storage ● Move processing to data with map and reduce ● Low-latency, fast performance ● Eliminate single point of failure ● Built on Red Hat-led JSR-347 (data grids) standards ● Multiple access protocols ● Compatible with applications written in any language, any framework Standalone server C Database optional Application A CacheCache Application B CacheCache CacheStandalone server B CacheStandalone server A CacheCacheCacheCache CacheCache
  • 33. Goto; conference Aarhus 2013 | Syed M Shaaf Use case - Replicated cache Ultimate failover protection ● Instant reads, linear performance scalability ● Network overhead scales linearly ● Limited to a single JVM heap size ● Replicate the same key/value, updates across the cluster Application A’ Application A Cache BCache A Database Application B Cache BCache A Ideal for: ● Small, fixed datasets ● Scenarios requiring extremely high fault tolerance
  • 34. Goto; conference Aarhus 2013 | Syed M Shaaf Use case – Data grid Achieve massive elastic big data scale ● Distributed, horizontally scalable, unlimited storage ● Move processing to data with map and reduce ● Low-latency, fast performance ● Eliminate single point of failure ● Built on Red Hat-led JSR-347 (data grids) standards ● Multiple access protocols ● Compatible with applications written in any language, any framework Standalone server C Database optional Application A CacheCache Application B CacheCache CacheStandalone server B CacheStandalone server A CacheCacheCacheCache CacheCache
  • 35. Goto; conference Aarhus 2013 | Syed M Shaaf Use case – Data grid Achieve massive elastic big data scale Ideal for: ● Massive distributed datasets like those from global, decentralized locations ● Elastic datasets that experience large fluctuations, periodicity, or unpredictability ● Transferring transaction loads away from local cache and traditional databases Standalone server C Database optional Application A CacheCache Application B CacheCache CacheStandalone server B CacheStandalone server A CacheCacheCacheCache CacheCache
  • 36. 36 JBoss Data Grid: Deployment and use patterns
  • 37. Goto; conference Aarhus 2013 | Syed M Shaaf Deployment Library mode ● “Bring your own” container ● Within one JVM: ● Multiple caches ● One node / cache ● Multiple caches / application ● ‘Cache hit’ is in memory ● Memory management ● Transactions, monitoring, events, and notifications JVM Cache Cache Cache User application User application
  • 38. Goto; conference Aarhus 2013 | Syed M Shaaf Deployment Client / Server stand-alone mode ● “Remote” clients ● Within one service JVM ● Multiple caches ● One node / cache ● Multiple caches / application ● Cache hit, not in local memory ● Compatibility - language agnostic ● Separate app and storage life cycles JVM Data Grid Cache Data Grid CacheCache User application User application
  • 39. Goto; conference Aarhus 2013 | Syed M Shaaf Usage patterns Side cache ● Application manages cache Database Application Cache
  • 40. Goto; conference Aarhus 2013 | Syed M Shaaf Usage patterns Inline cache - Application speaks only to cache 1) App requests data (K1) 2) Cache loader retrieves from persistent store (K1) Application Persistent store Cache Loader K1 1) App writes data (K2) 2) Cache writes to persistent store (K2) K1 Application Persistent store Cache Store K2 K2 K2
  • 41. Goto; conference Aarhus 2013 | Syed M Shaaf Searching/Indexing Cache B Cache Manager Cache A App A. Hibernate Search App B. Get Indexed data Server
  • 42. Goto; conference Aarhus 2013 | Syed M Shaaf Map/Reduce 1. MAP K V K1 K2 K3 K V K1 K2 K3 K V K1 K2 K3 M M M 2. Reduce R R R
  • 43. Goto; conference Aarhus 2013 | Syed M Shaaf Web servers Grid servers DB/Storage Integrationservers Mgmt/Monitor One Scenario Data Replication and Cache
  • 44. Goto; conference Aarhus 2013 | Syed M Shaaf References ● Http://www.redhat.com ● Http://access.redhat.com ● Http://www.openshift.com ● Http://www.jboss.org/infinispan ● Http://www.jboss.org/jgroups