SlideShare a Scribd company logo
Building Apps with
Distributed In-Memory Computing
using Apache Geode
Nitin Lamba
@nlamba9
(incubating)
William Markito
@william_markito
Introduction (Nitin)
• WHAT? Overview & history
• WHY? Relevance & Differentiators
• HOW? Features & Basic Concepts
• SEE! Quick start
Hands-on (William)
• LEARN: Advanced Concepts - Persistence, f(x), PDX, …
• SHOW: Demos (Docker, PDX)
Resources
Q & A
Agenda
2
Introduction
Nitin
3
From GEM to GEODE…
4
A distributed, memory-based data
management platform for data
oriented apps that need:
• high performance, scalability,
resiliency and continuous
availability
• fast access to critical data sets
• location-aware distributed data
processing
• event-driven data architecture
What is GEODE?
5
High-level Architecture
6
Powerful app development kit
• APIs: Java & REST
• Adapters: Redis, Lucene*, Spark*, …
Multiple persistence options
• Filesystem, RDBMS or HDFS*
• Sync: read-through, write-through
• Async: write-behind
Durable <K,V> cache/ store
• Data replicated or partitioned
• Redundant storage in-memory/ disk
• Flexible data retention policiesÎ
!
Locator
Server
Server
Server
Server
+""""
" 
$
%
%
%
&& &
% % % % % % % %
&&
A Peer-2-Peer
Distributed System
REST
!
* Experimental and waiting community feedback
• 1000+ systems in production (real customers)
• Cutting edge use cases
Incubating but ROCK solid…
7
<2000 2004 2008 2012 2016
Early drivers
• Data Volumes
• Margins/ transactions
• IT maintenance costs
• Elasticity needs
Real-time needs
• Real-time response
• Time to market needs
• Flexible Data Models
• Persistent+In-memory
Global Data
• Visibility across DC
• Fast Ingest
• Device to enterprise
• Uptime (always on)
Open Source!
• Apache Incubation
• Gemfire > Geode
• M1 release
• 1st Geode Summit
Financial
Services
US DoD
Trade Clearing
Travel Portal
Online
Gambling
Telcos
Manufacturing
Auto Insurance
Payroll processing
Rail systems
…with both SCALE and SPEED, …
8
40K
Transactions
per second
3TB
Data
in-memory
17B
Records
in-memory
120K
Concurrent
users
… and impacting a LOT of people!
9
China Railway

Corporation
Indian
Railways
19%
17%
36%
of the world population
Built for PERFORMANCE…
10
Operationspersecond
0
200,000
400,000
600,000
800,000
YCSB Workloads
AReads
AUpdates
BReads
BUpdates
CReads
DInserts
DReads
FReads
FUpdates
Cassandra
Geode
…and horizontal, consistent SCALABILITY!
11
Horizontal scaling for reads, consistent latency and CPU
0
4.5
9
13.5
18
Speedup
0
1.25
2.5
3.75
5
Server	Hosts
2 4 6 8 10
speedup latency	(ms) CPU	%
• Scaled from 256 clients and 2 servers to 1280 clients and 10 servers
• Partitioned region with redundancy and 1K data size
• Minimize copying
• Minimize contention points
• Run user code in-process
• Partitioning & parallelism
• Avoid disk seeks
• Automated benchmarks
What makes it go FAST?
12
• Cache
• Region
• Member
• Client Cache
• Persistence
• Functions
• Events & Listeners
• High Availability
• Serialization
Let’s talk about a few (basic) CONCEPTS…
13
• In-memory storage and
management for your data
• Configurable through XML,
Java API or CLI
• Collection of Region
What is a CACHE?
14
Region
Region
Region
Cache
JVM
• Distributed java.util.Map on
steroids (Key/Value)
• Consistent API regardless of where
or how data is stored
• Observable (reactive)
• Highly available, redundant on
cache Member (s).
What is a REGION?
15
Region
Cache
java.util.Map
JVM
Key Value
K01 May
K02 Tim
• Local, Replicated or Partitioned
• In-memory or persistent
• Redundant
• LRU
• Overflow
Region: Types & Options
16
Region
Cache
java.util.Map
JVM
Key Value
K01 May
K02 Tim
Region
Cache
java.util.Map
JVM
Key Value
K01 May
K02 Tim
LOCAL	
LOCAL_HEAP_LRU	
LOCAL_OVERFLOW	
LOCAL_PERSISTENT	
LOCAL_PERSISTENT_OVERFLOW	
PARTITION	
PARTITION_HEAP_LRU	
PARTITION_OVERFLOW	
PARTITION_PERSISTENT	
PARTITION_PERSISTENT_OVERFLOW	
PARTITION_PROXY	
PARTITION_PROXY_REDUNDANT	
PARTITION_REDUNDANT	
PARTITION_REDUNDANT_HEAP_LRU	
PARTITION_REDUNDANT_OVERFLOW	
PARTITION_REDUNDANT_PERSISTENT	
PARTITION_REDUNDANT_PERSISTENT_OVERFLOW	
REPLICATE	
REPLICATE_HEAP_LRU	
REPLICATE_OVERFLOW	
REPLICATE_PERSISTENT	
REPLICATE_PERSISTENT_OVERFLOW	
REPLICATE_PROXY
• Durability
• WAL for efficient writing
• Consistent recovery
• Compaction
Persistent Regions
17
Modify
k1->v5
Create
k6->v6
Create
k2->v2
Create
k4->v4
Oplog2.crf
Member
1
Modify
k4->v7Oplog3.crf
Put k4->v7
Region
Cache
java.util.Map
JVM
Key Value
K01 May
K02 Tim
Region
Cache
java.util.Map
JVM
Key Value
K01 May
K02 Tim
Server 1 Server N
• A process that has a connection to
the system
• A process that has created a cache
• Embeddable within your
application
What is a MEMBER?
18
Client
Locator
Server
• A process connected to the
Geode server(s)
• Can have a local copy of the data
• Run OQL queries on local data
• Can be notified about events on
the servers
What is a CLIENT CACHE?
19
Application
GemFire Server
Region
Region
RegionClient Cache
• Clone & Build
•
• Start Services
• Create & Monitor Region
How to START? Easy as !!
20
git	clone	https://github.com/apache/incubator-geode	
cd	incubator-geode

./gradlew	build	-Dskip.tests=true
cd	gemfire-assembly/build/install/apache-geode		
./bin/gfsh		
gfsh>	start	locator	--name=locator		
gfsh>	start	server	--name=server
gfsh>	create	region	--name=myRegion	—type=REPLICATE	
gfsh>	start	[pulse	|	jconsole]
1
2
3
'
1 2 3
Hands On
William
21
• Cache
• Region
• Member
• Client Cache
• Persistence
• Functions
• Events & Listeners
• High Availability
• Serialization
More (advanced) CONCEPTS…
22
Persistence - Shared Nothing
23
Server 3Server 2Server 1
Persistence - Shared Nothing
24
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2
Primary
Secondary
Persistence - Shared Nothing
25
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2
Primary
Secondary
Persistence - Shared Nothing
26
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2
Primary
Secondary
Persistence - Shared Nothing
27
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2
Primary
Secondary
Persistence - Shared Nothing
28
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2
Primary
Secondary
B3
B2
Server 1 waits for others when it starts
Persistence - Shared Nothing
29
Server 3Server 2Server 1
B1
B3
B2
B1
B3
B2
Primary
Secondary
Fetches missed operations on restart
Persistence - Operational Logs
30
Create
k1->v1
Create 

k2->v2
Modify

k1->v3
Create 

k4->v4
Modify
k1->v5
Create 

k6->v6
Member 1
Put k6->v6
Oplog2.crf
Oplog1.crf
Append to
operation log
Persistence - Operational Logs: Compaction
31
Create
k1->v1
Create 

k2->v2
Modify

k1->v3
Create 

k4->v4
Modify
k1->v5
Create 

k6->v6
Member 1
Put k6->v6
Oplog2.crf
Oplog1.crf
Append to
operation log
Copy live
data forward
• Used for distributed concurrent
processing 

(Map/Reduce, stored procedure)
• Highly available
• Data oriented
• Member oriented
Functions
32
Submit (f1)
f1 , f2 , … fn
Execute

Functions
Functions
33
Server Server
FunctionService.onRegion.withFilter.execute
ResultCollector.getResult
Server Distributed System
execute
Server
Server
6
1
result
execute
execute
result
result
2
5
3
4
3 4
Server
Partitioned Region
Data Store - X
Partitioned Region
Data Store - Y
Partitioned Region
Data Store - Z
Partitioned Region
Data Accessor
Partitioned Region
Data Accessor
filter = Keys X, Y
Client Region
• Register Interest
• Individual Keys OR RegEx for Keys
• Updates Local Copy
• Examples:
• region.registerInterest(“key-1”);
• region1.registerInterestRegex(“[a-z]+“);
• Continuous Query
• Receive Notification when Query condition met on server
• Example:
• SELECT * FROM /tradeOrder t WHERE t.price > 100.00
Can be DURABLE
Events & Notifications
34
• CacheWriter / CacheListener
• AsyncEventListener (queue / batch)
• Parallel or Serial
• Conflation
Listeners
35
High Availability
36
Fixed or Flexible schema?
37
id name age pet_id
or
{	
		id			:	1,	
		name	:	“Fred”,	
		age		:	42,	
		pet		:	{	
				name	:	“Barney”,	
				type	:	“dino”	
		}	
}
Portable Data eXchange (PDX)
38
C#, C++, Java, JSON
No IDL, no schemas, no hand-coding
Schema evolution (Forward and Backward Compatible)
* domain object classes not required
|												header												|							data							|	
|	pdx	|	length	|	dsid	|	typeid	|	fields	|	offsets	|
Efficient for queries
39
{	
		id			:	1,	
		name	:	“Fred”,	
		age		:	42,	
		pet		:	{	
				name	:	“Barney”,	
				type	:	“dino”	
		}	
}
SELECT	p.name	FROM	/Person	p	WHERE	p.pet.type	=	“dino”
single field
deserialization
But HOW to serialize data?
40
Benchmark: https://github.com/eishay/jvm-serializers
Schema Evolution
41
Member A Member B
Distributed Type Definitions
v2v1
Application #1
Application #2
v2 objects preserve data
from missing fields
v1 objects use default values to
fill in new fields
PDX provides forwards and backwards
compatibility, no code required
Demo
(Docker, PDX, …)
42
Code
• New features
• Bug fixes
• Writing tests
Documentation
• Wiki
• Web site
• User guide
How to CONTRIBUTE?
43
Community
• Join the mailing list
• Ask or answer
• Join our HipChat
• Become a speaker
• Finding bugs
• Testing an RC/Beta
Website
http://geode.incubator.apache.org/
JIRA
https://issues.apache.org/jira/browse/GEODE
Wiki
cwiki.apache.org/confluence/display/GEODE
GitHub
https://github.com/apache/incubator-geode
Mailing lists
mail-archives.apache.org/mod_mbox/incubator-geode-dev/
Where to BEGIN?
44
45
Thank you!
http://geode.incubator.apache.org
https://github.com/Pivotal-Open-Source-Hub

More Related Content

PDF
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
PDF
#GeodeSummit - Redis to Geode Adaptor
PDF
Introduction to Apache Geode (Cork, Ireland)
PDF
Apache Geode Meetup, London
PDF
Apache Geode Meetup, Cork, Ireland at CIT
PPTX
Apache Geode Clubhouse - WAN-based Replication
PDF
Build your first Internet of Things app today with Open Source
PPTX
Geode Meetup Apachecon
Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement
#GeodeSummit - Redis to Geode Adaptor
Introduction to Apache Geode (Cork, Ireland)
Apache Geode Meetup, London
Apache Geode Meetup, Cork, Ireland at CIT
Apache Geode Clubhouse - WAN-based Replication
Build your first Internet of Things app today with Open Source
Geode Meetup Apachecon

What's hot (20)

PPTX
Using Apache Geode: Lessons Learned at Southwest Airlines
PPTX
An Introduction to Apache Geode (incubating)
PDF
Apache Geode - The First Six Months
PDF
#GeodeSummit - Where Does Geode Fit in Modern System Architectures
POTX
Building Effective Apache Geode Applications with Spring Data GemFire
PPTX
Visualizing Kafka Security
PPTX
ApexMeetup Geode - Talk1 2016-03-17
PPTX
How to Design for Database High Availability
 
PPTX
Hive 3 - a new horizon
PDF
Development of concurrent services using In-Memory Data Grids
PPTX
GemFire In Memory Data Grid
PPTX
New life inside monolithic application
PPTX
Running secured Spark job in Kubernetes compute cluster and integrating with ...
PPTX
Database as a Service - Tutorial @ICDE 2010
PPTX
Deploying MariaDB databases with containers at Nokia Networks
PDF
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
PPTX
An Expert Guide to Migrating Legacy Databases to PostgreSQL
 
PDF
eBay Cloud CMS - QCon 2012 - http://yidb.org/
PPTX
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
PPTX
Apache geode
Using Apache Geode: Lessons Learned at Southwest Airlines
An Introduction to Apache Geode (incubating)
Apache Geode - The First Six Months
#GeodeSummit - Where Does Geode Fit in Modern System Architectures
Building Effective Apache Geode Applications with Spring Data GemFire
Visualizing Kafka Security
ApexMeetup Geode - Talk1 2016-03-17
How to Design for Database High Availability
 
Hive 3 - a new horizon
Development of concurrent services using In-Memory Data Grids
GemFire In Memory Data Grid
New life inside monolithic application
Running secured Spark job in Kubernetes compute cluster and integrating with ...
Database as a Service - Tutorial @ICDE 2010
Deploying MariaDB databases with containers at Nokia Networks
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
An Expert Guide to Migrating Legacy Databases to PostgreSQL
 
eBay Cloud CMS - QCon 2012 - http://yidb.org/
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
Apache geode
Ad

Viewers also liked (19)

PDF
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
PPTX
Open Sourcing GemFire - Apache Geode
PPTX
Introducing Apache Geode and Spring Data GemFire
PDF
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
PPTX
Zettaset Elastic Big Data Security for Greenplum Database
PDF
JBoss Community Introduction
KEY
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
PPTX
Архитектура Apache Ignite .NET
PDF
Building Wall St Risk Systems with Apache Geode
PDF
Apache conbigdata2015 christiantzolov-federated sql on hadoop and beyond- lev...
PDF
Infinispan from POC to Production
PDF
Infinispan Servers: Beyond peer-to-peer data grids
PDF
Hacking Infinispan: the new open source data grid meets NoSQL
KEY
Infinspan: In-memory data grid meets NoSQL
PDF
Redis adaptor for Apache Geode
PDF
Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction
PDF
인메모리 클러스터링 아키텍처
ODP
Infinispan and Enterprise Data Grid
PPTX
Apache HAWQ and Apache MADlib: Journey to Apache
Using Apache Calcite for Enabling SQL and JDBC Access to Apache Geode and Oth...
Open Sourcing GemFire - Apache Geode
Introducing Apache Geode and Spring Data GemFire
IMCSummit 2015 - 1 IT Business - The Evolution of Pivotal Gemfire
Zettaset Elastic Big Data Security for Greenplum Database
JBoss Community Introduction
Infinispan, Data Grids, NoSQL, Cloud Storage and JSR 347
Архитектура Apache Ignite .NET
Building Wall St Risk Systems with Apache Geode
Apache conbigdata2015 christiantzolov-federated sql on hadoop and beyond- lev...
Infinispan from POC to Production
Infinispan Servers: Beyond peer-to-peer data grids
Hacking Infinispan: the new open source data grid meets NoSQL
Infinspan: In-memory data grid meets NoSQL
Redis adaptor for Apache Geode
Keeping Infinispan In Shape: Highly-Precise, Scalable Data Eviction
인메모리 클러스터링 아키텍처
Infinispan and Enterprise Data Grid
Apache HAWQ and Apache MADlib: Journey to Apache
Ad

Similar to Building Apps with Distributed In-Memory Computing Using Apache Geode (20)

PPTX
Geode introduction
PDF
Geode - Day 2
PPTX
Apache Geode (incubating) Introduction with Docker
PPTX
HBase Introduction
PPTX
Chapter Six Storage-systemsgggggggg.pptx
PDF
Pivotal's effort on Apache Geode
PPTX
GemFire In-Memory Data Grid
PPTX
Introduction to Apache HBase
PPT
Wmware NoSQL
PDF
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
PPTX
HBase in Practice
PDF
Facebook keynote-nicolas-qcon
PDF
Facebook Messages & HBase
PDF
支撑Facebook消息处理的h base存储系统
PPTX
HBase in Practice
PDF
Managing Big Data: An Introduction to Data Intensive Computing
PDF
Scalable IoT platform
PDF
Geode - Day 1
PDF
Cassandra Talk: Austin JUG
PPTX
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Geode introduction
Geode - Day 2
Apache Geode (incubating) Introduction with Docker
HBase Introduction
Chapter Six Storage-systemsgggggggg.pptx
Pivotal's effort on Apache Geode
GemFire In-Memory Data Grid
Introduction to Apache HBase
Wmware NoSQL
Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
HBase in Practice
Facebook keynote-nicolas-qcon
Facebook Messages & HBase
支撑Facebook消息处理的h base存储系统
HBase in Practice
Managing Big Data: An Introduction to Data Intensive Computing
Scalable IoT platform
Geode - Day 1
Cassandra Talk: Austin JUG
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends

More from PivotalOpenSourceHub (20)

PPTX
New Security Framework in Apache Geode
PDF
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
PDF
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
PDF
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
PPTX
#GeodeSummit - Off-Heap Storage Current and Future Design
PDF
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
PPTX
#GeodeSummit - Spring Data GemFire API Current and Future
PDF
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
PDF
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
PDF
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...
PDF
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
PDF
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
PDF
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
PDF
#GeodeSummit - Design Tradeoffs in Distributed Systems
PDF
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
PPTX
GPORCA: Query Optimization as a Service
PDF
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
PPTX
Apache Geode Offheap Storage
PPTX
Apache Zeppelin Meetup Christian Tzolov 1/21/16
PPTX
Build & test Apache Hawq
New Security Framework in Apache Geode
#GeodeSummit: Easy Ways to Become a Contributor to Apache Geode
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit: Combining Stream Processing and In-Memory Data Grids for Near-R...
#GeodeSummit - Off-Heap Storage Current and Future Design
#GeodeSummit - Integration & Future Direction for Spring Cloud Data Flow & Geode
#GeodeSummit - Spring Data GemFire API Current and Future
#GeodeSummit - Modern manufacturing powered by Spring XD and Geode
#GeodeSummit - Using Geode as Operational Data Services for Real Time Mobile ...
#GeodeSummit - Large Scale Fraud Detection using GemFire Integrated with Gree...
#GeodeSummit: Democratizing Fast Analytics with Ampool (Powered by Apache Geode)
#GeodeSummit: Architecting Data-Driven, Smarter Cloud Native Apps with Real-T...
#GeodeSummit - Apex & Geode: In-memory streaming, storage & analytics
#GeodeSummit - Design Tradeoffs in Distributed Systems
#GeodeSummit - Wall St. Derivative Risk Solutions Using Geode
GPORCA: Query Optimization as a Service
Pivoting Spring XD to Spring Cloud Data Flow with Sabby Anandan
Apache Geode Offheap Storage
Apache Zeppelin Meetup Christian Tzolov 1/21/16
Build & test Apache Hawq

Recently uploaded (20)

PDF
Spectral efficient network and resource selection model in 5G networks
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Empathic Computing: Creating Shared Understanding
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Cloud computing and distributed systems.
PDF
Electronic commerce courselecture one. Pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
MYSQL Presentation for SQL database connectivity
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPTX
A Presentation on Artificial Intelligence
Spectral efficient network and resource selection model in 5G networks
The AUB Centre for AI in Media Proposal.docx
Reach Out and Touch Someone: Haptics and Empathic Computing
Dropbox Q2 2025 Financial Results & Investor Presentation
Diabetes mellitus diagnosis method based random forest with bat algorithm
Understanding_Digital_Forensics_Presentation.pptx
Chapter 3 Spatial Domain Image Processing.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Empathic Computing: Creating Shared Understanding
NewMind AI Monthly Chronicles - July 2025
Network Security Unit 5.pdf for BCA BBA.
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Big Data Technologies - Introduction.pptx
Cloud computing and distributed systems.
Electronic commerce courselecture one. Pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
MYSQL Presentation for SQL database connectivity
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
A Presentation on Artificial Intelligence

Building Apps with Distributed In-Memory Computing Using Apache Geode

  • 1. Building Apps with Distributed In-Memory Computing using Apache Geode Nitin Lamba @nlamba9 (incubating) William Markito @william_markito
  • 2. Introduction (Nitin) • WHAT? Overview & history • WHY? Relevance & Differentiators • HOW? Features & Basic Concepts • SEE! Quick start Hands-on (William) • LEARN: Advanced Concepts - Persistence, f(x), PDX, … • SHOW: Demos (Docker, PDX) Resources Q & A Agenda 2
  • 4. From GEM to GEODE… 4
  • 5. A distributed, memory-based data management platform for data oriented apps that need: • high performance, scalability, resiliency and continuous availability • fast access to critical data sets • location-aware distributed data processing • event-driven data architecture What is GEODE? 5
  • 6. High-level Architecture 6 Powerful app development kit • APIs: Java & REST • Adapters: Redis, Lucene*, Spark*, … Multiple persistence options • Filesystem, RDBMS or HDFS* • Sync: read-through, write-through • Async: write-behind Durable <K,V> cache/ store • Data replicated or partitioned • Redundant storage in-memory/ disk • Flexible data retention policiesÎ ! Locator Server Server Server Server +"""" "  $ % % % && & % % % % % % % % && A Peer-2-Peer Distributed System REST ! * Experimental and waiting community feedback
  • 7. • 1000+ systems in production (real customers) • Cutting edge use cases Incubating but ROCK solid… 7 <2000 2004 2008 2012 2016 Early drivers • Data Volumes • Margins/ transactions • IT maintenance costs • Elasticity needs Real-time needs • Real-time response • Time to market needs • Flexible Data Models • Persistent+In-memory Global Data • Visibility across DC • Fast Ingest • Device to enterprise • Uptime (always on) Open Source! • Apache Incubation • Gemfire > Geode • M1 release • 1st Geode Summit Financial Services US DoD Trade Clearing Travel Portal Online Gambling Telcos Manufacturing Auto Insurance Payroll processing Rail systems
  • 8. …with both SCALE and SPEED, … 8 40K Transactions per second 3TB Data in-memory 17B Records in-memory 120K Concurrent users
  • 9. … and impacting a LOT of people! 9 China Railway
 Corporation Indian Railways 19% 17% 36% of the world population
  • 10. Built for PERFORMANCE… 10 Operationspersecond 0 200,000 400,000 600,000 800,000 YCSB Workloads AReads AUpdates BReads BUpdates CReads DInserts DReads FReads FUpdates Cassandra Geode
  • 11. …and horizontal, consistent SCALABILITY! 11 Horizontal scaling for reads, consistent latency and CPU 0 4.5 9 13.5 18 Speedup 0 1.25 2.5 3.75 5 Server Hosts 2 4 6 8 10 speedup latency (ms) CPU % • Scaled from 256 clients and 2 servers to 1280 clients and 10 servers • Partitioned region with redundancy and 1K data size
  • 12. • Minimize copying • Minimize contention points • Run user code in-process • Partitioning & parallelism • Avoid disk seeks • Automated benchmarks What makes it go FAST? 12
  • 13. • Cache • Region • Member • Client Cache • Persistence • Functions • Events & Listeners • High Availability • Serialization Let’s talk about a few (basic) CONCEPTS… 13
  • 14. • In-memory storage and management for your data • Configurable through XML, Java API or CLI • Collection of Region What is a CACHE? 14 Region Region Region Cache JVM
  • 15. • Distributed java.util.Map on steroids (Key/Value) • Consistent API regardless of where or how data is stored • Observable (reactive) • Highly available, redundant on cache Member (s). What is a REGION? 15 Region Cache java.util.Map JVM Key Value K01 May K02 Tim
  • 16. • Local, Replicated or Partitioned • In-memory or persistent • Redundant • LRU • Overflow Region: Types & Options 16 Region Cache java.util.Map JVM Key Value K01 May K02 Tim Region Cache java.util.Map JVM Key Value K01 May K02 Tim LOCAL LOCAL_HEAP_LRU LOCAL_OVERFLOW LOCAL_PERSISTENT LOCAL_PERSISTENT_OVERFLOW PARTITION PARTITION_HEAP_LRU PARTITION_OVERFLOW PARTITION_PERSISTENT PARTITION_PERSISTENT_OVERFLOW PARTITION_PROXY PARTITION_PROXY_REDUNDANT PARTITION_REDUNDANT PARTITION_REDUNDANT_HEAP_LRU PARTITION_REDUNDANT_OVERFLOW PARTITION_REDUNDANT_PERSISTENT PARTITION_REDUNDANT_PERSISTENT_OVERFLOW REPLICATE REPLICATE_HEAP_LRU REPLICATE_OVERFLOW REPLICATE_PERSISTENT REPLICATE_PERSISTENT_OVERFLOW REPLICATE_PROXY
  • 17. • Durability • WAL for efficient writing • Consistent recovery • Compaction Persistent Regions 17 Modify k1->v5 Create k6->v6 Create k2->v2 Create k4->v4 Oplog2.crf Member 1 Modify k4->v7Oplog3.crf Put k4->v7 Region Cache java.util.Map JVM Key Value K01 May K02 Tim Region Cache java.util.Map JVM Key Value K01 May K02 Tim Server 1 Server N
  • 18. • A process that has a connection to the system • A process that has created a cache • Embeddable within your application What is a MEMBER? 18 Client Locator Server
  • 19. • A process connected to the Geode server(s) • Can have a local copy of the data • Run OQL queries on local data • Can be notified about events on the servers What is a CLIENT CACHE? 19 Application GemFire Server Region Region RegionClient Cache
  • 20. • Clone & Build • • Start Services • Create & Monitor Region How to START? Easy as !! 20 git clone https://github.com/apache/incubator-geode cd incubator-geode
 ./gradlew build -Dskip.tests=true cd gemfire-assembly/build/install/apache-geode ./bin/gfsh gfsh> start locator --name=locator gfsh> start server --name=server gfsh> create region --name=myRegion —type=REPLICATE gfsh> start [pulse | jconsole] 1 2 3 ' 1 2 3
  • 22. • Cache • Region • Member • Client Cache • Persistence • Functions • Events & Listeners • High Availability • Serialization More (advanced) CONCEPTS… 22
  • 23. Persistence - Shared Nothing 23 Server 3Server 2Server 1
  • 24. Persistence - Shared Nothing 24 Server 3Server 2Server 1 B1 B3 B2 B1 B3 B2 Primary Secondary
  • 25. Persistence - Shared Nothing 25 Server 3Server 2Server 1 B1 B3 B2 B1 B3 B2 Primary Secondary
  • 26. Persistence - Shared Nothing 26 Server 3Server 2Server 1 B1 B3 B2 B1 B3 B2 Primary Secondary
  • 27. Persistence - Shared Nothing 27 Server 3Server 2Server 1 B1 B3 B2 B1 B3 B2 Primary Secondary
  • 28. Persistence - Shared Nothing 28 Server 3Server 2Server 1 B1 B3 B2 B1 B3 B2 Primary Secondary B3 B2 Server 1 waits for others when it starts
  • 29. Persistence - Shared Nothing 29 Server 3Server 2Server 1 B1 B3 B2 B1 B3 B2 Primary Secondary Fetches missed operations on restart
  • 30. Persistence - Operational Logs 30 Create k1->v1 Create 
 k2->v2 Modify
 k1->v3 Create 
 k4->v4 Modify k1->v5 Create 
 k6->v6 Member 1 Put k6->v6 Oplog2.crf Oplog1.crf Append to operation log
  • 31. Persistence - Operational Logs: Compaction 31 Create k1->v1 Create 
 k2->v2 Modify
 k1->v3 Create 
 k4->v4 Modify k1->v5 Create 
 k6->v6 Member 1 Put k6->v6 Oplog2.crf Oplog1.crf Append to operation log Copy live data forward
  • 32. • Used for distributed concurrent processing 
 (Map/Reduce, stored procedure) • Highly available • Data oriented • Member oriented Functions 32 Submit (f1) f1 , f2 , … fn Execute
 Functions
  • 33. Functions 33 Server Server FunctionService.onRegion.withFilter.execute ResultCollector.getResult Server Distributed System execute Server Server 6 1 result execute execute result result 2 5 3 4 3 4 Server Partitioned Region Data Store - X Partitioned Region Data Store - Y Partitioned Region Data Store - Z Partitioned Region Data Accessor Partitioned Region Data Accessor filter = Keys X, Y Client Region
  • 34. • Register Interest • Individual Keys OR RegEx for Keys • Updates Local Copy • Examples: • region.registerInterest(“key-1”); • region1.registerInterestRegex(“[a-z]+“); • Continuous Query • Receive Notification when Query condition met on server • Example: • SELECT * FROM /tradeOrder t WHERE t.price > 100.00 Can be DURABLE Events & Notifications 34
  • 35. • CacheWriter / CacheListener • AsyncEventListener (queue / batch) • Parallel or Serial • Conflation Listeners 35
  • 37. Fixed or Flexible schema? 37 id name age pet_id or { id : 1, name : “Fred”, age : 42, pet : { name : “Barney”, type : “dino” } }
  • 38. Portable Data eXchange (PDX) 38 C#, C++, Java, JSON No IDL, no schemas, no hand-coding Schema evolution (Forward and Backward Compatible) * domain object classes not required | header | data | | pdx | length | dsid | typeid | fields | offsets |
  • 40. But HOW to serialize data? 40 Benchmark: https://github.com/eishay/jvm-serializers
  • 41. Schema Evolution 41 Member A Member B Distributed Type Definitions v2v1 Application #1 Application #2 v2 objects preserve data from missing fields v1 objects use default values to fill in new fields PDX provides forwards and backwards compatibility, no code required
  • 43. Code • New features • Bug fixes • Writing tests Documentation • Wiki • Web site • User guide How to CONTRIBUTE? 43 Community • Join the mailing list • Ask or answer • Join our HipChat • Become a speaker • Finding bugs • Testing an RC/Beta