SlideShare a Scribd company logo
æHow to make your DBMS
1000x faster
19/12/2017 –City College
Presentation’s Overview
Part 1 – Theory
• What is the problem with RDBMS and NoSQL solutions
• What is Apache Ignite
• Main Features of Apache Ignite
• Use cases and Integrations
• Supported platforms
Part 2 – Examples from Apache Ignite’s repository
Scaling relational Databases is hard
• RDBMS mainly scale up / vertically (bigger/faster
machines)
• Limited scalability compared with Big Data requirements
• Shared all approach
• Same data files are available to all nodes (instances)
• Distributed locks required
• Distributed network search is required when the data
of an instance is not yet persisted into the file system.
In these cases a network search is required for recent
committed values. This approach is not scalable.
Source: http://www.marklogic.com/blog/relational-databases-scale/
NoSQL databases as a solution
• Provide horizontal scalability with the use of shared
nothing architectures and partitioning.
• But the following functionalities are not easily supported
in most NoSQL platforms:
• Joins
• Set operations (union/interest/minus)
• Transactions
• Full ANSI SQL support
• Constraints as we know from the RDBMS
What is Apache Ignite
Apache Ignite is the in-memory computing platform
that is durable, strongly consistent, and highly available
with powerful SQL, key-value and processing APIs  Durable Memory
 Ignite Persistence
 ACID Compliance
 Complete SQL Support
 Key-Value
 Collocated Processing
 Scalability and
Durability
What you can do with Apache Ignite?
Apache Ignite, is an in-memory computing platform that
enables you to dramatically accelerate and scale out your
existing data-intensive applications without ripping and
replacing your existing databases. It can reduce query
times by 1,000x versus disk-based systems. You can scale
out by adding new nodes to your cluster, which can handle
hundreds of terabytes of data from multiple databases.
What you can do with Apache Ignite? (cont.)
You can modernize your existing data-intensive
architecture by inserting Apache Ignite between your
existing application and database layers. Apache Ignite
integrates seamlessly with RDBMS, NoSQL and Apache®
Hadoop™ databases. It features a unified API which
supports SQL, C++, .NET, PHP, MapReduce,
JAVA/Scala/Groovy, and Node.js protocols for the
application layer. Your Apache Ignite cluster, applications,
and databases can run on premise, in a hybrid
environment, or on a cloud platform such as AWS® or
Microsoft Azure.
Nikita Ivanov
Founder and CTO at GridGain systems
“You can buy a 10-blade server that has a terabyte of RAM for less than
$25,000 (~year 2015). RAM does push up the initial price but because RAM’s
lower power and cooling costs, and no moving parts to break, analysts say that
the TCO (Total Cost of Ownership) for using RAM instead of rotating or solid-
state storage as primary storage breaks even in about three years. And that's
just looking at TCO, not including the delivered value from getting much faster
processing performance.”
Source: https://www.linux.com/news/gridgain-memory-data-fabric-becomes-apache-
ignite
Source: https://gist.github.com/jboner/2841832
Latency Comparison Numbers
--------------------------
L1 cache reference 0.5 ns
Branch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cache
Mutex lock/unlock 25 ns
Main memory reference 100 ns 20x L2 cache, 200x L1 cache
Compress 1K bytes with Zippy 3,000 ns 3 us
Send 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD
Read 1 MB sequentially from memory 250,000 ns 250 us
Round trip within same datacenter 500,000 ns 500 us
Read 1 MB sequentially from SSD* 1,000,000 ns 1,000 us 1 ms ~1GB/sec SSD, 4X
memory
Disk seek 10,000,000 ns 10,000 us 10 ms 20x datacenter roundtrip
Read 1 MB sequentially from disk 20,000,000 ns 20,000 us 20 ms 80x memory, 20X SSD
Send packet CA->Netherlands->CA 150,000,000 ns 150,000 us 150 ms
Apache Ignite Overview
In-Memory Database (IMDB)
Apache Ignite can be used as a distributed and
horizontally scalable in-memory database
(IMDB) that supports ACID transactions and can
be used with SQL, key-value, compute, machine
learning and other data processing APIs.
One of the distinguishing characteristics of
Ignite SQL is the support for distributed SQL
JOINs, which works in both, collocated and non-
collocated fashions.
When collocated, the JOINs are executed on the local data available on each node
without having to move large data sets across the network. Such collocated
approach provides the best scalability and performance in distributed clusters.
More information about the IMDB can be found here.
Quiz 1
1. How many times is RAM faster than an SSD disk for
1MB sequential read?
1. 4
2. 10
3. 8
2. How many times is RAM faster than a typical HDD
for 1MB sequential read?
1. 80
2. 100
3. 1000
Quiz 2
1. Apache Ignite supports transactions?
1. Yes
2. No
2. Apache Ignite supports distributed SQL joins?
1. Yes
2. No
Distributed SQL Database
Apache Ignite is fully complaint with ANSI-
99 compliant, horizontally scalable and
fault-tolerant distributed SQL database.
The distribution is provided either by
partitioning the data across cluster nodes
or by full replication, depending on the
use case.
You can interact with Ignite as you would
with any other SQL storage, using standard
JDBC or ODBC connectivity. Ignite also
provides native SQL APIs for Java, .NET
and C++ developers for better
performance.
More information about the Distributed SQl DBs can be found here.
Distributed Collocated SQL Query
In-Memory Data Grid (IMDG) – Key-Value store
The data grid has been built from
the ground up to linearly scale to
hundreds of nodes with strong
semantics for data locality and
affinity data routing to reduce
redundant data noise.
It can be viewed as a distributed partitioned hash map with every cluster
node owning a portion of the overall data. This way the more cluster nodes
we add, the more data we can cache.
More information about the IMDG can be found here.
Compute Grid
Distributed computations are performed in
parallel fashion to gain high performance,
low latency, and linear scalability.
Ignite compute grid provides a set of
simple APIs that allow users distribute
computations and data processing across
multiple computers in the cluster.
Distributed parallel processing is based on
the ability to take any computation and
execute it on any set of cluster nodes and
return the results back.
More information about the Compute grid can be found here.
• Continuous availability of deployed services regardless of topology changes or crashes.
• Automatically deploy any number of distributed service instances in the cluster.
• Automatically deploy singletons, including cluster-singleton, node-singleton, or key-affinity-
singleton.
• Automatically deploy distributed services on node start-up by specifying them in the configuration.
• Undeploy any of the deployed services.
• Get information about service deployment topology within the cluster.
• Create service proxy for accessing remotely deployed distributed services
Service Grid
Service Grid allows for deployments
of arbitrary user-defined services on
the cluster. You can implement and
deploy any service, such as custom
counters, ID generators, hierarchical
maps, etc.
More information about the Service grid can be found here.
Distributed Data Structures
Ignite allows for most of the data structures
from java.util.concurrent framework to be
used in a distributed fashion.
Ignite gives you the capability to take a data
structure you are familiar with and use it in a
clustered fashion.
For example, you can take
java.util.concurrent.BlockingDeque and add
something to it on one node and poll it from
another node.
Or have a distributed primary key generator
which would guarantee uniqueness on all
nodes.
More information about the Distributed data Strucutres can be found here.
• Queue and Set
• Atomic Types
• CountDownLatch
• ID Generator
• Semaphore
Quiz 3
1. Apache Ignite is able to scale up and down by simply
adding/removing nodes from the cluster?
1. Yes
2. No
2. Does Apache Ignite has the concept of servers and
clients?
1. Yes
2. No
3. Is it possible to manage an Apache Ignite cluster
remotely?
1. Yes
2. No
Data Streamers
1. Client nodes inject finite or
continuous streams of data into
Ignite caches using Ignite Data
Streamers.
2. Data is automatically partitioned
between Ignite data nodes, and each
node gets equal amount of data.
3. Streamed data can be concurrently
processed directly on the Ignite data
nodes in collocated fashion.
4. Clients can also perform concurrent
SQL queries on the streamed data.
More information about the Data Streamers can be found here.
Integration with major streaming technologies
Apache Ignite integrates with major streaming technologies and frameworks
in order to bring even more advanced streaming capabilities to Ignite-based
architectures:
1. Kafka Streamer
2. Camel Streamer
3. JMS Streamer
4. MQTT Streamer
5. Storm Streamer
6. Flink streamer
7. Twitter Streamer
8. Flume Streamer
9. Zero MQ
10. Rocket MQ
More information about Integrating Ignite with Data
Streamers can be found here.
Messaging & Events
Exchange custom messages between nodes across the cluster.
Ignite distributed messaging allows for topic based cluster-wide
communication between all nodes.
Messages with a specified message topic can be distributed to all or sub-
group of nodes that have subscribed to that topic.
Ignite messaging is based on publish-subscribe paradigm where publishers
and subscribers are connected together by a common topic.
When one of the nodes sends a message A for topic T, it is published on all
nodes that have subscribed to T.
More information about Messaging & Events can be found here.
Sliding Windows
More information about Sliding Windows can be found here.
Sliding windows are configured as Ignite cache eviction policies,
and can be:
• Time-based sliding windows
• FIFO sliding windows
• LRU sliding windows
• Querying sliding windows
Web Session clustering
More information about Web Session Clustering can be found here.
Ignite In-Memory Data Fabric is capable of
caching web sessions of all Java Servlet
containers that follow Java Servlet 3.0
Specification, including Apache Tomcat,
Eclipse Jetty, Oracle WebLogic, and others.
• No need for sticky sessions provided by
the Load Balancer.
Hibernate L2 Cache
More information about Hibernate L2 cache can be found here.
Ignite In-Memory Data Fabric can be used as
Hibernate Second-Level cache (or L2 cache),
which can significantly speed-up the
persistence layer of your application.
Hibernate is a well-known and widely used
framework for Object-Relational Mapping
(ORM). While interacting closely with an SQL
database, it performs caching of retrieved
data to minimize expensive database requests
Spring Caching
More information about Spring Cache can be found here.
Ignite is shipped with SpringCacheManager - an implementation of Spring
Cache Abstraction. It provides an annotation-based way to enable caching
for Java methods so that the result of a method execution is stored in the
Ignite cache. Later, if the same method is called with the same set of
parameter values, the result will be retrieved from the cache instead of
actually executing the method.
Example:
private JdbcTemplate jdbc;
@Cacheable("averageSalary")
public long averageSalary(int organizationId) {
String sql = "SELECT AVG(e.salary) " + "FROM Employee e " + "WHERE e.organizationId = ?";
return jdbc.queryForObject(sql, Long.class, organizationId);
}
Spring Data
More information about Spring Data can be found here.
Spring Data Framework provides a unified and
widely used API that allows abstracting an
underlying data storage from the application
layer.
Spring Data helps you avoid locking to a specific
database vendor, making it easy to switch from
one database to another with minimal efforts.
Apache Ignite implements Spring Data
CrudRepository interface that not only supports
basic CRUD operations but also provides access
to the Apache Ignite SQL Grid via the unified
Spring Data API.
@RepositoryConfig(cacheName = "PersonCache")
public interface PersonRepository extends IgniteRepository
<Person, Long> {
/**
* Gets all the persons with the given name.
* @param name Person name.
* @return A list of Persons with the given first name.
*/
public List<Person> findByFirstName(String name);
/**
* Returns top Person with the specified surname.
* @param name Person surname.
* @return Person that satisfy the query.
*/
public Cache.Entry<Long, Person> findTopByLastNameLike
(String name);
}
Apache Spark
More information about Ignite for Spark can be found here.
Apache Ignite provides an implementation of
Spark RDD (Resilient Distributed Datasets)
abstraction which allows to easily share
state in memory across Spark jobs. The main
difference between native Spark RDD and
IgniteRDD is that Ignite RDD provides a
shared in-memory view on data across
different Spark jobs, workers, or
applications, while native Spark RDD cannot
be seen by other Spark jobs or applications.
Other integrations
More information about integrations can be found here.
Apache Ignite integrates with:
• Hadoop
• Apache Cassandra
• PHP PDO – Data Objects
• MyBatis L2 Cache
• OSGi
Ignite Native Persistence
Ignite native persistence is a distributed,
ACID, and SQL-compliant disk store that
transparently integrates with Ignite's durable
memory. Ignite persistence is optional and can
be turned on and off. When turned off Ignite
becomes a pure in-memory store.
With the native persistence enabled, Ignite always stores a superset of data
on disk, and as much as possible in RAM. For example, if there are 100 entries
and RAM has the capacity to store only 20, then all 100 will be stored on disk
and only 20 will be cached in RAM for better performance.
More information about the Ignite Native Persistence can be found here.
3rd Party Persistence
JCache specification comes with APIs for
javax.cache.integration.CacheLoader and
javax.cache.integration.CacheWriter which are used for write-through
and read-through to and from an underlying persistent storage
respectively (e.g. an RDBMS database like Oracle or MySQL, or NoSQL
database like MongoDB or Couchbase).
It supports:
• Read/Write Through
• Write-Behind
More information about the 3rd Party Persistence can be found here.
Supported platforms & protocols
• Java
• .NET
• C++
• REST API
• Memcached
• Redis
• PHP
More information about the Platforms & Protocols can be found here.
Apache Ignite has a rich set of APIs that
are covered throughout the
documentation.
The APIs are implemented in the form of
native libraries that support major
languages such as Java, .NET and C++, as
well as a variety of protocols like REST,
Memcached, and Redis
Automatic RDBMS Integration
More information about the Automatic RDBMS Integration can be found here.
SqlLine with version 2.3.0
More information about the sqlline tool can can be found here.
Typical deployment for Apache Ignite
More information can be found here.
1. One or more applications connect to the Apache Ignite cluster in
order to manipulate the data in memory.
2. The application never communicates directly with the database.
3. Apache Ignite is responsible to synchronise the data.
Legacy systems?
More information can be found here.
1. Existing legacy systems updating a database.
2. New systems that rely on Apache Ignite with 3rd Party
Persistence enabled.
3. How we guarantee that stale data won’t reside on Ignite
cluster for a long time and will be updated as soon as the
database receives updated from the legacy application?
Legacy systems. Possible solution 1.
More information can be found here.
Connect the legacy system to the Ignite cluster directly.
1. Development is required in order
to make the transition from the
existing DB to Apache Ignite.
2. Complex PL/SQL stored
procedures needs rewrite.
3. Many legacy applications.
1. Simple solution.
2. No added costs.
Legacy systems. Possible solution 2 (Push).
Custom logic on the third party database that would propagate the
committed changes back to the Apache Ignite cluster.
1. Development cost.1. The data are being
replicated on time
Legacy systems. Possible solution 3 (Gridgain
and Oracle GoldenGate Integrator).
Use Gridgain cluster and Oracle GoldenGate Integrator.
1. Licenses cost ($).1. No need to develop complex
code
More information can be found here.
1. Startup a cluster
2. Run the JdbcExample/modified and show the console online
3. CacheTransactionExample
4. CacheQueryExample
5. CacheDataStreamerExample
6. CacheContinuousQueryExample (Show partitions from the web console)
7. CacheAffinityExample (java 8)
8. ComputeClosureExample (java 8)
9. IgniteAtomicSequenceExample
10. MessagingExample
11. PersistenceStoreExample
Examples from Apache Ignite’s github repo
Similar products/solutions
Hazelcast
Oracle Coherence
Pivotal GemFire
Terracotta
Gigaspaces
Redis
Detailed comparisons between GridGain and the
previous products can be found here.
Thank you!
Useful Resources
• https://github.com/apache/ignite
• https://www.gridgain.com/resources/documentation
• https://github.com/srecon/ignite-book-code-samples

More Related Content

PPTX
Designing data intensive applications
PDF
Apache Airflow
PDF
Nike tech-talk-intro-to-apache-ignite
PPT
MySQL Transactions
PPTX
PostgreSQL and Linux Containers
PPTX
Oracle Goldengate for Big Data - LendingClub Implementation
PDF
SQL vs. NoSQL Databases
PPTX
SQL vs MongoDB
Designing data intensive applications
Apache Airflow
Nike tech-talk-intro-to-apache-ignite
MySQL Transactions
PostgreSQL and Linux Containers
Oracle Goldengate for Big Data - LendingClub Implementation
SQL vs. NoSQL Databases
SQL vs MongoDB

What's hot (20)

PDF
Adf presentation
PPTX
Spark rdd vs data frame vs dataset
PPTX
NOSQL vs SQL
PPS
PPTX
An Overview of Apache Cassandra
PPTX
High Performance Scaling Techniques in Golang Using Go Assembly
PPTX
Elasticsearch - under the hood
PPTX
Sql vs NoSQL-Presentation
PPTX
Non relational databases-no sql
PDF
NOSQL- Presentation on NoSQL
PPTX
SQL Server Index and Partition Strategy
PPTX
Hadoop And Their Ecosystem ppt
PPTX
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
PDF
Introduction to Apache Hive
PPTX
Singleton Design Pattern - Creation Pattern
PDF
Introduction to Cassandra Architecture
PDF
Airflow Best Practises & Roadmap to Airflow 2.0
PPTX
What is NoSQL and CAP Theorem
ODP
Using ANTLR on real example - convert "string combined" queries into paramete...
PPT
Banco de Dados - NoSQL
Adf presentation
Spark rdd vs data frame vs dataset
NOSQL vs SQL
An Overview of Apache Cassandra
High Performance Scaling Techniques in Golang Using Go Assembly
Elasticsearch - under the hood
Sql vs NoSQL-Presentation
Non relational databases-no sql
NOSQL- Presentation on NoSQL
SQL Server Index and Partition Strategy
Hadoop And Their Ecosystem ppt
Siligong.Data - May 2021 - Transforming your analytics workflow with dbt
Introduction to Apache Hive
Singleton Design Pattern - Creation Pattern
Introduction to Cassandra Architecture
Airflow Best Practises & Roadmap to Airflow 2.0
What is NoSQL and CAP Theorem
Using ANTLR on real example - convert "string combined" queries into paramete...
Banco de Dados - NoSQL
Ad

Similar to Apache ignite v1.3 (20)

PDF
Apache Ignite
PDF
Getting Started with Apache Ignite as a Distributed Database
PDF
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
PDF
The next-phase-of-distributed-systems-with-apache-ignite
PDF
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
PPTX
IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Conver...
PPTX
Apache ignite​
PDF
IMCSummit 2015 - Day 2 Developer Track - Anatomy of an In-Memory Data Fabric:...
PPTX
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
PDF
Spark Summit EU talk by Christos Erotocritou
PPTX
IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...
PPTX
Accelerating the Hadoop data stack with Apache Ignite, Spark and Bigtop
PDF
Fast Data with Apache Ignite and Apache Spark with Christos Erotocritou
PPTX
Apache ignite as in-memory computing platform
PDF
Improving Apache Spark™ In-Memory Computing with Apache Ignite™
PDF
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
PPTX
In-Memory Computing Essentials for Software Engineers
PDF
In-memory database
PDF
Apache ignite - a do-it-all key-value db?
PPTX
Continuous Machine and Deep Learning with Apache Ignite
Apache Ignite
Getting Started with Apache Ignite as a Distributed Database
OSDC 2017 - Christos Erotocritou - Apache ignite in-memory data fabric
The next-phase-of-distributed-systems-with-apache-ignite
August 2016 HUG: Better together: Fast Data with Apache Spark™ and Apache Ign...
IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Conver...
Apache ignite​
IMCSummit 2015 - Day 2 Developer Track - Anatomy of an In-Memory Data Fabric:...
An Introduction to Apache Ignite - Mandhir Gidda - Codemotion Rome 2017
Spark Summit EU talk by Christos Erotocritou
IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...
Accelerating the Hadoop data stack with Apache Ignite, Spark and Bigtop
Fast Data with Apache Ignite and Apache Spark with Christos Erotocritou
Apache ignite as in-memory computing platform
Improving Apache Spark™ In-Memory Computing with Apache Ignite™
Data Summer Conf 2018, “Apache Ignite + Apache Spark RDDs and DataFrames inte...
In-Memory Computing Essentials for Software Engineers
In-memory database
Apache ignite - a do-it-all key-value db?
Continuous Machine and Deep Learning with Apache Ignite
Ad

Recently uploaded (20)

PDF
Understanding Forklifts - TECH EHS Solution
PDF
System and Network Administration Chapter 2
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
history of c programming in notes for students .pptx
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
ISO 45001 Occupational Health and Safety Management System
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
medical staffing services at VALiNTRY
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
top salesforce developer skills in 2025.pdf
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Softaken Excel to vCard Converter Software.pdf
PPTX
Materi-Enum-and-Record-Data-Type (1).pptx
PDF
AI in Product Development-omnex systems
Understanding Forklifts - TECH EHS Solution
System and Network Administration Chapter 2
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
history of c programming in notes for students .pptx
Wondershare Filmora 15 Crack With Activation Key [2025
Which alternative to Crystal Reports is best for small or large businesses.pdf
VVF-Customer-Presentation2025-Ver1.9.pptx
ISO 45001 Occupational Health and Safety Management System
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
medical staffing services at VALiNTRY
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
top salesforce developer skills in 2025.pdf
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
How Creative Agencies Leverage Project Management Software.pdf
Operating system designcfffgfgggggggvggggggggg
Softaken Excel to vCard Converter Software.pdf
Materi-Enum-and-Record-Data-Type (1).pptx
AI in Product Development-omnex systems

Apache ignite v1.3

  • 1. æHow to make your DBMS 1000x faster 19/12/2017 –City College
  • 2. Presentation’s Overview Part 1 – Theory • What is the problem with RDBMS and NoSQL solutions • What is Apache Ignite • Main Features of Apache Ignite • Use cases and Integrations • Supported platforms Part 2 – Examples from Apache Ignite’s repository
  • 3. Scaling relational Databases is hard • RDBMS mainly scale up / vertically (bigger/faster machines) • Limited scalability compared with Big Data requirements • Shared all approach • Same data files are available to all nodes (instances) • Distributed locks required • Distributed network search is required when the data of an instance is not yet persisted into the file system. In these cases a network search is required for recent committed values. This approach is not scalable. Source: http://www.marklogic.com/blog/relational-databases-scale/
  • 4. NoSQL databases as a solution • Provide horizontal scalability with the use of shared nothing architectures and partitioning. • But the following functionalities are not easily supported in most NoSQL platforms: • Joins • Set operations (union/interest/minus) • Transactions • Full ANSI SQL support • Constraints as we know from the RDBMS
  • 5. What is Apache Ignite Apache Ignite is the in-memory computing platform that is durable, strongly consistent, and highly available with powerful SQL, key-value and processing APIs  Durable Memory  Ignite Persistence  ACID Compliance  Complete SQL Support  Key-Value  Collocated Processing  Scalability and Durability
  • 6. What you can do with Apache Ignite? Apache Ignite, is an in-memory computing platform that enables you to dramatically accelerate and scale out your existing data-intensive applications without ripping and replacing your existing databases. It can reduce query times by 1,000x versus disk-based systems. You can scale out by adding new nodes to your cluster, which can handle hundreds of terabytes of data from multiple databases.
  • 7. What you can do with Apache Ignite? (cont.) You can modernize your existing data-intensive architecture by inserting Apache Ignite between your existing application and database layers. Apache Ignite integrates seamlessly with RDBMS, NoSQL and Apache® Hadoop™ databases. It features a unified API which supports SQL, C++, .NET, PHP, MapReduce, JAVA/Scala/Groovy, and Node.js protocols for the application layer. Your Apache Ignite cluster, applications, and databases can run on premise, in a hybrid environment, or on a cloud platform such as AWS® or Microsoft Azure.
  • 8. Nikita Ivanov Founder and CTO at GridGain systems “You can buy a 10-blade server that has a terabyte of RAM for less than $25,000 (~year 2015). RAM does push up the initial price but because RAM’s lower power and cooling costs, and no moving parts to break, analysts say that the TCO (Total Cost of Ownership) for using RAM instead of rotating or solid- state storage as primary storage breaks even in about three years. And that's just looking at TCO, not including the delivered value from getting much faster processing performance.” Source: https://www.linux.com/news/gridgain-memory-data-fabric-becomes-apache- ignite
  • 9. Source: https://gist.github.com/jboner/2841832 Latency Comparison Numbers -------------------------- L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns 14x L1 cache Mutex lock/unlock 25 ns Main memory reference 100 ns 20x L2 cache, 200x L1 cache Compress 1K bytes with Zippy 3,000 ns 3 us Send 1K bytes over 1 Gbps network 10,000 ns 10 us Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSD Read 1 MB sequentially from memory 250,000 ns 250 us Round trip within same datacenter 500,000 ns 500 us Read 1 MB sequentially from SSD* 1,000,000 ns 1,000 us 1 ms ~1GB/sec SSD, 4X memory Disk seek 10,000,000 ns 10,000 us 10 ms 20x datacenter roundtrip Read 1 MB sequentially from disk 20,000,000 ns 20,000 us 20 ms 80x memory, 20X SSD Send packet CA->Netherlands->CA 150,000,000 ns 150,000 us 150 ms
  • 11. In-Memory Database (IMDB) Apache Ignite can be used as a distributed and horizontally scalable in-memory database (IMDB) that supports ACID transactions and can be used with SQL, key-value, compute, machine learning and other data processing APIs. One of the distinguishing characteristics of Ignite SQL is the support for distributed SQL JOINs, which works in both, collocated and non- collocated fashions. When collocated, the JOINs are executed on the local data available on each node without having to move large data sets across the network. Such collocated approach provides the best scalability and performance in distributed clusters. More information about the IMDB can be found here.
  • 12. Quiz 1 1. How many times is RAM faster than an SSD disk for 1MB sequential read? 1. 4 2. 10 3. 8 2. How many times is RAM faster than a typical HDD for 1MB sequential read? 1. 80 2. 100 3. 1000
  • 13. Quiz 2 1. Apache Ignite supports transactions? 1. Yes 2. No 2. Apache Ignite supports distributed SQL joins? 1. Yes 2. No
  • 14. Distributed SQL Database Apache Ignite is fully complaint with ANSI- 99 compliant, horizontally scalable and fault-tolerant distributed SQL database. The distribution is provided either by partitioning the data across cluster nodes or by full replication, depending on the use case. You can interact with Ignite as you would with any other SQL storage, using standard JDBC or ODBC connectivity. Ignite also provides native SQL APIs for Java, .NET and C++ developers for better performance. More information about the Distributed SQl DBs can be found here. Distributed Collocated SQL Query
  • 15. In-Memory Data Grid (IMDG) – Key-Value store The data grid has been built from the ground up to linearly scale to hundreds of nodes with strong semantics for data locality and affinity data routing to reduce redundant data noise. It can be viewed as a distributed partitioned hash map with every cluster node owning a portion of the overall data. This way the more cluster nodes we add, the more data we can cache. More information about the IMDG can be found here.
  • 16. Compute Grid Distributed computations are performed in parallel fashion to gain high performance, low latency, and linear scalability. Ignite compute grid provides a set of simple APIs that allow users distribute computations and data processing across multiple computers in the cluster. Distributed parallel processing is based on the ability to take any computation and execute it on any set of cluster nodes and return the results back. More information about the Compute grid can be found here.
  • 17. • Continuous availability of deployed services regardless of topology changes or crashes. • Automatically deploy any number of distributed service instances in the cluster. • Automatically deploy singletons, including cluster-singleton, node-singleton, or key-affinity- singleton. • Automatically deploy distributed services on node start-up by specifying them in the configuration. • Undeploy any of the deployed services. • Get information about service deployment topology within the cluster. • Create service proxy for accessing remotely deployed distributed services Service Grid Service Grid allows for deployments of arbitrary user-defined services on the cluster. You can implement and deploy any service, such as custom counters, ID generators, hierarchical maps, etc. More information about the Service grid can be found here.
  • 18. Distributed Data Structures Ignite allows for most of the data structures from java.util.concurrent framework to be used in a distributed fashion. Ignite gives you the capability to take a data structure you are familiar with and use it in a clustered fashion. For example, you can take java.util.concurrent.BlockingDeque and add something to it on one node and poll it from another node. Or have a distributed primary key generator which would guarantee uniqueness on all nodes. More information about the Distributed data Strucutres can be found here. • Queue and Set • Atomic Types • CountDownLatch • ID Generator • Semaphore
  • 19. Quiz 3 1. Apache Ignite is able to scale up and down by simply adding/removing nodes from the cluster? 1. Yes 2. No 2. Does Apache Ignite has the concept of servers and clients? 1. Yes 2. No 3. Is it possible to manage an Apache Ignite cluster remotely? 1. Yes 2. No
  • 20. Data Streamers 1. Client nodes inject finite or continuous streams of data into Ignite caches using Ignite Data Streamers. 2. Data is automatically partitioned between Ignite data nodes, and each node gets equal amount of data. 3. Streamed data can be concurrently processed directly on the Ignite data nodes in collocated fashion. 4. Clients can also perform concurrent SQL queries on the streamed data. More information about the Data Streamers can be found here.
  • 21. Integration with major streaming technologies Apache Ignite integrates with major streaming technologies and frameworks in order to bring even more advanced streaming capabilities to Ignite-based architectures: 1. Kafka Streamer 2. Camel Streamer 3. JMS Streamer 4. MQTT Streamer 5. Storm Streamer 6. Flink streamer 7. Twitter Streamer 8. Flume Streamer 9. Zero MQ 10. Rocket MQ More information about Integrating Ignite with Data Streamers can be found here.
  • 22. Messaging & Events Exchange custom messages between nodes across the cluster. Ignite distributed messaging allows for topic based cluster-wide communication between all nodes. Messages with a specified message topic can be distributed to all or sub- group of nodes that have subscribed to that topic. Ignite messaging is based on publish-subscribe paradigm where publishers and subscribers are connected together by a common topic. When one of the nodes sends a message A for topic T, it is published on all nodes that have subscribed to T. More information about Messaging & Events can be found here.
  • 23. Sliding Windows More information about Sliding Windows can be found here. Sliding windows are configured as Ignite cache eviction policies, and can be: • Time-based sliding windows • FIFO sliding windows • LRU sliding windows • Querying sliding windows
  • 24. Web Session clustering More information about Web Session Clustering can be found here. Ignite In-Memory Data Fabric is capable of caching web sessions of all Java Servlet containers that follow Java Servlet 3.0 Specification, including Apache Tomcat, Eclipse Jetty, Oracle WebLogic, and others. • No need for sticky sessions provided by the Load Balancer.
  • 25. Hibernate L2 Cache More information about Hibernate L2 cache can be found here. Ignite In-Memory Data Fabric can be used as Hibernate Second-Level cache (or L2 cache), which can significantly speed-up the persistence layer of your application. Hibernate is a well-known and widely used framework for Object-Relational Mapping (ORM). While interacting closely with an SQL database, it performs caching of retrieved data to minimize expensive database requests
  • 26. Spring Caching More information about Spring Cache can be found here. Ignite is shipped with SpringCacheManager - an implementation of Spring Cache Abstraction. It provides an annotation-based way to enable caching for Java methods so that the result of a method execution is stored in the Ignite cache. Later, if the same method is called with the same set of parameter values, the result will be retrieved from the cache instead of actually executing the method. Example: private JdbcTemplate jdbc; @Cacheable("averageSalary") public long averageSalary(int organizationId) { String sql = "SELECT AVG(e.salary) " + "FROM Employee e " + "WHERE e.organizationId = ?"; return jdbc.queryForObject(sql, Long.class, organizationId); }
  • 27. Spring Data More information about Spring Data can be found here. Spring Data Framework provides a unified and widely used API that allows abstracting an underlying data storage from the application layer. Spring Data helps you avoid locking to a specific database vendor, making it easy to switch from one database to another with minimal efforts. Apache Ignite implements Spring Data CrudRepository interface that not only supports basic CRUD operations but also provides access to the Apache Ignite SQL Grid via the unified Spring Data API. @RepositoryConfig(cacheName = "PersonCache") public interface PersonRepository extends IgniteRepository <Person, Long> { /** * Gets all the persons with the given name. * @param name Person name. * @return A list of Persons with the given first name. */ public List<Person> findByFirstName(String name); /** * Returns top Person with the specified surname. * @param name Person surname. * @return Person that satisfy the query. */ public Cache.Entry<Long, Person> findTopByLastNameLike (String name); }
  • 28. Apache Spark More information about Ignite for Spark can be found here. Apache Ignite provides an implementation of Spark RDD (Resilient Distributed Datasets) abstraction which allows to easily share state in memory across Spark jobs. The main difference between native Spark RDD and IgniteRDD is that Ignite RDD provides a shared in-memory view on data across different Spark jobs, workers, or applications, while native Spark RDD cannot be seen by other Spark jobs or applications.
  • 29. Other integrations More information about integrations can be found here. Apache Ignite integrates with: • Hadoop • Apache Cassandra • PHP PDO – Data Objects • MyBatis L2 Cache • OSGi
  • 30. Ignite Native Persistence Ignite native persistence is a distributed, ACID, and SQL-compliant disk store that transparently integrates with Ignite's durable memory. Ignite persistence is optional and can be turned on and off. When turned off Ignite becomes a pure in-memory store. With the native persistence enabled, Ignite always stores a superset of data on disk, and as much as possible in RAM. For example, if there are 100 entries and RAM has the capacity to store only 20, then all 100 will be stored on disk and only 20 will be cached in RAM for better performance. More information about the Ignite Native Persistence can be found here.
  • 31. 3rd Party Persistence JCache specification comes with APIs for javax.cache.integration.CacheLoader and javax.cache.integration.CacheWriter which are used for write-through and read-through to and from an underlying persistent storage respectively (e.g. an RDBMS database like Oracle or MySQL, or NoSQL database like MongoDB or Couchbase). It supports: • Read/Write Through • Write-Behind More information about the 3rd Party Persistence can be found here.
  • 32. Supported platforms & protocols • Java • .NET • C++ • REST API • Memcached • Redis • PHP More information about the Platforms & Protocols can be found here. Apache Ignite has a rich set of APIs that are covered throughout the documentation. The APIs are implemented in the form of native libraries that support major languages such as Java, .NET and C++, as well as a variety of protocols like REST, Memcached, and Redis
  • 33. Automatic RDBMS Integration More information about the Automatic RDBMS Integration can be found here.
  • 34. SqlLine with version 2.3.0 More information about the sqlline tool can can be found here.
  • 35. Typical deployment for Apache Ignite More information can be found here. 1. One or more applications connect to the Apache Ignite cluster in order to manipulate the data in memory. 2. The application never communicates directly with the database. 3. Apache Ignite is responsible to synchronise the data.
  • 36. Legacy systems? More information can be found here. 1. Existing legacy systems updating a database. 2. New systems that rely on Apache Ignite with 3rd Party Persistence enabled. 3. How we guarantee that stale data won’t reside on Ignite cluster for a long time and will be updated as soon as the database receives updated from the legacy application?
  • 37. Legacy systems. Possible solution 1. More information can be found here. Connect the legacy system to the Ignite cluster directly. 1. Development is required in order to make the transition from the existing DB to Apache Ignite. 2. Complex PL/SQL stored procedures needs rewrite. 3. Many legacy applications. 1. Simple solution. 2. No added costs.
  • 38. Legacy systems. Possible solution 2 (Push). Custom logic on the third party database that would propagate the committed changes back to the Apache Ignite cluster. 1. Development cost.1. The data are being replicated on time
  • 39. Legacy systems. Possible solution 3 (Gridgain and Oracle GoldenGate Integrator). Use Gridgain cluster and Oracle GoldenGate Integrator. 1. Licenses cost ($).1. No need to develop complex code More information can be found here.
  • 40. 1. Startup a cluster 2. Run the JdbcExample/modified and show the console online 3. CacheTransactionExample 4. CacheQueryExample 5. CacheDataStreamerExample 6. CacheContinuousQueryExample (Show partitions from the web console) 7. CacheAffinityExample (java 8) 8. ComputeClosureExample (java 8) 9. IgniteAtomicSequenceExample 10. MessagingExample 11. PersistenceStoreExample Examples from Apache Ignite’s github repo
  • 41. Similar products/solutions Hazelcast Oracle Coherence Pivotal GemFire Terracotta Gigaspaces Redis Detailed comparisons between GridGain and the previous products can be found here.
  • 42. Thank you! Useful Resources • https://github.com/apache/ignite • https://www.gridgain.com/resources/documentation • https://github.com/srecon/ignite-book-code-samples

Editor's Notes

  • #35: sqlline.bat --color=true --verbose=true -u jdbc:ignite:thin://127.0.0.1/ create table city(id long primary key, name varchar) with "template=replicated"; create table person (id long, name varchar, city_id long, primary key(id, city_id)) with "backups=1, affinityKey=city_id"; create index idx_city_name on city(name); create index idx_person_name on Person(name); !tables insert into city (id, name) values(1, 'Forest Hill'); insert into city (id, name) values(2, 'Denver'); insert into city (id, name) values(3, 'St. Petersburg'); insert into person (id, name, city_id) values (1, 'John Doe', 3); insert into person (id, name, city_id) values (2, 'Rob Chen', 2); insert into person (id, name, city_id) values (3, 'Mary Davis', 1); insert into person (id, name, city_id) values (4, 'Richard Miles', 2); select p.name, c.name from Person p, city c where p.city_id = c.id and c.name='Denver';