SlideShare a Scribd company logo
1 / 168
2 / 168
 
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purpose only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied up in
making purchasing decisions. The development, release and timing of any features or
functionality described for Oracle´s product remains at the sole discretion of Oracle.
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
3 / 168
about.me/lefred
Who am I ?
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
4 / 168
Frédéric Descamps
@lefred
MySQL Evangelist
Managing MySQL since 3.23
devops believer
living in Belgium 🇧🇪
https://lefred.be
 
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
5 / 168
Group Replication: heart of MySQL InnoDB
Cluster
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
6 / 168
Group Replication: heart of MySQL InnoDB
Cluster
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
7 / 168
MySQL Group Replication
but what is it ?!?
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
8 / 168
MySQL Group Replication
but what is it ?!?
GR is a plugin for MySQL, made by MySQL and packaged with MySQL
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
9 / 168
MySQL Group Replication
but what is it ?!?
GR is a plugin for MySQL, made by MySQL and packaged with MySQL
GR is an implementation of Replicated Database State Machine theory
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
10 / 168
MySQL Group Replication
but what is it ?!?
GR is a plugin for MySQL, made by MySQL and packaged with MySQL
GR is an implementation of Replicated Database State Machine theory
Paxos based protocol (our implementation is close to Mencius)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
11 / 168
MySQL Group Replication
but what is it ?!?
GR is a plugin for MySQL, made by MySQL and packaged with MySQL
GR is an implementation of Replicated Database State Machine theory
Paxos based protocol (our implementation is close to Mencius)
GR allows to write on all Group Members (cluster nodes) simultaneously while
retaining consistency
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
12 / 168
MySQL Group Replication
but what is it ?!?
GR is a plugin for MySQL, made by MySQL and packaged with MySQL
GR is an implementation of Replicated Database State Machine theory
Paxos based protocol (our implementation is close to Mencius)
GR allows to write on all Group Members (cluster nodes) simultaneously while
retaining consistency
GR implements conflict detection and resolution
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
13 / 168
MySQL Group Replication
but what is it ?!?
GR is a plugin for MySQL, made by MySQL and packaged with MySQL
GR is an implementation of Replicated Database State Machine theory
Paxos based protocol (our implementation is close to Mencius)
GR allows to write on all Group Members (cluster nodes) simultaneously while
retaining consistency
GR implements conflict detection and resolution
GR allows automatic distributed recovery
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
14 / 168
MySQL Group Replication
but what is it ?!?
GR is a plugin for MySQL, made by MySQL and packaged with MySQL
GR is an implementation of Replicated Database State Machine theory
Paxos based protocol (our implementation is close to Mencius)
GR allows to write on all Group Members (cluster nodes) simultaneously while
retaining consistency
GR implements conflict detection and resolution
GR allows automatic distributed recovery
Supported on all MySQL platforms !!
Linux, Windows, Solaris, OSX, FreeBSD
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
15 / 168
terminology
Write vs Writeset
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
16 / 168
Let's illustrate a table:
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
17 / 168
Now let's make a change
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
18 / 168
and at commit time:
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
19 / 168
Writesets
Contain the hash for the rows PKs that are changed and in some cases the hashes of
foreign keys or others dependencies that need to be captured (e.g. non NULL UKs).
Writesets are gathered during transaction execution.
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
20 / 168
Writesets
Contain the hash for the rows PKs that are changed and in some cases the hashes of
foreign keys or others dependencies that need to be captured (e.g. non NULL UKs).
Writesets are gathered during transaction execution.
Writes
Called also write values, refers to the actual changes. Write values are also gathered
during transaction execution.
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
21 / 168
Writeset - examples
+-------+-----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id | binary(1) | NO | PRI | NULL | |
| name | binary(2) | YES | | NULL | |
+-------+-----------+------+-----+---------+-------+
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
22 / 168
Writeset - examples
+-------+-----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id | binary(1) | NO | PRI | NULL | |
| name | binary(2) | YES | | NULL | |
+-------+-----------+------+-----+---------+-------+
mysql> insert into t2 values (1,2);
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
23 / 168
Writeset - examples
+-------+-----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id | binary(1) | NO | PRI | NULL | |
| name | binary(2) | YES | | NULL | |
+-------+-----------+------+-----+---------+-------+
mysql> insert into t2 values (1,2);
pke: PRIMARY | test | t2 | 1 | 1 hash: 11853456929268668462
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
24 / 168
Writeset - examples
+-------+-----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id | binary(1) | NO | PRI | NULL | |
| name | binary(2) | YES | | NULL | |
+-------+-----------+------+-----+---------+-------+
mysql> insert into t2 values (1,2);
pke: PRIMARY | test | t2 | 1 | 1 hash: 11853456929268668462
mysql> update t2 set name=3 where id=1;
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
25 / 168
Writeset - examples
+-------+-----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id | binary(1) | NO | PRI | NULL | |
| name | binary(2) | YES | | NULL | |
+-------+-----------+------+-----+---------+-------+
mysql> insert into t2 values (1,2);
pke: PRIMARY | test | t2 | 1 | 1 hash: 11853456929268668462
mysql> update t2 set name=3 where id=1;
pke: PRIMARY | test | t2 | 1 | 1 hash: 10002085147685770725
pke: PRIMARY | test | t2 | 1 | 1 hash: 10002085147685770725
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
26 / 168
Writeset - examples (2)
+-------+-----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id | binary(1) | NO | PRI | NULL | |
| name | binary(2) | YES | UNI | NULL | |
| name2 | binary(1) | YES | | NULL | |
+-------+-----------+------+-----+---------+-------+
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
27 / 168
Writeset - examples (2)
+-------+-----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id | binary(1) | NO | PRI | NULL | |
| name | binary(2) | YES | UNI | NULL | |
| name2 | binary(1) | YES | | NULL | |
+-------+-----------+------+-----+---------+-------+
mysql> insert into t3 values (1,2,3);
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
28 / 168
Writeset - examples (2)
+-------+-----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id | binary(1) | NO | PRI | NULL | |
| name | binary(2) | YES | UNI | NULL | |
| name2 | binary(1) | YES | | NULL | |
+-------+-----------+------+-----+---------+-------+
mysql> insert into t3 values (1,2,3);
pke: PRIMARY | test |t3 | 1 | 1 hash: 79134815725924853
pke: name | test |t3 | 2 hash: 11034644986657565827
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
29 / 168
Writeset - examples (2)
+-------+-----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id | binary(1) | NO | PRI | NULL | |
| name | binary(2) | YES | UNI | NULL | |
| name2 | binary(1) | YES | | NULL | |
+-------+-----------+------+-----+---------+-------+
mysql> insert into t3 values (1,2,3);
pke: PRIMARY | test |t3 | 1 | 1 hash: 79134815725924853
pke: name | test |t3 | 2 hash: 11034644986657565827
mysql> update t3 set name=3 where id=1;
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
30 / 168
Writeset - examples (2)
+-------+-----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id | binary(1) | NO | PRI | NULL | |
| name | binary(2) | YES | UNI | NULL | |
| name2 | binary(1) | YES | | NULL | |
+-------+-----------+------+-----+---------+-------+
mysql> insert into t3 values (1,2,3);
pke: PRIMARY | test |t3 | 1 | 1 hash: 79134815725924853
pke: name | test |t3 | 2 hash: 11034644986657565827
mysql> update t3 set name=3 where id=1;
pke: PRIMARY | test | t3 | 1 | 1 hash: 79134815725924853
pke: name | test | t3 | 3 hash: 18082071075512932388
pke: PRIMARY | test | t3 | 1 | 1 hash: 79134815725924853
pke: name | test | t3 | 2 hash: 11034644986657565827
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
31 / 168
Writeset - examples (2)
+-------+-----------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-----------+------+-----+---------+-------+
| id | binary(1) | NO | PRI | NULL | |
| name | binary(2) | YES | UNI | NULL | |
| name2 | binary(1) | YES | | NULL | |
+-------+-----------+------+-----+---------+-------+
mysql> insert into t3 values (1,2,3);
pke: PRIMARY | test |t3 | 1 | 1 hash: 79134815725924853
pke: name | test |t3 | 2 hash: 11034644986657565827
mysql> update t3 set name=3 where id=1;
pke: PRIMARY | test | t3 | 1 | 1 hash: 79134815725924853
pke: name | test | t3 | 3 hash: 18082071075512932388
pke: PRIMARY | test | t3 | 1 | 1 hash: 79134815725924853
pke: name | test | t3 | 2 hash: 11034644986657565827
[after image]
[before image]
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
32 / 168
GR is nice, but how does it work ?
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
33 / 168
GR is nice, but how does it work ?
it´s just ...
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
34 / 168
GR is nice, but how does it work ?
it´s just ...
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
35 / 168
GR is nice, but how does it work ?
it´s just ...
... no, in fact the writesets replication is synchronous and then certification and apply of
the changes are local to each nodes and asynchronous.
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
36 / 168
GR is nice, but how does it work ?
it´s just ...
... no, in fact the writesets replication is synchronous and then certification and apply of
the changes are local to each nodes and asynchronous.
not that easy to understand... right ? As a picture is worth a 1000 words, let´s illustrate
this...
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
37 / 168
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
38 / 168
MySQL Group Replication
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
39 / 168
MySQL Group Replication
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
40 / 168
MySQL Group Replication
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
41 / 168
MySQL Group Replication
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
42 / 168
MySQL Group Replication
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
43 / 168
MySQL Group Replication
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
44 / 168
MySQL Group Replication
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
45 / 168
MySQL Group Replication
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
46 / 168
MySQL Group Replication
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
47 / 168
MySQL Group Replication
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
48 / 168
MySQL Group Replication
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
49 / 168
MySQL Group Replication
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
50 / 168
MySQL Group Replication (full transaction)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
51 / 168
MySQL Group Replication (full transaction)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
52 / 168
MySQL Group Replication (full transaction)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
53 / 168
MySQL Group Replication (full transaction)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
54 / 168
MySQL Group Replication (full transaction)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
55 / 168
MySQL Group Replication (full transaction)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
56 / 168
MySQL Group Replication (full transaction)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
57 / 168
MySQL Group Replication (full transaction)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
58 / 168
MySQL Group Replication (full transaction)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
59 / 168
MySQL Group Communication System (GCS)
MySQL Xcom protocol
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
60 / 168
MySQL Group Communication System (GCS)
MySQL Xcom protocol
Replicated Database State Machine
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
61 / 168
MySQL Group Communication System (GCS)
MySQL Xcom protocol
Replicated Database State Machine
Paxos based protocol
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
62 / 168
MySQL Group Communication System (GCS)
MySQL Xcom protocol
Replicated Database State Machine
Paxos based protocol
its task: deliver messages across the distributed system:
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
63 / 168
MySQL Group Communication System (GCS)
MySQL Xcom protocol
Replicated Database State Machine
Paxos based protocol
its task: deliver messages across the distributed system:
atomically
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
64 / 168
MySQL Group Communication System (GCS)
MySQL Xcom protocol
Replicated Database State Machine
Paxos based protocol
its task: deliver messages across the distributed system:
atomically
in Total Order
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
65 / 168
MySQL Group Communication System (GCS)
MySQL Xcom protocol
Replicated Database State Machine
Paxos based protocol
its task: deliver messages across the distributed system:
atomically
in Total Order
MySQL Group Replication receives the Ordered 'tickets' from this GCS subsystem.
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
66 / 168
Total Order
GTID generation
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
67 / 168
How does Group Replication handle GTIDs ?
There are two ways of generating GTIDs:
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
68 / 168
How does Group Replication handle GTIDs ?
There are two ways of generating GTIDs:
AUTOMATIC: the transaction is assigned with an automatically generated id during
commit. Where regular replication uses the source server UUID, on Group Replication,
the group name is used.
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
69 / 168
How does Group Replication handle GTIDs ?
There are two ways of generating GTIDs:
AUTOMATIC: the transaction is assigned with an automatically generated id during
commit. Where regular replication uses the source server UUID, on Group Replication,
the group name is used.
ASSIGNED: the user assigns manually a GTID through SET GTID_NEXT to the
transaction. This is common to any replication format and the id is assigned before
the transaction starts.
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
70 / 168
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
71 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
72 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
73 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
74 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
75 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
76 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
77 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
78 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
79 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
80 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
81 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
82 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
83 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
84 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
85 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
86 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
87 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
88 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
89 / 168
Group Replication : Total Order Delivery - GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
90 / 168
Group Replication : GTID
The previous example was not totally in sync with reality. In fact, a writer allocates a
block of GTID and when we have multiple writes (multi-primary mode) all writers will use
GTID sequence numbers in their allocated block.
The size of the block is defined by
group_replication_gtid_assignment_block_size (default to 1M)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
91 / 168
Group Replication : GTID
Example:
Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
92 / 168
Group Replication : GTID
Example:
Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355
New write on an other node:
Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355,1000354
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
93 / 168
Group Replication : GTID
Example:
Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355
New write on an other node:
Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355,1000354
Let's write on the third node:
Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355:1000354:2000354
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
94 / 168
Group Replication : GTID
Example:
Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355
New write on an other node:
Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355,1000354
Let's write on the third node:
Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355:1000354:2000354
And writing back on the first one:
Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-356:1000354:2000354
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
95 / 168
done !
Return from Commit
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
96 / 168
Group Replication: return from commit
Asynchronous Replication:
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
97 / 168
Group Replication: return from commit (2)
Semi-Sync Replication:
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
98 / 168
Group Replication: return from commit (3)
Group Replication:
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
99 / 168
Does this mean we can have a distant node and
always let it ack later ?
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
100 / 168
Does this mean we can have a distant node and
always let it ack later ?
NO!
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
101 / 168
Does this mean we can have a distant node and
always let it ack later ?
NO!
Because the system has to wait for the noop (single skip message) from the “distant”
node where latency is higher
The size of the GCS consensus messages window can be get and set from UDF functions:
group_replication_get_write_concurrency(), group_replication_set_write_concurrency()
mysql> select group_replication_get_write_concurrency();
+-------------------------------------------+
| group_replication_get_write_concurrency() |
+-------------------------------------------+
| 10 |
+-------------------------------------------+
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
102 / 168
Event Horizon
GCS Write Consensus Concurrency
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
103 / 168
Event Horizon
GCS Write Consensus Concurrency
group replication write concurrency
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
104 / 168
Event Horizon
GCS Write Consensus Concurrency
group replication write concurrency
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
105 / 168
Event Horizon
GCS Write Consensus Concurrency
group replication write concurrency
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
106 / 168
Event Horizon
GCS Write Consensus Concurrency
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
107 / 168
Event Horizon
GCS Write Consensus Concurrency
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
108 / 168
Event Horizon
GCS Write Consensus Concurrency
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
109 / 168
Event Horizon
GCS Write Consensus Concurrency
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
110 / 168
conflict
Optimistic Locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
111 / 168
Group Replication : Optimistic Locking
Group Replication uses optimistic locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
112 / 168
Group Replication : Optimistic Locking
Group Replication uses optimistic locking
during a transaction, local (InnoDB) locking happens
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
113 / 168
Group Replication : Optimistic Locking
Group Replication uses optimistic locking
during a transaction, local (InnoDB) locking happens
optimistically assumes there will be no conflicts across nodes
(no communication between nodes necessary)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
114 / 168
Group Replication : Optimistic Locking
Group Replication uses optimistic locking
during a transaction, local (InnoDB) locking happens
optimistically assumes there will be no conflicts across nodes
(no communication between nodes necessary)
cluster-wide conflict resolution happens only at COMMIT, during certification
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
115 / 168
Group Replication : Optimistic Locking
Group Replication uses optimistic locking
during a transaction, local (InnoDB) locking happens
optimistically assumes there will be no conflicts across nodes
(no communication between nodes necessary)
cluster-wide conflict resolution happens only at COMMIT, during certification
Let´s first have a look at the traditional locking to compare.
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
116 / 168
Traditional locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
117 / 168
Traditional locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
118 / 168
Traditional locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
119 / 168
Traditional locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
120 / 168
Traditional locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
121 / 168
Traditional locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
122 / 168
Optimistic Locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
123 / 168
Optimistic Locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
124 / 168
Optimistic Locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
125 / 168
Optimistic Locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
126 / 168
Optimistic Locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
127 / 168
Optimistic Locking
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
128 / 168
Optimistic Locking
The system returns error 149 as certification failed:
ERROR 1180 (HY000): Got error 149 during COMMIT
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
129 / 168
Such conflicts happen only when using multi-
primary group !
 
not totally true in MySQL < 8.0.13 when failover happens
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
130 / 168
Drawbacks of optimistic locking
having a first-committer-wins system means conflicts will more likely happen when
writing on multiple members with:
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
131 / 168
Drawbacks of optimistic locking
having a first-committer-wins system means conflicts will more likely happen when
writing on multiple members with:
large transactions
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
132 / 168
Drawbacks of optimistic locking
having a first-committer-wins system means conflicts will more likely happen when
writing on multiple members with:
large transactions
long running transactions
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
133 / 168
Drawbacks of optimistic locking
having a first-committer-wins system means conflicts will more likely happen when
writing on multiple members with:
large transactions
long running transactions
hotspot records
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
134 / 168
can the transaction be committed ?
Certification
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
135 / 168
Certification
Certification is the process that only needs to answer the following unique question:
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
136 / 168
Certification
Certification is the process that only needs to answer the following unique question:
can the write (transaction) be committed ?
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
137 / 168
Certification
Certification is the process that only needs to answer the following unique question:
can the write (transaction) be committed ?
based on yet to be applied transactions
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
138 / 168
Certification
Certification is the process that only needs to answer the following unique question:
can the write (transaction) be committed ?
based on yet to be applied transactions
such conflicts must come for other members/nodes
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
139 / 168
Certification
Certification is the process that only needs to answer the following unique question:
can the write (transaction) be committed ?
based on yet to be applied transactions
such conflicts must come for other members/nodes
happens on every member/node and is deterministic
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
140 / 168
Certification
Certification is the process that only needs to answer the following unique question:
can the write (transaction) be committed ?
based on yet to be applied transactions
such conflicts must come for other members/nodes
happens on every member/node and is deterministic
results are not reported to the group (does not require a new communication step)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
141 / 168
Certification
Certification is the process that only needs to answer the following unique question:
can the write (transaction) be committed ?
based on yet to be applied transactions
such conflicts must come for other members/nodes
happens on every member/node and is deterministic
results are not reported to the group (does not require a new communication step)
pass: commit/queue to appy
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
142 / 168
Certification
Certification is the process that only needs to answer the following unique question:
can the write (transaction) be committed ?
based on yet to be applied transactions
such conflicts must come for other members/nodes
happens on every member/node and is deterministic
results are not reported to the group (does not require a new communication step)
pass: commit/queue to appy
fail: rollback/drop the transaction
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
143 / 168
Certification
Certification is the process that only needs to answer the following unique question:
can the write (transaction) be committed ?
based on yet to be applied transactions
such conflicts must come for other members/nodes
happens on every member/node and is deterministic
results are not reported to the group (does not require a new communication step)
pass: commit/queue to appy
fail: rollback/drop the transaction
serialized by the total order in GCS/XCOM + GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
144 / 168
Certification
Certification is the process that only needs to answer the following unique question:
can the write (transaction) be committed ?
based on yet to be applied transactions
such conflicts must come for other members/nodes
happens on every member/node and is deterministic
results are not reported to the group (does not require a new communication step)
pass: commit/queue to appy
fail: rollback/drop the transaction
serialized by the total order in GCS/XCOM + GTID
cost is based on trx size (# rows & # keys)
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
145 / 168
Certification
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
146 / 168
Houston we have a problem !
Flow Control
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
147 / 168
Flow Control
In Group Replication, every member send statistics about its queues (applier queue and
certification queue) to the other members. Then every node decide to slow down or not
if they realize that one node reached the threshold for one of the queue.
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
148 / 168
Flow Control
In Group Replication, every member send statistics about its queues (applier queue and
certification queue) to the other members. Then every node decide to slow down or not
if they realize that one node reached the threshold for one of the queue.
So when group_replication_ ow_control_mode is set to QUOTA on the
node seeing that one of the other members of the cluster is lagging behind (threshold
reached), it will throttle the write operations to the a quota that is calculated based on
the number of transactions applied in the last second, and then it is reduced below that
by subtracting the “over the quota” messages from the last period.
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
149 / 168
Flow Control
In Group Replication, every member send statistics about its queues (applier queue and
certification queue) to the other members. Then every node decide to slow down or not
if they realize that one node reached the threshold for one of the queue.
So when group_replication_ ow_control_mode is set to QUOTA on the
node seeing that one of the other members of the cluster is lagging behind (threshold
reached), it will throttle the write operations to the a quota that is calculated based on
the number of transactions applied in the last second, and then it is reduced below that
by subtracting the “over the quota” messages from the last period.
This mean that the threshold is NOT decided on the node being slow, but the node
writing a transaction checks its threshold flow control values and compare them to the
statistics from the other nodes to decide to throttle or not.
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
150 / 168
Flow Control - on writer
>quota
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
151 / 168
Flow Control - on all members
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
152 / 168
Flow Control - configuration variables
As in MySQL 8.0.13:
+-----------------------------------------------------+-------+
| Variable_name | Value |
+-----------------------------------------------------+-------+
| group_replication_ ow_control_applier_threshold | 25000 |
| group_replication_ ow_control_certi er_threshold | 25000 |
| group_replication_ ow_control_hold_percent | 10 |
| group_replication_ ow_control_max_quota | 0 |
| group_replication_ ow_control_member_quota_percent | 0 |
| group_replication_ ow_control_min_quota | 0 |
| group_replication_ ow_control_min_recovery_quota | 0 |
| group_replication_ ow_control_mode | QUOTA |
| group_replication_ ow_control_period | 1 |
| group_replication_ ow_control_release_percent | 50 |
+-----------------------------------------------------+-------+
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
153 / 168
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
154 / 168
transaction's lifecycle in Group Replication
Summary
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
155 / 168
begin;
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
156 / 168
begin;
update table1
set c = 999
where id =2;
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
157 / 168
begin;
update table1
set c = 999
where id =2;
update table1
set b = "eee"
where id = 3;
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
158 / 168
begin;
update table1
set c = 999
where id =2;
update table1
set b = "eee"
where id = 3;
commit;
clientblocksoncommit...
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
159 / 168
begin;
update table1
set c = 999
where id =2;
update table1
set b = "eee"
where id = 3;
commit;
clientblocksoncommit...
writesets
+ gtid_event
+ write values
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
160 / 168
begin;
update table1
set c = 999
where id =2;
update table1
set b = "eee"
where id = 3;
commit;
clientblocksoncommit...
writesets
+ gtid_event
+ write values
certify
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
161 / 168
begin;
update table1
set c = 999
where id =2;
update table1
set b = "eee"
where id = 3;
commit;
clientblocksoncommit...
writesets
+ gtid_event
+ write values
certify
certify
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
162 / 168
begin;
update table1
set c = 999
where id =2;
update table1
set b = "eee"
where id = 3;
commit;
clientblocksoncommit...
writesets
+ gtid_event
+ write values
certify
certify
certify
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
163 / 168
begin;
update table1
set c = 999
where id =2;
update table1
set b = "eee"
where id = 3;
commit;
commit finalized
writesets
+ gtid_event
+ write values
certify
certify
certify
+ GTID
bin
log
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
164 / 168
begin;
update table1
set c = 999
where id =2;
update table1
set b = "eee"
where id = 3;
commit;
commit finalized
writesets
+ gtid_event
+ write values
certify
certify
certify
+ GTID
bin
log
+ GTID
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
165 / 168
begin;
update table1
set c = 999
where id =2;
update table1
set b = "eee"
where id = 3;
commit;
commit finalized
writesets
+ gtid_event
+ write values
certify
certify
certify
+ GTID
bin
log
+ GTID
+ GTIDrelay
log
relay
log
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
166 / 168
begin;
update table1
set c = 999
where id =2;
update table1
set b = "eee"
where id = 3;
commit;
commit finalized
writesets
+ gtid_event
+ write values
certify
certify
certify
+ GTID
bin
log
+ GTID
+ GTIDrelay
log
relay
log
bin
log
bin
log
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
167 / 168
Thank you !
Any Questions ?
Copyright @ 2018 Oracle and/or its affiliates. All rights reserved.
168 / 168

More Related Content

PDF
Oracle Open World 2018 / Code One : MySQL 8.0 High Availability with MySQL I...
PDF
MySQL Innovation Day Chicago - MySQL HA So Easy : That's insane !!
PDF
MySQL InnoDB Cluster and Group Replication in a nutshell hands-on tutorial
PDF
Oracle Open World 2018 / Code One : MySQL 8.0 Document Store
PDF
MySQL Group Replication: the magic explained v.2
PDF
DataOpsbarcelona 2019: Deep dive into MySQL Group Replication... the magic e...
PDF
Introduction to MySQL InnoDB Cluster
PDF
MySQL User Group NL: MySQL 8.0 Document Store- NoSQL with all the benefits of...
Oracle Open World 2018 / Code One : MySQL 8.0 High Availability with MySQL I...
MySQL Innovation Day Chicago - MySQL HA So Easy : That's insane !!
MySQL InnoDB Cluster and Group Replication in a nutshell hands-on tutorial
Oracle Open World 2018 / Code One : MySQL 8.0 Document Store
MySQL Group Replication: the magic explained v.2
DataOpsbarcelona 2019: Deep dive into MySQL Group Replication... the magic e...
Introduction to MySQL InnoDB Cluster
MySQL User Group NL: MySQL 8.0 Document Store- NoSQL with all the benefits of...

What's hot (20)

PDF
MySQL InnoDB Cluster and Group Replication in a Nutshell
PDF
MySQL Document Store - when SQL & NoSQL live together... in peace!
PDF
MySQL 8.0 Document Store - how to mix NoSQL & SQL in MySQL 8.0
PDF
FOSDEM MySQL & Friends Devroom, February 2018 MySQL Point-in-Time Recovery l...
PDF
Introduction to MySQL InnoDB Cluster
PDF
MySQL Document Store - How to replace a NoSQL database by MySQL without effor...
PDF
MySQL InnoDB Cluster in a Nutshell - Hands-on Lab
PDF
MySQL Shell: the best DBA tool ?
PDF
DataOps barcelona - MySQL 8.0 document store: NoSQL with all the benefits of ...
PDF
MySQL Shell : the best DBA tool ?
PDF
Another MySQL HA Solution for ProxySQL Users, Easy and All Integrated: MySQL ...
PDF
DataOps Barcelona - MySQL HA so easy... that's insane !
PDF
MySQL Tech Café #8: MySQL 8.0 for Python Developers
PDF
Introduction to MySQL InnoDB Cluster
PDF
Group Replication: A Journey to the Group Communication Core
PDF
pre-FOSDEM MySQL day, February 2018 - MySQL Document Store
PDF
MySQL Group Replication - HandsOn Tutorial
PDF
MySQL Community Meetup in China : Innovation driven by the Community
PDF
the State of the Dolphin - October 2020
PDF
Looking Inside the MySQL 8.0 Document Store
MySQL InnoDB Cluster and Group Replication in a Nutshell
MySQL Document Store - when SQL & NoSQL live together... in peace!
MySQL 8.0 Document Store - how to mix NoSQL & SQL in MySQL 8.0
FOSDEM MySQL & Friends Devroom, February 2018 MySQL Point-in-Time Recovery l...
Introduction to MySQL InnoDB Cluster
MySQL Document Store - How to replace a NoSQL database by MySQL without effor...
MySQL InnoDB Cluster in a Nutshell - Hands-on Lab
MySQL Shell: the best DBA tool ?
DataOps barcelona - MySQL 8.0 document store: NoSQL with all the benefits of ...
MySQL Shell : the best DBA tool ?
Another MySQL HA Solution for ProxySQL Users, Easy and All Integrated: MySQL ...
DataOps Barcelona - MySQL HA so easy... that's insane !
MySQL Tech Café #8: MySQL 8.0 for Python Developers
Introduction to MySQL InnoDB Cluster
Group Replication: A Journey to the Group Communication Core
pre-FOSDEM MySQL day, February 2018 - MySQL Document Store
MySQL Group Replication - HandsOn Tutorial
MySQL Community Meetup in China : Innovation driven by the Community
the State of the Dolphin - October 2020
Looking Inside the MySQL 8.0 Document Store
Ad

Similar to Percona Live Europe 2018 MySQL Group Replication... the magic explained (20)

PDF
MySQL High Availability with Group Replication
PDF
MySQL InnoDB Cluster - Group Replication
PDF
MySQL InnoDB Cluster: High Availability Made Easy!
PDF
MySQL Group Replicatio in a nutshell - MySQL InnoDB Cluster
PDF
MySQL 8.0 InnoDB Cluster - Easiest Tutorial
PDF
MySQL High Availability Solutions
PDF
Mysqlhacodebits20091203 1260184765-phpapp02
PDF
MySQL High Availability Solutions
PDF
MySQL innodb cluster and Group Replication in a nutshell - hands-on tutorial ...
PDF
Confoo 202 - MySQL Group Replication and ReplicaSet
PDF
State of the Dolphin 2020 - 25th Anniversary of MySQL with 8.0.20
PDF
Replication Whats New in Mysql 8
PDF
MySQL Database Architectures - 2020-10
PDF
Everything You Need to Know About MySQL Group Replication
PDF
MySQL 8.0 New Features -- September 27th presentation for Open Source Summit
PDF
Ohio Linux Fest -- MySQL's NoSQL
PDF
RivieraJUG - MySQL 8.0 - What's new for developers.pdf
PDF
MySQL's NoSQL -- Texas Linuxfest August 22nd 2015
PDF
Oracle Open World Middle East - MySQL 8 a Giant Leap for SQL
PDF
Collaborate 2012 - Administering MySQL for Oracle DBAs
MySQL High Availability with Group Replication
MySQL InnoDB Cluster - Group Replication
MySQL InnoDB Cluster: High Availability Made Easy!
MySQL Group Replicatio in a nutshell - MySQL InnoDB Cluster
MySQL 8.0 InnoDB Cluster - Easiest Tutorial
MySQL High Availability Solutions
Mysqlhacodebits20091203 1260184765-phpapp02
MySQL High Availability Solutions
MySQL innodb cluster and Group Replication in a nutshell - hands-on tutorial ...
Confoo 202 - MySQL Group Replication and ReplicaSet
State of the Dolphin 2020 - 25th Anniversary of MySQL with 8.0.20
Replication Whats New in Mysql 8
MySQL Database Architectures - 2020-10
Everything You Need to Know About MySQL Group Replication
MySQL 8.0 New Features -- September 27th presentation for Open Source Summit
Ohio Linux Fest -- MySQL's NoSQL
RivieraJUG - MySQL 8.0 - What's new for developers.pdf
MySQL's NoSQL -- Texas Linuxfest August 22nd 2015
Oracle Open World Middle East - MySQL 8 a Giant Leap for SQL
Collaborate 2012 - Administering MySQL for Oracle DBAs
Ad

More from Frederic Descamps (20)

PDF
MySQL Innovation & Cloud Day - Document Store avec MySQL HeatWave Database Se...
PDF
MySQL Day Roma - MySQL Shell and Visual Studio Code Extension
PDF
RivieraJUG - MySQL Indexes and Histograms
PDF
MySQL User Group NL - MySQL 8
PDF
State of the Dolphin - May 2022
PDF
Percona Live 2022 - MySQL Shell for Visual Studio Code
PDF
Percona Live 2022 - The Evolution of a MySQL Database System
PDF
Percona Live 2022 - MySQL Architectures
PDF
LinuxFest Northwest 2022 - The Evolution of a MySQL Database System
PDF
Open Source 101 2022 - MySQL Indexes and Histograms
PDF
Pi Day 2022 - from IoT to MySQL HeatWave Database Service
PDF
Confoo 2022 - le cycle d'une instance MySQL
PDF
FOSDEM 2022 MySQL Devroom: MySQL 8.0 - Logical Backups, Snapshots and Point-...
PDF
Les nouveautés de MySQL 8.0
PDF
Les nouveautés de MySQL 8.0
PDF
State of The Dolphin - May 2021
PDF
MySQL Shell for DBAs
PDF
Deploying Magento on OCI with MDS
PDF
MySQL Router REST API
PDF
From single MySQL instance to High Availability: the journey to MySQL InnoDB ...
MySQL Innovation & Cloud Day - Document Store avec MySQL HeatWave Database Se...
MySQL Day Roma - MySQL Shell and Visual Studio Code Extension
RivieraJUG - MySQL Indexes and Histograms
MySQL User Group NL - MySQL 8
State of the Dolphin - May 2022
Percona Live 2022 - MySQL Shell for Visual Studio Code
Percona Live 2022 - The Evolution of a MySQL Database System
Percona Live 2022 - MySQL Architectures
LinuxFest Northwest 2022 - The Evolution of a MySQL Database System
Open Source 101 2022 - MySQL Indexes and Histograms
Pi Day 2022 - from IoT to MySQL HeatWave Database Service
Confoo 2022 - le cycle d'une instance MySQL
FOSDEM 2022 MySQL Devroom: MySQL 8.0 - Logical Backups, Snapshots and Point-...
Les nouveautés de MySQL 8.0
Les nouveautés de MySQL 8.0
State of The Dolphin - May 2021
MySQL Shell for DBAs
Deploying Magento on OCI with MDS
MySQL Router REST API
From single MySQL instance to High Availability: the journey to MySQL InnoDB ...

Recently uploaded (20)

PDF
Hybrid model detection and classification of lung cancer
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
1. Introduction to Computer Programming.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
project resource management chapter-09.pdf
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
Tartificialntelligence_presentation.pptx
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
MIND Revenue Release Quarter 2 2025 Press Release
Hybrid model detection and classification of lung cancer
A novel scalable deep ensemble learning framework for big data classification...
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
1. Introduction to Computer Programming.pptx
Encapsulation_ Review paper, used for researhc scholars
Hindi spoken digit analysis for native and non-native speakers
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
project resource management chapter-09.pdf
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
OMC Textile Division Presentation 2021.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
A comparative analysis of optical character recognition models for extracting...
Tartificialntelligence_presentation.pptx
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Heart disease approach using modified random forest and particle swarm optimi...
Zenith AI: Advanced Artificial Intelligence
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
NewMind AI Weekly Chronicles - August'25-Week II
MIND Revenue Release Quarter 2 2025 Press Release

Percona Live Europe 2018 MySQL Group Replication... the magic explained

  • 3.   Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purpose only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied up in making purchasing decisions. The development, release and timing of any features or functionality described for Oracle´s product remains at the sole discretion of Oracle. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 3 / 168
  • 4. about.me/lefred Who am I ? Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 4 / 168
  • 5. Frédéric Descamps @lefred MySQL Evangelist Managing MySQL since 3.23 devops believer living in Belgium 🇧🇪 https://lefred.be   Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 5 / 168
  • 6. Group Replication: heart of MySQL InnoDB Cluster Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 6 / 168
  • 7. Group Replication: heart of MySQL InnoDB Cluster Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 7 / 168
  • 8. MySQL Group Replication but what is it ?!? Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 8 / 168
  • 9. MySQL Group Replication but what is it ?!? GR is a plugin for MySQL, made by MySQL and packaged with MySQL Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 9 / 168
  • 10. MySQL Group Replication but what is it ?!? GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 10 / 168
  • 11. MySQL Group Replication but what is it ?!? GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos based protocol (our implementation is close to Mencius) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 11 / 168
  • 12. MySQL Group Replication but what is it ?!? GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos based protocol (our implementation is close to Mencius) GR allows to write on all Group Members (cluster nodes) simultaneously while retaining consistency Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 12 / 168
  • 13. MySQL Group Replication but what is it ?!? GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos based protocol (our implementation is close to Mencius) GR allows to write on all Group Members (cluster nodes) simultaneously while retaining consistency GR implements conflict detection and resolution Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 13 / 168
  • 14. MySQL Group Replication but what is it ?!? GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos based protocol (our implementation is close to Mencius) GR allows to write on all Group Members (cluster nodes) simultaneously while retaining consistency GR implements conflict detection and resolution GR allows automatic distributed recovery Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 14 / 168
  • 15. MySQL Group Replication but what is it ?!? GR is a plugin for MySQL, made by MySQL and packaged with MySQL GR is an implementation of Replicated Database State Machine theory Paxos based protocol (our implementation is close to Mencius) GR allows to write on all Group Members (cluster nodes) simultaneously while retaining consistency GR implements conflict detection and resolution GR allows automatic distributed recovery Supported on all MySQL platforms !! Linux, Windows, Solaris, OSX, FreeBSD Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 15 / 168
  • 16. terminology Write vs Writeset Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 16 / 168
  • 17. Let's illustrate a table: Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 17 / 168
  • 18. Now let's make a change Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 18 / 168
  • 19. and at commit time: Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 19 / 168
  • 20. Writesets Contain the hash for the rows PKs that are changed and in some cases the hashes of foreign keys or others dependencies that need to be captured (e.g. non NULL UKs). Writesets are gathered during transaction execution. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 20 / 168
  • 21. Writesets Contain the hash for the rows PKs that are changed and in some cases the hashes of foreign keys or others dependencies that need to be captured (e.g. non NULL UKs). Writesets are gathered during transaction execution. Writes Called also write values, refers to the actual changes. Write values are also gathered during transaction execution. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 21 / 168
  • 22. Writeset - examples +-------+-----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------+------+-----+---------+-------+ | id | binary(1) | NO | PRI | NULL | | | name | binary(2) | YES | | NULL | | +-------+-----------+------+-----+---------+-------+ Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 22 / 168
  • 23. Writeset - examples +-------+-----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------+------+-----+---------+-------+ | id | binary(1) | NO | PRI | NULL | | | name | binary(2) | YES | | NULL | | +-------+-----------+------+-----+---------+-------+ mysql> insert into t2 values (1,2); Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 23 / 168
  • 24. Writeset - examples +-------+-----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------+------+-----+---------+-------+ | id | binary(1) | NO | PRI | NULL | | | name | binary(2) | YES | | NULL | | +-------+-----------+------+-----+---------+-------+ mysql> insert into t2 values (1,2); pke: PRIMARY | test | t2 | 1 | 1 hash: 11853456929268668462 Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 24 / 168
  • 25. Writeset - examples +-------+-----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------+------+-----+---------+-------+ | id | binary(1) | NO | PRI | NULL | | | name | binary(2) | YES | | NULL | | +-------+-----------+------+-----+---------+-------+ mysql> insert into t2 values (1,2); pke: PRIMARY | test | t2 | 1 | 1 hash: 11853456929268668462 mysql> update t2 set name=3 where id=1; Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 25 / 168
  • 26. Writeset - examples +-------+-----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------+------+-----+---------+-------+ | id | binary(1) | NO | PRI | NULL | | | name | binary(2) | YES | | NULL | | +-------+-----------+------+-----+---------+-------+ mysql> insert into t2 values (1,2); pke: PRIMARY | test | t2 | 1 | 1 hash: 11853456929268668462 mysql> update t2 set name=3 where id=1; pke: PRIMARY | test | t2 | 1 | 1 hash: 10002085147685770725 pke: PRIMARY | test | t2 | 1 | 1 hash: 10002085147685770725 Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 26 / 168
  • 27. Writeset - examples (2) +-------+-----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------+------+-----+---------+-------+ | id | binary(1) | NO | PRI | NULL | | | name | binary(2) | YES | UNI | NULL | | | name2 | binary(1) | YES | | NULL | | +-------+-----------+------+-----+---------+-------+ Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 27 / 168
  • 28. Writeset - examples (2) +-------+-----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------+------+-----+---------+-------+ | id | binary(1) | NO | PRI | NULL | | | name | binary(2) | YES | UNI | NULL | | | name2 | binary(1) | YES | | NULL | | +-------+-----------+------+-----+---------+-------+ mysql> insert into t3 values (1,2,3); Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 28 / 168
  • 29. Writeset - examples (2) +-------+-----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------+------+-----+---------+-------+ | id | binary(1) | NO | PRI | NULL | | | name | binary(2) | YES | UNI | NULL | | | name2 | binary(1) | YES | | NULL | | +-------+-----------+------+-----+---------+-------+ mysql> insert into t3 values (1,2,3); pke: PRIMARY | test |t3 | 1 | 1 hash: 79134815725924853 pke: name | test |t3 | 2 hash: 11034644986657565827 Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 29 / 168
  • 30. Writeset - examples (2) +-------+-----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------+------+-----+---------+-------+ | id | binary(1) | NO | PRI | NULL | | | name | binary(2) | YES | UNI | NULL | | | name2 | binary(1) | YES | | NULL | | +-------+-----------+------+-----+---------+-------+ mysql> insert into t3 values (1,2,3); pke: PRIMARY | test |t3 | 1 | 1 hash: 79134815725924853 pke: name | test |t3 | 2 hash: 11034644986657565827 mysql> update t3 set name=3 where id=1; Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 30 / 168
  • 31. Writeset - examples (2) +-------+-----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------+------+-----+---------+-------+ | id | binary(1) | NO | PRI | NULL | | | name | binary(2) | YES | UNI | NULL | | | name2 | binary(1) | YES | | NULL | | +-------+-----------+------+-----+---------+-------+ mysql> insert into t3 values (1,2,3); pke: PRIMARY | test |t3 | 1 | 1 hash: 79134815725924853 pke: name | test |t3 | 2 hash: 11034644986657565827 mysql> update t3 set name=3 where id=1; pke: PRIMARY | test | t3 | 1 | 1 hash: 79134815725924853 pke: name | test | t3 | 3 hash: 18082071075512932388 pke: PRIMARY | test | t3 | 1 | 1 hash: 79134815725924853 pke: name | test | t3 | 2 hash: 11034644986657565827 Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 31 / 168
  • 32. Writeset - examples (2) +-------+-----------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +-------+-----------+------+-----+---------+-------+ | id | binary(1) | NO | PRI | NULL | | | name | binary(2) | YES | UNI | NULL | | | name2 | binary(1) | YES | | NULL | | +-------+-----------+------+-----+---------+-------+ mysql> insert into t3 values (1,2,3); pke: PRIMARY | test |t3 | 1 | 1 hash: 79134815725924853 pke: name | test |t3 | 2 hash: 11034644986657565827 mysql> update t3 set name=3 where id=1; pke: PRIMARY | test | t3 | 1 | 1 hash: 79134815725924853 pke: name | test | t3 | 3 hash: 18082071075512932388 pke: PRIMARY | test | t3 | 1 | 1 hash: 79134815725924853 pke: name | test | t3 | 2 hash: 11034644986657565827 [after image] [before image] Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 32 / 168
  • 33. GR is nice, but how does it work ? Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 33 / 168
  • 34. GR is nice, but how does it work ? it´s just ... Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 34 / 168
  • 35. GR is nice, but how does it work ? it´s just ... Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 35 / 168
  • 36. GR is nice, but how does it work ? it´s just ... ... no, in fact the writesets replication is synchronous and then certification and apply of the changes are local to each nodes and asynchronous. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 36 / 168
  • 37. GR is nice, but how does it work ? it´s just ... ... no, in fact the writesets replication is synchronous and then certification and apply of the changes are local to each nodes and asynchronous. not that easy to understand... right ? As a picture is worth a 1000 words, let´s illustrate this... Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 37 / 168
  • 38. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 38 / 168
  • 39. MySQL Group Replication Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 39 / 168
  • 40. MySQL Group Replication Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 40 / 168
  • 41. MySQL Group Replication Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 41 / 168
  • 42. MySQL Group Replication Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 42 / 168
  • 43. MySQL Group Replication Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 43 / 168
  • 44. MySQL Group Replication Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 44 / 168
  • 45. MySQL Group Replication Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 45 / 168
  • 46. MySQL Group Replication Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 46 / 168
  • 47. MySQL Group Replication Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 47 / 168
  • 48. MySQL Group Replication Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 48 / 168
  • 49. MySQL Group Replication Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 49 / 168
  • 50. MySQL Group Replication Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 50 / 168
  • 51. MySQL Group Replication (full transaction) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 51 / 168
  • 52. MySQL Group Replication (full transaction) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 52 / 168
  • 53. MySQL Group Replication (full transaction) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 53 / 168
  • 54. MySQL Group Replication (full transaction) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 54 / 168
  • 55. MySQL Group Replication (full transaction) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 55 / 168
  • 56. MySQL Group Replication (full transaction) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 56 / 168
  • 57. MySQL Group Replication (full transaction) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 57 / 168
  • 58. MySQL Group Replication (full transaction) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 58 / 168
  • 59. MySQL Group Replication (full transaction) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 59 / 168
  • 60. MySQL Group Communication System (GCS) MySQL Xcom protocol Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 60 / 168
  • 61. MySQL Group Communication System (GCS) MySQL Xcom protocol Replicated Database State Machine Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 61 / 168
  • 62. MySQL Group Communication System (GCS) MySQL Xcom protocol Replicated Database State Machine Paxos based protocol Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 62 / 168
  • 63. MySQL Group Communication System (GCS) MySQL Xcom protocol Replicated Database State Machine Paxos based protocol its task: deliver messages across the distributed system: Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 63 / 168
  • 64. MySQL Group Communication System (GCS) MySQL Xcom protocol Replicated Database State Machine Paxos based protocol its task: deliver messages across the distributed system: atomically Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 64 / 168
  • 65. MySQL Group Communication System (GCS) MySQL Xcom protocol Replicated Database State Machine Paxos based protocol its task: deliver messages across the distributed system: atomically in Total Order Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 65 / 168
  • 66. MySQL Group Communication System (GCS) MySQL Xcom protocol Replicated Database State Machine Paxos based protocol its task: deliver messages across the distributed system: atomically in Total Order MySQL Group Replication receives the Ordered 'tickets' from this GCS subsystem. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 66 / 168
  • 67. Total Order GTID generation Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 67 / 168
  • 68. How does Group Replication handle GTIDs ? There are two ways of generating GTIDs: Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 68 / 168
  • 69. How does Group Replication handle GTIDs ? There are two ways of generating GTIDs: AUTOMATIC: the transaction is assigned with an automatically generated id during commit. Where regular replication uses the source server UUID, on Group Replication, the group name is used. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 69 / 168
  • 70. How does Group Replication handle GTIDs ? There are two ways of generating GTIDs: AUTOMATIC: the transaction is assigned with an automatically generated id during commit. Where regular replication uses the source server UUID, on Group Replication, the group name is used. ASSIGNED: the user assigns manually a GTID through SET GTID_NEXT to the transaction. This is common to any replication format and the id is assigned before the transaction starts. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 70 / 168
  • 71. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 71 / 168
  • 72. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 72 / 168
  • 73. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 73 / 168
  • 74. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 74 / 168
  • 75. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 75 / 168
  • 76. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 76 / 168
  • 77. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 77 / 168
  • 78. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 78 / 168
  • 79. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 79 / 168
  • 80. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 80 / 168
  • 81. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 81 / 168
  • 82. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 82 / 168
  • 83. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 83 / 168
  • 84. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 84 / 168
  • 85. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 85 / 168
  • 86. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 86 / 168
  • 87. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 87 / 168
  • 88. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 88 / 168
  • 89. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 89 / 168
  • 90. Group Replication : Total Order Delivery - GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 90 / 168
  • 91. Group Replication : GTID The previous example was not totally in sync with reality. In fact, a writer allocates a block of GTID and when we have multiple writes (multi-primary mode) all writers will use GTID sequence numbers in their allocated block. The size of the block is defined by group_replication_gtid_assignment_block_size (default to 1M) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 91 / 168
  • 92. Group Replication : GTID Example: Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355 Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 92 / 168
  • 93. Group Replication : GTID Example: Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355 New write on an other node: Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355,1000354 Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 93 / 168
  • 94. Group Replication : GTID Example: Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355 New write on an other node: Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355,1000354 Let's write on the third node: Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355:1000354:2000354 Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 94 / 168
  • 95. Group Replication : GTID Example: Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355 New write on an other node: Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355,1000354 Let's write on the third node: Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-355:1000354:2000354 And writing back on the first one: Executed_Gtid_Set: 0b5c746d-d552-11e8-bef0-08002718d305:1-356:1000354:2000354 Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 95 / 168
  • 96. done ! Return from Commit Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 96 / 168
  • 97. Group Replication: return from commit Asynchronous Replication: Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 97 / 168
  • 98. Group Replication: return from commit (2) Semi-Sync Replication: Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 98 / 168
  • 99. Group Replication: return from commit (3) Group Replication: Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 99 / 168
  • 100. Does this mean we can have a distant node and always let it ack later ? Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 100 / 168
  • 101. Does this mean we can have a distant node and always let it ack later ? NO! Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 101 / 168
  • 102. Does this mean we can have a distant node and always let it ack later ? NO! Because the system has to wait for the noop (single skip message) from the “distant” node where latency is higher The size of the GCS consensus messages window can be get and set from UDF functions: group_replication_get_write_concurrency(), group_replication_set_write_concurrency() mysql> select group_replication_get_write_concurrency(); +-------------------------------------------+ | group_replication_get_write_concurrency() | +-------------------------------------------+ | 10 | +-------------------------------------------+ Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 102 / 168
  • 103. Event Horizon GCS Write Consensus Concurrency Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 103 / 168
  • 104. Event Horizon GCS Write Consensus Concurrency group replication write concurrency Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 104 / 168
  • 105. Event Horizon GCS Write Consensus Concurrency group replication write concurrency Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 105 / 168
  • 106. Event Horizon GCS Write Consensus Concurrency group replication write concurrency Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 106 / 168
  • 107. Event Horizon GCS Write Consensus Concurrency Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 107 / 168
  • 108. Event Horizon GCS Write Consensus Concurrency Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 108 / 168
  • 109. Event Horizon GCS Write Consensus Concurrency Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 109 / 168
  • 110. Event Horizon GCS Write Consensus Concurrency Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 110 / 168
  • 111. conflict Optimistic Locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 111 / 168
  • 112. Group Replication : Optimistic Locking Group Replication uses optimistic locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 112 / 168
  • 113. Group Replication : Optimistic Locking Group Replication uses optimistic locking during a transaction, local (InnoDB) locking happens Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 113 / 168
  • 114. Group Replication : Optimistic Locking Group Replication uses optimistic locking during a transaction, local (InnoDB) locking happens optimistically assumes there will be no conflicts across nodes (no communication between nodes necessary) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 114 / 168
  • 115. Group Replication : Optimistic Locking Group Replication uses optimistic locking during a transaction, local (InnoDB) locking happens optimistically assumes there will be no conflicts across nodes (no communication between nodes necessary) cluster-wide conflict resolution happens only at COMMIT, during certification Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 115 / 168
  • 116. Group Replication : Optimistic Locking Group Replication uses optimistic locking during a transaction, local (InnoDB) locking happens optimistically assumes there will be no conflicts across nodes (no communication between nodes necessary) cluster-wide conflict resolution happens only at COMMIT, during certification Let´s first have a look at the traditional locking to compare. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 116 / 168
  • 117. Traditional locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 117 / 168
  • 118. Traditional locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 118 / 168
  • 119. Traditional locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 119 / 168
  • 120. Traditional locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 120 / 168
  • 121. Traditional locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 121 / 168
  • 122. Traditional locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 122 / 168
  • 123. Optimistic Locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 123 / 168
  • 124. Optimistic Locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 124 / 168
  • 125. Optimistic Locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 125 / 168
  • 126. Optimistic Locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 126 / 168
  • 127. Optimistic Locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 127 / 168
  • 128. Optimistic Locking Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 128 / 168
  • 129. Optimistic Locking The system returns error 149 as certification failed: ERROR 1180 (HY000): Got error 149 during COMMIT Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 129 / 168
  • 130. Such conflicts happen only when using multi- primary group !   not totally true in MySQL < 8.0.13 when failover happens Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 130 / 168
  • 131. Drawbacks of optimistic locking having a first-committer-wins system means conflicts will more likely happen when writing on multiple members with: Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 131 / 168
  • 132. Drawbacks of optimistic locking having a first-committer-wins system means conflicts will more likely happen when writing on multiple members with: large transactions Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 132 / 168
  • 133. Drawbacks of optimistic locking having a first-committer-wins system means conflicts will more likely happen when writing on multiple members with: large transactions long running transactions Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 133 / 168
  • 134. Drawbacks of optimistic locking having a first-committer-wins system means conflicts will more likely happen when writing on multiple members with: large transactions long running transactions hotspot records Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 134 / 168
  • 135. can the transaction be committed ? Certification Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 135 / 168
  • 136. Certification Certification is the process that only needs to answer the following unique question: Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 136 / 168
  • 137. Certification Certification is the process that only needs to answer the following unique question: can the write (transaction) be committed ? Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 137 / 168
  • 138. Certification Certification is the process that only needs to answer the following unique question: can the write (transaction) be committed ? based on yet to be applied transactions Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 138 / 168
  • 139. Certification Certification is the process that only needs to answer the following unique question: can the write (transaction) be committed ? based on yet to be applied transactions such conflicts must come for other members/nodes Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 139 / 168
  • 140. Certification Certification is the process that only needs to answer the following unique question: can the write (transaction) be committed ? based on yet to be applied transactions such conflicts must come for other members/nodes happens on every member/node and is deterministic Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 140 / 168
  • 141. Certification Certification is the process that only needs to answer the following unique question: can the write (transaction) be committed ? based on yet to be applied transactions such conflicts must come for other members/nodes happens on every member/node and is deterministic results are not reported to the group (does not require a new communication step) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 141 / 168
  • 142. Certification Certification is the process that only needs to answer the following unique question: can the write (transaction) be committed ? based on yet to be applied transactions such conflicts must come for other members/nodes happens on every member/node and is deterministic results are not reported to the group (does not require a new communication step) pass: commit/queue to appy Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 142 / 168
  • 143. Certification Certification is the process that only needs to answer the following unique question: can the write (transaction) be committed ? based on yet to be applied transactions such conflicts must come for other members/nodes happens on every member/node and is deterministic results are not reported to the group (does not require a new communication step) pass: commit/queue to appy fail: rollback/drop the transaction Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 143 / 168
  • 144. Certification Certification is the process that only needs to answer the following unique question: can the write (transaction) be committed ? based on yet to be applied transactions such conflicts must come for other members/nodes happens on every member/node and is deterministic results are not reported to the group (does not require a new communication step) pass: commit/queue to appy fail: rollback/drop the transaction serialized by the total order in GCS/XCOM + GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 144 / 168
  • 145. Certification Certification is the process that only needs to answer the following unique question: can the write (transaction) be committed ? based on yet to be applied transactions such conflicts must come for other members/nodes happens on every member/node and is deterministic results are not reported to the group (does not require a new communication step) pass: commit/queue to appy fail: rollback/drop the transaction serialized by the total order in GCS/XCOM + GTID cost is based on trx size (# rows & # keys) Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 145 / 168
  • 146. Certification Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 146 / 168
  • 147. Houston we have a problem ! Flow Control Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 147 / 168
  • 148. Flow Control In Group Replication, every member send statistics about its queues (applier queue and certification queue) to the other members. Then every node decide to slow down or not if they realize that one node reached the threshold for one of the queue. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 148 / 168
  • 149. Flow Control In Group Replication, every member send statistics about its queues (applier queue and certification queue) to the other members. Then every node decide to slow down or not if they realize that one node reached the threshold for one of the queue. So when group_replication_ ow_control_mode is set to QUOTA on the node seeing that one of the other members of the cluster is lagging behind (threshold reached), it will throttle the write operations to the a quota that is calculated based on the number of transactions applied in the last second, and then it is reduced below that by subtracting the “over the quota” messages from the last period. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 149 / 168
  • 150. Flow Control In Group Replication, every member send statistics about its queues (applier queue and certification queue) to the other members. Then every node decide to slow down or not if they realize that one node reached the threshold for one of the queue. So when group_replication_ ow_control_mode is set to QUOTA on the node seeing that one of the other members of the cluster is lagging behind (threshold reached), it will throttle the write operations to the a quota that is calculated based on the number of transactions applied in the last second, and then it is reduced below that by subtracting the “over the quota” messages from the last period. This mean that the threshold is NOT decided on the node being slow, but the node writing a transaction checks its threshold flow control values and compare them to the statistics from the other nodes to decide to throttle or not. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 150 / 168
  • 151. Flow Control - on writer >quota Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 151 / 168
  • 152. Flow Control - on all members Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 152 / 168
  • 153. Flow Control - configuration variables As in MySQL 8.0.13: +-----------------------------------------------------+-------+ | Variable_name | Value | +-----------------------------------------------------+-------+ | group_replication_ ow_control_applier_threshold | 25000 | | group_replication_ ow_control_certi er_threshold | 25000 | | group_replication_ ow_control_hold_percent | 10 | | group_replication_ ow_control_max_quota | 0 | | group_replication_ ow_control_member_quota_percent | 0 | | group_replication_ ow_control_min_quota | 0 | | group_replication_ ow_control_min_recovery_quota | 0 | | group_replication_ ow_control_mode | QUOTA | | group_replication_ ow_control_period | 1 | | group_replication_ ow_control_release_percent | 50 | +-----------------------------------------------------+-------+ Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 153 / 168
  • 154. Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 154 / 168
  • 155. transaction's lifecycle in Group Replication Summary Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 155 / 168
  • 156. begin; Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 156 / 168
  • 157. begin; update table1 set c = 999 where id =2; Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 157 / 168
  • 158. begin; update table1 set c = 999 where id =2; update table1 set b = "eee" where id = 3; Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 158 / 168
  • 159. begin; update table1 set c = 999 where id =2; update table1 set b = "eee" where id = 3; commit; clientblocksoncommit... Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 159 / 168
  • 160. begin; update table1 set c = 999 where id =2; update table1 set b = "eee" where id = 3; commit; clientblocksoncommit... writesets + gtid_event + write values Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 160 / 168
  • 161. begin; update table1 set c = 999 where id =2; update table1 set b = "eee" where id = 3; commit; clientblocksoncommit... writesets + gtid_event + write values certify Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 161 / 168
  • 162. begin; update table1 set c = 999 where id =2; update table1 set b = "eee" where id = 3; commit; clientblocksoncommit... writesets + gtid_event + write values certify certify Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 162 / 168
  • 163. begin; update table1 set c = 999 where id =2; update table1 set b = "eee" where id = 3; commit; clientblocksoncommit... writesets + gtid_event + write values certify certify certify Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 163 / 168
  • 164. begin; update table1 set c = 999 where id =2; update table1 set b = "eee" where id = 3; commit; commit finalized writesets + gtid_event + write values certify certify certify + GTID bin log Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 164 / 168
  • 165. begin; update table1 set c = 999 where id =2; update table1 set b = "eee" where id = 3; commit; commit finalized writesets + gtid_event + write values certify certify certify + GTID bin log + GTID Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 165 / 168
  • 166. begin; update table1 set c = 999 where id =2; update table1 set b = "eee" where id = 3; commit; commit finalized writesets + gtid_event + write values certify certify certify + GTID bin log + GTID + GTIDrelay log relay log Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 166 / 168
  • 167. begin; update table1 set c = 999 where id =2; update table1 set b = "eee" where id = 3; commit; commit finalized writesets + gtid_event + write values certify certify certify + GTID bin log + GTID + GTIDrelay log relay log bin log bin log Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 167 / 168
  • 168. Thank you ! Any Questions ? Copyright @ 2018 Oracle and/or its affiliates. All rights reserved. 168 / 168