SlideShare a Scribd company logo
How to migrate to sharding with Spider
Kentoku SHIBA
1. What is SPIDER?
2. Why SPIDER? what SPIDER can do for you?
3. How to migrate to sharding using Replication
4. How to migrate to sharding using Trigger
5. How to migrate to sharding using Spider function
6. How to migrate to sharding
using Vertical Partitioning Storage Engine
Agenda
What is Spider
What is the Spider Storage Engine?
Spider is a sharding solution and proxying
solution. Spider Storage Engine is a
plugin of MariaDB/MySQL. Spider tables
can be used to federate from other servers
MariaDB/MySQL/OracleDB tables as if they
stand on local server. And Spider can
create database sharding by using table
partitioning feature.
What is the Spider Storage Engine?
1.request
2. Execute SQL
4.response
AP
All databases can be used as ONE database through Spider.
APAP AP AP
SPIDER
(MariaDB/MySQL)
MariaDB
tbl_a
MySQL
tbl_b
SPIDER
(MariaDB/MySQL)
SPIDER
(MariaDB/MySQL)
OracleDB
tbl_c
3. Distributed SQL3. Distributed SQL 3. Distributed SQL
What is the Spider Storage Engine?
Spider is bundled in MariaDB
from 10.0 and all patches for MariaDB is
applied in 10.3
Why SPIDER?
What SPIDER can do for you?
Why Spider? What Spider can do for you?
For federation
You can attach tables from other servers or
from local server by using Spider.
For sharding
You can divide huge tables and huge
traffics to multiple servers by using Spider.
Why Spider? What Spider can do for you?
Cross shard join
You can join all tables by using Spider,
even if tables are on different servers.
simple
sharding
solution
Join operation with simple sharding solution (without Spider)
DB1
tbl_a1
1.Request
2. Execute SQL with JOIN
3.Response
DB2
AP
Join operation requires that all joined tables are on same
server.
APAP AP AP
tbl_a2tbl_b1 tbl_b2
Join operation with Spider
1.request
2. Execute SQL with JOIN
3.response
AP
You can JOIN all tables, even if tables are on different servers.
APAP AP AP
SPIDER
(MariaDB/MySQL)
DB1
tbl_a1
DB2
tbl_a2tbl_b1 tbl_b2
Why Spider? What Spider can do for you?
Join push down
If it is possible, Spider executes JOIN
operation at data node directly.
JOIN push down
1.request
2. Execute SQL with JOIN
3.response
AP
If all tables are on same data node, Spider executes JOIN
operation on data node directly.
APAP AP AP
SPIDER
(MariaDB/MySQL)
DB1
tbl_a
DB2
tbl_ctbl_b tbl_d
JOIN push down
Simple join operation are two times faster
on simple JOIN pushdown test.
Also, in this pushdown of JOIN, when
aggregate functions are included in the
query, since the aggregation processing is
also executed at the data node, the amount
of data transfer is greatly reduced and it
becomes super high speed.
How to migrate to sharding with Spider
using Replication
Initial Structure
There is 1 MariaDB server without Spider.
DB1
tbl_a
Create table tbl_a (
col_a int,
col_b int,
primary key(col_a)
) engine = InnoDB;
Step 1 (for migrating)
Create table on DB3 and DB4.
Then create Spider table on DB2.
DB1
tbl_a
DB3
tbl_a
col_a%2=1col_a%2=0
DB2
DB4
tbl_a
Create table tbl_a (
col_a int,
col_b int,
primary key(col_a)
) engine = Spider
Connection ‘
table “tbl_a”,
user “user”,
password “pass”
‘
partition by list(
mod(col_a, 2)) (
partition pt1 values in(0)
comment ‘host “DB3”’,
partition pt2 values in(1)
comment ‘host “DB4”’
);
tbl_a
Step 2
DB1
tbl_a
DB3
tbl_a
col_a%2=1col_a%2=0
DB2
DB4
tbl_a
Copy table data from DB1 to DB2.
(Use mysqldump with “--master-data = 1 or 2” option)
tbl_a
Step 3
Start replication from DB1 to DB2.
Wait for resolving replication delay.
DB1
tbl_a
DB3
tbl_a
col_a%2=1col_a%2=0
DB2
DB4
tbl_a
tbl_a
replication
Step 4
Stop client access for DB1.
Wait for resolving replication delay.
Switch client access from DB1 to DB2.
DB1
tbl_a
DB3
tbl_a
col_a%2=1col_a%2=0
DB2
DB4
tbl_a
tbl_a
replication
Finish
Stop replication on DB2.
Remove DB1.
DB3
tbl_a
col_a%2=1col_a%2=0
DB2
DB4
tbl_a
tbl_a
Pros and Cons of Replication way
Pros
1. No need to manage lock size for coping.
2. Support non primary key table.
Cons
1. Need to stop writing.
How to migrate to sharding with Spider
using Trigger
Initial Structure
There is 1 MariaDB server without Spider.
DB1
tbl_aCreate table tbl_a (
col_a int,
col_b int,
primary key(col_a)
) engine = InnoDB;
Step 1 (for migrating)
Create table on DB2 and DB3.
Then create Spider table on DB1.
DB1
tbl_a
DB2
tbl_a
col_a%2=1col_a%2=0
DB3
tbl_a
tbl_a2
Create table tbl_a2 (
col_a int,
col_b int,
primary key(col_a)
) engine = Spider
Connection ‘
table “tbl_a”,
user “user”,
password “pass”
‘
partition by list(
mod(col_a, 2)) (
partition pt1 values in(0)
comment ‘host “DB2”’,
partition pt2 values in(1)
comment ‘host “DB3”’
);
Step 2
Create triggers on DB1.
(For copying insert, update and delete. If you use “truncate” for tbl_a, you should better to use
other way)
DB1
tbl_a
DB2
tbl_a
col_a%2=1col_a%2=0
DB3
tbl_a
tbl_a2
delimiter |
create trigger tbl_a_i after insert
on tbl_a for each row
insert into tbl_a2 (a,b) values
(new.a, new.b);
|
create trigger tbl_a_u after update
on tbl_a for each row
update tbl_a2 set a = new.a,
b = new.b
where a = old.a;
|
create trigger tbl_a_d after delete
on tbl_a for each row
delete from tbl_a2 where a = old.a;
|
delimiter ;
Step 3
DB1
tbl_a
DB2
tbl_a
col_a%2=1col_a%2=0
DB3
tbl_a
tbl_a2
Insert select from tbl_a to tbl_a2.
(Please take care of locking time for tbl_a and tbl_a2.)
Step 4
DB2
tbl_a
col_a%2=1col_a%2=0
DB3
tbl_a
Rename table from tbl_a2 to tbl_a.
Rename table tbl_a to tbl_a3,
tbl_a2 to tbl_a;
DB1
tbl_a3
tbl_a
Finish
DB1 DB2
tbl_a
col_a%2=1col_a%2=0
DB3
tbl_atbl_a
Drop table tbl_a3.
Pros and Cons of Trigger way
Pros
1. No need to stop services.
2. Easy to copy.(Simple command)
Cons
1. Impossible to support truncate.
2. Need to manage lock size at coping.
3. Impossible to support non primary key.
How to migrate to sharding with Spider
using Spider function
Initial Structure
There is 1 MariaDB server without Spider.
DB1
tbl_aCreate table tbl_a (
col_a int,
col_b int,
primary key(col_a)
) engine = InnoDB;
Step 1 (for migrating)
Create table on DB2 and DB3.
Then create Spider table on DB1.
DB1
tbl_a
DB2
tbl_a
col_a%2=1col_a%2=0
DB3
tbl_a
tbl_a2
Create table tbl_a2 (
col_a int,
col_b int,
primary key(col_a)
) engine = Spider
Connection ‘
table “tbl_a”,
user “user”,
password “pass”
‘
partition by list(
mod(col_a, 2)) (
partition pt1 values in(0)
comment ‘host “DB2”’,
partition pt2 values in(1)
comment ‘host “DB3”’
);
Step 2
Create tables on DB1.
DB1
tbl_a
DB2
tbl_a
col_a%2=1col_a%2=0
DB3
tbl_a
tbl_a2Create table tbl_a4 (
col_a int,
col_b int,
primary key(col_a)
) engine = Spider
Connection ‘
host “localhost”
table “tbl_a3 tbl_a2”,
lst “0 2”,
user “user”,
password “pass”
‘;
tbl_a4
tbl_a3
Create table tbl_a3 (
col_a int,
col_b int,
primary key(col_a)
) engine = InnoDB;
Step 3
Rename table on DB1.
DB1
tbl_a3
DB2
tbl_a
col_a%2=1col_a%2=0
DB3
tbl_a
tbl_a2
Rename table tbl_a3 to tbl_a5,
tbl_a to tbl_a3, tbl_a4 to tbl_a;
tbl_a
tbl_a5
Step 4
Copy data on DB1.
DB1
tbl_a3
DB2
tbl_a
col_a%2=1col_a%2=0
DB3
tbl_a
tbl_a2
Select
spider_copy_table(‘tbl_a’, ‘’, ‘’);
tbl_a
tbl_a5
Step 5
Rename table on DB1.
DB1
tbl_a3
DB2
tbl_a
col_a%2=1col_a%2=0
DB3
tbl_a
tbl_aRename table tbl_a2 to tbl_a6,
tbl_a5 to tbl_a2, tbl_a to tbl_a7,
tbl_a6 to tbl_a;
tbl_a7
tbl_a2
Finish
DB1 DB2
tbl_a
col_a%2=1col_a%2=0
DB3
tbl_atbl_a
Drop table tbl_a2, tbl_a3 and tbl_a7.
Pros and Cons of Spider function way
Pros
1. No need to stop services.
2. Easy to copy.(Simple command. Lock
size is managed by Spider)
Cons
1. Impossible to support non primary key.
How to migrate to sharding with Spider
using Vertical Partitioning Storage Engine
Initial Structure
There is 1 MariaDB server without Spider.
DB1
tbl_a
Create table tbl_a (
col_a int,
col_b int,
primary key(col_a),
key idx2(col_b)
) engine = InnoDB;
Step 1 (for migrating)
Create table on DB2 and DB3.
Then create tables on DB1.
DB1
tbl_a
DB2
col_a%2=1col_a%2=0
DB3
Create table tbl_pk (
col_a int,
primary key(col_a)
) engine = Spider
Connection ‘
table “tbl_pk”,
user “user”,
password “pass”
‘
partition by list(
mod(col_a, 2)) (
partition pt1 values in(0)
comment ‘host “DB2”’,
partition pt2 values in(1)
comment ‘host “DB3”’
);
tbl_pk
tbl_pk tbl_pk
Step 2
Create table on DB4 and DB5.
Then create tables on DB1.
DB1
tbl_a
DB2
col_a%2=1col_a%2=0
DB3
Create table tbl_a3 (
col_a int,
col_b int,
key idx1(col_a),
key idx2(col_b)
) engine = Spider
Connection ‘
table “tbl_a2”,
user “user”,
password “pass”
‘
partition by list(
mod(col_b, 2)) (
partition pt1 values in(0)
comment ‘host “DB4”’,
partition pt2 values in(1)
comment ‘host “DB5”’
);
tbl_pk
tbl_a2
tbl_pk tbl_pk
DB4
col_b%2=1col_b%2=0
DB5
tbl_a2 tbl_a2
Step 3
Create tables on DB1.
DB1
tbl_a
DB2
col_a%2=1col_a%2=0
DB3
Create table tbl_a3 (
col_a int,
col_b int,
primary key(col_a),
key idx2(col_b)
) engine = VP
Comment ‘
ctm “1”,
ist “1”,
pcm “1”,
tnl “tbl_a4 tbl_pk tbl_a2”
‘;
tbl_a3
tbl_pk
tbl_a2
tbl_pk tbl_pk
DB4
col_b%2=1col_b%2=0
DB5
tbl_a2 tbl_a2
Create table tbl_a4 (
col_a int,
col_b int,
primary key(col_a)
) engine = InnoDB;
tbl_a4
Step 4
Rename tables on DB1.
DB1
DB2
col_a%2=1col_a%2=0
DB3
tbl_a
tbl_pk
tbl_a2
tbl_pk tbl_pk
DB4
col_b%2=1col_b%2=0
DB5
tbl_a2 tbl_a2
Rename table tbl_a4 to tbl_a5,
tbl_a to tbl_a4, tbl_a3 to tbl_a;
tbl_a5
tbl_a4
Step 5
Copy data from tbl_a4 to tbl_pk and tbl_a2 on DB1.
DB1
DB2
col_a%2=1col_a%2=0
DB3
tbl_a
tbl_pk
tbl_a2
tbl_pk tbl_pk
DB4
col_b%2=1col_b%2=0
DB5
tbl_a2 tbl_a2
Select vp_copy_tables(‘table_a’,
‘tbl_a4’, ‘tbl_pk tbl_a2’);
tbl_a5
tbl_a4
Step 6
Alter table tbl_a on DB1.
DB1
DB2
col_a%2=1col_a%2=0
DB3
tbl_a
tbl_pk
tbl_a2
tbl_pk tbl_pk
DB4
col_b%2=1col_b%2=0
DB5
tbl_a2 tbl_a2Alter table tbl_a
comment ‘
pcm “1”,
tnl “tbl_pk tbl_a2”
‘;
tbl_a5
tbl_a4
Finish
Drop table tbl_a on DB1.
DB1
DB2
col_a%2=1col_a%2=0
DB3
tbl_a
tbl_pk
tbl_a2
tbl_pk tbl_pk
DB4
col_b%2=1col_b%2=0
DB5
tbl_a2 tbl_a2
Drop table tbl_a4, tbl_a5;
Pros and Cons of VP way
Pros
1. No need to stop services.
2. Support spiltting by non unique columns.
3. Easy to copy.(Simple command. Lock
size is managed by VP)
Cons
1. VP storage engine is required.
2. Impossible to support non primary key.
Thank you for
taking your
time!!

More Related Content

PDF
An issue of all slaves stop replication
PDF
Spider HA 20100922(DTT#7)
PDF
Cassandra South Bay Meetup - Backup And Restore For Apache Cassandra
PDF
The Automation Factory
PDF
从 Oracle 合并到 my sql npr 实例分析
PPTX
Scylla Summit 2022: Making Schema Changes Safe with Raft
PPTX
Spark 1.6 vs Spark 2.0
PDF
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...
An issue of all slaves stop replication
Spider HA 20100922(DTT#7)
Cassandra South Bay Meetup - Backup And Restore For Apache Cassandra
The Automation Factory
从 Oracle 合并到 my sql npr 实例分析
Scylla Summit 2022: Making Schema Changes Safe with Raft
Spark 1.6 vs Spark 2.0
Lessons from Cassandra & Spark (Matthias Niehoff & Stephan Kepser, codecentri...

What's hot (20)

PDF
Extending Apache Spark – Beyond Spark Session Extensions
ODP
Introduction to apache_cassandra_for_developers-lhg
PPTX
Tuning tips for Apache Spark Jobs
PDF
Scoop Job, import and export to RDBMS
PPTX
mesos-devoxx14
PDF
Why your Spark job is failing
PDF
Installing Apache Hive, internal and external table, import-export
PDF
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
PDF
Scala+data
PPTX
SORT & JOIN IN SPARK 2.0
PDF
Deep Dive into Cassandra
PDF
Introduction to scoop and its functions
PDF
Apache cassandra and spark. you got the the lighter, let's start the fire
PPTX
Emr zeppelin & Livy demystified
PDF
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
PDF
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
PPTX
Tale of Kafka Consumer for Spark Streaming
PDF
SQL to Hive Cheat Sheet
PPTX
Riak add presentation
PDF
Apache Kafka DC Meetup: Replicating DB Binary Logs to Kafka
Extending Apache Spark – Beyond Spark Session Extensions
Introduction to apache_cassandra_for_developers-lhg
Tuning tips for Apache Spark Jobs
Scoop Job, import and export to RDBMS
mesos-devoxx14
Why your Spark job is failing
Installing Apache Hive, internal and external table, import-export
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Scala+data
SORT & JOIN IN SPARK 2.0
Deep Dive into Cassandra
Introduction to scoop and its functions
Apache cassandra and spark. you got the the lighter, let's start the fire
Emr zeppelin & Livy demystified
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
Everyday I'm Shuffling - Tips for Writing Better Spark Programs, Strata San J...
Tale of Kafka Consumer for Spark Streaming
SQL to Hive Cheat Sheet
Riak add presentation
Apache Kafka DC Meetup: Replicating DB Binary Logs to Kafka
Ad

Similar to How to migrate_to_sharding_with_spider (20)

PDF
Using spider for sharding in production
PDF
M|18 How MariaDB Server Scales with Spider
PDF
Newest topic of spider 20131016 in Buenos Aires Argentina
PDF
Sharding with spider solutions 20160721
PDF
Transparent sharding with Spider: what's new and getting started
PPTX
Hive Bucketing in Apache Spark
PPTX
MariaDB pres at LeMUG
PDF
Deep Dive into the New Features of Apache Spark 3.0
PPTX
Database highload solutions
PPTX
Database highload solutions
PDF
What’s New in the Upcoming Apache Spark 3.0
PDF
Mysql features for the enterprise
PPTX
How to scale relational (OLTP) databases. Think: Sharding @C16LV
PPTX
DBMS Modeling & Optimization
PDF
More Than Just The Tip Of The Iceberg.pdf
PDF
Oracle 12.2 sharded database management
PPTX
HiveACIDPublic
DOCX
Inno db datafiles backup and retore
PPS
Big data hadoop rdbms
PDF
Howmysqlworks
Using spider for sharding in production
M|18 How MariaDB Server Scales with Spider
Newest topic of spider 20131016 in Buenos Aires Argentina
Sharding with spider solutions 20160721
Transparent sharding with Spider: what's new and getting started
Hive Bucketing in Apache Spark
MariaDB pres at LeMUG
Deep Dive into the New Features of Apache Spark 3.0
Database highload solutions
Database highload solutions
What’s New in the Upcoming Apache Spark 3.0
Mysql features for the enterprise
How to scale relational (OLTP) databases. Think: Sharding @C16LV
DBMS Modeling & Optimization
More Than Just The Tip Of The Iceberg.pdf
Oracle 12.2 sharded database management
HiveACIDPublic
Inno db datafiles backup and retore
Big data hadoop rdbms
Howmysqlworks
Ad

More from Kentoku (20)

PDF
MariaDB 10.3から利用できるSpider関連の性能向上機能・便利機能ほか
PDF
Spiderストレージエンジンの使い方と利用事例 他ストレージエンジンの紹介
PDF
Spider storage engine (dec212016)
PDF
Spiderストレージエンジンのご紹介
PDF
MariaDB ColumnStore 20160721
PDF
Mroonga 20141129
PDF
MariaDB Spider Mroonga 20140218
PDF
Mroonga 20131129
PDF
Spiderの最新動向 20131009
PDF
Spiderの最新動向 20130419
PDF
Mroonga 20121129
PDF
Mroonga unsupported feature_20111129
PDF
Introducing mroonga 20111129
PDF
hs_spider_hs_something_20110906
PDF
Charms of MySQL 20101206(DTT#7)
PDF
Introducing Spider 20101206(DTT#7)
PDF
Spider DeNA Technology Seminar #2
PDF
Advanced Sharding Techniques with Spider (MUC2010)
PDF
Spider Performance Test(Bench Mark04242009)
PDF
Spider Shibuya.pm #12
MariaDB 10.3から利用できるSpider関連の性能向上機能・便利機能ほか
Spiderストレージエンジンの使い方と利用事例 他ストレージエンジンの紹介
Spider storage engine (dec212016)
Spiderストレージエンジンのご紹介
MariaDB ColumnStore 20160721
Mroonga 20141129
MariaDB Spider Mroonga 20140218
Mroonga 20131129
Spiderの最新動向 20131009
Spiderの最新動向 20130419
Mroonga 20121129
Mroonga unsupported feature_20111129
Introducing mroonga 20111129
hs_spider_hs_something_20110906
Charms of MySQL 20101206(DTT#7)
Introducing Spider 20101206(DTT#7)
Spider DeNA Technology Seminar #2
Advanced Sharding Techniques with Spider (MUC2010)
Spider Performance Test(Bench Mark04242009)
Spider Shibuya.pm #12

Recently uploaded (20)

PPTX
chrmotography.pptx food anaylysis techni
PPTX
Leprosy and NLEP programme community medicine
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
DOCX
Factor Analysis Word Document Presentation
PPTX
modul_python (1).pptx for professional and student
PPTX
Business_Capability_Map_Collection__pptx
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PPTX
Steganography Project Steganography Project .pptx
PPTX
CYBER SECURITY the Next Warefare Tactics
PDF
Transcultural that can help you someday.
PDF
Microsoft 365 products and services descrption
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PDF
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
PPTX
IMPACT OF LANDSLIDE.....................
PPTX
A Complete Guide to Streamlining Business Processes
PDF
[EN] Industrial Machine Downtime Prediction
PPT
Image processing and pattern recognition 2.ppt
PDF
Introduction to the R Programming Language
PPTX
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx
chrmotography.pptx food anaylysis techni
Leprosy and NLEP programme community medicine
STERILIZATION AND DISINFECTION-1.ppthhhbx
Factor Analysis Word Document Presentation
modul_python (1).pptx for professional and student
Business_Capability_Map_Collection__pptx
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
Topic 5 Presentation 5 Lesson 5 Corporate Fin
Steganography Project Steganography Project .pptx
CYBER SECURITY the Next Warefare Tactics
Transcultural that can help you someday.
Microsoft 365 products and services descrption
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
Data Engineering Interview Questions & Answers Data Modeling (3NF, Star, Vaul...
IMPACT OF LANDSLIDE.....................
A Complete Guide to Streamlining Business Processes
[EN] Industrial Machine Downtime Prediction
Image processing and pattern recognition 2.ppt
Introduction to the R Programming Language
(Ali Hamza) Roll No: (F24-BSCS-1103).pptx

How to migrate_to_sharding_with_spider

  • 1. How to migrate to sharding with Spider Kentoku SHIBA
  • 2. 1. What is SPIDER? 2. Why SPIDER? what SPIDER can do for you? 3. How to migrate to sharding using Replication 4. How to migrate to sharding using Trigger 5. How to migrate to sharding using Spider function 6. How to migrate to sharding using Vertical Partitioning Storage Engine Agenda
  • 4. What is the Spider Storage Engine? Spider is a sharding solution and proxying solution. Spider Storage Engine is a plugin of MariaDB/MySQL. Spider tables can be used to federate from other servers MariaDB/MySQL/OracleDB tables as if they stand on local server. And Spider can create database sharding by using table partitioning feature.
  • 5. What is the Spider Storage Engine? 1.request 2. Execute SQL 4.response AP All databases can be used as ONE database through Spider. APAP AP AP SPIDER (MariaDB/MySQL) MariaDB tbl_a MySQL tbl_b SPIDER (MariaDB/MySQL) SPIDER (MariaDB/MySQL) OracleDB tbl_c 3. Distributed SQL3. Distributed SQL 3. Distributed SQL
  • 6. What is the Spider Storage Engine? Spider is bundled in MariaDB from 10.0 and all patches for MariaDB is applied in 10.3
  • 7. Why SPIDER? What SPIDER can do for you?
  • 8. Why Spider? What Spider can do for you? For federation You can attach tables from other servers or from local server by using Spider. For sharding You can divide huge tables and huge traffics to multiple servers by using Spider.
  • 9. Why Spider? What Spider can do for you? Cross shard join You can join all tables by using Spider, even if tables are on different servers.
  • 10. simple sharding solution Join operation with simple sharding solution (without Spider) DB1 tbl_a1 1.Request 2. Execute SQL with JOIN 3.Response DB2 AP Join operation requires that all joined tables are on same server. APAP AP AP tbl_a2tbl_b1 tbl_b2
  • 11. Join operation with Spider 1.request 2. Execute SQL with JOIN 3.response AP You can JOIN all tables, even if tables are on different servers. APAP AP AP SPIDER (MariaDB/MySQL) DB1 tbl_a1 DB2 tbl_a2tbl_b1 tbl_b2
  • 12. Why Spider? What Spider can do for you? Join push down If it is possible, Spider executes JOIN operation at data node directly.
  • 13. JOIN push down 1.request 2. Execute SQL with JOIN 3.response AP If all tables are on same data node, Spider executes JOIN operation on data node directly. APAP AP AP SPIDER (MariaDB/MySQL) DB1 tbl_a DB2 tbl_ctbl_b tbl_d
  • 14. JOIN push down Simple join operation are two times faster on simple JOIN pushdown test. Also, in this pushdown of JOIN, when aggregate functions are included in the query, since the aggregation processing is also executed at the data node, the amount of data transfer is greatly reduced and it becomes super high speed.
  • 15. How to migrate to sharding with Spider using Replication
  • 16. Initial Structure There is 1 MariaDB server without Spider. DB1 tbl_a Create table tbl_a ( col_a int, col_b int, primary key(col_a) ) engine = InnoDB;
  • 17. Step 1 (for migrating) Create table on DB3 and DB4. Then create Spider table on DB2. DB1 tbl_a DB3 tbl_a col_a%2=1col_a%2=0 DB2 DB4 tbl_a Create table tbl_a ( col_a int, col_b int, primary key(col_a) ) engine = Spider Connection ‘ table “tbl_a”, user “user”, password “pass” ‘ partition by list( mod(col_a, 2)) ( partition pt1 values in(0) comment ‘host “DB3”’, partition pt2 values in(1) comment ‘host “DB4”’ ); tbl_a
  • 18. Step 2 DB1 tbl_a DB3 tbl_a col_a%2=1col_a%2=0 DB2 DB4 tbl_a Copy table data from DB1 to DB2. (Use mysqldump with “--master-data = 1 or 2” option) tbl_a
  • 19. Step 3 Start replication from DB1 to DB2. Wait for resolving replication delay. DB1 tbl_a DB3 tbl_a col_a%2=1col_a%2=0 DB2 DB4 tbl_a tbl_a replication
  • 20. Step 4 Stop client access for DB1. Wait for resolving replication delay. Switch client access from DB1 to DB2. DB1 tbl_a DB3 tbl_a col_a%2=1col_a%2=0 DB2 DB4 tbl_a tbl_a replication
  • 21. Finish Stop replication on DB2. Remove DB1. DB3 tbl_a col_a%2=1col_a%2=0 DB2 DB4 tbl_a tbl_a
  • 22. Pros and Cons of Replication way Pros 1. No need to manage lock size for coping. 2. Support non primary key table. Cons 1. Need to stop writing.
  • 23. How to migrate to sharding with Spider using Trigger
  • 24. Initial Structure There is 1 MariaDB server without Spider. DB1 tbl_aCreate table tbl_a ( col_a int, col_b int, primary key(col_a) ) engine = InnoDB;
  • 25. Step 1 (for migrating) Create table on DB2 and DB3. Then create Spider table on DB1. DB1 tbl_a DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2 Create table tbl_a2 ( col_a int, col_b int, primary key(col_a) ) engine = Spider Connection ‘ table “tbl_a”, user “user”, password “pass” ‘ partition by list( mod(col_a, 2)) ( partition pt1 values in(0) comment ‘host “DB2”’, partition pt2 values in(1) comment ‘host “DB3”’ );
  • 26. Step 2 Create triggers on DB1. (For copying insert, update and delete. If you use “truncate” for tbl_a, you should better to use other way) DB1 tbl_a DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2 delimiter | create trigger tbl_a_i after insert on tbl_a for each row insert into tbl_a2 (a,b) values (new.a, new.b); | create trigger tbl_a_u after update on tbl_a for each row update tbl_a2 set a = new.a, b = new.b where a = old.a; | create trigger tbl_a_d after delete on tbl_a for each row delete from tbl_a2 where a = old.a; | delimiter ;
  • 27. Step 3 DB1 tbl_a DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2 Insert select from tbl_a to tbl_a2. (Please take care of locking time for tbl_a and tbl_a2.)
  • 28. Step 4 DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a Rename table from tbl_a2 to tbl_a. Rename table tbl_a to tbl_a3, tbl_a2 to tbl_a; DB1 tbl_a3 tbl_a
  • 30. Pros and Cons of Trigger way Pros 1. No need to stop services. 2. Easy to copy.(Simple command) Cons 1. Impossible to support truncate. 2. Need to manage lock size at coping. 3. Impossible to support non primary key.
  • 31. How to migrate to sharding with Spider using Spider function
  • 32. Initial Structure There is 1 MariaDB server without Spider. DB1 tbl_aCreate table tbl_a ( col_a int, col_b int, primary key(col_a) ) engine = InnoDB;
  • 33. Step 1 (for migrating) Create table on DB2 and DB3. Then create Spider table on DB1. DB1 tbl_a DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2 Create table tbl_a2 ( col_a int, col_b int, primary key(col_a) ) engine = Spider Connection ‘ table “tbl_a”, user “user”, password “pass” ‘ partition by list( mod(col_a, 2)) ( partition pt1 values in(0) comment ‘host “DB2”’, partition pt2 values in(1) comment ‘host “DB3”’ );
  • 34. Step 2 Create tables on DB1. DB1 tbl_a DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2Create table tbl_a4 ( col_a int, col_b int, primary key(col_a) ) engine = Spider Connection ‘ host “localhost” table “tbl_a3 tbl_a2”, lst “0 2”, user “user”, password “pass” ‘; tbl_a4 tbl_a3 Create table tbl_a3 ( col_a int, col_b int, primary key(col_a) ) engine = InnoDB;
  • 35. Step 3 Rename table on DB1. DB1 tbl_a3 DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2 Rename table tbl_a3 to tbl_a5, tbl_a to tbl_a3, tbl_a4 to tbl_a; tbl_a tbl_a5
  • 36. Step 4 Copy data on DB1. DB1 tbl_a3 DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_a2 Select spider_copy_table(‘tbl_a’, ‘’, ‘’); tbl_a tbl_a5
  • 37. Step 5 Rename table on DB1. DB1 tbl_a3 DB2 tbl_a col_a%2=1col_a%2=0 DB3 tbl_a tbl_aRename table tbl_a2 to tbl_a6, tbl_a5 to tbl_a2, tbl_a to tbl_a7, tbl_a6 to tbl_a; tbl_a7 tbl_a2
  • 39. Pros and Cons of Spider function way Pros 1. No need to stop services. 2. Easy to copy.(Simple command. Lock size is managed by Spider) Cons 1. Impossible to support non primary key.
  • 40. How to migrate to sharding with Spider using Vertical Partitioning Storage Engine
  • 41. Initial Structure There is 1 MariaDB server without Spider. DB1 tbl_a Create table tbl_a ( col_a int, col_b int, primary key(col_a), key idx2(col_b) ) engine = InnoDB;
  • 42. Step 1 (for migrating) Create table on DB2 and DB3. Then create tables on DB1. DB1 tbl_a DB2 col_a%2=1col_a%2=0 DB3 Create table tbl_pk ( col_a int, primary key(col_a) ) engine = Spider Connection ‘ table “tbl_pk”, user “user”, password “pass” ‘ partition by list( mod(col_a, 2)) ( partition pt1 values in(0) comment ‘host “DB2”’, partition pt2 values in(1) comment ‘host “DB3”’ ); tbl_pk tbl_pk tbl_pk
  • 43. Step 2 Create table on DB4 and DB5. Then create tables on DB1. DB1 tbl_a DB2 col_a%2=1col_a%2=0 DB3 Create table tbl_a3 ( col_a int, col_b int, key idx1(col_a), key idx2(col_b) ) engine = Spider Connection ‘ table “tbl_a2”, user “user”, password “pass” ‘ partition by list( mod(col_b, 2)) ( partition pt1 values in(0) comment ‘host “DB4”’, partition pt2 values in(1) comment ‘host “DB5”’ ); tbl_pk tbl_a2 tbl_pk tbl_pk DB4 col_b%2=1col_b%2=0 DB5 tbl_a2 tbl_a2
  • 44. Step 3 Create tables on DB1. DB1 tbl_a DB2 col_a%2=1col_a%2=0 DB3 Create table tbl_a3 ( col_a int, col_b int, primary key(col_a), key idx2(col_b) ) engine = VP Comment ‘ ctm “1”, ist “1”, pcm “1”, tnl “tbl_a4 tbl_pk tbl_a2” ‘; tbl_a3 tbl_pk tbl_a2 tbl_pk tbl_pk DB4 col_b%2=1col_b%2=0 DB5 tbl_a2 tbl_a2 Create table tbl_a4 ( col_a int, col_b int, primary key(col_a) ) engine = InnoDB; tbl_a4
  • 45. Step 4 Rename tables on DB1. DB1 DB2 col_a%2=1col_a%2=0 DB3 tbl_a tbl_pk tbl_a2 tbl_pk tbl_pk DB4 col_b%2=1col_b%2=0 DB5 tbl_a2 tbl_a2 Rename table tbl_a4 to tbl_a5, tbl_a to tbl_a4, tbl_a3 to tbl_a; tbl_a5 tbl_a4
  • 46. Step 5 Copy data from tbl_a4 to tbl_pk and tbl_a2 on DB1. DB1 DB2 col_a%2=1col_a%2=0 DB3 tbl_a tbl_pk tbl_a2 tbl_pk tbl_pk DB4 col_b%2=1col_b%2=0 DB5 tbl_a2 tbl_a2 Select vp_copy_tables(‘table_a’, ‘tbl_a4’, ‘tbl_pk tbl_a2’); tbl_a5 tbl_a4
  • 47. Step 6 Alter table tbl_a on DB1. DB1 DB2 col_a%2=1col_a%2=0 DB3 tbl_a tbl_pk tbl_a2 tbl_pk tbl_pk DB4 col_b%2=1col_b%2=0 DB5 tbl_a2 tbl_a2Alter table tbl_a comment ‘ pcm “1”, tnl “tbl_pk tbl_a2” ‘; tbl_a5 tbl_a4
  • 48. Finish Drop table tbl_a on DB1. DB1 DB2 col_a%2=1col_a%2=0 DB3 tbl_a tbl_pk tbl_a2 tbl_pk tbl_pk DB4 col_b%2=1col_b%2=0 DB5 tbl_a2 tbl_a2 Drop table tbl_a4, tbl_a5;
  • 49. Pros and Cons of VP way Pros 1. No need to stop services. 2. Support spiltting by non unique columns. 3. Easy to copy.(Simple command. Lock size is managed by VP) Cons 1. VP storage engine is required. 2. Impossible to support non primary key.
  • 50. Thank you for taking your time!!