SlideShare a Scribd company logo
MySQL Scale-Out
                                                 by
                                      application partitioning

                                                 Oli Sennhauser
                                                   Rebenweg 6
                                                 CH – 8610 Uster
                                                   Switzerland
                                           oli.sennhauser@bluewin.ch



Introduction                                                very limited way by adding more disks to your IO sys­
Eventually every database system hit its limits. Espe­      tem, but here too you eventually hit a limit (price).
cially on the Internet, where you have millions of users
which theoretically access your database simul­             Scale-out possibilities
taneously, eventually your IO system will be a bottle­      So we have to think about other possibilities to scale-
neck.                                                       out. One possibility would be to use MySQL Cluster.
                                                            This solution can be very fast because it is not IO
Conventional solutions                                      bound. But it has some other limits like: amount of
In general, as a first step, MySQL Replication is used to   available RAM, and joins not performing to well. If
scale-out in such a situation. MySQL Replication scales     these limitations were not applicable, MySQL Cluster
very well when you have a high read/write (r/w) ratio.      would be a good and performant solution.
The higher the better.                                      An other promising but more complex solution with
                                                            nearly no scale-out limits is application partitioning. If
                                                            and when you get into the top-1000 rank on alexa [1],
                   Web server
                   Web server                               you have to think about such solutions.
                    Web server
                     Web server
                                                            Application partitioning
                                                            What does “application partitioning” mean? Application
                                                            partitioning means the following:
               Application server
                Application server
                 Application server
                 Application server                         “Application partitioning distributes application pro­
                                                            cessing across all system resources...”

                                                            There are 2 different kinds of application partitioning:
                                                            horizontal and vertical application partitioning.
                                     MySQL
                                       MySQL
     MySQL                              MySQL               Horizontal application partitioning
                                    Replication
                                        MySQL
    Replication Replication          Replication
                                      Replication           Horizontal application partitioning is also known as
                                       Slave
                                       Replication
                                        Slave
   Master (write)                        Slave              Multi-Tier-Computing [2] which means splitting the
                                    (readonly)
                                          Slave
                                     (readonly)
                                      (readonly)
                                       (readonly)           database back end, the application server (middle tier),
                                                            the web server, and the client doing the display. This
                                                            nowadays is common sense and good practice.
                                                            But with horizontal application partitioning you still
                                                            have not avoided the IO bottleneck on the database back
                                                            end.

But also such a MySQL Replication system (let us call
it “MySQL Replication cluster” [4] rather than “MySQL
Cluster” in this paper) hits its limits when you have a
huge amount of (write) access.
Because database systems have random disk access, it's
not the throughput of your IO system that's relevant but
the IO per second (random seek). You can scale this in a
static information like for example geographic informa­
        Client                                   Web-Client       tion). This can also be done by a separate replication
                                                                  tree:
                                                 Web server                                      WS
                            horizontal
                           application
    Monolithic system      partitioning    Application server /
       containing:                                                                               AS
  * database back end,
                                              Middle-tier
 * application logic and
   * presentation logic
                                            MySQL database                           M1 M2 M3                M'
                                              back end

                                                                                     S1     S2     S3        S'
Vertical application partitioning
With vertical application partitioning you can scale-out
your system to a nearly unlimited degree. The more                The disadvantage of this solution is, that you have to
loosely coupled your vertical application partitions are,         (keep) open at least 1 connection from each application
the better the whole system scales [3].                           server (AS) to each Master (Mn) and Slave (Sn) of each
                                                                  MySQL Replication cluster. So the limitation of this
But what does vertical application partitioning now               system is roughly:
mean? For example suppose you have an on-line contact
website with 1 million users. Some of them, let's say               #Conn./Server : #AS Conn. = #Replication clusters
20%, are actively searching for contacts with other
people. Each of these active searching users does 10                                      1000 : 50 = 20
contact requests per day. This gives approximately 2
million changes into the back end (23 changes per                 When this limit too has been hit, a much more
second). In general one contact request results in more           sophisticated solution with distribution of the users in
than one change in the database and also people are               the AS and WS tier has to be considered:
doing this contact search during peak hours (1/3 of the
day). This can result easily in several hundred changes
per second on the database during peak time. But your
I/O system is roughly limited by this formula:                                             Internet

           250 I/O's /s per disk * #disks = #I/O /s

When you are using MySQL Replication, some caching
                                                                                     WS WS WS
mechanism (MySQL query cache, block buffer cache,
file system cache, battery buffered disk cache, etc.) can
help and when you follow the concept of “relaxation of                               AS      AS         AS
constraints” you can increase this amount of I/O by
some factors. You can handle these 1 million users on an
optimized MySQL Replication Cluster system (when                                     M1       M2        M3
you have tuned it properly).
But what happens when you want to scale by factor 10
or even 100? With 10 million users your system                                        S1      S2        S3
definitely hits its limits. How do we scale here?
In this case we can only scale if we split up one MySQL           But in this concept something like an “asynchronous
Replication Cluster into several pieces. This splitting           inter MySQL Replication Cluster” protocol has to be
can be done for example by user (user_id).                        established.

            WS                                   WS               How to partition an entity
                                                                  An entity can be split up in several different ways:
            AS                                   AS               Partition by RANGE
                                                                  Users are distributed to their MySQL Replication
       M                              M1 M2 M3 M4                 cluster, for example by their user_id. For every 1
                                                                  million users you have to provide a new MySQL
                                                                  Replication cluster:
       S                              S1    S2    S3   S4
                                                                                       AS
It should be considered that the splitting is done by the
entity with the smallest possible interaction. Otherwise a
lot of synchronization work has to be done between the               user_id <      user_id <           user_id <
concerned database nodes. It should also be considered               1'000'000      2'000'000           3'000'000   ...
that some data can or must be kept redundant (general
Advantages:
•   No redistribution of users during growth needed.       or
    You only have to add a new MySQL Replication
    cluster.                                                     Cluster = HASH(last_name) MOD #Clusters
•   Improves slightly locality of reference [5].
                                                           Splitting up by DIV is already discussed in “Partitioning
•   Easy to understand.                                    by RANGE”.
•   Easy to locate data.
•   Likelihood of hot-spots is low.                        Advantages of HASH:
                                                           •   Random distribution, thus no hot-spots
•   Simple distribution logic can be implemented.
                                                           Disadvantages of HASH:
Disadvantages:
                                                           •   For rebalancing the whole system must be mi­
•   On the “old” MySQL Replication clusters it is
                                                               grated!
    likely that you get less and less activity. So you
    either have to waste hardware resources -- which is    •   Hot-spots can happen if done wrong for example
    not too serious because these machines are depre­          HASH(country) MOD # Clusters
    ciated after some years and “only” consume some
    power and space in your IT center -- or you have to    Advantages of MOD:
    migrate users from the oldest MySQL Replication        •   Deterministic distribution, target cluster is easily
    Clusters once in a while -- which causes a lot of          visible “by hand”.
    traffic and probably some downtime on these 2 ma­      Disadvantages of MOD:
    chines.                                                •   For rebalancing the whole system must be mi­
•   Resource balancing causes a lot of migration work.         grated!
•   When resource balancing is done, simple distribu­
    tion logic does not apply anymore. Then a lookup       Partition by LOAD (with lookup table)
    mechanism is needed.                                   A dynamic way to partition users is measuring the load
                                                           of each MySQL Replication cluster (somehow) and dis­
Partition by a certain CHARACTERISTCS                      tributing new users accordingly (similar to a load balan­
Users are distributed by certain characteristics for       cer). For this, for every user a more or less static lookup
example last name, birth date or country.                  table is needed to determine on which MySQL Replica­
                                                           tion cluster a user is located.
                          AS
                                                                                   AS                    lookup
                                                                                                          table

          last_name     last_name     last_name
         BETWEEN       BETWEEN       BETWEEN
         'A' AND 'I'   'J' AND R'    'S' AND 'Z'
                                                                            90% load
Advantages:                                                     50% load               60% load
                                                                                                  20% load
•   Easy to understand.
•   Easy to locate data.                                   Advantages:
Disadvantages:
                                                           •   New MySQL Replication cluster is automatically
•   You can get “hot-spots” for example on the server          loaded more until it reaches saturation.
    with the last name starting with “S” or some coun­
                                                           •   No data redistribution is need.
    tries like US, JP, D etc., and get unused resources
                                                           Disadvantages:
    on servers with for example birth date February
    29th, last names with “X” and “Y” or countries like    •   When old users are not removed after some posting
    the Principality of Liechtenstein, Monaco or An­           peaks can happen on the old systems.
    dorra. This can cause a necessity for redistribution
    of data.                                               Literature
•   This can be avoided by merging some of the values      [1] Alexa top 500 ranking: http://www.alexa.com
    into one MySQL Replication Cluster but then some       [2] Multi-Tier-Computing:
    look-up table must exist.                              http://en.wikipedia.org/wiki/Three-
                                                           tier_%28computing%29
•   Resource balancing is difficult.
                                                           [3] Loose coupling:
                                                           http://en.wikipedia.org/wiki/Loose_coupling
Partitioning by HASH/MODULO                                [4] Cluster:
An entity can also be split up by some other functions     http://en.wikipedia.org/wiki/Computer_cluster
like MODULO. The MySQL Replication Cluster is de­          [5] Locality of reference:
termined by either:                                        http://en.wikipedia.org/wiki/Locality_of_reference
           Cluster = user_id MOD #Clusters

More Related Content

PPTX
Sql azure introduction
PDF
Emulex OneConnect Universal CNA (Short Overview)
PDF
Jonas On Windows Azure OW2con11, Nov 24-25, Paris
 
PDF
Virtual Solution for Microsoft SQL Server
PDF
Z26167171
PDF
Windows Azure: Verbinden, erweitern, integrieren Sie ihr Firmennetzwerk in di...
PDF
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...
PPTX
Windows Azure Design Patterns
Sql azure introduction
Emulex OneConnect Universal CNA (Short Overview)
Jonas On Windows Azure OW2con11, Nov 24-25, Paris
 
Virtual Solution for Microsoft SQL Server
Z26167171
Windows Azure: Verbinden, erweitern, integrieren Sie ihr Firmennetzwerk in di...
Improve deep learning inference  performance with Microsoft Azure Esv4 VMs wi...
Windows Azure Design Patterns

What's hot (19)

PDF
Windows Sql Azure Cloud Computing Platform
PDF
Tim Cramer, Eucaday
PDF
DBaaS with VMware vCAC, EMC XtremIO, and Cisco UCS
PDF
Databases That Support SharePoint 2013
PDF
My cool new Slideshow!
PPTX
CodeFutures - Scaling Your Database in the Cloud
PDF
MongoDB on Windows Azure
PPTX
Scalability
PDF
MongoDB on Windows Azure
PDF
Stairway to heaven webinar
PDF
Virtualizing the Next Generation of Server Workloads with AMD™
PDF
Choosing Your Windows Azure Platform Strategy
PDF
Sap Virtualization Week 2009
PDF
EOUG95 - Client Server Very Large Databases - Paper
PDF
CloudFest Denver Windows Azure Design Patterns
PDF
Brief about Windows Azure Platform
PPTX
Cloud computing 101
PDF
SQL Server 2008 R2 Parallel Data Warehouse
PDF
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
Windows Sql Azure Cloud Computing Platform
Tim Cramer, Eucaday
DBaaS with VMware vCAC, EMC XtremIO, and Cisco UCS
Databases That Support SharePoint 2013
My cool new Slideshow!
CodeFutures - Scaling Your Database in the Cloud
MongoDB on Windows Azure
Scalability
MongoDB on Windows Azure
Stairway to heaven webinar
Virtualizing the Next Generation of Server Workloads with AMD™
Choosing Your Windows Azure Platform Strategy
Sap Virtualization Week 2009
EOUG95 - Client Server Very Large Databases - Paper
CloudFest Denver Windows Azure Design Patterns
Brief about Windows Azure Platform
Cloud computing 101
SQL Server 2008 R2 Parallel Data Warehouse
IJERD (www.ijerd.com) International Journal of Engineering Research and Devel...
Ad

Viewers also liked (7)

PDF
aon Press Release 3Q 08
PDF
raytheon Q4 Earnings Presentation
PDF
aon 4Q 2008_Earnings Release Final
PDF
aon 1Q 08 Earnings Release
PDF
Peugeot Financial Results for the 1st half of 2008
PDF
Meeting with investors of march 2013
PDF
ppg industries 4Q07EARNINGSPRESSRELEASE
aon Press Release 3Q 08
raytheon Q4 Earnings Presentation
aon 4Q 2008_Earnings Release Final
aon 1Q 08 Earnings Release
Peugeot Financial Results for the 1st half of 2008
Meeting with investors of march 2013
ppg industries 4Q07EARNINGSPRESSRELEASE
Ad

Similar to Application Partitioning Wp (20)

PDF
MySQL高可用
PPS
Web20expo Scalable Web Arch
PPS
Web20expo Scalable Web Arch
PPS
Web20expo Scalable Web Arch
PPS
Scalable Web Arch
PPS
Scalable Web Architectures - Common Patterns & Approaches
PDF
Scalability Considerations
PDF
Next-gen Flash-based MySQL and NoSQL Solutions (Real World Case Studies of Ex...
PDF
Conference slides: MySQL Cluster Performance Tuning
PPS
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
PPT
Building a Scalable Architecture for web apps
PDF
Scale Fail: How I Learned to Love the Downtime
ODP
redis
PDF
What every developer should know about database scalability, PyCon 2010
KEY
MySQL Consegi
PPT
Y!7 Simple Scalability
PPT
PHP – Faster And Cheaper. Scale Vertically with IBM i
PPTX
Handling Data in Mega Scale Systems
PDF
Linux and H/W optimizations for MySQL
PDF
Cloudcon East Presentation
MySQL高可用
Web20expo Scalable Web Arch
Web20expo Scalable Web Arch
Web20expo Scalable Web Arch
Scalable Web Arch
Scalable Web Architectures - Common Patterns & Approaches
Scalability Considerations
Next-gen Flash-based MySQL and NoSQL Solutions (Real World Case Studies of Ex...
Conference slides: MySQL Cluster Performance Tuning
Scalable Web Architectures: Common Patterns and Approaches - Web 2.0 Expo NYC
Building a Scalable Architecture for web apps
Scale Fail: How I Learned to Love the Downtime
redis
What every developer should know about database scalability, PyCon 2010
MySQL Consegi
Y!7 Simple Scalability
PHP – Faster And Cheaper. Scale Vertically with IBM i
Handling Data in Mega Scale Systems
Linux and H/W optimizations for MySQL
Cloudcon East Presentation

More from liufabin 66688 (20)

PDF
人大2010新生住宿指南
PDF
中山大学2010新生住宿指南
PDF
武汉大学2010新生住宿指南
PDF
天津大学2010新生住宿指南
PDF
清华2010新生住宿指南
PDF
南京大学2010新生住宿指南
PDF
复旦2010新生住宿指南
PDF
北京师范大学2010新生住宿指南
PDF
北大2010新生住宿指南
PPT
Optimzing mysql
PDF
Mysql Replication Excerpt 5.1 En
PDF
Drupal Con My Sql Ha 2008 08 29
PDF
090507.New Replication Features(2)
DOC
Mysql Replication
PPT
High Performance Mysql
PDF
high performance mysql
PDF
Lecture28
PDF
Refactoring And Unit Testing
PDF
Refactoring Simple Example
PDF
Refactoring Example
人大2010新生住宿指南
中山大学2010新生住宿指南
武汉大学2010新生住宿指南
天津大学2010新生住宿指南
清华2010新生住宿指南
南京大学2010新生住宿指南
复旦2010新生住宿指南
北京师范大学2010新生住宿指南
北大2010新生住宿指南
Optimzing mysql
Mysql Replication Excerpt 5.1 En
Drupal Con My Sql Ha 2008 08 29
090507.New Replication Features(2)
Mysql Replication
High Performance Mysql
high performance mysql
Lecture28
Refactoring And Unit Testing
Refactoring Simple Example
Refactoring Example

Recently uploaded (20)

PDF
Encapsulation theory and applications.pdf
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
sap open course for s4hana steps from ECC to s4
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Cloud computing and distributed systems.
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Machine learning based COVID-19 study performance prediction
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Empathic Computing: Creating Shared Understanding
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
Encapsulation theory and applications.pdf
NewMind AI Weekly Chronicles - August'25-Week II
Building Integrated photovoltaic BIPV_UPV.pdf
Programs and apps: productivity, graphics, security and other tools
Spectral efficient network and resource selection model in 5G networks
The Rise and Fall of 3GPP – Time for a Sabbatical?
Assigned Numbers - 2025 - Bluetooth® Document
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
cuic standard and advanced reporting.pdf
Unlocking AI with Model Context Protocol (MCP)
Reach Out and Touch Someone: Haptics and Empathic Computing
sap open course for s4hana steps from ECC to s4
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Cloud computing and distributed systems.
A comparative analysis of optical character recognition models for extracting...
Machine learning based COVID-19 study performance prediction
MIND Revenue Release Quarter 2 2025 Press Release
Empathic Computing: Creating Shared Understanding
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
20250228 LYD VKU AI Blended-Learning.pptx

Application Partitioning Wp

  • 1. MySQL Scale-Out by application partitioning Oli Sennhauser Rebenweg 6 CH – 8610 Uster Switzerland oli.sennhauser@bluewin.ch Introduction very limited way by adding more disks to your IO sys­ Eventually every database system hit its limits. Espe­ tem, but here too you eventually hit a limit (price). cially on the Internet, where you have millions of users which theoretically access your database simul­ Scale-out possibilities taneously, eventually your IO system will be a bottle­ So we have to think about other possibilities to scale- neck. out. One possibility would be to use MySQL Cluster. This solution can be very fast because it is not IO Conventional solutions bound. But it has some other limits like: amount of In general, as a first step, MySQL Replication is used to available RAM, and joins not performing to well. If scale-out in such a situation. MySQL Replication scales these limitations were not applicable, MySQL Cluster very well when you have a high read/write (r/w) ratio. would be a good and performant solution. The higher the better. An other promising but more complex solution with nearly no scale-out limits is application partitioning. If and when you get into the top-1000 rank on alexa [1], Web server Web server you have to think about such solutions. Web server Web server Application partitioning What does “application partitioning” mean? Application partitioning means the following: Application server Application server Application server Application server “Application partitioning distributes application pro­ cessing across all system resources...” There are 2 different kinds of application partitioning: horizontal and vertical application partitioning. MySQL MySQL MySQL MySQL Horizontal application partitioning Replication MySQL Replication Replication Replication Replication Horizontal application partitioning is also known as Slave Replication Slave Master (write) Slave Multi-Tier-Computing [2] which means splitting the (readonly) Slave (readonly) (readonly) (readonly) database back end, the application server (middle tier), the web server, and the client doing the display. This nowadays is common sense and good practice. But with horizontal application partitioning you still have not avoided the IO bottleneck on the database back end. But also such a MySQL Replication system (let us call it “MySQL Replication cluster” [4] rather than “MySQL Cluster” in this paper) hits its limits when you have a huge amount of (write) access. Because database systems have random disk access, it's not the throughput of your IO system that's relevant but the IO per second (random seek). You can scale this in a
  • 2. static information like for example geographic informa­ Client Web-Client tion). This can also be done by a separate replication tree: Web server WS horizontal application Monolithic system partitioning Application server / containing: AS * database back end, Middle-tier * application logic and * presentation logic MySQL database M1 M2 M3 M' back end S1 S2 S3 S' Vertical application partitioning With vertical application partitioning you can scale-out your system to a nearly unlimited degree. The more The disadvantage of this solution is, that you have to loosely coupled your vertical application partitions are, (keep) open at least 1 connection from each application the better the whole system scales [3]. server (AS) to each Master (Mn) and Slave (Sn) of each MySQL Replication cluster. So the limitation of this But what does vertical application partitioning now system is roughly: mean? For example suppose you have an on-line contact website with 1 million users. Some of them, let's say #Conn./Server : #AS Conn. = #Replication clusters 20%, are actively searching for contacts with other people. Each of these active searching users does 10 1000 : 50 = 20 contact requests per day. This gives approximately 2 million changes into the back end (23 changes per When this limit too has been hit, a much more second). In general one contact request results in more sophisticated solution with distribution of the users in than one change in the database and also people are the AS and WS tier has to be considered: doing this contact search during peak hours (1/3 of the day). This can result easily in several hundred changes per second on the database during peak time. But your I/O system is roughly limited by this formula: Internet 250 I/O's /s per disk * #disks = #I/O /s When you are using MySQL Replication, some caching WS WS WS mechanism (MySQL query cache, block buffer cache, file system cache, battery buffered disk cache, etc.) can help and when you follow the concept of “relaxation of AS AS AS constraints” you can increase this amount of I/O by some factors. You can handle these 1 million users on an optimized MySQL Replication Cluster system (when M1 M2 M3 you have tuned it properly). But what happens when you want to scale by factor 10 or even 100? With 10 million users your system S1 S2 S3 definitely hits its limits. How do we scale here? In this case we can only scale if we split up one MySQL But in this concept something like an “asynchronous Replication Cluster into several pieces. This splitting inter MySQL Replication Cluster” protocol has to be can be done for example by user (user_id). established. WS WS How to partition an entity An entity can be split up in several different ways: AS AS Partition by RANGE Users are distributed to their MySQL Replication M M1 M2 M3 M4 cluster, for example by their user_id. For every 1 million users you have to provide a new MySQL Replication cluster: S S1 S2 S3 S4 AS It should be considered that the splitting is done by the entity with the smallest possible interaction. Otherwise a lot of synchronization work has to be done between the user_id < user_id < user_id < concerned database nodes. It should also be considered 1'000'000 2'000'000 3'000'000 ... that some data can or must be kept redundant (general
  • 3. Advantages: • No redistribution of users during growth needed. or You only have to add a new MySQL Replication cluster. Cluster = HASH(last_name) MOD #Clusters • Improves slightly locality of reference [5]. Splitting up by DIV is already discussed in “Partitioning • Easy to understand. by RANGE”. • Easy to locate data. • Likelihood of hot-spots is low. Advantages of HASH: • Random distribution, thus no hot-spots • Simple distribution logic can be implemented. Disadvantages of HASH: Disadvantages: • For rebalancing the whole system must be mi­ • On the “old” MySQL Replication clusters it is grated! likely that you get less and less activity. So you either have to waste hardware resources -- which is • Hot-spots can happen if done wrong for example not too serious because these machines are depre­ HASH(country) MOD # Clusters ciated after some years and “only” consume some power and space in your IT center -- or you have to Advantages of MOD: migrate users from the oldest MySQL Replication • Deterministic distribution, target cluster is easily Clusters once in a while -- which causes a lot of visible “by hand”. traffic and probably some downtime on these 2 ma­ Disadvantages of MOD: chines. • For rebalancing the whole system must be mi­ • Resource balancing causes a lot of migration work. grated! • When resource balancing is done, simple distribu­ tion logic does not apply anymore. Then a lookup Partition by LOAD (with lookup table) mechanism is needed. A dynamic way to partition users is measuring the load of each MySQL Replication cluster (somehow) and dis­ Partition by a certain CHARACTERISTCS tributing new users accordingly (similar to a load balan­ Users are distributed by certain characteristics for cer). For this, for every user a more or less static lookup example last name, birth date or country. table is needed to determine on which MySQL Replica­ tion cluster a user is located. AS AS lookup table last_name last_name last_name BETWEEN BETWEEN BETWEEN 'A' AND 'I' 'J' AND R' 'S' AND 'Z' 90% load Advantages: 50% load 60% load 20% load • Easy to understand. • Easy to locate data. Advantages: Disadvantages: • New MySQL Replication cluster is automatically • You can get “hot-spots” for example on the server loaded more until it reaches saturation. with the last name starting with “S” or some coun­ • No data redistribution is need. tries like US, JP, D etc., and get unused resources Disadvantages: on servers with for example birth date February 29th, last names with “X” and “Y” or countries like • When old users are not removed after some posting the Principality of Liechtenstein, Monaco or An­ peaks can happen on the old systems. dorra. This can cause a necessity for redistribution of data. Literature • This can be avoided by merging some of the values [1] Alexa top 500 ranking: http://www.alexa.com into one MySQL Replication Cluster but then some [2] Multi-Tier-Computing: look-up table must exist. http://en.wikipedia.org/wiki/Three- tier_%28computing%29 • Resource balancing is difficult. [3] Loose coupling: http://en.wikipedia.org/wiki/Loose_coupling Partitioning by HASH/MODULO [4] Cluster: An entity can also be split up by some other functions http://en.wikipedia.org/wiki/Computer_cluster like MODULO. The MySQL Replication Cluster is de­ [5] Locality of reference: termined by either: http://en.wikipedia.org/wiki/Locality_of_reference Cluster = user_id MOD #Clusters