Breda Development Meetup 2016-06-08 - High Availability

High Availability
Breda Development Meetup
Bas Peters - june 8, 2016

Uptime
Percentile target Max downtime per year
90% 36 days
99% 3.65 days
99.5% 1.83 days
99.9% 8.76 hours
99.99% 52.56 minutes
99.999% 5.25 minutes
99.9999% 31.5 seconds

HA is Redundancy
ü RAID: Disk crash? Another disk still works!
ü Virtualization: Physical host crashes? VM available on other physical host!
ü Clustering: Server crashes? Another server still works!
ü Power: Power outage? Redundant power supply!
ü Network: Switch or NIC crashes? 2nd network route available!
ü Geographical: Datacenter offline? Another DC available to perform work!

Traditional setup
router
server
end user

Traditional setup - enhanced
router database serverend user application server

Adding redundancy
router database serverend user
application server 1
loadbalancer

Enhanced redundancy
router database serverend user
loadbalancer
router (backup) loadbalancer (backup)

Database redundancy
router
end user
loadbalancer
router (backup) loadbalancer (backup)
database server 1
database server 2

Datacenter redundancy
routerend user
loadbalancer application server 2
router (backup) loadbalancer (backup) database server 1
database server 2
datacenter 1
datacenter 2

States and sessions
o Multiple requests can be served by
different backend servers
o Store session in database or noSQL cache
o Loadbalancer can “stick” a single backend
server to a user…
o ... but not in all cases!
app 1 app 2 app 3 app 4
1
2
3
12 3

Local storage
o Avoid storing meaningful persistent user content on a local server
o Application level caching is useful as long as it is not destructive
o Synchronization of contents between backend servers is a pain
o Use database for storage where possible
… There are possibilities to share storage amongst backend servers

Shared storage - NAS
o Network Attached Storage
o A NAS handles the complete filesystem
o Relies on protocols like:
NFS: Network Filesystem
SMB/CIFS: Windows File Sharing
o Simple to implement
o Redundancy is very hard to achieve, often single point of failure
o Performance is mediocre and bottlenecks can occur

Shared storage - SAN
o Storage Area Network
o A SAN handles only the “block level” part of the filesystem
o Relies on protocols like:
iSCSI: IP based SCSI
Fibre Channel: Optical fiber transport protocol
AoE: ATA over Ethernet
o Hard to implement, expensive
o Redundancy can be achieved to avoid single point of failure
o Performance and scalability is (reasonably) good

Shared storage – Cluster Filesystem
o Filesystem shared on multiple servers using special software / drivers
o Windows implementation:
DFS: Windows Distributed File System
o Linux implementations:
HDFS: Hadoop Distributed Filesystem
Ceph: Object Storage Platform
GlusterFS: Red Hat Cluster Filesystem
o Relatively easy to implement
o Redundancy can easily be achieved
o Performance and scalability is (reasonably) good

Database High Availability
o High Availability on RDBMS (relational database management systems) is
often the most difficult thing in a High Available setup
o Hardware resources and data need to be redundant
o Remember that it isn’t just data, it is constantly changing data
o High Availability means the operation can continue uninterrupted, not by
restoring a new/backup server

Database HA - Replication
o Asynchronous by default
o One master, many slaves
o No write scale-out possible
o Difficult to recover from a failover situation
o Prone to inconsistency when not used properly

Database HA - Sharding
o Separate data over multiple database
back-ends using keyed distribution
o Multi master setup possible
o Excellent scalability
o Redundancy needs to be obtained through a complementary methodology
o Requires more complex application logic

Database HA – Clustering I
o Synchronous by default
o Multi master setup possible
o Write scale-out possible
o Near-automatic fault recovery
o Requires code level replication conflict resolving

Database HA – Clustering II
Clustering for Microsoft SQL (from 2012)
o Always On Availability Groups
o Each node requires WSFC (Windows Server Failover Clustering)
o Asynchronous and synchronous commit mode supported
o Up to 8 “warm” availability replicas can be setup
o These replicas can be used for read transactions and backups
o Availability group listener to automatically redirect clients to the best available server
o Not a “real” cluster, no master-master replication possible

Database HA – Clustering III
Clustering for MySQL (MariaDB)
o Galera (wsrep) plugin to enable clustering
(included in MariaDB 10.1 by default)
o Asynchronous and synchronous commit mode supported
o Multi-master synchronous replication
o Read and write scalability
o Automatic membership control, node joining and dropping
o No listener functionality that redirects clients to available nodes

Clustering – Quorum I
”A quorum is the minimum number of members of a deliberative
assembly necessary to conduct the business of that group”
- Wikipedia

Clustering – Quorum II
o Node Majority: Each node that is available
and in communication can vote. The cluster functions
only with a majority of the votes.
o When a network partition occurs, the nodes in the minority part will go in lockdown to
avoid a “split brain” situation
o When a network partition resolves, the minority part will rejoin the active cluster after
a state transfer to retrieve the data that was changed in the mean time
o A cluster should contain an odd number of nodes to prevent a total lockdown during a
node failure or network partition

Clustering – Scenario 1
o Node A is gracefully stopped
o Other nodes receive “leave” message
and quorum is reduced by 1
o Cluster is online
o Node B and C continue to serve
requests because they have the
majority of votes (2 of 2)

o Node A and B are gracefully stopped
o Node C receive “leave” messages from
A and B and quorum is reduced by 2
o Cluster is online
o Node C continues to serve clients since
it has the majority of votes in the
quorum (1 of 1)

o All nodes are gracefully stopped
o Cluster is offline
o There is a potential problem in starting
the cluster again. The most recent (last
stopped) node should be used to
bootstrap the cluster or there is
potential data loss

o Node A disappears from the cluster due to
unforeseen circumstances
o Node B and C will try to reconnect to A but will
eventually remove A from the cluster,
maintaining the quorum (3)
o Cluster is online
o Node B and C continue to serve requests
because they have the majority of votes
(2 of 3)

o Node A and B disappear from the cluster
due to unforeseen circumstances
o Node C will try to reconnect to A and B
but will eventually remove both from the
cluster, maintaining the quorum (3)
o The cluster is offline because Node C
cannot acquire a majority of the votes
(1 of 3) and will remain in lockdown

o All nodes disappear from the cluster
due to unforeseen circumstances
o Cluster is offline (obviously)
o This is a potential problem as the Node
with the most recent data should be
used to bootstrap the cluster again to
avoid data loss

o A network split causes Node A, B and C
to lose connectivity with Node
D, E and F
o Node A, B and C have no majority
(3 of 6) and Node D, E and F also have
no majority (3 of 6).
All Nodes go in lockdown

Clustering – Multiple Datacenters I
DC 1 DC 2
node 1
node 2
node 3

Clustering – Multiple Datacenters II
DC 1 DC 2
node 1
node 2
node 3
node 4

Clustering – Multiple Datacenters III
DC 1 DC 2
node 1 node 2
DC 3
node 3

Clustering – Multiple Datacenters IV
DC 1 DC 2
node 1
node 2
node 3
node 4
DC 3
node 5 node 6

Health Endpoint Monitoring
o Monitor applications for availability in a HA pool
o Monitor middle-tier services for availability
o Automatic removal of misbehaving endpoints from the pool
o Endpoints that are healthy again after a service interruption are
automatically re-added

Application Health Check
loadbalancer
Application Node
Storage available
Code can be executed
Database reachable
Service A running
Service B running
status request
200 (OK)
Response time: 50ms

Database Health Check
loadbalancer
Database Node
Database running
Simple query can be
executed
Local database node is
healthy cluster node
status request
200 (OK)
Response time: 50ms

appserver 1
appserver 2
appserver 3
Monitoring Strategy
Loadbalancer
DB loadbalancer
db node 1
db node 2
db node 3
DB loadbalancer
db node 1
db node 2
db node 3appserver1appserver2
DB node 1DB node 3

Design Patterns for HA environments
o Safeguard performance
o Increase fault tolerancy
o Improve consistency

Queue based load leveling pattern I
o Temporal decoupling
o Load leveling
o Load balancing
o Loose coupling
tasks
service
message queue
requests received
at variable rate
messages processed
at a more
consistent rate

Queue based load leveling pattern II
When to use?
o Any type of application or service that is subject to overloading
When not to use?
o Not suitable if a response with minimal latency is expected from the
application or service

Throttling pattern I
o Reject or delay requests to the application when a certain number of
requests in a certain amount of time is reached
o Disable or degrade functionality of selected nonessential services so that
essential services can run unimpeded with sufficient resources

Throttling pattern II
When to use?
o To ensure that a system continues to meet service level agreements
o To prevent a single tenant from monopolizing the resources provided by an
application
o To handle bursts in activity
o To help cost-optimize a system by limiting the maximum resource levels
needed to keep it functioning

Retry pattern
o Enable the application to handle anticipated, temporary failures
o Transparently retrying an operation that has previously failed in the
expectation that the cause of the failure is transient
o Especially useful in micro-service and cloud architectures

Deployments
High available environments bring additional challenges to software
deployments:
o How to perform atomic releases?
o How to rollback a faulty release quickly?
o How to release new software without any downtime?

Basic deployment
loadbalancer
database cluster
1. replace application
code on appserver 1
code on appserver 2
3. apply database changes
DONE!

Enhanced deployment
loadbalancer
database cluster
1. remove appserver 1
from the pool
3. enable appserver 1 in the pool
and disable appserver 2
code on appserver 1
DONE!
4. replace application code on
appserver 2
5. enable appserver 2 in the pool
6. apply database changes

A/B Deployments I
loadbalancer application server 1 application server 2
www.live.nl
appserver 1 - A
appserver 2 - A
www.shadow.nl
appserver 1 - B
appserver 2 - B
webserver A
/deploy/A
webserver A
/deploy/A
webserver B
/deploy/B
webserver B
/deploy/B

A/B Deployments II
loadbalancer
request for:
www.live.nl
“www.live.nl is being
served by pool A”
application server
Webserver A code resides at
/deploy/A
request for:
www.shadow.nl
“www.shadow.nl is being
served by pool B”
Webserver B code resides at
/deploy/B

A/B Deployments III
loadbalancer
www.live.nl
www.shadow.nl
POOL A è B
POOL B è A
By swapping Pool A with Pool B in the loadbalancer, the entire backends
are switched instantaneously.
This enables seamless deployment without downtime

Deployment best practices
o Never introduce backwards breaking changes to the database
o Thoroughly test shadow-live environment as it is the closest to the real live
deployment
o Maintain a tight release versioning, based on semantic versioning
o Releasing end of day and on a Friday is not recommended

WWW.CMTELECOM.COM
THANKS FOR LISTENING!

Breda Development Meetup 2016-06-08 - High Availability

More Related Content

What's hot (20)

Similar to Breda Development Meetup 2016-06-08 - High Availability (20)

Recently uploaded (20)

Breda Development Meetup 2016-06-08 - High Availability