SlideShare a Scribd company logo
CLOUDCONF 2014
Database: backup e disaster recovery in Cloud
Walter Dal Mut
@walterdalmut – www.corley.it – walterdalmut.com
DISASTER RECOVERY
Disaster recovery (DR) is about preparing for and recovering from a
disaster.
DISASTER
Any event that has a negative impact on
your business continuity or finances could be termed a disaster.
WHYWEARETALKINGABOUT DR?
• Over 70% of businesses involved in a major fire either do not reopen, or subsequently fail
within 3 years of fire. (Source continuitycentral.com)
• 80% of businesses affected by a major
incident either never re-open or close within 18 months (SourceAxa)
• 70 percent of companies go out of business after a major data loss (Source
continuitycentral.com)
• 80% of businesses suffering a computer disaster, who have no disaster recovery plans, go
out of business. (Source “A BridgeToo Far”, IBM BusinessRecovery Service & Cranfield,
1993)
• A recent study from Gartner, Inc., found that 90 percent of companies that experience
data loss go out of business within two years.
• 80 percent of companies without well-conceived data protection and recovery strategies
go out of business within 2 years of a major disaster. (Source: US NationalArchives and
Records Administration)
RTO – RECOVERYTIME
OBJECTIVE
This is the duration of time and the service level to which a business
process must be restored after a disaster
RTO what it implies?
• Have a system that records 1000 transaction at hour
• Take a snapshot of a system at 03:00 am (every day)
• 10:00 am a disaster event occurs
• You spend 1 hour to sort things out for the backup (off-site, preparation, etc.)
• Recover operation takes 4 hours in order to get back to operate (at minimum
service level)
• 5 hours is the: RECOVERYTIME OBJECTIVE
RPO – RECOVERY POINT
OBJECTIVE
This describes the acceptable amount of data loss measured in time.
RPO –WHAT IT IMPLIES?
• Have a system that records 1000 transaction at hour
• Take a snaphot of a system at 03:00 am (every day)
• 10:00 am a disaster event occurs
• In this case we lost around 7000 transactions.
• 1000 transactions 03:00 04:00
• 1000 transactions 04:00 05:00
• …
• But: we are accepting 24 hours of data loss 24000 transactions (RPO)
DISASTER RECOVERY STRATEGIES
Local
tape
backup
Online
backup
Pilot-Light
Warm
Stand-by
And
More…
$ $$$ $$$$$$
Seconds
Days
ON-PREMISE & CLOUD
Use cloud resources in order to provide business continuity
Disaster Recovery & Cloud?
• On Demand
• We can allocate and release new resources whenever we need
• Cost Effective
• Pay as you go model.We pay only for resources that we are effectively
using
• Scalable
• We can scale freely and adapt our strategy thanks to autoscaling and
other mechanisms
• Secure
• Control doesn’t mean security
FOCUS ON DATABASES
We will focus on MySQL but you can apply to your infrastructure without
any problem.
BACKUP & RESTORE
Take a snapshot of a system and restore it when you need it
Application
Backup
Restore
RTO & RPO?
Things to remember…
RTO
What resources can impact on my RTO
RESOURCES
ALLOCATION
How fast we can set up all resources, eg: instances, network, etc etc.
DB RESTORE
How many time the database restore can takes?
RPO
What resources can impact on my RPO
DB SNAPSHOT
How many time we need to recover all data from our snapshot?
Backup & Restore – RPO & RTO
Configuration
• Resources Allocation
• ???
• Restore Operation
• ???
• DNS
• TTL 30 minutes
• Snapshot
• Every 24 hour
Effects
• RTO – RecoveryTime Objective
• 30 minutes + ??? + ???
• RPO – Recovery Point Objective
• 24 hour
• Downtime per month
• 99.8% availability 86.23 minutes
• 99.95% availability 21.56 minutes
COSTS ON S3 (AWS)
0.085$ per GB durability 99,999999999%
$0.068 / GB durability 99,99%
$0.010 / GB durability 99.999999999% [glacier]
Pilot light
We can let a little resource always active
that can help us to activate a whole
system
Replication
Basically pilot-light is based on database
replication strategies
For MySQL async replication is used as
base strategy
http://www.slideshare.net/corleycloud/m
ysql-scale-out-cloudparty-2013-milano-
talent-garden
ON-PREMISE –WEB APP
READ REPLICA ON A CLOUD PROVIDER
MOVETO CLOUD ON A DISASTER
RTO & RPO?
Things to remember…
RTO
What resources can impact on my RTO
RESOURCES
ALLOCATION
run and configure new instances typically takes a couple of minutes
you have always to care about resources and times.
DNS PROPAGATION
DNS takes a little while before propagate new addresses (TimeTo Live)
RPO
What resources can impact on my RPO
DB REPLICATION
Remember that Master/Slave replications are ASYNC!
It implies LAG replication time and that impact with your RPO!
MONITORYOUR
INFRASTRUCTURE
Setting an RPO about 20 minutes implies that your replication LAG time
should be always under 20 minutes!
Pilot Light – RPO & RTO
Configuration
• Resources Allocation
• 20 minutes
• DNS
• TTL 30 minutes
• Replication LAG
• 20 minutes
Effects
• RTO – RecoveryTime Objective
• 50 minutes
• RPO – Recovery Point Objective
• 20 minutes
• Downtime per month
• 99.8% availability 86.23 minutes
• 99.95% availability 21.56 minutes
COSTS ON AWS
0.06$ per hour  1 m1.small~43$ per month
0.05$ per GB EBS
0.05$ per 1 million I/O requests EBS
WARM STANDBY
Extends pilot-light resource allocation and preparation
Warm Standby
Warm Stand-by
Warm StandBy – RPO & RTO
Configuration
• Resources Allocation
• 5 minutes
• DNS
• TTL 30 minutes
• Replication LAG
• 20 minutes
Effects
• RTO – RecoveryTime Objective
• 35 minutes
• RPO – Recovery Point Objective
• 20 minutes
• Downtime per month
• 99.8% availability 86.23 minutes
• 99.95% availability 21.56 minutes
COSTS ON AWS
0.06$ per hour 2 m1.small~86$ per month
0.05$ per GB EBS
0.05$ per 1 million I/O requests EBS
ELB 20$ per month
PILOT LIGHT
VS
WARM STAND-BY
Effectively in our examples
Pilot Light is much more effective than warm stand-by.
Doesn’t it?
DEPENDS ON
ASSUMPTIONS
We assume that we don’t need to scale out our database but that is
enough to scale it up only!
Resource allocation for new read replicas? How long does it takes?
THANKS FOR LISTENING

More Related Content

PPT
Disaster Recovery and the Cloud
PPTX
Backup on the cloud 10.1.13
PPTX
Backup on the cloud Webinar
PPTX
AWS Cloud Disaster Recovery Plan Checklist - Are you ready?
PPTX
Building Bulletproof Infrastructure on AWS
PDF
Big data and Analytics on AWS
PPTX
Cloud computing virtualization
DOCX
Case Studies (Questions and Answers)
Disaster Recovery and the Cloud
Backup on the cloud 10.1.13
Backup on the cloud Webinar
AWS Cloud Disaster Recovery Plan Checklist - Are you ready?
Building Bulletproof Infrastructure on AWS
Big data and Analytics on AWS
Cloud computing virtualization
Case Studies (Questions and Answers)

Similar to Disaster Recovery - On-Premise & Cloud (20)

PDF
Disaster Recovery and Reliability
PDF
Disaster recovery - What, Why, and How
PPTX
Designing a Modern Disaster Recovery Environment
PPTX
Designing a Modern Disaster Recovery Environment
PPTX
Deepak_ppt_ver1.0.pptx
PDF
7_Questions_DR_Plan_6-23-16
PDF
HADRFINAL13112016
PDF
Executive Guide to DRaaS TierPoint and DR Capabilities
PPTX
Disaster Recovery.pptx it's presentation for disaster recovery use this
PPTX
Disaster Recovery Solution for common industry
PPTX
Disaster recovery solution
PPT
Business continuity and disaster recovery
PPTX
Disaster Recover : 10 tips for disaster recovery planning
PPTX
RECOVERY TIME OBJECTIVE IN ISO 22301.pptx
PDF
Disaster Recovery in Business Continuity Management
PPTX
Disaster Recovery Planning for MySQL & MariaDB
PDF
ProfitBricks-white-paper-Disaster-Recovery-US
PPTX
RapidScale CloudRecovery
PPTX
Disaster Recovery: Understanding Trend, Methodology, Solution, and Standard
PPT
Disaster Recovery
Disaster Recovery and Reliability
Disaster recovery - What, Why, and How
Designing a Modern Disaster Recovery Environment
Designing a Modern Disaster Recovery Environment
Deepak_ppt_ver1.0.pptx
7_Questions_DR_Plan_6-23-16
HADRFINAL13112016
Executive Guide to DRaaS TierPoint and DR Capabilities
Disaster Recovery.pptx it's presentation for disaster recovery use this
Disaster Recovery Solution for common industry
Disaster recovery solution
Business continuity and disaster recovery
Disaster Recover : 10 tips for disaster recovery planning
RECOVERY TIME OBJECTIVE IN ISO 22301.pptx
Disaster Recovery in Business Continuity Management
Disaster Recovery Planning for MySQL & MariaDB
ProfitBricks-white-paper-Disaster-Recovery-US
RapidScale CloudRecovery
Disaster Recovery: Understanding Trend, Methodology, Solution, and Standard
Disaster Recovery
Ad

More from Corley S.r.l. (20)

PDF
Aws rekognition - riconoscimento facciale
PDF
AWSome day 2018 - scalability and cost optimization with container services
PDF
AWSome day 2018 - API serverless with aws
PDF
AWSome day 2018 - database in cloud
PDF
Trace your micro-services oriented application with Zipkin and OpenTracing
PDF
Apiconf - The perfect REST solution
PDF
Apiconf - Doc Driven Development
PDF
Authentication and authorization in res tful infrastructures
PDF
Flexibility and scalability of costs in serverless infrastructures
PDF
CloudConf2017 - Deploy, Scale & Coordinate a microservice oriented application
PDF
React vs Angular2
PDF
A single language for backend and frontend from AngularJS to cloud with Clau...
PPTX
AngularJS: Service, factory & provider
PPTX
The advantage of developing with TypeScript
PDF
Angular coding: from project management to web and mobile deploy
PDF
Corley cloud angular in cloud
PDF
Measure your app internals with InfluxDB and Symfony2
PDF
Read Twitter Stream and Tweet back pictures with Raspberry Pi & AWS Lambda
PDF
Deploy and Scale your PHP App with AWS ElasticBeanstalk and Docker- PHPTour L...
PDF
Middleware PHP - A simple micro-framework
Aws rekognition - riconoscimento facciale
AWSome day 2018 - scalability and cost optimization with container services
AWSome day 2018 - API serverless with aws
AWSome day 2018 - database in cloud
Trace your micro-services oriented application with Zipkin and OpenTracing
Apiconf - The perfect REST solution
Apiconf - Doc Driven Development
Authentication and authorization in res tful infrastructures
Flexibility and scalability of costs in serverless infrastructures
CloudConf2017 - Deploy, Scale & Coordinate a microservice oriented application
React vs Angular2
A single language for backend and frontend from AngularJS to cloud with Clau...
AngularJS: Service, factory & provider
The advantage of developing with TypeScript
Angular coding: from project management to web and mobile deploy
Corley cloud angular in cloud
Measure your app internals with InfluxDB and Symfony2
Read Twitter Stream and Tweet back pictures with Raspberry Pi & AWS Lambda
Deploy and Scale your PHP App with AWS ElasticBeanstalk and Docker- PHPTour L...
Middleware PHP - A simple micro-framework
Ad

Recently uploaded (20)

PDF
Modernizing your data center with Dell and AMD
PDF
Approach and Philosophy of On baking technology
PDF
cuic standard and advanced reporting.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Cloud computing and distributed systems.
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Encapsulation theory and applications.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Machine learning based COVID-19 study performance prediction
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
Modernizing your data center with Dell and AMD
Approach and Philosophy of On baking technology
cuic standard and advanced reporting.pdf
Understanding_Digital_Forensics_Presentation.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Cloud computing and distributed systems.
Review of recent advances in non-invasive hemoglobin estimation
NewMind AI Weekly Chronicles - August'25 Week I
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Encapsulation_ Review paper, used for researhc scholars
Mobile App Security Testing_ A Comprehensive Guide.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Encapsulation theory and applications.pdf
NewMind AI Monthly Chronicles - July 2025
Dropbox Q2 2025 Financial Results & Investor Presentation
Machine learning based COVID-19 study performance prediction
Per capita expenditure prediction using model stacking based on satellite ima...
Building Integrated photovoltaic BIPV_UPV.pdf

Disaster Recovery - On-Premise & Cloud

  • 1. CLOUDCONF 2014 Database: backup e disaster recovery in Cloud Walter Dal Mut @walterdalmut – www.corley.it – walterdalmut.com
  • 2. DISASTER RECOVERY Disaster recovery (DR) is about preparing for and recovering from a disaster.
  • 3. DISASTER Any event that has a negative impact on your business continuity or finances could be termed a disaster.
  • 4. WHYWEARETALKINGABOUT DR? • Over 70% of businesses involved in a major fire either do not reopen, or subsequently fail within 3 years of fire. (Source continuitycentral.com) • 80% of businesses affected by a major incident either never re-open or close within 18 months (SourceAxa) • 70 percent of companies go out of business after a major data loss (Source continuitycentral.com) • 80% of businesses suffering a computer disaster, who have no disaster recovery plans, go out of business. (Source “A BridgeToo Far”, IBM BusinessRecovery Service & Cranfield, 1993) • A recent study from Gartner, Inc., found that 90 percent of companies that experience data loss go out of business within two years. • 80 percent of companies without well-conceived data protection and recovery strategies go out of business within 2 years of a major disaster. (Source: US NationalArchives and Records Administration)
  • 5. RTO – RECOVERYTIME OBJECTIVE This is the duration of time and the service level to which a business process must be restored after a disaster
  • 6. RTO what it implies? • Have a system that records 1000 transaction at hour • Take a snapshot of a system at 03:00 am (every day) • 10:00 am a disaster event occurs • You spend 1 hour to sort things out for the backup (off-site, preparation, etc.) • Recover operation takes 4 hours in order to get back to operate (at minimum service level) • 5 hours is the: RECOVERYTIME OBJECTIVE
  • 7. RPO – RECOVERY POINT OBJECTIVE This describes the acceptable amount of data loss measured in time.
  • 8. RPO –WHAT IT IMPLIES? • Have a system that records 1000 transaction at hour • Take a snaphot of a system at 03:00 am (every day) • 10:00 am a disaster event occurs • In this case we lost around 7000 transactions. • 1000 transactions 03:00 04:00 • 1000 transactions 04:00 05:00 • … • But: we are accepting 24 hours of data loss 24000 transactions (RPO)
  • 10. ON-PREMISE & CLOUD Use cloud resources in order to provide business continuity
  • 11. Disaster Recovery & Cloud? • On Demand • We can allocate and release new resources whenever we need • Cost Effective • Pay as you go model.We pay only for resources that we are effectively using • Scalable • We can scale freely and adapt our strategy thanks to autoscaling and other mechanisms • Secure • Control doesn’t mean security
  • 12. FOCUS ON DATABASES We will focus on MySQL but you can apply to your infrastructure without any problem.
  • 13. BACKUP & RESTORE Take a snapshot of a system and restore it when you need it
  • 17. RTO & RPO? Things to remember…
  • 18. RTO What resources can impact on my RTO
  • 19. RESOURCES ALLOCATION How fast we can set up all resources, eg: instances, network, etc etc.
  • 20. DB RESTORE How many time the database restore can takes?
  • 21. RPO What resources can impact on my RPO
  • 22. DB SNAPSHOT How many time we need to recover all data from our snapshot?
  • 23. Backup & Restore – RPO & RTO Configuration • Resources Allocation • ??? • Restore Operation • ??? • DNS • TTL 30 minutes • Snapshot • Every 24 hour Effects • RTO – RecoveryTime Objective • 30 minutes + ??? + ??? • RPO – Recovery Point Objective • 24 hour • Downtime per month • 99.8% availability 86.23 minutes • 99.95% availability 21.56 minutes
  • 24. COSTS ON S3 (AWS) 0.085$ per GB durability 99,999999999% $0.068 / GB durability 99,99% $0.010 / GB durability 99.999999999% [glacier]
  • 25. Pilot light We can let a little resource always active that can help us to activate a whole system
  • 26. Replication Basically pilot-light is based on database replication strategies For MySQL async replication is used as base strategy http://www.slideshare.net/corleycloud/m ysql-scale-out-cloudparty-2013-milano- talent-garden
  • 28. READ REPLICA ON A CLOUD PROVIDER
  • 29. MOVETO CLOUD ON A DISASTER
  • 30. RTO & RPO? Things to remember…
  • 31. RTO What resources can impact on my RTO
  • 32. RESOURCES ALLOCATION run and configure new instances typically takes a couple of minutes you have always to care about resources and times.
  • 33. DNS PROPAGATION DNS takes a little while before propagate new addresses (TimeTo Live)
  • 34. RPO What resources can impact on my RPO
  • 35. DB REPLICATION Remember that Master/Slave replications are ASYNC! It implies LAG replication time and that impact with your RPO!
  • 36. MONITORYOUR INFRASTRUCTURE Setting an RPO about 20 minutes implies that your replication LAG time should be always under 20 minutes!
  • 37. Pilot Light – RPO & RTO Configuration • Resources Allocation • 20 minutes • DNS • TTL 30 minutes • Replication LAG • 20 minutes Effects • RTO – RecoveryTime Objective • 50 minutes • RPO – Recovery Point Objective • 20 minutes • Downtime per month • 99.8% availability 86.23 minutes • 99.95% availability 21.56 minutes
  • 38. COSTS ON AWS 0.06$ per hour  1 m1.small~43$ per month 0.05$ per GB EBS 0.05$ per 1 million I/O requests EBS
  • 39. WARM STANDBY Extends pilot-light resource allocation and preparation
  • 42. Warm StandBy – RPO & RTO Configuration • Resources Allocation • 5 minutes • DNS • TTL 30 minutes • Replication LAG • 20 minutes Effects • RTO – RecoveryTime Objective • 35 minutes • RPO – Recovery Point Objective • 20 minutes • Downtime per month • 99.8% availability 86.23 minutes • 99.95% availability 21.56 minutes
  • 43. COSTS ON AWS 0.06$ per hour 2 m1.small~86$ per month 0.05$ per GB EBS 0.05$ per 1 million I/O requests EBS ELB 20$ per month
  • 44. PILOT LIGHT VS WARM STAND-BY Effectively in our examples Pilot Light is much more effective than warm stand-by. Doesn’t it?
  • 45. DEPENDS ON ASSUMPTIONS We assume that we don’t need to scale out our database but that is enough to scale it up only! Resource allocation for new read replicas? How long does it takes?