SlideShare a Scribd company logo
Amazon Aurora:
A New Dawn in the World of RDBMS
Kim Schmidt -> AWS Consultant
DataLeader.io -> AWS Partner
-> AWS Vendor
Frank La Vigne
frank@dataleader.io
Introduction
 President & CEO of DataLeader.io http://dataleader.io/
 8 Industry Certifications (including Microsoft), Currently
Studying for the Amazon Web Services Solution Architect
Associate Exam
 Won the National Windows 7 Incubation Week 9 Months
Prior to Release, NAPW “Woman of the Year”, O’Reilly
Media Author & Trainer
 Email: kim@dataleader.io
 Twitter: @DataLeader
 Blog: https://awskimschmidt.com/ &
https://kimschmidtsbrain.com/
 LinkedIn: https://www.linkedin.com/in/dataleader/
2 |2 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
*
Why Amazon Aurora?
3 |3 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
• MySQL/PostgreSQL – ANSI-SQL
• Speed/Availability of High-End
Commercial Databases with
Simplicity & Cost Effectiveness of
Open Source Databases
• Highly Secure Multiple Levels
• At least as Durable & Fault-
Tolerant of Enterprise-class
Database Engines at 1/10 the Cost
& No License Needed
• Fully-Managed
• Built ON AWS FOR the cloud “from
scratch”
• Integrates with Other AWS Services
• Infinitely Scalable
• Decoupled Storage, Logging, &
Caching from DB Engine – SOA!
• Asynchronous Scheme for
Durable State
• Drastically Reduced NW I/O &
PPS on NW
• 6 Copies of Data Across
Locations
• Re-engineered Thread Pooling
• Over .5M SELECTs/sec & 100K
reads/sec
• 6 million INSERTs/min & 30M
SELECTs/min
• Further Scales with Up To 15
Read Replicas Automatically
Grows Storage Up To 64 TB
• THE LOG IS THE DATABASE!
Almost Instant Crash Recovery
with No Data Loss
What? No Maintenance?
4 |4 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
HIGH
AVAILABILITY
SCALABILITY
OS INSTALLS
OS PATCHING
POWER
HVAC
NETWORK
RACKING
STACKING
MAINTENANCE
DB SW INSTALLS
DB SW PATCHES
DB BACKUPS
MANAGED BY AWS
MANAGED BY YOU
APP
OPTIMIZATION
Fully Managed Means This is YOU!
5 |5 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
Amazon Aurora
Storage Cluster Volume Diagram
6 |6 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
What Why How When Where, Say What???
7 |7 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
3 Significant Architectural
Advantages:
 Storage: independent, fault-
tolerant, self-healing SERVICE
across data centers
 Network IOPs reduced by only
writing redo logs to storage
 Backup & Redo Recovery is
continuous, asynchronous,
with compute & memory
spread across a large
distributed fleet of Aurora
instances
Amazon Aurora Architecture 101
8 |8 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
Database
Engine
Storage,
Logging
(&
Caching)
Continual
Backups
Amazon
RDS:
Storage
Control
Plane
Services
Used
Customer VPC
RDS VPC
Storage VPC
*
Scaling Up
9 |9 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
One Amazon Aurora Instance Can Scale from:
1 vCPU & 2GB memory (new small) -> 32 vCPUs & 244GB memory
Scaling Out
10 |10 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
MySQL with Standby I/O
11 |11 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
REDO LOG BINLOG DATA DOUBLE-WRITE FRM FILES
I/O Traffic in Amazon Aurora Database
12 |12 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
REDO LOG BINLOG DATA DOUBLE-WRITE FRM FILES
I/O Traffic in Amazon Aurora Storage Node
13 |13 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
Throughput, Availability, & Durability
14 |14 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
Storage Node Availability / Durability:
 Quorum system for read / write that’s
latency tolerant & doesn’t stall writes
 Peer-to-Peer gossip replication to fill in the
holes
 Continuous scrubbing of data blocks
 Amazon Aurora’s backup capability
enables point-in-time recovery of your
database instance, to any second during
your established retention period, up to 35
day
 Backups are automatic, incremental, &
continuous, have no impact on database
performance, & has 99.999999999%
durability
Amazon Aurora’s
Asynchronous Group Commits
15 |15 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
Amazon Aurora’s Adaptive Thread Pool
16 |16 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
Always-Warm Cache
17 |17 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
THE LOG IS THE DATABASE!
18 |18 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
DEMO
LAUNCHING AN
AMAZON AURORA
CLUSTER
The 1st EVER #SQLSatLA20 |
AWS Database Migration Service
21 |21 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
AWS Schema Conversion Tool for
Heterogeneous Data Migration
22 |22 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
AWS Schema
Conversion Tool
DATA MIGRATION
ASSESSMENT
REPORT BEING
GENERATED
Migrating Heterogeneous
Databases to Amazon Aurora
23 |23 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
1. SOURCE DB
SCHEMA
2. ACTION ITEMS:
CAN’T CONVERT
AUTOMATICALLY
3. STATUS OF
CURRENT TARGET
DB SCHEMA
4. CHOSEN SOURCE
SCHEMA ELEMENT
DETAILS
5. CHOSEN SCHEMA
ELEMENT TARGET
SCHEMA DETAILS
Migrating Heterogeneous
Databases to Amazon Aurora
24 |24 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
http://bit.ly/heteroMigration
Summary
At the end of this session, you should have learned:
 Amazon Aurora has a SOA where the database engine is decoupled from
storage, logging & cache
 By only writing redo log records to storage, network IOPs are drastically
reduced
 Backup is asynchronous, continual & incremental happening in the background
 Recovery is near-instantaneous via warm cache + without checkpointing,
occurs in the background, & without data loss
25 |25 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center26 |
Please Support Our Sponsors
SQL Saturday is made possible with the generous support of these sponsors.
You can support them by opting-in and visiting them in the sponsor area.
EMAIL ME!: kim@dataleader.io
TWEET ME!: @DataLeader
CONNECT WITH ME!:
https://www.linkedin.com/in/dataleader/

More Related Content

PDF
PPTX
HBase introduction in azure
PPTX
How to migrate Console Apps as a cloud service
PPTX
Data cleansing and data prep with synapse data flows
PPTX
Azure serverless Full-Stack kickstart
PPTX
PDF
A Data Journey With AWS
PPTX
Aws day 4
HBase introduction in azure
How to migrate Console Apps as a cloud service
Data cleansing and data prep with synapse data flows
Azure serverless Full-Stack kickstart
A Data Journey With AWS
Aws day 4

What's hot (14)

PPTX
Aws day 2
PPTX
Aws day 3
PPT
WordPress Enterprise architecture on AWS
PPTX
Data quality patterns in the cloud with ADF
PPTX
AWS Summit Berlin 2013 - Choosing the right data storage options with AWS
PPTX
Driving the On-Demand Economy with Predictive Analytics
PDF
Introduction to Amazon Web Services (AWS)
PPTX
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
PPT
Building a data warehouse with AWS Redshift, Matillion and Yellowfin
PPT
Scalable Web Architecture
PDF
AWS tutorial-Part27:AWS EC2
PPTX
Going Serverless - an Introduction to AWS Glue
PPTX
CloudCrowd - NT/e Presentation on Scalable Cloud Transaction & ORM
PDF
The Fast Path to Building Operational Applications with Spark
Aws day 2
Aws day 3
WordPress Enterprise architecture on AWS
Data quality patterns in the cloud with ADF
AWS Summit Berlin 2013 - Choosing the right data storage options with AWS
Driving the On-Demand Economy with Predictive Analytics
Introduction to Amazon Web Services (AWS)
Microsoft Data Integration Pipelines: Azure Data Factory and SSIS
Building a data warehouse with AWS Redshift, Matillion and Yellowfin
Scalable Web Architecture
AWS tutorial-Part27:AWS EC2
Going Serverless - an Introduction to AWS Glue
CloudCrowd - NT/e Presentation on Scalable Cloud Transaction & ORM
The Fast Path to Building Operational Applications with Spark
Ad

Similar to An Introduction to Amazon Aurora Cloud-native Relational Database (16)

PDF
Amazon Aurora (MySQL, Postgres)
PPTX
What's new with Azure Sql Database
PPTX
Microsoft Azure News - 2018 April
PDF
Solving enterprise challenges through scale out storage & big compute final
PPTX
Amazon Aurora Getting started Guide -level 0
PDF
AWS Re:Invent 2019 Re:Cap
PPTX
AWS Cloud Technology And Future of Faster Modern Architecture
PPTX
re:Invent re:Cap - Big Data & IoT at Any Scale
PDF
Trivadis TechEvent 2017 Oracle on azure by Michael Schwarzgorn
PDF
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
PPTX
How to run your Hadoop Cluster in 10 minutes
PPTX
Azure Storage
PDF
Introdução ao data warehouse Amazon Redshift
PPTX
Construindo data lakes e analytics com AWS
PPTX
Azure Spring Cloud
PPTX
Azure satpn19 time series analytics with azure adx
Amazon Aurora (MySQL, Postgres)
What's new with Azure Sql Database
Microsoft Azure News - 2018 April
Solving enterprise challenges through scale out storage & big compute final
Amazon Aurora Getting started Guide -level 0
AWS Re:Invent 2019 Re:Cap
AWS Cloud Technology And Future of Faster Modern Architecture
re:Invent re:Cap - Big Data & IoT at Any Scale
Trivadis TechEvent 2017 Oracle on azure by Michael Schwarzgorn
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
How to run your Hadoop Cluster in 10 minutes
Azure Storage
Introdução ao data warehouse Amazon Redshift
Construindo data lakes e analytics com AWS
Azure Spring Cloud
Azure satpn19 time series analytics with azure adx
Ad

More from DataLeader.io (11)

PPTX
Amazon Aurora Cloud-native Relational Database, Section 2.0
PPTX
Amazon Aurora Relational Database Built for the AWS Cloud, Version 1 Series
DOCX
Kim Schmidt's Resume
PPTX
Microsoft DigiGirlz, Teaching Teens About Databases (Trick!)
PPTX
The Zen of Silverlight
PPTX
The Fundamentals of HTML5
PPTX
How to Build Composite Applications with PRISM
PPTX
Microsoft Kinect & the Microsoft MIX11 Game Preview
PPTX
Managing High Availability with Low Cost
PPTX
A Microsoft Silverlight User Group Starter Kit Made Available for Everyone to...
PPTX
Building Applications with the Microsoft Kinect SDK
Amazon Aurora Cloud-native Relational Database, Section 2.0
Amazon Aurora Relational Database Built for the AWS Cloud, Version 1 Series
Kim Schmidt's Resume
Microsoft DigiGirlz, Teaching Teens About Databases (Trick!)
The Zen of Silverlight
The Fundamentals of HTML5
How to Build Composite Applications with PRISM
Microsoft Kinect & the Microsoft MIX11 Game Preview
Managing High Availability with Low Cost
A Microsoft Silverlight User Group Starter Kit Made Available for Everyone to...
Building Applications with the Microsoft Kinect SDK

Recently uploaded (20)

PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPT
ISS -ESG Data flows What is ESG and HowHow
PPT
Quality review (1)_presentation of this 21
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
.pdf is not working space design for the following data for the following dat...
PDF
Foundation of Data Science unit number two notes
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
Database Infoormation System (DBIS).pptx
PDF
Business Analytics and business intelligence.pdf
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction to machine learning and Linear Models
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
climate analysis of Dhaka ,Banglades.pptx
Miokarditis (Inflamasi pada Otot Jantung)
ISS -ESG Data flows What is ESG and HowHow
Quality review (1)_presentation of this 21
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
Qualitative Qantitative and Mixed Methods.pptx
.pdf is not working space design for the following data for the following dat...
Foundation of Data Science unit number two notes
Clinical guidelines as a resource for EBP(1).pdf
Database Infoormation System (DBIS).pptx
Business Analytics and business intelligence.pdf
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction to machine learning and Linear Models
Galatica Smart Energy Infrastructure Startup Pitch Deck
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
oil_refinery_comprehensive_20250804084928 (1).pptx
Fluorescence-microscope_Botany_detailed content
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
climate analysis of Dhaka ,Banglades.pptx

An Introduction to Amazon Aurora Cloud-native Relational Database

  • 1. Amazon Aurora: A New Dawn in the World of RDBMS Kim Schmidt -> AWS Consultant DataLeader.io -> AWS Partner -> AWS Vendor Frank La Vigne frank@dataleader.io
  • 2. Introduction  President & CEO of DataLeader.io http://dataleader.io/  8 Industry Certifications (including Microsoft), Currently Studying for the Amazon Web Services Solution Architect Associate Exam  Won the National Windows 7 Incubation Week 9 Months Prior to Release, NAPW “Woman of the Year”, O’Reilly Media Author & Trainer  Email: kim@dataleader.io  Twitter: @DataLeader  Blog: https://awskimschmidt.com/ & https://kimschmidtsbrain.com/  LinkedIn: https://www.linkedin.com/in/dataleader/ 2 |2 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
  • 3. * Why Amazon Aurora? 3 |3 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center • MySQL/PostgreSQL – ANSI-SQL • Speed/Availability of High-End Commercial Databases with Simplicity & Cost Effectiveness of Open Source Databases • Highly Secure Multiple Levels • At least as Durable & Fault- Tolerant of Enterprise-class Database Engines at 1/10 the Cost & No License Needed • Fully-Managed • Built ON AWS FOR the cloud “from scratch” • Integrates with Other AWS Services • Infinitely Scalable • Decoupled Storage, Logging, & Caching from DB Engine – SOA! • Asynchronous Scheme for Durable State • Drastically Reduced NW I/O & PPS on NW • 6 Copies of Data Across Locations • Re-engineered Thread Pooling • Over .5M SELECTs/sec & 100K reads/sec • 6 million INSERTs/min & 30M SELECTs/min • Further Scales with Up To 15 Read Replicas Automatically Grows Storage Up To 64 TB • THE LOG IS THE DATABASE! Almost Instant Crash Recovery with No Data Loss
  • 4. What? No Maintenance? 4 |4 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center HIGH AVAILABILITY SCALABILITY OS INSTALLS OS PATCHING POWER HVAC NETWORK RACKING STACKING MAINTENANCE DB SW INSTALLS DB SW PATCHES DB BACKUPS MANAGED BY AWS MANAGED BY YOU APP OPTIMIZATION
  • 5. Fully Managed Means This is YOU! 5 |5 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
  • 6. Amazon Aurora Storage Cluster Volume Diagram 6 |6 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
  • 7. What Why How When Where, Say What??? 7 |7 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center 3 Significant Architectural Advantages:  Storage: independent, fault- tolerant, self-healing SERVICE across data centers  Network IOPs reduced by only writing redo logs to storage  Backup & Redo Recovery is continuous, asynchronous, with compute & memory spread across a large distributed fleet of Aurora instances
  • 8. Amazon Aurora Architecture 101 8 |8 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center Database Engine Storage, Logging (& Caching) Continual Backups Amazon RDS: Storage Control Plane Services Used Customer VPC RDS VPC Storage VPC *
  • 9. Scaling Up 9 |9 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center One Amazon Aurora Instance Can Scale from: 1 vCPU & 2GB memory (new small) -> 32 vCPUs & 244GB memory
  • 10. Scaling Out 10 |10 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
  • 11. MySQL with Standby I/O 11 |11 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center REDO LOG BINLOG DATA DOUBLE-WRITE FRM FILES
  • 12. I/O Traffic in Amazon Aurora Database 12 |12 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center REDO LOG BINLOG DATA DOUBLE-WRITE FRM FILES
  • 13. I/O Traffic in Amazon Aurora Storage Node 13 |13 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
  • 14. Throughput, Availability, & Durability 14 |14 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center Storage Node Availability / Durability:  Quorum system for read / write that’s latency tolerant & doesn’t stall writes  Peer-to-Peer gossip replication to fill in the holes  Continuous scrubbing of data blocks  Amazon Aurora’s backup capability enables point-in-time recovery of your database instance, to any second during your established retention period, up to 35 day  Backups are automatic, incremental, & continuous, have no impact on database performance, & has 99.999999999% durability
  • 15. Amazon Aurora’s Asynchronous Group Commits 15 |15 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
  • 16. Amazon Aurora’s Adaptive Thread Pool 16 |16 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
  • 17. Always-Warm Cache 17 |17 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
  • 18. THE LOG IS THE DATABASE! 18 |18 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
  • 20. The 1st EVER #SQLSatLA20 |
  • 21. AWS Database Migration Service 21 |21 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
  • 22. AWS Schema Conversion Tool for Heterogeneous Data Migration 22 |22 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center AWS Schema Conversion Tool DATA MIGRATION ASSESSMENT REPORT BEING GENERATED
  • 23. Migrating Heterogeneous Databases to Amazon Aurora 23 |23 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center 1. SOURCE DB SCHEMA 2. ACTION ITEMS: CAN’T CONVERT AUTOMATICALLY 3. STATUS OF CURRENT TARGET DB SCHEMA 4. CHOSEN SOURCE SCHEMA ELEMENT DETAILS 5. CHOSEN SCHEMA ELEMENT TARGET SCHEMA DETAILS
  • 24. Migrating Heterogeneous Databases to Amazon Aurora 24 |24 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center http://bit.ly/heteroMigration
  • 25. Summary At the end of this session, you should have learned:  Amazon Aurora has a SOA where the database engine is decoupled from storage, logging & cache  By only writing redo log records to storage, network IOPs are drastically reduced  Backup is asynchronous, continual & incremental happening in the background  Recovery is near-instantaneous via warm cache + without checkpointing, occurs in the background, & without data loss 25 |25 | The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center
  • 26. The 1st EVER #SQLSatLA on June 10th 2017 Microsoft Technology Center26 | Please Support Our Sponsors SQL Saturday is made possible with the generous support of these sponsors. You can support them by opting-in and visiting them in the sponsor area.
  • 27. EMAIL ME!: kim@dataleader.io TWEET ME!: @DataLeader CONNECT WITH ME!: https://www.linkedin.com/in/dataleader/

Editor's Notes

  • #2: CLICK Laptop Sticker
  • #4: * Question attendees KEEP CLICKING THROUGH FOR ANIMATIONS Since it was released in July 2015, surpassing Redshift, which pioneered cloud-based dw Not a SQL Server expert & I’m also not an expert on Amazon Aurora. Some of you might know more about Aurora than I do. But I’m here to tell a you a story that happened last fall….CLICK Story incl YouTube & Aurora Team & Abdul Sait In a nutshell, then what can in depth Amazon Aurora is: CLICK CLICK CLICK UNTIL LOG IS DB Achieves consensus on durable state across many storage nodes with an asynchronous scheme, avoiding chatty recovery protocols
  • #5: Managed Services Provision only what you need on-demand, pay only for what’s used
  • #6: CLICK
  • #7: A Cluster = 1+ instances & cluster volume that manages data for the instances CLICK Here, 1 cluster so it’s both r/w, pointed to by the blue arrow CLICK Cluster vol = all-SSD virtual db stoage vol spanning 3 Azs in whatever region. Recommend to place in diff regions for autom failover Has 2 storage nodes in ea AZ, totaling 6 nodes for high avail, even if you only have 1 instance CLICK Each node has individual segments that have their own redo logs CLICK In this diagram, you have a Primary Cluster & 2 Read Replicas distributed across 3 AZs for durability & availability, & scalability The blue arrow here points to the cluster volume backing up to S3 Notice the 2-way arrows between the storage nodes in each AZ. This is peer-to-peer gossip that we’ll be talking about later
  • #8: 3 significant archetectural advantages: CLICK Storage is an independent, fault-tolerant, & self-healing SERVICE across many data centers, protecting db from perf variance & transient or permanent failures at either the nw or storage tier CLICK By only writing redo log records to storage, nw IOPs lowers dramatically. Once the new bottleneck was addressed, the Aurora Team were able to aggressively optimize numerous points of contention, obtain significant throughput improvements Moved the complex & critical functionality of backup & redo recovery from a 1-time expensive, multi-phase operations in the database engine to continuous, asynch ops amortized (how much time & memory) across a large distributed fleet of instances = near-instant crash recovery w/o checkpointing & with inexpensive backups that don’t interfere with the foreground process LET ME EXPLAIN
  • #9: Decoupled SOA Architecture Moved Logging & Storage to its own service (like EBS) Storage deployed on cluster of EC2 SSD VMs Caching outside the db process to remain warm in case of db restart CLICK Automatic, continual, incremental backups to Amazon S3 (11 9’s) for no extra charge (*Dropbox uses S3!) CLICK Amazon RDS = agent that monitors cluster health & determines if it needs to fail over or if an instance needs to be replaced CLICK Amazon DynamoDB: NOSQL - persistent storage of cluster & volume configuration, volume metadata & a detailed description of data backed up to S3 CLICK For orchestrating long-running operations (ie restore), Amazon Simple Workflow Service (SWF) is used. CLICK Amazon Route 53 is used in helping to maintain pro-active, automated, & early detection of real & potential problems, before end users are impacted (re-routes to Replicas or creates new instance) CLICK For Security, communication is isolated between the database, apps & storage with VPCs
  • #10: Scaling Up buy a bigger database host Scaling Out = Sharding, additional administration costs The minimum storage capacity for an Amazon Aurora cluster is 10 GB. Based on your usage, your storage will automatically grow in 10 GB increments up to 64 TB with no impact to database performance, and no need to provision this storage increase in advance If you want to scale up immediately, you can scale compute resources allocated to your instance in the AWS Console: the associated memory and CPU resources are modified by changing your instance class. You can scale from an instance with 2 vCPUs with 15 GB memory to an instance with 32 vCPUs and 244 GB memory. Scales up to millions of transactions per minute If you need more than that, you can add up to 15 Read Replicas CLICK You can MODIFY a running instance to Scale Up Can click checkbox to “Apply Immediately” or if you don’t it will happen during your next chosen maintenance window (avail impact for a few min)
  • #11: Scaling out by creating up to 15 AURORA Read Replicas spread across 3 Availability Zones to further scale read capacity, reduce latency, increase availability & durability. You can do this live from the console
  • #12: Here you see a representation of what a MySQL database running on Amazon RDS would look like with Standby. #1: Writes to Prim Inst writes data against EBS, then by #2 mirrored to another EBS for EBS dur & avail, at #3, same write operation issued to standby where again with #4&5 the EBS get data writes DRDB=DISTRIBUTED REPLICATED BLOCK DEVICE – LINUX DIST REPLICATED STORAGE Looking at Observations, if nothing else, I/O at #’s 1,3&% are sequential & synchronous Note a Performance Test was done: they got 780 thousand write transactions in 30 minutes, and about 7.4 thousand read/write IO transactions per million transactions. This was only tested on the Primary Instance because the Standby database is part of the Amazon RDS Service and really relevant to what’s happening on the Primary database instance. (END)
  • #13: Let’s now look at IO traffic on an Amazon Aurora DATABASE One difference is that the only thing the Primary Instance is doing to the storage layer that spans the multiple AZs is sending redo log files to that storage node. It collects them up, aka “boxcarring” combines them together & sends them in regular chunks to the storage node. The redo log files are small, & batching them up before sending really helps reduce network IO being sent across the wire. This allows the Aurora Cluster to be a lot more TOLERANT of what’s going on from a network standpoint because it’s sending small amounts of data to start with thus any little hiccups that might occur are less impactful to what’s happening from the Aurora cluster standpoint. Once the log files get to the storage layer, they do 6x more writes than you’d see in MySQL because we’re writing it to all 6 nodes – but it ends up being 9x less network traffic compared to what you’d see in MySQL PRIMARY INSTANCE ONLY Same Benchmarking test: (able to write 28 MILLION transactions, which is) 35x transaction more than what was achieved with MySQL. That works out to about 950K IO operations per 1M transactions – this has 6x the amplification because of the 6 storage nodes. If you broke that down it’s about 158K read/write operation per storage node. That’s about 7.7x less than what it took from a MySQL standpoint.
  • #14: Let’s now look at IO traffic on an Amazon Aurora Storage Node CLICK The Primary Instance is going to send log records to the storage node The Storage Node, BY #1, is going to put that into an in-memory queue CLICK Then BY #2, it’s going to pull of that queue and persist that data & write it to disk = At that point the data is durable on the storage node & it would acknowledge back – notice the acronym “ACK” to the Primary Instance, & at that point, all interactions with that Primary Instance is DONE! That is the critical path from an Aurora standpoint. Everything else is done asynchronously & can happen independent of communication with the Primary Instance. CLICK BY #3, Once the storage node has its log files, it’s going to start organizing those log files & the records in those log files because things can chill up out of sequence or not show up at all sometimes. It’ll find out “do I have everything”, “am I missing any writes”, “is anything out of sequence”? CLICK At this point, by #4, is where for all 6 storage nodes begin the peer-to-peer gossip network that helps all 6 nodes talk to each other & resolve conflicts where maybe 1 of the nodes is missing data. The nodes sort all that out & exchange data so that all the nodes have the same amount of data & that all missing data or conflicting data is resolved. CLICK BY #5, Once that’s done they coalesce the log records into new data block versions CLICK BY #6, Then periodically, asynchronously & very frequently those storage nodes will backup the log & block data to Amazon S3, implying this is our new storage used for our database. CLICK BY #7, Storage nodes will also go through garbage collection, looking for old log files & data blocks that have been replaced & get rid of those CLICK BY #8, Finally it will do data scrubbing. It re-reads data blocks independent of requests from the database instance & verify checksums against those blocks to ensure they’re still good data blocks that haven’t been corrupted through normal disk usage or that they’re dirty now. If they do find a bad data block, they leverage the peer-to-peer network again & they’ll heal themselves across that network so everything is fine again. Some things to note: the input queue is 46x less than in MySQL IO (unamplified, per node). They get the foreground latency path done in the first 2 steps out of the way making everything else asynchronous. The storage tier is multi-tenant, so there are going to be patterns in high usage & low usage during the day on that storage tier, so the team worked to take advantage of the low points to get a lot of these asynchronous jobs done so there’s no negative impact on the customers but still get all of the work done in a decent amount of time.
  • #15: CLICK Amazon Aurora implements a technique called “Quorum” that enforces consistent operation in a distributed system. It’s used as a replica control method and a way to ensure transaction atomicity in the presence of network partitioning without stalling writes. CLICK Peer-to-Peer “gossip replication” fills in holes. The gossip protocol discovers and shares location & state information about the other nodes in the cluster, and is persisted locally by each node. This has been likened to gossip around a water cooler where by the end of the day everyone has heard the latest “gossip” at least once, if not many times CLICK Amazon Aurora has continuous scrubbing of data blocks. Data scrubbing is an error correction technique that uses a background task to periodically inspect main memory or storage for errors, then correct detected errors using redundant data in the form of checksums. This reduces the likelihood that single correctable errors will accumulate CLICK Amazon Aurora’s backup capability enables point-in-time recovery of your database instance, to any second during your established retention period, up to the last 5 minutes. Your automatic retention period can be anywhere up to 35 days, as you can see in this screenshot. Automated backups are stored in Amazon S3 with no impact on database performance
  • #16: Another architectural change done in Amazon Aurora is the way commits happen Traditionally, the way that commits work is somebody does a write, & those writes are collected in a buffer. time has passed, the buffer will get flushed & written to disk. The problem with this is whoever was the first writer will get a latency penalty. If not enough reads happen, they have to wait for the timeout flush to happen. In Aurora, as soon as the first write happens, IO operations start. Every write gets its own IO, it’s not waiting for anything else. These writes are collected in a buffer & a background job collects them at some point & they’re sent off to the storage node. They’re considered durable when 4 out of the 6 storage nodes acknowledge YES, I’ve got the data & I’ve committed it to disk. Then they look at the last log record number, - or log sequence number or LSN - CLICK in this case it’s #47, and the system asks “who below this number needs an acknowledgement?” & they’re going to acknowledge back to all the numbers below that that the write was successful & at that point they consider the database durable up to this last record number and then it advances from that point. There’s a lot more complexity about how Aurora is able to master asynchronous processing other than simply LSN Look up
  • #17: Another architectural change done in Amazon Aurora is they re-engineered the thread pooling. Today with MySQL, every connection gets a thread. And as more connections happens and the database is more heavily used, this can be a problem, leading to performance challenges. TIME (Some of this is solved with MySQL Enterprise Edition where thread groups are used. If it sees a long running connection, it’ll add another thread to accommodate that, but it’s kind of a work-around where it requires careful stall threshold tuning to add threads in the right places so there’s not too many threads which will burden the database or not add enough threads which will delay getting things processed.) So on Amazon Aurora from a thread point, is everything connects via epoll() CLICK Behind epoll() is a Latch Free Task Queue that has a bunch of threads that aren’t doing any work & are available for work. Because these are independent, threads are able to scale up or down dynamically depending on the amount of pending transactions or connections coming in to epoll(). What’s really cool about this is it’s “aware” of when a transaction is awaiting a commit. While awaiting that commit, the thread can be repurposed & let it go & do other work & have another thread hang around to wait for the acknowledgement of the commits when they happen. This helps threading get the most of what it’s already allocated. This means Amazon Aurora can gracefully handle 5 thousand + concurrent client sessions on the largest Amazon Aurora instance, r3.8xl.
  • #18: You can also cache the writes in memory & when the buffer fills up you flush it out asynchronously. When you cache the writes in memory, you end up with great consistent write performance because you’re writing to memory & not to storage. This also means backups are continuous and incremental. Each new log segment is copied to backup storage as it’s completed. Last bullet point: Instant Crash Recovery + Survivable Cache = Quick & Easy Recovery from DB Failures There’s also what’s referred to as “Multi-Version Concurrency Control” which has to do with the fact that data is only appended, not updated, What that means is that when a client requests a read, the read is going to look up the bit of data via the pointers in the index. This means it’s going to get the most current data at the time the read was requested. Someone could change the data while the client is reading it, but that’s ok because to that client it got the most up-to-date version. If the client then wants to write a value back out, all you need to do is on the write, look to see if the pointer has changed from the value when it was read & if answer is yes, you have a concurrency problem. In Amazon Aurora, you can handle that optimistically as opposed to pessimistically where you’d have to lock that data down during that transaction. Reads get a copy of the data in the exact state it existed when the transaction started with optimistic concurrency. This doesn’t mean you’re going to overwrite what someone else has written, it just means you can choose to do this or you can choose to re-read it & re-write it. This solves a lot of scaling issues with relational databases’ heavy use of locks, which can be burdensome. (END)
  • #19: WHAT THIS LOG-BASED STORAGE REALLY MEANS IS THAT THE DB FILE ITSELF IS THE WRITE-AHEAD LOG. It’s your replay log as you’re writing out, appending these blocks to storage, which is what write-ahead does That means your IO is reduced because you don’t have to do 2 writes now, which is write to the write-ahead log then secondly do what you said you were going to do in the write-ahead log which is how databases work today, and also you get almost instantaneous recovery from failures because the database file is the write ahead file so if the database fails you just restart the database & start reading from where you left off, update the pointers & you’re up by your replay logs as you’re reading from the database.
  • #20: FOR BEN: ON THE NEXT SLIDE/VIDEO #20 CLICK IT, IT’S A VIDEO-ONLY DEMO WITH ANNOTATIONS WHERE I’D STOP TO EXPLAIN. EVERYONE LOVED THIS FORMAT, BECAUSE I CUT OUT ALL THE CLICKS & STUFF THAT TAKES UP TIME
  • #21: CLICK SLIDE, IT’S A VIDEO-ONLY DEMO WITH ANNOTATIONS WHERE I’D STOP TO EXPLAIN. EVERYONE LOVED THIS FORMAT, BECAUSE I CUT OUT ALL THE CLICKS & STUFF THAT TAKES UP TIME Launch Options Regions (& AZs) Aurora DB Engine DB Instance Class is how you scale up – watch “Details” Multiple AZ Deployment VPC: Must Deploy in VPC w at Least 1 Subnet in at Least 2 AZs. RDS Auto-Provisions a New Instance in an AZ that has a VPC Subnet upon a Failover. ALSO Provides options to load balance across AZs if one of the Aza becomes temp unavail VPC Security Group: Rules for Inbound Access, by default no access IAM or DB Authorization: IAM more granular & safer Encryption: KMS Failover Priority Backup Retention Period Enhanced Monitoring: = Metrics from the DB are free via CloudWatch. Enhanced=agent on the instance, useful to see how diff processes or threads on DB uses the CPU, etc) Maintenance Window Launch & Stop: Review Window RDS Dashboard, Instances Tab: New Instance Creating Primary Cluster with a Read Replica Instance Actions: Modify: Scale Up Instance Actions: Create Read Replica: Scale Out Alarms & Recent Events Tab Configuration Details Tab -> Subnets & Security Group DB Cluster Details & Cluster Endpoint SQL Client: Paste Endpoint into “Host or IP Address” Testing Connection Create Table Insert Data into Table -> Better to Use Load from S3 Select * ??? – Where’s the data coming from? * Another query to see how the cast from Better Call Saul rates me on my presentation Back in RDS Dashboard CloudWatch Metrics
  • #22: AWS Database Migration Service helps you migrate databases to AWS easily and securely. The source database remains fully operational during the migration, minimizing downtime to applications that rely on the database. The AWS Database Migration Service can migrate your data to and from most widely used commercial and open-source databases. CLICK With just a few clicks, the migration to Amazon Aurora starts CLICK While your original db stays live CLICK You can even replicate back to your original db
  • #23: For heterogeneous migrations, the AWS Schema Conversion Tool automatically converts the source database schema and a majority of the custom code to a format compatible with the target database. The custom code that the tool converts includes views, stored procedures, and functions. Any code that the tool cannot convert automatically is clearly marked so that you can convert it yourself. CLICK The SCT creates an assessment report upon completion the assessment report view opens, showing the Summary tab. The Summary tab displays the summary information from the database migration assessment report. It shows items that were converted automatically, and items that were not converted automatically. AWS SCT ANALYZES YOUR APP, EXTRACTS THE SQL CODE & CREATES A LOCAL VERSION OF THE CONVERTED SQL FOR YOU TO REVIEW & EDIT. THE TOOL DOESN’T CHANGE THE CODE IN YOUR APP UNTIL YOU’RE READY
  • #24: The assessment report view also includes an Action Items tab. This tab contains a list of items that can't be converted automatically to the database engine of your target Amazon RDS DB instance. If you select an action item from the list, AWS SCT highlights the item from your schema that the action item applies to. The report also contains recommendations for how to manually convert the schema item. CLICK Sample report: Turquoise highlight = “Of the total 179 database storage objects, in the source database, we were able to identify 169 (94%) database storage objects that can be converted automatically or with minimal changes to MySQL” The second line states “10 (6%) database storage objects required 58 medium & 10 significant user actions to complete the conversion” Simple – Actions that can be completed in less than 1 hour. Medium – Actions that are more complex and can be completed in 1 to 4 hours. Significant – Actions that are very complex and take more than 4 hours to complete.
  • #25: YOU can migrate your db for as little as $3/TB AWS DMS supports, as a source, on-premises and Amazon EC2 instance databases for Microsoft SQL Server versions 2005, 2008, 2008R2, 2012, and 2014. The Enterprise, Standard, Workgroup, and Developer editions are supported. The Web and Express editions are not supported.
  • #26: Add benefits, re-evaluate