SlideShare a Scribd company logo
© 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---1
WE’VE GOT ALL YOUR
OPENSTACK
STORAGE COVERED.
WE’VE GOT ALL YOUR
OPENSTACK
STORAGE COVERED.
Ed Balduf - ed.balduf@solidfire.com, @madskier5
Alex Meade - alex.meade@netapp.com, @mralexmeade
Cinder: How Stuff works
Live Migration and Replication
Live Migration
© 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---3
▪ Block Migration
▪ Disk must be copied between Compute nodes
▪ Shared Storage
▪ Compute nodes share instance storage
▪ Volume-based
▪ Instance information is stored on Cinder backend
Guest OS on VM has no indication it changed compute nodes
False documentation
© 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---4
Live Migration and Storage Compatibility
Migration Type Local Storage Cinder Volumes Shared Storage
Block Migration
Live Migration
BM w/ RO devices
LM w/ RO devices
The Config Drive
▪ 2 ways to inject configuration information into a VM
▪ MetaData service
▪ Config Drive
▪ The Config Drive is a R/O storage device
▪ See previous slide
▪ Nova force_config_drive option may be used to force a config drive
▪ Do not use this option.
▪ Or use shared storage for the config drive
▪ Users can specify one if they want
© 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---6
Live Migration Flow with Block storage
1. Pre-Migration
▪ Check Memory, CPU and Disk resources
2. Reservation
▪ Mount Disks as needed
▪ Calls Cinder initalize_connection() again.
3. Pre-Copy
4. Stop and Copy
5. Commitment
6. Clean-up
▪ Unmount disks as necessary.
See the great presentation from Vancouver: https://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/dive-
into-vm-live-migration
Nova & Cinder
Hypervisor
Pre-Migration
Storage
Compute A
VM A
[Running]
Compute B
Storage
Protocol
Reservation
Storage
Compute A
VM A
[Running]
Compute B
VM A
[Reserved]
Pre-Copy
Storage
Compute A
VM A
[Running]
Compute B
VM A
[Paused]
Copy Memory
Stop and Copy
Storage
Compute A
VM A
[Paused]
Compute B
VM A
[Paused]
Copy Dirty
Memory and
CPU state
**NOTE** Max time in this phase is
equal to the live_migration_downtime
Nova config option. Which defaults to
500 milliseconds.
Commitment
Storage
Compute A
Compute B
VM A
[Running]
Clean UP
Storage
Compute A
Compute B
VM A
[Running]
Demo
Gotchas
▪ Error reporting is non-existent
▪ If you have authentication wrong or firewall doesn’t allow libvirt port then it silently fails.
▪ Mitaka is better about doing upfront storage checks in API.
▪Cinder User Messages (coming in Newton)
▪Ex: cinder message-show 07ce25a6-3af4-4f05-9169-bf540eea9e22
© 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---15
+------------------+--------------------------------------------------------+
| Property | Value |
+------------------+--------------------------------------------------------+
| created_at | 2016-04-13T21:21:50.000000 |
| event_id | MULTIPLE_ATTACHMENT_ERROR |
| guaranteed_until | 2016-05-13T21:21:50.000000 |
| id | 07ce25a6-3af4-4f05-9169-bf540eea9e22 |
| message_level | ERROR |
| request_id | req-03110a48-3769-419b-b40b-e200ddf2c378 |
| resource_type | VOLUME |
| resource_uuid | 450a62fd-f809-4226-96a2-75593a4ad558 |
| user_message | Could not map target LUN to multiple initiators. |
+------------------+--------------------------------------------------------+
Live Migration Resources
▪ Live Migration Configuration
▪ Current Openstack Documentation is now fantastic at describing this:
▪ http://docs.openstack.org/admin-guide/compute-configuring-migrations.html
▪ Blogs:
▪ Remy van Elst - Kilo Release - 6/13/2015
▪ https://raymii.org/s/articles/Openstack_-
_(Manually)_migrating_(KVM)_Nova_Compute_Virtual_Machines.html#Configure_(live)_migration
▪ John Griffith - Juno Release - 12/8/2014
▪ http://j-griffith.github.io/2014/12/08/openstack-live-migration-with-cinder-backed-instances/
▪ Kimi Zhang - Grizzly Release - 8/26/1013
▪ https://kimizhang.wordpress.com/2013/08/26/openstack-vm-live-migration/
▪ Sébastien Han - Essex Release - 7/12/2012
▪ http://www.sebastien-han.fr/blog/2012/07/12/openstack-block-migration/
▪ Video:
▪ https://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/dive-into-vm-live-
migration
© 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---16
Replication in Cinder.
Why are we up here again?
Replication in the cloud with Multiple vendor backends is HARD!
We’re on design #4
Early designs - Vendor centric. No knowledge in the cloud or applications.
Official V1 - Juno. IBM only.
Official V2 - Liberty. No drivers released.
Official V2.1 (aka Cheesecake) - Mitaka mid-cycle
Game plan for Cheesecake
Simplified use case:
Disaster Recovery only.
Admin disaster recovery only.
Fail everything which is replicated to the DR site.
Non-replicated volumes are ‘Offline’
Before Cinder learned about replication
Vendor specific volume type extra specs - indication of replication state of the backend
Examples:
▪ SolidFire example from Essex (sf:replication:all-of-the-replication-infos)
▪ mvip: IPaddr, api_port: portNum, login: loginToMvip, password: secretPassword
▪ not in tree: https://github.com/j-griffith/nova/blob/essex-sf-replication/nova/volume/san.py
▪ NetApp example (netapp_mirrored)
OpenStack is completely unaware. If failover occurs, the admin must re-configure
OpenStack.
18
Use Case for Cheesecake!
Straight forward DR
Non-automated failover of replicated volumes.
When Disaster declared….
API for Cloud Administrator to call to cause failover.
DR storage system is not seen or managed in OpenStack until failover
Non-replicated volumes are “Offline”
There is no split decision.
DR Storage unit becomes the backend.
No failback (your primary is on fire remember!)
No concept of a managed secondary!
Terms this time around
Fail-over
Switch over to the secondary array.
Volumes which are replicated will be there.
Volumes not replicated will not be available.
Attached volumes will need to be re-attached manually.
Freeze
Do not allow any resource create/delete actions
snapshot-create, xxx-delete, resize, retype etc should return an InvalidCommand error
I/O is still allowed, but this is an admin freeze
The idea is to keep thing stable for recovery (if possible)
Unfreeze
Allow resource create/delete commands.
Old Terms (no longer used)
Terms:
promote
Reenable
enabled/disabled
Status:
disabled
inactive
active
active-stopped
error
Tasks (Admin)
replication enable
replication disable
replication failover
list replication targets
How it works and what it does:
Driver must report:
replication_enabled = True
In it’s capabilities.
[solidfire-1]
volume_driver = cinder.volume.drivers.solidfire.SolidFireDriver
san_ip = 172.27.1.50
san_login = admin
san_password = solidfire
volume_backend_name = solidfire
sf_account_prefix = balduf-master
replication_device = backend_id:172.27.1.191,mvip:172.27.1.191,login:admin,password:admin
(Note: No trailing comma allowed in replication_device KV pair list)
Volume extra specs
Keywords:
replication : enabled/disabled
All others are vendor specific:
Example: HP
type: sync/periodic
Drivers Supporting Replication
Available in Mitaka:
SolidFire (out of tree)
Dell
EMC
HP
Huawei
Storewize
IBM
Pure
In process, coming in Newton:
NetApp Data ONTAP & E-series.
Fail-back or lack thereof
▪ If there really is a disaster and ‘A’ is burned to a crisp, there is no fail-back!
▪ But how do we make ‘B’ the new master?
▪ And someday buy ‘C’ and replicate to it?
▪ Fix the database
$ mysql -u root
MariaDB [(none)]> use cinder
MariaDB [cinder]> select id,host,disabled,disabled_reason,replication_status,frozen,active_backend_id from services;
+----+----------------------------------------------+----------+-----------------+--------------------+--------+-------------------+
| id | host | disabled | disabled_reason | replication_status | frozen | active_backend_id |
+----+----------------------------------------------+----------+-----------------+--------------------+--------+-------------------+
| 1 | devstack-master.pm.solidfire.net | 0 | NULL | not-capable | 0 | NULL |
| 2 | devstack-master.pm.solidfire.net | 0 | NULL | not-capable | 0 | NULL |
| 3 | devstack-master.pm.solidfire.net@solidfire-1 | 1 | NULL | failed-over | 0 | 172.27.50.191 |
| 4 | devstack-master.pm.solidfire.net@lvmdriver-1 | 0 | NULL | disabled | 0 | NULL |
+----+----------------------------------------------+----------+-----------------+--------------------+--------+-------------------+
4 rows in set (0.00 sec)
MariaDB [cinder]> update services set disabled=0,disabled_reason=NULL,replication_status='disabled',active_backend_id=NULL where id=3;
▪ Goto Page #1
© 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---25
Demo
Tiramisu (next) Newton
Design cycle here in Austin.
‘Goal’ is some control by the tenant
What if the tenant doesn’t want to wait for Admin?
What if the tenant has a disaster somewhere else in their application.
‘Goal’ to deal with vendor/tenant grouping constructs for replication
May become a separate effort
© 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---28

More Related Content

PPTX
RHEVM - Live Storage Migration
PPTX
Cinder - status of replication
PDF
Cinder enhancements-for-replication-using-stateless-snapshots
PDF
Disaster recovery of OpenStack Cinder using DRBD
PPTX
Optimizing VM images for OpenStack with KVM/QEMU
PDF
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
PDF
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
PPT
4. v sphere big data extensions hadoop
RHEVM - Live Storage Migration
Cinder - status of replication
Cinder enhancements-for-replication-using-stateless-snapshots
Disaster recovery of OpenStack Cinder using DRBD
Optimizing VM images for OpenStack with KVM/QEMU
TechDay - Toronto 2016 - Hyperconvergence and OpenNebula
XPDS14 - Scaling Xen's Aggregate Storage Performance - Felipe Franciosi, Citrix
4. v sphere big data extensions hadoop

What's hot (20)

PDF
Kvm performance optimization for ubuntu
PDF
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
PDF
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
PDF
Libvirt/KVM Driver Update (Kilo)
PDF
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
PDF
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
PPTX
OpenStack Cinder
PDF
QEMU Disk IO Which performs Better: Native or threads?
PDF
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
ODP
Disk Performance Comparison Xen v.s. KVM
PDF
Kvm optimizations
PDF
GlusterFS w/ Tiered XFS
PDF
[OpenInfra Days Korea 2018] (Track 3) - CephFS with OpenStack Manila based on...
PDF
Compute 101 - OpenStack Summit Vancouver 2015
PDF
2021.02 new in Ceph Pacific Dashboard
PPTX
ceph-barcelona-v-1.2
PDF
XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...
PDF
OSv at Cassandra Summit
PDF
Data Reduction for Gluster with VDO
ODP
Gluster volume snapshot
Kvm performance optimization for ubuntu
OpenNebulaConf 2016 - The DRBD SDS for OpenNebula by Philipp Reisner, LINBIT
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
Libvirt/KVM Driver Update (Kilo)
OpenStack Cinder, Implementation Today and New Trends for Tomorrow
XPDS14: Xen 4.5 Roadmap - Konrad Wilk, Oracle
OpenStack Cinder
QEMU Disk IO Which performs Better: Native or threads?
OpenNebulaConf 2016 - Building a GNU/Linux Distribution by Daniel Dehennin, M...
Disk Performance Comparison Xen v.s. KVM
Kvm optimizations
GlusterFS w/ Tiered XFS
[OpenInfra Days Korea 2018] (Track 3) - CephFS with OpenStack Manila based on...
Compute 101 - OpenStack Summit Vancouver 2015
2021.02 new in Ceph Pacific Dashboard
ceph-barcelona-v-1.2
XPDS14 - Intel(r) Virtualization Technology for Directed I/O (VT-d) Posted In...
OSv at Cassandra Summit
Data Reduction for Gluster with VDO
Gluster volume snapshot
Ad

Similar to Cinder Live Migration and Replication - OpenStack Summit Austin (20)

PDF
Deep Dive into Openstack Storage, Sean Cohen, Red Hat
PDF
Deep dive into OpenStack storage, Sean Cohen, Red Hat
PDF
Getting it Right: OpenStack Private Cloud Storage
PDF
Radical Innovations In Storage for Multi-Tenant Infrastructure
PDF
Percona Live 4/14/15: Leveraging open stack cinder for peak application perfo...
PPTX
Open stack operations guide
PPTX
Leveraging OpenStack Cinder for Peak Application Performance
PPTX
Introduction to OpenStack Cinder
PPTX
Introduction to Cinder
PDF
Get the most out OpenStack block storage with SolidFire
PDF
[OpenStack Days Korea 2016] Track2 - OpenStack 기반 소프트웨어 정의 스토리지 기술
PDF
Road show 2015 triangle meetup
PPTX
OpenStack Cinder Best Practices - Meet Up
PPT
Open vStorage Meetup - Santa Clara 04/16
PDF
OpenStack Block Storage 101
PPTX
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
PPTX
OpenStack Cinder
PPTX
Climb Technical Overview
PDF
Open stack solidfire-mavenspire-meetup
PDF
Introduction to OpenStack Storage
Deep Dive into Openstack Storage, Sean Cohen, Red Hat
Deep dive into OpenStack storage, Sean Cohen, Red Hat
Getting it Right: OpenStack Private Cloud Storage
Radical Innovations In Storage for Multi-Tenant Infrastructure
Percona Live 4/14/15: Leveraging open stack cinder for peak application perfo...
Open stack operations guide
Leveraging OpenStack Cinder for Peak Application Performance
Introduction to OpenStack Cinder
Introduction to Cinder
Get the most out OpenStack block storage with SolidFire
[OpenStack Days Korea 2016] Track2 - OpenStack 기반 소프트웨어 정의 스토리지 기술
Road show 2015 triangle meetup
OpenStack Cinder Best Practices - Meet Up
Open vStorage Meetup - Santa Clara 04/16
OpenStack Block Storage 101
OpenStack and NetApp - Chen Reuven - OpenStack Day Israel 2017
OpenStack Cinder
Climb Technical Overview
Open stack solidfire-mavenspire-meetup
Introduction to OpenStack Storage
Ad

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
cuic standard and advanced reporting.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Modernizing your data center with Dell and AMD
PDF
Encapsulation theory and applications.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Cloud computing and distributed systems.
PDF
Electronic commerce courselecture one. Pdf
PPT
Teaching material agriculture food technology
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Empathic Computing: Creating Shared Understanding
PDF
Network Security Unit 5.pdf for BCA BBA.
NewMind AI Weekly Chronicles - August'25 Week I
MYSQL Presentation for SQL database connectivity
Encapsulation_ Review paper, used for researhc scholars
cuic standard and advanced reporting.pdf
The AUB Centre for AI in Media Proposal.docx
Modernizing your data center with Dell and AMD
Encapsulation theory and applications.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Chapter 3 Spatial Domain Image Processing.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Advanced methodologies resolving dimensionality complications for autism neur...
Cloud computing and distributed systems.
Electronic commerce courselecture one. Pdf
Teaching material agriculture food technology
Spectral efficient network and resource selection model in 5G networks
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Empathic Computing: Creating Shared Understanding
Network Security Unit 5.pdf for BCA BBA.

Cinder Live Migration and Replication - OpenStack Summit Austin

  • 1. © 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---1 WE’VE GOT ALL YOUR OPENSTACK STORAGE COVERED. WE’VE GOT ALL YOUR OPENSTACK STORAGE COVERED.
  • 2. Ed Balduf - ed.balduf@solidfire.com, @madskier5 Alex Meade - alex.meade@netapp.com, @mralexmeade Cinder: How Stuff works Live Migration and Replication
  • 3. Live Migration © 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---3 ▪ Block Migration ▪ Disk must be copied between Compute nodes ▪ Shared Storage ▪ Compute nodes share instance storage ▪ Volume-based ▪ Instance information is stored on Cinder backend Guest OS on VM has no indication it changed compute nodes
  • 4. False documentation © 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---4
  • 5. Live Migration and Storage Compatibility Migration Type Local Storage Cinder Volumes Shared Storage Block Migration Live Migration BM w/ RO devices LM w/ RO devices
  • 6. The Config Drive ▪ 2 ways to inject configuration information into a VM ▪ MetaData service ▪ Config Drive ▪ The Config Drive is a R/O storage device ▪ See previous slide ▪ Nova force_config_drive option may be used to force a config drive ▪ Do not use this option. ▪ Or use shared storage for the config drive ▪ Users can specify one if they want © 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---6
  • 7. Live Migration Flow with Block storage 1. Pre-Migration ▪ Check Memory, CPU and Disk resources 2. Reservation ▪ Mount Disks as needed ▪ Calls Cinder initalize_connection() again. 3. Pre-Copy 4. Stop and Copy 5. Commitment 6. Clean-up ▪ Unmount disks as necessary. See the great presentation from Vancouver: https://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/dive- into-vm-live-migration Nova & Cinder Hypervisor
  • 11. Stop and Copy Storage Compute A VM A [Paused] Compute B VM A [Paused] Copy Dirty Memory and CPU state **NOTE** Max time in this phase is equal to the live_migration_downtime Nova config option. Which defaults to 500 milliseconds.
  • 14. Demo
  • 15. Gotchas ▪ Error reporting is non-existent ▪ If you have authentication wrong or firewall doesn’t allow libvirt port then it silently fails. ▪ Mitaka is better about doing upfront storage checks in API. ▪Cinder User Messages (coming in Newton) ▪Ex: cinder message-show 07ce25a6-3af4-4f05-9169-bf540eea9e22 © 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---15 +------------------+--------------------------------------------------------+ | Property | Value | +------------------+--------------------------------------------------------+ | created_at | 2016-04-13T21:21:50.000000 | | event_id | MULTIPLE_ATTACHMENT_ERROR | | guaranteed_until | 2016-05-13T21:21:50.000000 | | id | 07ce25a6-3af4-4f05-9169-bf540eea9e22 | | message_level | ERROR | | request_id | req-03110a48-3769-419b-b40b-e200ddf2c378 | | resource_type | VOLUME | | resource_uuid | 450a62fd-f809-4226-96a2-75593a4ad558 | | user_message | Could not map target LUN to multiple initiators. | +------------------+--------------------------------------------------------+
  • 16. Live Migration Resources ▪ Live Migration Configuration ▪ Current Openstack Documentation is now fantastic at describing this: ▪ http://docs.openstack.org/admin-guide/compute-configuring-migrations.html ▪ Blogs: ▪ Remy van Elst - Kilo Release - 6/13/2015 ▪ https://raymii.org/s/articles/Openstack_- _(Manually)_migrating_(KVM)_Nova_Compute_Virtual_Machines.html#Configure_(live)_migration ▪ John Griffith - Juno Release - 12/8/2014 ▪ http://j-griffith.github.io/2014/12/08/openstack-live-migration-with-cinder-backed-instances/ ▪ Kimi Zhang - Grizzly Release - 8/26/1013 ▪ https://kimizhang.wordpress.com/2013/08/26/openstack-vm-live-migration/ ▪ Sébastien Han - Essex Release - 7/12/2012 ▪ http://www.sebastien-han.fr/blog/2012/07/12/openstack-block-migration/ ▪ Video: ▪ https://www.openstack.org/summit/vancouver-2015/summit-videos/presentation/dive-into-vm-live- migration © 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---16
  • 17. Replication in Cinder. Why are we up here again? Replication in the cloud with Multiple vendor backends is HARD! We’re on design #4 Early designs - Vendor centric. No knowledge in the cloud or applications. Official V1 - Juno. IBM only. Official V2 - Liberty. No drivers released. Official V2.1 (aka Cheesecake) - Mitaka mid-cycle Game plan for Cheesecake Simplified use case: Disaster Recovery only. Admin disaster recovery only. Fail everything which is replicated to the DR site. Non-replicated volumes are ‘Offline’
  • 18. Before Cinder learned about replication Vendor specific volume type extra specs - indication of replication state of the backend Examples: ▪ SolidFire example from Essex (sf:replication:all-of-the-replication-infos) ▪ mvip: IPaddr, api_port: portNum, login: loginToMvip, password: secretPassword ▪ not in tree: https://github.com/j-griffith/nova/blob/essex-sf-replication/nova/volume/san.py ▪ NetApp example (netapp_mirrored) OpenStack is completely unaware. If failover occurs, the admin must re-configure OpenStack. 18
  • 19. Use Case for Cheesecake! Straight forward DR Non-automated failover of replicated volumes. When Disaster declared…. API for Cloud Administrator to call to cause failover. DR storage system is not seen or managed in OpenStack until failover Non-replicated volumes are “Offline” There is no split decision. DR Storage unit becomes the backend. No failback (your primary is on fire remember!) No concept of a managed secondary!
  • 20. Terms this time around Fail-over Switch over to the secondary array. Volumes which are replicated will be there. Volumes not replicated will not be available. Attached volumes will need to be re-attached manually. Freeze Do not allow any resource create/delete actions snapshot-create, xxx-delete, resize, retype etc should return an InvalidCommand error I/O is still allowed, but this is an admin freeze The idea is to keep thing stable for recovery (if possible) Unfreeze Allow resource create/delete commands.
  • 21. Old Terms (no longer used) Terms: promote Reenable enabled/disabled Status: disabled inactive active active-stopped error Tasks (Admin) replication enable replication disable replication failover list replication targets
  • 22. How it works and what it does: Driver must report: replication_enabled = True In it’s capabilities. [solidfire-1] volume_driver = cinder.volume.drivers.solidfire.SolidFireDriver san_ip = 172.27.1.50 san_login = admin san_password = solidfire volume_backend_name = solidfire sf_account_prefix = balduf-master replication_device = backend_id:172.27.1.191,mvip:172.27.1.191,login:admin,password:admin (Note: No trailing comma allowed in replication_device KV pair list)
  • 23. Volume extra specs Keywords: replication : enabled/disabled All others are vendor specific: Example: HP type: sync/periodic
  • 24. Drivers Supporting Replication Available in Mitaka: SolidFire (out of tree) Dell EMC HP Huawei Storewize IBM Pure In process, coming in Newton: NetApp Data ONTAP & E-series.
  • 25. Fail-back or lack thereof ▪ If there really is a disaster and ‘A’ is burned to a crisp, there is no fail-back! ▪ But how do we make ‘B’ the new master? ▪ And someday buy ‘C’ and replicate to it? ▪ Fix the database $ mysql -u root MariaDB [(none)]> use cinder MariaDB [cinder]> select id,host,disabled,disabled_reason,replication_status,frozen,active_backend_id from services; +----+----------------------------------------------+----------+-----------------+--------------------+--------+-------------------+ | id | host | disabled | disabled_reason | replication_status | frozen | active_backend_id | +----+----------------------------------------------+----------+-----------------+--------------------+--------+-------------------+ | 1 | devstack-master.pm.solidfire.net | 0 | NULL | not-capable | 0 | NULL | | 2 | devstack-master.pm.solidfire.net | 0 | NULL | not-capable | 0 | NULL | | 3 | devstack-master.pm.solidfire.net@solidfire-1 | 1 | NULL | failed-over | 0 | 172.27.50.191 | | 4 | devstack-master.pm.solidfire.net@lvmdriver-1 | 0 | NULL | disabled | 0 | NULL | +----+----------------------------------------------+----------+-----------------+--------------------+--------+-------------------+ 4 rows in set (0.00 sec) MariaDB [cinder]> update services set disabled=0,disabled_reason=NULL,replication_status='disabled',active_backend_id=NULL where id=3; ▪ Goto Page #1 © 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---25
  • 26. Demo
  • 27. Tiramisu (next) Newton Design cycle here in Austin. ‘Goal’ is some control by the tenant What if the tenant doesn’t want to wait for Admin? What if the tenant has a disaster somewhere else in their application. ‘Goal’ to deal with vendor/tenant grouping constructs for replication May become a separate effort
  • 28. © 2016 NetApp, Inc. All rights reserved. --- NETAPP CONFIDENTIAL ---28