SlideShare a Scribd company logo
3
Most read
11
Most read
12
Most read
Nutanix Metro-Availability
Christian Johannsen, Senior SE Nutanix
Nutanix – Technology Review
3
Nutanix Virtual Computing Platform
4
Convergence21
Data15
Metadata8
Cloud4
VM Mobility3
Control Plane10
VDI2
MapReduce2
Security1
Support1
Analytics1
‱ Shared-nothing storage controller for virtualization environments.
‱ Method for networking converged shared-nothing storage for high availability.
‱ I/O and storage for a virtualization environment with multiple hypervisor types.
‱ Performing hot-swap of a storage device in a converged architecture.
Key Patents
Top Categories
Web-scale Foundation Platform
22 patents filed
Powerful
Control Plane
10 patents filed
Scale-out
Data Plane
15 patents filed
47 Patents
Patent Distribution
Nutanix Patent Portfolio
5
Nutanix Distributed File System (NDFS)
Virtual Storage Control Virtual Storage Control
Virtual Machine/Virtual Disk
Flash HDD
Enterprise Storage
‱ Data Locality
‱ Tiering and Caching
‱ Compression
‱ Deduplication
‱ Shadow Clones
‱ Snapshots and Clones
Data Protection
‱ Converged Backups
‱ Integrated DR
‱ Cloud Connect
‱ Metro Availability
‱ 3rd party Backup
Solutions
Resiliency
‱ Tunable Redundancy
‱ Data Path Redundancy
‱ Data Integrity Checks
‱ Availability Domains
Security
‱ Data at Rest Encryption
‱ Nutanix Security DL
‱ Cluster Shield
‱ Two-factor Auth
Nutanix – Data Protection
7
Stay covered for Critical Workloads
RTORPO
Nutanix
Offers
Minutes Minutes Time Stream
Hours Hours Cloud Connect
Near -zero Minutes Metro Availability
Minutes Minutes Remote
Replication
Minor
incidents
Major
incidents
Time between backups Maximum tolerable outage
8
Time Stream
Set retention policy for local
and remote snapshots
Set snapshot schedule for a
protection domain
Time-Based backup (Storage Snapshot) with local and remote retention
 Imagine the snapshots beside the integrated replication
 Application consistent snapshot possible
9
Nutanix Cloud Connect
Datacenter Cloud
Backup and recovery of VMs from Nutanix cluster to the public cloud
 VMCaliber and WAN optimized
 Fully integrated management experience with Prism
 Quick restore and state recovery
10
Async DR
VM-centric workflows
 Granular VM based snapshots and policies, better than LUN based
 Space efficient sub block-level snapshots (redirect-on-write)
 N-Way master-master model for more than on site
 VM and application level crash consistency
11
Introducing Nutanix Metro Availability
Geographic distributed High Availability – covers the entire infrastructure stack
 Covers entire infrastructure stack
 Leverage existing network
 Deploy in minutes through Nutanix Prism with minimal change management
 Mix and match models to workloads
Customer
Network
12
 Network
 <=5ms RTT
 < 400 KMs between two sites
 Bandwidth depends on ‘data change rate’
 Recommended: redundant physical networks b/w sites
 General
 2 Nutanix clusters, one on each site
 Mixing hardware models allowed
 Hypervisor
 ESXi in NOS 4.1
 Hyper-V/KVM in the future (Q1 CY 2015)
Requirements
13
Architecture
Synchronous storage replication
 Datastore stretched over both Nutanix cluster in a single Hypervisor cluster (vMotion, HA)
 In conjunction with existing data management features, compression, deduplication, and
tiering
 Standby containers are unavailable for direct virtual machine traffic (first release)
14
Nutanix I/O Path
I/O Path
1. OpLog acts as a write buffer (Random Writes)
2. Data is replicated to other nodes sync.
3. Sequentially drained to Extent Store
4. ILM (Information Lifecycle Management)
chooses the right target or the data
5. Deduplicated read cache (Content Cache) spans
Memory and SSD
6. VM accessing the same data on just one copy
(deduplicated)
1. IF data not in Content Cache it will be promoted
per ILM
2. Extensible Platform for future I/O patterns
15
15
1. Write IO
2a. Written to local OpLog (RF) and remote
replication to remote OpLog
2b. Local Replication in remote OpLog (RF)
3a. Write IO Ack in local OpLog (RF)
3b. Write IO Ack in remote OpLog (RF)
3c. Write IO Ack from remote OpLog
4. Write IO Ack from local OpLog to the hypervisor
Write Anatomy
16
16
1. Write IO
2. Write IO forwarded to Active Container
3a. Written to local OpLog (RF) and remote
replication to remote OpLog
3b. Local Replication in remote OpLog (RF)
4a. Write IO Ack in local OpLog (RF)
4b. Write IO Ack in remote OpLog (RF)
4c. Write IO Ack from remote OpLog
5. Write IO Ack from local OpLog to the remote
OpLog
6. Write IO Ack from local OpLog to the hypervisor
Write Anatomy (vMotion, Recovery)
17
17
1. Read Request
2. Read Request forwarded to Active Container
3. Data returned from the Active Container
4. Data sent to the VM
Read Anatomy (vMotion, Recovery)
18
18
Scenarios
19
19
Scenarios
Network failure between sites
Manual or Automatic (seconds)
20
20
Scenarios
Site Failure
Demo Time!
https://drive.google.com/a/nutanix.com/file/d/0B3sqKkY-Et4deF9Db2NPdlYzMmM/view
Thank You

More Related Content

PDF
NF102: Nutanix AHV Basics
PPTX
Five common customer use cases for Virtual SAN - VMworld US / 2015
PPT
Disaster Recovery & Data Backup Strategies
PPTX
Windows Server 2019.pptx
PDF
NF101: Nutanix 101
PPTX
VMware vSphere technical presentation
PPTX
Storage basics
PPTX
Ceph Tech Talk -- Ceph Benchmarking Tool
NF102: Nutanix AHV Basics
Five common customer use cases for Virtual SAN - VMworld US / 2015
Disaster Recovery & Data Backup Strategies
Windows Server 2019.pptx
NF101: Nutanix 101
VMware vSphere technical presentation
Storage basics
Ceph Tech Talk -- Ceph Benchmarking Tool

What's hot (20)

PPTX
Nutanix overview
PPTX
NetApp & Storage fundamentals
PDF
Glusterfs 파음시슀템 ê”Źì„±_및 욎영가읎드_v2.0
 
PDF
Introduction to virtualization
PDF
X-Tour Nutanix 101
PPTX
Raid Technology
PPT
Introduction to Virtualization
PDF
Kvm performance optimization for ubuntu
PPT
An Introduction To Server Virtualisation
PPTX
Nutanix
PPTX
Enabling Diverse Workload Scheduling in YARN
PDF
VSAN – Architettura e Design
PPTX
Server virtualization
PPTX
VMware vSAN - Novosco, June 2017
PDF
Seastore: Next Generation Backing Store for Ceph
PPT
Ibm spectrum scale fundamentals workshop for americas part 8 spectrumscale ba...
PPTX
Edge Computing Architecture using GPUs and Kubernetes
PPT
CDW: SAN vs. NAS
PPTX
Introduction to Hyper-V
DOCX
Storage Area Networks Unit 3 Notes
Nutanix overview
NetApp & Storage fundamentals
Glusterfs 파음시슀템 ê”Źì„±_및 욎영가읎드_v2.0
 
Introduction to virtualization
X-Tour Nutanix 101
Raid Technology
Introduction to Virtualization
Kvm performance optimization for ubuntu
An Introduction To Server Virtualisation
Nutanix
Enabling Diverse Workload Scheduling in YARN
VSAN – Architettura e Design
Server virtualization
VMware vSAN - Novosco, June 2017
Seastore: Next Generation Backing Store for Ceph
Ibm spectrum scale fundamentals workshop for americas part 8 spectrumscale ba...
Edge Computing Architecture using GPUs and Kubernetes
CDW: SAN vs. NAS
Introduction to Hyper-V
Storage Area Networks Unit 3 Notes
Ad

Viewers also liked (20)

PDF
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
PDF
2016 11-16 Citrix XenServer & Nutanix Master Class
PPTX
Nutanix vdi workshop presentation
PDF
Nutanix Community Meetup #1 - Nutanix慄門線
PPTX
Nutanix NEXT on Tour - Maarssen, Netherlands
PDF
Enterprise Cloud Platform - Keynote - Utrecht
PDF
SYN 104: Citrix and Nutanix
 
PPTX
PPTX
Web scale IT - Nutanix
 
PPTX
PrĂ©sentation NUTANIX - ÉVÉNEMENT TOUR D’ARGENT : ACROPOLIS - NUTANIX - SIPART...
PDF
VMworld 2013: Software-defined Storage - The Next Phase in the Evolution of E...
PDF
Got Big Data? Splunk on Nutanix
PDF
Command reference nos-v3_5
PPTX
DataStax TechDay - Munich 2014
PPTX
BigData Developers MeetUp
PPTX
Apache Cassandra at the Geek2Geek Berlin
PPTX
Webinar: Network Automation [Tips & Tricks]
PPTX
Demystifying Networking Webinar Series- Routing on the Host
PDF
NutanixăŁăŠăƒŠăƒ‹ïŒŸ
PPTX
Network Architecture for Containers
Nutanix + Cumulus Linux: Deploying True Hyper Convergence with Open Networking
2016 11-16 Citrix XenServer & Nutanix Master Class
Nutanix vdi workshop presentation
Nutanix Community Meetup #1 - Nutanix慄門線
Nutanix NEXT on Tour - Maarssen, Netherlands
Enterprise Cloud Platform - Keynote - Utrecht
SYN 104: Citrix and Nutanix
 
Web scale IT - Nutanix
 
PrĂ©sentation NUTANIX - ÉVÉNEMENT TOUR D’ARGENT : ACROPOLIS - NUTANIX - SIPART...
VMworld 2013: Software-defined Storage - The Next Phase in the Evolution of E...
Got Big Data? Splunk on Nutanix
Command reference nos-v3_5
DataStax TechDay - Munich 2014
BigData Developers MeetUp
Apache Cassandra at the Geek2Geek Berlin
Webinar: Network Automation [Tips & Tricks]
Demystifying Networking Webinar Series- Routing on the Host
NutanixăŁăŠăƒŠăƒ‹ïŒŸ
Network Architecture for Containers
Ad

Similar to Nutanix - Expert Session - Metro Availability (20)

PPTX
Nutanix Fundamentals The Enterprise Cloud Company
PPTX
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
PDF
Best Practices for Virtualizing Apache Hadoop
PDF
Cloud Strategies for a modern hybrid datacenter - Dec 2015
PDF
Nutanix - The Next Level in Web Scale IT Architectures is Here
PPTX
Big Data LDN 2016: Kick Start your Big Data project with Hyperconverged Infra...
PPT
Oracle Solaris 11 Built for Clouds
PDF
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...
PDF
Presentazione VMware @ VMUGIT UserCon 2015
PDF
MayaData Datastax webinar - Operating Cassandra on Kubernetes with the help ...
PPTX
Software Defined Network - SDN
PDF
SDN, ONOS, and Network Virtualization
PPTX
Ceph Day Melbourne - Walk Through a Software Defined Everything PoC
PPTX
Open stack ha design & deployment kilo
PDF
Introduction to Apache Mesos and DC/OS
PPT
cc_mod1.ppt useful for engineering students
PDF
Gluster Webinar May 25: Whats New in GlusterFS 3.2
PDF
Autopilot : Securing Cloud Native Storage
PPTX
Cloud infrastructure, Virtualization tec
PDF
Cisco HyperFlex 3.0
Nutanix Fundamentals The Enterprise Cloud Company
VMworld 2015: The Future of Software- Defined Storage- What Does it Look Like...
Best Practices for Virtualizing Apache Hadoop
Cloud Strategies for a modern hybrid datacenter - Dec 2015
Nutanix - The Next Level in Web Scale IT Architectures is Here
Big Data LDN 2016: Kick Start your Big Data project with Hyperconverged Infra...
Oracle Solaris 11 Built for Clouds
Radisys/Wind River: The Telcom Cloud - Deployment Strategies: SDN/NFV and Vir...
Presentazione VMware @ VMUGIT UserCon 2015
MayaData Datastax webinar - Operating Cassandra on Kubernetes with the help ...
Software Defined Network - SDN
SDN, ONOS, and Network Virtualization
Ceph Day Melbourne - Walk Through a Software Defined Everything PoC
Open stack ha design & deployment kilo
Introduction to Apache Mesos and DC/OS
cc_mod1.ppt useful for engineering students
Gluster Webinar May 25: Whats New in GlusterFS 3.2
Autopilot : Securing Cloud Native Storage
Cloud infrastructure, Virtualization tec
Cisco HyperFlex 3.0

Recently uploaded (20)

PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
 
PPTX
Transform Your Business with a Software ERP System
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
medical staffing services at VALiNTRY
PPT
Introduction Database Management System for Course Database
PDF
top salesforce developer skills in 2025.pdf
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Nekopoi APK 2025 free lastest update
PPTX
ISO 45001 Occupational Health and Safety Management System
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
System and Network Administration Chapter 2
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Design an Analysis of Algorithms I-SECS-1021-03
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
 
Transform Your Business with a Software ERP System
Which alternative to Crystal Reports is best for small or large businesses.pdf
medical staffing services at VALiNTRY
Introduction Database Management System for Course Database
top salesforce developer skills in 2025.pdf
How Creative Agencies Leverage Project Management Software.pdf
Nekopoi APK 2025 free lastest update
ISO 45001 Occupational Health and Safety Management System
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PTS Company Brochure 2025 (1).pdf.......
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
System and Network Administration Chapter 2
Wondershare Filmora 15 Crack With Activation Key [2025
Design an Analysis of Algorithms I-SECS-1021-03

Nutanix - Expert Session - Metro Availability

  • 4. 4 Convergence21 Data15 Metadata8 Cloud4 VM Mobility3 Control Plane10 VDI2 MapReduce2 Security1 Support1 Analytics1 ‱ Shared-nothing storage controller for virtualization environments. ‱ Method for networking converged shared-nothing storage for high availability. ‱ I/O and storage for a virtualization environment with multiple hypervisor types. ‱ Performing hot-swap of a storage device in a converged architecture. Key Patents Top Categories Web-scale Foundation Platform 22 patents filed Powerful Control Plane 10 patents filed Scale-out Data Plane 15 patents filed 47 Patents Patent Distribution Nutanix Patent Portfolio
  • 5. 5 Nutanix Distributed File System (NDFS) Virtual Storage Control Virtual Storage Control Virtual Machine/Virtual Disk Flash HDD Enterprise Storage ‱ Data Locality ‱ Tiering and Caching ‱ Compression ‱ Deduplication ‱ Shadow Clones ‱ Snapshots and Clones Data Protection ‱ Converged Backups ‱ Integrated DR ‱ Cloud Connect ‱ Metro Availability ‱ 3rd party Backup Solutions Resiliency ‱ Tunable Redundancy ‱ Data Path Redundancy ‱ Data Integrity Checks ‱ Availability Domains Security ‱ Data at Rest Encryption ‱ Nutanix Security DL ‱ Cluster Shield ‱ Two-factor Auth
  • 6. Nutanix – Data Protection
  • 7. 7 Stay covered for Critical Workloads RTORPO Nutanix Offers Minutes Minutes Time Stream Hours Hours Cloud Connect Near -zero Minutes Metro Availability Minutes Minutes Remote Replication Minor incidents Major incidents Time between backups Maximum tolerable outage
  • 8. 8 Time Stream Set retention policy for local and remote snapshots Set snapshot schedule for a protection domain Time-Based backup (Storage Snapshot) with local and remote retention  Imagine the snapshots beside the integrated replication  Application consistent snapshot possible
  • 9. 9 Nutanix Cloud Connect Datacenter Cloud Backup and recovery of VMs from Nutanix cluster to the public cloud  VMCaliber and WAN optimized  Fully integrated management experience with Prism  Quick restore and state recovery
  • 10. 10 Async DR VM-centric workflows  Granular VM based snapshots and policies, better than LUN based  Space efficient sub block-level snapshots (redirect-on-write)  N-Way master-master model for more than on site  VM and application level crash consistency
  • 11. 11 Introducing Nutanix Metro Availability Geographic distributed High Availability – covers the entire infrastructure stack  Covers entire infrastructure stack  Leverage existing network  Deploy in minutes through Nutanix Prism with minimal change management  Mix and match models to workloads Customer Network
  • 12. 12  Network  <=5ms RTT  < 400 KMs between two sites  Bandwidth depends on ‘data change rate’  Recommended: redundant physical networks b/w sites  General  2 Nutanix clusters, one on each site  Mixing hardware models allowed  Hypervisor  ESXi in NOS 4.1  Hyper-V/KVM in the future (Q1 CY 2015) Requirements
  • 13. 13 Architecture Synchronous storage replication  Datastore stretched over both Nutanix cluster in a single Hypervisor cluster (vMotion, HA)  In conjunction with existing data management features, compression, deduplication, and tiering  Standby containers are unavailable for direct virtual machine traffic (first release)
  • 14. 14 Nutanix I/O Path I/O Path 1. OpLog acts as a write buffer (Random Writes) 2. Data is replicated to other nodes sync. 3. Sequentially drained to Extent Store 4. ILM (Information Lifecycle Management) chooses the right target or the data 5. Deduplicated read cache (Content Cache) spans Memory and SSD 6. VM accessing the same data on just one copy (deduplicated) 1. IF data not in Content Cache it will be promoted per ILM 2. Extensible Platform for future I/O patterns
  • 15. 15 15 1. Write IO 2a. Written to local OpLog (RF) and remote replication to remote OpLog 2b. Local Replication in remote OpLog (RF) 3a. Write IO Ack in local OpLog (RF) 3b. Write IO Ack in remote OpLog (RF) 3c. Write IO Ack from remote OpLog 4. Write IO Ack from local OpLog to the hypervisor Write Anatomy
  • 16. 16 16 1. Write IO 2. Write IO forwarded to Active Container 3a. Written to local OpLog (RF) and remote replication to remote OpLog 3b. Local Replication in remote OpLog (RF) 4a. Write IO Ack in local OpLog (RF) 4b. Write IO Ack in remote OpLog (RF) 4c. Write IO Ack from remote OpLog 5. Write IO Ack from local OpLog to the remote OpLog 6. Write IO Ack from local OpLog to the hypervisor Write Anatomy (vMotion, Recovery)
  • 17. 17 17 1. Read Request 2. Read Request forwarded to Active Container 3. Data returned from the Active Container 4. Data sent to the VM Read Anatomy (vMotion, Recovery)
  • 19. 19 19 Scenarios Network failure between sites Manual or Automatic (seconds)

Editor's Notes

  • #4: The secret to this radical change is in the patented Nutanix Distributed File System. What you see here is a diagram representing a typical Nutanix cluster made up of nodes which are nothing but standard x86 servers w direct attached SSDs and HDDs. Unlike traditional infrastructure with a finite number of storage controllers, each additional node added to the Nutanix cluster incorporates a storage controller VM ensuring no bottlenecks in the architecture as you scale out. In doing so, you completely avoid forklift upgrades, irregularities in performance as new users are added, and reduce footprint significantly. Lastly, each Nutanix node has a built-in hypervisor of choice, whether it be vSphere, Hyper-V or KVM. This ensures the deployment is well provisioned for future enhancements such as integration with public clouds.
  • #5: Convergence - 16 Data - 9 Metadata, MapReduce – 8 Cloud, VM Mobility – 6 Control Plane – 3 VDI - 2 Analytics, Security, Support – 3 (US Patent 8601473)
  • #8: Purpose: Nutanix delivers the power of web-scale infrastructure to enterprise customers as a turnkey solutions Key Points: What Nutanix does is bring the simplicity, agility and rapid scale that web-scale technologies deliver but as a turnkey enterprise solution Customers can run their diverse application workloads without having to build custom applications Customers don’t have to learn how to use Cassandra, map-reduce, etc. The Nutanix solution does all that under the hood Talk about “controlled disruption” – Nutanix is building the bridge for enterprise IT to embrace web-scale IT without completely overhauling the way they do things
  • #9: The notes section should have a detailed description of how the feature works. De-duplication of data on disk: An administrator can enable disk de-dupe on a container level and/or a vdisk level to reclaim capacity across the cluster. This feature will be available to new as well as existing customers after they upgrade their clusters to NOS 4.0. The feature is disabled by default and has to be explicitly enabled. Once enabled, NDFS deduplicates data in chunks of 4KB blocks (although block size for dedupe is configurable - it works optimally at 4KB). When enabled, upon a write IO request, NDFS calculates and stores SHA01 fingerprint in metadata - data is not deduped at this point. The system performs dedupe when a subsequent read occurs for that data. Curator process scans the data resident on the disks and compares SHA01 fingerprint of the 4KB blocks (which was calculated at the write time and stored along with the metadata). If the fingerprint of a new block matches another existing block, then it updates the metadata to point to the existing block and releases the newly created block. This feature de-dupes data at block level. The block size used by de-dupe is 4KB by default (its configurable). Block Awareness /RF is done at a level below dedupe. So the system has only 2 copies (or 3 copies in case of RF3) of a unique block across the cluster, which are spread over the cluster. Dedupe comes in two flavors - inline and post-process (async). With inline dedupe, there is a performance penalty since it competes with CVM resources (CPU memory) that are being used for servicing user IO at the same time. Whereas, with async/post-processing the performance penalty is minimal. How much CPU/Memory resources of the CVM controller are consumed when de-duplicating disk data? Will we have disk de-dupe inline or post process in Danube? Inline is target. Post process will definitely be there. Inline needs to be turned on for ingest. SE/User should turn it OFF after ingest else performance impact. Which workloads are helped by compression and De-dupe? Can we turn on both compression and De-dupe at the same time? Do we prevent our users to do so at the same time? We do not prevent users to do that from the UI. However, when both are enabled on a container, compression wins today in the backend and de-dupe is disabled. How does de-dupe interoperate with snapshots, backups and quick clones etc.? De-dupe works with snapshots, backups and quick clones. It is not recommended to use de-dupe with shadow clones. Compression and de-dupe? Will prism let them do it? Confirm. They use different block sizes. We don’t recommend mixing both. But the UI lets the user do it. Which workloads should be targeted towards de-dupe and compression? Josh Rodgers: Check best practice. Can a user get any indication of how much space savings he can expect if he enables disk de-dupe on a given Nutanix container before turning de-dupe ON? Dedupe may yield significant savings for workloads (VDI), but may not yield similar returns for all workloads (e.g. server virtualization). Therefore, one should keep this in mind when enabling dedupe for a container/vdisk.
  • #11: The notes section should have a detailed description of how the feature works. De-duplication of data on disk: An administrator can enable disk de-dupe on a container level and/or a vdisk level to reclaim capacity across the cluster. This feature will be available to new as well as existing customers after they upgrade their clusters to NOS 4.0. The feature is disabled by default and has to be explicitly enabled. Once enabled, NDFS deduplicates data in chunks of 4KB blocks (although block size for dedupe is configurable - it works optimally at 4KB). When enabled, upon a write IO request, NDFS calculates and stores SHA01 fingerprint in metadata - data is not deduped at this point. The system performs dedupe when a subsequent read occurs for that data. Curator process scans the data resident on the disks and compares SHA01 fingerprint of the 4KB blocks (which was calculated at the write time and stored along with the metadata). If the fingerprint of a new block matches another existing block, then it updates the metadata to point to the existing block and releases the newly created block. This feature de-dupes data at block level. The block size used by de-dupe is 4KB by default (its configurable). Block Awareness /RF is done at a level below dedupe. So the system has only 2 copies (or 3 copies in case of RF3) of a unique block across the cluster, which are spread over the cluster. Dedupe comes in two flavors - inline and post-process (async). With inline dedupe, there is a performance penalty since it competes with CVM resources (CPU memory) that are being used for servicing user IO at the same time. Whereas, with async/post-processing the performance penalty is minimal. How much CPU/Memory resources of the CVM controller are consumed when de-duplicating disk data? Will we have disk de-dupe inline or post process in Danube? Inline is target. Post process will definitely be there. Inline needs to be turned on for ingest. SE/User should turn it OFF after ingest else performance impact. Which workloads are helped by compression and De-dupe? Can we turn on both compression and De-dupe at the same time? Do we prevent our users to do so at the same time? We do not prevent users to do that from the UI. However, when both are enabled on a container, compression wins today in the backend and de-dupe is disabled. How does de-dupe interoperate with snapshots, backups and quick clones etc.? De-dupe works with snapshots, backups and quick clones. It is not recommended to use de-dupe with shadow clones. Compression and de-dupe? Will prism let them do it? Confirm. They use different block sizes. We don’t recommend mixing both. But the UI lets the user do it. Which workloads should be targeted towards de-dupe and compression? Josh Rodgers: Check best practice. Can a user get any indication of how much space savings he can expect if he enables disk de-dupe on a given Nutanix container before turning de-dupe ON? Dedupe may yield significant savings for workloads (VDI), but may not yield similar returns for all workloads (e.g. server virtualization). Therefore, one should keep this in mind when enabling dedupe for a container/vdisk.
  • #13: Requirement is enough bandwidth to handle the data change rate, and a round trip time of <=5ms. A redundant network link is obviously highly recommended. Keep in mind that the cluster used to replicate data does not have to have the exact same hardware.