SlideShare a Scribd company logo
Novel Multi-region Clusters
Cassandra Deployments Split Between Heterogeneous Data Centres
with NAT & DNS-SD
#CassandraSummit
Adam Zegelin
Co-founder & VP of Engineering
www.instaclustr.com
adam@instaclustr.com

@adamzegelin
Instaclustr
• Instaclustr provides Cassandra-as-a-service in the cloud

(Currently only on AWS — Google Cloud in private beta)
• We currently manage 50+ Cassandra nodes for various customers
• We often get requests to do cool things — and try and make it
happen!
Multi-DC @ Instaclustr
• Cloud ⇄ cloud, “classic” internet-facing data centre ⇄ cloud
• Works out-of-the-box today.
• Requires per-node public IP
• Private network clusters ⇄ Cloud clusters
• Easy if your private network allocates per-node public IP addresses
• VPNs
• Something else?
• Overview of multi- region/data centre clusters
• What is supported out-of-the-box
• Alternative solutions
• Supporting technology overview (NAT/PAT and DNS-SD)
• Implementation
Single Node
• What you get from running
apt-get install
cassandra and /usr/bin/
cassandra
• Fragile (no redundancy)
• Dev/test/sandbox only
C*
Multi-node, Single Data Centre
• Two or more servers running
Cassandra within one DC
• Replication of data
(redundancy)
• Increased capacity (storage +
throughput)
• Baseline for production
clusters
C* C*
C*
Multi-node, Multi-DC
• Cassandra running in two or
more data centres
• Global deployments
• Data near your customers
(reduced latency)
• Supported out-of-the-box
C* C*
C*
C* C*
C*
C* C*
C*
Snitches
• Understands data centres and racks
• Implementation may automatically determine node DC and rack

(EC2MultiRegionSnitch uses AWS internal metadata service, GossipingPropertiesFileSnitch loads
a .properties file)
• Node DC and rack is advertised via Gossip
• Determine node proximity (estimated link latency)
• Cluster may use a combination of Snitch implementations
Data Centres
• Collection of Racks
• Complete replications
• Geographically separate
• Possibly high-latency interconnects

(e.g. East Coast US → Sydney, ~300ms round-trip)
Racks
• Collection of nodes
• May fail as a single unit
• Modelled on the traditional DC rack/cage

(n-servers running of a UPS)
☁
• Amazon Web Services

(use EC2MultiRegionSnitch)
• Data Centre ≡ AWS Region

(e.g. US_East_1, AP_SOUTHEAST_2)
• Rack ≡ Availability Zone

(e.g. us-east-1a, ap-southeast-2b)
• Google Cloud Platform

(no out-of-the-box auto-configuring snitch — use GossipingPropertiesFileSnitch, or roll your own!)
• Data Centre ≡ GCP Region

(e.g. US, Europe)
• Rack ≡ Zone

(e.g. us-central1-a, europe-west1-a)
Data Centre Aware
• Cassandra is data centre aware
• Only fetch data from a remote DC if absolutely required

(remote data is more “expensive”)
• Clients can be made data centre aware
• If your app knows its DC, client will talk to the closest DC
Cluster cluster = Cluster.builder()
.addContactPoint(…)
.withLoadBalancingPolicy(new DCAwareRoundRobinPolicy(“US_EAST_1"))
.build();
Multi DC Support
• Per-node public (internet-facing) IP address
• Optionally, per-node private IP address
• Per-node public address is used for inter-data centre connectivity
• Per node private address is used for intra-data centre connectivity
Multi DC Support
• Cloud ⇄ cloud, traditional ⇄ cloud, traditional ⇄ traditional
• Easy to setup per-node public and private addresses
• Private network clusters ⇄ Cloud clusters
• Private networks: 𝑛 public addresses, shared by 𝑥 private
addresses. Not 1 ↔ 1

(where often 𝑥 > 𝑛)
• done via Network Address Translation
IPv4 Address Space Exhaustion
Source: http://www.potaroo.net/tools/ipv4/
Multi-DC Support
• IPv4
• Address exhaustion
• Over time, will become more expensive to purchase addresses
• Wasteful

(being a good internet citizen)
Alternatives
• IPv6
• Java supports it ∴ Cassandra probably supports it

(untested by us)
• Global IPv6 adoption is ~4%

(according to Google — google.com/intl/en/ipv6/statistics.html)
• IPv6/IPv4 hybrid

(Teredo, 6over4, et. al.)
• AWS EC2 does not support IPv6. End of story.

(Elastic Load Balancer does support IPv6)
Alternatives
• VPNs
• tinc, OpenVPN, etc.
• All private address space — no dual addressing
• Requires multiple links — between every DC and per client
• Address space overlaps between multiple VPNs
• Connectivity to multiple clusters an issue

(for multi-cluster apps, centralised monitoring, etc)
Data Centres Links
3 3
5 10
7 21
Alternatives
• Network Address Translation (NAT)

(aka IP Masquerading or Port Address Translation (PAT))
• Deployed on most private networks
• Connectivity between private network clusters ⇄ Cloud clusters
• Supports client connectivity to multiple clusters
NAT Basics
• Re-maps IP address spaces

(e.g. Public 96.31.81.80 ↔ Private 192.168.*.*)
• 𝑛 public addresses, shared by 𝑥 private addresses. Not 1 ↔ 1

(where often n = 1, 𝑥 > 𝑛)
• Port Address Translation
• Private port ↔ Public port
• Outbound connections only without port forwarding or NAT traversal
• Per DC gateway device — performs NAT and port forwarding
NAT with Inbound Connections
• Static port forwarding

(configured on the gateway)
• Automatic port forwarding — UPnP, NAT-PMP/PCP

(configured by the application, e.g. Cassandra)
• NAT Traversal — STUN, ICE, etc.
NAT + C∗
Situation: 𝑛 Cassandra nodes, 1 public address per data centre
• Port forward different public ports for each node
• Advertise assigned ports
• Modify Cassandra and client applications to connect to
advertised ports
Advertising Port Mappings
• Extend Cassandra Gossip
• Include port numbers in node address announcements
• Allow seed node addresses to include port numbers
• Allow multiple nodes to have identical public & private addresses

(only port numbers differ per DC)
• How to bootstrap? SIP?
• Cassandra must be aware of the allocated ports in order to advertise
• Hard if C* is not directly responsible for the port mapping

(e.g. static port forwarding)
• Too many modifications to internals
Advertising Port Mappings
• DNS-SD — dns-sd.org

(aka Bonjour/Zeroconf)
• Reads — works with existing DNS implementations

(it’s just a DNS query)
• Even inside restrictive networks, DNS usually works
• Combination of DNS TXT, SRV and PTR records.
• Updates
• via DNS Update & TSIG — supported by bind
• via API — e.g. for AWS Route 53
Advertising Port Mappings
• DNS-SD cont’d.
• SRV records contain hostname and port

(i.e., hostname of the NAT gateway and public C* port)
• TXT records contain key=value pairs

(useful for additional connection & config details)
• Modify C* connection code to lookup foreign node port from DNS
• Modify client driver connection code to lookup ports from DNS
• Can be queried & updated out-of-band

(updated by the NAT device or central management server which knows which ports were mapped)
Advertised Details
• Each cluster is it’s own browse domain
• Each NAT gateway device has an A record in the browse domain
• Each DNS-SD service is named based on the private IP address
• Requires unique private IP addresses across data centres
• SRV port is the C* thrift port
• Additional ports are advertise via TXT
Configuration
• Cassandra is configured to only use private addresses
• On cluster creation
• Establish a new DNS-SD browse domain
• Create A records for each gateway device
• NAT gateway device is notified when a new C* node is started
• Allocates random public ports for C* and configures Port Forwarding
• Updates DNS-SD
• New SRV and TXT record
$ dns-sd -B _cassandra._tcp 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au.
Browsing for _cassandra._tcp
A/R Flags if Domain Service Type Instance Name
Add 3 0 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. _cassandra._tcp. 192-168-2-4
Add 3 0 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. _cassandra._tcp. 192-168-1-2
Add 3 0 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. _cassandra._tcp. 192-168-2-3
Add 3 0 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. _cassandra._tcp. 192-168-2-2
Add 3 0 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. _cassandra._tcp. 192-168-1-4
Add 2 0 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. _cassandra._tcp. 192-168-1-3
$ dns-sd -L 192-168-1-4 _cassandra._tcp 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au.
Lookup 192-168-1-4._cassandra._tcp.1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au.
192-168-1-4._cassandra._tcp.1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. can be reached at aws-
us-east1-gateway.1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au.:1236 (interface 0)
version=2.0.7
cqlport=1237
$ nslookup aws-us-east1-gateway.1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au.
Non-authoritative answer:
Name: aws-us-east1-gateway.1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au
Address: 54.209.123.195
Output of dns-sd

(Can also use avahi-browse, dig, or any other DNS query tool)
Java Driver Modifications
• This is usually a no-op

(the default is IdentityTranslater)
• Modify translate() to perform a DNS-SD lookup.
• The address parameter is a node private IP address.
• Locate a service with a name = private IP address to determine
public IP/port.
public interface AddressTranslater {
public InetSocketAddress translate(InetSocketAddress address);
}
Modifying Cassandra
• Responsible for managing Socket connections.
• Modify newSocket() to perform a DNS-SD lookup.
• The endpoint parameter is a node private IP address.
• Locate a service with a name = private IP address to determine
public IP/port
public class OutboundTcpConnectionPool
{
⋮
public static Socket newSocket(InetAddress endpoint) throws IOException {…}

⋮
}
C* C*
C*
C* C*
C*
NAT Gateway NAT Gateway
DNS (+ DNS-SD) Server

(Route 53, Self-hosted, etc)Client
Application
Thanks!
Questions?
adam@instaclustr.com

More Related Content

PDF
Communication Protocols (UART, SPI,I2C)
PPTX
DMA operation
PDF
ZYNQ BRAM Implementation
PPTX
1.FPGA for dummies: Basic FPGA architecture
PDF
Archiving in linux tar
PPTX
Direct memory access
PDF
Primary Memory: RAM, ROM and their Types
PPTX
Fiduccia mattheyses
Communication Protocols (UART, SPI,I2C)
DMA operation
ZYNQ BRAM Implementation
1.FPGA for dummies: Basic FPGA architecture
Archiving in linux tar
Direct memory access
Primary Memory: RAM, ROM and their Types
Fiduccia mattheyses

What's hot (20)

PDF
Introduction to Bus | Address, Data, Control Bus
PPTX
8086 microprocessor-architecture
PDF
DDR, GDDR, HBM Memory : Presentation
DOCX
8086 Architecture
PPTX
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
PPT
Introduction for microprocessor
PPTX
Accessing I/O Devices
PDF
Aula sobre Placa-mãe
PPTX
Bottom half in linux kernel
PPT
Data transferschemes
ODP
Direct Memory Access (DMA)-Working and Implementation
PPT
Graphic card
DOCX
Historia y evolucion del socket del procesador
PPTX
Dynamic data structures
PPT
Memory hierarchy
DOCX
Flag register 8086 assignment
PPTX
Register organization, stack
PDF
Esd mod 3
PPTX
Computer registers
PPT
Introduction to Bus | Address, Data, Control Bus
8086 microprocessor-architecture
DDR, GDDR, HBM Memory : Presentation
8086 Architecture
03_03_Implementing_PCIe_ATS_in_ARM-based_SoCs_Final
Introduction for microprocessor
Accessing I/O Devices
Aula sobre Placa-mãe
Bottom half in linux kernel
Data transferschemes
Direct Memory Access (DMA)-Working and Implementation
Graphic card
Historia y evolucion del socket del procesador
Dynamic data structures
Memory hierarchy
Flag register 8086 assignment
Register organization, stack
Esd mod 3
Computer registers
Ad

Similar to Multi-Region Cassandra Clusters (20)

PDF
Cassandra Summit 2014: Novel Multi-Region Clusters — Cassandra Deployments Sp...
PDF
Cassandra On EC2
PPTX
Netflix and Open Source
PDF
Private cloud networking_cloudstack_days_austin
PDF
CloudStack In Production
PPTX
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
PDF
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
PDF
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
PDF
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
PDF
Web Application Architectures on AWS
PDF
GumGum: Multi-Region Cassandra in AWS
PPTX
2014-09-15 cloud platform master class
PDF
Cassandra for Sysadmins
PPTX
Deploying Apache CloudStack from API to UI
PDF
Presentation ING for ISC2 Secure Summits EMEA
PDF
Devnexus slides - Amazon Web Services
PPTX
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
PDF
Kubernetes Networking 101 kubecon EU 2022
PDF
Txlf2012
PDF
RedHat OpenStack Platform Overview
Cassandra Summit 2014: Novel Multi-Region Clusters — Cassandra Deployments Sp...
Cassandra On EC2
Netflix and Open Source
Private cloud networking_cloudstack_days_austin
CloudStack In Production
C* Summit 2013: Netflix Open Source Tools and Benchmarks for Cassandra by Adr...
Leveraging Cassandra for real-time multi-datacenter public cloud analytics
iland Internet Solutions: Leveraging Cassandra for real-time multi-datacenter...
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Web Application Architectures on AWS
GumGum: Multi-Region Cassandra in AWS
2014-09-15 cloud platform master class
Cassandra for Sysadmins
Deploying Apache CloudStack from API to UI
Presentation ING for ISC2 Secure Summits EMEA
Devnexus slides - Amazon Web Services
Cassandra EU 2012 - Netflix's Cassandra Architecture and Open Source Efforts
Kubernetes Networking 101 kubecon EU 2022
Txlf2012
RedHat OpenStack Platform Overview
Ad

More from Instaclustr (20)

PDF
Apache Cassandra Community Health
PDF
Instaclustr introduction to managing cassandra
PDF
Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...
PDF
Instaclustr Apache Cassandra Best Practices & Toubleshooting
PDF
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
PDF
Micro-batching: High-performance writes
PPTX
Everyday I’m scaling... Cassandra
PPTX
Processing 50,000 events per second with Cassandra and Spark
PPTX
Load Testing Cassandra Applications
PDF
Cassandra-as-a-Service
PDF
Cassandra Front Lines
PDF
Cassandra Bootstap from Backups
PDF
Migrating to Cassandra
PDF
Cassandra on Docker
PDF
Securing Cassandra
PDF
Apache Cassandra Management
PDF
Apache Cassandra in the Cloud
PDF
Introduction to Apache Cassandra
PDF
Cassandra Bootstrap from Backups
PDF
Development Nirvana with Cassandra
Apache Cassandra Community Health
Instaclustr introduction to managing cassandra
Instaclustr webinar 50,000 transactions per second with Apache Spark on Apach...
Instaclustr Apache Cassandra Best Practices & Toubleshooting
Instaclustr Webinar 50,000 Transactions Per Second with Apache Spark on Apach...
Micro-batching: High-performance writes
Everyday I’m scaling... Cassandra
Processing 50,000 events per second with Cassandra and Spark
Load Testing Cassandra Applications
Cassandra-as-a-Service
Cassandra Front Lines
Cassandra Bootstap from Backups
Migrating to Cassandra
Cassandra on Docker
Securing Cassandra
Apache Cassandra Management
Apache Cassandra in the Cloud
Introduction to Apache Cassandra
Cassandra Bootstrap from Backups
Development Nirvana with Cassandra

Recently uploaded (20)

PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Machine learning based COVID-19 study performance prediction
PDF
Encapsulation theory and applications.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPT
Teaching material agriculture food technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Modernizing your data center with Dell and AMD
PPTX
Understanding_Digital_Forensics_Presentation.pptx
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Reach Out and Touch Someone: Haptics and Empathic Computing
Building Integrated photovoltaic BIPV_UPV.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Machine learning based COVID-19 study performance prediction
Encapsulation theory and applications.pdf
The AUB Centre for AI in Media Proposal.docx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Dropbox Q2 2025 Financial Results & Investor Presentation
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Teaching material agriculture food technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Spectral efficient network and resource selection model in 5G networks
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Modernizing your data center with Dell and AMD
Understanding_Digital_Forensics_Presentation.pptx

Multi-Region Cassandra Clusters

  • 1. Novel Multi-region Clusters Cassandra Deployments Split Between Heterogeneous Data Centres with NAT & DNS-SD #CassandraSummit
  • 2. Adam Zegelin Co-founder & VP of Engineering www.instaclustr.com adam@instaclustr.com
 @adamzegelin
  • 3. Instaclustr • Instaclustr provides Cassandra-as-a-service in the cloud
 (Currently only on AWS — Google Cloud in private beta) • We currently manage 50+ Cassandra nodes for various customers • We often get requests to do cool things — and try and make it happen!
  • 4. Multi-DC @ Instaclustr • Cloud ⇄ cloud, “classic” internet-facing data centre ⇄ cloud • Works out-of-the-box today. • Requires per-node public IP • Private network clusters ⇄ Cloud clusters • Easy if your private network allocates per-node public IP addresses • VPNs • Something else?
  • 5. • Overview of multi- region/data centre clusters • What is supported out-of-the-box • Alternative solutions • Supporting technology overview (NAT/PAT and DNS-SD) • Implementation
  • 6. Single Node • What you get from running apt-get install cassandra and /usr/bin/ cassandra • Fragile (no redundancy) • Dev/test/sandbox only C*
  • 7. Multi-node, Single Data Centre • Two or more servers running Cassandra within one DC • Replication of data (redundancy) • Increased capacity (storage + throughput) • Baseline for production clusters C* C* C*
  • 8. Multi-node, Multi-DC • Cassandra running in two or more data centres • Global deployments • Data near your customers (reduced latency) • Supported out-of-the-box C* C* C* C* C* C* C* C* C*
  • 9. Snitches • Understands data centres and racks • Implementation may automatically determine node DC and rack
 (EC2MultiRegionSnitch uses AWS internal metadata service, GossipingPropertiesFileSnitch loads a .properties file) • Node DC and rack is advertised via Gossip • Determine node proximity (estimated link latency) • Cluster may use a combination of Snitch implementations
  • 10. Data Centres • Collection of Racks • Complete replications • Geographically separate • Possibly high-latency interconnects
 (e.g. East Coast US → Sydney, ~300ms round-trip)
  • 11. Racks • Collection of nodes • May fail as a single unit • Modelled on the traditional DC rack/cage
 (n-servers running of a UPS)
  • 12. ☁ • Amazon Web Services
 (use EC2MultiRegionSnitch) • Data Centre ≡ AWS Region
 (e.g. US_East_1, AP_SOUTHEAST_2) • Rack ≡ Availability Zone
 (e.g. us-east-1a, ap-southeast-2b) • Google Cloud Platform
 (no out-of-the-box auto-configuring snitch — use GossipingPropertiesFileSnitch, or roll your own!) • Data Centre ≡ GCP Region
 (e.g. US, Europe) • Rack ≡ Zone
 (e.g. us-central1-a, europe-west1-a)
  • 13. Data Centre Aware • Cassandra is data centre aware • Only fetch data from a remote DC if absolutely required
 (remote data is more “expensive”) • Clients can be made data centre aware • If your app knows its DC, client will talk to the closest DC
  • 14. Cluster cluster = Cluster.builder() .addContactPoint(…) .withLoadBalancingPolicy(new DCAwareRoundRobinPolicy(“US_EAST_1")) .build();
  • 15. Multi DC Support • Per-node public (internet-facing) IP address • Optionally, per-node private IP address • Per-node public address is used for inter-data centre connectivity • Per node private address is used for intra-data centre connectivity
  • 16. Multi DC Support • Cloud ⇄ cloud, traditional ⇄ cloud, traditional ⇄ traditional • Easy to setup per-node public and private addresses • Private network clusters ⇄ Cloud clusters • Private networks: 𝑛 public addresses, shared by 𝑥 private addresses. Not 1 ↔ 1
 (where often 𝑥 > 𝑛) • done via Network Address Translation
  • 17. IPv4 Address Space Exhaustion Source: http://www.potaroo.net/tools/ipv4/
  • 18. Multi-DC Support • IPv4 • Address exhaustion • Over time, will become more expensive to purchase addresses • Wasteful
 (being a good internet citizen)
  • 19. Alternatives • IPv6 • Java supports it ∴ Cassandra probably supports it
 (untested by us) • Global IPv6 adoption is ~4%
 (according to Google — google.com/intl/en/ipv6/statistics.html) • IPv6/IPv4 hybrid
 (Teredo, 6over4, et. al.) • AWS EC2 does not support IPv6. End of story.
 (Elastic Load Balancer does support IPv6)
  • 20. Alternatives • VPNs • tinc, OpenVPN, etc. • All private address space — no dual addressing • Requires multiple links — between every DC and per client • Address space overlaps between multiple VPNs • Connectivity to multiple clusters an issue
 (for multi-cluster apps, centralised monitoring, etc)
  • 21. Data Centres Links 3 3 5 10 7 21
  • 22. Alternatives • Network Address Translation (NAT)
 (aka IP Masquerading or Port Address Translation (PAT)) • Deployed on most private networks • Connectivity between private network clusters ⇄ Cloud clusters • Supports client connectivity to multiple clusters
  • 23. NAT Basics • Re-maps IP address spaces
 (e.g. Public 96.31.81.80 ↔ Private 192.168.*.*) • 𝑛 public addresses, shared by 𝑥 private addresses. Not 1 ↔ 1
 (where often n = 1, 𝑥 > 𝑛) • Port Address Translation • Private port ↔ Public port • Outbound connections only without port forwarding or NAT traversal • Per DC gateway device — performs NAT and port forwarding
  • 24. NAT with Inbound Connections • Static port forwarding
 (configured on the gateway) • Automatic port forwarding — UPnP, NAT-PMP/PCP
 (configured by the application, e.g. Cassandra) • NAT Traversal — STUN, ICE, etc.
  • 25. NAT + C∗ Situation: 𝑛 Cassandra nodes, 1 public address per data centre • Port forward different public ports for each node • Advertise assigned ports • Modify Cassandra and client applications to connect to advertised ports
  • 26. Advertising Port Mappings • Extend Cassandra Gossip • Include port numbers in node address announcements • Allow seed node addresses to include port numbers • Allow multiple nodes to have identical public & private addresses
 (only port numbers differ per DC) • How to bootstrap? SIP? • Cassandra must be aware of the allocated ports in order to advertise • Hard if C* is not directly responsible for the port mapping
 (e.g. static port forwarding) • Too many modifications to internals
  • 27. Advertising Port Mappings • DNS-SD — dns-sd.org
 (aka Bonjour/Zeroconf) • Reads — works with existing DNS implementations
 (it’s just a DNS query) • Even inside restrictive networks, DNS usually works • Combination of DNS TXT, SRV and PTR records. • Updates • via DNS Update & TSIG — supported by bind • via API — e.g. for AWS Route 53
  • 28. Advertising Port Mappings • DNS-SD cont’d. • SRV records contain hostname and port
 (i.e., hostname of the NAT gateway and public C* port) • TXT records contain key=value pairs
 (useful for additional connection & config details) • Modify C* connection code to lookup foreign node port from DNS • Modify client driver connection code to lookup ports from DNS • Can be queried & updated out-of-band
 (updated by the NAT device or central management server which knows which ports were mapped)
  • 29. Advertised Details • Each cluster is it’s own browse domain • Each NAT gateway device has an A record in the browse domain • Each DNS-SD service is named based on the private IP address • Requires unique private IP addresses across data centres • SRV port is the C* thrift port • Additional ports are advertise via TXT
  • 30. Configuration • Cassandra is configured to only use private addresses • On cluster creation • Establish a new DNS-SD browse domain • Create A records for each gateway device • NAT gateway device is notified when a new C* node is started • Allocates random public ports for C* and configures Port Forwarding • Updates DNS-SD • New SRV and TXT record
  • 31. $ dns-sd -B _cassandra._tcp 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. Browsing for _cassandra._tcp A/R Flags if Domain Service Type Instance Name Add 3 0 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. _cassandra._tcp. 192-168-2-4 Add 3 0 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. _cassandra._tcp. 192-168-1-2 Add 3 0 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. _cassandra._tcp. 192-168-2-3 Add 3 0 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. _cassandra._tcp. 192-168-2-2 Add 3 0 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. _cassandra._tcp. 192-168-1-4 Add 2 0 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. _cassandra._tcp. 192-168-1-3 $ dns-sd -L 192-168-1-4 _cassandra._tcp 1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. Lookup 192-168-1-4._cassandra._tcp.1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. 192-168-1-4._cassandra._tcp.1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. can be reached at aws- us-east1-gateway.1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au.:1236 (interface 0) version=2.0.7 cqlport=1237 $ nslookup aws-us-east1-gateway.1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au. Non-authoritative answer: Name: aws-us-east1-gateway.1da53f83-e635-11e3-96eb-2ec9d09504f5.clusters.instaclustr.com.au Address: 54.209.123.195 Output of dns-sd
 (Can also use avahi-browse, dig, or any other DNS query tool)
  • 32. Java Driver Modifications • This is usually a no-op
 (the default is IdentityTranslater) • Modify translate() to perform a DNS-SD lookup. • The address parameter is a node private IP address. • Locate a service with a name = private IP address to determine public IP/port. public interface AddressTranslater { public InetSocketAddress translate(InetSocketAddress address); }
  • 33. Modifying Cassandra • Responsible for managing Socket connections. • Modify newSocket() to perform a DNS-SD lookup. • The endpoint parameter is a node private IP address. • Locate a service with a name = private IP address to determine public IP/port public class OutboundTcpConnectionPool { ⋮ public static Socket newSocket(InetAddress endpoint) throws IOException {…}
 ⋮ }
  • 34. C* C* C* C* C* C* NAT Gateway NAT Gateway DNS (+ DNS-SD) Server
 (Route 53, Self-hosted, etc)Client Application