SlideShare a Scribd company logo
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace - Doug Jones and Kevin Lynch, Microservices Practitioner Virtual Summit 2017
Part 1: Building the Pillars of
Microservices
Part 2: Containerization and
Orchestration (Kubernetes)
AGENDA
Part 1: Building the Pillars
01
The Journey to Microservices
02
Building the Pillars of Microservices
2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace - Doug Jones and Kevin Lynch, Microservices Practitioner Virtual Summit 2017
Microservices Journey: A Story of Growth
2013: small (< 50 engineers)
build product & grow customer base
whatever works
2014: medium (< 100 engineers)
we have a lot of customers now!
whatever works doesn't work anymore
2016: large (100+ engineers)
architect for scalability and reliability
organizational structures
?: XL (200+ engineers)
Challenges with a Monolith
● Reliability
● Performance
● Engineering agility/speed, cross-team coupling
● Engineering time spent fire fighting rather than building new
functionality
What were the increasingly difficult challenges with a
monolith?
https://www.squarespace.com/?gclid=<unique-id>
Challenges with a Monolith
Story of an Outage...During the Super Bowl
Challenges with a Monolith
● Monitoring typically starts at the edges
○ Think requests in, DB queries out, etc
● What about the guts of the app? How much visibility do you have
there?
● How long does it take you to recover from an issue? Find the cause
and fix the issue?
Challenges with Monitoring/Finding Faults
The Journey to Microservices
● Define Pillars: ideas we consider necessary for successful production
microservices
● Implement these pillars as part of our platform
● Reduce boilerplate and reinventing the wheel syndrome
● Service authors get these for free and can focus on their application
domain
Design a Platform for Production, Remove Challenges
Pillars
Microservice Framework
HTTP API
Service
Discovery
Software Load
Balancer
Observability Async Client Fault Tolerance
https://engineering.squarespace.com/blog/2017/the-pill
ars-of-squarespace-services
Platform Features
Service
Discovery
API
Documentation
Structured
Logging
Metrics &
Dashboards
Distributed
Tracing
Contextual
Information
Alert
Definitions
Standardized
Deployments
Healthchecks
Dynamic
Configuration
Client-Side Load
Balancing
Latency & Fault
Tolerance
Client-Side
Caching
HTTP Request
Builders
Code
Generation
Service
Dashboard
Traffic
Visualization
Server Platform
Client Platform
ToolingTooling
Platform Features
Service
Discovery
API
Documentation
Structured
Logging
Metrics &
Dashboards
Distributed
Tracing
Contextual
Information
Alert
Definitions
Standardized
Deployments
Healthchecks
Dynamic
Configuration
Client-Side Load
Balancing
Latency & Fault
Tolerance
Client-Side
Caching
HTTP Request
Builders
Code
Generation
Service
Dashboard
Traffic
Visualization
Async/Reactive
Alert
Management
Log Aggregation
Building the Pillars of Microservices
● HTTP + JSON
○ Industry standard. Tons of tools.
● Solid open source Java API server platforms
○ Started with Dropwizard
○ now on Spring Boot (configured to use Jetty and Jersey 2)
Pillar: HTTP APIs
Building the Pillars of Microservices
● Swagger (OpenAPI Specification)
● Code generation
○ Swagger spec → models, server API, client
Even Easier APIs
Swagger Path Example
paths:
/currency-info:
put:
tags:
- CurrencyInfo
description: "Creates a new {@link CurrencyInfo} resource."
summary: Create a new currency info
operationId: save
parameters:
- name: info
in: body
schema:
$ref: '#/definitions/CurrencyInfo'
responses:
200:
description: ok
schema:
$ref: '#/definitions/CurrencyInfo'
Interactive API Documentation
Building the Pillars of Microservices
● Services announce themselves, publishing their name and host/port
information
● Started with a simple announcement payload and found that was
enough
● Healthchecks to mark services down
Pillar: Service Discovery
Building the Pillars of Microservices
● First: Zookeeper
○ Complicated clients (no HTTP API)
○ Must build discovery on Zookeeper primitives
○ Strong consistency is unnecessary
○ Client heartbeats can’t be expanded upon
○ No great way to support multiple data centers
Service Discovery Systems
● Now: Consul
○ First class discovery support
○ Built in multi-data center support
○ Simple HTTP API
○ Configurable healthchecks
○ key/value store
■ We use for dynamic config and leader election
Building the Pillars of Microservices
Service Discovery Systems
DC2DC1
Multi DC with Consul
ConsulConsulConsul ConsulConsulConsul
Service
Announce
Service
Announce
Primary DB Replica DB
Replicate
WAN Gossip
Consistent Set
DC2DC1
Multi DC with Consul
ConsulConsulConsul ConsulConsulConsul
Service Service
Primary DB Replica DB
Replicate
Service
Query
?dc=”DC2”
Remote DC forwarding
Building the Pillars of Microservices
● Avoid middleware/extra configuration
● Customizable logic
● Connection pooling
● System awareness to increase fault tolerance
● Builds on Netflix Ribbon OSS
Pillar: Software Load Balancers
Building the Pillars of Microservices
● Metrics
● Dashboards
● Distributed Tracing
● Structured Logging
● Healthchecks
● Alerts
Pillar: Observability
Metrics & Dashboards
Distributed Tracing
Structured Logging
tail -f /data/logs/taxation-access.log
2017-03-22 07:24:45:026 GMT
thread=jetty-846
contextId=JaOLrH2O
contextSourceType=billing
clientVersion=taxation-client-3.1
level=INFO
class=AccessLog
ip=10.100.101.205
method=GET
uri=/api/1/taxation/rates
queryString=
httpVersion=HTTP/1.1
responseCode=200
responseTimeMs=39
Contextual Information
Client
v3.1
Taxation Service
Billing Service
Context IdClient Version
Client Source
Type
JaOLrH2O
Building the Pillars of Microservices
● Addresses the Fanout problem, improved latency
● Reactive: RxJava with RxNetty
● Allows greater composition and reuse. Avoid “callback hell”
Pillar: Async Client
Fanout Depicted
Client
Service A
Service Z
Application Container
Service B
Service C
Service D
Sync Execution
Client
Service A
Service Z
Application Container
Service B
Service C
Service D
1
2
3
4
5
Total Latency = A + B + C + D + Z
Async Execution
Client
Service A
Service Z
Application Container
Service B
Service C
Service D
1
2
2
2
1
Total Latency = max(A, Z)
A = max(B, C, D) + A’s latency
Building the Pillars of Microservices
● Circuit breakers
● Retry logic
○ Much easier to implement w/ RxJava
● Timeouts
● Fallbacks (cached or static values)
● Netflix Hystrix
Pillar: Fault Tolerance
Fault Tolerance
Service B
Service A
Service C
Service A Client
Service B Client
Service C Client
User
Request
Application Container
Fault Tolerance
Service B
Service A
Service C
Service A Client
10 Threads
Service B Client
5 Threads
Service C Client
5 Threads
User
Request
Fail fast, fail silent, or fallback
Application Container
Pillars
Microservice Framework
HTTP API
Service
Discovery
Software Load
Balancer
Observability Async Client Fault Tolerance
https://engineering.squarespace.com/blog/2017/the-pill
ars-of-squarespace-services
Building the Pillars of Microservices
● Entirely Async Systems
○ Async servers, Streaming, gRPC, Netty
● Distributed task management
○ Serverless computing
● Easier/better alert definition and management
● Better tooling to create and deploy services
Future Work
Part 2: Containerization & Kubernetes Orchestration
01
The problem with static infrastructure
02
Kubernetes in a datacenter?
03
Challenges
Containerization & Kubernetes Orchestration
● Engineering org grows...
● More services…
● More infrastructure to spin up…
● Ops becomes a blocker...
Stuck in a loop
Containerization & Kubernetes Orchestration
● Difficult to find resources
● Slow to provision and scale
● Already have discovery!
● Metrics system must support short lived metrics
● Alerts are usually per instance
Static infrastructure and microservices do not mix!
Traditional Provisioning Process
● Pick ESX with available resources
● Pick IP
● Register host to Cobbler
● Register DNS entry
● Create new VM on ESX
● PXE boot VM and install OS and base configuration
● Install system dependencies (LDAP, NTP, CollectD, Sensu…)
● Install app dependencies (Java, FluentD/Filebeat, Consul,
Mongo-S…)
● Install the app
● App registers with discovery system and begins receiving traffic
Kubernetes Provisioning Process
● kubectl apply -f app.yaml
Containerization & Kubernetes Orchestration
● Provisioning/Scaling: Kubernetes
● Monitoring: Prometheus
● Alerting: AlertManager
● Discovery: Consul + Kubernetes
● Decentralization
So how do we make this magic work?
Kubernetes in a datacenter?
Kubernetes Architecture
Spine and Leaf Layer 3 Clos Topology
● Each leaf switch represents a Top-of-Rack switch (ToR)
● All work is performed at the leaf switch
● Each leaf switch is separate Layer 3 domain
● Each leaf is a separate BGP domain (ASN)
● No Spanning Tree Protocol issues seen in L2 networks (convergence
time, loops)
Leaf Leaf Leaf Leaf
Spine Spine
Spine and Leaf Layer 3 Clos Topology
● Simple to understand
● Easy to scale
● Predictable and consistent latency (hops = 2)
● Allows for Anycast IPs
Leaf Leaf Leaf Leaf
Spine Spine
Calico Networking
● No network overlay required
● Communicates directly with existing L3 mesh network
● BGP Peering with Top of Rack switch
● Calico supports Kubernetes NetworkPolicy firewall rules
Monitoring
● Graphite does not scale well with ephemeral instances
● Easy to have combinatoric explosion of metrics
Traditional Monitoring & Alerting
● Application and system alerts are tightly coupled
● Difficult to create alerts on SLAs
● Difficult to route alerts
Kubernetes Monitoring & Alerting
Kubernetes Monitoring & Alerting
Kubernetes Monitoring & Alerting
Microservice Pod Definition
resources:
requests:
cpu: 2
memory: 4Gi
limits:
cpu: 2
memory: 4Gi
Microservice Pod
Java Microservice
fluentd consul
Challenges
Microservice Pod Definition
resources:
requests:
cpu: 2
memory: 4Gi
limits:
cpu: 2
memory: 4Gi
● Kubernetes assumes no other processes are
consuming significant resources
● Completely Fair Scheduler (CFS)
○ Schedules a task based on CPU Shares
○ Throttles a task once it hits CPU Quota
Microservice Pod Definition
resources:
requests:
cpu: 2
memory: 4Gi
limits:
cpu: 2
memory: 4Gi
● Shares = CPU Request * 1024
● Total Kubernetes Shares = # Cores * 1024
● Quota = CPU Limit * 100ms
● Period = 100ms
Java in a Container
● JVM is able to detect # of cores via sysconf(_SC_NPROCESSORS_ONLN)
● Scales tasks relative to this
Java in a Container
● Provide a base container that calculates the container’s resources!
● Detect # of “cores” assigned
○ /sys/fs/cgroup/cpu/cpu.cfs_quota_us divided by
/sys/fs/cgroup/cpu/cpu.cfs_period_us
● Automatically tune the JVM:
○ -XX:ParallelGCThreads=${core_limit}
○ -XX:ConcGCThreads=${core_limit}
○ -Djava.util.concurrent.ForkJoinPool.common.parallelism=${core_limit
}
Java in a Container
● Many libraries rely on Runtime.getRuntime.availableProcessors()
○ Jetty
○ ForkJoinPool
○ GC Threads
○ That mystery dependency...
Java in a Container
● Use Linux preloading to override availableProcessors()
#include <stdlib.h>
#include <unistd.h>
int JVM_ActiveProcessorCount(void) {
char* val = getenv("CONTAINER_CORE_LIMIT");
return val != NULL ? atoi(val) : sysconf(_SC_NPROCESSORS_ONLN);
}
https://engineering.squarespace.com/blog/2017/understanding-linux-container-scheduling
Communication With External Services
● Environment specific services should not be encoded in application
● Single deployment for all environments and datacenters
● Federation API expects same deployment
● Not all applications are using consul
Communication With External Services
Communication With External Services
apiVersion: v1
kind: Service
metadata:
name: kafka
namespace: elk
spec:
type: ClusterIP
clusterIP: None
sessionAffinity: None
ports:
- port: 9092
protocol: TCP
targetPort: 9092
apiVersion: v1
kind: Endpoints
metadata:
name: kafka
namespace: elk
subsets:
- addresses:
- ip: 10.120.201.33
- ip: 10.120.201.34
- ip: 10.120.201.35
...
ports:
- port: 9092
protocol: TCP
So what’s left?
Future Work: Enforce Squarespace Standards
● Custom Admission Controller requires all services, deployments, etc.
meet certain standards
○ Resource requests/limits
○ Owner annotations
○ Service labels
Future Work: Updating Common Dependencies
● Custom Initializers
○ Inject container dependencies into deployments (consul, fluentd)
○ Configure Prometheus instances for each namespace
● Trigger rescheduling of pods when dependencies need updating
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: location
namespace: core-services
annotations:
initializer.squarespace.net/consul:
"true"
QUESTIONS
Thank you!
squarespace.com/careers
Doug Jones
@dougfjones
Kevin Lynch
@kevml

More Related Content

PDF
Don't Assume Your API Gateway is Ready for Microservices
PDF
2017 Microservices Practitioner Virtual Summit - Opening Keynote: Trends in M...
PDF
2017 Microservices Practitioner Virtual Summit: Move Fast, Make Things: how d...
PDF
NYC Kubernetes Meetup: Ambassador and Istio - Flynn, Datawire
PDF
2017 Microservices Practitioner Virtual Summit: Ancestry's Journey towards Mi...
PDF
MA Microservices Meetup: Move fast and make things
PDF
Your Developers Can Be Heroes on Kubernetes
PDF
Empower Your Docker Containers with Watson - DockerCon 2017 Austin
Don't Assume Your API Gateway is Ready for Microservices
2017 Microservices Practitioner Virtual Summit - Opening Keynote: Trends in M...
2017 Microservices Practitioner Virtual Summit: Move Fast, Make Things: how d...
NYC Kubernetes Meetup: Ambassador and Istio - Flynn, Datawire
2017 Microservices Practitioner Virtual Summit: Ancestry's Journey towards Mi...
MA Microservices Meetup: Move fast and make things
Your Developers Can Be Heroes on Kubernetes
Empower Your Docker Containers with Watson - DockerCon 2017 Austin

What's hot (20)

PPTX
Communication Amongst Microservices: Kubernetes, Istio, and Spring Cloud - An...
PDF
Microservices Practitioner Summit Jan '15 - Maximizing Developer Productivity...
PDF
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
PPTX
An Open-Source Platform to Connect, Manage, and Secure Microservices
PDF
Cloud-Native Progressive Delivery
PDF
Embracing Observability in CI/CD with OpenTelemetry
PDF
Kubernetes and lastminute.com: our course towards better scalability and proc...
PDF
Loadbalancers: The fabric for your micro services
PPTX
Netflix OSS Meetup Season 5 Episode 1
PDF
stackconf 2021 | Prometheus in 2021 and beyond
PPTX
Kubernetes + netflix oss
PDF
Open Source and Secure Coding Practices
PPTX
Istio Mesh – Managing Container Deployments at Scale
PDF
Introduction To Flink
PDF
Herding Kats - Netflix’s Journey to Kubernetes Public
PDF
Securing the Software Supply Chain with TUF and Docker - Justin Cappos and Sa...
PDF
Netflix and Containers: Not A Stranger Thing
PDF
Istio presentation jhug
PPTX
Secure Credential Management with CredHub - DaShaun Carter & Sharath Sahadevan
PPTX
Cost Control and Rapid Innovation in Kubernetes with OpenRewrite
Communication Amongst Microservices: Kubernetes, Istio, and Spring Cloud - An...
Microservices Practitioner Summit Jan '15 - Maximizing Developer Productivity...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
An Open-Source Platform to Connect, Manage, and Secure Microservices
Cloud-Native Progressive Delivery
Embracing Observability in CI/CD with OpenTelemetry
Kubernetes and lastminute.com: our course towards better scalability and proc...
Loadbalancers: The fabric for your micro services
Netflix OSS Meetup Season 5 Episode 1
stackconf 2021 | Prometheus in 2021 and beyond
Kubernetes + netflix oss
Open Source and Secure Coding Practices
Istio Mesh – Managing Container Deployments at Scale
Introduction To Flink
Herding Kats - Netflix’s Journey to Kubernetes Public
Securing the Software Supply Chain with TUF and Docker - Justin Cappos and Sa...
Netflix and Containers: Not A Stranger Thing
Istio presentation jhug
Secure Credential Management with CredHub - DaShaun Carter & Sharath Sahadevan
Cost Control and Rapid Innovation in Kubernetes with OpenRewrite
Ad

Similar to 2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace - Doug Jones and Kevin Lynch, Microservices Practitioner Virtual Summit 2017 (20)

PPTX
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
PPTX
Kubernetes @ Squarespace: Kubernetes in the Datacenter
PDF
Introduction-to-Service-Mesh-with-Istio-and-Kiali-OSS-Japan-July-2019.pdf
PDF
Introduction-to-Service-Mesh-with-Istio-and-Kiali-OSS-Japan-July-2019.pdf
PDF
Kubernetes for Beginners
PDF
Building ‘Bootiful’ microservices cloud
PPTX
CI/CD Pipeline with Kubernetes
PDF
Microservices @ Work - A Practice Report of Developing Microservices
PDF
What is a Service Mesh and what can it do for your Microservices
PPTX
Microservices at ibotta pitfalls and learnings
PDF
Monolithic to Microservices Migration Journey of iyzico with Spring Cloud
PPTX
Ultimate Guide to Microservice Architecture on Kubernetes
ODP
Spring cloud for microservices architecture
PDF
Monolithic to microservices migration journey with spring cloud
PDF
Tungsten Fabric Overview
PPTX
Sumo Logic Cert Jam - Advanced Metrics with Kubernetes
PDF
Xpdays: Kubernetes CI-CD Frameworks Case Study
PDF
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
PPTX
Service Meshes with Istio
PDF
20250617 [KubeCon JP 2025] containerd - Project Update and Deep Dive.pdf
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace: Kubernetes in the Datacenter
Introduction-to-Service-Mesh-with-Istio-and-Kiali-OSS-Japan-July-2019.pdf
Introduction-to-Service-Mesh-with-Istio-and-Kiali-OSS-Japan-July-2019.pdf
Kubernetes for Beginners
Building ‘Bootiful’ microservices cloud
CI/CD Pipeline with Kubernetes
Microservices @ Work - A Practice Report of Developing Microservices
What is a Service Mesh and what can it do for your Microservices
Microservices at ibotta pitfalls and learnings
Monolithic to Microservices Migration Journey of iyzico with Spring Cloud
Ultimate Guide to Microservice Architecture on Kubernetes
Spring cloud for microservices architecture
Monolithic to microservices migration journey with spring cloud
Tungsten Fabric Overview
Sumo Logic Cert Jam - Advanced Metrics with Kubernetes
Xpdays: Kubernetes CI-CD Frameworks Case Study
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Service Meshes with Istio
20250617 [KubeCon JP 2025] containerd - Project Update and Deep Dive.pdf
Ad

More from Ambassador Labs (20)

PDF
Building Microservice Systems Without Cooking Your Laptop: Going “Remocal” wi...
PDF
Ambassador Developer Office Hours: Summer of Kubernetes Ship Week 1: Intro to...
PDF
Cloud native development without the toil
PPTX
Webinar: Accelerate Your Inner Dev Loop for Kubernetes Services
PDF
[Confoo Montreal 2020] From Grief to Growth: The 7 Stages of Observability - ...
PDF
[Confoo Montreal 2020] Build Your Own Serverless with Knative - Alex Gervais
PDF
[QCon London 2020] The Future of Cloud Native API Gateways - Richard Li
PDF
What's New in the Ambassador Edge Stack 1.0?
PDF
Webinar: Effective Management of APIs and the Edge when Adopting Kubernetes
PDF
Ambassador: Building a Control Plane for Envoy
PDF
Telepresence - Fast Development Workflows for Kubernetes
PDF
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...
PDF
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
PDF
The rise of Layer 7, microservices, and the proxy war with Envoy, NGINX, and ...
PDF
The Simply Complex Task of Implementing Kubernetes Ingress - Velocity NYC
PDF
Ambassador Kubernetes-Native API Gateway
PPTX
Micro xchg 2018 - What is a Service Mesh?
PDF
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
PDF
Webinar: Code Faster on Kubernetes
PDF
QCon SF 2017 - Microservices: Service-Oriented Development
Building Microservice Systems Without Cooking Your Laptop: Going “Remocal” wi...
Ambassador Developer Office Hours: Summer of Kubernetes Ship Week 1: Intro to...
Cloud native development without the toil
Webinar: Accelerate Your Inner Dev Loop for Kubernetes Services
[Confoo Montreal 2020] From Grief to Growth: The 7 Stages of Observability - ...
[Confoo Montreal 2020] Build Your Own Serverless with Knative - Alex Gervais
[QCon London 2020] The Future of Cloud Native API Gateways - Richard Li
What's New in the Ambassador Edge Stack 1.0?
Webinar: Effective Management of APIs and the Edge when Adopting Kubernetes
Ambassador: Building a Control Plane for Envoy
Telepresence - Fast Development Workflows for Kubernetes
[KubeCon NA 2018] Telepresence Deep Dive Session - Rafael Schloming & Luke Sh...
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
The rise of Layer 7, microservices, and the proxy war with Envoy, NGINX, and ...
The Simply Complex Task of Implementing Kubernetes Ingress - Velocity NYC
Ambassador Kubernetes-Native API Gateway
Micro xchg 2018 - What is a Service Mesh?
KubeCon NA 2017: Ambassador and Envoy (Envoy Salon)
Webinar: Code Faster on Kubernetes
QCon SF 2017 - Microservices: Service-Oriented Development

Recently uploaded (20)

PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
medical staffing services at VALiNTRY
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
System and Network Administraation Chapter 3
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Nekopoi APK 2025 free lastest update
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
System and Network Administration Chapter 2
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
CHAPTER 2 - PM Management and IT Context
Which alternative to Crystal Reports is best for small or large businesses.pdf
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
medical staffing services at VALiNTRY
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Understanding Forklifts - TECH EHS Solution
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
2025 Textile ERP Trends: SAP, Odoo & Oracle
Upgrade and Innovation Strategies for SAP ERP Customers
System and Network Administraation Chapter 3
How to Migrate SBCGlobal Email to Yahoo Easily
Nekopoi APK 2025 free lastest update
Design an Analysis of Algorithms I-SECS-1021-03
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
System and Network Administration Chapter 2
VVF-Customer-Presentation2025-Ver1.9.pptx
CHAPTER 2 - PM Management and IT Context

2017 Microservices Practitioner Virtual Summit: Microservices at Squarespace - Doug Jones and Kevin Lynch, Microservices Practitioner Virtual Summit 2017

  • 2. Part 1: Building the Pillars of Microservices Part 2: Containerization and Orchestration (Kubernetes) AGENDA
  • 3. Part 1: Building the Pillars 01 The Journey to Microservices 02 Building the Pillars of Microservices
  • 5. Microservices Journey: A Story of Growth 2013: small (< 50 engineers) build product & grow customer base whatever works 2014: medium (< 100 engineers) we have a lot of customers now! whatever works doesn't work anymore 2016: large (100+ engineers) architect for scalability and reliability organizational structures ?: XL (200+ engineers)
  • 6. Challenges with a Monolith ● Reliability ● Performance ● Engineering agility/speed, cross-team coupling ● Engineering time spent fire fighting rather than building new functionality What were the increasingly difficult challenges with a monolith?
  • 7. https://www.squarespace.com/?gclid=<unique-id> Challenges with a Monolith Story of an Outage...During the Super Bowl
  • 8. Challenges with a Monolith ● Monitoring typically starts at the edges ○ Think requests in, DB queries out, etc ● What about the guts of the app? How much visibility do you have there? ● How long does it take you to recover from an issue? Find the cause and fix the issue? Challenges with Monitoring/Finding Faults
  • 9. The Journey to Microservices ● Define Pillars: ideas we consider necessary for successful production microservices ● Implement these pillars as part of our platform ● Reduce boilerplate and reinventing the wheel syndrome ● Service authors get these for free and can focus on their application domain Design a Platform for Production, Remove Challenges
  • 10. Pillars Microservice Framework HTTP API Service Discovery Software Load Balancer Observability Async Client Fault Tolerance https://engineering.squarespace.com/blog/2017/the-pill ars-of-squarespace-services
  • 11. Platform Features Service Discovery API Documentation Structured Logging Metrics & Dashboards Distributed Tracing Contextual Information Alert Definitions Standardized Deployments Healthchecks Dynamic Configuration Client-Side Load Balancing Latency & Fault Tolerance Client-Side Caching HTTP Request Builders Code Generation Service Dashboard Traffic Visualization Server Platform Client Platform ToolingTooling
  • 12. Platform Features Service Discovery API Documentation Structured Logging Metrics & Dashboards Distributed Tracing Contextual Information Alert Definitions Standardized Deployments Healthchecks Dynamic Configuration Client-Side Load Balancing Latency & Fault Tolerance Client-Side Caching HTTP Request Builders Code Generation Service Dashboard Traffic Visualization Async/Reactive Alert Management Log Aggregation
  • 13. Building the Pillars of Microservices ● HTTP + JSON ○ Industry standard. Tons of tools. ● Solid open source Java API server platforms ○ Started with Dropwizard ○ now on Spring Boot (configured to use Jetty and Jersey 2) Pillar: HTTP APIs
  • 14. Building the Pillars of Microservices ● Swagger (OpenAPI Specification) ● Code generation ○ Swagger spec → models, server API, client Even Easier APIs
  • 15. Swagger Path Example paths: /currency-info: put: tags: - CurrencyInfo description: "Creates a new {@link CurrencyInfo} resource." summary: Create a new currency info operationId: save parameters: - name: info in: body schema: $ref: '#/definitions/CurrencyInfo' responses: 200: description: ok schema: $ref: '#/definitions/CurrencyInfo'
  • 17. Building the Pillars of Microservices ● Services announce themselves, publishing their name and host/port information ● Started with a simple announcement payload and found that was enough ● Healthchecks to mark services down Pillar: Service Discovery
  • 18. Building the Pillars of Microservices ● First: Zookeeper ○ Complicated clients (no HTTP API) ○ Must build discovery on Zookeeper primitives ○ Strong consistency is unnecessary ○ Client heartbeats can’t be expanded upon ○ No great way to support multiple data centers Service Discovery Systems
  • 19. ● Now: Consul ○ First class discovery support ○ Built in multi-data center support ○ Simple HTTP API ○ Configurable healthchecks ○ key/value store ■ We use for dynamic config and leader election Building the Pillars of Microservices Service Discovery Systems
  • 20. DC2DC1 Multi DC with Consul ConsulConsulConsul ConsulConsulConsul Service Announce Service Announce Primary DB Replica DB Replicate WAN Gossip Consistent Set
  • 21. DC2DC1 Multi DC with Consul ConsulConsulConsul ConsulConsulConsul Service Service Primary DB Replica DB Replicate Service Query ?dc=”DC2” Remote DC forwarding
  • 22. Building the Pillars of Microservices ● Avoid middleware/extra configuration ● Customizable logic ● Connection pooling ● System awareness to increase fault tolerance ● Builds on Netflix Ribbon OSS Pillar: Software Load Balancers
  • 23. Building the Pillars of Microservices ● Metrics ● Dashboards ● Distributed Tracing ● Structured Logging ● Healthchecks ● Alerts Pillar: Observability
  • 26. Structured Logging tail -f /data/logs/taxation-access.log 2017-03-22 07:24:45:026 GMT thread=jetty-846 contextId=JaOLrH2O contextSourceType=billing clientVersion=taxation-client-3.1 level=INFO class=AccessLog ip=10.100.101.205 method=GET uri=/api/1/taxation/rates queryString= httpVersion=HTTP/1.1 responseCode=200 responseTimeMs=39
  • 27. Contextual Information Client v3.1 Taxation Service Billing Service Context IdClient Version Client Source Type JaOLrH2O
  • 28. Building the Pillars of Microservices ● Addresses the Fanout problem, improved latency ● Reactive: RxJava with RxNetty ● Allows greater composition and reuse. Avoid “callback hell” Pillar: Async Client
  • 29. Fanout Depicted Client Service A Service Z Application Container Service B Service C Service D
  • 30. Sync Execution Client Service A Service Z Application Container Service B Service C Service D 1 2 3 4 5 Total Latency = A + B + C + D + Z
  • 31. Async Execution Client Service A Service Z Application Container Service B Service C Service D 1 2 2 2 1 Total Latency = max(A, Z) A = max(B, C, D) + A’s latency
  • 32. Building the Pillars of Microservices ● Circuit breakers ● Retry logic ○ Much easier to implement w/ RxJava ● Timeouts ● Fallbacks (cached or static values) ● Netflix Hystrix Pillar: Fault Tolerance
  • 33. Fault Tolerance Service B Service A Service C Service A Client Service B Client Service C Client User Request Application Container
  • 34. Fault Tolerance Service B Service A Service C Service A Client 10 Threads Service B Client 5 Threads Service C Client 5 Threads User Request Fail fast, fail silent, or fallback Application Container
  • 35. Pillars Microservice Framework HTTP API Service Discovery Software Load Balancer Observability Async Client Fault Tolerance https://engineering.squarespace.com/blog/2017/the-pill ars-of-squarespace-services
  • 36. Building the Pillars of Microservices ● Entirely Async Systems ○ Async servers, Streaming, gRPC, Netty ● Distributed task management ○ Serverless computing ● Easier/better alert definition and management ● Better tooling to create and deploy services Future Work
  • 37. Part 2: Containerization & Kubernetes Orchestration 01 The problem with static infrastructure 02 Kubernetes in a datacenter? 03 Challenges
  • 38. Containerization & Kubernetes Orchestration ● Engineering org grows... ● More services… ● More infrastructure to spin up… ● Ops becomes a blocker... Stuck in a loop
  • 39. Containerization & Kubernetes Orchestration ● Difficult to find resources ● Slow to provision and scale ● Already have discovery! ● Metrics system must support short lived metrics ● Alerts are usually per instance Static infrastructure and microservices do not mix!
  • 40. Traditional Provisioning Process ● Pick ESX with available resources ● Pick IP ● Register host to Cobbler ● Register DNS entry ● Create new VM on ESX ● PXE boot VM and install OS and base configuration ● Install system dependencies (LDAP, NTP, CollectD, Sensu…) ● Install app dependencies (Java, FluentD/Filebeat, Consul, Mongo-S…) ● Install the app ● App registers with discovery system and begins receiving traffic
  • 41. Kubernetes Provisioning Process ● kubectl apply -f app.yaml
  • 42. Containerization & Kubernetes Orchestration ● Provisioning/Scaling: Kubernetes ● Monitoring: Prometheus ● Alerting: AlertManager ● Discovery: Consul + Kubernetes ● Decentralization So how do we make this magic work?
  • 43. Kubernetes in a datacenter?
  • 45. Spine and Leaf Layer 3 Clos Topology ● Each leaf switch represents a Top-of-Rack switch (ToR) ● All work is performed at the leaf switch ● Each leaf switch is separate Layer 3 domain ● Each leaf is a separate BGP domain (ASN) ● No Spanning Tree Protocol issues seen in L2 networks (convergence time, loops) Leaf Leaf Leaf Leaf Spine Spine
  • 46. Spine and Leaf Layer 3 Clos Topology ● Simple to understand ● Easy to scale ● Predictable and consistent latency (hops = 2) ● Allows for Anycast IPs Leaf Leaf Leaf Leaf Spine Spine
  • 47. Calico Networking ● No network overlay required ● Communicates directly with existing L3 mesh network ● BGP Peering with Top of Rack switch ● Calico supports Kubernetes NetworkPolicy firewall rules
  • 49. ● Graphite does not scale well with ephemeral instances ● Easy to have combinatoric explosion of metrics Traditional Monitoring & Alerting ● Application and system alerts are tightly coupled ● Difficult to create alerts on SLAs ● Difficult to route alerts
  • 53. Microservice Pod Definition resources: requests: cpu: 2 memory: 4Gi limits: cpu: 2 memory: 4Gi Microservice Pod Java Microservice fluentd consul
  • 55. Microservice Pod Definition resources: requests: cpu: 2 memory: 4Gi limits: cpu: 2 memory: 4Gi ● Kubernetes assumes no other processes are consuming significant resources ● Completely Fair Scheduler (CFS) ○ Schedules a task based on CPU Shares ○ Throttles a task once it hits CPU Quota
  • 56. Microservice Pod Definition resources: requests: cpu: 2 memory: 4Gi limits: cpu: 2 memory: 4Gi ● Shares = CPU Request * 1024 ● Total Kubernetes Shares = # Cores * 1024 ● Quota = CPU Limit * 100ms ● Period = 100ms
  • 57. Java in a Container ● JVM is able to detect # of cores via sysconf(_SC_NPROCESSORS_ONLN) ● Scales tasks relative to this
  • 58. Java in a Container ● Provide a base container that calculates the container’s resources! ● Detect # of “cores” assigned ○ /sys/fs/cgroup/cpu/cpu.cfs_quota_us divided by /sys/fs/cgroup/cpu/cpu.cfs_period_us ● Automatically tune the JVM: ○ -XX:ParallelGCThreads=${core_limit} ○ -XX:ConcGCThreads=${core_limit} ○ -Djava.util.concurrent.ForkJoinPool.common.parallelism=${core_limit }
  • 59. Java in a Container ● Many libraries rely on Runtime.getRuntime.availableProcessors() ○ Jetty ○ ForkJoinPool ○ GC Threads ○ That mystery dependency...
  • 60. Java in a Container ● Use Linux preloading to override availableProcessors() #include <stdlib.h> #include <unistd.h> int JVM_ActiveProcessorCount(void) { char* val = getenv("CONTAINER_CORE_LIMIT"); return val != NULL ? atoi(val) : sysconf(_SC_NPROCESSORS_ONLN); } https://engineering.squarespace.com/blog/2017/understanding-linux-container-scheduling
  • 61. Communication With External Services ● Environment specific services should not be encoded in application ● Single deployment for all environments and datacenters ● Federation API expects same deployment ● Not all applications are using consul
  • 63. Communication With External Services apiVersion: v1 kind: Service metadata: name: kafka namespace: elk spec: type: ClusterIP clusterIP: None sessionAffinity: None ports: - port: 9092 protocol: TCP targetPort: 9092 apiVersion: v1 kind: Endpoints metadata: name: kafka namespace: elk subsets: - addresses: - ip: 10.120.201.33 - ip: 10.120.201.34 - ip: 10.120.201.35 ... ports: - port: 9092 protocol: TCP
  • 65. Future Work: Enforce Squarespace Standards ● Custom Admission Controller requires all services, deployments, etc. meet certain standards ○ Resource requests/limits ○ Owner annotations ○ Service labels
  • 66. Future Work: Updating Common Dependencies ● Custom Initializers ○ Inject container dependencies into deployments (consul, fluentd) ○ Configure Prometheus instances for each namespace ● Trigger rescheduling of pods when dependencies need updating apiVersion: extensions/v1beta1 kind: Deployment metadata: name: location namespace: core-services annotations: initializer.squarespace.net/consul: "true"