SlideShare a Scribd company logo
what happens when you type
en.wikipedia.org
SREcon19 Dublin
@kosiaris • @manjiki
effie mouzeli • alexandros kosiaris
About
CC BY-SA 4.0 Niccolò Caranti@kosiaris • @manjiki 2@kosiaris • @manjiki
Did you know...
● … the Wikipedia infrastructure is run by the Wikimedia
Foundation, an American nonprofit charitable
organisation?
● … and we are ~370 people?
● … and we have no affiliation with Wikileaks?
● … all content is managed by volunteers?
● … we support 304 languages?
● … Wikipedia is 18 years old ?
● … Wikipedia hosts some really really weird articles?
● … which can’t be read in Turkey (2017) nor China (2019)?
3
Wikimedia Projects
4
Wikimedia Infrastructure
✺ Open source software
✺ 2 Primary Data Centres
✺ 3 Caching Points of Presence
✺ ~17 billion pageviews per month*
✺ ~300k new editors per month
✺ ~1300 bare metal servers
5* it’s complicated
Site Reliability Engineering
✺ Datacenter Operations
✺ Data Persistence
✺ Infrastructure Foundations
✺ Service Operations
✺ Traffic
The SRE team is a globally distributed
team of 26 people responsible for
developing and maintaining Wikimedia's
production systems
The Foundation has more SREs in other
teams as well!
6
Application Layer
CC BY-SA 2.0 Arthur Dunn@kosiaris • @manjiki 7
MediaWiki
✺ Our core application
✺ PHP, Apache, MySQL. Yes.*
✴ PHP7.2 since Sept 2019
✺ Wiki web pages - app servers
cluster
✺ API cluster
✺ Jobrunners/Videoscalers cluster
MediaWiki is a free server-based
software, licensed under the GNU GPL.
It is an extremely powerful, scalable
software, and a feature-rich wiki
implementation that uses PHP to
process and display data stored in a
database, such as MySQL.
8* it’s complicated
Application Layer Caches
9
2014
2019
10
From a
Monolith to
Microservices
✺ Elasticity
✺ Hardware fault mitigation
✺ Deployments
✺ Migration is not easy, and still
ongoing
11
From a
Monolith to
Microservices
Microservices!
✺ Thumbor
✺ Mathoid
✺ ORES
✺ Mobile Content Service (MCS)
✺ And many more
Thumbor is used for imagescaling
Mathoid renders LaTeX, and returns JSON
with PNG, SVG or MathML renderings of the
formula
ORES scores edits using Machine Learning
(anti-vandalism effort)
MCS modifies page content on the fly,
tailoring it for mobile
12
Kubernetes
Public Domain@kosiaris • @manjiki
✺ Bare metal
✺ Calico as a CNI plugin
✺ Helm for deployments
✺ 2 clusters + 1 staging one
✺ Docker as a CRE
We have been running it successfully for
the last 2 years! Currently, 11 services on
it. Got a pipeline in the works.
Powers all mathematical formulas on
Wikipedia!!!
14
Kubernetes
Message Queueing
CC BY 2.0 bootbearwdc@kosiaris • @manjiki
Message Queueing
✺ Yes, we use Apache Kafka
✺ We are sending events like:
✳ wikitext templates refresh
✳ edge caches purging
✳ cross wiki links
✳ create new thumbnails
✳ re-encoding videos to open source
formats
Apache Kafka: stream processing
platform for real-time data feeds
One message queue to rule them all;
started as a service for Analytics only.
Now, it is our de facto solution.
16
Databases
CC BY 2.0 RageZ@kosiaris • @manjiki 17
MariaDB*
✺ Database clusters are divided into
sections
✺ Sections have masters and
replicas*
✺ MediaWiki reads from replicas
and writes to master
✺ Clusters:
✳ Wikitext (compressed)
✳ Metadata
✳ Parsercache
MariaDB: fork of MySQL, migrated from
MySQL in 2013*
Have a go at
https://quarry.wmflabs.org
18* it’s complicated
MariaDB
19
✺ Online schema migrations*
✺ Cross DC replication
✺ TLS across all DBs
✺ Snapshots and local dumps for
Backups
✺ ~570 TB total data
✺ ~150 DB servers
✺ ~350k queries per second (qps)
✺ ~70 TB of RAM
* it’s complicated
Elasticsearch
You guessed it right, we use it for search.
That box on your top right.
Run by a team surprisingly called
Search Platform!
20
Storage
CC BY-NC 2.0 Gail Thomas@kosiaris • @manjiki
Swift
✺ All our media are stored on Swift
✺ It has frontends
… and backends
✺ 1 billion objects
✺ ~390 TB of media!
OpenStack Object Storage: a scalable
storage system that stores and retrieves data
via HTTP
22
Traffic
Public Domain@kosiaris • @manjiki 23
Network
24
Network
25
✺ We have our own content delivery
network
✺ We direct traffic to a location on
demand (via GeoDNS)
✳ Pooling/Depooling DCs
✳ 10 min TTL
✺ LVS as a Layer 3/4 Linux
loadbalancer*
gdnsd: GeoDNS is written and maintained
by one of us
peering: interconnection with other
internet networks
Linux Virtual Server: an advanced L3/L4
load balancing solution for linux, supports
consistent hashing
pybal: LVS manager, developed in-house
Network
26* it’s complicated
LVS-DR
27
CDN
28
Nginx-: Highly performant HTTP
webserver/proxy with excellent TLS
support
Varnish: Reverse HTTP caching proxy
CDN
29
✺ Nginx- for TLS termination
✺ Varnish frontend
✳ in memory
✺ Varnish backend
✳ local stores
✺ Varnish text
✳ HTML, CSS, JS etc
✺ Varnish upload
✳ media, media, media
CDN (coming soon)
30
Apache Traffic Server: Reverse and
forward proxy with excellent caching
support
ACME-chief: handles all the process of
issuing and renewing Let’s Encrypt
certificates (dns-01)
CDN (coming soon)
31
✺ ATS TLS
✳ in memory
✺ ATS backend
✳ local store (SSDs)
✺ ATS text
✳ HTML, CSS, JS etc
✺ ATS upload
✳ media, media, media
✺ ACME-chief
what happens when you type
en.wikipedia.org
@manjiki • @kosiaris CC BY 3.0 WikiReader
Read (cached)
33
Read (cached)
34
Read (uncached)
35
Edit - Media Upload
36
Managing to Manage
GETTY IMAGES@kosiaris • @manjiki
✺ Infrastructure as code
✺ Configuration management
✺ Kubernetes
✺ Testing/CI/CD
✺ Orchestration tooling
Puppet: configuration management
system for servers/services
...~50k lines of puppet code
...~100k lines of Ruby/ERB
Cumin: in-house automation and
orchestration tool
Managing to Manage
38
@kosiaris • @manjiki
In a Nutshell
CC BY 2.0 Peter Trimming
Want to sell encyclopedias?
https://jobs.wikimedia.org
https://grafana.wikimedia.org/
https://github.com/wikimedia/operations-puppet
https://phabricator.wikimedia.org/
https://wikitech.wikimedia.org/
SREcon19 Dublin
@kosiaris • @manjiki

More Related Content

PDF
Unleashing k8 s to reduce complexities of an entire middleware platform
PDF
Simple Solutions for Complex Problems - Boulder Meetup
PDF
KubeCon + CloudNative Con NA 2021 | A New Generation of NATS
PPT
Linux Experience for Herman
PDF
NATS vs HTTP
PDF
Implementing Microservices with NATS
PPTX
Deploying WSO2 Middleware on Kubernetes
PDF
An analysis of TLS handshake proxying
Unleashing k8 s to reduce complexities of an entire middleware platform
Simple Solutions for Complex Problems - Boulder Meetup
KubeCon + CloudNative Con NA 2021 | A New Generation of NATS
Linux Experience for Herman
NATS vs HTTP
Implementing Microservices with NATS
Deploying WSO2 Middleware on Kubernetes
An analysis of TLS handshake proxying

What's hot (20)

PDF
K8s 101
PPTX
The 3 Models in the NGINX Microservices Reference Architecture
PDF
GopherCon 2017 - Writing Networking Clients in Go: The Design & Implementati...
PDF
What's New in Go Crypto - Gotham Go
PDF
NATS in action - A Real time Microservices Architecture handled by NATS
PPTX
NATS for Modern Messaging and Microservices
PDF
IT Minds Mindblown Networking Event 2016
PDF
NATS + Docker meetup talk Oct - 2016
PDF
CFSSL 1.1: The Evolution of a PKI toolkit - DEF CON 23
PDF
Webinar: Achieving Economies of Web Scale in Your Enterprise with Containeriz...
PDF
Paris Container Day 2016 : Les conteneurs, microsoft azure et windows (Micros...
PDF
Serverless for the Cloud Native Era with Fission
PPTX
Using Redis as Distributed Cache for ASP.NET apps - Peter Kellner, 73rd Stre...
PPTX
Multi tenancy for docker
PDF
Encode polkadot club
PPTX
Kubernetes meetup bangalore december 2017 - v02
PPTX
LinuxKit Update at the Moby Summit
PDF
Microservice - Up to 500k CCU
PDF
A Kong retrospective: from 0.10 to 0.13
PPTX
What’s New in NGINX Ingress Controller for Kubernetes Release 1.5.0
K8s 101
The 3 Models in the NGINX Microservices Reference Architecture
GopherCon 2017 - Writing Networking Clients in Go: The Design & Implementati...
What's New in Go Crypto - Gotham Go
NATS in action - A Real time Microservices Architecture handled by NATS
NATS for Modern Messaging and Microservices
IT Minds Mindblown Networking Event 2016
NATS + Docker meetup talk Oct - 2016
CFSSL 1.1: The Evolution of a PKI toolkit - DEF CON 23
Webinar: Achieving Economies of Web Scale in Your Enterprise with Containeriz...
Paris Container Day 2016 : Les conteneurs, microsoft azure et windows (Micros...
Serverless for the Cloud Native Era with Fission
Using Redis as Distributed Cache for ASP.NET apps - Peter Kellner, 73rd Stre...
Multi tenancy for docker
Encode polkadot club
Kubernetes meetup bangalore december 2017 - v02
LinuxKit Update at the Moby Summit
Microservice - Up to 500k CCU
A Kong retrospective: from 0.10 to 0.13
What’s New in NGINX Ingress Controller for Kubernetes Release 1.5.0
Ad

Similar to What Happens When You Type en.wikipedia.org - SREcon19 EMEA (20)

PPTX
The Future of Web Application Architectures
PDF
OSCON: Unikernels and Docker: From revolution to evolution
PDF
Container Networking Deep Dive
PPTX
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
PDF
Docker Enterprise Networking and Cisco Contiv - Cisco Live 2017 BRKSDN-2256
PDF
Dockercon 16 Recap
PPTX
KubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
PPTX
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
PDF
20220406 - SDAN_Presentation1_SDANOverview.pdf
PDF
Cloud Native CI/CD with Jenkins X and Knative Pipelines
PDF
Kubernetes and Container Technologies from Cloud Native Computing Foundation
PDF
Getting Started with Docker - Nick Stinemates
PPTX
I Have a NoSQL toaster - DC - August 2017
PPTX
Kubernetes for Docker Users
PPTX
Containers virtaulization and docker
PDF
containerd summit - Deep Dive into containerd
PDF
Paris Container Day 2016 : How cluster managers affect the landscape of moder...
PDF
Cisco Live 2017: Container networking deep dive with Docker Enterprise Editio...
PDF
Day 2 Kubernetes - Tools for Operability (Velocity London Meetup)
PDF
3 - Delen Private Bank: FOSS adventures in a Cloud Native world
The Future of Web Application Architectures
OSCON: Unikernels and Docker: From revolution to evolution
Container Networking Deep Dive
Lots More LOCKSS for Web Archiving: Boons from the LOCKSS Software Re-Archite...
Docker Enterprise Networking and Cisco Contiv - Cisco Live 2017 BRKSDN-2256
Dockercon 16 Recap
KubeCon USA 2017 brief Overview - from Kubernetes meetup Bangalore
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
20220406 - SDAN_Presentation1_SDANOverview.pdf
Cloud Native CI/CD with Jenkins X and Knative Pipelines
Kubernetes and Container Technologies from Cloud Native Computing Foundation
Getting Started with Docker - Nick Stinemates
I Have a NoSQL toaster - DC - August 2017
Kubernetes for Docker Users
Containers virtaulization and docker
containerd summit - Deep Dive into containerd
Paris Container Day 2016 : How cluster managers affect the landscape of moder...
Cisco Live 2017: Container networking deep dive with Docker Enterprise Editio...
Day 2 Kubernetes - Tools for Operability (Velocity London Meetup)
3 - Delen Private Bank: FOSS adventures in a Cloud Native world
Ad

More from effie mouzeli (6)

PDF
Small scale engineering - Velocity NY '18
PDF
The Curious Case of Hiring and Being Hired - SREcon18 Europe
PDF
Halt and Don't Catch Fire - SREcon18 Europe
PDF
Startup Systems Engineer's Instruction manual - SREcon17 Europe
PDF
Automations using Saltstack - SREcon16 Europe
PDF
The SaltStack Pub Crawl - Fosscomm 2016
Small scale engineering - Velocity NY '18
The Curious Case of Hiring and Being Hired - SREcon18 Europe
Halt and Don't Catch Fire - SREcon18 Europe
Startup Systems Engineer's Instruction manual - SREcon17 Europe
Automations using Saltstack - SREcon16 Europe
The SaltStack Pub Crawl - Fosscomm 2016

Recently uploaded (20)

PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
KodekX | Application Modernization Development
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Encapsulation theory and applications.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Spectroscopy.pptx food analysis technology
PDF
Unlocking AI with Model Context Protocol (MCP)
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Spectral efficient network and resource selection model in 5G networks
Diabetes mellitus diagnosis method based random forest with bat algorithm
Network Security Unit 5.pdf for BCA BBA.
Digital-Transformation-Roadmap-for-Companies.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Dropbox Q2 2025 Financial Results & Investor Presentation
KodekX | Application Modernization Development
sap open course for s4hana steps from ECC to s4
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Big Data Technologies - Introduction.pptx
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation theory and applications.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Spectroscopy.pptx food analysis technology
Unlocking AI with Model Context Protocol (MCP)

What Happens When You Type en.wikipedia.org - SREcon19 EMEA

  • 1. what happens when you type en.wikipedia.org SREcon19 Dublin @kosiaris • @manjiki effie mouzeli • alexandros kosiaris
  • 2. About CC BY-SA 4.0 Niccolò Caranti@kosiaris • @manjiki 2@kosiaris • @manjiki
  • 3. Did you know... ● … the Wikipedia infrastructure is run by the Wikimedia Foundation, an American nonprofit charitable organisation? ● … and we are ~370 people? ● … and we have no affiliation with Wikileaks? ● … all content is managed by volunteers? ● … we support 304 languages? ● … Wikipedia is 18 years old ? ● … Wikipedia hosts some really really weird articles? ● … which can’t be read in Turkey (2017) nor China (2019)? 3
  • 5. Wikimedia Infrastructure ✺ Open source software ✺ 2 Primary Data Centres ✺ 3 Caching Points of Presence ✺ ~17 billion pageviews per month* ✺ ~300k new editors per month ✺ ~1300 bare metal servers 5* it’s complicated
  • 6. Site Reliability Engineering ✺ Datacenter Operations ✺ Data Persistence ✺ Infrastructure Foundations ✺ Service Operations ✺ Traffic The SRE team is a globally distributed team of 26 people responsible for developing and maintaining Wikimedia's production systems The Foundation has more SREs in other teams as well! 6
  • 7. Application Layer CC BY-SA 2.0 Arthur Dunn@kosiaris • @manjiki 7
  • 8. MediaWiki ✺ Our core application ✺ PHP, Apache, MySQL. Yes.* ✴ PHP7.2 since Sept 2019 ✺ Wiki web pages - app servers cluster ✺ API cluster ✺ Jobrunners/Videoscalers cluster MediaWiki is a free server-based software, licensed under the GNU GPL. It is an extremely powerful, scalable software, and a feature-rich wiki implementation that uses PHP to process and display data stored in a database, such as MySQL. 8* it’s complicated
  • 11. ✺ Elasticity ✺ Hardware fault mitigation ✺ Deployments ✺ Migration is not easy, and still ongoing 11 From a Monolith to Microservices
  • 12. Microservices! ✺ Thumbor ✺ Mathoid ✺ ORES ✺ Mobile Content Service (MCS) ✺ And many more Thumbor is used for imagescaling Mathoid renders LaTeX, and returns JSON with PNG, SVG or MathML renderings of the formula ORES scores edits using Machine Learning (anti-vandalism effort) MCS modifies page content on the fly, tailoring it for mobile 12
  • 14. ✺ Bare metal ✺ Calico as a CNI plugin ✺ Helm for deployments ✺ 2 clusters + 1 staging one ✺ Docker as a CRE We have been running it successfully for the last 2 years! Currently, 11 services on it. Got a pipeline in the works. Powers all mathematical formulas on Wikipedia!!! 14 Kubernetes
  • 15. Message Queueing CC BY 2.0 bootbearwdc@kosiaris • @manjiki
  • 16. Message Queueing ✺ Yes, we use Apache Kafka ✺ We are sending events like: ✳ wikitext templates refresh ✳ edge caches purging ✳ cross wiki links ✳ create new thumbnails ✳ re-encoding videos to open source formats Apache Kafka: stream processing platform for real-time data feeds One message queue to rule them all; started as a service for Analytics only. Now, it is our de facto solution. 16
  • 17. Databases CC BY 2.0 RageZ@kosiaris • @manjiki 17
  • 18. MariaDB* ✺ Database clusters are divided into sections ✺ Sections have masters and replicas* ✺ MediaWiki reads from replicas and writes to master ✺ Clusters: ✳ Wikitext (compressed) ✳ Metadata ✳ Parsercache MariaDB: fork of MySQL, migrated from MySQL in 2013* Have a go at https://quarry.wmflabs.org 18* it’s complicated
  • 19. MariaDB 19 ✺ Online schema migrations* ✺ Cross DC replication ✺ TLS across all DBs ✺ Snapshots and local dumps for Backups ✺ ~570 TB total data ✺ ~150 DB servers ✺ ~350k queries per second (qps) ✺ ~70 TB of RAM * it’s complicated
  • 20. Elasticsearch You guessed it right, we use it for search. That box on your top right. Run by a team surprisingly called Search Platform! 20
  • 21. Storage CC BY-NC 2.0 Gail Thomas@kosiaris • @manjiki
  • 22. Swift ✺ All our media are stored on Swift ✺ It has frontends … and backends ✺ 1 billion objects ✺ ~390 TB of media! OpenStack Object Storage: a scalable storage system that stores and retrieves data via HTTP 22
  • 26. ✺ We have our own content delivery network ✺ We direct traffic to a location on demand (via GeoDNS) ✳ Pooling/Depooling DCs ✳ 10 min TTL ✺ LVS as a Layer 3/4 Linux loadbalancer* gdnsd: GeoDNS is written and maintained by one of us peering: interconnection with other internet networks Linux Virtual Server: an advanced L3/L4 load balancing solution for linux, supports consistent hashing pybal: LVS manager, developed in-house Network 26* it’s complicated
  • 29. Nginx-: Highly performant HTTP webserver/proxy with excellent TLS support Varnish: Reverse HTTP caching proxy CDN 29 ✺ Nginx- for TLS termination ✺ Varnish frontend ✳ in memory ✺ Varnish backend ✳ local stores ✺ Varnish text ✳ HTML, CSS, JS etc ✺ Varnish upload ✳ media, media, media
  • 31. Apache Traffic Server: Reverse and forward proxy with excellent caching support ACME-chief: handles all the process of issuing and renewing Let’s Encrypt certificates (dns-01) CDN (coming soon) 31 ✺ ATS TLS ✳ in memory ✺ ATS backend ✳ local store (SSDs) ✺ ATS text ✳ HTML, CSS, JS etc ✺ ATS upload ✳ media, media, media ✺ ACME-chief
  • 32. what happens when you type en.wikipedia.org @manjiki • @kosiaris CC BY 3.0 WikiReader
  • 36. Edit - Media Upload 36
  • 37. Managing to Manage GETTY IMAGES@kosiaris • @manjiki
  • 38. ✺ Infrastructure as code ✺ Configuration management ✺ Kubernetes ✺ Testing/CI/CD ✺ Orchestration tooling Puppet: configuration management system for servers/services ...~50k lines of puppet code ...~100k lines of Ruby/ERB Cumin: in-house automation and orchestration tool Managing to Manage 38
  • 39. @kosiaris • @manjiki In a Nutshell CC BY 2.0 Peter Trimming
  • 40. Want to sell encyclopedias? https://jobs.wikimedia.org https://grafana.wikimedia.org/ https://github.com/wikimedia/operations-puppet https://phabricator.wikimedia.org/ https://wikitech.wikimedia.org/ SREcon19 Dublin @kosiaris • @manjiki