SlideShare a Scribd company logo
Combining Ceph with Kubernetes
19 April 2018
John Spray
Principal Software Engineer, Offce of the CTO
<john.spray@redhat.com>
2
About me
commit ac30e6cee2b2d3815438f1a392a951d511bddfd4
Author: John Spray <john.spray@redhat.com>
Date: Thu Jun 30 14:05:02 2016 +0100
mgr: create ceph-mgr service
Signed-off-by: John Spray <john.spray@redhat.com>
3
Ceph operations today
● RPM packages (all daemons on server same version)
● Physical services confgured by external orchestrator:
● Ansible, salt, etc
● Logical entities confgured via Ceph itself (pools,
flesystems, auth):
● CLI, mgr module interface, restful module
● Separate workfow from the physical deployment
● Plus some external monitoring to make sure your services
stay up
4
Pain points
● All those elements combine to create a high surface area
beween users and the software.
● Lots of human decision making, opportunities for mistakes
● In practice, deployments often kept relatively static after initial
decision making is done.
Can new container environments enable something better?
5
Glorious Container Future
● Unicorns for everyone!
● Ice cream for breakfast!
● Every Ceph cluster comes with a free Pony!
● Sunny and warm every day!
6
The real container future
● Kubernetes is a tool that implements the basic operations that
we need for the management of cluster services
● Deploy builds (in container format)
● Detect devices, start container in specifc location (OSD)
● Schedule/place groups of services (MDS, RGW)
● If we were writing a Ceph management server/agent today, it
would look much like Kubernetes: so let’s just use Kubernetes!
Kubernetes gives us the primitives, we still have to do the
business logic and UI
7
Why Kubernetes?
● Widely adopted (Red Hat OpenShift, Google Compute
Engine, Amazon EKS, etc.)
● CLI/REST driven (extensible API)
● Lightweight design
Rook
9
Rook
● Simplifed, container-native way of consuming Ceph
● Built for Kubernetes, extending the Kubernetes API
● CNCF inception project
http://rook.io/
http://github.com/rook/
10
Rook components
● Image: Ceph and Rook binaries in one artifact
● ‘agent’ handles mounting volumes
● Hide complexity of client version, kernel version variations
● ‘operator’ watches objects in etcd, manipulates Ceph in
response
● Create a “Filesystem” object, Rook operator does corresponding
“ceph fs new”
11
Rook example
$ kubectl create -f rook-cluster.yaml
$ kubectl -n rook get pod
NAME READY STATUS
rook-api-1511082791-7qs0m 1/1 Running
rook-ceph-mgr0-1279756402-wc4vt 1/1 Running
rook-ceph-mon0-jflt5 1/1 Running
rook-ceph-mon1-wkc8p 1/1 Running
rook-ceph-mon2-p31dj 1/1 Running
rook-ceph-osd-0h6nb 1/1 Running
12
Rook user interface
● Rook objects are created via the extensible Kubernetes API
service (Custom Resource Defintnoiss
● aka: kubectl + yaml fles
● This style is consistent with Kubernetes ecosystem, but
could beneft from a friendlier layer on top
● “point and click” is desirable for many users (& vendors)
● declarative confguration not always a good ft for storage: deleting a pool
should require a confrmation button!
Combining Rook with ceph-mgr
14
“Just give me the storage”
● Rook’s simplifed model is suitable for people who do not want to
pay any attention to how Ceph is confgured: they just want to
see a volume attached to their container.
● However: people buying hardware (or paying for cloud) often
care a lot about how the storage cluster is confgured.
● Lifecycle: start out not caring about details, but care more and
more as time goes on, eventually want to get into the details and
optimize use of resources.
15
What is ceph-mgr?
● Component of RADOS: a sibling of the mon and OSD
daemons. C++ code using same auth/networking stack.
● Mandatory component: includes key functionality
● Host to python modules that do monitoring/management
● Relatively simple in itself: the fun parts are the python
modules.
16
dashboard module
● Mnmnc (13.2.x) release includesan extended management
web UI based on OpeiAttnc
● Would like Kubernetes integration, so that we can create
containers from the dashboard too:
● The “Create Filesystem” button starts MDS cluster
● A “Create OSD” button that starts OSDs
→ Call out to Rook from ceph-mgr
(aid to other orchestrators toos
17
Why not build Rook-like functionality into mgr?
1. Upgrades! An out-of-Ceph component that knows how to
orchestrate a Ceph upgrade, while other Ceph services may be
offine (aka “who manages the manager?”)
2. Commonality between simplifed pure-Rook systems and
fully-featured containerized Ceph clusters.
3. Retain Rook’s client mounting/volume capabilities: we are
publishing info about Ceph cluster into K8s so that Rook can
take care of the volume management.
18
How can we re-use the Rook operator
How can we share Rook’s code for running containers, without
limiting ourselves to their Ceph feature subset?
→ Modify Rook to make the non-container parts of CRD
objects optional (e.g. pools on a Filesystem)
→ ceph-mgr creates cut-down Filesystem object to get
MDS containers created
→ migration path from pure-Rook systems to general
purpose Ceph clusters
19
Two ways to consume containerized Ceph
Rook operator
K8s
ceph-mgr Rook user
Kubectl, lnmnted feature setFull coitrol, ponit+clnck
Mngratnoi (nf desnreds
Ceph image
20
What doesn’t Kubernetes do for us?
● Installing itself (obviously)
● Confguring the underlying networks
● Bootstrapping Rook
→ External setup tools will continue to have a role in the non-Ceph-
specifc tasks
21
Status/tasks
● Getting Rook to consume the upstream Ceph container image,
instead of its own custom-built single-binary image.
● Patching Rook operator to enable doing just the container parts
● Patching Rook to enable injecting confg+key to manage an
existing cluster
● Connecting ceph-mgr backend to drive Rook via the K8s API
● Exposing K8s-enabled workfows in the dashboard UI
→ Goal: one click Filesystem creation
(...and one click {everything_else} too)
Other enabling work
23
Background
● Recall: external orchestrators are handling physical deployment
of services, but most logical management is still direct to Ceph
● Or is it? Increasingly, orchestrators mix physically deploying
Ceph services with logical confguration:
● Rook creates volumes as CephFS flesystems, but this means creating
underlying pools. How does it know how to confgure them?
● Same for anything deploying RGW
● Rook also exposes some health/monitoring of the Ceph cluster, but is this in
terms a non-Ceph-expert can understand?
● We must continue to make managing Ceph easier, and where
possible, remove need for intervention.
24
Placement group merging
Expernmeital for Mnmnc
● Historically, pg_num could be increased but not decreased
● Sometimes problematic, when e.g. physically shrinking a cluster,
or if bad pg_nums were chosen.
● Bigger problem: prevented automatic pg_num selection,
because mistakes could not be reversed.
● Implementation is not simple, and doing it still has an IO cost,
but the option will be there → now we can autoselect pg_num!
25
Automatic pg_num selection
Expernmeital for Mnmnc
● Hard (impossible?) to do perfectly
● Pretty easy to do useful common cases:
● Select initial pg_nums according to expected space use
● Increase pg_nums if actual space use has gone ~2x over ideal PG capacity
● Decrease pg_num for underused pools if another pool needs to increase
theirs
● Not an optimiser! But probably going to do the job as well as
most humans are doing it today.
26
Automatic pg_num selection (continued)
Expernmeital for Mnmnc
● Prompting users for expected capacity makes sense for data
pools, but not for metadata pools:
● Combine data and metadata pool creation into one command
● Wrap pools into new “poolset” structure describing policy
● Auto-construct poolsets for existing deployments, but don’t auto-adjust
unless explicitly enabled
ceph poolset create cephfs my_filesystem 100GB
27
Progress bars
Expernmeital for Mnmnc
● Health reporting was improved in lumnious, but in many cases it
is still too low level.
● Especially placement groups:
● hard to distinguish between real problems and normal rebalancing
● Once we start auto-picking pg_num, users won’t know what a PG is until
they see them in the health status
● Introduce `progress` module to synthesize high level view from
PG state: “56% recovered from failure of OSD 123”
28
Wrap up
● All these improvements reduce cognitive load on ordinary user.
● Do not need to know what an MDS is: ask Rook for a flesystem, and get one.
● Do not need to know what a placement group is
● Do not need to know magic commands: look at the dashboard
● Actions that no longer require human thought can now be tied
into automated workfows: fulfl the promise of software defned
storage.
Q&A

More Related Content

PDF
XCP-ng - past, present and future
PDF
XCP-ng - Olivier Lambert
PDF
Paul Angus - what's new in ACS 4.11
PDF
Antoine Coetsier - billing the cloud
PDF
Boyan Krosnov - Building a software-defined cloud - our experience
PDF
Giles Sirett - welcome and CloudStack news
PPTX
Paul Angus - CloudStack Container Service
PDF
Dag Sonstebo - CloudStack usage service
XCP-ng - past, present and future
XCP-ng - Olivier Lambert
Paul Angus - what's new in ACS 4.11
Antoine Coetsier - billing the cloud
Boyan Krosnov - Building a software-defined cloud - our experience
Giles Sirett - welcome and CloudStack news
Paul Angus - CloudStack Container Service
Dag Sonstebo - CloudStack usage service

What's hot (20)

PDF
Wido den Hollander - building highly available cloud with Ceph and CloudStack
PDF
Introductions & CloudStack news - Giles Sirett
PDF
Paul Angus – Backup & Recovery in CloudStack
PDF
Building software defined clouds - Boyan Ivanov
PDF
Wido den Hollander - 10 ways to break your Ceph cluster
PPTX
Adam Dagnall: Advanced S3 compatible storage integration in CloudStack
PPTX
CloudStack news
PDF
CloudStack usage service
PDF
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
PDF
Ceph with CloudStack
PPTX
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
PPTX
Containers and CloudStack
PPTX
Cloud Foundry Deployment Tools: BOSH vs Juju Charms
PDF
LinuxTag 2013
PDF
Paul Angus - CloudStack Backup and Recovery Framework
PPTX
Whats New in Apache CloudStack Version 4.5
PPTX
XenServer HA Improvements
PPTX
vBACD - Deploying Infrastructure-as-a-Service with CloudStack - 2/28
PDF
Monitoring Large-scale Cloud Infrastructures with OpenNebula
PPTX
Taking Cloud to Extremes: Scaled-down, Highly Available, and Mission-critical...
Wido den Hollander - building highly available cloud with Ceph and CloudStack
Introductions & CloudStack news - Giles Sirett
Paul Angus – Backup & Recovery in CloudStack
Building software defined clouds - Boyan Ivanov
Wido den Hollander - 10 ways to break your Ceph cluster
Adam Dagnall: Advanced S3 compatible storage integration in CloudStack
CloudStack news
CloudStack usage service
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
Ceph with CloudStack
What’s New in CloudStack 4.15 - CloudStack European User Group Virtual, May 2021
Containers and CloudStack
Cloud Foundry Deployment Tools: BOSH vs Juju Charms
LinuxTag 2013
Paul Angus - CloudStack Backup and Recovery Framework
Whats New in Apache CloudStack Version 4.5
XenServer HA Improvements
vBACD - Deploying Infrastructure-as-a-Service with CloudStack - 2/28
Monitoring Large-scale Cloud Infrastructures with OpenNebula
Taking Cloud to Extremes: Scaled-down, Highly Available, and Mission-critical...
Ad

Similar to John Spray - Ceph in Kubernetes (20)

ODP
Enabling ceph-mgr to control Ceph services via Kubernetes
PDF
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
PDF
Cncf meetup-rook
PDF
7. Cloud Native Computing - Kubernetes - Bratislava - Rook.io
PDF
Cncf meetup-rook
PDF
Ceph storage for ocp deploying and managing ceph on top of open shift conta...
PDF
What's New with Ceph - Ceph Day Silicon Valley
PDF
CEPH DAY BERLIN - WHAT'S NEW IN CEPH
PPTX
Using Rook to Manage Kubernetes Storage with Ceph
PDF
MicroK8s 1.28 - MicroCeph on MicroK8s.pdf
PDF
Making distributed storage easy: usability in Ceph Luminous and beyond
PDF
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
PDF
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
PPTX
Ceph Day Chicago - Ceph at work at Bloomberg
PDF
Making Ceph awesome on Kubernetes with Rook - Bassam Tabbara
PDF
Ceph Tech Talk: Ceph at DigitalOcean
PPTX
Ceph & OpenStack - Boston Meetup
PDF
Ceph and openstack at the boston meetup
PDF
Ceph Day Netherlands - Ceph @ BIT
PDF
Quick-and-Easy Deployment of a Ceph Storage Cluster
Enabling ceph-mgr to control Ceph services via Kubernetes
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
Cncf meetup-rook
7. Cloud Native Computing - Kubernetes - Bratislava - Rook.io
Cncf meetup-rook
Ceph storage for ocp deploying and managing ceph on top of open shift conta...
What's New with Ceph - Ceph Day Silicon Valley
CEPH DAY BERLIN - WHAT'S NEW IN CEPH
Using Rook to Manage Kubernetes Storage with Ceph
MicroK8s 1.28 - MicroCeph on MicroK8s.pdf
Making distributed storage easy: usability in Ceph Luminous and beyond
CEPH DAY BERLIN - DEPLOYING CEPH IN KUBERNETES WITH ROOK
Backup management with Ceph Storage - Camilo Echevarne, Félix Barbeira
Ceph Day Chicago - Ceph at work at Bloomberg
Making Ceph awesome on Kubernetes with Rook - Bassam Tabbara
Ceph Tech Talk: Ceph at DigitalOcean
Ceph & OpenStack - Boston Meetup
Ceph and openstack at the boston meetup
Ceph Day Netherlands - Ceph @ BIT
Quick-and-Easy Deployment of a Ceph Storage Cluster
Ad

More from ShapeBlue (20)

PPTX
The Yotta x CloudStack Advantage: Scalable, India-First Cloud
PPTX
Simplifying End-to-End Apache CloudStack Deployment with a Web-Based Automati...
PPTX
Extensions Framework (XaaS) - Enabling Orchestrate Anything
PDF
CloudStack GPU Integration - Rohit Yadav
PPTX
Building and Operating a Private Cloud with CloudStack and LINBIT CloudStack ...
PDF
Ampere Offers Energy-Efficient Future For AI And Cloud
PDF
Empowering Cloud Providers with Apache CloudStack and Stackbill
PDF
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
PDF
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
PDF
Fully Open-Source Private Clouds: Freedom, Security, and Control
PPTX
Pushing the Limits: CloudStack at 25K Hosts
PPTX
Stretching CloudStack over multiple datacenters
PPTX
Proposed Feature: Monitoring and Managing Cloud Usage Costs in Apache CloudStack
PPSX
CloudStack + KVM: Your Local Cloud Lab
PDF
I’d like to resell your CloudStack services, but...
PDF
Storage Setup for LINSTOR/DRBD/CloudStack
PDF
Apache CloudStack 101 - Introduction, What’s New and What’s Coming
PDF
Development of an Оbject Storage Plugin for CloudStack, Christian Reichert, s...
PDF
VM-HA with CloudStack and Linstor, Rene Peinthor
PDF
How We Use CloudStack to Provide Managed Hosting, Swen Brüseke, proIO
The Yotta x CloudStack Advantage: Scalable, India-First Cloud
Simplifying End-to-End Apache CloudStack Deployment with a Web-Based Automati...
Extensions Framework (XaaS) - Enabling Orchestrate Anything
CloudStack GPU Integration - Rohit Yadav
Building and Operating a Private Cloud with CloudStack and LINBIT CloudStack ...
Ampere Offers Energy-Efficient Future For AI And Cloud
Empowering Cloud Providers with Apache CloudStack and Stackbill
Apache CloudStack 201: Let's Design & Build an IaaS Cloud
Meetup Kickoff & Welcome - Rohit Yadav, CSIUG Chairman
Fully Open-Source Private Clouds: Freedom, Security, and Control
Pushing the Limits: CloudStack at 25K Hosts
Stretching CloudStack over multiple datacenters
Proposed Feature: Monitoring and Managing Cloud Usage Costs in Apache CloudStack
CloudStack + KVM: Your Local Cloud Lab
I’d like to resell your CloudStack services, but...
Storage Setup for LINSTOR/DRBD/CloudStack
Apache CloudStack 101 - Introduction, What’s New and What’s Coming
Development of an Оbject Storage Plugin for CloudStack, Christian Reichert, s...
VM-HA with CloudStack and Linstor, Rene Peinthor
How We Use CloudStack to Provide Managed Hosting, Swen Brüseke, proIO

Recently uploaded (20)

PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Encapsulation theory and applications.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
Cloud computing and distributed systems.
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
MYSQL Presentation for SQL database connectivity
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
cuic standard and advanced reporting.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
Dropbox Q2 2025 Financial Results & Investor Presentation
Mobile App Security Testing_ A Comprehensive Guide.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Understanding_Digital_Forensics_Presentation.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Reach Out and Touch Someone: Haptics and Empathic Computing
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Encapsulation theory and applications.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Diabetes mellitus diagnosis method based random forest with bat algorithm
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Cloud computing and distributed systems.
NewMind AI Weekly Chronicles - August'25 Week I
MYSQL Presentation for SQL database connectivity
The AUB Centre for AI in Media Proposal.docx
cuic standard and advanced reporting.pdf
Big Data Technologies - Introduction.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Review of recent advances in non-invasive hemoglobin estimation

John Spray - Ceph in Kubernetes

  • 1. Combining Ceph with Kubernetes 19 April 2018 John Spray Principal Software Engineer, Offce of the CTO <john.spray@redhat.com>
  • 2. 2 About me commit ac30e6cee2b2d3815438f1a392a951d511bddfd4 Author: John Spray <john.spray@redhat.com> Date: Thu Jun 30 14:05:02 2016 +0100 mgr: create ceph-mgr service Signed-off-by: John Spray <john.spray@redhat.com>
  • 3. 3 Ceph operations today ● RPM packages (all daemons on server same version) ● Physical services confgured by external orchestrator: ● Ansible, salt, etc ● Logical entities confgured via Ceph itself (pools, flesystems, auth): ● CLI, mgr module interface, restful module ● Separate workfow from the physical deployment ● Plus some external monitoring to make sure your services stay up
  • 4. 4 Pain points ● All those elements combine to create a high surface area beween users and the software. ● Lots of human decision making, opportunities for mistakes ● In practice, deployments often kept relatively static after initial decision making is done. Can new container environments enable something better?
  • 5. 5 Glorious Container Future ● Unicorns for everyone! ● Ice cream for breakfast! ● Every Ceph cluster comes with a free Pony! ● Sunny and warm every day!
  • 6. 6 The real container future ● Kubernetes is a tool that implements the basic operations that we need for the management of cluster services ● Deploy builds (in container format) ● Detect devices, start container in specifc location (OSD) ● Schedule/place groups of services (MDS, RGW) ● If we were writing a Ceph management server/agent today, it would look much like Kubernetes: so let’s just use Kubernetes! Kubernetes gives us the primitives, we still have to do the business logic and UI
  • 7. 7 Why Kubernetes? ● Widely adopted (Red Hat OpenShift, Google Compute Engine, Amazon EKS, etc.) ● CLI/REST driven (extensible API) ● Lightweight design
  • 9. 9 Rook ● Simplifed, container-native way of consuming Ceph ● Built for Kubernetes, extending the Kubernetes API ● CNCF inception project http://rook.io/ http://github.com/rook/
  • 10. 10 Rook components ● Image: Ceph and Rook binaries in one artifact ● ‘agent’ handles mounting volumes ● Hide complexity of client version, kernel version variations ● ‘operator’ watches objects in etcd, manipulates Ceph in response ● Create a “Filesystem” object, Rook operator does corresponding “ceph fs new”
  • 11. 11 Rook example $ kubectl create -f rook-cluster.yaml $ kubectl -n rook get pod NAME READY STATUS rook-api-1511082791-7qs0m 1/1 Running rook-ceph-mgr0-1279756402-wc4vt 1/1 Running rook-ceph-mon0-jflt5 1/1 Running rook-ceph-mon1-wkc8p 1/1 Running rook-ceph-mon2-p31dj 1/1 Running rook-ceph-osd-0h6nb 1/1 Running
  • 12. 12 Rook user interface ● Rook objects are created via the extensible Kubernetes API service (Custom Resource Defintnoiss ● aka: kubectl + yaml fles ● This style is consistent with Kubernetes ecosystem, but could beneft from a friendlier layer on top ● “point and click” is desirable for many users (& vendors) ● declarative confguration not always a good ft for storage: deleting a pool should require a confrmation button!
  • 14. 14 “Just give me the storage” ● Rook’s simplifed model is suitable for people who do not want to pay any attention to how Ceph is confgured: they just want to see a volume attached to their container. ● However: people buying hardware (or paying for cloud) often care a lot about how the storage cluster is confgured. ● Lifecycle: start out not caring about details, but care more and more as time goes on, eventually want to get into the details and optimize use of resources.
  • 15. 15 What is ceph-mgr? ● Component of RADOS: a sibling of the mon and OSD daemons. C++ code using same auth/networking stack. ● Mandatory component: includes key functionality ● Host to python modules that do monitoring/management ● Relatively simple in itself: the fun parts are the python modules.
  • 16. 16 dashboard module ● Mnmnc (13.2.x) release includesan extended management web UI based on OpeiAttnc ● Would like Kubernetes integration, so that we can create containers from the dashboard too: ● The “Create Filesystem” button starts MDS cluster ● A “Create OSD” button that starts OSDs → Call out to Rook from ceph-mgr (aid to other orchestrators toos
  • 17. 17 Why not build Rook-like functionality into mgr? 1. Upgrades! An out-of-Ceph component that knows how to orchestrate a Ceph upgrade, while other Ceph services may be offine (aka “who manages the manager?”) 2. Commonality between simplifed pure-Rook systems and fully-featured containerized Ceph clusters. 3. Retain Rook’s client mounting/volume capabilities: we are publishing info about Ceph cluster into K8s so that Rook can take care of the volume management.
  • 18. 18 How can we re-use the Rook operator How can we share Rook’s code for running containers, without limiting ourselves to their Ceph feature subset? → Modify Rook to make the non-container parts of CRD objects optional (e.g. pools on a Filesystem) → ceph-mgr creates cut-down Filesystem object to get MDS containers created → migration path from pure-Rook systems to general purpose Ceph clusters
  • 19. 19 Two ways to consume containerized Ceph Rook operator K8s ceph-mgr Rook user Kubectl, lnmnted feature setFull coitrol, ponit+clnck Mngratnoi (nf desnreds Ceph image
  • 20. 20 What doesn’t Kubernetes do for us? ● Installing itself (obviously) ● Confguring the underlying networks ● Bootstrapping Rook → External setup tools will continue to have a role in the non-Ceph- specifc tasks
  • 21. 21 Status/tasks ● Getting Rook to consume the upstream Ceph container image, instead of its own custom-built single-binary image. ● Patching Rook operator to enable doing just the container parts ● Patching Rook to enable injecting confg+key to manage an existing cluster ● Connecting ceph-mgr backend to drive Rook via the K8s API ● Exposing K8s-enabled workfows in the dashboard UI → Goal: one click Filesystem creation (...and one click {everything_else} too)
  • 23. 23 Background ● Recall: external orchestrators are handling physical deployment of services, but most logical management is still direct to Ceph ● Or is it? Increasingly, orchestrators mix physically deploying Ceph services with logical confguration: ● Rook creates volumes as CephFS flesystems, but this means creating underlying pools. How does it know how to confgure them? ● Same for anything deploying RGW ● Rook also exposes some health/monitoring of the Ceph cluster, but is this in terms a non-Ceph-expert can understand? ● We must continue to make managing Ceph easier, and where possible, remove need for intervention.
  • 24. 24 Placement group merging Expernmeital for Mnmnc ● Historically, pg_num could be increased but not decreased ● Sometimes problematic, when e.g. physically shrinking a cluster, or if bad pg_nums were chosen. ● Bigger problem: prevented automatic pg_num selection, because mistakes could not be reversed. ● Implementation is not simple, and doing it still has an IO cost, but the option will be there → now we can autoselect pg_num!
  • 25. 25 Automatic pg_num selection Expernmeital for Mnmnc ● Hard (impossible?) to do perfectly ● Pretty easy to do useful common cases: ● Select initial pg_nums according to expected space use ● Increase pg_nums if actual space use has gone ~2x over ideal PG capacity ● Decrease pg_num for underused pools if another pool needs to increase theirs ● Not an optimiser! But probably going to do the job as well as most humans are doing it today.
  • 26. 26 Automatic pg_num selection (continued) Expernmeital for Mnmnc ● Prompting users for expected capacity makes sense for data pools, but not for metadata pools: ● Combine data and metadata pool creation into one command ● Wrap pools into new “poolset” structure describing policy ● Auto-construct poolsets for existing deployments, but don’t auto-adjust unless explicitly enabled ceph poolset create cephfs my_filesystem 100GB
  • 27. 27 Progress bars Expernmeital for Mnmnc ● Health reporting was improved in lumnious, but in many cases it is still too low level. ● Especially placement groups: ● hard to distinguish between real problems and normal rebalancing ● Once we start auto-picking pg_num, users won’t know what a PG is until they see them in the health status ● Introduce `progress` module to synthesize high level view from PG state: “56% recovered from failure of OSD 123”
  • 28. 28 Wrap up ● All these improvements reduce cognitive load on ordinary user. ● Do not need to know what an MDS is: ask Rook for a flesystem, and get one. ● Do not need to know what a placement group is ● Do not need to know magic commands: look at the dashboard ● Actions that no longer require human thought can now be tied into automated workfows: fulfl the promise of software defned storage.
  • 29. Q&A