SlideShare a Scribd company logo
Rudder / Experiences
RUDDER FOR EVERYONE?
OSDC 2017
FLORIAN HEIGL
Why I‘m here
Dayjob: Freelance Sysadmin-Consultant
I like fixing things and processes
Nightjob: Fix a lot of things. Rant a lot.
Rudder Ambassador
OpenNebula community champion
Why I‘m here
Liked bleeding edge, using Ansible since 2011
(10-20-800-100-30 nodes)
Some other tools, before that, too.
Not really happy.
Too many cases of: “if our solution doesn‘t fit, you got the wrong problem(*)“
And then I tried Rudder...
(I might have a backup slide on that)
OSDC 2017 | Experiences with Rudder, is it really for everyone? by Florian Heigl
Rudder from where
u Rudder project went public in 2011
u Basic idea: „Drift assessment“
u What‘s parts of my fleet are drifting away?
u How do we best steer all of it back on course?
u This is how you avoid crashes!
u Project started by 3 long-term CM consultants
u Built on real requirements of many people
OSDC 2017 | Experiences with Rudder, is it really for everyone? by Florian Heigl
Rudder to where
Rudder claims to be config management for the masses
How does it fare?
Rudder to where
Is it really easier to use?
For whom is it easier to use?
Rudder to where
What changes if you use it – short term
u Convenience level is extreme since everything is automatic
u Base OS rebuilds get quite reproducible
u Need to think very cross-OS, helps abstract what you really wanted
u Expect you‘ll want to rebuild to improve on this
u Track what you‘re adjusting
Rudder to where
What changes if you use it -- medium-long term
u Very hands-off – satisfaction can‘t come from one-off runs anymore, but
from running tight ship all the time
u CMDB housekeeping - Ghost ships are trouble
u Continuously maintained systems get more defensible
Rudder to where
Were there undesired results?
So far, none
Rudder to where
Are there unexpected benefits?
u Naming conventions (tiny but powerful)
u Architectur-e-ing
u THE AGENT
1. an agent means no lock-out
2. things can just fix themselves
Rudder to where
UX
What is easier now?
You don‘t even need to do most things (dynamic groups)
Having Metrics
Detecting ‘weirdness‘
Self-Fixing (Not more than glitches in the Matrix)
Rudder to where
What is still hard
u Bending to your will a tool is tricky if you try things you‘re not (yet)
supposed to. Glue is sticky & might not come out right 1st time
u Auto-acceptance
u What‘s hard everywhere else: Clusters
UX
Some question marks & dreams remain
Policy maintenance over years
(will start JSON-Diffing now)
High-end rollout clockworks
We need to build our Docker support (it‘s easy)
UX
Who benefits most?
Devs?
Ops?
Managers?
OSDC 2017 | Experiences with Rudder, is it really for everyone? by Florian Heigl
UX
„I didn‘t imagine it could be this intuitive“
-- junior project manager after about 15 minutes of introduction to Rudder
UX
Having a Web Interface can help
u visible documentation
u conformity
u differing skill levels
u large teams
u having a design
u Building bridges
OSDC 2017 | Experiences with Rudder, is it really for everyone? by Florian Heigl
2011-2014
2014-2015
2016-2017
OSDC 2017 | Experiences with Rudder, is it really for everyone? by Florian Heigl
Performance
u „Monitoring should not negatively impact performance“ (Oracle, 1986)
u CPU Usage?
u Disk trashing?
u Run times?
Performance
u Gets faster on (almost) each version
u 4.1 is ... fast
1. Good performance à add Features
2. Features à Perf cost
3. Cry about it à Tuning
4. Tuning à Faster than 1.
Performance
Performance
u GUI was performing OK up to 1000 nodes
u Many rewrites, much tuning
u 30x faster now
u Smooth, loads 2000 nodes in 10s via Wifi + SSH tunnel J
Performance
Performance
u What if you don‘t manage 1000s of nodes?
u What if your smallest server type has less than 512G RAM?
u Can you run the server on something normal?
Performance
Performance
u Master: 4GB good starting point, 8GB nicer
u Master: JVM + PostgreSQL + LDAP want RAM
u I combine w/ ElasticSearch + Logstash => 16GB RAM
u Don‘t combine on AWS t2.* instances. Never.
Performance
u Agent: Needs a little disk space, almost no RAM, a bit of CPU (@5min)
u Agent: Syslog traffic bursty, but can limit to „relevant“ info
u Relay (Hub): a single 2 core / 2GB Xen VM could handle 2000 nodes
u Relay (Hub): Likely put on anything down to Avoton level
Cool things: OpenSCAP
u Yes, we got that...
1. Automated OVAL fetch
2. Central Validation (OVAL = downloaded XML processed as root!)
3. Automatic Deployment
4. Autoscheduled, time-spread daily Runs
5. Automatic result collection
6. Results integrated in UI (Rudder plugin)
Cool things: Agent
Just to get that clear...
u Completely AUTONOMOUS
u Owns & Decides to run policy
u Works without master/relays
u Will likely keep policy intact forever
u ...till Cthulhu awakes at the end of time
Cool things: A skeleton
u Trivial, but can help everyone
1. Centrally manage /etc/skel
2. creates /home/$user/.ssh
3. touch authorized_keys
4. separate root skel (.vimrc, .inputrc, ...)
u /etc/skel is non-invasive luxury defaults
Cool things: Autopatching
u started autopatchings systems where I‘m allowed to
u yum hooks (post-install triggers)
u used to restart endangered OpenSSL-based services
u need some yum excludes
u just avoid halfassed desktop things like firewalld
Cool things: Monitoring
u Systems are clean enough to alert
1. Automated Agent config inc. SSH keys
2. Automated Lynis (Baselines Sec Scanner) rollout
3. Automated daily security scoring
4. Scores reported to Nagios & alerted
5. Rudder compliance also in Nagios
6. Missing OS patches also in Nagios
7. Put in Service Group/BI Rule „Compliance“
Cool things: Application setup
u Yes, you can do that...
1. Trigger via Node Properties (can be from CMDB, AWS Tags, ...)
2. Set up application stack
3. Initialize „safe“ applications (ES, Redis, ...)
4. Don‘t initialize „unsafe“ applications (PostgreSQL)
Cool things: Application setup
u But yes, you can do that...
1. Trigger via Node Properties (can be from CMDB, AWS Tags, ...)
2. Set up application stack
3. Initialize „safe“ applications (ES, Redis, ...)
4. Don‘t initialize „unsafe“ applications (PostgreSQL)
Cool things: Audit mode
u Fleet Control killer feature
1. Decide: Enforce or Report Compliance Deltas
1. Per Node
2. Per Setting
3. Per Rule
2. Query via API
3. Think, Plan, Conquer
Cool things: Relay API
u Instant Policy runs anywhere
1. Safe: Relays can only trigger the run
2. Fast
3. Scalable
OSDC 2017 | Experiences with Rudder, is it really for everyone? by Florian Heigl
Cool things: sharefile
u Instant File copies everywhere
1. N:N copy between nodes
2. centrally managed
3. Quite fast - can dropJRE on 60 nodes in 5 minutes
4. Might not be the recommended use case J
5. Effect?
Cool things: sharefile
Cool things: Ansible inventory
u Let‘s make a faster Ansible!
1. Use Rudder‘s automagic groups, avoid gathers & complex grouping
2. Use Ansible for deployment of unsafe applications
3. One-shot character
4. but build Rules so Rudder can fix
u Also Plugins for: Rundeck, Cobbler, Centreon & some more?
Cool things: ARM Agent
u Very fresh, but not raw! Debian/Ubuntu
u Tested:
?!!!!
ARMHF AARCH64 Thunder X2
Roadmap
u Right now development is too fast to follow (for me)
u Both minors and majors can introduce shiny things
u Majors API changes, heavy lifting features
Closing
This was my experience, I am happy with Rudder
u Pretty stable
u darn fast
u always there to save me
You could
u check out www.rudder-project.org
u Test it and give feedback
u Vagrant Box: rudder-vagrant @ GitHub
Why things need to get better
Why things need to get better
OSDC 2017 | Experiences with Rudder, is it really for everyone? by Florian Heigl
Dev Cycle

More Related Content

PPTX
What is Node.js
PPTX
Meteor node upnorth-bobdavies
PDF
What happens after react?
PDF
Service workers
PDF
Infrastructure as Code, Theory Crash Course
PPTX
Nodejs web service for starters
PDF
JUST EAT: Embracing DevOps
PDF
The State of the Veil Framework
What is Node.js
Meteor node upnorth-bobdavies
What happens after react?
Service workers
Infrastructure as Code, Theory Crash Course
Nodejs web service for starters
JUST EAT: Embracing DevOps
The State of the Veil Framework

Similar to OSDC 2017 | Experiences with Rudder, is it really for everyone? by Florian Heigl (20)

PDF
Pilot Tech Talk #10 — Practical automation by Kamil Cholewiński
PPTX
Velocity 2015: Building Self-Healing Systems
PPTX
Velocity 2015 building self healing systems (slide share version)
PPTX
Evolving your api architecture with the strangler pattern
PDF
OSAC16: Unikernel-powered Transient Microservices: Changing the Face of Softw...
PDF
Echidna, sistema de respuesta a incidentes open source [GuadalajaraCON 2013]
PPTX
SiestaTime - Defcon27 Red Team Village
PPTX
Postmortem of a uwp xaml application development
PDF
DevOps Fest 2020. immutable infrastructure as code. True story.
PDF
Docker in Production at the Aurora Team
PPTX
An overview of node.js
PDF
Cloud adoption fails - 5 ways deployments go wrong and 5 solutions
PDF
The Future of System Administration
ODP
Future of Sysadmin 2014
PDF
AI&BigData Lab. Александр Конопко "Celos: оркестрирование и тестирование зада...
PPTX
How do we drive tech changes
PDF
Cloud for Grownups - 🛑 No Kubernetes, 🌀 No Complexity, ✅ Just AWS-Powered Res...
PDF
From pets to cattle - powered by CoreOS, docker, Mesos & nginx
PDF
Sensepost assessment automation
PDF
Puppet for SysAdmins
Pilot Tech Talk #10 — Practical automation by Kamil Cholewiński
Velocity 2015: Building Self-Healing Systems
Velocity 2015 building self healing systems (slide share version)
Evolving your api architecture with the strangler pattern
OSAC16: Unikernel-powered Transient Microservices: Changing the Face of Softw...
Echidna, sistema de respuesta a incidentes open source [GuadalajaraCON 2013]
SiestaTime - Defcon27 Red Team Village
Postmortem of a uwp xaml application development
DevOps Fest 2020. immutable infrastructure as code. True story.
Docker in Production at the Aurora Team
An overview of node.js
Cloud adoption fails - 5 ways deployments go wrong and 5 solutions
The Future of System Administration
Future of Sysadmin 2014
AI&BigData Lab. Александр Конопко "Celos: оркестрирование и тестирование зада...
How do we drive tech changes
Cloud for Grownups - 🛑 No Kubernetes, 🌀 No Complexity, ✅ Just AWS-Powered Res...
From pets to cattle - powered by CoreOS, docker, Mesos & nginx
Sensepost assessment automation
Puppet for SysAdmins
Ad

Recently uploaded (20)

PDF
STL Containers in C++ : Sequence Container : Vector
PDF
MCP Security Tutorial - Beginner to Advanced
PDF
Designing Intelligence for the Shop Floor.pdf
DOCX
How to Use SharePoint as an ISO-Compliant Document Management System
DOCX
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PPTX
Weekly report ppt - harsh dattuprasad patel.pptx
PPTX
Custom Software Development Services.pptx.pptx
PPTX
GSA Content Generator Crack (2025 Latest)
PDF
Complete Guide to Website Development in Malaysia for SMEs
PDF
Cost to Outsource Software Development in 2025
PDF
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
PDF
How Tridens DevSecOps Ensures Compliance, Security, and Agility
PPTX
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PDF
Digital Systems & Binary Numbers (comprehensive )
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
PDF
Topaz Photo AI Crack New Download (Latest 2025)
PDF
Wondershare Recoverit Full Crack New Version (Latest 2025)
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
STL Containers in C++ : Sequence Container : Vector
MCP Security Tutorial - Beginner to Advanced
Designing Intelligence for the Shop Floor.pdf
How to Use SharePoint as an ISO-Compliant Document Management System
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
Weekly report ppt - harsh dattuprasad patel.pptx
Custom Software Development Services.pptx.pptx
GSA Content Generator Crack (2025 Latest)
Complete Guide to Website Development in Malaysia for SMEs
Cost to Outsource Software Development in 2025
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
How Tridens DevSecOps Ensures Compliance, Security, and Agility
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
Monitoring Stack: Grafana, Loki & Promtail
Digital Systems & Binary Numbers (comprehensive )
Why Generative AI is the Future of Content, Code & Creativity?
Topaz Photo AI Crack New Download (Latest 2025)
Wondershare Recoverit Full Crack New Version (Latest 2025)
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Ad

OSDC 2017 | Experiences with Rudder, is it really for everyone? by Florian Heigl

  • 3. Why I‘m here Dayjob: Freelance Sysadmin-Consultant I like fixing things and processes Nightjob: Fix a lot of things. Rant a lot. Rudder Ambassador OpenNebula community champion
  • 4. Why I‘m here Liked bleeding edge, using Ansible since 2011 (10-20-800-100-30 nodes) Some other tools, before that, too. Not really happy. Too many cases of: “if our solution doesn‘t fit, you got the wrong problem(*)“ And then I tried Rudder... (I might have a backup slide on that)
  • 6. Rudder from where u Rudder project went public in 2011 u Basic idea: „Drift assessment“ u What‘s parts of my fleet are drifting away? u How do we best steer all of it back on course? u This is how you avoid crashes! u Project started by 3 long-term CM consultants u Built on real requirements of many people
  • 8. Rudder to where Rudder claims to be config management for the masses How does it fare?
  • 9. Rudder to where Is it really easier to use? For whom is it easier to use?
  • 10. Rudder to where What changes if you use it – short term u Convenience level is extreme since everything is automatic u Base OS rebuilds get quite reproducible u Need to think very cross-OS, helps abstract what you really wanted u Expect you‘ll want to rebuild to improve on this u Track what you‘re adjusting
  • 11. Rudder to where What changes if you use it -- medium-long term u Very hands-off – satisfaction can‘t come from one-off runs anymore, but from running tight ship all the time u CMDB housekeeping - Ghost ships are trouble u Continuously maintained systems get more defensible
  • 12. Rudder to where Were there undesired results? So far, none
  • 13. Rudder to where Are there unexpected benefits? u Naming conventions (tiny but powerful) u Architectur-e-ing u THE AGENT 1. an agent means no lock-out 2. things can just fix themselves
  • 15. UX What is easier now? You don‘t even need to do most things (dynamic groups) Having Metrics Detecting ‘weirdness‘ Self-Fixing (Not more than glitches in the Matrix)
  • 16. Rudder to where What is still hard u Bending to your will a tool is tricky if you try things you‘re not (yet) supposed to. Glue is sticky & might not come out right 1st time u Auto-acceptance u What‘s hard everywhere else: Clusters
  • 17. UX Some question marks & dreams remain Policy maintenance over years (will start JSON-Diffing now) High-end rollout clockworks We need to build our Docker support (it‘s easy)
  • 20. UX „I didn‘t imagine it could be this intuitive“ -- junior project manager after about 15 minutes of introduction to Rudder
  • 21. UX Having a Web Interface can help u visible documentation u conformity u differing skill levels u large teams u having a design u Building bridges
  • 27. Performance u „Monitoring should not negatively impact performance“ (Oracle, 1986) u CPU Usage? u Disk trashing? u Run times?
  • 28. Performance u Gets faster on (almost) each version u 4.1 is ... fast 1. Good performance à add Features 2. Features à Perf cost 3. Cry about it à Tuning 4. Tuning à Faster than 1.
  • 30. Performance u GUI was performing OK up to 1000 nodes u Many rewrites, much tuning u 30x faster now u Smooth, loads 2000 nodes in 10s via Wifi + SSH tunnel J
  • 32. Performance u What if you don‘t manage 1000s of nodes? u What if your smallest server type has less than 512G RAM? u Can you run the server on something normal?
  • 34. Performance u Master: 4GB good starting point, 8GB nicer u Master: JVM + PostgreSQL + LDAP want RAM u I combine w/ ElasticSearch + Logstash => 16GB RAM u Don‘t combine on AWS t2.* instances. Never.
  • 35. Performance u Agent: Needs a little disk space, almost no RAM, a bit of CPU (@5min) u Agent: Syslog traffic bursty, but can limit to „relevant“ info u Relay (Hub): a single 2 core / 2GB Xen VM could handle 2000 nodes u Relay (Hub): Likely put on anything down to Avoton level
  • 36. Cool things: OpenSCAP u Yes, we got that... 1. Automated OVAL fetch 2. Central Validation (OVAL = downloaded XML processed as root!) 3. Automatic Deployment 4. Autoscheduled, time-spread daily Runs 5. Automatic result collection 6. Results integrated in UI (Rudder plugin)
  • 37. Cool things: Agent Just to get that clear... u Completely AUTONOMOUS u Owns & Decides to run policy u Works without master/relays u Will likely keep policy intact forever u ...till Cthulhu awakes at the end of time
  • 38. Cool things: A skeleton u Trivial, but can help everyone 1. Centrally manage /etc/skel 2. creates /home/$user/.ssh 3. touch authorized_keys 4. separate root skel (.vimrc, .inputrc, ...) u /etc/skel is non-invasive luxury defaults
  • 39. Cool things: Autopatching u started autopatchings systems where I‘m allowed to u yum hooks (post-install triggers) u used to restart endangered OpenSSL-based services u need some yum excludes u just avoid halfassed desktop things like firewalld
  • 40. Cool things: Monitoring u Systems are clean enough to alert 1. Automated Agent config inc. SSH keys 2. Automated Lynis (Baselines Sec Scanner) rollout 3. Automated daily security scoring 4. Scores reported to Nagios & alerted 5. Rudder compliance also in Nagios 6. Missing OS patches also in Nagios 7. Put in Service Group/BI Rule „Compliance“
  • 41. Cool things: Application setup u Yes, you can do that... 1. Trigger via Node Properties (can be from CMDB, AWS Tags, ...) 2. Set up application stack 3. Initialize „safe“ applications (ES, Redis, ...) 4. Don‘t initialize „unsafe“ applications (PostgreSQL)
  • 42. Cool things: Application setup u But yes, you can do that... 1. Trigger via Node Properties (can be from CMDB, AWS Tags, ...) 2. Set up application stack 3. Initialize „safe“ applications (ES, Redis, ...) 4. Don‘t initialize „unsafe“ applications (PostgreSQL)
  • 43. Cool things: Audit mode u Fleet Control killer feature 1. Decide: Enforce or Report Compliance Deltas 1. Per Node 2. Per Setting 3. Per Rule 2. Query via API 3. Think, Plan, Conquer
  • 44. Cool things: Relay API u Instant Policy runs anywhere 1. Safe: Relays can only trigger the run 2. Fast 3. Scalable
  • 46. Cool things: sharefile u Instant File copies everywhere 1. N:N copy between nodes 2. centrally managed 3. Quite fast - can dropJRE on 60 nodes in 5 minutes 4. Might not be the recommended use case J 5. Effect?
  • 48. Cool things: Ansible inventory u Let‘s make a faster Ansible! 1. Use Rudder‘s automagic groups, avoid gathers & complex grouping 2. Use Ansible for deployment of unsafe applications 3. One-shot character 4. but build Rules so Rudder can fix u Also Plugins for: Rundeck, Cobbler, Centreon & some more?
  • 49. Cool things: ARM Agent u Very fresh, but not raw! Debian/Ubuntu u Tested: ?!!!! ARMHF AARCH64 Thunder X2
  • 50. Roadmap u Right now development is too fast to follow (for me) u Both minors and majors can introduce shiny things u Majors API changes, heavy lifting features
  • 51. Closing This was my experience, I am happy with Rudder u Pretty stable u darn fast u always there to save me You could u check out www.rudder-project.org u Test it and give feedback u Vagrant Box: rudder-vagrant @ GitHub
  • 52. Why things need to get better
  • 53. Why things need to get better