SlideShare a Scribd company logo
rudder.io
What uses for observing operations of
Configuration Management?
Nicolas CHARLES
nicolas@rudder.io - @nico_charles 1
Are we really looking at logs?
2
I’m sure everyone here does, but...
No error nor change in logs means success?
3
Aren’t we missing something?
Getting and understanding the info is complex
4
Operators, Managers, Experts, APIs have differents needs
Frustration if we need a third party to get data
We mistrust what we don’t understand
Getting and understanding the info is complex
Putting errors into perspective
Errors can be expected
Errors in production can have catastrophic consequences
Errors in a Vagrant VM is much less critical
Getting and understanding the info is complex
Strong reliance on Expert(s)
SPOF
Fatigue
Knowing the exact infrastructure state
monitoring observabilityVS
Observability adoption
Databases
Built in facilities
Tooling ecosystem to extract knowledge
Observability adoption
Software
Legacy: embedding agent (often proprietary solutions)
New developments:
Best practices
Open standards
Architectural bricks
These concepts are core to Rudder
Everyone/thing can be an actor of configuration management
These concepts are core to Rudder
Technique
A set of operations & configurations to reach a state
With variables for configuration
Created by experts
These concepts are core to Rudder
These concepts are core to Rudder
Directive
Technique + Parameters
Defines how services must be managed
Driven by business needs, managed by admins or APIs
These concepts are core to Rudder
Rule
The application of Directive(s) to Group(s)
Defines the targets of the Directive(s)
Higher approach of services, managed by admins or APIs
Each can focus on what is relevant
15
Operators
Security Experts
Each can focus on what is relevant
16
Managers
APIs
"rules": [
{
"id": "32377fd7-02fd-43d0-aab7-28460a91347b",
"name": "Security rules - baseline",
"compliance": 100,
"mode": "full-compliance",
"complianceDetails": {
"successAlreadyOK": 87.47,
"successNotApplicable": 12.53
},
"directives": [
{
"id": "c16e3a90-b9d7-427d-83c1-d80e33124e4c",
"name": "CIS Benchmark 2.1.6 - rsh",
"compliance": 100.0,
"complianceDetails": {
"successAlreadyOK": 100.00
}
What is this compliance?
PARAM
RULE
● Id
DIRECTIVE
● Id
● (Components)
GROUP
● Id
RUDDER config
(global)
● Policy Mode
● Schedule
NODE
● Properties
● Policy Mode
● Schedule
Environmental context
● Id : . . .
● Generated : . . .
Files
Node configuration
Change request
Historisation
Historization
Event logs
What is this compliance?
RUDDER config
(global)
● Policy Mode
● Schedule
NODE
● Properties
● Policy Mode
● Schedule
Environmental context
● Id : . . .
● Generated : . . .
Files
Node configuration
Change request
Historisation
Event logs
PARAM
RULE
● Id
● Groups + Directives
DIRECTIVE
● Id
● Components
GROUP
● Id
Historization
What is this compliance?
PARAM
RULE
● Id
DIRECTIVE
● Id
● (Components)
GROUP
● Id
RUDDER config
(global)
● Policy Mode
● Schedule
NODE
● Properties
● Policy Mode
● Schedule
Environmental context
● Id : . . .
● Generated : . . .
Files
Node configuration
Change request
Historisation
Historization Event logs
What is this compliance?
PARAM
RULE
● Id
DIRECTIVE
● Id
● (Components)
GROUP
● Id
RUDDER config
(global)
● Policy Mode
● Schedule
NODE
● Properties
● Policy Mode
● Schedule
Environmental context
● Id : . . .
● Generated : . . .
Files
Node configuration
Change request
Historisation
Historization
Event logs
What is this compliance?
PARAM
RULE
● Id
DIRECTIVE
● Id
● (Components)
GROUP
● Id
RUDDER config
(global)
● Policy Mode
● Schedule
NODE
● Properties
● Policy Mode
● Schedule
Environmental context
● Id : . . .
● Generated : . . .
Files
Node configuration
Change request
Historisation
Historization
Event logs
What is this compliance?
22
● Id : . . .
● Generated : . . .
Files
Node configuration
RUN
● Reports
● Reports
● ...
● ...
METADATA
● node id
● config id
● run timestamp
RUN
● Reports
● Reports
● ...
● ...
METADATA
● node id
● config id
● run timestamp
● Signature
Get Policy
Send configuration
reports
Expected reports
(node id, config id,
timestamp)
Run reports
Historization
Compliance
historized
Store expected reports
Metadata
● Integrity
● Signature
Config
● Id
● For Rule R,
Directive D1,
Component C
What is this compliance?
23
● Id : . . .
● Generated : . . .
Files
Node configuration
Run reports
RUN
● Reports
● Reports
● ...
● ...
METADATA
● node id
● config id
● run timestamp
RUN
● Reports
● Reports
● ...
● ...
METADATA
● node id
● config id
● run timestamp
● Signature
Get Policy
Send configuration
reports
Expected reports
node id
config id
timestamp
end of validity
Historization
Compliance
historized
Store expected reports
Metadata
● Integrity
● Signature
Config
● Id
● For Rule R,
Directive D1,
Component C
What is this compliance?
24
● Id : . . .
● Generated : . . .
Files
Node configuration
RUN
● Reports
● Reports
● ...
● ...
METADATA
● node id
● config id
● run timestamp
RUN
● Reports
● Reports
● ...
● ...
METADATA
● node id
● config id
● run timestamp
● Signature
Get Policy
Send configuration
reports
Expected reports
(node id, config id,
timestamp)
Run reports
Historization
Compliance
historized
Store expected reports
Metadata
● Integrity
● Signature
Config
● Id
● For Rule R,
Directive D1,
Component C
What is this compliance?
25
● Id : . . .
● Generated : . . .
Files
Node configuration
RUN
● Reports
● Reports
● ...
● ...
METADATA
● node id
● config id
● run timestamp
RUN
● Reports
● Reports
● ...
● ...
METADATA
● node id
● config id
● run timestamp
● Signature
Get Policy
Send configuration
reports
Expected reports
(node id, config id,
timestamp)
Run reports
Historization
Compliance
historized
Store expected reports
Metadata
● Integrity
● Signature
Config
● Id
● For Rule R,
Directive D1,
Component C
What is this compliance?
26
● Id : . . .
● Generated : . . .
Files
Node configuration
RUN
● Reports
● Reports
● ...
● ...
METADATA
● node id
● config id
● run timestamp
RUN
● Reports
● Reports
● ...
● ...
METADATA
● node id
● config id
● run timestamp
● Signature
Get Policy
Send configuration
reports
Expected reports
(node id, config id,
timestamp)
Run reports
Historization
Compliance
historized
Store expected reports
Metadata
● Integrity
● Signature
Config
● Id
● For Rule R,
Directive D1,
Component C
Make information available
27
A lot information from inside Rudder, usable in Rudder context
Details of each run (timestamped info)
Policy generation details
Serialization of configurations
Inventories
...
Causality and dependencies of events
28
Why would we need it?
● We have logs
● We have experts
Causality and dependencies of events
29
Causality and dependencies of events
30
Diagnostic on infrastructures is hard
● Many systems
● Dependencies across systems
● Many actors involved
An issue on one component can impact hundred systems
We need to separate the causes from the symptoms
Causality and dependencies of events
31
Monitoring can only correlate
Causes and precedences help root cause analysis
Causality and dependencies of events
32
How can we do that ??!??
Event sourcing & Tracing
33
Events happen on the whole infrastructure
Describe and analyze over systems
Order events
Contextualize
Event sourcing & Tracing
34
Terminology (Dapper & OpenTracing)
Trace: Description of a “transaction” as it moves through systems
Span: Named and timed operation, piece of workflow (+ tags and logs)
Span context: Trace information that accompanies the transaction
Event sourcing & Tracing
35
What’s in a span?
Operation name
Start & end timestamps
Tags: Set of key:value
Logs: Set of key:value
SpanContext
Event sourcing & Tracing
36
Temporal relationships between Spans in a single Trace
https://www.jaegertracing.io/docs/1.9/architecture/
Event sourcing & Tracing
37
What would be the traces?
Defining the infrastructure state is a trace
Each changes before validation is a span
Validating results in a change request closes the trace
Computing the nodes configurations is a trace
Computing targets, overrides and generating files are spans
Closes with the serialization of the nodes configurations in database
Each run on an node is a trace
Each configuration check is a span
Event sourcing & Tracing
38
RULE
● Id
DIRECTIVE
● Id
GROUP
● Id
Environmental
context
● Id : . . .
● Generated : . .
● Commit id.
Files
Node configuration
Change request
RUN
● Reports
● Reports
● ...
● ...
METADATA
● node id
● config id
● run timestamp
RUN
● Reports
● Reports
● ...
● ...
METADATA
● node id
● config id
● run timestamp
● Signature
Get config
Send configuration
reports
Expected reports
(node id, config id,
timestamp)
Run reports
Historisation
Store expected reports
Metadata
● Integrity
● CommitId
● Signature
Config
● For Rule R,
Directive D1,
Component C
Events
Commit Id
Defining state
Trace + Spans
Trace
Run: Trace
Each step: span
Message bus
Event sourcing & Tracing
39
● Id : . . .
● Generated : . .
● Commit id.
Files
Node configuration
METADATA
● node id
● config id
● run timestamp
RUN
METADATA
Signature
Get config
Send configuration
reports
Expected reports
(node id, config id,
timestamp)
Run reports
Store expected reports
Metadata
● Integrity
● CommitId
● Signature
Config
● For Rule R,
Directive D1,
Component C
Trace
Message bus
Run: Trace
Each step: span
Compliance
CMDB Hooks
Monitoring
Event sourcing & Tracing
40
Store Traces & Events:
● Integrate with systems in place
● Many tools are compatible with OpenTracing
Correlate with non-observable systems
Closing thoughts
41
With Rudder, information is centralized and made available in a
relevant way for all actors/things
Closing thoughts
42
How can you benefit more of your
configuration management?
Closing thoughts
43
What can we do of these billions events?
Closing thoughts
44
What can we do of these billions events?
Reactive approach
Query, search and analyze traces in case of problems
Closing thoughts
45
What can we do of these billions events?
Proactive approach
Process mining: Machine Learning on these events
Detect unusual behaviours
Outliers
Inconsistencies across systems
Closing thoughts
46
Mark Burgess
Founder of Configuration Management
http://markburgess.org/anomalies.htm
l
rudder.io
Questions ?
Nicolas CHARLES
nicolas@rudder.io - @nico_charles 47
Security?
48
Events, trace and logs hold critical data
Within a unique system, security can be built-in
AuthN/AuthZ
For distributed system, it’s much harder
Who can see what?
Who defines and enforces the authorizations?
Tags on events for authorizations
Security?
49
Events, trace and logs hold critical data
Cipher information vs partial visibility?
rudder.io
What uses for observing operations of
Configuration Management?
Nicolas CHARLES
nicolas@rudder.io - @nico_charles 50
Event sourcing & Tracing
51
Temporal relationships between Spans in a single Trace
––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–> time
[Span A···················································]
[Span B··············································]
[Span D··········································]
[Span C········································]
[Span E·······] [Span F··] [Span G··] [Span H··]
https://opentracing.io/specification/
Event sourcing & Tracing
52
Every components need to know the context
● Carry the Span Context along each events
Add some information for each events
● Save on logging thanks to context
Send these traces on message bus

More Related Content

ODP
Oracle 11g: Learning to Love the ADR
PDF
Fosdem - Configurations do you prove yours?
PDF
OSIS 2019 - Qu’apporte l’observabilité à la gestion de configuration ?
PDF
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
PDF
Getting data into Rudder
PDF
Anomaly Detection using Neural Networks with Pandas, Keras and Python
PDF
Splunk, SIEMs, and Big Data - The Undercroft - November 2019
PDF
Enterprise Cloud Security
Oracle 11g: Learning to Love the ADR
Fosdem - Configurations do you prove yours?
OSIS 2019 - Qu’apporte l’observabilité à la gestion de configuration ?
OSIS19_Cloud : Qu’apporte l’observabilité à la gestion de configuration? par ...
Getting data into Rudder
Anomaly Detection using Neural Networks with Pandas, Keras and Python
Splunk, SIEMs, and Big Data - The Undercroft - November 2019
Enterprise Cloud Security

Similar to What uses for observing operations of Configuration Management? (20)

PDF
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
PDF
RMLL 2013 - Synchronize OpenLDAP and Active Directory with LSC
PDF
Derbycon - The Unintended Risks of Trusting Active Directory
PDF
Bogdan Kecman INIT Presentation
PPTX
Securing Your Enterprise Web Apps with MongoDB Enterprise
ODP
Bogdan Kecman Advanced Databasing
PPTX
Data science workflows: from notebooks to production
PDF
Native Container Monitoring
PDF
Native container monitoring
PPT
“Lights Out”Configuration using Tivoli Netcool AutoDiscovery Tools
PDF
MongoDB World 2018: Transactions and Durability: Putting the “D” in ACID
PDF
PPTX
EDR(End Point Detection And Response).pptx
PDF
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
PDF
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
PPT
SPC in solar industry
PDF
Deploy 22 microservices from scratch in 30 mins with GitOps
PDF
Oracle Exadata Training.pdf
PDF
Designing for operability and managability
PDF
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Building a data pipeline to ingest data into Hadoop in minutes using Streamse...
RMLL 2013 - Synchronize OpenLDAP and Active Directory with LSC
Derbycon - The Unintended Risks of Trusting Active Directory
Bogdan Kecman INIT Presentation
Securing Your Enterprise Web Apps with MongoDB Enterprise
Bogdan Kecman Advanced Databasing
Data science workflows: from notebooks to production
Native Container Monitoring
Native container monitoring
“Lights Out”Configuration using Tivoli Netcool AutoDiscovery Tools
MongoDB World 2018: Transactions and Durability: Putting the “D” in ACID
EDR(End Point Detection And Response).pptx
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source: VictoriaMetrics - ClickHouse
SPC in solar industry
Deploy 22 microservices from scratch in 30 mins with GitOps
Oracle Exadata Training.pdf
Designing for operability and managability
Oracle Database Performance Tuning Advanced Features and Best Practices for DBAs
Ad

More from RUDDER (20)

PDF
What if configuration management didn't need to be lvl60 in dev?
PDF
Servers compliance: audit, remediation, proof
PDF
OW2Con - Configurations, do you prove yours?
PDF
The new plugin ecosystem in RUDDER 5.0
PDF
UX challenges of a UI-centric config management tool
PDF
What happened in RUDDER in 2018 and what’s next?
PDF
What is RUDDER and when should I use it?
PDF
L'audit en continu : clé de la conformité démontrable (#POSS 2018)
PDF
Fiabilité et conformité continues en production avec Rudder (#BBOOST 2018)
PDF
Stay up - voyage d'un éditeur de logiciels libres
PDF
How we scaled Rudder to 10k, and the road to 50k
PDF
What's new and what's next in Rudder
PDF
Poss 2017 : gestion des configurations et mise en conformité chez un service ...
PDF
Poss 2017 - la continuité, arme secrète de la gestion du si - cas concret de ...
PDF
POSS 2017 : Comment automatiser son infrastructure quand... on a pas le temps...
PDF
DevOps D-Day 2017 - Gestion des configurations et mise en conformité chez un ...
PDF
RUDDER - Continuous Configuration (configuration management + continuous aud...
PDF
RUDDER - Continuous Configuration (configuration management + continuous audi...
PDF
OSIS 2017 - Scala REX dans Rudder
PDF
Automating the manual - feedback on including existing systems in configurati...
What if configuration management didn't need to be lvl60 in dev?
Servers compliance: audit, remediation, proof
OW2Con - Configurations, do you prove yours?
The new plugin ecosystem in RUDDER 5.0
UX challenges of a UI-centric config management tool
What happened in RUDDER in 2018 and what’s next?
What is RUDDER and when should I use it?
L'audit en continu : clé de la conformité démontrable (#POSS 2018)
Fiabilité et conformité continues en production avec Rudder (#BBOOST 2018)
Stay up - voyage d'un éditeur de logiciels libres
How we scaled Rudder to 10k, and the road to 50k
What's new and what's next in Rudder
Poss 2017 : gestion des configurations et mise en conformité chez un service ...
Poss 2017 - la continuité, arme secrète de la gestion du si - cas concret de ...
POSS 2017 : Comment automatiser son infrastructure quand... on a pas le temps...
DevOps D-Day 2017 - Gestion des configurations et mise en conformité chez un ...
RUDDER - Continuous Configuration (configuration management + continuous aud...
RUDDER - Continuous Configuration (configuration management + continuous audi...
OSIS 2017 - Scala REX dans Rudder
Automating the manual - feedback on including existing systems in configurati...
Ad

Recently uploaded (20)

PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PDF
top salesforce developer skills in 2025.pdf
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Digital Strategies for Manufacturing Companies
PPTX
ai tools demonstartion for schools and inter college
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
System and Network Administraation Chapter 3
PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Understanding Forklifts - TECH EHS Solution
PPTX
L1 - Introduction to python Backend.pptx
PPT
Introduction Database Management System for Course Database
PPTX
Transform Your Business with a Software ERP System
PDF
Digital Systems & Binary Numbers (comprehensive )
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
top salesforce developer skills in 2025.pdf
wealthsignaloriginal-com-DS-text-... (1).pdf
How to Choose the Right IT Partner for Your Business in Malaysia
Softaken Excel to vCard Converter Software.pdf
Which alternative to Crystal Reports is best for small or large businesses.pdf
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Wondershare Filmora 15 Crack With Activation Key [2025
Digital Strategies for Manufacturing Companies
ai tools demonstartion for schools and inter college
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
System and Network Administraation Chapter 3
Computer Software and OS of computer science of grade 11.pptx
Odoo Companies in India – Driving Business Transformation.pdf
Understanding Forklifts - TECH EHS Solution
L1 - Introduction to python Backend.pptx
Introduction Database Management System for Course Database
Transform Your Business with a Software ERP System
Digital Systems & Binary Numbers (comprehensive )

What uses for observing operations of Configuration Management?

  • 1. rudder.io What uses for observing operations of Configuration Management? Nicolas CHARLES nicolas@rudder.io - @nico_charles 1
  • 2. Are we really looking at logs? 2 I’m sure everyone here does, but...
  • 3. No error nor change in logs means success? 3 Aren’t we missing something?
  • 4. Getting and understanding the info is complex 4 Operators, Managers, Experts, APIs have differents needs Frustration if we need a third party to get data We mistrust what we don’t understand
  • 5. Getting and understanding the info is complex Putting errors into perspective Errors can be expected Errors in production can have catastrophic consequences Errors in a Vagrant VM is much less critical
  • 6. Getting and understanding the info is complex Strong reliance on Expert(s) SPOF Fatigue
  • 7. Knowing the exact infrastructure state monitoring observabilityVS
  • 8. Observability adoption Databases Built in facilities Tooling ecosystem to extract knowledge
  • 9. Observability adoption Software Legacy: embedding agent (often proprietary solutions) New developments: Best practices Open standards Architectural bricks
  • 10. These concepts are core to Rudder Everyone/thing can be an actor of configuration management
  • 11. These concepts are core to Rudder Technique A set of operations & configurations to reach a state With variables for configuration Created by experts
  • 12. These concepts are core to Rudder
  • 13. These concepts are core to Rudder Directive Technique + Parameters Defines how services must be managed Driven by business needs, managed by admins or APIs
  • 14. These concepts are core to Rudder Rule The application of Directive(s) to Group(s) Defines the targets of the Directive(s) Higher approach of services, managed by admins or APIs
  • 15. Each can focus on what is relevant 15 Operators Security Experts
  • 16. Each can focus on what is relevant 16 Managers APIs "rules": [ { "id": "32377fd7-02fd-43d0-aab7-28460a91347b", "name": "Security rules - baseline", "compliance": 100, "mode": "full-compliance", "complianceDetails": { "successAlreadyOK": 87.47, "successNotApplicable": 12.53 }, "directives": [ { "id": "c16e3a90-b9d7-427d-83c1-d80e33124e4c", "name": "CIS Benchmark 2.1.6 - rsh", "compliance": 100.0, "complianceDetails": { "successAlreadyOK": 100.00 }
  • 17. What is this compliance? PARAM RULE ● Id DIRECTIVE ● Id ● (Components) GROUP ● Id RUDDER config (global) ● Policy Mode ● Schedule NODE ● Properties ● Policy Mode ● Schedule Environmental context ● Id : . . . ● Generated : . . . Files Node configuration Change request Historisation Historization Event logs
  • 18. What is this compliance? RUDDER config (global) ● Policy Mode ● Schedule NODE ● Properties ● Policy Mode ● Schedule Environmental context ● Id : . . . ● Generated : . . . Files Node configuration Change request Historisation Event logs PARAM RULE ● Id ● Groups + Directives DIRECTIVE ● Id ● Components GROUP ● Id Historization
  • 19. What is this compliance? PARAM RULE ● Id DIRECTIVE ● Id ● (Components) GROUP ● Id RUDDER config (global) ● Policy Mode ● Schedule NODE ● Properties ● Policy Mode ● Schedule Environmental context ● Id : . . . ● Generated : . . . Files Node configuration Change request Historisation Historization Event logs
  • 20. What is this compliance? PARAM RULE ● Id DIRECTIVE ● Id ● (Components) GROUP ● Id RUDDER config (global) ● Policy Mode ● Schedule NODE ● Properties ● Policy Mode ● Schedule Environmental context ● Id : . . . ● Generated : . . . Files Node configuration Change request Historisation Historization Event logs
  • 21. What is this compliance? PARAM RULE ● Id DIRECTIVE ● Id ● (Components) GROUP ● Id RUDDER config (global) ● Policy Mode ● Schedule NODE ● Properties ● Policy Mode ● Schedule Environmental context ● Id : . . . ● Generated : . . . Files Node configuration Change request Historisation Historization Event logs
  • 22. What is this compliance? 22 ● Id : . . . ● Generated : . . . Files Node configuration RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp ● Signature Get Policy Send configuration reports Expected reports (node id, config id, timestamp) Run reports Historization Compliance historized Store expected reports Metadata ● Integrity ● Signature Config ● Id ● For Rule R, Directive D1, Component C
  • 23. What is this compliance? 23 ● Id : . . . ● Generated : . . . Files Node configuration Run reports RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp ● Signature Get Policy Send configuration reports Expected reports node id config id timestamp end of validity Historization Compliance historized Store expected reports Metadata ● Integrity ● Signature Config ● Id ● For Rule R, Directive D1, Component C
  • 24. What is this compliance? 24 ● Id : . . . ● Generated : . . . Files Node configuration RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp ● Signature Get Policy Send configuration reports Expected reports (node id, config id, timestamp) Run reports Historization Compliance historized Store expected reports Metadata ● Integrity ● Signature Config ● Id ● For Rule R, Directive D1, Component C
  • 25. What is this compliance? 25 ● Id : . . . ● Generated : . . . Files Node configuration RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp ● Signature Get Policy Send configuration reports Expected reports (node id, config id, timestamp) Run reports Historization Compliance historized Store expected reports Metadata ● Integrity ● Signature Config ● Id ● For Rule R, Directive D1, Component C
  • 26. What is this compliance? 26 ● Id : . . . ● Generated : . . . Files Node configuration RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp ● Signature Get Policy Send configuration reports Expected reports (node id, config id, timestamp) Run reports Historization Compliance historized Store expected reports Metadata ● Integrity ● Signature Config ● Id ● For Rule R, Directive D1, Component C
  • 27. Make information available 27 A lot information from inside Rudder, usable in Rudder context Details of each run (timestamped info) Policy generation details Serialization of configurations Inventories ...
  • 28. Causality and dependencies of events 28 Why would we need it? ● We have logs ● We have experts
  • 30. Causality and dependencies of events 30 Diagnostic on infrastructures is hard ● Many systems ● Dependencies across systems ● Many actors involved An issue on one component can impact hundred systems We need to separate the causes from the symptoms
  • 31. Causality and dependencies of events 31 Monitoring can only correlate Causes and precedences help root cause analysis
  • 32. Causality and dependencies of events 32 How can we do that ??!??
  • 33. Event sourcing & Tracing 33 Events happen on the whole infrastructure Describe and analyze over systems Order events Contextualize
  • 34. Event sourcing & Tracing 34 Terminology (Dapper & OpenTracing) Trace: Description of a “transaction” as it moves through systems Span: Named and timed operation, piece of workflow (+ tags and logs) Span context: Trace information that accompanies the transaction
  • 35. Event sourcing & Tracing 35 What’s in a span? Operation name Start & end timestamps Tags: Set of key:value Logs: Set of key:value SpanContext
  • 36. Event sourcing & Tracing 36 Temporal relationships between Spans in a single Trace https://www.jaegertracing.io/docs/1.9/architecture/
  • 37. Event sourcing & Tracing 37 What would be the traces? Defining the infrastructure state is a trace Each changes before validation is a span Validating results in a change request closes the trace Computing the nodes configurations is a trace Computing targets, overrides and generating files are spans Closes with the serialization of the nodes configurations in database Each run on an node is a trace Each configuration check is a span
  • 38. Event sourcing & Tracing 38 RULE ● Id DIRECTIVE ● Id GROUP ● Id Environmental context ● Id : . . . ● Generated : . . ● Commit id. Files Node configuration Change request RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp RUN ● Reports ● Reports ● ... ● ... METADATA ● node id ● config id ● run timestamp ● Signature Get config Send configuration reports Expected reports (node id, config id, timestamp) Run reports Historisation Store expected reports Metadata ● Integrity ● CommitId ● Signature Config ● For Rule R, Directive D1, Component C Events Commit Id Defining state Trace + Spans Trace Run: Trace Each step: span Message bus
  • 39. Event sourcing & Tracing 39 ● Id : . . . ● Generated : . . ● Commit id. Files Node configuration METADATA ● node id ● config id ● run timestamp RUN METADATA Signature Get config Send configuration reports Expected reports (node id, config id, timestamp) Run reports Store expected reports Metadata ● Integrity ● CommitId ● Signature Config ● For Rule R, Directive D1, Component C Trace Message bus Run: Trace Each step: span Compliance CMDB Hooks Monitoring
  • 40. Event sourcing & Tracing 40 Store Traces & Events: ● Integrate with systems in place ● Many tools are compatible with OpenTracing Correlate with non-observable systems
  • 41. Closing thoughts 41 With Rudder, information is centralized and made available in a relevant way for all actors/things
  • 42. Closing thoughts 42 How can you benefit more of your configuration management?
  • 43. Closing thoughts 43 What can we do of these billions events?
  • 44. Closing thoughts 44 What can we do of these billions events? Reactive approach Query, search and analyze traces in case of problems
  • 45. Closing thoughts 45 What can we do of these billions events? Proactive approach Process mining: Machine Learning on these events Detect unusual behaviours Outliers Inconsistencies across systems
  • 46. Closing thoughts 46 Mark Burgess Founder of Configuration Management http://markburgess.org/anomalies.htm l
  • 48. Security? 48 Events, trace and logs hold critical data Within a unique system, security can be built-in AuthN/AuthZ For distributed system, it’s much harder Who can see what? Who defines and enforces the authorizations? Tags on events for authorizations
  • 49. Security? 49 Events, trace and logs hold critical data Cipher information vs partial visibility?
  • 50. rudder.io What uses for observing operations of Configuration Management? Nicolas CHARLES nicolas@rudder.io - @nico_charles 50
  • 51. Event sourcing & Tracing 51 Temporal relationships between Spans in a single Trace ––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–> time [Span A···················································] [Span B··············································] [Span D··········································] [Span C········································] [Span E·······] [Span F··] [Span G··] [Span H··] https://opentracing.io/specification/
  • 52. Event sourcing & Tracing 52 Every components need to know the context ● Carry the Span Context along each events Add some information for each events ● Save on logging thanks to context Send these traces on message bus