SlideShare a Scribd company logo
Private, Public, Hybrid Unique
Challenges with OpenStack
Gathering Info  CMDB
Ryszard Chojnacki, an OPS CMDB blueprint worker
27 October, 2015
Covering what?
• Define a scenario for collecting data
• Show actual payload headers used in one application
• Set the stage for a CMDB blueprints direction:
ETL vs. Federation approaches
Extract, Transform & Load
Send data as appropriate
• Allows for
– Complex and low cost queries
– Can be built to accommodate
loss
– History, what changed last
week
Federation
Access the sources of information in “real time”
• Allows for
– What is the situation NOW!
– Works well for OpenStack
APIs depending on use-case
Set the scene
Imagine this Scenario
You have hardware, data and applications spread over multiple
locations – how can you aggregate meta data into 1 place
Local source:
provisioning as example
{
“fqdn": “compute-0001.env1.adomain.com",
“serial": “USE1234567",
“os_vendor": “Ubuntu",
“os_release": “12.04",
“role": “compute-hypervisor",
}
Suppose provision systems are
created such that there is 1 for
each environment
• Each system has a limited
scope
• Each system must be uniquely
identifiable to permit data
aggregation
1
2
3
Global source:
asset management
Payload type rack_info:
{
“rack_id”: “r0099”,
“datacenter”: “Frankfurt”,
“tile”: “0404”
}
Payload type rack_contents:
{
“in_rack”: “r00099”,
“serial”: “USE1234567”,
“u”: 22,
“pid”: “qy799a”
}
Suppose that there is a single asset
management tool that covers al
environments
• Scope is global
• Unique ID still employed
• The example has more than
one type of data for:
• Each rack – rack_info
• Each asset – rack_contents
1
2
3
Snapshot Header
Message formats
payload
{
"payload": { . . . }
}
Separate logically, by encapsulating
data into a payload document
For example, put here:
• Provisioning data
• Rack data
• Asset data
Message formats
version
{
"version": {
"major": 1
// provider extension possible here; minor, tiny, sha1, …
},
"payload": { . . . }
}
“Schema” version for the payload
• Major versions same indicates
no incompatible changes made
to schema
• Where compatible snapshot 
live process will occur
Note: Documents don’t have
schemas, but there must be some
required, plus optional key/value
pairs, so that consumers of the data
can rely programmatically on it
Message formats
when was that?
{
"batch_ts": <epoch_time>,
"record_ts": <epoch_time>,
“batch_isodate”: “2015-01-16 16:07:21.503680”, . . .
"version": {
"major": 1
},
"payload": { . . . }
}
Useful for understanding how old is
the data I’m seeing
• Batch timestamp must be
constant for all records in the
same batch
• Record is when the record was
exported/message created –
maybe the same as batch
• Updated, if available, is when
the data was last changed in
the source system
Note human readable _isodate
forms, not used in processing
Message formats
{
"source": {
"system": "rackreader.adomain.net"
"type": "rack_info",
"location": "env"
},
"batch_ts": <epoch_time>,
"record_ts": <epoch_time>,
"import_ts": <epoch_time>,
"version": {
"major": 1
},
"payload": { . . . }
}
Provides where the data came from
and the type of data
• System: usually fqdn of source
system
• Location: the scope of the
system
• Type: describes the payload
content, and is tied to the
schema
Message formats
{
"record_id": “r00099-rackreader.adomain.net",
"msg_type": "snapshot",
"source": {
"system": "rackreader.adomain.net"
"type": "rack_info",
"location": "env"
},
"batch_ts": <epoch_time>,
"record_ts": <epoch_time>,
"import_ts": <epoch_time>,
"version": {
"major": 1
},
"payload": { . . . }
}
Mark the content with a unique ID
for that record, and how to process
• A combination of an identifier in
the source system plus an fqdn
makes for a very globally
unique value
• This value is the primary key for
all data operations on the record
• This is a “snapshot” how that is
processed described shortly
Implementation
Philosophy employed
• Operating at large scale  expect to have issues
– Small % error X a big number = some degree of loss
• Tolerant of loss
• Considerate of resources
• Wanted history
– Need easy access to the very latest
• Need a flexible [document] schema – this is JSON
– Provider/Agent is the owner of the schema for its data
– Need a way to converge; communities of practice
Snapshot versus event
based updates
Example
• Snapshot updates every 8h
– Larger data set but not very frequent
• Live updates as they occur
– Tiny data as they occur
• Result
– Minimal network utilization
– Small overhead on source
• Use the combination that best
suits the need
We run 2 collections
• Snapshot
• Has history
• Live
• Has only the latest
Snapshots update Live
Message type overview
Snapshot
• Snapshot
– Defines a Snapshot record
• Batch record count
– Defines how many items in a
batch
– Only If sizes match update
Live
• Required to know what to
delete
Live
• Overwrite
– Live overwrite a complete doc
for a single record
• Delete
– Delete from live a single
record
– Never affects Snapshot
Message formats
snapshot_size
{
"msg_type": "snapshot_size",
"source": {
"system": "rackreader.adomain.net"
"type": "rack_info",
"location": "env"
},
"size": 3,
"batch_ts": <epoch_time>
}
If the consumer receives the size of
messages indicated then the
update of live is possible
Any records received are always
placed in the snapshot collection
Message formats
overwrite
{
"msg_type": "overwrite",
"record_id": “r00099-rackreader.adomain.net",
"source": {
"system": "rackreader.adomain.net"
"type": "rack_contents",
"location": "env"
},
"version": {
"major": 1
},
"record_ts": <epoch_time>,
"payload": { . . . }
}
• Separate the header info from
the payload data
Message formats
delete
{
"msg_type": “delete",
"record_id": “r00099-rackreader.adomain.net",
"source": {
"system": "rackreader.adomain.net"
"type": "rack_contents",
"location": "env"
},
"version": {
"major": 1
},
"record_ts": <epoch_time>,
}
• Separate the header info from
the payload data
Direction
Noteworthy
• If we lose an event we catch up in the batch update
• If we lose a batch, data is just 1 batch cycle stale
• Several companies have arrived at this position
• Records are fairly small
– rabbitMQ friendly
– Easy to search in your data store
CMDB blueprint
• Set the stage for a CMDB blueprints direction:
– Collect
– Store
– Query
• Focus on the Collection framework
• Community of Practice
– Share common stuff; hopefully every expanding domain
– Permit ad-hoc sources, for what you have now
Thank-you!
Message processing
A live update to
“B” goes straight in
here
And is later
updated again by
the snapshot
B
There are 2 collections
of data; snapshot and
live
Snapshot always keeps
growing
Live only has 1 entry
per record
Public private hybrid - cmdb challenge

More Related Content

PPTX
Apache Pulsar, Supporting the Entire Lifecycle of Streaming Data
PDF
Apache Pulsar Seattle - Meetup
PDF
CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Cural...
PPTX
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
PDF
Stream or segment : what is the best way to access your events in Pulsar_Neng
PPTX
Druid realtime indexing
PPTX
Introduction to streaming and messaging flume,kafka,SQS,kinesis
PPT
January 2011 HUG: Kafka Presentation
Apache Pulsar, Supporting the Entire Lifecycle of Streaming Data
Apache Pulsar Seattle - Meetup
CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Cural...
HBaseCon 2015: HBase as an IoT Stream Analytics Platform for Parkinson's Dise...
Stream or segment : what is the best way to access your events in Pulsar_Neng
Druid realtime indexing
Introduction to streaming and messaging flume,kafka,SQS,kinesis
January 2011 HUG: Kafka Presentation

What's hot (20)

PPTX
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
PDF
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
PPTX
HBaseCon 2013: Near Real Time Indexing for eBay Search
PDF
Going from three nines to four nines using Kafka | Tejas Chopra, Netflix
PPTX
HBaseCon 2015: HBase Operations in a Flurry
PPTX
Need for Time series Database
PDF
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
PDF
Drinking from the Firehose - Real-time Metrics
PDF
Data Analytics Service Company and Its Ruby Usage
PPTX
Case studies session 2
PDF
Cassandra background-and-architecture
PPTX
Always On: Building Highly Available Applications on Cassandra
PPTX
HBaseCon 2013: ETL for Apache HBase
PPTX
Real time data pipline with kafka streams
PDF
Big Data Day LA 2015 - Lessons Learned from Designing Data Ingest Systems by ...
PPTX
PPTX
Stateful streaming and the challenge of state
PDF
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
PDF
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
PDF
AWS Community Nordics Virtual Meetup
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
Bulletproof Kafka with Fault Tree Analysis (Andrey Falko, Lyft) Kafka Summit ...
HBaseCon 2013: Near Real Time Indexing for eBay Search
Going from three nines to four nines using Kafka | Tejas Chopra, Netflix
HBaseCon 2015: HBase Operations in a Flurry
Need for Time series Database
Building Scalable and Extendable Data Pipeline for Call of Duty Games: Lesson...
Drinking from the Firehose - Real-time Metrics
Data Analytics Service Company and Its Ruby Usage
Case studies session 2
Cassandra background-and-architecture
Always On: Building Highly Available Applications on Cassandra
HBaseCon 2013: ETL for Apache HBase
Real time data pipline with kafka streams
Big Data Day LA 2015 - Lessons Learned from Designing Data Ingest Systems by ...
Stateful streaming and the challenge of state
HBaseCon2017 Analyzing cryptocurrencies in real time with hBase, Kafka and St...
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
AWS Community Nordics Virtual Meetup
Ad

Viewers also liked (20)

PDF
Bangladesh growing (2nd Edition-20-Jan-14)
PPT
Keeping up with Facebook's Changes (and 'Liking' it!)
PPT
Как принимать платежи Яндекс.Деньгами без подключения?
PPS
中老年人忌諱的10個動作
PDF
Vocabulary describing people
PPS
Utazási magazin(2)+ani (nx power lite)
PPT
Barbercheck
PDF
PORTFOLIO // JFSTO
PPS
Arany kezek(18)+ani (nx power lite)
PPS
Fotos szemmel(3)+ani (nx power lite)
PPS
Arany kezek(1)+ani (nx power lite)
PPS
Arany kezek(7)+ani (nx power lite)
PDF
Controllo radiologico e conseguenze dell'incidente nucleare di Fukushima (2 G...
PDF
David Wyman Slide Show1
PDF
ตารางสอบ ปวส 1 ภาคเรียน 2 55
DOCX
Digipak analysis
PPS
Házi múzeum, gusev vladimir+ani (nx powerlite)
PDF
Your $100 a day method
KEY
Eastpoint
Bangladesh growing (2nd Edition-20-Jan-14)
Keeping up with Facebook's Changes (and 'Liking' it!)
Как принимать платежи Яндекс.Деньгами без подключения?
中老年人忌諱的10個動作
Vocabulary describing people
Utazási magazin(2)+ani (nx power lite)
Barbercheck
PORTFOLIO // JFSTO
Arany kezek(18)+ani (nx power lite)
Fotos szemmel(3)+ani (nx power lite)
Arany kezek(1)+ani (nx power lite)
Arany kezek(7)+ani (nx power lite)
Controllo radiologico e conseguenze dell'incidente nucleare di Fukushima (2 G...
David Wyman Slide Show1
ตารางสอบ ปวส 1 ภาคเรียน 2 55
Digipak analysis
Házi múzeum, gusev vladimir+ani (nx powerlite)
Your $100 a day method
Eastpoint
Ad

Similar to Public private hybrid - cmdb challenge (20)

PPTX
MongoDB World 2018: Pissing Off IT and Delivery: A Tale of 2 ODS’s
PDF
Data Day Texas 2017: Scaling Data Science at Stitch Fix
PDF
OSDC 2018 | From Monolith to Microservices by Paul Puschmann_
KEY
Event Driven Architecture
PDF
BOMs Away - Why everyone needs a BOM (AppSec Cali 2019)
PPTX
Software architecture for data applications
PDF
AWS Lambda Powertools walkthrough.pdf
PDF
Data Day Seattle 2017: Scaling Data Science at Stitch Fix
PDF
Log everything! @DC13
PDF
Event Driven-Architecture from a Scalability perspective
KEY
London devops logging
PPTX
Serverless data and analytics on AWS for operations
PPTX
Effective Microservices In a Data-centric World
PDF
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
PDF
Eventos y Microservicios - Santander TechTalk
PPTX
MongoDB.local Austin 2018: Pissing Off IT and Delivery: A Tale of 2 ODS's
PDF
Data Infrastructure for a World of Music
PDF
NetflixOSS Open House Lightning talks
PDF
Security Monitoring for big Infrastructures without a Million Dollar budget
PDF
Moving To MicroServices
MongoDB World 2018: Pissing Off IT and Delivery: A Tale of 2 ODS’s
Data Day Texas 2017: Scaling Data Science at Stitch Fix
OSDC 2018 | From Monolith to Microservices by Paul Puschmann_
Event Driven Architecture
BOMs Away - Why everyone needs a BOM (AppSec Cali 2019)
Software architecture for data applications
AWS Lambda Powertools walkthrough.pdf
Data Day Seattle 2017: Scaling Data Science at Stitch Fix
Log everything! @DC13
Event Driven-Architecture from a Scalability perspective
London devops logging
Serverless data and analytics on AWS for operations
Effective Microservices In a Data-centric World
Eko10 - Security Monitoring for Big Infrastructures without a Million Dollar ...
Eventos y Microservicios - Santander TechTalk
MongoDB.local Austin 2018: Pissing Off IT and Delivery: A Tale of 2 ODS's
Data Infrastructure for a World of Music
NetflixOSS Open House Lightning talks
Security Monitoring for big Infrastructures without a Million Dollar budget
Moving To MicroServices

Recently uploaded (20)

PPTX
Monitoring Stack: Grafana, Loki & Promtail
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Download FL Studio Crack Latest version 2025 ?
PDF
Cost to Outsource Software Development in 2025
PDF
CapCut Video Editor 6.8.1 Crack for PC Latest Download (Fully Activated) 2025
PPTX
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Designing Intelligence for the Shop Floor.pdf
PDF
iTop VPN Free 5.6.0.5262 Crack latest version 2025
PPTX
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
DOCX
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
PDF
iTop VPN 6.5.0 Crack + License Key 2025 (Premium Version)
PPTX
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PPTX
Computer Software and OS of computer science of grade 11.pptx
PPTX
Patient Appointment Booking in Odoo with online payment
Monitoring Stack: Grafana, Loki & Promtail
Operating system designcfffgfgggggggvggggggggg
Navsoft: AI-Powered Business Solutions & Custom Software Development
Download FL Studio Crack Latest version 2025 ?
Cost to Outsource Software Development in 2025
CapCut Video Editor 6.8.1 Crack for PC Latest Download (Fully Activated) 2025
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Designing Intelligence for the Shop Floor.pdf
iTop VPN Free 5.6.0.5262 Crack latest version 2025
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
Odoo Companies in India – Driving Business Transformation.pdf
Design an Analysis of Algorithms II-SECS-1021-03
wealthsignaloriginal-com-DS-text-... (1).pdf
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
iTop VPN 6.5.0 Crack + License Key 2025 (Premium Version)
Embracing Complexity in Serverless! GOTO Serverless Bengaluru
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Computer Software and OS of computer science of grade 11.pptx
Patient Appointment Booking in Odoo with online payment

Public private hybrid - cmdb challenge

  • 1. Private, Public, Hybrid Unique Challenges with OpenStack Gathering Info  CMDB Ryszard Chojnacki, an OPS CMDB blueprint worker 27 October, 2015
  • 2. Covering what? • Define a scenario for collecting data • Show actual payload headers used in one application • Set the stage for a CMDB blueprints direction:
  • 3. ETL vs. Federation approaches Extract, Transform & Load Send data as appropriate • Allows for – Complex and low cost queries – Can be built to accommodate loss – History, what changed last week Federation Access the sources of information in “real time” • Allows for – What is the situation NOW! – Works well for OpenStack APIs depending on use-case
  • 5. Imagine this Scenario You have hardware, data and applications spread over multiple locations – how can you aggregate meta data into 1 place
  • 6. Local source: provisioning as example { “fqdn": “compute-0001.env1.adomain.com", “serial": “USE1234567", “os_vendor": “Ubuntu", “os_release": “12.04", “role": “compute-hypervisor", } Suppose provision systems are created such that there is 1 for each environment • Each system has a limited scope • Each system must be uniquely identifiable to permit data aggregation 1 2 3
  • 7. Global source: asset management Payload type rack_info: { “rack_id”: “r0099”, “datacenter”: “Frankfurt”, “tile”: “0404” } Payload type rack_contents: { “in_rack”: “r00099”, “serial”: “USE1234567”, “u”: 22, “pid”: “qy799a” } Suppose that there is a single asset management tool that covers al environments • Scope is global • Unique ID still employed • The example has more than one type of data for: • Each rack – rack_info • Each asset – rack_contents 1 2 3
  • 9. Message formats payload { "payload": { . . . } } Separate logically, by encapsulating data into a payload document For example, put here: • Provisioning data • Rack data • Asset data
  • 10. Message formats version { "version": { "major": 1 // provider extension possible here; minor, tiny, sha1, … }, "payload": { . . . } } “Schema” version for the payload • Major versions same indicates no incompatible changes made to schema • Where compatible snapshot  live process will occur Note: Documents don’t have schemas, but there must be some required, plus optional key/value pairs, so that consumers of the data can rely programmatically on it
  • 11. Message formats when was that? { "batch_ts": <epoch_time>, "record_ts": <epoch_time>, “batch_isodate”: “2015-01-16 16:07:21.503680”, . . . "version": { "major": 1 }, "payload": { . . . } } Useful for understanding how old is the data I’m seeing • Batch timestamp must be constant for all records in the same batch • Record is when the record was exported/message created – maybe the same as batch • Updated, if available, is when the data was last changed in the source system Note human readable _isodate forms, not used in processing
  • 12. Message formats { "source": { "system": "rackreader.adomain.net" "type": "rack_info", "location": "env" }, "batch_ts": <epoch_time>, "record_ts": <epoch_time>, "import_ts": <epoch_time>, "version": { "major": 1 }, "payload": { . . . } } Provides where the data came from and the type of data • System: usually fqdn of source system • Location: the scope of the system • Type: describes the payload content, and is tied to the schema
  • 13. Message formats { "record_id": “r00099-rackreader.adomain.net", "msg_type": "snapshot", "source": { "system": "rackreader.adomain.net" "type": "rack_info", "location": "env" }, "batch_ts": <epoch_time>, "record_ts": <epoch_time>, "import_ts": <epoch_time>, "version": { "major": 1 }, "payload": { . . . } } Mark the content with a unique ID for that record, and how to process • A combination of an identifier in the source system plus an fqdn makes for a very globally unique value • This value is the primary key for all data operations on the record • This is a “snapshot” how that is processed described shortly
  • 15. Philosophy employed • Operating at large scale  expect to have issues – Small % error X a big number = some degree of loss • Tolerant of loss • Considerate of resources • Wanted history – Need easy access to the very latest • Need a flexible [document] schema – this is JSON – Provider/Agent is the owner of the schema for its data – Need a way to converge; communities of practice
  • 16. Snapshot versus event based updates Example • Snapshot updates every 8h – Larger data set but not very frequent • Live updates as they occur – Tiny data as they occur • Result – Minimal network utilization – Small overhead on source • Use the combination that best suits the need We run 2 collections • Snapshot • Has history • Live • Has only the latest Snapshots update Live
  • 17. Message type overview Snapshot • Snapshot – Defines a Snapshot record • Batch record count – Defines how many items in a batch – Only If sizes match update Live • Required to know what to delete Live • Overwrite – Live overwrite a complete doc for a single record • Delete – Delete from live a single record – Never affects Snapshot
  • 18. Message formats snapshot_size { "msg_type": "snapshot_size", "source": { "system": "rackreader.adomain.net" "type": "rack_info", "location": "env" }, "size": 3, "batch_ts": <epoch_time> } If the consumer receives the size of messages indicated then the update of live is possible Any records received are always placed in the snapshot collection
  • 19. Message formats overwrite { "msg_type": "overwrite", "record_id": “r00099-rackreader.adomain.net", "source": { "system": "rackreader.adomain.net" "type": "rack_contents", "location": "env" }, "version": { "major": 1 }, "record_ts": <epoch_time>, "payload": { . . . } } • Separate the header info from the payload data
  • 20. Message formats delete { "msg_type": “delete", "record_id": “r00099-rackreader.adomain.net", "source": { "system": "rackreader.adomain.net" "type": "rack_contents", "location": "env" }, "version": { "major": 1 }, "record_ts": <epoch_time>, } • Separate the header info from the payload data
  • 22. Noteworthy • If we lose an event we catch up in the batch update • If we lose a batch, data is just 1 batch cycle stale • Several companies have arrived at this position • Records are fairly small – rabbitMQ friendly – Easy to search in your data store
  • 23. CMDB blueprint • Set the stage for a CMDB blueprints direction: – Collect – Store – Query • Focus on the Collection framework • Community of Practice – Share common stuff; hopefully every expanding domain – Permit ad-hoc sources, for what you have now
  • 25. Message processing A live update to “B” goes straight in here And is later updated again by the snapshot B There are 2 collections of data; snapshot and live Snapshot always keeps growing Live only has 1 entry per record