SlideShare a Scribd company logo
Copyright © 2014 Splunk Inc. 
Sustainable Logging: SUCCEEDING WITH SPLUNK
2 
Paul Gilowey 
Foundation Technology Specialist 
paul.gilowey@santam.co.za 
@paulcgt 
Sustainable Logging: 
SUCCEEDING WITH SPLUNK 
Words and thoughts expressed herein are my own, and not those of Santam.
3 
www.dan-dare.org
4 
My technology background
5 
The evolution that led to Splunk
6 
In the beginning there was ONE. 
depotwallpaper.com
7 
Then things got really complex.
8
9
10 
In 2012, a new project
11 
A big decision 
It’s time to say goodbye…
12 
Highly distributed and integrated
13 
A brand new world 
Claims 
Finance 
Docs 
B2B 
Portal 
Legacy 
Reverse 
Proxies 
Load-balancers 
IDM 
Integration 
ESM 
Virtualisation 
New Policy Administration 
MDM
14 
James Wheeler 
souvenirpixels.com 
Too many logs to monitor
15 
capetownstockphotos.com 
So little time to trace problems
16 
Not only in production 
https://www.flickr.com/photos/wsdot/
17 
On a tight timeline
18 
https://www.flickr.com/photos/usnavy/ 
December 2013 Production and Non-Production 20GB
19 
Now what? 
So we’re collecting log events.
20 
Developers like doing things the old way
21 
tail -f ./catalina.out
22 
We like this. It’s comforting.
23 
Effecting change
24 
CTO’s Office 
Splunk users (dev, ops, etc.) 
Choosing your champion
25 
•have influence across departments 
•act as product owner 
•be fanatical 
•be hands-on 
•have a development background 
•be an architect 
Dave Keeshan - https://www.flickr.com/photos/spudmurphy/ 
Your champion should…
26 
Tips to help your champion
27 
Help developers troubleshoot (even in dev) 
Ed Yordon https://www.flickr.com/photos/yourdon/
28 
Change how developers think about log events
29 
Police lazy logging 
[INFO ] Got here 
[INFO ] finished loop 420 
[INFO ] JDE… 
[INFO ] >>>>>>>>AAAAAAAA 
[INFO ] BBBBBBBBBBBBBBB 
[ERROR] It failed!!!!!!
30 
Ops might as well be blindfolded. 
https://www.flickr.com/photos/foxtongue
31 
Do you really want to be called at 2am?
32 
Demonstrate thoughtful logging 
[DEBUG] TxId=328, Counting invoice line items… 
[INFO ] TxId=328, Invoice LineItemsTotal=420 
[DEBUG] TxId=328, Calling remote service JDE… 
[TRACE] TxId=328, JDE Request: {“TxID”:”328”, “Items”[{“desc”:”Motor Vehicle”,”prem”:305.24},… 
[WARN ] TxId=328, Timed out while calling remote service JDE… target system may be down. Will retry in 30s.
33 
Show the benefit of structured log events [INFO] Purchase complete - total=42 currency=ZAR language=en_ZA priority=13 “Purchase complete” priority<4 | stats sum(total) as currencyTotal by currency | table currency, currencyTotal
34 11 Sep 2014 15:05:27,960 [Thread-428] [DEBUG] [stm.amx.communication.outboundcommunicationmanager] za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver - btid=77320d33-5f8c-4178-b13e-c594816463d8, cmpid=za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver, uid=System, za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver.processStatusMessage : Status [STATUS_PROCESSING_COMPLETED = 6], will act on [STATUS_FINISHED = 1], for now only GENERATE_DIGITAL_DOCUMENT. 11 Sep 2014 15:05:36,272 [Thread-428] [DEBUG] [stm.amx.communication.outboundcommunicationmanager] za.co.santam.communication.outboundcommunicationmanager.RunnableReceiver - btid=e76665e2-e876-455a-a087-aeb5ba97d5a8, cmpid=za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver, uid=System, za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver.processMessages : Blocking(2000) read storage until message arrives... 11 Sep 2014 15:05:36,472 [Thread-427] [DEBUG] [stm.amx.communication.outboundcommunicationmanager] za.co.santam.communication.outboundcommunicationmanager.RunnableReceiver - btid=e76665e2-e876-455a-a087-aeb5ba97d5a8, cmpid=za.co.santam.communication.outboundcommunicationmanager.RunnableStorageReceiver, uid=System, za.co.santam.communication.outboundcommunicationmanager.RunnableStorageReceiver.processMessages : message received. 11 Sep 2014 15:05:36,475 [Thread-427] [TRACE] [com.tibco.amx.platform] com.tibco.governance.amxagent.msginterceptor.component.AMXGovMsgInterceptorComponent - Target URI : urn:amx:env2/stm.amx.communication.outboundcommunicationmanager/StatusReceiver_1.2.0.v2014-09-10- 1604#reference(StatusReceiver_ContentManagerProxyAsync_v4_Int). 
Change this…
35 
… into this.
36 
Formalise stacktrace logging policy 
Function call -> 
Function call -> 
Function call -> 
Function call 
<- Log stacktrace 
<- Log stacktrace 
<- Log stacktrace 
<- Log stacktrace
37 
Avoid filtering events. 
[DEBUG] TxId=328, Real important debug statement. 
[INFO ] TxId=328, This would have been useful to see... 
[DEBUG] TxId=328, Useful when we really need it. 
[TRACE] TxId=328, Oh man, I need this event so bad. 
[DEBUG] TxId=328, Flippin’ important debug message. 
[INFO ] TxId=328, This would have been useful to see... 
[WARN ] TxId=328, Why am I logging at all?
38 
Avoid filtering events. 
[WARN ] TxId=328, Real important debug statement. 
[WARN ] TxId=328, This would have been useful to see... 
[WARN ] TxId=328, Useful when we really need it. 
[WARN ] TxId=328, Oh man, I need this event so bad. 
[WARN ] TxId=328, Flippin’ important debug message. 
[WARN ] TxId=328, Cummon, I *really* wanna see this! 
[WARN ] TxId=328, Why am I logging at all?
39 
tail -f ./catalina.out
40 
Why developer buy-in matters
41 
“A fool with a tool is still a fool.” Grady Booch
42 
•Laughable deadlines 
•Long days, longer nights 
•Management pressure
43 
If we log excessively…
44 
Bob B. Brown - https://www.flickr.com/photos/beleaveme
45 
tail -f ./catalina.out
46 
Nope, no fires today, folks. 
Robert du Bois https://www.flickr.com/photos/lordisgood
47 
No value, no money. 
Neubie - https://www.flickr.com/photos/neubie/
48 
Shelfware. 
Robert Couse-Baker https://www.flickr.com/photos/29233640@N07/
49 
8 steps to successful implementation
50 
Start small (but plan to grow big) 
Pewstruck.com - https://www.flickr.com/photos/canoodlepets/ 
1
51 
Start with a 
clean slate 
2
52 
Learn 
Implement 
Stabilise 
Spread the word 
Refine 
Take a 
smart approach 
3
53 Dashboards are pretty, alerts are king Reactive becomes proactive Register defects (ERROR = defect) Filter, don’t flood mailboxes 
Build alerts 
and 
set policy 
4
54 Get a feel for the pain Make sure filtering is working Police false positives 
Receive 
all alerts 
yourself 
5
55 Mine their data yourself 
–Find what’s difficult to show 
–Build dashboards to showcase their solutions Broaden their minds – complement traditional BI by using log events 
Help 
managers 
look good 
6
56 
“Not too hot, not too cold, just right!” 
“Meh – too sloooow…” 
“Too expensive!” 
Apply the Goldilocks Principle 
7
57 
Monitor licence usage by source or source type 
index=_internal source=*metrics.log 
group="per_sourcetype_thruput" 
| stats sum(kb) as KB by series 
| where KB > 20000 
8
58 
Wrapping up
59 
Encourage thoughtful logging 
Promote good logging practices 
Police bad behaviour 
Be intimately involved 
Adopt a helpful attitude 
Make sure you show value 
To be successful:
Thanks for listening! 
Paul Gilowey 
Foundation Technology Specialist 
paul.gilowey@santam.co.za 
@paulcgt

More Related Content

PDF
Splunk guide for_iso_27002
PPTX
Kusto (Azure Data Explorer) Training for R&D - January 2019
KEY
Data Driven Practice with e-MDs
PDF
Faronics Data Igloo User Guide
PDF
15884086 Oracle Developer Build Forms I
PDF
stackconf 2025 | IP Authentication: A Tale of Performance Pitfalls and Challe...
PDF
stackconf 2025 | IP Authentication: A Tale of Performance Pitfalls and Challe...
PDF
Mastering IDEAScript with Website The Definitive Guide 1st Edition Idea
Splunk guide for_iso_27002
Kusto (Azure Data Explorer) Training for R&D - January 2019
Data Driven Practice with e-MDs
Faronics Data Igloo User Guide
15884086 Oracle Developer Build Forms I
stackconf 2025 | IP Authentication: A Tale of Performance Pitfalls and Challe...
stackconf 2025 | IP Authentication: A Tale of Performance Pitfalls and Challe...
Mastering IDEAScript with Website The Definitive Guide 1st Edition Idea

Similar to Sustainable Logging – SplunkLive! 2014 (20)

PDF
IBM Blueworks Live Ninja Lab
PDF
Blueworks Live Ninja Lab
PDF
Sigfox Workshop with Akeru & TheThings.io
PDF
Sharepoint 2010 For Project Management 2nd Ed Dux Raymond Sy
PDF
Wf solutions misc
PDF
Finding attacks with these 6 events
PDF
Data Democratization at Nubank
PPTX
Serverless Data Architecture at scale on Google Cloud Platform
PDF
Story of Multnomah County: Migrating from Vignette and Building a Drupal Ecos...
PDF
Architecting Iot Solutions On Azure Conquering Complexity For Scalable Device...
PPTX
Getting Things Done for Technical Communicators at TCUK14
PDF
Drupal For Designers 1st Edition Dani Nordin
PPTX
Dev Ops for systems of record - Talk at Agile Australia 2015
PDF
Creating first project in mikroC PRO for 8051
PPTX
Agile Data: revolutionizing data and database cloning
PDF
Exploring and Using the Python Ecosystem
PDF
Architecting Modern Data Platforms Jan Kunigk Ian Buss Paul Wilkinson
PDF
PuppetConf 2016: A Tale of Two Hierarchies: Group Policy & Puppet – Matt Ston...
PDF
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
PPTX
Operational Data Vault
IBM Blueworks Live Ninja Lab
Blueworks Live Ninja Lab
Sigfox Workshop with Akeru & TheThings.io
Sharepoint 2010 For Project Management 2nd Ed Dux Raymond Sy
Wf solutions misc
Finding attacks with these 6 events
Data Democratization at Nubank
Serverless Data Architecture at scale on Google Cloud Platform
Story of Multnomah County: Migrating from Vignette and Building a Drupal Ecos...
Architecting Iot Solutions On Azure Conquering Complexity For Scalable Device...
Getting Things Done for Technical Communicators at TCUK14
Drupal For Designers 1st Edition Dani Nordin
Dev Ops for systems of record - Talk at Agile Australia 2015
Creating first project in mikroC PRO for 8051
Agile Data: revolutionizing data and database cloning
Exploring and Using the Python Ecosystem
Architecting Modern Data Platforms Jan Kunigk Ian Buss Paul Wilkinson
PuppetConf 2016: A Tale of Two Hierarchies: Group Policy & Puppet – Matt Ston...
[KubeCon NA 2018] Effective Kubernetes Develop: Turbocharge Your Dev Loop - P...
Operational Data Vault
Ad

Recently uploaded (20)

PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Approach and Philosophy of On baking technology
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Electronic commerce courselecture one. Pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Review of recent advances in non-invasive hemoglobin estimation
Encapsulation_ Review paper, used for researhc scholars
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Digital-Transformation-Roadmap-for-Companies.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Dropbox Q2 2025 Financial Results & Investor Presentation
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
MIND Revenue Release Quarter 2 2025 Press Release
“AI and Expert System Decision Support & Business Intelligence Systems”
Advanced methodologies resolving dimensionality complications for autism neur...
Programs and apps: productivity, graphics, security and other tools
Approach and Philosophy of On baking technology
A comparative analysis of optical character recognition models for extracting...
Diabetes mellitus diagnosis method based random forest with bat algorithm
sap open course for s4hana steps from ECC to s4
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Encapsulation theory and applications.pdf
Electronic commerce courselecture one. Pdf
Ad

Sustainable Logging – SplunkLive! 2014

  • 1. Copyright © 2014 Splunk Inc. Sustainable Logging: SUCCEEDING WITH SPLUNK
  • 2. 2 Paul Gilowey Foundation Technology Specialist paul.gilowey@santam.co.za @paulcgt Sustainable Logging: SUCCEEDING WITH SPLUNK Words and thoughts expressed herein are my own, and not those of Santam.
  • 4. 4 My technology background
  • 5. 5 The evolution that led to Splunk
  • 6. 6 In the beginning there was ONE. depotwallpaper.com
  • 7. 7 Then things got really complex.
  • 8. 8
  • 9. 9
  • 10. 10 In 2012, a new project
  • 11. 11 A big decision It’s time to say goodbye…
  • 12. 12 Highly distributed and integrated
  • 13. 13 A brand new world Claims Finance Docs B2B Portal Legacy Reverse Proxies Load-balancers IDM Integration ESM Virtualisation New Policy Administration MDM
  • 14. 14 James Wheeler souvenirpixels.com Too many logs to monitor
  • 15. 15 capetownstockphotos.com So little time to trace problems
  • 16. 16 Not only in production https://www.flickr.com/photos/wsdot/
  • 17. 17 On a tight timeline
  • 18. 18 https://www.flickr.com/photos/usnavy/ December 2013 Production and Non-Production 20GB
  • 19. 19 Now what? So we’re collecting log events.
  • 20. 20 Developers like doing things the old way
  • 21. 21 tail -f ./catalina.out
  • 22. 22 We like this. It’s comforting.
  • 24. 24 CTO’s Office Splunk users (dev, ops, etc.) Choosing your champion
  • 25. 25 •have influence across departments •act as product owner •be fanatical •be hands-on •have a development background •be an architect Dave Keeshan - https://www.flickr.com/photos/spudmurphy/ Your champion should…
  • 26. 26 Tips to help your champion
  • 27. 27 Help developers troubleshoot (even in dev) Ed Yordon https://www.flickr.com/photos/yourdon/
  • 28. 28 Change how developers think about log events
  • 29. 29 Police lazy logging [INFO ] Got here [INFO ] finished loop 420 [INFO ] JDE… [INFO ] >>>>>>>>AAAAAAAA [INFO ] BBBBBBBBBBBBBBB [ERROR] It failed!!!!!!
  • 30. 30 Ops might as well be blindfolded. https://www.flickr.com/photos/foxtongue
  • 31. 31 Do you really want to be called at 2am?
  • 32. 32 Demonstrate thoughtful logging [DEBUG] TxId=328, Counting invoice line items… [INFO ] TxId=328, Invoice LineItemsTotal=420 [DEBUG] TxId=328, Calling remote service JDE… [TRACE] TxId=328, JDE Request: {“TxID”:”328”, “Items”[{“desc”:”Motor Vehicle”,”prem”:305.24},… [WARN ] TxId=328, Timed out while calling remote service JDE… target system may be down. Will retry in 30s.
  • 33. 33 Show the benefit of structured log events [INFO] Purchase complete - total=42 currency=ZAR language=en_ZA priority=13 “Purchase complete” priority<4 | stats sum(total) as currencyTotal by currency | table currency, currencyTotal
  • 34. 34 11 Sep 2014 15:05:27,960 [Thread-428] [DEBUG] [stm.amx.communication.outboundcommunicationmanager] za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver - btid=77320d33-5f8c-4178-b13e-c594816463d8, cmpid=za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver, uid=System, za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver.processStatusMessage : Status [STATUS_PROCESSING_COMPLETED = 6], will act on [STATUS_FINISHED = 1], for now only GENERATE_DIGITAL_DOCUMENT. 11 Sep 2014 15:05:36,272 [Thread-428] [DEBUG] [stm.amx.communication.outboundcommunicationmanager] za.co.santam.communication.outboundcommunicationmanager.RunnableReceiver - btid=e76665e2-e876-455a-a087-aeb5ba97d5a8, cmpid=za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver, uid=System, za.co.santam.communication.outboundcommunicationmanager.RunnableStatusReceiver.processMessages : Blocking(2000) read storage until message arrives... 11 Sep 2014 15:05:36,472 [Thread-427] [DEBUG] [stm.amx.communication.outboundcommunicationmanager] za.co.santam.communication.outboundcommunicationmanager.RunnableReceiver - btid=e76665e2-e876-455a-a087-aeb5ba97d5a8, cmpid=za.co.santam.communication.outboundcommunicationmanager.RunnableStorageReceiver, uid=System, za.co.santam.communication.outboundcommunicationmanager.RunnableStorageReceiver.processMessages : message received. 11 Sep 2014 15:05:36,475 [Thread-427] [TRACE] [com.tibco.amx.platform] com.tibco.governance.amxagent.msginterceptor.component.AMXGovMsgInterceptorComponent - Target URI : urn:amx:env2/stm.amx.communication.outboundcommunicationmanager/StatusReceiver_1.2.0.v2014-09-10- 1604#reference(StatusReceiver_ContentManagerProxyAsync_v4_Int). Change this…
  • 35. 35 … into this.
  • 36. 36 Formalise stacktrace logging policy Function call -> Function call -> Function call -> Function call <- Log stacktrace <- Log stacktrace <- Log stacktrace <- Log stacktrace
  • 37. 37 Avoid filtering events. [DEBUG] TxId=328, Real important debug statement. [INFO ] TxId=328, This would have been useful to see... [DEBUG] TxId=328, Useful when we really need it. [TRACE] TxId=328, Oh man, I need this event so bad. [DEBUG] TxId=328, Flippin’ important debug message. [INFO ] TxId=328, This would have been useful to see... [WARN ] TxId=328, Why am I logging at all?
  • 38. 38 Avoid filtering events. [WARN ] TxId=328, Real important debug statement. [WARN ] TxId=328, This would have been useful to see... [WARN ] TxId=328, Useful when we really need it. [WARN ] TxId=328, Oh man, I need this event so bad. [WARN ] TxId=328, Flippin’ important debug message. [WARN ] TxId=328, Cummon, I *really* wanna see this! [WARN ] TxId=328, Why am I logging at all?
  • 39. 39 tail -f ./catalina.out
  • 40. 40 Why developer buy-in matters
  • 41. 41 “A fool with a tool is still a fool.” Grady Booch
  • 42. 42 •Laughable deadlines •Long days, longer nights •Management pressure
  • 43. 43 If we log excessively…
  • 44. 44 Bob B. Brown - https://www.flickr.com/photos/beleaveme
  • 45. 45 tail -f ./catalina.out
  • 46. 46 Nope, no fires today, folks. Robert du Bois https://www.flickr.com/photos/lordisgood
  • 47. 47 No value, no money. Neubie - https://www.flickr.com/photos/neubie/
  • 48. 48 Shelfware. Robert Couse-Baker https://www.flickr.com/photos/29233640@N07/
  • 49. 49 8 steps to successful implementation
  • 50. 50 Start small (but plan to grow big) Pewstruck.com - https://www.flickr.com/photos/canoodlepets/ 1
  • 51. 51 Start with a clean slate 2
  • 52. 52 Learn Implement Stabilise Spread the word Refine Take a smart approach 3
  • 53. 53 Dashboards are pretty, alerts are king Reactive becomes proactive Register defects (ERROR = defect) Filter, don’t flood mailboxes Build alerts and set policy 4
  • 54. 54 Get a feel for the pain Make sure filtering is working Police false positives Receive all alerts yourself 5
  • 55. 55 Mine their data yourself –Find what’s difficult to show –Build dashboards to showcase their solutions Broaden their minds – complement traditional BI by using log events Help managers look good 6
  • 56. 56 “Not too hot, not too cold, just right!” “Meh – too sloooow…” “Too expensive!” Apply the Goldilocks Principle 7
  • 57. 57 Monitor licence usage by source or source type index=_internal source=*metrics.log group="per_sourcetype_thruput" | stats sum(kb) as KB by series | where KB > 20000 8
  • 59. 59 Encourage thoughtful logging Promote good logging practices Police bad behaviour Be intimately involved Adopt a helpful attitude Make sure you show value To be successful:
  • 60. Thanks for listening! Paul Gilowey Foundation Technology Specialist paul.gilowey@santam.co.za @paulcgt