SlideShare a Scribd company logo
Copyright © 2016 Splunk Inc.
Power of Splunk
Search Processing Language (SPL™)
Stephen Luedtke
Sr. Technical Marketing Mgr
Safe Harbor Statement
2
During the course of this presentation, we may make forward looking statements regarding future events
or the expected performance of the company. We caution you that such statements reflect our current
expectations and estimates based on factors currently known to us and that actual events or results could
differ materially. For important factors that may cause actual results to differ from those contained in our
forward-looking statements, please review our filings with the SEC. The forward-looking statements
made in this presentation are being made as of the time and date of its live presentation. If reviewed
after its live presentation, this presentation may not contain current or accurate information. We do not
assume any obligation to update any forward looking statements we may make. In addition, any
information about our roadmap outlines our general product direction and is subject to change at any
time without notice. It is for informational purposes only and shall not be incorporated into any contract
or other commitment. Splunk undertakes no obligation either to develop the features or functionality
described orto includeany suchfeatureor functionalityina futurerelease.
Agenda
● Overview & Anatomy of a Search
– Quick refresher on search language and structure
● SPL Commands and Examples
– Searching, charting, converging, mapping,
transactions, anomalies, exploring
● Custom Commands
– Extend the capabilities of SPL
● Q&A
3
SPL Overview
SPL Overview
● Over 140+ search commands
● Syntax was originally based upon the Unix pipeline and SQL
and is optimized for time series data
● The scope of SPL includes data searching, filtering, modification, manipulation,
enrichment, insertion and deletion
● Includes anomaly detection and machine learning
5
Why Create a New Query Language?
● Flexibility and
effectiveness on
small and big data
6
Why Create a New Query Language?
● Flexibility and
effectiveness on
small and big data
● Late-binding schema
7
Why Create a New Query Language?
● Flexibility and
effectiveness on
small and big data
● Late-binding schema
● More/better methods
of correlation
8
Data
Why Create a New Query Language?
● Flexibility and
effectiveness on
small and big data
● Late-binding schema
● More/better methods
of correlation
● Not just analyze, but
visualize
9
Data
BIG Data
search and filter | munge | report | cleanup
| rename sum(KB) AS "Total KB" dc(clientip) AS "Unique Customers"
| eval KB=bytes/1024
sourcetype=access*
| stats sum(KB) dc(clientip)
SPL Basic Structure
10
SPL Examples
SPL Examples and Recipes
● Find the needle in the haystack
● Charting statistics and predicting values
● Enriching and converging data sources
● Map geographic data in real time
● Identifying anomalies
● Transactions
● Data exploration & finding relationships between fields
● Custom Commands
12
SPL Examples and Recipes
● Find the needle in the haystack
● Charting statistics and predicting values
● Enriching and converging data sources
● Map geographic data in real time
● Identifying anomalies
● Transactions
● Data exploration & finding relationships between fields
● Custom Commands
13
Eval – Just Getting Started!
Splunk Search Quick Reference Guide
14
SPL Examples and Recipes
● Find the needle in the haystack
● Charting statistics and predicting values
● Enriching and converging data sources
● Map geographic data in real time
● Identifying anomalies
● Transactions
● Data exploration & finding relationships between fields
● Custom Commands
21
Stats, Timechart, Eventstats, Streamstats
22
Stats/Timechart – But Wait, There’s More!
Splunk Search Quick Reference Guide
23
SPL Examples and Recipes
● Find the needle in the haystack
● Charting statistics and predicting values
● Enriching and converging data sources
● Map geographic data in real time
● Identifying anomalies
● Transactions
● Data exploration & finding relationships between fields
● Custom Commands
32
33
Converging Data Sources
Index Untapped Data: Any Source, Type, Volume
Online
Services Web
Services
Servers
Security GPS
Location
Storage
Desktops
Networks
Packaged
Applications
Custom
ApplicationsMessaging
Telecoms
Online
Shopping
Cart
Web
Clickstreams
Databases
Energy
Meters
Call Detail
Records
Smartphones
and Devices
RFID
On-
Premises
Private
Cloud
Public
Cloud
Ask Any Question
Application Delivery
Security, Compliance
and Fraud
IT Operations
Business Analytics
Industrial Data and
the Internet of Things
SPL Examples and Recipes
● Find the needle in the haystack
● Charting statistics and predicting values
● Enriching and converging data sources
● Map geographic data in real time
● Identifying anomalies
● Transactions
● Data exploration & finding relationships between fields
● Custom Commands
37
SPL Examples and Recipes
● Find the needle in the haystack
● Charting statistics and predicting values
● Enriching and converging data sources
● Map geographic data in real time
● Identifying anomalies
● Transactions
● Data exploration & finding relationships between fields
● Custom Commands
42
SPL Examples and Recipes
● Find the needle in the haystack
● Charting statistics and predicting values
● Enriching and converging data sources
● Map geographic data in real time
● Identifying anomalies
● Transactions
● Data exploration & finding relationships between fields
● Custom Commands
44
SPL Examples and Recipes
● Find the needle in the haystack
● Charting statistics and predicting values
● Enriching and converging data sources
● Map geographic data in real time
● Identifying anomalies
● Transactions
● Data exploration & finding relationships between fields
● Custom Commands
48
Data Exploration
| analyzefields
| anomalies
| arules
| associate
| cluster
| contingency
| correlate
| fieldsummary
49
Machine Learning Toolkit and Showcase
Examples
● Predict Numeric Fields
● Predict Categorical Fields
● Detect Numerical Outliers
● Detect Categorical Outliers
● Forecast Time Series
● Cluster Events
55
SPL Examples and Recipes
● Find the needle in the haystack
● Charting statistics and predicting values
● Enriching and converging data sources
● Map geographic data in real time
● Identifying anomalies
● Transactions
● Data exploration & finding relationships between fields
● Custom Commands
56
Custom Commands
● What is a Custom Command?
– “| haversine origin="47.62,-122.34" outputField=dist lat lon”
● Why do we use Custom Commands?
– Run other/external algorithms on your Splunk data
– Save time munging data (see Timewrap!)
– Because you can!
● Create your own or download as Apps
– Haversine (Distance between two GPS coords)
– Timewrap (Enhanced Time overlay)
– Levenshtein (Fuzzy string compare)
– Base64 (Encode/Decode)
57
Custom Commands – Haversine
Examples
● Download and install App
Haversine
● Read documentation then
use in SPL!
sourcetype=access*
| iplocation clientip
| search City=A*
| haversine origin="47.62,-122.34"
units=mi outputField=dist lat lon
| table clientip, City, dist, lat, lon
58
Custom Commands – Haversine
Examples
● Download and install App
Haversine
● Read documentation then
use in SPL!
sourcetype=access*
| iplocation clientip
| search City=A*
| haversine origin="47.62,-122.34"
units=mi outputField=dist lat lon
| table clientip, City, dist, lat, lon
59
For More Information
● Additional information can be found in:
– Power Of SPL App!
– Search Manual
– Blogs
– Answers
– Exploring Splunk
60
Q & A
Thank you!

More Related Content

PPTX
NetsecTR "Her Yönüyle Siber Tehdit İstihbaratı"
PDF
とある診断員とAWS
PDF
Splunk Artificial Intelligence & Machine Learning Webinar
PDF
Global Cyber Threat Intelligence
PDF
[DI12] あらゆるデータをビジネスに活用! Azure Data Lake を中心としたビックデータ処理基盤のアーキテクチャと実装
PPTX
An introduction to SOC (Security Operation Center)
PDF
Zero Trust : How to Get Started
PPTX
Splunk Phantom SOAR Roundtable
NetsecTR "Her Yönüyle Siber Tehdit İstihbaratı"
とある診断員とAWS
Splunk Artificial Intelligence & Machine Learning Webinar
Global Cyber Threat Intelligence
[DI12] あらゆるデータをビジネスに活用! Azure Data Lake を中心としたビックデータ処理基盤のアーキテクチャと実装
An introduction to SOC (Security Operation Center)
Zero Trust : How to Get Started
Splunk Phantom SOAR Roundtable

What's hot (20)

PPTX
Effective Cyber Defense Using CIS Critical Security Controls
PPTX
Netpluz Managed SOC - MSS Service
PDF
Bulding Soc In Changing Threat Landscapefinal
PDF
Protecting Vital Data With NIST Framework - Patrick Kerpan's Secure260 presen...
PDF
DNS Protokolüne Yönelik Güncel Saldırı Teknikleri & Çözüm Önerileri
PDF
Using Machine Learning and Analytics to Hunt for Security Threats - Webinar
PPTX
What is Threat Hunting? - Panda Security
PPTX
Zararlı Yazılım Tespiti ve Siber i̇stihbarat Amaçlı IOC Kullanımı
PDF
DTS Solution - Building a SOC (Security Operations Center)
PDF
Building a Next-Generation Security Operations Center (SOC)
PDF
Introduction to Tenable
PDF
MITRE ATT&CKcon 2018: Hunters ATT&CKing with the Data, Roberto Rodriguez, Spe...
PPTX
Bilgi Güvenliği Farkındalık Eğitimi Sunumu
PDF
Ücretsiz Bilgi Güvenliği Farkındalık Eğitimi
PDF
Bilgi Güvenliği Farkındalık Eğitimi
PDF
Cloud Security: Attacking The Metadata Service v2
PPTX
Güvenliği Artırmak için Tehdit İstihbaratı ve Zafiyet Yönetiminin Birleşimi
PPTX
Effective Security Operation Center - present by Reza Adineh
PDF
Threat Intelligence & Threat research Sources
PDF
Fortinet_ProductGuide_NOV2021_R127.pdf
Effective Cyber Defense Using CIS Critical Security Controls
Netpluz Managed SOC - MSS Service
Bulding Soc In Changing Threat Landscapefinal
Protecting Vital Data With NIST Framework - Patrick Kerpan's Secure260 presen...
DNS Protokolüne Yönelik Güncel Saldırı Teknikleri & Çözüm Önerileri
Using Machine Learning and Analytics to Hunt for Security Threats - Webinar
What is Threat Hunting? - Panda Security
Zararlı Yazılım Tespiti ve Siber i̇stihbarat Amaçlı IOC Kullanımı
DTS Solution - Building a SOC (Security Operations Center)
Building a Next-Generation Security Operations Center (SOC)
Introduction to Tenable
MITRE ATT&CKcon 2018: Hunters ATT&CKing with the Data, Roberto Rodriguez, Spe...
Bilgi Güvenliği Farkındalık Eğitimi Sunumu
Ücretsiz Bilgi Güvenliği Farkındalık Eğitimi
Bilgi Güvenliği Farkındalık Eğitimi
Cloud Security: Attacking The Metadata Service v2
Güvenliği Artırmak için Tehdit İstihbaratı ve Zafiyet Yönetiminin Birleşimi
Effective Security Operation Center - present by Reza Adineh
Threat Intelligence & Threat research Sources
Fortinet_ProductGuide_NOV2021_R127.pdf
Ad

Viewers also liked (20)

PPTX
Splunk Technologie Add-ons und Alert Actions entwickeln
PPTX
Daten getriebene Service Intelligence mit Splunk ITSI
PPTX
Splunk für Security
PPTX
Data Obfuscation in Splunk Enterprise
PPTX
Splunk Stream - Einblicke in Netzwerk Traffic
PPTX
Splunk Überblick
PPTX
Getting Started Getting Started With Splunk Enterprise
PPTX
Machine Learning
PDF
Discovery Day Milano 2017
PDF
UX, ethnography and possibilities: for Libraries, Museums and Archives
PDF
Designing Teams for Emerging Challenges
PDF
Splunk at Banco Popolare de Sondrio
PDF
Visual Design with Data
PDF
3 Things Every Sales Team Needs to Be Thinking About in 2017
PDF
How to Become a Thought Leader in Your Niche
PPTX
SplunkLive! Frankfurt 2017 - MediaMarktSaturn
PPTX
Splunk Discovery Day Hamburg - Data Driven Insights
PPTX
Splunk Discovery Day Hamburg - Security Session
Splunk Technologie Add-ons und Alert Actions entwickeln
Daten getriebene Service Intelligence mit Splunk ITSI
Splunk für Security
Data Obfuscation in Splunk Enterprise
Splunk Stream - Einblicke in Netzwerk Traffic
Splunk Überblick
Getting Started Getting Started With Splunk Enterprise
Machine Learning
Discovery Day Milano 2017
UX, ethnography and possibilities: for Libraries, Museums and Archives
Designing Teams for Emerging Challenges
Splunk at Banco Popolare de Sondrio
Visual Design with Data
3 Things Every Sales Team Needs to Be Thinking About in 2017
How to Become a Thought Leader in Your Niche
SplunkLive! Frankfurt 2017 - MediaMarktSaturn
Splunk Discovery Day Hamburg - Data Driven Insights
Splunk Discovery Day Hamburg - Security Session
Ad

Similar to Power of SPL - Search Processing Language (20)

PPTX
Power of Splunk Search Processing Language (SPL)
PPTX
Power of Splunk Search Processing Language (SPL) ...
PPTX
SplunkLive! Zurich 2017 - The Power of SPL
PDF
Power of SPL Workshop
PPTX
Power of SPL
PPTX
Power of SPL
PPTX
Power of SPL
PDF
The Power of SPL
PDF
Splunk workshop-2017-Power-of-SPL
PDF
Power of SPL
PDF
Power of SPL Workshop
PDF
Nationwide Splunk Ninjas!
PDF
Splunk Webinar: Mit Splunk SPL Maschinendaten durchsuchen, transformieren und...
PPTX
Power of SPL Breakout Session
PPTX
Power of SPL Breakout Session
PPTX
Power of SPL
PPTX
Getting started with Splunk Breakout Session
PDF
Lesser known-search-commands
PDF
Getting Started Breakout Session
PPTX
Getting Started with Splunk Enterprise
Power of Splunk Search Processing Language (SPL)
Power of Splunk Search Processing Language (SPL) ...
SplunkLive! Zurich 2017 - The Power of SPL
Power of SPL Workshop
Power of SPL
Power of SPL
Power of SPL
The Power of SPL
Splunk workshop-2017-Power-of-SPL
Power of SPL
Power of SPL Workshop
Nationwide Splunk Ninjas!
Splunk Webinar: Mit Splunk SPL Maschinendaten durchsuchen, transformieren und...
Power of SPL Breakout Session
Power of SPL Breakout Session
Power of SPL
Getting started with Splunk Breakout Session
Lesser known-search-commands
Getting Started Breakout Session
Getting Started with Splunk Enterprise

More from Splunk (20)

PDF
Splunk Leadership Forum Wien - 20.05.2025
PDF
Splunk Security Update | Public Sector Summit Germany 2025
PDF
Building Resilience with Energy Management for the Public Sector
PDF
IT-Lagebild: Observability for Resilience (SVA)
PDF
Nach dem SOC-Aufbau ist vor der Automatisierung (OFD Baden-Württemberg)
PDF
Monitoring einer Sicheren Inter-Netzwerk Architektur (SINA)
PDF
Praktische Erfahrungen mit dem Attack Analyser (gematik)
PDF
Cisco XDR & Splunk SIEM - stronger together (DATAGROUP Cyber Security)
PDF
Security - Mit Sicherheit zum Erfolg (Telekom)
PDF
One Cisco - Splunk Public Sector Summit Germany April 2025
PDF
.conf Go 2023 - Data analysis as a routine
PDF
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
PDF
.conf Go 2023 - Navegando la normativa SOX (Telefónica)
PDF
.conf Go 2023 - Raiffeisen Bank International
PDF
.conf Go 2023 - På liv og død Om sikkerhetsarbeid i Norsk helsenett
PDF
.conf Go 2023 - Many roads lead to Rome - this was our journey (Julius Bär)
PDF
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...
PDF
.conf go 2023 - Cyber Resilienz – Herausforderungen und Ansatz für Energiever...
PDF
.conf go 2023 - De NOC a CSIRT (Cellnex)
PDF
conf go 2023 - El camino hacia la ciberseguridad (ABANCA)
Splunk Leadership Forum Wien - 20.05.2025
Splunk Security Update | Public Sector Summit Germany 2025
Building Resilience with Energy Management for the Public Sector
IT-Lagebild: Observability for Resilience (SVA)
Nach dem SOC-Aufbau ist vor der Automatisierung (OFD Baden-Württemberg)
Monitoring einer Sicheren Inter-Netzwerk Architektur (SINA)
Praktische Erfahrungen mit dem Attack Analyser (gematik)
Cisco XDR & Splunk SIEM - stronger together (DATAGROUP Cyber Security)
Security - Mit Sicherheit zum Erfolg (Telekom)
One Cisco - Splunk Public Sector Summit Germany April 2025
.conf Go 2023 - Data analysis as a routine
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
.conf Go 2023 - Navegando la normativa SOX (Telefónica)
.conf Go 2023 - Raiffeisen Bank International
.conf Go 2023 - På liv og død Om sikkerhetsarbeid i Norsk helsenett
.conf Go 2023 - Many roads lead to Rome - this was our journey (Julius Bär)
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...
.conf go 2023 - Cyber Resilienz – Herausforderungen und Ansatz für Energiever...
.conf go 2023 - De NOC a CSIRT (Cellnex)
conf go 2023 - El camino hacia la ciberseguridad (ABANCA)

Recently uploaded (20)

PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Cloud computing and distributed systems.
PDF
Electronic commerce courselecture one. Pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Reach Out and Touch Someone: Haptics and Empathic Computing
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
The AUB Centre for AI in Media Proposal.docx
sap open course for s4hana steps from ECC to s4
Cloud computing and distributed systems.
Electronic commerce courselecture one. Pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Encapsulation_ Review paper, used for researhc scholars
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Review of recent advances in non-invasive hemoglobin estimation
Per capita expenditure prediction using model stacking based on satellite ima...
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Network Security Unit 5.pdf for BCA BBA.
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton

Power of SPL - Search Processing Language

  • 1. Copyright © 2016 Splunk Inc. Power of Splunk Search Processing Language (SPL™) Stephen Luedtke Sr. Technical Marketing Mgr
  • 2. Safe Harbor Statement 2 During the course of this presentation, we may make forward looking statements regarding future events or the expected performance of the company. We caution you that such statements reflect our current expectations and estimates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward-looking statements, please review our filings with the SEC. The forward-looking statements made in this presentation are being made as of the time and date of its live presentation. If reviewed after its live presentation, this presentation may not contain current or accurate information. We do not assume any obligation to update any forward looking statements we may make. In addition, any information about our roadmap outlines our general product direction and is subject to change at any time without notice. It is for informational purposes only and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligation either to develop the features or functionality described orto includeany suchfeatureor functionalityina futurerelease.
  • 3. Agenda ● Overview & Anatomy of a Search – Quick refresher on search language and structure ● SPL Commands and Examples – Searching, charting, converging, mapping, transactions, anomalies, exploring ● Custom Commands – Extend the capabilities of SPL ● Q&A 3
  • 5. SPL Overview ● Over 140+ search commands ● Syntax was originally based upon the Unix pipeline and SQL and is optimized for time series data ● The scope of SPL includes data searching, filtering, modification, manipulation, enrichment, insertion and deletion ● Includes anomaly detection and machine learning 5
  • 6. Why Create a New Query Language? ● Flexibility and effectiveness on small and big data 6
  • 7. Why Create a New Query Language? ● Flexibility and effectiveness on small and big data ● Late-binding schema 7
  • 8. Why Create a New Query Language? ● Flexibility and effectiveness on small and big data ● Late-binding schema ● More/better methods of correlation 8 Data
  • 9. Why Create a New Query Language? ● Flexibility and effectiveness on small and big data ● Late-binding schema ● More/better methods of correlation ● Not just analyze, but visualize 9 Data BIG Data
  • 10. search and filter | munge | report | cleanup | rename sum(KB) AS "Total KB" dc(clientip) AS "Unique Customers" | eval KB=bytes/1024 sourcetype=access* | stats sum(KB) dc(clientip) SPL Basic Structure 10
  • 12. SPL Examples and Recipes ● Find the needle in the haystack ● Charting statistics and predicting values ● Enriching and converging data sources ● Map geographic data in real time ● Identifying anomalies ● Transactions ● Data exploration & finding relationships between fields ● Custom Commands 12
  • 13. SPL Examples and Recipes ● Find the needle in the haystack ● Charting statistics and predicting values ● Enriching and converging data sources ● Map geographic data in real time ● Identifying anomalies ● Transactions ● Data exploration & finding relationships between fields ● Custom Commands 13
  • 14. Eval – Just Getting Started! Splunk Search Quick Reference Guide 14
  • 15. SPL Examples and Recipes ● Find the needle in the haystack ● Charting statistics and predicting values ● Enriching and converging data sources ● Map geographic data in real time ● Identifying anomalies ● Transactions ● Data exploration & finding relationships between fields ● Custom Commands 21
  • 17. Stats/Timechart – But Wait, There’s More! Splunk Search Quick Reference Guide 23
  • 18. SPL Examples and Recipes ● Find the needle in the haystack ● Charting statistics and predicting values ● Enriching and converging data sources ● Map geographic data in real time ● Identifying anomalies ● Transactions ● Data exploration & finding relationships between fields ● Custom Commands 32
  • 19. 33 Converging Data Sources Index Untapped Data: Any Source, Type, Volume Online Services Web Services Servers Security GPS Location Storage Desktops Networks Packaged Applications Custom ApplicationsMessaging Telecoms Online Shopping Cart Web Clickstreams Databases Energy Meters Call Detail Records Smartphones and Devices RFID On- Premises Private Cloud Public Cloud Ask Any Question Application Delivery Security, Compliance and Fraud IT Operations Business Analytics Industrial Data and the Internet of Things
  • 20. SPL Examples and Recipes ● Find the needle in the haystack ● Charting statistics and predicting values ● Enriching and converging data sources ● Map geographic data in real time ● Identifying anomalies ● Transactions ● Data exploration & finding relationships between fields ● Custom Commands 37
  • 21. SPL Examples and Recipes ● Find the needle in the haystack ● Charting statistics and predicting values ● Enriching and converging data sources ● Map geographic data in real time ● Identifying anomalies ● Transactions ● Data exploration & finding relationships between fields ● Custom Commands 42
  • 22. SPL Examples and Recipes ● Find the needle in the haystack ● Charting statistics and predicting values ● Enriching and converging data sources ● Map geographic data in real time ● Identifying anomalies ● Transactions ● Data exploration & finding relationships between fields ● Custom Commands 44
  • 23. SPL Examples and Recipes ● Find the needle in the haystack ● Charting statistics and predicting values ● Enriching and converging data sources ● Map geographic data in real time ● Identifying anomalies ● Transactions ● Data exploration & finding relationships between fields ● Custom Commands 48
  • 24. Data Exploration | analyzefields | anomalies | arules | associate | cluster | contingency | correlate | fieldsummary 49
  • 25. Machine Learning Toolkit and Showcase Examples ● Predict Numeric Fields ● Predict Categorical Fields ● Detect Numerical Outliers ● Detect Categorical Outliers ● Forecast Time Series ● Cluster Events 55
  • 26. SPL Examples and Recipes ● Find the needle in the haystack ● Charting statistics and predicting values ● Enriching and converging data sources ● Map geographic data in real time ● Identifying anomalies ● Transactions ● Data exploration & finding relationships between fields ● Custom Commands 56
  • 27. Custom Commands ● What is a Custom Command? – “| haversine origin="47.62,-122.34" outputField=dist lat lon” ● Why do we use Custom Commands? – Run other/external algorithms on your Splunk data – Save time munging data (see Timewrap!) – Because you can! ● Create your own or download as Apps – Haversine (Distance between two GPS coords) – Timewrap (Enhanced Time overlay) – Levenshtein (Fuzzy string compare) – Base64 (Encode/Decode) 57
  • 28. Custom Commands – Haversine Examples ● Download and install App Haversine ● Read documentation then use in SPL! sourcetype=access* | iplocation clientip | search City=A* | haversine origin="47.62,-122.34" units=mi outputField=dist lat lon | table clientip, City, dist, lat, lon 58
  • 29. Custom Commands – Haversine Examples ● Download and install App Haversine ● Read documentation then use in SPL! sourcetype=access* | iplocation clientip | search City=A* | haversine origin="47.62,-122.34" units=mi outputField=dist lat lon | table clientip, City, dist, lat, lon 59
  • 30. For More Information ● Additional information can be found in: – Power Of SPL App! – Search Manual – Blogs – Answers – Exploring Splunk 60
  • 31. Q & A

Editor's Notes

  • #2: This presentation has some animations and content to help tell stories as you go. Feel free to change ANY of this to your own liking! I would definitely practice your flow once or twice before a presentation. There is A LOT of content to get through in 1 hour. The slides with search examples can be unhidden if needed. Here is what you need for this presentation: You should have the following installed: PowerOfSPL App - https://splunkbase.splunk.com/app/3353/ Custom Cluster Map Visualization - https://splunkbase.splunk.com/app/3122/ Clustered Single Value Map Visualization - https://splunkbase.splunk.com/app/3124/ Geo Heatmap Custom Visualization - https://splunkbase.splunk.com/app/3217/ Timewrap Custom Command (NOTE this command is now included in CORE) - https://splunkbase.splunk.com/app/1645/ Haversine Custom Command - https://splunkbase.splunk.com/app/936/ Levenshtein Custom Command - https://splunkbase.splunk.com/app/1898/ Optional: Splunk Search Reference Guide handouts Mini buttercups or other prizes to give out for answering questions during the presentation Shake! Demo can be used for interactivity on some of these search examples if you want… definitely adds some flare to the presentation
  • #3: Safe Harbor Statement
  • #4: *charting – To spice things up, We are going to make the session a little interactive and have you guys send data to splunk realtime from your phones and we’ll test out some of the splunk commands on this real-time data. Disclaimer: What this class is vs. what it is not? - This class is meant to showcase examples of the Splunk Search Processing Language. We’ll go through basic steps of how to use a few of commands, but for the most part it is meant to demo, however you can learn much more in depth by enrolling in the Basic and Advanced Search and Reporting classes or read up on the docs online. Don’t worry - anything you see I’ll provide references and the examples will be available for d/l after the session. Opening Tell for each Agenda Item: What and why is it important? Anatomy of a Search: - First we’ll do a quick refresher on the anatomy of a search and why it’s useful. It’s important to understand the basic flow of the language and also the benefits of it. Examples of SPL: - Next we’ll show how both basic and more advanced search commands can be used to answer real world questions and build operation intelligence. In fact, we’ll breakdown a few of the searches in the Operational Intelligence demo you saw on the main stage. Additionally we’ll look at how SPL can help you explore new and complex data. In my opinion, this is an often overlooked and really powerful benefit of SPL. Custom Commands: - Lastly, I’ll show how to extend the Splunk search language using custom commands. This is also exciting due to the fact that the community has already made so many additions. Q&As: - And ofcourse we’ll finish with some Q & A’s. Time: (Total 60 min) Overview: 5 min Examples of SPL: 35 min Custom Commands 10 min Q & A: 10 min
  • #6: - We call them search commands, but they really do so much more and that’s what I hope to get across with you today. “The Splunk search language has over 140+ commands, is very expressive and can perform a wide variety of tasks ranging from filtering to data, to munging or modifying, and reporting.” “The Syntax was …” “Why? Because SQL is good for certain tasks and the Unix pipeline is amazing!” This is great BUT… WHY WOULD WE WANT TO CREATE A NEW LANGUAGE AND WHY DO YOU CARE?
  • #7: <Engage audience here.. Before showing bullet points ask “Why do you think we would want to create a new language?”> <Also Feel free to change pictures or flow of this slide..> -- have buttercups to throw out if anyone answers correctly? - Today we require the ability to quickly search and correlate through large amounts of data, sometimes in an unstructured or semi-unstructured way. Conventional query languages (such as SQL or MDX) simply do not provide the flexibility required for the effective searching of big data. Not only this but STREAMING data. (SQL can be great at joining a bunch of small tables together, but really large joins on datasets can be a problem whereas hadoop can be great with larger data sets, but sometimes inefficient when it comes to many small files or datasets. ) - Machine Data is different: - It is voluminous unstructured time series data with no predefined schema - It is generated by all IT systems– from applications and servers, to networks and RFIDs. - It is non-standard data and characterized by unpredictable and changing formats Traditional approaches are just not engineered for managing this high volume, high velocity, and highly diverse form of data. Splunk’s NoSQL query approach does not involve or impose any predefined schema. This enables the increased flexibility mentioned above, as there are No limits on the formats of data – No limits on where you can collect it from No limits on the questions that you can ask of it And no limits on scale Methods of Correlation enabled by SPL Time & GeoLocation: Identify relationships based on time and geographic location Transactions: Track a series of events as a single transaction Subsearches: Results of one search as input into other searches Lookups: Enhance, enrich, validate or add context to event data SQL-like joins between different data sets In addition to flexible searching and correlation, the same language is used to rapidly construct reports, dashboards, trendlines and other visualizations. This is useful because you can understand and leverage your data without the cost associated with the formal structuring or modeling of the data first. (With hadoop or SQL you run a job or query to generate results, but then you have need to integrate more software to actually visualize it!) “OK.. Let’s move on..”
  • #8: <Engage audience here.. Before showing bullet points ask “Why do you think we would want to create a new language?”> <Also Feel free to change pictures or flow of this slide..> -- have buttercups to throw out if anyone answers correctly? - Today we require the ability to quickly search and correlate through large amounts of data, sometimes in an unstructured or semi-unstructured way. Conventional query languages (such as SQL or MDX) simply do not provide the flexibility required for the effective searching of big data. Not only this but STREAMING data. (SQL can be great at joining a bunch of small tables together, but really large joins on datasets can be a problem whereas hadoop can be great with larger data sets, but sometimes inefficient when it comes to many small files or datasets. ) - Machine Data is different: - It is voluminous unstructured time series data with no predefined schema - It is generated by all IT systems– from applications and servers, to networks and RFIDs. - It is non-standard data and characterized by unpredictable and changing formats Traditional approaches are just not engineered for managing this high volume, high velocity, and highly diverse form of data. Splunk’s NoSQL query approach does not involve or impose any predefined schema. This enables the increased flexibility mentioned above, as there are No limits on the formats of data – No limits on where you can collect it from No limits on the questions that you can ask of it And no limits on scale Methods of Correlation enabled by SPL Time & GeoLocation: Identify relationships based on time and geographic location Transactions: Track a series of events as a single transaction Subsearches: Results of one search as input into other searches Lookups: Enhance, enrich, validate or add context to event data SQL-like joins between different data sets In addition to flexible searching and correlation, the same language is used to rapidly construct reports, dashboards, trendlines and other visualizations. This is useful because you can understand and leverage your data without the cost associated with the formal structuring or modeling of the data first. (With hadoop or SQL you run a job or query to generate results, but then you have need to integrate more software to actually visualize it!) “OK.. Let’s move on..”
  • #9: <Engage audience here.. Before showing bullet points ask “Why do you think we would want to create a new language?”> <Also Feel free to change pictures or flow of this slide..> -- have buttercups to throw out if anyone answers correctly? - Today we require the ability to quickly search and correlate through large amounts of data, sometimes in an unstructured or semi-unstructured way. Conventional query languages (such as SQL or MDX) simply do not provide the flexibility required for the effective searching of big data. Not only this but STREAMING data. (SQL can be great at joining a bunch of small tables together, but really large joins on datasets can be a problem whereas hadoop can be great with larger data sets, but sometimes inefficient when it comes to many small files or datasets. ) - Machine Data is different: - It is voluminous unstructured time series data with no predefined schema - It is generated by all IT systems– from applications and servers, to networks and RFIDs. - It is non-standard data and characterized by unpredictable and changing formats Traditional approaches are just not engineered for managing this high volume, high velocity, and highly diverse form of data. Splunk’s NoSQL query approach does not involve or impose any predefined schema. This enables the increased flexibility mentioned above, as there are No limits on the formats of data – No limits on where you can collect it from No limits on the questions that you can ask of it And no limits on scale Methods of Correlation enabled by SPL Time & GeoLocation: Identify relationships based on time and geographic location Transactions: Track a series of events as a single transaction Subsearches: Results of one search as input into other searches Lookups: Enhance, enrich, validate or add context to event data SQL-like joins between different data sets In addition to flexible searching and correlation, the same language is used to rapidly construct reports, dashboards, trendlines and other visualizations. This is useful because you can understand and leverage your data without the cost associated with the formal structuring or modeling of the data first. (With hadoop or SQL you run a job or query to generate results, but then you have need to integrate more software to actually visualize it!) “OK.. Let’s move on..”
  • #10: <Engage audience here.. Before showing bullet points ask “Why do you think we would want to create a new language?”> <Also Feel free to change pictures or flow of this slide..> -- have buttercups to throw out if anyone answers correctly? - Today we require the ability to quickly search and correlate through large amounts of data, sometimes in an unstructured or semi-unstructured way. Conventional query languages (such as SQL or MDX) simply do not provide the flexibility required for the effective searching of big data. Not only this but STREAMING data. (SQL can be great at joining a bunch of small tables together, but really large joins on datasets can be a problem whereas hadoop can be great with larger data sets, but sometimes inefficient when it comes to many small files or datasets. ) - Machine Data is different: - It is voluminous unstructured time series data with no predefined schema - It is generated by all IT systems– from applications and servers, to networks and RFIDs. - It is non-standard data and characterized by unpredictable and changing formats Traditional approaches are just not engineered for managing this high volume, high velocity, and highly diverse form of data. Splunk’s NoSQL query approach does not involve or impose any predefined schema. This enables the increased flexibility mentioned above, as there are No limits on the formats of data – No limits on where you can collect it from No limits on the questions that you can ask of it And no limits on scale Methods of Correlation enabled by SPL Time & GeoLocation: Identify relationships based on time and geographic location Transactions: Track a series of events as a single transaction Subsearches: Results of one search as input into other searches Lookups: Enhance, enrich, validate or add context to event data SQL-like joins between different data sets In addition to flexible searching and correlation, the same language is used to rapidly construct reports, dashboards, trendlines and other visualizations. This is useful because you can understand and leverage your data without the cost associated with the formal structuring or modeling of the data first. (With hadoop or SQL you run a job or query to generate results, but then you have need to integrate more software to actually visualize it!) “OK.. Let’s move on..”
  • #11: “Let’s take a closer look at the syntax, notice the unix pipeline” “The structure of SPL creates an easy way to stitch a variety of commands together to solve almost any question you may ask of your data.” “Search and Filter” - The search and filter piece allows you to use fields or keywords to reduce the data set. It’s an important but often overlooked part of the search due to the performance implications. “Munge” - The munge step is a powerful piece because you can “re-shape” data on the fly. In this example we show creating a new field called KB from an existing field “bytes”. “Report” - Once we’ve shaped and massaged the data we now have an abundant set of reporting commands that are used to visualize results through charts and tables, or even send to a third party application in whatever format they require. “Cleanup” - Lastly there are some cleanup options to help you create better labeling and add or remove fields. Again, sticthing together makes it easier to utilize and understand advanced commands, better flow etc. Additionally the implicit join on time and automatic granularity helps reduces complexity compared to what you would have to do in SQL and excel or other tools. “Let’s look at some more in depth examples”
  • #13: “In this next section we’ll take a more in depth look at some search examples and recipes. It would be impossible for us to go over every command and use case so the goal of this is to show a few different commands that can help solve most problems and generate quick time to value in the following area."
  • #15: “There are tons of EVAL commands to help you shape or manipulate your data the way you want it.” Optional <Click on image to go to show and scroll through online quick reference quide>
  • #16: Note how the search assistant shows the number of both exact and similar matched terms before you even click search. This can be very useful when exploring and previewing your data sets without having to run searches over and over again to find a result.
  • #17: Additionally we can further filter our data set down to a specific host.
  • #18: Lastly we can combine filters and keyword searches very easily. “This is pretty basic, but the key here is that SPL makes it incredibly easy and flexible to filter your searches down and reduce your data set to exactly what you’re looking for.
  • #19: Remember Munging or Re-shaping our data on the fly? Talk about Eval and it’s importance sourcetype=access* |eval KB=bytes/1024
  • #20: sourcetype=access* | eval http_response = if(status == 200, "OK", "Error”)
  • #21: sourcetype=access* | eval connection = clientip.":".port
  • #23: There are 3 commands that are the basis of calculating statistics and visualizing results. Essentially chart is just stats visualized and timechart is stats by _time visualized. These SPL commands are extremely powerful and easy to use. “Let’s go through some examples – additionally we’ll make it more interesting and pull apart some searches and visualizations from one of the demo’s you saw on stage” <Go to IT Ops Visibility, click on Storage indicator> 1. Use Read/Write OPs by instance for STATS, bonus w/ sparkline 2. Use Read/Write OPs for TIMECHART
  • #24: “Again, don’t forget about the quick reference guide. There are many more statistical functions you can use with these commands on your data.”
  • #28: Show difference between stats and timechart (adds _time buckets, visualize, etc.) Why is this awesome? We can do all of the same statistical calculations over time with almost any level of granularity. For example… <change timepicker from 60min to 15min, add span=1s to search and zoom in> Add below? Due to the implicit time dimension, it’s very easy to use timechart to visualize disparate data sets with varying time frequencies. SQL vs Timechart actual comparison?
  • #29: Walk through trendline basic options
  • #30: Walk through predict basic options “The timechart command plus other SPL commands make it very easy to visualize your data any way you want.”
  • #34: Context is everything when it comes to building successful operational intelligence. When you are stuck analyzing events from a single data source at a time, you might be missing out on rich contextual information or new insights that other data sources can provide. Let’s take a quick look at a few powerful SPL commands that can help make this happen.
  • #44: Walk through the dashboard which points out several uses of anomalydetection
  • #46: sourcetype=access* | transaction JSESSIONID
  • #47: sourcetype=access* | transaction JSESSIONID | stats min(duration) max(duration) avg(duration)
  • #48: NOTE: Many transactions can be re-created using stats. Transaction is easy but stats is way more efficient and it’s a mapable command (more work will be distributed to the indexers). sourcetype=access* | stats min(_time) AS earliest max(_time) AS latest by JSESSIONID | eval duration=latest-earliest | stats min(duration) max(duration) avg(duration)
  • #50: Feel free to change this and use your own story! “My interpretation of Data Exploration when it comes to Splunk is the process of characterizing and researching behavior of both existing and new data sources.” “ For example while you may have an existing data source you are already used to, but there still could be some unknown value in in terms of patterns, relationships between fields and rare events that could point you to new insights or help with predictive analytics. This capability gives you confidence to explore new data sources as well because you can quickly look for replacements and nuggets that stick out or help classify data. A friend once asked me to look at some biomedical data with DNA information. The vocabulary and field definitions were way above me, but I was able to quickly understand patterns and relationships with Splunk and provide them value instaneously. With Splunk you literally become afraid of no data!” Let’s look at a few quick examples.
  • #51: “The cluster command is used to find common and/or rare events within your data” <Show simple table search first and point out # of events, then run cluster and sort on cluster count to show common vs rare events> * | table _raw _time * | cluster showcount=t t=.1 | table _raw cluster_count | sort - cluster_count
  • #52: Fieldsummary gives you a quick breakdown of your numerical fields such as count, min, max, stdev, etc. It also shows you examples values in the event. I used maxvals to limit the number of samples it shows per field. sourcetype=access_combined | fields – date* source* time* | fieldsummary maxvals=5
  • #53: “The correlate command is used to find co-occurrence between fields. Basically a matrix showing the ‘Field1 exists 80% of the time when Field2 exists’” sourcetype=access_combined | fields – date* source* time* | correlate “This can be useful for both making sure your field extractions are correct (if you expect a field to exist %100 of the time when another field exists) and also helping you identify potential patterns and trends between different fields.”
  • #54: “The contingency command is used to look for relationships of between two fields. Basically for these two fields, how many different value combinations are there and what are they / most common” sourcetype=access_combined | contingency uri status
  • #55: This command is extremely useful for not only looking for meaningful fields in your data, but also for determining which fields to use in linear or logistical regression algorithms in the machine learning app. sourcetype=access_combined | analyzefields classfield=status
  • #56: If you want to learn more about Data Science, Exploration and Machine Learning, download the Machine Learning App! You’ll use new SPL commands like “fit” and “apply” to train models on data in Splunk. New SPL commands: fit, apply, summary, listmodels, and deletemodel * Predict Numeric Fields (Linear Regression): e.g. predict median house values. * Predict Categorical Fields (Logistic Regression): e.g. predict customer churn. * Detect Numeric Outliers (distribution statistics): e.g. detect outliers in IT Ops data. * Detect Categorical Outliers (probabilistic measures): e.g. detect outliers in diabetes patient records. * Forecast Time Series: e.g. forecast data center growth and capacity planning. * Cluster Events (K-means, DBSCAN, Spectral Clustering, BIRCH).
  • #58: Depending on remaining time can show 1 or more custom command examples. “We’ve gone over a variety of Splunk search commands.. but what happens when we can’t find a command that fits our needs OR want to use a complex algorithm someone already OR even create your own?? Enter Custom Commands.” Additional Text: Splunk's search language includes a wide variety of commands that you can use to get what you want out of your data and even to display the results in different ways. You have commands to correlate events and calculate statistics on your results, evaluate fields and reorder results, reformat and enrich your data, build charts, and more. Still, Splunk enables you to expand the search language to customize these commands to better meet your needs or to write your own search commands for custom processing or calculations.
  • #59: Let’s see Haversine in action. <Pull up search>
  • #60: *Note – Coordinates of origin in this Haversine example is currently “Seattle”
  • #61: References: Make sure to reference this App is now available for download!!