SlideShare a Scribd company logo
Copyright © 2015 Splunk Inc.
Take Splunk to the Next Level:
Architecture
2
Legal Notices
During the course of this presentation, we may make forward-looking statements regarding future
events or the expected performance of the company. We caution you that such statements reflect our
current expectations and estimates based on factors currently known to us and that actual events or
results could differ materially. For important factors that may cause actual results to differ from those
contained in our forward-looking statements, please review our filings with the SEC. The forward-
looking statements made in this presentation are being made as of the time and date of its live
presentation. If reviewed after its live presentation, this presentation may not contain current or
accurate information. We do not assume any obligation to update any forward-looking statements
we may make. In addition, any information about our roadmap outlines our general product direction
and is subject to change at any time without notice. It is for informational purposes only and shall
not be incorporated into any contract or other commitment. Splunk undertakes no obligation either
to develop the features or functionality described or to include any such feature or functionality in a
future release.
2
3
Splunk at the Next Level
Time to move beyond initial Splunk environment
• More use cases – how to tackle?
• More data – how do we scale?
• Splunk is mission critical == HA
• Global deployments
• Improving Splunk user experience Screenshot here
4
Growing your Splunk Deployment
Many customers start with a single use case…
• Ex: Monitor the web servers
• Help ensure up-time & response times
• Track usage, errors
• Provides business value
5
Growing your Splunk Deployment
Value statement for each overall service
Your services exist in a larger context than just one app, or one tier.
What is the value of the service as a whole?
What are CIO commitments for the service?
• The organization’s web site is one of the most critical parts of the business.
• Performance of the overall environment must be maintained at all times.
• Failures in any portion of the web site must be quickly identified, send
notification to the appropriate parties.
• Dependencies on external processes must be monitored as well.
6
Growing your Splunk Deployment
The larger context
• Failure in one system cascades
• Map dependencies, estimate costs
• Use Splunk to track all dependencies.
• What happens when it is down?
Dependencies often include:
• Networking dependencies
• Shared storage
• Databases, middleware, custom apps
• Virtualization layer
Screenshot here
7
Scales to Hundreds of TBs/Day
Enterprise-Class Scale, Resilience and Interoperability
Send data from thousands of servers using any combination of Splunk Forwarders
Auto load-balanced forwarding to Splunk Indexers
Offload search load to Splunk Search Heads
Visibility Across Datacenters
Distributed search unifies the view
across locations
Role-based access controls how far a given
user's search will span
New York Tokyo
London Cloud
9
Product Roles
Searching and Reporting (Search Head)
Indexing and Search Services (Indexer)
Data Collection and Forwarding (Forwarder)
Indexer Cluster Master, SHC Deployer
Distributed Management / Deployment Server
License Master, Distributed Mgmt Console
Databases
Networks
Servers
Virtual
Machines
Smart
phones
and
Devices
Custom
Applications
Security
WebServer
Sensors
Copyright © 2015 Splunk Inc.
Forwarders
11
Example Forwarding Tier
11
12
Splunk Universal Forwarder
Why use the UF over other methods?
Collect syslog / event log / custom application logs
Collect configuration files, registry settings
Collect data NOT in log files: scripted inputs on current state
Collect wire data – Splunk Stream
Faster, Lower overhead than “agentless” polling
Centrally administered
… and
13
Forwarder Load Balancing
Have UF balance across multiple indexers
Load Balance
– Multiple hosts in outputs
– DNS round robin
– LB not needed!
Geography-based routing
Optional SSL encryption
Compressed 10 to 1
14
Deployment Server
Central management of Splunk Forwarders
Deployment Server manages Apps, Configs
Select one or more classes for each host
Class defines apps & configs
Works by phone-home
Notes:
DS does not push forwarder binaries
Use Cluster Master to manage indexers in cluster, not DS
15
Forwarding Tier Design Best Practices
15
• Use a Syslog Server for Syslog data
• Deployment server (on a VM) for central management
• Let AutoLB distribute data across available indexers
• May need to increase UF throughput setting for high velocity sources
– Enable forceTimebasedAutoLB (for more even distribution)
– maxKBps (to adjust throttling)
Questions?
Copyright © 2015 Splunk Inc.
Indexers
17
Indexers
Dedicated indexers serve three primary roles:
Data Storage
Processing and parsing at index-time
Indexing
Data Management
Hot / warm / cold data rotation
Aging and removal
Data Retrieval
Perform search upon request, return data to search heads
18
Scaling - Indexers
Sizing for index performance
Indexers are usually storage-bound
Indexers: 150 to 250 GB per day, each. (With reference HW.)
Ref HW: 12 cores (2 GHz+), 12 GB RAM, 800+ IOPs
Optimal HW (normal disk): 16 CPU cores, 48 GB RAM
Optimal HW (SSD): 24 CPU cores, 132 GB RAM
Questions?
19
Tiered Storage
• Splunk supports tiered storage
• Hot / Warm buckets – put on fastest disk
• Size Hot/Warm for normal saved search durations. (7d, 30d)
• Use slower / cheaper storage (NAS?) for long term access
• Optional: Use Frozen to roll data to glacier, Hadoop, etc.
20
SSD Advantage
http://blogs.splunk.com/2012/05/10/quantifying-the-benefits-of-
splunk-with-ssds/
• Low cost random seeks
• Writes are not that much faster – no great improvement with Indexing
• Significant improvements with Sparse/needle-haystack searches
• Dense searches become CPU bound
• Searches run faster allowing for more completed searches/min
• Use Enterprise-grade SSDs, not commercial-grade.
21
Scaling - Storage
Manual storage calculation
Raw data rate  net compression of ~ 50% on disk.
Simple: rate * compression * retention / #indexers
Hot / warm requirements
– 200 GB / day * 50% * 30 days = 3TB per indexer
Cold storage requirements
– 200 GB / day * 50% * 335 days = 33.5TB per indexer
Clustering
– Changes storage story completely
22
Scaling - Storage
One example of good local storage
A well configured indexer using local storage might look like:
• SSDs in RAID 5, sized for 14 days of storage
• SATA drives in RAID 5, sized for 6 months of storage
SSDs: RAID 5 provides decent performance
Spinning disks:
• Hot/Warm, RAID 1+0, 800 IOPS or faster
• Cold – RAID 5 with proper block / stripe sizing
23
Scaling - Storage
Sizing Calculator: http://splunk-sizing.appspot.com/
Copyright © 2015 Splunk Inc.
Indexer Clustering
25
Delivers Mission-Critical Availability
• Data replication – maintain
searchability even if servers
go down
• Multi-site capable –
maintain searchability even
if a site goes down
• Search Affinity – optimized
searches by fetching from
the closest/fastest location
REPLICATION
Portland
Datacenter
New York
Datacenter
Clustering
26
Indexer Clustering
High-Availability, Out of the Box
Splunk indexer clustering
Active-Active= better performance
Specific terms:
– Master Node / Master Cluster Node
– Peer Node
– Search Factor
– Replication Factor
Additional details: Splunk Docs, Distributed Deployment Manual
27
Cross-site Clustering
Search Affinity by location
“Search locally”, “Store Globally”
DR scenarios
28
How Clustering Affects Sizing
• Increased storage:
– 15% of raw usage for every replica copy
– 35% MORE to make that searchable
• Increased processing
– Incoming data to indexer is streamed to indexing peers to satisfy required
number of copies
• More hosts
– Need “replication factor” + 2 (search head, cluster master)
2
29
Scaling - Storage
Sizing Calculator: http://splunk-sizing.appspot.com/
30
Downsides of Indexer Clustering
• Increased Storage
• Cluster master is required – use a VM.
• Increased bandwidth
Questions?
3
Copyright © 2015 Splunk Inc.
Search Heads
32
Scaling the Search Heads
Splunk Search is critical, too!
Scaling your search heads
Scale to handle # of concurrent queries
Dedicated Search heads for certain apps, scheduled alerts
Remember – Search heads virtualize well!
Copyright © 2015 Splunk Inc.
Search Head Clustering
34
SHP vs SHC
Search Head Clustering
Seach Head Pooling
• Available since v4.2
• Sharing configurations through NFS
• Single point of failure
• Performance issues
• No shared storage requirement
• Replication using local storage
• Commodity hardware
• OSes: Linux or Solaris
NFS
35
Search Head Clustering
1. Group search heads into a cluster
2. A captain gets elected dynamically
3. User created reports/dashboards automatically replicated
to other search heads
36
Search Head Clustering
37
Search Tier Design Best Practices
37
• Minimum 3 nodes required
• ES will still require a Separate Search Head or dedicated SHC
• Use LDAP/AD/SSO for user Authentication
• Load Balancer configured for sticky sessions
• Must use deployer to push apps to search heads
• Confirm your applications’ support for SHC!
Questions?
38
Search Head Clustering
Use “Captain” instead of “Master” to avoid confusion with Index-
Clustering
Minimum 3 nodes required.
Cluster takes certain key decisions based on *majority* (consensus)
In multi-site setup have more nodes in main datacenter
Copyright © 2015 Splunk Inc.
The Final Stretch
40
Load Balancer
Search Head Cluster, Deployer
Clustered Peer Node + Cluster master
Deployment server
Universal Forwarders on Servers
Syslog, NetFlow data
HFs for scheduled polling via API
40
41
Hybrid Approach for rollout
41
• Add the existing Splunk
instance as a search peer
until the data retention
period has expired
• Disable scheduled searches
on the old instance
• Migrate any Summary Index
data to new Indexers
42
Distributed Management Console
Manage Splunk 6.2 environments
Replaces Deployment Monitor App
Incorporates SOS app prior to 6.2
43
Cloud & Hybrid
Scale without waiting for hardware
44
Suggested Reading
• Distributed Deployment Manual
– http://docs.splunk.com/Documentation/Splunk/latest/Deploy/Distributedoverv
iew
• Highlights
– Reference hardware specs
– How searches affect performance
 Dense / Rare / Sparse
– App considerations
– Summary table
4
45
Top 5 things to Remember
45
• Indexers: Storage requirements, IOPS, RAID config
• Indexer clustering: HA, DR, and site affinity!
• SHC: Minimum buy-in for a SHC is 3
• When in doubt – add another Indexer
• Excellent VM candidates:
– Master Cluster Node (Indexer clustering)
– Deployer (Search head clustering)
– Deployment Server (Central Forwarder management)
– License Master
– Distributed Management Console
46
Prizes in Exchange for Your Survey Feedback!
Text Splunk to 878787
OR
Scan this QR Code
Then stop by our reg desk for a free gift and a chance to win a $100
AMEX gift cards
Thank You

More Related Content

PPTX
Scale Splunk
PPTX
Taking Splunk to the Next Level – Architecture
PPTX
Taking Splunk to the Next Level - Architecture
PPTX
Taking Splunk to the Next Level - Technical
PPTX
Taking Splunk to the Next Level - Architecture Breakout Session
PPTX
Taking Splunk to the Next Level - Architecture Breakout Session
PPTX
SplunkLive! Atlanta Mar 2013 - University of Alabama at Birmingham
PPTX
Taking Splunk to the Next Level – Architecture
Scale Splunk
Taking Splunk to the Next Level – Architecture
Taking Splunk to the Next Level - Architecture
Taking Splunk to the Next Level - Technical
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
SplunkLive! Atlanta Mar 2013 - University of Alabama at Birmingham
Taking Splunk to the Next Level – Architecture

What's hot (20)

PPTX
Taking Splunk to the Next Level - Architecture Breakout Session
PDF
SplunkLive Melbourne Scaling and best practice for Splunk on premise and in t...
PDF
Hive spark-s3acommitter-hbase-nfs
PPTX
Building Efficient Pipelines in Apache Spark
PPTX
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
PDF
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
ODP
Get involved with the Apache Software Foundation
PDF
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
PDF
Spark on Mesos
PDF
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
PDF
Inside Solr 5 - Bangalore Solr/Lucene Meetup
PDF
Hive on spark berlin buzzwords
PDF
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
PDF
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
PDF
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Introduction to Kafka - Je...
PPTX
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark Clusters
PDF
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
PDF
Reactive Streams, Linking Reactive Application To Spark Streaming
PDF
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
PDF
Low latency high throughput streaming using Apache Apex and Apache Kudu
Taking Splunk to the Next Level - Architecture Breakout Session
SplunkLive Melbourne Scaling and best practice for Splunk on premise and in t...
Hive spark-s3acommitter-hbase-nfs
Building Efficient Pipelines in Apache Spark
Building Effective Near-Real-Time Analytics with Spark Streaming and Kudu
Connect Code to Resource Consumption to Scale Your Production Spark Applicati...
Get involved with the Apache Software Foundation
Running Spark Inside Containers with Haohai Ma and Khalid Ahmed
Spark on Mesos
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Inside Solr 5 - Bangalore Solr/Lucene Meetup
Hive on spark berlin buzzwords
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Introduction to Kafka - Je...
April 2016 HUG: CaffeOnSpark: Distributed Deep Learning on Spark Clusters
Webinar Slides: High Noon at AWS — Amazon RDS vs. Tungsten Clustering with My...
Reactive Streams, Linking Reactive Application To Spark Streaming
High Performance Enterprise Data Processing with Apache Spark with Sandeep Va...
Low latency high throughput streaming using Apache Apex and Apache Kudu
Ad

Similar to Taking Splunk to the Next Level - Architecture Breakout Session (20)

PPTX
Taking Splunk to the Next Level - Architecture Breakout Session
PPTX
Taking Splunk to the Next Level - Architecture Breakout Session
PPTX
Taking Splunk to the Next Level - Architecture
PDF
Deploying Splunk. Arquitetura e dimensionamento do Splunk
PPTX
Getting Started with Splunk
PPTX
Getting Started with Splunk Breakout Session
PPTX
Getting Started with Splunk Breakout Session
PDF
SplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
PPTX
Getting Started with Splunk Enterprises
PDF
Getting Started with Splunk Enterprise
PDF
Splunk Cloud
PDF
Splunk Cloud
PDF
Splunk Cloud
PDF
Splunk Cloud
PPTX
Best Practices for a CoE
PDF
SplunkLive Sydney Scaling and best practice for Splunk on premise and in the ...
PDF
SplunkLive Melbourne Scaling and best practice for Splunk on premise and in t...
PDF
SplunkLive Sydney Scaling and best practice for Splunk on premise and in the ...
PDF
Splunk Data Onboarding Overview - Splunk Data Collection Architecture
PPTX
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture
Deploying Splunk. Arquitetura e dimensionamento do Splunk
Getting Started with Splunk
Getting Started with Splunk Breakout Session
Getting Started with Splunk Breakout Session
SplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
Getting Started with Splunk Enterprises
Getting Started with Splunk Enterprise
Splunk Cloud
Splunk Cloud
Splunk Cloud
Splunk Cloud
Best Practices for a CoE
SplunkLive Sydney Scaling and best practice for Splunk on premise and in the ...
SplunkLive Melbourne Scaling and best practice for Splunk on premise and in t...
SplunkLive Sydney Scaling and best practice for Splunk on premise and in the ...
Splunk Data Onboarding Overview - Splunk Data Collection Architecture
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
Ad

More from Splunk (20)

PDF
Splunk Leadership Forum Wien - 20.05.2025
PDF
Splunk Security Update | Public Sector Summit Germany 2025
PDF
Building Resilience with Energy Management for the Public Sector
PDF
IT-Lagebild: Observability for Resilience (SVA)
PDF
Nach dem SOC-Aufbau ist vor der Automatisierung (OFD Baden-Württemberg)
PDF
Monitoring einer Sicheren Inter-Netzwerk Architektur (SINA)
PDF
Praktische Erfahrungen mit dem Attack Analyser (gematik)
PDF
Cisco XDR & Splunk SIEM - stronger together (DATAGROUP Cyber Security)
PDF
Security - Mit Sicherheit zum Erfolg (Telekom)
PDF
One Cisco - Splunk Public Sector Summit Germany April 2025
PDF
.conf Go 2023 - Data analysis as a routine
PDF
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
PDF
.conf Go 2023 - Navegando la normativa SOX (Telefónica)
PDF
.conf Go 2023 - Raiffeisen Bank International
PDF
.conf Go 2023 - På liv og død Om sikkerhetsarbeid i Norsk helsenett
PDF
.conf Go 2023 - Many roads lead to Rome - this was our journey (Julius Bär)
PDF
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...
PDF
.conf go 2023 - Cyber Resilienz – Herausforderungen und Ansatz für Energiever...
PDF
.conf go 2023 - De NOC a CSIRT (Cellnex)
PDF
conf go 2023 - El camino hacia la ciberseguridad (ABANCA)
Splunk Leadership Forum Wien - 20.05.2025
Splunk Security Update | Public Sector Summit Germany 2025
Building Resilience with Energy Management for the Public Sector
IT-Lagebild: Observability for Resilience (SVA)
Nach dem SOC-Aufbau ist vor der Automatisierung (OFD Baden-Württemberg)
Monitoring einer Sicheren Inter-Netzwerk Architektur (SINA)
Praktische Erfahrungen mit dem Attack Analyser (gematik)
Cisco XDR & Splunk SIEM - stronger together (DATAGROUP Cyber Security)
Security - Mit Sicherheit zum Erfolg (Telekom)
One Cisco - Splunk Public Sector Summit Germany April 2025
.conf Go 2023 - Data analysis as a routine
.conf Go 2023 - How KPN drives Customer Satisfaction on IPTV
.conf Go 2023 - Navegando la normativa SOX (Telefónica)
.conf Go 2023 - Raiffeisen Bank International
.conf Go 2023 - På liv og død Om sikkerhetsarbeid i Norsk helsenett
.conf Go 2023 - Many roads lead to Rome - this was our journey (Julius Bär)
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...
.conf go 2023 - Cyber Resilienz – Herausforderungen und Ansatz für Energiever...
.conf go 2023 - De NOC a CSIRT (Cellnex)
conf go 2023 - El camino hacia la ciberseguridad (ABANCA)

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Approach and Philosophy of On baking technology
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
MYSQL Presentation for SQL database connectivity
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPT
Teaching material agriculture food technology
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Big Data Technologies - Introduction.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Cloud computing and distributed systems.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
NewMind AI Weekly Chronicles - August'25 Week I
Approach and Philosophy of On baking technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation_ Review paper, used for researhc scholars
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
MYSQL Presentation for SQL database connectivity
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Spectral efficient network and resource selection model in 5G networks
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Teaching material agriculture food technology
Chapter 3 Spatial Domain Image Processing.pdf
Big Data Technologies - Introduction.pptx
Understanding_Digital_Forensics_Presentation.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Cloud computing and distributed systems.
The Rise and Fall of 3GPP – Time for a Sabbatical?

Taking Splunk to the Next Level - Architecture Breakout Session

  • 1. Copyright © 2015 Splunk Inc. Take Splunk to the Next Level: Architecture
  • 2. 2 Legal Notices During the course of this presentation, we may make forward-looking statements regarding future events or the expected performance of the company. We caution you that such statements reflect our current expectations and estimates based on factors currently known to us and that actual events or results could differ materially. For important factors that may cause actual results to differ from those contained in our forward-looking statements, please review our filings with the SEC. The forward- looking statements made in this presentation are being made as of the time and date of its live presentation. If reviewed after its live presentation, this presentation may not contain current or accurate information. We do not assume any obligation to update any forward-looking statements we may make. In addition, any information about our roadmap outlines our general product direction and is subject to change at any time without notice. It is for informational purposes only and shall not be incorporated into any contract or other commitment. Splunk undertakes no obligation either to develop the features or functionality described or to include any such feature or functionality in a future release. 2
  • 3. 3 Splunk at the Next Level Time to move beyond initial Splunk environment • More use cases – how to tackle? • More data – how do we scale? • Splunk is mission critical == HA • Global deployments • Improving Splunk user experience Screenshot here
  • 4. 4 Growing your Splunk Deployment Many customers start with a single use case… • Ex: Monitor the web servers • Help ensure up-time & response times • Track usage, errors • Provides business value
  • 5. 5 Growing your Splunk Deployment Value statement for each overall service Your services exist in a larger context than just one app, or one tier. What is the value of the service as a whole? What are CIO commitments for the service? • The organization’s web site is one of the most critical parts of the business. • Performance of the overall environment must be maintained at all times. • Failures in any portion of the web site must be quickly identified, send notification to the appropriate parties. • Dependencies on external processes must be monitored as well.
  • 6. 6 Growing your Splunk Deployment The larger context • Failure in one system cascades • Map dependencies, estimate costs • Use Splunk to track all dependencies. • What happens when it is down? Dependencies often include: • Networking dependencies • Shared storage • Databases, middleware, custom apps • Virtualization layer Screenshot here
  • 7. 7 Scales to Hundreds of TBs/Day Enterprise-Class Scale, Resilience and Interoperability Send data from thousands of servers using any combination of Splunk Forwarders Auto load-balanced forwarding to Splunk Indexers Offload search load to Splunk Search Heads
  • 8. Visibility Across Datacenters Distributed search unifies the view across locations Role-based access controls how far a given user's search will span New York Tokyo London Cloud
  • 9. 9 Product Roles Searching and Reporting (Search Head) Indexing and Search Services (Indexer) Data Collection and Forwarding (Forwarder) Indexer Cluster Master, SHC Deployer Distributed Management / Deployment Server License Master, Distributed Mgmt Console Databases Networks Servers Virtual Machines Smart phones and Devices Custom Applications Security WebServer Sensors
  • 10. Copyright © 2015 Splunk Inc. Forwarders
  • 12. 12 Splunk Universal Forwarder Why use the UF over other methods? Collect syslog / event log / custom application logs Collect configuration files, registry settings Collect data NOT in log files: scripted inputs on current state Collect wire data – Splunk Stream Faster, Lower overhead than “agentless” polling Centrally administered … and
  • 13. 13 Forwarder Load Balancing Have UF balance across multiple indexers Load Balance – Multiple hosts in outputs – DNS round robin – LB not needed! Geography-based routing Optional SSL encryption Compressed 10 to 1
  • 14. 14 Deployment Server Central management of Splunk Forwarders Deployment Server manages Apps, Configs Select one or more classes for each host Class defines apps & configs Works by phone-home Notes: DS does not push forwarder binaries Use Cluster Master to manage indexers in cluster, not DS
  • 15. 15 Forwarding Tier Design Best Practices 15 • Use a Syslog Server for Syslog data • Deployment server (on a VM) for central management • Let AutoLB distribute data across available indexers • May need to increase UF throughput setting for high velocity sources – Enable forceTimebasedAutoLB (for more even distribution) – maxKBps (to adjust throttling) Questions?
  • 16. Copyright © 2015 Splunk Inc. Indexers
  • 17. 17 Indexers Dedicated indexers serve three primary roles: Data Storage Processing and parsing at index-time Indexing Data Management Hot / warm / cold data rotation Aging and removal Data Retrieval Perform search upon request, return data to search heads
  • 18. 18 Scaling - Indexers Sizing for index performance Indexers are usually storage-bound Indexers: 150 to 250 GB per day, each. (With reference HW.) Ref HW: 12 cores (2 GHz+), 12 GB RAM, 800+ IOPs Optimal HW (normal disk): 16 CPU cores, 48 GB RAM Optimal HW (SSD): 24 CPU cores, 132 GB RAM Questions?
  • 19. 19 Tiered Storage • Splunk supports tiered storage • Hot / Warm buckets – put on fastest disk • Size Hot/Warm for normal saved search durations. (7d, 30d) • Use slower / cheaper storage (NAS?) for long term access • Optional: Use Frozen to roll data to glacier, Hadoop, etc.
  • 20. 20 SSD Advantage http://blogs.splunk.com/2012/05/10/quantifying-the-benefits-of- splunk-with-ssds/ • Low cost random seeks • Writes are not that much faster – no great improvement with Indexing • Significant improvements with Sparse/needle-haystack searches • Dense searches become CPU bound • Searches run faster allowing for more completed searches/min • Use Enterprise-grade SSDs, not commercial-grade.
  • 21. 21 Scaling - Storage Manual storage calculation Raw data rate  net compression of ~ 50% on disk. Simple: rate * compression * retention / #indexers Hot / warm requirements – 200 GB / day * 50% * 30 days = 3TB per indexer Cold storage requirements – 200 GB / day * 50% * 335 days = 33.5TB per indexer Clustering – Changes storage story completely
  • 22. 22 Scaling - Storage One example of good local storage A well configured indexer using local storage might look like: • SSDs in RAID 5, sized for 14 days of storage • SATA drives in RAID 5, sized for 6 months of storage SSDs: RAID 5 provides decent performance Spinning disks: • Hot/Warm, RAID 1+0, 800 IOPS or faster • Cold – RAID 5 with proper block / stripe sizing
  • 23. 23 Scaling - Storage Sizing Calculator: http://splunk-sizing.appspot.com/
  • 24. Copyright © 2015 Splunk Inc. Indexer Clustering
  • 25. 25 Delivers Mission-Critical Availability • Data replication – maintain searchability even if servers go down • Multi-site capable – maintain searchability even if a site goes down • Search Affinity – optimized searches by fetching from the closest/fastest location REPLICATION Portland Datacenter New York Datacenter Clustering
  • 26. 26 Indexer Clustering High-Availability, Out of the Box Splunk indexer clustering Active-Active= better performance Specific terms: – Master Node / Master Cluster Node – Peer Node – Search Factor – Replication Factor Additional details: Splunk Docs, Distributed Deployment Manual
  • 27. 27 Cross-site Clustering Search Affinity by location “Search locally”, “Store Globally” DR scenarios
  • 28. 28 How Clustering Affects Sizing • Increased storage: – 15% of raw usage for every replica copy – 35% MORE to make that searchable • Increased processing – Incoming data to indexer is streamed to indexing peers to satisfy required number of copies • More hosts – Need “replication factor” + 2 (search head, cluster master) 2
  • 29. 29 Scaling - Storage Sizing Calculator: http://splunk-sizing.appspot.com/
  • 30. 30 Downsides of Indexer Clustering • Increased Storage • Cluster master is required – use a VM. • Increased bandwidth Questions? 3
  • 31. Copyright © 2015 Splunk Inc. Search Heads
  • 32. 32 Scaling the Search Heads Splunk Search is critical, too! Scaling your search heads Scale to handle # of concurrent queries Dedicated Search heads for certain apps, scheduled alerts Remember – Search heads virtualize well!
  • 33. Copyright © 2015 Splunk Inc. Search Head Clustering
  • 34. 34 SHP vs SHC Search Head Clustering Seach Head Pooling • Available since v4.2 • Sharing configurations through NFS • Single point of failure • Performance issues • No shared storage requirement • Replication using local storage • Commodity hardware • OSes: Linux or Solaris NFS
  • 35. 35 Search Head Clustering 1. Group search heads into a cluster 2. A captain gets elected dynamically 3. User created reports/dashboards automatically replicated to other search heads
  • 37. 37 Search Tier Design Best Practices 37 • Minimum 3 nodes required • ES will still require a Separate Search Head or dedicated SHC • Use LDAP/AD/SSO for user Authentication • Load Balancer configured for sticky sessions • Must use deployer to push apps to search heads • Confirm your applications’ support for SHC! Questions?
  • 38. 38 Search Head Clustering Use “Captain” instead of “Master” to avoid confusion with Index- Clustering Minimum 3 nodes required. Cluster takes certain key decisions based on *majority* (consensus) In multi-site setup have more nodes in main datacenter
  • 39. Copyright © 2015 Splunk Inc. The Final Stretch
  • 40. 40 Load Balancer Search Head Cluster, Deployer Clustered Peer Node + Cluster master Deployment server Universal Forwarders on Servers Syslog, NetFlow data HFs for scheduled polling via API 40
  • 41. 41 Hybrid Approach for rollout 41 • Add the existing Splunk instance as a search peer until the data retention period has expired • Disable scheduled searches on the old instance • Migrate any Summary Index data to new Indexers
  • 42. 42 Distributed Management Console Manage Splunk 6.2 environments Replaces Deployment Monitor App Incorporates SOS app prior to 6.2
  • 43. 43 Cloud & Hybrid Scale without waiting for hardware
  • 44. 44 Suggested Reading • Distributed Deployment Manual – http://docs.splunk.com/Documentation/Splunk/latest/Deploy/Distributedoverv iew • Highlights – Reference hardware specs – How searches affect performance  Dense / Rare / Sparse – App considerations – Summary table 4
  • 45. 45 Top 5 things to Remember 45 • Indexers: Storage requirements, IOPS, RAID config • Indexer clustering: HA, DR, and site affinity! • SHC: Minimum buy-in for a SHC is 3 • When in doubt – add another Indexer • Excellent VM candidates: – Master Cluster Node (Indexer clustering) – Deployer (Search head clustering) – Deployment Server (Central Forwarder management) – License Master – Distributed Management Console
  • 46. 46 Prizes in Exchange for Your Survey Feedback! Text Splunk to 878787 OR Scan this QR Code Then stop by our reg desk for a free gift and a chance to win a $100 AMEX gift cards

Editor's Notes

  • #8: By allowing Splunk Enterprise to be split into multiple roles, any portion of Splunk can be scaled as needed. Customers are using Splunk to index hundreds of TB/s a day and search over petabytes of data. Splunk can take a single search and query as many indexers as are needed to complete the job, allowing you to use inexpensive commodity hardware in massively parallel clusters. Besides achieve massive scale, splitting the roles enabled user to meet location and data segmentation requirements.
  • #9: Searches can be distributed from a single search head to any number of indexers. These indexers can all be local for massive parallelization for Big Data problems, or spread across a global enterprise to help you keep data wherever makes the most sense for your network, availability, and security requirements. Splunk Enterprise can be deployed on premise, in the cloud, or a combination of both. There is also an Amazon Machine Image available or if you don’t want to host or administer Splunk, it can be managed as a service by our experts using “Splunk Cloud”.
  • #10: These are multiple logical roles, a Splunk instance can be one or more of the roles. The search head is what most users interact with. It is the webserver and app interpreting engine that provides the primary, web-based user interface. Since most of the data interpretation happens as-needed at search time, the role of the search head is to translate user and app requests into actionable searches for it’s indexer(s) and display the results. The Splunk web UI is highly customizable, either through our own view and app system, or by embedding Splunk searches in your own web apps or our API. Additional search heads can be deployed to scale with user or search load. The core of the Splunk infrastructure is indexing. An indexer does two things – it accepts and processes new data, adding it to the index and compressing it on disk. The indexer also services search requests, looking through the data it has via it’s indices and returning the appropriate results to the searcher over a secure compressed communication channel. Indexers scale out almost limitlessly and with almost no degradation in overall performance, allowing Splunk to scale from single-instance small deployments to truly massive Big Data challenges. The Splunk forwarder is an optional component that can be installed to forward data from servers, desktops, mainframes, and even ARM based devices. There are two types of forwarders; the full Splunk distribution or a dedicated “Universal Forwarder”. The full Splunk distribution can be configured to filter data before transmitting, execute scripts locally, or run SplunkWeb. This gives you several options depending on the footprint size your endpoints can tolerate. The universal forwarder is an ultra-lightweight agent designed to collect data in the smallest possible footprint. Both flavors of forwarder come with automatic load balancing, SSL encryption and data compression, and the ability to route data to multiple Splunk instances or third party systems. The Cluster Master coordinates which indexers have copies of which buckets to ensure we have met the proper number of replication and searchable copies of each bucket. All clustered Indexers check in with the Master to alert them of their status, and the status of each of their replicated indexes and buckets. It also manages the apps and configurations on clustered indexers. We will talk more about buckets later. The Deployment Server can be used to manage your Splunk forwarders, for centrally managed data collection. More on this to come. Listed for completeness are the license master and DMC roles, these typically coexist with other roles such as the Deployment server.
  • #22: This slide shows the way we used to calculate data storage requirements.
  • #24: This app makes it far easier to size the environment’s storage requirements. And it includes clustering configurations, which we’ll talk about in a sec.
  • #26: Splunk’s clustering technology allows you to choose how many raw copies and searchable copies of your data you would like to keep. It also allows you to chose which indexers you want to store the copies on. This capability allows servers or even datacenters to go down without losing the ability to access the data. In addition, the search affinity capability allows users to fetch data from the closest or fastest location where there is a copy of the data which can not only save the time it takes to do a search but bandwidth by eliminating the need to use the WAN when there is a local copy.
  • #29: Default 3/2 cluster uses 3*.15 + 2*.35 = 115% of license usage for that redudancy Processing : a little more CPU and more network this is much better in current versions, the indexed data (tsidx, etc) is streamed to the replica peer, rather than forcing the peer to re-index.
  • #30: With Search factor / rep factor variables in the mix - what had been simple without clustering now becomes more challenging. Demo sizing calculator if time allows. Hot/Warm vs. Cold, in different RAID configurations. Sample indexes.conf is generated too.
  • #31: As discussed – default parameters require *more than* original log size
  • #36: Uniform user experience among pooled search heads No single point of failure Search job failure aware Does not require external storage such as NFS
  • #37: Note the Deployer on this image. Deployer virtualizes very well. The deployer pushes apps & configurations to the search head cluster members.
  • #41: Putting it all together
  • #42: What’s the best way to roll out some of these features? It depends on customer environment. But one common method is shown here.