SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
Supercharge Splunk with
Cloudera
2© Cloudera, Inc. All rights reserved.
1,000,000,000,000+
[ events per day ]
3© Cloudera, Inc. All rights reserved.
Challenges with Splunk Today
Splunk can not cost effectively
scale to the volume and variety
of modern data
Only partial view of the
enterprise limits analytics and
slows decisions
Difficult to deploy custom
advanced machine learning
capabilities
Explosion of Data Limited Enterprise Visibility Limited Analytic Processing
DataAccess
1%50%100%
DataVolume
10PB1PB1TB
IF (X) AND (Y)
THEN (Z)
Time
User
Network
Endpoint
Archived
Data
Emerging
Data
4© Cloudera, Inc. All rights reserved.
Advantages of Cloudera over Splunk
Cloud-Native & On-Premise
Go Beyond Splunk’s SPL
• Share enriched data across
multiple analytic processing
engines
• Simple search, SQL, Python,
R, Scala
Data Flexibility
• Faster, more agile, full-
fidelity data acquisition
• Data portability: Open data
model and open storage
Cost-Effective Scalability
• Elastic scale on-prem or in
the cloud
• Cloud-native pay-per-use
and transience
• Proven at big data scale
Hybrid
• Runs across multi-clouds &
on-prem
• Multi-storage over S3, HDFS,
Kudu, Isilon, etc
¢¢¢
5© Cloudera, Inc. All rights reserved.
Optimizing Splunk with Cloudera
PackagedApplications
Analytic
Processing
(Spark, Impala, Solr)
Management,
Governance,Security
(ClouderaManager,Cloudera
Navigator)
Data and
Analytic
Management
Cloudera Data Hub
Open Source Custom
Apache Spot Open Data
Models
(HDFS, Hbase, Kudu)
Ingestion
(Kafka, Flume, Streamsets)
(On premise or Cloud)
Splunk
Servers Threat Intelligence Network User Endpoint
6© Cloudera, Inc. All rights reserved.
Support multiple workloads with community defined Open
Data Models
Endpoint User
Network
DIVERSE DATA SOURCES SINGLE ACCESS
Source: Momentum Partners Cybersecurity Snapshot April 2016
7© Cloudera, Inc. All rights reserved.
Many applications on one shared data set and architecture
Visualization & machine learning
applications can share common
data set & infrastructure
CustomPackaged
Spot community is developing out
machine learning (e.g. network
threat detection)
Open Source
Build custom applications &
analytics using Cloudera without
having to buy new infrastructure
8© Cloudera, Inc. All rights reserved.
When to Use Cloudera vs Splunk
Cloudera
Best for:
• Self-service exploratory analytics
• Machine learning
• Long term archive
• Custom data streams
Benefits:
• Faster performance over large amounts of data
• Complete analytic flexibility (search, SQL,
statistical, and machine learning)
• Cost-effective scale
• Open data models and open data storage
Splunk
Best for:
• Workflow management
• Using pre-package rules
• Hot data management
• Preconfigured connectors
Benefits:
• Optimized for specific use cases
• Existing rules and connectors built out
• Quickly query for hot data for simple
questions
• Proprietary data format and data storage
optimized for their applications
9© Cloudera, Inc. All rights reserved.
Two starter use cases for Splunk Optimization with
Cloudera
Getting Started
10© Cloudera, Inc. All rights reserved.
Potential Goals
Demonstrate Splunk
optimization
✓ Install and configure Cloudera clusters (cloud or on
prem)
✓ Install and configure Apache Spot Open Data Models
✓ Build ingest adapters for Splunk to Apache Spot
✓ Build visualization dashboard that delivers some subset
of optics currently defined in Splunk
Establish
Cloudera data hub
Provide analytic
foundations
✓ Build IT and cybersecurity analytics platform on the
Apache Spot Open Data Model (ODM)
11© Cloudera, Inc. All rights reserved.
2 potential starting places…
1. Splunk Cost Tuning
2. Context Enrichment and Increased Visibility
12© Cloudera, Inc. All rights reserved.
Splunk Cost Tuning
• Identify where enterprise wants to
optimize cost, ingest/ indexing or storage
• Offload event data from heavy forward to
reduce long term storage and ingest/
indexing costs
• Keep enough data in Splunk to power
dashboards with long term analytics in
Cloudera. Enable flexible analytics:
• Search
• SQL
• Machine Learning (Python, Scala, R)
Packaged Applications
Analytic
Processing
(Spark, Impala, Solr)
Management,Governance,
Security
(ClouderaManager,Cloudera
Navigator)
Data and
Analytic
Management
Cloudera Data Hub
Custom
Apache Spot Open Data
Models
(HDFS, Hbase, Kudu)
Ingestion
(Kafka, Flume, Streamsets)
(On premise or Cloud)
Splunk
Sources
Open Source
Splunk Heavy Forwarder
Splunk Storage
Threat Intelligence Network User Endpoint
13© Cloudera, Inc. All rights reserved.
Context Enrichment and Increased Visibility
• Load events and context sources into
EDH landing it in Apache Spot’s Open
Data Model
• Enrich and enhance events with
additional context in the ODM
• Keep enough data in Splunk to power
dashboards with long term analytics in
Cloudera. Enable flexible analytics:
• Search
• SQL
• Machine Learning (Python, Scala, R)
Packaged Applications
Analytic
Processing
(Spark, Impala, Solr)
Management,Governance,
Security
(ClouderaManager,Cloudera
Navigator)
Data and
Analytic
Management
Cloudera Data Hub
Custom
Apache Spot Open Data
Models
(HDFS, Hbase, Kudu)
Ingestion
(Kafka, Flume, Streamsets)
(On premise or Cloud)
Sources
Apache Spot Algorithms
Splunk
Splunk Heavy Forwarder
Splunk Indexer
Packaged Applications
Analytic
Processing
(Spark, Impala, Solr)
Management,Governance,
Security
(ClouderaManager,Cloudera
Navigator)
Data and
Analytic
Management
Cloudera Data Hub
Custom
Apache Spot Open Data
Models
(HDFS, Hbase, Kudu)
Ingestion
(Kafka, Flume, Streamsets)
(On premise or Cloud)
Splunk
Open Source
Splunk Heavy Forwarder
Splunk Storage
Threat Intelligence Network User Endpoint
14© Cloudera, Inc. All rights reserved.
Q&A
15© Cloudera, Inc. All rights reserved.
Thank You

More Related Content

PPTX
Introducing virtual reality
PDF
Lecture 8 Introduction to Augmented Reality
PDF
16. Le BIOS et ses role principales dans le fonctionnement de de l'ordinate...
PPTX
Dell Technologies - The Portfolio in 20+9 Minutes
PPTX
Augmented reality
PDF
Internet of Things: What is it? What makes it Tick? What you need to know.
PPTX
Saama Presents Is your Big Data Solution Ready for Streaming
PPTX
Zabbix at scale with Elasticsearch
Introducing virtual reality
Lecture 8 Introduction to Augmented Reality
16. Le BIOS et ses role principales dans le fonctionnement de de l'ordinate...
Dell Technologies - The Portfolio in 20+9 Minutes
Augmented reality
Internet of Things: What is it? What makes it Tick? What you need to know.
Saama Presents Is your Big Data Solution Ready for Streaming
Zabbix at scale with Elasticsearch

Similar to Supercharge Splunk with Cloudera
 (20)

PPTX
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
PPTX
Data Science and CDSW
PPTX
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
PPTX
Leveraging the Cloud for Big Data Analytics 12.11.18
PPTX
Leveraging the cloud for analytics and machine learning 1.29.19
PPTX
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
PPTX
Get Started with Cloudera’s Cyber Solution
PPTX
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
PPTX
High-Performance Analytics in the Cloud with Apache Impala
PPTX
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
PDF
Hybrid is the New Normal
PPTX
Analyzing Hadoop Data Using Sparklyr

PPTX
How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
PPTX
Part 1: Introducing the Cloudera Data Science Workbench
PPTX
Part 2: A Visual Dive into Machine Learning and Deep Learning 

PPTX
Get started with Cloudera's cyber solution
PPTX
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
PPTX
From Insight to Action: Using Data Science to Transform Your Organization
PPTX
Cloudera Altus: Big Data in der Cloud einfach gemacht
PPTX
Spark One Platform Webinar
Gartner Data and Analytics Summit: Bringing Self-Service BI & SQL Analytics ...
Data Science and CDSW
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera Big Data Integration Speedpitch at TDWI Munich June 2017
Get Started with Cloudera’s Cyber Solution
Multidisziplinäre Analyseanwendungen auf einer gemeinsamen Datenplattform ers...
High-Performance Analytics in the Cloud with Apache Impala
What’s New in Cloudera Enterprise 6.0: The Inside Scoop 6.14.18
Hybrid is the New Normal
Analyzing Hadoop Data Using Sparklyr

How to Build Multi-disciplinary Analytics Applications on a Shared Data Platform
Part 1: Introducing the Cloudera Data Science Workbench
Part 2: A Visual Dive into Machine Learning and Deep Learning 

Get started with Cloudera's cyber solution
Part 2: Apache Kudu: Extending the Capabilities of Operational and Analytic D...
From Insight to Action: Using Data Science to Transform Your Organization
Cloudera Altus: Big Data in der Cloud einfach gemacht
Spark One Platform Webinar
Ad

More from Cloudera, Inc. (20)

PPTX
Partner Briefing_January 25 (FINAL).pptx
PPTX
Cloudera Data Impact Awards 2021 - Finalists
PPTX
2020 Cloudera Data Impact Awards Finalists
PPTX
Edc event vienna presentation 1 oct 2019
PPTX
Machine Learning with Limited Labeled Data 4/3/19
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
PPTX
Introducing Cloudera DataFlow (CDF) 2.13.19
PPTX
Introducing Cloudera Data Science Workbench for HDP 2.12.19
PPTX
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
PPTX
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
PPTX
Modern Data Warehouse Fundamentals Part 3
PPTX
Modern Data Warehouse Fundamentals Part 2
PPTX
Modern Data Warehouse Fundamentals Part 1
PPTX
Extending Cloudera SDX beyond the Platform
PPTX
Federated Learning: ML with Privacy on the Edge 11.15.18
PPTX
Analyst Webinar: Doing a 180 on Customer 360
PPTX
Build a modern platform for anti-money laundering 9.19.18
PPTX
Introducing the data science sandbox as a service 8.30.18
PPTX
Cloudera SDX
PPTX
Introducing Workload XM 8.7.18
Partner Briefing_January 25 (FINAL).pptx
Cloudera Data Impact Awards 2021 - Finalists
2020 Cloudera Data Impact Awards Finalists
Edc event vienna presentation 1 oct 2019
Machine Learning with Limited Labeled Data 4/3/19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 1
Extending Cloudera SDX beyond the Platform
Federated Learning: ML with Privacy on the Edge 11.15.18
Analyst Webinar: Doing a 180 on Customer 360
Build a modern platform for anti-money laundering 9.19.18
Introducing the data science sandbox as a service 8.30.18
Cloudera SDX
Introducing Workload XM 8.7.18
Ad

Recently uploaded (20)

PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
medical staffing services at VALiNTRY
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Transform Your Business with a Software ERP System
PPTX
ai tools demonstartion for schools and inter college
PDF
System and Network Administraation Chapter 3
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Online Work Permit System for Fast Permit Processing
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
Introduction to Artificial Intelligence
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
AI in Product Development-omnex systems
How to Migrate SBCGlobal Email to Yahoo Easily
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
medical staffing services at VALiNTRY
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Transform Your Business with a Software ERP System
ai tools demonstartion for schools and inter college
System and Network Administraation Chapter 3
Odoo POS Development Services by CandidRoot Solutions
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Which alternative to Crystal Reports is best for small or large businesses.pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Design an Analysis of Algorithms II-SECS-1021-03
Online Work Permit System for Fast Permit Processing
VVF-Customer-Presentation2025-Ver1.9.pptx
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Introduction to Artificial Intelligence
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
AI in Product Development-omnex systems

Supercharge Splunk with Cloudera


  • 1. 1© Cloudera, Inc. All rights reserved. Supercharge Splunk with Cloudera
  • 2. 2© Cloudera, Inc. All rights reserved. 1,000,000,000,000+ [ events per day ]
  • 3. 3© Cloudera, Inc. All rights reserved. Challenges with Splunk Today Splunk can not cost effectively scale to the volume and variety of modern data Only partial view of the enterprise limits analytics and slows decisions Difficult to deploy custom advanced machine learning capabilities Explosion of Data Limited Enterprise Visibility Limited Analytic Processing DataAccess 1%50%100% DataVolume 10PB1PB1TB IF (X) AND (Y) THEN (Z) Time User Network Endpoint Archived Data Emerging Data
  • 4. 4© Cloudera, Inc. All rights reserved. Advantages of Cloudera over Splunk Cloud-Native & On-Premise Go Beyond Splunk’s SPL • Share enriched data across multiple analytic processing engines • Simple search, SQL, Python, R, Scala Data Flexibility • Faster, more agile, full- fidelity data acquisition • Data portability: Open data model and open storage Cost-Effective Scalability • Elastic scale on-prem or in the cloud • Cloud-native pay-per-use and transience • Proven at big data scale Hybrid • Runs across multi-clouds & on-prem • Multi-storage over S3, HDFS, Kudu, Isilon, etc ¢¢¢
  • 5. 5© Cloudera, Inc. All rights reserved. Optimizing Splunk with Cloudera PackagedApplications Analytic Processing (Spark, Impala, Solr) Management, Governance,Security (ClouderaManager,Cloudera Navigator) Data and Analytic Management Cloudera Data Hub Open Source Custom Apache Spot Open Data Models (HDFS, Hbase, Kudu) Ingestion (Kafka, Flume, Streamsets) (On premise or Cloud) Splunk Servers Threat Intelligence Network User Endpoint
  • 6. 6© Cloudera, Inc. All rights reserved. Support multiple workloads with community defined Open Data Models Endpoint User Network DIVERSE DATA SOURCES SINGLE ACCESS Source: Momentum Partners Cybersecurity Snapshot April 2016
  • 7. 7© Cloudera, Inc. All rights reserved. Many applications on one shared data set and architecture Visualization & machine learning applications can share common data set & infrastructure CustomPackaged Spot community is developing out machine learning (e.g. network threat detection) Open Source Build custom applications & analytics using Cloudera without having to buy new infrastructure
  • 8. 8© Cloudera, Inc. All rights reserved. When to Use Cloudera vs Splunk Cloudera Best for: • Self-service exploratory analytics • Machine learning • Long term archive • Custom data streams Benefits: • Faster performance over large amounts of data • Complete analytic flexibility (search, SQL, statistical, and machine learning) • Cost-effective scale • Open data models and open data storage Splunk Best for: • Workflow management • Using pre-package rules • Hot data management • Preconfigured connectors Benefits: • Optimized for specific use cases • Existing rules and connectors built out • Quickly query for hot data for simple questions • Proprietary data format and data storage optimized for their applications
  • 9. 9© Cloudera, Inc. All rights reserved. Two starter use cases for Splunk Optimization with Cloudera Getting Started
  • 10. 10© Cloudera, Inc. All rights reserved. Potential Goals Demonstrate Splunk optimization ✓ Install and configure Cloudera clusters (cloud or on prem) ✓ Install and configure Apache Spot Open Data Models ✓ Build ingest adapters for Splunk to Apache Spot ✓ Build visualization dashboard that delivers some subset of optics currently defined in Splunk Establish Cloudera data hub Provide analytic foundations ✓ Build IT and cybersecurity analytics platform on the Apache Spot Open Data Model (ODM)
  • 11. 11© Cloudera, Inc. All rights reserved. 2 potential starting places… 1. Splunk Cost Tuning 2. Context Enrichment and Increased Visibility
  • 12. 12© Cloudera, Inc. All rights reserved. Splunk Cost Tuning • Identify where enterprise wants to optimize cost, ingest/ indexing or storage • Offload event data from heavy forward to reduce long term storage and ingest/ indexing costs • Keep enough data in Splunk to power dashboards with long term analytics in Cloudera. Enable flexible analytics: • Search • SQL • Machine Learning (Python, Scala, R) Packaged Applications Analytic Processing (Spark, Impala, Solr) Management,Governance, Security (ClouderaManager,Cloudera Navigator) Data and Analytic Management Cloudera Data Hub Custom Apache Spot Open Data Models (HDFS, Hbase, Kudu) Ingestion (Kafka, Flume, Streamsets) (On premise or Cloud) Splunk Sources Open Source Splunk Heavy Forwarder Splunk Storage Threat Intelligence Network User Endpoint
  • 13. 13© Cloudera, Inc. All rights reserved. Context Enrichment and Increased Visibility • Load events and context sources into EDH landing it in Apache Spot’s Open Data Model • Enrich and enhance events with additional context in the ODM • Keep enough data in Splunk to power dashboards with long term analytics in Cloudera. Enable flexible analytics: • Search • SQL • Machine Learning (Python, Scala, R) Packaged Applications Analytic Processing (Spark, Impala, Solr) Management,Governance, Security (ClouderaManager,Cloudera Navigator) Data and Analytic Management Cloudera Data Hub Custom Apache Spot Open Data Models (HDFS, Hbase, Kudu) Ingestion (Kafka, Flume, Streamsets) (On premise or Cloud) Sources Apache Spot Algorithms Splunk Splunk Heavy Forwarder Splunk Indexer Packaged Applications Analytic Processing (Spark, Impala, Solr) Management,Governance, Security (ClouderaManager,Cloudera Navigator) Data and Analytic Management Cloudera Data Hub Custom Apache Spot Open Data Models (HDFS, Hbase, Kudu) Ingestion (Kafka, Flume, Streamsets) (On premise or Cloud) Splunk Open Source Splunk Heavy Forwarder Splunk Storage Threat Intelligence Network User Endpoint
  • 14. 14© Cloudera, Inc. All rights reserved. Q&A
  • 15. 15© Cloudera, Inc. All rights reserved. Thank You