Copyright © 2013 Splunk Inc.
Big Data at the
Speed of Business
Isaac Mosquera
Director of Mobile, ShareThis
Clint Sharp
Principal Big Data Product Manager, Splunk
Copyright © 2013 Splunk Inc.
What We’ll Talk About
• Our quest for visibility
• Analyzing at scale
• Splunk and Big Data
• Where do you start?
• Q&A
About Splunk
Company (NASDAQ: SPLK)
Founded 2004, first software release in 2006
HQ: San Francisco
Business Model / Products
Industry-leading machine data platform
On-premise, in the cloud and SaaS
5,600+ Customers
63 of the Fortune 100
Largest license: 100 Terabytes per day
#1 Big Data Innovator*
* Fast Company's Most Innovative Companies Issue (March 2013)
About ShareThis and Socialize
ShareThis makes the world more
connected, trusted and valuable through sharing
Powers the social web, touching the lives
of 95 percent of U.S.
Acquires Socialize, which makes mobile
and social more engaging
Socialized integrated into thousands of
iOS and Android Apps
Installed on 80M+ devices
Evaluating 20 Billion
Ad Impressions Monthly
Copyright © 2013 Splunk Inc.
Copyright © 2013 Splunk Inc.
Copyright © 2013 Splunk Inc.
Copyright © 2013 Splunk Inc.
Copyright © 2013 Splunk Inc.
Copyright © 2013 Splunk Inc.
Copyright © 2013 Splunk Inc.
Final Architecture
RDBMS
(Generated
Reports)
S3
Snapshots
Search
Head
Socialize Bidder
Splunk
Indexer
Indexer
Indexer
Cache Cluster
Memcache Memcache Memcache
So, What is
Splunk?
14
Expanding Universe of Data Sources
Machine-generated DataBusiness Application Data Human-generated Data
Highly Structured Arbitrarily Structured
2012-12-05 07:04:44
Id=00Q000000Rd910EAJ City=New York
Country=US CreatedDate=“2012-12-05
07:06:44” Email.jdoe@gmail.com
Email_Opt_In_c Customer_Street
_Address_c=“123 Main St.”
purchased_product_id=
product_i BD-01 twitter_username
john_t_doe
Industry Leading Platform for Machine Data
Any Machine Data Operational Intelligence
HA Indexes
and Storage
Commodity
Servers
Developer
Platform
Custom
dashboards
Monitor
and alert
Ad hoc
search
Report and
analyze
Analyzing Heterogeneous Data
Universal Index Schema-on-the-fly Flexibility and
Fast Time to Value
• No data normalization
• Automatically handles
timestamps
• Parsers not required
• Index every term &
pattern “blindly”
• No attempt to
“understand” up front
• Structure applied at
search-time
• No brittle schema to
work around
• Automatically find
transactions, patterns
and trends
• Normalization as it’s
needed
• Faster implementation
• Easy search language
• Multiple views into the
same data
Gain Critical Insights … in Real-time
Order ID
Customer’s Tweet
Time Waiting On Hold
Product ID
Company’s Name
Sources
Twitter
Care IVR
Middleware
Error
Order Processing
Order ID
Customer ID
Twitter ID
Customer ID
Customer ID
Deep Visibility and Insight for IT and Business
IT Operations Management Web Intelligence
Business AnalyticsApplication Management
Security and Compliance Industrial Data / Internet of Things
Over 5,600 organizations using Splunk across IT and business users
Driving Insights
from Big Data
Hadoop
The ShareThis Insights Platform
On Father’s day:
“Who were the most shared about topics?”
“What type of type of beers do people drink?”
API ETL Pre-
aggregation
Analytics
?
Finding the Optimal Approach
Hadoop and MapReduce are great for complex data science on data
at rest – the previous architecture took 9 months with a team of
engineers, data architects, etc.
The Splunk platform delivers real-time, interactive analysis –
we can build many of the same insights within 1 hour
What should be the core focus or competency of your team?
Conclusion: find the most optimal approach for the business
What About
Ad Hoc Analysis?
PR Insights Example
What was the situation? (e.g. fast moving business, needed
real-time insights)
What was the PR team struggling with? Difficult to find useful
data to build interesting use-cases
What did they want? They wanted a flexible real-time reporting
environment to extract insights useful for the market
How my team helped? Delivered a single dashboard that contained
real-time data into the sharing behaviors across our network
PR Insights Dashboard
Let’s not forget
The low-hanging fruit
Operational Analytics for an Online World
website
API Notification
Google (GCM)
Feedback
Processor
Apple (APNS)
? !
Notifications Systems
Driving Superior Customer Experience
How many 500 errors
have I had over time?
Look for anomalies
and spikes!
Zone in directly
to the customer!!
Online Device Notifications
One More Thing …
28
Copyright © 2013 Splunk Inc.
New product from Splunk
delivers interactive data
exploration, analysis and
visualizations for Hadoop
Announcing Hunk Beta
Splunk Analytics for Hadoop
Derive Actionable Insights from Raw Data
Hadoop
Storage
Immediately
start
exploring, analyz
ing and
visualizing raw
data in Hadoop
1 2Point
Splunk at
Hadoop
Cluster
Explore Analyze Visualize Dashboards Share
Learn More
31
splunk.com/bigdata
Copyright © 2013 Splunk Inc.
Questions?

More Related Content

PDF
Platfora Data Visualization Meetup
PDF
Join 2017_Deep Dive_To Use or Not Use PDT's
PPTX
Splunk sales presentation
PDF
A Perspective from the intersection Data Science, Mobility, and Mobile Devices
PPT
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
PDF
MAALBS Big Data agile framwork
PPT
Objectivity/DB: A Multipurpose NoSQL Database
PPTX
Big Data as Competitive Advantage in Financial Services
Platfora Data Visualization Meetup
Join 2017_Deep Dive_To Use or Not Use PDT's
Splunk sales presentation
A Perspective from the intersection Data Science, Mobility, and Mobile Devices
Real-time Streaming Analytics: Business Value, Use Cases and Architectural Co...
MAALBS Big Data agile framwork
Objectivity/DB: A Multipurpose NoSQL Database
Big Data as Competitive Advantage in Financial Services

What's hot (17)

PDF
Markerstudy Group Drives Growth and Innovation
PPTX
MicroStrategy on Amazon Web Services (AWS) Cloud
 
PPTX
Turning data from insights into value
PDF
MongoDB_Spark
PDF
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
PDF
A modern data platform meets the needs of each type of data in your business
PPTX
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
PPTX
Extreme Analytics @ eBay
PDF
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
PDF
Take Action: The New Reality of Data-Driven Business
PPTX
How Big Data Can Help Marketers Improve Customer Relationships
PDF
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
PDF
Open Blueprint for Real-Time Analytics with In-Stream Processing
PDF
Callcenter HPE IDOL overview
PDF
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
PPTX
Infochimps + CloudCon: Infinite Monkey Theorem
PDF
Big Data in Cloud: Seize your Insight Like a Golden Snitch (Margaret Ostapchu...
Markerstudy Group Drives Growth and Innovation
MicroStrategy on Amazon Web Services (AWS) Cloud
 
Turning data from insights into value
MongoDB_Spark
Strata 2016 - Architecting for Change: LinkedIn's new data ecosystem
A modern data platform meets the needs of each type of data in your business
Real-time Streaming Analytics for Enterprises based on Apache Storm - Impetus...
Extreme Analytics @ eBay
Girish Sathyanarayana, Senior Data Scientist at AppLift, " Business Value Thr...
Take Action: The New Reality of Data-Driven Business
How Big Data Can Help Marketers Improve Customer Relationships
Sudhir Rawat, Sr Techonology Evangelist at Microsoft SQL Business Intelligenc...
Open Blueprint for Real-Time Analytics with In-Stream Processing
Callcenter HPE IDOL overview
Open Blueprint for Real-Time Analytics in Retail: Strata Hadoop World 2017 S...
Infochimps + CloudCon: Infinite Monkey Theorem
Big Data in Cloud: Seize your Insight Like a Golden Snitch (Margaret Ostapchu...
Ad

Viewers also liked (20)

PDF
Congratsyourthedbatoo
PPTX
Socialize your app to drive app discovery and user engagement
PDF
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
PDF
MySQL performance webinar
PPTX
Intro to MySQL Part I
ODP
Librecon 2014 photos
PDF
Barcelona mysqlnd qc
ODP
MySQL in the Real World
PPT
PDF
Introduction to MySQL Cluster
PPTX
Database Optimization (MySQL)
PDF
MySql's NoSQL -- best of both worlds on the same disks
ODP
MySQL Enterprise Portfolio
PPTX
Exploiting JXL using Selenium
PPTX
Introduction to AWS
ODP
MySQL Cluster
PPT
PPTX
My sql performance tuning course
ODP
MySQL 5.7 - What's new and How to upgrade
PPTX
Cutting Through the Disruption
Congratsyourthedbatoo
Socialize your app to drive app discovery and user engagement
Webinar slides: The Holy Grail Webinar: Become a MySQL DBA - Database Perform...
MySQL performance webinar
Intro to MySQL Part I
Librecon 2014 photos
Barcelona mysqlnd qc
MySQL in the Real World
Introduction to MySQL Cluster
Database Optimization (MySQL)
MySql's NoSQL -- best of both worlds on the same disks
MySQL Enterprise Portfolio
Exploiting JXL using Selenium
Introduction to AWS
MySQL Cluster
My sql performance tuning course
MySQL 5.7 - What's new and How to upgrade
Cutting Through the Disruption
Ad

Similar to Splunk/Socialize at Hadoop Summit (20)

PDF
Splunk for big_data
PPTX
Softcat Splunk Discovery Day Manchester, March 2017
PPTX
Splunk live london_grs
PPTX
Getting Started with Splunk Breakout Session
PDF
Getting Started with Splunk Enterprise
PDF
SplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
PPTX
Splunk
PPTX
SplunkLive! London 2017 - Splunk Overview
PPTX
Leverage Machine Data
PPTX
Splunk - Verwandeln Sie Datensilos in Operational Intelligence
PDF
Splunk Sales Presentation Imagemaker 2014
PPTX
Getting Started with Splunk Breakout Session
PPTX
Getting Started with Splunk Enterprises
PDF
Splunk Webinar: IT Operations Demo für Troubleshooting & Dashboarding
PDF
Splunk hunkbeta
PPTX
Getting Started with Splunk Breakout Session
PDF
Optimize IT Infrastructure
PPTX
Delivering New Visibility and Analytics for IT Operations
PDF
SplunkLive! São Paulo 2014 - Overview by markus zirn
PPTX
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5
Splunk for big_data
Softcat Splunk Discovery Day Manchester, March 2017
Splunk live london_grs
Getting Started with Splunk Breakout Session
Getting Started with Splunk Enterprise
SplunkLive! Amsterdam 2015 Breakout - Getting Started with Splunk
Splunk
SplunkLive! London 2017 - Splunk Overview
Leverage Machine Data
Splunk - Verwandeln Sie Datensilos in Operational Intelligence
Splunk Sales Presentation Imagemaker 2014
Getting Started with Splunk Breakout Session
Getting Started with Splunk Enterprises
Splunk Webinar: IT Operations Demo für Troubleshooting & Dashboarding
Splunk hunkbeta
Getting Started with Splunk Breakout Session
Optimize IT Infrastructure
Delivering New Visibility and Analytics for IT Operations
SplunkLive! São Paulo 2014 - Overview by markus zirn
SplunkLive! Washington DC May 2013 - Splunk Enterprise 5

Recently uploaded (20)

PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
Hindi spoken digit analysis for native and non-native speakers
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Getting started with AI Agents and Multi-Agent Systems
PPTX
O2C Customer Invoices to Receipt V15A.pptx
PPTX
Modernising the Digital Integration Hub
PPT
Geologic Time for studying geology for geologist
PDF
STKI Israel Market Study 2025 version august
PDF
Architecture types and enterprise applications.pdf
PDF
CloudStack 4.21: First Look Webinar slides
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPT
Module 1.ppt Iot fundamentals and Architecture
PPTX
The various Industrial Revolutions .pptx
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
Enhancing emotion recognition model for a student engagement use case through...
A contest of sentiment analysis: k-nearest neighbor versus neural network
Assigned Numbers - 2025 - Bluetooth® Document
WOOl fibre morphology and structure.pdf for textiles
Hindi spoken digit analysis for native and non-native speakers
Benefits of Physical activity for teenagers.pptx
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Getting started with AI Agents and Multi-Agent Systems
O2C Customer Invoices to Receipt V15A.pptx
Modernising the Digital Integration Hub
Geologic Time for studying geology for geologist
STKI Israel Market Study 2025 version august
Architecture types and enterprise applications.pdf
CloudStack 4.21: First Look Webinar slides
A comparative study of natural language inference in Swahili using monolingua...
Module 1.ppt Iot fundamentals and Architecture
The various Industrial Revolutions .pptx
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
Univ-Connecticut-ChatGPT-Presentaion.pdf
DP Operators-handbook-extract for the Mautical Institute
Enhancing emotion recognition model for a student engagement use case through...

Splunk/Socialize at Hadoop Summit

  • 1. Copyright © 2013 Splunk Inc. Big Data at the Speed of Business Isaac Mosquera Director of Mobile, ShareThis Clint Sharp Principal Big Data Product Manager, Splunk Copyright © 2013 Splunk Inc.
  • 2. What We’ll Talk About • Our quest for visibility • Analyzing at scale • Splunk and Big Data • Where do you start? • Q&A
  • 3. About Splunk Company (NASDAQ: SPLK) Founded 2004, first software release in 2006 HQ: San Francisco Business Model / Products Industry-leading machine data platform On-premise, in the cloud and SaaS 5,600+ Customers 63 of the Fortune 100 Largest license: 100 Terabytes per day #1 Big Data Innovator* * Fast Company's Most Innovative Companies Issue (March 2013)
  • 4. About ShareThis and Socialize ShareThis makes the world more connected, trusted and valuable through sharing Powers the social web, touching the lives of 95 percent of U.S. Acquires Socialize, which makes mobile and social more engaging Socialized integrated into thousands of iOS and Android Apps Installed on 80M+ devices
  • 5. Evaluating 20 Billion Ad Impressions Monthly
  • 6. Copyright © 2013 Splunk Inc.
  • 7. Copyright © 2013 Splunk Inc.
  • 8. Copyright © 2013 Splunk Inc.
  • 9. Copyright © 2013 Splunk Inc.
  • 10. Copyright © 2013 Splunk Inc.
  • 11. Copyright © 2013 Splunk Inc.
  • 12. Copyright © 2013 Splunk Inc.
  • 15. Expanding Universe of Data Sources Machine-generated DataBusiness Application Data Human-generated Data Highly Structured Arbitrarily Structured 2012-12-05 07:04:44 Id=00Q000000Rd910EAJ City=New York Country=US CreatedDate=“2012-12-05 07:06:44” Email.jdoe@gmail.com Email_Opt_In_c Customer_Street _Address_c=“123 Main St.” purchased_product_id= product_i BD-01 twitter_username john_t_doe
  • 16. Industry Leading Platform for Machine Data Any Machine Data Operational Intelligence HA Indexes and Storage Commodity Servers Developer Platform Custom dashboards Monitor and alert Ad hoc search Report and analyze
  • 17. Analyzing Heterogeneous Data Universal Index Schema-on-the-fly Flexibility and Fast Time to Value • No data normalization • Automatically handles timestamps • Parsers not required • Index every term & pattern “blindly” • No attempt to “understand” up front • Structure applied at search-time • No brittle schema to work around • Automatically find transactions, patterns and trends • Normalization as it’s needed • Faster implementation • Easy search language • Multiple views into the same data
  • 18. Gain Critical Insights … in Real-time Order ID Customer’s Tweet Time Waiting On Hold Product ID Company’s Name Sources Twitter Care IVR Middleware Error Order Processing Order ID Customer ID Twitter ID Customer ID Customer ID
  • 19. Deep Visibility and Insight for IT and Business IT Operations Management Web Intelligence Business AnalyticsApplication Management Security and Compliance Industrial Data / Internet of Things Over 5,600 organizations using Splunk across IT and business users
  • 21. Hadoop The ShareThis Insights Platform On Father’s day: “Who were the most shared about topics?” “What type of type of beers do people drink?” API ETL Pre- aggregation Analytics ?
  • 22. Finding the Optimal Approach Hadoop and MapReduce are great for complex data science on data at rest – the previous architecture took 9 months with a team of engineers, data architects, etc. The Splunk platform delivers real-time, interactive analysis – we can build many of the same insights within 1 hour What should be the core focus or competency of your team? Conclusion: find the most optimal approach for the business
  • 23. What About Ad Hoc Analysis?
  • 24. PR Insights Example What was the situation? (e.g. fast moving business, needed real-time insights) What was the PR team struggling with? Difficult to find useful data to build interesting use-cases What did they want? They wanted a flexible real-time reporting environment to extract insights useful for the market How my team helped? Delivered a single dashboard that contained real-time data into the sharing behaviors across our network
  • 26. Let’s not forget The low-hanging fruit
  • 27. Operational Analytics for an Online World website API Notification Google (GCM) Feedback Processor Apple (APNS) ? ! Notifications Systems Driving Superior Customer Experience How many 500 errors have I had over time? Look for anomalies and spikes! Zone in directly to the customer!! Online Device Notifications
  • 28. One More Thing … 28
  • 29. Copyright © 2013 Splunk Inc. New product from Splunk delivers interactive data exploration, analysis and visualizations for Hadoop Announcing Hunk Beta Splunk Analytics for Hadoop
  • 30. Derive Actionable Insights from Raw Data Hadoop Storage Immediately start exploring, analyz ing and visualizing raw data in Hadoop 1 2Point Splunk at Hadoop Cluster Explore Analyze Visualize Dashboards Share
  • 32. Copyright © 2013 Splunk Inc. Questions?

Editor's Notes

  • #4: Splunk                 $186 million        Turns machine data into valuable insightsSplunk now has more than 600 employees worldwide, with headquarters in San Francisco and 14 offices around the world.Since first shipping its software in 2006, Splunk now has over 4,400 customers in 80+ countries. These organizations are using Splunk software to improve service levels, reduce operations costs, mitigate security risks, enable compliance, enhance DevOps collaboration and create new product and service offerings. Please always refer to latest company data found here: http://www.splunk.com/company.
  • #17: Talk specifically about how Splunk supports:Volume – scalable real-time architecture.Velocity – horizontal scalability.Variety – universal forwarding and indexing for highly diverse data from thousands of heterogeneous sources.Variability –late-binding schema for maximum search time analysis.
  • #19: When we look more closely at the data we see that it contains critical information – customer id, order id, time waiting on hold, twitter id … what was tweeted. What’s important is first of all the ability to actually see across all these disparate data sources, but then to correlate related events across disparate sources, to deliver If you can correlate and visualize related events across these disparate sources, you can build a picture of activity, behavior and experience. And what if you can do all of this in real-time? You can respond more quickly to events that matter.  You can extrapolate this example to a wide range of use cases – security and fraud, transaction monitoring and analysis, web analytics, IT operations and so on.
  • #20: Splunk turns raw machine data to new visibility, insights and analytics for IT and business professionals. Intelligence from operational data can help organizations meaningfully improve performance in a wide range of areas e.g. meet service levels, reduce costs, mitigate security risks, maintain compliance and gain insights. As well as providing analysis of real-time activity and behavior of products, users, services, servers.Example users of Splunk today include:Customer supportOperations teamsSysadminsApp developersSecurity analystsAuditorsIT execsWeb/biz analystsLOB owners / execs
  • #28: API -> Notification Server -> Either Apple or Google -> At some time later, they will respond back with whether there were any real problems. With Splunk I can look at each individual piece as a whole and look at how the message traversed through the system.Without Splunk – would not know how to do it.
  • #30: This is why we are announcing a new product from Splunk. It’s in Beta, it’s called “Hunk” and it’s SPLUNK ANALYTICS FOR HADOOP.This is a NEW PRODUCT from Splunk that delivers INTERACTIVE DATA EXPLORATION, ANALYSIS and VISUALIZATIONS FOR HADOOP.
  • #31: Because it’s based on proven Splunk technology – deployed at thousands of organizations, we’ve naturally made it easy to deploy.Simply point it at your Hadoop cluster and start interacting with and analyzing data immediately.