SlideShare a Scribd company logo
Sumo Logic Confidential
Data Collection
June 2016
How-To Webinar
Welcome.
To give everyone a
chance to successfully
connect, we’ll start at
10:05 AM Pacific.
Sumo Logic Confidential
At the completion of this webinar, you will be able to…
Design a Sumo Logic deployment that fits your
organization
Install Collectors
Create your Data Sources
Understand Local File Configuration Management
Sumo Logic Confidential
High-Level Data Flow
Sumo Logic Confidential
Sumo Logic Data Flow
Data Collection Search & Analyze Visualize & Monitor
Alerts
Dashboards
Collectors
Sources
Operators
Charts
1 2 3
Sumo Logic ConfidentialSumo Logic Confidential
Enterprise Logs are Everywhere
Custom App
Code
Server / OS
Virtual
Databases
Network
Open
Source
Middleware
Content
Delivery
IaaS,
PaaS
SaaS Security
Sumo Logic Confidential
Designing Your Deployment
• Sumo Logic Data
Collection is
infinitely flexible.
• Design a Sumo
Logic deployment
that's right for
your organization.
• Installed versus
Hosted Collectors.
Sumo Logic Confidential
Host A
Collectors and Sources
Apache Access
Apache Error
Collector
A
Host B
Collector
B
Host C
Collector
C
Apache Access
Apache Error
IIS Logs
IIS W3C Logs
Sumo Logic Confidential
Collectors
Sumo Logic ConfidentialSumo Logic Confidential
Collector and Deployment Options
Collector
Cloud Data
Collection
Centralized
Data
Collection
Local Data
Collection
Collector
CollectorCollector
Collector
Hosted Collectors Installed Collectors
Sumo Logic Confidential
Source Types
S3 Bucket
 Any data written to S3 buckets via AWS, Lambda
Scripts, custom Apps
HTTPS
 Akamai, Log Appender Libraries, etc.
Google
 Google API
Typical Scenarios
AWS Only Customers, while it's possible to
rely on Cloud Data Collection entirely, this is
not typical. These source types are normally
just part of the overall collection strategies
Benefits/Drawbacks
+ No Software Installation
- S3 Latency issues
- Https Post Caching Need
Cloud Data Collection
Most Data is generated in the Cloud and by Cloud Services and is collected via Sumo Logics Cloud Integrations.
Sumo Logic Confidential
Local Data Collection
The Sumo Logic Collector is installed on all target Hosts and, where possible, sends log data produced on those target Hosts directly to
Sumo Logic Backend via https connection.
Source Types
Local Files
 Operating Systems, Middleware, Custom Apps,
etc.
Windows Events
 Local Windows Events
Docker
 Logs and Stats
Syslog (dedicated Collector)
 Network Devices, Snare, etc
Script (dedicated Collector)
 Cloud API’s, Database Content, binary data
Typical Scenarios
Customers with large amounts of (similar)
servers, using orchestration/automation,
mostly OS and application logs
- On Premise Datacenters
- Cloud Instances
Benefits/Drawbacks
+ No Hardware Requirement
+ Automation (Chef/Puppet/Scripting)
- Outbound Internet Access Required
- Resource Usage on Target
Sumo Logic ConfidentialSumo Logic Confidential
Collector Deployment – Local Collectors
Sumo Logic Confidential
Source Types
Syslog
 Operating Systems, Middleware, Custom
Applications, etc
Windows Events
 Remote Windows Events
Script
 Cloud API’s, Database Content, binary data
Typical Scenarios
Customers with mostly Windows
Environments or existing logging
infrastructure (syslog/logstash)
- On Premise Datacenters
Benefits/Drawbacks
+ No Outbound Internet Access
+ Leverage existing logging Infrastructure
- Scale
- Dedicated Hardware
- Complexity (Failover, syslog rules)
Centralized Data Collection
The Sumo Logic Collector is installed on a set of dedicated machines, these collect log data from the target Hosts via various remote
mechanisms and forward the data to the Sumo Logic Backend. This can be accomplished by either using Sumo Logic syslog source
type or by running Syslog Servers (syslog-ng, rsyslog), write to file, and collect from there.
Sumo Logic ConfidentialSumo Logic Confidential
Collector Deployment – Centralized Collector
Sumo Logic Confidential
Deployment Options Summary
Collector Benefits Drawbacks
Local
• Direct access to source logs
• Ease of troubleshooting
• No additional HW requirements
• More Complex Management
• Resource usage on target host
• Need for outbound internet access
Centralize
d
• Fewer collectors and sources
• Simplified management
• Target hosts don’t need outbound
internet access
• Need for dedicated hardware
• More complex setup (users, permissions)
• Harder to troubleshoot
• Requires careful planning in order to scale
Hosted
• Agentless
• Build it into your infrastructure (S3)
• Direct HTTP POST
• Requires local script to POST or curl
messages
Resources:
 Design Your Deployment
 Best Practices: Local and Centralized Data Collection
Sumo Logic Confidential
Sources
Sumo Logic Confidential
Host A
Collectors and Sources
Apache Access
Apache Error
Collector
A
Host B
Collector
B
Host C
Collector
C
Apache Access
Apache Error
IIS Logs
IIS W3C Logs
Sumo Logic ConfidentialSumo Logic Confidential
Defining a Source
A single Collector can have
multiple Sources.
Key fields to define when
configuring any Source type:
• Name
• Description
• Historical Data
• Source Host
• Source Category
• File path
– Excluding syslog
• Timestamp Parsing
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: Remote File
Required for remote collection:
• Listening port
• Remote login credentials
– Username and password
– Local SSH
• Absolute file path
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: Syslog
Required for Syslog collection:
• Protocol
• Listening port
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: Windows Event Collection
Required for Windows Event Collection:
• Remote specific:
– Remote host name(s)
– Windows Domain
– Username / password
• Windows Event Type
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: Windows Performance Collection
Required for Windows Performance Collection:
• Remote specific:
– Remote host name(s)
– Windows Domain
– Username / password
• Frequency
• Perfmon Queries
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: Script
Required for script based collection:
• Execution frequency
• Command type
• Path to script
• Script to execute
• Working directory
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: HTTP
Required for HTTP Source:
• How to treat incoming POST
requests
After Configuration:
• Use URL to send POST
messages to the collector
Sumo Logic ConfidentialSumo Logic Confidential
Source Specific: Amazon S3 and AWS sources
Required for Amazon S3:
• IAM
– Key ID
– Security Key
• Bucket name
• Path expression
• Scan interval
Sumo Logic ConfidentialSumo Logic Confidential
Configuration: Filtering Source Data
• Regular expressions are used to create rules to filter data sent from a Source.
• The filters affect only data sent to Sumo Logic; logs on your end remain intact.
• Filter Types
– Exclude Filter (Black List)
– Include Filter (White List)
– Hash Filter (i.e. Replace credit card number with unique randomly generated code)
– Mask Filter (i.e. Mask each character with #)
– Note
• Exclude filters override all other filter types for a specific value
• Mask and hash filters are applied after exclusion and inclusion filters
Sumo Logic ConfidentialSumo Logic Confidential
Configuration: Filtering Files (Blacklisting)
• Blacklist files or set of files that shouldn’t be ingested
Sumo Logic Confidential
Metadata
Sumo Logic Confidential
Metadata Fields
Name Description
_collector Name of the collector this data came from
_source Name of the source this data came through
_sourceHost Hostname of the server this data came from
_sourceName Name of the log file (including path)
_sourceCategory Category designation of source data
Tags added to your messages when data is collected
Host A
Apache Access
Apache Error
Collector
A
Sumo Logic Confidential
Host A
Metadata Field Usage
Apache Access
_sourceCategory =
WS/Apache/Access
Apache Error
_sourceCategory =
WS/Apache/Error
Collector
A
Host B
Collector
B
Host C
Collector
C
Apache Access
_sourceCategory =
WS/Apache/Access
Apache Error
_sourceCategory =
WS/Apache/Error
IIS Logs
_sourceCategory =
WS/IIS
IIS W3C Logs
_sourceCategory =
WS/IIS/W3C
Sample Searches for
_sourceCategory:
= WS/Apache/Access
= WS/Apache/*
= WS/*
Sumo Logic ConfidentialSumo Logic Confidential
Source Category Best Practices
• Recommended nomenclature for Source Categories
Component1/Component2/Component3…
• From least descriptive to most descriptive
Networking/Firewall/Cisco/FWSM
Networking/Firewall/Cisco/ASA
Networking/Firewall/PAN/PA7050
Networking/Router/Cisco/2821
• Note: Not all types of logs need to have the same amount of levels.
• Benefits
– Simple search scoping by using wild cards anywhere in the string
– Simple, intuitive and self-maintaining partitions/index
– Simple and self maintaining RBAC rules
• Blog Post: Good SourceCategory, Bad SourceCategory
Sumo Logic Confidential
Automation
Sumo Logic ConfidentialSumo Logic Confidential
Automating Deployments
• Silent installation
 Use sumo.conf
 Provide name, credentials and source file parameter for initial setup only
• Local Configuration Collector Management
 Manage configuration locally using a JSON file with Chef/Puppet
 Available for both new and existing collectors
• Collector Management API
 Define an initial Source configuration for your Collectors using a JSON file
 Retrieve and update Collector Configuration from an HTTP endpoint
Sumo Logic ConfidentialSumo Logic Confidential
Installed Collector Deployment Tips
• Install using Collector Guidelines/Requrements
• Access Keys
– Used for collector registration and API
– ID/Key Pair instead of user/pass
• Especially important when storing credentials on disk
• Collector Logs
– Logs in: $SUMO_HOME/logs
– Current Log: $SUMO_HOME/logs/collector.log
– Check for Out of Memory Errors
– Increase memory if needed as described on Support Site Post
Sumo Logic Confidential
Questions?
Additional Resources
Search Video Library and Documentation
Search/Post to Community Forums
Search, post, respond
Submit/vote for feature requests
Submit Tips & Tricks
Open a Support Case
Sumo Logic Services
Customer Success, Professional Services,
Training
Sumo Logic Confidential
Thank You!
April 2016

More Related Content

PDF
An approach towards sotif with ansys medini analyze
PPT
Testing Metrics
PDF
Verification challenges and methodologies - SoC and ASICs
PPT
Compiler Design Unit 1
PDF
Storage organization and stack allocation of space
PDF
Autosar Basics hand book_v1
PDF
Uvm presentation dac2011_final
PPTX
Automatic test packet generation
An approach towards sotif with ansys medini analyze
Testing Metrics
Verification challenges and methodologies - SoC and ASICs
Compiler Design Unit 1
Storage organization and stack allocation of space
Autosar Basics hand book_v1
Uvm presentation dac2011_final
Automatic test packet generation

What's hot (20)

PPTX
Operating system 22 threading issues
PPT
Software quality assurance lecture 1
PPT
phases of a compiler
PPTX
STM-UNIT-1.pptx
PDF
Results of model-based testing in automotive
PPTX
Off the-shelf components (cots)
PPTX
Flash Bootloader Development for ECU programming
PPTX
compiler and their types
PPT
TEST EXECUTION AND REPORTING
PPTX
Ch 6 randomization
PDF
Python.pdf
PPT
Ch5: Threads (Operating System)
PPTX
Synchronization in distributed computing
PPT
Real-Time Operating Systems
PDF
Session 6 sv_randomization
PDF
Functional verification techniques EW16 session
PPT
System testing ppt
PDF
Automotive embedded systems part1 v1
PPT
Introduction to Compiler Construction
PDF
Diagnostic in Adaptive AUTOSAR
Operating system 22 threading issues
Software quality assurance lecture 1
phases of a compiler
STM-UNIT-1.pptx
Results of model-based testing in automotive
Off the-shelf components (cots)
Flash Bootloader Development for ECU programming
compiler and their types
TEST EXECUTION AND REPORTING
Ch 6 randomization
Python.pdf
Ch5: Threads (Operating System)
Synchronization in distributed computing
Real-Time Operating Systems
Session 6 sv_randomization
Functional verification techniques EW16 session
System testing ppt
Automotive embedded systems part1 v1
Introduction to Compiler Construction
Diagnostic in Adaptive AUTOSAR
Ad

Viewers also liked (20)

PPTX
Sumo Logic: Optimizing Scheduled Searches
PDF
How To Webinar - Sumo Logic API
PPTX
Sumo Logic Quickstart - Jan 2017
PDF
Sumo Logic - Optimizing Your Search Experience (2016-08-17)
PPTX
Sumo Logic "How to" Webinar: Advanced Analytics
PPTX
Bring your Graphite-compatible metrics into Sumo Logic
PPTX
Sumo Logic Search Job API
PPTX
Sumo Logic QuickStart
PDF
How to Webinar: Monitoring through Alerts
PPTX
Introduction to LogCompare - Reducing MTTI/MTTR with Ease
PPTX
How Netskope Mastered DevOps with Sumo Logic
PDF
Marcel Kornacker, Software Enginner at Cloudera - "Data modeling for data sci...
PPTX
Sumo Logic Webinar: Visibility into your Host Metrics
PPTX
Sumo Logic QuickStart Webinar Oct 2016
PPTX
Sumo Logic quickStart Webinar June 2016
PPTX
Sumo Logic Quickstart - Nv 2016
PPTX
How Hudl and Cloud Cruiser Leverage Sumo Logic's Unified Logs and Metrics
PPTX
Sumo Logic QuickStart Webinar - Dec 2016
PPTX
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
PDF
The Future of Data Management: The Enterprise Data Hub
Sumo Logic: Optimizing Scheduled Searches
How To Webinar - Sumo Logic API
Sumo Logic Quickstart - Jan 2017
Sumo Logic - Optimizing Your Search Experience (2016-08-17)
Sumo Logic "How to" Webinar: Advanced Analytics
Bring your Graphite-compatible metrics into Sumo Logic
Sumo Logic Search Job API
Sumo Logic QuickStart
How to Webinar: Monitoring through Alerts
Introduction to LogCompare - Reducing MTTI/MTTR with Ease
How Netskope Mastered DevOps with Sumo Logic
Marcel Kornacker, Software Enginner at Cloudera - "Data modeling for data sci...
Sumo Logic Webinar: Visibility into your Host Metrics
Sumo Logic QuickStart Webinar Oct 2016
Sumo Logic quickStart Webinar June 2016
Sumo Logic Quickstart - Nv 2016
How Hudl and Cloud Cruiser Leverage Sumo Logic's Unified Logs and Metrics
Sumo Logic QuickStart Webinar - Dec 2016
Standing Up an Effective Enterprise Data Hub -- Technology and Beyond
The Future of Data Management: The Enterprise Data Hub
Ad

Similar to "How to" Webinar: Sending Data to Sumo Logic (20)

PPTX
Using Sumo Logic - Apr 2018
PDF
Sumo Logic QuickStart Webinar - Jan 2016
PDF
Level 3 Certification: Setting up Sumo Logic - Oct 2018
PPTX
Setting Up Sumo Logic - Sep 2017
PPTX
Setting up Sumo Logic - June 2017
PPTX
Sumo Logic Cert Jam - Administration
PDF
Sumo Logic Quickstart Training 10/14/2015
PDF
Sumo Logic QuickStart Webinar
PPTX
Setting Up Sumo Logic - Apr 2017
PDF
Anatomy of a Cloud Hack
PPTX
Microsoft Sentinel Deployment V1.pptx
PPTX
Hack proof your aws cloud cloudcheckr_040416
PPTX
Azure sentinel
PPTX
Using AWS To Build A Scalable Machine Data Analytics Service
PDF
DEF CON 24 - workshop - Craig Young - brainwashing embedded systems
PPTX
Sumo Logic QuickStart Webinar July 2016
PPTX
Sumo Logic QuickStart Webinar Sep 2016
PPTX
Sumo Logic QuickStat - Apr 2017
PPTX
Sumo Logic QuickStart - May 2016
PDF
Alabama CyberNow 2018: Cloud Hardening and Digital Forensics Readiness
Using Sumo Logic - Apr 2018
Sumo Logic QuickStart Webinar - Jan 2016
Level 3 Certification: Setting up Sumo Logic - Oct 2018
Setting Up Sumo Logic - Sep 2017
Setting up Sumo Logic - June 2017
Sumo Logic Cert Jam - Administration
Sumo Logic Quickstart Training 10/14/2015
Sumo Logic QuickStart Webinar
Setting Up Sumo Logic - Apr 2017
Anatomy of a Cloud Hack
Microsoft Sentinel Deployment V1.pptx
Hack proof your aws cloud cloudcheckr_040416
Azure sentinel
Using AWS To Build A Scalable Machine Data Analytics Service
DEF CON 24 - workshop - Craig Young - brainwashing embedded systems
Sumo Logic QuickStart Webinar July 2016
Sumo Logic QuickStart Webinar Sep 2016
Sumo Logic QuickStat - Apr 2017
Sumo Logic QuickStart - May 2016
Alabama CyberNow 2018: Cloud Hardening and Digital Forensics Readiness

More from Sumo Logic (19)

PPTX
Welcome Webinar Slides
PDF
Welcome Webinar PDF
PPTX
Sumo Logic Cert Jam - Advanced Metrics with Kubernetes
PPTX
Sumo Logic Cert Jam - Security & Compliance
PPTX
Sumo Logic Cert Jam - Advanced Metrics with Kubernetes
PPTX
Sumo Logic Cert Jam - Metrics Mastery
PPTX
Sumo Logic Cert Jam - Security Analytics
PPTX
Sumo Logic Cert Jam - Search Mastery
PPTX
Sumo Logic Cert Jam - Fundamentals
PPTX
Sumo Logic Cert Jam - Fundamentals (Spanish)
PPTX
Sumo Logic Cert Jam - Metrics Mastery
PDF
Security Certification: Security Analytics using Sumo Logic - Oct 2018
PDF
Level 2 Certification: Using Sumo Logic - Oct 2018
PDF
Sumo Logic Certification - Level 2 (Using Sumo)
PPTX
Sumo Logic QuickStart Webinar - Get Certified
PPTX
You Build It, You Secure It: Introduction to DevSecOps
PPTX
Making the Shift from DevOps to Practical DevSecOps | Sumo Logic Webinar
PPTX
Machine Analytics: Correlate Your Logs and Metrics
PPTX
Scaling Your Tools for Your Modern Application
Welcome Webinar Slides
Welcome Webinar PDF
Sumo Logic Cert Jam - Advanced Metrics with Kubernetes
Sumo Logic Cert Jam - Security & Compliance
Sumo Logic Cert Jam - Advanced Metrics with Kubernetes
Sumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Security Analytics
Sumo Logic Cert Jam - Search Mastery
Sumo Logic Cert Jam - Fundamentals
Sumo Logic Cert Jam - Fundamentals (Spanish)
Sumo Logic Cert Jam - Metrics Mastery
Security Certification: Security Analytics using Sumo Logic - Oct 2018
Level 2 Certification: Using Sumo Logic - Oct 2018
Sumo Logic Certification - Level 2 (Using Sumo)
Sumo Logic QuickStart Webinar - Get Certified
You Build It, You Secure It: Introduction to DevSecOps
Making the Shift from DevOps to Practical DevSecOps | Sumo Logic Webinar
Machine Analytics: Correlate Your Logs and Metrics
Scaling Your Tools for Your Modern Application

Recently uploaded (20)

PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
How Creative Agencies Leverage Project Management Software.pdf
PPTX
Essential Infomation Tech presentation.pptx
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Digital Strategies for Manufacturing Companies
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
top salesforce developer skills in 2025.pdf
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
Nekopoi APK 2025 free lastest update
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Reimagine Home Health with the Power of Agentic AI​
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Operating system designcfffgfgggggggvggggggggg
How Creative Agencies Leverage Project Management Software.pdf
Essential Infomation Tech presentation.pptx
PTS Company Brochure 2025 (1).pdf.......
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Digital Strategies for Manufacturing Companies
Design an Analysis of Algorithms I-SECS-1021-03
Odoo POS Development Services by CandidRoot Solutions
How to Migrate SBCGlobal Email to Yahoo Easily
Odoo Companies in India – Driving Business Transformation.pdf
VVF-Customer-Presentation2025-Ver1.9.pptx
top salesforce developer skills in 2025.pdf
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Upgrade and Innovation Strategies for SAP ERP Customers
Nekopoi APK 2025 free lastest update
Lecture 3: Operating Systems Introduction to Computer Hardware Systems

"How to" Webinar: Sending Data to Sumo Logic

  • 1. Sumo Logic Confidential Data Collection June 2016 How-To Webinar Welcome. To give everyone a chance to successfully connect, we’ll start at 10:05 AM Pacific.
  • 2. Sumo Logic Confidential At the completion of this webinar, you will be able to… Design a Sumo Logic deployment that fits your organization Install Collectors Create your Data Sources Understand Local File Configuration Management
  • 4. Sumo Logic Confidential Sumo Logic Data Flow Data Collection Search & Analyze Visualize & Monitor Alerts Dashboards Collectors Sources Operators Charts 1 2 3
  • 5. Sumo Logic ConfidentialSumo Logic Confidential Enterprise Logs are Everywhere Custom App Code Server / OS Virtual Databases Network Open Source Middleware Content Delivery IaaS, PaaS SaaS Security
  • 6. Sumo Logic Confidential Designing Your Deployment • Sumo Logic Data Collection is infinitely flexible. • Design a Sumo Logic deployment that's right for your organization. • Installed versus Hosted Collectors.
  • 7. Sumo Logic Confidential Host A Collectors and Sources Apache Access Apache Error Collector A Host B Collector B Host C Collector C Apache Access Apache Error IIS Logs IIS W3C Logs
  • 9. Sumo Logic ConfidentialSumo Logic Confidential Collector and Deployment Options Collector Cloud Data Collection Centralized Data Collection Local Data Collection Collector CollectorCollector Collector Hosted Collectors Installed Collectors
  • 10. Sumo Logic Confidential Source Types S3 Bucket  Any data written to S3 buckets via AWS, Lambda Scripts, custom Apps HTTPS  Akamai, Log Appender Libraries, etc. Google  Google API Typical Scenarios AWS Only Customers, while it's possible to rely on Cloud Data Collection entirely, this is not typical. These source types are normally just part of the overall collection strategies Benefits/Drawbacks + No Software Installation - S3 Latency issues - Https Post Caching Need Cloud Data Collection Most Data is generated in the Cloud and by Cloud Services and is collected via Sumo Logics Cloud Integrations.
  • 11. Sumo Logic Confidential Local Data Collection The Sumo Logic Collector is installed on all target Hosts and, where possible, sends log data produced on those target Hosts directly to Sumo Logic Backend via https connection. Source Types Local Files  Operating Systems, Middleware, Custom Apps, etc. Windows Events  Local Windows Events Docker  Logs and Stats Syslog (dedicated Collector)  Network Devices, Snare, etc Script (dedicated Collector)  Cloud API’s, Database Content, binary data Typical Scenarios Customers with large amounts of (similar) servers, using orchestration/automation, mostly OS and application logs - On Premise Datacenters - Cloud Instances Benefits/Drawbacks + No Hardware Requirement + Automation (Chef/Puppet/Scripting) - Outbound Internet Access Required - Resource Usage on Target
  • 12. Sumo Logic ConfidentialSumo Logic Confidential Collector Deployment – Local Collectors
  • 13. Sumo Logic Confidential Source Types Syslog  Operating Systems, Middleware, Custom Applications, etc Windows Events  Remote Windows Events Script  Cloud API’s, Database Content, binary data Typical Scenarios Customers with mostly Windows Environments or existing logging infrastructure (syslog/logstash) - On Premise Datacenters Benefits/Drawbacks + No Outbound Internet Access + Leverage existing logging Infrastructure - Scale - Dedicated Hardware - Complexity (Failover, syslog rules) Centralized Data Collection The Sumo Logic Collector is installed on a set of dedicated machines, these collect log data from the target Hosts via various remote mechanisms and forward the data to the Sumo Logic Backend. This can be accomplished by either using Sumo Logic syslog source type or by running Syslog Servers (syslog-ng, rsyslog), write to file, and collect from there.
  • 14. Sumo Logic ConfidentialSumo Logic Confidential Collector Deployment – Centralized Collector
  • 15. Sumo Logic Confidential Deployment Options Summary Collector Benefits Drawbacks Local • Direct access to source logs • Ease of troubleshooting • No additional HW requirements • More Complex Management • Resource usage on target host • Need for outbound internet access Centralize d • Fewer collectors and sources • Simplified management • Target hosts don’t need outbound internet access • Need for dedicated hardware • More complex setup (users, permissions) • Harder to troubleshoot • Requires careful planning in order to scale Hosted • Agentless • Build it into your infrastructure (S3) • Direct HTTP POST • Requires local script to POST or curl messages Resources:  Design Your Deployment  Best Practices: Local and Centralized Data Collection
  • 17. Sumo Logic Confidential Host A Collectors and Sources Apache Access Apache Error Collector A Host B Collector B Host C Collector C Apache Access Apache Error IIS Logs IIS W3C Logs
  • 18. Sumo Logic ConfidentialSumo Logic Confidential Defining a Source A single Collector can have multiple Sources. Key fields to define when configuring any Source type: • Name • Description • Historical Data • Source Host • Source Category • File path – Excluding syslog • Timestamp Parsing
  • 19. Sumo Logic ConfidentialSumo Logic Confidential Source Specific: Remote File Required for remote collection: • Listening port • Remote login credentials – Username and password – Local SSH • Absolute file path
  • 20. Sumo Logic ConfidentialSumo Logic Confidential Source Specific: Syslog Required for Syslog collection: • Protocol • Listening port
  • 21. Sumo Logic ConfidentialSumo Logic Confidential Source Specific: Windows Event Collection Required for Windows Event Collection: • Remote specific: – Remote host name(s) – Windows Domain – Username / password • Windows Event Type
  • 22. Sumo Logic ConfidentialSumo Logic Confidential Source Specific: Windows Performance Collection Required for Windows Performance Collection: • Remote specific: – Remote host name(s) – Windows Domain – Username / password • Frequency • Perfmon Queries
  • 23. Sumo Logic ConfidentialSumo Logic Confidential Source Specific: Script Required for script based collection: • Execution frequency • Command type • Path to script • Script to execute • Working directory
  • 24. Sumo Logic ConfidentialSumo Logic Confidential Source Specific: HTTP Required for HTTP Source: • How to treat incoming POST requests After Configuration: • Use URL to send POST messages to the collector
  • 25. Sumo Logic ConfidentialSumo Logic Confidential Source Specific: Amazon S3 and AWS sources Required for Amazon S3: • IAM – Key ID – Security Key • Bucket name • Path expression • Scan interval
  • 26. Sumo Logic ConfidentialSumo Logic Confidential Configuration: Filtering Source Data • Regular expressions are used to create rules to filter data sent from a Source. • The filters affect only data sent to Sumo Logic; logs on your end remain intact. • Filter Types – Exclude Filter (Black List) – Include Filter (White List) – Hash Filter (i.e. Replace credit card number with unique randomly generated code) – Mask Filter (i.e. Mask each character with #) – Note • Exclude filters override all other filter types for a specific value • Mask and hash filters are applied after exclusion and inclusion filters
  • 27. Sumo Logic ConfidentialSumo Logic Confidential Configuration: Filtering Files (Blacklisting) • Blacklist files or set of files that shouldn’t be ingested
  • 29. Sumo Logic Confidential Metadata Fields Name Description _collector Name of the collector this data came from _source Name of the source this data came through _sourceHost Hostname of the server this data came from _sourceName Name of the log file (including path) _sourceCategory Category designation of source data Tags added to your messages when data is collected Host A Apache Access Apache Error Collector A
  • 30. Sumo Logic Confidential Host A Metadata Field Usage Apache Access _sourceCategory = WS/Apache/Access Apache Error _sourceCategory = WS/Apache/Error Collector A Host B Collector B Host C Collector C Apache Access _sourceCategory = WS/Apache/Access Apache Error _sourceCategory = WS/Apache/Error IIS Logs _sourceCategory = WS/IIS IIS W3C Logs _sourceCategory = WS/IIS/W3C Sample Searches for _sourceCategory: = WS/Apache/Access = WS/Apache/* = WS/*
  • 31. Sumo Logic ConfidentialSumo Logic Confidential Source Category Best Practices • Recommended nomenclature for Source Categories Component1/Component2/Component3… • From least descriptive to most descriptive Networking/Firewall/Cisco/FWSM Networking/Firewall/Cisco/ASA Networking/Firewall/PAN/PA7050 Networking/Router/Cisco/2821 • Note: Not all types of logs need to have the same amount of levels. • Benefits – Simple search scoping by using wild cards anywhere in the string – Simple, intuitive and self-maintaining partitions/index – Simple and self maintaining RBAC rules • Blog Post: Good SourceCategory, Bad SourceCategory
  • 33. Sumo Logic ConfidentialSumo Logic Confidential Automating Deployments • Silent installation  Use sumo.conf  Provide name, credentials and source file parameter for initial setup only • Local Configuration Collector Management  Manage configuration locally using a JSON file with Chef/Puppet  Available for both new and existing collectors • Collector Management API  Define an initial Source configuration for your Collectors using a JSON file  Retrieve and update Collector Configuration from an HTTP endpoint
  • 34. Sumo Logic ConfidentialSumo Logic Confidential Installed Collector Deployment Tips • Install using Collector Guidelines/Requrements • Access Keys – Used for collector registration and API – ID/Key Pair instead of user/pass • Especially important when storing credentials on disk • Collector Logs – Logs in: $SUMO_HOME/logs – Current Log: $SUMO_HOME/logs/collector.log – Check for Out of Memory Errors – Increase memory if needed as described on Support Site Post
  • 35. Sumo Logic Confidential Questions? Additional Resources Search Video Library and Documentation Search/Post to Community Forums Search, post, respond Submit/vote for feature requests Submit Tips & Tricks Open a Support Case Sumo Logic Services Customer Success, Professional Services, Training
  • 36. Sumo Logic Confidential Thank You! April 2016

Editor's Notes

  • #2: Welcome everyone. My name is…. I’m joined by Maisie and Ryan who are part of our Engineering team that works on Data Collection. They will be fielding questions at the end of this webinar’s Q&A session. Housekeeping items: Everyone is on mute to avoid distractions If you want to ask a question, please do so using the GTW question panel This webinar will be recorded and shared with all of you, along with the slides
  • #3: Please note that this webinar is specifically for users with Admin priviledges who have access to install and manage Collectors. At the completion of this webinar, you will be able to…
  • #5: Sumo Logic Data Flow is broken into 3 main areas: Data Collection through configurable Collectors and Sources. Collectors collect, compress, cache and encrypt the data for secure transfer. Search and Analyze – Users can run searches and correlate events in real-time across the entire application stack. We will be spending most of our time in this area during this webinar, as this is most likely what you will first be doing as a new user. Visualize and Monitor- Users have the ability to create custom dashboards to help you easily monitor your data in real-time. Custom alerts notify you when specific events are identified across your stack. I will cover Data Collection at a high-level, and cover the next 2 areas through a demo.
  • #6: What data can we ingest? We can ingest data from just about any source you can imagine - structured or unstructured. Here are just a few of the devices, applications and frameworks you may be using - all of which produce log data that SL can analyze. The left hands side can present you technology stack – from custom application code all the way down to your network devices.
  • #7: Sumo Logic Installed and Hosted Collectors are infinitely flexible. Design a Sumo Logic deployment that's right for your organization. <Review slide citing some examples>
  • #8: At a High-level, Customers collect and send data to Sumo Logic through the use of Collectors and Sources. We’ll cover collectors first and then dive into Sources. This is an great example what we see at a typical customer. This customer is sending web server log files to the Sumo Logic service. Host A and Host B are each sending a couple of log files through a locally installed Sumo Logic collector. In the case of Host C, which is sending IIS log files, it’s using a hosted collector where a local script can send data to an HTTP endpoint (running curl and POST commands).
  • #11: Hosted Collectors Allow for seamless collection from Amazon S3 buckets and HTTP Sources. Hosted Collectors don't require installation or activation, nor do Hosted Collectors have physical requirements, since they're hosted in AWS. Because there are no performance issues to consider, you can configure as many S3 and HTTP Sources as you'd like for a single Hosted Collector. Installed Collectors Sumo Logic Installed Collectors are lightweight and efficient. You can choose to install a small number of Collectors to minimize maintenance or just because you want to keep your topology simple (Centralized). Alternatively, you can choose to install many Collectors on many machines (Local) to distribute the bandwidth impact across your network. Installed Collectors are deployed in your environment, either on a local machine, a machine in your organization, or even an Amazon Machine Image (AMI). Installed Collectors require a software download and installation. Upgrades to Collector software are released regularly. A few things to consider: Consider having an Installed Collector on a dedicated machine if: You are running a very high-bandwidth network with high logging levels. You want a central collection point for many Sources. Consider having more than one Installed Collector if: You expect the combined number of files coming into one Collector to exceed 500. Your hardware has memory or CPU limitations. You expect combined logging traffic for one Collector to be higher than 15,000 events per second. Your network clusters or regions are geographically separated. You prefer to install many Collectors, for example, one per machine to collect local files. IMPORTANT: For system requirement details, see Installed Collector Requirements.
  • #13: The Sumo Logic Collector is installed on all target Hosts and, where possible, sends log data produced on those target Hosts directly to Sumo Logic Backend via https connection.
  • #14: The Sumo Logic Collector is installed on all target Hosts and, where possible, sends log data produced on those target Hosts directly to Sumo Logic Backend via https connection.
  • #15: The Sumo Logic Collector is installed on a set of dedicated machines, these collect log data from the target Hosts via various remote mechanisms and forward the data to the Sumo Logic Backend. This can be accomplished by either using Sumo Logic syslog source type or by running Syslog Servers (syslog-ng, rsyslog), write to file, and collect from there.
  • #16: The Sumo Logic Collector is installed on a set of dedicated machines, these collect log data from the target Hosts via various remote mechanisms and forward the data to the Sumo Logic Backend. This can be accomplished by either using Sumo Logic syslog source type or by running Syslog Servers (syslog-ng, rsyslog), write to file, and collect from there.
  • #17: In most cases our customers will employ a mix of the above options to account for different limitations on both the log types and source types. For example, network devices only broadcast syslog, so even if you generally employ local file collection paradigm, you still need some syslog infrastructure to collect these logs. Same is true for any Cloud API logs (e.g. Okta Event, etc) you may want to collect via script. Another example: an AWS-only customer, will most likely still choose to install collectors on all their EC2 instances (local collection) and collect AWS Audit logs (CloudTrail, ELB, etc) via the S3 integration. Which strategy you choose does not depend primarily on where the data lives, but on the following: sensitivity in terms of outbound internet access, technical abilities in your team (setting up centralized infrastructure requires knowledge, hardware and a need for monitoring/scaling/fault tolerance) Whether or not there is a logging infrastructure already in place. At a high-level, we only recommend the Centralize method if the following are true: - You absolutely cannot live with the internet access requirements - You have an existing infrastructure (syslog/logstash) - Your data volume or your number of target hosts is pretty large
  • #19: At a High-level, Customers collect and send data to Sumo Logic through the use of Collectors and Sources. We’ll cover collectors first and then dive into Sources. This is an great example what we see at a typical customer. This customer is sending web server log files to the Sumo Logic service. Host A and Host B are each sending a couple of log files through a locally installed Sumo Logic collector. In the case of Host C, which is sending IIS log files, it’s using a hosted collector where a local script can send data to an HTTP endpoint (running curl and POST commands). Hosted collectors are also able to load data from AWS S3 buckets.
  • #20: Name: something that is relevant to the data you are collecting Description: reference to understand the source Source Category; custom label that you can easily use to search data gathered by this source Timestamp Host File Path or Source (Name) Source Specific Config Local/Remote Path script/ File/Windows
  • #27: Name: something that is relevant to the data you are collecting Description: reference to understand the source Source Category; custom label that you can easily use to search data gathered by this source Timestamp Host Hosted: S3: path expression allows you to identify which objects to upload from S3 can use wildcard to define the path expression and capture more files. Exact file name will only pick up files that match that
  • #31: Great, data is ingested into the Sumo Logic service, but something else is also happening in the background. Every single message ingested gets tagged with metadata that makes it much easier to search for related messages. This table shows the 5 main tags (review them all) In particular, I want to point out the source Category metadata field, as choosing the right naming convention can make a big impact on your searching capabilities and performance.
  • #32: This example will highlight the importance of defining the proper source category: Notice I’ve added the desired SourceCategory for each Source: = WS/Apache/Access Searches across Apache Security logs in both Host A and Host B = WS/Apache/* Searches across all Apache sources in both Host A and Host B = WS/* Searches across all Web Servers across all hosts