SlideShare a Scribd company logo
Sumo Logic Confidential
Setting Up Sumo Logic
Data Collection and System Optimization
Mario Sanchez
Director of Technical Content and Training
April 2017
Welcome.
To give everyone a
chance to successfully
connect, we’ll start at
10:05 AM Pacific.
Note you are currently
muted.
Sumo Logic Confidential
At the completion of this webinar, you will be able to…
Deploy a data collection strategy that best fits your
environment
Implement best practices around data collection
Develop a robust naming convention for your
metadata
Learn to utilize optimization tools to enhance search
performance
Sumo Logic Confidential
What is Sumo Logic?
Sumo Logic Confidential
Continuous Intelligence
DEVOPS IT INFRASTRUCTURE
AND OPERATIONS
COMPLIANCE AND
SECURITY
DEVOPS
Streamline continuous
delivery
Monitor KPI’s and
Metrics
Accelerate
Troubleshooting
IT INFRASTRUCTURE
AND OPERATIONS
Monitor all workloads
Troubleshoot and
increase uptime
Simplify, Modernize,
and save costs
COMPLIANCE AND
SECURITY
Automate and
demonstrate compliance
Audit all systems
Think beyond rules
Sumo Logic Cloud Analytics Service
Sumo Logic Confidential
Enterprise Logs are Everywhere
Custom App
Code
Server / OS
Virtual
Databases
Network
Open
Source
Middleware
Content
Delivery
IaaS,
PaaS
SaaS Security
Sumo Logic Confidential
High-Level Data Flow
Sumo Logic Confidential
Sumo Logic Data Flow
Data Collection Search & Analyze Visualize & Monitor
Alerts
Dashboards
Collectors
Sources
Operators
Detect
1 2 3
Sumo Logic Confidential
Data Collection Strategy
Sumo Logic Confidential
Designing Your Deployment
• Sumo Logic Data
Collection is
infinitely flexible.
• Design a Sumo
Logic deployment
that's right for
your organization.
• Installed versus
Hosted Collectors.
Sumo Logic ConfidentialSumo Logic Confidential
Collector and Deployment Options
Collector
Cloud Data
Collection
Centralized
Data
Collection
Local Data
Collection
Collector
CollectorCollector
Collector
Hosted Collectors Installed Collectors
Best Practices on Designing
Your Deployment
Sumo Logic Confidential
Local Data Collection
The Sumo Logic Collector is installed on all target Hosts and, where possible, sends log data produced on those target Hosts directly to
Sumo Logic Backend via https connection.
Source Types
Local Files
 Operating Systems, Middleware, Custom Apps,
etc.
Windows Events
 Local Windows Events
Docker
 Logs and Stats
Syslog (dedicated Collector)
 Network Devices, Snare, etc
Script (dedicated Collector)
 Cloud API’s, Database Content, binary data
Typical Scenarios
Customers with large amounts of (similar)
servers, using orchestration/automation,
mostly OS and application logs
- On Premise Datacenters
- Cloud Instances
Benefits/Drawbacks
+ No Hardware Requirement
+ Automation (Chef/Puppet/Scripting)
- Outbound Internet Access Required
- Resource Usage on Target
Sumo Logic Confidential
Source Types
Syslog
 Operating Systems, Middleware, Custom
Applications, etc
Windows Events
 Remote Windows Events
Script
 Cloud API’s, Database Content, binary data
Typical Scenarios
Customers with mostly Windows
Environments or existing logging
infrastructure (syslog/logstash)
- On Premise Datacenters
Benefits/Drawbacks
+ No Outbound Internet Access
+ Leverage existing logging Infrastructure
- Scale
- Dedicated Hardware
- Complexity (Failover, syslog rules)
Centralized Data Collection
The Sumo Logic Collector is installed on a set of dedicated machines, these collect log data from the target Hosts via various remote
mechanisms and forward the data to the Sumo Logic Backend. This can be accomplished by either using Sumo Logic syslog source
type or by running Syslog Servers (syslog-ng, rsyslog), write to file, and collect from there.
Sumo Logic Confidential
Source Types
S3 Bucket
 Any data written to S3 buckets (AWS Audit or
other)
HTTPS
 Lambda Scripts, Akamai, One Login, Log
Appender Libraries, etc.
Google / O365
 Google API and O365 API
Typical Scenarios
Customers using Cloud Infrastructure, while
it's possible to rely on Cloud Data Collection
entirely, this is not typical. These source
types are normally just part of the overall
collection strategies
Benefits/Drawbacks
+ No Software Installation
- S3 Latency issues
- Https Post Caching Need
Cloud Data Collection
Most Data is generated in the Cloud and by Cloud Services and is collected via Sumo Logics Cloud Integrations.
Sumo Logic Confidential
Metadata Design
Sumo Logic Confidential
What is Metadata?
Tag Description
_collector Name of the collector (defaults to hostname)
_source Name of the source this data came through
_sourceHost Hostname of the server (defaults to hostname)
_sourceName Name and Path of the log file
_sourceCategory Can be freely configured. Main metadata tag
Metadata tags are associated with each log message that is collected. Values are set through
collector and source configuration.
Sumo Logic ConfidentialSumo Logic Confidential
Source Category Best Practices
Recommended nomenclature for Source Categories
Component1/Component2/Component3…
From least descriptive to most descriptive
* Note: Not all types of logs need to have the same amount of levels.
Best Practices: Good Source Category, Bad Source Category
Prod/MyApp1/Apache/Access
Prod/MyApp1/Apache/Error
Prod/MyApp1/CloudTrail
Dev/MyApp1/Apache/Access
Dev/MyApp1/Apache/Error
Dev/MyApp1/CloudTrail
Prod/MyApp2/Nginx/Access
Prod/MyApp2/Tomcat/Access
Prod/MyApp2/Tomcat/Catalina/Out
Prod/MyApp2/MySQL/SlowQueries
Dev/MyApp2/Nginx/Access
Dev/MyApp2/Tomcat/Access
Dev/MyApp2/Tomcat/Catalina/Out
Dev/MyApp2/MySQL/SlowQueries
Sumo Logic ConfidentialSumo Logic Confidential
Metadata: Source Category Best Practices and Benefits
Simple Search Scoping
_sourceCategory=Prod/MyApp1/Apache* (All Apache Logs for Prod)
_sourceCategory=*/MyApp1/Apache* (All Apache Logs for all environments)
Simple, Intuitive and Self-maintaining Partitions/Indexes
_sourceCategory=Prod/MyApp1*
_sourceCategory=Prod/MyApp2*
Note: First or first and second component are used for partitioning
Simple and Self-maintaining RBAC Roles
_sourceCategory=Prod/MyApp1*
Sumo Logic ConfidentialSumo Logic Confidential
Metadata: Source Category Best Practices
Common components (and any combination of):
• Environment (Prod/UAT/DEV)
• Application Name
• Geographic Information (East vs West datacenter, office location, etc.)
• AWS Region
• Business Unit
Highest level components should group the data how it is most often searched together:
Prod/Web/Apache/Access
Dev/Web/Apache/Access
Prod/DB/MySQL/Error
Dev/DB/MySQL/Error
Web/Apache/Access/Prod
Web/Apache/Access/Dev
DB/MySQL/Error/Prod
DB/MySQL/Error/Dev
Sumo Logic Confidential
Optimization Tools
Sumo Logic Confidential
Partitions
Indexes for subsets of your data. Segregate your data into smaller, logical chunks, that are
mostly searched in isolation of other Partitions.
Best Practices
No overlap
< 20 Partitions
Ideally between 1% and 30% of total volume
Group data that is searched together most often
About Partitions
Examples:
_sourceCategory=Prod/MyApp1*
_sourceCategory=Prod/MyApp2*
or
_sourceCategory=Prod/*
_sourceCategory=Dev/*
Sumo Logic Confidential
Field Extraction Rules
Apply parse logic for a dataset at time of ingest, as opposed to at search time.
Benefits
Better Performance
Standardized field names
Simplified Searches
Best Practices
Build simple, specific Rules
Test Parse and other operations thoroughly (use nodrop and isEmpty for testing)
Limitations
50 rules/200 fields (Will be removed soon)
Not all operators supported
Sumo Logic Confidential
Scheduled Views
Copies of subsets of data, similar to a relation DB materialized view.
Use Cases
Pre-aggregated data (e.g. for long-term trends)
Find the needle in the haystack….
Best Practices
We recommend selectivity of > 1:10000
How They Work
View is updated by service ~once a minute
Allows for backfilling
Search view using _view=[viewname]
Data does count against ingest volume
Sumo Logic Confidential
Review: Search Optimization Tools
What I want to do is Partition Scheduled View Field Extraction
Run queries against a
certain set of data
Choose if the
amount of data is
between 1-30%
Choose if the
amount of data you’d
like to segregate is
1% or less
Choose if you want to
pre-extract fields that
you are searching
against frequently
Extract fields from logs and
make available to all users
✔
Use data to identify long-
term trends
✔
Segregate data by
Metadata
✔
Pre-computed or
aggregate data ready to
query
✔
Use RBAC to deny or grant
access to the data
✔ ✔
Sumo Logic Confidential
In Summary, you can…
Ingest any type of logs (structured and non-structured)
Select a deployment option that best fits your sources
Develop a robust naming convention for your metadata
Take advantage of Optimization Tools
Call to Action:
Set up deployment option or (hybrid option) that best fits your environment
Ensure you have a robust _SourceCategory naming convention
At the very least, set up Field Extraction Rules for your popular data sources
Sumo Logic Confidential
Questions?
Consume Training
sumologic.com/training
Read Documentation
help.sumologic.com
Search/Post to Community
community.sumologic.com
slack.sumologic.com
Open a Support Case
support.sumologic.com
Log a Feature Request
sumologic.ideas.aha.io/ideas
Sumo Logic Confidential
Thank you!

More Related Content

PPTX
Scaling Your Tools for Your Modern Application
PPTX
Machine Analytics: Correlate Your Logs and Metrics
PPTX
Setting up Sumo Logic - June 2017
PPTX
Sumo Logic QuickStat - Apr 2017
PPTX
Bring your Graphite-compatible metrics into Sumo Logic
PPTX
Sumo Logic Webinar: Visibility into your Host Metrics
PPTX
Advanced Troubleshooting Techniques for your Application Stack Using MongoDB
PPTX
Sumo Logic QuickStart Webinar Oct 2016
Scaling Your Tools for Your Modern Application
Machine Analytics: Correlate Your Logs and Metrics
Setting up Sumo Logic - June 2017
Sumo Logic QuickStat - Apr 2017
Bring your Graphite-compatible metrics into Sumo Logic
Sumo Logic Webinar: Visibility into your Host Metrics
Advanced Troubleshooting Techniques for your Application Stack Using MongoDB
Sumo Logic QuickStart Webinar Oct 2016

What's hot (20)

PPTX
Sumo Logic QuickStart Webinar Sep 2016
PPTX
How McGraw Hill Uses Sumo Logic and AWS for Operational and Security Intellig...
PPTX
AWS and Sumo Logic Webinar: Simplify Compliance with Proactive Machine Data A...
PDF
Infrastructure monitoring made easy, from ingest to insight
PDF
Construção de uma plataforma de observabilidade centralizada
PDF
Q radar architecture deep dive
PDF
Threat hunting with Elastic APM
PDF
DBOps
PPTX
You Can't Correlate what you don't have - ArcSight Protect 2011
PDF
Reinventing enterprise defense with the Elastic Stack
PDF
Palestra de abertura: Evolução e visão do Elastic Observability
PPTX
Data Onboarding Breakout Session
PDF
AWS security monitoring and compliance validation from Adobe.
PDF
Cybersecurity with Apache Metron and Apache Solr - Ward Bekker, Hortonworks &...
PDF
O monitoramento da infraestrutura facilitado, da ingestão ao insight
DOCX
Architecture
PDF
Standard Content Guide for ArcSight Express w/ CORR-Engine v3.0
PPTX
Apache Spot
PPTX
Getting Started with Splunk Enterprise
PPTX
Apply big data and data lake for processing security data collections
Sumo Logic QuickStart Webinar Sep 2016
How McGraw Hill Uses Sumo Logic and AWS for Operational and Security Intellig...
AWS and Sumo Logic Webinar: Simplify Compliance with Proactive Machine Data A...
Infrastructure monitoring made easy, from ingest to insight
Construção de uma plataforma de observabilidade centralizada
Q radar architecture deep dive
Threat hunting with Elastic APM
DBOps
You Can't Correlate what you don't have - ArcSight Protect 2011
Reinventing enterprise defense with the Elastic Stack
Palestra de abertura: Evolução e visão do Elastic Observability
Data Onboarding Breakout Session
AWS security monitoring and compliance validation from Adobe.
Cybersecurity with Apache Metron and Apache Solr - Ward Bekker, Hortonworks &...
O monitoramento da infraestrutura facilitado, da ingestão ao insight
Architecture
Standard Content Guide for ArcSight Express w/ CORR-Engine v3.0
Apache Spot
Getting Started with Splunk Enterprise
Apply big data and data lake for processing security data collections
Ad

Similar to Setting Up Sumo Logic - Apr 2017 (20)

PPTX
Setting Up Sumo Logic - Sep 2017
PPTX
Using Sumo Logic - Apr 2018
PDF
Level 3 Certification: Setting up Sumo Logic - Oct 2018
PPTX
Sumo Logic Cert Jam - Administration
PPTX
"How to" Webinar: Sending Data to Sumo Logic
PPTX
Sumo Logic QuickStart Webinar - Get Certified
PDF
Sumo Logic QuickStart Webinar - Jan 2016
PPTX
Welcome Webinar Slides
PPTX
Sumo Logic QuickStart Webinar - Dec 2016
PDF
Sumo Logic QuickStart Webinar
PPTX
Sumo Logic Quickstart - Nv 2016
PPTX
Sumo Logic Quickstart - Jan 2017
PDF
Sumo Logic - Optimizing Your Search Experience (2016-08-17)
PDF
Sumo Logic Quickstart Training 10/14/2015
PDF
Sumo Logic Quick Start - Feb 2016
PDF
Optimizing Your Search Experience
PPTX
Sumo Logic quickStart Webinar June 2016
PPTX
Sumo Logic QuickStart - May 2016
PPTX
Sumo Logic QuickStart
PPTX
Sumo Logic "How to" Webinar: Advanced Analytics
Setting Up Sumo Logic - Sep 2017
Using Sumo Logic - Apr 2018
Level 3 Certification: Setting up Sumo Logic - Oct 2018
Sumo Logic Cert Jam - Administration
"How to" Webinar: Sending Data to Sumo Logic
Sumo Logic QuickStart Webinar - Get Certified
Sumo Logic QuickStart Webinar - Jan 2016
Welcome Webinar Slides
Sumo Logic QuickStart Webinar - Dec 2016
Sumo Logic QuickStart Webinar
Sumo Logic Quickstart - Nv 2016
Sumo Logic Quickstart - Jan 2017
Sumo Logic - Optimizing Your Search Experience (2016-08-17)
Sumo Logic Quickstart Training 10/14/2015
Sumo Logic Quick Start - Feb 2016
Optimizing Your Search Experience
Sumo Logic quickStart Webinar June 2016
Sumo Logic QuickStart - May 2016
Sumo Logic QuickStart
Sumo Logic "How to" Webinar: Advanced Analytics
Ad

More from Sumo Logic (17)

PDF
Welcome Webinar PDF
PPTX
Sumo Logic Cert Jam - Advanced Metrics with Kubernetes
PPTX
Sumo Logic Cert Jam - Security & Compliance
PPTX
Sumo Logic Cert Jam - Advanced Metrics with Kubernetes
PPTX
Sumo Logic Cert Jam - Metrics Mastery
PPTX
Sumo Logic Cert Jam - Security Analytics
PPTX
Sumo Logic Cert Jam - Search Mastery
PPTX
Sumo Logic Cert Jam - Fundamentals
PPTX
Sumo Logic Cert Jam - Fundamentals (Spanish)
PPTX
Sumo Logic Cert Jam - Metrics Mastery
PDF
Security Certification: Security Analytics using Sumo Logic - Oct 2018
PDF
Level 2 Certification: Using Sumo Logic - Oct 2018
PDF
Sumo Logic Certification - Level 2 (Using Sumo)
PPTX
You Build It, You Secure It: Introduction to DevSecOps
PPTX
Making the Shift from DevOps to Practical DevSecOps | Sumo Logic Webinar
PPTX
Sumo Logic Search Job API
PPTX
Sumo Logic: Optimizing Scheduled Searches
Welcome Webinar PDF
Sumo Logic Cert Jam - Advanced Metrics with Kubernetes
Sumo Logic Cert Jam - Security & Compliance
Sumo Logic Cert Jam - Advanced Metrics with Kubernetes
Sumo Logic Cert Jam - Metrics Mastery
Sumo Logic Cert Jam - Security Analytics
Sumo Logic Cert Jam - Search Mastery
Sumo Logic Cert Jam - Fundamentals
Sumo Logic Cert Jam - Fundamentals (Spanish)
Sumo Logic Cert Jam - Metrics Mastery
Security Certification: Security Analytics using Sumo Logic - Oct 2018
Level 2 Certification: Using Sumo Logic - Oct 2018
Sumo Logic Certification - Level 2 (Using Sumo)
You Build It, You Secure It: Introduction to DevSecOps
Making the Shift from DevOps to Practical DevSecOps | Sumo Logic Webinar
Sumo Logic Search Job API
Sumo Logic: Optimizing Scheduled Searches

Recently uploaded (20)

PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
Introduction to Artificial Intelligence
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
ai tools demonstartion for schools and inter college
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
history of c programming in notes for students .pptx
PDF
AI in Product Development-omnex systems
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
medical staffing services at VALiNTRY
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Understanding Forklifts - TECH EHS Solution
Navsoft: AI-Powered Business Solutions & Custom Software Development
Introduction to Artificial Intelligence
Operating system designcfffgfgggggggvggggggggg
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
ai tools demonstartion for schools and inter college
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
history of c programming in notes for students .pptx
AI in Product Development-omnex systems
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
medical staffing services at VALiNTRY
Softaken Excel to vCard Converter Software.pdf
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf

Setting Up Sumo Logic - Apr 2017

  • 1. Sumo Logic Confidential Setting Up Sumo Logic Data Collection and System Optimization Mario Sanchez Director of Technical Content and Training April 2017 Welcome. To give everyone a chance to successfully connect, we’ll start at 10:05 AM Pacific. Note you are currently muted.
  • 2. Sumo Logic Confidential At the completion of this webinar, you will be able to… Deploy a data collection strategy that best fits your environment Implement best practices around data collection Develop a robust naming convention for your metadata Learn to utilize optimization tools to enhance search performance
  • 4. Sumo Logic Confidential Continuous Intelligence DEVOPS IT INFRASTRUCTURE AND OPERATIONS COMPLIANCE AND SECURITY DEVOPS Streamline continuous delivery Monitor KPI’s and Metrics Accelerate Troubleshooting IT INFRASTRUCTURE AND OPERATIONS Monitor all workloads Troubleshoot and increase uptime Simplify, Modernize, and save costs COMPLIANCE AND SECURITY Automate and demonstrate compliance Audit all systems Think beyond rules Sumo Logic Cloud Analytics Service
  • 5. Sumo Logic Confidential Enterprise Logs are Everywhere Custom App Code Server / OS Virtual Databases Network Open Source Middleware Content Delivery IaaS, PaaS SaaS Security
  • 7. Sumo Logic Confidential Sumo Logic Data Flow Data Collection Search & Analyze Visualize & Monitor Alerts Dashboards Collectors Sources Operators Detect 1 2 3
  • 8. Sumo Logic Confidential Data Collection Strategy
  • 9. Sumo Logic Confidential Designing Your Deployment • Sumo Logic Data Collection is infinitely flexible. • Design a Sumo Logic deployment that's right for your organization. • Installed versus Hosted Collectors.
  • 10. Sumo Logic ConfidentialSumo Logic Confidential Collector and Deployment Options Collector Cloud Data Collection Centralized Data Collection Local Data Collection Collector CollectorCollector Collector Hosted Collectors Installed Collectors Best Practices on Designing Your Deployment
  • 11. Sumo Logic Confidential Local Data Collection The Sumo Logic Collector is installed on all target Hosts and, where possible, sends log data produced on those target Hosts directly to Sumo Logic Backend via https connection. Source Types Local Files  Operating Systems, Middleware, Custom Apps, etc. Windows Events  Local Windows Events Docker  Logs and Stats Syslog (dedicated Collector)  Network Devices, Snare, etc Script (dedicated Collector)  Cloud API’s, Database Content, binary data Typical Scenarios Customers with large amounts of (similar) servers, using orchestration/automation, mostly OS and application logs - On Premise Datacenters - Cloud Instances Benefits/Drawbacks + No Hardware Requirement + Automation (Chef/Puppet/Scripting) - Outbound Internet Access Required - Resource Usage on Target
  • 12. Sumo Logic Confidential Source Types Syslog  Operating Systems, Middleware, Custom Applications, etc Windows Events  Remote Windows Events Script  Cloud API’s, Database Content, binary data Typical Scenarios Customers with mostly Windows Environments or existing logging infrastructure (syslog/logstash) - On Premise Datacenters Benefits/Drawbacks + No Outbound Internet Access + Leverage existing logging Infrastructure - Scale - Dedicated Hardware - Complexity (Failover, syslog rules) Centralized Data Collection The Sumo Logic Collector is installed on a set of dedicated machines, these collect log data from the target Hosts via various remote mechanisms and forward the data to the Sumo Logic Backend. This can be accomplished by either using Sumo Logic syslog source type or by running Syslog Servers (syslog-ng, rsyslog), write to file, and collect from there.
  • 13. Sumo Logic Confidential Source Types S3 Bucket  Any data written to S3 buckets (AWS Audit or other) HTTPS  Lambda Scripts, Akamai, One Login, Log Appender Libraries, etc. Google / O365  Google API and O365 API Typical Scenarios Customers using Cloud Infrastructure, while it's possible to rely on Cloud Data Collection entirely, this is not typical. These source types are normally just part of the overall collection strategies Benefits/Drawbacks + No Software Installation - S3 Latency issues - Https Post Caching Need Cloud Data Collection Most Data is generated in the Cloud and by Cloud Services and is collected via Sumo Logics Cloud Integrations.
  • 15. Sumo Logic Confidential What is Metadata? Tag Description _collector Name of the collector (defaults to hostname) _source Name of the source this data came through _sourceHost Hostname of the server (defaults to hostname) _sourceName Name and Path of the log file _sourceCategory Can be freely configured. Main metadata tag Metadata tags are associated with each log message that is collected. Values are set through collector and source configuration.
  • 16. Sumo Logic ConfidentialSumo Logic Confidential Source Category Best Practices Recommended nomenclature for Source Categories Component1/Component2/Component3… From least descriptive to most descriptive * Note: Not all types of logs need to have the same amount of levels. Best Practices: Good Source Category, Bad Source Category Prod/MyApp1/Apache/Access Prod/MyApp1/Apache/Error Prod/MyApp1/CloudTrail Dev/MyApp1/Apache/Access Dev/MyApp1/Apache/Error Dev/MyApp1/CloudTrail Prod/MyApp2/Nginx/Access Prod/MyApp2/Tomcat/Access Prod/MyApp2/Tomcat/Catalina/Out Prod/MyApp2/MySQL/SlowQueries Dev/MyApp2/Nginx/Access Dev/MyApp2/Tomcat/Access Dev/MyApp2/Tomcat/Catalina/Out Dev/MyApp2/MySQL/SlowQueries
  • 17. Sumo Logic ConfidentialSumo Logic Confidential Metadata: Source Category Best Practices and Benefits Simple Search Scoping _sourceCategory=Prod/MyApp1/Apache* (All Apache Logs for Prod) _sourceCategory=*/MyApp1/Apache* (All Apache Logs for all environments) Simple, Intuitive and Self-maintaining Partitions/Indexes _sourceCategory=Prod/MyApp1* _sourceCategory=Prod/MyApp2* Note: First or first and second component are used for partitioning Simple and Self-maintaining RBAC Roles _sourceCategory=Prod/MyApp1*
  • 18. Sumo Logic ConfidentialSumo Logic Confidential Metadata: Source Category Best Practices Common components (and any combination of): • Environment (Prod/UAT/DEV) • Application Name • Geographic Information (East vs West datacenter, office location, etc.) • AWS Region • Business Unit Highest level components should group the data how it is most often searched together: Prod/Web/Apache/Access Dev/Web/Apache/Access Prod/DB/MySQL/Error Dev/DB/MySQL/Error Web/Apache/Access/Prod Web/Apache/Access/Dev DB/MySQL/Error/Prod DB/MySQL/Error/Dev
  • 20. Sumo Logic Confidential Partitions Indexes for subsets of your data. Segregate your data into smaller, logical chunks, that are mostly searched in isolation of other Partitions. Best Practices No overlap < 20 Partitions Ideally between 1% and 30% of total volume Group data that is searched together most often About Partitions Examples: _sourceCategory=Prod/MyApp1* _sourceCategory=Prod/MyApp2* or _sourceCategory=Prod/* _sourceCategory=Dev/*
  • 21. Sumo Logic Confidential Field Extraction Rules Apply parse logic for a dataset at time of ingest, as opposed to at search time. Benefits Better Performance Standardized field names Simplified Searches Best Practices Build simple, specific Rules Test Parse and other operations thoroughly (use nodrop and isEmpty for testing) Limitations 50 rules/200 fields (Will be removed soon) Not all operators supported
  • 22. Sumo Logic Confidential Scheduled Views Copies of subsets of data, similar to a relation DB materialized view. Use Cases Pre-aggregated data (e.g. for long-term trends) Find the needle in the haystack…. Best Practices We recommend selectivity of > 1:10000 How They Work View is updated by service ~once a minute Allows for backfilling Search view using _view=[viewname] Data does count against ingest volume
  • 23. Sumo Logic Confidential Review: Search Optimization Tools What I want to do is Partition Scheduled View Field Extraction Run queries against a certain set of data Choose if the amount of data is between 1-30% Choose if the amount of data you’d like to segregate is 1% or less Choose if you want to pre-extract fields that you are searching against frequently Extract fields from logs and make available to all users ✔ Use data to identify long- term trends ✔ Segregate data by Metadata ✔ Pre-computed or aggregate data ready to query ✔ Use RBAC to deny or grant access to the data ✔ ✔
  • 24. Sumo Logic Confidential In Summary, you can… Ingest any type of logs (structured and non-structured) Select a deployment option that best fits your sources Develop a robust naming convention for your metadata Take advantage of Optimization Tools Call to Action: Set up deployment option or (hybrid option) that best fits your environment Ensure you have a robust _SourceCategory naming convention At the very least, set up Field Extraction Rules for your popular data sources
  • 25. Sumo Logic Confidential Questions? Consume Training sumologic.com/training Read Documentation help.sumologic.com Search/Post to Community community.sumologic.com slack.sumologic.com Open a Support Case support.sumologic.com Log a Feature Request sumologic.ideas.aha.io/ideas

Editor's Notes

  • #2: Welcome everyone to the ”Setting Up Sumo Logic” webinar My name is …. And I am … Before we get started, let’s cover some housekeeping items: - To avoid distractions, all Participants are muted - However, if you want to ask a question, feel free to use the GoToWebinar Question panel. We will have a Q&A session at the end. - To preempt the most common question: Yes, this session will be recorded and I will share Slides and a recording of this webinar with everyone OK, let’s get started
  • #4: Sumo Logic helps you gain insights into the growing pool of data within your complex environment.
  • #5: Most of you are using the Sumo Logic service for at least one of the 3 following use cases: For DevOps –allows DevOps teams to monitor KPI’s to deliver quality software; less time troubleshooting and more time developing code. For IT Ops – Extract valuable information such as latencies, performance metrics, trends and any critical events tied with core systems. For Compliance and Security – Sumo Logic helps organizations simplify and automate compliance & security monitoring across their entire stack, using predictive analytics
  • #6: What data can we ingest? We can ingest data from just about any source you can imagine - structured or unstructured. Here are just a few of the devices, applications and frameworks you may be using - all of which produce log data that Sumo Logic can ingest and analyze. The left hands side can present you technology stack – from custom application code all the way down to your network devices. The right can represent your infrastructure.
  • #8: Sumo Logic Data Flow is broken into 3 main areas: Data Collection through configurable Collectors and Sources. Collectors collect, compress, cache and encrypt the data for secure transfer. Search and Analyze – Users can run searches and correlate events in real-time across the entire application stack. We will be spending most of our time in this area during this webinar, as this is most likely what you will first be doing as a new user. Visualize and Monitor- Users have the ability to create custom dashboards to help you easily monitor your data in real-time. Custom alerts notify you when specific events are identified across your stack. I will cover Data Collection at a high-level, and cover the next 2 areas through a demo.
  • #10: Sumo Logic Installed and Hosted Collectors are infinitely flexible. Design a Sumo Logic deployment that's right for your organization. <Review slide citing some examples>
  • #11: Hosted Collectors Allow for seamless collection from Amazon S3 buckets and HTTP Sources. Hosted Collectors don't require installation or activation, nor do Hosted Collectors have physical requirements, since they're hosted in AWS. Because there are no performance issues to consider, you can configure as many S3 and HTTP Sources as you'd like for a single Hosted Collector. Installed Collectors Sumo Logic Installed Collectors are lightweight and efficient. You can choose to install a small number of Collectors to minimize maintenance or just because you want to keep your topology simple (Centralized). Alternatively, you can choose to install many Collectors on many machines (Local) to distribute the bandwidth impact across your network. Installed Collectors are deployed in your environment, either on a local machine, a machine in your organization, or even an Amazon Machine Image (AMI). Installed Collectors require a software download and installation. Upgrades to Collector software are released regularly. A few things to consider: Consider having an Installed Collector on a dedicated machine if: You are running a very high-bandwidth network with high logging levels. You want a central collection point for many Sources. Consider having more than one Installed Collector if: You expect the combined number of files coming into one Collector to exceed 500. Your hardware has memory or CPU limitations. You expect combined logging traffic for one Collector to be higher than 15,000 events per second. Your network clusters or regions are geographically separated. You prefer to install many Collectors, for example, one per machine to collect local files. IMPORTANT: For system requirement details, see Installed Collector Requirements.
  • #12: The Sumo Logic Collector is installed on all target Hosts and, where possible, sends log data produced on those target Hosts directly to Sumo Logic Backend via https connection.
  • #13: The Sumo Logic Collector is installed on a set of dedicated machines, these collect log data from the target Hosts via various remote mechanisms and forward the data to the Sumo Logic Backend. This can be accomplished by either using Sumo Logic syslog source type or by running Syslog Servers (syslog-ng, rsyslog), write to file, and collect from there.
  • #16: Great, data is ingested into the Sumo Logic service, but something else is also happening in the background. Every single message ingested gets tagged with metadata that makes it much easier to search for related messages. This table shows the 5 main tags (review them all) In particular, I want to point out the source Category metadata field, as choosing the right naming convention can make a big impact on your searching capabilities and performance.