SlideShare a Scribd company logo
syslog-ng: from raw data to Big Data
Scale 14x, Los Angles
Peter Czanik / Balabit
2
About me
■ Peter Czanik from Hungary
■ Community manager at BalaBit: syslog-ng upstream
■ Doing syslog-ng packaging, support, advocating
■ BalaBit is an IT security company with development HQ in Budapest,
Hungary
■ Over 200 employees: the majority are engineers
3
syslog-ng
■ Logging: recording events, like this one:
□ Jan 14 11:38:48 linux-0jbu sshd[7716]: Accepted publickey for root from
127.0.0.1 port 48806 ssh2
■ syslog-ng: enhanced logging daemon, with a focus on central log
collection.
□ Not only syslog
□ Processing and filtering messages
□ Storing to a central location or forwarding to a wide variety of destinations
4
C-3PO (Star Wars)
5
syslog-ng and Big Data
syslog-ng can facilitate the data pipeline to Big Data in many ways:
■ Data collector
■ Data processor
■ Data filtering
6
syslog-ng: data collector
Collect system and application logs together: contextual data for either side
■ A wide variety of platform specific sources:
□ /dev/log & Co
□ Journal, Sun streams
■ Receive syslog messages over the network
□ Legacy or RFC5424, UDP/TCP/TLS
■ Logs or any kind of data from applications:
□ Through files, sockets, pipes, etc.
□ Application output
7
syslog-ng: processing
Process messages close to the source: easier filtering, lower load on the
consumer side
■ classify, normalize and structure logs with built-in parsers:
□ CSV-parser, DB-parser (PatternDB), JSON parser
■ rewrite messages:
□ for example anonymization
■ Reformatting messages using templates:
□ Destination might need a specific format (ISO date, JSON, etc.)
■ Enrich data:
□ GeoIP, additional fields based on message content
8
syslog-ng: data filtering
Main uses:
■ Message routing (login events to SIEM, smtp logs to separate file, etc.)
■ Throw away surplus logs (don't store debug level messages to SQL)
Many possibilities:
■ Based on message content, parameters or macros
■ Using comparisons, wildcards, regular expressions and functions
■ Combining all of these with boolean operators
9
syslog-ng “Big Data” destinations
■ Distributed file systems:
□ Hadoop
■ NoSQL databases:
□ MongoDB
□ Elasticsearch
■ Messaging systems:
□ Kafka
10
Free-form log messages
■ Most log messages are: date + hostname + text
Mar 11 13:37:56 linux-6965 sshd[4547]: Accepted keyboard-
interactive/pam for root from 127.0.0.1 port 46048 ssh2
■ Text = English sentence with some variable parts
■ Easy to read by a human
■ Difficult to process them with scripts
11
Solution: structured logging
■ Events represented as name-value pairs
■ Example: an ssh login:
□ source_ip=192.168.123.45
□ app=sshd
□ user=root
■ syslog-ng: name-value pairs inside
□ Date, facility, priority, program name, pid, etc.
■ Parsers in syslog-ng can turn unstructured and some structured data (csv,
JSON) into name value pairs
12
JSON parser
■ Turns JSON based log messages into name-value pairs
■ {"PROGRAM":"prg00000","PRIORITY":"info","PID":"1234","MESSAGE":"s
eq: 0000000000, thread: 0000, runid: 1374490607, stamp: 2013-07-
22T12:56:47 MESSAGE...
","HOST":"localhost","FACILITY":"auth","DATE":"Jul 22 12:56:47"}
13
csv parser
■ csv-parser: parses columnar data into fields
parser p_apache {
csv-parser(columns("APACHE.CLIENT_IP", "APACHE.IDENT_NAME", "APACHE.USER_NAME",
"APACHE.TIMESTAMP", "APACHE.REQUEST_URL", "APACHE.REQUEST_STATUS",
"APACHE.CONTENT_LENGTH", "APACHE.REFERER", "APACHE.USER_AGENT",
"APACHE.PROCESS_TIME", "APACHE.SERVER_NAME")
flags(escape-double-char,strip-whitespace) delimiters(" ") quote-pairs('""[]')
);
};
destination d_file { file("/var/log/messages-${APACHE.USER_NAME:-nouser}"); };
log { source(s_local); parser(p_apache); destination(d_file);};
14
PatternDB parser
■ PatternDB message parser:
□ Can extract useful information from unstructured messages into name-value
pairs
□ Add status fields based on message text
□ Message classification (like LogCheck)
■ Needs XML describing log messages
■ Example: an ssh login failure:
□ user=root, source_ip=192.168.123.45, action=login, status=failure
□ classified as “violation”
15
Anonymizing messages
■ Many regulations about what can be logged
□ PCI-DSS: credit card numbers
□ Europe: IP addresses, user names
■ Locating sensitive information:
□ Regular expressions: slow, works also in unknown logs
□ Patterndb: fast, only in known log messages
■ Anonymizing:
□ Overwrite it with constant
□ Overwrite it with a hash of the original
16
Language bindings in syslog-ng
■ The primary language of syslog-ng is C:
□ High performance: processes a lot more EPS than interpreted languages
■ Not everything is implemented in C
■ Rapid prototyping is easier in interpreted languages
■ Python & Java destinations in syslog-ng, Lua & Perl in incubator
□ Embedded interpreter
□ Message or full range of name value pairs can be passed
□ Proper error handling
17
Java based “Big Data” destinations
■ Most of “Big Data” is written in Java
■ C and Python clients exist, but Java is official and maintained together
with the server component
■ More effort to get started:
□ Due to missing JARs and build tools (gradle) not yet in distributions
□ libjvm.so needs to be added to LD_LIBRARY_PATH
■ https://czanik.blogs.balabit.com/2015/08/getting-started-with-syslog-ng-3-
7-1-and-elasticsearch-hadoop-kafka/
18
Configuration
■ “Don't Panic”
■ Simple and logical, even if looks difficult first
■ Pipeline model:
□ Many different building blocks (sources, destinations, filters, parsers, etc.)
□ Connected using “log” statements into a pipeline
19
syslog-ng.conf: global options
@version:3.7
@include "scl.conf"
# this is a comment :)
options {
flush_lines (0);
# [...]
keep_hostname (yes);
};
20
syslog-ng.conf: sources
source s_sys {
system();
internal();
};
source s_net {
udp(ip(0.0.0.0) port(514));
};
21
syslog-ng.conf: destinations
destination d_mesg { file("/var/log/messages"); };
destination d_es {
elasticsearch(
index("syslog-ng_${YEAR}.${MONTH}.${DAY}")
type("test")
cluster("syslog-ng")
template("$(format-json --scope rfc3164 --scope nv-pairs --exclude R_DATE --key ISODATE)n");
);
};
22
syslog-ng.conf: filters, parsers
filter f_nodebug { level(info..emerg); };
filter f_messages { level(info..emerg) and
not (facility(mail)
or facility(authpriv)
or facility(cron)); };
parser pattern_db {
db-parser(file("/opt/syslog-ng/etc/patterndb.xml") );
};
23
syslog-ng.conf: logpath
log { source(s_sys); filter(f_messages); destination(d_mesg); };
log {
source(s_net);
source(s_sys);
filter(f_nodebug);
parser(pattern_db);
destination(d_es);
flags(flow-control);
};
24
Patterndb & ElasticSearch & Kibana
25
Kafka
■ Publish – subscribe messaging
■ Data backbone for data driven organizations
□ LinkedIn
□ Spotify
■ Kafka destination is already in syslog-ng
□ Source is planned
26
syslog-ng benefits for Big Data
■ High performance reliable log collection
■ Simplified architecture
□ Single application for both syslog and application data
■ Easier to use data
□ Parsed and presented in a ready to use format
■ Lower load on destinations
□ Efficient message filtering and routing
27
Joining the community
■ syslog-ng: http://syslog-ng.org/
■ Source on GitHub: https://github.com/balabit/syslog-ng
■ Mailing list: https://lists.balabit.hu/pipermail/syslog-ng/
■ IRC: #syslog-ng on freenode
■ University students:
□ open trainee positions!
□ syslog-ng universe
28
Questions?
■ Questions?
□ My blog: http://czanik.blogs.balabit.com/
□ My e-mail: peter.czanik@balabit.com
29
End
30
Sample XML
■ <?xml version='1.0' encoding='UTF-8'?>
■ <patterndb version='3' pub_date='2010-07-13'>
■ <ruleset name='opensshd' id='2448293e-6d1c-412c-a418-a80025639511'>
■ <pattern>sshd</pattern>
■ <rules>
■ <rule provider="patterndb" id="4dd5a329-da83-4876-a431-ddcb59c2858c" class="system">
■ <patterns>
■ <pattern>Accepted @ESTRING:usracct.authmethod: @for @ESTRING:usracct.username: @from @ESTRING:usracct.device: @port @ESTRING::
@@ANYSTRING:usracct.service@</pattern>
■ </patterns>
■ <examples>
■ <example>
■ <test_message program="sshd">Accepted password for bazsi from 127.0.0.1 port 48650 ssh2</test_message>
■ <test_values>
■ <test_value name="usracct.username">bazsi</test_value>
■ <test_value name="usracct.authmethod">password</test_value>
■ <test_value name="usracct.device">127.0.0.1</test_value>
■ <test_value name="usracct.service">ssh2</test_value>
■ </test_values>
■ </example>
■ </examples>
■ <values>
■ <value name="usracct.type">login</value>
■ <value name="usracct.sessionid">$PID</value>
■ <value name="usracct.application">$PROGRAM</value>
■ <value name="secevt.verdict">ACCEPT</value>
■ </values>
■ </rule>

More Related Content

PDF
syslog-ng: from log collection to processing and information extraction
PDF
2015. Libre Software Meeting - syslog-ng: from log collection to processing a...
PDF
LOADays 2015 - syslog-ng - from log collection to processing and infomation e...
PPTX
State of the art logging
ODP
Get the most out of your security logs using syslog-ng
PDF
Centralized Logging with syslog
PDF
Scaling your logging infrastructure using syslog-ng
PDF
Centralized + Unified Logging
syslog-ng: from log collection to processing and information extraction
2015. Libre Software Meeting - syslog-ng: from log collection to processing a...
LOADays 2015 - syslog-ng - from log collection to processing and infomation e...
State of the art logging
Get the most out of your security logs using syslog-ng
Centralized Logging with syslog
Scaling your logging infrastructure using syslog-ng
Centralized + Unified Logging

What's hot (20)

PDF
Fluentd vs. Logstash for OpenStack Log Management
PDF
Fluentd unified logging layer
PDF
Fluentd meetup
PDF
Like loggly using open source
PDF
Volker Fröhlich - How to Debug Common Agent Issues
PPTX
gRPC on .NET Core - NDC Sydney 2019
PDF
Redis - for duplicate detection on real time stream
PDF
The basics of fluentd
PDF
What is new in Go 1.8
PPT
ELK stack at weibo.com
PDF
Rihards Olups - Encrypting Daemon Traffic With Zabbix 3.0
PDF
The basics of fluentd
PPTX
Life of an Fluentd event
PDF
Hydra - Getting Started
PDF
Fluentd introduction at ipros
PDF
Logstash: Get to know your logs
PDF
Fluentd - CNCF Paris
PPT
Fileextraction with suricata
PDF
From nothing to Prometheus : one year after
ODP
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
Fluentd vs. Logstash for OpenStack Log Management
Fluentd unified logging layer
Fluentd meetup
Like loggly using open source
Volker Fröhlich - How to Debug Common Agent Issues
gRPC on .NET Core - NDC Sydney 2019
Redis - for duplicate detection on real time stream
The basics of fluentd
What is new in Go 1.8
ELK stack at weibo.com
Rihards Olups - Encrypting Daemon Traffic With Zabbix 3.0
The basics of fluentd
Life of an Fluentd event
Hydra - Getting Started
Fluentd introduction at ipros
Logstash: Get to know your logs
Fluentd - CNCF Paris
Fileextraction with suricata
From nothing to Prometheus : one year after
Logging for OpenStack - Elasticsearch, Fluentd, Logstash, Kibana
Ad

Similar to SCaLE 2016 - syslog-ng: From Raw Data to Big Data (20)

PDF
Scaling Your Logging Infrastructure With Syslog-NG
ODP
Turbo charge your logs
PPTX
Why proper logging is important
PPTX
You Can't Correlate what you don't have - ArcSight Protect 2011
ODP
Turbo charge your logs
PDF
Configuring Syslog by Octavio
ODP
Centralized Syslog
PDF
Syslog Centralization Logging with Windows ~ A techXpress Guide
PDF
Regulatory compliance and system logging
DOCX
Advanced Log Processing
PDF
Application Logging in the 21st century - 2014.key
PDF
Trouble shoot with linux syslog
PDF
Log Management: AtlSecCon2015
PDF
LogStash in action
PDF
Logging in dockerized environment
PPT
Syslog explained in detail in this presentation.ppt
PDF
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
PDF
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
KEY
Message:Passing - lpw 2012
PDF
What you most likely did not know about sudo…
Scaling Your Logging Infrastructure With Syslog-NG
Turbo charge your logs
Why proper logging is important
You Can't Correlate what you don't have - ArcSight Protect 2011
Turbo charge your logs
Configuring Syslog by Octavio
Centralized Syslog
Syslog Centralization Logging with Windows ~ A techXpress Guide
Regulatory compliance and system logging
Advanced Log Processing
Application Logging in the 21st century - 2014.key
Trouble shoot with linux syslog
Log Management: AtlSecCon2015
LogStash in action
Logging in dockerized environment
Syslog explained in detail in this presentation.ppt
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
Atmosphere 2014: Centralized log management based on Logstash and Kibana - ca...
Message:Passing - lpw 2012
What you most likely did not know about sudo…
Ad

More from BalaBit (16)

PDF
NIAS 2015 - The value add of open source for innovation
PDF
Les Assises 2015 - Why people are the most important aspect of IT security?
PDF
Big Data Science - hype?
ODP
DevAssistant, Docker and You
PDF
Linux Kernel – Hogyan csapjunk bele?
PPTX
Swift -Helyzetjelentés az iOS programozás új nyelvéről
PPTX
DATA DRIVEN DESIGN - avagy hogy fér össze a kreativitás a tényekkel
PPTX
eCSI - The Agile IT security
PDF
Top 10 reasons to monitor privileged users
PDF
Hogyan maradj egészséges irodai munka mellett?
ODP
Kontrolle und revisionssichere Auditierung privilegierter IT-Zugriffe
PPTX
Techreggeli - Logmenedzsment
PDF
Balabit Company Overview
PDF
BalaBit IT Security cégismertető prezentációja
PDF
The Future of Electro Car
PDF
Compliance needs transparency
NIAS 2015 - The value add of open source for innovation
Les Assises 2015 - Why people are the most important aspect of IT security?
Big Data Science - hype?
DevAssistant, Docker and You
Linux Kernel – Hogyan csapjunk bele?
Swift -Helyzetjelentés az iOS programozás új nyelvéről
DATA DRIVEN DESIGN - avagy hogy fér össze a kreativitás a tényekkel
eCSI - The Agile IT security
Top 10 reasons to monitor privileged users
Hogyan maradj egészséges irodai munka mellett?
Kontrolle und revisionssichere Auditierung privilegierter IT-Zugriffe
Techreggeli - Logmenedzsment
Balabit Company Overview
BalaBit IT Security cégismertető prezentációja
The Future of Electro Car
Compliance needs transparency

Recently uploaded (20)

PPT
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
COURSE DESCRIPTOR OF SURVEYING R24 SYLLABUS
PPTX
Artificial Intelligence
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
PDF
737-MAX_SRG.pdf student reference guides
PDF
PPT on Performance Review to get promotions
PDF
Exploratory_Data_Analysis_Fundamentals.pdf
PDF
Artificial Superintelligence (ASI) Alliance Vision Paper.pdf
PPTX
UNIT - 3 Total quality Management .pptx
PPTX
UNIT 4 Total Quality Management .pptx
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
PPTX
Current and future trends in Computer Vision.pptx
PPT
Total quality management ppt for engineering students
PDF
86236642-Electric-Loco-Shed.pdf jfkduklg
PDF
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PPT
A5_DistSysCh1.ppt_INTRODUCTION TO DISTRIBUTED SYSTEMS
Introduction, IoT Design Methodology, Case Study on IoT System for Weather Mo...
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
COURSE DESCRIPTOR OF SURVEYING R24 SYLLABUS
Artificial Intelligence
Automation-in-Manufacturing-Chapter-Introduction.pdf
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
737-MAX_SRG.pdf student reference guides
PPT on Performance Review to get promotions
Exploratory_Data_Analysis_Fundamentals.pdf
Artificial Superintelligence (ASI) Alliance Vision Paper.pdf
UNIT - 3 Total quality Management .pptx
UNIT 4 Total Quality Management .pptx
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
Current and future trends in Computer Vision.pptx
Total quality management ppt for engineering students
86236642-Electric-Loco-Shed.pdf jfkduklg
SMART SIGNAL TIMING FOR URBAN INTERSECTIONS USING REAL-TIME VEHICLE DETECTI...
Fundamentals of safety and accident prevention -final (1).pptx
A5_DistSysCh1.ppt_INTRODUCTION TO DISTRIBUTED SYSTEMS

SCaLE 2016 - syslog-ng: From Raw Data to Big Data

  • 1. syslog-ng: from raw data to Big Data Scale 14x, Los Angles Peter Czanik / Balabit
  • 2. 2 About me ■ Peter Czanik from Hungary ■ Community manager at BalaBit: syslog-ng upstream ■ Doing syslog-ng packaging, support, advocating ■ BalaBit is an IT security company with development HQ in Budapest, Hungary ■ Over 200 employees: the majority are engineers
  • 3. 3 syslog-ng ■ Logging: recording events, like this one: □ Jan 14 11:38:48 linux-0jbu sshd[7716]: Accepted publickey for root from 127.0.0.1 port 48806 ssh2 ■ syslog-ng: enhanced logging daemon, with a focus on central log collection. □ Not only syslog □ Processing and filtering messages □ Storing to a central location or forwarding to a wide variety of destinations
  • 5. 5 syslog-ng and Big Data syslog-ng can facilitate the data pipeline to Big Data in many ways: ■ Data collector ■ Data processor ■ Data filtering
  • 6. 6 syslog-ng: data collector Collect system and application logs together: contextual data for either side ■ A wide variety of platform specific sources: □ /dev/log & Co □ Journal, Sun streams ■ Receive syslog messages over the network □ Legacy or RFC5424, UDP/TCP/TLS ■ Logs or any kind of data from applications: □ Through files, sockets, pipes, etc. □ Application output
  • 7. 7 syslog-ng: processing Process messages close to the source: easier filtering, lower load on the consumer side ■ classify, normalize and structure logs with built-in parsers: □ CSV-parser, DB-parser (PatternDB), JSON parser ■ rewrite messages: □ for example anonymization ■ Reformatting messages using templates: □ Destination might need a specific format (ISO date, JSON, etc.) ■ Enrich data: □ GeoIP, additional fields based on message content
  • 8. 8 syslog-ng: data filtering Main uses: ■ Message routing (login events to SIEM, smtp logs to separate file, etc.) ■ Throw away surplus logs (don't store debug level messages to SQL) Many possibilities: ■ Based on message content, parameters or macros ■ Using comparisons, wildcards, regular expressions and functions ■ Combining all of these with boolean operators
  • 9. 9 syslog-ng “Big Data” destinations ■ Distributed file systems: □ Hadoop ■ NoSQL databases: □ MongoDB □ Elasticsearch ■ Messaging systems: □ Kafka
  • 10. 10 Free-form log messages ■ Most log messages are: date + hostname + text Mar 11 13:37:56 linux-6965 sshd[4547]: Accepted keyboard- interactive/pam for root from 127.0.0.1 port 46048 ssh2 ■ Text = English sentence with some variable parts ■ Easy to read by a human ■ Difficult to process them with scripts
  • 11. 11 Solution: structured logging ■ Events represented as name-value pairs ■ Example: an ssh login: □ source_ip=192.168.123.45 □ app=sshd □ user=root ■ syslog-ng: name-value pairs inside □ Date, facility, priority, program name, pid, etc. ■ Parsers in syslog-ng can turn unstructured and some structured data (csv, JSON) into name value pairs
  • 12. 12 JSON parser ■ Turns JSON based log messages into name-value pairs ■ {"PROGRAM":"prg00000","PRIORITY":"info","PID":"1234","MESSAGE":"s eq: 0000000000, thread: 0000, runid: 1374490607, stamp: 2013-07- 22T12:56:47 MESSAGE... ","HOST":"localhost","FACILITY":"auth","DATE":"Jul 22 12:56:47"}
  • 13. 13 csv parser ■ csv-parser: parses columnar data into fields parser p_apache { csv-parser(columns("APACHE.CLIENT_IP", "APACHE.IDENT_NAME", "APACHE.USER_NAME", "APACHE.TIMESTAMP", "APACHE.REQUEST_URL", "APACHE.REQUEST_STATUS", "APACHE.CONTENT_LENGTH", "APACHE.REFERER", "APACHE.USER_AGENT", "APACHE.PROCESS_TIME", "APACHE.SERVER_NAME") flags(escape-double-char,strip-whitespace) delimiters(" ") quote-pairs('""[]') ); }; destination d_file { file("/var/log/messages-${APACHE.USER_NAME:-nouser}"); }; log { source(s_local); parser(p_apache); destination(d_file);};
  • 14. 14 PatternDB parser ■ PatternDB message parser: □ Can extract useful information from unstructured messages into name-value pairs □ Add status fields based on message text □ Message classification (like LogCheck) ■ Needs XML describing log messages ■ Example: an ssh login failure: □ user=root, source_ip=192.168.123.45, action=login, status=failure □ classified as “violation”
  • 15. 15 Anonymizing messages ■ Many regulations about what can be logged □ PCI-DSS: credit card numbers □ Europe: IP addresses, user names ■ Locating sensitive information: □ Regular expressions: slow, works also in unknown logs □ Patterndb: fast, only in known log messages ■ Anonymizing: □ Overwrite it with constant □ Overwrite it with a hash of the original
  • 16. 16 Language bindings in syslog-ng ■ The primary language of syslog-ng is C: □ High performance: processes a lot more EPS than interpreted languages ■ Not everything is implemented in C ■ Rapid prototyping is easier in interpreted languages ■ Python & Java destinations in syslog-ng, Lua & Perl in incubator □ Embedded interpreter □ Message or full range of name value pairs can be passed □ Proper error handling
  • 17. 17 Java based “Big Data” destinations ■ Most of “Big Data” is written in Java ■ C and Python clients exist, but Java is official and maintained together with the server component ■ More effort to get started: □ Due to missing JARs and build tools (gradle) not yet in distributions □ libjvm.so needs to be added to LD_LIBRARY_PATH ■ https://czanik.blogs.balabit.com/2015/08/getting-started-with-syslog-ng-3- 7-1-and-elasticsearch-hadoop-kafka/
  • 18. 18 Configuration ■ “Don't Panic” ■ Simple and logical, even if looks difficult first ■ Pipeline model: □ Many different building blocks (sources, destinations, filters, parsers, etc.) □ Connected using “log” statements into a pipeline
  • 19. 19 syslog-ng.conf: global options @version:3.7 @include "scl.conf" # this is a comment :) options { flush_lines (0); # [...] keep_hostname (yes); };
  • 20. 20 syslog-ng.conf: sources source s_sys { system(); internal(); }; source s_net { udp(ip(0.0.0.0) port(514)); };
  • 21. 21 syslog-ng.conf: destinations destination d_mesg { file("/var/log/messages"); }; destination d_es { elasticsearch( index("syslog-ng_${YEAR}.${MONTH}.${DAY}") type("test") cluster("syslog-ng") template("$(format-json --scope rfc3164 --scope nv-pairs --exclude R_DATE --key ISODATE)n"); ); };
  • 22. 22 syslog-ng.conf: filters, parsers filter f_nodebug { level(info..emerg); }; filter f_messages { level(info..emerg) and not (facility(mail) or facility(authpriv) or facility(cron)); }; parser pattern_db { db-parser(file("/opt/syslog-ng/etc/patterndb.xml") ); };
  • 23. 23 syslog-ng.conf: logpath log { source(s_sys); filter(f_messages); destination(d_mesg); }; log { source(s_net); source(s_sys); filter(f_nodebug); parser(pattern_db); destination(d_es); flags(flow-control); };
  • 25. 25 Kafka ■ Publish – subscribe messaging ■ Data backbone for data driven organizations □ LinkedIn □ Spotify ■ Kafka destination is already in syslog-ng □ Source is planned
  • 26. 26 syslog-ng benefits for Big Data ■ High performance reliable log collection ■ Simplified architecture □ Single application for both syslog and application data ■ Easier to use data □ Parsed and presented in a ready to use format ■ Lower load on destinations □ Efficient message filtering and routing
  • 27. 27 Joining the community ■ syslog-ng: http://syslog-ng.org/ ■ Source on GitHub: https://github.com/balabit/syslog-ng ■ Mailing list: https://lists.balabit.hu/pipermail/syslog-ng/ ■ IRC: #syslog-ng on freenode ■ University students: □ open trainee positions! □ syslog-ng universe
  • 28. 28 Questions? ■ Questions? □ My blog: http://czanik.blogs.balabit.com/ □ My e-mail: peter.czanik@balabit.com
  • 30. 30 Sample XML ■ <?xml version='1.0' encoding='UTF-8'?> ■ <patterndb version='3' pub_date='2010-07-13'> ■ <ruleset name='opensshd' id='2448293e-6d1c-412c-a418-a80025639511'> ■ <pattern>sshd</pattern> ■ <rules> ■ <rule provider="patterndb" id="4dd5a329-da83-4876-a431-ddcb59c2858c" class="system"> ■ <patterns> ■ <pattern>Accepted @ESTRING:usracct.authmethod: @for @ESTRING:usracct.username: @from @ESTRING:usracct.device: @port @ESTRING:: @@ANYSTRING:usracct.service@</pattern> ■ </patterns> ■ <examples> ■ <example> ■ <test_message program="sshd">Accepted password for bazsi from 127.0.0.1 port 48650 ssh2</test_message> ■ <test_values> ■ <test_value name="usracct.username">bazsi</test_value> ■ <test_value name="usracct.authmethod">password</test_value> ■ <test_value name="usracct.device">127.0.0.1</test_value> ■ <test_value name="usracct.service">ssh2</test_value> ■ </test_values> ■ </example> ■ </examples> ■ <values> ■ <value name="usracct.type">login</value> ■ <value name="usracct.sessionid">$PID</value> ■ <value name="usracct.application">$PROGRAM</value> ■ <value name="secevt.verdict">ACCEPT</value> ■ </values> ■ </rule>