SlideShare a Scribd company logo
Visualizing HPCC Systems Log Data Using ELK
Innovation and
Reinvention Driving
Transformation
OCTOBER 9, 2018
2018 HPCC Systems® Community
Day
Rodrigo Pastrana & Miguel Vazquez
Visualizing HPCC Systems Log Data Using
ELK
Who are we?
Visualizing HPCC Systems Log Data Using ELK 3
Rodrigo Pastrana
Architect
HPCC Systems
Miguel Vazquez
Consulting SWE
HPCC Systems
Why visualize HPCC Systems log data?
• Component logs contain a
wealth of raw information
• This data can be used for
debugging, profiling, billing,
accounting, analyzing, etc.
• These actions are difficult to do
with raw log data
• Visualizing this data can help
you find the needle in the
haystack
Visualizing HPCC Systems Log Data Using ELK 4
A visualization is worth a thousand (or more) log entries…
Visualizing HPCC Systems Log Data Using ELK 5
Discussion Topics
• What is ELK (and why)
• HPCC Systems Log
Details
• ELK Topology and other
Considerations
• Sample ELK Topology
• Sample ESP
Transaction Info
Processing
• ELK Component
Configuration
• Demo Visualization of
ESP Transaction data
Elasticsearch Logstash Kibana (and Beats) – Elastic Stack
• Powerful, flexible Open-source stack for log analytics from Elastic
• Arguably the de-facto standard for log processing and analytics
Components
• Beats: Light-weight, single purpose log data shippers
• Logstash: Ingests data from a number of different sources,
parse, filter, mutate, and stashes it.
• Elasticsearch: Search and analytics engine - acts as a data store.
• Kibana: Visualization front end
Visualizing HPCC Systems Log Data Using ELK 7
HPCC Systems Component Logs
• Important to understand HPCC Systems log basics!
• Logs generated for major HPCC Systems components
• ESP, ROXIE, THOR, DALI, Sasha, DafileServ, DFUServer, ECLAgent, etc.
• Log files are time-stamped and rolled daily;
• Running log link provided
• Default location: /var/log/HPCCSystems/<componentname>
• Default log message format (can be edited in configuration):
• SequenceNumber, Date-TimeStamp, ProcessId, ThreadId, QuotedMessage
• QuotedMessage – Contains actual log message
• Aggregation of other fields – Uniquely identify log message instance
Visualizing HPCC Systems Log Data Using ELK 8
ELK Implementation - Design Considerations
• Several considerations when setting up ELK system to process log data
• What are we doing with the data
• Troubleshooting? Monitoring? Analytics? Reporting?
• Type and volume of data
• Data sensitivity
• Resources required
• Security
• Compliance
• Disaster recovery
• Data backup and others
Visualizing HPCC Systems Log Data Using ELK 9
Our approach (to illustrate the point)
Visualizing HPCC Systems Log Data Using ELK 10
• We will target ESP transactions and ROXIE completed queries
• ESP logs conveniently provide transaction summary log entries
• Roxie provides similar query complete summary log entries
• Focuses on most important info, minimize amount of data processed through
ELK
• This also prevents us from exposing sensitive data (PII, SPII nor business
data)
• We will target a remote ELK cluster
• Processing small subset of logs
• No sensitive data!
• Minimal resource contention with HPCC Systems components
ROXI
E
ROXI
E
ESP
…
Overall Topology
Visualizing HPCC Systems Log Data Using ELK 11
Metricbeats - Monitor cluster system health (optional)
Visualizing HPCC Systems Log Data Using ELK 12
Metricbeats - Monitor node system health (optional)
Visualizing HPCC Systems Log Data Using ELK 13
Setup Filebeat to forward HPCC Systems log entries
Visualizing HPCC Systems Log Data Using ELK 14
• Filebeat primarily responsible for tailing files, filtering, and multi-line stitching
• Filebeat configuration in filebeat.yml
• Declare filebeat “prospector” for each log message type to be forwarded
• Set prospector to target running component log file
• In case of HPCC component ESP labeled “myesp”
• /var/log/HPCCSystems/myesp/esp.log
• Define custom field “component” set to component type (ESP, ROXIE,etc.).
• “component” field subsequently used by Logstash
• Declare regex patterns to include or exclude message types
• Include_lines: standard component log columns + “TxSummary …”
• Declare regex patterns to handle multi-line TxSummary messages as single
entity
FileBeat ESP prospector
Visualizing HPCC Systems Log Data Using ELK 15
filebeat.prospectors:
type: log
enabled: true
paths:
- /var/log/HPCCSystems/myesp/esp.log
fields:
component: esp
fields_under_root: true
encoding: plain
include_lines:
['([A-Z-0-9]{8})s+(d{4}-d{2}-d{2}sd{2}:d{2}:d{2}.d{3})s+(d+)s+(d+)s+("TxSummary)']
multiline.pattern:
([A-Z-0-9]{8})s+(d{4}-d{2}-d{2}sd{2}:d{2}:d{2}.d{3})s+(d+)s+(d+)s+(")
multiline.negate: true
multiline.match: after
Targeting current log file
Custom field declaration
Rule treats multi-line messages as single
entity
RegEx describes default log entry with
Quoted message with a leading
“TxSummary”
Default ‘myesp’ ESP log
location
FileBeat.yml
FileBeat.yml details continued…
• Prospector definitions for other components very similar
• Target appropriate component log file
• Create custom “component” field, populate with appropriate label
• Regex to include or exclude entry types of interest
• Configure Filebeat to forward the filtered log entries to remote Logstash instance
• Logstash assumed to listen for these messages on particular address(es)
and ports, (X.Y.Z.W:5044 for this example):
#----------------------------- Logstash output -----------------
----
output.logstash:
# The Logstash hosts
hosts: ["X.Y.Z.W:5044"]
FileBeat.yml
Visualizing HPCC Systems Log Data Using ELK 16
Process HPCC Systems log messages through Logstash
Capture
HPCC
Systems
log
messages
Parse
Structure
Filter
Mutate
Create and
populate
ElasticSearch
indexes
Visualizing HPCC Systems Log Data Using ELK 17
ROXI
E
ROXI
E
ESP
…
Logstash Input setup
• Set up the Logstash input mechanism
• Many options (stdin, s3, redis, pipe, file, log4j, kafka, etc.)
• Let’s enable input for filebeats messages via port 5044
input
{
beats { port => 5044
}
}
Logstash.con
f
Visualizing HPCC Systems Log Data Using ELK 18
Setup Logstash to process ESP transactions
Filter{
if [component] == "esp"
{
grok {match => { "message" =>
"%{BASE16NUM:sequence}
%{TIMESTAMP_ISO8601:logtimestamp}
+%{INT:processID} +%{INT:threadID}
%{QUOTEDSTRING:logmessage}" }}
kv { source => "logmessage" field_split => "[;" value_split => "=" }
Logstash.con
f
Rule for ESP based
messages
Capture messages with known log
format
Expected sequence of space delimited fields
(fieldtype:fieldname)
Parses string field “logmessage” into key value pairs
"TxSummary[activeReqs=4;rcv=1ms;user=ausr@127.0.0.1;req=POST wssmc.ACTIVITY v1.2;total=45ms;]"
Visualizing HPCC Systems Log Data Using ELK 19
Setup Logstash to process ESP transactions (continued…)
mutate {
remove_field => [ "message", "@version" ]
rename =>
{
"total" => "TotalTrxmS”
"rcv" => "TimeReceived”
}
convert =>
{
"threadID" => "integer”
"processID" => "integer”
}}}
else if [component] == “roxie“
Logstash.con
f
Important to filter out noise
Assign meaningful field names
Assign field type for aggregation
purpose
Create similar rules for other log message
types
Visualizing HPCC Systems Log Data Using ELK 20
Setup Logstash to process ESP transactions (output to ES)
output
{
if [component] == "esp"
{
elasticsearch
{
hosts => ["yourelasticsearchaddress:9200"]
index => “esp-log-%{+YYYY.MM.dd}"
}
}
else if [component] == “roxie"
{ …}
}
Logstash.con
f
Define output Logstash output rules
Forward processed messages to
ES
Important to establish appropriate indexing mechanism
Similar rules for ROXIE based
messages
Visualizing HPCC Systems Log Data Using ELK 21
Confirm EL indexes are created
Visualizing HPCC Systems Log Data Using ELK 22
http://yourkibanaip:5601/app/kibana#/dev_tools/conso
le
Let’s create some
visualizations
Discover your newly created log events
Visualizing HPCC Systems Log Data Using ELK 24
Some of Kibana’s visualization toolset
Visualizing HPCC Systems Log Data Using ELK 25
Visualization creation
• Visualize > > Select a visualization type > Enter some metrics > Save
Visualizing HPCC Systems Log Data Using ELK 26
Deeper dive into what our users are doing
Visualizing HPCC Systems Log Data Using ELK 27
Visualizations using a search query
Visualizing HPCC Systems Log Data Using ELK 28
Lets tie it up all together with a Dashboard
• Dashboard > > > Save
Visualizing HPCC Systems Log Data Using ELK 29
Available in 7.0
Visualizing HPCC Systems Log Data Using ELK 30
• Ability to embed into the ECL Watch U/I
Questions?
Visualizing HPCC Systems Log Data Using ELK 31
Useful Links
• https://hpccsystems.com/blog/ELK_visualizations
• https://www.elastic.co/guide/index.html
• https://github.com/rpastrana/hpcc-elk
• http://cdn.hpccsystems.com/releases/CE-Candidate-7.0.0/docs/EN_US/HPCCSystemAdministratorsGuide_EN_US-7.0.0-
rc2.pdf#page=26
Contact Us
• Rodrigo.Pastrana@lexisnexisrisk.com
• Miguel.Vazquez@lexisnexisrisk.com
Visualizing HPCC Systems Log Data Using ELK 32
Thank you

More Related Content

PPT
Elk presentation 2#3
PDF
Real-time Stream Processing with Apache Flink @ Hadoop Summit
PDF
Demystifying DataFrame and Dataset
PDF
Deep Dive Into Catalyst: Apache Spark 2.0’s Optimizer
PDF
Easy, scalable, fault tolerant stream processing with structured streaming - ...
PDF
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
PDF
2021 04-20 apache arrow and its impact on the database industry.pptx
PDF
Easy, scalable, fault tolerant stream processing with structured streaming - ...
Elk presentation 2#3
Real-time Stream Processing with Apache Flink @ Hadoop Summit
Demystifying DataFrame and Dataset
Deep Dive Into Catalyst: Apache Spark 2.0’s Optimizer
Easy, scalable, fault tolerant stream processing with structured streaming - ...
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
2021 04-20 apache arrow and its impact on the database industry.pptx
Easy, scalable, fault tolerant stream processing with structured streaming - ...

What's hot (20)

PDF
Making Structured Streaming Ready for Production
PPTX
Omid: scalable and highly available transaction processing for Apache Phoenix
PDF
Javantura v3 - Logs – the missing gold mine – Franjo Žilić
PDF
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
PPTX
Flink 0.10 @ Bay Area Meetup (October 2015)
PDF
Structured Streaming for Columnar Data Warehouses with Jack Gudenkauf
PDF
Productionizing your Streaming Jobs
PDF
PSUG #52 Dataflow and simplified reactive programming with Akka-streams
PDF
Experiences in ELK with D3.js for Large Log Analysis and Visualization
PDF
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
PDF
Testing data streaming applications
PDF
What's new with Apache Spark's Structured Streaming?
PDF
Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
PPT
Nephee Framework Design Ver2
PDF
Strava Labs: Exploring a Billion Activity Dataset from Athletes with Apache S...
PPTX
Extending Flux - Writing Your Own Functions by Adam Anthony
ODP
將 Open Data 放上 Open Source Platforms: 開源資料入口平台 CKAN 開發經驗分享
PDF
Efficient Data Storage for Analytics with Apache Parquet 2.0
PDF
Mixing Metrics and Logs with Grafana + Influx by David Kaltschmidt, Director ...
PPTX
University program - writing an apache apex application
Making Structured Streaming Ready for Production
Omid: scalable and highly available transaction processing for Apache Phoenix
Javantura v3 - Logs – the missing gold mine – Franjo Žilić
Faceting Optimizations for Solr: Presented by Toke Eskildsen, State & Univers...
Flink 0.10 @ Bay Area Meetup (October 2015)
Structured Streaming for Columnar Data Warehouses with Jack Gudenkauf
Productionizing your Streaming Jobs
PSUG #52 Dataflow and simplified reactive programming with Akka-streams
Experiences in ELK with D3.js for Large Log Analysis and Visualization
Arbitrary Stateful Aggregations using Structured Streaming in Apache Spark
Testing data streaming applications
What's new with Apache Spark's Structured Streaming?
Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
Nephee Framework Design Ver2
Strava Labs: Exploring a Billion Activity Dataset from Athletes with Apache S...
Extending Flux - Writing Your Own Functions by Adam Anthony
將 Open Data 放上 Open Source Platforms: 開源資料入口平台 CKAN 開發經驗分享
Efficient Data Storage for Analytics with Apache Parquet 2.0
Mixing Metrics and Logs with Grafana + Influx by David Kaltschmidt, Director ...
University program - writing an apache apex application
Ad

Similar to Visualizing HPCC Systems Log Data Using ELK (20)

PDF
How to use Parquet as a Sasis for ETL and Analytics
PDF
ELK stack introduction
PDF
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
PPTX
Recordmanagment2
PDF
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
PDF
Logisland "Event Mining at scale"
PDF
Redis Streams - Fiverr Tech5 meetup
PPTX
Flink internals web
PDF
HPCC Systems 6.0.0 Highlights
PPTX
Elastic stack Presentation
PDF
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
PPTX
Tamir Dresher - DotNet 7 What's new.pptx
PPTX
How bol.com makes sense of its logs, using the Elastic technology stack.
PPTX
Centralized Logging System Using ELK Stack
PDF
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
PDF
SparkSQL: A Compiler from Queries to RDDs
PPTX
Intro to HPC
PPT
Skills Portfolio
PPTX
Spark Sql and DataFrame
PPTX
NATE-Central-Log
How to use Parquet as a Sasis for ETL and Analytics
ELK stack introduction
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
Recordmanagment2
VMworld 2013: Deep Dive into vSphere Log Management with vCenter Log Insight
Logisland "Event Mining at scale"
Redis Streams - Fiverr Tech5 meetup
Flink internals web
HPCC Systems 6.0.0 Highlights
Elastic stack Presentation
Apache Big Data EU 2016: Building Streaming Applications with Apache Apex
Tamir Dresher - DotNet 7 What's new.pptx
How bol.com makes sense of its logs, using the Elastic technology stack.
Centralized Logging System Using ELK Stack
Managing your Black Friday Logs - Antonio Bonuccelli - Codemotion Rome 2018
SparkSQL: A Compiler from Queries to RDDs
Intro to HPC
Skills Portfolio
Spark Sql and DataFrame
NATE-Central-Log
Ad

More from HPCC Systems (20)

PPTX
Natural Language to SQL Query conversion using Machine Learning Techniques on...
PPT
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
PPTX
Towards Trustable AI for Complex Systems
PPTX
Welcome
PPTX
Closing / Adjourn
PPTX
Community Website: Virtual Ribbon Cutting
PPTX
Path to 8.0
PPTX
Release Cycle Changes
PPTX
Geohashing with Uber’s H3 Geospatial Index
PPTX
Advancements in HPCC Systems Machine Learning
PPTX
Docker Support
PPTX
Expanding HPCC Systems Deep Neural Network Capabilities
PPTX
Leveraging Intra-Node Parallelization in HPCC Systems
PPTX
DataPatterns - Profiling in ECL Watch
PPTX
Leveraging the Spark-HPCC Ecosystem
PPTX
Work Unit Analysis Tool
PPTX
Community Award Ceremony
PPTX
Dapper Tool - A Bundle to Make your ECL Neater
PPTX
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
PPTX
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...
Natural Language to SQL Query conversion using Machine Learning Techniques on...
Improving Efficiency of Machine Learning Algorithms using HPCC Systems
Towards Trustable AI for Complex Systems
Welcome
Closing / Adjourn
Community Website: Virtual Ribbon Cutting
Path to 8.0
Release Cycle Changes
Geohashing with Uber’s H3 Geospatial Index
Advancements in HPCC Systems Machine Learning
Docker Support
Expanding HPCC Systems Deep Neural Network Capabilities
Leveraging Intra-Node Parallelization in HPCC Systems
DataPatterns - Profiling in ECL Watch
Leveraging the Spark-HPCC Ecosystem
Work Unit Analysis Tool
Community Award Ceremony
Dapper Tool - A Bundle to Make your ECL Neater
A Success Story of Challenging the Status Quo: Gadget Girls and the Inclusion...
Beyond the Spectrum – Creating an Environment of Diversity and Empowerment wi...

Recently uploaded (20)

PPTX
IMPACT OF LANDSLIDE.....................
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPT
Predictive modeling basics in data cleaning process
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Introduction to Inferential Statistics.pptx
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PDF
How to run a consulting project- client discovery
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PPTX
SAP 2 completion done . PRESENTATION.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Topic 5 Presentation 5 Lesson 5 Corporate Fin
PDF
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PPTX
Managing Community Partner Relationships
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPTX
Database Infoormation System (DBIS).pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
IMPACT OF LANDSLIDE.....................
Acceptance and paychological effects of mandatory extra coach I classes.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Predictive modeling basics in data cleaning process
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Introduction to Inferential Statistics.pptx
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
How to run a consulting project- client discovery
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
SAP 2 completion done . PRESENTATION.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Topic 5 Presentation 5 Lesson 5 Corporate Fin
Systems Analysis and Design, 12th Edition by Scott Tilley Test Bank.pdf
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Managing Community Partner Relationships
Optimise Shopper Experiences with a Strong Data Estate.pdf
Database Infoormation System (DBIS).pptx
Qualitative Qantitative and Mixed Methods.pptx

Visualizing HPCC Systems Log Data Using ELK

  • 2. Innovation and Reinvention Driving Transformation OCTOBER 9, 2018 2018 HPCC Systems® Community Day Rodrigo Pastrana & Miguel Vazquez Visualizing HPCC Systems Log Data Using ELK
  • 3. Who are we? Visualizing HPCC Systems Log Data Using ELK 3 Rodrigo Pastrana Architect HPCC Systems Miguel Vazquez Consulting SWE HPCC Systems
  • 4. Why visualize HPCC Systems log data? • Component logs contain a wealth of raw information • This data can be used for debugging, profiling, billing, accounting, analyzing, etc. • These actions are difficult to do with raw log data • Visualizing this data can help you find the needle in the haystack Visualizing HPCC Systems Log Data Using ELK 4
  • 5. A visualization is worth a thousand (or more) log entries… Visualizing HPCC Systems Log Data Using ELK 5
  • 6. Discussion Topics • What is ELK (and why) • HPCC Systems Log Details • ELK Topology and other Considerations • Sample ELK Topology • Sample ESP Transaction Info Processing • ELK Component Configuration • Demo Visualization of ESP Transaction data
  • 7. Elasticsearch Logstash Kibana (and Beats) – Elastic Stack • Powerful, flexible Open-source stack for log analytics from Elastic • Arguably the de-facto standard for log processing and analytics Components • Beats: Light-weight, single purpose log data shippers • Logstash: Ingests data from a number of different sources, parse, filter, mutate, and stashes it. • Elasticsearch: Search and analytics engine - acts as a data store. • Kibana: Visualization front end Visualizing HPCC Systems Log Data Using ELK 7
  • 8. HPCC Systems Component Logs • Important to understand HPCC Systems log basics! • Logs generated for major HPCC Systems components • ESP, ROXIE, THOR, DALI, Sasha, DafileServ, DFUServer, ECLAgent, etc. • Log files are time-stamped and rolled daily; • Running log link provided • Default location: /var/log/HPCCSystems/<componentname> • Default log message format (can be edited in configuration): • SequenceNumber, Date-TimeStamp, ProcessId, ThreadId, QuotedMessage • QuotedMessage – Contains actual log message • Aggregation of other fields – Uniquely identify log message instance Visualizing HPCC Systems Log Data Using ELK 8
  • 9. ELK Implementation - Design Considerations • Several considerations when setting up ELK system to process log data • What are we doing with the data • Troubleshooting? Monitoring? Analytics? Reporting? • Type and volume of data • Data sensitivity • Resources required • Security • Compliance • Disaster recovery • Data backup and others Visualizing HPCC Systems Log Data Using ELK 9
  • 10. Our approach (to illustrate the point) Visualizing HPCC Systems Log Data Using ELK 10 • We will target ESP transactions and ROXIE completed queries • ESP logs conveniently provide transaction summary log entries • Roxie provides similar query complete summary log entries • Focuses on most important info, minimize amount of data processed through ELK • This also prevents us from exposing sensitive data (PII, SPII nor business data) • We will target a remote ELK cluster • Processing small subset of logs • No sensitive data! • Minimal resource contention with HPCC Systems components
  • 12. Metricbeats - Monitor cluster system health (optional) Visualizing HPCC Systems Log Data Using ELK 12
  • 13. Metricbeats - Monitor node system health (optional) Visualizing HPCC Systems Log Data Using ELK 13
  • 14. Setup Filebeat to forward HPCC Systems log entries Visualizing HPCC Systems Log Data Using ELK 14 • Filebeat primarily responsible for tailing files, filtering, and multi-line stitching • Filebeat configuration in filebeat.yml • Declare filebeat “prospector” for each log message type to be forwarded • Set prospector to target running component log file • In case of HPCC component ESP labeled “myesp” • /var/log/HPCCSystems/myesp/esp.log • Define custom field “component” set to component type (ESP, ROXIE,etc.). • “component” field subsequently used by Logstash • Declare regex patterns to include or exclude message types • Include_lines: standard component log columns + “TxSummary …” • Declare regex patterns to handle multi-line TxSummary messages as single entity
  • 15. FileBeat ESP prospector Visualizing HPCC Systems Log Data Using ELK 15 filebeat.prospectors: type: log enabled: true paths: - /var/log/HPCCSystems/myesp/esp.log fields: component: esp fields_under_root: true encoding: plain include_lines: ['([A-Z-0-9]{8})s+(d{4}-d{2}-d{2}sd{2}:d{2}:d{2}.d{3})s+(d+)s+(d+)s+("TxSummary)'] multiline.pattern: ([A-Z-0-9]{8})s+(d{4}-d{2}-d{2}sd{2}:d{2}:d{2}.d{3})s+(d+)s+(d+)s+(") multiline.negate: true multiline.match: after Targeting current log file Custom field declaration Rule treats multi-line messages as single entity RegEx describes default log entry with Quoted message with a leading “TxSummary” Default ‘myesp’ ESP log location FileBeat.yml
  • 16. FileBeat.yml details continued… • Prospector definitions for other components very similar • Target appropriate component log file • Create custom “component” field, populate with appropriate label • Regex to include or exclude entry types of interest • Configure Filebeat to forward the filtered log entries to remote Logstash instance • Logstash assumed to listen for these messages on particular address(es) and ports, (X.Y.Z.W:5044 for this example): #----------------------------- Logstash output ----------------- ---- output.logstash: # The Logstash hosts hosts: ["X.Y.Z.W:5044"] FileBeat.yml Visualizing HPCC Systems Log Data Using ELK 16
  • 17. Process HPCC Systems log messages through Logstash Capture HPCC Systems log messages Parse Structure Filter Mutate Create and populate ElasticSearch indexes Visualizing HPCC Systems Log Data Using ELK 17 ROXI E ROXI E ESP …
  • 18. Logstash Input setup • Set up the Logstash input mechanism • Many options (stdin, s3, redis, pipe, file, log4j, kafka, etc.) • Let’s enable input for filebeats messages via port 5044 input { beats { port => 5044 } } Logstash.con f Visualizing HPCC Systems Log Data Using ELK 18
  • 19. Setup Logstash to process ESP transactions Filter{ if [component] == "esp" { grok {match => { "message" => "%{BASE16NUM:sequence} %{TIMESTAMP_ISO8601:logtimestamp} +%{INT:processID} +%{INT:threadID} %{QUOTEDSTRING:logmessage}" }} kv { source => "logmessage" field_split => "[;" value_split => "=" } Logstash.con f Rule for ESP based messages Capture messages with known log format Expected sequence of space delimited fields (fieldtype:fieldname) Parses string field “logmessage” into key value pairs "TxSummary[activeReqs=4;rcv=1ms;user=ausr@127.0.0.1;req=POST wssmc.ACTIVITY v1.2;total=45ms;]" Visualizing HPCC Systems Log Data Using ELK 19
  • 20. Setup Logstash to process ESP transactions (continued…) mutate { remove_field => [ "message", "@version" ] rename => { "total" => "TotalTrxmS” "rcv" => "TimeReceived” } convert => { "threadID" => "integer” "processID" => "integer” }}} else if [component] == “roxie“ Logstash.con f Important to filter out noise Assign meaningful field names Assign field type for aggregation purpose Create similar rules for other log message types Visualizing HPCC Systems Log Data Using ELK 20
  • 21. Setup Logstash to process ESP transactions (output to ES) output { if [component] == "esp" { elasticsearch { hosts => ["yourelasticsearchaddress:9200"] index => “esp-log-%{+YYYY.MM.dd}" } } else if [component] == “roxie" { …} } Logstash.con f Define output Logstash output rules Forward processed messages to ES Important to establish appropriate indexing mechanism Similar rules for ROXIE based messages Visualizing HPCC Systems Log Data Using ELK 21
  • 22. Confirm EL indexes are created Visualizing HPCC Systems Log Data Using ELK 22 http://yourkibanaip:5601/app/kibana#/dev_tools/conso le
  • 24. Discover your newly created log events Visualizing HPCC Systems Log Data Using ELK 24
  • 25. Some of Kibana’s visualization toolset Visualizing HPCC Systems Log Data Using ELK 25
  • 26. Visualization creation • Visualize > > Select a visualization type > Enter some metrics > Save Visualizing HPCC Systems Log Data Using ELK 26
  • 27. Deeper dive into what our users are doing Visualizing HPCC Systems Log Data Using ELK 27
  • 28. Visualizations using a search query Visualizing HPCC Systems Log Data Using ELK 28
  • 29. Lets tie it up all together with a Dashboard • Dashboard > > > Save Visualizing HPCC Systems Log Data Using ELK 29
  • 30. Available in 7.0 Visualizing HPCC Systems Log Data Using ELK 30 • Ability to embed into the ECL Watch U/I
  • 31. Questions? Visualizing HPCC Systems Log Data Using ELK 31 Useful Links • https://hpccsystems.com/blog/ELK_visualizations • https://www.elastic.co/guide/index.html • https://github.com/rpastrana/hpcc-elk • http://cdn.hpccsystems.com/releases/CE-Candidate-7.0.0/docs/EN_US/HPCCSystemAdministratorsGuide_EN_US-7.0.0- rc2.pdf#page=26 Contact Us • Rodrigo.Pastrana@lexisnexisrisk.com • Miguel.Vazquez@lexisnexisrisk.com
  • 32. Visualizing HPCC Systems Log Data Using ELK 32 Thank you