SlideShare a Scribd company logo
Is Your Data in an Exo/c Format Stored in Ka7a?
Let's write a Telegraf Plugin! The Case of Avro.
16 June, 2020
Webinar
Emanuele Falzone, Ph.D. Student @ Politecnico di Milano
Emanuele Falzone
Ph.D Student @ Politecnico di Milano
❤ Free and Open Source Software
Contacts
- emanuele.falzone@polimi.it
- emanuelefalzone.com
- github.com/emanuele-falzone
Outline
- Introdution
- Data Lifeycle
- Use Cases
- Customer Relationship Management System
- Apache Avro
- Telegraf
- Bank transactions
- Conclusions
Disclaimer
This webinar requires a grasp
knowlegde on what are Kafka
and InfluxDB
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 3
ingestion storage analysis visualization
Data Lifecycle
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 4
Data Lifecycle
ingestion storage analysis visualization
customapplication
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 5
Architecture
application
custom
application
custom
without telegraf with telegraf
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 6
Data Lifecycle
ingestion storage analysis visualization
customapplication
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 7
Use Case
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 8
Customer Relationship Management System
- Available for different platforms: web, ios, android
- Different club status: blonde, silver, gold, platinum
- Customers continuously provide ratings
Code available at https://github.com/emanuele-falzone/influxdata-webinar-crm-ratings-use-case
Customer Relationship Management
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 9
Customer Rela>onship Management
customer data
ratings
RATINGS-WITH-CUSTOMER-DATA
avro
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 10
{
"rating_id": 5313,
"user_id": 3,
"stars": 5,
"route_id": 6975,
"rating_time": 1519304105213,
"channel": "web",
"message": "awesome!"
}
{
"id": 3,
"first_name": "Merilyn",
"last_name": "Doughartie",
"email": "mdoughartie1@dedecms.com",
"gender": "Female",
"club_status": "platinium",
"comments": "none"
}
{
"stars": 5,
"rating_time": 1519304105213,
"channel": "web",
"gender": "Female",
"club_status": "platinium"
}
- binary format
- self describing with schema embedded in the data itself
01000101 10110011
int 42
- given the schema you can automaticcally generate encoder/decoder
Apache
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 11
Apache
{
"type": "record",
"name": "KsqlDataSourceSchema",
"namespace": "io.confluent.ksql.avro_schemas",
"fields": [
{
"name": ”RATING_TIME",
"type": [
"null",
"long"
],
"default": null
},
{
"name": "STARS",
"type": [
"null",
"int"
],
"default": null
},
. . .
]
}
{
”RATING_TIME": {
"long": 1591557033850
},
"CHANNEL": {
"string": "iOS-test"
},
"STARS": {
"int": 2
},
"CLUB_STATUS": {
"string": "gold"
},
”GENDER": {
"string": ”Female"
}
}
schema
value
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 12
{
"type": "record",
"name": "KsqlDataSourceSchema",
"namespace": "io.confluent.ksql.avro_schemas",
"fields": [
{
"name": "TIMESTAMP",
"type": [
"null",
"long"
],
"default": null
},
{
"name": "STARS",
"type": [
"null",
"int"
],
"default": null
},
. . .
]
}
{
"TIMESTAMP": {
"long": 1591557033850
},
"RATING_ID": {
"long": 12
},
"CHANNEL": {
"string": "iOS-test"
},
"STARS": {
"int": 2
},
"MESSAGE": {
"string": "meh"
},
"CLUB_STATUS": {
"string": "gold"
},
. . .
}
schema
value
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 13
encode0xA0 0xB1 0x47 0x06 0x90 . . .
decode
{
"type": "record",
"name": "KsqlDataSourceSchema",
"namespace": "io.confluent.ksql.avro_schemas",
"fields": [
{
"name": "TIMESTAMP",
"type": [
"null",
"long"
],
"default": null
},
{
"name": "STARS",
"type": [
"null",
"int"
],
"default": null
},
. . .
]
}
{
"TIMESTAMP": {
"long": 1591557033850
},
"RATING_ID": {
"long": 12
},
"CHANNEL": {
"string": "iOS-test"
},
"STARS": {
"int": 2
},
"MESSAGE": {
"string": "meh"
},
"CLUB_STATUS": {
"string": "gold"
},
. . .
}
schema
value
Apache
Apache messages in Kafka
0x20 0x45 0x37 0x74 0x61 . . . 0xA0 0xB1 0x47 0x06 0x90 . . .
0x20 0x45 0x37 0x74 0x61 . . . 0xA1 0xE1 0x60 0x00 0x00 . . .
0x20 0x45 0x37 0x74 0x61 . . . 0xA3 0xB5 0x00 0x32 0x54 . . .
m bytes
Binary Message
n bytes
Schema
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 14
Schema registry
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 15
id schema
1
{
"type": "record",
"name": "KsqlDataSourceSchema",
"namespace": "io.confluent.ksql.avro_schemas",
"fields": [
{
"name": "TIMESTAMP",
"type": [
"null",
"long"
],
"default": null
},
{
"name": "STARS",
"type": [
"null",
"int"
],
"default": null
},
. . .
]
}
GET /schemas/ids/%d
application/json
Kafka + schema registry
0x00 0x00 0x00 0x00 0x01 0xA0 0xB1 0x47 0x06 0x90 . . .
0x00 0x00 0x00 0x00 0x01 0xA1 0xE1 0x60 0x00 0x00 . . .
0x00 0x00 0x00 0x00 0x01 0xA3 0xB5 0x00 0x32 0x54 . . .
4 bytes
Schema ID
m bytes
Binary Message
schema
registry
1 byte
Magic Byte
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 16
Architecture
schema
registry
kafka
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 17
- the most used tool to feed data to InfluxDB
- written in golang
- Open Source!
- plugin architeture
- more than 200 plugins
- documentation and examples
- easy to deploy
- docker container with config file
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 18
Plugin architecture
input
metric
parser
plugins
plugins
telegraf
data flow
telegraf boundary
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 19
bytes/string
Telegraf metric
closely based on InfluxDB data model and contain four main components:
- Measurement name: Description and namespace for the metric.
- Tags: Key/Value string pairs and usually used to identify the metric.
- Fields: Key/Value pairs that are typed and usually contain the metric data.
- Timestamp: Date and time associated with the fields.
This metric type exists only in memory and must be converted to a concrete representation in order to be
transmitted or viewed
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 20
Plugin architecture
input
metric
parser serializer
TO BE NOTICED: there are also processors and aggregators plugins, but we are not using them.
plugins
output
plugins
pluginsplugins
telegraf
data flow
telegraf boundary
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 21
Plugins
input plugins:
file, kafka, influx, mqtt, http, socket, event-hub,
kinesis, cloud-pubsub, mysql, mongodb,
couchdb, postgresql, . . .
parser plugins:
json, csv, form-urlencoded, influx, wavefront, . . .
serializer plugins = input parser
output plugins = input plugins
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 22
telegraf.conf
# Input plugin
[[inputs.file]]
files = [”/tmp/sample.csv"]
# Parser plugin
data_format = "csv"
csv_header_row_count = 1
csv_delimiter = ","
csv_measurement_column = "measurement"
csv_timestamp_column = "time"
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 23
# Output plugin
[[outputs.influxdb_v2]]
urls = ["http://influxdb:9999"]
token = "d2VsY29tZQ=="
organization = "polimi"
bucket = "bucket”
# Serializer plugin
data_format = ”influx"
Plugin architecture
influxdb-v2
outputkafka
consumer telegraf
metric
avro
parser
influx
serializer
missing!!!
data flow
telegraf boundary
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 24
From Apache to Telegraf metric
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 25
{
. . .
"fields": [
{
"name": "RATING_TIME",
. . .
},
{
"name": "CHANNEL",
. . .
},
{
"name": "STARS",
. . .
},
{
"name": "CLUB_STATUS",
. . .
},
{
"name": "GENDER",
. . .
}
]
}
TIMESTAMP
TAG
FIELD
TAG
TAG
MEASUREMENT=ratings
telegraf.conf
# Input plugin
[[inputs.kafka_consumer]]
brokers = ["kafka:29092"]
topics = ["RATINGS_WITH_CUSTOMER_DATA"]
# Parser plugin
data_format = "avro"
avro_measurement = "ratings"
avro_tags = ["CHANNEL", "CLUB_STATUS", "GENDER"]
avro_fields = ["STARS"]
avro_timestamp = "RATING_TIME"
avro_timestamp_format = "unix_ms"
avro_schema_registry = "http://schema-registry:8081"
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 26
# Output plugin
[[outputs.influxdb_v2]]
urls = ["http://influxdb:9999"]
token = "d2VsY29tZQ=="
organization = "polimi"
bucket = "bucket”
# Serializer plugin
data_format = ”influx"
how can I make telegraf
understand avro messages ?
Code
Gopkg.lock
Gopkg.toml
plugins/parsers/avro/parser.go
plugins/parsers/avro/schema_registry.go
plugins/parsers/avro/parser_test.go
plugins/parsers/registry.go
internal/config/config.go
plugins/parsers/avro/README.md
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 28
dependencies
parser logic
tests
parser init
telegraf.conf
documentation
Code available at https://github.com/emanuele-falzone/telegraf/tree/avro/plugins/parsers/avro
Talk is cheap.
Show me the code.
~ Linus Torvalds ~
Open Source Contribution
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 30
Can you help John?
John works in a bank and wants to visualize the average amount of transactions over time using
InfluxDB.
Here is a sample transaction: transaction,sender=Alice,receiver=Bob amount=28.00 1591644525345
However, messages are sent to Kafka encrypted with the public key of the bank.
Extend telegraf in order to decrypt every transaction and upload the corresponding
measurement to InfluxDB.
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 31
Code available at https://github.com/emanuele-falzone/influxdata-webinar-bank-transactions-use-case
Conclusion
- InfluxDB and telegraf in the Data Lifeycle
- Use Cases
- Customer Relationship Management System
- Apache Avro
- Bank Transactions
- Encryption
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 32
Questions?
Contacts
- emanuele.falzone@polimi.it
- emanuelefalzone.com
- github.com/emanuele-falzone
Emanuele Falzone - InfluxData Webinar - 16 June, 2020 33
We look forward to bringing together our community of
developers in this new format to learn, interact, and share
tips and use cases.
23-24 June, 2020
Virtual Experience
www.influxdays.com

More Related Content

PPTX
Aduanas de venezuela
PDF
MANUAL DE AUDITORÍA DE CUMPLIMIENTO - MAC 13.NOV.2014
PDF
REQUISITOS PARA LA CONTRATACIÓN DE PERSONAL EXTRANJERO
PDF
Modificacion presupuestaria
PPT
Grupo 4 presentación exacciones parafiscales laborales
PDF
Casos practicos unr_2014_unid_1_a_5_final
DOC
Examen normas y principios Contabilidad Gubernamental
PDF
lectura 4 ajuste por inflación y reexpresion de estados contables normas con...
Aduanas de venezuela
MANUAL DE AUDITORÍA DE CUMPLIMIENTO - MAC 13.NOV.2014
REQUISITOS PARA LA CONTRATACIÓN DE PERSONAL EXTRANJERO
Modificacion presupuestaria
Grupo 4 presentación exacciones parafiscales laborales
Casos practicos unr_2014_unid_1_a_5_final
Examen normas y principios Contabilidad Gubernamental
lectura 4 ajuste por inflación y reexpresion de estados contables normas con...

What's hot (20)

DOCX
Ajuste por inflación contable
PPTX
Conociendo la lottt
DOC
Modelos de instrumentos elaborados x docentes
PPTX
Contabilidad ambiental
DOCX
Ensayo potestad tributaria
PPTX
Control interno basado en informe COSO
DOCX
Dictamen auditoria
PDF
No. 2 ejercicios ley del iva
PDF
Partidas no monetarias y sistema de ajuste por inflacion a los activos
DOCX
Deberes formales inces
PPTX
AEZ5 FINALIZACIÓN DEL AÑO LECTIVO 2023-2024.pptx
PPTX
NIA 300-499 PLANIFICACION RIESGO Y RESPUESTA
PPT
Contribuyentes formales y libros de iva
PPTX
Presentacion reglamento 4 LOAFSP
PDF
Tabla de retenciones del ISLR 2024(1).pdf
DOCX
P.U.D. CONTA. 1ero.docx
PPT
Iva
PDF
Control y evaluación de costos de energía eléctrica ips
PPTX
Auditoria propiedad planta y equipo
PDF
Resumen nias
Ajuste por inflación contable
Conociendo la lottt
Modelos de instrumentos elaborados x docentes
Contabilidad ambiental
Ensayo potestad tributaria
Control interno basado en informe COSO
Dictamen auditoria
No. 2 ejercicios ley del iva
Partidas no monetarias y sistema de ajuste por inflacion a los activos
Deberes formales inces
AEZ5 FINALIZACIÓN DEL AÑO LECTIVO 2023-2024.pptx
NIA 300-499 PLANIFICACION RIESGO Y RESPUESTA
Contribuyentes formales y libros de iva
Presentacion reglamento 4 LOAFSP
Tabla de retenciones del ISLR 2024(1).pdf
P.U.D. CONTA. 1ero.docx
Iva
Control y evaluación de costos de energía eléctrica ips
Auditoria propiedad planta y equipo
Resumen nias
Ad

Similar to InfluxData Webinar 16 June, 2020 - How to Create a Telegraf Parser Plugin for Data Stored in Kafka (20)

PDF
Discover How Volvo Cars Uses a Time Series Database to Become Data-Driven
PDF
MUM Europe 2017 - Traffic Generator Case Study
ODP
PDF
Apache StreamPipes – Flexible Industrial IoT Management
PDF
Tools for mxf-embedded bucore metadata, Dieter Van Rijsselbergen, Jean-Pierre...
PDF
Continuous SQL with Apache Streaming (FLaNK and FLiP)
PDF
The Fine Art of Time Travelling - Implementing Event Sourcing - Andrea Saltar...
PDF
Training thethings.iO
PDF
Dataframes in Spark - Data Analysts' perspective
PDF
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
PPTX
Lyft talks #4 Orchestrating big data and ML pipelines at Lyft
PDF
Ml ops and the feature store with hopsworks, DC Data Science Meetup
PDF
Postgre sql custom datatype overloading operator and casting
PDF
iguazio - nuclio Meetup Nov 30th
PDF
An AI-Powered Chatbot to Simplify Apache Spark Performance Management
PPT
What's New with Windows Phone - FoxCon Talk
PDF
Twig Templating
PDF
Three Functional Programming Technologies for Big Data
PDF
JCConf 2022 - New Features in Java 18 & 19
Discover How Volvo Cars Uses a Time Series Database to Become Data-Driven
MUM Europe 2017 - Traffic Generator Case Study
Apache StreamPipes – Flexible Industrial IoT Management
Tools for mxf-embedded bucore metadata, Dieter Van Rijsselbergen, Jean-Pierre...
Continuous SQL with Apache Streaming (FLaNK and FLiP)
The Fine Art of Time Travelling - Implementing Event Sourcing - Andrea Saltar...
Training thethings.iO
Dataframes in Spark - Data Analysts' perspective
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Lyft talks #4 Orchestrating big data and ML pipelines at Lyft
Ml ops and the feature store with hopsworks, DC Data Science Meetup
Postgre sql custom datatype overloading operator and casting
iguazio - nuclio Meetup Nov 30th
An AI-Powered Chatbot to Simplify Apache Spark Performance Management
What's New with Windows Phone - FoxCon Talk
Twig Templating
Three Functional Programming Technologies for Big Data
JCConf 2022 - New Features in Java 18 & 19
Ad

Recently uploaded (20)

PDF
Digital Logic Computer Design lecture notes
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Sustainable Sites - Green Building Construction
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
Digital Logic Computer Design lecture notes
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
CYBER-CRIMES AND SECURITY A guide to understanding
R24 SURVEYING LAB MANUAL for civil enggi
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Automation-in-Manufacturing-Chapter-Introduction.pdf
Sustainable Sites - Green Building Construction
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
OOP with Java - Java Introduction (Basics)
Embodied AI: Ushering in the Next Era of Intelligent Systems
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Model Code of Practice - Construction Work - 21102022 .pdf

InfluxData Webinar 16 June, 2020 - How to Create a Telegraf Parser Plugin for Data Stored in Kafka

  • 1. Is Your Data in an Exo/c Format Stored in Ka7a? Let's write a Telegraf Plugin! The Case of Avro. 16 June, 2020 Webinar Emanuele Falzone, Ph.D. Student @ Politecnico di Milano
  • 2. Emanuele Falzone Ph.D Student @ Politecnico di Milano ❤ Free and Open Source Software Contacts - emanuele.falzone@polimi.it - emanuelefalzone.com - github.com/emanuele-falzone
  • 3. Outline - Introdution - Data Lifeycle - Use Cases - Customer Relationship Management System - Apache Avro - Telegraf - Bank transactions - Conclusions Disclaimer This webinar requires a grasp knowlegde on what are Kafka and InfluxDB Emanuele Falzone - InfluxData Webinar - 16 June, 2020 3
  • 4. ingestion storage analysis visualization Data Lifecycle Emanuele Falzone - InfluxData Webinar - 16 June, 2020 4
  • 5. Data Lifecycle ingestion storage analysis visualization customapplication Emanuele Falzone - InfluxData Webinar - 16 June, 2020 5
  • 6. Architecture application custom application custom without telegraf with telegraf Emanuele Falzone - InfluxData Webinar - 16 June, 2020 6
  • 7. Data Lifecycle ingestion storage analysis visualization customapplication Emanuele Falzone - InfluxData Webinar - 16 June, 2020 7
  • 8. Use Case Emanuele Falzone - InfluxData Webinar - 16 June, 2020 8 Customer Relationship Management System - Available for different platforms: web, ios, android - Different club status: blonde, silver, gold, platinum - Customers continuously provide ratings Code available at https://github.com/emanuele-falzone/influxdata-webinar-crm-ratings-use-case
  • 9. Customer Relationship Management Emanuele Falzone - InfluxData Webinar - 16 June, 2020 9
  • 10. Customer Rela>onship Management customer data ratings RATINGS-WITH-CUSTOMER-DATA avro Emanuele Falzone - InfluxData Webinar - 16 June, 2020 10 { "rating_id": 5313, "user_id": 3, "stars": 5, "route_id": 6975, "rating_time": 1519304105213, "channel": "web", "message": "awesome!" } { "id": 3, "first_name": "Merilyn", "last_name": "Doughartie", "email": "mdoughartie1@dedecms.com", "gender": "Female", "club_status": "platinium", "comments": "none" } { "stars": 5, "rating_time": 1519304105213, "channel": "web", "gender": "Female", "club_status": "platinium" }
  • 11. - binary format - self describing with schema embedded in the data itself 01000101 10110011 int 42 - given the schema you can automaticcally generate encoder/decoder Apache Emanuele Falzone - InfluxData Webinar - 16 June, 2020 11
  • 12. Apache { "type": "record", "name": "KsqlDataSourceSchema", "namespace": "io.confluent.ksql.avro_schemas", "fields": [ { "name": ”RATING_TIME", "type": [ "null", "long" ], "default": null }, { "name": "STARS", "type": [ "null", "int" ], "default": null }, . . . ] } { ”RATING_TIME": { "long": 1591557033850 }, "CHANNEL": { "string": "iOS-test" }, "STARS": { "int": 2 }, "CLUB_STATUS": { "string": "gold" }, ”GENDER": { "string": ”Female" } } schema value Emanuele Falzone - InfluxData Webinar - 16 June, 2020 12
  • 13. { "type": "record", "name": "KsqlDataSourceSchema", "namespace": "io.confluent.ksql.avro_schemas", "fields": [ { "name": "TIMESTAMP", "type": [ "null", "long" ], "default": null }, { "name": "STARS", "type": [ "null", "int" ], "default": null }, . . . ] } { "TIMESTAMP": { "long": 1591557033850 }, "RATING_ID": { "long": 12 }, "CHANNEL": { "string": "iOS-test" }, "STARS": { "int": 2 }, "MESSAGE": { "string": "meh" }, "CLUB_STATUS": { "string": "gold" }, . . . } schema value Emanuele Falzone - InfluxData Webinar - 16 June, 2020 13 encode0xA0 0xB1 0x47 0x06 0x90 . . . decode { "type": "record", "name": "KsqlDataSourceSchema", "namespace": "io.confluent.ksql.avro_schemas", "fields": [ { "name": "TIMESTAMP", "type": [ "null", "long" ], "default": null }, { "name": "STARS", "type": [ "null", "int" ], "default": null }, . . . ] } { "TIMESTAMP": { "long": 1591557033850 }, "RATING_ID": { "long": 12 }, "CHANNEL": { "string": "iOS-test" }, "STARS": { "int": 2 }, "MESSAGE": { "string": "meh" }, "CLUB_STATUS": { "string": "gold" }, . . . } schema value Apache
  • 14. Apache messages in Kafka 0x20 0x45 0x37 0x74 0x61 . . . 0xA0 0xB1 0x47 0x06 0x90 . . . 0x20 0x45 0x37 0x74 0x61 . . . 0xA1 0xE1 0x60 0x00 0x00 . . . 0x20 0x45 0x37 0x74 0x61 . . . 0xA3 0xB5 0x00 0x32 0x54 . . . m bytes Binary Message n bytes Schema Emanuele Falzone - InfluxData Webinar - 16 June, 2020 14
  • 15. Schema registry Emanuele Falzone - InfluxData Webinar - 16 June, 2020 15 id schema 1 { "type": "record", "name": "KsqlDataSourceSchema", "namespace": "io.confluent.ksql.avro_schemas", "fields": [ { "name": "TIMESTAMP", "type": [ "null", "long" ], "default": null }, { "name": "STARS", "type": [ "null", "int" ], "default": null }, . . . ] } GET /schemas/ids/%d application/json
  • 16. Kafka + schema registry 0x00 0x00 0x00 0x00 0x01 0xA0 0xB1 0x47 0x06 0x90 . . . 0x00 0x00 0x00 0x00 0x01 0xA1 0xE1 0x60 0x00 0x00 . . . 0x00 0x00 0x00 0x00 0x01 0xA3 0xB5 0x00 0x32 0x54 . . . 4 bytes Schema ID m bytes Binary Message schema registry 1 byte Magic Byte Emanuele Falzone - InfluxData Webinar - 16 June, 2020 16
  • 17. Architecture schema registry kafka Emanuele Falzone - InfluxData Webinar - 16 June, 2020 17
  • 18. - the most used tool to feed data to InfluxDB - written in golang - Open Source! - plugin architeture - more than 200 plugins - documentation and examples - easy to deploy - docker container with config file Emanuele Falzone - InfluxData Webinar - 16 June, 2020 18
  • 19. Plugin architecture input metric parser plugins plugins telegraf data flow telegraf boundary Emanuele Falzone - InfluxData Webinar - 16 June, 2020 19 bytes/string
  • 20. Telegraf metric closely based on InfluxDB data model and contain four main components: - Measurement name: Description and namespace for the metric. - Tags: Key/Value string pairs and usually used to identify the metric. - Fields: Key/Value pairs that are typed and usually contain the metric data. - Timestamp: Date and time associated with the fields. This metric type exists only in memory and must be converted to a concrete representation in order to be transmitted or viewed Emanuele Falzone - InfluxData Webinar - 16 June, 2020 20
  • 21. Plugin architecture input metric parser serializer TO BE NOTICED: there are also processors and aggregators plugins, but we are not using them. plugins output plugins pluginsplugins telegraf data flow telegraf boundary Emanuele Falzone - InfluxData Webinar - 16 June, 2020 21
  • 22. Plugins input plugins: file, kafka, influx, mqtt, http, socket, event-hub, kinesis, cloud-pubsub, mysql, mongodb, couchdb, postgresql, . . . parser plugins: json, csv, form-urlencoded, influx, wavefront, . . . serializer plugins = input parser output plugins = input plugins Emanuele Falzone - InfluxData Webinar - 16 June, 2020 22
  • 23. telegraf.conf # Input plugin [[inputs.file]] files = [”/tmp/sample.csv"] # Parser plugin data_format = "csv" csv_header_row_count = 1 csv_delimiter = "," csv_measurement_column = "measurement" csv_timestamp_column = "time" Emanuele Falzone - InfluxData Webinar - 16 June, 2020 23 # Output plugin [[outputs.influxdb_v2]] urls = ["http://influxdb:9999"] token = "d2VsY29tZQ==" organization = "polimi" bucket = "bucket” # Serializer plugin data_format = ”influx"
  • 24. Plugin architecture influxdb-v2 outputkafka consumer telegraf metric avro parser influx serializer missing!!! data flow telegraf boundary Emanuele Falzone - InfluxData Webinar - 16 June, 2020 24
  • 25. From Apache to Telegraf metric Emanuele Falzone - InfluxData Webinar - 16 June, 2020 25 { . . . "fields": [ { "name": "RATING_TIME", . . . }, { "name": "CHANNEL", . . . }, { "name": "STARS", . . . }, { "name": "CLUB_STATUS", . . . }, { "name": "GENDER", . . . } ] } TIMESTAMP TAG FIELD TAG TAG MEASUREMENT=ratings
  • 26. telegraf.conf # Input plugin [[inputs.kafka_consumer]] brokers = ["kafka:29092"] topics = ["RATINGS_WITH_CUSTOMER_DATA"] # Parser plugin data_format = "avro" avro_measurement = "ratings" avro_tags = ["CHANNEL", "CLUB_STATUS", "GENDER"] avro_fields = ["STARS"] avro_timestamp = "RATING_TIME" avro_timestamp_format = "unix_ms" avro_schema_registry = "http://schema-registry:8081" Emanuele Falzone - InfluxData Webinar - 16 June, 2020 26 # Output plugin [[outputs.influxdb_v2]] urls = ["http://influxdb:9999"] token = "d2VsY29tZQ==" organization = "polimi" bucket = "bucket” # Serializer plugin data_format = ”influx"
  • 27. how can I make telegraf understand avro messages ?
  • 28. Code Gopkg.lock Gopkg.toml plugins/parsers/avro/parser.go plugins/parsers/avro/schema_registry.go plugins/parsers/avro/parser_test.go plugins/parsers/registry.go internal/config/config.go plugins/parsers/avro/README.md Emanuele Falzone - InfluxData Webinar - 16 June, 2020 28 dependencies parser logic tests parser init telegraf.conf documentation Code available at https://github.com/emanuele-falzone/telegraf/tree/avro/plugins/parsers/avro
  • 29. Talk is cheap. Show me the code. ~ Linus Torvalds ~
  • 30. Open Source Contribution Emanuele Falzone - InfluxData Webinar - 16 June, 2020 30
  • 31. Can you help John? John works in a bank and wants to visualize the average amount of transactions over time using InfluxDB. Here is a sample transaction: transaction,sender=Alice,receiver=Bob amount=28.00 1591644525345 However, messages are sent to Kafka encrypted with the public key of the bank. Extend telegraf in order to decrypt every transaction and upload the corresponding measurement to InfluxDB. Emanuele Falzone - InfluxData Webinar - 16 June, 2020 31 Code available at https://github.com/emanuele-falzone/influxdata-webinar-bank-transactions-use-case
  • 32. Conclusion - InfluxDB and telegraf in the Data Lifeycle - Use Cases - Customer Relationship Management System - Apache Avro - Bank Transactions - Encryption Emanuele Falzone - InfluxData Webinar - 16 June, 2020 32
  • 33. Questions? Contacts - emanuele.falzone@polimi.it - emanuelefalzone.com - github.com/emanuele-falzone Emanuele Falzone - InfluxData Webinar - 16 June, 2020 33
  • 34. We look forward to bringing together our community of developers in this new format to learn, interact, and share tips and use cases. 23-24 June, 2020 Virtual Experience www.influxdays.com