From the course: Explore Time Series Data: Ingest and Collect with Telegraf and InfluxDB

What is Telegraf and why use it?

- Telegraf is InfluxData's open source data collection agent. In this chapter, I'll give you an overview of what Telegraf is and what makes it the preferred data collection solution for many users of time series data. Data collection is the first step and often biggest obstacle when using a database because you can't start querying and visualizing your data until it's written into your database. Telegraf aims to make this process as easy as possible, addressing many of the challenges users will encounter while collecting time series data at high volumes or other complex data productions. Push and pull mechanisms are two ways to get your time series data from your data sources to your database. Push mechanisms will periodically send data that has been read from your data sources out to your database. While pull methods are a central collector that resides in your database and periodically request metrics from the data source to pull them in. This can often be done by scripting code to do so, utilizing any scrapers a database platform would provide. Instead of coding and troubleshooting long and complex scripts, Telegraf out of the box is a no code solution. It does require the basic understanding of how to use your terminal and how to set up configuration files, but you don't need to know any programming language to use it as it is. In our previous lesson in discussing the benefits of data collection at the edge, I reviewed ingesting time series data from cargo ships. Sensors on the ship are often in the middle of the ocean where there's poor data connection and the data cannot consistently stream into your database. In this situation you may have to worry about losing data due to the lack of network connectivity or burst of data when you do get that connectivity that might overwhelm your system. Telegraf includes features to accommodate for situations like this. Telegraf includes a robust scheduler, a memory metric buffer, and full streaming support to make sure you don't lose any of that data in the data ingestion process. Having to script scraping data from a few of these ships can be doable, but what happens when you start collecting data from thousands of these ships that are sending millions of metrics a second at different rates, often at unexpected times? Telegraf is built for this high speed data ingestion, able to ingest data as low as the microsecond level from thousands of devices. Another problem that often arises is that the information in its rawest form directly from the source can be messy. Telegraf offers flexible processing solutions to turn the messiest data into consistent clean data as it's coming into your database on whatever interval. You may also encounter that you don't need all the metrics from your ships so you would end up writing all your data from your devices into your databases, even data you don't end up needing. Telegraf allows you to easily drop metrics you don't want, or even route certain metrics to one database and the rest to another. On top of all of that, Telegraf is a plug-in driven agent that collects metrics. It is created and maintained by InfluxData, the creators of InfluxDB. It is 100% open source and that is a major aspect of Telegraf, from its large community to its customizability. And lastly, Telegraf and all its internal plugins are written in Go. Telegraf is a single binary that when used out of the box doesn't require any external dependencies. The binary takes up a minimal memory footprint, something the InfluxData team is always conscious of as we want users to be able to download it on the smallest of IoT devices. Telegraf can collect data coming in different ways but is optimized for streaming data. As well, Telegraf is optimized for writing data into InfluxDB, even though it can also send data to a variety of destinations.

Contents