SlideShare a Scribd company logo
IFQL and the future of
InfluxData
Paul Dix

Founder & CTO

@pauldix

paul@influxdata.com
Evolution of a query
language…
REST API
SQL-ish
Vaguely Familiar
select percentile(90, value) from cpu
where time > now() - 1d and
“host” = ‘serverA’
group by time(10m)
0.8 -> 0.9
Breaking API change, addition of tags
Functional or SQL?
Afraid to switch…
InfluxData Platform Future and Vision
InfluxData Platform Future and Vision
InfluxData Platform Future and Vision
InfluxData Platform Future and Vision
InfluxData Platform Future and Vision
InfluxData Platform Future and Vision
InfluxData Platform Future and Vision
Difficult to improve & change
It’s not SQL!
Kapacitor
Fall of 2015
Kapacitor’s TICKscript
stream
|from()
.database('telegraf')
.measurement('cpu')
.groupBy(*)
|window()
.period(5m)
.every(5m)
.align()
|mean('usage_idle')
.as('usage_idle')
|influxDBOut()
.database('telegraf')
.retentionPolicy('autogen')
.measurement('mean_cpu_idle')
.precision('s')
Hard to debug
Steep learning curve
Not Recomposable
Second Language
Rethinking Everything
Kapacitor is Background
Processing
Stream or Batch
InfluxDB is batch interactive
IFQL and unified API
Building towards 2.0
Project Goals
Photo by Glen Carrie on Unsplash
One Language to Unite!
Feature Velocity
Decouple storage from
compute
Iterate & deploy
more frequently
Scale
independently
Workload
Isolation
InfluxData Platform Future and Vision
Decouple language from
engine
{
"operations": [
{
"id": "select0",
"kind": "select",
"spec": {
"database": "foo",
"hosts": null
}
},
{
"id": "where1",
"kind": "where",
"spec": {
"expression": {
"root": {
"type": "binary",
"operator": "and",
"left": {
"type": "binary",
"operator": "and",
"left": {
"type": "binary",
"operator": "==",
"left": {
"type": "reference",
"name": "_measurement",
"kind": "tag"
},
"right": {
"type": "stringLiteral",
"value": "cpu"
}
},
Query represented as DAG in JSON
InfluxData Platform Future and Vision
A Data Language
Design Philosophy
UI for Many
because no one wants to actually write a query
Readability
over terseness
Flexible
add to language easily
Testable
new functions and user queries
Easy to Contribute
inspiration from Telegraf
Code Sharing & Reuse
no code > code
A few examples
// get the last value written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> last()
// get the last value written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> last()
Result: _result
Block: keys: [_field, _measurement, host, region] bounds: [1677-09-21T00:12:43.145224192Z, 2018-02-12T15:53:04.361902250Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
2018-02-12T15:53:00.000000000Z usage_system cpu server0 east 60.6284
Block: keys: [_field, _measurement, host, region] bounds: [1677-09-21T00:12:43.145224192Z, 2018-02-12T15:53:04.361902250Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
2018-02-12T15:53:00.000000000Z usage_user cpu server0 east 39.3716
// get the last minute of data from a specific
// measurement & field & host
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
// get the last minute of data from a specific
// measurement & field & host
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
Result: _result
Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T16:01:45.677502014Z, 2018-02-12T16:02:45.677502014Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
2018-02-12T16:01:50.000000000Z usage_user cpu server0 east 50.549
2018-02-12T16:02:00.000000000Z usage_user cpu server0 east 35.4458
2018-02-12T16:02:10.000000000Z usage_user cpu server0 east 30.0493
2018-02-12T16:02:20.000000000Z usage_user cpu server0 east 44.3378
2018-02-12T16:02:30.000000000Z usage_user cpu server0 east 11.1584
2018-02-12T16:02:40.000000000Z usage_user cpu server0 east 46.712
// get the mean in 10m intervals of last hour
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu")
|> range(start:-1h)
|> window(every:15m)
|> mean()
Result: _result
Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T15:05:06.708945484Z, 2018-02-12T16:05:06.708945484Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
2018-02-12T15:28:41.128654848Z usage_user cpu server0 east 50.72841444444444
2018-02-12T15:43:41.128654848Z usage_user cpu server0 east 51.19163333333333
2018-02-12T15:13:41.128654848Z usage_user cpu server0 east 45.5091088235294
2018-02-12T15:58:41.128654848Z usage_user cpu server0 east 49.65145555555555
2018-02-12T16:05:06.708945484Z usage_user cpu server0 east 46.41292368421052
Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T15:05:06.708945484Z, 2018-02-12T16:05:06.708945484Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
2018-02-12T15:28:41.128654848Z usage_system cpu server0 east 49.27158555555556
2018-02-12T15:58:41.128654848Z usage_system cpu server0 east 50.34854444444444
2018-02-12T16:05:06.708945484Z usage_system cpu server0 east 53.58707631578949
2018-02-12T15:13:41.128654848Z usage_system cpu server0 east 54.49089117647058
2018-02-12T15:43:41.128654848Z usage_system cpu server0 east 48.808366666666664
Elements of IFQL
Functional
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
Functional
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
built in functions
Functional
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
anonymous functions
Functional
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
pipe forward operator
Named Parameters
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
named parameters only!
Readability
Flexibility
Functions have inputs &
outputs
Testability
Builder
Inputs
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
no input
Outputs
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
output is entire db
Outputs
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
pipe that output to filter
Filter function input
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
anonymous filter function
input is a single record
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Filter function input
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
A record looks like a flat object
or row in a table
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Record Properties
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
tag key
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Record Properties
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r.host == "server0")
|> range(start:-1m)
same as before
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Special Properties
starts with _
reserved for system
attributes
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Special Properties
works other way
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r._measurement == "cpu" and
r._field == "usage_user")
|> range(start:-1m)
|> max()
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Special Properties
_measurement and _field
present for all InfluxDB data
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Special Properties
_value exists in all series
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == “usage_user" and
r[“_value"] > 50.0)
|> range(start:-1m)
|> max()
{“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
Filter function output
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
filter function output
is a boolean to determine if record is in set
Filter Operators
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
!=
=~
!~
in
Filter Boolean Logic
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => (r[“host"] == “server0" or
r[“host"] == “server1") and
r[“_measurement”] == “cpu")
|> range(start:-1m)
parens for precedence
Function with explicit return
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => {return r[“host"] == “server0"})
|> range(start:-1m)
long hand function definition
Outputs
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
filter output
is set of data matching filter function
Outputs
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
piped to range
which further filters by a time range
Outputs
// get the last 1 hour written for anything from a given host
from(db:"mydb")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1m)
range output is the final query result
Function Isolation
(but the planner may do otherwise)
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
range and filter switched
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
results the same
Result: _result
Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T17:52:02.322301856Z, 2018-02-12T17:53:02.322301856Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
2018-02-12T17:53:02.322301856Z usage_user cpu server0 east 97.3174
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
is this the same as the top two?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
|> range(start:-1m)
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
moving max to here
changes semantics
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
|> range(start:-1m)
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
here it operates on
only the last minute of data
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
|> range(start:-1m)
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
here it operates on
data for all time
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
|> range(start:-1m)
Does order matter?
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
then that result
is filtered down to
the last minute
(which will likely be empty)
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
|> range(start:-1m)
Planner Optimizes
maintains query semantics
Optimization
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
Optimization
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
this is more efficient
Optimization
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == "usage_user")
|> max()
query DAG different
plan DAG same as one on left
Optimization
from(db:"mydb")
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == “usage_user”
r[“_value"] > 22.0)
|> range(start:-1m)
|> max()
from(db:"mydb")
|> range(start:-1m)
|> filter(fn: (r) =>
r["host"] == "server0" and
r["_measurement"] == "cpu" and
r["_field"] == “usage_user"
r[“_value"] > 22.0)
|> max()
this does a full table scan
Variables & Closures
db = "mydb"
measurement = "cpu"
from(db:db)
|> filter(fn: (r) => r._measurement == measurement and
r.host == "server0")
|> last()
Variables & Closures
db = "mydb"
measurement = "cpu"
from(db:db)
|> filter(fn: (r) => r._measurement == measurement and
r.host == "server0")
|> last()
anonymous filter function
closure over surrounding context
User Defined Functions
db = "mydb"
measurement = “cpu"
fn = (r) => r._measurement == measurement and
r.host == "server0"
from(db:db)
|> filter(fn: fn)
|> last()
assign function to variable fn
User Defined Functions
from(db:"mydb")
|> filter(fn: (r) =>
r["_measurement"] == "cpu" and
r["_field"] == "usage_user" and
r["host"] == "server0")
|> range(start:-1h)
User Defined Functions
from(db:"mydb")
|> filter(fn: (r) =>
r["_measurement"] == "cpu" and
r["_field"] == "usage_user" and
r["host"] == "server0")
|> range(start:-1h)
get rid of some common boilerplate?
User Defined Functions
select = (db, m, f) => {
return from(db:db)
|> filter(fn: (r) => r._measurement == m and r._field == f)
}
User Defined Functions
select = (db, m, f) => {
return from(db:db)
|> filter(fn: (r) => r._measurement == m and r._field == f)
}
select(db: "mydb", m: "cpu", f: "usage_user")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1h)
User Defined Functions
select = (db, m, f) => {
return from(db:db)
|> filter(fn: (r) => r._measurement == m and r._field == f)
}
select(m: "cpu", f: "usage_user")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1h)
throws error
error calling function "select": missing required keyword argument "db"
Default Arguments
select = (db="mydb", m, f) => {
return from(db:db)
|> filter(fn: (r) => r._measurement == m and r._field == f)
}
select(m: "cpu", f: "usage_user")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1h)
Default Arguments
select = (db="mydb", m, f) => {
return from(db:db)
|> filter(fn: (r) => r._measurement == m and r._field == f)
}
select(m: "cpu", f: "usage_user")
|> filter(fn: (r) => r["host"] == "server0")
|> range(start:-1h)
Multiple Results to Client
data = from(db:"mydb")
|> filter(fn: (r) r._measurement == "cpu" and
r._field == "usage_user")
|> range(start: -4h)
|> window(every: 5m)
data |> min() |> yield(name: "min")
data |> max() |> yield(name: "max")
data |> mean() |> yield(name: "mean")
Multiple Results to Client
data = from(db:"mydb")
|> filter(fn: (r) r._measurement == "cpu" and
r._field == "usage_user")
|> range(start: -4h)
|> window(every: 5m)
data |> min() |> yield(name: "min")
data |> max() |> yield(name: "max")
data |> mean() |> yield(name: "mean")
Result: min
Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T16:55:55.487457216Z, 2018-02-12T20:55:55.487457216Z)
_time _field _measurement host region _value
------------------------------ --------------- --------------- --------------- --------------- ----------------------
name
User Defined Pipe Forwardable Functions
mf = (m, f, table=<-) => {
return table
|> filter(fn: (r) => r._measurement == m and
r._field == f)
}
from(db:"mydb")
|> mf(m: "cpu", f: "usage_user")
|> filter(fn: (r) => r.host == "server0")
|> last()
User Defined Pipe Forwardable Functions
mf = (m, f, table=<-) => {
return table
|> filter(fn: (r) => r._measurement == m and
r._field == f)
}
from(db:"mydb")
|> mf(m: "cpu", f: "usage_user")
|> filter(fn: (r) => r.host == "server0")
|> last()
takes a table
from a pipe forward
by default
User Defined Pipe Forwardable Functions
mf = (m, f, table=<-) => {
return table
|> filter(fn: (r) => r._measurement == m and
r._field == f)
}
from(db:"mydb")
|> mf(m: "cpu", f: "usage_user")
|> filter(fn: (r) => r.host == "server0")
|> last()
calling it, then chaining
Passing as Argument
mf = (m, f, table=<-) => {
return table
|> filter(fn: (r) => r._measurement == m and
r._field == f)
}
sending the from as argument
mf(m: "cpu", f: "usage_user", table: from(db:"mydb"))
|> filter(fn: (r) => r.host == "server0")
|> last()
Passing as Argument
mf = (m, f, table=<-) =>
filter(fn: (r) => r._measurement == m and r._field == f,
table: table)
rewrite the function to use argument
mf(m: "cpu", f: "usage_user", table: from(db:"mydb"))
|> filter(fn: (r) => r.host == "server0")
|> last()
Any pipe forward function can use arguments
min(table:
range(start: -1h, table:
filter(fn: (r) => r.host == "server0", table:
from(db: "mydb"))))
Make you a Lisp
Easy to add Functions
like plugins in Telegraf
code file
test file
package functions
import (
"fmt"
"github.com/influxdata/ifql/ifql"
"github.com/influxdata/ifql/query"
"github.com/influxdata/ifql/query/execute"
"github.com/influxdata/ifql/query/plan"
)
const CountKind = "count"
type CountOpSpec struct {
}
func init() {
ifql.RegisterFunction(CountKind, createCountOpSpec)
query.RegisterOpSpec(CountKind, newCountOp)
plan.RegisterProcedureSpec(CountKind, newCountProcedure, CountKind)
execute.RegisterTransformation(CountKind, createCountTransformation)
}
func createCountOpSpec(args map[string]ifql.Value, ctx ifql.Context) (query.OperationSpec, error) {
if len(args) != 0 {
return nil, fmt.Errorf(`count function requires no arguments`)
}
return new(CountOpSpec), nil
}
func newCountOp() query.OperationSpec {
return new(CountOpSpec)
}
func (s *CountOpSpec) Kind() query.OperationKind {
return CountKind
}
type CountProcedureSpec struct {
}
func newCountProcedure(query.OperationSpec) (plan.ProcedureSpec, error) {
return new(CountProcedureSpec), nil
}
func (s *CountProcedureSpec) Kind() plan.ProcedureKind {
return CountKind
}
func (s *CountProcedureSpec) Copy() plan.ProcedureSpec {
return new(CountProcedureSpec)
}
func (s *CountProcedureSpec) PushDownRule() plan.PushDownRule {
return plan.PushDownRule{
Root: SelectKind,
Through: nil,
}
}
func (s *CountProcedureSpec) PushDown(root *plan.Procedure, dup func() *plan.Procedure) {
selectSpec := root.Spec.(*SelectProcedureSpec)
if selectSpec.AggregateSet {
root = dup()
selectSpec = root.Spec.(*SelectProcedureSpec)
selectSpec.AggregateSet = false
selectSpec.AggregateType = ""
return
}
selectSpec.AggregateSet = true
selectSpec.AggregateType = CountKind
}
type CountAgg struct {
count int64
}
func createCountTransformation(id execute.DatasetID, mode execute.AccumulationMode, spec plan.ProcedureSpec, ctx execute.Context
(execute.Transformation, execute.Dataset, error) {
t, d := execute.NewAggregateTransformationAndDataset(id, mode, ctx.Bounds(), new(CountAgg))
return t, d, nil
}
func (a *CountAgg) DoBool(vs []bool) {
a.count += int64(len(vs))
}
func (a *CountAgg) DoUInt(vs []uint64) {
a.count += int64(len(vs))
}
func (a *CountAgg) DoInt(vs []int64) {
a.count += int64(len(vs))
}
func (a *CountAgg) DoFloat(vs []float64) {
a.count += int64(len(vs))
}
func (a *CountAgg) DoString(vs []string) {
a.count += int64(len(vs))
}
func (a *CountAgg) Type() execute.DataType {
return execute.TInt
}
func (a *CountAgg) ValueInt() int64 {
return a.count
}
Defines parser, validation,
execution
Imports and Namespaces
from(db:"mydb")
|> filter(fn: (r) => r.host == "server0")
|> range(start: -1h)
// square the value
|> map(fn: (r) => r._value * r._value)
shortcut for this?
Imports and Namespaces
from(db:"mydb")
|> filter(fn: (r) => r.host == "server0")
|> range(start: -1h)
// square the value
|> map(fn: (r) => r._value * r._value)
square = (table=<-) {
table |> map(fn: (r) => r._value * r._value)
}
Imports and Namespaces
import "github.com/pauldix/ifqlmath"
from(db:"mydb")
|> filter(fn: (r) => r.host == "server0")
|> range(start: -1h)
|> ifqlmath.square()
Imports and Namespaces
import "github.com/pauldix/ifqlmath"
from(db:"mydb")
|> filter(fn: (r) => r.host == "server0")
|> range(start: -1h)
|> ifqlmath.square()
namespace
MOAR EXAMPLES!
Math across measurements
foo = from(db: "mydb")
|> filter(fn: (r) => r._measurement == "foo")
|> range(start: -1h)
bar = from(db: "mydb")
|> filter(fn: (r) => r._measurement == "bar")
|> range(start: -1h)
join(
tables: {foo:foo, bar:bar},
fn: (t) => t.foo._value + t.bar._value)
|> yield(name: "foobar")
Having Query
from(db:"mydb")
|> filter(fn: (r) => r._measurement == "cpu")
|> range(start:-1h)
|> window(every:10m)
|> mean()
// this is the having part
|> filter(fn: (r) => r._value > 90)
Grouping
// group - average utilization across regions
from(db:"mydb")
|> filter(fn: (r) => r._measurement == "cpu" and
r._field == "usage_system")
|> range(start: -1h)
|> group(by: ["region"])
|> window(every:10m)
|> mean()
Get Metadata
from(db:"mydb")
|> filter(fn: (r) => r._measurement == "cpu")
|> range(start: -48h, stop: -47h)
|> tagValues(key: "host")
Get Metadata
from(db:"mydb")
|> filter(fn: (r) => r._measurement == "cpu")
|> range(start: -48h, stop: -47h)
|> group(by: ["measurement"], keep: ["host"])
|> distinct(column: "host")
Get Metadata
tagValues = (table=<-) =>
table
|> group(by: ["measurement"], keep: ["host"])
|> distinct(column: "host")
Get Metadata
from(db:"mydb")
|> filter(fn: (r) => r._measurement == "cpu")
|> range(start: -48h, stop: -47h)
|> tagValues(key: “host")
|> count()
Functions Implemented as IFQL
// _sortLimit is a helper function, which sorts
// and limits a table.
_sortLimit = (n, desc, cols=["_value"], table=<-) =>
table
|> sort(cols:cols, desc:desc)
|> limit(n:n)
// top sorts a table by cols and keeps only the top n records.
top = (n, cols=["_value"], table=<-) =>
_sortLimit(table:table, n:n, cols:cols, desc:true)
Project Status and Timeline
API 2.0 Work
Lock down query request/response format
Apache Arrow
We’re contributing the Go
implementation!
https://github.com/influxdata/arrow
Finalize Language
(a few months or so)
Ship with Enterprise 1.6
(summertime)
Hack & workshop day
tomorrow!
Ask the registration desk today
Thank you!
Paul Dix

paul@influxdata.com

@pauldix

More Related Content

PDF
INFLUXQL & TICKSCRIPT
PDF
Advanced kapacitor
PPTX
Kapacitor - Real Time Data Processing Engine
PDF
The Monitoring Playground
PDF
OPTIMIZING THE TICK STACK
PDF
Virtual training Intro to Kapacitor
PDF
Monitoring InfluxEnterprise
PDF
WRITING QUERIES (INFLUXQL AND TICK)
INFLUXQL & TICKSCRIPT
Advanced kapacitor
Kapacitor - Real Time Data Processing Engine
The Monitoring Playground
OPTIMIZING THE TICK STACK
Virtual training Intro to Kapacitor
Monitoring InfluxEnterprise
WRITING QUERIES (INFLUXQL AND TICK)

What's hot (20)

PDF
Inside the InfluxDB storage engine
PPTX
InfluxDB 1.0 - Optimizing InfluxDB by Sam Dillard
PDF
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
PDF
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
PPTX
Taming the Tiger: Tips and Tricks for Using Telegraf
PDF
A Deeper Dive into EXPLAIN
 
PDF
Time Series Data with InfluxDB
PDF
Finding OOMS in Legacy Systems with the Syslog Telegraf Plugin
PDF
Introduction to InfluxDB
PPTX
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
PDF
Lessons Learned: Running InfluxDB Cloud and Other Cloud Services at Scale | T...
PDF
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
PDF
Flux and InfluxDB 2.0 by Paul Dix
PDF
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
PPTX
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
PDF
Downsampling your data October 2017
PPTX
9:40 am InfluxDB 2.0 and Flux – The Road Ahead Paul Dix, Founder and CTO | ...
PPTX
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
PDF
Write your own telegraf plugin
PDF
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Inside the InfluxDB storage engine
InfluxDB 1.0 - Optimizing InfluxDB by Sam Dillard
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
Extending Flux to Support Other Databases and Data Stores | Adam Anthony | In...
Taming the Tiger: Tips and Tricks for Using Telegraf
A Deeper Dive into EXPLAIN
 
Time Series Data with InfluxDB
Finding OOMS in Legacy Systems with the Syslog Telegraf Plugin
Introduction to InfluxDB
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Lessons Learned: Running InfluxDB Cloud and Other Cloud Services at Scale | T...
Obtaining the Perfect Smoke By Monitoring Your BBQ with InfluxDB and Telegraf
Flux and InfluxDB 2.0 by Paul Dix
InfluxDB IOx Tech Talks: Intro to the InfluxDB IOx Read Buffer - A Read-Optim...
Scott Anderson [InfluxData] | InfluxDB Tasks – Beyond Downsampling | InfluxDa...
Downsampling your data October 2017
9:40 am InfluxDB 2.0 and Flux – The Road Ahead Paul Dix, Founder and CTO | ...
Samantha Wang [InfluxData] | Best Practices on How to Transform Your Data Usi...
Write your own telegraf plugin
Meet the Experts: Visualize Your Time-Stamped Data Using the React-Based Gira...
Ad

Similar to InfluxData Platform Future and Vision (20)

PDF
Flux and InfluxDB 2.0
PPTX
Using Grafana with InfluxDB 2.0 and Flux Lang by Jacob Lisi
PDF
Balaji Palani [InfluxData] | InfluxDB Tasks Overview | InfluxDays 2022
PDF
Optimizing the Grafana Platform for Flux
PPTX
Functional programming with Ruby - can make you look smart
PDF
Monitoring with Prometheus
PDF
Rhebok, High Performance Rack Handler / Rubykaigi 2015
PDF
A Tour of Building Web Applications with R Shiny
PDF
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
KEY
Introduction to cloudforecast
KEY
fog or: How I Learned to Stop Worrying and Love the Cloud
PDF
Influx/Days 2017 San Francisco | Paul Dix
PDF
Cutting through the fog of cloud
PDF
Router Queue Simulation in C++ in MMNN and MM1 conditions
KEY
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
PDF
Deconstructing the Functional Web with Clojure
PDF
Analyzing the Performance Effects of Meltdown + Spectre on Apache Spark Workl...
PDF
Background Jobs - Com BackgrounDRb
PDF
The powerful toolset of the go-mysql library
PDF
Deathstar
Flux and InfluxDB 2.0
Using Grafana with InfluxDB 2.0 and Flux Lang by Jacob Lisi
Balaji Palani [InfluxData] | InfluxDB Tasks Overview | InfluxDays 2022
Optimizing the Grafana Platform for Flux
Functional programming with Ruby - can make you look smart
Monitoring with Prometheus
Rhebok, High Performance Rack Handler / Rubykaigi 2015
A Tour of Building Web Applications with R Shiny
Monitoring Your ISP Using InfluxDB Cloud and Raspberry Pi
Introduction to cloudforecast
fog or: How I Learned to Stop Worrying and Love the Cloud
Influx/Days 2017 San Francisco | Paul Dix
Cutting through the fog of cloud
Router Queue Simulation in C++ in MMNN and MM1 conditions
fog or: How I Learned to Stop Worrying and Love the Cloud (OpenStack Edition)
Deconstructing the Functional Web with Clojure
Analyzing the Performance Effects of Meltdown + Spectre on Apache Spark Workl...
Background Jobs - Com BackgrounDRb
The powerful toolset of the go-mysql library
Deathstar
Ad

More from InfluxData (20)

PPTX
Announcing InfluxDB Clustered
PDF
Best Practices for Leveraging the Apache Arrow Ecosystem
PDF
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
PDF
Power Your Predictive Analytics with InfluxDB
PDF
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
PDF
Build an Edge-to-Cloud Solution with the MING Stack
PDF
Meet the Founders: An Open Discussion About Rewriting Using Rust
PDF
Introducing InfluxDB Cloud Dedicated
PDF
Gain Better Observability with OpenTelemetry and InfluxDB
PPTX
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
PDF
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
PPTX
Introducing InfluxDB’s New Time Series Database Storage Engine
PDF
Start Automating InfluxDB Deployments at the Edge with balena
PDF
Understanding InfluxDB’s New Storage Engine
PDF
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
PPTX
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
PDF
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
PDF
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
PDF
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
PDF
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Announcing InfluxDB Clustered
Best Practices for Leveraging the Apache Arrow Ecosystem
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
Power Your Predictive Analytics with InfluxDB
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
Build an Edge-to-Cloud Solution with the MING Stack
Meet the Founders: An Open Discussion About Rewriting Using Rust
Introducing InfluxDB Cloud Dedicated
Gain Better Observability with OpenTelemetry and InfluxDB
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
Introducing InfluxDB’s New Time Series Database Storage Engine
Start Automating InfluxDB Deployments at the Edge with balena
Understanding InfluxDB’s New Storage Engine
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022

Recently uploaded (20)

PPTX
Digital Literacy And Online Safety on internet
PDF
An introduction to the IFRS (ISSB) Stndards.pdf
PPTX
artificial intelligence overview of it and more
PDF
Cloud-Scale Log Monitoring _ Datadog.pdf
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
PDF
Decoding a Decade: 10 Years of Applied CTI Discipline
PPTX
presentation_pfe-universite-molay-seltan.pptx
PDF
Unit-1 introduction to cyber security discuss about how to secure a system
PDF
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
PPTX
introduction about ICD -10 & ICD-11 ppt.pptx
DOCX
Unit-3 cyber security network security of internet system
PPTX
Job_Card_System_Styled_lorem_ipsum_.pptx
PPTX
Introduction to Information and Communication Technology
PPTX
Introuction about ICD -10 and ICD-11 PPT.pptx
PPTX
Slides PPTX World Game (s) Eco Economic Epochs.pptx
PDF
Testing WebRTC applications at scale.pdf
PPTX
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
PPTX
522797556-Unit-2-Temperature-measurement-1-1.pptx
PPTX
CHE NAA, , b,mn,mblblblbljb jb jlb ,j , ,C PPT.pptx
PDF
Paper PDF World Game (s) Great Redesign.pdf
Digital Literacy And Online Safety on internet
An introduction to the IFRS (ISSB) Stndards.pdf
artificial intelligence overview of it and more
Cloud-Scale Log Monitoring _ Datadog.pdf
The New Creative Director: How AI Tools for Social Media Content Creation Are...
Decoding a Decade: 10 Years of Applied CTI Discipline
presentation_pfe-universite-molay-seltan.pptx
Unit-1 introduction to cyber security discuss about how to secure a system
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
introduction about ICD -10 & ICD-11 ppt.pptx
Unit-3 cyber security network security of internet system
Job_Card_System_Styled_lorem_ipsum_.pptx
Introduction to Information and Communication Technology
Introuction about ICD -10 and ICD-11 PPT.pptx
Slides PPTX World Game (s) Eco Economic Epochs.pptx
Testing WebRTC applications at scale.pdf
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
522797556-Unit-2-Temperature-measurement-1-1.pptx
CHE NAA, , b,mn,mblblblbljb jb jlb ,j , ,C PPT.pptx
Paper PDF World Game (s) Great Redesign.pdf

InfluxData Platform Future and Vision

  • 1. IFQL and the future of InfluxData Paul Dix Founder & CTO @pauldix paul@influxdata.com
  • 2. Evolution of a query language…
  • 5. Vaguely Familiar select percentile(90, value) from cpu where time > now() - 1d and “host” = ‘serverA’ group by time(10m)
  • 6. 0.8 -> 0.9 Breaking API change, addition of tags
  • 26. InfluxDB is batch interactive
  • 27. IFQL and unified API Building towards 2.0
  • 28. Project Goals Photo by Glen Carrie on Unsplash
  • 29. One Language to Unite!
  • 32. Iterate & deploy more frequently
  • 37. { "operations": [ { "id": "select0", "kind": "select", "spec": { "database": "foo", "hosts": null } }, { "id": "where1", "kind": "where", "spec": { "expression": { "root": { "type": "binary", "operator": "and", "left": { "type": "binary", "operator": "and", "left": { "type": "binary", "operator": "==", "left": { "type": "reference", "name": "_measurement", "kind": "tag" }, "right": { "type": "stringLiteral", "value": "cpu" } }, Query represented as DAG in JSON
  • 41. UI for Many because no one wants to actually write a query
  • 46. Code Sharing & Reuse no code > code
  • 48. // get the last value written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> last()
  • 49. // get the last value written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> last() Result: _result Block: keys: [_field, _measurement, host, region] bounds: [1677-09-21T00:12:43.145224192Z, 2018-02-12T15:53:04.361902250Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T15:53:00.000000000Z usage_system cpu server0 east 60.6284 Block: keys: [_field, _measurement, host, region] bounds: [1677-09-21T00:12:43.145224192Z, 2018-02-12T15:53:04.361902250Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T15:53:00.000000000Z usage_user cpu server0 east 39.3716
  • 50. // get the last minute of data from a specific // measurement & field & host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m)
  • 51. // get the last minute of data from a specific // measurement & field & host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) Result: _result Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T16:01:45.677502014Z, 2018-02-12T16:02:45.677502014Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T16:01:50.000000000Z usage_user cpu server0 east 50.549 2018-02-12T16:02:00.000000000Z usage_user cpu server0 east 35.4458 2018-02-12T16:02:10.000000000Z usage_user cpu server0 east 30.0493 2018-02-12T16:02:20.000000000Z usage_user cpu server0 east 44.3378 2018-02-12T16:02:30.000000000Z usage_user cpu server0 east 11.1584 2018-02-12T16:02:40.000000000Z usage_user cpu server0 east 46.712
  • 52. // get the mean in 10m intervals of last hour from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu") |> range(start:-1h) |> window(every:15m) |> mean() Result: _result Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T15:05:06.708945484Z, 2018-02-12T16:05:06.708945484Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T15:28:41.128654848Z usage_user cpu server0 east 50.72841444444444 2018-02-12T15:43:41.128654848Z usage_user cpu server0 east 51.19163333333333 2018-02-12T15:13:41.128654848Z usage_user cpu server0 east 45.5091088235294 2018-02-12T15:58:41.128654848Z usage_user cpu server0 east 49.65145555555555 2018-02-12T16:05:06.708945484Z usage_user cpu server0 east 46.41292368421052 Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T15:05:06.708945484Z, 2018-02-12T16:05:06.708945484Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T15:28:41.128654848Z usage_system cpu server0 east 49.27158555555556 2018-02-12T15:58:41.128654848Z usage_system cpu server0 east 50.34854444444444 2018-02-12T16:05:06.708945484Z usage_system cpu server0 east 53.58707631578949 2018-02-12T15:13:41.128654848Z usage_system cpu server0 east 54.49089117647058 2018-02-12T15:43:41.128654848Z usage_system cpu server0 east 48.808366666666664
  • 54. Functional // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m)
  • 55. Functional // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) built in functions
  • 56. Functional // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) anonymous functions
  • 57. Functional // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) pipe forward operator
  • 58. Named Parameters // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) named parameters only!
  • 64. Inputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) no input
  • 65. Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) output is entire db
  • 66. Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) pipe that output to filter
  • 67. Filter function input // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) anonymous filter function input is a single record {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 68. Filter function input // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) A record looks like a flat object or row in a table {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 69. Record Properties // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) tag key {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 70. Record Properties // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start:-1m) same as before {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 71. Special Properties starts with _ reserved for system attributes from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 72. Special Properties works other way from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r._measurement == "cpu" and r._field == "usage_user") |> range(start:-1m) |> max() {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 73. Special Properties _measurement and _field present for all InfluxDB data from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 74. Special Properties _value exists in all series from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == “usage_user" and r[“_value"] > 50.0) |> range(start:-1m) |> max() {“_measurement”:”cpu”, ”_field”:”usage_user", “host":"server0", “region":"west", "_value":23.2}
  • 75. Filter function output // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) filter function output is a boolean to determine if record is in set
  • 76. Filter Operators // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) != =~ !~ in
  • 77. Filter Boolean Logic // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => (r[“host"] == “server0" or r[“host"] == “server1") and r[“_measurement”] == “cpu") |> range(start:-1m) parens for precedence
  • 78. Function with explicit return // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => {return r[“host"] == “server0"}) |> range(start:-1m) long hand function definition
  • 79. Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) filter output is set of data matching filter function
  • 80. Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) piped to range which further filters by a time range
  • 81. Outputs // get the last 1 hour written for anything from a given host from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1m) range output is the final query result
  • 82. Function Isolation (but the planner may do otherwise)
  • 83. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max()
  • 84. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() range and filter switched
  • 85. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() results the same Result: _result Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T17:52:02.322301856Z, 2018-02-12T17:53:02.322301856Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- 2018-02-12T17:53:02.322301856Z usage_user cpu server0 east 97.3174
  • 86. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() is this the same as the top two? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)
  • 87. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() moving max to here changes semantics from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)
  • 88. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() here it operates on only the last minute of data from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)
  • 89. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() here it operates on data for all time from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)
  • 90. Does order matter? from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() then that result is filtered down to the last minute (which will likely be empty) from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() |> range(start:-1m)
  • 92. Optimization from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max()
  • 93. Optimization from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() this is more efficient
  • 94. Optimization from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == "usage_user") |> max() query DAG different plan DAG same as one on left
  • 95. Optimization from(db:"mydb") |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == “usage_user” r[“_value"] > 22.0) |> range(start:-1m) |> max() from(db:"mydb") |> range(start:-1m) |> filter(fn: (r) => r["host"] == "server0" and r["_measurement"] == "cpu" and r["_field"] == “usage_user" r[“_value"] > 22.0) |> max() this does a full table scan
  • 96. Variables & Closures db = "mydb" measurement = "cpu" from(db:db) |> filter(fn: (r) => r._measurement == measurement and r.host == "server0") |> last()
  • 97. Variables & Closures db = "mydb" measurement = "cpu" from(db:db) |> filter(fn: (r) => r._measurement == measurement and r.host == "server0") |> last() anonymous filter function closure over surrounding context
  • 98. User Defined Functions db = "mydb" measurement = “cpu" fn = (r) => r._measurement == measurement and r.host == "server0" from(db:db) |> filter(fn: fn) |> last() assign function to variable fn
  • 99. User Defined Functions from(db:"mydb") |> filter(fn: (r) => r["_measurement"] == "cpu" and r["_field"] == "usage_user" and r["host"] == "server0") |> range(start:-1h)
  • 100. User Defined Functions from(db:"mydb") |> filter(fn: (r) => r["_measurement"] == "cpu" and r["_field"] == "usage_user" and r["host"] == "server0") |> range(start:-1h) get rid of some common boilerplate?
  • 101. User Defined Functions select = (db, m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) }
  • 102. User Defined Functions select = (db, m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) } select(db: "mydb", m: "cpu", f: "usage_user") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1h)
  • 103. User Defined Functions select = (db, m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) } select(m: "cpu", f: "usage_user") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1h) throws error error calling function "select": missing required keyword argument "db"
  • 104. Default Arguments select = (db="mydb", m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) } select(m: "cpu", f: "usage_user") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1h)
  • 105. Default Arguments select = (db="mydb", m, f) => { return from(db:db) |> filter(fn: (r) => r._measurement == m and r._field == f) } select(m: "cpu", f: "usage_user") |> filter(fn: (r) => r["host"] == "server0") |> range(start:-1h)
  • 106. Multiple Results to Client data = from(db:"mydb") |> filter(fn: (r) r._measurement == "cpu" and r._field == "usage_user") |> range(start: -4h) |> window(every: 5m) data |> min() |> yield(name: "min") data |> max() |> yield(name: "max") data |> mean() |> yield(name: "mean")
  • 107. Multiple Results to Client data = from(db:"mydb") |> filter(fn: (r) r._measurement == "cpu" and r._field == "usage_user") |> range(start: -4h) |> window(every: 5m) data |> min() |> yield(name: "min") data |> max() |> yield(name: "max") data |> mean() |> yield(name: "mean") Result: min Block: keys: [_field, _measurement, host, region] bounds: [2018-02-12T16:55:55.487457216Z, 2018-02-12T20:55:55.487457216Z) _time _field _measurement host region _value ------------------------------ --------------- --------------- --------------- --------------- ---------------------- name
  • 108. User Defined Pipe Forwardable Functions mf = (m, f, table=<-) => { return table |> filter(fn: (r) => r._measurement == m and r._field == f) } from(db:"mydb") |> mf(m: "cpu", f: "usage_user") |> filter(fn: (r) => r.host == "server0") |> last()
  • 109. User Defined Pipe Forwardable Functions mf = (m, f, table=<-) => { return table |> filter(fn: (r) => r._measurement == m and r._field == f) } from(db:"mydb") |> mf(m: "cpu", f: "usage_user") |> filter(fn: (r) => r.host == "server0") |> last() takes a table from a pipe forward by default
  • 110. User Defined Pipe Forwardable Functions mf = (m, f, table=<-) => { return table |> filter(fn: (r) => r._measurement == m and r._field == f) } from(db:"mydb") |> mf(m: "cpu", f: "usage_user") |> filter(fn: (r) => r.host == "server0") |> last() calling it, then chaining
  • 111. Passing as Argument mf = (m, f, table=<-) => { return table |> filter(fn: (r) => r._measurement == m and r._field == f) } sending the from as argument mf(m: "cpu", f: "usage_user", table: from(db:"mydb")) |> filter(fn: (r) => r.host == "server0") |> last()
  • 112. Passing as Argument mf = (m, f, table=<-) => filter(fn: (r) => r._measurement == m and r._field == f, table: table) rewrite the function to use argument mf(m: "cpu", f: "usage_user", table: from(db:"mydb")) |> filter(fn: (r) => r.host == "server0") |> last()
  • 113. Any pipe forward function can use arguments min(table: range(start: -1h, table: filter(fn: (r) => r.host == "server0", table: from(db: "mydb"))))
  • 114. Make you a Lisp
  • 115. Easy to add Functions like plugins in Telegraf
  • 118. package functions import ( "fmt" "github.com/influxdata/ifql/ifql" "github.com/influxdata/ifql/query" "github.com/influxdata/ifql/query/execute" "github.com/influxdata/ifql/query/plan" ) const CountKind = "count" type CountOpSpec struct { } func init() { ifql.RegisterFunction(CountKind, createCountOpSpec) query.RegisterOpSpec(CountKind, newCountOp) plan.RegisterProcedureSpec(CountKind, newCountProcedure, CountKind) execute.RegisterTransformation(CountKind, createCountTransformation) } func createCountOpSpec(args map[string]ifql.Value, ctx ifql.Context) (query.OperationSpec, error) { if len(args) != 0 { return nil, fmt.Errorf(`count function requires no arguments`) } return new(CountOpSpec), nil } func newCountOp() query.OperationSpec { return new(CountOpSpec) } func (s *CountOpSpec) Kind() query.OperationKind { return CountKind }
  • 119. type CountProcedureSpec struct { } func newCountProcedure(query.OperationSpec) (plan.ProcedureSpec, error) { return new(CountProcedureSpec), nil } func (s *CountProcedureSpec) Kind() plan.ProcedureKind { return CountKind } func (s *CountProcedureSpec) Copy() plan.ProcedureSpec { return new(CountProcedureSpec) } func (s *CountProcedureSpec) PushDownRule() plan.PushDownRule { return plan.PushDownRule{ Root: SelectKind, Through: nil, } } func (s *CountProcedureSpec) PushDown(root *plan.Procedure, dup func() *plan.Procedure) { selectSpec := root.Spec.(*SelectProcedureSpec) if selectSpec.AggregateSet { root = dup() selectSpec = root.Spec.(*SelectProcedureSpec) selectSpec.AggregateSet = false selectSpec.AggregateType = "" return } selectSpec.AggregateSet = true selectSpec.AggregateType = CountKind }
  • 120. type CountAgg struct { count int64 } func createCountTransformation(id execute.DatasetID, mode execute.AccumulationMode, spec plan.ProcedureSpec, ctx execute.Context (execute.Transformation, execute.Dataset, error) { t, d := execute.NewAggregateTransformationAndDataset(id, mode, ctx.Bounds(), new(CountAgg)) return t, d, nil } func (a *CountAgg) DoBool(vs []bool) { a.count += int64(len(vs)) } func (a *CountAgg) DoUInt(vs []uint64) { a.count += int64(len(vs)) } func (a *CountAgg) DoInt(vs []int64) { a.count += int64(len(vs)) } func (a *CountAgg) DoFloat(vs []float64) { a.count += int64(len(vs)) } func (a *CountAgg) DoString(vs []string) { a.count += int64(len(vs)) } func (a *CountAgg) Type() execute.DataType { return execute.TInt } func (a *CountAgg) ValueInt() int64 { return a.count }
  • 122. Imports and Namespaces from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) // square the value |> map(fn: (r) => r._value * r._value) shortcut for this?
  • 123. Imports and Namespaces from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) // square the value |> map(fn: (r) => r._value * r._value) square = (table=<-) { table |> map(fn: (r) => r._value * r._value) }
  • 124. Imports and Namespaces import "github.com/pauldix/ifqlmath" from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) |> ifqlmath.square()
  • 125. Imports and Namespaces import "github.com/pauldix/ifqlmath" from(db:"mydb") |> filter(fn: (r) => r.host == "server0") |> range(start: -1h) |> ifqlmath.square() namespace
  • 127. Math across measurements foo = from(db: "mydb") |> filter(fn: (r) => r._measurement == "foo") |> range(start: -1h) bar = from(db: "mydb") |> filter(fn: (r) => r._measurement == "bar") |> range(start: -1h) join( tables: {foo:foo, bar:bar}, fn: (t) => t.foo._value + t.bar._value) |> yield(name: "foobar")
  • 128. Having Query from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu") |> range(start:-1h) |> window(every:10m) |> mean() // this is the having part |> filter(fn: (r) => r._value > 90)
  • 129. Grouping // group - average utilization across regions from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu" and r._field == "usage_system") |> range(start: -1h) |> group(by: ["region"]) |> window(every:10m) |> mean()
  • 130. Get Metadata from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu") |> range(start: -48h, stop: -47h) |> tagValues(key: "host")
  • 131. Get Metadata from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu") |> range(start: -48h, stop: -47h) |> group(by: ["measurement"], keep: ["host"]) |> distinct(column: "host")
  • 132. Get Metadata tagValues = (table=<-) => table |> group(by: ["measurement"], keep: ["host"]) |> distinct(column: "host")
  • 133. Get Metadata from(db:"mydb") |> filter(fn: (r) => r._measurement == "cpu") |> range(start: -48h, stop: -47h) |> tagValues(key: “host") |> count()
  • 134. Functions Implemented as IFQL // _sortLimit is a helper function, which sorts // and limits a table. _sortLimit = (n, desc, cols=["_value"], table=<-) => table |> sort(cols:cols, desc:desc) |> limit(n:n) // top sorts a table by cols and keeps only the top n records. top = (n, cols=["_value"], table=<-) => _sortLimit(table:table, n:n, cols:cols, desc:true)
  • 135. Project Status and Timeline
  • 136. API 2.0 Work Lock down query request/response format
  • 138. We’re contributing the Go implementation! https://github.com/influxdata/arrow
  • 139. Finalize Language (a few months or so)
  • 140. Ship with Enterprise 1.6 (summertime)
  • 141. Hack & workshop day tomorrow! Ask the registration desk today