SlideShare a Scribd company logo
The other day
iex(1)> defmodule RepeatN do
...(1)> def repeat_n(_function, 0) do
...(1)> # noop
...(1)> end
...(1)> def repeat_n(function, 1) do
...(1)> function.()
...(1)> end
...(1)> def repeat_n(function, count) do
...(1)> function.()
...(1)> repeat_n(function, count - 1)
...(1)> end
...(1)> end
{:module, RepeatN, ...}
iex(1)> defmodule RepeatN do
...(1)> def repeat_n(_function, 0) do
...(1)> # noop
...(1)> end
...(1)> def repeat_n(function, 1) do
...(1)> function.()
...(1)> end
...(1)> def repeat_n(function, count) do
...(1)> function.()
...(1)> repeat_n(function, count - 1)
...(1)> end
...(1)> end
{:module, RepeatN, ...}
iex(2)> :timer.tc fn -> RepeatN.repeat_n(fn -> 0 end, 100) end
{210, 0}
iex(1)> defmodule RepeatN do
...(1)> def repeat_n(_function, 0) do
...(1)> # noop
...(1)> end
...(1)> def repeat_n(function, 1) do
...(1)> function.()
...(1)> end
...(1)> def repeat_n(function, count) do
...(1)> function.()
...(1)> repeat_n(function, count - 1)
...(1)> end
...(1)> end
{:module, RepeatN, ...}
iex(2)> :timer.tc fn -> RepeatN.repeat_n(fn -> 0 end, 100) end
{210, 0}
iex(3)> list = Enum.to_list(1..100)
[...]
iex(4)> :timer.tc fn -> Enum.each(list, fn(_) -> 0 end) end
{165, :ok}
iex(1)> defmodule RepeatN do
...(1)> def repeat_n(_function, 0) do
...(1)> # noop
...(1)> end
...(1)> def repeat_n(function, 1) do
...(1)> function.()
...(1)> end
...(1)> def repeat_n(function, count) do
...(1)> function.()
...(1)> repeat_n(function, count - 1)
...(1)> end
...(1)> end
{:module, RepeatN, ...}
iex(2)> :timer.tc fn -> RepeatN.repeat_n(fn -> 0 end, 100) end
{210, 0}
iex(3)> list = Enum.to_list(1..100)
[...]
iex(4)> :timer.tc fn -> Enum.each(list, fn(_) -> 0 end) end
{165, :ok}
iex(5)> :timer.tc fn -> Enum.each(list, fn(_) -> 0 end) end
{170, :ok}
iex(6)> :timer.tc fn -> Enum.each(list, fn(_) -> 0 end) end
{184, :ok}
Success!
The End?
How fast ist it really? Benchmarking in practice
How many atrocities have
we just committed?
Atrocities
● Way too few samples
● Realistic data/multiple inputs?
● No warmup
● Non production environment
● Does creating the list matter?
● Is repeating really the bottle neck?
● Repeatability?
● Setup information
● Running on battery
● Lots of applications running
n = 10_000
range = 1..n
list = Enum.to_list range
fun = fn -> 0 end
Benchee.run %{
"Enum.each" =>
fn -> Enum.each(list, fn(_) -> fun.() end) end,
"List comprehension" =>
fn -> for _ <- list, do: fun.() end,
"Recursion" =>
fn -> RepeatN.repeat_n(fun, n) end
}
Name ips average deviation median
Recursion 6.83 K 146.41 Όs ±15.76% 139.00 Όs
Enum.each 4.39 K 227.86 Όs ±8.05% 224.00 Όs
List comprehension 3.13 K 319.22 Όs ±16.20% 323.00 Όs
Comparison:
Recursion 6.83 K
Enum.each 4.39 K - 1.56x slower
List comprehension 3.13 K - 2.18x slower
How fast is it really?
Benchmarking in Practice
Tobias Pfeiffer
@PragTob
pragtob.info
How fast is it really?
Benchmarking in Practice
Tobias Pfeiffer
@PragTob
pragtob.info
How fast ist it really? Benchmarking in practice
How fast ist it really? Benchmarking in practice
Concept vs Tool Usage
Ruby?
Profiling vs. Benchmarking
Flame Graph
Elixir.Life.Board:map/2
El.. Elixir.Enum:-map/2-lc$^0/1-0-/2El.. El..El..
El.. Elixir.Life.Board:map/2Elixir.Life.Board:map/2 El..
El..
Elixir.Life.Board:map/2
Elixir.Enum:-map/2-lc$^0/1-0-/2
El..
El..Elixir.Enum:-map/2-lc$^0/1-0-/2Elixir.Enum:-map/2-lc$^0/1-0-/2
(0.47.0)
Eli..
eflame:apply1/3
Elixir.Life:run_loop/3
Elixir.Life.Board:map/2
Elixir.Enum:-map/2-lc$^0/1-0-/2
El..El.. El..
http://learningelixir.joekain.com/profiling-elixir-2/
What to benchmark?
● Runtime?
● Memory?
● Throughput?
● Custom?
What to measure?
The famous post
What to measure?
● Runtime!
● Memory?
● Throughput?
● Custom?
But, why?
What's fastest?
How long will this take?
Enum.sort/1 performance
Name ips average deviation median
10k 595.62 1.68 ms ±8.77% 1.61 ms
100k 43.29 23.10 ms ±13.21% 21.50 ms
1M 3.26 306.53 ms ±9.82% 291.05 ms
5M 0.53 1899.00 ms ±7.94% 1834.97 ms
Comparison:
10k 595.62
100k 43.29 - 13.76x slower
1M 3.26 - 182.58x slower
5M 0.53 - 1131.09x slower
Enum.sort performance
Enum.sort performance
Did we make it faster?
“Isn’t that the root of all evil?”
“Programing Bumper Sticker”
More likely, not reading
the sources is the source
of all evil
“We should forget about small efficiencies, say
about 97% of the time: premature optimization
is the root of all evil.”
Donald Knuth, 1974
(Computing Surveys, Vol 6, No 4, December 1974)
“Yet we should not pass up our opportunities in
that critical 3%.
A good programmer (
) will be wise to look
carefully at the critical code but only after that
code has been identified.”
Donald Knuth, 1974
(Computing Surveys, Vol 6, No 4, December 1974)
The very next sentence
80 / 20
What is critical?
Application Monitoring
“In established engineering disciplines a 12 %
improvement, easily obtained, is never
considered marginal; and I believe the same
viewpoint should prevail in software engineering.”
Donald Knuth, 1974
(Computing Surveys, Vol 6, No 4, December 1974)
Prior Paragraph
“It is often a mistake to make a priori
judgments about what parts of a program are
really critical, since the universal experience of
programmers who have been using measurement
tools has been that their intuitive guesses fail.”
Donald Knuth, 1974
( Computing Surveys, Vol 6, No 4, December 1974 )
What's the fastest way to
sort a list of numbers
largest to smallest?
list = 1..10_000 |> Enum.to_list |> Enum.shuffle
Benchee.run %{
"sort(fun)" =>
fn -> Enum.sort(list, &(&1 > &2)) end,
"sort |> reverse" =>
fn -> list |> Enum.sort |> Enum.reverse end,
"sort_by(-value)" =>
fn -> Enum.sort_by(list, fn(val) -> -val end) end
}
list = 1..10_000 |> Enum.to_list |> Enum.shuffle
Benchee.run %{
"sort(fun)" =>
fn -> Enum.sort(list, &(&1 > &2)) end,
"sort |> reverse" =>
fn -> list |> Enum.sort |> Enum.reverse end,
"sort_by(-value)" =>
fn -> Enum.sort_by(list, fn(val) -> -val end) end
}
list = 1..10_000 |> Enum.to_list |> Enum.shuffle
Benchee.run %{
"sort(fun)" =>
fn -> Enum.sort(list, &(&1 > &2)) end,
"sort |> reverse" =>
fn -> list |> Enum.sort |> Enum.reverse end,
"sort_by(-value)" =>
fn -> Enum.sort_by(list, fn(val) -> -val end) end
}
list = 1..10_000 |> Enum.to_list |> Enum.shuffle
Benchee.run %{
"sort(fun)" =>
fn -> Enum.sort(list, &(&1 > &2)) end,
"sort |> reverse" =>
fn -> list |> Enum.sort |> Enum.reverse end,
"sort_by(-value)" =>
fn -> Enum.sort_by(list, fn(val) -> -val end) end
}
Name ips average deviation median
sort |> reverse 596.54 1.68 ms ±6.83% 1.65 ms
sort(fun) 238.88 4.19 ms ±5.53% 4.14 ms
sort_by(-value) 146.86 6.81 ms ±8.68% 6.59 ms
Comparison:
sort |> reverse 596.54
sort(fun) 238.88 - 2.50x slower
sort_by(-value) 146.86 - 4.06x slower
Different types of benchmarks
Feature
Integration
Unit
Testing Pyramid
Application
Macro
Micro
Benchmarking Pyramid
How fast ist it really? Benchmarking in practice
Micro Macro Application
Micro Macro
Components involved
Application
Micro Macro
Setup Complexity
Components involved
Application
Micro Macro
Setup Complexity
Execution Time
Components involved
Application
Micro Macro
Setup Complexity
Execution Time
Confidence of Real Impact
Components involved
Application
Micro Macro
Setup Complexity
Execution Time
Confidence of Real Impact
Components involved
Chance of Interference
Application
Micro Macro
Setup Complexity
Execution Time
Confidence of Real Impact
Components involved
Chance of Interference
Golden Middle
Application
Micro Macro
Setup Complexity
Execution Time
Confidence of Real Impact
Components involved
Chance of Interference
Application
Good Benchmarking
What are you benchmarking for?
Overly specific benchmarks &
exaggerated results
● Elixir 1.3.4
● Erlang 19.1
● i5-7200U – 2 x 2.5GHz (Up to 3.10GHz)
● 8GB RAM
● Linux Mint 18 - 64 bit (Ubuntu 16.04 base)
● Linux Kernel 4.4.0-51
System Specification
Interference free Environment
[info] GET /
[debug] Processing by Rumbl.PageController.index/2
Parameters: %{}
Pipelines: [:browser]
[info] Sent 200 in 46ms
[info] GET /sessions/new
[debug] Processing by Rumbl.SessionController.new/2
Parameters: %{}
Pipelines: [:browser]
[info] Sent 200 in 5ms
[info] GET /users/new
[debug] Processing by Rumbl.UserController.new/2
Parameters: %{}
Pipelines: [:browser]
[info] Sent 200 in 7ms
[info] POST /users
[debug] Processing by Rumbl.UserController.create/2
Parameters: %{"_csrf_token" =>
"NUEUdRMNAiBfIHEeNwZkfA05PgAOJgAAf0ACXJqCjl7YojW+trdjdg==", "_utf8" => " ", "user" =>✓
%{"name" => "asdasd", "password" => "[FILTERED]", "username" => "Homer"}}
Pipelines: [:browser]
[debug] QUERY OK db=0.1ms
begin []
[debug] QUERY OK db=0.9ms
INSERT INTO "users" ("name","password_hash","username","inserted_at","updated_at") VALUES
($1,$2,$3,$4,$5) RETURNING "id" ["asdasd",
"$2b$12$.qY/kpo0Dec7vMK1ClJoC.Lw77c3oGllX7uieZILMlFh2hFpJ3F.C", "Homer", {{2016, 12, 2},
{14, 10, 28, 0}}, {{2016, 12, 2}, {14, 10, 28, 0}}]
Logging & Friends
Garbage Collection
How fast ist it really? Benchmarking in practice
Zoom in
Correct & Meaningful Setup
Warmup
Inputs matter!
Malformed inputs
Where are your inputs
n = 10_000
fun = fn -> 0 end
Benchee.run %{
"Enum.each" => fn ->
Enum.each(Enum.to_list(1..n), fn(_) -> fun.() end)
end,
"List comprehension" => fn ->
for _ <- Enum.to_list(1..n), do: fun.()
end,
"Recursion" => fn -> RepeatN.repeat_n(fun, n) end
}
Executed every time
n = 10_000
fun = fn -> 0 end
Benchee.run %{
"Enum.each" => fn ->
Enum.each(Enum.to_list(1..n), fn(_) -> fun.() end)
end,
"List comprehension" => fn ->
for _ <- Enum.to_list(1..n), do: fun.()
end,
"Recursion" => fn -> RepeatN.repeat_n(fun, n) end
}
defmodule MyMap do
def map_tco(list, function) do
Enum.reverse do_map_tco([], list, function)
end
defp do_map_tco(acc, [], _function) do
acc
end
defp do_map_tco(acc, [head | tail], func) do
do_map_tco([func.(head) | acc], tail, func)
end
def map_body([], _func), do: []
def map_body([head | tail], func) do
[func.(head) | map_body(tail, func)]
end
end
TCO
defmodule MyMap do
def map_tco(list, function) do
Enum.reverse do_map_tco([], list, function)
end
defp do_map_tco(acc, [], _function) do
acc
end
defp do_map_tco(acc, [head | tail], func) do
do_map_tco([func.(head) | acc], tail, func)
end
def map_body([], _func), do: []
def map_body([head | tail], func) do
[func.(head) | map_body(tail, func)]
end
end
TCO
alias Benchee.Formatters.{Console, HTML}
map_fun = fn(i) -> i + 1 end
inputs = %{
"Small (10 Thousand)" => Enum.to_list(1..10_000),
"Middle (100 Thousand)" => Enum.to_list(1..100_000),
"Big (1 Million)" => Enum.to_list(1..1_000_000),
"Bigger (5 Million)" => Enum.to_list(1..5_000_000)
}
Benchee.run %{
"tail-recursive" =>
fn(list) -> MyMap.map_tco(list, map_fun) end,
"stdlib map" =>
fn(list) -> Enum.map(list, map_fun) end,
"body-recursive" =>
fn(list) -> MyMap.map_body(list, map_fun) end
}, time: 20, warmup: 10, inputs: inputs,
formatters: [&Console.output/1, &HTML.output/1],
html: [file: "bench/output/tco_small_sample.html"]
TCO
alias Benchee.Formatters.{Console, HTML}
map_fun = fn(i) -> i + 1 end
inputs = %{
"Small (10 Thousand)" => Enum.to_list(1..10_000),
"Middle (100 Thousand)" => Enum.to_list(1..100_000),
"Big (1 Million)" => Enum.to_list(1..1_000_000),
"Bigger (5 Million)" => Enum.to_list(1..5_000_000)
}
Benchee.run %{
"tail-recursive" =>
fn(list) -> MyMap.map_tco(list, map_fun) end,
"stdlib map" =>
fn(list) -> Enum.map(list, map_fun) end,
"body-recursive" =>
fn(list) -> MyMap.map_body(list, map_fun) end
}, time: 20, warmup: 10, inputs: inputs,
formatters: [&Console.output/1, &HTML.output/1],
html: [file: "bench/output/tco_small_sample.html"]
TCO
##### With input Small (10 Thousand) #####
Comparison:
body-recursive 5.12 K
stdlib map 5.07 K - 1.01x slower
tail-recursive 4.38 K - 1.17x slower
##### With input Middle (100 Thousand) #####
Comparison:
body-recursive 491.16
stdlib map 488.45 - 1.01x slower
tail-recursive 399.08 - 1.23x slower
##### With input Big (1 Million) #####
Comparison:
tail-recursive 35.36
body-recursive 25.69 - 1.38x slower
stdlib map 24.85 - 1.42x slower
##### With input Bigger (5 Million) #####
Comparison:
tail-recursive 6.93
body-recursive 4.92 - 1.41x slower
stdlib map 4.87 - 1.42x slower
TCO
##### With input Small (10 Thousand) #####
Comparison:
body-recursive 5.12 K
stdlib map 5.07 K - 1.01x slower
tail-recursive 4.38 K - 1.17x slower
##### With input Middle (100 Thousand) #####
Comparison:
body-recursive 491.16
stdlib map 488.45 - 1.01x slower
tail-recursive 399.08 - 1.23x slower
##### With input Big (1 Million) #####
Comparison:
tail-recursive 35.36
body-recursive 25.69 - 1.38x slower
stdlib map 24.85 - 1.42x slower
##### With input Bigger (5 Million) #####
Comparison:
tail-recursive 6.93
body-recursive 4.92 - 1.41x slower
stdlib map 4.87 - 1.42x slower
TCO
TCO
Excursion into Statistics
average = total_time / iterations
Average
Why not just take the average?
defp standard_deviation(samples, average, iterations) do
total_variance = Enum.reduce samples, 0, fn(sample, total) ->
total + :math.pow((sample - average), 2)
end
variance = total_variance / iterations
:math.sqrt variance
end
Standard Deviation
defp standard_deviation(samples, average, iterations) do
total_variance = Enum.reduce samples, 0, fn(sample, total) ->
total + :math.pow((sample - average), 2)
end
variance = total_variance / iterations
:math.sqrt variance
end
Spread of Values
Raw Run Times
Histogram
Outliers
Low Standard Deviation
Standard Deviation
defp compute_median(run_times, iterations) do
sorted = Enum.sort(run_times)
middle = div(iterations, 2)
if Integer.is_odd(iterations) do
sorted |> Enum.at(middle) |> to_float
else
(Enum.at(sorted, middle) +
Enum.at(sorted, middle - 1)) / 2
end
end
Median
Average
Median
Average
Median
Minimum & Maximum
Boxplot
Surprise findings
alias Benchee.Formatters.{Console, HTML}
map_fun = fn(i) -> [i, i * i] end
inputs = %{
"Small" => Enum.to_list(1..200),
"Medium" => Enum.to_list(1..1000),
"Bigger" => Enum.to_list(1..10_000)
}
Benchee.run(%{
"flat_map" =>
fn(list) -> Enum.flat_map(list, map_fun) end,
"map.flatten" => fn(list) ->
list
|> Enum.map(map_fun)
|> List.flatten
end
}, inputs: inputs,
formatters: [&Console.output/1, &HTML.output/1],
html: [file: "bench/output/flat_map.html"])
flat_map
##### With input Medium #####
Name ips average deviation median
map.flatten 15.51 K 64.48 Όs ±17.66% 63.00 Όs
flat_map 8.95 K 111.76 Όs ±7.18% 112.00 Όs
Comparison:
map.flatten 15.51 K
flat_map 8.95 K - 1.73x slower
flat_map
How fast ist it really? Benchmarking in practice
How fast ist it really? Benchmarking in practice
base_map = (0..50)
|> Enum.zip(300..350)
|> Enum.into(%{})
# deep maps with 6 top level conflicts
orig = Map.merge base_map, some_deep_map
new = Map.merge base_map, some_deep_map_2
simple = fn(_key, _base, override) -> override end
Benchee.run %{
"Map.merge/2" => fn -> Map.merge orig, new end,
"Map.merge/3" =>
fn -> Map.merge orig, new, simple end,
}, formatters: [&Benchee.Formatters.Console.output/1,
&Benchee.Formatters.HTML.output/1],
html: %{file: "bench/output/merge_3.html"}
merge/2 vs merge/3
base_map = (0..50)
|> Enum.zip(300..350)
|> Enum.into(%{})
# deep maps with 6 top level conflicts
orig = Map.merge base_map, some_deep_map
new = Map.merge base_map, some_deep_map_2
simple = fn(_key, _base, override) -> override end
Benchee.run %{
"Map.merge/2" => fn -> Map.merge orig, new end,
"Map.merge/3" =>
fn -> Map.merge orig, new, simple end,
}, formatters: [&Benchee.Formatters.Console.output/1,
&Benchee.Formatters.HTML.output/1],
html: %{file: "bench/output/merge_3.html"}
merge/2 vs merge/3
merge/2 vs merge/3
Is merge/3 variant about

– as fast as merge/2? (+-20%)
– 2x slower than merge/2
– 5x slower than merge/2
– 10x slower than merge/2
– 20x slower than merge/2
merge/2 vs merge/3
Is merge/3 variant about

– as fast as merge/2? (+-20%)
– 2x slower than merge/2
– 5x slower than merge/2
– 10x slower than merge/2
– 20x slower than merge/2
merge/2 vs merge/3
Is merge/3 variant about

– as fast as merge/2? (+-20%)
– 2x slower than merge/2
– 5x slower than merge/2
– 10x slower than merge/2
– 20x slower than merge/2
merge/2 vs merge/3
Is merge/3 variant about

– as fast as merge/2? (+-20%)
– 2x slower than merge/2
– 5x slower than merge/2
– 10x slower than merge/2
– 20x slower than merge/2
merge/2 vs merge/3
Is merge/3 variant about

– as fast as merge/2? (+-20%)
– 2x slower than merge/2
– 5x slower than merge/2
– 10x slower than merge/2
– 20x slower than merge/2
merge/2 vs merge/3
Is merge/3 variant about

– as fast as merge/2? (+-20%)
– 2x slower than merge/2
– 5x slower than merge/2
– 10x slower than merge/2
– 20x slower than merge/2
Name ips average deviation median
Map.merge/2 1.64 M 0.61 Όs ±11.12% 0.61 Όs
Map.merge/3 0.0921 M 10.86 Όs ±72.22% 10.00 Όs
Comparison:
Map.merge/2 1.64 M
Map.merge/3 0.0921 M - 17.85x slower
merge/2 vs merge/3
defmodule MyMap do
def map_tco(list, function) do
Enum.reverse do_map_tco([], list, function)
end
defp do_map_tco(acc, [], _function) do
acc
end
defp do_map_tco(acc, [head | tail], function) do
do_map_tco([function.(head) | acc], tail, function)
end
def map_tco_arg_order(list, function) do
Enum.reverse do_map_tco_arg_order(list, function, [])
end
defp do_map_tco_arg_order([], _function, acc) do
acc
end
defp do_map_tco_arg_order([head | tail], func, acc) do
do_map_tco_arg_order(tail, func, [func.(head) | acc])
end
end
defmodule MyMap do
def map_tco(list, function) do
Enum.reverse do_map_tco([], list, function)
end
defp do_map_tco(acc, [], _function) do
acc
end
defp do_map_tco(acc, [head | tail], function) do
do_map_tco([function.(head) | acc], tail, function)
end
def map_tco_arg_order(list, function) do
Enum.reverse do_map_tco_arg_order(list, function, [])
end
defp do_map_tco_arg_order([], _function, acc) do
acc
end
defp do_map_tco_arg_order([head | tail], func, acc) do
do_map_tco_arg_order(tail, func, [func.(head) | acc])
end
end
Does argument order make a
difference?
##### With input Middle (100 Thousand) #####
Name ips average deviation median
stdlib map 490.02 2.04 ms ±7.76% 2.07 ms
body-recursive 467.51 2.14 ms ±7.34% 2.17 ms
tail-rec arg-order 439.04 2.28 ms ±17.96% 2.25 ms
tail-recursive 402.56 2.48 ms ±16.00% 2.46 ms
Comparison:
stdlib map 490.02
body-recursive 467.51 - 1.05x slower
tail-rec arg-order 439.04 - 1.12x slower
tail-recursive 402.56 - 1.22x slower
##### With input Big (1 Million) #####
Name ips average deviation median
tail-rec arg-order 39.76 25.15 ms ±10.14% 24.33 ms
tail-recursive 36.58 27.34 ms ±9.38% 26.41 ms
stdlib map 25.70 38.91 ms ±3.05% 38.58 ms
body-recursive 25.04 39.94 ms ±3.04% 39.64 ms
Comparison:
tail-rec arg-order 39.76
tail-recursive 36.58 - 1.09x slower
stdlib map 25.70 - 1.55x slower
body-recursive 25.04 - 1.59x slower
##### With input Middle (100 Thousand) #####
Name ips average deviation median
stdlib map 490.02 2.04 ms ±7.76% 2.07 ms
body-recursive 467.51 2.14 ms ±7.34% 2.17 ms
tail-rec arg-order 439.04 2.28 ms ±17.96% 2.25 ms
tail-recursive 402.56 2.48 ms ±16.00% 2.46 ms
Comparison:
stdlib map 490.02
body-recursive 467.51 - 1.05x slower
tail-rec arg-order 439.04 - 1.12x slower
tail-recursive 402.56 - 1.22x slower
##### With input Big (1 Million) #####
Name ips average deviation median
tail-rec arg-order 39.76 25.15 ms ±10.14% 24.33 ms
tail-recursive 36.58 27.34 ms ±9.38% 26.41 ms
stdlib map 25.70 38.91 ms ±3.05% 38.58 ms
body-recursive 25.04 39.94 ms ±3.04% 39.64 ms
Comparison:
tail-rec arg-order 39.76
tail-recursive 36.58 - 1.09x slower
stdlib map 25.70 - 1.55x slower
body-recursive 25.04 - 1.59x slower
How fast ist it really? Benchmarking in practice
How fast ist it really? Benchmarking in practice
How fast ist it really? Benchmarking in practice
How fast ist it really? Benchmarking in practice
But

it can not be!
“The order of arguments will likely matter when
we generate the branching code. The order of
arguments will specially matter if performing
binary matching.”
José Valim, 2016
(Comment Section of my blog!)
A wild José appears!
config
|> Benchee.init
|> Benchee.System.system
|> Benchee.benchmark("job", fn -> magic end)
|> Benchee.measure
|> Benchee.statistics
|> Benchee.Formatters.Console.output
|> Benchee.Formatters.HTML.output
A transformation of inputs
How fast ist it really? Benchmarking in practice
Always do your own benchmarks!
alias Benchee.Formatters.{Console, HTML}
map_fun = fn(i) -> [i, i * i] end
inputs = %{
"Small" => Enum.to_list(1..200),
"Medium" => Enum.to_list(1..1000),
"Bigger" => Enum.to_list(1..10_000)
}
Benchee.run(%{
"flat_map" =>
fn(list) -> Enum.flat_map(list, map_fun) end,
"map.flatten" => fn(list) ->
list
|> Enum.map(map_fun)
|> List.flatten
end
}, inputs: inputs,
formatters: [&Console.output/1, &HTML.output/1],
html: [file: "bench/output/flat_map.html"])
Remember?
● Elixir 1.4.0-rc.0
● Erlang 19.1
● i5-7200U – 2 x 2.5GHz (Up to 3.10GHz)
● 8GB RAM
● Linux Mint 18 - 64 bit (Ubuntu 16.04 base)
● Linux Kernel 4.4.0-51
Mhm Upgrades
alias Benchee.Formatters.{Console, HTML}
map_fun = fn(i) -> [i, i * i] end
inputs = %{
"Small" => Enum.to_list(1..200),
"Medium" => Enum.to_list(1..1000),
"Bigger" => Enum.to_list(1..10_000)
}
Benchee.run(%{
"flat_map" =>
fn(list) -> Enum.flat_map(list, map_fun) end,
"map.flatten" => fn(list) ->
list
|> Enum.map(map_fun)
|> List.flatten
end
}, inputs: inputs,
formatters: [&Console.output/1, &HTML.output/1],
html: [file: "bench/output/flat_map.html"])
flat_map
alias Benchee.Formatters.{Console, HTML}
map_fun = fn(i) -> [i, i * i] end
inputs = %{
"Small" => Enum.to_list(1..200),
"Medium" => Enum.to_list(1..1000),
"Bigger" => Enum.to_list(1..10_000)
}
Benchee.run(%{
"flat_map" =>
fn(list) -> Enum.flat_map(list, map_fun) end,
"map.flatten" => fn(list) ->
list
|> Enum.map(map_fun)
|> List.flatten
end
}, inputs: inputs,
formatters: [&Console.output/1, &HTML.output/1],
html: [file: "bench/output/flat_map.html"])
flat_map
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:4:4] [async-
threads:10] [hipe] [kernel-poll:false]
Elixir 1.4.0-rc.0
Benchmark suite executing with the following configuration:
warmup: 2.0s
time: 5.0s
parallel: 1
inputs: Bigger, Medium, Small
Estimated total run time: 42.0s
Benchmarking with input Bigger:
Benchmarking flat_map...
Benchmarking map.flatten...
Benchmarking with input Medium:
Benchmarking flat_map...
Benchmarking map.flatten...
Benchmarking with input Small:
Benchmarking flat_map...
Benchmarking map.flatten...
flat_map
Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:4:4] [async-
threads:10] [hipe] [kernel-poll:false]
Elixir 1.4.0-rc.0
Benchmark suite executing with the following configuration:
warmup: 2.0s
time: 5.0s
parallel: 1
inputs: Bigger, Medium, Small
Estimated total run time: 42.0s
Benchmarking with input Bigger:
Benchmarking flat_map...
Benchmarking map.flatten...
Benchmarking with input Medium:
Benchmarking flat_map...
Benchmarking map.flatten...
Benchmarking with input Small:
Benchmarking flat_map...
Benchmarking map.flatten...
flat_map
##### With input Bigger #####
Name ips average deviation median
flat_map 1.76 K 569.47 Όs ±26.95% 512.00 Όs
map.flatten 1.02 K 982.57 Όs ±25.06% 901.00 Όs
Comparison:
flat_map 1.76 K
map.flatten 1.02 K - 1.73x slower
##### With input Medium #####
Name ips average deviation median
flat_map 21.39 K 46.76 Όs ±19.24% 48.00 Όs
map.flatten 14.99 K 66.71 Όs ±18.13% 65.00 Όs
Comparison:
flat_map 21.39 K
map.flatten 14.99 K - 1.43x slower
##### With input Small #####
Name ips average deviation median
flat_map 118.66 K 8.43 Όs ±180.99% 8.00 Όs
map.flatten 79.25 K 12.62 Όs ±97.97% 12.00 Όs
Comparison:
flat_map 118.66 K
map.flatten 79.25 K - 1.50x slower
The tables have turned!
##### With input Bigger #####
Name ips average deviation median
flat_map 1.76 K 569.47 Όs ±26.95% 512.00 Όs
map.flatten 1.02 K 982.57 Όs ±25.06% 901.00 Όs
Comparison:
flat_map 1.76 K
map.flatten 1.02 K - 1.73x slower
##### With input Medium #####
Name ips average deviation median
flat_map 21.39 K 46.76 Όs ±19.24% 48.00 Όs
map.flatten 14.99 K 66.71 Όs ±18.13% 65.00 Όs
Comparison:
flat_map 21.39 K
map.flatten 14.99 K - 1.43x slower
##### With input Small #####
Name ips average deviation median
flat_map 118.66 K 8.43 Όs ±180.99% 8.00 Όs
map.flatten 79.25 K 12.62 Όs ±97.97% 12.00 Όs
Comparison:
flat_map 118.66 K
map.flatten 79.25 K - 1.50x slower
> 2x faster
How fast ist it really? Benchmarking in practice
How did that happen?
How fast ist it really? Benchmarking in practice
18 minutes later...
How fast ist it really? Benchmarking in practice
How fast ist it really? Benchmarking in practice
Enjoy Benchmarking! ❀
Tobias Pfeiffer
@PragTob
pragtob.info
github.com/PragTob/benchee

More Related Content

PDF
Elixir & Phoenix – fast, concurrent and explicit
PDF
Elixir & Phoenix – fast, concurrent and explicit
PDF
Stop Guessing and Start Measuring - Benchmarking Practice (Poly Version)
PDF
How fast is it really? Benchmarking in Practice (Ruby Version)
PDF
Palestra sobre Collections com Python
 
PDF
Clustering com numpy e cython
PDF
Pre-Bootcamp introduction to Elixir
PDF
"PostgreSQL and Python" Lightning Talk @EuroPython2014
Elixir & Phoenix – fast, concurrent and explicit
Elixir & Phoenix – fast, concurrent and explicit
Stop Guessing and Start Measuring - Benchmarking Practice (Poly Version)
How fast is it really? Benchmarking in Practice (Ruby Version)
Palestra sobre Collections com Python
 
Clustering com numpy e cython
Pre-Bootcamp introduction to Elixir
"PostgreSQL and Python" Lightning Talk @EuroPython2014

What's hot (20)

PDF
Intro to OTP in Elixir
PDF
Stop Guessing and Start Measuring - Benchmarking in Practice (Lambdadays)
PDF
Python postgre sql a wonderful wedding
PPTX
Super Advanced Python –act1
PDF
Javascript
PDF
PLOTCON NYC: Behind Every Great Plot There's a Great Deal of Wrangling
 
PDF
Implementing virtual machines in go & c 2018 redux
PDF
tensorflow/keras model coding tutorial ć‹‰ćŒ·äŒš
PDF
Let’s Talk About Ruby
PDF
Corona sdk
PPTX
Oh Composable World!
PDF
Go ahead, make my day
PDF
Láș­p trĂŹnh Python cÆĄ báșŁn
PDF
mobl - model-driven engineering lecture
ODP
Very basic functional design patterns
PPTX
The Groovy Puzzlers – The Complete 01 and 02 Seasons
PDF
The Ring programming language version 1.9 book - Part 56 of 210
PDF
はじめどぼGroovy
PDF
Python fundamentals - basic | WeiYuan
Intro to OTP in Elixir
Stop Guessing and Start Measuring - Benchmarking in Practice (Lambdadays)
Python postgre sql a wonderful wedding
Super Advanced Python –act1
Javascript
PLOTCON NYC: Behind Every Great Plot There's a Great Deal of Wrangling
 
Implementing virtual machines in go & c 2018 redux
tensorflow/keras model coding tutorial ć‹‰ćŒ·äŒš
Let’s Talk About Ruby
Corona sdk
Oh Composable World!
Go ahead, make my day
Láș­p trĂŹnh Python cÆĄ báșŁn
mobl - model-driven engineering lecture
Very basic functional design patterns
The Groovy Puzzlers – The Complete 01 and 02 Seasons
The Ring programming language version 1.9 book - Part 56 of 210
はじめどぼGroovy
Python fundamentals - basic | WeiYuan
Ad

Viewers also liked (20)

PDF
What did AlphaGo do to beat the strongest human Go player?
PDF
Introducing Elixir the easy way
PPT
Why Semantic Search Is Hard
PPTX
Test-Driven Development In Action
PDF
Test Driven Development
PDF
Elasticsearch na prĂĄtica
PPTX
JavaScript TDD
PPTX
Orchestration tool roundup - OpenStack Israel summit - kubernetes vs. docker...
DOCX
HagayOnn_EnglishCV_ 2016
PPTX
Scala does the Catwalk
PDF
What's the Magic in LinkedIn?
PPTX
Not your dad's h base new
PDF
Scrum. software engineering seminar
PDF
Storm at Forter
PDF
Ś˜ŚœŚ€Ś•Ś Ś™Ś Ś—Ś›ŚžŚ™Ś ڕڐŚȘŚ
PPTX
Joy of scala
PDF
Guice - dependency injection framework
PPTX
1953 and all that. A tale of two sciences (Kitcher, 1984)
PDF
How does the Internet Work?
PDF
ŚžŚ›ŚȘŚ‘ Ś”ŚžŚœŚŠŚ” - ŚœŚ™ŚšŚŸ Ś€ŚšŚ™Ś“ŚžŚŸ
What did AlphaGo do to beat the strongest human Go player?
Introducing Elixir the easy way
Why Semantic Search Is Hard
Test-Driven Development In Action
Test Driven Development
Elasticsearch na prĂĄtica
JavaScript TDD
Orchestration tool roundup - OpenStack Israel summit - kubernetes vs. docker...
HagayOnn_EnglishCV_ 2016
Scala does the Catwalk
What's the Magic in LinkedIn?
Not your dad's h base new
Scrum. software engineering seminar
Storm at Forter
Ś˜ŚœŚ€Ś•Ś Ś™Ś Ś—Ś›ŚžŚ™Ś ڕڐŚȘŚ
Joy of scala
Guice - dependency injection framework
1953 and all that. A tale of two sciences (Kitcher, 1984)
How does the Internet Work?
ŚžŚ›ŚȘŚ‘ Ś”ŚžŚœŚŠŚ” - ŚœŚ™ŚšŚŸ Ś€ŚšŚ™Ś“ŚžŚŸ
Ad

Similar to How fast ist it really? Benchmarking in practice (20)

PDF
Writing Faster Python 3
PDF
PythonOOP
KEY
Desarrollando aplicaciones web en minutos
PPT
ComandosDePython_ComponentesBasicosImpl.ppt
PDF
Do snow.rwn
KEY
Hidden treasures of Ruby
PDF
Elixir in a nutshell - Fundamental Concepts
PPTX
Pythonlearn-03-Conditional.pptx
PDF
Cc code cards
PDF
Learn 90% of Python in 90 Minutes
PPTX
PDF
Ruby Language - A quick tour
 
PDF
Python na Infraestrutura ‹MySQL do Facebook‹
DOCX
Basic python laboratoty_ PSPP Manual .docx
PDF
Yurii Bodarev - OTP, Phoenix & Ecto: Three Pillars of Elixir
PDF
Elixir cheatsheet
PDF
Python Functions (PyAtl Beginners Night)
PDF
cel shading as PDF and Python description
PDF
Python faster for loop
PDF
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn
Writing Faster Python 3
PythonOOP
Desarrollando aplicaciones web en minutos
ComandosDePython_ComponentesBasicosImpl.ppt
Do snow.rwn
Hidden treasures of Ruby
Elixir in a nutshell - Fundamental Concepts
Pythonlearn-03-Conditional.pptx
Cc code cards
Learn 90% of Python in 90 Minutes
Ruby Language - A quick tour
 
Python na Infraestrutura ‹MySQL do Facebook‹
Basic python laboratoty_ PSPP Manual .docx
Yurii Bodarev - OTP, Phoenix & Ecto: Three Pillars of Elixir
Elixir cheatsheet
Python Functions (PyAtl Beginners Night)
cel shading as PDF and Python description
Python faster for loop
Numerical tour in the Python eco-system: Python, NumPy, scikit-learn

More from Tobias Pfeiffer (20)

PDF
Going Staff - Keynote @ CodeBEAM EU edition
PDF
Going Staff
PDF
Stories in Open SOurce
PDF
Metaphors are everywhere: Ideas to Improve Software Development
PDF
Stories in Open Source
PDF
Elixir & Phoenix – Fast, Concurrent and Explicit
PDF
Functioning Among Humans
PDF
Functioning Among Humans
PDF
Do You Need That Validation? Let Me Call You Back About It
PDF
Elixir, your Monolith and You
PDF
Where do Rubyists go?
PDF
It's About the Humans, Stupid (Lightning)
PDF
Code, Comments, Concepts, Comprehension – Conclusion?
PDF
What did AlphaGo do to beat the strongest human Go player?
PDF
What did AlphaGo do to beat the strongest human Go player? (Strange Group Ver...
PDF
Ruby to Elixir - what's great and what you might miss
PDF
Elixir & Phoenix - fast, concurrent and explicit
PDF
Beating Go Thanks to the Power of Randomness
PDF
Optimizing For Readability
PDF
Code is read many mor times than written - short
Going Staff - Keynote @ CodeBEAM EU edition
Going Staff
Stories in Open SOurce
Metaphors are everywhere: Ideas to Improve Software Development
Stories in Open Source
Elixir & Phoenix – Fast, Concurrent and Explicit
Functioning Among Humans
Functioning Among Humans
Do You Need That Validation? Let Me Call You Back About It
Elixir, your Monolith and You
Where do Rubyists go?
It's About the Humans, Stupid (Lightning)
Code, Comments, Concepts, Comprehension – Conclusion?
What did AlphaGo do to beat the strongest human Go player?
What did AlphaGo do to beat the strongest human Go player? (Strange Group Ver...
Ruby to Elixir - what's great and what you might miss
Elixir & Phoenix - fast, concurrent and explicit
Beating Go Thanks to the Power of Randomness
Optimizing For Readability
Code is read many mor times than written - short

Recently uploaded (20)

PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
Transform Your Business with a Software ERP System
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
 
PDF
AI in Product Development-omnex systems
PPTX
history of c programming in notes for students .pptx
PDF
top salesforce developer skills in 2025.pdf
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
System and Network Administraation Chapter 3
PTS Company Brochure 2025 (1).pdf.......
Which alternative to Crystal Reports is best for small or large businesses.pdf
How to Migrate SBCGlobal Email to Yahoo Easily
Transform Your Business with a Software ERP System
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
2025 Textile ERP Trends: SAP, Odoo & Oracle
Operating system designcfffgfgggggggvggggggggg
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
 
AI in Product Development-omnex systems
history of c programming in notes for students .pptx
top salesforce developer skills in 2025.pdf
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Design an Analysis of Algorithms II-SECS-1021-03
Softaken Excel to vCard Converter Software.pdf
Odoo Companies in India – Driving Business Transformation.pdf
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
System and Network Administraation Chapter 3

How fast ist it really? Benchmarking in practice

  • 2. iex(1)> defmodule RepeatN do ...(1)> def repeat_n(_function, 0) do ...(1)> # noop ...(1)> end ...(1)> def repeat_n(function, 1) do ...(1)> function.() ...(1)> end ...(1)> def repeat_n(function, count) do ...(1)> function.() ...(1)> repeat_n(function, count - 1) ...(1)> end ...(1)> end {:module, RepeatN, ...}
  • 3. iex(1)> defmodule RepeatN do ...(1)> def repeat_n(_function, 0) do ...(1)> # noop ...(1)> end ...(1)> def repeat_n(function, 1) do ...(1)> function.() ...(1)> end ...(1)> def repeat_n(function, count) do ...(1)> function.() ...(1)> repeat_n(function, count - 1) ...(1)> end ...(1)> end {:module, RepeatN, ...} iex(2)> :timer.tc fn -> RepeatN.repeat_n(fn -> 0 end, 100) end {210, 0}
  • 4. iex(1)> defmodule RepeatN do ...(1)> def repeat_n(_function, 0) do ...(1)> # noop ...(1)> end ...(1)> def repeat_n(function, 1) do ...(1)> function.() ...(1)> end ...(1)> def repeat_n(function, count) do ...(1)> function.() ...(1)> repeat_n(function, count - 1) ...(1)> end ...(1)> end {:module, RepeatN, ...} iex(2)> :timer.tc fn -> RepeatN.repeat_n(fn -> 0 end, 100) end {210, 0} iex(3)> list = Enum.to_list(1..100) [...] iex(4)> :timer.tc fn -> Enum.each(list, fn(_) -> 0 end) end {165, :ok}
  • 5. iex(1)> defmodule RepeatN do ...(1)> def repeat_n(_function, 0) do ...(1)> # noop ...(1)> end ...(1)> def repeat_n(function, 1) do ...(1)> function.() ...(1)> end ...(1)> def repeat_n(function, count) do ...(1)> function.() ...(1)> repeat_n(function, count - 1) ...(1)> end ...(1)> end {:module, RepeatN, ...} iex(2)> :timer.tc fn -> RepeatN.repeat_n(fn -> 0 end, 100) end {210, 0} iex(3)> list = Enum.to_list(1..100) [...] iex(4)> :timer.tc fn -> Enum.each(list, fn(_) -> 0 end) end {165, :ok} iex(5)> :timer.tc fn -> Enum.each(list, fn(_) -> 0 end) end {170, :ok} iex(6)> :timer.tc fn -> Enum.each(list, fn(_) -> 0 end) end {184, :ok}
  • 9. How many atrocities have we just committed?
  • 10. Atrocities ● Way too few samples ● Realistic data/multiple inputs? ● No warmup ● Non production environment ● Does creating the list matter? ● Is repeating really the bottle neck? ● Repeatability? ● Setup information ● Running on battery ● Lots of applications running
  • 11. n = 10_000 range = 1..n list = Enum.to_list range fun = fn -> 0 end Benchee.run %{ "Enum.each" => fn -> Enum.each(list, fn(_) -> fun.() end) end, "List comprehension" => fn -> for _ <- list, do: fun.() end, "Recursion" => fn -> RepeatN.repeat_n(fun, n) end }
  • 12. Name ips average deviation median Recursion 6.83 K 146.41 ÎŒs ±15.76% 139.00 ÎŒs Enum.each 4.39 K 227.86 ÎŒs ±8.05% 224.00 ÎŒs List comprehension 3.13 K 319.22 ÎŒs ±16.20% 323.00 ÎŒs Comparison: Recursion 6.83 K Enum.each 4.39 K - 1.56x slower List comprehension 3.13 K - 2.18x slower
  • 13. How fast is it really? Benchmarking in Practice Tobias Pfeiffer @PragTob pragtob.info
  • 14. How fast is it really? Benchmarking in Practice Tobias Pfeiffer @PragTob pragtob.info
  • 18. Ruby?
  • 20. Flame Graph Elixir.Life.Board:map/2 El.. Elixir.Enum:-map/2-lc$^0/1-0-/2El.. El..El.. El.. Elixir.Life.Board:map/2Elixir.Life.Board:map/2 El.. El.. Elixir.Life.Board:map/2 Elixir.Enum:-map/2-lc$^0/1-0-/2 El.. El..Elixir.Enum:-map/2-lc$^0/1-0-/2Elixir.Enum:-map/2-lc$^0/1-0-/2 (0.47.0) Eli.. eflame:apply1/3 Elixir.Life:run_loop/3 Elixir.Life.Board:map/2 Elixir.Enum:-map/2-lc$^0/1-0-/2 El..El.. El.. http://learningelixir.joekain.com/profiling-elixir-2/
  • 22. ● Runtime? ● Memory? ● Throughput? ● Custom? What to measure?
  • 24. What to measure? ● Runtime! ● Memory? ● Throughput? ● Custom?
  • 27. How long will this take?
  • 28. Enum.sort/1 performance Name ips average deviation median 10k 595.62 1.68 ms ±8.77% 1.61 ms 100k 43.29 23.10 ms ±13.21% 21.50 ms 1M 3.26 306.53 ms ±9.82% 291.05 ms 5M 0.53 1899.00 ms ±7.94% 1834.97 ms Comparison: 10k 595.62 100k 43.29 - 13.76x slower 1M 3.26 - 182.58x slower 5M 0.53 - 1131.09x slower
  • 31. Did we make it faster?
  • 32. “Isn’t that the root of all evil?”
  • 34. More likely, not reading the sources is the source of all evil
  • 35. “We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.” Donald Knuth, 1974 (Computing Surveys, Vol 6, No 4, December 1974)
  • 36. “Yet we should not pass up our opportunities in that critical 3%. A good programmer (
) will be wise to look carefully at the critical code but only after that code has been identified.” Donald Knuth, 1974 (Computing Surveys, Vol 6, No 4, December 1974) The very next sentence
  • 39. “In established engineering disciplines a 12 % improvement, easily obtained, is never considered marginal; and I believe the same viewpoint should prevail in software engineering.” Donald Knuth, 1974 (Computing Surveys, Vol 6, No 4, December 1974) Prior Paragraph
  • 40. “It is often a mistake to make a priori judgments about what parts of a program are really critical, since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail.” Donald Knuth, 1974 ( Computing Surveys, Vol 6, No 4, December 1974 )
  • 41. What's the fastest way to sort a list of numbers largest to smallest?
  • 42. list = 1..10_000 |> Enum.to_list |> Enum.shuffle Benchee.run %{ "sort(fun)" => fn -> Enum.sort(list, &(&1 > &2)) end, "sort |> reverse" => fn -> list |> Enum.sort |> Enum.reverse end, "sort_by(-value)" => fn -> Enum.sort_by(list, fn(val) -> -val end) end }
  • 43. list = 1..10_000 |> Enum.to_list |> Enum.shuffle Benchee.run %{ "sort(fun)" => fn -> Enum.sort(list, &(&1 > &2)) end, "sort |> reverse" => fn -> list |> Enum.sort |> Enum.reverse end, "sort_by(-value)" => fn -> Enum.sort_by(list, fn(val) -> -val end) end }
  • 44. list = 1..10_000 |> Enum.to_list |> Enum.shuffle Benchee.run %{ "sort(fun)" => fn -> Enum.sort(list, &(&1 > &2)) end, "sort |> reverse" => fn -> list |> Enum.sort |> Enum.reverse end, "sort_by(-value)" => fn -> Enum.sort_by(list, fn(val) -> -val end) end }
  • 45. list = 1..10_000 |> Enum.to_list |> Enum.shuffle Benchee.run %{ "sort(fun)" => fn -> Enum.sort(list, &(&1 > &2)) end, "sort |> reverse" => fn -> list |> Enum.sort |> Enum.reverse end, "sort_by(-value)" => fn -> Enum.sort_by(list, fn(val) -> -val end) end }
  • 46. Name ips average deviation median sort |> reverse 596.54 1.68 ms ±6.83% 1.65 ms sort(fun) 238.88 4.19 ms ±5.53% 4.14 ms sort_by(-value) 146.86 6.81 ms ±8.68% 6.59 ms Comparison: sort |> reverse 596.54 sort(fun) 238.88 - 2.50x slower sort_by(-value) 146.86 - 4.06x slower
  • 47. Different types of benchmarks
  • 54. Micro Macro Setup Complexity Execution Time Components involved Application
  • 55. Micro Macro Setup Complexity Execution Time Confidence of Real Impact Components involved Application
  • 56. Micro Macro Setup Complexity Execution Time Confidence of Real Impact Components involved Chance of Interference Application
  • 57. Micro Macro Setup Complexity Execution Time Confidence of Real Impact Components involved Chance of Interference Golden Middle Application
  • 58. Micro Macro Setup Complexity Execution Time Confidence of Real Impact Components involved Chance of Interference Application
  • 60. What are you benchmarking for?
  • 61. Overly specific benchmarks & exaggerated results
  • 62. ● Elixir 1.3.4 ● Erlang 19.1 ● i5-7200U – 2 x 2.5GHz (Up to 3.10GHz) ● 8GB RAM ● Linux Mint 18 - 64 bit (Ubuntu 16.04 base) ● Linux Kernel 4.4.0-51 System Specification
  • 64. [info] GET / [debug] Processing by Rumbl.PageController.index/2 Parameters: %{} Pipelines: [:browser] [info] Sent 200 in 46ms [info] GET /sessions/new [debug] Processing by Rumbl.SessionController.new/2 Parameters: %{} Pipelines: [:browser] [info] Sent 200 in 5ms [info] GET /users/new [debug] Processing by Rumbl.UserController.new/2 Parameters: %{} Pipelines: [:browser] [info] Sent 200 in 7ms [info] POST /users [debug] Processing by Rumbl.UserController.create/2 Parameters: %{"_csrf_token" => "NUEUdRMNAiBfIHEeNwZkfA05PgAOJgAAf0ACXJqCjl7YojW+trdjdg==", "_utf8" => " ", "user" =>✓ %{"name" => "asdasd", "password" => "[FILTERED]", "username" => "Homer"}} Pipelines: [:browser] [debug] QUERY OK db=0.1ms begin [] [debug] QUERY OK db=0.9ms INSERT INTO "users" ("name","password_hash","username","inserted_at","updated_at") VALUES ($1,$2,$3,$4,$5) RETURNING "id" ["asdasd", "$2b$12$.qY/kpo0Dec7vMK1ClJoC.Lw77c3oGllX7uieZILMlFh2hFpJ3F.C", "Homer", {{2016, 12, 2}, {14, 10, 28, 0}}, {{2016, 12, 2}, {14, 10, 28, 0}}] Logging & Friends
  • 72. Where are your inputs n = 10_000 fun = fn -> 0 end Benchee.run %{ "Enum.each" => fn -> Enum.each(Enum.to_list(1..n), fn(_) -> fun.() end) end, "List comprehension" => fn -> for _ <- Enum.to_list(1..n), do: fun.() end, "Recursion" => fn -> RepeatN.repeat_n(fun, n) end }
  • 73. Executed every time n = 10_000 fun = fn -> 0 end Benchee.run %{ "Enum.each" => fn -> Enum.each(Enum.to_list(1..n), fn(_) -> fun.() end) end, "List comprehension" => fn -> for _ <- Enum.to_list(1..n), do: fun.() end, "Recursion" => fn -> RepeatN.repeat_n(fun, n) end }
  • 74. defmodule MyMap do def map_tco(list, function) do Enum.reverse do_map_tco([], list, function) end defp do_map_tco(acc, [], _function) do acc end defp do_map_tco(acc, [head | tail], func) do do_map_tco([func.(head) | acc], tail, func) end def map_body([], _func), do: [] def map_body([head | tail], func) do [func.(head) | map_body(tail, func)] end end TCO
  • 75. defmodule MyMap do def map_tco(list, function) do Enum.reverse do_map_tco([], list, function) end defp do_map_tco(acc, [], _function) do acc end defp do_map_tco(acc, [head | tail], func) do do_map_tco([func.(head) | acc], tail, func) end def map_body([], _func), do: [] def map_body([head | tail], func) do [func.(head) | map_body(tail, func)] end end TCO
  • 76. alias Benchee.Formatters.{Console, HTML} map_fun = fn(i) -> i + 1 end inputs = %{ "Small (10 Thousand)" => Enum.to_list(1..10_000), "Middle (100 Thousand)" => Enum.to_list(1..100_000), "Big (1 Million)" => Enum.to_list(1..1_000_000), "Bigger (5 Million)" => Enum.to_list(1..5_000_000) } Benchee.run %{ "tail-recursive" => fn(list) -> MyMap.map_tco(list, map_fun) end, "stdlib map" => fn(list) -> Enum.map(list, map_fun) end, "body-recursive" => fn(list) -> MyMap.map_body(list, map_fun) end }, time: 20, warmup: 10, inputs: inputs, formatters: [&Console.output/1, &HTML.output/1], html: [file: "bench/output/tco_small_sample.html"] TCO
  • 77. alias Benchee.Formatters.{Console, HTML} map_fun = fn(i) -> i + 1 end inputs = %{ "Small (10 Thousand)" => Enum.to_list(1..10_000), "Middle (100 Thousand)" => Enum.to_list(1..100_000), "Big (1 Million)" => Enum.to_list(1..1_000_000), "Bigger (5 Million)" => Enum.to_list(1..5_000_000) } Benchee.run %{ "tail-recursive" => fn(list) -> MyMap.map_tco(list, map_fun) end, "stdlib map" => fn(list) -> Enum.map(list, map_fun) end, "body-recursive" => fn(list) -> MyMap.map_body(list, map_fun) end }, time: 20, warmup: 10, inputs: inputs, formatters: [&Console.output/1, &HTML.output/1], html: [file: "bench/output/tco_small_sample.html"] TCO
  • 78. ##### With input Small (10 Thousand) ##### Comparison: body-recursive 5.12 K stdlib map 5.07 K - 1.01x slower tail-recursive 4.38 K - 1.17x slower ##### With input Middle (100 Thousand) ##### Comparison: body-recursive 491.16 stdlib map 488.45 - 1.01x slower tail-recursive 399.08 - 1.23x slower ##### With input Big (1 Million) ##### Comparison: tail-recursive 35.36 body-recursive 25.69 - 1.38x slower stdlib map 24.85 - 1.42x slower ##### With input Bigger (5 Million) ##### Comparison: tail-recursive 6.93 body-recursive 4.92 - 1.41x slower stdlib map 4.87 - 1.42x slower TCO
  • 79. ##### With input Small (10 Thousand) ##### Comparison: body-recursive 5.12 K stdlib map 5.07 K - 1.01x slower tail-recursive 4.38 K - 1.17x slower ##### With input Middle (100 Thousand) ##### Comparison: body-recursive 491.16 stdlib map 488.45 - 1.01x slower tail-recursive 399.08 - 1.23x slower ##### With input Big (1 Million) ##### Comparison: tail-recursive 35.36 body-recursive 25.69 - 1.38x slower stdlib map 24.85 - 1.42x slower ##### With input Bigger (5 Million) ##### Comparison: tail-recursive 6.93 body-recursive 4.92 - 1.41x slower stdlib map 4.87 - 1.42x slower TCO
  • 80. TCO
  • 82. average = total_time / iterations Average
  • 83. Why not just take the average?
  • 84. defp standard_deviation(samples, average, iterations) do total_variance = Enum.reduce samples, 0, fn(sample, total) -> total + :math.pow((sample - average), 2) end variance = total_variance / iterations :math.sqrt variance end Standard Deviation
  • 85. defp standard_deviation(samples, average, iterations) do total_variance = Enum.reduce samples, 0, fn(sample, total) -> total + :math.pow((sample - average), 2) end variance = total_variance / iterations :math.sqrt variance end Spread of Values
  • 91. defp compute_median(run_times, iterations) do sorted = Enum.sort(run_times) middle = div(iterations, 2) if Integer.is_odd(iterations) do sorted |> Enum.at(middle) |> to_float else (Enum.at(sorted, middle) + Enum.at(sorted, middle - 1)) / 2 end end Median
  • 99. alias Benchee.Formatters.{Console, HTML} map_fun = fn(i) -> [i, i * i] end inputs = %{ "Small" => Enum.to_list(1..200), "Medium" => Enum.to_list(1..1000), "Bigger" => Enum.to_list(1..10_000) } Benchee.run(%{ "flat_map" => fn(list) -> Enum.flat_map(list, map_fun) end, "map.flatten" => fn(list) -> list |> Enum.map(map_fun) |> List.flatten end }, inputs: inputs, formatters: [&Console.output/1, &HTML.output/1], html: [file: "bench/output/flat_map.html"]) flat_map
  • 100. ##### With input Medium ##### Name ips average deviation median map.flatten 15.51 K 64.48 ÎŒs ±17.66% 63.00 ÎŒs flat_map 8.95 K 111.76 ÎŒs ±7.18% 112.00 ÎŒs Comparison: map.flatten 15.51 K flat_map 8.95 K - 1.73x slower flat_map
  • 103. base_map = (0..50) |> Enum.zip(300..350) |> Enum.into(%{}) # deep maps with 6 top level conflicts orig = Map.merge base_map, some_deep_map new = Map.merge base_map, some_deep_map_2 simple = fn(_key, _base, override) -> override end Benchee.run %{ "Map.merge/2" => fn -> Map.merge orig, new end, "Map.merge/3" => fn -> Map.merge orig, new, simple end, }, formatters: [&Benchee.Formatters.Console.output/1, &Benchee.Formatters.HTML.output/1], html: %{file: "bench/output/merge_3.html"} merge/2 vs merge/3
  • 104. base_map = (0..50) |> Enum.zip(300..350) |> Enum.into(%{}) # deep maps with 6 top level conflicts orig = Map.merge base_map, some_deep_map new = Map.merge base_map, some_deep_map_2 simple = fn(_key, _base, override) -> override end Benchee.run %{ "Map.merge/2" => fn -> Map.merge orig, new end, "Map.merge/3" => fn -> Map.merge orig, new, simple end, }, formatters: [&Benchee.Formatters.Console.output/1, &Benchee.Formatters.HTML.output/1], html: %{file: "bench/output/merge_3.html"} merge/2 vs merge/3
  • 105. merge/2 vs merge/3 Is merge/3 variant about
 – as fast as merge/2? (+-20%) – 2x slower than merge/2 – 5x slower than merge/2 – 10x slower than merge/2 – 20x slower than merge/2
  • 106. merge/2 vs merge/3 Is merge/3 variant about
 – as fast as merge/2? (+-20%) – 2x slower than merge/2 – 5x slower than merge/2 – 10x slower than merge/2 – 20x slower than merge/2
  • 107. merge/2 vs merge/3 Is merge/3 variant about
 – as fast as merge/2? (+-20%) – 2x slower than merge/2 – 5x slower than merge/2 – 10x slower than merge/2 – 20x slower than merge/2
  • 108. merge/2 vs merge/3 Is merge/3 variant about
 – as fast as merge/2? (+-20%) – 2x slower than merge/2 – 5x slower than merge/2 – 10x slower than merge/2 – 20x slower than merge/2
  • 109. merge/2 vs merge/3 Is merge/3 variant about
 – as fast as merge/2? (+-20%) – 2x slower than merge/2 – 5x slower than merge/2 – 10x slower than merge/2 – 20x slower than merge/2
  • 110. merge/2 vs merge/3 Is merge/3 variant about
 – as fast as merge/2? (+-20%) – 2x slower than merge/2 – 5x slower than merge/2 – 10x slower than merge/2 – 20x slower than merge/2
  • 111. Name ips average deviation median Map.merge/2 1.64 M 0.61 ÎŒs ±11.12% 0.61 ÎŒs Map.merge/3 0.0921 M 10.86 ÎŒs ±72.22% 10.00 ÎŒs Comparison: Map.merge/2 1.64 M Map.merge/3 0.0921 M - 17.85x slower merge/2 vs merge/3
  • 112. defmodule MyMap do def map_tco(list, function) do Enum.reverse do_map_tco([], list, function) end defp do_map_tco(acc, [], _function) do acc end defp do_map_tco(acc, [head | tail], function) do do_map_tco([function.(head) | acc], tail, function) end def map_tco_arg_order(list, function) do Enum.reverse do_map_tco_arg_order(list, function, []) end defp do_map_tco_arg_order([], _function, acc) do acc end defp do_map_tco_arg_order([head | tail], func, acc) do do_map_tco_arg_order(tail, func, [func.(head) | acc]) end end
  • 113. defmodule MyMap do def map_tco(list, function) do Enum.reverse do_map_tco([], list, function) end defp do_map_tco(acc, [], _function) do acc end defp do_map_tco(acc, [head | tail], function) do do_map_tco([function.(head) | acc], tail, function) end def map_tco_arg_order(list, function) do Enum.reverse do_map_tco_arg_order(list, function, []) end defp do_map_tco_arg_order([], _function, acc) do acc end defp do_map_tco_arg_order([head | tail], func, acc) do do_map_tco_arg_order(tail, func, [func.(head) | acc]) end end
  • 114. Does argument order make a difference?
  • 115. ##### With input Middle (100 Thousand) ##### Name ips average deviation median stdlib map 490.02 2.04 ms ±7.76% 2.07 ms body-recursive 467.51 2.14 ms ±7.34% 2.17 ms tail-rec arg-order 439.04 2.28 ms ±17.96% 2.25 ms tail-recursive 402.56 2.48 ms ±16.00% 2.46 ms Comparison: stdlib map 490.02 body-recursive 467.51 - 1.05x slower tail-rec arg-order 439.04 - 1.12x slower tail-recursive 402.56 - 1.22x slower ##### With input Big (1 Million) ##### Name ips average deviation median tail-rec arg-order 39.76 25.15 ms ±10.14% 24.33 ms tail-recursive 36.58 27.34 ms ±9.38% 26.41 ms stdlib map 25.70 38.91 ms ±3.05% 38.58 ms body-recursive 25.04 39.94 ms ±3.04% 39.64 ms Comparison: tail-rec arg-order 39.76 tail-recursive 36.58 - 1.09x slower stdlib map 25.70 - 1.55x slower body-recursive 25.04 - 1.59x slower
  • 116. ##### With input Middle (100 Thousand) ##### Name ips average deviation median stdlib map 490.02 2.04 ms ±7.76% 2.07 ms body-recursive 467.51 2.14 ms ±7.34% 2.17 ms tail-rec arg-order 439.04 2.28 ms ±17.96% 2.25 ms tail-recursive 402.56 2.48 ms ±16.00% 2.46 ms Comparison: stdlib map 490.02 body-recursive 467.51 - 1.05x slower tail-rec arg-order 439.04 - 1.12x slower tail-recursive 402.56 - 1.22x slower ##### With input Big (1 Million) ##### Name ips average deviation median tail-rec arg-order 39.76 25.15 ms ±10.14% 24.33 ms tail-recursive 36.58 27.34 ms ±9.38% 26.41 ms stdlib map 25.70 38.91 ms ±3.05% 38.58 ms body-recursive 25.04 39.94 ms ±3.04% 39.64 ms Comparison: tail-rec arg-order 39.76 tail-recursive 36.58 - 1.09x slower stdlib map 25.70 - 1.55x slower body-recursive 25.04 - 1.59x slower
  • 122. “The order of arguments will likely matter when we generate the branching code. The order of arguments will specially matter if performing binary matching.” JosĂ© Valim, 2016 (Comment Section of my blog!) A wild JosĂ© appears!
  • 123. config |> Benchee.init |> Benchee.System.system |> Benchee.benchmark("job", fn -> magic end) |> Benchee.measure |> Benchee.statistics |> Benchee.Formatters.Console.output |> Benchee.Formatters.HTML.output A transformation of inputs
  • 125. Always do your own benchmarks!
  • 126. alias Benchee.Formatters.{Console, HTML} map_fun = fn(i) -> [i, i * i] end inputs = %{ "Small" => Enum.to_list(1..200), "Medium" => Enum.to_list(1..1000), "Bigger" => Enum.to_list(1..10_000) } Benchee.run(%{ "flat_map" => fn(list) -> Enum.flat_map(list, map_fun) end, "map.flatten" => fn(list) -> list |> Enum.map(map_fun) |> List.flatten end }, inputs: inputs, formatters: [&Console.output/1, &HTML.output/1], html: [file: "bench/output/flat_map.html"]) Remember?
  • 127. ● Elixir 1.4.0-rc.0 ● Erlang 19.1 ● i5-7200U – 2 x 2.5GHz (Up to 3.10GHz) ● 8GB RAM ● Linux Mint 18 - 64 bit (Ubuntu 16.04 base) ● Linux Kernel 4.4.0-51 Mhm Upgrades
  • 128. alias Benchee.Formatters.{Console, HTML} map_fun = fn(i) -> [i, i * i] end inputs = %{ "Small" => Enum.to_list(1..200), "Medium" => Enum.to_list(1..1000), "Bigger" => Enum.to_list(1..10_000) } Benchee.run(%{ "flat_map" => fn(list) -> Enum.flat_map(list, map_fun) end, "map.flatten" => fn(list) -> list |> Enum.map(map_fun) |> List.flatten end }, inputs: inputs, formatters: [&Console.output/1, &HTML.output/1], html: [file: "bench/output/flat_map.html"]) flat_map
  • 129. alias Benchee.Formatters.{Console, HTML} map_fun = fn(i) -> [i, i * i] end inputs = %{ "Small" => Enum.to_list(1..200), "Medium" => Enum.to_list(1..1000), "Bigger" => Enum.to_list(1..10_000) } Benchee.run(%{ "flat_map" => fn(list) -> Enum.flat_map(list, map_fun) end, "map.flatten" => fn(list) -> list |> Enum.map(map_fun) |> List.flatten end }, inputs: inputs, formatters: [&Console.output/1, &HTML.output/1], html: [file: "bench/output/flat_map.html"]) flat_map
  • 130. Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:4:4] [async- threads:10] [hipe] [kernel-poll:false] Elixir 1.4.0-rc.0 Benchmark suite executing with the following configuration: warmup: 2.0s time: 5.0s parallel: 1 inputs: Bigger, Medium, Small Estimated total run time: 42.0s Benchmarking with input Bigger: Benchmarking flat_map... Benchmarking map.flatten... Benchmarking with input Medium: Benchmarking flat_map... Benchmarking map.flatten... Benchmarking with input Small: Benchmarking flat_map... Benchmarking map.flatten... flat_map
  • 131. Erlang/OTP 19 [erts-8.1] [source] [64-bit] [smp:4:4] [async- threads:10] [hipe] [kernel-poll:false] Elixir 1.4.0-rc.0 Benchmark suite executing with the following configuration: warmup: 2.0s time: 5.0s parallel: 1 inputs: Bigger, Medium, Small Estimated total run time: 42.0s Benchmarking with input Bigger: Benchmarking flat_map... Benchmarking map.flatten... Benchmarking with input Medium: Benchmarking flat_map... Benchmarking map.flatten... Benchmarking with input Small: Benchmarking flat_map... Benchmarking map.flatten... flat_map
  • 132. ##### With input Bigger ##### Name ips average deviation median flat_map 1.76 K 569.47 ÎŒs ±26.95% 512.00 ÎŒs map.flatten 1.02 K 982.57 ÎŒs ±25.06% 901.00 ÎŒs Comparison: flat_map 1.76 K map.flatten 1.02 K - 1.73x slower ##### With input Medium ##### Name ips average deviation median flat_map 21.39 K 46.76 ÎŒs ±19.24% 48.00 ÎŒs map.flatten 14.99 K 66.71 ÎŒs ±18.13% 65.00 ÎŒs Comparison: flat_map 21.39 K map.flatten 14.99 K - 1.43x slower ##### With input Small ##### Name ips average deviation median flat_map 118.66 K 8.43 ÎŒs ±180.99% 8.00 ÎŒs map.flatten 79.25 K 12.62 ÎŒs ±97.97% 12.00 ÎŒs Comparison: flat_map 118.66 K map.flatten 79.25 K - 1.50x slower The tables have turned!
  • 133. ##### With input Bigger ##### Name ips average deviation median flat_map 1.76 K 569.47 ÎŒs ±26.95% 512.00 ÎŒs map.flatten 1.02 K 982.57 ÎŒs ±25.06% 901.00 ÎŒs Comparison: flat_map 1.76 K map.flatten 1.02 K - 1.73x slower ##### With input Medium ##### Name ips average deviation median flat_map 21.39 K 46.76 ÎŒs ±19.24% 48.00 ÎŒs map.flatten 14.99 K 66.71 ÎŒs ±18.13% 65.00 ÎŒs Comparison: flat_map 21.39 K map.flatten 14.99 K - 1.43x slower ##### With input Small ##### Name ips average deviation median flat_map 118.66 K 8.43 ÎŒs ±180.99% 8.00 ÎŒs map.flatten 79.25 K 12.62 ÎŒs ±97.97% 12.00 ÎŒs Comparison: flat_map 118.66 K map.flatten 79.25 K - 1.50x slower > 2x faster
  • 135. How did that happen?
  • 140. Enjoy Benchmarking! ❀ Tobias Pfeiffer @PragTob pragtob.info github.com/PragTob/benchee