SlideShare a Scribd company logo
Optimizing Performance in
Real-World Problems
Nikola Peric
nikola.peric03@gmail.com
February 2016
About me
◉ Full-stack developer at market research firm, Synqrinus
http://www.synqrinus.com/
◉ Synqrinus conducts online surveys and focus groups so
a lot of my work has to do with automating the data
analysis from those sources
The Basics
Wax on, wax off
0
When and why to optimize for
performance
◉ Performance, for most scenarios, is a
secondary need
◉ It arises after the initial application is built
◉ Optimization allows for your users to be more
efficient, effective, or have a better experience
0
First step
◉ Is to not use any of the functions and techniques I’m
going to talk about
◉ It’s about reducing redundant calls in your code
◉ It’s about cleaning up and optimizing your initial code to
begin with
◉ For many every day cases, this along will be enough
0
Tools of the trade
◉ Benchmark, benchmark, benchmark
◉ Any change made with performance in mind should be
measured
◉ A more advanced alternative to simply running time
across multiple iterations is the Criterium library
◉ https://github.com/hugoduncan/criterium
0
Memoization
Think memory, but without the r
1
What is memoization?
◉memoize wraps a function with a basic cache in the
form of an atom
◉Think of it as “remembering” the output to a given
input
◉The parameters passed through to a given function
are treated as the keys to the map stored in the atom
◉When the function is called with the same parameters,
there is no recalculation necessary and the result is
simply looked up
1
When should I use
memoization?
Do use
◉ if you are sending the same
parameters as inputs to
computationally intensive
functions
◉ if the function calls are
referentially transparent (i.e. the
output alone is sufficient)
Do not use
◉ if you expect the output to
change over time
◉ if there are side effects you
expect to run within the function
◉ if your outputs/inputs are
sufficiently large that they
would cost a sizable amount of
memory
1
Problem time!
Background
◉ With some of the data we work with there is a map that
requires retrieval and formatting from the database
before we can work with it
◉ Often times when one project is being analyzed, the
same map of data has to get formatted repeatedly
◉ This seemed like a perfect opportunity to use
memoization
1
Problem time!
Before
(defn format-syn-datamap
[datamap]
(->> datamap
(map #(into {}
{(keyword (:id %))
(:map %)}))
(apply merge)))
(defn formatted-datamap
[datamap-id]
(format-syn-datamap (db/get-datamap datamap-id)))
12.7 ms
Criterium bench execution time means
1
Problem time!
Before After
(defn format-syn-datamap
[datamap]
(->> datamap
(map #(into {}
{(keyword (:id %))
(:map %)}))
(apply merge)))
(defn formatted-datamap
[datamap-id]
(format-syn-datamap (db/get-datamap datamap-id)))
(def formatted-datamap
(memoize
(fn [datamap-id]
(format-syn-datamap
(db/get-datamap datamap-id)))))
12.7 ms 95.9 ns
>100,000x faster
Criterium bench execution time means
fun fact: 1ms=1,000,000ns
(differs based on actual scenario)
1
core.memoize
◉ If you find yourself using memoize you’ll notice that
there are some features that would be nice to have,
such as…
◉ … clearing the cache
◉ … limiting the size of the cache (e.g. to speed up access
for commonly accessed results, or recently accessed)
◉ For this, and more, there’s core.memoize
◉ https://github.com/clojure/core.memoize
1
Parallelization (with
pmap)
This is one the things Clojure’s good for right?
2
What is parallelization
and pmap?
◉Parallelization is running multiple calculations at the same
time (across multiple threads)
◉pmap is basically a “parallelized map”
◉Note: pmap is lazy! Simply calling pmap won’t cause any
work to begin
◉What pmap tries to do is wrap each element of the coll(s)
you are mapping as a future, and then attempt to deref and
synchronize based on the number of threads available
◉Sounds confusing right? A simpler way to imagine it would
be: (doall (map #(future (f %)) coll)
2
When should I use pmap?
Do use
◉ if the function that is being
mapped is a computationally
heavy function
◉ if we’re talking about CPU
intensive tasks
Do not use
◉ if the time saved from running
the function in parallel will be
lost from coordination of the
items in the collection
◉ if you don’t want to max the
CPU
Also note
◉ There are so many other ways to apply parallel processing in Clojure!
We’ll talk about one more later, but if performance is important to you,
you will want to read more about it
◉ Useful functions: future, delay, promise
2
Problem time!
Background
◉ We have a collection of maps as the raw data
(thousands of items in the coll)
◉ We want to run a computationally intensive function and
use the outputs to generate a new map (calc-fn)
◉ We also want to map this process multiple times, once
for each variable we wish to calculate
◉ Note: for sake of this example, some elements of the
following fn have been simplified
2
Problem time!2
Before
27.4 ns
Criterium bench execution time means
(defn row-calc [data weight-var vars calc-fn conditions]
(map
(fn [v]
(map #(let [[value size] (calc-fn v %1 nil weight-var)]
{:value value
:size size
:conditions %2})
data conditions)) vars))
Note: depending on complexity of arguments,
calc-fn may be very computationally intensive, or
not much at all. I choose a very basic set of
arguments for this benchmark
Problem time!2
Before After
27.4 ns 22.1 ns
>1.2x faster (~20%)
Criterium bench execution time means
(defn row-calc [data weight-var vars calc-fn conditions]
(map
(fn [v]
(map #(let [[value size] (calc-fn v %1 nil weight-var)]
{:value value
:size size
:conditions %2})
data conditions)) vars))
(defn row-calc [data weight-var vars calc-fn conditions]
(map
(fn [v]
(pmap #(let [[value size] (calc-fn v %1 nil weight-var)]
{:value value
:size size
:conditions %2})
data conditions)) vars))
(differs based on actual scenario)
Reducers
More parallelization… and more!
3
What are reducers?
◉While we were looking for pmap if you wanted a parallel
reduce, there’s reducers!
◉core.reducer offers parallelization for common functions
such as map, filter, mapcat, flatten*
◉Imagine a scenario where you are apply a map over a filter
◉What if you could compute these not sequentially, but in
parallel, i.e. reduce through your collection(s) only once?
◉That’s the power of reducers
3
*caveat – some functions in core.reducer do not support parallelization (e.g. take, take-while, drop)
How do I use core.reducers?
◉Reference clojure.core.reducers namespace (we will be
aliasing the namespace as “r” from here on)
◉Create a reducer from one of the following: r/map, r/mapcat,
r/filter, r/remove, r/flatten, r/take-while, r/take,
r/drop
◉Apply the reduction with one of the following functions:
r/reduce, r/fold, r/foldcat, into, reduce
3
What is fold?
◉ fold is a parallalized reduce/combine form of reduce
◉It is used in the form
(r/fold reducing-fn reducer)
◉ reducing-fn must be associative
◉ reducing-fn must be a monoid (i.e. give its identity even
when 0 arguments are passed)
◉ fold does all this by chunking your collection into smaller
parts, and then reducing and combining them back together
all while maintaining order
◉Essentially it’s reduce on steroids
3
When should I use reducers?
Do use
◉ if you want easy to use
parallelism for commonly used
functions such a map or filter
◉ if you have a large amount of
data to apply computations to
(see fold)
◉ if you want a parallel reduce
Do not use
◉ if you don’t care for parallelism
and really just wanted
composed functions that iterate
through all items once (in which
case, see transducers
http://clojure.org/reference/transducers)
◉ if you don’t want to max the
CPU (for most core.reducer
features)
3
Problem time!
Background
◉ We want to map through a large collection of maps and
select a single value from each map
◉ Then from the result sequence we sum up the values
◉ This is an excellent test of fold’s parallel
partinioning/reducing, and r/map’s parallelism
3
Problem time!3
Before
136.3 μs
Criterium bench execution time means
(defn weighted-total [data weight-var]
(reduce + (map weight-var data)))
Problem time!3
Before After
136.3 μs 29.4 μs
>4.6x faster
Criterium bench execution time means
(defn weighted-total [data weight-var]
(reduce + (map weight-var data)))
(defn weighted-total [data weight-var]
(r/fold + (r/map weight-var data)))
Closing Thoughts4
Stop
◉ Does the business value created from
pursuing additional optimization outweigh the
investment?
◉ If no, stop
◉ If yes, continue
4
Finding areas for optimization
◉ Often times there are multiple areas that can require
attention
◉ Possible elements to look for include…
◉ map/filter/any manipulation of collections
◉ Calculations that are known to be computationally
expensive (parallelize or memoize if reasonable)
4
Summary
◉ Benchmark, benchmark, benchmark
◉ Sometimes a perceived optimization can lose you time
under certain scenarios
◉ Optimize only when reasonable to do so
◉ There are trade offs to optimization
◉ Happy efficiency hunting!
4
Any questions?
You can reach me at
◉ nikola.peric03@gmail.com
Thank you!

More Related Content

PPTX
Hadoop performance optimization tips
PDF
MapReduce: teoria e prática
PPSX
MapReduce Scheduling Algorithms
KEY
Testing Hadoop jobs with MRUnit
PPTX
Introduction to map reduce
PPT
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
DOCX
Big data unit iv and v lecture notes qb model exam
PPTX
GoodFit: Multi-Resource Packing of Tasks with Dependencies
Hadoop performance optimization tips
MapReduce: teoria e prática
MapReduce Scheduling Algorithms
Testing Hadoop jobs with MRUnit
Introduction to map reduce
Cascading talk in Etsy (http://www.meetup.com/cascading/events/169390262/)
Big data unit iv and v lecture notes qb model exam
GoodFit: Multi-Resource Packing of Tasks with Dependencies

What's hot (20)

PDF
Anirudh Koul. 30 Golden Rules of Deep Learning Performance
PDF
RAPIDS: ускоряем Pandas и scikit-learn на GPU Павел Клеменков, NVidia
PPTX
Hadoop deconstructing map reduce job step by step
PDF
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
PDF
Garbage collection in JVM
PDF
InfluxData Platform Future and Vision
PPTX
An introduction to Test Driven Development on MapReduce
PDF
Introduction to Parallelization ans performance optimization
PDF
Mapreduce - Simplified Data Processing on Large Clusters
PDF
Size-Based Scheduling: From Theory To Practice, And Back
PDF
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
PDF
Large Scale Data Analysis with Map/Reduce, part I
PDF
OpenGL 4.4 - Scene Rendering Techniques
PPTX
Time space trade off
PPTX
Aggarwal Draft
PDF
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
PPTX
How data rules the world: Telemetry in Battlefield Heroes
PPTX
FlameWorks GTC 2014
PDF
Flux and InfluxDB 2.0 by Paul Dix
Anirudh Koul. 30 Golden Rules of Deep Learning Performance
RAPIDS: ускоряем Pandas и scikit-learn на GPU Павел Клеменков, NVidia
Hadoop deconstructing map reduce job step by step
Sebastian Schelter – Distributed Machine Learing with the Samsara DSL
Garbage collection in JVM
InfluxData Platform Future and Vision
An introduction to Test Driven Development on MapReduce
Introduction to Parallelization ans performance optimization
Mapreduce - Simplified Data Processing on Large Clusters
Size-Based Scheduling: From Theory To Practice, And Back
Large data with Scikit-learn - Boston Data Mining Meetup - Alex Perrier
Large Scale Data Analysis with Map/Reduce, part I
OpenGL 4.4 - Scene Rendering Techniques
Time space trade off
Aggarwal Draft
Distributed Multi-GPU Computing with Dask, CuPy and RAPIDS
How data rules the world: Telemetry in Battlefield Heroes
FlameWorks GTC 2014
Flux and InfluxDB 2.0 by Paul Dix
Ad

Viewers also liked (11)

PDF
Top 30 logo styles
PDF
LEAP ELA IG - www.lumoslearning.com
PDF
Professional services
PPTX
Tics y NTics
DOCX
BMKT369MarketingPlan
DOCX
Majdie Hajjar - Dissertation
PPTX
Disney Consumer Products: Marketing Nutrition to Children
PDF
Generic presentation
PPT
Eng8 participles
PDF
CGonsewski CV 06.20.16
DOC
Anthony_Parsons CV v2
Top 30 logo styles
LEAP ELA IG - www.lumoslearning.com
Professional services
Tics y NTics
BMKT369MarketingPlan
Majdie Hajjar - Dissertation
Disney Consumer Products: Marketing Nutrition to Children
Generic presentation
Eng8 participles
CGonsewski CV 06.20.16
Anthony_Parsons CV v2
Ad

Similar to Optimizing Performance - Clojure Remote - Nikola Peric (20)

PDF
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
PDF
Apache Airflow® Best Practices: DAG Writing
PPTX
NYAI - Scaling Machine Learning Applications by Braxton McKee
PPTX
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
PDF
Hadoop map reduce in operation
PPTX
Map reduce
PDF
Velocity 2018 preetha appan final
PPT
Lecture1
PDF
Flink Forward SF 2017: Malo Deniélou - No shard left behind: Dynamic work re...
PDF
Hadoop map reduce concepts
PDF
Data Analytics and Simulation in Parallel with MATLAB*
PDF
DevoxxUK: Optimizating Application Performance on Kubernetes
PDF
Deploying Models at Scale with Apache Beam
PPTX
Speed up R with parallel programming in the Cloud
PPTX
Enterprise application performance - Understanding & Learnings
PDF
Efficient Evaluation of Embedded-System Design Alternatives (SPLC Tutorial 2019)
PPTX
Yahoo's Experience Running Pig on Tez at Scale
PPTX
Cloudera Data Science Challenge
PPTX
Data Science Challenge presentation given to the CinBITools Meetup Group
PDF
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...
Malo Denielou - No shard left behind: Dynamic work rebalancing in Apache Beam
Apache Airflow® Best Practices: DAG Writing
NYAI - Scaling Machine Learning Applications by Braxton McKee
Braxton McKee, CEO & Founder, Ufora at MLconf NYC - 4/15/16
Hadoop map reduce in operation
Map reduce
Velocity 2018 preetha appan final
Lecture1
Flink Forward SF 2017: Malo Deniélou - No shard left behind: Dynamic work re...
Hadoop map reduce concepts
Data Analytics and Simulation in Parallel with MATLAB*
DevoxxUK: Optimizating Application Performance on Kubernetes
Deploying Models at Scale with Apache Beam
Speed up R with parallel programming in the Cloud
Enterprise application performance - Understanding & Learnings
Efficient Evaluation of Embedded-System Design Alternatives (SPLC Tutorial 2019)
Yahoo's Experience Running Pig on Tez at Scale
Cloudera Data Science Challenge
Data Science Challenge presentation given to the CinBITools Meetup Group
Big Data Day LA 2016/ Big Data Track - Portable Stream and Batch Processing w...

Recently uploaded (20)

PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
System and Network Administration Chapter 2
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
Nekopoi APK 2025 free lastest update
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
top salesforce developer skills in 2025.pdf
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
PTS Company Brochure 2025 (1).pdf.......
PPT
Introduction Database Management System for Course Database
PPTX
Introduction to Artificial Intelligence
PPTX
Transform Your Business with a Software ERP System
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
System and Network Administration Chapter 2
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Online Work Permit System for Fast Permit Processing
Design an Analysis of Algorithms I-SECS-1021-03
2025 Textile ERP Trends: SAP, Odoo & Oracle
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Nekopoi APK 2025 free lastest update
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Operating system designcfffgfgggggggvggggggggg
top salesforce developer skills in 2025.pdf
Softaken Excel to vCard Converter Software.pdf
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
How to Migrate SBCGlobal Email to Yahoo Easily
PTS Company Brochure 2025 (1).pdf.......
Introduction Database Management System for Course Database
Introduction to Artificial Intelligence
Transform Your Business with a Software ERP System

Optimizing Performance - Clojure Remote - Nikola Peric

  • 1. Optimizing Performance in Real-World Problems Nikola Peric nikola.peric03@gmail.com February 2016
  • 2. About me ◉ Full-stack developer at market research firm, Synqrinus http://www.synqrinus.com/ ◉ Synqrinus conducts online surveys and focus groups so a lot of my work has to do with automating the data analysis from those sources
  • 3. The Basics Wax on, wax off 0
  • 4. When and why to optimize for performance ◉ Performance, for most scenarios, is a secondary need ◉ It arises after the initial application is built ◉ Optimization allows for your users to be more efficient, effective, or have a better experience 0
  • 5. First step ◉ Is to not use any of the functions and techniques I’m going to talk about ◉ It’s about reducing redundant calls in your code ◉ It’s about cleaning up and optimizing your initial code to begin with ◉ For many every day cases, this along will be enough 0
  • 6. Tools of the trade ◉ Benchmark, benchmark, benchmark ◉ Any change made with performance in mind should be measured ◉ A more advanced alternative to simply running time across multiple iterations is the Criterium library ◉ https://github.com/hugoduncan/criterium 0
  • 8. What is memoization? ◉memoize wraps a function with a basic cache in the form of an atom ◉Think of it as “remembering” the output to a given input ◉The parameters passed through to a given function are treated as the keys to the map stored in the atom ◉When the function is called with the same parameters, there is no recalculation necessary and the result is simply looked up 1
  • 9. When should I use memoization? Do use ◉ if you are sending the same parameters as inputs to computationally intensive functions ◉ if the function calls are referentially transparent (i.e. the output alone is sufficient) Do not use ◉ if you expect the output to change over time ◉ if there are side effects you expect to run within the function ◉ if your outputs/inputs are sufficiently large that they would cost a sizable amount of memory 1
  • 10. Problem time! Background ◉ With some of the data we work with there is a map that requires retrieval and formatting from the database before we can work with it ◉ Often times when one project is being analyzed, the same map of data has to get formatted repeatedly ◉ This seemed like a perfect opportunity to use memoization 1
  • 11. Problem time! Before (defn format-syn-datamap [datamap] (->> datamap (map #(into {} {(keyword (:id %)) (:map %)})) (apply merge))) (defn formatted-datamap [datamap-id] (format-syn-datamap (db/get-datamap datamap-id))) 12.7 ms Criterium bench execution time means 1
  • 12. Problem time! Before After (defn format-syn-datamap [datamap] (->> datamap (map #(into {} {(keyword (:id %)) (:map %)})) (apply merge))) (defn formatted-datamap [datamap-id] (format-syn-datamap (db/get-datamap datamap-id))) (def formatted-datamap (memoize (fn [datamap-id] (format-syn-datamap (db/get-datamap datamap-id))))) 12.7 ms 95.9 ns >100,000x faster Criterium bench execution time means fun fact: 1ms=1,000,000ns (differs based on actual scenario) 1
  • 13. core.memoize ◉ If you find yourself using memoize you’ll notice that there are some features that would be nice to have, such as… ◉ … clearing the cache ◉ … limiting the size of the cache (e.g. to speed up access for commonly accessed results, or recently accessed) ◉ For this, and more, there’s core.memoize ◉ https://github.com/clojure/core.memoize 1
  • 14. Parallelization (with pmap) This is one the things Clojure’s good for right? 2
  • 15. What is parallelization and pmap? ◉Parallelization is running multiple calculations at the same time (across multiple threads) ◉pmap is basically a “parallelized map” ◉Note: pmap is lazy! Simply calling pmap won’t cause any work to begin ◉What pmap tries to do is wrap each element of the coll(s) you are mapping as a future, and then attempt to deref and synchronize based on the number of threads available ◉Sounds confusing right? A simpler way to imagine it would be: (doall (map #(future (f %)) coll) 2
  • 16. When should I use pmap? Do use ◉ if the function that is being mapped is a computationally heavy function ◉ if we’re talking about CPU intensive tasks Do not use ◉ if the time saved from running the function in parallel will be lost from coordination of the items in the collection ◉ if you don’t want to max the CPU Also note ◉ There are so many other ways to apply parallel processing in Clojure! We’ll talk about one more later, but if performance is important to you, you will want to read more about it ◉ Useful functions: future, delay, promise 2
  • 17. Problem time! Background ◉ We have a collection of maps as the raw data (thousands of items in the coll) ◉ We want to run a computationally intensive function and use the outputs to generate a new map (calc-fn) ◉ We also want to map this process multiple times, once for each variable we wish to calculate ◉ Note: for sake of this example, some elements of the following fn have been simplified 2
  • 18. Problem time!2 Before 27.4 ns Criterium bench execution time means (defn row-calc [data weight-var vars calc-fn conditions] (map (fn [v] (map #(let [[value size] (calc-fn v %1 nil weight-var)] {:value value :size size :conditions %2}) data conditions)) vars)) Note: depending on complexity of arguments, calc-fn may be very computationally intensive, or not much at all. I choose a very basic set of arguments for this benchmark
  • 19. Problem time!2 Before After 27.4 ns 22.1 ns >1.2x faster (~20%) Criterium bench execution time means (defn row-calc [data weight-var vars calc-fn conditions] (map (fn [v] (map #(let [[value size] (calc-fn v %1 nil weight-var)] {:value value :size size :conditions %2}) data conditions)) vars)) (defn row-calc [data weight-var vars calc-fn conditions] (map (fn [v] (pmap #(let [[value size] (calc-fn v %1 nil weight-var)] {:value value :size size :conditions %2}) data conditions)) vars)) (differs based on actual scenario)
  • 21. What are reducers? ◉While we were looking for pmap if you wanted a parallel reduce, there’s reducers! ◉core.reducer offers parallelization for common functions such as map, filter, mapcat, flatten* ◉Imagine a scenario where you are apply a map over a filter ◉What if you could compute these not sequentially, but in parallel, i.e. reduce through your collection(s) only once? ◉That’s the power of reducers 3 *caveat – some functions in core.reducer do not support parallelization (e.g. take, take-while, drop)
  • 22. How do I use core.reducers? ◉Reference clojure.core.reducers namespace (we will be aliasing the namespace as “r” from here on) ◉Create a reducer from one of the following: r/map, r/mapcat, r/filter, r/remove, r/flatten, r/take-while, r/take, r/drop ◉Apply the reduction with one of the following functions: r/reduce, r/fold, r/foldcat, into, reduce 3
  • 23. What is fold? ◉ fold is a parallalized reduce/combine form of reduce ◉It is used in the form (r/fold reducing-fn reducer) ◉ reducing-fn must be associative ◉ reducing-fn must be a monoid (i.e. give its identity even when 0 arguments are passed) ◉ fold does all this by chunking your collection into smaller parts, and then reducing and combining them back together all while maintaining order ◉Essentially it’s reduce on steroids 3
  • 24. When should I use reducers? Do use ◉ if you want easy to use parallelism for commonly used functions such a map or filter ◉ if you have a large amount of data to apply computations to (see fold) ◉ if you want a parallel reduce Do not use ◉ if you don’t care for parallelism and really just wanted composed functions that iterate through all items once (in which case, see transducers http://clojure.org/reference/transducers) ◉ if you don’t want to max the CPU (for most core.reducer features) 3
  • 25. Problem time! Background ◉ We want to map through a large collection of maps and select a single value from each map ◉ Then from the result sequence we sum up the values ◉ This is an excellent test of fold’s parallel partinioning/reducing, and r/map’s parallelism 3
  • 26. Problem time!3 Before 136.3 μs Criterium bench execution time means (defn weighted-total [data weight-var] (reduce + (map weight-var data)))
  • 27. Problem time!3 Before After 136.3 μs 29.4 μs >4.6x faster Criterium bench execution time means (defn weighted-total [data weight-var] (reduce + (map weight-var data))) (defn weighted-total [data weight-var] (r/fold + (r/map weight-var data)))
  • 29. Stop ◉ Does the business value created from pursuing additional optimization outweigh the investment? ◉ If no, stop ◉ If yes, continue 4
  • 30. Finding areas for optimization ◉ Often times there are multiple areas that can require attention ◉ Possible elements to look for include… ◉ map/filter/any manipulation of collections ◉ Calculations that are known to be computationally expensive (parallelize or memoize if reasonable) 4
  • 31. Summary ◉ Benchmark, benchmark, benchmark ◉ Sometimes a perceived optimization can lose you time under certain scenarios ◉ Optimize only when reasonable to do so ◉ There are trade offs to optimization ◉ Happy efficiency hunting! 4
  • 32. Any questions? You can reach me at ◉ nikola.peric03@gmail.com Thank you!