SlideShare a Scribd company logo
Parallel Extensions
Chester Hartin
Code Review 11/30/12
History of parallelization
 Definition: a form of computation in which many
  calculations are carried out simultaneously,
  operating on the principle that large problems
  can be divided into smaller ones, which are then
  solved concurrently (“in parallel”)

 Developers have tried to improve performance
  by parallelizing problems, even before true
  multicore systems

 How is this different from multithreading?
  Multithreading is a type of parallelism
Real-life parallelization
 Consider that we have some eggs to boil (data
  to process)

 Before the early 2000s we only had 1 pot (core)
  and more eggs than we could boil at once
  (meaning we could boil >1 at a time in
  parallel, since you can fit >1 egg/pot). After the
  early 2000s we had 2 pots, thus could boil twice
  as many eggs at once.
Pots vs Boil time
Given: We have 10 eggs to boil and each egg requires 8
minutes in order to be ready to eat. Each pot holds up to
5 eggs.
Number of pots                Boil Time
1                             16 min
2                             8 min
3                             8 min
4                             8 min

Something interesting occurs… Adding more than 2 pots
does nothing to decrease the overall time.
Time efficiency




              Time to boil 100 eggs
What does this mean?
 It really doesn’t matter how many cores we use.
  This problem simply will not speed up by adding
  cores.
 Our equations are pretty simple:
  Pots Needed = Number of Eggs/Eggs Per Pot
    20 = 100/5
  Time = Ceiling(Pots Needed/Pots) x Boiling Time
    160 = Celing(20/1) x 8

 In Computer Terms:
  ExecutionTime = Ceiling(Amount of
   threads/Cores)xThreadExecutionTime
Caveat
 In the egg example we assume…
  Thread execution time is constant (never happens)
  Presume each core executes one thread at a time
   and does not continue w/ the next thread in the
   queue until it’s finished – ie given a quad core
   processor, it can execute 4 threads and give us the
   same result as the egg boiling w/ 4 pots
Short attention span LINQ
 Warning: I only really know basic LINQ (slowly
  integrating it into the Real Feeds Project where I
  can use it)

 LINQ = Language Integrated Query

 Something something query – gotcha. Looks like
  SQL in reverse (we know SQL, right?)

 Layman's terms – LINQ works against collections
  of data (any data really that has an enumerator)
  to get a all or subset of data
Simple LINQ
var ages =   new List<int>(){ 25, 21, 18, 65};

var agesInOrder = from age in ages

                  orderby age ascending

                  select   age;
Parallelization and LINQ
 Let’s vary our boil time
       Egg                   Batch   Boil Time (min)
       1                     1       8
       2                     1       4
       3                     1       8
       4                     1       8
       5                     1       4
       6                     2       4
       7                     2       4
       8                     2       8
       9                     2       6
Ex 2
 Take a collection of egg boil times

 Iterate over the collection and look at 5 items at
  a time
  Find the longest cooking time for the egg in the
   current patch

 Simulate the boiling time with Thread.Sleep

 Stop looking for eggs when there are <5 eggs in
  the current batch
Ex 2 (cont)
 First run – ~1600ms

 2nd run - ~1200ms why?
  Put 5 eggs in the pot
  After 4 min, remove 2nd from last egg
  After 8 min, remove remaining egg
  Add next batch that contains 1 egg
  After 4min remove the egg from the pot
LINQ to the rescue
 Any for / foreach look can potentially be
  converted into LINQ

 Compare Boil() code v1 & v2

 Note :
  Optimized (1600ms vs 1200ms)
  Nicer to read
Parallelize It!
 We have 2 options
  Parallel extensions
  Parallel LINQ
Parallel Extensions
 Introduced in .Net 4.0

 Has 2 important methods that we’ll focus on
    Parallel.For
      Parallel.For(0, eggs.Length, I => {});
    Parallel.ForEach
Parallel LINQ
 Say we have a list of web requests that we need
  to do

 Each call takes a certain amount of time & we
  want to parallelize it

 In previous examples we’ve relied on an index,
  but say if we can’t
Example 3

More Related Content

PDF
vbench: lightweight performance testing for Python
PPTX
Elapsed time
PPTX
Asynchronous programming
PPT
Reflection
PPT
Dependency injection
PPTX
PPT
Proxy pattern
PDF
Elapsed Time
vbench: lightweight performance testing for Python
Elapsed time
Asynchronous programming
Reflection
Dependency injection
Proxy pattern
Elapsed Time

Similar to Parallel extensions (15)

PDF
301132
PPTX
Multicore programmingandtpl(.net day)
PPTX
Multicore programmingandtpl
PPTX
.NET Multithreading/Multitasking
PPTX
Concurrency scalability
PPTX
Concurrency
PPTX
Parallel and Asynchronous Programming - ITProDevConnections 2012 (Greek)
ZIP
.Net 4.0 Threading and Parallel Programming
PPTX
Multi core programming 1
PPT
Overview Of Parallel Development - Ericnel
PDF
Parallel Programming With Microsoft Net Design Patterns For Decomposition And...
PDF
An eternal question of timing
PDF
Performance optimization techniques for Java code
PPTX
Patterns of parallel programming
PDF
Code dive 2019 kamil witecki - should i care about cpu cache
301132
Multicore programmingandtpl(.net day)
Multicore programmingandtpl
.NET Multithreading/Multitasking
Concurrency scalability
Concurrency
Parallel and Asynchronous Programming - ITProDevConnections 2012 (Greek)
.Net 4.0 Threading and Parallel Programming
Multi core programming 1
Overview Of Parallel Development - Ericnel
Parallel Programming With Microsoft Net Design Patterns For Decomposition And...
An eternal question of timing
Performance optimization techniques for Java code
Patterns of parallel programming
Code dive 2019 kamil witecki - should i care about cpu cache
Ad

Parallel extensions

  • 2. History of parallelization  Definition: a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can be divided into smaller ones, which are then solved concurrently (“in parallel”)  Developers have tried to improve performance by parallelizing problems, even before true multicore systems  How is this different from multithreading? Multithreading is a type of parallelism
  • 3. Real-life parallelization  Consider that we have some eggs to boil (data to process)  Before the early 2000s we only had 1 pot (core) and more eggs than we could boil at once (meaning we could boil >1 at a time in parallel, since you can fit >1 egg/pot). After the early 2000s we had 2 pots, thus could boil twice as many eggs at once.
  • 4. Pots vs Boil time Given: We have 10 eggs to boil and each egg requires 8 minutes in order to be ready to eat. Each pot holds up to 5 eggs. Number of pots Boil Time 1 16 min 2 8 min 3 8 min 4 8 min Something interesting occurs… Adding more than 2 pots does nothing to decrease the overall time.
  • 5. Time efficiency Time to boil 100 eggs
  • 6. What does this mean?  It really doesn’t matter how many cores we use. This problem simply will not speed up by adding cores.  Our equations are pretty simple:  Pots Needed = Number of Eggs/Eggs Per Pot  20 = 100/5  Time = Ceiling(Pots Needed/Pots) x Boiling Time  160 = Celing(20/1) x 8  In Computer Terms:  ExecutionTime = Ceiling(Amount of threads/Cores)xThreadExecutionTime
  • 7. Caveat  In the egg example we assume…  Thread execution time is constant (never happens)  Presume each core executes one thread at a time and does not continue w/ the next thread in the queue until it’s finished – ie given a quad core processor, it can execute 4 threads and give us the same result as the egg boiling w/ 4 pots
  • 8. Short attention span LINQ  Warning: I only really know basic LINQ (slowly integrating it into the Real Feeds Project where I can use it)  LINQ = Language Integrated Query  Something something query – gotcha. Looks like SQL in reverse (we know SQL, right?)  Layman's terms – LINQ works against collections of data (any data really that has an enumerator) to get a all or subset of data
  • 9. Simple LINQ var ages = new List<int>(){ 25, 21, 18, 65}; var agesInOrder = from age in ages orderby age ascending select age;
  • 10. Parallelization and LINQ  Let’s vary our boil time Egg Batch Boil Time (min) 1 1 8 2 1 4 3 1 8 4 1 8 5 1 4 6 2 4 7 2 4 8 2 8 9 2 6
  • 11. Ex 2  Take a collection of egg boil times  Iterate over the collection and look at 5 items at a time  Find the longest cooking time for the egg in the current patch  Simulate the boiling time with Thread.Sleep  Stop looking for eggs when there are <5 eggs in the current batch
  • 12. Ex 2 (cont)  First run – ~1600ms  2nd run - ~1200ms why?  Put 5 eggs in the pot  After 4 min, remove 2nd from last egg  After 8 min, remove remaining egg  Add next batch that contains 1 egg  After 4min remove the egg from the pot
  • 13. LINQ to the rescue  Any for / foreach look can potentially be converted into LINQ  Compare Boil() code v1 & v2  Note :  Optimized (1600ms vs 1200ms)  Nicer to read
  • 14. Parallelize It!  We have 2 options  Parallel extensions  Parallel LINQ
  • 15. Parallel Extensions  Introduced in .Net 4.0  Has 2 important methods that we’ll focus on  Parallel.For  Parallel.For(0, eggs.Length, I => {});  Parallel.ForEach
  • 16. Parallel LINQ  Say we have a list of web requests that we need to do  Each call takes a certain amount of time & we want to parallelize it  In previous examples we’ve relied on an index, but say if we can’t