SlideShare a Scribd company logo
Lessons from the Parallel Universe
November 2015
Itamar Tayer
Otiot.org
Itamar.tayer@cooladata.com
GPARS
@CoolaData
LinkedIn.com/company/CoolaData
www.facebook.com/Cooladata
plus.google.com/+CooladataGplus/posts
AGENDA
Enters Gpars
Actors
Try groovy!
Agents
Data parallelism & Fork/Join
Data flow
Parallel programming crisis
What’s wrong with threads?
3
Try Groovy
• Almost zero learning curve for Java developers
• Seamless integration with Java classes and libraries
• Lots of good libraries
• Grails
• Gradle
• Gorm
• Spock
• Json
• And much more…
4
Parallel Computing Crisis
• Computational power is achieved by increasing the number
of cores on the chip, rather than making each core faster
• More and more models are using a (large) cluster of
machines to solve complex problems
• We need a computing paradigm that allow us to create
reliable parallelism in a natural way
5
What About 100 Cores?
6
7
What’s Wrong with Java Threads?
• “Threads are evil”
• Easy to get wrong, hard to get right
• Hard to reason about the model state
• Bugs (Dead locks, race conditions etc. ) are difficult to discover and track
• Couples applicative and parallelism code
• Limited, expensive and low level resources
• Threading model is only applicable to local parallelism, not for
distribution
8
Pointing to the Problem…
class Point {
private int x,y;
public setX(int newVal){ | //T1
x=newVal; | p.setX(2)
} | p.setY(2)
public setY(int newVal){ | //T2
y=newVal; | p.setX(4)
} | p.setY(4)
}
9
Pointing to the Problem…
class Point {
private int x,y;
public synchronized setX(int newVal){
x=newVal;
}
public synchronized setY(int newVal){
y=newVal;
}
}
10
Pointing to the Problem…
class Point {
private int x,y;
public synchronized set(int newX, int newY){
x=newX;
y=newY;
}
}
11
Pointing to the Problem…
class Point {
private int x,y;
public synchronized set(Integer newX, Integer newY){
if (newX != null) x=newX;
if (newY != null) y=newY;
}
}
12
Another Observation
public class ValueHolder {
private List listeners = new LinkedList();
private int value;
public interface Listener {
public void valueChanged(int newValue);
}
public void addListener(Listener listener) {
Listeners.add(listener);
}
public void setValue(int newValue) {
value = newValue;
Iterator i = listeners.iterator();
while(i.hasNext()) {
((Listener)i.next()).valueChanged(newValue);
}
}
}
13
We Need a Better Paradigm
• Describe parallelism problems in an abstract, application
agnostic model
• Solve concurrency issues at the infrastructure level
• Allow developer concentrate on application logic
• Prevent developers from dealing with concurrency primitives
• Just like memory is handled in JAVA!
14
Enters GPARS
• The groovy library for parallelism
• Bundled in groovy since 1.8
• Offers various parallelism models
• Clean and concise code due to groovy flexibility
15
Data Parallelism
def range = 1..100
println Runtime.getRuntime().availableProcessors()
//sequential
range.each { println it + ',' + Thread.currentThread()}
//concurrent
ParallelEnhancer.enhanceInstance(range)
range.asConcurrent {
range.each {println it + ',' + Thread.currentThread()}
}
16
Data Parallelism
def numbers = [10,6,3,8,1,5,7,2,4,9]
def sleeper = {number -> sleep number*100; print "${number}, "}
withPool(numbers.size()){
def sorter = sleeper.asyncFun()
numbers.each{sorter it}
}
17
Actor
• Very common and reliable parallelism model
• System is built from independent actors that communicate
with each other via messages
• Every actor has a ‘mail box’ for messages, and messages are
processed one by one
• A scheduler provide threads to actors that need to process
a message
• Program is naturally concurrent, no need to consider
concurrency in application code
18
Actor Model
ONLINE GAMING MODEL
Actor Case Study
19
ZONE
01
ZONE
03
ZONE
02
20
Groovy Actor
final class MyDDA extends DynamicDispatchActor {
def myState=0;
void onMessage(String message) {
myState+=1
println "Received string ${myState}"
}
void onMessage(Integer message) {
myState+=2
println "Received integer ${myState}"
}
21
Using The Actor
def myActor = new MyDDA().start()
final Thread t1 = Thread.start {while (true)
{
myActor.send "Hello“
sleep 1000
}
}
final Thread t2 = Thread.start {while (true)
{
myActor << 12
sleep 1000
}
}
22
Actor-Based Concurrency
23
Agents
• Sometimes sharing memory does make sense
• Agent offers another level of indirection between a reference
and it’s actual state
• Agent encapsulates the execution
Ref X
X1
X2
Agent
X
24
Groovy Agent
class Point {
int x,y
}
agent = new Agent(new Point())
agent.send{it.x=2;it.y=2}
agent<<{it.x=4;it.y=4}
sleep(1000)
def value = agent.val;
println "$value.x“
println "$value.y";
25
Data Flow
• Data flow model allow us to divide our problem to smaller
parallel computations
• Each sub task is written independently, and the framework takes
care for putting it all together
• each data flow task can get a value only once during it’s life
time. Reading the value will block until it’s ready
• This will align all the tasks in the right order, and create a
completely deterministic flow
26
Data Flow
final def x = new DataflowVariable()
final def y = new DataflowVariable()
final def z = new DataflowVariable()
task
{
z << x.val + y.val
}
task
{
x << 10 //or get some value from the network
}
task
{
y << 5 //or do a long computation
}
println "Result: ${z.val}"
Build vs. Buy
CoolaData Offers the Best of Both Worlds
27
ETL
REAL TIME
PROCESSING
HBASE
Couchbase
CASSANDRA
INTERACTIVE
PROCESSING
Exasol
Vertica
Redshift
BATCH
PROCESSING
HADOOP
HIVE
REAL-TIME
PROCESSING
(STORM, KINSESIS)
DATA VISUALIZATION
(EXCEL, TABLEAU, QlikView)
STRUCTURED AND UNSTRUCTURED DATA
(HDFS, S3)
vs
• Lower cost of ownership
• Faster time to market
• Future proof: customizable and open
• Stronger analytical power enabling predictive and proactive analytics
• Development requires specific know-how and experience, not core to the business
28
Low Risk, Quick 2 Results
Grows with your organizational needs
Proven

More Related Content

PPT
Distributed Reactive Architecture: Extending SOA with Events
PPTX
Introduction To Tensorflow
PPT
PPTX
Attention Is All You Need
PDF
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
PPTX
IoT Supercharged: Complex event processing for MQTT with Eclipse technologies
PDF
A More Scaleable Way of Making Recommendations with MLlib-(Xiangrui Meng, Dat...
PPTX
Tg noh jeju_workshop
Distributed Reactive Architecture: Extending SOA with Events
Introduction To Tensorflow
Attention Is All You Need
[Paper reading] L-SHAPLEY AND C-SHAPLEY: EFFICIENT MODEL INTERPRETATION FOR S...
IoT Supercharged: Complex event processing for MQTT with Eclipse technologies
A More Scaleable Way of Making Recommendations with MLlib-(Xiangrui Meng, Dat...
Tg noh jeju_workshop

What's hot (16)

PDF
Reactive Stream Processing Using DDS and Rx
PDF
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
PDF
PyData Paris 2015 - Track 3.1 Niels Zeilemaker
PDF
Paris Cassandra Meetup - Cassandra for Developers
PPTX
Xcore meets IncQuery: How the New Generation of DSLs are Made
PDF
YaJug - Cassandra for Java Developers
PDF
ChtiJUG - Cassandra 2.0
PDF
Geneva JUG - Cassandra for Java Developers
PPTX
Sharing of Distributed Objects in a DX Cluster, thanks to Hazelcast - Online ...
PPTX
Stream processing from single node to a cluster
PPTX
SQL Server Deep Dive, Denis Reznik
PDF
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
PPTX
Introduction Big Data and Hadoop
PDF
Deep learning with TensorFlow
PDF
Functional Programming and Composing Actors
PDF
Remote Log Analytics Using DDS, ELK, and RxJS
Reactive Stream Processing Using DDS and Rx
Overloading in Overdrive: A Generic Data-Centric Messaging Library for DDS
PyData Paris 2015 - Track 3.1 Niels Zeilemaker
Paris Cassandra Meetup - Cassandra for Developers
Xcore meets IncQuery: How the New Generation of DSLs are Made
YaJug - Cassandra for Java Developers
ChtiJUG - Cassandra 2.0
Geneva JUG - Cassandra for Java Developers
Sharing of Distributed Objects in a DX Cluster, thanks to Hazelcast - Online ...
Stream processing from single node to a cluster
SQL Server Deep Dive, Denis Reznik
Melanie Warrick, Deep Learning Engineer, Skymind.io at MLconf SF - 11/13/15
Introduction Big Data and Hadoop
Deep learning with TensorFlow
Functional Programming and Composing Actors
Remote Log Analytics Using DDS, ELK, and RxJS
Ad

Viewers also liked (10)

PDF
Gpars Workshop 2014
PDF
concurrency with GPars
PDF
GPars 2014
ODP
GPars (Groovy Parallel Systems)
PDF
Dataflow: the concurrency/parallelism architecture you need
PPT
Universe Theories
 
PDF
Origins - Why something is here
PPTX
Parallel Universe: Will Libraries and Publishers Learn to Share?
PDF
groovy and concurrency
PPTX
Multiverse theory powerpoint final
Gpars Workshop 2014
concurrency with GPars
GPars 2014
GPars (Groovy Parallel Systems)
Dataflow: the concurrency/parallelism architecture you need
Universe Theories
 
Origins - Why something is here
Parallel Universe: Will Libraries and Publishers Learn to Share?
groovy and concurrency
Multiverse theory powerpoint final
Ad

Similar to GPARS: Lessons from the parallel universe - Itamar Tayer, CoolaData (20)

PPTX
Java concurrency
PPTX
Concurrency Constructs Overview
PPT
10-design-patterns1.ppt.software engineering
PDF
Responsible JavaScript
PPTX
06 Java Language And OOP Part VI
PDF
Need 4 Speed FI
PPTX
Finding bugs that matter with Findbugs
PDF
Go Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
PDF
Design patterns - Common Solutions to Common Problems - Brad Wood
PDF
Non-blocking synchronization — what is it and why we (don't?) need it
PDF
cf.Objective() 2017 - Design patterns - Brad Wood
PDF
findbugs Bernhard Merkle
PDF
Exploiting Concurrency with Dynamic Languages
PPTX
Java 7 & 8 New Features
PPTX
20160609 nike techtalks reactive applications tools of the trade
PDF
Atlassian Groovy Plugins
PDF
A Survey of Concurrency Constructs
PDF
Forgive me for i have allocated
PDF
Rails israel 2013
KEY
Gae icc fall2011
Java concurrency
Concurrency Constructs Overview
10-design-patterns1.ppt.software engineering
Responsible JavaScript
06 Java Language And OOP Part VI
Need 4 Speed FI
Finding bugs that matter with Findbugs
Go Reactive: Event-Driven, Scalable, Resilient & Responsive Systems
Design patterns - Common Solutions to Common Problems - Brad Wood
Non-blocking synchronization — what is it and why we (don't?) need it
cf.Objective() 2017 - Design patterns - Brad Wood
findbugs Bernhard Merkle
Exploiting Concurrency with Dynamic Languages
Java 7 & 8 New Features
20160609 nike techtalks reactive applications tools of the trade
Atlassian Groovy Plugins
A Survey of Concurrency Constructs
Forgive me for i have allocated
Rails israel 2013
Gae icc fall2011

More from Codemotion Tel Aviv (20)

PDF
Keynote: Trends in Modern Application Development - Gilly Dekel, IBM
PDF
Angular is one fire(base)! - Shmuela Jacobs
PDF
Demystifying docker networking black magic - Lorenzo Fontana, Kiratech
PDF
Faster deep learning solutions from training to inference - Amitai Armon & Ni...
PDF
Facts about multithreading that'll keep you up at night - Guy Bar on, Vonage
PDF
Master the Art of the AST (and Take Control of Your JS!) - Yonatan Mevorach, ...
PDF
Unleash the power of angular Reactive Forms - Nir Kaufman, 500Tech
PDF
Can we build an Azure IoT controlled device in less than 40 minutes that cost...
PDF
Actors and Microservices - Can two walk together? - Rotem Hermon, Gigya
PDF
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
PDF
My Minecraft Smart Home: Prototyping the internet of uncanny things - Sascha ...
PDF
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
PDF
Containerised ASP.NET Core apps with Kubernetes
PDF
Fullstack DDD with ASP.NET Core and Anguar 2 - Ronald Harmsen, NForza
PDF
The Art of Decomposing Monoliths - Kfir Bloch, Wix
PDF
SOA Lessons Learnt (or Microservices done Better) - Sean Farmar, Particular S...
PDF
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Ben...
PDF
Getting Physical with Web Bluetooth - Uri Shaked, BlackBerry
PDF
Web based virtual reality - Tanay Pant, Mozilla
PDF
Material Design Demytified - Ran Nachmany, Google
Keynote: Trends in Modern Application Development - Gilly Dekel, IBM
Angular is one fire(base)! - Shmuela Jacobs
Demystifying docker networking black magic - Lorenzo Fontana, Kiratech
Faster deep learning solutions from training to inference - Amitai Armon & Ni...
Facts about multithreading that'll keep you up at night - Guy Bar on, Vonage
Master the Art of the AST (and Take Control of Your JS!) - Yonatan Mevorach, ...
Unleash the power of angular Reactive Forms - Nir Kaufman, 500Tech
Can we build an Azure IoT controlled device in less than 40 minutes that cost...
Actors and Microservices - Can two walk together? - Rotem Hermon, Gigya
How to Leverage Machine Learning (R, Hadoop, Spark, H2O) for Real Time Proces...
My Minecraft Smart Home: Prototyping the internet of uncanny things - Sascha ...
Distributed Systems explained (with NodeJS) - Bruno Bossola, JUG Torino
Containerised ASP.NET Core apps with Kubernetes
Fullstack DDD with ASP.NET Core and Anguar 2 - Ronald Harmsen, NForza
The Art of Decomposing Monoliths - Kfir Bloch, Wix
SOA Lessons Learnt (or Microservices done Better) - Sean Farmar, Particular S...
S3, Cassandra or Outer Space? Dumping Time Series Data using Spark - Demi Ben...
Getting Physical with Web Bluetooth - Uri Shaked, BlackBerry
Web based virtual reality - Tanay Pant, Mozilla
Material Design Demytified - Ran Nachmany, Google

Recently uploaded (20)

PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
Teaching material agriculture food technology
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Modernizing your data center with Dell and AMD
PDF
Approach and Philosophy of On baking technology
PPTX
Cloud computing and distributed systems.
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
KodekX | Application Modernization Development
PPTX
A Presentation on Artificial Intelligence
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Reach Out and Touch Someone: Haptics and Empathic Computing
Teaching material agriculture food technology
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Dropbox Q2 2025 Financial Results & Investor Presentation
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Modernizing your data center with Dell and AMD
Approach and Philosophy of On baking technology
Cloud computing and distributed systems.
CIFDAQ's Market Insight: SEC Turns Pro Crypto
KodekX | Application Modernization Development
A Presentation on Artificial Intelligence
Unlocking AI with Model Context Protocol (MCP)
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
cuic standard and advanced reporting.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Electronic commerce courselecture one. Pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...

GPARS: Lessons from the parallel universe - Itamar Tayer, CoolaData

  • 1. Lessons from the Parallel Universe November 2015 Itamar Tayer Otiot.org Itamar.tayer@cooladata.com GPARS @CoolaData LinkedIn.com/company/CoolaData www.facebook.com/Cooladata plus.google.com/+CooladataGplus/posts
  • 2. AGENDA Enters Gpars Actors Try groovy! Agents Data parallelism & Fork/Join Data flow Parallel programming crisis What’s wrong with threads?
  • 3. 3 Try Groovy • Almost zero learning curve for Java developers • Seamless integration with Java classes and libraries • Lots of good libraries • Grails • Gradle • Gorm • Spock • Json • And much more…
  • 4. 4 Parallel Computing Crisis • Computational power is achieved by increasing the number of cores on the chip, rather than making each core faster • More and more models are using a (large) cluster of machines to solve complex problems • We need a computing paradigm that allow us to create reliable parallelism in a natural way
  • 6. 6
  • 7. 7 What’s Wrong with Java Threads? • “Threads are evil” • Easy to get wrong, hard to get right • Hard to reason about the model state • Bugs (Dead locks, race conditions etc. ) are difficult to discover and track • Couples applicative and parallelism code • Limited, expensive and low level resources • Threading model is only applicable to local parallelism, not for distribution
  • 8. 8 Pointing to the Problem… class Point { private int x,y; public setX(int newVal){ | //T1 x=newVal; | p.setX(2) } | p.setY(2) public setY(int newVal){ | //T2 y=newVal; | p.setX(4) } | p.setY(4) }
  • 9. 9 Pointing to the Problem… class Point { private int x,y; public synchronized setX(int newVal){ x=newVal; } public synchronized setY(int newVal){ y=newVal; } }
  • 10. 10 Pointing to the Problem… class Point { private int x,y; public synchronized set(int newX, int newY){ x=newX; y=newY; } }
  • 11. 11 Pointing to the Problem… class Point { private int x,y; public synchronized set(Integer newX, Integer newY){ if (newX != null) x=newX; if (newY != null) y=newY; } }
  • 12. 12 Another Observation public class ValueHolder { private List listeners = new LinkedList(); private int value; public interface Listener { public void valueChanged(int newValue); } public void addListener(Listener listener) { Listeners.add(listener); } public void setValue(int newValue) { value = newValue; Iterator i = listeners.iterator(); while(i.hasNext()) { ((Listener)i.next()).valueChanged(newValue); } } }
  • 13. 13 We Need a Better Paradigm • Describe parallelism problems in an abstract, application agnostic model • Solve concurrency issues at the infrastructure level • Allow developer concentrate on application logic • Prevent developers from dealing with concurrency primitives • Just like memory is handled in JAVA!
  • 14. 14 Enters GPARS • The groovy library for parallelism • Bundled in groovy since 1.8 • Offers various parallelism models • Clean and concise code due to groovy flexibility
  • 15. 15 Data Parallelism def range = 1..100 println Runtime.getRuntime().availableProcessors() //sequential range.each { println it + ',' + Thread.currentThread()} //concurrent ParallelEnhancer.enhanceInstance(range) range.asConcurrent { range.each {println it + ',' + Thread.currentThread()} }
  • 16. 16 Data Parallelism def numbers = [10,6,3,8,1,5,7,2,4,9] def sleeper = {number -> sleep number*100; print "${number}, "} withPool(numbers.size()){ def sorter = sleeper.asyncFun() numbers.each{sorter it} }
  • 17. 17 Actor • Very common and reliable parallelism model • System is built from independent actors that communicate with each other via messages • Every actor has a ‘mail box’ for messages, and messages are processed one by one • A scheduler provide threads to actors that need to process a message • Program is naturally concurrent, no need to consider concurrency in application code
  • 19. ONLINE GAMING MODEL Actor Case Study 19 ZONE 01 ZONE 03 ZONE 02
  • 20. 20 Groovy Actor final class MyDDA extends DynamicDispatchActor { def myState=0; void onMessage(String message) { myState+=1 println "Received string ${myState}" } void onMessage(Integer message) { myState+=2 println "Received integer ${myState}" }
  • 21. 21 Using The Actor def myActor = new MyDDA().start() final Thread t1 = Thread.start {while (true) { myActor.send "Hello“ sleep 1000 } } final Thread t2 = Thread.start {while (true) { myActor << 12 sleep 1000 } }
  • 23. 23 Agents • Sometimes sharing memory does make sense • Agent offers another level of indirection between a reference and it’s actual state • Agent encapsulates the execution Ref X X1 X2 Agent X
  • 24. 24 Groovy Agent class Point { int x,y } agent = new Agent(new Point()) agent.send{it.x=2;it.y=2} agent<<{it.x=4;it.y=4} sleep(1000) def value = agent.val; println "$value.x“ println "$value.y";
  • 25. 25 Data Flow • Data flow model allow us to divide our problem to smaller parallel computations • Each sub task is written independently, and the framework takes care for putting it all together • each data flow task can get a value only once during it’s life time. Reading the value will block until it’s ready • This will align all the tasks in the right order, and create a completely deterministic flow
  • 26. 26 Data Flow final def x = new DataflowVariable() final def y = new DataflowVariable() final def z = new DataflowVariable() task { z << x.val + y.val } task { x << 10 //or get some value from the network } task { y << 5 //or do a long computation } println "Result: ${z.val}"
  • 27. Build vs. Buy CoolaData Offers the Best of Both Worlds 27 ETL REAL TIME PROCESSING HBASE Couchbase CASSANDRA INTERACTIVE PROCESSING Exasol Vertica Redshift BATCH PROCESSING HADOOP HIVE REAL-TIME PROCESSING (STORM, KINSESIS) DATA VISUALIZATION (EXCEL, TABLEAU, QlikView) STRUCTURED AND UNSTRUCTURED DATA (HDFS, S3) vs • Lower cost of ownership • Faster time to market • Future proof: customizable and open • Stronger analytical power enabling predictive and proactive analytics • Development requires specific know-how and experience, not core to the business
  • 28. 28 Low Risk, Quick 2 Results Grows with your organizational needs Proven