SlideShare a Scribd company logo
Anika Technologies
Your Transforming Partner
Java Concurrency and Performance
Training Contents
☛ Description
☛ Intended Audience
☛ Key Skills
☛ Prerequisites
☛ Instructional Method
☛ course contents
Java Concurrency and Performance
Course Contents
page4 ☛ Producer Consumer(Basic Hand-Off) Day1
☛ Common Issues with thread
☛ Java Memory Model(JMM)
☛ Applied Threading techniques
☛ Building Blocks for Highly
Concurrent Design
☛ Highly Concurrent Data Structures-
Part1 Day2
☛ Designing For Concurrency
☛ Sharing Objects
☛ Composing Objects
☛ Canned Synchronizers
☛ Structuring Concurrent Applications
page10 ☛ Cancellation and Shutdown Day3
☛ Applying Thread Pools
☛ Liveness, Performance, and Testing
☛ Performance and Scalability
☛ Explicit Locks
page16 ☛ Building Custom Synchronizers Day4
☛ Atomic Variables and
Nonblocking Synchronization
☛ Fork and Join Framework
☛ Crash course in Mordern hardware
☛ Designing for multi-
core/processor environment
☛ Highly Concurrent Data Structures-Part2
Description:
■ With the advent of multi-core processors the usage of single threaded programs is
soon becoming obsolete. Java was built to be able to do many things at once. In
computer lingo, we call this "concurrency". This is the main reason why Java is so
useful. Today we see a lot of our applications running on multiple cores, concurrent
java programs with multiple threads is the answer for effective performance and
stability on multi-core based applications. Concurrency is among the utmost worries
for newcomers to Java programming but there's no reason to let it deter you. Not only
is excellent documentation available but also pictorial representations of each topic to
make understanding much graceful and enhanced. Java threads have become easier
to work with as the Java platform has evolved. In order to learn how to do
multithreaded programming in Java 6 and 7, you need some building blocks. Our
training expert with his rich training and consulting experience illustrates with real
application based case studies.
Intended Audience:
■ The target group is programmers who want to know foundations of concurrent
programming and existing concurrent programming environments, in order, now or
in future, to develop multithreaded applications for multi-core processors and shared
memory multiprocessors.
Key Skills:
■
■
■
■
Dealing with threads and collections on a multi-core/ multiprocessor.
To quickly identify the root cause of poor performance in your applications.
Eliminate conditions that will prevent you from finding performance bottlenecks.
JDK 5, 6, 7 which have features to harness the power of the underlying hardware.
Prerequisites:
■ Basic knowledge of Java (introductory course or equivalent practical experience).
Instructional Method:
■ This is an instructor led course provides lecture topics and the practical application
of JEE5.0 and the underlying technologies. It pictorially presents most concepts and
there is a detailed case study that strings together the technologies, patterns and
design.
Java Concurrency and Performance
■ Producer Consumer(Basic Hand-Off) ( Day 1 )
■ Why wait-notify require Synchronization
• notifyAll used as work around
• Structural modification to hidden queue by wait-notify
• locking handling done by OS
• use cases for notify-notifyAll
• Hidden queue
• design issues with synchronization
■ Common Issues with thread
• Uncaught Exception Handler
• problem with stop
• Dealing with InterruptedStatus
■ Java Memory Model(JMM)
• Real Meaning and effect of synchronization
• Volatile
• Sequential Consistency would disallow common optimizations
• The changes in JMM
• Final
■ Shortcomings of the original JMM
• Finals not really final
• Prevents effective compiler optimizations
• Processor executes operations out of order
• Compiler is free to reorder certain instructions
• Cache reorders writes
• Old JMM surprising and confusing
Instruction Reordering
• What is the limit of reordering
• Programmatic Control
• super-scalar processors
• heavily pipelines processors
• As-if-serial-semantics
• Why is reordering done
■ Cache Coherency
• Write-back Caching explained
• What is cache Coherence.
• How does it effect java programs.
• Software based Cache Coherency
• NUMA(Non uniform memory access)
• Caching explained
• Cache incoherency
■ New JMM and goals of JSR-133
• Simple,intuitive and, feasible
• Out-of-thin-air safety
• High performance JVM implementations across architectures
• Minimal impact on existing code
• Initialization safety
• Preserve existing safety guarantees and type-safety
■ Applied Threading techniques
• Safe Construction techniques
• Thread Local Storage
• Thread safety levels
• UnSafe Construction techniques
■ Building Blocks for Highly Concurrent Design
■ CAS
• Wait-free Queue implementation
• Optimistic Design
• Wait-free Stack implementation
• Hardware based locking
■ ABA problem
• Markable reference
• weakCompareAndSet
• Stamped reference
■ Reentrant Lock
• ReentrantReadWriteLock
• ReentrantLock
■ Lock Striping
• Lock Striping on LinkNodes
• Lock Striping on table
■ Indentifying scalability bottlenecks in java.util.Collection
• segregating them based on Thread safety levels
■ Lock Implementation
• Multiple user conditions and wait queues
• Lock Polling techniques
• Based on CAS
• Design issues with synchronization
■ Highly Concurrent Data Structures-Part1 ( Day 2 )
■ Weakly Consistent Iterators vs Fail Fast Iterators
■ ConcurrentHashMap
• Structure
• remove/put/resize lock
• Almost immutability
• Using volatile to detect interference
• Read does not block in common code path
■ Designing For Concurrency
• Atomicity
• Confinement
• Immutability
• Visibility
• Almost Immutability
• Restructuring and refactoring
■ Sharing Objects
■
■
Thread confinement
• Stack confinement
• ThreadLocal
• Unshared objects are safe
• Ad-hoc thread confinement
Visibility
■
■
■
• Synchronization and visibility
• Non-atomic 64-bit numeric operations
• Problems that state data can cause
• Volatile vs synchronized
• Single-threaded write safety
• Volatile flushing
• Making fields visible with volatile
• Reason why changes are not visible
Immutability
• Definition of immutable
• Immutable is always thread safe
• Immutable containing mutable object
• Final fields
Safe publication
• Making objects and their state visible
• Safe publication idioms
• How to share objects safely
• "Effectively" immutable objects
Publication and escape
• Publishing objects to alien methods
• Publishing objects as method returns
• Implicit links to outer class
• Ways we might let object escape
• Publishing objects via fields
■ Composing Objects
m
■ Instance confinement
• Split locks
• Example of fleet management
• Java monitor pattern
• Lock confinement
• Encapsulation
• How instance confinement is good
• State guarded by private fields
■ Documenting synchronization policies
• Examples from the JDK
• Documentation checklist
• What should be documented
• Synchronization policies
• Interpreting vague documentation
■ Adding functionality to existing thread-safe classes
• Benefits of reuse
• Using composition to add functionality
• Subclassing to add functionality
• Modifying existing code
• Client-side locking
■ Designing a thread-safe class
• Pre-condition
• Thread-safe counter with invariant
• Primitive vs object fields
• Encapsulation
• Post-conditions
• Waiting for pre-condition to become true
■ Delegating thread safety
• Independent fields
• Publishing underlying fields
• Delegating safety to ConcurrentMap
m
• Invariables and delegation
• Using thread safe components
• Delegation with vehicle tracker
■ Canned Synchronizers
• Semaphore
• Latches
• SynchronousQueue
• Future
• Exchanger
• Synchronous Queue Framework
• Mutex
• Barrier
■ Structuring Concurrent Applications
■ Finding exploitable parallelism
• Callable controlling lifecycle
• CompletionService
• Limitations of parallelizing heterogeneous tasks
• Callable and Future
• Time limited tasks
• Example showing page renderer with future
• Sequential vs parallel
• Breaking up a single client request
■ The Executor framework
• Memory leaks with ThreadLocal
• Delayed and periodic tasks
• Thread pool structure
• Motivation for using Executor
• Executor lifecycle, state machine
• Difference between java.util.Timer and ScheduledExecutor
• ThreadPoolExecutor
• Decoupling task submission from execution
• Shutdown() vs ShutdownNow()
• Executor interface
• Thread pool benefits
• Standard ExecutorService configurations
■ Execution policies
• Various sizing options for number of threads and queue length
• In which order? (FIFO, LIFO, by priority)
• Who will execute it?
■ Executing tasks in threads
• Disadvantage of unbounded thread creation
• Single-threaded vs multi-threaded
• Explicitely creating tasks
• Indepence of tasks
• Identifying tasks
• Task boundaries
■ Cancellation and Shutdown ( Day 3 )
■ Stopping a thread-based service
• Graceful shutdown
• ExecutorService shutdown
• Providing lifecycle methods
• Asynchronous logging caveats
• Example: A logging service
• Poison pills
• One-shot execution service
■ Task cancellation
• Cancellation policies
• Using flags to signal cancellation
• Reasons for wanting to cancel a task
• Cooperative vs preemptive cancellation
■ Interruption
• Origins of interruptions
• WAITING state of thread
• How does interrupt work?
• Methods that put thread in WAITING state
• Policies in dealing with InterruptedException
• Thread.interrupted() method
■
■
Dealing with non-interruptible blocking
• Interrupting locks
• Reactions of IO libraries to interrupts
Responding to interruption
• Letting the method throw the exception
• Saving the interrupt for later
• Ignoring the interrupt status
• Restoring the interrupt and exiting
■
■
Interruption policies
• Task vs Thread
• Different meanings of interrupt
• Preserving the interrupt status
Example: timed run
■
■
• Telling a long run to eventually give up
• Canceling busy jobs
Handling abnormal thread termination
• Using UncaughtExceptionHandler
• Dealing with exceptions in Swing
• ThreadGroup for uncaught exceptions
JVM shutdown
• Shutdown hooks
• Orderly shutdown
• Daemon threads
• Finalizers
• Abrupt shutdown
■ Applying Thread Pools
■ Configuring ThreadPoolExecutor
• Thread factories
• corePoolSize
• Customizing thread pool executor after construction
• Using default Executors.new* methods
• Managing queued tasks
• maximumPoolSize
• keepAliveTime
• PriorityBlockingQueue
■ Saturation policies
• Discard
• Caller runs
• Abort
• Discard oldest
■ Sizing thread pools
• Examples of various pool sizes
• Determining the maximum allowed threads on your
operating system
• CPU-intensiv vs IO-intensive task sizing
• Danger of hardcoding worker number
• Problems when pool is too large or small
• Formula for calculating how many threads to use
• Mixing different types of tasks
■ Tasks and Execution Policies
• Long-running tasks
• Homogenous, independent and thread-agnostic tasks
• Thread starvation deadlock
■ Extending ThreadPoolExecutor
• terminate
• Using hooks for extension
• afterExecute
• beforeExecute
m
■ Parallelizing recursive algorithms
• Using Fork/Join to execute tasks
• Converting sequential tasks to parallel
■ Liveness, Performance, and Testing
■ Avoiding Liveness Hazards
■ Other liveness hazards
• Poor responsiveness
• Livelock
■ Starvation
• ReadWriteLock in Java 5 vs Java 6
• Detecting thread starvation
■
■
Avoiding and diagnosing deadlocks
• Adding a sleep to cause deadlocks
• "TryLock" with synchronized
• Using open calls
• Verifying thread deadlocks
• Avoiding multiple locks
• Timed lock attempts
• Stopping deadlock victims
• DeadlockArbitrator
• Deadlock analysis with thread dumps
• Unit testing for lock ordering deadlocks
Deadlock
• Thread-starvation deadlocks
• Discovering deadlocks
• Checking whether locks are held
• Resource deadlocks
• The drinking philosophers
• Lock-ordering deadlocks
• Defining a global ordering
• Resolving deadlocks
m
• Causing a deadlock amongst philosophers
• Deadlock between cooperating objects
• Imposing a natural order
• Dynamic lock order deadlocks
• Defining order on dynamic locks
■ Open calls and alien methods
• Example in Vector
■ Performance and Scalability
■
■
Thinking about performance
• Mistakes in traditional performance optimizations
• 2-tier vs multi-tier
• Evaluating performance tradeoffs
• Performance vs scalability
• Effects of serial sections and locking
• How fast vs how much
Reducing lock contention
• How to monitor CPU utilization
• Performance comparisons
• ReadWriteLock
• Using CopyOnWrite collections
• Immutable objects
• Atomic fields
• Using ConcurrentHashMap
• Narrowing lock scope
• Avoiding "hot fields"
• Hotspot options for lock performance
• Reasons why CPUs might not be loaded
• How to find "hot locks"
• Lock splitting
• Dangers of object pooling
• Safety first!
• Reducing lock granularity
m
• Exclusive locks
■ Lock striping
• In ConcurrentHashMap
• In ConcurrentLinkedQueue
■ Amdahl's and Little's laws
• Formula for Amdahl's Law
• Problems with Amdahl's law in practice
• Applying Little's Law in practice
• Utilization according to Amdahl
• Maximum useful cores
• How threading relates to Little's Law
• Formula for Little's Law
■ Costs introduced by threads
• Context switching
• Locking and unlocking
• Cache invalidation
• Spinning before actual blocking
• Lock elision
• Memory barriers
• Escape analysis and uncontended locks
■ Explicit Locks
■ Lock and ReentrantLock
• Using try-finally
• Memory visibility semantics
• Using try-lock to avoid deadlocks
• tryLock and timed locks
• Interruptible locking
• Non-block-structured locking
• ReentrantLock implementation
• Using the explicit lock
■ Synchronized vs ReentrantLock
m
■
■
• Memory semantics
• Prefer synchronized
• Ease of use
Performance considerations
• Heavily contended locks
• Java 5 vs Java 6 performance
• Throughput on contended locks
• Uncontended performance
Fairness
• Standard non-fair mechanisms
• Throughput of fair locks
• Round-robin by OS
• Barging
• Fair explicit locks in Java
■ Read-write locks
• ReadWriteLock interface
• Understanding system to avoid starvation
■ ReadWriteLock implementation options
• Release preference
• Downgrading
• Reader barging
• Upgrading
• Reentrancy
■ Building Custom Synchronizers ( Day 4 )
■ Explicit condition objects
• Condition interface
• Timed conditions
• Benefits of explicit condition queues
■ AbstractQueuedSynchronizer (AQS)
• Basis for other synchronizers
■ Managing state dependence
m
• Exceptions on pre-condition fails
• Structure of blocking state-dependent actions
• Crude blocking by polling and sleeping
• Example using bounded queues
• Single-threaded vs multi-threaded
■ Introducing condition queues
• With intrinsic locks
■ Using condition queues
• Waking up too soon
• Conditional waits
• Condition queue
• Encapsulating condition queues
• State-dependence
• notify() vs notifyAll()
• Condition predicate
• Lock
• Waiting for a specific timeout
■ Missed signals
• InterruptedException
■ Atomic Variables and Nonblocking Synchronization
■ Hardware support for concurrency
• Using "Unsafe" to access memory directly
• CAS support in the JVM
• Compare-and-Set
• Performance advantage of padding
• Nonblocking counter
• Simulation of CAS
• Managing conflicts with CAS
• Compare-and-Swap (CAS)
• Shared cache lines
• Optimistic locking
m
■ Atomic variable classes
• Optimistic locking classes
• How do atomics work?
• Atomic array classes
• Performance comparisons: Locks vs atomics
• Cost of atomic spin loops
• Very fast when not too much contention
• Types of atomic classes
■ Disadvantages of locking
• Priority inversion
• Elimination of uncontended intrinsic locks
• Volatile vs locking performance
■ Nonblocking algorithms
• Scalability problems with lock-based algorithms
• Atomic field updaters
• Doing speculative work
• AtomicStampedReference
• Nonblocking stack
• Definition of nonblocking and lock-free
• Highly scalable hash table
• The ABA problem
■ Using sun.misc.Unsafe
• Dangers
• Reasons why we need it
■ Fork and Join Framework
• Fork -join decomposition
• Fork and Join
• ParallelArray
• Divide and conquer
• Hardware shapes programming idiom
• Exposing fine grained parallelism
m
• Anatomy of Fork and Join
• Limitations
• Work Stealing
■ Crash course in Mordern hardware
• Amdahl's Law
■ Cache
• cache controller
• write
• Direct mapped
• read
• Address mapping in cache
■ Memory Architectures
• NUMA
• UMA
■ Designing for multi-core/processor environment
• Concurrent Stack
• Harsh Realities of parallelism
• Parallel Programming
■ Concurrent Objects
• Sequential Consistency
• Linearizability
• Concurrency and Correctness
• Progress Conditions
• Quiescent Consistency
■ Concurrency Patterns
• Lazy Synchronization
• Lock free Synchronization
• Optimistic Synchronization
• Fine grained Synchronization
■ Priority Queues
• Heap Based Unbounded Priority Queue
m
• Skiplist based Unbounded priority Queue
• Array Based bounded Priority Queue
• Tree based Bounded Priority Queue
■ Lists
• Coarse Grained Synchronization
• Lazy Synchronization
• Optimistic Synchronization
• Non Blocking Synchronization
• Fine Grained Synchronization
■
■
Skiplists
Spinlocks
• Lock suitable for NUMA systems
■ Concurrent Queues
• Unbounded lock-free Queue
• Bounded Partial Queue
• Unbounded Total Queue
■ Concurrent Hashing
• Open Address Hashing
• Closed Address Hashing
• Lock Free Hashing
■ Highly Concurrent Data Structures-Part2
• CopyOnWriteArray(List/Set)
■ NonBlockingHashMap
• For systems with more than 100 cpus/cores
• State based Reasoning
• all CAS spin loop bounded
• Constant Time key-value mapping
• faster than ConcurrentHashMap
• no locks even during resize
■ Queue interfaces
• Queue
m
• BlockingQueue
• Deque
• BlockingDeque
■ Queue Implementations
■ ArrayDeque and ArrayBlockingDeque
• WorkStealing using Deques
■ LinkedBlockingQueue
■ LinkedBlockingDeque
■ ConcurrentLinkedQueue
• GC unlinking
• Michael and Scott algorithm
• Tails and heads are allowed to lag
• Support for interior removals
• Relaxed writes
■ ConcurrentLinkedDeque
• Same as ConcurrentLinkedQueue except bidirectional pointers
■ LinkedTransferQueue
• Internal removes handled differently
• Heuristics based spinning/blocking on number of processors
• Behavior differs based on method calls
• Usual ConcurrentLinkedQueue optimizations
• Normal and Dual Queue
■ Skiplist
• Lock free Skiplist
• Sequential Skiplist
• Lock based Concurrent Skiplist
■ ConcurrentSkipListMap(and Set)
• Indexes are allowed to race
• Iteration
• Problems with AtomicMarkableReference
• Probabilistic Data Structure
m
• Marking and nulling
• Different Way to mark
Mobile: +91 7719882295/ 9730463630
Email: sales@anikatechnologies.com
Website:www.anikatechnologies.com

More Related Content

PDF
Concurrency in Java
PPT
Java8 - Under the hood
PPTX
Java concurrency
PDF
Java programming and security
KEY
Modern Java Concurrency (Devoxx Nov/2011)
PPTX
Stability Patterns for Microservices
PPTX
Lost with data consistency
PDF
Multithreading development with workers
Concurrency in Java
Java8 - Under the hood
Java concurrency
Java programming and security
Modern Java Concurrency (Devoxx Nov/2011)
Stability Patterns for Microservices
Lost with data consistency
Multithreading development with workers

Similar to Java Concurrency and Performance | Multi Threading | Concurrency | Java Concurrency (20)

PDF
JVM and Java Performance Tuning | JVM Tuning | Java Performance
DOC
Java online training from hyderabad
PDF
Best Java Online Training in India
PDF
DrupalSouth 2015 - Performance: Not an Afterthought
PDF
Java Online Training
PPT
Java training in ahmedabad
PPTX
NodeJS - Server Side JS
PPT
The JAVA Training Workshop in Ahmedabad
PPTX
JAVA-History-buzzwords-JVM_architecture.pptx
PDF
Introduction to java (revised)
PPTX
AngularJS - Architecture decisions in a large project 
PPTX
VTU 6th Sem Elective CSE - Module 3 cloud computing
PPTX
Road Trip To Component
PPTX
Cqrs.frameworks
PDF
Working With Concurrency In Java 8
PDF
Nine Neins - where Java EE will never take you
PDF
Oracle Fuson Middleware Diagnostics, Performance and Troubleshoot
PPTX
Oracle application container cloud back end integration using node final
PDF
Java Performance Tuning
PPTX
2019 devconfza - legacy js
JVM and Java Performance Tuning | JVM Tuning | Java Performance
Java online training from hyderabad
Best Java Online Training in India
DrupalSouth 2015 - Performance: Not an Afterthought
Java Online Training
Java training in ahmedabad
NodeJS - Server Side JS
The JAVA Training Workshop in Ahmedabad
JAVA-History-buzzwords-JVM_architecture.pptx
Introduction to java (revised)
AngularJS - Architecture decisions in a large project 
VTU 6th Sem Elective CSE - Module 3 cloud computing
Road Trip To Component
Cqrs.frameworks
Working With Concurrency In Java 8
Nine Neins - where Java EE will never take you
Oracle Fuson Middleware Diagnostics, Performance and Troubleshoot
Oracle application container cloud back end integration using node final
Java Performance Tuning
2019 devconfza - legacy js
Ad

More from Anand Narayanan (11)

PDF
Scrum Foundation Training by Anika Technologies
PDF
Agile Essentials Training by Anika Technologies
PDF
Smart Staffing using Regression Analysis Model
PDF
Sentiment analysis using nlp
PDF
Deep learning with_computer_vision
PDF
Spark Internals Training | Apache Spark | Spark | Anika Technologies
PDF
Advanced Elastic Search | Elastic Search | Kibana | Logstash
PDF
Deep learning internals
PDF
Understanding and Designing Ultra low latency systems | Low Latency | Ultra L...
PDF
Big Data Analytics and Artifical Intelligence
PDF
SynopsisLowLatencySeminar.PDF
Scrum Foundation Training by Anika Technologies
Agile Essentials Training by Anika Technologies
Smart Staffing using Regression Analysis Model
Sentiment analysis using nlp
Deep learning with_computer_vision
Spark Internals Training | Apache Spark | Spark | Anika Technologies
Advanced Elastic Search | Elastic Search | Kibana | Logstash
Deep learning internals
Understanding and Designing Ultra low latency systems | Low Latency | Ultra L...
Big Data Analytics and Artifical Intelligence
SynopsisLowLatencySeminar.PDF
Ad

Recently uploaded (20)

PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Cloud computing and distributed systems.
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
DOCX
The AUB Centre for AI in Media Proposal.docx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
KodekX | Application Modernization Development
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Machine learning based COVID-19 study performance prediction
PPT
Teaching material agriculture food technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Spectral efficient network and resource selection model in 5G networks
Programs and apps: productivity, graphics, security and other tools
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Cloud computing and distributed systems.
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Understanding_Digital_Forensics_Presentation.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
The AUB Centre for AI in Media Proposal.docx
“AI and Expert System Decision Support & Business Intelligence Systems”
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
KodekX | Application Modernization Development
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Machine learning based COVID-19 study performance prediction
Teaching material agriculture food technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Spectroscopy.pptx food analysis technology
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Dropbox Q2 2025 Financial Results & Investor Presentation
Spectral efficient network and resource selection model in 5G networks

Java Concurrency and Performance | Multi Threading | Concurrency | Java Concurrency

  • 1. Anika Technologies Your Transforming Partner Java Concurrency and Performance Training Contents ☛ Description ☛ Intended Audience ☛ Key Skills ☛ Prerequisites ☛ Instructional Method ☛ course contents
  • 2. Java Concurrency and Performance Course Contents page4 ☛ Producer Consumer(Basic Hand-Off) Day1 ☛ Common Issues with thread ☛ Java Memory Model(JMM) ☛ Applied Threading techniques ☛ Building Blocks for Highly Concurrent Design ☛ Highly Concurrent Data Structures- Part1 Day2 ☛ Designing For Concurrency ☛ Sharing Objects ☛ Composing Objects ☛ Canned Synchronizers ☛ Structuring Concurrent Applications page10 ☛ Cancellation and Shutdown Day3 ☛ Applying Thread Pools ☛ Liveness, Performance, and Testing ☛ Performance and Scalability ☛ Explicit Locks page16 ☛ Building Custom Synchronizers Day4 ☛ Atomic Variables and Nonblocking Synchronization ☛ Fork and Join Framework ☛ Crash course in Mordern hardware ☛ Designing for multi- core/processor environment ☛ Highly Concurrent Data Structures-Part2
  • 3. Description: ■ With the advent of multi-core processors the usage of single threaded programs is soon becoming obsolete. Java was built to be able to do many things at once. In computer lingo, we call this "concurrency". This is the main reason why Java is so useful. Today we see a lot of our applications running on multiple cores, concurrent java programs with multiple threads is the answer for effective performance and stability on multi-core based applications. Concurrency is among the utmost worries for newcomers to Java programming but there's no reason to let it deter you. Not only is excellent documentation available but also pictorial representations of each topic to make understanding much graceful and enhanced. Java threads have become easier to work with as the Java platform has evolved. In order to learn how to do multithreaded programming in Java 6 and 7, you need some building blocks. Our training expert with his rich training and consulting experience illustrates with real application based case studies. Intended Audience: ■ The target group is programmers who want to know foundations of concurrent programming and existing concurrent programming environments, in order, now or in future, to develop multithreaded applications for multi-core processors and shared memory multiprocessors. Key Skills: ■ ■ ■ ■ Dealing with threads and collections on a multi-core/ multiprocessor. To quickly identify the root cause of poor performance in your applications. Eliminate conditions that will prevent you from finding performance bottlenecks. JDK 5, 6, 7 which have features to harness the power of the underlying hardware. Prerequisites: ■ Basic knowledge of Java (introductory course or equivalent practical experience). Instructional Method: ■ This is an instructor led course provides lecture topics and the practical application of JEE5.0 and the underlying technologies. It pictorially presents most concepts and there is a detailed case study that strings together the technologies, patterns and design.
  • 4. Java Concurrency and Performance ■ Producer Consumer(Basic Hand-Off) ( Day 1 ) ■ Why wait-notify require Synchronization • notifyAll used as work around • Structural modification to hidden queue by wait-notify • locking handling done by OS • use cases for notify-notifyAll • Hidden queue • design issues with synchronization ■ Common Issues with thread • Uncaught Exception Handler • problem with stop • Dealing with InterruptedStatus ■ Java Memory Model(JMM) • Real Meaning and effect of synchronization • Volatile • Sequential Consistency would disallow common optimizations • The changes in JMM • Final ■ Shortcomings of the original JMM • Finals not really final • Prevents effective compiler optimizations • Processor executes operations out of order • Compiler is free to reorder certain instructions • Cache reorders writes • Old JMM surprising and confusing Instruction Reordering • What is the limit of reordering • Programmatic Control
  • 5. • super-scalar processors • heavily pipelines processors • As-if-serial-semantics • Why is reordering done ■ Cache Coherency • Write-back Caching explained • What is cache Coherence. • How does it effect java programs. • Software based Cache Coherency • NUMA(Non uniform memory access) • Caching explained • Cache incoherency ■ New JMM and goals of JSR-133 • Simple,intuitive and, feasible • Out-of-thin-air safety • High performance JVM implementations across architectures • Minimal impact on existing code • Initialization safety • Preserve existing safety guarantees and type-safety ■ Applied Threading techniques • Safe Construction techniques • Thread Local Storage • Thread safety levels • UnSafe Construction techniques ■ Building Blocks for Highly Concurrent Design ■ CAS • Wait-free Queue implementation • Optimistic Design • Wait-free Stack implementation • Hardware based locking ■ ABA problem • Markable reference
  • 6. • weakCompareAndSet • Stamped reference ■ Reentrant Lock • ReentrantReadWriteLock • ReentrantLock ■ Lock Striping • Lock Striping on LinkNodes • Lock Striping on table ■ Indentifying scalability bottlenecks in java.util.Collection • segregating them based on Thread safety levels ■ Lock Implementation • Multiple user conditions and wait queues • Lock Polling techniques • Based on CAS • Design issues with synchronization ■ Highly Concurrent Data Structures-Part1 ( Day 2 ) ■ Weakly Consistent Iterators vs Fail Fast Iterators ■ ConcurrentHashMap • Structure • remove/put/resize lock • Almost immutability • Using volatile to detect interference • Read does not block in common code path ■ Designing For Concurrency • Atomicity • Confinement • Immutability • Visibility • Almost Immutability • Restructuring and refactoring ■ Sharing Objects
  • 7. ■ ■ Thread confinement • Stack confinement • ThreadLocal • Unshared objects are safe • Ad-hoc thread confinement Visibility ■ ■ ■ • Synchronization and visibility • Non-atomic 64-bit numeric operations • Problems that state data can cause • Volatile vs synchronized • Single-threaded write safety • Volatile flushing • Making fields visible with volatile • Reason why changes are not visible Immutability • Definition of immutable • Immutable is always thread safe • Immutable containing mutable object • Final fields Safe publication • Making objects and their state visible • Safe publication idioms • How to share objects safely • "Effectively" immutable objects Publication and escape • Publishing objects to alien methods • Publishing objects as method returns • Implicit links to outer class • Ways we might let object escape • Publishing objects via fields ■ Composing Objects m
  • 8. ■ Instance confinement • Split locks • Example of fleet management • Java monitor pattern • Lock confinement • Encapsulation • How instance confinement is good • State guarded by private fields ■ Documenting synchronization policies • Examples from the JDK • Documentation checklist • What should be documented • Synchronization policies • Interpreting vague documentation ■ Adding functionality to existing thread-safe classes • Benefits of reuse • Using composition to add functionality • Subclassing to add functionality • Modifying existing code • Client-side locking ■ Designing a thread-safe class • Pre-condition • Thread-safe counter with invariant • Primitive vs object fields • Encapsulation • Post-conditions • Waiting for pre-condition to become true ■ Delegating thread safety • Independent fields • Publishing underlying fields • Delegating safety to ConcurrentMap m
  • 9. • Invariables and delegation • Using thread safe components • Delegation with vehicle tracker ■ Canned Synchronizers • Semaphore • Latches • SynchronousQueue • Future • Exchanger • Synchronous Queue Framework • Mutex • Barrier ■ Structuring Concurrent Applications ■ Finding exploitable parallelism • Callable controlling lifecycle • CompletionService • Limitations of parallelizing heterogeneous tasks • Callable and Future • Time limited tasks • Example showing page renderer with future • Sequential vs parallel • Breaking up a single client request ■ The Executor framework • Memory leaks with ThreadLocal • Delayed and periodic tasks • Thread pool structure • Motivation for using Executor • Executor lifecycle, state machine • Difference between java.util.Timer and ScheduledExecutor • ThreadPoolExecutor • Decoupling task submission from execution • Shutdown() vs ShutdownNow()
  • 10. • Executor interface • Thread pool benefits • Standard ExecutorService configurations ■ Execution policies • Various sizing options for number of threads and queue length • In which order? (FIFO, LIFO, by priority) • Who will execute it? ■ Executing tasks in threads • Disadvantage of unbounded thread creation • Single-threaded vs multi-threaded • Explicitely creating tasks • Indepence of tasks • Identifying tasks • Task boundaries ■ Cancellation and Shutdown ( Day 3 ) ■ Stopping a thread-based service • Graceful shutdown • ExecutorService shutdown • Providing lifecycle methods • Asynchronous logging caveats • Example: A logging service • Poison pills • One-shot execution service ■ Task cancellation • Cancellation policies • Using flags to signal cancellation • Reasons for wanting to cancel a task • Cooperative vs preemptive cancellation ■ Interruption • Origins of interruptions • WAITING state of thread
  • 11. • How does interrupt work? • Methods that put thread in WAITING state • Policies in dealing with InterruptedException • Thread.interrupted() method ■ ■ Dealing with non-interruptible blocking • Interrupting locks • Reactions of IO libraries to interrupts Responding to interruption • Letting the method throw the exception • Saving the interrupt for later • Ignoring the interrupt status • Restoring the interrupt and exiting ■ ■ Interruption policies • Task vs Thread • Different meanings of interrupt • Preserving the interrupt status Example: timed run ■ ■ • Telling a long run to eventually give up • Canceling busy jobs Handling abnormal thread termination • Using UncaughtExceptionHandler • Dealing with exceptions in Swing • ThreadGroup for uncaught exceptions JVM shutdown • Shutdown hooks • Orderly shutdown • Daemon threads • Finalizers • Abrupt shutdown ■ Applying Thread Pools ■ Configuring ThreadPoolExecutor
  • 12. • Thread factories • corePoolSize • Customizing thread pool executor after construction • Using default Executors.new* methods • Managing queued tasks • maximumPoolSize • keepAliveTime • PriorityBlockingQueue ■ Saturation policies • Discard • Caller runs • Abort • Discard oldest ■ Sizing thread pools • Examples of various pool sizes • Determining the maximum allowed threads on your operating system • CPU-intensiv vs IO-intensive task sizing • Danger of hardcoding worker number • Problems when pool is too large or small • Formula for calculating how many threads to use • Mixing different types of tasks ■ Tasks and Execution Policies • Long-running tasks • Homogenous, independent and thread-agnostic tasks • Thread starvation deadlock ■ Extending ThreadPoolExecutor • terminate • Using hooks for extension • afterExecute • beforeExecute m
  • 13. ■ Parallelizing recursive algorithms • Using Fork/Join to execute tasks • Converting sequential tasks to parallel ■ Liveness, Performance, and Testing ■ Avoiding Liveness Hazards ■ Other liveness hazards • Poor responsiveness • Livelock ■ Starvation • ReadWriteLock in Java 5 vs Java 6 • Detecting thread starvation ■ ■ Avoiding and diagnosing deadlocks • Adding a sleep to cause deadlocks • "TryLock" with synchronized • Using open calls • Verifying thread deadlocks • Avoiding multiple locks • Timed lock attempts • Stopping deadlock victims • DeadlockArbitrator • Deadlock analysis with thread dumps • Unit testing for lock ordering deadlocks Deadlock • Thread-starvation deadlocks • Discovering deadlocks • Checking whether locks are held • Resource deadlocks • The drinking philosophers • Lock-ordering deadlocks • Defining a global ordering • Resolving deadlocks m
  • 14. • Causing a deadlock amongst philosophers • Deadlock between cooperating objects • Imposing a natural order • Dynamic lock order deadlocks • Defining order on dynamic locks ■ Open calls and alien methods • Example in Vector ■ Performance and Scalability ■ ■ Thinking about performance • Mistakes in traditional performance optimizations • 2-tier vs multi-tier • Evaluating performance tradeoffs • Performance vs scalability • Effects of serial sections and locking • How fast vs how much Reducing lock contention • How to monitor CPU utilization • Performance comparisons • ReadWriteLock • Using CopyOnWrite collections • Immutable objects • Atomic fields • Using ConcurrentHashMap • Narrowing lock scope • Avoiding "hot fields" • Hotspot options for lock performance • Reasons why CPUs might not be loaded • How to find "hot locks" • Lock splitting • Dangers of object pooling • Safety first! • Reducing lock granularity m
  • 15. • Exclusive locks ■ Lock striping • In ConcurrentHashMap • In ConcurrentLinkedQueue ■ Amdahl's and Little's laws • Formula for Amdahl's Law • Problems with Amdahl's law in practice • Applying Little's Law in practice • Utilization according to Amdahl • Maximum useful cores • How threading relates to Little's Law • Formula for Little's Law ■ Costs introduced by threads • Context switching • Locking and unlocking • Cache invalidation • Spinning before actual blocking • Lock elision • Memory barriers • Escape analysis and uncontended locks ■ Explicit Locks ■ Lock and ReentrantLock • Using try-finally • Memory visibility semantics • Using try-lock to avoid deadlocks • tryLock and timed locks • Interruptible locking • Non-block-structured locking • ReentrantLock implementation • Using the explicit lock ■ Synchronized vs ReentrantLock m
  • 16. ■ ■ • Memory semantics • Prefer synchronized • Ease of use Performance considerations • Heavily contended locks • Java 5 vs Java 6 performance • Throughput on contended locks • Uncontended performance Fairness • Standard non-fair mechanisms • Throughput of fair locks • Round-robin by OS • Barging • Fair explicit locks in Java ■ Read-write locks • ReadWriteLock interface • Understanding system to avoid starvation ■ ReadWriteLock implementation options • Release preference • Downgrading • Reader barging • Upgrading • Reentrancy ■ Building Custom Synchronizers ( Day 4 ) ■ Explicit condition objects • Condition interface • Timed conditions • Benefits of explicit condition queues ■ AbstractQueuedSynchronizer (AQS) • Basis for other synchronizers ■ Managing state dependence m
  • 17. • Exceptions on pre-condition fails • Structure of blocking state-dependent actions • Crude blocking by polling and sleeping • Example using bounded queues • Single-threaded vs multi-threaded ■ Introducing condition queues • With intrinsic locks ■ Using condition queues • Waking up too soon • Conditional waits • Condition queue • Encapsulating condition queues • State-dependence • notify() vs notifyAll() • Condition predicate • Lock • Waiting for a specific timeout ■ Missed signals • InterruptedException ■ Atomic Variables and Nonblocking Synchronization ■ Hardware support for concurrency • Using "Unsafe" to access memory directly • CAS support in the JVM • Compare-and-Set • Performance advantage of padding • Nonblocking counter • Simulation of CAS • Managing conflicts with CAS • Compare-and-Swap (CAS) • Shared cache lines • Optimistic locking m
  • 18. ■ Atomic variable classes • Optimistic locking classes • How do atomics work? • Atomic array classes • Performance comparisons: Locks vs atomics • Cost of atomic spin loops • Very fast when not too much contention • Types of atomic classes ■ Disadvantages of locking • Priority inversion • Elimination of uncontended intrinsic locks • Volatile vs locking performance ■ Nonblocking algorithms • Scalability problems with lock-based algorithms • Atomic field updaters • Doing speculative work • AtomicStampedReference • Nonblocking stack • Definition of nonblocking and lock-free • Highly scalable hash table • The ABA problem ■ Using sun.misc.Unsafe • Dangers • Reasons why we need it ■ Fork and Join Framework • Fork -join decomposition • Fork and Join • ParallelArray • Divide and conquer • Hardware shapes programming idiom • Exposing fine grained parallelism m
  • 19. • Anatomy of Fork and Join • Limitations • Work Stealing ■ Crash course in Mordern hardware • Amdahl's Law ■ Cache • cache controller • write • Direct mapped • read • Address mapping in cache ■ Memory Architectures • NUMA • UMA ■ Designing for multi-core/processor environment • Concurrent Stack • Harsh Realities of parallelism • Parallel Programming ■ Concurrent Objects • Sequential Consistency • Linearizability • Concurrency and Correctness • Progress Conditions • Quiescent Consistency ■ Concurrency Patterns • Lazy Synchronization • Lock free Synchronization • Optimistic Synchronization • Fine grained Synchronization ■ Priority Queues • Heap Based Unbounded Priority Queue m
  • 20. • Skiplist based Unbounded priority Queue • Array Based bounded Priority Queue • Tree based Bounded Priority Queue ■ Lists • Coarse Grained Synchronization • Lazy Synchronization • Optimistic Synchronization • Non Blocking Synchronization • Fine Grained Synchronization ■ ■ Skiplists Spinlocks • Lock suitable for NUMA systems ■ Concurrent Queues • Unbounded lock-free Queue • Bounded Partial Queue • Unbounded Total Queue ■ Concurrent Hashing • Open Address Hashing • Closed Address Hashing • Lock Free Hashing ■ Highly Concurrent Data Structures-Part2 • CopyOnWriteArray(List/Set) ■ NonBlockingHashMap • For systems with more than 100 cpus/cores • State based Reasoning • all CAS spin loop bounded • Constant Time key-value mapping • faster than ConcurrentHashMap • no locks even during resize ■ Queue interfaces • Queue m
  • 21. • BlockingQueue • Deque • BlockingDeque ■ Queue Implementations ■ ArrayDeque and ArrayBlockingDeque • WorkStealing using Deques ■ LinkedBlockingQueue ■ LinkedBlockingDeque ■ ConcurrentLinkedQueue • GC unlinking • Michael and Scott algorithm • Tails and heads are allowed to lag • Support for interior removals • Relaxed writes ■ ConcurrentLinkedDeque • Same as ConcurrentLinkedQueue except bidirectional pointers ■ LinkedTransferQueue • Internal removes handled differently • Heuristics based spinning/blocking on number of processors • Behavior differs based on method calls • Usual ConcurrentLinkedQueue optimizations • Normal and Dual Queue ■ Skiplist • Lock free Skiplist • Sequential Skiplist • Lock based Concurrent Skiplist ■ ConcurrentSkipListMap(and Set) • Indexes are allowed to race • Iteration • Problems with AtomicMarkableReference • Probabilistic Data Structure m
  • 22. • Marking and nulling • Different Way to mark Mobile: +91 7719882295/ 9730463630 Email: sales@anikatechnologies.com Website:www.anikatechnologies.com