SlideShare a Scribd company logo
Multi-threading
your way out
Rokas Antanas Balevičius
Why we have “race conditions” ?
RAM L3 L2
L1 RAX
L1 RAX
Out-of-order execution
public void Thread1
{
a = 1;
b = 1;
}
public void Thread2
{
while(b == 0) {continue;}
Assert(a == 1);
}
public void Thread1
{
b = 1;
a = 1;
}
public void Thread2
{
while(b == 0) {continue;}
Assert(a == 1);
}
Compiler and JIT optimizations
var b = 0;
public void Thread1
{
b = 1;
}
public void Thread2
{
while(b == 0) {continue;}
Foo();
}
var b = 0;
public void Thread1
{
b = 1;
}
public void Thread2
{
if (b == 0)
{
while(true) { continue; }
}
Foo();
}
Preemption
T1 T2 T3 TN
20 ms 20 ms 20 ms 20 ms
Limit Compiler and Jit optimizations
Limit CPU optimizations
Do things atomically
Part1: Memory barriers
“I Think”
- LinusTorvalds
“In practice, however, they are informal
prose documents”
- Smart people at Cambridge University
L1
CPU1
Bus
L1
CPU2
Everything goes through Cache
Reads are simple and pull based
L1
CPU1
Bus
L1 (A)
CPU2
L1 (A)
CPU1
Bus
L1 (A)
CPU2
Writes are exclusive
Modern cache is write-back
Cache is coherent (MESI)
L1 (A)
CPU1
Bus
L1 (A)
CPU2
L1 (A)
CPU1 (A+)
Bus
L1 (A)
CPU2
L1 (A+)
CPU1
Bus
L1 (A)
CPU2
L1 (A+)
CPU1
Bus
L1 (A+)
CPU2
Everything goes through Cache
Reads are pull based
Writes are exclusive
Cache is coherent (MESI)
Problem: idle on writes
Lets improve performance
public void Thread1
{
a = 1;
b = 1;
}
public void Thread2
{
while(b == 0) {continue;}
Assert(a == 1);
}
L1 (B, A)
CPU1
Bus
Store
Buffer
L1 (A)
CPU2
Store
Buffer
L1 (B, A)
CPU1
Bus
Store
Buffer (A+)
L1 (A)
CPU2
Store
Buffer
L1 (B+, A)
CPU1
Bus
Store
Buffer (A+)
L1 (A)
CPU2
Store
Buffer
L1 (B+, A)
CPU1
Bus
Store
Buffer (A+)
L1 (A, B+)
CPU2
Store
Buffer
Boom!
CPU2 saw writes in different order
CPU1 did everything in order
L1 (A+, B+)
CPU1
Bus
Store
Buffer
L1 (A+, B+)
CPU2
Store
Buffer
Write Memory Barrier
public void Thread1
{
a = 1;
Thread.MemoryBarrier() ;
b = 1;
}
public void Thread2
{
while(b == 0) {continue;}
Assert(a == 1);
}
L1 (B, A)
CPU1
Bus
Store
Buffer (A+)
L1 (A)
CPU2
Store
Buffer
L1 (B, A)
CPU1
Bus
Store Buffer
(A+, B+)
L1 (A)
CPU2
Store
Buffer
L1 (A+, B)
CPU1
Bus
Store
Buffer (B+)
L1 (A)
CPU2
Store
Buffer
L1 (A+, B+)
CPU1
Bus
Store
Buffer
L1 (A, B+)
CPU2
Store
Buffer
L1 (A+, B+)
CPU1
Bus
Store
Buffer
L1 (A+, B+)
CPU2
Store
Buffer
Acknowledging invalidates is slow
Lets improve performance again
L1
CPU1
Bus
Store
Buffer
Invalidate
Queue
L1
CPU2
Store
Buffer
Invalidate
Queue
L1 (A, B)
CPU1
Bus
Store Buffer
(A+, B+)
Invalidate
Queue
L1 (A)
CPU2
Store
Buffer
Invalidate
Queue
L1 (A, B)
CPU1
Bus
Store Buffer
(A+, B+)
Invalidate
Queue
L1 (A)
CPU2
Store
Buffer
Invalidate
Queue (A)
L1 (A+, B+)
CPU1
Bus
Store
Buffer
Invalidate
Queue
L1 (A)
CPU2
Store
Buffer
Invalidate
Queue (A)
L1 (A+, B+)
CPU1
Bus
Store
Buffer
Invalidate
Queue
L1 (A, B+)
CPU2
Store
Buffer
Invalidate
Queue (A)
Read Memory Barrier
public void Thread1
{
a = 1;
Thread.MemoryBarrier() ;
b = 1;
}
public void Thread2
{
while(b == 0) {continue;}
Thread.MemoryBarrier() ;
Assert(a == 1);
}
L1 (A+, B+)
CPU1
Bus
Store
Buffer
Invalidate
Queue
L1 (A, B+)
CPU2
Store
Buffer
Invalidate
Queue (A)
L1 (A+, B+)
CPU1
Bus
Store
Buffer
Invalidate
Queue
L1 (A, B+)
CPU2
Store
Buffer
Invalidate
Queue
L1 (A+, B+)
CPU1
Bus
Store
Buffer
Invalidate
Queue
L1 (A+, B+)
CPU2
Store
Buffer
Invalidate
Queue
Memory barriers also limit compiler and JIT
optimizations
Full Memory Barrier
OK OK
OK
OK
OK
OK
Memory Barrier
Code Line 1
Code Line 2
Code Line 3
Code Line 1
Code Line 2
Code Line 1
Code Line 2
Memory Barrier
Memory Barrier
Code Line 1
Code Line 2
Code Line 3
Code Line 1
Code Line 2
Code Line 1
Code Line 2
Memory Barrier
NO!
NO!NO!
NO!
Barrier pairing
Part 2: Atomic operation
LOCK CMPXCHG
int locker = 0;
Interlocked.CompareExchange(ref locker, 1, 0) == 0
Solves preemption
Time to get back to reality
C# Thread vs OS Thread
High level critical sections
OK OK
OK
OK
OK
OK
Atomic
Acquire Barrier
Code Line 1
Code Line 2
Code Line 3
Code Line 1
Code Line 2
Code Line 1
Code Line 2
Atomic
Release Barrier
Atomic
Acquire Barrier
Code Line 1
Code Line 2
Code Line 3
Code Line 1
Code Line 2
Code Line 1
Code Line 2
NO!
NO!Atomic
Release Barrier
Kind of ok
Kind of ok
Atomic
Acquire Barrier
Code Line 1
Code Line 2
Code Line 3
Code Line 1
Code Line 2
Code Line 1
Code Line 2
Atomic
Release Barrier
Memory Barrier
Atomic
Memory Barrier
Memory Barrier
Atomic
Memory Barrier
C# Primitive
C# Primitive
Code Line 1
Code Line 2
Code Line 3
Code Line 1
Code Line 2
Code Line 1
Code Line 2
Spinning
Blocking
Non blocking algorithms
It is all about Speculations
Usage and problems
Multithread GC
More non-blocking algorithms
Signaling
IO bound scenarios
….
Thank you

More Related Content

PDF
Compilation of COSMO for GPU using LLVM
PDF
An evaluation of LLVM compiler for SVE with fairly complicated loops
PPTX
Padding oracle [opkoko2011]
PDF
JIT compilation in modern platforms – challenges and solutions
PDF
The Quantum Physics of Java
PDF
Code GPU with CUDA - Optimizing memory and control flow
PDF
Revelation pyconuk2016
PDF
Code GPU with CUDA - Identifying performance limiters
Compilation of COSMO for GPU using LLVM
An evaluation of LLVM compiler for SVE with fairly complicated loops
Padding oracle [opkoko2011]
JIT compilation in modern platforms – challenges and solutions
The Quantum Physics of Java
Code GPU with CUDA - Optimizing memory and control flow
Revelation pyconuk2016
Code GPU with CUDA - Identifying performance limiters

What's hot (20)

PDF
Code gpu with cuda - CUDA introduction
PDF
Move from C to Go
PPTX
Streams for the Web
PDF
from Binary to Binary: How Qemu Works
PDF
Goroutine stack and local variable allocation in Go
PDF
Low Overhead System Tracing with eBPF
PDF
BPF / XDP 8월 세미나 KossLab
PPTX
PDF
Introduction to RevKit
PPTX
Understanding eBPF in a Hurry!
PDF
Reversible Logic Synthesis and RevKit
PDF
Porting and Optimization of Numerical Libraries for ARM SVE
PPTX
Implementation of pipelining in datapath
PDF
How it's made: C++ compilers (GCC)
PDF
Linux Kernel 개발참여방법과 문화 (Contribution)
PPT
GEM - GNU C Compiler Extensions Framework
PPTX
The Next Linux Superpower: eBPF Primer
PDF
Introduction to nand2 tetris
PPTX
Berkeley Packet Filters
PDF
What Does R7RS Change Programming in Scheme?
Code gpu with cuda - CUDA introduction
Move from C to Go
Streams for the Web
from Binary to Binary: How Qemu Works
Goroutine stack and local variable allocation in Go
Low Overhead System Tracing with eBPF
BPF / XDP 8월 세미나 KossLab
Introduction to RevKit
Understanding eBPF in a Hurry!
Reversible Logic Synthesis and RevKit
Porting and Optimization of Numerical Libraries for ARM SVE
Implementation of pipelining in datapath
How it's made: C++ compilers (GCC)
Linux Kernel 개발참여방법과 문화 (Contribution)
GEM - GNU C Compiler Extensions Framework
The Next Linux Superpower: eBPF Primer
Introduction to nand2 tetris
Berkeley Packet Filters
What Does R7RS Change Programming in Scheme?
Ad

Similar to Multi-threading your way out (20)

PDF
Java Concurrency in Practice
PDF
Facts about multithreading that'll keep you up at night - Guy Bar on, Vonage
PPTX
Concurrency with java
PPTX
Concurrency with java
PPTX
Concurrency with java
PPTX
Concurrency with java
PPTX
Concurrency with java
PPTX
Concurrency with java
PPTX
Concurrency with java
PPT
Lockless Programming GDC 09
PDF
Java Memory Model
PDF
Java Concurrency, A(nother) Peek Under the Hood [JavaOne 2016 CON1497]
PPTX
Full solution to bounded buffer
PDF
Java under the hood
PPT
Operating systems
PPT
Operating systems
PPT
Operating systems
PDF
Comparing different concurrency models on the JVM
PPTX
Transactional Memory
PPT
Operating systems
Java Concurrency in Practice
Facts about multithreading that'll keep you up at night - Guy Bar on, Vonage
Concurrency with java
Concurrency with java
Concurrency with java
Concurrency with java
Concurrency with java
Concurrency with java
Concurrency with java
Lockless Programming GDC 09
Java Memory Model
Java Concurrency, A(nother) Peek Under the Hood [JavaOne 2016 CON1497]
Full solution to bounded buffer
Java under the hood
Operating systems
Operating systems
Operating systems
Comparing different concurrency models on the JVM
Transactional Memory
Operating systems
Ad

More from .NET Crowd (11)

PPTX
Clean architecture
PPTX
Quantum Computing With the Q# Language
PDF
Fast IDentity Online New wave of open authentication standards
PPTX
Visual Studio Team Services Extensions by Taavi Kõosaar (@melborp)
PPTX
Typescript language
PPTX
Dependency Injection: išmoktos pamokos
PPTX
Raimondas tijunaitis tackle_big_ball_of_mud_super_mario_style
PPTX
Tomas Urbonaitis "Introduction to asynchronous persistent messaging with NSer...
PPTX
Rokas Balevičius "Logstash - system heartbeat implementation"
PPTX
Andrej Slivko "CQRS praktikoje"
PPTX
Donatas Mačiūnas "Git - pažabokim istoriją"
Clean architecture
Quantum Computing With the Q# Language
Fast IDentity Online New wave of open authentication standards
Visual Studio Team Services Extensions by Taavi Kõosaar (@melborp)
Typescript language
Dependency Injection: išmoktos pamokos
Raimondas tijunaitis tackle_big_ball_of_mud_super_mario_style
Tomas Urbonaitis "Introduction to asynchronous persistent messaging with NSer...
Rokas Balevičius "Logstash - system heartbeat implementation"
Andrej Slivko "CQRS praktikoje"
Donatas Mačiūnas "Git - pažabokim istoriją"

Recently uploaded (20)

PDF
System and Network Administraation Chapter 3
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Digital Strategies for Manufacturing Companies
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPTX
Introduction to Artificial Intelligence
PPTX
history of c programming in notes for students .pptx
System and Network Administraation Chapter 3
Understanding Forklifts - TECH EHS Solution
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Odoo POS Development Services by CandidRoot Solutions
How to Migrate SBCGlobal Email to Yahoo Easily
Navsoft: AI-Powered Business Solutions & Custom Software Development
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Softaken Excel to vCard Converter Software.pdf
Design an Analysis of Algorithms I-SECS-1021-03
wealthsignaloriginal-com-DS-text-... (1).pdf
PTS Company Brochure 2025 (1).pdf.......
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Operating system designcfffgfgggggggvggggggggg
Digital Strategies for Manufacturing Companies
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
CHAPTER 2 - PM Management and IT Context
Upgrade and Innovation Strategies for SAP ERP Customers
Introduction to Artificial Intelligence
history of c programming in notes for students .pptx

Multi-threading your way out