SlideShare a Scribd company logo
Locks? We don’t need no
       stinkin’ Locks!

            @mikeb2701
http://bad-concurrency.blogspot.com
                                Image: http://subcirlce.co.uk
Lock? We don't need no stinkin' locks!
Memory Models
Happens-Before
Causality
Causality
  Fear will keep the
 local systems inline.
     instructions
           - Grand Moff Wilhuff Tarkin
•   Loads are not reordered with other loads.


•   Stores are not reordered with other stores.


•   Stores are not reordered with older loads.


•   In a multiprocessor system, memory ordering obeys causality (memory
    ordering respects transitive visibility).


•   In a multiprocessor system, stores to the same location have a total order.


•   In a multiprocessor system, locked instructions to the same
    location have a total order.


•   Loads and Stores are not reordered with locked instructions.
Non-Blocking
 Primitives
Unsafe
public class AtomicLong extends Number
                        implements Serializable {

    // ...
    private volatile long value;

    // ...
    /**
      * Sets to the given value.
      *
      * @param newValue the new value
      */
    public final void set(long newValue) {
         value = newValue;
    }

    // ...
}
# {method} 'set' '(J)V' in 'java/util/concurrent/atomic/AtomicLong'
# this:       rsi:rsi   = 'java/util/concurrent/atomic/AtomicLong'
# parm0:      rdx:rdx   = long
#             [sp+0x20] (sp of caller)
  mov    0x8(%rsi),%r10d
  shl    $0x3,%r10
  cmp    %r10,%rax
  jne    0x00007f1f410378a0 ;     {runtime_call}
  xchg   %ax,%ax
  nopl   0x0(%rax,%rax,1)
  xchg   %ax,%ax
  push   %rbp
  sub    $0x10,%rsp
  nop
  mov    %rdx,0x10(%rsi)
  lock addl $0x0,(%rsp)     ;*putfield value
                            ; - j.u.c.a.AtomicLong::set@2 (line 112)
  add    $0x10,%rsp
  pop    %rbp
  test   %eax,0xa40fd06(%rip)         # 0x00007f1f4b471000
                            ;   {poll_return}
public class AtomicLong extends Number
                        implements Serializable {


    // setup to use Unsafe.compareAndSwapLong for updates
    private static final Unsafe unsafe = Unsafe.getUnsafe();
    private static final long valueOffset;

    // ...
    /**
      * Eventually sets to the given value.
      *
      * @param newValue the new value
      * @since 1.6
      */
    public final void lazySet(long newValue) {
         unsafe.putOrderedLong(this, valueOffset, newValue);
    }

    // ...
}
# {method} 'lazySet' '(J)V' in 'java/util/concurrent/atomic/
AtomicLong'
# this:       rsi:rsi   = 'java/util/concurrent/atomic/AtomicLong'
# parm0:      rdx:rdx   = long
#             [sp+0x20] (sp of caller)
  mov    0x8(%rsi),%r10d
  shl    $0x3,%r10
  cmp    %r10,%rax
  jne    0x00007f1f410378a0 ;     {runtime_call}
  xchg   %ax,%ax
  nopl   0x0(%rax,%rax,1)
  xchg   %ax,%ax
  push   %rbp
  sub    $0x10,%rsp
  nop
  mov    %rdx,0x10(%rsi)     ;*invokevirtual putOrderedLong
                             ; - AtomicLong::lazySet@8 (line 122)
  add    $0x10,%rsp
  pop    %rbp
  test   %eax,0xa41204b(%rip)         # 0x00007f1f4b471000
                             ;   {poll_return}
public class AtomicInteger extends Number
                           implements Serializable {

    // setup to use Unsafe.compareAndSwapInt for updates
    private static final Unsafe unsafe = Unsafe.getUnsafe();
    private static final long valueOffset;

    private volatile int value;

    //...

    public final boolean compareAndSet(int expect,
                                       int update) {
        return unsafe.compareAndSwapInt(this, valueOffset,
                                        expect, update);
    }
}
# {method} 'compareAndSet' '(JJ)Z' in 'java/util/concurrent/
atomic/AtomicLong'
  # this:       rsi:rsi    = 'java/util/concurrent/atomic/AtomicLong'
  # parm0:      rdx:rdx    = long
  # parm1:      rcx:rcx    = long
  #             [sp+0x20] (sp of caller)
  mov     0x8(%rsi),%r10d
  shl     $0x3,%r10
  cmp     %r10,%rax
  jne     0x00007f6699037a60 ;      {runtime_call}
  xchg    %ax,%ax
  nopl    0x0(%rax,%rax,1)
  xchg    %ax,%ax
  sub     $0x18,%rsp
  mov     %rbp,0x10(%rsp)
  mov     %rdx,%rax
  lock cmpxchg %rcx,0x10(%rsi)
  sete    %r11b
  movzbl %r11b,%r11d ;*invokevirtual compareAndSwapLong
                        ; - j.u.c.a.AtomicLong::compareAndSet@9 (line
149)
  mov     %r11d,%eax
  add     $0x10,%rsp
  pop     %rbp
  test    %eax,0x91df935(%rip)          # 0x00007f66a223e000
                        ;   {poll_return}
set()   compareAndSet      lazySet()
  9



6.75



 4.5



2.25



  0
                 nanoseconds/op
Example - Disruptor Multi-producer




private void publish(Disruptor disruptor, long value) {
    long next = disruptor.next();
    disruptor.setValue(next, value);
    disruptor.publish(next);
}
Example - Disruptor Multi-producer
public long next() {
    long next;
    long current;

    do {
        current = nextSequence.get();
        next = current + 1;
        while (next > (readSequence.get() + size)) {
            LockSupport.parkNanos(1L);
            continue;
        }
    } while (!nextSequence.compareAndSet(current, next));

    return next;
}
Algorithm: Spin - 1



public void publish(long sequence) {
    long sequenceMinusOne = sequence - 1;
    while (cursor.get() != sequenceMinusOne) {
        // Spin
    }

    cursor.lazySet(sequence);
}
Spin - 1
                    25



                  18.75
million ops/sec




                   12.5



                   6.25



                     0
                          1   2   3     4         5      6   7   8
                                      Producer Threads
Algorithm: Co-Op
public void publish(long sequence) {
    int counter = RETRIES;
    while (sequence - cursor.get() > pendingPublication.length()) {
        if (--counter == 0) {
            Thread.yield();
            counter = RETRIES;
        }
    }

    long expectedSequence = sequence - 1;
    pendingPublication.set((int) sequence & pendingMask, sequence);

    if (cursor.get() >= sequence) { return; }

    long nextSequence = sequence;
    while (cursor.compareAndSet(expectedSequence, nextSequence)) {
        expectedSequence = nextSequence;
        nextSequence++;
        if (pendingPublication.get((int) nextSequence & pendingMask) != nextSequence) {
            break;
        }
    }
}
Spin - 1              Co-Op
                   30



                  22.5
million ops/sec




                   15



                   7.5



                    0
                         1   2   3            4         5      6   7   8
                                            Producer Threads
Algorithm: Buffer
public long next() {
    long next;
    long current;

    do {
        current = cursor.get();
        next = current + 1;
        while (next > (readSequence.get() + size)) {
            LockSupport.parkNanos(1L);
            continue;
        }
    } while (!cursor.compareAndSet(current, next));

    return next;
}
Algorithm: Buffer


public void publish(long sequence) {
    int publishedValue = (int) (sequence >>> indexShift);
    published.set(indexOf(sequence), publishedValue);
}



// Get Value
int availableValue = (int) (current >>> indexShift);
int index = indexOf(current);
while (published.get(index) != availableValue) {
     // Spin
}
Spin - 1   Co-Op             Buffer
                   70



                  52.5
million ops/sec




                   35



                  17.5



                    0
                         1   2        3     4             5       6    7   8
                                                Threads
Stuff that sucks...
Q&A
• https://github.com/mikeb01/jax2012
• http://www.lmax.com/careers
• http://www.infoq.com/presentations/Lock-
  free-Algorithms
• http://www.youtube.com/watch?
  v=DCdGlxBbKU4

More Related Content

PDF
Counter Wars (JEEConf 2016)
PDF
Concurrency in Python
PDF
Node.js Event Loop & EventEmitter
PDF
Event loop
PDF
Golang Performance : microbenchmarks, profilers, and a war story
PPTX
Scalable Web Apps
PPTX
Rapid Application Design in Financial Services
PDF
Inside the JVM - Follow the white rabbit! / Breizh JUG
Counter Wars (JEEConf 2016)
Concurrency in Python
Node.js Event Loop & EventEmitter
Event loop
Golang Performance : microbenchmarks, profilers, and a war story
Scalable Web Apps
Rapid Application Design in Financial Services
Inside the JVM - Follow the white rabbit! / Breizh JUG

What's hot (20)

PPT
JavaScript Event Loop
KEY
Don’t block the event loop!
PDF
Inside the JVM - Follow the white rabbit!
PPTX
Java concurrency in practice
PDF
Deep Dive async/await in Unity with UniTask(EN)
PDF
Learning Python from Data
ODP
NovaProva, a new generation unit test framework for C programs
PPTX
Introduction to Debuggers
PPTX
Async programming and python
PDF
Javascript TDD with Jasmine, Karma, and Gulp
PDF
Profiling and optimizing go programs
PDF
Kernel Recipes 2019 - Formal modeling made easy
PDF
Beyond JVM - YOW! Sydney 2013
PDF
The Simple Scheduler in Embedded System @ OSDC.TW 2014
PDF
JVM for Dummies - OSCON 2011
PDF
Minimal MVC in JavaScript
PDF
Fast as C: How to Write Really Terrible Java
KEY
Distributed app development with nodejs and zeromq
PDF
Global Interpreter Lock: Episode III - cat < /dev/zero > GIL;
PDF
不深不淺,帶你認識 LLVM (Found LLVM in your life)
JavaScript Event Loop
Don’t block the event loop!
Inside the JVM - Follow the white rabbit!
Java concurrency in practice
Deep Dive async/await in Unity with UniTask(EN)
Learning Python from Data
NovaProva, a new generation unit test framework for C programs
Introduction to Debuggers
Async programming and python
Javascript TDD with Jasmine, Karma, and Gulp
Profiling and optimizing go programs
Kernel Recipes 2019 - Formal modeling made easy
Beyond JVM - YOW! Sydney 2013
The Simple Scheduler in Embedded System @ OSDC.TW 2014
JVM for Dummies - OSCON 2011
Minimal MVC in JavaScript
Fast as C: How to Write Really Terrible Java
Distributed app development with nodejs and zeromq
Global Interpreter Lock: Episode III - cat < /dev/zero > GIL;
不深不淺,帶你認識 LLVM (Found LLVM in your life)
Ad

Viewers also liked (9)

PPTX
Java 9 Functionality and Tooling
PDF
Live Demo from JavaOne
PDF
2015 Java update and roadmap, JUG sevilla
PPTX
Career Advice for Programmers
PDF
Refactoring to Java 8 (QCon New York)
PPTX
Becoming fully buzzword compliant
PDF
Staying Ahead of the Curve
PDF
Staying Ahead of the Curve
PDF
Real World Java 9
Java 9 Functionality and Tooling
Live Demo from JavaOne
2015 Java update and roadmap, JUG sevilla
Career Advice for Programmers
Refactoring to Java 8 (QCon New York)
Becoming fully buzzword compliant
Staying Ahead of the Curve
Staying Ahead of the Curve
Real World Java 9
Ad

Similar to Lock? We don't need no stinkin' locks! (20)

KEY
Java Core | Understanding the Disruptor: a Beginner's Guide to Hardcore Concu...
PDF
Understanding the Disruptor
KEY
Java and the machine - Martijn Verburg and Kirk Pepperdine
PDF
Java Concurrency Idioms
PDF
Non-blocking Michael-Scott queue algorithm
PDF
Multi-core Parallelization in Clojure - a Case Study
PDF
The Quantum Physics of Java
PDF
Un monde où 1 ms vaut 100 M€ - Devoxx France 2015
PPT
Enhancing the region model of RTSJ
PDF
Parallel Programming
PDF
Concurrecy techdrop
PPTX
Blazing Fast Windows 8 Apps using Visual C++
PDF
What can be done with Java, but should better be done with Erlang (@pavlobaron)
PPTX
Jvm memory model
DOCX
Java 5 concurrency
PDF
Unsafe Java
PDF
Beginners guide-concurrency
PDF
High Performance Systems Without Tears - Scala Days Berlin 2018
PPT
Os Reindersfinal
PPT
Os Reindersfinal
Java Core | Understanding the Disruptor: a Beginner's Guide to Hardcore Concu...
Understanding the Disruptor
Java and the machine - Martijn Verburg and Kirk Pepperdine
Java Concurrency Idioms
Non-blocking Michael-Scott queue algorithm
Multi-core Parallelization in Clojure - a Case Study
The Quantum Physics of Java
Un monde où 1 ms vaut 100 M€ - Devoxx France 2015
Enhancing the region model of RTSJ
Parallel Programming
Concurrecy techdrop
Blazing Fast Windows 8 Apps using Visual C++
What can be done with Java, but should better be done with Erlang (@pavlobaron)
Jvm memory model
Java 5 concurrency
Unsafe Java
Beginners guide-concurrency
High Performance Systems Without Tears - Scala Days Berlin 2018
Os Reindersfinal
Os Reindersfinal

Recently uploaded (20)

PPTX
A Presentation on Artificial Intelligence
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Modernizing your data center with Dell and AMD
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Cloud computing and distributed systems.
A Presentation on Artificial Intelligence
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Encapsulation_ Review paper, used for researhc scholars
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
20250228 LYD VKU AI Blended-Learning.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Modernizing your data center with Dell and AMD
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Review of recent advances in non-invasive hemoglobin estimation
Advanced methodologies resolving dimensionality complications for autism neur...
NewMind AI Monthly Chronicles - July 2025
Mobile App Security Testing_ A Comprehensive Guide.pdf
MYSQL Presentation for SQL database connectivity
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Cloud computing and distributed systems.

Lock? We don't need no stinkin' locks!

  • 1. Locks? We don’t need no stinkin’ Locks! @mikeb2701 http://bad-concurrency.blogspot.com Image: http://subcirlce.co.uk
  • 5. Causality Causality Fear will keep the local systems inline. instructions - Grand Moff Wilhuff Tarkin
  • 6. Loads are not reordered with other loads. • Stores are not reordered with other stores. • Stores are not reordered with older loads. • In a multiprocessor system, memory ordering obeys causality (memory ordering respects transitive visibility). • In a multiprocessor system, stores to the same location have a total order. • In a multiprocessor system, locked instructions to the same location have a total order. • Loads and Stores are not reordered with locked instructions.
  • 9. public class AtomicLong extends Number implements Serializable { // ... private volatile long value; // ... /** * Sets to the given value. * * @param newValue the new value */ public final void set(long newValue) { value = newValue; } // ... }
  • 10. # {method} 'set' '(J)V' in 'java/util/concurrent/atomic/AtomicLong' # this: rsi:rsi = 'java/util/concurrent/atomic/AtomicLong' # parm0: rdx:rdx = long # [sp+0x20] (sp of caller) mov 0x8(%rsi),%r10d shl $0x3,%r10 cmp %r10,%rax jne 0x00007f1f410378a0 ; {runtime_call} xchg %ax,%ax nopl 0x0(%rax,%rax,1) xchg %ax,%ax push %rbp sub $0x10,%rsp nop mov %rdx,0x10(%rsi) lock addl $0x0,(%rsp) ;*putfield value ; - j.u.c.a.AtomicLong::set@2 (line 112) add $0x10,%rsp pop %rbp test %eax,0xa40fd06(%rip) # 0x00007f1f4b471000 ; {poll_return}
  • 11. public class AtomicLong extends Number implements Serializable { // setup to use Unsafe.compareAndSwapLong for updates private static final Unsafe unsafe = Unsafe.getUnsafe(); private static final long valueOffset; // ... /** * Eventually sets to the given value. * * @param newValue the new value * @since 1.6 */ public final void lazySet(long newValue) { unsafe.putOrderedLong(this, valueOffset, newValue); } // ... }
  • 12. # {method} 'lazySet' '(J)V' in 'java/util/concurrent/atomic/ AtomicLong' # this: rsi:rsi = 'java/util/concurrent/atomic/AtomicLong' # parm0: rdx:rdx = long # [sp+0x20] (sp of caller) mov 0x8(%rsi),%r10d shl $0x3,%r10 cmp %r10,%rax jne 0x00007f1f410378a0 ; {runtime_call} xchg %ax,%ax nopl 0x0(%rax,%rax,1) xchg %ax,%ax push %rbp sub $0x10,%rsp nop mov %rdx,0x10(%rsi) ;*invokevirtual putOrderedLong ; - AtomicLong::lazySet@8 (line 122) add $0x10,%rsp pop %rbp test %eax,0xa41204b(%rip) # 0x00007f1f4b471000 ; {poll_return}
  • 13. public class AtomicInteger extends Number implements Serializable { // setup to use Unsafe.compareAndSwapInt for updates private static final Unsafe unsafe = Unsafe.getUnsafe(); private static final long valueOffset; private volatile int value; //... public final boolean compareAndSet(int expect, int update) { return unsafe.compareAndSwapInt(this, valueOffset, expect, update); } }
  • 14. # {method} 'compareAndSet' '(JJ)Z' in 'java/util/concurrent/ atomic/AtomicLong' # this: rsi:rsi = 'java/util/concurrent/atomic/AtomicLong' # parm0: rdx:rdx = long # parm1: rcx:rcx = long # [sp+0x20] (sp of caller) mov 0x8(%rsi),%r10d shl $0x3,%r10 cmp %r10,%rax jne 0x00007f6699037a60 ; {runtime_call} xchg %ax,%ax nopl 0x0(%rax,%rax,1) xchg %ax,%ax sub $0x18,%rsp mov %rbp,0x10(%rsp) mov %rdx,%rax lock cmpxchg %rcx,0x10(%rsi) sete %r11b movzbl %r11b,%r11d ;*invokevirtual compareAndSwapLong ; - j.u.c.a.AtomicLong::compareAndSet@9 (line 149) mov %r11d,%eax add $0x10,%rsp pop %rbp test %eax,0x91df935(%rip) # 0x00007f66a223e000 ; {poll_return}
  • 15. set() compareAndSet lazySet() 9 6.75 4.5 2.25 0 nanoseconds/op
  • 16. Example - Disruptor Multi-producer private void publish(Disruptor disruptor, long value) { long next = disruptor.next(); disruptor.setValue(next, value); disruptor.publish(next); }
  • 17. Example - Disruptor Multi-producer public long next() { long next; long current; do { current = nextSequence.get(); next = current + 1; while (next > (readSequence.get() + size)) { LockSupport.parkNanos(1L); continue; } } while (!nextSequence.compareAndSet(current, next)); return next; }
  • 18. Algorithm: Spin - 1 public void publish(long sequence) { long sequenceMinusOne = sequence - 1; while (cursor.get() != sequenceMinusOne) { // Spin } cursor.lazySet(sequence); }
  • 19. Spin - 1 25 18.75 million ops/sec 12.5 6.25 0 1 2 3 4 5 6 7 8 Producer Threads
  • 20. Algorithm: Co-Op public void publish(long sequence) { int counter = RETRIES; while (sequence - cursor.get() > pendingPublication.length()) { if (--counter == 0) { Thread.yield(); counter = RETRIES; } } long expectedSequence = sequence - 1; pendingPublication.set((int) sequence & pendingMask, sequence); if (cursor.get() >= sequence) { return; } long nextSequence = sequence; while (cursor.compareAndSet(expectedSequence, nextSequence)) { expectedSequence = nextSequence; nextSequence++; if (pendingPublication.get((int) nextSequence & pendingMask) != nextSequence) { break; } } }
  • 21. Spin - 1 Co-Op 30 22.5 million ops/sec 15 7.5 0 1 2 3 4 5 6 7 8 Producer Threads
  • 22. Algorithm: Buffer public long next() { long next; long current; do { current = cursor.get(); next = current + 1; while (next > (readSequence.get() + size)) { LockSupport.parkNanos(1L); continue; } } while (!cursor.compareAndSet(current, next)); return next; }
  • 23. Algorithm: Buffer public void publish(long sequence) { int publishedValue = (int) (sequence >>> indexShift); published.set(indexOf(sequence), publishedValue); } // Get Value int availableValue = (int) (current >>> indexShift); int index = indexOf(current); while (published.get(index) != availableValue) { // Spin }
  • 24. Spin - 1 Co-Op Buffer 70 52.5 million ops/sec 35 17.5 0 1 2 3 4 5 6 7 8 Threads
  • 26. Q&A • https://github.com/mikeb01/jax2012 • http://www.lmax.com/careers • http://www.infoq.com/presentations/Lock- free-Algorithms • http://www.youtube.com/watch? v=DCdGlxBbKU4

Editor's Notes

  • #2: - Concurrency is taught all wrong.\n- What is non-blocking concurrency.\n- Mechanical Sympathy, locks/mutexs are a completely artificial construct\n- MTs concurrency course blocking v. non-blocking.\n- Tools for non-blocking concurrency functions of the CPU, need to look at CPU architecture first.\n
  • #3: - Causality\n- Why CPUs/Compilers reorder\n
  • #4: - Java Memory Model provides serial consistency for race-free programs\n- As-if-serial\n- Disallows out of thin air values\n- First main-stream programming language to include a memory model (C/C++ combination of the CPU and whatever the compiler happens to do.\n
  • #5: \n
  • #6: \n
  • #7: \n
  • #8: - volatile\n- java.util.concurrent.atomic.*\n - Atomic<Long|Integer|Reference>\n - Atomic<Long|Integer|Reference>Array (why use over an array of atomics)\n - Atomic<Long|Integer|Reference>FieldUpdater (can be a bit slow)\n
  • #9: - Fight club\n- If you’re smart enough\n
  • #10: \n
  • #11: \n
  • #12: \n
  • #13: \n
  • #14: \n
  • #15: \n
  • #16: \n
  • #17: \n
  • #18: \n
  • #19: \n
  • #20: \n
  • #21: \n
  • #22: \n
  • #23: \n
  • #24: \n
  • #25: \n
  • #26: - Thread wake ups\n- Hard spin\n- Spin with yield\n- PAUSE instruction - please add to Java\n- MONITOR and MWAIT\n
  • #27: \n