C++ Lock Free Multi Producer And Consumer Circular Queue

Bryan Zimmerman

Expert modern C++(11-26+) engineer who excels at low latency, concurrent, distributed, high performance computing(HPC), cross-platform software. A hardware, FPGA, and RF Engineer. BAZ Innovations, LLC Owner

Published Mar 29, 2025

I. Overview

In this article I want to discuss a multi producer, multi consumer lock free queue. This is differentiated from a single producer, single consumer lock free queue in that multiple writer and reader threads can push and pop from the queue respectively. See Wikipedia Circular Buffer and previous Linked In article C++ Lock Free Circular Queue.

II. Single Producer Push

Let's take a look at the single producer push routine from the previous article where I have taken the liberty of upgrading the memory order from relaxed to acquire where necessary since we are looking to use this for multiple producers.

Why does this not work for a multiple producers? It is pretty easy to see how this would not work even in the case of two producers. If we have two threads trying to push an element at the same time then they both will get the same write index value. Then they will both see a queue not full, and finally they will both construct different elements into the same memory location. Other bad things may happen, but at a minimum we will have data corruption and missing elements which will only compound with the more producers that we have.

III. Multi Producer Approach

So how can we change this to allow multiple producers?

A. Increment Write Index on Entry

If we changed the first operation to be a fetch add as follows:

This would definitely solve the issue of the write index value being the same for multiple producers. However, it introduces a couple other problems. First, if we have two producers and the queue is full, then the first one to enter will see the full queue and return. However, the second one to enter will have a write index which now exceeds the read index, thus allowing a write on a full queue. Second, as soon as we increment the write index, it indicates an element has been added to the queue so a read could take place on an element which has not been properly constructed yet. There are other issues, but this is a no go.

B. Pending Write Atomic Counter

What if we used a pending write atomic counter with something like the following:

This would solve the issue of overrunning the read index with the write index. However we need a way to mark a write index as pending to construct an element to. There is no relation ship between the pending counter and the current write index which would give a unique pending write index. Also, incrementing the write index to mark as pending, results in the same situation in A above where we try to read a not properly constructed element.

There is an even more subtle issue here. Even if we were able to get a valid pending write index; there is no guarantee for the order that the elements are constructed. If we have two producers trying to write to element 0 and element 1 respectively, then element 1 may complete first. But we cannot update the write index to this element since element 0 is not constructed yet. So this idea keeps increasing in complexity with no obvious path forward.

C. Two Stage Commit

What we need is a way to allow the write index to advance to reserve space, but also be able to indicate that an element has been properly constructed. A solution to this which is also used in databases in called a two stage commit. Instead of storing just our data in the queue we now store a structure as follows:

Similar to before, we have a block of memory which we can construct an element in, and we have added an atomic bool to indicate whether this element has been properly constructed or not. So now we can advance the write index to reserve a memory location in order to put an element in and mark when it is finish constructed and valid to be read.

There are other ways to indicate a two stage commit using std::atomic_flag or reserving a bit on the address through alignment.

IV. Multi Producer Push

With the approach outline above we can now create a lock free, multi producer push routine as follows:

We read the current write and read indices and check for a full queue. While this does give the same write index to multiple producers to begin with, we resolve that problem later on. If the queue is not full we try to advance the write index to reserve a spot to put the data into. If we do not succeed then someone else has advanced already, and we will get the new write index and loop again to check for a full queue.

If we were able to reserve a spot then we need to wait for the element to be in a deconstructed state since it is possible that this element is currently being read and we would not want to write into it and corrupt the value. Once not in use, we construct the element in the reserved spot and then mark it as valid. This sequence completes our two stage commit. We also do not consume a possibly moved in value if we are not able to place it in the queue.

V. Multi Consumer Pop

The changes to the pop routine are similar to those for the push routine and we end up with a routine as follows:

We read the current read and write indices and check for an empty queue. Similar to the write it is possible that multiple consumers get the same read index, but that is properly resolved later on. We try to grab an element to read by advance the read index. If we fail, then someone else has already grabbed it and we get the new read index. Then we loop again checking for an empty queue.

If we were able to grab an element then we spin waiting for the element to be properly constructed since it is possible that we are writing to the same element we are trying to read. Once it is properly constructed we read out the data and call the destructor if necessary. Finally we mark the element as no longer used, and return the value.

An examination of this routine shows that they are almost mirror images of each other.

VI. Code

The code for the entire circular queue is given as follows:

VII. Performance

We will get a relative comparison by running both the single producer, single consumer version and the multi producer, multi consumer version through the same testing framework with verification. I will caveat this with a statement that this is not a production level performance test.

The result of the single producer, single consumer queue is as follows:

The results of the multi producer, multi consumer queue where there are always an equal number of reader and writer threads is as follows:

We see here that the performance is relatively the same as the SPSC queue and scales reasonably well to multiple threads.

VIII. Summary

In this article we saw how we can modify our single producer, single consumer lock free queue into a multi producer, multi consumer lock free queue using the concept of a two stage commit.

In a future article we can now use this queue in combination with the lock free stack to produce a non memory leaking lock free stack.

C++ Lock Free Multi Producer And Consumer Circular Queue

Bryan Zimmerman

Expert modern C++(11-26+) engineer who excels at low latency, concurrent, distributed, high performance computing(HPC), cross-platform software. A hardware, FPGA, and RF Engineer. BAZ Innovations, LLC Owner

I. Overview

II. Single Producer Push

III. Multi Producer Approach

A. Increment Write Index on Entry

B. Pending Write Atomic Counter

C. Two Stage Commit

IV. Multi Producer Push

V. Multi Consumer Pop

VI. Code

VII. Performance

VIII. Summary

More articles by this author

Others also viewed

Your weekend reading from Web Directions

Failable initialisers in Swift, and their usage

Eager Loading of Child Entities in EF Core

Understanding a Helm Chart!

Enhancing ICM InfoWorks & SWMM 1D Output: Elegant Ruby Scripts for Statistical Table Generation

The Perfect Progress Bar

The Six Blind Men and the Requirements

Lets talk about the... C - word

String: April Fool -POTD

How Fast Does Your Code Run? A Beginner's Guide

Explore topics

I. Overview

II. Single Producer Push

III. Multi Producer Approach

A. Increment Write Index on Entry

B. Pending Write Atomic Counter

C. Two Stage Commit

IV. Multi Producer Push

V. Multi Consumer Pop

VI. Code

VII. Performance

VIII. Summary

C++ Lock Free Stack Part 2

Apr 5, 2025

C++ Lock Free Stack

Mar 22, 2025

Dynamic Prefix Sums

Mar 2, 2025

C++ Lock-Free Circular Queues

Feb 22, 2025

Intricacies of C++ Callbacks

Jul 16, 2024

The Importance of Variable Scope

Sep 2, 2021

C++ to Punn or not to Punn

Mar 22, 2020

Ruminations on Rule of 5

Feb 1, 2020

C++Object Suicide

Dec 24, 2019

The Abuse of C++ auto

Oct 16, 2017

Others also viewed

Your weekend reading from Web Directions

Failable initialisers in Swift, and their usage

Eager Loading of Child Entities in EF Core

Understanding a Helm Chart!

Enhancing ICM InfoWorks & SWMM 1D Output: Elegant Ruby Scripts for Statistical Table Generation

The Perfect Progress Bar

The Six Blind Men and the Requirements

Lets talk about the... C - word

String: April Fool -POTD

How Fast Does Your Code Run? A Beginner's Guide

Explore topics