Multithread queue atomic operations - c++

I'm playing with the std::atomic structures and wrote this lock-free multi-producer multi-consumer queue, which I'm attaching here. The idea for the queue is based on two stacks - a producer and consumer stack, which are essentially linked list structures. The nodes of the lists hold the indexes into an array of that holds the actual data, where you would read or write.
The idea would be that the nodes for the lists are mutually exclusive, ie, a pointer to a node can exist only in the producer or the consumer list. A producer would attempt to acquire a node from the producer list, a consumer from the consumer list and whenever a pointer to a node is acquired by either producer or consumer, it should be out of both lists so that noone else could acquire it. I'm using std::atomic_compare_exchange functions to spin until a node is popped.
The problem is that there must be something wrong with the logic or the operations are not atomic as I assume them to be because even with 1 producer and 1 consumer, given enough time, the queue will livelock and what I have noticed is that if you assert that cell != cell->m_next, the assert would get hit ! So its probably something staring me in the face and I just don't see it, so I wonder if someone could pitch in.
Thx
#ifndef MTQueue_h
#define MTQueue_h
#include <atomic>
template<typename Data, uint64_t queueSize>
class MTQueue
{
public:
MTQueue() : m_produceHead(0), m_consumeHead(0)
{
for(int i=0; i<queueSize-1; ++i)
{
m_nodes[i].m_idx = i;
m_nodes[i].m_next = &m_nodes[i+1];
}
m_nodes[queueSize-1].m_idx = queueSize - 1;
m_nodes[queueSize-1].m_next = NULL;
m_produceHead = m_nodes;
m_consumeHead = NULL;
}
struct CellNode
{
uint64_t m_idx;
CellNode* m_next;
};
bool push(const Data& data)
{
if(m_produceHead == NULL)
return false;
// Pop the producer list.
CellNode* cell = m_produceHead;
while(!std::atomic_compare_exchange_strong(&m_produceHead,
&cell, cell->m_next))
{
cell = m_produceHead;
if(!cell)
return false;
}
// At this point cell should point to a node that is not in any of the lists
m_data[cell->m_idx] = data;
// Push that node as the new head of the consumer list
cell->m_next = m_consumeHead;
while (!std::atomic_compare_exchange_strong(&m_consumeHead,
&cell->m_next, cell))
{
cell->m_next = m_consumeHead;
}
return true;
}
bool pop(Data& data)
{
if(m_consumeHead == NULL)
return false;
// Pop the consumer list
CellNode* cell = m_consumeHead;
while(!std::atomic_compare_exchange_strong(&m_consumeHead,
&cell, cell->m_next))
{
cell = m_consumeHead;
if(!cell)
return false;
}
// At this point cell should point to a node that is not in any of the lists
data = m_data[cell->m_idx];
// Push that node as the new head of the producer list
cell->m_next = m_produceHead;
while(!std::atomic_compare_exchange_strong(&m_produceHead,
&cell->m_next, cell))
{
cell->m_next = m_produceHead;
}
return true;
};
private:
Data m_data[queueSize];
// The nodes for the two lists
CellNode m_nodes[queueSize];
volatile std::atomic<CellNode*> m_produceHead;
volatile std::atomic<CellNode*> m_consumeHead;
};
#endif

I see a few problems with your queue implementation:
It's not a queue, it's a stack: the most recent item pushed is the first item popped. Not that there's anything wrong with stacks, but it's confusing to call it a queue. In fact it is two lock-free stacks: one stack that is initially populated with the array of nodes, and another stack that stores actual data elements using the first stack as a list of free nodes.
There is a data race on CellNode::m_next in both push and pop (unsurprisingly, since they both do the same thing, i.e., pop a node from one stack and push that node onto the other). Say two threads simultaneously enter e.g. pop and both read the same value from m_consumeHead. Thread 1 races ahead successfully popping and sets data. Then Thread 1 writes the value of m_produceHead into cell->m_next while Thread 2 is simultaneously reading cell->m_next to pass to std::atomic_compare_exchange_strong_explicit. The simultaneous non-atomic read and write of cell->m_next by two threads is by definition a data race.
This is what is known as a "benign" race in the concurrency literature: a stale/invalid value is read, but never gets used. If you are confident that your code will never need to run on an architecture where it could cause fiery explosions you may ignore it, but for strict conformance with the Standard memory model you need to make m_next an atomic and use at least memory_order_relaxed reads to eliminate the data race.
ABA. The correctness of your compare-exchange loops is based on the premise that an atomic pointer (e.g., m_produceHead and m_consumeHead) having the same value at both the initial load and the later compare-exchange implies that the pointee object must therefore be unchanged as well. This premise does not hold in any design in which it is possible to recycle an object faster than some thread makes a trip through its compare-exchange loop. Consider this sequence of events:
Thread 1 enters pop and reads the value of m_consumeHead and m_consumeHead->m_next but blocks before calling the compare-exchange.
Thread 2 successfully pops that node from m_consumeHead and blocks as well.
Thread 3 pushes several nodes onto m_consumeHead.
Thread 2 unblocks and pushes the original node onto m_produceHead.
Thread 3 pops that node from m_produceHead, and pushes it back onto m_consumeHead.
Thread 1 finally unblocks and calls the compare-exchange function, which succeeds since the value of m_consumeHead is the same. It pops the node - which is all well and good - but sets m_consumeHead to the stale m_next value it read back in step 1. All the nodes pushed by Thread 3 in the meantime are leaked.

I believe I was able to crack this one. No livelock at 1000000 writes/reads for queues from size 2 to 1024 and from 1 producer and 1 consumer to 100 producers / 100 consumers.
Here's the solution. The trick is not to use cell->m_next directly in the compare and swap (the same applies for the producer code by the way) and to require strict memory order rules:
This seems to confirm my suspicion that it was compiler reordering of the reads writes.
Here's the code:
bool push(const TData& data)
{
CellNode* cell = m_produceHead.load(std::memory_order_acquire);
if(cell == NULL)
return false;
while(!std::atomic_compare_exchange_strong_explicit(&m_produceHead,
&cell,
cell->m_next,
std::memory_order_acquire,
std::memory_order_release))
{
if(!cell)
return false;
}
m_data[cell->m_idx] = data;
CellNode* curHead = m_consumeHead;
cell->m_next = curHead;
while (!std::atomic_compare_exchange_strong_explicit(&m_consumeHead,
&curHead,
cell,
std::memory_order_acquire,
std::memory_order_release))
{
cell->m_next = curHead;
}
return true;
}
bool pop(TData& data)
{
CellNode* cell = m_consumeHead.load(std::memory_order_acquire);
if(cell == NULL)
return false;
while(!std::atomic_compare_exchange_strong_explicit(&m_consumeHead,
&cell,
cell->m_next,
std::memory_order_acquire,
std::memory_order_release))
{
if(!cell)
return false;
}
data = m_data[cell->m_idx];
CellNode* curHead = m_produceHead;
cell->m_next = curHead;
while(!std::atomic_compare_exchange_strong_explicit(&m_produceHead,
&curHead,
cell,
std::memory_order_acquire,
std::memory_order_release))
{
cell->m_next = curHead;
}
return true;
};

Related

Deadlock occuring after multiple iterations of queries in threads (multithreading)

I encounter deadlocks while executing the code snippet below as a thread.
void thread_lifecycle(
Queue<std::tuple<int64_t, int64_t, uint8_t>, QUEUE_SIZE>& query,
Queue<std::string, QUEUE_SIZE>& output_queue,
std::vector<Object>& pgs,
bool* pgs_executed, // Initialized to array false-values
std::mutex& pgs_executed_mutex,
std::atomic<uint32_t>& atomic_pgs_finished
){
bool iter_bool = false;
std::tuple<int64_t, int64_t, uint8_t> next_query;
std::string output = "";
int64_t lower, upper;
while(true) {
// Get next query
next_query = query.pop_front();
// Stop Condition reached terminate thread
if (std::get<2>(next_query) == uint8_t(-1)) break;
//Set query params
lower = std::get<0>(next_query);
upper = std::get<1>(next_query);
// Scan bool array
for (uint32_t i = 0; i < pgs.size(); i++){
// first lock for reading
pgs_executed_mutex.lock();
if (pgs_executed[i] == iter_bool) {
pgs_executed[i] = !pgs_executed[i];
// Unlock and execute the query
pgs_executed_mutex.unlock();
output = pgs.at(i).get_result(lower, upper);
// If query yielded a result, then add it to the output
if (output.length() != 0) {
output_queue.push_back(output);
}
// Inform main thread in case of last result
if (++atomic_pgs_finished >= pgs.size()) {
output_queue.push_back("LAST_RESULT_IDENTIFIER");
atomic_pgs_finished.exchange(0);
}
} else {
pgs_executed_mutex.unlock();
continue;
}
}
//finally flip for next query
iter_bool = !iter_bool;
}
}
Explained:
I have a vector of objects containing information which can be queried (similar to as a table in a database). Each thread can access the objects and all of them iterate the vector ONCE to query the objects which have not been queried and return results, if any.
In the next query it goes through the vector again, and so on... I use the bool* array to denote the entries which are currently queried, so that the processes can synchronize and determine which query should be executed next.
If all have been executed, the last thread having possibly the last results will also return an identifier for the main thread in order to inform that all objects have been queried.
My Question:
Regarding the bool* as well as atomic-pgs_finished, can there be a scenario, in-which a deadlock can occur. As far as i can think, i cannot see a deadlock in this snippet. However, executing this and running this for a while results into a deadlock.
I am seriously considering that a bit (byte?) has randomly flipped causing this deadlock (on ECC-RAM), so that 1 or more objects actually were not executed. Is this even possible?
Maybe another implementation could help?
Edit, Implementation of the Queue:
template<class T, size_t MaxQueueSize>
class Queue
{
std::condition_variable consumer_, producer_;
std::mutex mutex_;
using unique_lock = std::unique_lock<std::mutex>;
std::queue<T> queue_;
public:
template<class U>
void push_back(U&& item) {
unique_lock lock(mutex_);
while(MaxQueueSize == queue_.size())
producer_.wait(lock);
queue_.push(std::forward<U>(item));
consumer_.notify_one();
}
T pop_front() {
unique_lock lock(mutex_);
while(queue_.empty())
consumer_.wait(lock);
auto full = MaxQueueSize == queue_.size();
auto item = queue_.front();
queue_.pop();
if(full)
producer_.notify_all();
return item;
}
};
Thanks to #Ulrich Eckhardt (,
#PaulMcKenzie and all the other comments, thank you for the brainstorming!). I probably have found the cause of the deadlock. I tried to reduce this example even more and thought on removing atomic_pgs_finished, a variable indicating whether all pgs have been queried. Interestingly: ++atomic_pgs_finished >= pgs.size() returns not only once but multiple times true, so that multiple threads are in this specific if-clause.
I simply fixed it by using another mutex around this if-clause. Maybe someone can explain why ++atomic_pgs_finished >= pgs.size() is not atomic and causes true for multiple threads.
Below i have updated the code (mostly the same as in the question) with comments, so that it might be more understandable.
void thread_lifecycle(
Queue<std::tuple<int64_t, int64_t, uint8_t>, QUEUE_SIZE>& query, // The input queue containing queries, in my case triples
Queue<std::string, QUEUE_SIZE>& output_queue, // The Output Queue of results
std::vector<Object>& pgs, // Objects which should be queried
bool* pgs_executed, // Initialized to an array of false-values
std::mutex& pgs_executed_mutex, // a mutex, protecting pgs_executed
std::atomic<uint32_t>& atomic_pgs_finished // atomic counter to count how many have been executed (to send a end signal)
){
// Initialize variables
std::tuple<int64_t, int64_t, uint8_t> next_query;
std::string output = "";
int64_t lower, upper;
// Set the first iteration to false for the very first query
// This flips on the second iteration to reuse pgs_executed with true values and so on...
bool iter_bool = false;
// Execute as long as valid queries are received
while(true) {
// Get next query
next_query = query.pop_front();
// Stop Condition reached terminate thread
if (std::get<2>(next_query) == uint8_t(-1)) break;
// "Parse query" to query the objects in pgs
lower = std::get<0>(next_query);
upper = std::get<1>(next_query);
// Now iterate through the pgs and pgs_executed (once)
for (uint32_t i = 0; i < pgs.size(); i++){
// Lock to read and write into pgs_executed
pgs_executed_mutex.lock();
if (pgs_executed[i] == iter_bool) {
pgs_executed[i] = !pgs_executed[i];
// Unlock since we now execute the query on the object (which was not queried before)
pgs_executed_mutex.unlock();
// Query Execution
output = pgs.at(i).get_result(lower, upper);
// If the query yielded a result, then add it to the output for the main thread to read
if (output.length() != 0) {
output_queue.push_back(output);
}
// HERE THE ROOT CAUSE OF THE DEADLOCK HAPPENS
// Here i would like to inform the main thread that we exexuted the query on
// every object in pgs, so that it should no longer wait for other results
if (++atomic_pgs_finished >= pgs.size()) {
// In this if-clause multiple threads are present at once!
// This is not intended and causes a deadlock, push_back-ing
// multiple times "LAST_RESULT_IDENTIFIER" in-which the main-thread
// assumed that a query has finished. The main thread then simply added the next query, while the
// previous one was not finished causing threads to race each other on two queries simultaneously
// and not having the same iter_bool!
output_queue.push_back("LAST_RESULT_IDENTIFIER");
atomic_pgs_finished.exchange(0);
}
// END: HERE THE ROOT CAUSE OF THE DEADLOCK HAPPENS
} else {
// This case happens when the next element in the list was already executed (by another process),
// simply unlock pgs_executed and continue with the next element in pgs
pgs_executed_mutex.unlock();
continue; // This is uneccessary and could be removed
}
}
//finally flip for the next query in order to reuse bool* (which now has trues if a second query is incoming)
iter_bool = !iter_bool;
}
}

A lock-free queue's push operation using compare_exchange_weak

I've come across a lock-free queue that supports multi-producer and a consumer, the push operation of the queue goes like this
inline void push(LT* item)
{
LT *old = pushlist.load(std::memory_order_relaxed);
do {
item->next = old;
} while (!pushlist.compare_exchange_weak(old, item,
std::memory_order_release));
}
pushlist is an atomic pointer-variable and I am thinking about a scenario where two threads are on this code
Thread1 :
T1# Loads the pointer (won't LOCK# anything)
T2# Goes to loop and hook the head (old) into new item
T3# Now in the cmpxchg(LOCK#); it expects head (old) to be * this-- which is true, so swap will be successful
Thread2:
T2# Loads the pointer (won't LOCK# anything)
T3# Goes to loop and hook the head (old) into its new item
T4# Now cmpxchg will expect head (old) to be * this-- which is not I guess, because Thread1 already changed it, so old isn't the head anymore, operation will be unsuccessful......now, will it not going to spin?
I tested it, works fine, yet I am not clear how it does.
auto work = [&p](auto name){
int i{0};
linked_item<int> dummy{};
for(; i < 100 ; ++i){
p.push(&dummy);
}
std::cout<<name<<" finished"<< i << "\n";
};
std::thread t1{work, "t1"}, t2{work, "t2"};
t1.join(); t2.join();

What are the correct memory orders to use when inserting a node at the beginning of a lock free singly linked list?

I have a simple linked list. There is no danger of the ABA problem, I'm happy with Blocking category and I don't care if my list is FIFO, LIFO or randomized. At long as the inserting succeeds without making others fails.
The code for that looks something like this:
class Class {
std::atomic<Node*> m_list;
...
};
void Class::add(Node* node)
{
node->next = m_list.load(std::memory_order_acquire);
while (!m_list.compare_exchange_weak(node->next, node, std::memory_order_acq_rel, std::memory_order_acquire));
}
where I more or less randomly filled in the used memory_order's.
What are the right memory orders to use here?
I've seen people use std::memory_order_relaxed in all places, one guy on SO used that too, but then std::memory_order_release for the success case of compare_exchange_weak -- and the genmc project uses memory_order_acquire / twice memory_order_acq_rel in a comparable situation, but I can't get genmc to work for a test case :(.
Using the excellent tool from Michalis Kokologiannakis genmc, I was able to verify the required memory orders with the following test code. Unfortunately, genmc currently requires C code, but that doesn't matter for figuring out what the memory orders need to be of course.
// Install https://github.com/MPI-SWS/genmc
//
// Then test with:
//
// genmc -unroll 5 -- genmc_sll_test.c
// These header files are replaced by genmc (see /usr/local/include/genmc):
#include <pthread.h>
#include <stdlib.h>
#include <stddef.h>
#include <assert.h>
#include <stdatomic.h>
#include <stdio.h>
#define PRODUCER_THREADS 3
#define CONSUMER_THREADS 2
struct Node
{
struct Node* next;
};
struct Node* const deleted = (struct Node*)0xd31373d;
_Atomic(struct Node*) list;
void* producer_thread(void* node_)
{
struct Node* node = (struct Node*)node_;
// Insert node at beginning of the list.
node->next = atomic_load_explicit(&list, memory_order_relaxed);
while (!atomic_compare_exchange_weak_explicit(&list, &node->next,
node, memory_order_release, memory_order_relaxed))
;
return NULL;
}
void* consumer_thread(void* param)
{
// Replace the whole list with an empty list.
struct Node* head = atomic_exchange_explicit(&list, NULL, memory_order_acquire);
// Delete each node that was in the list.
while (head)
{
struct Node* orphan = head;
head = orphan->next;
// Mark the node as deleted.
assert(orphan->next != deleted);
orphan->next = deleted;
}
return NULL;
}
pthread_t t[PRODUCER_THREADS + CONSUMER_THREADS];
struct Node n[PRODUCER_THREADS]; // Initially filled with zeroes -->
// none of the Node's is marked as deleted.
int main()
{
// Start PRODUCER_THREADS threads that each append one node to the queue.
for (int i = 0; i < PRODUCER_THREADS; ++i)
if (pthread_create(&t[i], NULL, producer_thread, &n[i]))
abort();
// Start CONSUMER_THREAD threads that each delete all nodes that were added so far.
for (int i = 0; i < CONSUMER_THREADS; ++i)
if (pthread_create(&t[PRODUCER_THREADS + i], NULL, consumer_thread, NULL))
abort();
// Wait till all threads finished.
for (int i = 0; i < PRODUCER_THREADS + CONSUMER_THREADS; ++i)
if (pthread_join(t[i], NULL))
abort();
// Count number of elements still in the list.
struct Node* l = list;
int count = 0;
while (l)
{
++count;
l = l->next;
}
// Count the number of deleted elements.
int del_count = 0;
for (int i = 0; i < PRODUCER_THREADS; ++i)
if (n[i].next == deleted)
++del_count;
assert(count + del_count == PRODUCER_THREADS);
//printf("count = %d; deleted = %d\n", count, del_count);
return 0;
}
The output of which is
$ genmc -unroll 5 -- genmc_sll_test.c
Number of complete executions explored: 6384
Total wall-clock time: 1.26s
Replacing either the memory_order_release or memory_order_acquire with memory_order_relaxed causes an assertion.
In fact, it can be checked that using exclusive memory_order_relaxed when just inserting nodes is sufficient to get them all cleanly in the list (although in a 'random' order - there is nothing sequential consistent, so the order in which they are added is not necessarily the same as that the threads try to add them, if such correlation exists for other reasons).
However, the memory_order_release is required so that when head is read with memory_order_acquire we can be certain that all non-atomic next pointers are visible in the "consumer" thread.
Note there is no ABA problem here because values used for head and next cannot be "reused" before they are deleted by the 'consumer_thread' function, which is the only place where these node are allowed to be deleted (therefore), implying that there can only be one consumer thread (this test code does NOT check for the ABA problem, so it also works using 2 CONSUMER_THREADS).
The actual code is a garbage collection mechanism where multiple "producer" threads add pointers to a singly linked list when those can be deleted, but where it is only safe to actually do so in one specific thread (in that case there is only one "consumer" thread thus, which performs this garbage collection at a well-known place in a main-loop).

Empty element in array-based bounded buffer

I have a classic producer/consumer problem. The code for producer is this:
#define BUFFER_SIZE 10
while (true) {
/* Produce an item */
while (( (in + 1) % BUFFER_SIZE) == out)
; /* do nothing -- no free buffers */
buffer[in] = item;
in = (in + 1) % BUFFER_SIZE;
}
And the consumer is:
while (true) {
while (in == out)
; // do nothing -- nothing to consume
// remove an item from the buffer
item = buffer[out];
out = (out + 1) % BUFFER_SIZE;
return item;
}
This works fine but the problem is that when the first eight elements are filled up and in=9 and out=0, the producer sits there and does not fill the last (ninth) element. This also happens when say, in=4 and out=5. In every case, one element is left empty and the queue appears to be "full" even though one slot is still empty.
I can come up with a few complicated checks but I need to know if there is a clean solution to filling the whole queue. I have tried incrementing in first and then putting the item in but that also runs into similar problems. (Initializing with -1 for both in and out doesn't work either).
See Wikipedia on this very topic. The simplest solutions appear to be either of:
Always keep one slot empty: if in == out then the buffer is empty; if in == out - 1, buffer is full
Replace out with num_unread_items; perform simple maths to retrieve out from num_unread_items and in
... but there are other options related to counting the number of read & write operations (either in separate variables or directly in in and out); or tracking whether the last operation was a read or a write in addition to the current in and out, which allows you to disambiguate the buffer-full/buffer-empty cases.
In an OS development class in college, I had an adjunct teacher that claimed it was impossible to have a software-only solution that could use all N elements in the buffer.
I proved him wrong with something I decided to call the race track solution (inspired by the fact that I like to run track).
On a race track, you are not limited to a 400 meter race; a race can consist of more than one lap. What happens if two runners are neck and neck
in a race? How do you know whether they are tied, or whether one runner has lapped the other? The answer is simple: in a race, we don't monitor a runner's position
on the track; we monitor the distance each runner has traversed. Thus, when two runners are neck and neck, we can disambiguafy between a tie and when one runner has
lapped the other.
So, our algorithm has an N-element array, and manages a 2N race. We don't restart the producer/consumer's counter back to zero until they finish their respective 2N race.
We don't allow the producer to be more than one lap ahead of the consumer, and we don't allow the consumer to be ahead of the producer.
Actually, we only have to monitor the distance between the producer and consumer.
The code is as follows:
Item track[LAP];
int consIdx = 0;
int prodIdx = 0;
void consumer()
{ while(true)
{ int diff = abs(prodIdx - consIdx);
if(0 < diff) //If the consumer isn't tied
{ track[consIdx%LAP] = null;
prodIdx = (prodIdx + 1) % (2*LAP);
}
}
}
void producer()
{ while(true)
{ int diff = (prodIdx - consIdx);
if(diff < LAP) //If prod hasn't lapped cons
{ track[prodIdx%LAP] = Item(); //Advance on the 1-lap track.
prodIdx = (prodIdx + 1) % (2*LAP);//Advance in the 2-lap race.
}
}
}
It's been a while since I originally solved the problem, so this is according to my best recollection. Hopefully I didn't overlook any bugs.
Hope this helps!

How to lock Queue variable address instead of using Critical Section?

I have 2 threads and global Queue, one thread (t1) push the data and another one(t2) pops the data, I wanted to sync this operation without using function where we can use that queue with critical section using windows API.
The Queue is global, and I wanted to know how to sync, is it done by locking address of Queue?
Is it possible to use Boost Library for the above problem?
One approach is to have two queues instead of one:
The producer thread pushes items to queue A.
When the consumer thread wants to pop items, queue A is swapped with empty queue B.
The producer thread continues pushing items to the fresh queue A.
The consumer, uninterrupted, consumes items off queue B and empties it.
Queue A is swapped with queue B etc.
The only locking/blocking/synchronization happens when the queues are being swapped, which should be a fast operation since it's really a matter of swapping two pointers.
I thought you could make a queue with those conditions without using any atomics or any thread safe stuff at all?
like if its just a circle buffer, one thread controls the read pointer, and the other controls the write pointer. both don't update until they are finished reading or writing. and it just works?
the only point of difficulty comes with determining when read==write whether the queue is full or empty, but you can overcome this by just having one dummy item always in the queue
class Queue
{
volatile Object* buffer;
int size;
volatile int readpoint;
volatile int writepoint;
void Init(int s)
{
size = s;
buffer = new Object[s];
readpoint = 0;
writepoint = 1;
}
//thread A will call this
bool Push(Object p)
{
if(writepoint == readpoint)
return false;
int wp = writepoint - 1;
if(wp<0)
wp+=size;
buffer[wp] = p;
int newWritepoint = writepoint + 1;
if(newWritepoint==size)
newWritePoint = 0;
writepoint = newWritepoint;
return true;
}
// thread B will call this
bool Pop(Object* p)
{
writepointTest = writepoint;
if(writepointTest<readpoint)
writepointTest+=size;
if(readpoint+1 == writepoint)
return false;
*p = buffer[readpoint];
int newReadpoint = readpoint + 1;
if(newReadpoint==size)
newReadPoint = 0;
readpoint = newReadPoint;
return true;
}
};
Another way to handle this issue is to allocate your queue dynamically and assign it to a pointer. The pointer value is passed off between threads when items have to be dequeued, and you protect this operation with a critical section. This means locking for every push into the queue, but much less contention on the removal of items.
This works well when you have many items between enqueueing and dequeueing, and works less well with few items.
Example (I'm using some given RAII locking class to do the locking). Also note...really only safe when only one thread dequeueing.
queue* my_queue = 0;
queue* pDequeue = 0;
critical_section section;
void enqueue(stuff& item)
{
locker lock(section);
if (!my_queue)
{
my_queue = new queue;
}
my_queue->add(item);
}
item* dequeue()
{
if (!pDequeue)
{ //handoff for dequeue work
locker lock(section);
pDequeue = my_queue;
my_queue = 0;
}
if (pDequeue)
{
item* pItem = pDequeue->pop(); //remove item and return it.
if (!pItem)
{
delete pDequeue;
pDequeue = 0;
}
return pItem;
}
return 0;
}