My critical section code does not work!!!
Backgrounder.run IS able to modify MESSAGE_QUEUE g_msgQueue and LockSections destructor hadn't been called yet !!!
Extra code :
typedef std::vector<int> MESSAGE_LIST; // SHARED OBJECT .. MUST LOCK!
class MESSAGE_QUEUE : MESSAGE_LIST{
public:
MESSAGE_LIST * m_pList;
MESSAGE_QUEUE(MESSAGE_LIST* pList){ m_pList = pList; }
~MESSAGE_QUEUE(){ }
/* This class will be shared between threads that means any
* attempt to access it MUST be inside a critical section.
*/
void Add( int messageCode ){ if(m_pList) m_pList->push_back(messageCode); }
int getLast()
{
if(m_pList){
if(m_pList->size() == 1){
Add(0x0);
}
m_pList->pop_back();
return m_pList->back();
}
}
void removeLast()
{
if(m_pList){
m_pList->erase(m_pList->end()-1,m_pList->end());
}
}
};
class Backgrounder{
public:
MESSAGE_QUEUE* m_pMsgQueue;
static void __cdecl Run( void* args){
MESSAGE_QUEUE* s_pMsgQueue = (MESSAGE_QUEUE*)args;
if(s_pMsgQueue->getLast() == 0x45)printf("It's a success!");
else printf("It's a trap!");
}
Backgrounder(MESSAGE_QUEUE* pMsgQueue)
{
m_pMsgQueue = pMsgQueue;
_beginthread(Run,0,(void*)m_pMsgQueue);
}
~Backgrounder(){ }
};
int main(){
MESSAGE_LIST g_List;
CriticalSection crt;
ErrorHandler err;
LockSection lc(&crt,&err); // Does not work , see question #2
MESSAGE_QUEUE g_msgQueue(&g_List);
g_msgQueue.Add(0x45);
printf("%d",g_msgQueue.getLast());
Backgrounder back_thread(&g_msgQueue);
while(!kbhit());
return 0;
}
#ifndef CRITICALSECTION_H
#define CRITICALSECTION_H
#include <windows.h>
#include "ErrorHandler.h"
class CriticalSection{
long m_nLockCount;
long m_nThreadId;
typedef CRITICAL_SECTION cs;
cs m_tCS;
public:
CriticalSection(){
::InitializeCriticalSection(&m_tCS);
m_nLockCount = 0;
m_nThreadId = 0;
}
~CriticalSection(){ ::DeleteCriticalSection(&m_tCS); }
void Enter(){ ::EnterCriticalSection(&m_tCS); }
void Leave(){ ::LeaveCriticalSection(&m_tCS); }
void Try();
};
class LockSection{
CriticalSection* m_pCS;
ErrorHandler * m_pErrorHandler;
bool m_bIsClosed;
public:
LockSection(CriticalSection* pCS,ErrorHandler* pErrorHandler){
m_bIsClosed = false;
m_pCS = pCS;
m_pErrorHandler = pErrorHandler;
// 0x1AE is code prefix for critical section header
if(!m_pCS)m_pErrorHandler->Add(0x1AE1);
if(m_pCS)m_pCS->Enter();
}
~LockSection(){
if(!m_pCS)m_pErrorHandler->Add(0x1AE2);
if(m_pCS && m_bIsClosed == false)m_pCS->Leave();
}
void ForceCSectionClose(){
if(!m_pCS)m_pErrorHandler->Add(0x1AE3);
if(m_pCS){m_pCS->Leave();m_bIsClosed = true;}
}
};
/*
Safe class basic structure;
class SafeObj
{
CriticalSection m_cs;
public:
void SafeMethod()
{
LockSection myLock(&m_cs);
//add code to implement the method ...
}
};
*/
#endif
Two questions in one. I don't know about the first, but the critical section part is easy to explain. The background thread isn't trying to claim the lock and so, of course, is not blocked. You need to make the critical section object crt visible to the thread so that it can lock it.
The way to use this lock class is that each section of code that you want serialised must create a LockSection object and hold on to it until the end of the serialised block:
Thread 1:
{
LockSection lc(&crt,&err);
//operate on shared object from thread 1
}
Thread 2:
{
LockSection lc(&crt,&err);
//operate on shared object from thread 2
}
Note that it has to be the same critical section instance crt that is used in each block of code that is to be serialised.
This code has a number of problems.
First of all, deriving from the standard containers is almost always a poor idea. In this case you're using private inheritance, which reduces the problems, but doesn't eliminate them entirely. In any case, you don't seem to put the inheritance to much (any?) use anyway. Even though you've derived your MESSAGE_QUEUE from MESSAGE_LIST (which is actually std::vector<int>), you embed a pointer to an instance of a MESSAGE_LIST into MESSAGE_QUEUE anyway.
Second, if you're going to use a queue to communicate between threads (which I think is generally a good idea) you should make the locking inherent in the queue operations rather than requiring each thread to manage the locking correctly on its own.
Third, a vector isn't a particularly suitable data structure for representing a queue anyway, unless you're going to make it fixed size, and use it roughly like a ring buffer. That's not a bad idea either, but it's quite a bit different from what you've done. If you're going to make the size dynamic, you'd probably be better off starting with a deque instead.
Fourth, the magic numbers in your error handling (0x1AE1, 0x1AE2, etc.) is quite opaque. At the very least, you need to give these meaningful names. The one comment you have does not make the use anywhere close to clear.
Finally, if you're going to go to all the trouble of writing code for a thread-safe queue, you might as well make it generic so it can hold essentially any kind of data you want, instead of dedicating it to one specific type.
Ultimately, your code doesn't seem to save the client much work or trouble over using the Windows functions directly. For the most part, you've just provided the same capabilities under slightly different names.
IMO, a thread-safe queue should handle almost all the work internally, so that client code can use it about like it would any other queue.
// Warning: untested code.
// Assumes: `T::T(T const &) throw()`
//
template <class T>
class queue {
std::deque<T> data;
CRITICAL_SECTION cs;
HANDLE semaphore;
public:
queue() {
InitializeCriticalSection(&cs);
semaphore = CreateSemaphore(NULL, 0, 2048, NULL);
}
~queue() {
DeleteCriticalSection(&cs);
CloseHandle(semaphore);
}
void push(T const &item) {
EnterCriticalSection(&cs);
data.push_back(item);
LeaveCriticalSection(&cs);
ReleaseSemaphore(semaphore, 1, NULL);
}
T pop() {
WaitForSingleObject(semaphore, INFINITE);
EnterCriticalSection(&cs);
T item = data.front();
data.pop_front();
LeaveCriticalSection(&cs);
return item;
}
};
HANDLE done;
typedef queue<int> msgQ;
enum commands { quit, print };
void backgrounder(void *qq) {
// I haven't quite puzzled out what your background thread
// was supposed to do, so I've kept it really simple, executing only
// the two commands listed above.
msgQ *q = (msgQ *)qq;
int command;
while (quit != (command = q->pop()))
printf("Print\n");
SetEvent(done);
}
int main() {
msgQ q;
done = CreateEvent(NULL, false, false, NULL);
_beginthread(backgrounder, 0, (void*)&q);
for (int i=0; i<20; i++)
q.push(print);
q.push(quit);
WaitForSingleObject(done, INFINITE);
return 0;
}
Your background thread needs access to the same CriticalSection object and it needs to create LockSection objects to lock it -- the locking is collaborative.
You are trying to return the last element after popping it.
Related
I have some fear about using unique_ptr with multithreading without mutex. I wrote simplified code below, please take a look. If I check unique_ptr != nullptr, is it thread safe?
class BigClassCreatedOnce
{
public:
std::atomic<bool> var;
// A lot of other stuff
};
BigClassCreatedOnce class instance will be created only once but I'm not sure is it safe to use it between threads.
class MainClass
{
public:
// m_bigClass used all around the class from the Main Thread
MainClass()
: m_bigClass()
, m_thread()
{
m_thread = std::thread([this]() {
while (1)
{
methodToBeCalledFromThread();
std::this_thread::sleep_for(std::chrono::milliseconds(1));
}
});
// other stuff here
m_bigClass.reset(new BigClassCreatedOnce()); // created only once
}
void methodToBeCalledFromThread()
{
if (!m_bigClass) // As I understand this is not safe
{
return;
}
if (m_bigClass->var.load()) // As I understand this is safe
{
// does something
}
}
std::unique_ptr<BigClassCreatedOnce> m_bigClass;
std::thread m_thread;
};
I just put it into infinity loop to simplify the sample.
int main()
{
MainClass ms;
while (1)
{
std::this_thread::sleep_for(std::chrono::milliseconds(1));
}
}
If I check unique_ptr != nullptr, is it thread safe
No, it is not thread safe. If you have more than one thread and at least of one of them writes to the shared data then you need synchronization. If you do not then you have a data race and it is undefined behavior.
m_bigClass.reset(new BigClassCreatedOnce()); // created only once
and
if (!m_bigClass)
Can both happen at the same time, so it is a data race.
I would also like to point out that
if (m_bigClass->var.load())
Is also not thread safe. var.load() is, but the access of m_bigClass is not so you could also have a data race there.
Edit #Mike pointed out that my try_lock function in the code below is unsafe and that accessor creation can produce a race condition as well. The suggestions (from everyone) have convinced me that I'm going down the wrong path.
Original Question
The requirements for locking on an embedded microcontroller are different enough from multithreading that I haven't been able to convert multithreading examples to my embedded applications. Typically I don't have an OS or threads of any kind, just main and whatever interrupt functions are called by the hardware periodically.
It's pretty common that I need to fill up a buffer from an interrupt, but process it in main. I've created the IrqMutex class below to try to safely implement this. Each person trying to access the buffer is assigned a unique id through IrqMutexAccessor, then they each can try_lock() and unlock(). The idea of a blocking lock() function doesn't work from interrupts because unless you allow the interrupt to complete, no other code can execute so the unlock() code never runs. I do however use a blocking lock from the main() code occasionally.
However, I know that the double-check lock doesn't work without C++11 memory barriers (which aren't available on many embedded platforms). Honestly despite reading quite a bit about it, I don't really understand how/why the memory access reordering can cause a problem. I think that the use of volatile sig_atomic_t (possibly combined with the use of unique IDs) makes this different from the double-check lock. But I'm hoping someone can: confirm that the following code is correct, explain why it isn't safe, or offer a better way to accomplish this.
class IrqMutex {
friend class IrqMutexAccessor;
private:
std::sig_atomic_t accessorIdEnum;
volatile std::sig_atomic_t owner;
protected:
std::sig_atomic_t nextAccessor(void) { return ++accessorIdEnum; }
bool have_lock(std::sig_atomic_t accessorId) {
return (owner == accessorId);
}
bool try_lock(std::sig_atomic_t accessorId) {
// Only try to get a lock, while it isn't already owned.
while (owner == SIG_ATOMIC_MIN) {
// <-- If an interrupt occurs here, both attempts can get a lock at the same time.
// Try to take ownership of this Mutex.
owner = accessorId; // SET
// Double check that we are the owner.
if (owner == accessorId) return true;
// Someone else must have taken ownership between CHECK and SET.
// If they released it after CHECK, we'll loop back and try again.
// Otherwise someone else has a lock and we have failed.
}
// This shouldn't happen unless they called try_lock on something they already owned.
if (owner == accessorId) return true;
// If someone else owns it, we failed.
return false;
}
bool unlock(std::sig_atomic_t accessorId) {
// Double check that the owner called this function (not strictly required)
if (owner == accessorId) {
owner = SIG_ATOMIC_MIN;
return true;
}
// We still return true if the mutex was unlocked anyway.
return (owner == SIG_ATOMIC_MIN);
}
public:
IrqMutex(void) : accessorIdEnum(SIG_ATOMIC_MIN), owner(SIG_ATOMIC_MIN) {}
};
// This class is used to manage our unique accessorId.
class IrqMutexAccessor {
friend class IrqMutex;
private:
IrqMutex& mutex;
const std::sig_atomic_t accessorId;
public:
IrqMutexAccessor(IrqMutex& m) : mutex(m), accessorId(m.nextAccessor()) {}
bool have_lock(void) { return mutex.have_lock(accessorId); }
bool try_lock(void) { return mutex.try_lock(accessorId); }
bool unlock(void) { return mutex.unlock(accessorId); }
};
Because there is one processor, and no threading the mutex serves what I think is a subtly different purpose than normal. There are two main use cases I run into repeatedly.
The interrupt is a Producer and takes ownership of a free buffer and loads it with a packet of data. The interrupt/Producer may keep its ownership lock for a long time spanning multiple interrupt calls. The main function is the Consumer and takes ownership of a full buffer when it is ready to process it. The race condition rarely happens, but if the interrupt/Producer finishes with a packet and needs a new buffer, but they are all full it will try to take the oldest buffer (this is a dropped packet event). If the main/Consumer started to read and process that oldest buffer at exactly the same time they would trample all over each other.
The interrupt is just a quick change or increment of something (like a counter). However, if we want to reset the counter or jump to some new value with a call from the main() code we don't want to try to write to the counter as it is changing. Here main actually does a blocking loop to obtain a lock, however I think its almost impossible to have to actually wait here for more than two attempts. Once it has a lock, any calls to the counter interrupt will be skipped, but that's generally not a big deal for something like a counter. Then I update the counter value and unlock it so it can start incrementing again.
I realize these two samples are dumbed down a bit, but some version of these patterns occur in many of the peripherals in every project I work on and I'd like once piece of reusable code that can safely handle this across various embedded platforms. I included the C tag, because all of this is directly convertible to C code, and on some embedded compilers that's all that is available. So I'm trying to find a general method that is guaranteed to work in both C and C++.
struct ExampleCounter {
volatile long long int value;
IrqMutex mutex;
} exampleCounter;
struct ExampleBuffer {
volatile char data[256];
volatile size_t index;
IrqMutex mutex; // One mutex per buffer.
} exampleBuffers[2];
const volatile char * const REGISTER;
// This accessor shouldn't be created in an interrupt or a race condition can occur.
static IrqMutexAccessor myMutex(exampleCounter.mutex);
void __irqQuickFunction(void) {
// Obtain a lock, add the data then unlock all within one function call.
if (myMutex.try_lock()) {
exampleCounter.value++;
myMutex.unlock();
} else {
// If we failed to obtain a lock, we skipped this update this one time.
}
}
// These accessors shouldn't be created in an interrupt or a race condition can occur.
static IrqMutexAccessor myMutexes[2] = {
IrqMutexAccessor(exampleBuffers[0].mutex),
IrqMutexAccessor(exampleBuffers[1].mutex)
};
void __irqLongFunction(void) {
static size_t bufferIndex = 0;
// Check if we have a lock.
if (!myMutex[bufferIndex].have_lock() and !myMutex[bufferIndex].try_lock()) {
// If we can't get a lock try the other buffer
bufferIndex = (bufferIndex + 1) % 2;
// One buffer should always be available so the next line should always be successful.
if (!myMutex[bufferIndex].try_lock()) return;
}
// ... at this point we know we have a lock ...
// Get data from the hardware and modify the buffer here.
const char c = *REGISTER;
exampleBuffers[bufferIndex].data[exampleBuffers[bufferIndex].index++] = c;
// We may keep the lock for multiple function calls until the end of packet.
static const char END_PACKET_SIGNAL = '\0';
if (c == END_PACKET_SIGNAL) {
// Unlock this buffer so it can be read from main.
myMutex[bufferIndex].unlock();
// Switch to the other buffer for next time.
bufferIndex = (bufferIndex + 1) % 2;
}
}
int main(void) {
while (true) {
// Mutex for counter
static IrqMutexAccessor myCounterMutex(exampleCounter.mutex);
// Change counter value
if (EVERY_ONCE_IN_A_WHILE) {
// Skip any updates that occur while we are updating the counter.
while(!myCounterMutex.try_lock()) {
// Wait for the interrupt to release its lock.
}
// Set the counter to a new value.
exampleCounter.value = 500;
// Updates will start again as soon as we unlock it.
myCounterMutex.unlock();
}
// Mutexes for __irqLongFunction.
static IrqMutexAccessor myBufferMutexes[2] = {
IrqMutexAccessor(exampleBuffers[0].mutex),
IrqMutexAccessor(exampleBuffers[1].mutex)
};
// Process buffers from __irqLongFunction.
for (size_t i = 0; i < 2; i++) {
// Obtain a lock so we can read the data.
if (!myBufferMutexes[i].try_lock()) continue;
// Check that the buffer isn't empty.
if (exampleBuffers[i].index == 0) {
myBufferMutexes[i].unlock(); // Don't forget to unlock.
continue;
}
// ... read and do something with the data here ...
exampleBuffer.index = 0;
myBufferMutexes[i].unlock();
}
}
}
}
Also note that I used volatile on any variable that is read-by or written-by the interrupt routine (unless the variable was only accessed from the interrupt like the static bufferIndex value in __irqLongFunction). I've read that mutexes remove some of need for volatile in multithreaded code, but I don't think that applies here. Did I use the right amount of volatile? I used it on: ExampleBuffer[].data[256], ExampleBuffer[].index, and ExampleCounter.value.
I apologize for the long answer, but perhaps it is fitting for a long question.
To answer your first question, I would say that your implementation of IrqMutex is not safe. Let me try to explain where I see problems.
Function nextAccessor
std::sig_atomic_t nextAccessor(void) { return ++accessorIdEnum; }
This function has a race condition, because the increment operator is not atomic, despite it being on an atomic value marked volatile. It involves 3 operations: reading the current value of accessorIdEnum, incrementing it, and writing the result back. If two IrqMutexAccessors are created at the same time, it's possible that they both get the same ID.
Function try_lock
The try_lock function also has a race condition. One thread (eg main), could go into the while loop, and then before taking ownership, another thread (eg an interrupt) can also go into the while loop and take ownership of the lock (returning true). Then the first thread can continue, moving onto owner = accessorId, and thus "also" take the lock. So two threads (or your main thread and an interrupt) can try_lock on an unowned mutex at the same time and both return true.
Disabling interrupts by RAII
We can achieve some level of simplicity and encapsulation by using RAII for interrupt disabling, for example the following class:
class InterruptLock {
public:
InterruptLock() {
prevInterruptState = currentInterruptState();
disableInterrupts();
}
~InterruptLock() {
restoreInterrupts(prevInterruptState);
}
private:
int prevInterruptState; // Whatever type this should be for the platform
InterruptLock(const InterruptLock&); // Not copy-constructable
};
And I would recommend disabling interrupts to get the atomicity you need within the mutex implementation itself. For example something like:
bool try_lock(std::sig_atomic_t accessorId) {
InterruptLock lock;
if (owner == SIG_ATOMIC_MIN) {
owner = accessorId;
return true;
}
return false;
}
bool unlock(std::sig_atomic_t accessorId) {
InterruptLock lock;
if (owner == accessorId) {
owner = SIG_ATOMIC_MIN;
return true;
}
return false;
}
Depending on your platform, this might look different, but you get the idea.
As you said, this provides a platform to abstract away from the disabling and enabling interrupts in general code, and encapsulates it to this one class.
Mutexes and Interrupts
Having said how I would consider implementing the mutex class, I would not actually use a mutex class for your use-cases. As you pointed out, mutexes don't really play well with interrupts, because an interrupt can't "block" on trying to acquire a mutex. For this reason, for code that directly exchanges data with an interrupt, I would instead strongly consider just directly disabling interrupts (for a very short time while the main "thread" touches the data).
So your counter might simply look like this:
volatile long long int exampleCounter;
void __irqQuickFunction(void) {
exampleCounter++;
}
...
// Change counter value
if (EVERY_ONCE_IN_A_WHILE) {
InterruptLock lock;
exampleCounter = 500;
}
In my mind, this is easier to read, easier to reason about, and won't "slip" when there's contention (ie miss timer beats).
Regarding the buffer use-case, I would strongly recommend against holding a lock for multiple interrupt cycles. A lock/mutex should be held for just the slightest moment required to "touch" a piece of memory - just long enough to read or write it. Get in, get out.
So this is how the buffering example might look:
struct ExampleBuffer {
char data[256];
} exampleBuffers[2];
ExampleBuffer* volatile bufferAwaitingConsumption = nullptr;
ExampleBuffer* volatile freeBuffer = &exampleBuffers[1];
const volatile char * const REGISTER;
void __irqLongFunction(void) {
static const char END_PACKET_SIGNAL = '\0';
static size_t index = 0;
static ExampleBuffer* receiveBuffer = &exampleBuffers[0];
// Get data from the hardware and modify the buffer here.
const char c = *REGISTER;
receiveBuffer->data[index++] = c;
// End of packet?
if (c == END_PACKET_SIGNAL) {
// Make the packet available to the consumer
bufferAwaitingConsumption = receiveBuffer;
// Move on to the next buffer
receiveBuffer = freeBuffer;
freeBuffer = nullptr;
index = 0;
}
}
int main(void) {
while (true) {
// Fetch packet from shared variable
ExampleBuffer* packet;
{
InterruptLock lock;
packet = bufferAwaitingConsumption;
bufferAwaitingConsumption = nullptr;
}
if (packet) {
// ... read and do something with the data here ...
// Once we're done with the buffer, we need to release it back to the producer
{
InterruptLock lock;
freeBuffer = packet;
}
}
}
}
This code is arguably easier to reason about, since there are only two memory locations shared between the interrupt and the main loop: one to pass packets from the interrupt to the main loop, and one to pass empty buffers back to the interrupt. We also only touch those variables under "lock", and only for the minimum time needed to "move" the value. (for simplicity I've skipped over the buffer overflow logic when the main loop takes too long to free the buffer).
It's true that in this case one may not even need the locks, since we're just reading and writing simple value, but the cost of disabling the interrupts is not much, and the risk of making mistakes otherwise, is not worth it in my opinion.
Edit
As pointed out in the comments, the above solution was meant to only tackle the multithreading problem, and omitted overflow checking. Here is more complete solution which should be robust under overflow conditions:
const size_t BUFFER_COUNT = 2;
struct ExampleBuffer {
char data[256];
ExampleBuffer* next;
} exampleBuffers[BUFFER_COUNT];
volatile size_t overflowCount = 0;
class BufferList {
public:
BufferList() : first(nullptr), last(nullptr) { }
// Atomic enqueue
void enqueue(ExampleBuffer* buffer) {
InterruptLock lock;
if (last)
last->next = buffer;
else {
first = buffer;
last = buffer;
}
}
// Atomic dequeue (or returns null)
ExampleBuffer* dequeueOrNull() {
InterruptLock lock;
ExampleBuffer* result = first;
if (first) {
first = first->next;
if (!first)
last = nullptr;
}
return result;
}
private:
ExampleBuffer* first;
ExampleBuffer* last;
} freeBuffers, buffersAwaitingConsumption;
const volatile char * const REGISTER;
void __irqLongFunction(void) {
static const char END_PACKET_SIGNAL = '\0';
static size_t index = 0;
static ExampleBuffer* receiveBuffer = &exampleBuffers[0];
// Recovery from overflow?
if (!receiveBuffer) {
// Try get another free buffer
receiveBuffer = freeBuffers.dequeueOrNull();
// Still no buffer?
if (!receiveBuffer) {
overflowCount++;
return;
}
}
// Get data from the hardware and modify the buffer here.
const char c = *REGISTER;
if (index < sizeof(receiveBuffer->data))
receiveBuffer->data[index++] = c;
// End of packet, or out of space?
if (c == END_PACKET_SIGNAL) {
// Make the packet available to the consumer
buffersAwaitingConsumption.enqueue(receiveBuffer);
// Move on to the next free buffer
receiveBuffer = freeBuffers.dequeueOrNull();
index = 0;
}
}
size_t getAndResetOverflowCount() {
InterruptLock lock;
size_t result = overflowCount;
overflowCount = 0;
return result;
}
int main(void) {
// All buffers are free at the start
for (int i = 0; i < BUFFER_COUNT; i++)
freeBuffers.enqueue(&exampleBuffers[i]);
while (true) {
// Fetch packet from shared variable
ExampleBuffer* packet = dequeueOrNull();
if (packet) {
// ... read and do something with the data here ...
// Once we're done with the buffer, we need to release it back to the producer
freeBuffers.enqueue(packet);
}
size_t overflowBytes = getAndResetOverflowCount();
if (overflowBytes) {
// ...
}
}
}
The key changes:
If the interrupt runs out of free buffers, it will recover
If the interrupt receives data while it doesn't have a receive buffer, it will communicate that to the main thread via getAndResetOverflowCount
If you keep getting buffer overflows, you can simply increase the buffer count
I've encapsulated the multithreaded access into a queue class implemented as a linked list (BufferList), which supports atomic dequeue and enqueue. The previous example also used queues, but of length 0-1 (either an item is enqueued or it isn't), and so the implementation of the queue was just a single variable. In the case of running out of free buffers, the receive queue could have 2 items, so I upgraded it to a proper queue rather than adding more shared variables.
If the interrupt is the producer and mainline code is the consumer, surely it's as simple as disabling the interrupt for the duration of the consume operation?
That's how I used to do it in my embedded micro controller days.
I need to make own simple thread-safe shared pointer class for embedded devices.
I made counting master pointer and handle as described in Jeff Alger's book (C++ for real programmers). This is my sources:
template <class T>
class counting_ptr {
public:
counting_ptr() : m_pointee(new T), m_counter(0) {}
counting_ptr(const counting_ptr<T>& sptr) :m_pointee(new T(*(sptr.m_pointee))), m_counter(0) {}
~counting_ptr() {delete m_pointee;}
counting_ptr<T>& operator=(const counting_ptr<T>& sptr)
{
if (this == &sptr) return *this;
delete m_pointee;
m_pointee = new T(*(sptr.m_pointee));
return *this;
}
void grab() {m_counter++;}
void release()
{
if (m_counter > 0) m_counter--;
if (m_counter <= 0)
delete this;
}
T* operator->() const {return m_pointee;}
private:
T* m_pointee;
int m_counter;
};
template <class T>
class shared_ptr {
private:
counting_ptr<T>* m_pointee;
public:
shared_ptr() : m_pointee(new counting_ptr<T>()) { m_pointee->grab(); }
shared_ptr(counting_ptr<T>* a_pointee) : m_pointee(a_ptr) { m_pointee->grab(); }
shared_ptr(const shared_ptr<T>& a_src) : m_pointee(a_src.m_pointee) {m_pointee->grab(); }
~shared_ptr() { m_pointee->release(); }
shared_ptr<T>& operator=(const shared_ptr<T>& a_src)
{
if (this == &a_src) return *this;
if (m_pointee == a_src.m_pointee) return *this;
m_pointee->release();
m_pointee = a_src.m_pointee;
m_pointee->grab();
return *this;
}
counting_ptr<T>* operator->() const {return m_pointee;}
};
This works fine if it used in one thread. Suppose I have two threads:
//thread 1
shared_ptr<T> p = some_global_shared_ptr;
//thread 2
some_global_shared_ptr = another_shared_ptr;
This case I can get undefined behaviour if one of threads will be interrupted between memory allocating/deallocating and counter changing. Of course I can enclose shared_ptr::release() into critical section so deletion of the pointer can be made safety. But what can I do with copy constructor? It is possible that constructor will be interrupted during m_pointee construction by another thread which will delete this m_pointee.
The only way I see to make shared_ptr assignement thread-safe is to enclose the assignment (or creation) into critical section. But this must be done in "user code". In other words user of shared_ptr class must take care about safety.
Is it possible to change this realization somehow to make the shared_ptr class thread safe?
=== EDIT ===
After some investigations (thanks to Jonathan) I realized that my shared_ptr has three unsafe places:
Unatomic counter changing
Unatomic assignment operator (source object can be deleted during copying)
shared_ptr copy constructor (very similar to previous case)
First two cases could be easily fixed by adding crtical sections. But I can't realize how to add critical section into copy constructor? Copy of a_src.m_pointee created before any other code in the constructor executed and can be deleted before calling grab. As Jonathan said in his comment it is very difficult to fix this problem.
I made such test:
typedef shared_ptr<....> Ptr;
Ptr p1, p2;
//thread 1
while (true)
{
Ptr p;
p2 = p;
}
//thread 2
while (!stop)
{
p1 = p2;
Ptr P(p2);
}
Of course, it crashed. But I have tried to use std::shared_ptr in VS 2013 C++. And it works!
So it is possible to make thread-safe copy constructor for shared_ptr. But stl sources too difficult for me and I don't understand how they did the trick. Please anyone explain me how it works in STL?
=== EDIT 2 ===
I am sorry, but the test for std::shared_ptr was made wrong. It doesn't pass too exactly as boost::shared_ptr does. Sometimes copy constructor fails to make a copy because source was deleted during copying. In this case empty pointer will be created.
This is hard to get right, I would seriously consider whether you actually need to support concurrent reads and writes of a single object (boost::shared_ptr and std::shared_ptr do not support that unless all accesses are done through the atomic_xxx() functions that are overloaded for shared_ptr and which typically acquire a lock).
For a start you would need to change shared_ptr<T>::m_pointee to atomic<counting_ptr<T>*> so that you can store a new value in it atomically. counting_ptr<T>::m_counter would need to be atomic<int> so the ref-count updates can be done atomically.
Your assignment operator is a big problem, you would need to at least re-order the operations so you increase the ref-count first, and avoid time of check to time of use bugs, something like this (not even compiled, let alone tested):
shared_ptr<T>& operator=(const shared_ptr<T>& a_src)
{
counter_ptr<T>* new_ptr = a_src.m_pointee.load();
new_ptr->grab();
counter_ptr<T>* old_ptr = m_pointee.exchange(new_ptr);
old_ptr->release();
return *this;
}
This form is safe against self-assignment (it just increases the ref-count then decreases it again if the two objects share the same pointee). It's still not safe against a_src changing while you try to copy it. Consider the case where a_src.m_pointee->m_counter == 1 initially. The current thread could call load() to get the other object's pointer, then a second thread could call release() on that pointer, which would delete it, making the grab() call undefined behaviour because it accesses an object that has been destroyed and memory that has been deallocated. Fixing that requires a pretty major redesign and maybe atomic operations that can operate on two words at once.
Getting this right is possible but is hard and you should really reconsider whether it's necessary, or if the code using it can just avoid modifying objects while other threads are reading them, except while the user has locked a mutex or other form of manual synchronisation.
After some investigations I can conclude that it is impossible to make thread-safe shared_ptr class where thread-safety understood as follow:
//thread 1
shared_ptr<T> p = some_global_shared_ptr;
//thread 2
some_global_shared_ptr = another_shared_ptr;
This example doesn't guarantees that p in first thread will point to old or new value of some_global_shared_ptr. In general this example leads to undefined behavior. The only way to make the example safety is to wrap both operators into critical sections or mutial exclusions.
The main problem caused by copy constructor of shared_ptr class. Other problems could be solved using critical sections inside shared_ptr methods.
Just inherit your class from CmyLock and you can make everything thread safe.
I use this class already many years in all my code, usually combined with class CmyThread, which creates a thread that has a very safe mutex. Maybe my answer is a little late, but above answers are not good practice.
/** Constructor */
CmyLock::CmyLock()
{
(void) pthread_mutexattr_init( &m_attr);
pthread_mutexattr_settype( &m_attr, PTHREAD_MUTEX_RECURSIVE);
pthread_mutex_init( &m_mutex, &m_attr);
}
/** Lock the thread for other threads. */
void CmyLock::lock()
{
pthread_mutex_lock( &m_mutex);
}
/** Unlock the thread for other threads. */
void CmyLock::unlock()
{
pthread_mutex_unlock( &m_mutex);
}
Here also the thread class. Try Please copy CmyLock and CmyThread classes to your project and tell when it's working! Although it's made for Linux, also Windows and Mac should be able to run this.
For the include file:
// #brief Class to create a single thread.
class CmyThread : public CmyLock
{
friend void *mythread_interrupt(void *ptr);
public:
CmyThread();
virtual ~CmyThread();
virtual void startWorking() {}
virtual void stopWorking() {}
virtual void work();
virtual void start();
virtual void stop();
bool isStopping() { return m_stopThread; }
bool isRunning() { return m_running && !m_stopThread; }
private:
virtual void run();
private:
bool m_running; ///< Thread is now running.
pthread_t m_thread; ///< Pointer to thread.
bool m_stopThread; ///< Indicate to stop thread.
};
The C++ file:
/** #brief Interrupt handler.
* #param ptr [in] SELF pointer for the instance.
*/
void *mythread_interrupt(void *ptr)
{
CmyThread *irq =
static_cast<CmyThread*> (ptr);
if (irq != NULL)
{
irq->run();
}
return NULL;
}
/** Constructor new thread. */
CmyThread::CmyThread()
: m_running( false)
, m_thread( 0)
, m_stopThread( false)
{
}
/** Start thread. */
void CmyThread::start()
{
m_running =true;
m_stopThread =false;
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
int stack_size =8192*1024;
pthread_attr_setstacksize(&attr, stack_size);
pthread_create(&m_thread, &attr, mythread_interrupt, (void*) this);
}
/** Thread function running. */
void CmyThread::run()
{
startWorking();
while (m_running && m_stopThread==false)
{
work();
}
m_running =false;
stopWorking();
pthread_exit(0);
}
/** Function to override for a thread. */
virtual void CmyThread::work()
{
delay(5000);
}
For example, here a simplistic example to store and retrieve 1000 data:
class a : public CmyLock
{
set_safe(int *data)
{
lock();
fileContent =std::make_shared<string>(data);
unlock();
}
get_safe(char *data)
{
lock();
strcpy( data, fileContent->c_str());
unlock();
}
std::shared_ptr<string> fileContent;
};
What would be a good/best to ensure thread safety for callback objects? Specifically, I'm trying to prevent a callback object from being deconstructed before all the threads are finished with it.
It is easy to code the client code to ensure thread safety, but I'm looking for a way that is a bit more streamlined. For example, using a factory object to generate the callback objects. The trouble then lies in tracking the usage of the callback object.
Below is an example code that I'm trying to improve.
class CHandlerCallback
{
public:
CHandlerCallback(){ ... };
virtual ~CHandlerCallback(){ ... };
virtual OnBegin(UINT nTotal ){ ... };
virtual OnStep (UINT nIncrmt){ ... };
virtual OnEnd(UINT nErrCode){ ... };
protected:
...
}
static DWORD WINAPI ThreadProc(LPVOID lpParameter)
{
CHandler* phandler = (CHandler*)lpParameter;
phandler ->ThreadProc();
return 0;
};
class CHandler
{
public:
CHandler(CHandlerCallback * sink = NULL) {
m_pSink = sink;
// Start the server thread. (ThreadProc)
};
~CHandler(){...};
VOID ThreadProc(LPVOID lpParameter) {
... do stuff
if (m_pSink) m_pSink->OnBegin(..)
while (not exit) {
... do stuff
if (m_pSink) m_pSink->OnStep(..)
... do stuff
}
if (m_pSink) m_pSink->OnEnd(..);
};
private:
CHandlerCallback * m_pSink;
}
class CSpecial1Callback: public CHandlerCallback
{
public:
CSpecial1Callback(){ ... };
virtual ~CBaseHandler(){ ... };
virtual OnStep (UINT nIncrmt){ ... };
}
class CSpecial2Callback: public CHandlerCallback...
Then the code that runs everything in a way similar to the following:
int main {
CSpecial2Callback* pCallback = new CSpecial2Callback();
CHandler handler(pCallback );
// Right now the client waits for CHandler to finish before deleting
// pCallback
}
Thanks!
If you're using C++11 you can use smart pointers to keep the object around until the last reference to the object disappears. See shared_pointer. If you're not in C++11 you could use boost's version. If you don't want to include that library and aren't in C++11 you can resort to keeping an internal count of threads using that object and destroy the object when that count reaches 0. Note that trying to track the counter yourself can be difficult as you'll need atomic updates to the counter.
shared_ptr<CSpecial2Callback> pCallback(new CSpecial2Callback());
CHandler handler(pCallback); // You'll need to change this to take a shared_ptr
... //Rest of code -- when the last reference to
... //pCallback is used up it will be destroyed.
I had a need for a Blocking Queue in C++ with timeout-capable offer(). The queue is intended for multiple producers, one consumer. Back when I was implementing, I didn't find any good existing queues that fit this need, so I coded it myself.
I'm seeing segfaults come out of the take() method on the queue, but they are intermittent. I've been looking over the code for issues but I'm not seeing anything that looks problematic.
I'm wondering if:
There is an existing library that does this reliably that I should
use (boost or header-only preferred).
Anyone sees any obvious flaw in my code that I need to fix.
Here is the header:
class BlockingQueue
{
public:
BlockingQueue(unsigned int capacity) : capacity(capacity) { };
bool offer(const MyType & myType, unsigned int timeoutMillis);
MyType take();
void put(const MyType & myType);
unsigned int getCapacity();
unsigned int getCount();
private:
std::deque<MyType> queue;
unsigned int capacity;
};
And the relevant implementations:
boost::condition_variable cond;
boost::mutex mut;
bool BlockingQueue::offer(const MyType & myType, unsigned int timeoutMillis)
{
Timer timer;
// boost::unique_lock is a scoped lock - its destructor will call unlock().
// So no need for us to make that call here.
boost::unique_lock<boost::mutex> lock(mut);
// We use a while loop here because the monitor may have woken up because
// another producer did a PulseAll. In that case, the queue may not have
// room, so we need to re-check and re-wait if that is the case.
// We use an external stopwatch to stop the madness if we have taken too long.
while (queue.size() >= this->capacity)
{
int monitorTimeout = timeoutMillis - ((unsigned int) timer.getElapsedMilliSeconds());
if (monitorTimeout <= 0)
{
return false;
}
if (!cond.timed_wait(lock, boost::posix_time::milliseconds(timeoutMillis)))
{
return false;
}
}
cond.notify_all();
queue.push_back(myType);
return true;
}
void BlockingQueue::put(const MyType & myType)
{
// boost::unique_lock is a scoped lock - its destructor will call unlock().
// So no need for us to make that call here.
boost::unique_lock<boost::mutex> lock(mut);
// We use a while loop here because the monitor may have woken up because
// another producer did a PulseAll. In that case, the queue may not have
// room, so we need to re-check and re-wait if that is the case.
// We use an external stopwatch to stop the madness if we have taken too long.
while (queue.size() >= this->capacity)
{
cond.wait(lock);
}
cond.notify_all();
queue.push_back(myType);
}
MyType BlockingQueue::take()
{
// boost::unique_lock is a scoped lock - its destructor will call unlock().
// So no need for us to make that call here.
boost::unique_lock<boost::mutex> lock(mut);
while (queue.size() == 0)
{
cond.wait(lock);
}
cond.notify_one();
MyType myType = this->queue.front();
this->queue.pop_front();
return myType;
}
unsigned int BlockingQueue::getCapacity()
{
return this->capacity;
}
unsigned int BlockingQueue::getCount()
{
return this->queue.size();
}
And yes, I didn't implement the class using templates - that is next on the list :)
Any help is greatly appreciated. Threading issues can be really hard to pin down.
-Ben
Why are cond, and mut globals? I would expect them to be members of your BlockingQueue object. I don't know what else is touching those things, but there may be an issue there.
I too have implemented a ThreadSafeQueue as part of a larger project:
https://github.com/cdesjardins/QueuePtr/blob/master/include/ThreadSafeQueue.h
It is a similar concept to yours, except the enqueue (aka offer) functions are non-blocking because there is basically no max capacity. To enforce a capacity I typically have a pool with N buffers added at system init time, and a Queue for message passing at run time, this also eliminates the need for memory allocation at run time which I consider to be a good thing (I typically work on embedded applications).
The only difference between a pool, and a queue is that a pool gets a bunch of buffers enqueued at system init time. So you have something like this:
ThreadSafeQueue<BufferDataType*> pool;
ThreadSafeQueue<BufferDataType*> queue;
void init()
{
for (int i = 0; i < NUM_BUFS; i++)
{
pool.enqueue(new BufferDataType);
}
}
Then when you want send a message you do something like the following:
void producerA()
{
BufferDataType *buf;
if (pool.waitDequeue(buf, timeout) == true)
{
initBufWithMyData(buf);
queue.enqueue(buf);
}
}
This way the enqueue function is quick and easy, but if the pool is empty, then you will block until someone puts a buffer back into the pool. The idea being that some other thread will be blocking on the queue and will return buffers to the pool when they have been processed as follows:
void consumer()
{
BufferDataType *buf;
if (queue.waitDequeue(buf, timeout) == true)
{
processBufferData(buf);
pool.enqueue(buf);
}
}
Anyways take a look at it, maybe it will help.
I suppose the problem in your code is modifying the deque by several threads. Look:
you're waiting for codition from another thread;
and then immediately sending a signal to other threads that deque is unlocked just before you want to modify it;
then you modifying the deque while other threads are thinking deque is allready unlocked and starting doing the same.
So, try to place all the cond.notify_*() after modifying the deque. I.e.:
void BlockingQueue::put(const MyType & myType)
{
boost::unique_lock<boost::mutex> lock(mut);
while (queue.size() >= this->capacity)
{
cond.wait(lock);
}
queue.push_back(myType); // <- modify first
cond.notify_all(); // <- then say to others that deque is free
}
For better understanding I suggest to read about the pthread_cond_wait().