I am working on a Queue class using threads in C++ (Windows 10 Visual Studio).
In my constructor, I start by setting a boolean error value to false. Immediately after, I attempt to initialise the critical section. If the there is a problem, the bool is set to true, otherwise it stays as false. I check this value in the main function (not shown) and end the program if the crit sec is not initialised (bool is false).
Constructor code:
Queue() {
// ERROR_CRIT is bool: false = no error in initialising crit section, true = there is an error
ERROR_CRIT = false;
// check init
if (!InitializeCriticalSectionAndSpinCount(&CritSec, 0x00000400)) {
ERROR_INIT_CRIT_SEC = true;
}
totalCustomers = 0;
currentReadIndex = 0;
currentWriteIndex = 0;
customerNum = 0;
};
My question is: what types of booleans should I have default set to true or false ? I have thought about this many times while writing other programs and I am not sure when to set a default value for a bool unless it is obvious. Sometimes it seems fair to start with either value. I feel it would be strange to set an error bool to start as true, but starting it as false also may be strange.
In the real world or in a company, would I remove the line 3 and add an else statement to the if with else {ERROR_CRIT = false;} ? Would this improve readability ? This is probably preference but what is often seen in the real world programming ?
Thank you in advance :D
A much better approach is to prevent the queue from existing at all if the critical section cannot be created. This establishes a tighter class invariant for the Queue class, which simplifies its implementation.
This is done by throwing an exception:
Queue() {
// check init
if (!InitializeCriticalSectionAndSpinCount(&CritSec, 0x00000400)) {
throw std::runtime_error("failed to create the critical section");
}
totalCustomers = 0;
currentReadIndex = 0;
currentWriteIndex = 0;
customerNum = 0;
};
This way, you don't ever need to check for ERROR_CRIT, since the Queue existing is enough information to guarantee that the critical section is correctly initialized.
While I agree with the accepted answer in general (and upvoted it), I want to add that established approach in modern OS is that in-process synchronization primitives do not fail for the reasons of lack of resources or something.
They can only fail if used improperly, if there's a diagnostic for such failures.
This confirms with InitializeCriticalSectionAndSpinCount documentation:
Return value
This function always succeeds and returns a nonzero value.
Windows Server 2003 and Windows XP: blah blah blah
The reason is that when synchronization primitives fail it is very complex to recover, but it is possible to have synchronization primitives that never fail. So on a modern OS failure handling code can never execute, and in order OS it still not likely to make the program recover.
C++ std::mutex ad other C++ mutexes align with this, and never throws due to lack of memory or resources (specifically std::mutex is even constexpr constructible, although Visual C++ violates it). New C++20 primitives also do not throw exceptions.
So you can just write:
Queue() noexcept {
InitializeCriticalSectionAndSpinCount(&CritSec, 0x00000400);
totalCustomers = 0;
currentReadIndex = 0;
currentWriteIndex = 0;
customerNum = 0;
};
or, if your're paranoid:
Queue() noexcept {
// check init
if (!InitializeCriticalSectionAndSpinCount(&CritSec, 0x00000400)) {
__fastfail(1); // something terrible happened, cannot recover
}
totalCustomers = 0;
currentReadIndex = 0;
currentWriteIndex = 0;
customerNum = 0;
};
Related
Recently Eric Niebler had a tweet about volatile and thread safety and somebody replied with the link to following code from Intel TBB.
void Block::shareOrphaned(intptr_t binTag, unsigned index)
{
MALLOC_ASSERT( binTag, ASSERT_TEXT );
// unreferenced formal parameter warning
tbb::detail::suppress_unused_warning(index);
STAT_increment(getThreadId(), index, freeBlockPublic);
markOrphaned();
if ((intptr_t)nextPrivatizable==binTag) {
// First check passed: the block is not in mailbox yet.
// Need to set publicFreeList to non-zero, so other threads
// will not change nextPrivatizable and it can be zeroed.
if ( !readyToShare() ) {
// another thread freed an object; we need to wait until it finishes.
// There is no need for exponential backoff, as the wait here is not for a lock;
// but need to yield, so the thread we wait has a chance to run.
// TODO: add a pause to also be friendly to hyperthreads
int count = 256;
while( (intptr_t)const_cast<Block* volatile &>(nextPrivatizable)==binTag ) {
if (--count==0) {
do_yield();
count = 256;
}
}
}
}
MALLOC_ASSERT( publicFreeList.load(std::memory_order_relaxed) !=NULL, ASSERT_TEXT );
// now it is safe to change our data
previous = NULL;
// it is caller responsibility to ensure that the list of blocks
// formed by nextPrivatizable pointers is kept consistent if required.
// if only called from thread shutdown code, it does not matter.
(intptr_t&)(nextPrivatizable) = UNUSABLE;
}
as an example of wrong use of volatile (since it guarantees nothing wrt threading).
Is this really a bug?
My first intuition is yes, but then again TBB is not some anon person github project, so I am curious if I am missing something.
github link
I'm trying to implement a protected variable that does not use locks in C++11. I have read a little about optimistic concurrency, but I can't understand how can it be implemented neither in C++ nor in any language.
The way I'm trying to implement the optimistic concurrency is by using a 'last modification id'. The process I'm doing is:
Take a copy of the last modification id.
Modify the protected value.
Compare the local copy of the modification id with the current one.
If the above comparison is true, commit the changes.
The problem I see is that, after comparing the 'last modification ids' (local copy and current one) and before commiting the changes, there is no way to assure that no other threads have modified the value of the protected variable.
Below there is a example of code. Lets suppose that are many threads executing that code and sharing the variable var.
/**
* This struct is pretended to implement a protected variable,
* but using optimistic concurrency instead of locks.
*/
struct ProtectedVariable final {
ProtectedVariable() : var(0), lastModificationId(0){ }
int getValue() const {
return var.load();
}
void setValue(int val) {
// This method is not atomic, other thread could change the value
// of val before being able to increment the 'last modification id'.
var.store(val);
lastModificationId.store(lastModificationId.load() + 1);
}
size_t getLastModificationId() const {
return lastModificationId.load();
}
private:
std::atomic<int> var;
std::atomic<size_t> lastModificationId;
};
ProtectedVariable var;
/**
* Suppose this method writes a value in some sort of database.
*/
int commitChanges(int val){
// Now, if nobody has changed the value of 'var', commit its value,
// retry the transaction otherwise.
if(var.getLastModificationId() == currModifId) {
// Here is one of the problems. After comparing the value of both Ids, other
// thread could modify the value of 'var', hence I would be
// performing the commit with a corrupted value.
var.setValue(val);
// Again, the same problem as above.
writeToDatabase(val);
// Return 'ok' in case of everything has gone ok.
return 0;
} else {
// If someone has changed the value of var while trying to
// calculating and commiting it, return error;
return -1;
}
}
/**
* This method is pretended to be atomic, but without using locks.
*/
void modifyVar(){
// Get the modification id for checking whether or not some
// thread has modified the value of 'var' after commiting it.
size_t currModifId = lastModificationId.load();
// Get a local copy of 'var'.
int currVal = var.getValue();
// Perform some operations basing on the current value of
// 'var'.
int newVal = currVal + 1 * 2 / 3;
if(commitChanges(newVal) != 0){
// If someone has changed the value of var while trying to
// calculating and commiting it, retry the transaction.
modifyVar();
}
}
I know that the above code is buggy, but I don't understand how to implement something like the above in a correct way, without bugs.
Optimistic concurrency doesn't mean that you don't use the locks, it merely means that you don't keep the locks during most of the operation.
The idea is that you split your modification into three parts:
Initialization, like getting the lastModificationId. This part may need locks, but not necessarily.
Actual computation. All expensive or blocking code goes here (including any disk writes or network code). The results are written in such a way that they not obscure previous version. The likely way it works is by storing the new values next to the old ones, indexed by not-yet-commited version.
Atomic commit. This part is locked, and must be short, simple, and non blocking. The likely way it works is that it just bumps the version number - after confirming, that there was no other version commited in the meantime. No database writes at this stage.
The main assumption here is that computation part is much more expensive that the commit part. If your modification is trivial and the computation cheap, then you can just use a lock, which is much simpler.
Some example code structured into these 3 parts could look like this:
struct Data {
...
}
...
std::mutex lock;
volatile const Data* value; // The protected data
volatile int current_value_version = 0;
...
bool modifyProtectedValue() {
// Initialize.
int version_on_entry = current_value_version;
// Compute the new value, using the current value.
// We don't have any lock here, so it's fine to make heavy
// computations or block on I/O.
Data* new_value = new Data;
compute_new_value(value, new_value);
// Commit or fail.
bool success;
lock.lock();
if (current_value_version == version_on_entry) {
value = new_value;
current_value_version++;
success = true;
} else {
success = false;
}
lock.unlock();
// Roll back in case of failure.
if (!success) {
delete new_value;
}
// Inform caller about success or failure.
return success;
}
// It's cleaner to keep retry logic separately.
bool retryModification(int retries = 5) {
for (int i = 0; i < retries; ++i) {
if (modifyProtectedValue()) {
return true;
}
}
return false;
}
This is a very basic approach, and especially the rollback is trivial. In real world example re-creating the whole Data object (or it's counterpart) would be likely infeasible, so the versioning would have to be done somewhere inside, and the rollback could be much more complex. But I hope it shows the general idea.
The key here is acquire-release semantics and test-and-increment. Acquire-release semantics are how you enforce an order of operations. Test-and-increment is how you choose which thread wins in case of a race.
Your problem therefore is the .store(lastModificationId+1). You'll need .fetch_add(1). It returns the old value. If that's not the expected value (from before your read), then you lost the race and retry.
If I understand your question, you mean to make sure var and lastModificationId are either both changed, or neither is.
Why not use std::atomic<T> where T would be structure that hold both the int and the size_t?
struct VarWithModificationId {
int var;
size_t lastModificationId;
};
class ProtectedVariable {
private std::atomic<VarWithModificationId> protectedVar;
// Add your public setter/getter methods here
// You should be guaranteed that if two threads access protectedVar, they'll each get a 'consistent' view of that variable, but the setter will need to use a lock
};
Оptimistic concurrency is used in database engines when it's expected that different users will access the same data rarely. It could go like this:
First user reads data and timestamp. Users handles the data for some time, user checks if the timestamp in the DB hasn't changes since he read the data, if it doesn't then user updates the data and the timestamp.
But, internally DB-engine uses locks for update anyway, during this lock it checks if timestamp has been changed and if it hasn't been, engine updates the data. Just time for which data is locked smaller than with pessimistic concurrency. And you also need to use some kind of locking.
What my function does is iterate through an array of bools and upon finding an element set to false, it is set to true. The function is a method from my memory manager singleton class which returns a pointer to memory. I'm getting an error where my iterator appears to loop through and ends up starting at the beginning, which I believe to because multiple threads are calling the function.
void* CNetworkMemoryManager::GetMemory()
{
WaitForSingleObject(hMutexCounter, INFINITE);
if(mCounter >= NetConsts::kNumMemorySlots)
{
mCounter = 0;
}
unsigned int tempCounter = mCounter;
unsigned int start = tempCounter;
while(mUsedSlots[tempCounter])
{
tempCounter++;
if(tempCounter >= NetConsts::kNumMemorySlots)
{
tempCounter = 0;
}
//looped all the way around
if(tempCounter == start)
{
assert(false);
return NULL;
}
}
//return pointer to free space and increment
mCounter = tempCounter + 1;
ReleaseMutex(hMutexCounter);
mUsedSlots[tempCounter] = true;
return mPointers[tempCounter];
}
My error is the assert that goes off in the loop. My question is how do I fix the function and is the error caused by multithreading?
Edit: added a mutex to guard the mCounter variable. No change. Error still occurs.
I can't say if the error is caused by multi threading or not but I can say your code is not thread safe.
You free the lock with
ReleaseMutex(hMutexCounter);
and then access tempCounter and mUsedSlots:
mUsedSlots[tempCounter] = true;
return mPointers[tempCounter];
neither of which are const. This is a data race because you have not correctly serialized access to these variables.
Change this to:
mUsedSlots[tempCounter] = true;
const unsigned int retVal = mPointers[tempCounter];
ReleaseMutex(hMutexCounter);
return retVal;
Then at least your code is thread safe, whether this solves your problem I can't say, try it out. On machines with multiple cores very weird things to happen as a result of data races.
As general best practice I would suggest looking at some C++11 synchronization features like std::mutex and std::lock_guard, this would have saved you from your self because std::lock_guard releases that lock automatically so you can't forget and, as in this case, you can't do it too soon inadvertently. This would also make your code more portable. If you don't have C++11 yet use the boost equivalents.
My code runs fine in debug mode but fails in release mode.
Here's a snippet of my code where it fails:
LOADER->AllocBundle(&m_InitialContent);
while(!m_InitialContent.isReady())
{
this->LoadingScreen();
}
AllocBundle() will load the content contained in m_InitialContent and set it's ready status to true when it is done. This is implemented using multithreading.
this->LoadingScreen() should render a loading screen, however at the moment that is not implemented yet so the function has an empty body.
Apparently this might be the cause of the error: If I give the function LoadingScreen() one line of code: std::cout<<"Loading"<<std::endl; then it will run fine.
If I don't, then the code gets stuck at while(!m_InitialContent.isReady()) It never even jumps to the code between the brackets (this->LoadingScreen();). And apparently neither does it update the expression in the while statement because it stays stuck there forever.
Does anyone have any ideas what might be causing this? And if so, what might the problem be?
I'm completely puzzled.
EDIT: Additional code on request
member of ContentLoader: details::ContentBundleAllocator m_CBA;
void ContentLoader::AllocBundle(ContentBundle* pBundle)
{
ASSERT(!(m_CBA.isRunning()), "ContentBundleAllocator is still busy");
m_CBA.Alloc(pBundle, m_SystemInfo.dwNumberOfProcessors);
}
void details::ContentBundleAllocator::Alloc(ContentBundle* pCB, UINT numThreads)
{
m_bIsRunning = true;
m_pCB = pCB;
pCB->m_bIsReady = false;
m_NumRunningThrds = numThreads;
std::pair<UINT,HANDLE> p;
for (UINT i = 0; i < numThreads; ++i)
{
p.second = (HANDLE)_beginthreadex(NULL,
NULL,
&details::ContentBundleAllocator::AllocBundle,
this,
NULL,&p.first);
SetThreadPriority(p.second,THREAD_PRIORITY_HIGHEST);
m_Threads.Insert(p);
}
}
unsigned int __stdcall details::ContentBundleAllocator::AllocBundle(void* param)
{
//PREPARE
ContentBundleAllocator* pCBA = (ContentBundleAllocator*)param;
//LOAD STUFF [collapsed for visibility+]
//EXIT===========================================================================================================
pCBA->m_NumRunningThrds -= 1;
if (pCBA->m_NumRunningThrds == 0)
{
pCBA->m_bIsRunning = false;
pCBA->m_pCB->m_bIsReady = true;
pCBA->Clear();
#ifdef DEBUG
std::tcout << std::endl;
#endif
std::tcout<<_T("exiting allocation...")<<std::endl;
}
std::tcout<<_T("exiting thread...")<<std::endl;
return 0;
}
bool isReady() const {return m_bIsReady;}
When you compile your code in Debug mode, the compiler does a lot of stuff behind the scenes that prevents many mistakes made by the programmer from crashing the application. When you run in Release, all bets are off. If your code is not correct, you're much more likely to crash in Release than in Debug.
A few things to check:
Make sure all variables are properly intialized
Make sure you do not have any deadlocks or race conditions
Make sure you aren't passing around pointers to local objects that have been deallocated
Make sure your strings are properly NULL-terminated
Don't catch exceptions that you're not expecting and then continue running as if nothing had happened.
You are accessing the variable m_bIsReady from different threads without memory barriers. This is wrong, as it may be cached by either optimizer or processor cache. You have to protect this variable from simultaneous access with a CriticalSection, or mutex, or whatever synchronization primitive is available in your library.
Note that there might be further mistakes, but this one is definitely a mistake, too. As a rule of thumb: each variable which is accessed from different threads has to be protected with a mutex/critical section/whatever.
from a quick look m_NumRunningThrds doesn't seem to be protected against simultaneous access so if (pCBA->m_NumRunningThrds == 0) might never be satisfied.
I have a huge global array of structures. Some regions of the array are tied to individual threads and those threads can modify their regions of the array without having to use critical sections. But there is one special region of the array which all threads may have access to. The code that accesses these parts of the array needs to carefully use critical sections (each array element has its own critical section) to prevent any possibility of two threads writing to the structure simultaneously.
Now I have a mysterious bug I am trying to chase, it is occurring unpredictably and very infrequently. It seems that one of the structures is being filled with some incorrect number. One obvious explanation is that another thread has accidentally been allowed to set this number when it should be excluded from doing so.
Unfortunately it seems close to impossible to track this bug. The array element in which the bad data appears is different each time. What I would love to be able to do is set some kind of trap for the bug as follows: I would enter a critical section for array element N, then I know that no other thread should be able to touch the data, then (until I exit the critical section) set some kind of flag to a debugging tool saying "if any other thread attempts to change the data here please break and show me the offending patch of source code"... but I suspect no such tool exists... or does it? Or is there some completely different debugging methodology that I should be employing.
How about wrapping your data with a transparent mutexed class? Then you could apply additional lock state checking.
class critical_section;
template < class T >
class element_wrapper
{
public:
element_wrapper(const T& v) : val(v) {}
element_wrapper() {}
const element_wrapper& operator = (const T& v) {
#ifdef _DEBUG_CONCURRENCY
if(!cs->is_locked())
_CrtDebugBreak();
#endif
val = v;
return *this;
}
operator T() { return val; }
critical_section* cs;
private:
T val;
};
As for critical section implementation:
class critical_section
{
public:
critical_section() : locked(FALSE) {
::InitializeCriticalSection(&cs);
}
~critical_section() {
_ASSERT(!locked);
::DeleteCriticalSection(&cs);
}
void lock() {
::EnterCriticalSection(&cs);
locked = TRUE;
}
void unlock() {
locked = FALSE;
::LeaveCriticalSection(&cs);
}
BOOL is_locked() {
return locked;
}
private:
CRITICAL_SECTION cs;
BOOL locked;
};
Actually, instead of custom critical_section::locked flag, one could use ::TryEnterCriticalSection (followed by ::LeaveCriticalSection if it succeeds) to determine if a critical section is owned. Though, the implementation above is almost as good.
So the appropriate usage would be:
typedef std::vector< element_wrapper<int> > cont_t;
void change(cont_t::reference x) { x.lock(); x = 1; x.unlock(); }
int main()
{
cont_t container(10, 0);
std::for_each(container.begin(), container.end(), &change);
}
I know two ways to handle such errors:
1) Read the code again and again, looking for possible errors. I can think about two errors that can cause this: unsynchronized access or writing by incorrect memory address. Maybe you have more ideas.
2) Logging, logging an logging. Add lot of optional traces (OutputDebugString or log file), in every critical place, which contain enough information - indexes, variable values etc. It is a good idea to add this tracing with some #ifdef. Reproduce the bug and try to understand from the log, what happens.
Your best (fastest) bet is still to revise the mutex code. As you said, it is the obvious explanation - why not trying to really find the explanation (by logic) instead of additional hints (by coding) that may come out inconclusive? If the code review doesn't turn out something useful you may still take the mutex code and use it for a test run. The first try should not be to reproduce the bug in your system but to ensure correct implementation of the mutex - implement threads (start from 2 upwards) that all try to access the same data structure again and again with a random small delay in each of them to have them jitter around on the time line. If this test results in a buggy mutex which you simply can't identify in the code then you have fallen victim to some architecture dependant effect (maybe intstruction reordering, multi-core cache incoherency, etc.) and need to find another mutex implementation. If OTOH you find an obvious bug in the mutex, try to exploit it in your real system (instrument your code so that the error should appear much more often) so that you can ensure that it really is the cause of your original problem.
I was thinking about this while pedaling to work. One possible way of handling this is to make portions of the memory in question be read-only when it is not actively being accessed and protected via critical section ownership. This is assuming that the problem is caused by a thread writing to the memory when it does not own the appropriate critical section.
There are quite a few limitations to this that prevent it from working. Most importantly is the fact that I think you can only set privileges on a page by page basis (4K I believe). So that would likely require some very specific changes to your allocation scheme so that you could narrow down the appropriate section to protect. The second problem is that it would not catch the rogue thread writing to the memory if another thread actively owned the critical section. But it would catch it and cause an immediate access violation if the critical section was not owned.
The idea would be to do to change your EnterCriticalSection calls to:
EnterCriticalSection()
VirtualProtect( … PAGE_READWRITE … );
And change the LeaveCriticalSection calls to:
VirtualProtect( … PAGE_READONLY … );
LeaveCriticalSection()
The following chunk of code shows a call to VirtualProtect
int main( int argc, char* argv[] 1
{
unsigned char *mem;
int i;
DWORD dwOld;
// this assume 4K page size
mem = malloc( 4096 * 10 );
for ( i = 0; i < 10; i++ )
mem[i * 4096] = i;
// set the second page to be readonly. The allocation from malloc is
// not necessarily on a page boundary, but this will definitely be in
// the second page.
printf( "VirtualProtect res = %d\n",
VirtualProtect( mem + 4096,
1, // ends up setting entire page
PAGE_READONLY, &dwOld ));
// can still read it
for ( i = 1; i < 10; i++ )
printf( "%d ", mem[i*4096] );
printf( "\n" );
// Can write to all but the second page
for ( i = 0; i < 10; i++ )
if ( i != 1 ) // avoid second page which we made readonly
mem[i] = 1;
// this causes an access violation
mem[4096] = 1;
}