strange proplem using two Threads and Boolean - c++

(I hate having to put a title like this. but I just couldn't find anything better)
I have two classes with two threads. first one detects motion between two frames:
void Detector::run(){
isActive = true;
// will run forever
while (isActive){
//code to detect motion for every frame
//.........................................
if(isThereMotion)
{
if(number_of_sequence>0){
theRecorder.setRecording(true);
theRecorder.setup();
// cout << " motion was detected" << endl;
}
number_of_sequence++;
}
else
{
number_of_sequence = 0;
theRecorder.setRecording(false);
// cout << " there was no motion" << endl;
cvWaitKey (DELAY);
}
}
}
second one will record a video when started:
void Recorder::setup(){
if (!hasStarted){
this->start();
}
}
void Recorder::run(){
theVideoWriter.open(filename, CV_FOURCC('X','V','I','D'), 20, Size(1980,1080), true);
if (recording){
while(recording){
//++++++++++++++++++++++++++++++++++++++++++++++++
cout << recording << endl;
hasStarted=true;
webcamRecorder.read(matRecorder); // read a new frame from video
theVideoWriter.write(matRecorder); //writer the frame into the file
}
}
else{
hasStarted=false;
cout << "no recording??" << endl;
changeFilemamePlusOne();
}
hasStarted=false;
cout << "finished recording" << endl;
theVideoWriter.release();
}
The boolean recording gets changed by the function:
void Recorder::setRecording(bool x){
recording = x;
}
The goal is to start the recording once motion was detected while preventing the program from starting the recording twice.
The really strange problem, which honestly doesn't make any sense in my head, is that the code will only work if I cout the boolean recording ( marked with the "++++++"). Else recording never changes to false and the code in the else statment never gets called.
Does anyone have an idea on why this is happening. I'm still just begining with c++ but this problem seems really strange to me..

I suppose your variables isThereMotion and recording are simple class members of type bool.
Concurrent access to these members isn't thread safe by default, and you'll face race conditions, and all kinds of weird behaviors.
I'd recommend to declare these member variables like this (as long you can make use of the latest standard):
class Detector {
// ...
std::atomic<bool> isThereMotion;
};
class Recorder {
// ...
std::atomic<bool> hasStarted;
};
etc.
The reason behind the scenes is, that even reading/writing a simple boolean value splits up into several assembler instructions applied to the CPU, and those may be scheduled off in the middle for a thread execution path change of the process. Using std::atomic<> provides something like a critical section for read/write operations on this variable automatically.
In short: Make everything, that is purposed to be accessed concurrently from different threads, an atomic value, or use an appropriate synchronization mechanism like a std::mutex.
If you can't use the latest c++ standard, you can perhaps workaround using boost::thread to keep your code portable.
NOTE:
As from your comments, your question seems to be specific for the Qt framework, there's a number of mechanisms you can use for synchronization as e.g. the mentioned QMutex.
Why volatile doesn't help in multithreaded environments?
volatile prevents the compiler to optimize away actual read access just by assumptions of values set formerly in a sequential manner. It doesn't prevent threads to be interrupted in actually retrieving or writing values there.
volatile should be used for reading from addresses that can be changed independently of the sequential or threading execution model (e.g. bus addressed peripheral HW registers, where the HW changes values actively, e.g. a FPGA reporting current data throughput at a register inteface).
See more details about this misconception here:
Why is volatile not considered useful in multithreaded C or C++ programming?

You could use a pool of nodes with pointers to frame buffers as part of a linked list fifo messaging system using mutex and semaphore to coordinate the threads. A message for each frame to be recorded would be sent to the recording thread (appended to it's list and a semaphore released), otherwise the node would be returned (appended) back to the main thread's list.
Example code using Windows based synchronization to copy a file. The main thread reads into buffers, the spawned thread writes from buffers it receives. The setup code is lengthy, but the actual messaging functions and the two thread functions are simple and small.
mtcopy.zip

Could be a liveness issue. The compiler could be re-ordering instructions or hoisting isActive out of the loop. Try marking it as volatile.
From MSDN docs:
Objects that are declared as volatile are not used in certain optimizations because their values can change at any time. The system always reads the current value of a volatile object when it is requested, even if a previous instruction asked for a value from the same object. Also, the value of the object is written immediately on assignment.
Simple example:
#include <iostream>
using namespace std;
int main() {
bool active = true;
while(active) {
cout << "Still active" << endl;
}
}
Assemble it:
g++ -S test.cpp -o test1.a
Add volatile to active as in volatile bool active = true
Assemble it again g++ -S test.cpp -o test2.a and look at the difference diff test1.a test2.a
< testb $1, -5(%rbp)
---
> movb -5(%rbp), %al
> testb $1, %al
Notice the first one doesn't even bother to read the value of active before testing it since the loop body never modifies it. The second version does.

Related

How to prevent critical section access with atomics in C++

I have a program where 4 threads are doing random transactions between two "bank accounts", at every 100th transaction per account interest of 5% should be added to the account.
This is an exercise for school I have to do, and in the previous lesson, we used mutexes which seemed very straightforward in locking a section of code.
Now we have to do the same thing but using atomics.
My problem is that currently more than one thread can enter addIntreset function and interest can be added multiple times.
this is part of the code where I check if it is 100th transaction
void transfer(Account& acc, int amount, string& thread)
{
++m_iCounter;
withdraw(acc, amount);
if (m_iCounter % 100 == 0)
{
addInterest(thread);
}
}
All threads go through that function and check for % 100 and sometimes more than one thread slips into addInterest. With mutexes I could restrict access here but how do I do it with atomic?
Maybe I could solve this somehow inside addInterest but if so how?
Inside addInterest is following code (if more than 1 thread slip in interest is added multiple times) :
void addInterest(string& thread)
{
float old = m_iBalance.load();
const float interest = 0.05f;
const float amount = old * interest;
while (!m_iBalance.compare_exchange_weak(old, old + amount))
{
//
}
++m_iInterest;
cout << thread << " interest : Acc" << name << " " << m_iCounter << endl;
}
One problem is that you increment and read counter separately, and because of this addInterest can be called multiple times. To fix this, you should write and read atomically.
const int newCounter = ++m_iCounter;
Adding interest:
If it doesn't matter when you add interest, as long as it happens after incrementing the counter, then your addInterest is probably fine as is.
If you must add interest to the balance you had during the 100th transaction (even if it has already changed when that thread reaches addInterest), you must somehow store the old balance before you increment the counter. The only way I can think of is to synchronize the entire transfer by using atomic flag as a replacement for mutex:
// Member variable
std::atomic_bool flag;
// Before the critical section
bool expected;
do {
expected = false;
} while (!flag.compare_exchange_weak(expected, true));
// Critical section here
// Unlock at the end
flag = false;
Ok 2 possible solution:
the first one is to use the atomic template that is in the header <atomic>, in this case you can make at least the m_iCounter as an atomic variable. but for a better atomicity of the operation all the variables in a critical section have to be atomic.
Further read on Atomic and Atomic template.
the second option is use a sperimental feature of c++ the so called STM (Software Transactional Memory). In this case all the content of the transfer function could be a transaction, or anyway all the thinghs that you have mutexed before.
Further reading Transactional Memory c++

List Iterator from STL library malfunctioning

The following is a snippet of a larger program and is done using Pthreads.
The UpdateFunction reads from a text file. The FunctionMap is just used to output (key,1). Here essentially UpdateFunction and FunctionMap run on different threads.
queue <list<string>::iterator> mapperpool;
void *UpdaterFunction(void* fn) {
std::string *x = static_cast<std::string*>(fn);
string filename = *x;
ifstream file (filename.c_str());
string word;
list <string> letterwords[50];
char alphabet = '0';
bool times = true;
int charno=0;
while(file >> word) {
if(times) {
alphabet = *(word.begin());
times = false;
}
if (alphabet != *(word.begin())) {
alphabet = *(word.begin());
mapperpool.push(letterwords[charno].begin());
letterwords[charno].push_back("xyzzyspoon");
charno++;
}
letterwords[charno].push_back(word);
}
file.close();
cout << "UPDATER DONE!!" << endl;
pthread_exit(NULL);
}
void *FunctionMap(void *i) {
long num = (long)i;
stringstream updaterword;
string toQ;
int charno = 0;
fprintf(stderr, "Print me %ld\n", num);
sleep(1);
while (!mapperpool.empty()) {
list<string>::iterator it = mapperpool.front();
while(*it != "xyzzyspoon") {
cout << "(" << *it << ",1)" << "\n";
cout << *it << "\n";
it++;
}
mapperpool.pop();
}
pthread_exit(NULL);
}
If I add the while(!mapperpool.empty()) in the UpdateFunction then it gives me the perfect output. But when I move it back to the FunctionMap then it gives me a weird out and Segfaults later.
Output when used in UpdateFunction:
Print me 0
course
cap
class
culture
class
cap
course
course
cap
culture
concurrency
.....
[Each word in separate line]
Output when used in FunctionMap (snippet shown above):
Print me 0
UPDATER DONE!!
(course%0+�0#+�0�+�05P+�0����cap%�+�0�+�0,�05�+�0����class5P?�0
����xyzzyspoon%�+�0�+�0(+�0%P,�0,�0�,�05+�0����class%p,�0�,�0-�05�,�0����cap%�,�0�,�0X-�05�,�0����course%-�0 -�0�-�050-�0����course%-�0p-�0�-�05�-�0����cap%�-�0�-�0H.�05�-�0����culture%.�0.�0�.�05 .�0
����concurrency%P.�0`.�0�.�05p.�0����course%�.�0�.�08/�05�.�0����cap%�.�0/�0�/�05/�0Segmentation fault (core dumped)
How do I fix this issue?
list <string> letterwords[50] is local to UpdaterFunction. When UpdaterFunction finishes, all its local variables got destroyed. When FunctionMap inspects iterator, that iterator already points to deleted memory.
When you insert while(!mapperpool.empty()) UpdaterFunction waits for FunctionMap completion and letterwords stays 'alive'.
Here essentially UpdateFunction and FunctionMap run on different threads.
And since they both manipulate the same object (mapperpool) and neither of them uses either pthread_mutex nor std::mutex (C++11), you have a data race. If you have a data race, you have Undefined Behaviour and the program might do whatever it wants. Most likely it will write garbage all over memory until eventually crashing, exactly as you see.
How do I fix this issue?
By locking the mapperpool object.
Why is list not thread-safe?
Well, in vast majority of use-cases, a single list (or any other collection) won't be used by more than one thread. In significant part of the rest the lock will have to extend over more than one operation on the collection, so the client will have to do its own locking anyway. The remaining tiny percentage of cases where locking in the operations themselves would help is not worth adding the overhead for everyone; C++ key design principle is that you only pay for what you use.
The collections are only reentrant, meaning that using different instances in parallel is safe.
Note on pthreads
C++11 introduced threading library that integrates well with the language. Most notably, it uses RAII for locking of std::mutex via std::lock_guard, std::unique_lock and std::shared_lock (for reader-writer locking). Consistently using these can eliminate large class of locking bugs that otherwise take considerable time to debug.
If you can't use C++11 yet (on desktop you can, but some embedded platforms did not get a compiler update yet), you should first consider Boost.Thread as it provides the same benefits.
If you can't use even then, still try to find, or write, a simple RAII wrapper for locking like the C++11/Boost do. The basic wrapper is just a couple of lines, but it will save you a lot of debugging.
Note that C++11 and Boost also have atomic operations library that pthreads sorely miss.

Multithreaded program works only with print statements

I wish I could have thought of a more descriptive title, but that is the case. I've got some code that I'd like to do some image processing with. I also need to get some statistical data from those processed images, and I'd like to do that on a separate thread so my main thread can keep doing image processing.
All that aside, here's my code. It shouldn't really be relevant, other than the fact that my Image class wraps an OpenCV Mat (though I'm not using OMP or anything, as far as I'm aware):
#include <thread>
#include <iostream>
#include <vector>
using namespace std;
//Data struct
struct CenterFindData{
//Some images I'd like to store
Image in, bpass, bpass_thresh, local_max, tmp;
//Some Data (Particle Data = float[8])
vector<ParticleData> data;
//My thread flag
bool goodToGo{ false };
//Constructor
CenterFindData(const Image& m);
};
vector<ParticleData> statistics(CenterFindData& CFD);
void operate(vector<CenterFindData> v_CFD);
..........................................
..........................................
..........................................
void operate(vector<CenterFindData> v_CFD){
//Thread function, gathers statistics on processed images
thread T( [&](){
int nProcessed(0);
for (auto& cfd : v_CFD){
//Chill while the images are still being processed
while (cfd.goodToGo == false){
//This works if I uncomment this print statement
/*cout << "Waiting" << endl;*/
}
cout << "Statistics gathered from " << nProcessed++ << " images" << endl;
//This returns vector<ParticleData>
cfd.data = m_Statistics(cfd);
}
});
//Run some filters on the images before statistics
int nProcessed(0);
for (auto& cfd : v_CFD){
//Preprocess images
RecenterImage(cfd.in);
m_BandPass(cfd);
m_LocalMax(cfd);
RecenterImage(cfd.bpass_thresh);
//Tell thread to do statistics, on to the next
cfd.goodToGo = true;
cout << "Ran filters on " << nProcessed++ << " images" << endl;
}
//Join thread
T.join();
}
I figure that the delay from cout is avoid some race condition I otherwise run into, but what? Because only one thread modified the bool goodToGo, while the other checks it, should that be a thread safe way of "gating" the two functions?
Sorry if anything is unclear, I'm very new to this and seem to make a lot of obvious mistakes WRT multithreaded programming.
Thanks for your help
john
When you have:
while (cfd.goodToGo == false){ }
the compiler doesn't see any reason to "reload" the value of goodToGo (it doesn't know that this value is affected by other threads!). So it reads it once, and then loops forever.
The reason printing something makes a difference is, that the compiler doesn't know what the print function actually will and won't affect, so "just in case", it reloads that value (if the compiler could "see inside" all of the printing code, it could in fact decide that goodToGo is NOT changed by the printing, and not needing to reload - but there are limits to how much time [or some proxy for time, such as "number of levels of calls" or "number of intermediate instructions"] the compiler spends on figuring these sort of thing out [and there may of course be calls to code that the compiler doesn't actually have access to the source code of, such as the system calls to write or similar].
The solution, however, is to use thread-safe mechanisms to update the goodToGo - we could just throw a volatile attribute to the variable, but that will not guarantee that, for example, another processor gets "told" that the value has updated, so could delay the detection of the updated value significantly [or even infinitely under some conditions].
Use std::atomic_bool goodToGo and the store and load functions to access the value inside. That way, you will be guaranteed that the value is updated correctly and "immediately" (as in, a few dozen to hundreds of clock-cycles later) seen by the other thread.
As a side-note, which probably should have been the actual answer: Busy-waiting for threads is a bad idea in general, you should probably look at some thread-primitives to wait for a condition_variable or similar.

Is a race condition possible when only one thread writes to a bool variable in c++?

In the following code example, program execution never ends.
It creates a thread which waits for a global bool to be set to true before terminating. There is only one writer and one reader. I believe that the only situation that allows the loop to continue running is if the bool variable is false.
How is it possible that the bool variable ends up in an inconsistent state with just one writer?
#include <iostream>
#include <pthread.h>
#include <unistd.h>
bool done = false;
void * threadfunc1(void *) {
std::cout << "t1:start" << std::endl;
while(!done);
std::cout << "t1:done" << std::endl;
return NULL;
}
int main()
{
pthread_t threads;
pthread_create(&threads, NULL, threadfunc1, NULL);
sleep(1);
done = true;
std::cout << "done set to true" << std::endl;
pthread_exit(NULL);
return 0;
}
There's a problem in the sense that this statement in threadfunc1():
while(!done);
can be implemented by the compiler as something like:
a_register = done;
label:
if (a_register == 0) goto label;
So updates to done will never be seen.
There is really nothing that prevents the compiler from optimizing the while-loop away. Use atomic or a mutex to access the bool from more than one thread. That is the only supported and correct solution. As you are using posix, a mutex would be the right solution in this case.
And don't use volatile. There is a posix standard that states what has to work and volatile is not a solution that has a guaranty to work.
And there is an othere problem: There is no guaranty that your newly created thread every started to run, before you set the flag to false.
For such simple example volatile is enough. But for vast majority of real world situations it is not. Use conditional variable for this task. They look weird at the first glance but actually they are quite logical. On x86 bool IS atomic to read/write (for ARM, probably, not). Also there is an obstacle with vector: it is NOT a vector of bools, it is a bitfield. To write vector from several threads use vector (or bool arr[SIZE]).
Also you don't join with thread, it is wrong.
Race condition means: when two threads are accessing the same object, and at least one of them is a write.
It means you will have two types of racing, write-write conflict and write-read conflict.
Back to your code, you essentially have two threads, one is the main thread, and another one is the one you created with pthread_create.
One of them is a read: while(!done), and one of them is a write: done = true.
You have race condition for sure.
Is a race condition possible when only one thread writes to a bool variable in c++?
Yes. In your case, the main thread is also a thread (i.e. you have one thread writing and one thread reading).
How is it possible that the bool variable ends up in an inconsistent state with just one writer?
The compiler is (should be) an optimizing compiler. It will probably optimize the reading of the done variable, unless you take care to avoid that (use std::atomic<bool> done instead).
its not guaranteed that the assignment to a bool which is one byte is atomic

C++ - Threads without coordinating mechanism like mutex_Lock

I attended one interview two days back. The interviewed guy was good in C++, but not in multithreading. When he asked me to write a code for multithreading of two threads, where one thread prints 1,3,5,.. and the other prints 2,4,6,.. . But, the output should be 1,2,3,4,5,.... So, I gave the below code(sudo code)
mutex_Lock LOCK;
int last=2;
int last_Value = 0;
void function_Thread_1()
{
while(1)
{
mutex_Lock(&LOCK);
if(last == 2)
{
cout << ++last_Value << endl;
last = 1;
}
mutex_Unlock(&LOCK);
}
}
void function_Thread_2()
{
while(1)
{
mutex_Lock(&LOCK);
if(last == 1)
{
cout << ++last_Value << endl;
last = 2;
}
mutex_Unlock(&LOCK);
}
}
After this, he said "these threads will work correctly even without those locks. Those locks will reduce the efficiency". My point was without the lock there will be a situation where one thread will check for(last == 1 or 2) at the same time the other thread will try to change the value to 2 or 1. So, My conclusion is that it will work without that lock, but that is not a correct/standard way. Now, I want to know who is correct and in which basis?
Without the lock, running the two functions concurrently would be undefined behaviour because there's a data race in the access of last and last_Value Moreover (though not causing UB) the printing would be unpredictable.
With the lock, the program becomes essentially single-threaded, and is probably slower than the naive single-threaded code. But that's just in the nature of the problem (i.e. to produce a serialized sequence of events).
I think the interviewer might have thought about using atomic variables.
Each instantiation and full specialization of the std::atomic template defines an atomic type. Objects of atomic types are the only C++ objects that are free from data races; that is, if one thread writes to an atomic object while another thread reads from it, the behavior is well-defined.
In addition, accesses to atomic objects may establish inter-thread synchronization and order non-atomic memory accesses as specified by std::memory_order.
[Source]
By this I mean the only thing you should change is remove the locks and change the lastvariable to std::atomic<int> last = 2; instead of int last = 2;
This should make it safe to access the last variable concurrently.
Out of curiosity I have edited your code a bit, and ran it on my Windows machine:
#include <iostream>
#include <atomic>
#include <thread>
#include <Windows.h>
std::atomic<int> last=2;
std::atomic<int> last_Value = 0;
std::atomic<bool> running = true;
void function_Thread_1()
{
while(running)
{
if(last == 2)
{
last_Value = last_Value + 1;
std::cout << last_Value << std::endl;
last = 1;
}
}
}
void function_Thread_2()
{
while(running)
{
if(last == 1)
{
last_Value = last_Value + 1;
std::cout << last_Value << std::endl;
last = 2;
}
}
}
int main()
{
std::thread a(function_Thread_1);
std::thread b(function_Thread_2);
while(last_Value != 6){}//we want to print 1 to 6
running = false;//inform threads we are about to stop
a.join();
b.join();//join
while(!GetAsyncKeyState('Q')){}//wait for 'Q' press
return 0;
}
and the output is always:
1
2
3
4
5
6
Ideone refuses to run this code (compilation errors)..
Edit: But here is a working linux version :) (thanks to soon)
The interviewer doesn't know what he is talking about. Without the locks you get races on both last and last_value. The compiler could for example reorder the assignment to last before the print and increment of last_value, which could lead to the other thread executing on stale data. Furthermore you could get interleaved output, meaning things like two numbers not being seperated by a linebreak.
Another thing, which could go wrong is that the compiler might decide not to reload last and (less importantly) last_value each iteration, since it can't (safely) change between those iterations anyways (since data races are illegal by the C++11 standard and aren't acknowledged in previous standards). This means that the code suggested by the interviewer actually has a good chance of creating infinite loops of doing absoulutely doing nothing.
While it is possible to make that code correct without mutices, that absolutely needs atomic operations with appropriate ordering constraints (release-semantics on the assignment to last and acquire on the load of last inside the if statement).
Of course your solution does lower efficiency due to effectivly serializing the whole execution. However since the runtime is almost completely spent inside the streamout operation, which is almost certainly internally synchronized by the use of locks, your solution doesn't lower the efficiency anymore then it already is. Waiting on the lock in your code might actually be faster then busy waiting for it, depending on the availible resources (the nonlocking version using atomics would absolutely tank when executed on a single core machine)