I am new to Boost threading and I am stuck with how output is performed from multiple threads.
I have a simple boost::thread counting down from 9 to 1; the main thread waits and then prints "LiftOff..!!"
#include <iostream>
#include <boost/thread.hpp>
using namespace std;
struct callable {
void operator() ();
};
void callable::operator() () {
int i = 10;
while(--i > 0) {
cout << "#" << i << ", ";
boost::this_thread::yield();
}
cout.flush();
}
int main() {
callable x;
boost::thread myThread(x);
myThread.join();
cout << "LiftOff..!!" << endl;
return 0;
}
The problem is that I have to use an explicit "cout.flush()" statement in my thread to display the output. If I don't use flush(), I only get "LiftOff!!" as the output.
Could someone please advise why I need to use flush() explicitly?
This isn't specifically thread related as cout will buffer usually on a per thread basis and only output when the implementation decides to - so in the thread the output will only appear on a implementation specific basic - by calling flush you are forcing the buffers to be flushed.
This will vary across implementations - usually though it's after a certain amount of characters or when a new line is sent.
I've found that multiple threads writing too the same stream or file is mostly OK - providing that the output is performed as atomically as possible. It's not something that I'd recommend in a production environment though as it is too unpredictable.
This behaviour seems to depend on OS specific implementation of the cout stream. I guess that write operations on cout are buffered to some thread specific memory intermediatly in your case, and the flush() operation forces them being printed on the console. I guess this, since endl includes calling the flush() operation and the endl in your main function doesn't see your changes even after the thread has been joined.
BTW it would be a good idea to synchronize outputs to an ostream shared between threads anyway, otherwise you might see them intermigled. We do so for our logging classes which use a background thread to write the logging messages to the associated ostream.
Given the short length of your messages, there's no reason anything should appear without a flush. (Don't forget that std::endl is the equivalent of << '\n' << std::flush.)
I get the asked behaviour with and without flush (gcc 4.3.2 boost 1.47 Linux RH5)
I assume that your cygwin system chooses to implement several std::cout objects with associated std::streambuf. This I assume is implementation specific.
Since flush or endl only forces its buffer to flush onto its OS controlled output sequence the cout object of your thread remains buffered.
Sharing a reference of an ostream between the threads should solve the problem.
Related
The following is a snippet of a larger program and is done using Pthreads.
The UpdateFunction reads from a text file. The FunctionMap is just used to output (key,1). Here essentially UpdateFunction and FunctionMap run on different threads.
queue <list<string>::iterator> mapperpool;
void *UpdaterFunction(void* fn) {
std::string *x = static_cast<std::string*>(fn);
string filename = *x;
ifstream file (filename.c_str());
string word;
list <string> letterwords[50];
char alphabet = '0';
bool times = true;
int charno=0;
while(file >> word) {
if(times) {
alphabet = *(word.begin());
times = false;
}
if (alphabet != *(word.begin())) {
alphabet = *(word.begin());
mapperpool.push(letterwords[charno].begin());
letterwords[charno].push_back("xyzzyspoon");
charno++;
}
letterwords[charno].push_back(word);
}
file.close();
cout << "UPDATER DONE!!" << endl;
pthread_exit(NULL);
}
void *FunctionMap(void *i) {
long num = (long)i;
stringstream updaterword;
string toQ;
int charno = 0;
fprintf(stderr, "Print me %ld\n", num);
sleep(1);
while (!mapperpool.empty()) {
list<string>::iterator it = mapperpool.front();
while(*it != "xyzzyspoon") {
cout << "(" << *it << ",1)" << "\n";
cout << *it << "\n";
it++;
}
mapperpool.pop();
}
pthread_exit(NULL);
}
If I add the while(!mapperpool.empty()) in the UpdateFunction then it gives me the perfect output. But when I move it back to the FunctionMap then it gives me a weird out and Segfaults later.
Output when used in UpdateFunction:
Print me 0
course
cap
class
culture
class
cap
course
course
cap
culture
concurrency
.....
[Each word in separate line]
Output when used in FunctionMap (snippet shown above):
Print me 0
UPDATER DONE!!
(course%0+�0#+�0�+�05P+�0����cap%�+�0�+�0,�05�+�0����class5P?�0
����xyzzyspoon%�+�0�+�0(+�0%P,�0,�0�,�05+�0����class%p,�0�,�0-�05�,�0����cap%�,�0�,�0X-�05�,�0����course%-�0 -�0�-�050-�0����course%-�0p-�0�-�05�-�0����cap%�-�0�-�0H.�05�-�0����culture%.�0.�0�.�05 .�0
����concurrency%P.�0`.�0�.�05p.�0����course%�.�0�.�08/�05�.�0����cap%�.�0/�0�/�05/�0Segmentation fault (core dumped)
How do I fix this issue?
list <string> letterwords[50] is local to UpdaterFunction. When UpdaterFunction finishes, all its local variables got destroyed. When FunctionMap inspects iterator, that iterator already points to deleted memory.
When you insert while(!mapperpool.empty()) UpdaterFunction waits for FunctionMap completion and letterwords stays 'alive'.
Here essentially UpdateFunction and FunctionMap run on different threads.
And since they both manipulate the same object (mapperpool) and neither of them uses either pthread_mutex nor std::mutex (C++11), you have a data race. If you have a data race, you have Undefined Behaviour and the program might do whatever it wants. Most likely it will write garbage all over memory until eventually crashing, exactly as you see.
How do I fix this issue?
By locking the mapperpool object.
Why is list not thread-safe?
Well, in vast majority of use-cases, a single list (or any other collection) won't be used by more than one thread. In significant part of the rest the lock will have to extend over more than one operation on the collection, so the client will have to do its own locking anyway. The remaining tiny percentage of cases where locking in the operations themselves would help is not worth adding the overhead for everyone; C++ key design principle is that you only pay for what you use.
The collections are only reentrant, meaning that using different instances in parallel is safe.
Note on pthreads
C++11 introduced threading library that integrates well with the language. Most notably, it uses RAII for locking of std::mutex via std::lock_guard, std::unique_lock and std::shared_lock (for reader-writer locking). Consistently using these can eliminate large class of locking bugs that otherwise take considerable time to debug.
If you can't use C++11 yet (on desktop you can, but some embedded platforms did not get a compiler update yet), you should first consider Boost.Thread as it provides the same benefits.
If you can't use even then, still try to find, or write, a simple RAII wrapper for locking like the C++11/Boost do. The basic wrapper is just a couple of lines, but it will save you a lot of debugging.
Note that C++11 and Boost also have atomic operations library that pthreads sorely miss.
I wish I could have thought of a more descriptive title, but that is the case. I've got some code that I'd like to do some image processing with. I also need to get some statistical data from those processed images, and I'd like to do that on a separate thread so my main thread can keep doing image processing.
All that aside, here's my code. It shouldn't really be relevant, other than the fact that my Image class wraps an OpenCV Mat (though I'm not using OMP or anything, as far as I'm aware):
#include <thread>
#include <iostream>
#include <vector>
using namespace std;
//Data struct
struct CenterFindData{
//Some images I'd like to store
Image in, bpass, bpass_thresh, local_max, tmp;
//Some Data (Particle Data = float[8])
vector<ParticleData> data;
//My thread flag
bool goodToGo{ false };
//Constructor
CenterFindData(const Image& m);
};
vector<ParticleData> statistics(CenterFindData& CFD);
void operate(vector<CenterFindData> v_CFD);
..........................................
..........................................
..........................................
void operate(vector<CenterFindData> v_CFD){
//Thread function, gathers statistics on processed images
thread T( [&](){
int nProcessed(0);
for (auto& cfd : v_CFD){
//Chill while the images are still being processed
while (cfd.goodToGo == false){
//This works if I uncomment this print statement
/*cout << "Waiting" << endl;*/
}
cout << "Statistics gathered from " << nProcessed++ << " images" << endl;
//This returns vector<ParticleData>
cfd.data = m_Statistics(cfd);
}
});
//Run some filters on the images before statistics
int nProcessed(0);
for (auto& cfd : v_CFD){
//Preprocess images
RecenterImage(cfd.in);
m_BandPass(cfd);
m_LocalMax(cfd);
RecenterImage(cfd.bpass_thresh);
//Tell thread to do statistics, on to the next
cfd.goodToGo = true;
cout << "Ran filters on " << nProcessed++ << " images" << endl;
}
//Join thread
T.join();
}
I figure that the delay from cout is avoid some race condition I otherwise run into, but what? Because only one thread modified the bool goodToGo, while the other checks it, should that be a thread safe way of "gating" the two functions?
Sorry if anything is unclear, I'm very new to this and seem to make a lot of obvious mistakes WRT multithreaded programming.
Thanks for your help
john
When you have:
while (cfd.goodToGo == false){ }
the compiler doesn't see any reason to "reload" the value of goodToGo (it doesn't know that this value is affected by other threads!). So it reads it once, and then loops forever.
The reason printing something makes a difference is, that the compiler doesn't know what the print function actually will and won't affect, so "just in case", it reloads that value (if the compiler could "see inside" all of the printing code, it could in fact decide that goodToGo is NOT changed by the printing, and not needing to reload - but there are limits to how much time [or some proxy for time, such as "number of levels of calls" or "number of intermediate instructions"] the compiler spends on figuring these sort of thing out [and there may of course be calls to code that the compiler doesn't actually have access to the source code of, such as the system calls to write or similar].
The solution, however, is to use thread-safe mechanisms to update the goodToGo - we could just throw a volatile attribute to the variable, but that will not guarantee that, for example, another processor gets "told" that the value has updated, so could delay the detection of the updated value significantly [or even infinitely under some conditions].
Use std::atomic_bool goodToGo and the store and load functions to access the value inside. That way, you will be guaranteed that the value is updated correctly and "immediately" (as in, a few dozen to hundreds of clock-cycles later) seen by the other thread.
As a side-note, which probably should have been the actual answer: Busy-waiting for threads is a bad idea in general, you should probably look at some thread-primitives to wait for a condition_variable or similar.
I've worked a decent amount with threading in C on linux and now I'm trying to do the same but with c++ on Windows, but I'm having trouble with printing to the standard output. In the function the thread carries out I have:
void print_number(void* x){
int num = *(static_cast<int*> (x));
std::cout << "The number is " << num << std::endl;
}
wrapped in a loop that creates three threads. The problem is that although everything gets printed, the threads seem to interrupt each other between each of the "<<"'s.
For example, the last time I ran it I got
The number is The number is 2The number is 3
1
When I was hoping for each on a separate line. I'm guessing that each thread is able to write to the standard output after another has written a single section between "<<"s. In C, this wasn't a problem because the buffer wasn't flushed until everything I needed the write was there, but that's not the case now I don't think. Is this a case of a need for a mutex?
In C++, we first of all would prefer to take arguments as int*. And then, we can just lock. In C++11:
std::mutex mtx; // somewhere, in case you have other print functions
// that you want to control
void print_number(int* num) {
std::unique_lock<std::mutex> lk{mtx}; // RAII. Unlocks when lk goes out of scope
std::cout << "The number is " << *num << std::endl;
}
If not C++11, there's boost::mutex and boost::mutex::scoped_lock that work the same way and do the same thing.
Your C example worked by accident; printf and the like aren't atomic either.
This is indeed a case for a mutex. I typically allocate it static function locally. E.g.:
void atomic_print(/*args*/) {
static MyMutex mutex;
mutex.acquire();
printf(/*with the args*/);
mutex.release();
}
(I hate having to put a title like this. but I just couldn't find anything better)
I have two classes with two threads. first one detects motion between two frames:
void Detector::run(){
isActive = true;
// will run forever
while (isActive){
//code to detect motion for every frame
//.........................................
if(isThereMotion)
{
if(number_of_sequence>0){
theRecorder.setRecording(true);
theRecorder.setup();
// cout << " motion was detected" << endl;
}
number_of_sequence++;
}
else
{
number_of_sequence = 0;
theRecorder.setRecording(false);
// cout << " there was no motion" << endl;
cvWaitKey (DELAY);
}
}
}
second one will record a video when started:
void Recorder::setup(){
if (!hasStarted){
this->start();
}
}
void Recorder::run(){
theVideoWriter.open(filename, CV_FOURCC('X','V','I','D'), 20, Size(1980,1080), true);
if (recording){
while(recording){
//++++++++++++++++++++++++++++++++++++++++++++++++
cout << recording << endl;
hasStarted=true;
webcamRecorder.read(matRecorder); // read a new frame from video
theVideoWriter.write(matRecorder); //writer the frame into the file
}
}
else{
hasStarted=false;
cout << "no recording??" << endl;
changeFilemamePlusOne();
}
hasStarted=false;
cout << "finished recording" << endl;
theVideoWriter.release();
}
The boolean recording gets changed by the function:
void Recorder::setRecording(bool x){
recording = x;
}
The goal is to start the recording once motion was detected while preventing the program from starting the recording twice.
The really strange problem, which honestly doesn't make any sense in my head, is that the code will only work if I cout the boolean recording ( marked with the "++++++"). Else recording never changes to false and the code in the else statment never gets called.
Does anyone have an idea on why this is happening. I'm still just begining with c++ but this problem seems really strange to me..
I suppose your variables isThereMotion and recording are simple class members of type bool.
Concurrent access to these members isn't thread safe by default, and you'll face race conditions, and all kinds of weird behaviors.
I'd recommend to declare these member variables like this (as long you can make use of the latest standard):
class Detector {
// ...
std::atomic<bool> isThereMotion;
};
class Recorder {
// ...
std::atomic<bool> hasStarted;
};
etc.
The reason behind the scenes is, that even reading/writing a simple boolean value splits up into several assembler instructions applied to the CPU, and those may be scheduled off in the middle for a thread execution path change of the process. Using std::atomic<> provides something like a critical section for read/write operations on this variable automatically.
In short: Make everything, that is purposed to be accessed concurrently from different threads, an atomic value, or use an appropriate synchronization mechanism like a std::mutex.
If you can't use the latest c++ standard, you can perhaps workaround using boost::thread to keep your code portable.
NOTE:
As from your comments, your question seems to be specific for the Qt framework, there's a number of mechanisms you can use for synchronization as e.g. the mentioned QMutex.
Why volatile doesn't help in multithreaded environments?
volatile prevents the compiler to optimize away actual read access just by assumptions of values set formerly in a sequential manner. It doesn't prevent threads to be interrupted in actually retrieving or writing values there.
volatile should be used for reading from addresses that can be changed independently of the sequential or threading execution model (e.g. bus addressed peripheral HW registers, where the HW changes values actively, e.g. a FPGA reporting current data throughput at a register inteface).
See more details about this misconception here:
Why is volatile not considered useful in multithreaded C or C++ programming?
You could use a pool of nodes with pointers to frame buffers as part of a linked list fifo messaging system using mutex and semaphore to coordinate the threads. A message for each frame to be recorded would be sent to the recording thread (appended to it's list and a semaphore released), otherwise the node would be returned (appended) back to the main thread's list.
Example code using Windows based synchronization to copy a file. The main thread reads into buffers, the spawned thread writes from buffers it receives. The setup code is lengthy, but the actual messaging functions and the two thread functions are simple and small.
mtcopy.zip
Could be a liveness issue. The compiler could be re-ordering instructions or hoisting isActive out of the loop. Try marking it as volatile.
From MSDN docs:
Objects that are declared as volatile are not used in certain optimizations because their values can change at any time. The system always reads the current value of a volatile object when it is requested, even if a previous instruction asked for a value from the same object. Also, the value of the object is written immediately on assignment.
Simple example:
#include <iostream>
using namespace std;
int main() {
bool active = true;
while(active) {
cout << "Still active" << endl;
}
}
Assemble it:
g++ -S test.cpp -o test1.a
Add volatile to active as in volatile bool active = true
Assemble it again g++ -S test.cpp -o test2.a and look at the difference diff test1.a test2.a
< testb $1, -5(%rbp)
---
> movb -5(%rbp), %al
> testb $1, %al
Notice the first one doesn't even bother to read the value of active before testing it since the loop body never modifies it. The second version does.
I understand that to avoid output intermixing access to cout and cerr by multiple threads must be synchronized. In a program that uses both cout and cerr, is it sufficient to lock them separately? or is it still unsafe to write to cout and cerr simultaneously?
Edit clarification: I understand that cout and cerr are "Thread Safe" in C++11. My question is whether or not a write to cout and a write to cerr by different threads simultaneously can interfere with each other (resulting in interleaved input and such) in the way that two writes to cout can.
If you execute this function:
void f() {
std::cout << "Hello, " << "world!\n";
}
from multiple threads you'll get a more-or-less random interleaving of the two strings, "Hello, " and "world\n". That's because there are two function calls, just as if you had written the code like this:
void f() {
std::cout << "Hello, ";
std::cout << "world!\n";
}
To prevent that interleaving, you have to add a lock:
std::mutex mtx;
void f() {
std::lock_guard<std::mutex> lock(mtx);
std::cout << "Hello, " << "world!\n";
}
That is, the problem of interleaving has nothing to do with cout. It's about the code that uses it: there are two separate function calls inserting text, so unless you prevent multiple threads from executing the same code at the same time, there's a potential for a thread switch between the function calls, which is what gives you the interleaving.
Note that a mutex does not prevent thread switches. In the preceding code snippet, it prevents executing the contents of f() simultaneously from two threads; one of the threads has to wait until the other finishes.
If you're also writing to cerr, you have the same issue, and you'll get interleaved output unless you ensure that you never have two threads making these inserter function calls at the same time, and that means that both functions must use the same mutex:
std::mutex mtx;
void f() {
std::lock_guard<std::mutex> lock(mtx);
std::cout << "Hello, " << "world!\n";
}
void g() {
std::lock_guard<std::mutex> lock(mtx);
std::cerr << "Hello, " << "world!\n";
}
In C++11, unlike in C++03, the insertion to and extraction from global stream objects (cout, cin, cerr, and clog) are thread-safe. There is no need to provide manual synchronization. It is possible, however, that characters inserted by different threads will interleave unpredictably while being output; similarly, when multiple threads are reading from the standard input, it is unpredictable which thread will read which token.
Thread-safety of the global stream objects is active by default, but it can be turned off by invoking the sync_with_stdio member function of the stream object and passing false as an argument. In that case, you would have to handle the synchronization manually.
It may be unsafe to write to cout and cerr simultaneously !
It depends on wheter cout is tied to cerr or not. See std::ios::tie.
"The tied stream is an output stream object which is flushed before
each i/o operation in this stream object."
This means, that cout.flush() may get called unintentionally by the thread which writes to cerr.
I spent some time to figure out, that this was the reason for randomly missing line endings in cout's output in one of my projects :(
With C++98 cout should not be tied to cerr. But despite the standard it is tied when using MSVC 2008 (my experience). When using the following code everything works well.
std::ostream *cerr_tied_to = cerr.tie();
if (cerr_tied_to) {
if (cerr_tied_to == &cout) {
cerr << "DBG: cerr is tied to cout ! -- untying ..." << endl;
cerr.tie(0);
}
}
See also: why cerr flushes the buffer of cout
There are already several answers here. I'll summarize and also address interactions between them.
Typically,
std::cout and std::cerr will often be funneled into a single stream of text, so locking them in common results in the most usable program.
If you ignore the issue, cout and cerr by default alias their stdio counterparts, which are thread-safe as in POSIX, up to the standard I/O functions (C++14 §27.4.1/4, a stronger guarantee than C alone). If you stick to this selection of functions, you get garbage I/O, but not undefined behavior (which is what a language lawyer might associate with "thread safety," irrespective of usefulness).
However, note that while standard formatted I/O functions (such as reading and writing numbers) are thread-safe, the manipulators to change the format (such as std::hex for hexadecimal or std::setw for limiting an input string size) are not. So, one can't generally assume that omitting locks is safe at all.
If you choose to lock them separately, things are more complicated.
Separate locking
For performance, lock contention may be reduced by locking cout and cerr separately. They're separately buffered (or unbuffered), and they may flush to separate files.
By default, cerr flushes cout before each operation, because they are "tied." This would defeat both separation and locking, so remember to call cerr.tie( nullptr ) before doing anything with it. (The same applies to cin, but not to clog.)
Decoupling from stdio
The standard says that operations on cout and cerr do not introduce races, but that can't be exactly what it means. The stream objects aren't special; their underlying streambuf buffers are.
Moreover, the call std::ios_base::sync_with_stdio is intended to remove the special aspects of the standard streams — to allow them to be buffered as other streams are. Although the standard doesn't mention any impact of sync_with_stdio on data races, a quick look inside the libstdc++ and libc++ (GCC and Clang) std::basic_streambuf classes shows that they do not use atomic variables, so they may create race conditions when used for buffering. (On the other hand, libc++ sync_with_stdio effectively does nothing, so it doesn't matter if you call it.)
If you want extra performance regardless of locking, sync_with_stdio(false) is a good idea. However, after doing so, locking is necessary, along with cerr.tie( nullptr ) if the locks are separate.
This may be useful ;)
inline static void log(std::string const &format, ...) {
static std::mutex locker;
std::lock_guard<std::mutex>(locker);
va_list list;
va_start(list, format);
vfprintf(stderr, format.c_str(), list);
va_end(list);
}
I use something like this:
// Wrap a mutex around cerr so multiple threads don't overlap output
// USAGE:
// LockedLog() << a << b << c;
//
class LockedLog {
public:
LockedLog() { m_mutex.lock(); }
~LockedLog() { *m_ostr << std::endl; m_mutex.unlock(); }
template <class T>
LockedLog &operator << (const T &msg)
{
*m_ostr << msg;
return *this;
}
private:
static std::ostream *m_ostr;
static std::mutex m_mutex;
};
std::mutex LockedLog::m_mutex;
std::ostream* LockedLog::m_ostr = &std::cerr;