iostream thread safety, must cout and cerr be locked separately?

iostream thread safety, must cout and cerr be locked separately? - c++

I understand that to avoid output intermixing access to cout and cerr by multiple threads must be synchronized. In a program that uses both cout and cerr, is it sufficient to lock them separately? or is it still unsafe to write to cout and cerr simultaneously?
Edit clarification: I understand that cout and cerr are "Thread Safe" in C++11. My question is whether or not a write to cout and a write to cerr by different threads simultaneously can interfere with each other (resulting in interleaved input and such) in the way that two writes to cout can.

If you execute this function:
void f() {
std::cout << "Hello, " << "world!\n";
}
from multiple threads you'll get a more-or-less random interleaving of the two strings, "Hello, " and "world\n". That's because there are two function calls, just as if you had written the code like this:
void f() {
std::cout << "Hello, ";
std::cout << "world!\n";
}
To prevent that interleaving, you have to add a lock:
std::mutex mtx;
void f() {
std::lock_guard<std::mutex> lock(mtx);
std::cout << "Hello, " << "world!\n";
}
That is, the problem of interleaving has nothing to do with cout. It's about the code that uses it: there are two separate function calls inserting text, so unless you prevent multiple threads from executing the same code at the same time, there's a potential for a thread switch between the function calls, which is what gives you the interleaving.
Note that a mutex does not prevent thread switches. In the preceding code snippet, it prevents executing the contents of f() simultaneously from two threads; one of the threads has to wait until the other finishes.
If you're also writing to cerr, you have the same issue, and you'll get interleaved output unless you ensure that you never have two threads making these inserter function calls at the same time, and that means that both functions must use the same mutex:
std::mutex mtx;
void f() {
std::lock_guard<std::mutex> lock(mtx);
std::cout << "Hello, " << "world!\n";
}
void g() {
std::lock_guard<std::mutex> lock(mtx);
std::cerr << "Hello, " << "world!\n";
}

In C++11, unlike in C++03, the insertion to and extraction from global stream objects (cout, cin, cerr, and clog) are thread-safe. There is no need to provide manual synchronization. It is possible, however, that characters inserted by different threads will interleave unpredictably while being output; similarly, when multiple threads are reading from the standard input, it is unpredictable which thread will read which token.
Thread-safety of the global stream objects is active by default, but it can be turned off by invoking the sync_with_stdio member function of the stream object and passing false as an argument. In that case, you would have to handle the synchronization manually.

It may be unsafe to write to cout and cerr simultaneously !
It depends on wheter cout is tied to cerr or not. See std::ios::tie.
"The tied stream is an output stream object which is flushed before
each i/o operation in this stream object."
This means, that cout.flush() may get called unintentionally by the thread which writes to cerr.
I spent some time to figure out, that this was the reason for randomly missing line endings in cout's output in one of my projects :(
With C++98 cout should not be tied to cerr. But despite the standard it is tied when using MSVC 2008 (my experience). When using the following code everything works well.
std::ostream *cerr_tied_to = cerr.tie();
if (cerr_tied_to) {
if (cerr_tied_to == &cout) {
cerr << "DBG: cerr is tied to cout ! -- untying ..." << endl;
cerr.tie(0);
}
}
See also: why cerr flushes the buffer of cout

There are already several answers here. I'll summarize and also address interactions between them.
Typically,
std::cout and std::cerr will often be funneled into a single stream of text, so locking them in common results in the most usable program.
If you ignore the issue, cout and cerr by default alias their stdio counterparts, which are thread-safe as in POSIX, up to the standard I/O functions (C++14 §27.4.1/4, a stronger guarantee than C alone). If you stick to this selection of functions, you get garbage I/O, but not undefined behavior (which is what a language lawyer might associate with "thread safety," irrespective of usefulness).
However, note that while standard formatted I/O functions (such as reading and writing numbers) are thread-safe, the manipulators to change the format (such as std::hex for hexadecimal or std::setw for limiting an input string size) are not. So, one can't generally assume that omitting locks is safe at all.
If you choose to lock them separately, things are more complicated.
Separate locking
For performance, lock contention may be reduced by locking cout and cerr separately. They're separately buffered (or unbuffered), and they may flush to separate files.
By default, cerr flushes cout before each operation, because they are "tied." This would defeat both separation and locking, so remember to call cerr.tie( nullptr ) before doing anything with it. (The same applies to cin, but not to clog.)
Decoupling from stdio
The standard says that operations on cout and cerr do not introduce races, but that can't be exactly what it means. The stream objects aren't special; their underlying streambuf buffers are.
Moreover, the call std::ios_base::sync_with_stdio is intended to remove the special aspects of the standard streams — to allow them to be buffered as other streams are. Although the standard doesn't mention any impact of sync_with_stdio on data races, a quick look inside the libstdc++ and libc++ (GCC and Clang) std::basic_streambuf classes shows that they do not use atomic variables, so they may create race conditions when used for buffering. (On the other hand, libc++ sync_with_stdio effectively does nothing, so it doesn't matter if you call it.)
If you want extra performance regardless of locking, sync_with_stdio(false) is a good idea. However, after doing so, locking is necessary, along with cerr.tie( nullptr ) if the locks are separate.

This may be useful ;)
inline static void log(std::string const &format, ...) {
static std::mutex locker;
std::lock_guard<std::mutex>(locker);
va_list list;
va_start(list, format);
vfprintf(stderr, format.c_str(), list);
va_end(list);
}

I use something like this:
// Wrap a mutex around cerr so multiple threads don't overlap output
// USAGE:
// LockedLog() << a << b << c;
//
class LockedLog {
public:
LockedLog() { m_mutex.lock(); }
~LockedLog() { *m_ostr << std::endl; m_mutex.unlock(); }
template <class T>
LockedLog &operator << (const T &msg)
{
*m_ostr << msg;
return *this;
}
private:
static std::ostream *m_ostr;
static std::mutex m_mutex;
};
std::mutex LockedLog::m_mutex;
std::ostream* LockedLog::m_ostr = &std::cerr;

Related

How do I print in a new thread without threads interrupting lines? (particularly c++)

I've worked a decent amount with threading in C on linux and now I'm trying to do the same but with c++ on Windows, but I'm having trouble with printing to the standard output. In the function the thread carries out I have:
void print_number(void* x){
int num = *(static_cast<int*> (x));
std::cout << "The number is " << num << std::endl;
}
wrapped in a loop that creates three threads. The problem is that although everything gets printed, the threads seem to interrupt each other between each of the "<<"'s.
For example, the last time I ran it I got
The number is The number is 2The number is 3
1
When I was hoping for each on a separate line. I'm guessing that each thread is able to write to the standard output after another has written a single section between "<<"s. In C, this wasn't a problem because the buffer wasn't flushed until everything I needed the write was there, but that's not the case now I don't think. Is this a case of a need for a mutex?

In C++, we first of all would prefer to take arguments as int*. And then, we can just lock. In C++11:
std::mutex mtx; // somewhere, in case you have other print functions
// that you want to control
void print_number(int* num) {
std::unique_lock<std::mutex> lk{mtx}; // RAII. Unlocks when lk goes out of scope
std::cout << "The number is " << *num << std::endl;
}
If not C++11, there's boost::mutex and boost::mutex::scoped_lock that work the same way and do the same thing.

Your C example worked by accident; printf and the like aren't atomic either.
This is indeed a case for a mutex. I typically allocate it static function locally. E.g.:
void atomic_print(/*args*/) {
static MyMutex mutex;
mutex.acquire();
printf(/*with the args*/);
mutex.release();
}

Boost Mutex Scoped Lock

I was reading through a Boost Mutex tutorial on drdobbs.com, and found this piece of code:
#include <boost/thread/thread.hpp>
#include <boost/thread/mutex.hpp>
#include <boost/bind.hpp>
#include <iostream>
boost::mutex io_mutex;
void count(int id)
{
for (int i = 0; i < 10; ++i)
{
boost::mutex::scoped_lock
lock(io_mutex);
std::cout << id << ": " <<
i << std::endl;
}
}
int main(int argc, char* argv[])
{
boost::thread thrd1(
boost::bind(&count, 1));
boost::thread thrd2(
boost::bind(&count, 2));
thrd1.join();
thrd2.join();
return 0;
}
Now I understand the point of a Mutex is to prevent two threads from accessing the same resource at the same time, but I don't see the correlation between io_mutex and std::cout. Does this code just lock everything within the scope until the scope is finished?

Now I understand the point of a Mutex is to prevent two threads from accessing the same resource at the same time, but I don't see the correlation between io_mutex and std::cout.
std::cout is a global object, so you can see that as a shared resource. If you access it concurrently from several threads, those accesses must be synchronized somehow, to avoid data races and undefined behavior.
Perhaps it will be easier for you to notice that concurrent access occurs by considering that:
std::cout << x
Is actually equivalent to:
::operator << (std::cout, x)
Which means you are calling a function that operates on the std::cout object, and you are doing so from different threads at the same time. std::cout must be protected somehow. But that's not the only reason why the scoped_lock is there (keep reading).
Does this code just lock everything within the scope until the scope is finished?
Yes, it locks io_mutex until the lock object itself goes out of scope (being a typical RAII wrapper), which happens at the end of each iteration of your for loop.
Why is it needed? Well, although in C++11 individual insertions into cout are guaranteed to be thread-safe, subsequent, separate insertions may be interleaved when several threads are outputting something.
Keep in mind that each insertion through operator << is a separate function call, as if you were doing:
std::cout << id;
std::cout << ": ";
std::cout << i;
std::cout << endl;
The fact that operator << returns the stream object allows you to chain the above function calls in a single expression (as you have done in your program), but the fact that you are having several separate function calls still holds.
Now looking at the above snippet, it is more evident that the purpose of this scoped lock is to make sure that each message of the form:
<id> ": " <index> <endl>
Gets printed without its parts being interleaved with parts from other messages.
Also, in C++03 (where insertions into cout are not guaranteed to be thread-safe) , the lock will protect the cout object itself from being accessed concurrently.

A mutex has nothing to do with anything else in the program
(except a conditional variable), at least at a higher level.
A mutex has two effeccts: it controls program flow, and prevents
multiple threads from executing the same block of code
simultaneously. It also ensures memory synchronization. The
important issue here, is that mutexes aren't associated with
resources, and don't prevent two threads from accessing the same
resource at the same time. A mutex defines a critical section
of code, which can only be entered by one thread at a time. If
all of the use of a particular resource is done in critical
sections controled by the same mutex, then the resource is
effectively protected by the mutex. But the relationship is
established by the coder, by ensuring that all use does take
place in the critical sections.

Basic thread locking in C++11

How do I lock my thread so that my output isn't something like this: hello...hello...hheelhllelolo.l..o......
std::size_t const nthreads=5;
std::vector<std::thread> my_threads(nthreads);
for(unsigned i=0;i<nthreads;i++)
{
my_threads[i] = std::thread ([] () {std::cout << "hello...";});
}

The standard says:
Concurrent access to a synchronized (27.5.3.4) standard iostream object’s formatted and unformatted input (27.7.2.1) and output (27.7.3.1) functions or a standard C stream by multiple threads shall not result in a data race (1.10). [Note: Users must still synchronize concurrent use of these objects and streams by multiple threads if they wish to avoid interleaved characters. — end note] — [iostream.objects.overview] 27.4.1 p4
Notice that the requirement not to produce a data race applies only to the standard iostream objects (cout, cin, cerr, clog, wcout, wcin, wcerr, and wclog) and only when they are synchronized (which they are by default and which can be disabled using the sync_with_stdio member function).
Unfortunately I've noticed two phenomena; implementations either provide stricter guarantees than required (e.g., thread synchronization for all stream objects no matter what, giving poor performance) or fewer (e.g., standard stream objects that are sync_with_stdio produce data races). MSVC seems to lean toward the former while libc++ leans toward the latter.
Anyway, as the note indicates, you have to provide mutual exclusion yourself if you want to avoid interleaved characters. Here's one way to do it:
std::mutex m;
struct lockostream {
std::lock_guard<std::mutex> l;
lockostream() : l(m) {}
};
std::ostream &operator<<(std::ostream &os, lockostream const &l) {
return os;
}
std::cout << lockostream() << "Hello, World!\n";
This way a lock guard is created and lives for the duration of the expression using std::cout. You can templatized the lockostream object to work for any basic_*stream, and even on the address of the stream so that you have a seperate mutex for each one.
Of course the standard stream objects are global variables, so you might want to avoid them the same way all global variables should be avoided. They're handy for learning C++ and toy programs, but you might want to arrange something better for real programs.

You have to use the normal locking techniques as you would do with any other resource otherwise you are experiencing UB.
std::mutex m;
std::lock_guard<std::mutex> lock(m);
std::cout << "hello hello";
or alternativly you can use printf which is threadsafe(on posix):
printf("hello hello");

BOOST threading : cout behavior

I am new to Boost threading and I am stuck with how output is performed from multiple threads.
I have a simple boost::thread counting down from 9 to 1; the main thread waits and then prints "LiftOff..!!"
#include <iostream>
#include <boost/thread.hpp>
using namespace std;
struct callable {
void operator() ();
};
void callable::operator() () {
int i = 10;
while(--i > 0) {
cout << "#" << i << ", ";
boost::this_thread::yield();
}
cout.flush();
}
int main() {
callable x;
boost::thread myThread(x);
myThread.join();
cout << "LiftOff..!!" << endl;
return 0;
}
The problem is that I have to use an explicit "cout.flush()" statement in my thread to display the output. If I don't use flush(), I only get "LiftOff!!" as the output.
Could someone please advise why I need to use flush() explicitly?

This isn't specifically thread related as cout will buffer usually on a per thread basis and only output when the implementation decides to - so in the thread the output will only appear on a implementation specific basic - by calling flush you are forcing the buffers to be flushed.
This will vary across implementations - usually though it's after a certain amount of characters or when a new line is sent.
I've found that multiple threads writing too the same stream or file is mostly OK - providing that the output is performed as atomically as possible. It's not something that I'd recommend in a production environment though as it is too unpredictable.

This behaviour seems to depend on OS specific implementation of the cout stream. I guess that write operations on cout are buffered to some thread specific memory intermediatly in your case, and the flush() operation forces them being printed on the console. I guess this, since endl includes calling the flush() operation and the endl in your main function doesn't see your changes even after the thread has been joined.
BTW it would be a good idea to synchronize outputs to an ostream shared between threads anyway, otherwise you might see them intermigled. We do so for our logging classes which use a background thread to write the logging messages to the associated ostream.

Given the short length of your messages, there's no reason anything should appear without a flush. (Don't forget that std::endl is the equivalent of << '\n' << std::flush.)

I get the asked behaviour with and without flush (gcc 4.3.2 boost 1.47 Linux RH5)
I assume that your cygwin system chooses to implement several std::cout objects with associated std::streambuf. This I assume is implementation specific.
Since flush or endl only forces its buffer to flush onto its OS controlled output sequence the cout object of your thread remains buffered.
Sharing a reference of an ostream between the threads should solve the problem.

How to make sure a chained logging statement is atomic?

I have a logging class which has operator<< overloaded. So I can do things like this:
oLogger << "Log this" << " and this" << " and " << 10 << endl;
oLogger`<< "Something else" << endl;
The logger does this without any problems. But, I want the logger object to be shared among threads. Then, I don't want it printing out something like this:
//LogFILE
Log this and this Something else
and 10
So, I need to lock a whole chain of operator<<s. I am guessing this can be done with RAII, I haven't given it much thought yet. In the meantime, is there any traditional way of getting this done? (Except ending input with a manipulator?)

Slight alternative to Nim's answer:
Create
class LockedLog {
static MutEx mutex; // global mutex for logging
ScopedLock lock; // some scoped locker to hold the mutex
Logger &oLogger; // reference to the log writer itself
public:
LockedLog(Logger &oLogger) : oLogger(oLogger), lock(mutex) {}
template <typename T>
LockedLog &operator<<(const T &value) { oLogger << value; return *this; }
};
And either just do:
LockedLog(oLogger) << "Log this" << " and this " << " and " << 10 << endl;
Or change Logger::operator<< to normal method, call this method in LockedLog::operator<<, add cast-operator to Logger:
operator LockedLog() { return LockedLog(*this); }
and that should add locking to your current code.
Update: That locks across all the calls to operator<< and may even lock around evaluation of their arguments (depends on whether compiler will evaluate left or right argument first and it may choose). To reduce that, one could:
class LockedLog {
static MutEx mutex; // global mutex for logging
std::stringstream buffer; // temporary formatting buffer;
Logger &oLogger; // reference to the log writer itself
public:
LockedLog(Logger &oLogger) : oLogger(oLogger), lock(mutex) {}
template <typename T>
LockedLog &operator<<(const T &value) { buffer << value; return *this; }
~LockedLog() { ScopedLock lock(mutex); oLogger << buffer.str() << std::flush; }
};
But the stringstream adds another overhead.

One approach is to use a macro, i.e.
#define LOG(x) \
{\
<acquire scoped lock> \
oLogger << x; \
}
then
LOG("Log this" << " and this" << " and " << 10 << endl);
I've also done it using the manipulator approach that you mention above, however the problem is that you need to have operator<< implemented for all the types (i.e. can't use the standard operators that exist)
EDIT: to reduce the time the lock is held, consider something like this:
#define LOG(x) \
{\
std::ostringstream str; \
str << x; \ // the streaming happens in local scope, no need for lock
oLogger.write(str.str()); \ // ensure the write method acquires a lock
}

I've found that the best solution is to write a class buffer so that
buffer(oLogger) << "Log this" << " and this" << " and " << 10 << endl;
creates a temporary buffer object, captures and formats the output and writes it to oLogger in its destructor. This is trivially done by wrapping a stringstream. Because every thread has its own buffers, formatting is independent.
For extra fanciness, buffer::~buffer can use several different mechanisms to prevent thread-unsafe access of oLogger. You assumed that operator<< calls from multiple threads might be interleaved. In fact, it's worse; they can be concurrent. You could get "LSoogm ethhiinsg else". Makeing sure that only one buffer flushes to oLogger at a time prevents this.

I would probably use expression templates here.
The main idea is that it's bloody stupid to acquire the lock during the formatting phase, especially since there might be function calls during this formatting.
You need to use two different phases:
format the log
atomically post the log
This can be accomplished with expression templates:
First call to Logger::operator<< yields a LoggerBuffer that embeds a reference to Logger.
Subsequent calls are performed on LoggerBuffer which deals with all the formatting mess
Upon destruction of LoggerBuffer (at the end of the statement), it locks Logger, pass the formatted string, and unlocks (unless you've got a lock-free queue or something)

As I have to internationalize logs, I prefer things like :
oLogger << myAutosprintf(_("My wonderful %s ! I have %d apples"), name, nbApple);
It is way better for translation :) And it will solve your problem. _() is a shortcut for the translation stuff.
You can use gnu::autosprintf, boost.format (thanks to Jan Huec), or write your own.
my2c
NB: Edited after good remarks (was too fast, thank you for the comments). I have erased the wrong "first part" statement

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js