How to easily make std::cout thread-safe? - c++

I have a multi-threaded application, which heavily uses std::cout for logging without any locking. In such a case, how can I easily add lock mechanism to make std::cout thread-safe?
I don't want to search for each occurrence of std::cout and add a line of locking code. That is too tedious.
Any better practice?

While I can't be sure this applies to every compiler / version of std libs
but in the code-base I'm using std::cout::operator<<() it is already thread-safe.
I'm assuming that what you're really trying to do it stop std::cout from mixing string when concatenating with the operator<< multiple time per string, across multiple threads.
The reason strings get garbled is because there is a "External" race on the operator<<
this can lead to things like this happening.
//Thread 1
std::cout << "the quick brown fox " << "jumped over the lazy dog " << std::endl;
//Thread 2
std::cout << "my mother washes" << " seashells by the sea shore" << std::endl;
//Could just as easily print like this or any other crazy order.
my mother washes the quick brown fox seashells by the sea shore \n
jumped over the lazy dog \n
If that's the case there is a much simpler answer than making your own thread safe cout or implementing a lock to use with cout.
Simply compose your string before you pass it to cout
For example.
//There are other ways, but stringstream uses << just like cout..
std::stringstream msg;
msg << "Error:" << Err_num << ", " << ErrorString( Err_num ) << "\n";
std::cout << msg.str();
This way your stings can't be garbled because they are already fully formed, plus its also a better practice to fully form your strings anyway before dispatching them.

Since C++20, you can use std::osyncstream wrapper:
http://en.cppreference.com/w/cpp/io/basic_osyncstream
{
std::osyncstream bout(std::cout); // synchronized wrapper for std::cout
bout << "Hello, ";
bout << "World!";
bout << std::endl; // flush is noted, but not yet performed
bout << "and more!\n";
} // characters are transferred and std::cout is flushed
It provides the guarantee that all output made to the same final
destination buffer (std::cout in the examples above) will be free of
data races and will not be interleaved or garbled in any way, as long
as every write to the that final destination buffer is made through
(possibly different) instances of std::basic_osyncstream.
Alternatively, you can use a temporary:
std::osyncstream(std::cout) << "Hello, " << "World!" << '\n';

Note: This answer is pre-C++20 so it does not use std::osyncstream with its separate buffering, but uses a lock instead.
I guess you could implement your own class which wraps cout and associates a mutex with it. The operator << of that new class would do three things:
create a lock for the mutex, possibly blocking other threads
do the output, i.e. do the operator << for the wrapped stream and the passed argument
construct an instance of a different class, passing the lock to that
This different class would keep the lock and delegate operator << to the wrapped stream. The destructor of that second class would eventually destroy the lock and release the mutex.
So any output you write as a single statement, i.e. as a single sequence of << invocations, will be printed atomically as long as all your output goes through that object with the same mutex.
Let's call the two classes synchronized_ostream and locked_ostream. If sync_cout is an instance of synchronized_ostream which wraps std::cout, then the sequence
sync_cout << "Hello, " << name << "!" << std::endl;
would result in the following actions:
synchronized_ostream::operator<< would aquire the lock
synchronized_ostream::operator<< would delegate the printing of "Hello, " to cout
operator<<(std::ostream&, const char*) would print "Hello, "
synchronized_ostream::operator<< would construct a locked_ostream and pass the lock to that
locked_ostream::operator<< would delegate the printing of name to cout
operator<<(std::ostream&, std::string) would print the name
The same delegation to cout happens for the exclamation point and the endline manipulator
The locked_ostream temporary gets destructed, the lock is released

I really like the trick from Nicolás given in this question of creating a temporary object and putting the protection code on the destructor.
/** Thread safe cout class
* Exemple of use:
* PrintThread{} << "Hello world!" << std::endl;
*/
class PrintThread: public std::ostringstream
{
public:
PrintThread() = default;
~PrintThread()
{
std::lock_guard<std::mutex> guard(_mutexPrint);
std::cout << this->str();
}
private:
static std::mutex _mutexPrint;
};
std::mutex PrintThread::_mutexPrint{};
You can then use it as a regular std::cout, from any thread:
PrintThread{} << "my_val=" << val << std::endl;
The object collect data as a regular ostringstream. As soon the coma is reached, the object is destroyed and flush all collected information.

Along the lines of the answer suggested by Conchylicultor, but without inheriting from std::ostringstream:
EDIT: Fixed return type for the overloaded operator and added overload for std::endl.
EDIT 1: I have extended this into a simple header-only library for logging / debugging multi-threaded programs.
#include <iostream>
#include <mutex>
#include <thread>
#include <vector>
#include <chrono>
static std::mutex mtx_cout;
// Asynchronous output
struct acout
{
std::unique_lock<std::mutex> lk;
acout()
:
lk(std::unique_lock<std::mutex>(mtx_cout))
{
}
template<typename T>
acout& operator<<(const T& _t)
{
std::cout << _t;
return *this;
}
acout& operator<<(std::ostream& (*fp)(std::ostream&))
{
std::cout << fp;
return *this;
}
};
int main(void)
{
std::vector<std::thread> workers_cout;
std::vector<std::thread> workers_acout;
size_t worker(0);
size_t threads(5);
std::cout << "With std::cout:" << std::endl;
for (size_t i = 0; i < threads; ++i)
{
workers_cout.emplace_back([&]
{
std::cout << "\tThis is worker " << ++worker << " in thread "
<< std::this_thread::get_id() << std::endl;
});
}
for (auto& w : workers_cout)
{
w.join();
}
worker = 0;
std::this_thread::sleep_for(std::chrono::seconds(2));
std::cout << "\nWith acout():" << std::endl;
for (size_t i = 0; i < threads; ++i)
{
workers_acout.emplace_back([&]
{
acout() << "\tThis is worker " << ++worker << " in thread "
<< std::this_thread::get_id() << std::endl;
});
}
for (auto& w : workers_acout)
{
w.join();
}
return 0;
}
Output:
With std::cout:
This is worker 1 in thread 139911511856896
This is worker This is worker 3 in thread 139911495071488
This is worker 4 in thread 139911486678784
2 in thread This is worker 5 in thread 139911503464192139911478286080
With acout():
This is worker 1 in thread 139911478286080
This is worker 2 in thread 139911486678784
This is worker 3 in thread 139911495071488
This is worker 4 in thread 139911503464192
This is worker 5 in thread 139911511856896

For fast debugging c++11 applications and avoid interleaved output I just write small functions like these:
...
#include <mutex>
...
mutex m_screen;
...
void msg(char const * const message);
...
void msg(char const * const message)
{
m_screen.lock();
cout << message << endl;
m_screen.unlock();
}
I use these types of functions for outputs and if numeric values are needed I just use something like this:
void msgInt(char const * const message, int const &value);
...
void msgInt(char const * const message, int const &value)
{
m_screen.lock();
cout << message << " = " << value << endl;
m_screen.unlock();
}
This is easy and works fine to me, but I don't really know if it is technically correct. So I would be glad to hear your opinions.
Well, I didn't read this:
I don't want to search for each occurrence of std::cout and add a line of locking code.
I'm sorry. However I hope it helps somebody.

A feasible solution uses a line-buffer for each thread. You might get interleaved lines, but not interleaved characters. If you attach that to thread-local storage, you also avoid lock contention issues. Then, when a line is full (or on flush, if you want), you write it to stdout. This last operation of course has to use a lock. You stuff all this into a streambuffer, which you put between std::cout and its original streambuffer (a.k.a. Decorator Pattern).
The problem this doesn't solve is things like format flags (e.g. hex/dec/oct for numbers), which can sometimes percolate between threads, because they are attached to the stream. It's nothing bad, assuming you're only logging and not using it for important data. It helps to just not format things specially. If you need hex output for certain numbers, try this:
template<typename integer_type>
std::string hex(integer_type v)
{
/* Notes:
1. using showbase would still not show the 0x for a zero
2. using (v + 0) converts an unsigned char to a type
that is recognized as integer instead of as character */
std::stringstream s;
s << "0x" << std::setfill('0') << std::hex
<< std::setw(2 * sizeof v) << (v + 0);
return s.str();
}
Similar approaches work for other formats as well.

I know its an old question, but it helped me a lot with my problem. I created an utility class based on this post answers and I'd like to share my result.
Considering we use C++11 or latter C++ versions, this class provides print and println functions to compose strings before calling the standard output stream and avoid concurrency problems. These are variadic functions which use templates to print different data types.
You can check its use in a producer-consumer problem on my github: https://github.com/eloiluiz/threadsBar
So, here is my code:
class Console {
private:
Console() = default;
inline static void innerPrint(std::ostream &stream) {}
template<typename Head, typename... Tail>
inline static void innerPrint(std::ostream &stream, Head const head, Tail const ...tail) {
stream << head;
innerPrint(stream, tail...);
}
public:
template<typename Head, typename... Tail>
inline static void print(Head const head, Tail const ...tail) {
// Create a stream buffer
std::stringbuf buffer;
std::ostream stream(&buffer);
// Feed input parameters to the stream object
innerPrint(stream, head, tail...);
// Print into console and flush
std::cout << buffer.str();
}
template<typename Head, typename... Tail>
inline static void println(Head const head, Tail const ...tail) {
print(head, tail..., "\n");
}
};

This is how I manage thread safe operations on std::cout using a custom enum and macros:
enum SynchronisedOutput { IO_Lock, IO_Unlock };
inline std::ostream & operator<<(std::ostream & os, SynchronisedOutput so) {
static std::mutex mutex;
if (IO_Lock == so) mutex.lock();
else if (IO_Unlock == so)
mutex.unlock();
return os;
}
#define sync_os(Os) (Os) << IO_Lock
#define sync_cout sync_os(std::cout)
#define sync_endl '\n' << IO_Unlock
This allow me to write things like:
sync_cout << "Hello, " << name << '!' << sync_endl;
in threads without racing issues.

I had a similar problem to yours. You can use the following class.
This only supports outputting to std::cout, but if you need a general one let me know. In the code below, tsprint creates an inline temporary object of the class ThreadSafePrinter. If you want, you can change tsprint to cout if you have used cout instead of std::cout, so you won't have to replace any instances of cout but I do not recommend such a practice in general. It is much better to use a special output symbol for such debug lines starting from the beginning of the project anyway.
I like this solution as well: 1.
In my solution all threads can continue inserting into their corresponding thread_local stringstream static-objects and then lock the mutex only when it is required to flush which is triggered in the destructor. This is expected to improve the efficiency by shortening the duration during which the mutex-lock is held. Maybe I can include a mechanism similar to the sync_endl solution mentioned in 1.
class ThreadSafePrinter
{
static mutex m;
static thread_local stringstream ss;
public:
ThreadSafePrinter() = default;
~ThreadSafePrinter()
{
lock_guard lg(m);
std::cout << ss.str();
ss.clear();
}
template<typename T>
ThreadSafePrinter& operator << (const T& c)
{
ss << c;
return *this;
}
// this is the type of std::cout
typedef std::basic_ostream<char, std::char_traits<char> > CoutType;
// this is the function signature of std::endl
typedef CoutType& (*StandardEndLine)(CoutType&);
// define an operator<< to take in std::endl
ThreadSafePrinter& operator<<(StandardEndLine manip)
{
manip(ss);
return *this;
}
};
mutex ThreadSafePrinter::m;
thread_local stringstream ThreadSafePrinter::ss;
#define tsprint ThreadSafePrinter()
void main()
{
tsprint << "asd ";
tsprint << "dfg";
}

In addition to synchronisation this solution provides information about the thread from which the log was written.
DISCLAIMER: It's quite a naive way of syncing the logs, however it might be applicable for some small use cases for debugging.
thread_local int thread_id = -1;
std::atomic<int> thread_count;
struct CurrentThread {
static void init() {
if (thread_id == -1) {
thread_id = thread_count++;
}
}
friend std::ostream &operator<<(std::ostream &os, const CurrentThread &t) {
os << "[Thread-" << thread_id << "] - ";
return os;
}
};
CurrentThread current_thread;
std::mutex io_lock;
#ifdef DEBUG
#define LOG(x) {CurrentThread::init(); std::unique_lock<std::mutex> lk(io_lock); cout << current_thread << x << endl;}
#else
#define LOG(x)
#endif
This can be used like this.
LOG(cout << "Waiting for some event");
And it will give log output
[Thread-1] - Entering critical section
[Thread-2] - Waiting on mutex
[Thread-1] - Leaving critical section, unlocking the mutex

Related

using ostream as function argument

I have a basic problem with understanding what ostream is exactly. I know that it's a base class for the output stream, but I can't quite gasp when to use it and why to use it instead of just saying std::cout.
So here I have this example where I have to create a new class named stack with a pop() function (just as in the class already provided by C++).
Here list_node is a struct which consists of two elements: the key (which is an integer) and an interator which points to the next integer.
Definition of list_node (already given):
struct list_node {
int key;
list_node∗ next;
// constructor
list_node (int k, list_node∗ n)
: key (k), next (n) {}
};
and here is the definition of the class (already given as well):
class stack {
public:
void push (int value) {...}
...
private:
list_node∗ top_node;
};
and here's the part with which I'm having trouble with:
void print (std::ostream& o) const
{
const list_node* p = top_node;
while (p != 0) {
o << p->key << " "; // 1 5 6
p = p->next;
}
}
I don't understand why they are using ostream& o as function argument. Couldn't they've just taken the top_node as argument and used as well .next function on it (.next reads the next list_node) and then they could've just printed it with the std::cout function. Why is it better to do it the way they did?
Why is it better to do it the way they did?
I am not sure of your question, and not sure it is a better way.
Perhaps the intent was for flexibility. Here is an example from my app library:
When I declare a data attribute as an ostream
class T431_t
{
// ...
std::ostream* m_so;
// ...
I can trivially use that attribute to deliver a report to 'where-m_so-points'. In this app, there are several examples of *mso << ... being used. Here is the primary example.
inline void reportProgress()
{
// ...
*m_so << " m_blk = " << m_blk
<< " m_N = 0x" << std::setfill('0') << std::hex << std::setw(16) << m_N
<< " " << std::dec << std::setfill(' ') << std::setw(3) << durationSec
<< "." << std::dec << std::setfill('0') << std::setw(3) << durationMSec
<< " sec (" << std::dec << std::setfill(' ') << std::setw(14)
<< digiComma(m_N) << ")" << std::endl;
// ...
}
Note that in the class constructor (ctor), there is a default assignment for m_so to std::cout.
T431_t(uint64_t maxDuration = MaxSingleCoreMS) :
// ..
m_so (&std::cout), // ctor init - default
// ..
{
// ...
When the user selects the dual-thread processing option, which is a command line option to perform the app in about 1/2 the time by using both processors of my desktop, the reports can become hard to read if I allow the two independent output streams to intertwine (on the user screen). Thus, in the object instance being run by thread 2, m_so is set some something different.
The following data attribute captures and holds thread 2 output for later streaming to std::cout.
std::stringstream m_ssMagic; // dual threads use separate out streams
Thread 2 is launched and the thread sets it's private m_so:
void exec2b () // thread 2 entry
{
m_now = Clock_t::now();
m_so = &m_ssMagic; // m_so points to m_ssMagic
// ...
m_ssMagic << " execDuration = " << m_ssDuration.str()
<< " (b) " << std::endl;
} // exec2b (void)
While thread 1 uses std::cout, and thread 2 uses m_ssMagic, 'main' (thread 0) simply waits for the joins.
The join's coordinate the thread completion, typically about the same time. Main (thread 0) then cout's the m_ssMagic contents.
//...
// main thread context:
case 2: // one parameter: 2 threads each runs 1/2 of tests
{ // use two new instances
T431_t t431a(MaxDualCoreMS); // lower test sequence
T431_t t431b(MaxDualCoreMS); // upper test sequence
// 2 additional threads started here
std::thread tA (&T431_t::exec2a, &t431a);
std::thread tB (&T431_t::exec2b, &t431b);
// 2 join's - thread main (0) waits for each to complete
tA.join();
tB.join();
// tA outputs directly to std::cout
// tB captured output to T431_t::m_ssMagic.
// both thread 1 and 2 have completed, so ok to:
std::cout << t431b.ssMagicShow() << std::endl;
retVal = 0;
} break;
To be complete, here is
std::string ssMagicShow() { return (m_ssMagic.str()); }
Summary
I wrote the single thread application first. After getting that working, I searched for a 'simple' way to make use of the second core on my desktop.
As part of my first refactor, I a) added "std::ostream m_so" initialized to &std::cout, and b) found all uses of std::cout. Most of these I simply replaced with "*m_so". I then c) confirmed that I had not broken the single thread solution. Quite easy, and worked the first try.
Subsequent effort implemented the command line 'dual-thread' option.
I think this approach will apply to my next desktop, when budget allows.
And from an OOP standpoint, this effort works because std::ostream is in the class hierarchy of both std::cout and std::stringstream. Thus
"std::cout is-a std::ostream",
and
"std::stringstream is-a std::ostream".
So m_so can point to instance of either derived class, and provide virtual method 'ostream-access' to either destination.

Processing all passed overloads at once

I'm tired of making up on the spot debug codes and including <iostream> in every single file. So I wanted to make myself a universal, self-contained and lightweight debug class, that I would just include in the header, and forget.
I want to use something along the lines of
#include "debug.hpp"
debug DBG;
DBG << "foo and" << " bar";
//Or even better, just include it and do debug() << "foo and" << " bar";
So, I wrote this:
#include <iostream>
#include <string>
#include <chrono>
#include <ctime>
class Debug
{
public:
Debug &operator<<(std::string arg_0)
{
auto tempTime = std::chrono::system_clock::to_time_t(
std::chrono::system_clock::now() );
auto timeString(ctime(&tempTime));
timeString = timeString.substr(timeString.find(':') - 2, 8);
std::cout << timeString << " >> " << arg_0 << '\n';
return *this;
}
};
But of course, this doesn't work because, as I've learned, every overload operator causes this function (is it still called a function?) to trigger separately. Creating:
hour:minute:second >> foo and
hour:minute:second >> bar
Any way I could pass everything at once after the first overload operator appears? Maybe as a stringstream? Also, I won't be only passing strings, but anything that I need, will this require me to manually create a separate overload function for every signle type that I may pass?
P.S: Cross-plaform solution is optional, but welcome (Currently developing on Linux)
You may return an other class to do the job, something like:
class Helper
{
public:
~Helper() { std::cout << "\n"; }
template<typename T>
friend Helper&& operator << (Helper&&h, const T& t) {
std::cout << t;
return std::move(h);
}
};
class Debug
{
public:
template<typename T>
friend Helper operator<<(Debug&, const T& t)
{
auto tempTime = std::chrono::system_clock::to_time_t(
std::chrono::system_clock::now() );
auto timeString{ctime(&tempTime)};
timeString = timeString.substr(timeString.find(':') - 2, 8);
std::cout << timeString << " >> " << t;
return Helper{};
}
};
Each time you call operator<<, your code prints the time stamp and \n. And that's the problem. To avoid that, you can print the time stamp in the constructor of Debug, and print \n in the destructor.
class Debug {
public:
Debug() {
auto tempTime = std::chrono::system_clock::to_time_t(
std::chrono::system_clock::now() );
std::string timeString(ctime(&tempTime));
timeString = timeString.substr(timeString.find(':') - 2, 8);
std::cout << timeString;
}
~Debug() {
std::cout << "\n";
}
Debug &operator<<(std::string arg_0) {
std::cout << " >> " << arg_0;
return *this;
}
};
In order to debug types other than string, you make operator<< a template:
template <typename T>
Debug &operator<<(T &&arg_0) {
std::cout << " >> " << std::forward<T>(arg_0);
return *this;
}
I see 2 design problems here:
You try to create stream-like object. It means that it doesn't know, when the line ends, until you send EOL to it. Without this information, it doesn't know when to add prefix to "your" line and print it. Consider the two following situation:
DBG << "foo and" << " bar";
and
DBG << "foo and";
... (a lot of code) ...
DBG << " bar";
They look exactly the same inside your Debug class, because:
DBG << "foo and" << " bar"; == (DBG.operator<<("foo and")).operator<<(" bar");
And this is the same as:
DBG.operator<<("foo and");
DBG.operator<<("bar");
So you have to decide how to define the end of the message you want to print (and when do you want to measure the time: At the beginning or at the end of the message?).
When do you want to flush your stream? You have to send std::endl or std::flush to std::cout to flush it. Sending "\n" does not flush std::cout (this is important difference between std::endl and "\n"). If you do not flush it, it may be printed several minutes/hours later (it will wait in a buffer). On the other hand frequent buffer flushing may be a performance killer in application producing large amount of text.
Try to define how your stream should behave when you send to it "\n", std::endl and std::flush (std::endl should be converted to "\n"+std::flush).
About other questions:
I would use simple template to "transfer" parameter of operator<<() to std::cout. It would allow to use your class for any type that can be printed by std::cout. To make things simpler you can define the operator<<() outside your class, eg.:
template<typename tParam>
Debug &operator<<(Debug& stream, tParam const & myParam)
{
...
return stream;
}

Logging function which uses operator <<

I'd like to write a function for logging which should be used like this:
log(__FILE__) << "My message containing integer: " << 123 << " and double: " << 1.2;
This should print the following line, add endl and flush immediately:
main.cpp: My message containing integer: 123 and double: 1.2
My (simplified) attempt for the implementation of the function:
class Writer
{
public:
template<typename T>
Writer & operator<<(T t)
{
cout << t << endl;
cout.flush();
return (*this);
}
};
Writer log(const char* fileName)
{
cout << fileName << ": ";
return Writer();
}
int main(int argc, const char *argv[])
{
log(__FILE__) << "My message containing integer: " << 123 << "and double: " << 1.2;
return 0;
}
My problem is that because of L-R associativity of the operator<< the output is:
main.cpp: My message containing integer:
123
and double:
1.2
Is there any way how to implement the function or is my requirement for its usage unrealizable?
Ideally I'd like to use plain C++03 (i.e. no C++11 features, boost and non-standard libraries).
L-R associativity is not related to your problem (if you talk about line breaks). The problem is because you use endl after each write. You don't need it (and if you do that, then you don't need flush, because endl already flushes the output).
The easy solution to your problem:
class Writer
{
public:
template<typename T>
Writer & operator<<(T t)
{
cout << t;
return (*this);
}
~Writer()
{
try {
cout << endl;
}
catch (...) {
// You have to make sure that no
// exception leaves destructor
}
}
};
It is also worth to notice, that your approach is not really scalable: it is impossible to use your code in multi-threaded environment. Assume that two threads are writing into your logging:
Thread 1: log(__FILE__) << "a" << "b" << "c";
Thread 2: log(__FILE__) << "a" << "b" << "c";
Here you can easily get a message "aabbcc\n\n" in your logfile, which is highly undesirable.
In order to avoid that, you can have a static mutex object inside log() function, which you pass into Writer constructor. Then you have to lock it in the constructor and unlock it in the destructor. It will guarantee the synchronization of concurrent writing of different entries.

Synchronizing output to std::cout

I have an application wherein several threads write to std::cout and I was looking for a simple solution to prevent data being sent to std::cout from being garbled up.
For example, if I have 2 threads and both output:
std::cout << "Hello" << ' ' << "from" << ' ' << "thread" << ' ' << n << '\n';
I might see something like:
HelloHello from fromthread 2
thread 1
What I would like to see is:
Hello from thread 2
Hello from thread 1
The order in which the lines are displayed is not very important, as long as they don't get intermixed.
I came up with the following fairly simple implementation:
class syncstream : public std::ostringstream {
public:
using std::ostringstream::ostringstream;
syncstream& operator<<(std::ostream& (*pf)(std::ostream&) ) { pf(*this); return *this; }
syncstream& operator<<(std::ios& (*pf)(std::ios&) ) { pf(*this); return *this; }
syncstream& operator<<(std::ios_base& (*pf)(std::ios_base&)) { pf(*this); return *this; }
template<typename T>
syncstream& operator<<(T&& token) {
static_cast<std::ostringstream&>(*this) << std::forward<T>(token);
return *this;
}
};
inline std::ostream& operator&&(std::ostream& s, const syncstream& g) { return s << g.str(); }
#define synced(stream) stream && syncstream()
Sorry about the macro.
So, now in my threads I can do:
synced(std::cout) << "Hello" << ' ' << "from" << ' ' << "thread" << ' ' << n << '\n';
I wrote the above because of my initial misunderstanding of §27.4.1. But, surprisingly it works very well.
I wrote the following test case:
void proc(int n) {
synced(std::cout) << "Hello" << ' ' << "world" << ' ' << "from" << ' ' << "thread" << ' ' << n << '\n';
}
int main() {
std::vector<std::thread> threads;
for(int n = 0; n < 1000; ++n) threads.push_back(std::thread(std::bind(proc, n)));
for(std::thread& thread: threads) thread.join();
return 0;
}
(full version here) and ran it with both g++ 4.8.3 and clang++ 3.5.1 (with libstdc++ and libc++) on my system.
Testing was done with a script, which runs the test case 1000 times generating 1 million output lines and then parses the output for any garbled lines.
I cannot make it not work (ie, produce garbled lines).
So my question is:
Why does the above implementation work?
With regards to thread safety: it's thread safe in the sense that it
won't cause a data race. But only as long as the target is one of the
standard stream objects (std::cout, etc.), and only as long as they
remain synched with stdio. That's all the standard guarantees. And
even then, you can still end up with interleaved characters.
I've had to deal with this problem a lot in the past. My solution has
always been a wrapper class, with a pointer to the actual
std::ostream, and a template:
template <typename T>
SynchedOutput& operator<<( T const& obj )
{
if ( myStream != nullptr ) {
(*myStream) << obj;
}
return *this;
}
The constructor of SynchedOutput then acquires a mutex lock, and the
destructor frees it, so you can write:
SynchedOutput( stream, mutex ) << ...;
(In my case, I was returning the temporary from a function, and was
doing so before C++11 and its move semantics, so my code was a bit more
complicated; I had to support copy, and keep track of the count of the
copies, so that I could unlock when the last one was destructed. Today,
just implement move semantics, and no copy, if you want to return the
instance from a function.))
The issue here is ensuring that everyone is using the same mutex. One
possibility might be to have the constructor look up the mutex in an
std::map indexed on the address of the stream object. This lookup
requires a global lock, so you can even construct a new mutex if the
stream object doesn't have one. The real issue is ensuring that the
mutex is removed from the map when the stream is destructed.
This appears thread-safe in the sense of not producing garbled lines, provided each output ends with a new line. However, it does change the nature of the stream output, in particular with respect to flushing.
1 synced(std::cerr) will be buffered (into your syncstream), while std::cerr is never buffered.
2 there is no guarantee that
synced(std::cout) << "a=" << 128 << std::endl;
actually flushes the buffer of std::cout, since all std::cout gets is the string "a=128\n".
A stronger interpretation of thread-safe would be that the order of output reflects the order, if any, of output calls. That is if
synced(std::cout) << "a=" << 128 << std::endl;
on thread A is guaranteed (by means of locks for example) to preceed the same call on thread B, then the output of A should always preceed that of B. I don't think that your code achieves that.

multiple threads writing to std::cout or std::cerr

I have OpenMP threads that write to the console via cout and cerr. This of course is not safe, since output can be interleaved. I could do something like
#pragma omp critical(cerr)
{
cerr << "my variable: " << variable << endl;
}
It would be nicer if could replace cerr with a thread-safe version, similar to the approach explained in the valgrind DRD manual (http://valgrind.org/docs/manual/drd-manual.html#drd-manual.effective-use) which involves deriving a class from std::ostreambuf. Ideally in the end I would just replace cerr with my own threaded cerr, e.g. simply:
tcerr << "my variable: " << variable << endl;
Such a class could print to the console as soon as it encounters an "endl". I do not mind if lines from different threads are interleaved, but each line should come only from one thread.
I do not really understand how all this streaming in C++ works, it is too complicated. Has anybody such a class or can show me how to create such a class for that purpose?
As others pointed out, in C++11, std::cout is thread-safe.
However if you use it like
std::cout << 1 << 2 << 3;
with different threads, the output can still be interleaved, since every << is a new function call which can be preceeded by any function call on another thread.
To avoid interleaving without a #pragma omp critical - which would lock everything - you can do the following:
std::stringstream stream; // #include <sstream> for this
stream << 1 << 2 << 3;
std::cout << stream.str();
The three calls writing 123 to the stream are happening in only one thread to a local, non-shared object, therefore aren't affected by any other threads. Then, there is only one call to the shared output stream std::cout, where the order of items 123 is already fixed, therefore won't get messed up.
You can use an approach similar to a string builder. Create a non-template class that:
offers templated operator<< for insertion into this object
internally builds into a std::ostringstream
dumps the contents on destruction
Rough approach:
class AtomicWriter {
std::ostringstream st;
public:
template <typename T>
AtomicWriter& operator<<(T const& t) {
st << t;
return *this;
}
~AtomicWriter() {
std::string s = st.str();
std::cerr << s;
//fprintf(stderr,"%s", s.c_str());
// write(2,s.c_str(),s.size());
}
};
Use as:
AtomicWriter() << "my variable: " << variable << "\n";
Or in more complex scenarios:
{
AtomicWriter w;
w << "my variables:";
for (auto & v : vars) {
w << ' ' << v;
}
} // now it dumps
You will need to add more overloads if you want manipulators, you can use write better than fprintf for the atomic write in the destructor, or std::cerr, you can generalize so that the destination is passed to the constructor (std::ostream/file descriptor/FILE*),
I don't have enough reputation to post a comment, but I wanted to post my addition to the AtomicWriter class to support std::endl and allow for other streams to be used besides std::cout. Here it is:
class AtomicWriter {
std::ostringstream st;
std::ostream &stream;
public:
AtomicWriter(std::ostream &s=std::cout):stream(s) { }
template <typename T>
AtomicWriter& operator<<(T const& t) {
st << t;
return *this;
}
AtomicWriter& operator<<( std::ostream&(*f)(std::ostream&) ) {
st << f;
return *this;
}
~AtomicWriter() { stream << st.str(); }
};
Put the following code in header file atomic_stream_macro.h
#ifndef atomic_stream_macro_h
#define atomic_stream_macro_h
#include <mutex>
/************************************************************************/
/************************************************************************/
extern std::mutex print_mutex;
#define PRINT_MSG(out,msg) \
{ \
std::unique_lock<std::mutex> lock (print_mutex); \
\
out << __FILE__ << "(" << __LINE__ << ")" << ": " \
<< msg << std::endl; \
}
/************************************************************************/
/************************************************************************/
#endif
Now the macro can be used from a file as follows.
#include <atomic_stream_macro.h>
#include <iostream>
int foo (void)
{
PRINT_MSG (std::cout, "Some " << "Text " << "Here ");
}
Finally, in the main.cxx, declare the mutex.
#include <mutex>
std::mutex print_mutex;
int main (void)
{
// launch threads from here
}
You could do it by inheriting std::basic_streambuf, and override the correct functions to make it threadsafe. Then use this class for your stream objects.