How to make sure a chained logging statement is atomic? - c++

I have a logging class which has operator<< overloaded. So I can do things like this:
oLogger << "Log this" << " and this" << " and " << 10 << endl;
oLogger`<< "Something else" << endl;
The logger does this without any problems. But, I want the logger object to be shared among threads. Then, I don't want it printing out something like this:
//LogFILE
Log this and this Something else
and 10
So, I need to lock a whole chain of operator<<s. I am guessing this can be done with RAII, I haven't given it much thought yet. In the meantime, is there any traditional way of getting this done? (Except ending input with a manipulator?)

Slight alternative to Nim's answer:
Create
class LockedLog {
static MutEx mutex; // global mutex for logging
ScopedLock lock; // some scoped locker to hold the mutex
Logger &oLogger; // reference to the log writer itself
public:
LockedLog(Logger &oLogger) : oLogger(oLogger), lock(mutex) {}
template <typename T>
LockedLog &operator<<(const T &value) { oLogger << value; return *this; }
};
And either just do:
LockedLog(oLogger) << "Log this" << " and this " << " and " << 10 << endl;
Or change Logger::operator<< to normal method, call this method in LockedLog::operator<<, add cast-operator to Logger:
operator LockedLog() { return LockedLog(*this); }
and that should add locking to your current code.
Update: That locks across all the calls to operator<< and may even lock around evaluation of their arguments (depends on whether compiler will evaluate left or right argument first and it may choose). To reduce that, one could:
class LockedLog {
static MutEx mutex; // global mutex for logging
std::stringstream buffer; // temporary formatting buffer;
Logger &oLogger; // reference to the log writer itself
public:
LockedLog(Logger &oLogger) : oLogger(oLogger), lock(mutex) {}
template <typename T>
LockedLog &operator<<(const T &value) { buffer << value; return *this; }
~LockedLog() { ScopedLock lock(mutex); oLogger << buffer.str() << std::flush; }
};
But the stringstream adds another overhead.

One approach is to use a macro, i.e.
#define LOG(x) \
{\
<acquire scoped lock> \
oLogger << x; \
}
then
LOG("Log this" << " and this" << " and " << 10 << endl);
I've also done it using the manipulator approach that you mention above, however the problem is that you need to have operator<< implemented for all the types (i.e. can't use the standard operators that exist)
EDIT: to reduce the time the lock is held, consider something like this:
#define LOG(x) \
{\
std::ostringstream str; \
str << x; \ // the streaming happens in local scope, no need for lock
oLogger.write(str.str()); \ // ensure the write method acquires a lock
}

I've found that the best solution is to write a class buffer so that
buffer(oLogger) << "Log this" << " and this" << " and " << 10 << endl;
creates a temporary buffer object, captures and formats the output and writes it to oLogger in its destructor. This is trivially done by wrapping a stringstream. Because every thread has its own buffers, formatting is independent.
For extra fanciness, buffer::~buffer can use several different mechanisms to prevent thread-unsafe access of oLogger. You assumed that operator<< calls from multiple threads might be interleaved. In fact, it's worse; they can be concurrent. You could get "LSoogm ethhiinsg else". Makeing sure that only one buffer flushes to oLogger at a time prevents this.

I would probably use expression templates here.
The main idea is that it's bloody stupid to acquire the lock during the formatting phase, especially since there might be function calls during this formatting.
You need to use two different phases:
format the log
atomically post the log
This can be accomplished with expression templates:
First call to Logger::operator<< yields a LoggerBuffer that embeds a reference to Logger.
Subsequent calls are performed on LoggerBuffer which deals with all the formatting mess
Upon destruction of LoggerBuffer (at the end of the statement), it locks Logger, pass the formatted string, and unlocks (unless you've got a lock-free queue or something)

As I have to internationalize logs, I prefer things like :
oLogger << myAutosprintf(_("My wonderful %s ! I have %d apples"), name, nbApple);
It is way better for translation :) And it will solve your problem. _() is a shortcut for the translation stuff.
You can use gnu::autosprintf, boost.format (thanks to Jan Huec), or write your own.
my2c
NB: Edited after good remarks (was too fast, thank you for the comments). I have erased the wrong "first part" statement

Related

Synchronize object

Having object that has extensive API list.
What is the best way to synchronize this object, i.e. the object already exists in legacy code and used in hundreds of lines of code.
The naive way is to wrap each API call to object with std::mutex. Is there an easier or elegant way to do it?
I have tried below code, however would like to get opinion on it or alternative solutions .
Below is template wrapper class that lock the object during the usage , in an automatic way. i.e. locks the object on creation and unlocks upon destruction.
This pattern is very similar to scope lock, however it's useful only for static objects/singletons, it wouldn't work for different instances of a given object
template <typename T> class Synced
{
static std::mutex _lock;
T& _value;
public:
Synced(T& val) : _value(val)
{
std::cout << "lock" << endl;
_lock.lock();
}
virtual ~Synced()
{
std::cout << "unlock" << endl;
_lock.unlock();
}
T& operator()()
{
return _value;
}
};
template <class T> std::mutex Synced<T>::_lock;
example class to be used with Synced template class
this could be example of a class mentioned above with tens of API's
class Board
{
public:
virtual ~Board() { cout << "Test dtor " << endl; }
void read() { cout << "read" << endl; }
void write() { cout << "write" << endl; }
void capture() { cout << "capture" << endl; }
};
example of usage , basic calls , the Synced object isn't bounded to scope , thus the destructor is called immediately after semicolon
int main(int argc, char* argv[])
{
Board b;
Synced<Board>(t)().read();
cout <<" " << endl;
Synced<Board>(t)().write();
cout << " " << endl;
Synced<Board>(t)().capture();
cout << " " << endl;
return 1;
}
Here below is output of above example run :
lock
read
unlock
lock
write
unlock
lock
capture
unlock
Test dtor
I only use mutexes for very small critical sections, a few lines of code maybe, and only if I control all possible error conditions. For a complex API you may end up with the / a mutex in an unexpected state. I tend to tackle this sort of thing with the reactor pattern. Whether or not that is practical depends on whether or not you can reasonably use serialization / deserialization for this object. If you have to write serialization yourself then consider things like API stability and complexity. I personally prefer zeromq for this sort of thing when using it is practical, your mileage may vary.

How to flush a log message and unlock a mutex automatically in C++?

I wrote this logger. It works fine, but there is something I have not been able to do.
uLOG(warning) << "Test log message " << 123 << uLOGE;
uLOG locks a C++11 mutex and starts writing on a file stream.
uLOGE flushes the stream and unlocks the mutex.
I would like to get the same result with this syntax:
uLOG(warning) << "Test log message " << 123;
so I would like the flush and the unlock to be called automatically at the end of the line.
Which is a possible way to do it?
I tried setting the ios::unitbuf flag, but this forces a flush for every << operator, not ideal for an SSD wearing. And it does not unlock the mutex.
I tried defining a temporary object in uLOG whose destructor would flush and unlock, but that forces to put the log line in its own code block: { uLOG(warning) << 123; }
Reference
You need to redesign your logging framework, so that uLOG is a class that you instantiate, and whose destructor does the work of your uLOGE macro.
Very simple example:
struct uLOG
{
uLOG(std::string const& type)
{
std::cout << "Log: " << type << " - ";
}
template<typename T>
uLOG& operator<<(T const& output)
{
std::cout << output;
return *this;
}
~uLOG()
{
std::cout << " (end of log)" << std::endl;
}
};
// ...
uLOG("warning") << "log message" << 123;
The above, in a suitable program, should print
Log: warning - log message123 (end of log)
This solution should not require the use of braces, so could be used in a single-statement un-braced if or loop.
Your second approach is correct, if implemented correctly it does not require braces. Did you do it with macros? They are not needed here.
uLOG should be a function that returns a temporary object of a Log writer class. As you outlined, it should lock in ctor and flush and unlock in dtor and also have templated operator<< for all types, that just forwards the call to actual log destination.

How do I print in a new thread without threads interrupting lines? (particularly c++)

I've worked a decent amount with threading in C on linux and now I'm trying to do the same but with c++ on Windows, but I'm having trouble with printing to the standard output. In the function the thread carries out I have:
void print_number(void* x){
int num = *(static_cast<int*> (x));
std::cout << "The number is " << num << std::endl;
}
wrapped in a loop that creates three threads. The problem is that although everything gets printed, the threads seem to interrupt each other between each of the "<<"'s.
For example, the last time I ran it I got
The number is The number is 2The number is 3
1
When I was hoping for each on a separate line. I'm guessing that each thread is able to write to the standard output after another has written a single section between "<<"s. In C, this wasn't a problem because the buffer wasn't flushed until everything I needed the write was there, but that's not the case now I don't think. Is this a case of a need for a mutex?
In C++, we first of all would prefer to take arguments as int*. And then, we can just lock. In C++11:
std::mutex mtx; // somewhere, in case you have other print functions
// that you want to control
void print_number(int* num) {
std::unique_lock<std::mutex> lk{mtx}; // RAII. Unlocks when lk goes out of scope
std::cout << "The number is " << *num << std::endl;
}
If not C++11, there's boost::mutex and boost::mutex::scoped_lock that work the same way and do the same thing.
Your C example worked by accident; printf and the like aren't atomic either.
This is indeed a case for a mutex. I typically allocate it static function locally. E.g.:
void atomic_print(/*args*/) {
static MyMutex mutex;
mutex.acquire();
printf(/*with the args*/);
mutex.release();
}

iostream thread safety, must cout and cerr be locked separately?

I understand that to avoid output intermixing access to cout and cerr by multiple threads must be synchronized. In a program that uses both cout and cerr, is it sufficient to lock them separately? or is it still unsafe to write to cout and cerr simultaneously?
Edit clarification: I understand that cout and cerr are "Thread Safe" in C++11. My question is whether or not a write to cout and a write to cerr by different threads simultaneously can interfere with each other (resulting in interleaved input and such) in the way that two writes to cout can.
If you execute this function:
void f() {
std::cout << "Hello, " << "world!\n";
}
from multiple threads you'll get a more-or-less random interleaving of the two strings, "Hello, " and "world\n". That's because there are two function calls, just as if you had written the code like this:
void f() {
std::cout << "Hello, ";
std::cout << "world!\n";
}
To prevent that interleaving, you have to add a lock:
std::mutex mtx;
void f() {
std::lock_guard<std::mutex> lock(mtx);
std::cout << "Hello, " << "world!\n";
}
That is, the problem of interleaving has nothing to do with cout. It's about the code that uses it: there are two separate function calls inserting text, so unless you prevent multiple threads from executing the same code at the same time, there's a potential for a thread switch between the function calls, which is what gives you the interleaving.
Note that a mutex does not prevent thread switches. In the preceding code snippet, it prevents executing the contents of f() simultaneously from two threads; one of the threads has to wait until the other finishes.
If you're also writing to cerr, you have the same issue, and you'll get interleaved output unless you ensure that you never have two threads making these inserter function calls at the same time, and that means that both functions must use the same mutex:
std::mutex mtx;
void f() {
std::lock_guard<std::mutex> lock(mtx);
std::cout << "Hello, " << "world!\n";
}
void g() {
std::lock_guard<std::mutex> lock(mtx);
std::cerr << "Hello, " << "world!\n";
}
In C++11, unlike in C++03, the insertion to and extraction from global stream objects (cout, cin, cerr, and clog) are thread-safe. There is no need to provide manual synchronization. It is possible, however, that characters inserted by different threads will interleave unpredictably while being output; similarly, when multiple threads are reading from the standard input, it is unpredictable which thread will read which token.
Thread-safety of the global stream objects is active by default, but it can be turned off by invoking the sync_with_stdio member function of the stream object and passing false as an argument. In that case, you would have to handle the synchronization manually.
It may be unsafe to write to cout and cerr simultaneously !
It depends on wheter cout is tied to cerr or not. See std::ios::tie.
"The tied stream is an output stream object which is flushed before
each i/o operation in this stream object."
This means, that cout.flush() may get called unintentionally by the thread which writes to cerr.
I spent some time to figure out, that this was the reason for randomly missing line endings in cout's output in one of my projects :(
With C++98 cout should not be tied to cerr. But despite the standard it is tied when using MSVC 2008 (my experience). When using the following code everything works well.
std::ostream *cerr_tied_to = cerr.tie();
if (cerr_tied_to) {
if (cerr_tied_to == &cout) {
cerr << "DBG: cerr is tied to cout ! -- untying ..." << endl;
cerr.tie(0);
}
}
See also: why cerr flushes the buffer of cout
There are already several answers here. I'll summarize and also address interactions between them.
Typically,
std::cout and std::cerr will often be funneled into a single stream of text, so locking them in common results in the most usable program.
If you ignore the issue, cout and cerr by default alias their stdio counterparts, which are thread-safe as in POSIX, up to the standard I/O functions (C++14 §27.4.1/4, a stronger guarantee than C alone). If you stick to this selection of functions, you get garbage I/O, but not undefined behavior (which is what a language lawyer might associate with "thread safety," irrespective of usefulness).
However, note that while standard formatted I/O functions (such as reading and writing numbers) are thread-safe, the manipulators to change the format (such as std::hex for hexadecimal or std::setw for limiting an input string size) are not. So, one can't generally assume that omitting locks is safe at all.
If you choose to lock them separately, things are more complicated.
Separate locking
For performance, lock contention may be reduced by locking cout and cerr separately. They're separately buffered (or unbuffered), and they may flush to separate files.
By default, cerr flushes cout before each operation, because they are "tied." This would defeat both separation and locking, so remember to call cerr.tie( nullptr ) before doing anything with it. (The same applies to cin, but not to clog.)
Decoupling from stdio
The standard says that operations on cout and cerr do not introduce races, but that can't be exactly what it means. The stream objects aren't special; their underlying streambuf buffers are.
Moreover, the call std::ios_base::sync_with_stdio is intended to remove the special aspects of the standard streams — to allow them to be buffered as other streams are. Although the standard doesn't mention any impact of sync_with_stdio on data races, a quick look inside the libstdc++ and libc++ (GCC and Clang) std::basic_streambuf classes shows that they do not use atomic variables, so they may create race conditions when used for buffering. (On the other hand, libc++ sync_with_stdio effectively does nothing, so it doesn't matter if you call it.)
If you want extra performance regardless of locking, sync_with_stdio(false) is a good idea. However, after doing so, locking is necessary, along with cerr.tie( nullptr ) if the locks are separate.
This may be useful ;)
inline static void log(std::string const &format, ...) {
static std::mutex locker;
std::lock_guard<std::mutex>(locker);
va_list list;
va_start(list, format);
vfprintf(stderr, format.c_str(), list);
va_end(list);
}
I use something like this:
// Wrap a mutex around cerr so multiple threads don't overlap output
// USAGE:
// LockedLog() << a << b << c;
//
class LockedLog {
public:
LockedLog() { m_mutex.lock(); }
~LockedLog() { *m_ostr << std::endl; m_mutex.unlock(); }
template <class T>
LockedLog &operator << (const T &msg)
{
*m_ostr << msg;
return *this;
}
private:
static std::ostream *m_ostr;
static std::mutex m_mutex;
};
std::mutex LockedLog::m_mutex;
std::ostream* LockedLog::m_ostr = &std::cerr;

BOOST threading : cout behavior

I am new to Boost threading and I am stuck with how output is performed from multiple threads.
I have a simple boost::thread counting down from 9 to 1; the main thread waits and then prints "LiftOff..!!"
#include <iostream>
#include <boost/thread.hpp>
using namespace std;
struct callable {
void operator() ();
};
void callable::operator() () {
int i = 10;
while(--i > 0) {
cout << "#" << i << ", ";
boost::this_thread::yield();
}
cout.flush();
}
int main() {
callable x;
boost::thread myThread(x);
myThread.join();
cout << "LiftOff..!!" << endl;
return 0;
}
The problem is that I have to use an explicit "cout.flush()" statement in my thread to display the output. If I don't use flush(), I only get "LiftOff!!" as the output.
Could someone please advise why I need to use flush() explicitly?
This isn't specifically thread related as cout will buffer usually on a per thread basis and only output when the implementation decides to - so in the thread the output will only appear on a implementation specific basic - by calling flush you are forcing the buffers to be flushed.
This will vary across implementations - usually though it's after a certain amount of characters or when a new line is sent.
I've found that multiple threads writing too the same stream or file is mostly OK - providing that the output is performed as atomically as possible. It's not something that I'd recommend in a production environment though as it is too unpredictable.
This behaviour seems to depend on OS specific implementation of the cout stream. I guess that write operations on cout are buffered to some thread specific memory intermediatly in your case, and the flush() operation forces them being printed on the console. I guess this, since endl includes calling the flush() operation and the endl in your main function doesn't see your changes even after the thread has been joined.
BTW it would be a good idea to synchronize outputs to an ostream shared between threads anyway, otherwise you might see them intermigled. We do so for our logging classes which use a background thread to write the logging messages to the associated ostream.
Given the short length of your messages, there's no reason anything should appear without a flush. (Don't forget that std::endl is the equivalent of << '\n' << std::flush.)
I get the asked behaviour with and without flush (gcc 4.3.2 boost 1.47 Linux RH5)
I assume that your cygwin system chooses to implement several std::cout objects with associated std::streambuf. This I assume is implementation specific.
Since flush or endl only forces its buffer to flush onto its OS controlled output sequence the cout object of your thread remains buffered.
Sharing a reference of an ostream between the threads should solve the problem.