I have an application in which I perform costly calculations in parallel worker threads. For simplicity, I write results to stdout directly from these threads.
This worked fine until I changed a few things in an attempt to make the code run faster. First, I replaced std::endl with "\n" to prevent a flushing after every line. And I added the following lines to the init part of my main program:
std::cin.tie(nullptr);
std::ios_base::sync_with_stdio(false);
The basic structure of the worker thread code looks like this:
while(true) {
// get data from job queue, protected by unique_lock on std::mutex
// process the data
// print results
{
std::lock_guard<std::mutex> lk(outputMutex_);
std::cout << "print many results" << "\n"; // was originally std::endl
}
}
Since this "optimization", the output of the workers occasionally "mixes". i.e. the mutex does not serve its intended purpose.
Why is this happening? My understanding is that there is just a single stdout stream buffer, and that the data arrives in the corresponding buffer in sequence, even if the output is not flushed from this buffer before releasing the mutex. But that does not seem to be the case...
(I realize that maybe it would be nicer to have the output generated in a separate thread, but then I'd need to pass back these results using another queue, which did not seem necessary here)
Update: Maybe my post was not clear enough. I do not care about the sequence of the results. The problem is that (for the example above) instead of this:
print many results
print many results
print many results
I sometimes get:
print many print many results
results
print many results
And the outputMutex_ is a static member that is shared by all worker threads.
you are accessing cout by multiple threads. The access to its queue is protected by the mutex, but it needs to be flushed. That doesn't happen automatically (always at least :) )
std::endl flushes cout, '\n' doesn't
or, you can tell cout to flush, with std::flush:
https://en.cppreference.com/w/cpp/io/manip/flush
try:
while(true) {
// get data from job queue, protected by unique_lock on std::mutex
// process the data
// print results
{
std::lock_guard<std::mutex> lk(outputMutex_);
std::cout << "print many results" << "\n"; // not flushed
std::cout << std::flush; // flushed!
std::cout << "print something else" << std::endl; // flushed!
}
}
more on:
https://stackoverflow.com/a/22026764/13735754
Related
I'm testing C++ code for stream buffering. As far as my understanding goes for buffering idea, the following code should print out "Before loop " and "After loop" at the same time, even though there is delay in form of for loop. Problem is that it prints them with that loop delay. Can someone explain the reason to me? I'm passing cout as an argument.
void testBuffer(ostream& os){
os << "Before loop - ";
for(int i = 0; i < 2000000000; i++){
// waste time
}
os << "After loop " << endl;
}
Buffers are not infinite, and in the case of non-file streams probably not even all that large.
Just because you did not write std::flush doesn't mean there definitely won't be an immediate response from the stream. If the buffer is full, it's still going to flush. It's just that you're not forcing an early flush.
Furthermore you may conceivably see std::cout behaving like std::cerr (which basically disables buffering) in debug modes. I don't know whether any implementation does this.
Moral of the story:
if you need the output immediately, flush;
if you need it later, write it later;
if and only if you don't care either way, do it the way you've done it.
This question already has answers here:
endl and flushing the buffer
(5 answers)
Closed 6 years ago.
std::cout << "Enter two numbers:";
std::cout << std:endl;
This code snippet is followed by two paragraphs and a warning note, among which I understood the first para, but neither the second one nor the note. The text is as follows -
"The first output operator prints a message to the user. That message
is a string literal, which is a sequence of characters enclosed in
double quotation marks. The text between the quotation marks is
printed to the standard output.
The second operator prints endl,
which is a special value called a manipulator. Writing endl has
the effect of ending the current line and flushing the buffer
associated with that device. Flushing the buffer ensures that all the
output the program has generated so far is actually written to the
output stream, rather than sitting in memory waiting to be written.
Warning Programmers often add print statements during debugging. Such statement should always flush the stream. Otherwise, if the
program crashes, output may be left in the buffer, leading to
incorrect inferences about where the program crashed."
So I didn't understand of the part of endl, nor the following warning. Can anyone please explain this to me as explicitly as possible and please try to keep it simple.
Imagine you have some code that crashes somewhere, and you don't know where. So you insert some print statements to narrow the problem down:
std::cout << "Before everything\n";
f1();
std::cout << "f1 done, now running f2\n";
f2();
std::cout << "all done\n";
Assuming that the program crashes during the evaluation of either f1() or f2(), you may not see any output, or you may see partial output that is misleading -- e.g. you could see only "Before everything", even though the crash happened in f2(). That's because the output data may be waiting in a buffer and hasn't actually been written to the output device.
The Primer's recommendation is therefore to flush each output, which you can conveniently achieve with endl:
std::cout << "Before everything" << std::endl;
f1();
std::cout << "f1 done, now running f2" << std::endl;
f2();
std::cout << "all done" << std::endl;
An alternative is to write debug output to std::cerr instead, which is not buffered by default (though you can always change the buffering of any ostream object later).
A more realistic use case is when you want to print a progress bar in a loop. Usually, a newline (\n) causes line-based output to be printed anyway, but if you want to print a single character for progress, you may not see it printed at all until after all the work is done unless you flush:
for (int i = 0; i != N; ++i)
{
if (i % 1000 == 0)
{
std::cout << '#'; // progress marger
std::cout.flush();
}
do_work();
}
std::cout << '\n';
Well, simply:
std::cout << "Hello world!";
will print "Hello world!" and will remain in the same line. Now if you want to go to a new line, you should use:
std::cout << "\n";
or
std::cout << std::endl;
Now before I explain the difference, you have to know 1 more simple thing: When you issue a print command with the std::cout stream, things are not printed immediately. They are stored in a buffer, and at some point this buffer is flushed, either when the buffer is full, or when you force it to flush.
The first kind, \n, will not flush, but the second kind std::endl, will go to a new line + flush.
Operating systems do buffered IO. That is, when your program outputs something, they dont necessarily put it immediately where it should go (i.e. disk, or the terminal), they might decide to keep the data in an internal memory buffer for some while before performing the actual IO operation on the device.
They do this to optmize performance, because doing the IO in chunks is better than doing it immediately as soon as there are a few bytes to write.
Flushing a buffer means asking the OS to perform immediately the IO operation without any more waiting. A programmer would do this this when (s)he knows that waiting for more data doesn't make sense.
The second note says that endl not only prints a newline, but also hints the cout to flush its buffer.
The 3rd note warns that debugging errors, if buffered and not flushed immediately, might not be seen if the program crashes while the error messages are still in the buffer (not flushed yet).
My program, which is executed from the command line, looks like this (execute command declared somewhere else):
int commandHandler::handleRequest(...)
{
bool cmdresult = execute(output);
if cmdresult
{
std::cout << output << std::endl;
}
}
The problem:
If you break the ongoing output to cout with ^C, another call to the program will crash at the output to cout, since "cout is locked but the owner is dead".
How do I prevent this in the easiest way? Are there any methods to check if cout is locked before trying to redirect output to that stream, and in that case, unlock it?
If I do a testprogram like this:
int main(void)
{
std::string output = "Superlongstringwouldbeprinted here... ";
for(int i=0;i<40000;i++)
{
output.append("Superlongstringwouldbeprinted here... ");
}
std::cout << output << std::endl;
}
in the "standard" environment, the output is breakable with ^C and I am able to run the program again with output to std::cout. That is, it seems like the implementation of direction to std::cout is flawed in the real time OS I am writing code for?
If you are with C++11, you may use
std::mutex::lock()
std::mutex::try_lock()
std::mutex::unlock()
functions.
However, you are not so lucky, you may use platform/library dependent codes
In Linux you can use POSIX mutex.
Another good idea to use scoped locks.
I am trying to print results in 2 nested for cycles using std::cout. However, the results are not printed to the console immediately, but with delay (after both for cycles or the program have been finished).
I do not consider such behavior normal, under Windows printing works OK. The program does not use threads.
Where could be the problem? (Ubuntu 10.10 + NetBeans 6.9).
std::cout is an stream, and it is buffered. You can flush it by several ways:
std::cout.flush();
std::cout << std::flush;
std::cout << std::endl; // same as: std::cout << "\n" << std::flush`
johny:
I am flushing the buffer before the cycle using std::endl. The problem arises when printing dot representing % of processed data inside the cycle.
If you flush the buffer before the cycle, that does not affect the output in the cycle. You have to flush in or after the cycle, to see the output.
If you don't flush your output, your output is not guaranteed to be visible outside your program. The fact that it is not printing in your terminal is just a consequence of the default behavior in linux to do line buffering when the output is a tty. If you run your program on linux with its output piped to another thing, like
./your_program | cat
Then the default buffer will be MUCH larger, most likely it will be at least 4096 bytes. So nothing will get displayed until the big buffer gets full. but really, the behaviour is OS specific unless you flush std::cout yourself.
To flush std::cout, use :
std::cout << std::flush;
also, using
std::cout << std::endl;
is a shortcut to
std::cout << '\n' << std::flush;
i am running 2 threads and the text i display first is displayed after the execution of thread
string thread(string url)
{
mutex.lock();
//some function goes here
mutex.unlock();
}
int main()
{
cout<<"asd";
boost::thread t1(boost::bind(&thread));
boost::thread t2(boost::bind(&thread));
t1.join();
t2.join();
}
in the main program i have just displayed an text asd this displayed always after the execution of the thread ..
std::cout << "asd" << std::flush;
Since cout is buffered, data put to it may not appear immediately on the console (or wherever it may be redirected to). Thus, try flushing the output stream within the thread. E.g.
cout << "asd" << endl;
I can't comment on your original post (not enough posts yet); on a tangential note, consider using scoped_lock (if you're not already!), safer than explicit lock/unlock calls...
also one word of caution, flush is expensive, call only when necessary.