std::cout overhead if closed (STDIN_FILENO)

std::cout overhead if closed (STDIN_FILENO) - c++

I have background process (daemon on unix system) that have std::cout for debug on the source file at some places. I run this daemon on silence mode or NO silence mode. On silence mode after start of process I execute this bit of code:
std::cout.rdbuf(0);
close(STDIN_FILENO);
close(STDOUT_FILENO);
close(STDERR_FILENO);
And as you understand std::cout is still present on the code and run.
On NON silence mode there is a big overhead, while cout to the screen is very expensive and slow IO operation.
The question:
What overhead of the code on silence mode? Is there some "drag" for my program due to std::cout present, but with closed STDOUT_FILENO? (time to time it try to print up to 1 kilobite of info)
and how big this overhead?

It obviously has some overhead. But not much; the first thing in every << is to test that the stream status is good. And it shouldn't be if the corresponding physical device is closed. At the very least, it will go bad after the first flush (due to the buffer becoming full). Alternatively, you could call std::cout.rdbuf( nullptr ), which should make it go bad immediately.
The traditional solution has been to create a no-op streambuf. This has the advantage that the stream doesn't go bad: reads just always see end of file, and output always works. It has the disadvantage that because the stream state is good, you actually format all of the output: std::cout << someDouble will do all the work of converting the double into a sequence of characters. If the stream state is bad (as it will be with a nullptr as the stream buffer), the << operators return before having tried to convert anything.

Related

why we need to close a file to complete the writing process of a file? [duplicate]

I understand cout << '\n' is preferred over cout << endl; but cout << '\n' doesn't flush the output stream. When should the output stream be flushed and when is it an issue?
What exactly is flushing?

Flushing forces an output stream to write any buffered characters. Read streamed input/output.
It depends on your application, in real-time or interactive applications you need to flush them immediately but in many cases you can wait until closing the file and leave the program to flush it automatically.

When must the output stream in C++ be flushed?
When you want to be sure that data written to it is visible to other programs or (in the case of file streams) to other streams reading the same file which aren't tied to this one; and when you want to be certain that the output is written even if the program terminates abnormally.
So you would want to do this when printing a message before a lengthy computation, or for printing a message to indicate that something's wrong (although you'd usually use cerr for that, which is automatically flushed after each output).
There's usually no need to flush cerr (which, by default, has its unitbuf flag set to flush after each output), or to flush cout before reading from cin (these streams are tied so that cout is flushed automatically before reading cin).
If the purpose of your program is to produce large amounts of output, either to cout or to a file, then don't flush after each line - that could slow it down significantly.
What exactly is flushing?
Output streams contain memory buffers, which are typically much faster to write to than the underlying output. Output operations put data into the buffer; flushing sends it to the final output.

First, you read wrong. Whether you use std::endl or '\n'
depends largely on context, but when in doubt, std::endl is
the normal default. Using '\n' is reserved to cases where
you know in advance that the flush isn't necessary, and that it
will be too costly.
Flushing is involved with buffering. When you write to
a stream, (typically) the data isn't written immediately to the
system; it is simply copied into a buffer, which will be written
when it is full, or when the file is closed. Or when it is
explicitly flushed. This is for performance reasons: a system
call is often a fairly expensive operation, and it's generally
not a good idea to do it for every characters. Historically,
C had something called line buffered mode, which flushed with
every '\n', and it turns out that this is a good compromize
for most things. For various technical reasons, C++ doesn't
have it; using std::endl is C++'s way of achieving the same
results.
My recommendation would be to just use std::endl until you
start having performance problems. If nothing else, it makes
debugging simpler. If you want to go further, it makes sense to
use '\n' when you're outputting a series of lines in just
a few statements. And there are special cases, like logging,
where you may want to explicitly control the flushing.

Flushing can be disastrous if you are writing a large file with frequent spaces.
For example
for(int i = 0 ;i < LARGENUMBER;i++)
{//Slow?
auto point = xyz[i];
cout<< point.x <<",",point.y<<endl;
}
vs
for(int i = 0 ;i < LARGENUMBER;i++)
{//faster
auto point = xyz[i];
cout<< point.x <<",",point.y<<"\n";
}
vs
for(int i = 0 ;i < LARGENUMBER;i++)
{//fastest?
auto point = xyz[i];
printf("%i,%i\n",point.x,point.y);
}
endl() was often know for doing other things, for example synchronize threads when in a so-called debug mode on MSVC, resulting in multithreaded programs that, contrary to expectation, printed uninterrupted phrases from different threads.

I/O libraries buffer data sent to stream for performance reasons. Whenever you need to be sure data has actually been sent to stream, you need to flush it (otherwise it may still be in buffer and not visible on screen or in file).
Some operations automatically flush streams, but you can also explicitly call something like ostream::flush.
You need to be sure data is flushed, whenever for example you have other program waiting for the input from first program.

It depends on what you are doing. For example, if you are using the console to warn the user about a long process... printing a series of dots in the same line... flushing can be interesting. For normal output, line per line, you should not care about flushing.
So, for char based output or non line based console output, flushing can be necessary. For line based output, it works as expected.
This other answer can clarify your question, based on why avoiding endl and flushing manually may be good for performance reasons:
mixing cout and printf for faster output
Regarding what is flushing: when you write to a buffered stream, like ostream, you don't have any guarantee that your data arrived the destination device (console, file, etc). This happens because the stream can use intermediary buffers to hold your data and to not stop your program. Usually, if your buffers are big enough, they will hold all data and won't stop your program due to slow I/O device. You may have already noticed that the console is very slow. The flush operation tells the stream that you want to be sure all intermediary data arrived on the destination device, or at least that their buffers are now empty. It is very important for log files, for example, where you want to be sure (not 100%) a line will be on disk not in an buffer somewhere. This becomes more important if your program can't loose data, i.e., if it crashes, you want to be sure you did you best to write your data on disk. For other applications, performance is more important and you can let the OS decide when to flush buffers for you or wait until you close the stream, for example.

Speeding printf and cout in windows

Windows cout and printf is really slow, so when a lot of data is sent it slows applications (it happens with code running during days to check if all is working well).
A metod to make it faster is to use a buffer by writting following code at the beginning of the main() function:
#ifndef __linux__ //Introduce this code at the beginning of main() to increase a lot the speed of cout in windows:
char buffer_setvbuf[1024];setvbuf(stdout, buffer_setvbuf, _IOFBF, sizeof buffer_setvbuf); //¿¡¡Sometimes it does not print cout inside a function until cout is used in main() or end of buffer is reached.
#endif
But unfortunately a side effect is that sometimes it does not print the data because the buffer is not full.
Then the questions:
1. How to force print: by making \n?
2. How to disable the buffer?

printf
I see you are trying to use larger buffer on memory to reduce the number of writes on stdout. Indeed, your code would not print anything until your buffer becomes full, because the buffering mode is set to _IOFBF (i.e. full buffering). Since you want control when to flush, there are two ways to go about.
Use _IOLBF (i.e. line buffering), and put newline character whenever you want to flush.
Call fflush(stdout) to manually flush the buffer.
std::cout
I think std::cout should be preferred when writing c++ code, because of its ease of use. One thing that might slow down the I/O process is synchronization between iostream and stdio. As far as I know, the default on many systems is to keep the two in sync, and it has some overhead. You can disable it by calling std::ios_base::sync_with_stdio(false). reference
When you need to flush output, you can use what is called "manipulators" for output stream - namely std::flush and std::endl. When those manipulators are put into an output stream like the following: std::cout << "your string" << std::endl, it is guaranteed that the output stream is flushed.
std::endl reference
std::flush reference
Bottom Line
Use fflush to flush stdout when using printf for output.
I recommend trying std::cout with sync off, and test if it fits your performance need.

Will using fflush after printf slow down your program?

I am writing some console programs and I notice that sometimes when I use print() and my program is idle, not everything is printed out (the last few lines are missing).
Eventually something will happen and the lines do get printed, but often when I close the program the last few lines are not there.
So I did some digging and it looks like the stdout buffer is not always emptied unless certain conditions are met (new line? / line feed?).
so I have created a "myprintf()" function which wraps printf to do the following (in pseudo code):
printf(...);
fflush(stdout);
The question is, apart from the obvious extra function call overhead, is this going to slow my program down? I.e. is this a bad practice performance wise?

Dependant on a couple of things.
If your printf ends with a newline character (\n), the stdout buffer will flush immedeatly and display all directly. This is default behaviour of the stdout buffer. So in that case flushing again would indeed slow your program down a little, albeit only a tiny amount.
Now, if you don't end on a newline character, stdout will not flush automatically and you indeed need the fflush to display things properly. It will then also slow down the program, albeit again, only a little.
You can completly avert your problem though, by setting the buffer stdout to not wait for newlines before flushing. This would make your wrapper redundant Like this:
setbuf(stdout, NULL);
Will guarantee every time there is anything in stdout it will be flushed. This will be slightly more efficient then your direct call to fflush() every time.
In conclusion, unless you are operating on very tight performance constraints, the overhead generated is negligible.

Yes it will slow it down. If it didn't, then flushing would be the default behavior.
Judging by the tone of your post, it does not sound like you're doing the kinds of things where too much flushing would be noticeable. So unless you say why you're afraid of a slowdown, flush away.

Yes there will be slowdown, not necessarily very noticeable; but it will exists.
A better alternative to printf(); fflush(); would be changing the buffering policy with setvbuf(); a call to setvbuf(stdout,NULL,_IONBF,0); will assure that every write to stdout will be directly flushed (better than call explicitly fflush() each time).

IO is generally the most time consuming operation. So unnecessary flush could slow down your program. But you should thing twice about it because if output is a terminal, flush is automatic when encoutering a new-line or before a read (on same terminal...). So the only use cases I can imagine could be:
outputting a single char (. or * or ...) regularly to show progression of a lenghty operation. Without the flush, nothing will be seen
expecting the program output to be piped to another program (tee for example) while wanting progress to appear immediately on screen
TL/DR: for common usages, the slowing down will not be noticiable, but it is useless.

is it safe to use a text file that is modified by c++ and is not closed?

The title is not so clear but what I mean is this:
std::fstream filestream("abc.dat", std::ios::out);
double write_to_file;
while (some_condition) {
write_to_file = 1.345; ///this number will be different in each loop iteration
filestream.seekg( 345 );
filestream << std::setw(5) << write_to_file << std::flush;
///write the number to replace the number that is written in the previous iteration
system( "./Some_app ./abc.dat" ); ///open an application in unix,
////which uses "abc.dat" as the input file
}
filestream.close();
that's the rough idea, each iteration re-write the number into the file and flush. I'm hoping not to open and close the file in each iteration, in order to save computing time. (also not sure of the complexity of open and close :/ ) Is it ok to do this?

On unix, std::flush does not necessarily write to the physical device. Typically, it does not. std::ofstream::flush calls rdbuf->pubsync(), which in turn calls rdbuf->sync(), which in turn "synchronizes the controlled sequences with the arrays." What are those "control sequences"? Typically they're not the underlying physical device. In a modern OS such as unix, there are lots of things in between high level I/O constructs such as C++'s concept of an I/O buffer and the bits on the device.
Even the low-level POSIX function fsync() does not necessarily guarantee that bits are written to the device. Even closing and reopening the output file does not necessarily guarantee that bits are written to the device.
You might want to rethink your design.

You need at least to flush the C++ stream buffer with filestream.flush() before calling system (but you did that with << std::flush;)
I am assuming that ./Someapp is not writing the file, and is opening it for reading only.
But in your case, better open and close the file at each iteration, since the system call is obviously a huge bottleneck.

Improving performance of ofstream?

I am communicating with some parallel processes using FIFOs. I am reading the pipe with read(). And I am writing to the named pipe by doing this:
ofstream pipe(namepipe);
pipe << data << endl;
pipe.close();
I have been noticing that the performance is horrible though! It takes like 40ms sometimes. It's an extreme latency in my opinion. I read that the use of std::endl can affect performance. Should I avoid using endl?
Does using ofstream affect performance? Are there any other alternatives to this method?
Thank you!

When working with large files with fstream, make sure to use a stream buffer and don't use endl (endl flushes the output stream).
At least the MSVC implementation copies 1 char at a time to the filebuf when no buffer was set (see streambuf::xsputn()), which can make your application CPU-bound, which will result in lower I/O rates.
So, try adding this to your code before doing the writing:
const size_t bufsize = 256*1024;
char buf[bufsize];
mystream.rdbuf()->pubsetbuf(buf, bufsize);
NB: You can find a complete sample application here.

A cheap hack:
std::ios::sync_with_stdio(false);
Note Use this only if you are not going to be mixing c IO with c++
The reason std::endl might affect i/o performance is because it flushes the stream. So to avoid this, you should use '\n'
Avoiding having to open and close multiple streams will also help

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js