What could cause a stream to enter the "bad" state? - c++

In C++, each stream has a bad bit:
This flag is set by operations performed on the stream when an error occurs while read or writing data, generally causing the loss of integrity of the stream.
Source
What would cause a stream to "lose integrity" and enter the bad state? This is not the same as the fail state, which most often occurs when an input stream attempts to store a value into a variable that cannot accept said value (such as attempting to store a string into an integer variable).
Note that this question is a more general form of c++ file bad bit, which is specific to file input streams; this question is not an exact duplicate as it applies to both input and output streams in general.

According to cppreference.com :
The standard library sets badbit in the following situations:
Insertion into the output stream by put() or write() fails for any
reason.
Insertion into the output stream by operator<<, std::put_money or
std::put_time, could not complete because the end of the output
stream was reached (The facet's formatting output function such as
num_put::put() or money_put::put(), returns an iterator iter such
that iter.failed()==true)
Stream is constructed with a null pointer for rdbuf(), or
putback()/unget() is called on a stream with a null rdbuf(), or a
null pointer passed to operator<<(basic_streambuf*)
rdbuf()->sputbackc() or rdbuf()->sungetc() return traits::eof() to
putback() orunget()`
rdbuf()->pubsync() returns -1 to sync(), to flush(), or to the
destructor of ostream::sentry on a unitbuf stream
Exception is thrown during an I/O operation by any member function of
the associated stream buffer (e.g. sbumpc(), xsputn(), sgetc(),
overflow(), etc)
Exception is thrown in iword() or pword() (e.g. std::bad_alloc)
This may be one more reason to choose cppreference.com over www.cpluplus.com, see:
What's wrong with cplusplus.com?

Take a look at the Apache C++ Standard Library User's Guide. Two potential causes for a badbit are listed there. I quote:
Memory shortage: There is no memory available to create the buffer, or the buffer has size 0 for other reasons (such as being provided from outside the stream), or the stream cannot allocate memory for its own internal data.
The underlying stream buffer throws an exception: The stream buffer might lose its integrity, as in memory shortage, or code conversion failure, or an unrecoverable read error from the external device. The stream buffer can indicate this loss of integrity by throwing an exception, which is caught by the stream and results in setting the badbit in the stream's state.

Related

When exactly to check std::ifstream::good()?

Lets say I have a code to compute the size of a file :
std::ifstream ifs(path, std::ifstream::ate | std::ifstream::binary);
unsigned int size = ifs.tellg();
ifs.close();
Most of the time in C++, where/when is it relevant to call ifs.good()?
In my case, is it better after creating the stream or after calling tellg()?
Then, if good() returns false, should I close the stream explicitly?
Starting from the point that you never need to close explicity a stream, good() is a function to:
Check whether state of stream is good
Returns true if none of the stream's error state flags (eofbit,
failbit and badbit) is set.
You can call it to verify if something ism't going well and then verify the other bits to check what is wrong. For example :
End-of-File reached on input operation (eofbit)
Logical error on i/o operation (failbit)
Read/writing error on i/o operation (badbit)
However is not useful to call good to think on manual close of the stream.
In my case, is it better after creating the stream or after calling
tellg()?
For my opinion in this case you don't need to call good(), but if you want to call it, is better after tellg() that can set some bit of failure in it's execution.

When does output buffer flush?

Apart from manually calling flush, what is the condition that cout or STDOUT(printf) would flush?
Exiting the current scope or current function? Is it timed? Flush when the buffer is full (and how big is the buffer)?
For <stdio.h> streams you can set the buffering mode using setvbuf(). It takes three buffering modes:
_IOFBF: the buffer is flushed when it is full or when a flush is explicitly requested.
_IOLBF: the buffer is flushed when a newline is found, the buffer is full, or a flush is requested.
_IONBF: the stream is unbuffered, i.e., output is sent as soon as available.
I had the impressino that the default setup for stdout is _IOLBF, for stderr it is _IONBF, and for other streams it is _IOFBF. However, looking at the C standard I don't find any indication on what the default is for any C stream.
For the standard C++ stream objects there is no equivalent to _IOLBF: if you want line buffer you'd use std::endl or, preferrably, '\n' and std::flush. There are a few setups for std::ostream, though:
You generally can use buf.pubsetbuf(0, 0) to turn a stream to be unbuffered. Since stream buffers can be implemented by users, it isn't guaranteed that the corresponding call to set the buffer is honored, though.
You can set std::ios_base::unitbuf which causes the stream to be flushed after each [properly implemented] output operations. By default std::ios_base::unitbuf is only set for std::cerr.
The normal setup for an std::ostream to flush the buffer when the buffer is full or when explicitly requested where, unfortunately, std::endl makes an explicit request to flush the buffer (causing performance problems in many cases because it tends to be used as a surrogate for '\n' which it is not).
An interesting one is the ability to in.tie() an output buffer to an input stream: if in.tie() contains a pointer to an std::ostream this output stream will be flushed prior to an attempt to read from in (assuming correctly implemented input operators, of course). By default, std::cout is tie()d to std::cin.
Nearly forgot an important one: if std::ios_base::sync_with_stdio() wasn't called with false the standard C++ streams (std::cin, std::cout, std::cerr and std::clog and their wchar_t counterparts) are probably entirely unbuffered! With the default settings of std::ios_base::sync_with_stdio(true) the standard C and C++ streams can be used in a mixed way. However, since the C library is generally oblivious of the C++ library this means that the C++ standard stream objects can't do any buffering. Using std::sync_with_stdio(true) is the major performance problem with standard C++ stream objects!
Neither in C nor in C++ you can really control the size of buffers: the requests to set a non-zero buffer are allowed to be ignored and normally will be ignored. That is, the stream will pretty much be flushed at somewhat random places.

ifstream operator >> and error handling

I want to use ifstream to read data from a named piped. I would like to use its operator>> to read formatted data (typically, an int).
However, I am a bit confused in the way error handling works.
Imagine I want to read an int but only 3 bytes are available. Errors bits would be set, but what will happen to theses 3 bytes ? Will they "disappear", will they be put back into the stream for later extraction ?
Thanks,
As has been pointed out, you can't read binary data over an istream.
But concerning the number of available bytes issue (since you'll
probably want to use basic_ios<char> and streambuf for your binary
streams): istream and ostream use a streambuf for the actual
sourcing and sinking of the bytes. And streambuf normally buffer: the
procedure is: if a byte is in the buffer, return it, otherwise, try to
reload the buffer, waiting until the reloading has finished, or
definitively failed. In case of definitive failure, the streambuf
returns end of file, and that terminates the input; istream will
memorize the end of file, and not attempt any more input. So if the
type you are reading needs four bytes, it will request four bytes from
the streambuf, and will normally wait until those four bytes are
there. No error will be set (because there isn't an error); you will
simply not return from the operator>> until those four bytes arrive.
If you implement your own binary streams, I would strongly recommend
using the same pattern; it will allow direct use of already existing
standard components like std::ios_base and (perhaps) std::filebuf,
and will provide other programmers with an idiom they are familiar with.
If the blocking is a problem, the simplest solution is just to run the
input in a separate thread, communicating via a message queue or
something similar. (Boost has support for asynchronous IO. This avoids
threads, but is globally much more complicated, and doesn't work well
with the classical stream idiom.)

Will fseek function flush data in the buffer in C++?

We know that call to functions like fprintf or fwrite will not write data to the disk immediately, instead, the data will be buffered until a threshold is reached. My question is, if I call the fseek function, will these buffered data writen to disk before seeking to the new position? Or the data is still in the buffer, and is writen to the new position?
cheng
I'm not aware if the buffer is guaranteed to be flushed, it may not if you seek to a position close enough. However there is no way that the buffered data will be written to the new position. The buffering is just an optimization, and as such it has to be transparent.
Yes; fseek() ensures that the file will look like it should according to the fwrite() operations you've performed.
The C standard, ISO/IEC 9899:1999 ยง7.19.9.2 fseek(), says:
The fseek function sets the file position indicator for the stream pointed to by stream.
If a read or write error occurs, the error indicator for the stream is set and fseek fails.
I don't believe that it's specified that the data must be flushed on a fseek but when the data is actually written to disk it must be written at that position that the stream was at when the write function was called. Even if the data is still buffered, that buffer can't be written to a different part of the file when it is flushed even if there has been a subsequent seek.
It seems that your real concern is whether previously-written (but not yet flushed) data would end up in the wrong place in the file if you do an fseek.
No, that won't happen. It'll behave as you'd expect.
I have vague memories of a requirement that you call fflush before
fseek, but I don't have my copy of the C standard available to verify.
(If you don't it would be undefined behavior or implementation defined,
or something like that.) The common Unix standard specifies that:
If the most recent operation, other than ftell(), on a given stream is
fflush(), the file offset in the underlying open file description
shall be adjusted to reflect the location specified by fseek().
[...]
If the stream is writable and buffered data had not been written to
the underlying file, fseek() shall cause the unwritten data to be
written to the file and shall mark the st_ctime and st_mtime fields of
the file for update.
This is marked as an extention to the ISO C standard, however, so you can't count on it except on Unix platforms (or other platforms which make similar guarantees).

c++ file bad bit

when I run this code, the open and seekg and tellg operation all success.
but when I read it, it fails, the eof,bad,fail bit are 0 1 1.
What can cause a file bad?
thanks
int readriblock(int blockid, char* buffer)
{
ifstream rifile("./ri/reverseindex.bin", ios::in|ios::binary);
rifile.seekg(blockid * RI_BLOCK_SIZE, ios::beg);
if(!rifile.good()){ cout<<"block not exsit"<<endl; return -1;}
cout<<rifile.tellg()<<endl;
rifile.read(buffer, RI_BLOCK_SIZE);
**cout<<rifile.eof()<<rifile.bad()<<rifile.fail()<<endl;**
if(!rifile.good()){ cout<<"error reading block "<<blockid<<endl; return -1;}
rifile.close();
return 0;
}
Quoting the Apache C++ Standard Library User's Guide:
The flag std::ios_base::badbit indicates problems with the underlying stream buffer. These problems could be:
Memory shortage. There is no memory available to create the buffer, or the buffer has size 0 for other reasons (such as being provided from outside the stream), or the stream cannot allocate memory for its own internal data, as with std::ios_base::iword() and std::ios_base::pword().
The underlying stream buffer throws an exception. The stream buffer might lose its integrity, as in memory shortage, or code conversion failure, or an unrecoverable read error from the external device. The stream buffer can indicate this loss of integrity by throwing an exception, which is caught by the stream and results in setting the badbit in the stream's state.
That doesn't tell you what the problem is, but it might give you a place to start.
Keep in mind the EOF bit is generally not set until a read is attempted and fails. (In other words, checking rifile.good after calling seekg may not accomplish anything.)
As Andrey suggested, checking errno (or using an OS-specific API) might let you get at the underlying problem. This answer has example code for doing that.
Side note: Because rifile is a local object, you don't need to close it once you're finished. Understanding that is important for understanding RAII, a key technique in C++.
try old errno. It should show real reason for error. unfortunately there is no C++ish way to do it.