Safely overloading stream operator>> - c++

There's a ton of information available on overloading operator<< to mimic a toString()-style method that converts a complex object to a string. I'm interested in also implementing the inverse, operator>> to deserialize a string into an object.
By inspecting the STL source, I've gathered that:
istream &operator>>(istream &, Object &);
would be the correct function signature for deserializing an object of type Object. Unfortunately, I have been at a loss for how to properly implement this - specifically how to handle errors:
How to indicate invalid data in the stream? Throw an exception?
What state should the stream be in if there is malformed data in the stream?
Should any flags be reset before returning the reference for operator chaining?

How to indicate invalid data in the stream? Throw an exception?
You should set the fail bit. If the user of the stream wants exception to be thrown, he can configure the stream (using istream::exceptions), and the stream will throw accordingly. I would do it like this, then
stream.setstate(ios_base::failbit);
What state should the stream be in if there is malformed data in the stream?
For malformed data that doesn't fit the format you want to read, you usually should set the fail bit. For internal stream specific errors, the bad bit is used (such as, if there is no buffer connected to the stream).
Should any flags be reset before returning the reference for operator chaining?
I haven't heard of such a thing.
For checking whether the stream is in a good state, you can use the istream::sentry class. Create an object of it, passing the stream and true (to tell it not to skip whitespace immediately). The sentry will evaluate to false if the eof, fail or bad bit is set.
istream::sentry s(stream, true);
if(!s) return stream;
// now, go on extracting data...

Some additional notes:
when implementing the operator>>, you probably should consider using the
bufstream and not other overloads of operator>>;
exceptions occuring during the operation should be translated to the
failbit or the badbit (members of streambuf may throw, depending on the
class used);
setting the state may throw; if you set the state after catching an
exception, you should propagate the original exception and not the one
throwed by setstate;
the width is a field to which you should pay attention. If you are
taking it into account, you should reset it to 0. If you are using other
operator>> to do basic works, you have to compute the width you are passing
from the one you received;
consider taking the locale into account.
Lange and Kreft (Standard C++ IOStreams and Locales) conver this in even
more details. They give a template code for the error handling which takes
about one page.

As for flags, I don't know if there is any standard somewhere, but it is a good idea to reset them.
Boost has neat raii wrappers for that: IO State Savers

Related

What happens to the data inserted into an unopen stream?

My intuitive feeling is that the data is thrown away entirely. I cannot seem to find a source with which to verify this suspicion.
What happens to data inserted into an unopen stream? (eg. std::ofstream)
Is the data discarded? Perhaps it is stored in a buffer until the stream is opened? Perhaps something else?
In the standard "remarks" of all the file stream buffer methods that correspond to operations on the buffer it indicates that if is_open() == false, the function always fails. Failure is defined as returning traits_type::eof(). This special value is caught by higher-level IO functions which in turn set std::ios_base::badbit flag in the stream state.
If the output stream is in a fail state (eg.: not open) nothing happens to the stream - the request to output/buffer data is ignored entirely.
Note: If the exception std::ios_base::badbit is enabled, it will be thrown.

Can exceptions in a custom implementation of the std::streambuf be delivered to the stream user?

Is this a standard enforced behavior that when an exception is thrown
in a custom std::streambuf method (eg. xsgetn),
it is caught (some status bits are set) but not rethrown ?
Is there any method to alter this (or pass the error message without some
dirty tricks) ?
There is a general guarantee that things like operator<< and
operator>> will not raise an exception unless asked to. You
can ask it to by specifying the exception mask:
stream.exceptions( std::ios_base::badbit );
(You can specify any or all of the error conditions here, but
badbit is probably the only thing you'd ever want to.)
If this is set, and a streambuf function exits through an
exception, that exception will be rethrown.
Another possibility is to maintain information concerning the
error in the streambuf, with a function to extract it. Then,
when the client code detects an error, it can use
dynamic_cast<MyStreambuf*>( stream.rdbuf() ) to get the
derived streambuf, and call its member functions to get the
error information.

What is the difference between flush() and sync() in regard to fstream buffers?

I was reading the cplusplus.com tutorial on I/O. At the end, it says fstream buffers are synchronized with the file on disc
Explicitly, with manipulators: When certain manipulators are used on
streams, an explicit synchronization takes place. These manipulators
are: flush and endl.
and
Explicitly, with member function sync(): Calling
stream's member function sync(), which takes no parameters, causes an
immediate synchronization. This function returns an int value equal to
-1 if the stream has no associated buffer or in case of failure. Otherwise (if the stream buffer was successfully synchronized) it
returns 0.
in addition to a few other implicit cases ( such as destruction and stream.close() )
What is the difference between calling fstream::flush() and fstream::sync()? endl?
In my code, I've always used flush().
Documentation on std::flush():
Flush stream buffer
Synchronizes the buffer associated with the stream
to its controlled output sequence. This effectively means that all
unwritten characters in the buffer are written to its controlled
output sequence as soon as possible ("flushed").
Documentation on std::streambuf::sync():
Synchronize input buffer with source of characters
It is called to synchronize the stream buffer with the controlled sequence (like the file in the case of file streams). The public member function pubsync calls this protected member function to perform this action.
Forgive me if this is a newbie question; I am a noob.
basic_ostream::flush
This is a non-virtual function which writes uncommited changes to the underlying buffer. In case of error, it sets an error flag in the used stream object. This is because the return value is a reference to the stream itself, to allow chaining.
basic_filebuf::sync
This is a virtual function which writes all pending changes to the underlying file and returns an error code to signal success or failure.
endl
This, when applied to an ostream, writes an '\n' to the stream and then calls flush on that stream.
So, essentially: flush is a more general function for any stream, whereas sync is explicitly bound to a file. flush is non-virtual, whereas sync is virtual. This changes how they can be used via pointers (to base class) in the case of inheritance. Furthermore, they differ in how they report errors.
sync is a member of input streams, all unread characters are cleared from the buffer. flush is a member of output streams and buffered output is passed down to the kernel.
C++ I/O involves a cooperation between a number of classes: stream, buffer, locale and locale::facet-s.
In particular sync and flush are member function that exist in both stream and streambuf, so beware to what documentation you are referring, since they do different things.
On streams flush tells the stream to tell the buffer (note the redirection) to flush its content onto the destination. This makes sure that no "pending write" remains.
std::endl, when applied to thestream with <<, is no more than a
thestream.put('\n'); thestream.flush();
Always on streams, sync tells the stream to tell the buffer to flush the content (for output) and read (for input) as much as it can to refill the buffer.
Note that -in buffers- sync can be also called internally by overflow to handle the "buffer full" (for output) and "buffer empty" (for input) situations.
I thus sense, sync is much more an "internal" function used in stream to buffer communication and buffer implementation (where it is virtual and overridden in different buffer types), while flush is much more an interface between the stream and the client program.
endl ... is just a shortcut.
I've understood it to be as follows:
flush() will get the data out of the library buffers into the OS's write buffers and will eventually result in a full synchronization (the data is fully written out), but it's definitely up to the OS when the sync will be complete.
sync() will, to the extent possible in a given OS, attempt to force full synchronization to come about — but the OS involved may or may not facilitate this.
So flush() is: get the data out of the buffer and in line to be written.
sync() is: if possible, force the data to be definitively written out, now.
That's been my understanding of this, but as I think about it, I can't remember how I came to this understanding, so I'm curious to hear from others, too.
What is the difference between calling fstream::flush() and fstream::sync()?
There is none: Both essentially call rdbuf()->pubsync() which then calls std::streambuf::sync(), see links at https://en.cppreference.com/w/cpp/io/basic_fstream
After constructing and checking the sentry object, calls rdbuf()->pubsync()
and https://en.cppreference.com/w/cpp/io/basic_streambuf/pubsync
Calls sync() of the most derived class
The only difference is where the functions are defined: sync is inherited from istream while flush is inherited from ostream (and fstream inherits from both). And of course the return values are different: sync returns an int (0 for ok, -1 for failure) while flush returns a reference to the stream object. But you likely don't care about those anyway.
The naming difference for input and output streams is that for input it "synchronizes" the internal buffer with the input stream (here a file) in case that changed or "flushes" pending changes from the internal buffer to the output stream (again: here a file). I.e. "sync from" and "flush to" made more sense naming wise. But for an iostream
And for completeness (almost) from Emilios answer:
std::endl, when applied to thestream with <<, is no more than a
thestream.put('\n').flush();
So it appends a newline and then calls the streams flush function which then eventually calls the buffers sync function (through pubsync).
Just a shortcut to basically use line buffering, i.e. write (up to) the end of that newline, then flush what was written.

Need help/advice on how to test an ostream passed to a function

I have a function that gets an ostream passed to it. I need to test the ostream to make sure it will work. I put the print statement in a try block with a catch statement to catch all exceptions. Now to test if this will work, i need a little help creating a bad ostream to pass the function.
I tried this:
filebuf buffer;
ostream& out = ostream(&buffer);
test1.print(out);
But when i test it, i cannot get my error message to be printed. What would be a better way to create a bad ostream to be passed to the function.
Also, is there a better way to check the ostream than using a try-catch block?
How about an ofstream that's not open to any file?
Streams don't in general throw exceptions when they don't work, so I would not expect your error message to be printed whether your operation succeeded or not.
The ios base class has a function called exceptions that can be used to control that, but it's fairly anti-social to write a function that changes the state of an argument like that, so you might have to be sure to restore it (even if it throws an exception...)
The usual way to test success of an operation on a stream is using if. A stream evaluates to a true value if everything is OK, a false value otherwise - by which I mean a value that's true/false as far as conditionals are concerned, not the bool values true or false.
To be precise, it's false if the failbit or badbit is set. The last operation "didn't work" if and only if at least one of those is set.

C++ Boost io streams, error handling

Is it possible to make a custom stream work like the stanadrd ones in regard for errors? That is by default use the good/fail/bad/eof bits rather than exceptions?
The boost docs only mention throwing an std::failure for stream errors and letting other error propagate (eg a badalloc from trying to allocate a buffer), however the boost code does not seem to catch these, instead relying on the user code to handle them, but all my existing code relies on the good(), bad() etc methods and the clear() method in cases where it needs to try again after an error.
From http://www.trip.net/~bobwb/cppnotes/lec08.htm
The error state can be set using:
void clear(iostate = 0);
The default value of zero results in ios_base::goodbit being set.
clear();
is therefore equivalent to
clear(0);
which is equivalent to
clear(ios_base::goodbit);
Note that ios_base::goodbit is a non-zero value. clear() might be used to set one of the other bits as part of a programmer's code for operator>>() for a particular object. For example:
if (bad_char) is.clear(ios_base::badbit); // set istream's badbit