Why does stream insertion of an empty streambuf fail?

Why does stream insertion of an empty streambuf fail? - c++

I was using the simple file-slurp and I decided to add some error checking. I was surprised that an empty file gives an error. This doesn't happen with every empty sequence either, "" works fine. I also verified that rdbuf() is returning a non-null pointer.
#include <iostream>
#include <sstream>
using namespace std;
int main(int, char**){
istringstream in(""); // Succeeds if non-empty
stringstream sstr;
if (sstr << in.rdbuf()) { // Succeeds if I replace in.rdbuf() with ""
cout << "Read Successful\n";
}else{
cout << "Read Failed\n";
}
}

It sets failbit because the standard requires it. (I feel foolish now. I thought I might have been doing something wrong.) Section 27.7.3.6.3.9 of the November 2014 Draft says:
If the function inserts no characters, it calls setstate(failbit) (which may throw ios_base::failure (27.5.5.4)).
Why the committee decided to make this behavior different from other sequence insertions (like char* and string) which do not consider insertion of nothing to be a failure is still a mystery. But I now know it is not a failure indicating misuse of the object.

Related

How to get rid of no "matching function call error" when iterating over a stream buffer?

I am trying to store binary data that should have the type of a std::complex< float > into a vector, through iterating over each element of the stream buffer. However I keep getting an error saying
no matching function for call to ‘std::istreambuf_iterator<std::complex<float> >::istreambuf_iterator(std::ifstream&)’
std::for_each(std::istreambuf_iterator<std::complex<float> >(i_f1),
I've tried searching for a solution but cannot find anything that would work. I am also trying to follow an example given in How to read entire stream into a std::vector? . Furthermore I'm compiling using g++ and -std=c++11.
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <cmath>
#include <boost/tuple/tuple.hpp>
#include <algorithm>
#include <iterator>
int main(){
//path to files
std::string data_path= "/$HOME/some_path/";
//file to be opened
std::string f_name1 = "ch1_d2.dat";
std::ifstream i_f1(data_path + f_name1, std::ios::binary);
if (!i_f1){
std::cout << "Error occurred reading file "<<f_name1 <<std::endl; std::cout << "Exiting" << std::endl;
return 0;
}
//Place buffer contents into vector
std::vector<std::complex<float> > data1;
std::for_each(std::istreambuf_iterator<std::complex<float> >(i_f1),
std::istreambuf_iterator<std::complex<float> >(),
[&data1](std::complex<float> vd){
data1.push_back(vd);
});
// Test to see if vector was read in correctly
for (auto i = data1.begin(); i != data1.end(); i++){
std::cout << *i << " ";
}
i_f1.close();
return 0;
}
I am quite lost at what I'm doing wrong, and am thus wondering why the
std::istreambuf_iterator()
does not accept the stream I am giving it as parameter?
Also the error message is confusing me as it seems to imply that I am calling the function in a wrong way, or a function that is non-existent.
Thanks

You want to read std::complex from i_f1 (which is a std::ifstream) using operator>> for std::complex, so you need a std::istream_iterator instead of std::istreambuf_iterator1:
std::for_each(std::istream_iterator<std::complex<float> >(i_f1),
std::istream_iterator<std::complex<float> >(),
[&data1](std::complex<float> vd){
data1.push_back(vd);
});
Your code can actually be simplified to:
std::vector<std::complex<float>> data1{
std::istream_iterator<std::complex<float>>(i_f1),
std::istream_iterator<std::complex<float>>()};
1 std::istreambuf_iterator is used to iterate character per character on, e.g., a std::basic_istream, not to iterate over it using overloads of operator>>.

You're probably using the wrong tool for the job.
You're trying to use a buffer iterator, which iterates over the constituent parts of a stream's buffer. But you're telling your computer that the buffer is one of complex<float>s … it isn't. An ifstream's buffer is of chars. Hence the constructor you're trying to use (one that takes an ifstream with a buffer of complex<float>) does not exist.
You can use an istream_iterator to perform a formatted iteration, i.e. to use the stream's magical powers (in this case, lexically interpreting input as complex<float>s) rather than directly accessing its underlying bytes.
You can read more on the previous question "the difference betwen istreambuf_iterator and istream_iterator".
The example you linked to does also go some way to explaining this.

Is size of char_traits<char16_t>::int_type not large enough?

Consider the following program:
#include <iostream>
#include <sstream>
#include <string>
int main(int, char **) {
std::basic_stringstream<char16_t> stream;
stream.put(u'\u0100');
std::cout << " Bad: " << stream.bad() << std::endl;
stream.put(u'\uFFFE');
std::cout << " Bad: " << stream.bad() << std::endl;
stream.put(u'\uFFFF');
std::cout << " Bad: " << stream.bad() << std::endl;
return 0;
}
The output is:
Bad: 0
Bad: 0
Bad: 1
It seems the reason the badbit gets set is because 'put' sets the badbit if the character equals std::char_traits::eof(). I can now no longer put to the stream.
At http://en.cppreference.com/w/cpp/string/char_traits it states:
int_type: an integer type that can hold all values of char_type plus
EOF
But if char_type is the same as int_type (uint_least16_t) then how can this be true?

The standard is quite explicit, std::char_traits<char16_t>::int_type is a typedef for std::uint_least16_t, see [char.traits.specializations.char16_t], which also says:
The member eof() shall return an implementation-defined constant that cannot appear as a valid UTF-16 code unit.
I'm not sure precisely how that interacts with http://www.unicode.org/versions/corrigendum9.html but existing practice in the major C++ implementations is to use the all-ones bit pattern for char_traits<char16_t>::eof(), even when uint_least16_t has exactly 16 bits.
After a bit more thought, I think it's possible for implementations to meet the Character traits requirements by making std::char_traits<char16_t>::to_int_type(char_type) return U+FFFD when given U+FFFF. This satisfies the requirement for eof() to return:
a value e such that X::eq_int_type(e,X::to_int_type(c)) is false for all values c.
This would also ensure that it's possible to distinguish success and failure when checking the result of basic_streambuf<char16_t>::sputc(u'\uFFFF'), so that it only returns eof() on failure, and returns u'\ufffd' otherwise.
I'll try that. I've created https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80624 to track this in GCC.
I've also reported an issue against the standard, so we can fix the "cannot appear as a valid UTF-16 code unit" wording, and maybe fix it some other way too.

The behavior is interesting, that:
stream.put(u'\uFFFF');
sets the badbit, while:
stream << u'\uFFFF';
char16_t c = u'\uFFFF'; stream.write( &c, 1 );
does not set badbit.
This answer only focus on the differences.
So let's check gcc's source code in bits/ostream.tcc, line 164~165, we can see that put() checks if the value equals to eof(), and set the badbit.
if (traits_type::eq_int_type(__put, traits_type::eof())) // <== It checks the value!
__err |= ios_base::badbit;
From line 196, we can see write() does not have this logic, it only checks if all the chars are written to the buffer.
This explains the behavior.
From std::basic_ostream::put's description:
Internally, the function accesses the output sequence by first
constructing a sentry object. Then (if good), it inserts c into its
associated stream buffer object as if calling its member function
sputc, and finally destroys the sentry object before returning.
It does not tell anything about the check of eof().
So I would think this is either a bug in document or a bug in implementation.

That really depends on what you mean by "large enough". char16_t is not "a type large enough to hold any Unicode character including those which I'm not allowed to use". You chose to try to cram \uFFFF, which is "reserved for internal use", into a char16_t, and thus you are the one at fault. The program is simply doing as you instructed.

Is calling `stringstream::str()` for getting what's printed actually legal?

Pre-history: I'm trying to ensure that some function foo(std::stringstream&) consumes all data from the stream.
Answers to a previous question suggest that using stringstream::str() is the right way of getting content of a stringstream. I've also seen it being used to convert arbitrary type to string like this:
std::stringstream sstr;
sstr << 10;
assert(sstr.str() == std::string("10")); // Conversion to std::string for clarity.
However, the notion of "content" is somewhat vague. For example, consider the following snippet:
#include <assert.h>
#include <sstream>
#include <iostream>
int main() {
std::stringstream s;
s << "10 20";
int x;
s >> x;
std::cout << s.str() << "\n";
return 0;
}
On Ideone (as well as on my system) this snippet prints 10 20, meaning that reading from stringstream does not modify what str() returns. So, my assumption is that that str() returns some internal buffer and it's up to stringstream (or, probably, its internal rdbuf, which is stringbuf by default) to handle "current position in that buffer". It's a known thing.
Looking at stringbuf::overflow() function (which re-allocates the buffer if there is not enough space), I can see that:
this may modify the pointers to both the input and output controlled sequences (up to all six of eback, gptr, egptr, pbase, pptr, epptr).
So, basically, there is no theoretical guarantee that writing to stringstream won't allocate a bigger buffer. Therefore, even using stringstream::str() for converting int to string is flawed: assert(sstr.str() == std::string("10")) from my first snippet can fail, because internal buffer is not guaranteed to be precisely of the necessary size.
Question is: what is the correct way of getting the "content" of stringstream, where "content" is defined as "all characters which could be consumed from the steream"?
Of course, one can read char-by-char, but I hope for a less verbose solution. I'm interested in the case where nothing is read from stringstream (my first snippet) as I never saw it fail.

You can use the tellg() function (inherited from std::basic_istream) to find the current input position. If it returns -1, there are no further characters to be consumed. Otherwise you can use s.str().substr(s.tellg()) to return the unconsumed characters in stringstream s.

Copy streams using rdbuf fails on empty input

It is a well known method to copy a stream into another using rdbuf:
#include <iostream>
#include <fstream>
int main()
{
std::ifstream in{"/tmp/foo.txt"};
std::cerr << in.rdbuf();
std::cerr << "Done\n";
}
However, this breaks (= sets the bad bit) my cerr when /tmp/foo.txt is empty. As a result, Done\n is not displayed.
Why is that? Observed with G++/libstdc++/GNU Linux and Clang++/libc++/OS X.

That seems to be the defined behaviour - see e.g. http://en.cppreference.com/w/cpp/io/basic_ostream/operator_ltlt:
If no characters were inserted, executes setstate(failbit)
I agree it's a bit unhelpful.

Different EOF behavior with read versus ignore

I was recently just tripped up by a subtle distinction between the behavior of std::istream::read versus std::istream::ignore. Basically, read extracts N bytes from the input stream, and stores them in a buffer. The ignore function extracts N bytes from the input stream, but simply discards them rather than storing them in a buffer. So, my understanding was that read and ignore are basically the same in every way, except for the fact that read saves the extracted bytes whereas ignore just discards them.
But there is another subtle difference between read and ignore which managed to trip me up. If you read to the end of a stream, the EOF condition is not triggered. You have to read past the end of a stream in order for the EOF condition to be triggered. But with ignore it is different: you only need to read to the end of a stream.
Consider:
#include <sstream>
#include <iostream>
using namespace std;
int main()
{
{
std::stringstream ss;
ss << "abcd";
char buf[1024];
ss.read(buf, 4);
std::cout << "EOF: " << std::boolalpha << ss.eof() << std::endl;
}
{
std::stringstream ss;
ss << "abcd";
ss.ignore(4);
std::cout << "EOF: " << std::boolalpha << ss.eof() << std::endl;
}
}
On GCC 4.4.5, this prints out:
EOF: false
EOF: true
So, why is the behavior different here? This subtle difference managed to confuse me enough to wonder why there is a difference. Is there some compelling reason that EOF is triggered "early" with a call to ignore?

eof() should only return true if you have already attempted to read past the end. In neither case should it be true. This may be a bug in your implementation.

I'm going to go out on a limb here and answer my own question: it really looks like this is a bug in GCC.
The standard reads in 27.6.1.3 paragraph 23:
[istream::ignore] behaves as an
unformatted input function (as
described in 27.6.1.3, paragraph 1).
After constructing a sentry object,
extracts characters and discards them.
Characters are extracted until any of
the following occurs:
if n != numeric_limits::max()
(18.2.1), n characters are extracted
end-of-file occurs on the input sequence (in which case the function
calls setstate(eofbit), which may
throw ios_base::failure(27.4.4.3));
c == delim for the next available input character c (in which case c is
extracted). Note: The last condition
will never occur if delim ==
traits::eof()
My (somewhat tentative) interpretation is that GCC is wrong here, because of the bold parts above. Ignore should behave as an unformatted input function, (like read()), which means that end-of-file should only occur on the input sequence if there is an attempt to extract additional bytes after the last byte in the stream has been extracted.
I'll submit a bug report if I find that enough people agree with this answer.

The consensus seemed to be that this was a legitimate bug in gcc. Since I saw no indication a bug report had been filed, I'm doing so now. The report can be viewed at:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51651

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Why does stream insertion of an empty streambuf fail? - c++

Related

How to get rid of no "matching function call error" when iterating over a stream buffer?

Is size of char_traits<char16_t>::int_type not large enough?

Is calling `stringstream::str()` for getting what's printed actually legal?

Copy streams using rdbuf fails on empty input

Different EOF behavior with read versus ignore

Categories

Resources