difference between << s.str() and << s.rdbuf() - c++

Can someone explain the subtle difference in:
ofstream f("test.txt")
std::stringstream s;
s<<"";
f << s.rdbuf();
f.good() // filestream is bad!!
ofstream f("test.txt")
std::stringstream s;
s<<"";
f << s.str();
f.good() // is still ok!
I mostly use .rdbuf() to push the stringstream to the file (because its more efficient), but if the stringstream is empty than the filestream gets bad...? Isnt this stupid?
I think I dont quite understand << s.rdbuf() ...

The insertion operator that "inserts" streambuffers sets the failbit if no characters could be extracted from the streambuffer - [ostream.inserters]/9:
If the function inserts no characters, it calls setstate(failbit)
(which may throw ios_base:: failure (27.5.5.4)).
Whereas the insertion operator that outputs a string obviously doesn't consider the amount of characters written.
It seems that this is because inserting a streambuffer "forwards" the streambuffer into the stream - if no characters could be extracted most certainly there was an error in the streambuffer itself and this error should be represented by the streams error state. Outputting an empty stream is an exception that was presumably not considered important enough to take into account when this rule was created.

Related

Is calling `stringstream::str()` for getting what's printed actually legal?

Pre-history: I'm trying to ensure that some function foo(std::stringstream&) consumes all data from the stream.
Answers to a previous question suggest that using stringstream::str() is the right way of getting content of a stringstream. I've also seen it being used to convert arbitrary type to string like this:
std::stringstream sstr;
sstr << 10;
assert(sstr.str() == std::string("10")); // Conversion to std::string for clarity.
However, the notion of "content" is somewhat vague. For example, consider the following snippet:
#include <assert.h>
#include <sstream>
#include <iostream>
int main() {
std::stringstream s;
s << "10 20";
int x;
s >> x;
std::cout << s.str() << "\n";
return 0;
}
On Ideone (as well as on my system) this snippet prints 10 20, meaning that reading from stringstream does not modify what str() returns. So, my assumption is that that str() returns some internal buffer and it's up to stringstream (or, probably, its internal rdbuf, which is stringbuf by default) to handle "current position in that buffer". It's a known thing.
Looking at stringbuf::overflow() function (which re-allocates the buffer if there is not enough space), I can see that:
this may modify the pointers to both the input and output controlled sequences (up to all six of eback, gptr, egptr, pbase, pptr, epptr).
So, basically, there is no theoretical guarantee that writing to stringstream won't allocate a bigger buffer. Therefore, even using stringstream::str() for converting int to string is flawed: assert(sstr.str() == std::string("10")) from my first snippet can fail, because internal buffer is not guaranteed to be precisely of the necessary size.
Question is: what is the correct way of getting the "content" of stringstream, where "content" is defined as "all characters which could be consumed from the steream"?
Of course, one can read char-by-char, but I hope for a less verbose solution. I'm interested in the case where nothing is read from stringstream (my first snippet) as I never saw it fail.
You can use the tellg() function (inherited from std::basic_istream) to find the current input position. If it returns -1, there are no further characters to be consumed. Otherwise you can use s.str().substr(s.tellg()) to return the unconsumed characters in stringstream s.

peek() behavior in istringstrem class

I saw a lot of questions on the peek method, but mine concerns a topic which would be almost obvious, but nevertheless (I think) interesting.
Suppose you have a binary file to read, and that you choose to bring up it as a whole in the program memory and use an istringstream object to
perform the reading.
For instance, if you are searching for the position og a given byte in the stream, accessing the hard disk repeatedly would waste time and resources...
But once you create the istringstream object any eventual NULL byte is
treated as an EOF signal.
At least this is what happened to me in the following short code:
// obvious omissis
std::istringstream is(buffer);
// where buffer is declared as char *
// and filled up with the contents of
// a binary file
char sample = 'a';
while(!is.eof() && is.peek() != sample)
{ is.get(); }
std::cout << "found " << sample << " at " << is.tellg() << std::endl;
This code doesn't work neither with g++ 4.9 nor with clang 3.5 in the
hypothesis that there is a null byte inside buffer before a match
with sample can be found, since that null byte sets the eof bit.
So my question is: Is this kind of approach to be avoided at all or there is some way to teach peek that a null byte is not "necessarily" the end of the stream?
If you look at your std::istringstream constructors, you'll see (2) takes a std::string. That can have embedded NULs, but if you pass buffer and that's a character array or char*, the string constructor you implicitly invoke will use a strlen-style ASCIIZ length determination to work out how much data to load. You should instead specify the buffer size explicitly - something like:
std::string str(buffer, bytes);
std::istringstream is(str);
Then your while(!is.eof() is bodgy... there are hundreds of S.O. Q&A about that issue; one at random - here.

What happened to << when using ofstream without any filename pointed?

(1) default constructor
Constructs an ofstream object that is not associated with any file.
Internally, its ostream base constructor is passed a pointer to a newly constructed filebuf object (the internal file stream buffer).
what happened to << when using ofstream without any filename pointed?
ofstream ofstream;
ofstream<<1<<endl;
where is the "1" go? is there any problems? I tried it, no issues. but I can't found any code clue for this, can anybody show the internal code explain for it?
Nothing happens.
[C++11: 27.9.1.1/3]: In particular:
If the file is not open for reading the input sequence cannot be read.
If the file is not open for writing the output sequence cannot be written.
A joint file position is maintained for both the input sequence and the output sequence
The stream is closed, an error flag is set and the data is ignored.
Example:
#include <iostream>
#include <fstream>
int main()
{
std::ofstream ofs;
ofs << 1 << std::endl;
std::cout << ofs.good() << std::endl;
}
// Output: 0
Live demo
The short version: the operations on the ofstream all fail, causing nothing to happen. The data that you write is lost and not stored anywhere, and the failbit will be set, causing the stream's fail() member function to return true.
The long version: when an ofstream is constructed without specifying a file, it default-constructs a filebuf. This creates a filebuf where is_open evaluates to false. As part of the stream insertion operation, the data to be written will need to be sent to the disk by calling filebuf::overflow, which, since is_open is false, will return EOF, causing the operation to fail.
Hope this helps!

Different EOF behavior with read versus ignore

I was recently just tripped up by a subtle distinction between the behavior of std::istream::read versus std::istream::ignore. Basically, read extracts N bytes from the input stream, and stores them in a buffer. The ignore function extracts N bytes from the input stream, but simply discards them rather than storing them in a buffer. So, my understanding was that read and ignore are basically the same in every way, except for the fact that read saves the extracted bytes whereas ignore just discards them.
But there is another subtle difference between read and ignore which managed to trip me up. If you read to the end of a stream, the EOF condition is not triggered. You have to read past the end of a stream in order for the EOF condition to be triggered. But with ignore it is different: you only need to read to the end of a stream.
Consider:
#include <sstream>
#include <iostream>
using namespace std;
int main()
{
{
std::stringstream ss;
ss << "abcd";
char buf[1024];
ss.read(buf, 4);
std::cout << "EOF: " << std::boolalpha << ss.eof() << std::endl;
}
{
std::stringstream ss;
ss << "abcd";
ss.ignore(4);
std::cout << "EOF: " << std::boolalpha << ss.eof() << std::endl;
}
}
On GCC 4.4.5, this prints out:
EOF: false
EOF: true
So, why is the behavior different here? This subtle difference managed to confuse me enough to wonder why there is a difference. Is there some compelling reason that EOF is triggered "early" with a call to ignore?
eof() should only return true if you have already attempted to read past the end. In neither case should it be true. This may be a bug in your implementation.
I'm going to go out on a limb here and answer my own question: it really looks like this is a bug in GCC.
The standard reads in 27.6.1.3 paragraph 23:
[istream::ignore] behaves as an
unformatted input function (as
described in 27.6.1.3, paragraph 1).
After constructing a sentry object,
extracts characters and discards them.
Characters are extracted until any of
the following occurs:
if n != numeric_limits::max()
(18.2.1), n characters are extracted
end-of-file occurs on the input sequence (in which case the function
calls setstate(eofbit), which may
throw ios_base::failure(27.4.4.3));
c == delim for the next available input character c (in which case c is
extracted). Note: The last condition
will never occur if delim ==
traits::eof()
My (somewhat tentative) interpretation is that GCC is wrong here, because of the bold parts above. Ignore should behave as an unformatted input function, (like read()), which means that end-of-file should only occur on the input sequence if there is an attempt to extract additional bytes after the last byte in the stream has been extracted.
I'll submit a bug report if I find that enough people agree with this answer.
The consensus seemed to be that this was a legitimate bug in gcc. Since I saw no indication a bug report had been filed, I'm doing so now. The report can be viewed at:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51651

Reading a fixed number of chars with << on an istream

I was trying out a few file reading strategies in C++ and I came across this.
ifstream ifsw1("c:\\trys\\str3.txt");
char ifsw1w[3];
do {
ifsw1 >> ifsw1w;
if (ifsw1.eof())
break;
cout << ifsw1w << flush << endl;
} while (1);
ifsw1.close();
The content of the file were
firstfirst firstsecond
secondfirst secondsecond
When I see the output it is printed as
firstfirst
firstsecond
secondfirst
I expected the output to be something like:
fir
stf
irs
tfi
.....
Moreover I see that "secondsecond" has not been printed. I guess that the last read has met the eof and the cout might not have been executed. But the first behavior is not understandable.
The extraction operator has no concept of the size of the ifsw1w variable, and (by default) is going to extract characters until it hits whitespace, null, or eof. These are likely being stored in the memory locations after your ifsw1w variable, which would cause bad bugs if you had additional variables defined.
To get the desired behavior, you should be able to use
ifsw1.width(3);
to limit the number of characters to extract.
It's virtually impossible to use std::istream& operator>>(std::istream&, char *) safely -- it's like gets in this regard -- there's no way for you to specify the buffer size. The stream just writes to your buffer, going off the end. (Your example above invokes undefined behavior). Either use the overloads accepting a std::string, or use std::getline(std::istream&, std::string).
Checking eof() is incorrect. You want fail() instead. You really don't care if the stream is at the end of the file, you care only if you have failed to extract information.
For something like this you're probably better off just reading the whole file into a string and using string operations from that point. You can do that using a stringstream:
#include <string> //For string
#include <sstream> //For stringstream
#include <iostream> //As before
std::ifstream myFile(...);
std::stringstream ss;
ss << myFile.rdbuf(); //Read the file into the stringstream.
std::string fileContents = ss.str(); //Now you have a string, no loops!
You're trashing the memory... its reading past the 3 chars you defined (its reading until a space or a new line is met...).
Read char by char to achieve the output you had mentioned.
Edit : Irritate is right, this works too (with some fixes and not getting the exact result, but that's the spirit):
char ifsw1w[4];
do{
ifsw1.width(4);
ifsw1 >> ifsw1w;
if(ifsw1.eof()) break;
cout << ifsw1w << flush << endl;
}while(1);
ifsw1.close();
The code has undefined behavior. When you do something like this:
char ifsw1w[3];
ifsw1 >> ifsw1w;
The operator>> receives a pointer to the buffer, but has no idea of the buffer's actual size. As such, it has no way to know that it should stop reading after two characters (and note that it should be 2, not 3 -- it needs space for a '\0' to terminate the string).
Bottom line: in your exploration of ways to read data, this code is probably best ignored. About all you can learn from code like this is a few things you should avoid. It's generally easier, however, to just follow a few rules of thumb than try to study all the problems that can arise.
Use std::string to read strings.
Only use fixed-size buffers for fixed-size data.
When you do use fixed buffers, pass their size to limit how much is read.
When you want to read all the data in a file, std::copy can avoid a lot of errors:
std::vector<std::string> strings;
std::copy(std::istream_iterator<std::string>(myFile),
std::istream_iterator<std::string>(),
std::back_inserter(strings));
To read the whitespace, you could used "noskipws", it will not skip whitespace.
ifsw1 >> noskipws >> ifsw1w;
But if you want to get only 3 characters, I suggest you to use the get method:
ifsw1.get(ifsw1w,3);