Why does not seekg(0) clear the eof state of stream? - c++

I would like to know if and why seekg(0) is not supposed to clear the eofbit of a stream.
I am in a point where I have already read all the stream, thus EOF has been reached (but no failbit is set yet) and want to go back with seekg() to a valid position and read some chars again. In this case seekg(0) seems "to work" with the eofbit set, but as soon as I try to read from the stream, the failbit is set. Is this logic, correct or is my implementation bad? Am I supposed to recognize this case and clear the eofbit manually (if the failbit is not set)?
EDIT:
The following program provided by a reader gives different results in my implementation ( mingw32-c++.exe (TDM-2 mingw32) 4.4.1 ):
#include <sstream>
#include <iostream>
#include <string>
int main() {
std::istringstream foo("AAA");
std::string a;
foo >> a;
std::cout << foo.eof() << " " << foo.fail() << std::endl; // 1 0
foo.seekg(0);
std::cout << foo.eof() << " " << foo.fail() << std::endl; // 0 0
foo >> a;
std::cout << foo.eof() << " " << foo.fail() << std::endl; // 1 0
foo >> a;
std::cout << foo.eof() << " " << foo.fail() << std::endl; // 1 1
}
The comments above are from the user who tried that program in his implementation. I obtain these results:
1 0
1 0
1 1
1 1

According to the new standard clear() is supposed to reset the eofbit (§ 27.7.2.3):
basic_istream<charT,traits>& seekg(pos_type pos);
Effects: Behaves as an unformatted input function ..., except that the function first clears eofbit ...
But in the old standard (§ 27.6.1.3) there is no mention of clearing the eofbit!
And a simple test:
#include <sstream>
#include <iostream>
#include <string>
int main() {
std::istringstream foo("AAA");
std::string a;
foo >> a;
std::cout << foo.eof() << " " << foo.fail() << std::endl; // 1 0
foo.seekg(0);
std::cout << foo.eof() << " " << foo.fail() << std::endl; // 0 0
foo >> a;
std::cout << foo.eof() << " " << foo.fail() << std::endl; // 1 0
foo >> a;
std::cout << foo.eof() << " " << foo.fail() << std::endl; // 1 1
}

Why not just manually clear() the stream then go back once the eofbit has been set? EOF has been reached, why should seekg clear it automatically? Doing that would seem to cause more problems.

Related

std::istream::unget() setting fail and bad bits if first read failed but not if second or further reads failed

I have the following code:
#include <algorithm>
#include <ios>
#include <iostream>
#include <string>
#include <vector>
std::vector<int> read_ints(std::istream& is)
{
std::vector<int> res;
std::cout << "Please input a list of integers ('quit' to finish):" << std::endl;
for (int i; is >> i; )
{
res.push_back(i);
}
if (is.eof()) // fine: end of file
{
std::cout << "EOF reached" << std::endl;
return res;
}
std::cout << (is.bad() ? "1) BAD\n" : "");
std::cout << (is.fail() ? "1) FAIL\n" : "");
if (is.fail()) // we failed to read an int: was it the 'quit' string?
{
is.clear(); // reset the state to good
std::cout << (is.bad() ? "2) BAD\n" : "");
std::cout << (is.fail() ? "2) FAIL\n" : "");
is.unget(); // put the non-digit back into the stream
std::cout << (is.bad() ? "3) BAD\n" : "");
std::cout << (is.fail() ? "3) FAIL\n" : "");
std::string s;
if (is >> s && s == "quit")
{
std::cout << "Exiting correctly" << std::endl;
return res;
}
std::cout << "Exiting with an error. User typed: " << s << std::endl;
is.setstate(std::ios_base::failbit); // add fail() to stream's state
}
return res;
}
int main()
{
// Read ints
auto v = read_ints(std::cin);
bool first = true;
std::for_each(v.begin(), v.end(),
[&first](int n) { std::cout << (first ? "" : " ") << n; first = false; });
std::cout << std::endl;
}
If I input one or more numbers followed by a word (e.g. 1hola, 1 2 3 quit), the is.unget() doesn't set neither the fail bit nor the bad bit, and I get the expected output messages (e.g. Exiting with an error. User typed: hola, Exiting correctly).
But if I just input a word (e.g. hola or quit), the is.unget() sets fail bit and bad bit and I cannot recover the last input, getting the message Exiting with an error. User typed:.
Why can't we put whatever we read from the stream back into it for the latter case?
https://godbolt.org/z/dMoodE
Your mistake seems to be thinking that when reading an integer and a non-digit is found that character is consumed. It isn't. Just remove the call to std::unget.

Why do I get garbage when I try to parse the binary file in C++?

I'm trying to parse .wav files in C++.
The 44 bytes in the header of the .wav file are some of the file's meta information, which I am trying to parse.
I parsed it in Python and got the following, which should be correct
Chunk_id : RIFF
Chunk_size : 468556
Format : WAVE
fmt_id : fmt
fmt_size : 16
audio_format : 1
channels_count : 1
sample_rate : 44100
byte_rate : 88200
block_align : 2
bits_per_sample : 16
data_id : data
data_size : 468520
But when I switch in to C++, I got this:
ChunkID: RIFFL&
ChunkSize: 468556
Format: WAVEfmt
FmtID: fmt
FmtChunkSize: 16
FmtAudioFormat: 1
FmtChannelNumber: 1
FmtSampleRate: 44100
FmtByteRate: 88200
FmtBlockAlign: 2
FmtBitPerSample: 16
DataChunkID: data(&
The problem is three fields consisting of a char array of four bytes.
ChunkID: RIFFL&, Format: WAVEfmt, DataChunkID: data(&
As parsed by Python, the contents of the three fields should be RIFF, WAVE, data.
And this is my C++ code.
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
struct WaveChunk
{
char ChunkID[4];
int ChunkSize;
char Format[4];
// fmt
char FmtID[4];
int FmtChunkSize;
short FmtAudioFormat;
short FmtChannelNumber;
int FmtSampleRate;
int FmtByteRate;
short FmtBlockAlign;
short FmtBitPerSample;
// fmt
// data
char DataChunkID[4];
int DataChunkSize;
// data
};
string WaveChunkToString(WaveChunk* wavechunk){
stringstream ss;
ss << "ChunkID: " << wavechunk->ChunkID << "\n";
ss << "ChunkSize: " << wavechunk->ChunkSize << "\n";
ss << "Format: " << wavechunk->Format << "\n";
ss << "FmtID: " << wavechunk->FmtID << "\n";
ss << "FmtChunkSize: " << wavechunk->FmtChunkSize << "\n";
ss << "FmtAudioFormat: " << wavechunk->FmtAudioFormat << "\n";
ss << "FmtChannelNumber: " << wavechunk->FmtChannelNumber << "\n";
ss << "FmtSampleRate: " << wavechunk->FmtSampleRate << "\n";
ss << "FmtByteRate: " << wavechunk->FmtByteRate << "\n";
ss << "FmtBlockAlign: " << wavechunk->FmtBlockAlign << "\n";
ss << "FmtBitPerSample: " << wavechunk->FmtBitPerSample << "\n";
ss << "DataChunkID: " << wavechunk->DataChunkID << "\n";
ss << "DataChunkSize: " << wavechunk->DataChunkSize << endl;
return ss.str();
}
using namespace std;
int main(){
WaveChunk w;
ifstream inf("target.wav", ios::binary|ios::in);
inf.read((char* ) &w, sizeof(WaveChunk));
cout << WaveChunkToString(&w);
return 0;
}
That's it, why are these three fields parsed differently than expected, but the other fields made up of numbers are fine?
In this line and the other lines to print char something[4];:
ss << "ChunkID: " << wavechunk->ChunkID << "\n";
std::stringstream will read until null-character '\0' but the array doesn't contain that, so it will read beyond the allocated buffer.
You will have to specify the size to print for properly printing the array without adding extra byte.
It can be done like this:
ss << "ChunkID: "; ss.write(wavechunk->ChunkID, 4); ss << "\n";

libc++: Why is the stream still good after closing

I have a very simple program
#include <iostream>
#include <fstream>
void CHECK(std::iostream& s)
{
std::cout << "good(): " << s.good()
<< " fail(): " << s.fail()
<< " bad(): " << s.bad()
<< " eof(): " << s.eof() << std::endl;
}
int main(int argc, const char * argv[])
{
std::fstream ofs("test.txt", std::ios::out | std::ios::trunc);
std::cout << "opened" << std::endl;
CHECK(ofs);
ofs << "Hello, World!\n";
CHECK(ofs);
ofs.close();
std::cout << "closed" << std::endl;
CHECK(ofs);
ofs << "Hello, World!\n";
std::cout << "after operation" << std::endl;
CHECK(ofs);
return 0;
}
With libc++ I get the following last line:
good(): 1 fail(): 0 bad(): 0 eof(): 0
Expected (or with libstdc++):
good(): 0 fail(): 1 bad(): 1 eof(): 0
I have tested on OSX with Xcode 9.4.1 (or on Linux), but always the same. Can anybody explain me the situation here? Also the file content was not updated, because already closed. Why is the stream still good after closing and further operation?
What I suspect is happening is that the operations are stuffing the data into the rdbuf associated with the stream. That succeeds, as long as there is room in the buffer. Eventually, the buffer gets full, and the stream attempts to write to the file (which is closed) and that fails.
You can test that by making the last bit a loop:
ofs.close();
std::cout << "closed" << std::endl;
CHECK(ofs);
for (int i = 0; i < 500; ++i)
{
ofs << "Hello, World!\n";
std::cout << i << " ";
CHECK(ofs);
}
std::cout << "after operation" << std::endl;
On my machine, it fails after about 300 - and forever after that.
Is this correct behavior? (or even standards-compliant?)
I don't know.
[ Later: If I change libc++ do set the buffer size to 0 upon close, then the first write fails - so that suggests that my analysis is correct. However, I still haven't found anything in the standard about what this 'should' do. ]

How to implement serialization and de-serialization of a double?

I am trying to solve the relatively simple problem of being able to write a double to a file and then to read the file into a double again. Based on this answer I decided to use the human readable format.
I have successfully circumvented the problems some compilers have with nan and [-]infinity according to this question. With finite numbers I use the std::stod function to convert the string representation of a number into the number itself. But from time to time the parsing fails with numbers close to zero, such as in the following example:
#include <cmath>
#include <iostream>
#include <sstream>
#include <limits>
const std::size_t maxPrecision = std::numeric_limits<double>::digits;
const double small = std::exp(-730.0);
int main()
{
std::stringstream stream;
stream.precision(maxPrecision);
stream << small;
std::cout << "serialized: " << stream.str() << std::endl;
double out = std::stod(stream.str());
std::cout << "de-serialized: " << out << std::endl;
return 0;
}
On my machine the result is:
serialized: 9.2263152681638151025201733115952403273156653201666065e-318
terminate called after throwing an instance of 'std::out_of_range'
what(): stod
The program has unexpectedly finished.
That is, the number is too close to zero to be properly parsed. At first I thougth that the problem is that this number is denormal, but this doesn't seem to be the case, since the mantissa starts with a 9 and not a 0.
Qt on the other hand has no problems with this number:
#include <cmath>
#include <limits>
#include <QString>
#include <QTextStream>
const std::size_t maxPrecision = std::numeric_limits<double>::digits;
const double small = std::exp(-730.0);
int main()
{
QString string = QString::number(small, 'g', maxPrecision);
QTextStream stream(stdout);
stream.setRealNumberPrecision(maxPrecision);
stream << "serialized: " << string << '\n';
bool ok;
double out = string.toDouble(&ok);
stream << "de-serialized: " << out << '\n' << (ok?"ok":"not ok") << '\n';
return 0;
}
Outputs:
serialized: 9.2263152681638151025201733115952403273156653201666065e-318
de-serialized: 9.2263152681638151025201733115952403273156653201666065e-318
ok
Summary:
Is this a bug in the gcc implementation of standard library?
Can I circumvent this elegantly?
Should I just use Qt?
Answering question #2:
This is probably my "C-way" kind of thinking, but you could copy the double into a uint64_t (mem-copying, not type-casting), serialize the uint64_t instead, and do the opposite on de-serialization.
Here is an example (without even having to copy from double into uint64_t and vice-versa):
uint64_t* pi = (uint64_t*)&small;
stringstream stream;
stream.precision(maxPrecision);
stream << *pi;
cout << "serialized: " << stream.str() << endl;
uint64_t out = stoull(stream.str());
double* pf = (double*)&out;
cout << "de-serialized: " << *pf << endl;
Please note that in order to avoid breaking strict-aliasing rule, you actually do need to copy it first, because the standard does not impose the allocation of double and uint64_t to the same address-alignment:
uint64_t ismall;
memcpy((void*)&ismall,(void*)&small,sizeof(small));
stringstream stream;
stream.precision(maxPrecision);
stream << ismall;
cout << "serialized: " << stream.str() << endl;
ismall = stoull(stream.str());
double fsmall;
memcpy((void*)&fsmall,(void*)&ismall,sizeof(small));
cout << "de-serialized: " << fsmall << endl;
If you're open to other recording methods you can use frexp:
#include <cmath>
#include <iostream>
#include <sstream>
#include <limits>
const std::size_t maxPrecision = std::numeric_limits<double>::digits;
const double small = std::exp(-730.0);
int main()
{
std::stringstream stream;
stream.precision(maxPrecision);
int exp;
double x = frexp(small, &exp);
//std::cout << x << " * 2 ^ " << exp << std::endl;
stream << x << " * 2 ^ " << exp;
int outexp;
double outx;
stream.seekg(0);
stream >> outx;
stream.ignore(7); // >> " * 2 ^ "
stream >> outexp;
//std::cout << outx << " * 2 ^ " << outexp << std::endl;
std::cout << small << std::endl << outx * pow(2, outexp) << std::endl;
return 0;
}

std::stringstream strange behaviour

Some background information, for a homework assignment I had to write a polish notation calculator using binary trees, for this to work I had to parse command line input so that it would properly build the binary tree and then go over it to give a valid answer to the mathematical expression that was entered.
For the parsing I used a std::stringstream so that I would easily be able to convert the std::string I was handed into a valid float (or integer, double). The issue I ran across was the following code, which has the error showcased and how I solved the issue. I was hoping that somebody where would be able to tell me if I was doing something wrong and .clear() is not correct, or if this is a bug in the standard library in the way it handles this particular input (only happens for + and -).
#include <iostream>
#include <sstream>
#include <string>
int main() {
std::string mystring("+");
int num;
char op;
std::stringstream iss(mystring);
iss >> num;
// Seems it is not a number
if (iss.fail()) {
// This part does not work as you would expect it to
// We clear the error state of the stringstream
iss.clear();
std::cout << "iss fail bit: " << iss.fail() << std::endl;
iss.get(op);
std::cout << "op is: " << op << " iss is: " << iss.str() << std::endl;
std::cout << "iss fail bit: " << iss.fail() << std::endl;
// This however works as you would expect it to
std::stringstream oss(iss.str());
std::cout << "oss fail bit: " << oss.fail() << std::endl;
oss.get(op);
std::cout << "op is: " << op << " oss is: " << oss.str() << std::endl;
std::cout << "oss fail bit: " << oss.fail() << std::endl;
} else {
// We got a number
}
}
Sample output from the program:
iss fail bit: 0
op is: iss is: +
iss fail bit: 1
oss fail bit: 0
op is: + oss is: +
oss fail bit: 0
Maybe you guys will see something I missed, or if this is indeed a bug higher up beyond my program, in which case pointers as to where to report this would be greatly appreciated.
When you say:
iss.clear();
std::cout << "iss fail bit: " << iss.fail() << std::endl;
iss.get(op);
you are trying to read something that has already been read. You need to reset the streams read pointer:
iss.clear();
iss.seekg(0); // start again
std::cout << "iss fail bit: " << iss.fail() << std::endl;
iss.get(op);