unexpected behavior when read a character from istringstream

unexpected behavior when read a character from istringstream - c++

I have a question on the stream behavior, see the following example. What I was expecting is the ss_char and ss_int will be eof state, but just the ss_int will be eof state.
My question is, why isn't ss_char eof state?
Can't I use the operator>>, only the istringstream::get() function, but then why read the value successfully?
Output:
char value: a
int value: 42
ss_char eof: false // why false?
ss_int eof: true
Sorry for my poor English. I’m working on improving my English.
#include <iostream>
#include <sstream>
int main(int /*argc*/, char const * /*argv*/[])
{
char c;
int num;
std::istringstream ss_int("42");
std::istringstream ss_char("a");
if (ss_char >> c)
{
std::cout << "char value: " << c << std::endl;
}
else
{
std::cout << "cannot read char" << std::endl;
}
if (ss_int >> num)
{
std::cout << "int value: " << num << std::endl;
}
else
{
std::cout << "cannot read int" << std::endl;
}
std::cout << std::endl;
std::cout << "ss_char eof: " << std::boolalpha << ss_char.eof() << std::endl; // why false
std::cout << "ss_int eof: " << std::boolalpha << ss_int.eof() << std::endl;
return 0;
}

CppReference says: "This function only reports the stream state as set by the most recent I/O operation, it does not examine the associated data source. For example, if the most recent I/O was a get(), which returned the last byte of a file, eof() returns false. The next get() fails to read anything and sets the eofbit. Only then eof() returns true."
oefbit will turn true when a read operation attempts to read beyond end of file, but not when it reads exactly to the end of file without trying to go further. When you read the char, it knows it should read a single byte, so this read operation is ok, the read position advance 1 byte, goes to the end, but let say the the stream still haven't noticed that it is indeed the end, it will if you try to read something else. When you read an integer, it tries to read beyond 42 because the length of the integer is not clear, it could have been 42901, so it has to read until it sees an space, and end of line, or eventually the end of the file/stream if there's nothing else to read.
And the result of the operator >> is the stream itself. When it is converted to void* (or bool, depends on c++11 or previous) it works as !fail(), so it tells you if the read or write operation was ok, regardless of whether it reached the end of file (next read operation will fail if it is now at the end).

The EOF condition doesn't actually occur until you try to read past the end of the stream.
In the char case you read exactly one character, the only one available. You don't try to read past the end because there is no need to.
Extracting an int on the other hand attempts to consume as many digits as possible. It reads the 4 and the 2, and then it tries to read again to see if there is another digit to consume, it does attempt to read past the end in this case. It notices that the input came to an end and so finishes the conversion of 42.

when extracting chars, it will pull a single character at a time and skip white spaces on consecutive calls.
when extracting int, the parser attempts to pull as many characters out to form the number. this causes the integer extraction to hit the eof in your test case.

Related

the characters in a c++ string are being ignored

I'm trying to write to a hid device using signal11's hidapi (here).
In my troubleshooting, I've noticed that part of a string isn't being displayed to the console.
Here is a sample of my code
//device is a hid device and is assigned to in another part of the program.
//dataBuffer is a struct with only a char array called "buffer" and an int which is the size of the array called "size"
void DeviceCommands::Write(hid_device* device, dataBuffer* buf)
{
std::cout << "Attempting write >> buffer...\n";
buf->buffer[0] = 0;
std::cout << "Written to buffer...\n" << "writing buffer to device\n";
int res = hid_write(device, buf->buffer, sizeof(buf->buffer));
std::cout << "Write success: " + '\n';
std::cout << "Write complete\n";
}
I'm expecting for the console to return the following:
Attempting write >> buffer...
Written to buffer...
writing buffer to device
Write success: (0 if the write succeeds, -1 if it fails)
Write complete
But instead, this happens:
Attempting write >> buffer...
Written to buffer...
writing buffer to device
ess: Write complete
The "Write succ", result, and the line break are missing, I'm somewhat new to c++ but I have experience with c#. I'm just confused and some help would be much appreciated, thanks in advance and ask if you need more information!

This line:
std::cout << "Write success: " + '\n';
will print the string "Write success: " with an offset of 10 characters, which is the ascii value of \n. Hence you see ess on the screen.
You probably want:
std::cout << "Write success: " << res << "\n";
assuming res returns 0 or -1 as needed.

Do not 'add' a character to a string. It will not do what you expect.
Here you are thinking you are adding the line feed character to your string "Write success" when in fact you are telling the compiler to take your constant string and only stream from the 10th character. Remember a constant string here is just an array of characters and the single character '\n' is converted to the number 10.
You are also missing the result out of the streaming.
So your second to last line should read:
std::cout << "Write success: " << res << std::endl;

How to determine in C++ if an element in a text file is a character or numeric?

I am trying to write a code in C++ reading a text file contains a series of numerics. For example, I have this .txt file which contains the following series of numbers mixed with a character:
1 2 3 a 5
I am trying to make the code capable of recognizing numerics and characters, such as the 4th entry above (which is a character), and then report error.
What I am doing is like
double value;
while(in) {
in >> value;
if(!isdigit(value)) {
cout << "Has non-numeric entry!" << endl;
break;
}
else
// some codes for storing the entry
}
However, the isdigit function doesn't work for text file. It seems when I am doing in >> value, the code will implicitly type-cast a into double.
Can anyone give me some suggestion?
Thanks a lot!

Your while loop doesn't do what you think it does.
It only iterates one statement:
in >> value;
The rest of the statements are actually outside the loop.
Using curly braces for the while body is always recommended

I created a small mini script where I would be reading in a file through a standard fstream library object as I was a little unsure on what your "in" represented.
Essentially, try to read in every element as a character and check the digit function. If you're reading in elements that are not of just length 1, a few modifications would have to be made. Let me know if that's the case and I'll try to help!
int main() {
std::fstream fin("detect_char.txt");
char x;
while (fin >> x) {
if (!isdigit(x)) {
std::cout << "found non-int value = " << x << '\n';
}
}
std::cout << '\n';
return 0;
}

Try reading the tokens into string and explicitly parsing it
ifstream infile("data.txt");
string token;
while (infile >> token) {
try {
double num = stod(token);
cout << num << endl;
}
catch (invalid_argument e) {
cerr << "Has non-numeric entry!" << endl;
}
}

Since it looks like the Asker's end goal is to have a double value for their own nefarious purposes and not simply detect the presence of garbage among the numbers, what the heck. Let's read a double.
double value;
while (in) // loop until failed even after the error handling case
{
if (in >> value) // read a double.
{
std::cout << value; // printing for now. Store as you see fit
}
else // failed to read a double
{
in.clear(); // clear error
std::string junk;
in >> junk; // easiest way I know of to read up to any whitepsace.
// It's kinda gross if the discard is long and the string resizes
}
}
Caveat:
What this can't handle is stuff like 3.14A. This will be read as 3.14 and stop, returning the 3.14 and leave the A for the next read where it will fail to parse and then be consumed and discarded by in >> junk; Catching that efficiently is a bit trickier and covered by William Lee's answer. If the exception handling of stod is deemed to expensive, use strtod and test that the end parameter reached the end of the string and no range errors were generated. See the example in the linked strtod documentation

Trying to understand cin behavior

I am writing software that takes a (huge) input stream from stdin and reads it into a vector of floats. I want to capture the case that the stream contains characters like commas, and either fail to accept it, or simply ignore everything that cannot be parsed as float (whichever is easier to implement, I have no preference). I have noticed the following behavior: when I call
echo "1.4, -0.7 890 23e-3" | ./cintest
this version
#include <iostream>
using std::endl;
using std::cin;
using std::cout;
int main ( int argc, const char* argv[] ){
float val;
while (cin >> val) {
cout << val << endl;
}
return 0;
}
prints
1.4
while this version
#include <iostream>
using std::endl;
using std::cin;
using std::cout;
int main ( int argc, const char* argv[] ){
float val;
while (cin) {
cin >> val;
cout << val << endl;
}
return 0;
}
prints
1.4
0
Without the comma, the first one prints
1.4
-0.7
890
0.023
while the second one prints
1.4
-0.7
890
0.023
0.023
Could somebody please explain what is going on here?

The first version of your code
while (cin >> val) {
tries to parse a float, and then checks whether the stream state is good. (Specifically, it calls operator>> to do the extraction, which will set failbit on error, and then uses the bool conversion to test failbit).
So, if the stream state is bad (because it couldn't convert , to a float), the loop body is not entered. Hence it terminates on the first failed conversion.
The second version
while (cin) {
cin >> val;
checks if the stream state is good (which just tells you the previous conversion succeeded), then tries parsing a float, and then assumes this succeeded without checking. It ought to check the stream state after conversion before using the float value, which in this case is left over from the previous iteration.
In a correct implementation, when conversion fails, you should check whether fail() is true but eof() is false (ie, the conversion failed for some reason other than end-of-file). In this case, use ignore() to discard input - you could either require whitespace (and ignore until the next space), or just ignore one character and try again.
Note that the ignore documentation linked above includes sample code with correct error handling. If we choose to skip a single character on failed conversions, your code would become:
for(;;) {
float val;
std::cin >> val;
if (std::cin.eof() || std::cin.bad()) {
break;
} else if (std::cin.fail()) {
std::cin.clear(); // unset failbit
std::cin.ignore(1); // skip next char
} else {
std::cout << val << '\n';
}
}

Your results have to do with when the >> fails.
In both versions, you read your values and reach the comma (or EOF). The read obviously fails, because , and EOF are not valid integers that >> can parse. So >>'s return value (the stream itself) converts to false, and you exit the loop in the first version (this is how it should work).
In the second version (which isn't how you should usually do it), however, you are still printing out whatever value ends up in val. In C++ before C++11, val remains the same; since C++11 >> writes 0 on failure.
TL;DR: Your second version stops the loop to late, and writes one pass of garbage.

It's because in the second one you have a bug.
You should always check to see if the formatted input operator>> actually worked.
So this code:
cin >> val;
cout << val << endl;
Should be written as:
if (cin >> val) {
cout << val << endl;
}
If the operator>> fails. Then it will set one of the fail bits on the stream and not put any value into val. So there is no need to print val because nothing was put into it.
This is why your second version prints garbage when it has no data left to read. The read fails and then you print out a value. Then you try and re-start the loop (which fails).
The first one works correctly.
while (cin >> val) {
cout << val << endl;
}
Because you read a value then check to see if the read works before entering the loop.

std::cin is an instantiation of std::istream
when there is a comma or any invalid data type at any point the operator >> fails. hence your code prints the last known value of 'val'.
Linked here is a reference for 'std::istream >>'
http://www.cplusplus.com/reference/istream/istream/operator%3E%3E/

Why is part of my code being skipped and not letting me enter input?

Why does my code skip the last question when I put to much info in for the fist one? What am I doing wrong?
const int SIZEC =31;
char phrase[SIZEC];
cout << " Provide a phrase, up to 30 characters with spaces. > " << endl;
cin.getline(phrase, SIZEC);
cin.ignore(numeric_limits<streamsize>::max(), '\n');
cout << " The phrase is: " << phrase << endl;
cout << endl;
cout << " Using sring Class Obects " << endl;
cout << "--------------------------" << endl;
cout << endl;
string leter;
cout << " Provide a single character > " << endl;
cin >> leter;
cout << " The single character is: " << leter << endl;
cout << endl;
If the code before this is needed tell me and I'll add it.

Use std::string::resize as a workaround.
string phrase;
getline(cin, phrase);
phrase.resize(30); // phrase will be reduced to 30 chars
string letter; // better to use char letter
cin >> letter;
letter.resize(1);

The main problem is that getline behaves differently in two cases:
If at least SIZEC characters are read and there is no newline character among them (e.g. there should be at least SIZEC+1 bytes to store the data read), it stops reading and sets so-called failbit status bit on the stream, which means "I have failed to read something, so input stream is probably incorrect". Quoting cplusplus.com:
The failbit flag is set if the function extracts no characters, or if
the delimiting character is not found once (n-1) characters have
already been written to s.
If newline character is encountered, failbit is not set and the newline character is succesfully read and ignored by getline.
What happens next is more interesting: extraction functions (all of them, I assume) fail immediately if the input stream is bad() (that is, either failbit, badbit, or eofbit are set on the stream). In particular, if previous extraction operation failed, all subsequent will fail as well. So, basically, if first line of the input cannot be fitted in your phrase array, then cin becomes "bad" and all further read operations do nothing.
You can override that behavior by manually resetting the failbit after calling getline like this:
cin.clear();
Following read operations will succeed until another one fails.
In your particular case, I assume that you want to read the first line regardless of the length, and then a second line. I that case, I think you
should to first check whether getline failed (by checking cin.failbit() or cin.good()) and then either do nothing (if it did not and there is no need in reading extra newline) or resetting the failbit and ignoring characters till the first newline. Something like this:
#include <iostream>
#include <limits>
#include <string>
using namespace std;
int main() {
char buf[5];
cin.getline(buf, sizeof buf);
if (!cin) { // operator ! is overloaded for `istream`, it's equal to `good()`
// If stream is bad: first line of the input was truncated,
// and following newline character was not read.
// Clear failbit so subsequent read operations do something.
cin.clear();
// Read remaining of the first line.
cin.ignore(numeric_limits<streamsize>::max(), '\n');
}
// At this point, first line of the input is consumed
// regardless of its length
int x;
cin >> x;
cout << "String: |" << buf << "|\n";
cout << "x: " << x << "\n";
}
You can read more on StackOverflow here and there.
However, if there is no reason to use C-style string together with istreams, I'd suggest you using string and std::getline instead (like in Shreevardhan's answer); it will produce cleaner code and there will be no extra cases.

EOF suddenly reaching with getline() in ifstream

I have file "in.txt" which consist of 2 strings:
abcde
12345
I have my code:
#include <iostream>
#include <fstream>
int main() {
std::ifstream fileIn("in.txt", std::ios::in);
char* chPtr = new(char[10]);
char ch;
printf("fileIn.get()==EOF?: %d \n", (fileIn.get() == EOF)); // =0
std::cout << "fileIn.eof() = " << fileIn.eof() << "\n"; // =0
fileIn.getline(chPtr, 3);
std::cout << "chPtr-" << chPtr << "\n"; //output:"bc" (see 1.)
fileIn.get(ch);
std::cout << "ch-" << ch << "\n"; //(see 2.)
printf("fileIn.get()==EOF?: %d \n", (fileIn.get() == EOF)); // =1 (see 3.)
std::cout << "fileIn.eof() = " << fileIn.eof() << "\n"; // =0 (see 4.)
fileIn.close();
delete[] chPtr;
}
Remarks to code:
(1.) 1st symbol 'a' was eaten by get() slightly above; Thus 2 next symbols read here, and 3rd symbol, what i wanted to read, getline() automatically assigns with value '\0' (if I understand correctly).
(2.)And here are is the question - here outputs symbol (with code [-52]). Unfortunately I haven't enough reputation to post images =( (This symbol is like 2 vertical white lines, right line of this pair is with gap at the middle).
(for information: I got this symbol each time, I'm trying to read to char variable an uninitialized element of char-array.)
But why I get it there?? Because there are still unreaded symbols in 1st string & whole 2nd string!
(3.) It turns out that, the cursor suddenly moved to the end of file. But WHY?? I can't understand
(4.) We still have zero here, because (if I understand correctly) there was not attempt of reading data behind the eof-line. The cursor just moved to place behind the last symbol of file, but not out of the file-end-border).

If istream::getline manages to read count-1 characters (count is 3 in your example) before EOF is reached, it will set failbit. See the reference.
This means all further extractions will fail unless you clear the flag, not that "cursor moved to the end". ch never gets initialized.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

unexpected behavior when read a character from istringstream - c++

when extracting chars, it will pull a single character at a time and skip white spaces on consecutive calls. when extracting int, the parser attempts to pull as many characters out to form the number. this causes the integer extraction to hit the eof in your test case.

Related

the characters in a c++ string are being ignored

How to determine in C++ if an element in a text file is a character or numeric?

Trying to understand cin behavior

Why is part of my code being skipped and not letting me enter input?

EOF suddenly reaching with getline() in ifstream

Categories

Resources