#include <iostream>
#include <sstream>
#include <fstream>
using namespace std;
int main(){
istringstream input("1234");
char c[5];
while(input.getline(c, 5, '\n')){
cout << "OUTPUT: " << c << endl;
}
}
The output is
OUTPUT: 1234
when I feel like all sources tell me input should test as false and there should be no ouput. From the standard (N3337) [27.7.2.3]/18:
Effects: Behaves as an unformatted input function (as described in 27.7.2.3, paragraph 1). After constructing a sentry object, extracts characters and stores them into successive locations of an array whose first element is designated by s. Characters are extracted and stored until one of the following
occurs:
end-of-file occurs on the input sequence (in which case the function calls setstate(eofbit));
traits::eq(c, delim) for the next available input character c (in which case the input character
is extracted but not stored);320
n is less than one or n - 1 characters are stored (in which case the function calls setstate(
failbit)).
Since 4 values get stored, failbit should be getting set. Some other sources give a bit differing but still confusing input on this function. Cplusplus:
The failbit flag is set if the function extracts no characters, or if the delimiting character is not found once (n-1) characters have already been written to s. Note that if the character that follows those (n-1) characters in the input sequence is precisely the delimiting character, it is also extracted and the failbit flag is not set (the extracted sequence was exactly n characters long).
Again, the deliminting character '\n' is not found after the 4, and so failbit should be getting set. Cppreference says a similar thing. What am I missing here?
Yes it reads n-1 characters and it never encountered '\n' but you missed the first point
end-of-file occurs on the input sequence (in which case the function calls setstate(eofbit));
Since you read in exactly what was in the stream the eofbit gets set and you get the input.
If we add
std::cout << input.eof();
You can see that is indeed what happened(live example)
Related
I’m having a bit of a problem in C++. When I wrote this:
int a = ‘:‘;
cout << a;
This printed out 58. It checks out with the ASCII table.
But if I write this:
int a;
cin >> a;
//i type in “:”
cout << a;
This will print out 0. It seems like if I put in any non-numeric input, a will be 0. I expected it to print out the equivalent ASCII number.
Can someone explain this for me? Thank you!
There are two things at work here.
First, ':' is a char, and although a char looks like a piece of text in your source code, it's really just a number (typically, an index into ASCII). This number can be assigned to other numeric types, such as int.
However, to deal with this oddity in a useful way, the IOStreams library treats char specially, for a numeric type. When you insert an int into a stream using formatted insertion (e.g. cout << 42), it automatically generates a string that looks like that number; but, when you insert a char into a stream using formatted extraction (e.g. cout << ';'), it does not do that.
Similarly, when you do formatted extraction, extracting into an int will interpret the user's input string as a number. Forgetting the char oddity, : in a more general sense is not a number, so your cin >> a does not succeed, as there is no string that looks like a number to interpret. (If a were a char, this "decoding" would again be disabled, and the task would succeed by simply copying the character from the user input.)
It can be confusing, but you're working in two separate data domains: user input as interpreted by IOStreams, and C++ data types. What is true for one, is not necessarily true for the other.
You're declaring a as an int, then the operator>> expects digits, but you give a punctuation, which makes extraction fails. As the result, since C++11, a is set to 0; before C++11 a won't be modified.
If extraction fails (e.g. if a letter was entered where a digit is expected), value is left unmodified and failbit is set. (until C++11)
If extraction fails, zero is written to value and failbit is set. (since C++11)
And
I expected it to print out the equivalent ASCII number.
No, even for valid digits, e.g. if you input 1, a will be set with value 1, but not its ASCII number, i.e. 49.
This will print out 0. It seems like if I put in any non-numeric input, a will be 0. I expected it to print out the equivalent ASCII number.
Since C++11 when extraction fails 0 will be automatically assigned.
However, there is a way where you can take a char input from std::cin and then print its ASCII value. It is called type-casting.
Here is an example:
#include <iostream>
int main()
{
char c;
std::cin >> c;
std::cout << int(c);
return 0;
}
Output:
:
58
When I read all data from a stream, but make no attempt to read past its end, the stream's EOF is not set. That's how C++ streams work, right? It's the reason this works:
#include <sstream>
#include <cassert>
char buf[255];
int main()
{
std::stringstream ss("abcdef");
ss.read(buf, 6);
assert(!ss.eof());
assert(ss.tellg() == 6);
}
However, if instead of read()ing data I ignore() it, EOF is set:
#include <sstream>
#include <cassert>
int main()
{
std::stringstream ss("abcdef");
ss.ignore(6);
assert(!ss.eof()); // <-- FAILS
assert(ss.tellg() == 6); // <-- FAILS
}
This is on GCC 4.8 and GCC trunk (Coliru).
It also has the unfortunate side-effect of making tellg() return -1 (because that's what tellg() does), which is annoying for what I'm doing.
Is this standard-mandated? If so, which passage and why? Why would ignore() attempt to read more than I told it to?
I can't find any reason for this behaviour on cppreference's ignore() page. I can probably .seekg(6, std::ios::cur) instead, right? But I'd still like to know what's going on.
I think this is a libstdc++ bug (42875, h/t NathanOliver). The requirements on ignore() in [istream.unformatted] are:
Characters are extracted until any
of the following occurs:
— n != numeric_limits<streamsize>::max() (18.3.2) and n characters have been extracted so far
— end-of-file occurs on the input sequence (in which case the function calls setstate(eofbit),
which may throw ios_base::failure (27.5.5.4));
— traits::eq_int_type(traits::to_int_type(c), delim) for the next available input character
c (in which case c is extracted).
Remarks: The last condition will never occur if traits::eq_int_type(delim, traits::eof()).
So we have two conditions (the last is ignored) - we either read n characters, or at some point we hit end-of-file in which case we set the eofbit. But, we are able to read n characters from the stream in this case (there are in fact 6 characters in your stream), so we will not hit end-of-file on the input sequence.
In libc++, eof() is not set and tellg() does return 6.
Consider the following simple example
#include <string>
#include <sstream>
#include <iomanip>
using namespace std;
int main() {
string str = "string";
istringstream is(str);
is >> setw(6) >> str;
return is.eof();
}
At the first sight, since the explicit width is specified by the setw manipulator, I'd expect the >> operator to finish reading the string after successfully extracting the requested number of characters from the input stream. I don't see any immediate reason for it to try to extract the seventh character, which means that I don't expect the stream to enter eof state.
When I run this example under MSVC++, it works as I expect it to: the stream remains in good state after reading. However, in GCC the behavior is different: the stream ends up in eof state.
The language standard, it gives the following list of completion conditions for this version of >> operator
n characters are stored;
end-of-file occurs on the input sequence;
isspace(c,is.getloc()) is true for the next available input character c.
Given the above, I don't see any reason for the >> operator to drive the stream into the eof state in the above code.
However, this is what the >> operator implementation in GCC library looks like
...
__int_type __c = __in.rdbuf()->sgetc();
while (__extracted < __n
&& !_Traits::eq_int_type(__c, __eof)
&& !__ct.is(__ctype_base::space,
_Traits::to_char_type(__c)))
{
if (__len == sizeof(__buf) / sizeof(_CharT))
{
__str.append(__buf, sizeof(__buf) / sizeof(_CharT));
__len = 0;
}
__buf[__len++] = _Traits::to_char_type(__c);
++__extracted;
__c = __in.rdbuf()->snextc();
}
__str.append(__buf, __len);
if (_Traits::eq_int_type(__c, __eof))
__err |= __ios_base::eofbit;
__in.width(0);
...
As you can see, at the end of each successful iteration, it attempts to prepare the next __c character for the next iteration, even though the next iteration might never occur. And after the cycle it analyzes the last value of that __c character and sets the eofbit accordingly.
So, my question is: triggering the eof stream state in the above situation, as GCC does - is it legal from the standard point of view? I don't see it explicitly specified in the document. Is both MSVC's and GCC's behavior compliant? Or is only one of them behaving correctly?
The definition for that particular operator>> is not relevant to the setting of the eofbit, as it only describes when the operation terminates, but not what triggers a particular bit.
The description for the eofbit in the standard (draft) says:
eofbit - indicates that an input operation reached the end of an input sequence;
I guess here it depends on how you want to interpret "reached". Note that gcc implementation correctly does not set failbit, which is defined as
failbit - indicates that an input operation failed to read the expected characters, or
that an output operation failed to generate the desired characters.
So I think eofbit does not necessarily mean that the end of file impeded the extractions of any new characters, just that the end of file has been "reached".
I can't seem to find a more accurate description for "reached", so I guess that would be implementation defined. If this logic is correct, then both MSVC and gcc behaviors are correct.
EDIT: In particular, it seems that eofbit gets set when sgetc() would return eof. This is described both in the istreambuf_iterator section and in the basic_istream::sentry section. So now the question is: when is the current position of the stream allowed to advance?
FINAL EDIT: It turns out that probably g++ has the correct behavior.
Every character scan passes through <locale>, in order to allow different character sets, money formats, time descriptions and number formats to be parsed. While there does not seem to be a through description on how the operator>> works for strings, there are very specific descriptions on how do_get functions for numbers, time and money are supposed to operate. You can find them from page 687 of the draft forward.
All of these start off by reading a ctype (the "global" version of a character, as read through locales) from a istreambuf_iterator (for numbers, you can find the call definitions at page 1018 of the draft). Then the ctype is processed, and finally the iterator is advanced.
So, in general, this requires the internal iterator to always point to the next character after the last one read; if that was not the case you could in theory extract more than you wanted:
string str = "strin1";
istringstream is(str);
is >> setw(6) >> str;
int x;
is >> x;
If the current character for is after the extraction for str was not on the eof, then the standard would require that x gets the value 1, since for numeric extraction the standard explicitly requires that the iterator is advanced after the first read.
Since this does not make much sense, and given that all complex extractions described in the standard behave in the same way, it makes sense that for strings the same would happen. Thus, as the pointer for is after reading 6 characters falls on the eof, the eofbit needs to be set.
I tried to do it like this
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
char b[2];
ifstream f("prad.txt");
f>>b ;
cout <<b;
return 0;
}
It should read 2 characters but it reads whole line. This worked on another language but doesn't work in C++ for some reason.
You can use read() to specify the number of characters to read:
char b[3] = "";
ifstream f("prad.txt");
f.read(b, sizeof(b) - 1); // Read one less that sizeof(b) to ensure null
cout << b; // terminated for use with cout.
This worked on another language but doesn't work in C++ for some
reason.
Some things change from language to language. In particular, in this case you've run afoul of the fact that in C++ pointers and arrays are scarcely different. That array gets passed to operator>> as a pointer to char, which is interpreted as a string pointer, so it does what it does to char buffers (to wit read until the width limit or end of line, whichever comes first). Your program ought to be crashing when that happens, since you're overflowing your buffer.
istream& get (char* s, streamsize n );
Extracts characters from the stream and stores them as a c-string into
the array beginning at s. Characters are extracted until either (n -
1) characters have been extracted or the delimiting character '\n' is
found. The extraction also stops if the end of file is reached in the
input sequence or if an error occurs during the input operation. If
the delimiting character is found, it is not extracted from the input
sequence and remains as the next character to be extracted. Use
getline if you want this character to be extracted (and discarded).
The ending null character that signals the end of a c-string is
automatically appended at the end of the content stored in s.
I was recently just tripped up by a subtle distinction between the behavior of std::istream::read versus std::istream::ignore. Basically, read extracts N bytes from the input stream, and stores them in a buffer. The ignore function extracts N bytes from the input stream, but simply discards them rather than storing them in a buffer. So, my understanding was that read and ignore are basically the same in every way, except for the fact that read saves the extracted bytes whereas ignore just discards them.
But there is another subtle difference between read and ignore which managed to trip me up. If you read to the end of a stream, the EOF condition is not triggered. You have to read past the end of a stream in order for the EOF condition to be triggered. But with ignore it is different: you only need to read to the end of a stream.
Consider:
#include <sstream>
#include <iostream>
using namespace std;
int main()
{
{
std::stringstream ss;
ss << "abcd";
char buf[1024];
ss.read(buf, 4);
std::cout << "EOF: " << std::boolalpha << ss.eof() << std::endl;
}
{
std::stringstream ss;
ss << "abcd";
ss.ignore(4);
std::cout << "EOF: " << std::boolalpha << ss.eof() << std::endl;
}
}
On GCC 4.4.5, this prints out:
EOF: false
EOF: true
So, why is the behavior different here? This subtle difference managed to confuse me enough to wonder why there is a difference. Is there some compelling reason that EOF is triggered "early" with a call to ignore?
eof() should only return true if you have already attempted to read past the end. In neither case should it be true. This may be a bug in your implementation.
I'm going to go out on a limb here and answer my own question: it really looks like this is a bug in GCC.
The standard reads in 27.6.1.3 paragraph 23:
[istream::ignore] behaves as an
unformatted input function (as
described in 27.6.1.3, paragraph 1).
After constructing a sentry object,
extracts characters and discards them.
Characters are extracted until any of
the following occurs:
if n != numeric_limits::max()
(18.2.1), n characters are extracted
end-of-file occurs on the input sequence (in which case the function
calls setstate(eofbit), which may
throw ios_base::failure(27.4.4.3));
c == delim for the next available input character c (in which case c is
extracted). Note: The last condition
will never occur if delim ==
traits::eof()
My (somewhat tentative) interpretation is that GCC is wrong here, because of the bold parts above. Ignore should behave as an unformatted input function, (like read()), which means that end-of-file should only occur on the input sequence if there is an attempt to extract additional bytes after the last byte in the stream has been extracted.
I'll submit a bug report if I find that enough people agree with this answer.
The consensus seemed to be that this was a legitimate bug in gcc. Since I saw no indication a bug report had been filed, I'm doing so now. The report can be viewed at:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51651