istream_iterator copy example keeps waiting for input - c++

I tried implementing an example of stream iterators from page 107 of "The C++ Standard Library". I get stuck on this line:
copy (istream_iterator<string>(cin), istream_iterator<string>(), back_inserter(coll));
The program keeps reading data from the console here, but does not pass on to the next line. How do I continue past this point?

From cppreference:
The default-constructed std::istream_iterator is known as the
end-of-stream iterator. When a valid std::istream_iterator reaches the
end of the underlying stream, it becomes equal to the end-of-stream
iterator. Dereferencing or incrementing it further invokes undefined
behavior
bold added
In other words, std::istream_iterator<string>(std::cin) keeps going until the end-of-input for std::cin. This doesn't happen at the end of the line, but at the end-of-file. In a console, there are specific commands to trigger the EOF:
In UNIX systems it is Ctrl+D, in Windows Ctrl+Z.

take for instance, if you would have made an input stream of int then you would have given input like - 45 56 45345 555 ....., so in all those cases, the reading operation of input stream would have returned a true value - while (cin>>var) { }
the while statement will not stop if it is getting a valid input, so to stop the reading of characters, we gave it following input, ... 54 56 3545 | , and as soon as it receives a special character the while loop stops as the conditions returns false.
That's the same case for all other type of input streams as well.
So I assume you understand here why your string type input stream never stops taking input because every possible input can be considered string.
The solution to this problem is using "ctrl + D in UNIX" and "ctrl + Z in windows", as it gives NULL in condition of while loop which means false, hence stoping the reading of string input.

Related

Does istream::ignore discard more than n characters?

(this is possibly a duplicate of Why does std::basic_istream::ignore() extract more characters than specified?, however my specific case doesn't deal with the delim)
From cppreference, the description of istream::ignore is the following:
Extracts and discards characters from the input stream until and including delim.
ignore behaves as an UnformattedInputFunction. After constructing and checking the sentry object, it extracts characters from the stream and discards them until any one of the following conditions occurs:
count characters were extracted. This test is disabled in the special case when count equals std::numeric_limitsstd::streamsize::max()
end of file conditions occurs in the input sequence, in which case the function calls setstate(eofbit)
the next available character c in the input sequence is delim, as determined by Traits::eq_int_type(Traits::to_int_type(c), delim). The delimiter character is extracted and discarded. This test is disabled if delim is Traits::eof()
However, let's say I've got the following program:
#include <iostream>
int main(void) {
int x;
char p;
if (std::cin >> x) {
std::cout << x;
} else {
std::cin.clear();
std::cin.ignore(2);
std::cout << "________________";
std::cin >> p;
std::cout << p;
}
Now, let's say I input something like p when my program starts. I expect cin to 'fail', then clear to be called and ignore to discard 2 characters from the buffer. So 'p' and '\n' that are left in the buffer should be discarded. However, the program still expects input after ignore gets called, so in reality it's only get to the final std::cin>>p after I've given it more than 2 characters to discard.
My issue:
Inputting something like 'b' and hitting Enter immediately after the first input (so 2 after the characters get discarded, 'p' and '\n') keeps 'b' in the buffer and immediately passes it to cin, without first printing the message. How can I make it so that the message gets printed immediately after the two characters are discarded and then << is called?
After a lot of back and forth in the comments (and reproducing the problem myself), it's clear the problem is that:
You enter p<Enter>, which isn't parsable
You try to discard exactly two characters with ignore
You output the underscores
You prompt for the next input
but in fact things seem to stop at step 2 until you give it more input, and the underscores only appear later. Well, bad news, you're right, the code is blocking at step 2 in ignore. ignore is blocking waiting for a third character to be entered (really, checking if it's EOF after those two characters), and by the spec, this is apparently the correct thing to do, I think?
The problem here is the same basic issue as the problem you linked just a different manifestation. When ignore terminates because it's read the number of characters requested, it always attempts to reads one more character, because it needs to know if condition 2 might also be true (it happened to read the last character so it can take the appropriate action, putting cin in EOF state, or leaving the next character in the buffer for the next read otherwise):
Effects: Behaves as an unformatted input function (as described above). After constructing a sentry object, extracts characters and discards them. Characters are extracted until any of the following occurs:
n != numeric_limits::max() (18.3.2) and n characters have been extracted so far
end-of-file occurs on the input sequence (in which case the function calls setstate(eofbit), which may throw ios_base::failure (27.5.5.4));
traits::eq_int_type(traits::to_int_type(c), delim) for the next available input character c (in which case c is extracted).
Since you didn't provide an end character for ignore, it's looking for EOF, and if it doesn't find it after two characters, it must read one more to see if it shows up after the ignored characters (if it does, it'll leave cin in EOF state, if not, the character it peeked at will be the next one you read).
Simplest solution here is to not try to specifically discard exactly two characters. You want to get rid of everything through the newline, so do that with:
std::cin.ignore(std::numeric_limits<std::stringsize>::max(), '\n');
instead of std::cin.ignore(2);; that will read any and all characters until the newline (or EOF), consume the newline, and it won't ever overread (in the sense that it continues forever until the delimiter or EOF is found, there is no condition under which it finishes reading a count of characters and needs to peek further).
If for some reason you want to specifically ignore exactly two characters (how do you know they entered p<Enter> and not pabc<Enter>?), just call .get() on it a couple times or .read(&two_byte_buffer, 2) or the like, so you read the raw characters without the possibility of trying to peek beyond them.
For the record, this seems a little from the cppreference spec (which may be wrong); condition 2 in the spec doesn't specify it needs to verify if it is at EOF after reading count characters, and cppreference claims condition 3 (which would need to peek) is explicitly not checked if the "delimiter" is the default Traits::eof(). But the spec quote found in your other answer doesn't include that line about condition 3 not applying for Traits::eof(), and condition 2 might allow for checking if you're at EOF, which would end up with the observed behavior.
Your problem is related to your terminal. When you press ENTER, you are most likely getting two characters -- '\r' and '\n'. Consequently, there is still one character left in the input stream to read from. Change that line to:
std::cin.ignore(10, '\n'); // 10 is not magical. You may use any number > 2
to see the behavior you are expecting.
Passing exact number of characters in buffer will do the trick:
std::cin.ignore(std::cin.rdbuf()->in_avail());

Explain how the for loop is terminating?

I came across the a code which I didn't understand. It was on a coding website. The code was like this-
#include<iostream>
using namespace std;
char s[4];
int x;
int main()
{
for(cin>>s;cin>>s;x+=44-s[1]);
cout<<x;
}
My question is how the for loop is terminating and since it was on a coding website so answers are checked using file operation in my knowledge. But if we are running it on IDE this for loop is not terminating instead it keeps on taking input from the user.So whats the explanation for this??
Sample Input
3
x++
x--
--x
Output
-1
EDIT
This is the problem link - Bit++
This is the solution link - In status filter set language to MS C++ Author name - wafizaini (Solution id - 27116030)
The loop is terminating because istream has operator bool() (prior to C++11 it was operator void*) which returns false when no additional input is available. Basically, the reason the loop stops is the same as why a more common while loop terminates:
while (cin >> s) {
...
}
The reason this does not terminate when you run with an IDE is that you need to supply an end-of-stream mark, which is delivered in a system-dependent way. On UNIX and other systems derived from it you press Ctrl+d, while on Windows you press Ctrl+z.
Note: Your program is at risk of getting a buffer overrun in case an end-user enters more than three characters (character #4 would be used for null terminator of the string). Also note that the initial input cin>>s is thrown away, because loop condition is checked before entering the body of the loop.
That's perfectly valid, although a bit difficult to read, C++11 code.
std::istream::operator>>()
returns a reference to the input stream itself, and
std::istream::operator bool()
in turn evaluates the stream to a boolean value, returning false whenever a fail bit is set.
When reading from a file, that loop will eventually try to read past the end of file, causing the eof fail bit to be set and thus stopping the loop.
However, when running that code on a shell, you need to manually input the EOF control code on the stream, otherwise the for loop won't stop. This can be done by pressing Ctrl+D on Unix shells, for example.
A more common loop condition is while (cin >> s).
The convention is that operator>> returns a reference to the stream (cin). The stream classes then have a conversion operator that will return false in if (cin) or while (cin) after some input has failed.
This would work in the middle part of a for-loop as well, but is a bit unusual.

C++ istream::peek - shouldn't it be nonblocking?

It seems well accepted that the istream::peek operation is blocking.
The standard, though arguably a bit ambiguous, leans towards nonblocking behavior. peek calls sgetc in turn, whose behavior is:
"The character at the current position of the controlled input sequence, as a value of type int.
If there are no more characters to read from the controlled input sequence, the function returns the end-of-file value (EOF)."
It doesn't say "If there are no more characters.......wait until there are"
Am I missing something here? Or are the peek implementations we use just kinda wrong?
The controlled input sequence is the file (or whatever) from which you're reading. So if you're at end of file, it returns EOF. Otherwise it returns the next character from the file.
I see nothing here that's ambiguous at all--if it needs a character that hasn't been read from the file, then it needs to read it (and wait till it's read, and return it).
If you're reading from something like a socket, then it's going to wait until data arrives (or the network stack detects EOF, such as the peer disconnecting).
The description from cppreference.com might be clearer than the one in your question:
Ensures that at least one character is available in the input area by [...] reading more data in from the input sequence (if applicable)."
"if applicable" does apply in this case; and "reading data from the input sequence" entails waiting for more data if there is none and the stream is not in an EOF or other error state.
When I get confused about console input I remind myself that console input can be redirected to come from a file, so the behavior of the keyboard more or less mimics the behavior of a file. When you try to read a character from file, you can get one of two results: you get a character, or you get EOF because you've reached the end of the file -- there are no more characters to be read. Same thing for keyboard input: either you get a character, or you get EOF because you've reached the end of the file. With a file, there is no notion of waiting for more characters: either a file has unread characters or it doesn't. Same thing for the keyboard. So if you have't reached EOF on the keyboard, reading a character returns the next character. You reach EOF on the keyboard by typing whatever character your system recognizes as EOF; on Unix systems that's ctrl-D, on Windows (if I remember correctly) that's ctrl-C. If you haven't reached EOF, there are more characters to be read.

Why does string extraction from a stream set the eof bit?

Let's say we have a stream containing simply:
hello
Note that there's no extra \n at the end like there often is in a text file. Now, the following simple code shows that the eof bit is set on the stream after extracting a single std::string.
int main(int argc, const char* argv[])
{
std::stringstream ss("hello");
std::string result;
ss >> result;
std::cout << ss.eof() << std::endl; // Outputs 1
return 0;
}
However, I can't see why this would happen according to the standard (I'm reading C++11 - ISO/IEC 14882:2011(E)). operator>>(basic_stream<...>&, basic_string<...>&) is defined as behaving like a formatted input function. This means it constructs a sentry object which proceeds to eat away whitespace characters. In this example, there are none, so the sentry construction completes with no problems. When converted to a bool, the sentry object gives true, so the extractor continues to get on with the actual extraction of the string.
The extraction is then defined as:
Characters are extracted and appended until any of the following occurs:
n characters are stored;
end-of-file occurs on the input sequence;
isspace(c,is.getloc()) is true for the next available input character c.
After the last character (if any) is extracted, is.width(0) is called and the sentry object k is destroyed.
If the function extracts no characters, it calls is.setstate(ios::failbit), which may throw ios_base::failure (27.5.5.4).
Nothing here actually causes the eof bit to be set. Yes, extraction stops if it hits the end-of-file, but it doesn't set the bit. In fact, the eof bit should only be set if we do another ss >> result;, because when the sentry attempts to gobble up whitespace, the following situation will occur:
If is.rdbuf()->sbumpc() or is.rdbuf()->sgetc() returns traits::eof(), the function calls setstate(failbit | eofbit)
However, this is definitely not happening yet because the failbit isn't being set.
The consequence of the eof bit being set is that the only reason the evil-idiom while (!stream.eof()) doesn't work when reading files is because of the extra \n at the end and not because the eof bit isn't yet set. My compiler is happily setting the eof bit when the extraction stops at the end of file.
So should this be happening? Or did the standard mean to say that setstate(eofbit) should occur?
To make it easier, the relevant sections of the standard are:
21.4.8.9 Inserters and extractors [string.io]
27.7.2.2 Formatted input functions [istream.formatted]
27.7.2.1.3 Class basic_istream::sentry [istream::sentry]
std::stringstream is a basic_istream and the operator>> of std::string "extracts" characters from it (as you found out).
27.7.2.1 Class template basic_istream
2 If rdbuf()->sbumpc() or rdbuf()->sgetc() returns traits::eof(), then the input function, except as
explicitly noted otherwise, completes its actions and does setstate(eofbit), which may throw ios_-
base::failure (27.5.5.4), before returning.
Also, "extracting" means calling these two functions.
3 Two groups of member function signatures share common properties: the formatted input functions (or
extractors) and the unformatted input functions. Both groups of input functions are described as if they
obtain (or extract) input characters by calling rdbuf()->sbumpc() or rdbuf()->sgetc(). They may use
other public members of istream.
So eof must be set.
Intuitively speaking, the EOF bit is set because during the read operation to extract the string, the stream did indeed hit the end of the file. Specifically, it continuously read characters out of the input stream, stopping because it hit the end of the stream before encountering a whitespace character. Accordingly, the stream set the EOF bit to mark that the end of stream was reached. Note that this is not the same as reporting failure - the operation was completed successfully - but the point of the EOF bit is not to report failure. It's to mark that the end of the stream was encountered.
I don't have a specific part of the spec to back this up, though I'll try to look for one when I get the chance.

Contents of the string after failed extraction from istream

If I do this:
ifstream stream("somefilewhichopenssuccesfully.txt");
string token;
if( stream >> token )
cout << token;
else
cout << token;
Is the output in the second case guaranteed to be an empty string? I can't seem to find the answer to this on cplusplus.com.
Thanks!
Is the output in the second case guaranteed to be an empty string?
The answer is : no, because it depends, as described below.
Since else block will be executed only if an attempt to read from the stream fails, and that can occur anytime in the course of reading.
If it fails at the very first attempt, then there is no character extraction from the stream, and hence token will be empty (as it was).
If it fails after few reads, then token will not be empty. It will contain the characters successfully read so far from the stream.
The section §21.3.7.9 from the Standard says,
Begins by constructing a sentry object
k as if k were constructed by typename
basic_istream::sentry
k(is). If bool(k) is true, it calls
str.erase() and then extracts
characters from is and appends them to
str as if by calling str.append(1,c).
If is.width() is greater than zero,
the maximum number n of characters
appended is is.width(); otherwise n is
str.max_size(). Characters are
extracted and appended until any of
the following occurs:
— n characters
are stored;
— end-of-file occurs on
the input sequence;
— isspace(c,is.getloc()) is true for the
next available input character c.
After the last character (if any) is
extracted, is.width(0) is called and
the sentry object k is destroyed.
If the function extracts no characters, it calls is.setstate(ios::failbit), which may throw ios_base::failure (27.4.4.3).
Also note that the section §21.3.1/2 from the Standard guarantees that the default constructed string will be empty. The Standard says its size will be zero, that means, empty.
I deleted my original answer because I wanted to test this. This is what I see, if there is an error whilst reading (EOF is not counted in this context), the original string is modified and the branch sees the modified version. To test I did the following, created a 2Gb file (touch then truncate), the above code to read. Whilst the code was running, removed the file (this should set the failbit - I think). Immediately stops reading, but the string is modified - it has a larger size.
To me this indicates that the string is modified even if the stream operation fails.
No, even if the operation fails, the string will contain the characters extracted so far.
The standard says (§21.4.8.9):
Effects: Behaves as a formatted input function (27.7.2.2.1). After constructing a sentry object, if the sentry converts to true, calls str.erase() and then extracts characters from is and appends them to str as if by calling str.append(1,c). If is.width() is greater than zero, the maximum number n of characters appended is is.width(); otherwise n is str.max_size(). Characters are extracted and appended until any of the following occurs:
— n characters are stored;
— end-of-file occurs on the input sequence;
— isspace(c,is.getloc()) is true for the next available input character c.