eof() is returning last character twice [duplicate] - c++

This question already has answers here:
Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?
(5 answers)
Closed 7 years ago.
I am reading in from an input file "input.txt" which has the string 'ABCDEFGH' and I am reading it in char by char. I am doing this using the code:
ifstream plaintext (input.txt);
char ch;
if (plaintext.is_open())
{
while(!plaintext.eof()){
plaintext.get(ch);
cout<<ch<<endl;
}
plaintext.close();
}
The string 'ABCDEFGHH' is printed out. I have no idea why it is printing 'H' twice. Any help would be appreciated. I got this code example from HERE

This is because the EOF test does not mean "our crystal ball tells us that there are no more characters available in this tream". Rather, it is a test which we apply after an input operation fails to determine whether the input failed due to running out of data (EOF) or some other condition (an I/O error of some sort).
In other words, EOF can be false even after we have successfully read what will be the last character. We will then try to read again, and this time get will fail, and not overwrite the existing value of ch, so it still holds the H.
Streams cannot predict the end of the data because then they could not be used for communication devices such as serial lines, interactive terminals or network sockets. On a terminal, we cannot tell that the user has typed the last character they will ever type. On a network, we cannot tell that the byte we have just received is the last one. Rather, we know that the previous byte was the last one, because the current read operation has failed.

Related

What the general purpose when using cin.clear? [duplicate]

This question already has answers here:
Why would we call cin.clear() and cin.ignore() after reading input?
(4 answers)
Closed 5 years ago.
I am a beginner to c++, and I just can't wrap my head around whats cin.ignore & cin.clear, they make absolutely no sense to me. When you explain this to me, please be very descriptive
In C++ input processing, cin.fail() would return true if the last cin command failed.
Usually, cin.fail() would return true in the following cases:
anytime you reach the EOF and try to read anything, cin.fail() would return true.
if you try to read an integer and it receives something that cannot be converted to an integer.
When cin.fail() return true and error occurs, the input buffer of cin is placed in an "error state". The state would block the further input processing.
Therefore, you have to use cin.clear(). It would overwrite the current value of the stream internal error flag => All bits are replaced by those in state, if state is good bit all error flags are cleared.
For cin.ignore, first it would accesses the input sequence by first constructing a sentry object. After that, it extracts characters from its associated stream buffer object as if calling its member functions sbumpc or sgetc, and finally destroys the sentry object before returning.
Therefore, It commonly used to perform extracting and discarding characters. A classical cases of cin.ignore is that when you're using getline() after cin, it would leaves a newline in your buffer until you switch function. That why you MUST flush the newline out of the buffer.
std::cin.ignore() can be called three different ways:
No arguments: A single character is taken from the input buffer and discarded:
std::cin.ignore(); //discard 1 character
One argument: The number of characters specified are taken from the input buffer and discarded:
std::cin.ignore(33); //discard 33 characters
Two arguments: discard the number of characters specified, or discard characters up to and including the specified delimiter (whichever comes first):
std::cin.ignore(26, '\n'); //ignore 26 characters or to a newline, whichever comes first
source: http://www.augustcouncil.com/~tgibson/tutorial/iotips.html

C++ istream::peek - shouldn't it be nonblocking?

It seems well accepted that the istream::peek operation is blocking.
The standard, though arguably a bit ambiguous, leans towards nonblocking behavior. peek calls sgetc in turn, whose behavior is:
"The character at the current position of the controlled input sequence, as a value of type int.
If there are no more characters to read from the controlled input sequence, the function returns the end-of-file value (EOF)."
It doesn't say "If there are no more characters.......wait until there are"
Am I missing something here? Or are the peek implementations we use just kinda wrong?
The controlled input sequence is the file (or whatever) from which you're reading. So if you're at end of file, it returns EOF. Otherwise it returns the next character from the file.
I see nothing here that's ambiguous at all--if it needs a character that hasn't been read from the file, then it needs to read it (and wait till it's read, and return it).
If you're reading from something like a socket, then it's going to wait until data arrives (or the network stack detects EOF, such as the peer disconnecting).
The description from cppreference.com might be clearer than the one in your question:
Ensures that at least one character is available in the input area by [...] reading more data in from the input sequence (if applicable)."
"if applicable" does apply in this case; and "reading data from the input sequence" entails waiting for more data if there is none and the stream is not in an EOF or other error state.
When I get confused about console input I remind myself that console input can be redirected to come from a file, so the behavior of the keyboard more or less mimics the behavior of a file. When you try to read a character from file, you can get one of two results: you get a character, or you get EOF because you've reached the end of the file -- there are no more characters to be read. Same thing for keyboard input: either you get a character, or you get EOF because you've reached the end of the file. With a file, there is no notion of waiting for more characters: either a file has unread characters or it doesn't. Same thing for the keyboard. So if you have't reached EOF on the keyboard, reading a character returns the next character. You reach EOF on the keyboard by typing whatever character your system recognizes as EOF; on Unix systems that's ctrl-D, on Windows (if I remember correctly) that's ctrl-C. If you haven't reached EOF, there are more characters to be read.

Reading in one byte at a time with .get() [duplicate]

This question already has answers here:
Why is iostream::eof inside a loop condition (i.e. `while (!stream.eof())`) considered wrong?
(5 answers)
Closed 7 years ago.
So i'm reading in a input file that contains:
lololololololol
I need to read it in using binary one byte at a time for something I'm doing later on. To do this i'm using get() to read it in then storing it into a char. It seems to be working correctly except for the last char that it reads in. The vector that it is reading into contains:
lololololololol
�
I'm not quite sure what this last value is but it's totally throwing off my finial output. So my question is, is there a reason get() would read in a value or byte from my text document that is not there? Or is it reading in something that I don't know of?
code:
while(istr.good()) {
temp = istr.get();
input.push_back(temp);
}
It's reading the EOF (end of file) character. You need to do the check after reading it to avoid it being inserted to the vector:
while(temp = istr.get(), istr.good()) // comma operator
input.push_back(temp);
Or you might use the 2nd std::istream_base::get overload and let istr implicitly convert to bool:
while(istr.get(temp))
input.push_back(temp);
Or try more advanced approaches. operator>> and std::getline would also work fine for this kind of input.

Why two EOF needed as input? [duplicate]

This question already has an answer here:
Canonical vs. non-canonical terminal input
(1 answer)
Closed 7 years ago.
When I run the code below, I use three inputs (in Ubuntu terminal):
abc(Ctrl+D)(Ctrl+D)
abc(Ctrl+D)(Enter)(Ctrl+D)
abc(Enter)(Ctrl+D)
The code reacts well in all cases. My question is: why in 1) and 2) I need two EOF?
#include <iostream>
int main()
{
int character;
while((character=std::cin.get())!=EOF){}
std::cout << std::endl << character << std::endl;
}
You don't have "two EOF". Bash is putting the tty in raw mode, and interpreting ^D differently depending on context. If you type ^D after a newline, bash closes the input stream on the foreground process. If you type a few characters first, bash requires you to type ^D twice before doing so. (The first ^D is treated like 'delete')
That's how the "EOF" character works (in "canonical" mode input, which is the default). It's actually never sent to the application, so it would be more accurate to call it the EOF signal.
The EOF character (normally Ctrl-D) causes the current line to be returned to the application program immediately. That's very similar to the behaviour of the EOL character (Enter), but unlike EOL, the EOF character is not included in the line.
If the EOF character is typed at the beginning of a line, then zero bytes are returned to the application program (since the EOF character is not sent). But if a read system call returns 0 bytes, that is considered an end-of-file indication. So at the beginning of a line, an EOF will be treated as terminating input; anywhere else, it will merely terminate the line and so you need two of them to terminate input.
For more details, see the .Posix terminal interface specification.

Why is (foobar>>x) preferred over (! foobar.eof() ) [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Why is iostream::eof inside a loop condition considered wrong?
eof() bad practice?
My teacher said we shouldn't use EOF to read in text file or binary file information instead we should use (afile>>x). He didn't explain why, can someone explain to me. Can someone also explain what are the differences in this two different method of reading
//Assuming declaration
//ifstream foobar
( ! foobar.eof() )
{
foobar>>x; // This is discouraged by my teacher
}
while (foobar>>x)
{
//This is encouraged by my teacher
}
Because the file is not at the end before you try to read from it.
operator>> returns a reference to the stream in the state it is after the read has been attempted and either succeeded or failed, and the stream evaluates to true if it succeeded or false if it failed. Testing for eof() first means that the file can have no useful data in it but not be at EOF yet, then when you read from it, it's at EOF and the read fails.
Another important detail is that operator>> for streams skips all leading whitespace, not trailing whitespace. This is why a file can not be at EOF before the read and be at EOF after a read.
Additionally, the former works when the next data in the file is data that cannot be read into an integer (for example, the next data is x), not just when it's at EOF, which is very important.
Example:
Consider the code:
int x, y;
f >> x;
if (!f.eof())
f >> y;
Assuming f is a file that contains the data 123␣ (the ␣ means space), the first read will succeed, but afterwards the file has no more integers in it and it is not at EOF. The second read will fail and the file will be at EOF, but you don't know because you tested for EOF before you tried reading. Then your code goes on to cause undefined behaviour because y is uninitialised.