File Handling:What is the use of peek() function in c++? - c++

As I have so much problem while dealing with the eof of a file, whenever I code with fstream and the eof appears I have to clear the stream in order to work with that stream. Although I have searched a lot about the eof and I got the result that I should start using:
fstream file("Filename.txt",ios::in|ios::ate|ios::out);
char str[80];
while(file>>str)
{
//do the required stuff
}
//clear the stream and reuse it
file.clear();
file.seekp(0);
But I have also read about a function called peek() which is also used for such purposes but I am a little confused in its working and I am not able to apply it in the code. So if anyone could guide me through this.
And I have also heard about a function called putback() what's that??
Edit-1
fstream file("Filename.txt",ios::in|ios::ate|ios::out);
char str[80];
while(file>>str)
{
//do the required stuff
}
//clear the stream and reuse it
file.clear();
file.seekp(0);
//Now do the required writing operation after reading the whole file wherever is required
//I also want to perform writing operations and if this pattern seems most suitable for me

Say you want to write a parser for C or C++ and your code does something like this:
char c = source.get();
switch(c)
{
...
case '<':
// May be < or <=
if (source.peek() == '=')
{
source.get();
return less_or_equal;
}
// Ok, not <= so:
return less;
...
}
[I ignored that it may be part of a template, a shift, or something else like that]
The need for putback() is very little when you have peek(), but it does allow code that "normally consumes" the character to put it back "if it got it wrong". Say you know that <= is more common than <, then you could do:
char c = source.get();
switch(c)
{
...
case '<':
// May be < or <=
c = source.get();
if (c == '=')
{
source.get();
return less_or_equal;
}
source.putback(c);
// Ok, not <= so:
return less;
...
}
because it only does putback on the rare case [as per the assumed statistics above].
One can imagine cases where the common case is to get and the rare case is mismatch, e.g. if we want to read a number:
int number = 0;
do
{
char c = input.get();
if (isdigit(c))
{
number *= 10;
number += c - '0';
}
else
{
input.putback(c);
}
while( isdgit(c) );
Since, most numbers have more than one digit in them, the more common case is that the first and the subsequent character is a digit, and the unusual case is that we need to call putback(). [Of course, reading numbers "properly" will require a bit more stuff...]

But I have also read about a function called peek() which is also used for such purposes
peek() was created for a different purpose - it's there to let your program process two characters at a time - essentially emulating ungetc() functionality from the portion of the library based on the C standard library.
Using peek to see if you are about to hit eof is valid, but the approach that you show in the code, i.e. while(file>>str), is more idiomatic to C++ than the one based on peek.
I have also heard about a function called putback() what's that?
The std::putback() function lets you do the same thing as ungetc() does for FILE* streams.

Related

Symbol at the end of txt file appears [duplicate]

What is wrong with using feof() to control a read loop? For example:
#include <stdio.h>
#include <stdlib.h>
int
main(int argc, char **argv)
{
char *path = "stdin";
FILE *fp = argc > 1 ? fopen(path=argv[1], "r") : stdin;
if( fp == NULL ){
perror(path);
return EXIT_FAILURE;
}
while( !feof(fp) ){ /* THIS IS WRONG */
/* Read and process data from file… */
}
if( fclose(fp) != 0 ){
perror(path);
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
What is wrong with this loop?
TL;DR
while(!feof) is wrong because it tests for something that is irrelevant and fails to test for something that you need to know. The result is that you are erroneously executing code that assumes that it is accessing data that was read successfully, when in fact this never happened.
I'd like to provide an abstract, high-level perspective. So continue reading if you're interested in what while(!feof) actually does.
Concurrency and simultaneity
I/O operations interact with the environment. The environment is not part of your program, and not under your control. The environment truly exists "concurrently" with your program. As with all things concurrent, questions about the "current state" don't make sense: There is no concept of "simultaneity" across concurrent events. Many properties of state simply don't exist concurrently.
Let me make this more precise: Suppose you want to ask, "do you have more data". You could ask this of a concurrent container, or of your I/O system. But the answer is generally unactionable, and thus meaningless. So what if the container says "yes" – by the time you try reading, it may no longer have data. Similarly, if the answer is "no", by the time you try reading, data may have arrived. The conclusion is that there simply is no property like "I have data", since you cannot act meaningfully in response to any possible answer. (The situation is slightly better with buffered input, where you might conceivably get a "yes, I have data" that constitutes some kind of guarantee, but you would still have to be able to deal with the opposite case. And with output the situation is certainly just as bad as I described: you never know if that disk or that network buffer is full.)
So we conclude that it is impossible, and in fact unreasonable, to ask an I/O system whether it will be able to perform an I/O operation. The only possible way we can interact with it (just as with a concurrent container) is to attempt the operation and check whether it succeeded or failed. At that moment where you interact with the environment, then and only then can you know whether the interaction was actually possible, and at that point you must commit to performing the interaction. (This is a "synchronisation point", if you will.)
EOF
Now we get to EOF. EOF is the response you get from an attempted I/O operation. It means that you were trying to read or write something, but when doing so you failed to read or write any data, and instead the end of the input or output was encountered. This is true for essentially all the I/O APIs, whether it be the C standard library, C++ iostreams, or other libraries. As long as the I/O operations succeed, you simply cannot know whether further, future operations will succeed. You must always first try the operation and then respond to success or failure.
Examples
In each of the examples, note carefully that we first attempt the I/O operation and then consume the result if it is valid. Note further that we always must use the result of the I/O operation, though the result takes different shapes and forms in each example.
C stdio, read from a file:
for (;;) {
size_t n = fread(buf, 1, bufsize, infile);
consume(buf, n);
if (n == 0) { break; }
}
The result we must use is n, the number of elements that were read (which may be as little as zero).
C stdio, scanf:
for (int a, b, c; scanf("%d %d %d", &a, &b, &c) == 3; ) {
consume(a, b, c);
}
The result we must use is the return value of scanf, the number of elements converted.
C++, iostreams formatted extraction:
for (int n; std::cin >> n; ) {
consume(n);
}
The result we must use is std::cin itself, which can be evaluated in a boolean context and tells us whether the stream is still in the good() state.
C++, iostreams getline:
for (std::string line; std::getline(std::cin, line); ) {
consume(line);
}
The result we must use is again std::cin, just as before.
POSIX, write(2) to flush a buffer:
char const * p = buf;
ssize_t n = bufsize;
for (ssize_t k = bufsize; (k = write(fd, p, n)) > 0; p += k, n -= k) {}
if (n != 0) { /* error, failed to write complete buffer */ }
The result we use here is k, the number of bytes written. The point here is that we can only know how many bytes were written after the write operation.
POSIX getline()
char *buffer = NULL;
size_t bufsiz = 0;
ssize_t nbytes;
while ((nbytes = getline(&buffer, &bufsiz, fp)) != -1)
{
/* Use nbytes of data in buffer */
}
free(buffer);
The result we must use is nbytes, the number of bytes up to and including the newline (or EOF if the file did not end with a newline).
Note that the function explicitly returns -1 (and not EOF!) when an error occurs or it reaches EOF.
You may notice that we very rarely spell out the actual word "EOF". We usually detect the error condition in some other way that is more immediately interesting to us (e.g. failure to perform as much I/O as we had desired). In every example there is some API feature that could tell us explicitly that the EOF state has been encountered, but this is in fact not a terribly useful piece of information. It is much more of a detail than we often care about. What matters is whether the I/O succeeded, more-so than how it failed.
A final example that actually queries the EOF state: Suppose you have a string and want to test that it represents an integer in its entirety, with no extra bits at the end except whitespace. Using C++ iostreams, it goes like this:
std::string input = " 123 "; // example
std::istringstream iss(input);
int value;
if (iss >> value >> std::ws && iss.get() == EOF) {
consume(value);
} else {
// error, "input" is not parsable as an integer
}
We use two results here. The first is iss, the stream object itself, to check that the formatted extraction to value succeeded. But then, after also consuming whitespace, we perform another I/O/ operation, iss.get(), and expect it to fail as EOF, which is the case if the entire string has already been consumed by the formatted extraction.
In the C standard library you can achieve something similar with the strto*l functions by checking that the end pointer has reached the end of the input string.
It's wrong because (in the absence of a read error) it enters the loop one more time than the author expects. If there is a read error, the loop never terminates.
Consider the following code:
/* WARNING: demonstration of bad coding technique!! */
#include <stdio.h>
#include <stdlib.h>
FILE *Fopen(const char *path, const char *mode);
int main(int argc, char **argv)
{
FILE *in;
unsigned count;
in = argc > 1 ? Fopen(argv[1], "r") : stdin;
count = 0;
/* WARNING: this is a bug */
while( !feof(in) ) { /* This is WRONG! */
fgetc(in);
count++;
}
printf("Number of characters read: %u\n", count);
return EXIT_SUCCESS;
}
FILE * Fopen(const char *path, const char *mode)
{
FILE *f = fopen(path, mode);
if( f == NULL ) {
perror(path);
exit(EXIT_FAILURE);
}
return f;
}
This program will consistently print one greater than the number of characters in the input stream (assuming no read errors). Consider the case where the input stream is empty:
$ ./a.out < /dev/null
Number of characters read: 1
In this case, feof() is called before any data has been read, so it returns false. The loop is entered, fgetc() is called (and returns EOF), and count is incremented. Then feof() is called and returns true, causing the loop to abort.
This happens in all such cases. feof() does not return true until after a read on the stream encounters the end of file. The purpose of feof() is NOT to check if the next read will reach the end of file. The purpose of feof() is to determine the status of a previous read function
and distinguish between an error condition and the end of the data stream. If fread() returns 0, you must use feof/ferror to decide whether an error occurred or if all of the data was consumed. Similarly if fgetc returns EOF. feof() is only useful after fread has returned zero or fgetc has returned EOF. Before that happens, feof() will always return 0.
It is always necessary to check the return value of a read (either an fread(), or an fscanf(), or an fgetc()) before calling feof().
Even worse, consider the case where a read error occurs. In that case, fgetc() returns EOF, feof() returns false, and the loop never terminates. In all cases where while(!feof(p)) is used, there must be at least a check inside the loop for ferror(), or at the very least the while condition should be replaced with while(!feof(p) && !ferror(p)) or there is a very real possibility of an infinite loop, probably spewing all sorts of garbage as invalid data is being processed.
So, in summary, although I cannot state with certainty that there is never a situation in which it may be semantically correct to write "while(!feof(f))" (although there must be another check inside the loop with a break to avoid a infinite loop on a read error), it is the case that it is almost certainly always wrong. And even if a case ever arose where it would be correct, it is so idiomatically wrong that it would not be the right way to write the code. Anyone seeing that code should immediately hesitate and say, "that's a bug". And possibly slap the author (unless the author is your boss in which case discretion is advised.)
No it's not always wrong. If your loop condition is "while we haven't tried to read past end of file" then you use while (!feof(f)). This is however not a common loop condition - usually you want to test for something else (such as "can I read more"). while (!feof(f)) isn't wrong, it's just used wrong.
feof() indicates if one has tried to read past the end of file. That means it has little predictive effect: if it is true, you are sure that the next input operation will fail (you aren't sure the previous one failed BTW), but if it is false, you aren't sure the next input operation will succeed. More over, input operations may fail for other reasons than the end of file (a format error for formatted input, a pure IO failure -- disk failure, network timeout -- for all input kinds), so even if you could be predictive about the end of file (and anybody who has tried to implement Ada one, which is predictive, will tell you it can complex if you need to skip spaces, and that it has undesirable effects on interactive devices -- sometimes forcing the input of the next line before starting the handling of the previous one), you would have to be able to handle a failure.
So the correct idiom in C is to loop with the IO operation success as loop condition, and then test the cause of the failure. For instance:
while (fgets(line, sizeof(line), file)) {
/* note that fgets don't strip the terminating \n, checking its
presence allow to handle lines longer that sizeof(line), not showed here */
...
}
if (ferror(file)) {
/* IO failure */
} else if (feof(file)) {
/* format error (not possible with fgets, but would be with fscanf) or end of file */
} else {
/* format error (not possible with fgets, but would be with fscanf) */
}
feof() is not very intuitive. In my very humble opinion, the FILE's end-of-file state should be set to true if any read operation results in the end of file being reached. Instead, you have to manually check if the end of file has been reached after each read operation. For example, something like this will work if reading from a text file using fgetc():
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *in = fopen("testfile.txt", "r");
while(1) {
char c = fgetc(in);
if (feof(in)) break;
printf("%c", c);
}
fclose(in);
return 0;
}
It would be great if something like this would work instead:
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *in = fopen("testfile.txt", "r");
while(!feof(in)) {
printf("%c", fgetc(in));
}
fclose(in);
return 0;
}

Calling putback() on istream multiple times

Many sites describe the istream::putback() function that lets you "put back" a character into the input stream so you can read it again in a subsequent reading operation.
What's to stop me, however, from calling putback() multiple times in sequence over the same stream? Of course, you're supposed to check for errors after every operation in order to find out if it succeeded; and yet, I wonder: is there any guarantee that a particular type of stream supports putting back more than one character at a time?
I'm only guessing here, but I can imagine istringstream is able to put back as many characters as the length of the string within the stream; but I'm not so sure that it is the same for ifstream.
Is any of this true? How do I find out how many characters I can putback() into an istream?
If you want to read multiple characters from a stream you may unget them using unget():
std::vector<char>&read_top5(std::istream & stream, std::vector<char> & container) {
std::ios_base::sync_with_stdio(false);
char c;
int i=4;
container.clear();
while (stream && stream.get(c)) {
container.push_back(c);
if (--i < 0) break;
if (c == '\n') break;
}
for (int j=0;j<(int)container.size();j++) {
//stream.putback(container[j]); // not working
stream.unget(); // working properly
}
return container;
}
This function reads the first 5 characters from stream while they are still in stream after the function exits.

Is there a way to check input data type using only basic concepts?

I'm being challenged to find ways to perform tasks that usually require the use of headers (besides iostream and iomanip) or greater-than-basic C++ knowledge. How can I check the data type of user input using only logical operators, basic arithmetic (+, -, *, /, %), if statements, and while loops?
Obviously the input variable has a declared data type in the first place, but this problem is covering the possibility of the user inputting the wrong data type.
I've tried several methods including the if (!(cin >> var1)) trick, but nothing works correctly. Is this possible at all?
Example
int main() {
int var1, var2;
cin >> var1;
cin >> var2;
cout << var1 << " - " << var2 << " = " << (var1-var2);
return 0;
}
It's possible to input asdf and 5.25 here, so how do I check that the input aren't integers as expected, using only the means I stated earlier?
I understand this problem is vague in many ways, mostly because the restrictions are extremely specific and listing everything I'm allowed to use would be a pain. I guess part of the problem as mentioned in the comments is figuring out how to distinguish between data types in the first place.
You can do that using simple operations, although it might be a little difficult, for example the following function can be used to check if the input is a decimal number. You can extend the idea and check if there is a period in between for floating point numbers.
Add a comment if you need further help.
bool isNumber(char *inp){
int i = 0;
if (inp[0] == '+' || inp[0] == '-') i = 1;
int sign = (inp[0] == '-' ? -1 : 1);
for (; inp[i]; i++){
if (!(inp[i] >= '0' && inp[i] <= '9'))
return false;
}
return true;
}
General checking after reading is done like this:
stream >> variable;
if (not stream.good()) {
// not successful
}
This can be done on any std::ios. It works for standard types (any numeric type, char, string, etc.) stopping at whitespace. If your variable could not be read, good returns false. You can customize it for your own classes (including control over good's return value):
istream & operator>>(istream & stream, YourClass & c)
{
// Read the data from stream into c
return stream;
}
For your specific problem: Suppose you read the characters 42. There is no way of distinguishing between reading it as
- an int
- a double
as both would be perfectly fine. You have to specify the input format more precisely.
The standard library is not magic - you just have to parse the data read from the user, similarly to what the standard library does.
First read the input from the user:
std::string s;
cin >> s;
(you may use getline instead if you want to read a whole line)
Then you can go on parsing it; we'll try to distinguish between integer (*[+-]?[0-9]+ *), real number (*[+-][0-9](\.[0-9]*)?([Ee][+-]?[0-9]+)? *), string (*"[^"]" *) and anything else ("bad").
enum TokenType {
Integer,
Real,
String,
Bad
};
The basic building block is a routine that "eats" consecutive digits; this will help us with the [0-9]* and [0-9]+ parts.
void eatdigits(const char *&rp) {
while(*rp>='0' && *rp<='0') rp++;
}
Also, a routine that skips whitespace can be handy:
void skipws(const char *&rp) {
while(*rp==' ') rp++;
// feel free to skip also tabs and whatever
}
Then we can attack the real problem
TokenType categorize(const char *rp) {
first, we want to skip the whitespace
skipws(rp);
then, we'll try to match the easiest stuff: the string
if(*rp=='"') {
// Skip the string content
while(*rp && *rp!='"') rp++;
// If the string stopped with anything different than " we
// have a parse error
if(!*rp) return Bad;
// Otherwise, skip the trailing whitespace
skipws(rp);
// And check if we got at the end
return *rp?Bad:String;
}
Then, on to numbers, notice that the real and integer definitions start in the same way; we have a common branch:
// If there's a + or -, it's fine, skip it
if(*rp=='+' || *rp=='-') rp++;
const char *before=rp;
// Skip the digits
eatdigits(rp);
// If we didn't manage to find any digit, it's not a valid number
if(rp==start) return Bad;
// If it ends here or after whitespace, it's an integer
if(!*rp) return Integer;
before = rp;
skipws(rp);
if(before!=rp) return *rp?Bad:Integer;
If we notice that there's still stuff, we tackle the real number:
// Maybe something after the decimal dot?
if(*rp=='.') {
rp++;
eatdigits(rp);
}
// Exponent
if(*rp=='E' || *rp=='e') {
rp++;
if(*rp=='+' || *rp=='-') rp++;
before=rp;
eatdigits(rp);
if(before==rp) return Bad;
}
skipws(rp);
return *rp?Bad:Real;
}
You can easily invoke this routine after reading the input.
(notice that here the string thing is just for fun, cin does not have any special processing for double-quotes delimited strings).

fstream never reaches eof [duplicate]

What is wrong with using feof() to control a read loop? For example:
#include <stdio.h>
#include <stdlib.h>
int
main(int argc, char **argv)
{
char *path = "stdin";
FILE *fp = argc > 1 ? fopen(path=argv[1], "r") : stdin;
if( fp == NULL ){
perror(path);
return EXIT_FAILURE;
}
while( !feof(fp) ){ /* THIS IS WRONG */
/* Read and process data from file… */
}
if( fclose(fp) != 0 ){
perror(path);
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
What is wrong with this loop?
TL;DR
while(!feof) is wrong because it tests for something that is irrelevant and fails to test for something that you need to know. The result is that you are erroneously executing code that assumes that it is accessing data that was read successfully, when in fact this never happened.
I'd like to provide an abstract, high-level perspective. So continue reading if you're interested in what while(!feof) actually does.
Concurrency and simultaneity
I/O operations interact with the environment. The environment is not part of your program, and not under your control. The environment truly exists "concurrently" with your program. As with all things concurrent, questions about the "current state" don't make sense: There is no concept of "simultaneity" across concurrent events. Many properties of state simply don't exist concurrently.
Let me make this more precise: Suppose you want to ask, "do you have more data". You could ask this of a concurrent container, or of your I/O system. But the answer is generally unactionable, and thus meaningless. So what if the container says "yes" – by the time you try reading, it may no longer have data. Similarly, if the answer is "no", by the time you try reading, data may have arrived. The conclusion is that there simply is no property like "I have data", since you cannot act meaningfully in response to any possible answer. (The situation is slightly better with buffered input, where you might conceivably get a "yes, I have data" that constitutes some kind of guarantee, but you would still have to be able to deal with the opposite case. And with output the situation is certainly just as bad as I described: you never know if that disk or that network buffer is full.)
So we conclude that it is impossible, and in fact unreasonable, to ask an I/O system whether it will be able to perform an I/O operation. The only possible way we can interact with it (just as with a concurrent container) is to attempt the operation and check whether it succeeded or failed. At that moment where you interact with the environment, then and only then can you know whether the interaction was actually possible, and at that point you must commit to performing the interaction. (This is a "synchronisation point", if you will.)
EOF
Now we get to EOF. EOF is the response you get from an attempted I/O operation. It means that you were trying to read or write something, but when doing so you failed to read or write any data, and instead the end of the input or output was encountered. This is true for essentially all the I/O APIs, whether it be the C standard library, C++ iostreams, or other libraries. As long as the I/O operations succeed, you simply cannot know whether further, future operations will succeed. You must always first try the operation and then respond to success or failure.
Examples
In each of the examples, note carefully that we first attempt the I/O operation and then consume the result if it is valid. Note further that we always must use the result of the I/O operation, though the result takes different shapes and forms in each example.
C stdio, read from a file:
for (;;) {
size_t n = fread(buf, 1, bufsize, infile);
consume(buf, n);
if (n == 0) { break; }
}
The result we must use is n, the number of elements that were read (which may be as little as zero).
C stdio, scanf:
for (int a, b, c; scanf("%d %d %d", &a, &b, &c) == 3; ) {
consume(a, b, c);
}
The result we must use is the return value of scanf, the number of elements converted.
C++, iostreams formatted extraction:
for (int n; std::cin >> n; ) {
consume(n);
}
The result we must use is std::cin itself, which can be evaluated in a boolean context and tells us whether the stream is still in the good() state.
C++, iostreams getline:
for (std::string line; std::getline(std::cin, line); ) {
consume(line);
}
The result we must use is again std::cin, just as before.
POSIX, write(2) to flush a buffer:
char const * p = buf;
ssize_t n = bufsize;
for (ssize_t k = bufsize; (k = write(fd, p, n)) > 0; p += k, n -= k) {}
if (n != 0) { /* error, failed to write complete buffer */ }
The result we use here is k, the number of bytes written. The point here is that we can only know how many bytes were written after the write operation.
POSIX getline()
char *buffer = NULL;
size_t bufsiz = 0;
ssize_t nbytes;
while ((nbytes = getline(&buffer, &bufsiz, fp)) != -1)
{
/* Use nbytes of data in buffer */
}
free(buffer);
The result we must use is nbytes, the number of bytes up to and including the newline (or EOF if the file did not end with a newline).
Note that the function explicitly returns -1 (and not EOF!) when an error occurs or it reaches EOF.
You may notice that we very rarely spell out the actual word "EOF". We usually detect the error condition in some other way that is more immediately interesting to us (e.g. failure to perform as much I/O as we had desired). In every example there is some API feature that could tell us explicitly that the EOF state has been encountered, but this is in fact not a terribly useful piece of information. It is much more of a detail than we often care about. What matters is whether the I/O succeeded, more-so than how it failed.
A final example that actually queries the EOF state: Suppose you have a string and want to test that it represents an integer in its entirety, with no extra bits at the end except whitespace. Using C++ iostreams, it goes like this:
std::string input = " 123 "; // example
std::istringstream iss(input);
int value;
if (iss >> value >> std::ws && iss.get() == EOF) {
consume(value);
} else {
// error, "input" is not parsable as an integer
}
We use two results here. The first is iss, the stream object itself, to check that the formatted extraction to value succeeded. But then, after also consuming whitespace, we perform another I/O/ operation, iss.get(), and expect it to fail as EOF, which is the case if the entire string has already been consumed by the formatted extraction.
In the C standard library you can achieve something similar with the strto*l functions by checking that the end pointer has reached the end of the input string.
It's wrong because (in the absence of a read error) it enters the loop one more time than the author expects. If there is a read error, the loop never terminates.
Consider the following code:
/* WARNING: demonstration of bad coding technique!! */
#include <stdio.h>
#include <stdlib.h>
FILE *Fopen(const char *path, const char *mode);
int main(int argc, char **argv)
{
FILE *in;
unsigned count;
in = argc > 1 ? Fopen(argv[1], "r") : stdin;
count = 0;
/* WARNING: this is a bug */
while( !feof(in) ) { /* This is WRONG! */
fgetc(in);
count++;
}
printf("Number of characters read: %u\n", count);
return EXIT_SUCCESS;
}
FILE * Fopen(const char *path, const char *mode)
{
FILE *f = fopen(path, mode);
if( f == NULL ) {
perror(path);
exit(EXIT_FAILURE);
}
return f;
}
This program will consistently print one greater than the number of characters in the input stream (assuming no read errors). Consider the case where the input stream is empty:
$ ./a.out < /dev/null
Number of characters read: 1
In this case, feof() is called before any data has been read, so it returns false. The loop is entered, fgetc() is called (and returns EOF), and count is incremented. Then feof() is called and returns true, causing the loop to abort.
This happens in all such cases. feof() does not return true until after a read on the stream encounters the end of file. The purpose of feof() is NOT to check if the next read will reach the end of file. The purpose of feof() is to determine the status of a previous read function
and distinguish between an error condition and the end of the data stream. If fread() returns 0, you must use feof/ferror to decide whether an error occurred or if all of the data was consumed. Similarly if fgetc returns EOF. feof() is only useful after fread has returned zero or fgetc has returned EOF. Before that happens, feof() will always return 0.
It is always necessary to check the return value of a read (either an fread(), or an fscanf(), or an fgetc()) before calling feof().
Even worse, consider the case where a read error occurs. In that case, fgetc() returns EOF, feof() returns false, and the loop never terminates. In all cases where while(!feof(p)) is used, there must be at least a check inside the loop for ferror(), or at the very least the while condition should be replaced with while(!feof(p) && !ferror(p)) or there is a very real possibility of an infinite loop, probably spewing all sorts of garbage as invalid data is being processed.
So, in summary, although I cannot state with certainty that there is never a situation in which it may be semantically correct to write "while(!feof(f))" (although there must be another check inside the loop with a break to avoid a infinite loop on a read error), it is the case that it is almost certainly always wrong. And even if a case ever arose where it would be correct, it is so idiomatically wrong that it would not be the right way to write the code. Anyone seeing that code should immediately hesitate and say, "that's a bug". And possibly slap the author (unless the author is your boss in which case discretion is advised.)
No it's not always wrong. If your loop condition is "while we haven't tried to read past end of file" then you use while (!feof(f)). This is however not a common loop condition - usually you want to test for something else (such as "can I read more"). while (!feof(f)) isn't wrong, it's just used wrong.
feof() indicates if one has tried to read past the end of file. That means it has little predictive effect: if it is true, you are sure that the next input operation will fail (you aren't sure the previous one failed BTW), but if it is false, you aren't sure the next input operation will succeed. More over, input operations may fail for other reasons than the end of file (a format error for formatted input, a pure IO failure -- disk failure, network timeout -- for all input kinds), so even if you could be predictive about the end of file (and anybody who has tried to implement Ada one, which is predictive, will tell you it can complex if you need to skip spaces, and that it has undesirable effects on interactive devices -- sometimes forcing the input of the next line before starting the handling of the previous one), you would have to be able to handle a failure.
So the correct idiom in C is to loop with the IO operation success as loop condition, and then test the cause of the failure. For instance:
while (fgets(line, sizeof(line), file)) {
/* note that fgets don't strip the terminating \n, checking its
presence allow to handle lines longer that sizeof(line), not showed here */
...
}
if (ferror(file)) {
/* IO failure */
} else if (feof(file)) {
/* format error (not possible with fgets, but would be with fscanf) or end of file */
} else {
/* format error (not possible with fgets, but would be with fscanf) */
}
feof() is not very intuitive. In my very humble opinion, the FILE's end-of-file state should be set to true if any read operation results in the end of file being reached. Instead, you have to manually check if the end of file has been reached after each read operation. For example, something like this will work if reading from a text file using fgetc():
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *in = fopen("testfile.txt", "r");
while(1) {
char c = fgetc(in);
if (feof(in)) break;
printf("%c", c);
}
fclose(in);
return 0;
}
It would be great if something like this would work instead:
#include <stdio.h>
int main(int argc, char *argv[])
{
FILE *in = fopen("testfile.txt", "r");
while(!feof(in)) {
printf("%c", fgetc(in));
}
fclose(in);
return 0;
}

Reading a text document character by character

I am reading a text file character by character using ifstream infile.get() in an infinite while loop.
This sits inside an infinite while loop, and should break out of it once the end of file condition is reached. (EOF). The while loop itself sits within a function of type void.
Here is the pseudo-code:
void function (...) {
while(true) {
...
if ( (ch = infile.get()) == EOF) {return;}
...
}
}
When I "cout" characters on the screen, it goes through all the character and then keeps running outputting what appears as blank space, i.e. it never breaks. I have no idea why. Any ideas?
In C++, you don't compare the return value with EOF. Instead, you can use a stream function such as good() to check if more data can be read. Something like this:
while (infile.good()) {
ch = infile.get();
// ...
}
One idiom that makes it relatively easy to read from a file and detect the end of the file correctly is to combine the reading and the testing into a single, atomic, event, such as:
while (infile >> ch)
or:
while (std::getline(infile, instring))
Of course, you should also consider using a standard algorithm, such as copy:
std::copy(std::istream_iterator<char>(infile),
std::istream_iterator<char>(),
std::ostream_itertror<char>(std::cout, "\n"));
One minor note: by default, reading with >> will skip white space. When you're doing character-by-character input/processing, you usually don't want that. Fortunately, disabling that is pretty easy:
infile.unsetf(std::ios_base::skipws);
try converting the function to an int one and return 1 when reaching EOF
The reason it is not working is that get() returns an int but you are using the input as a char.
When you assign the result of get() to a char it is fine as long as the last character read was a character. BUT if the last character read was a special character (such as EOF) then it will get truncated when assigned to a char and thus the subsequent comparison to EOF will always fail.
This should work:
void function (...)
{
while(true)
{
...
int value;
if ( (value = infile.get()) == EOF) {return;}
char ch = value;
...
}
}
But it should be noted that it is a lot easier to use the more standard pattern where the read is done as part of the condition. Unfortunately the get() does not give you that functionality. So we need to switch to a method that uses iterators.
Note the standard istream_iterator will not work as you expect (as it ignores white space). But you can use the istreambuf_iterator (notice the buf after istream) which does not ignore white space.
void function (...)
{
for(std::istreambuf_iterator<char> loop(infile);
loop != std::istreambuf_iterator<char>();
++loop)
{
char ch = *loop;
...
}
}