How to know if the next character is EOF in C++ - c++

I'm need to know if the next char in ifstream is the end of file. I'm trying to do this with .peek():
if (file.peek() == -1)
and
if (file.peek() == file.eof())
But neither works. There's a way to do this?
Edit: What I'm trying to do is to add a letter to the end of each word in a file. In order to do so I ask if the next char is a punctuation mark, but in this way the last word is left without an extra letter. I'm working just with char, not string.

istream::peek() returns the constant EOF (which is not guaranteed to be equal to -1) when it detects end-of-file or error. To check robustly for end-of-file, do this:
int c = file.peek();
if (c == EOF) {
if (file.eof())
// end of file
else
// error
} else {
// do something with 'c'
}
You should know that the underlying OS primitive, read(2), only signals EOF when you try to read past the end of the file. Therefore, file.eof() will not be true when you have merely read up to the last character in the file. In other words, file.eof() being false does not mean the next read operation will succeed.

This should work:
if (file.peek(), file.eof())
But why not just check for errors after making an attempt to read useful data?

file.eof() returns a flag value. It is set to TRUE if you can no longer read from file. EOF is not an actual character, it's a marker for the OS. So when you're there - file.eof() should be true.
So, instead of if (file.peek() == file.eof()) you should have if (true == file.eof()) after a read (or peek) to check if you reached the end of file (which is what you're trying to do, if I understand correctly).

For a stream connected to the keyboard the eof condition is that I intend to type Ctrl+D/Ctrl+Z during the next input.
peek() is totally unable to see that. :-)

Usually to check end of file I used:
if(cin.fail())
{
// Do whatever here
}
Another such way to implement that would be..
while(!cin.fail())
{
// Do whatever here
}
Additional information would be helpful so we know what you want to do.

There is no way of telling if the next character is the end of the file, and trying to do so is one of the commonest errors that new C and C++ programmers make, because there is no end-of-file character in most operating systems. What you can tell is that reading past the current position in a stream will read past the end of file, but this is in general pretty useless information. You should instead test all read operations for success or failure, and act on that status.

You didn't show any code you are working with, so there is some guessing on my part. You don't usually need low level facilities (like peek()) when working with streams. What you probably interested in is istream_iterator. Here is an example,
cout << "enter value";
for(istream_iterator<double> it(cin), end;
it != end; ++it)
{
cout << "\nyou entered value " << *it;
cout << "\nTry again ...";
}
You can also use istreambuf_iterator to work on buffer directly:
cout << "Please, enter your name: ";
string name;
for(istreambuf_iterator<char> it(cin.rdbuf()), end;
it != end && *it != '\n'; ++it)
{
name += *it;
}
cout << "\nyour name is " << name;

just use this code in macosx
if (true == file.eof())
it work for me in macosx!

Related

does the this stl operator >> function make magic happens?

I have a weird problem when I test C++ STL features.
If I uncomment the line if(eee), my while loop never exits.
I'm using vs2015 under 64-bit Windows.
int i = 0;
istream& mystream = data.getline(mycharstr,128);
size_t mycount = data.gcount();
string str(mycharstr,mycharstr+mycount);
istringstream myinput(str);
WORD myfunclist[9] = {0};
for_each(myfunclist,myfunclist+9, [](WORD& i){ i = UINT_MAX;});
CALLEESET callee_set;
callee_set.clear();
bool failbit = myinput.fail();
bool eof = myinput.eof();
while (!failbit && !eof)
{
int eee = myinput.peek();
if (EOF == eee) break;
//if (eee) // if i uncomment this line ,the failbit and eof will always be false,so the loop will never exit.
{
myinput >> myfunclist[i++];
}
//else break;
failbit = myinput.fail();
eof = myinput.eof();
cout << myinput.rdstate() << endl;
}
I think that
int eee = myinput.peek();
at some point returns zero.
Then due to
if (eee)
you stop reading from the stream and never reach EOF.
Try to do
if (eee >= 0)
instead
As an alternative you could do:
if (eee < 0)
{
break;
}
// No need for further check of eee - just do the read
myinput >> myfunclist[i++];
The root cause of your problem is a misunderstanding about the way streams set their flags: fail() and eof() are only set once a reading operation fails or tried to read after the last byte was reached.
In other words, with C++ streams you may perfectly have read the last byte of your input and be at the end of file, yet eof() will stay false until you try to read more. You will find on StackOverflow many questions and answers about why you should not loop on eof in a C++ stream.
Consequences:
You will always enter into the loop, even if there is no character to read in myinput.
You therefore have to check for the special case of peek() returning EOF.
If you're still in the loop after the peek, then there are still characters to read. Keep in mind that peek() does not consume the characters. If you do not read it in a proper way, you stay at the same position in the stream. So if for any reason you do no reach myinput >> myfunclist[i++];, you're stuck in an endless loop, constantly peeking the same character over and over again. This is the 0 case that is well described in 4386427's answer : it's still there and you do not progress in the stream.
Other comments:
since your input can be 128 bytes long, and you read integers in text encoding, you could have evil input with 64 different words, causing your loop to go out ov bounds and cause for example memory corruption.
It is not clear why at all you try to peek.
I'd suggest to forget about the flags, use the usual stream reading idiom and simplify the code to:
...
callee_set.clear(); // until there, no change
while (i<9 && myinput >> myfunclist[i++])
{
cout << myinput.rdstate() << endl; // if you really want to know ;-)
}

c++ appending text into a string until see a specific character

I have more than one input files like this:
>1aab_
GKGDPKKPRGKMSSYAFFVQTSREEHKKKHPDASVNFSEFSKKCSERWKT
MSAKEKGKFEDMAKADKARYEREMKTYIPPKGE
>1j46_A
MQDRVKRPMNAFIVWSRDQRRKMALENPRMRNSEISKQLGYQWKMLTEAE
KWPFFQEAQKLQAMHREKYPNYKYRPRRKAKMLPK
>1k99_A
MKKLKKHPDFPKKPLTPYFRFFMEKRAKYAKLHPEMSNLDLTKILSKKYK
ELPEKKKMKYIQDFQREKQEFERNLARFREDHPDLIQNAKK
>2lef_A
MHIKKPLNAFMLYMKEMRANVVAESTLKESAAINQILGRRWHALSREEQA
KYYELARKERQLHMQLYPGWSARDNYGKKKKRKREK
Here, what I have to do:
vector <string> names;
vector <string> seqs;
names.resize(total); //"total" is already known.
seqs.resize(total);
counter=0;char input;
while ((input = myInput.get()) != EOF)
{
if(input=='>')
names[counter]= take all line (>1aab_, >1j46_A, so...)
else
untill the see next '>' append the character into sequence[counter]
counter++;
}
Finally it will be like this:
names[0]=">1aab_"
sequence[0]="GKGDPKKPRGKMSSYAFFVQTSREEHKKKHPDASVNFSEFSKKCSERWKTMSAKEKGKFEDMAKADKARYEREMKTYIPPKGE"
and so on..
I am thinking about for 2 hours and I couldn't figure out it. Can anyone help about that? Thanks in advance.
There's a few ways to solve it; I'll present some examples but I'm not testing/compiling this code, so there may be minor bugs - the logic is the important bit.
Since your pseudocode appears to be processing the input character by character, I've taken that as a requirement.
The way you seem to be thinking about it would be implemented with essentially a pair of loops - one for reading the name, the other for reading the sequence - which are enclosed in an outer loop, in order to process all records.
This would look something like the following:
// first character in file should be a '>', indicating the start
// of a record.
input = myInput.get();
if (input != '>')
{
std::cerr << "Malformed input file!" << std::endl;
return /*...*/;
}
do
{
// record name continues up until the newline
while ((input = myInput.get()) != EOF)
{
if (input == '\n' || input == '\r')
break;
names[counter].push_back(input);
}
// read sequence until we hit a '>' or EOF
while ((input = myInput.get()) != EOF)
{
if (input == '>')
{
// advance to next record number
counter++;
break;
}
sequence[counter].push_back(input);
}
} while (input != EOF && counter < total);
You'll also notice I moved the check for the initial '>' to before the loop, just as a way of ingesting (and discarding) the character, as well as a basic sanity check of the input. This is because we really use this character to mark the end of the sequence (rather than the "start of a record") - when we enter the loop, we assume we're already reading the record name.
Another way to approach it is to use a state machine. Essentially, this utilises additional variables to track the state the parser is in.
For this particular case, you only have two states: either you're reading a record name, or the sequence. So, we can just use a single boolean to track which state we're in.
Armed with the state variable, we can then make decisions about what to do with the character we just read based upon the state we're in. At the simplest level here, if we're in "read the record name" state, we add the character to the names variable, otherwise we add it to the sequence variable.
// state flag to indicate if we're currently reading a name line,
// i.e. a line starting with ">"
// This should be set true by the first record we encounter, so
// we'll set it false (to indicate we're reading a sequence) in
// order to allow us to detect bad input files.
bool reading_name = false;
// indicate we're on the first record, so we can avoid incrementing
// the record counter
bool first_record = true;
// process input character-by-character until end of file
while ((input = myInput.get()) != EOF)
{
// check for start of new record
if (input == '>')
{
// for robustness, verify we're not already reading a name,
// as this probably indicates invalid input
if (reading_name)
{
std::cerr << "Input is malformed?!" << endl;
break;
}
// switch to reading name state
reading_name = true;
// advance to next record, but only if it isn't the first record
if (first_record)
{
// disable the first_record flag, and explicitly set the
// record counter to 0.
first_record = false;
counter = 0;
}
else if (++counter >= total)
{
std::cerr << "Error: too many records!" << std::endl;
break;
}
}
// first character in file should start a new record
else if (first_record)
{
std::cerr << "Missing record start character at beginning of input!" << std::endl;
break;
}
// make sure we are processing a valid record number
else if (counter >= total)
{
std::cerr << "Invalid record number!" << std::endl;
break;
}
// continue reading the name
else if (reading_name)
{
// check if we've reached the end of the line; you
// may also want/need to check for \r if your input
// files may have Windows-style line endings
if (input == '\n')
{
// switch to reading sequence state
reading_name = false;
}
else
{
// add character to current name
names[counter].push_back(input);
}
}
// continue reading the sequence
else
{
// you might need to handle line ending characters here,
// maybe just skipping them?
// add character to current sequence
sequence[counter].push_back(input);
}
}
This adds a fair amount of complexity, which is of questionable value for this particular exercise, but does make adding additional states easier in future. It also has the benefit of only a single place in the code where I/O is done, which reduces the chances of errors (not checking for EOF, overflow array bounds, etc.).
In this case, we're actually using the '>' character as an indicator that a new record is starting, so we add a bit of extra logic to make sure that all works properly with the record counter. You could also just use a signed integer for your counter variable and start it at -1, so it will increment to 0 at the start of the first record, but using signed variables to index into arrays isn't a good idea.
There are more ways to approach this problem, but hopefully this gives you somewhere to start on your own solution.

Simple C++ not reading EOF

I'm having a hard time understanding why while (cin.get(Ch)) doesn't see the EOF. I read in a text file with 3 words, and when I debug my WordCount is at 3 (just what I hoped for). Then it goes back to the while loop and gets stuck. Ch then has no value. I thought that after the newline it would read the EOF and break out. I am not allowed to use <fstream>, I have to use redirection in DOS. Thank you so much.
#include <iostream>
using namespace std;
int main()
{
char Ch = ' ';
int WordCount = 0;
int LetterCount = 0;
cout << "(Reading file...)" << endl;
while (cin.get(Ch))
{
if ((Ch == '\n') || (Ch == ' '))
{
++WordCount;
LetterCount = 0;
}
else
++LetterCount;
}
cout << "Number of words => " << WordCount << endl;
return 0;
}
while (cin >> Ch)
{ // we get in here if, and only if, the >> was successful
if ((Ch == '\n') || (Ch == ' '))
{
++WordCount;
LetterCount = 0;
}
else
++LetterCount;
}
That's the safe, and common, way to rewrite your code safely and with minimal changes.
(Your code is unusual, trying to scan all characters and count whitespace and newlines. I'll give a more general answer to a slightly different question - how to read in all the words.)
The safest way to check if a stream is finished if if(stream). Beware of if(stream.good()) - it doesn't always work as expected and will sometimes quit too early. The last >> into a char will not take us to EOF, but the last >> into an int or string will take us to EOF. This inconsistency can be confusing. Therefore, it is not correct to use good(), or any other test that tests EOF.
string word;
while(cin >> word) {
++word_count;
}
There is an important difference between if(cin) and if(cin.good()). The former is the operator bool conversion. Usually, in this context, you want to test:
"did the last extraction operation succeed or fail?"
This is not the same as:
"are we now at EOF?"
After the last word has been read by cin >> word, the string is at EOF. But the word is still valid and contains the last word.
TLDR: The eof bit is not important. The bad bit is. This tells us that the last extraction was a failure.
The Counting
The program counts newline and space characters as words. In your file contents "this if fun!" I see two spaces and no newline. This is consistent with the observed output indicating two words.
Have you tried looking at your file with a hex editor or something similar to be sure of the exact contents?
You could also change your program to count one more word if the last character read in the loop was a letter. This way you don't have to have newline terminated input files.
Loop Termination
I have no explanation for your loop termination issues. The while-condition looks fine to me. istream::get(char&) returns a stream reference. In a while-condition, depending on the C++ level your compiler implements, operator bool or operator void* will be applied to the reference to indicate if further reading is possible.
Idiom
The standard idiom for reading from a stream is
char c = 0;
while( cin >> c )
process(c);
I do not deviate from it without serious reason.
you input file is
this is fun!{EOF}
two spaces make WordCount increase to 2
and then EOF, exit loop! if you add a new line, you input file is
this is fun!\n{EOF}
I took your program loaded it in to visual studio 2013, changed cin to an fstream object that opened a file called stuff.txt which contains the exact characters "This is fun!/n/r" and the program worked. As previous answers have indicated, be careful because if there's not a /n at the end of the text the program will miss the last word. However, I wasn't able to replicate the application hanging in an infinite loop. The code as written looks correct to me.
cin.get(char) returns a reference to an istream object which then has it's operator bool() called which returns false when any of the error bits are set. There are some better ways to write this code to deal with other error conditions... but this code works for me.
In your case, the correct way to bail out of the loop is:
while (cin.good()) {
char Ch = cin.get();
if (cin.good()) {
// do something with Ch
}
}
That said, there are probably better ways to do what you're trying to do.

Detecting space in a file in c++

Hi i was just wondering if anybody could help me i am reading characters from a file then inserting them into a map i have the code working i was just wondering how do i detect if a space is in the file cause i need to store the amount of times a space occurred in a file any help would be great thanks.
map<char, int> treeNodes; //character and the frequency
ifstream text("test.txt");
while(!text.eof())
{
text >> characters;
//getline(text,characters);
cout << characters;
if(treeNodes.count(characters) == 0)
{
if(isspace (characters))
{
cout << "space" << endl;
}
else
treeNodes.insert(pair<char,int>(characters,1));
}
else
{
treeNodes[characters] += 1;
}
}
Formatted input, i.e. when using the right shift operator>>() skips leading whitespace by default. You can turn this off using std::noskipws but depending on what sort of things you want to read it won't be a very happy experience. The best approach is probably using unformatted input, i.e. something like std::getline() and split the line on space within the program.
If you just want to count the number of times any particular character occurred, you probably want to use std::istreambuf_iterator<char> and just iterate over the content of the stream (this code also omits some other unnecessary clutter):
for (std::istreambuf_iterator<char> it(text), end(); it != end; ++it) {
++treeNodes[*it];
}
BTW, you never want to use the result of eof() for something different than determining whether the last read failed because the stream has reached its end.
couldn't you just cast the char to an int and test if it is equal to the ascii value of a space?

C++ file handling (structures)

Following code, when compiled and run with g++,
prints '1' twice, whereas I expect '1' to be printed
only once, since I am dumping a single structure to
the file, but while reading back it seems to be
reading two structures. Why?
#include <iostream.h>
#include <fstream.h>
int main(){
struct student
{
int rollNo;
};
struct student stud1;
stud1.rollNo = 1;
ofstream fout;
fout.open("stu1.dat");
fout.write((char*)&stud1,sizeof(stud1));
fout.close();
ifstream filin("stu1.dat");
struct student tmpStu;
while(!filin.eof())
{
filin.read((char*)&tmpStu,sizeof(tmpStu));
cout << tmpStu.rollNo << endl;
}
filin.close();
}
eof only gets set after a read fails, so the read runs twice, and the second time, it doesn't modify the buffer.
Try this:
while(filin.read((char*)&tmpStu,sizeof(tmpStu)))
{
cout << tmpStu.rollNo << endl;
}
Or
while(!filin.read((char*)&tmpStu,sizeof(tmpStu)).eof())
{
cout << tmpStu.rollNo << endl;
}
Read returns a reference to filin when called, which will evaluate to true if the stream is still good. When read fails to read any more data, the reference will evaluate to false, which will prevent it from entering the loop.
Your while loop is executing twice because the EOF condition is not true until the first attempt to read beyond the end of the file. So the cout is executed twice.
This prints 1 twice because of the exact way eof and read work. If you are at the very end of a file, read will fail, then calls to eof after that return true. If you have not attempted to read past the end of the file, eof will return false because the stream is not in the EOF state, even though there is no more data left to read.
To summarize, your calls look like this:
eof - false (at beginning of file)
read (at beginning of file)
eof - false (now at end of file, but EOF not set)
read (at end of file. fails and sets EOF state internally)
eof - true (EOF state set)
A better strategy would be to check eof right after the read call.
I believe it is because you are checking for filin.eof() and that won't be true until the second time you read.
See here. It notes that eofbit is set "...The end of the source of characters is reached before n characters have been read ...". In your case you won't hit EOF until the second read.
Cool.
Another way (courtesy of experts-exchange, I asked the same question there :-))
while(filin.peek() != EOF)
{
filin.read((char*)&tmpStu,sizeof(tmpStu));
cout << tmpStu.rollNo << endl;
}