Unable to parse final word in text file using peek() - c++

I'm attempting to write a lexer and parser but I'm having trouble getting the final variable in a text file due to in_file.tellg() equaling -1. My program only works if I add a space character after the variable, otherwise I get a compiler error. I want to mention that I'm able to get every other variable in the text file but the last one. I believe the cause of the problem is in_file.peek()!=EOF setting in_file.tellg() to -1.
My program is something like this:
ifstream in(file_name);
char c;
in >> noskipws;
while(in >> c ){
if(is_letter_part_of_variable(c)) {
int start_pos = in.tellg(),
end_pos,
length;
while(is_letter_part_of_variable(c) && in.peek()!=EOF ) {
in>>c;
}
end_pos = in.tellg(); // This becomes -1 for some reason
length = end_pos - start_pos; // Should be 7
// Reset file pointer to original position to chomp word.
in.clear();
in.seekg(start_pos-1, in.beg);
// The word 'message' should go in here.
char *identifier = new char[length];
in.read(identifier, length);
identifier[length] = '\0';
}
}
example.text
message = "Hello, World"
print message
I tried removing peek()!= EOF which gives me an eternal loop. I tried !in_file.eof() and that also makes tellg() equal to -1. What can I do to fix/enhance this code?

I believe the cause of the problem is in_file.peek()!=EOF setting in_file.tellg() to -1.
Close. peek attempts to read a character and returns EOF if it reads past the end of the stream. Reading past the end of a stream sets the stream's fail bit. tellg returns -1 if the fail bit is set.
Simple Solution
clear the fail bit before calling tellg.
Better solution
Use std::string.
std::string identifier;
while(in>>c && is_letter_part_of_variable(c)) {
identifier += c;
}
All of the messing around with peek, seekg, tellg and the dreaded new vanish.

Related

How to read just before EOF from a file and put it into a string? [duplicate]

This question already has answers here:
How do I read an entire file into a std::string in C++?
(23 answers)
Closed 3 years ago.
My function reads a file and puts it into a string in order for me to process it. I need to read just before EOF, obviously. The problem is that the EOF character is also put inside the string and I can't find a way to bypass it, since it leds other parts of the program to fail. I link the function below.
string name_to_open, ret = string();
ifstream in;
getline(cin, name_to_open);
in.open(name_to_open.c_str());
if (!in.is_open()) {
cout << "Error." << endl;
return string();
}
else {
ret += in.get();
while (in.good()) {
ret += in.get();
};
};
in.close();
return ret;
The function reads fine until the end of the file, then appends EOF and \0. How can I solve the problem? Does the EOF character work fine in controls? I also tried to put a line ret[ret.size() - 1] = '\0'; at the end of the cycle, but this doesn't seem to work either.
ret += in.get(); appends the character read from the tile to the string whether the value read was good or not. You need to 1) read, 2) test that the read is valid and the value read is safe to use, 3) use the value read. Currently your code reads, uses, and then tests whether or not the value read was safe to use.
Possible solution:
int temp;
while ((temp = in.get()) != EOF) // read and test. Enter if not EOF
{
ret += static_cast<char>(temp); // add the character
};
Note: get returns an int, not a char. This is to be able to insert out-of-band codes such as EOF without colliding with an existing valid character. Immediately treating the return value as a char could result in bugs because a special code may be mishandled.
Note: there are many better ways to read an entire file into a string: How do I read an entire file into a std::string in C++?

How can I convert a char array to a string in C++?

I have a char array called firstFileStream[50], which is being written to from an infile using fstream.
I want to convert this char array into a string called firstFileAsString. If I write string firstFileAsString = firstFileStream; it only writes the first word within the array and stops at the first space, or empty character. If I write firstFileAsString(firstFileStream) I get the same output.
How do I write the whole char array, so all words within it, to a string?
Here is the code to read in and write:
string firstInputFile = "inputText1.txt";
char firstFileStream[50];
ifstream openFileStream;
openFileStream.open(firstInputFile);
if (strlen(firstFileStream) == 0) { // If the array is empty
cout << "First File Stream: " << endl;
while (openFileStream.good()) { // While we haven't reached the end of the file
openFileStream >> firstFileStream;
}
string firstFileAsString = firstFileStream;
}
My problem, as zdan pointed out, is I was only reading the first word of the file, so instead I've used istreambuf_iterator<char> to assign the content directly to the string rather than the character array first. This can then be broken down into a character array, rather than the other way around.
openFileStream >> firstFileStream;
reads only one word from the file.
A simple example of reading the whole file (at least up to the buffering capacity) looks like this:
openFileStream.read(firstFileStream, sizeof(firstFileStream) - 1);
// sizeof(firstFileStream) - 1 so we have space for the string terminator
int bytesread;
if (openFileStream.eof()) // read whole file
{
bytesread = openFileStream.gcount(); // read whatever gcount returns
}
else if (openFileStream) // no error. stopped reading before buffer overflow or end of file
{
bytesread = sizeof(firstFileStream) - 1; //read full buffer
}
else // file read error
{
// handle file error here. Maybe gcount, maybe return.
}
firstFileStream[bytesread] = '\0'; // null terminate string

does the this stl operator >> function make magic happens?

I have a weird problem when I test C++ STL features.
If I uncomment the line if(eee), my while loop never exits.
I'm using vs2015 under 64-bit Windows.
int i = 0;
istream& mystream = data.getline(mycharstr,128);
size_t mycount = data.gcount();
string str(mycharstr,mycharstr+mycount);
istringstream myinput(str);
WORD myfunclist[9] = {0};
for_each(myfunclist,myfunclist+9, [](WORD& i){ i = UINT_MAX;});
CALLEESET callee_set;
callee_set.clear();
bool failbit = myinput.fail();
bool eof = myinput.eof();
while (!failbit && !eof)
{
int eee = myinput.peek();
if (EOF == eee) break;
//if (eee) // if i uncomment this line ,the failbit and eof will always be false,so the loop will never exit.
{
myinput >> myfunclist[i++];
}
//else break;
failbit = myinput.fail();
eof = myinput.eof();
cout << myinput.rdstate() << endl;
}
I think that
int eee = myinput.peek();
at some point returns zero.
Then due to
if (eee)
you stop reading from the stream and never reach EOF.
Try to do
if (eee >= 0)
instead
As an alternative you could do:
if (eee < 0)
{
break;
}
// No need for further check of eee - just do the read
myinput >> myfunclist[i++];
The root cause of your problem is a misunderstanding about the way streams set their flags: fail() and eof() are only set once a reading operation fails or tried to read after the last byte was reached.
In other words, with C++ streams you may perfectly have read the last byte of your input and be at the end of file, yet eof() will stay false until you try to read more. You will find on StackOverflow many questions and answers about why you should not loop on eof in a C++ stream.
Consequences:
You will always enter into the loop, even if there is no character to read in myinput.
You therefore have to check for the special case of peek() returning EOF.
If you're still in the loop after the peek, then there are still characters to read. Keep in mind that peek() does not consume the characters. If you do not read it in a proper way, you stay at the same position in the stream. So if for any reason you do no reach myinput >> myfunclist[i++];, you're stuck in an endless loop, constantly peeking the same character over and over again. This is the 0 case that is well described in 4386427's answer : it's still there and you do not progress in the stream.
Other comments:
since your input can be 128 bytes long, and you read integers in text encoding, you could have evil input with 64 different words, causing your loop to go out ov bounds and cause for example memory corruption.
It is not clear why at all you try to peek.
I'd suggest to forget about the flags, use the usual stream reading idiom and simplify the code to:
...
callee_set.clear(); // until there, no change
while (i<9 && myinput >> myfunclist[i++])
{
cout << myinput.rdstate() << endl; // if you really want to know ;-)
}

c++ reading undefined number of lines with eof()

I'm dealing with a problem using eof().
using
string name;
int number, n=0;
while(!in.eof())
{
in >> name >> number;
//part of code that puts into object array
n++;
}
sounds normal to me as it whenever there are no more text in the file.
But what I get is n being 4200317. When I view the array entries, I see the first ones ats the ones in the file and other being 0s.
What could be the problem and how should I solve it? Maybe there's an alternative to this reading problem (having undefined number of lines)
The correct way:
string name;
int number;
int n = 0;
while(in >> name >> number)
{
// The loop will only be entered if the name and number are correctly
// read from the input stream. If either fail then the state of the
// stream is set to bad and then the while loop will not be entered.
// This works because the result of the >> operator is the std::istream
// When an istream is used in a boolean context its is converted into
// a type that can be used in a boolean context using the isgood() to
// check its state. If the state is good it will be converted to an objet
// that can be considered to be true.
//part of code that puts into object array
n++;
}
Why your code fails:
string name;
int number, n=0;
while(!in.eof())
{
// If you are on the last line of the file.
// This will read the last line. BUT it will not read past
// the end of file. So it will read the last line leaving no
// more data but it will NOT set the EOF flag.
// Thus it will reenter the loop one last time
// This last time it will fail to read any data and set the EOF flag
// But you are now in the loop so it will still processes all the
// commands that happen after this.
in >> name >> number;
// To prevent anything bad.
// You must check the state of the stream after using it:
if (!in)
{
break; // or fix as appropriate.
}
// Only do work if the read worked correctly.
n++;
}
in << name << number;
This looks like writing, not reading.
Am I wrong?
int number, n = 0;
You weren't initializing n, and you seem to have a typo.
This probably would be more correct
string name;
int number, n = 0;
while (in >> name && in >> number)
{
n++;
}
The eof is a bad practice.
Note that there is a subtle difference here from your code: your code ended when it encountered an eof or silently looped for infinite time if it found a wrong line (Hello World for example), this code ends when it encounters a non correctly formatted "tuple" of name + number or the file ends (or there are other errors, like disconnecting the disk during the operation :-) ). If you want to check if the file was read correctly, after the while you can check if in.eof() is true. If it's true, then all the file was read correctly.

getline seems to not working correctly

Please tell me what am I doing wrong here. What I want to do is this:
1.Having txt file with four numbers and each of this numbers has 15 digits:
std::ifstream file("numbers.txt",std::ios::binary);
I'm trying to read those numbers into my array:
char num[4][15];
And what I'm thinking I'm doing is: for as long as you don't reach end of files write every line (max 15 chars, ending at '\n') into num[lines]. But this somewhat doesn't work. Firstly it reads correctly only first number, rest is just "" (empty string) and secondly file.eof() doesn't seems to work correctly either. In txt file which I'm presenting below this code I reached lines equal 156. What's going on?
for (unsigned lines = 0; !file.eof(); ++lines)
{
file.getline(num[lines],15,'\n');
}
So the whole "routine" looks like this:
int main()
{
std::ifstream file("numbers.txt",std::ios::binary);
char numbers[4][15];
for (unsigned lines = 0; !file.eof(); ++lines)
{
file.getline(numbers[lines],15,'\n');// sizeof(numbers[0])
}
}
This is contents of my txt file:
111111111111111
222222222222222
333333333333333
444444444444444
P.S.
I'm using VS2010 sp1
Do not use the eof() function! The canonical way to read lines is:
while( getline( cin, line ) ) {
// do something with line
}
file.getline() extracts 14 characters, filling in num[0][0] .. num[0][13]. Then it stores a '\0' in num[0][14] and sets the failbit on file because that's what it does when the buffer is full but terminating character not reached.
Further attempts to call file.getline() do nothing because failbit is set.
Tests for !file.eof() return true because the eofbit is not set.
Edit: to give a working example, best is to use strings, of course, but to fill in your char array, you could do this:
#include <iostream>
#include <fstream>
int main()
{
std::ifstream file("numbers.txt"); // not binary!
char numbers[4][16]={}; // 16 to fit 15 chars and the '\0'
for (unsigned lines = 0;
lines < 4 && file.getline(numbers[lines], 16);
++lines)
{
std::cout << "numbers[" << lines << "] = " << numbers[lines] << '\n';
}
}
tested on Visual Studio 2010 SP1
According to ifstream doc, reading stops either after n-1 characters are read or delim sign is found : first read would take then only 14 bytes.
It reads bytes : '1' (the character) is 0x41 : your buffer would be filled with 0x41 instead of 1 as you seem to expect, last character will be 0 (end of c-string)
Side note, your code doesn't check that lines doesn't go beyond your array.
Using getline supposes you're expecting text and you open the file in binary mode : seems wrong to me.
It looks like the '\n' in the end of the first like is not being considered, and remaining in the buffer. So in the next getline() it gets read.
Try adding a file.get() after each getline().
If one file.get() does not work, try two, because under the Windows default file encoding the line ends with '\n\r\' (or '\r\n', I never know :)
Change it to the following:
#include <cstring>
int main()
{
//no need to use std::ios_base::binary since it's ASCII data
std::ifstream file("numbers.txt");
//allocate one more position in array for the NULL terminator
char numbers[4][16];
//you only have 4 lines, so don't use EOF since that will cause an extra read
//which will then cause and extra loop, causing undefined behavior
for (unsigned lines = 0; lines < 4; ++lines)
{
//copy into your buffer that also includes space for a terminating null
//placing in if-statement checks for the failbit of ifstream
if (!file.getline(numbers[lines], 16,'\n'))
{
//make sure to place a terminating NULL in empty string
//since the read failed
numbers[lines][0] = '\0';
}
}
}