std::getline and eol vs eof - c++

I've got a program that is tailing a growing file.
I'm trying to avoid grabbing a partial line from the file (e.g. reading before the line is completely written by the other process.) I know it's happening in my code, so I'm trying to catch it specifically.
Is there a sane way to do this?
Here's what I'm trying:
if (getline (stream, logbuffer))
{
if (stream.eof())
{
cout << "Partial line found!" << endl;
return false;
}
return true;
}
return false;
However, I can't easily reproduce the problem so I'm not sure I'm detecting it with this code. std::getline strips off newlines, so I can't check the buffer for a trailing newline. My log message (above) is NEVER tripping.
Is there some other way of trying to check what I want to detect? Is there a way to know if the last line I read hit EOF without finding a EOL character?
Thanks.

This will never be true:
if (getline (stream, logbuffer))
{
if (stream.eof())
{
/// will never get here
If getline() worked, the stream cannot be in an eof state. The eof() and related state tests only work on the results of a previous read operation such as getline()- they do not predict what the next read will do.
As far as I know, there is no way of doing what you want. However, if the other process writes a line at a time, the problems you say you are experiencing should be very rare (non -existent in my experience), depending to some extent on the OS you are are using. I suspect the problem lies elsewhere, probably in your code. Tailing a file is a very common thing to do, and one does not normally need to resort to special code to do it.
However, should you find you do need to read partial lines, the basic algorithm is as follows:
forever do
wait for file change
read all possible input using read or readsome (not getline)
chop input into lines and possible partial line
process as required
end

An istream object such as std::cin has a get function that stops reading when it gets to a newline without extracting it from the stream. You could then peek() or get() it to see if indeed it is a newline. The catch is that you have to know the maximum length of a line coming from the other application. Example (untested) code follows below:
char buf[81]; // assumes an 80-char line length + null char
memset(buf, 0, 81);
if (cin.get(buf, 81))
{
if (cin.peek() == EOF) // You ran out of data before hitting end of line
{
cout << "Partial line found!\n";
}
}

I have to take issue with one statement you made here:
However, I can't easily reproduce the problem so I'm not sure I'm detecting it with this code.
It seems like from what you said it would be extremely easy to replicate your problem, if it is what you said. You can easily create a text file in some text editor - just make sure that the last like ends in an EOF instead of going on to a new line. Then point your program at that file and see what results.

Even if the other program isn't done writing the file, in the file that's where the line ends, so there's no way to tell the difference other than waiting to see if the other program writes something new.
edit: If you just want to tell if the line ends in a newline or not, you could write your own getline function that reads until it hits a newline but doesn't strip it.

Related

Program Almost Runs ,Trouble With File Operation

The program almost runs but i am not sure how to make the .txt file for this , its not giving me an error.
the project asks me to:
" File encryption is the science of writing the contents of a file in a secret code. Your encryption program should work like a filter, reading the contents of one file, modifying
the data into a code, and then writing the coded contents out to a second file.
The second file will be a version of the first file, but written in a secret code. Although there are complex encryption techniques, you should come up with a simple one of your own. For example, you could read the first file one character at a time, and add 10 to the ASCII code of each character before it is written to the second file. "
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
char ch;
fstream fin, fout;
fin.open("testone.txt", ios::in);
fout.open("encrypted.txt", ios::out);
while (!fin.eof())
{
fin.get(ch);
fout.put(ch + 10);
}
fin.close();
fout.close();
system("pause");
return 0;
}
Read this -
Error LNK1561: entry point must be defined
https://social.msdn.microsoft.com/Forums/vstudio/en-US/e1200aa9-34c7-487c-a87e-0d0368fb3bff/error-lnk1561-entry-point-must-be-definedproblem-in-c?forum=vclanguage
Not up on my Visual C, but you may need #include <cstdlib> to get system
LNK1561 means your main function can't be found. Clearly the main function is present, so this should compile. Follow Beta's suggestion and ensure you can compile and run a trivial program.
Putting Compiling issues aside, This code won't work.
Overarching Problem: You are not checking for any errors along the way, so there is no way for your program to tell if anything has gone wrong.
For example, what if the file didn't open? The while (!fin.eof()) becomes an infinite loop. If the file is not open, you can never read EOF. Trying to use EOF as a loop condition is a bad idea anyway. Definitely read the link in #Steephen's comment.
If you fail to read a character with fin.get(ch); then what? The current code tries to use the character anyway. Bad idea.
Testing a stream is pretty simple. if (!fin) does the job. Read up on how streams work to learn why. Thius simple test doesn't tell you what went wrong, but at least you know something went wrong.
To make things easier, most stream functions return the stream. This lets you chain stream operations together and makes if (!fin.get(ch)) an easy way to tell if get worked.
So your IO loop can be as simple as
while (fin.get(ch) && fout.put(ch + 10))
{
}
If get couldn't get ch for any reason--unopened file, end of file, unreadable file--the while loop exits. Afterwards you can query fin to find out why. If EOF, awesome. If not EOF, the output file's probably wrong.
The same applies to put. If put failed, the loop ends. Test for why and decide if you want to keep the file.
I also recommend dropping a quick test at the end of main to print out a check.
fin.open("encrypted.txt", ios::in);
while (fin.get(ch) && std::cout.put(ch - 10))
{
}
A better test would be to read the character, undo the encryption, and compare against the original input.

Reading a text file from the first line multiple times (C++)

I'm using "getline" to read some lines in a text file. It works as it should, but I'm calling the method multiple times.
while(getline(file, line))
{
//Do something
}
//More code in between
while(getline(file, line))
{
//Do something else
}
The problem is that when I call "getline" the second time it starts reading from where it previously finished (e.g. If the first while loop ends at the second line then the next loop starts at the third line). How can I ensure that my program reads the file from the first line every time?
If you need that same first line multiple times I think you should reconsider your strategy.
Just read the line once.
Save it in a variable (or just keep it in the variable "line" you already have).
Close the file.
You would avoid a lot of not necessary I/O operations...
Nonetheless as other people suggested if by any reason you want to procede with this approach you need to insert:
myinputstream.clear(); //clear the buffer
myinputstream.seekg(0, ios::beg); //reset the reading position to beginning
between each attempt to read the same file.
And do not forget to close it eventually.
myinputstream.close();
There's a seekg() function that should help
http://www.cplusplus.com/reference/istream/istream/seekg/
iostream::seekg (0, iostream::beg);
will move you at the beggining of the stream

Difference between while(!file.eof()) and while(file >> variable)

First things first - I've got a text file in which there are binary numbers, one number for each row. I'm trying to read them and sum them up in a C++ program. I've written a function which transforms them to decimal and adds them after that and I know for sure that function's ok. And here's my problem - for these two different ways of reading a text file, I get different results (and only one of these results is right) [my function is decimal()]:
ifstream file;
file.open("sample.txt");
int sum = 0;
string BinaryNumber;
while (!file.eof()){
file >> BinaryNumber;
sum+=decimal(BinaryNumber);
}
and that way my sum is too large, but by a small quantity.
ifstream file;
file.open("sample.txt");
int sum = 0;
string BinaryNumber;
while (file >> BinaryNumber){
sum+=decimal(BinaryNumber);
}
and this way gives me the the right sum. After some testing I came to a conclusion that the while loop with eof() is making one more iteration than the other while loop. So my question is - what is the difference between those two ways of reading from a text file? Why the first while loop gives me the wrong result and what may be this extra iteration that it's doing?
The difference is that >> reads the data first, and then tells you whether it has been a success or not, while file.eof() does the check prior to the reading. That is why you get an extra read with the file.eof() approach, and that read is invalid.
You can modify the file.eof() code to make it work by moving the check to a place after the read, like this:
// This code has a problem, too!
while (true) { // We do not know if it's EOF until we try to read
file >> BinaryNumber; // Try reading first
if (file.eof()) { // Now it's OK to check for EOF
break; // We're at the end of file - exit the loop
}
sum+=decimal(BinaryNumber);
}
However, this code would break if there is no delimiter following the last data entry. So your second approach (i.e. checking the result of >>) is the correct one.
EDIT: This post was edited in response to this comment.
When using file.eof() to test the input, the last input probably fails and the value stays unchanged and is, thus, processed twice: when reading a string, the stream first skips leading whitespace and then reads characters until it finds a space. Assuming the last value is followed by a newline, the stream hasn't touched EOF, yet, i.e., file.eof() isn't true but reading a string fails because there are no non-whitespace characters.
When using file >> value the operation is executed and checked for success: always use this approach! The use of eof() is only to determine whether the failure to read was due to EOF being hit or something else.

Having problems with 0x0A character in C++ even in binary mode. (interprets it as new file)

Hi this might seem a bit noobie, but here we go. Im developing a program that downloads leaderboards of a certain game from the internet and transforms it into a proper format to work with it (elaborate rankings, etc).
The files contains the names, ordered by rank, but between each name there are 7 random control codes (obivously unprintable). The txt file looks like this:
..C...hName1..)...&Name2......)Name3..é...þName4..Ü...†Name5..‘...QName6..~...bName7..H...NName8..|....Name9..v...HName10.
Checked via an hexEditor and saw the first control code after each name is always a null character (0x00). So, what I do is read everything, and then cout every character. When a 0x00 character is found, skip 7 characters and keep couting. Therefore you end up with the list, right?
At first I had the problem that on those random control codes, sometimes you would find like a "soft EOF" (0x1A), and the program would stop reading there. So I finally figured out to open it in binary mode. It worked, and then everything would be couted... or thats what I thought.
But I came across another file which still didn't work, and finally found out that there was an EOF character! (0x0A) Which doesn't makes sense since Im opening it in binary mode. But still, after reading that character, C++ interprets that as a new file, and hence skips 7 characters, so the name after that character will always appear cut.
Here's my current code:
#include <cstdlib>
#include <iostream>
#include <fstream>
using namespace std;
int main () {
string scores;
system("wget http://certainwebsite/001.txt"); //download file
ifstream highin ("001.txt", ios::binary);
ofstream highout ("board.txt", ios::binary);
if (highin.is_open())
{
while ( highin.good() )
{
getline (highin, scores);
for (int i=0;i<scores.length(); i++)
{
if (scores[i]==0x00){
i=i+7; //skip 7 characters if 'null' is found
cout << endl;
highout << endl;
}
cout << scores[i];
highout << scores[i]; //cout names and save them in output file
}
}
highin.close();
}
else cout << "Unable to open file";
system("pause>nul");
}
Not sure how to ignore that character if being already in binary mode doesn't work. Sorry for the long question but I wanted to be detailed and specific. In this case, the EOF character is located before the Name3, and hence this is how the output looks like:
http://i.imgur.com/yu1NjoZ.png
By default getline() reads until the end of line and discards the newline character. However, the delimiter character could be customized (by supplying the third parameter). If you wish to read until the null character (not until the end of line), you could try using getline (highin, scores, '\0'); (and adjusting the logic of skipping the characters).
I'm glad you figured it out and it doesn't surprise me that getline() was the culprit. I had a similar issue dealing with the newline character when I was trying to read in a CSV file. There are several different getline() functions in C++ depending on how you call the function and each seems to handle the newline character differently.
As a side note, in your for loop, I'd recommend against performing a method call in your test. That adds unnecessary overhead to the loop. It'd be better to call the method once and put that value into a variable, then enter the loop and test i against the length variable. Unless you expect the length to change, calling the length() method each iteration is a waste of system resources.
Thank you all guys, it worked, it was the getline() which was giving me problems indeed. Due to the 'while' loop, each time it found a new line character, it restarted the process, hence skipping those 7 characters.

What is the "right" way to read a file with C++ fstreams?

I am using the standard C++ fstreams library and I am wondering what is the right way to use it. By experience I sort of figured out a small usage protocol, but I am not really sure about it. For the sake of simplicity let's assume that I just want to read a file, e.g., to filter its content and put it on another file. My routine is roughly as follows:
I declare a local istream i("filename") variable to open the file;
I check either i.good() or i.is_open() to handle the case where something went bad when opening, e.g., because the file does not exist; after, I assume that the file exists and that i is ok;
I call i.peek() and then again i.good() or i.eof() to rule out the case where the file is empty; after, I assume that I have actually something to read;
I use >> or whatever to read the file's content, and eof() to check that I am over;
I do not explicitly close the file - I rely on RAII and keep my methods as short and coherent as I can.
Is it a sane (correct, minimal) routine? In the negative case, how would you fix it? Please note that I am not considering races - synchronization is a different affair.
I would eliminate the peek/good/eof (your third step). Simply attempt to read your data, and check whether the attempted read succeeded or failed. Likewise, in the fourth step, just check whether your attempted read succeeded or not.
Typical code would be something like:
std::ifstream i("whatever");
if (!i)
error("opening file");
while (i >> your_data)
process(your_data);
if (!i.eof())
// reading failed before end of file
It's simpler than you have described. The first two steps are fine (but the second is not necessary if you follow the rest of my advice). Then you should attempt extraction, but use the extraction as the condition of a loop or if statement. If, for example, the file is formatted as a series of lines (or other delimited sequences) all of the same format, you could do:
std::string line;
while (std::getline(i, line)) {
// Parse line
}
The body of the loop will only execute if the line extraction works. Of course, you will need to check the validity of the line inside the loop.
If you have a certain series of extractions or other operations to do on the stream, you can place them in an if condition like so:
if (i >> some_string &&
i.get() == '-' &&
i >> some_int) {
// Use some_string and some_int
}
If this first extraction fails, the i.ignore() not execute due to short-circuit evaluation of &&. The body of the if statement will only execute if both extractions succeed. If you have two extractions together, you can of course chain them:
if (i >> some_string >> some_int) {
// Use some_string and some_int
}
The second extraction in the chain will not occur if the first one fails. A failed extraction puts the stream in a state in which all following extractions also fail automatically.
For this reason, it's also fine to place the stream operations outside of the if condition and then check the state of the stream:
i >> some_string >> some_int;
if (i) {
// Use some_string and some_int
}
With both of these methods, you don't have to check for certain problems with the stream. Checking the stream for eof() doesn't necessarily mean that the next read will fail. A common case is when people use the following incorrect extraction loop:
// DO NOT DO THIS
while (!i.eof()) {
std::getline(i, line)
// Do something with line
}
Most text files end with an extra new line at the end that text editors hide from you. When you're reading lines from the text file, for the last iteration you haven't yet hit the end of file because there's still a \n to read. So the loop continues, attempts to extract the next line which doesn't exist and screws up. People often observe this as "reading the last line of the file twice".