std::getline partially reads first and last line and sets eof-bit - c++

I need to read csv-files with C++: the first line of the file contains all column titles, the remaining lines contain floating point data (examples below, files have been shrunk down).
A few files have issues, I'm using the following code
#include <iostream>
#include <fstream>
#include <string>
// Compiled and testen on with Clang++ on Ubuntu 14.04
int main(int argc, char** argv) {
std::ifstream in;
in.open(argv[1]);
if(!in.is_open()) {
std::cerr << "Cannot open file: " << argv[1] << "\n";
return 1;
}
std::string buff;
std::getline(in, buff);
while(!in.eof()) {
std::cout << buff << "\n";
getline(in, buff);
}
in.close();
return 0;
}
For most files this runs okay, reading one line each iteration; example of a 'good' file:
Time,Smile,AU04,AU02,AU15,Trackerfail,AU18,AU09,negAU12,AU10,Expressive,Unilateral_LAU12,Unilateral_RAU12,AU14,Unilateral_LAU14,Unilateral_RAU14,AU05,AU17,AU26,Forward,Backward
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,33.333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,20.0
0.3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,33.333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,33.333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.667,0.0
58.3,50.0,0.0,0.0,0.0,33.333,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
62.4,33.333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,20.0
Some files go crazy and set the eof-bit after the first getline. After this first read, buff contains part of the first line and part of the last line; example of a 'bad' file:
Time,Smile,AU04,AU02,AU15,Trackerfail,AU18,AU09,negAU12,AU10,Occlusion,Expressive,Unilateral_LAU12,Unilateral_RAU12,AU14,Unilateral_LAU14,Unilateral_RAU14,AU05,Au17,AU57,AU58
0,0,0,0,0,16.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0.3,0,0,0,0,33.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1.3,0,0,0,0,16.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
57.9,66.667,0,0,0,66.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
60.3,33.333,0,0,0,66.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
And the contents of buff after one call to getline:
Time,Smile,AU04,AU02,AU15,Trackerfail,AU18,AU09,negAU12,AU10,Occlusion,Expressive,Unilateral_LAU12,Unilateral_RAU12,AU14,Unilateral_LAU14,Unilateral_RA60.3,33.333,0,0,0,66.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
As you can see, the first line gets mixed with the last line. I can't figure out what's going wrong. Each line ends with a \n, the file ends with an empty \n.
I suppose my question is: why does getline skip to end-of-file while mixing the first and last line for some of the files while others work fine?
Edit: I need to convert a big dataset to a new, more consistent format. The current format is full of inconsistencies (using 0 and 0.0 or AU17 and Au17). Still, these formatting problems should not affect simply reading the file, right?
Edit2:
cat -v -e -t on a good file:
Time,Smile,AU04,AU02,AU15,Trackerfail,AU18,AU09,negAU12,AU10,Expressive,Unilateral_LAU12,Unilateral_RAU12,AU14,Unilateral_LAU14,AU05,AU17,AU26,Forward,Backward^M$
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,66.667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0^M$
0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,33.333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0^M$
etc...
cat -v -e -t on a bad file:
Time,Smile,AU04,AU02,AU15,Trackerfail,AU18,AU09,negAU12,AU10,Occlusion,Expressive,Unilateral_LAU12,Unilateral_RAU12,AU14,Unilateral_LAU14,Unilateral_RAU14,AU05,Au17,AU57,AU58^M0,0,0,0,0,16.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M0.3,0,0,0,0,33.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M1.3,0,0,0,0,16.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M1.4,0,0,0,0,33.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M1.8,0,0,0,0,50,0,0,0,0,0,0,0,0,0,0,0,0,0,25,0^M2.8,0,0,0,0,50,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M3,0,0,0,0,33.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M31,0,0,0,0,33.333,0,0,0,0,25,0,0,0,0,0,0,0,0,0,0^M31.1,0,0,0,0,50,0,0,0,0,50,0,0,0,0,0,0,0,0,0,0^M31.2,0,0,0,0,66.667,0,0,0,0,50,0,0,0,0,0,0,0,0,0,0^M31.4,0,0,33.333,0,66.667,0,0,0,0,50,0,0,0,0,0,0,0,0,0,0^M31.5,0,0,33.333,0,66.667,0,0,0,0,50,25,0,0,0,0,0,0,0,0,0^M32,0,0,33.333,0,66.667,0,0,0,0,50,25,0,0,0,0,0,0,0,0,25^M32.1,0,0,33.333,0,83.333,0,0,0,0,50,25,0,0,0,0,0,0,0,0,25^M32.2,0,0,33.333,0,83.333,0,0,0,0,25,25,0,0,0,0,0,0,0,0,25^M32.4,0,0,33.333,0,83.333,0,0,0,0,25,0,0,0,0,0,0,0,0,0,25^M32.7,0,0,33.333,0,83.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,25^M33,0,0,33.333,0,83.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M33.5,0,0,0,0,83.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M33.9,0,0,0,0,66.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M55,33.333,0,0,0,66.667,0,0,0,0,0,25,0,0,0,0,0,0,0,0,0^M55.2,66.667,0,0,0,66.667,0,0,0,0,0,25,0,0,0,0,0,0,0,0,0^M55.8,100,0,0,0,66.667,0,0,0,0,0,25,0,0,0,0,0,0,0,0,0^M56.8,100,0,0,0,66.667,0,0,0,0,0,25,0,0,0,0,0,0,0,0,25^M57.4,66.667,0,0,0,66.667,0,0,0,0,0,25,0,0,0,0,0,0,0,0,25^M57.8,66.667,0,0,0,66.667,0,0,0,0,0,25,0,0,0,0,0,0,0,0,0^M57.9,66.667,0,0,0,66.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M60.3,33.333,0,0,0,66.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Seems like a big difference, how can I solve this?

It seems that the files are missing the newline character, and instead have only the carriage-return characters (which is equal to ^M or CTRLM).
You can fix it by using using cat with the file, and piping to tr to translate the carriage-return to a newline:
$ cat your-file | tr '\r' '\n' > your-file-fixed
After seeing your comment about the files coming from Max OS, I assume that it's the old pre-OSX versions, when the newline on Mac OS was just a single carriage-return.

Related

devc++ input from file does not work

I'm trying to redirect a .txt content to .exe
program.exe < file.txt
and contents of file.txt are
35345345345
34543534562
23435635432
35683045342
69849593458
95238942394
28934928341
but the first index in array is the file path and the file contents is not displayed.
int main(int argc, char *args[])
{
for(int c = 0; c<argc; c++){
cout << "Param " << c << ": " << args[c] << "\n";
}
system("PAUSE");
return EXIT_SUCCESS;
}
Desired output:
Param0: 35345345345
Param1: 34543534562
Param2: 23435635432
Param3: 35683045342
Param4: 69849593458
Param5: 95238942394
Param6: 28934928341
The myapp < file.txt syntax passes to stdin (or cin if you prefer), not the arguments.
You have misunderstood what argc and argv are for. They contain the command line arguments to your program. If, for example, you ran:
program.exe something 123
The null terminated strings pointed to by argv will be program.exe, something, and 123.
You are attempting to redirect the contents of a file to program.exe using < file.txt. This is not a command line argument. It simply redirects the contents of the file to the standard input of your program. To get those contents you will need to extract from std::cin.
When you say "but the first index in array is the file path and the file contents is not displayed." it sounds like you're trying to read input from argv and argc. The angle bracket shell operator does not work that way. Instead, stdin (what cin and several C functions read from) has the contents of that file. So, to read from the file in the case above, you'd use cin.
If you instead really wanted to have a file automatically inserted into the argument list, I can't help you with the windows shell. However, if you have the option of using bash, the following will work:
program.exe `cat file.txt`
The backtick operator expands into the result of the command contained within, and so the contents are then passed as arguments to program.exe (again, under the bash shell and not the windows shell)
This code does what i was expecting to do with the other one. Thanks everybody who helped.
#include <iostream>
#include <string>
using namespace std;
int main()
{
string line;
while (getline(cin, line))
cout << "line: " << line << '\n';
}

Problems using getline()

I'm running out of hair to pull out, so I thought maybe someone here could help me with this frustration.
I'm trying to read a file line by line, which seems simple enough, using getline(). Problem is, my code seems to keep ignoring the \n, and putting the entire file into one string, which is problematic to say the least.
void MakeRandomLayout(int rows, int cols)
{
string fiveByFive = "cubes25.txt";
string fourByFour = "cubes16.txt";
ifstream infile;
while (true) {
infile.open(fourByFour.c_str());
if (infile.fail()) {
infile.clear();
cout << "No such file found";
} else {
break;
}
}
Vector<string> cubes;
string cube;
while (std::getline(infile, cube)) {
cubes.add(cube);
}
}
Edits: Running OSX 10.7.
The infinite loop for the file is unfinished, will eventually ask for a file.
No luck with extended getline() version, tried that earlier.
Same system for dev and build/run.
The text file i'm reading in looks as follows:
AAEEGN
ABBJOO
ACHOPS
AFFKPS
AOOTTW
CIMOTU
DEILRX
DELRVY
DISTTY
EEGHNW
EEINSU
EHRTVW
EIOSST
ELRTTY
HIMNQU
HLNNRZ
Each string is on a new line in the file. The second one that I'm not reading in is the same but 25 lines instead of 16
Mac software recognizes either '\r' or '\n' as line-endings, for backward compatibility with Mac OS Classic. Make sure that your text editor hasn't put '\r' line endings in your file when your processing code is expecting '\n' (and verify that the '\n' characters you think are in the middle of the string aren't in fact '\r' instead.
I suspect that you are failing to display the contents of Vector correctly. When you dump the Vector, do you print a \n after each entry? You should, because getline discards the newlines on input.
FYI: the typical pattern for reading line-by-line is this:
Vector<string> cubes;
string cube;
while(std::getline(infile, cube)) {
cubes.add(cube);
}
Note that this will discard the newlines, but will put one line per entry in Vector.
EDIT: For whatever it is worth, if you were using an std::vector, you could slurp the file in thusly:
std::ifstream ifile(av[1]);
std::vector<std::string> v(
(std::istream_iterator<std::string>(ifile)),
std::istream_iterator<std::string>());

What's the correct way to read a text file in C++?

I need to make a program in C++ that must read and write text files line by line with an specific format, but the problem is that in my PC I work in Windows, and in College they have Linux and I am having problems because of line endings are different in these OS.
I am new to C++ and don't know could I make my program able read the files no matter if they were written in Linux or Windows. Can anybody give me some hints? thanks!
The input is like this:
James White 34 45.5 10 black
Miguel Chavez 29 48.7 9 red
David McGuire 31 45.8 10 blue
Each line being a record of a struct of 6 variables.
Using the std::getline overload without the last (i.e. delimiter) parameter should take care of the end-of-line conversions automatically:
std::ifstream in("TheFile.txt");
std::string line;
while (std::getline(in, line)) {
// Do something with 'line'.
}
Here's a simple way to strip string of an extra "\r":
std::ifstream in("TheFile.txt");
std::string line;
std::getline(input, line));
if (line[line.size() - 1] == '\r')
line.resize(line.size() - 1);
If you can already read the files, just check for all of the newline characters like "\n" and "\r". I'm pretty sure that linux uses "\r\n" as the newline character.
You can read this page: http://en.wikipedia.org/wiki/Newline
and here is a list of all the ascii codes including the newline characters:
http://www.asciitable.com/
Edit: Linux uses "\n", Windows uses "\r\n", Mac uses "\r". Thanks to Seth Carnegie
Since the result will be CR LF, I would add something like the following to consume the extras if they exist. So once your have read you record call this before trying to read the next.
std::cin.ignore(std::numeric_limits<std::streamsize>::max(), '\n');
If you know the number of values you are going to read for each record you could simply use the ">>" method. For example:
fstream f("input.txt" std::ios::in);
string tempStr;
double tempVal;
for (number of records) {
// read the first name
f >> tempStr;
// read the last name
f >> tempStr;
// read the number
f >> tempVal;
// and so on.
}
Shouldn't that suffice ?
Hi I will give you the answer in stages. Please go trough in order to understand the code.
Stage 1: Design our program:
Our program based on the requirements should...:
...include a definition of a data type that would hold the data. i.e. our
structure of 6 variables.
...provide user interaction i.e. the user should be able to
provide the program, the file name and its location.
...be able to
open the chosen file.
...be able to read the file data and
write/save them into our structure.
...be able to close the file
after the data is read.
...be able to print out of the saved data.
Usually you should split your code into functions representing the above.
Stage 2: Create an array of the chosen structure to hold the data
...
#define MAX 10
...
strPersonData sTextData[MAX];
...
Stage 3: Enable user to give in both the file location and its name:
.......
string sFileName;
cout << "Enter a file name: ";
getline(cin,sFileName);
ifstream inFile(sFileName.c_str(),ios::in);
.....
->Note 1 for stage 3. The accepted format provided then by the user should be:
c:\\SomeFolder\\someTextFile.txt
We use two \ backslashes instead of one \, because we wish it to be treated as literal backslash.
->Note 2 for stage 3. We use ifstream i.e. input file stream because we want to read data from file. This
is expecting the file name as c-type string instead of a c++ string. For this reason we use:
..sFileName.c_str()..
Stage 4: Read all data of the chosen file:
...
while (!inFile.eof()) { //we loop while there is still data in the file to read
...
}
...
So finally the code is as follows:
#include <iostream>
#include <fstream>
#include <cstring>
#define MAX 10
using namespace std;
int main()
{
string sFileName;
struct strPersonData {
char c1stName[25];
char c2ndName[30];
int iAge;
double dSomeData1; //i had no idea what the next 2 numbers represent in your code :D
int iSomeDate2;
char cColor[20]; //i dont remember the lenghts of the different colors.. :D
};
strPersonData sTextData[MAX];
cout << "Enter a file name: ";
getline(cin,sFileName);
ifstream inFile(sFileName.c_str(),ios::in);
int i=0;
while (!inFile.eof()) { //loop while there is still data in the file
inFile >>sTextData[i].c1stName>>sTextData[i].c2ndName>>sTextData[i].iAge
>>sTextData[i].dSomeData1>>sTextData[i].iSomeDate2>>sTextData[i].cColor;
++i;
}
inFile.close();
cout << "Reading the file finished. See it yourself: \n"<< endl;
for (int j=0;j<i;j++) {
cout<<sTextData[j].c1stName<<"\t"<<sTextData[j].c2ndName
<<"\t"<<sTextData[j].iAge<<"\t"<<sTextData[j].dSomeData1
<<"\t"<<sTextData[j].iSomeDate2<<"\t"<<sTextData[j].cColor<<endl;
}
return 0;
}
I am going to give you some exercises now :D :D
1) In the last loop:
for (int j=0;j<i;j++) {
cout<<sTextData[j].c1stName<<"\t"<<sTextData[j].c2ndName
<<"\t"<<sTextData[j].iAge<<"\t"<<sTextData[j].dSomeData1
<<"\t"<<sTextData[j].iSomeDate2<<"\t"<<sTextData[j].cColor<<endl;}
Why do I use variable i instead of lets say MAX???
2) Could u change the program based on stage 1 on sth like:
int main(){
function1()
function2()
...
functionX()
...return 0;
}
I hope i helped...

Reading from a file, only reads text untill it gets to empty space

I managed to successfully read the text in a file but it only reads until it hits an empty space, for example the text: "Hi, this is a test", cout's as: "Hi,".
Removing the "," made no difference.
I think I need to add something similar to "inFil.ignore(1000,'\n');" to the following bit of code:
inFil>>text;
inFil.ignore(1000,'\n');
cout<<"The file cointains the following: "<<text<<endl;
I would prefer not to change to getline(inFil, variabel); because that would force me to redo a program that is essentially working.
Thank you for any help, this seems like a very small and easily fixed problem but I cant seem to find a solution.
std::ifstream file("file.txt");
if(!file) throw std::exception("Could not open file.txt for reading!");
std::string line;
//read until the first \n is found, essentially reading line by line unti file ends
while(std::getline(file, line))
{
//do something line by line
std::cout << "Line : " << line << "\n";
}
This will help you read the file. I don't know what you are trying to achieve since your code is not complete but the above code is commonly used to read files in c++.
You've been using formatted extraction to extract a single string, once: this means a single word.
If you want a string containing the entire file contents:
std::fstream fs("/path/to/file");
std::string all_of_the_file(
(std::istreambuf_iterator<char>(filestream)),
std::istreambuf_iterator<char>()
);

getline() returns empty line in Eclipse but working properly in Dev C++

Here is my code:
#include <iostream>
#include <stdlib.h>
#include <fstream>
using namespace std;
int main() {
string line;
ifstream inputFile;
inputFile.open("input.txt");
do {
getline(inputFile, line);
cout << line << endl;
} while (line != "0");
return 0;
}
input.txt content:
5 9 2 9 3
8 2 8 2 1
0
In Enclipse, it goes to infinite-loop. I'm using MinGW 5.1.6 + Eclipse CDT.
I tried many things but I couldn't find the problem.
Since you are on windows try:
} while (line != "0\r");
The last line is stored as "0\r\n". The \n is used as the line delimiter by getline so the actual line read will be "0\r"
or
you can convert the dos format file to UNIX format using command
dos2unix input.txt
Now your original program should work. The command will change the \r\n at the end of the line to \n
Also you should always do error checking after you try to open a file, something like:
inputFile.open("input.txt");
if(! inputFile.is_open()) {
cerr<< "Error opening file";
exit(1);
}
It will create an infinite loop if no line contains exactly 0. For example 0\n is not the same thing as 0. My guess is that that is your problem.
EDIT: To elaborate, getline should be discarding the newline. Perhaps the newline encoding of your file wrong (i.e. windows vs. unix).
Your main problem is working directory.
Because you are specifying a file using a relative path it searches for the file from the current working directory. The working directory can be specified by your dev environment. (Note: The working directory is not necessarily the same directory where the executable lives (this is a common assumption among beginners but only holds in very special circumstances)).
Though you have a special end of input marker "0" you should also check that the getline() is not failing (as it could error out for other reasons (including beady formatted input). As such it is usually best to check the condition of the file as you read it.
int main()
{
string line;
ifstream inputFile;
inputFile.open("input.txt");
while((getline(inputfile, line)) && (line != "0"))
{
// loop only entered if getline() worked and line !="0"
// In the original an infinite loop is entered when bad input results in EOF being hit.
cout << line << endl;
}
if (inputfile)
{
cout << line << endl; // If you really really really want to print the "0"
// Personally I think doing anything with the termination
// sequence is a mistake but added here to satisfy comments.
}
return 0;
}