Possible compiler bug while reading file and outputting contents - c++

While trying to help a friend with a problem with his code, I encountered a very weird bug when compiling the following code with GCC.
#include <fstream>
#include <iostream>
#include <string>
int main() {
std::ifstream classes("classes.txt");
std::string line;
std::string txt = ".txt";
while (std::getline(classes, line)) {
std::cout << "[-]: " << line << "," << txt << std::endl;
}
return 0;
}
classes.txt contains the following:
CSC1
CSC2
CSC46
CSC151
MTH121
When compiled with Clang or MSVC, the output is as follows:
[-]: CSC1,.txt
[-]: CSC2,.txt
[-]: CSC46,.txt
[-]: CSC151,.txt
[-]: MTH121,.txt
But, when compiled with GCC, this is what the code outputs:
,.txtCSC1
,.txtCSC2
,.txtCSC46
,.txtCSC151
[-]: MTH121,.txt
I cannot make sense of whats happening here. Can anyone explain this?
Image with GCC version and output:

No, this is not a compiler bug. You are running into line-ending differences between operating systems. My magic ball tells me that if you run dos2unix classes.txt, the problem will go away. Similarly, cat -v classes.txt should output something similar to the following:
CSC1^M
CSC2^M
CSC46^M
CSC151^M
MTH121^M
Here, the ^M denotes \r\n. This is known as a CRLF or "carriage return line feed". On Linux when the carriage return is encountered, it instructs the terminal to go back to the beginning of the line. This results in .txt overwriting whatever you had output previously.
N.B if you are running Clang on an Apple system, which I'm guessing you are, certain versions of Mac use \r, but not \r\n or \n..

Related

Why does g++ compiled file end with a % sign?

#include <iostream>
int main()
{
std::cout << "Hello";
}
I compiled it using g++ Hello.cpp
I received the following output when I ran the compiled a.out file using ./a.out
Why do I keep getting a % sign at the end of the output?
./a.out
Hello%
The % you see there might actually be your shell prompt, and not part of your program output. You're not writing a new line after your output, so the shell prompt appears at the very end of the output of the last command.
Possible solutions:
Append a newline to the end of the output with + "\n".
Add a std::endl to the end of your output.

ofstream not generating new lines in Windows subsystem for Linux

I'm trying to write a .dat file using Windows subsystem for Linux, but the fstream library seems to bypass every endline command.
Here is my code:
int main()
{
string fname = "DataSheet.dat";
ofstream fdata (fname.c_str(), ios::out);
fdata << "First line" << endl;
fdata << "Second line" << endl;
fdata.close();
return = 0;
}
I tried substituting << endl with << "\n" and modifying the ofstream command like showed there, but nothing worked; the output was always First lineSecond line instead of First line and Second line on subsequent lines.
Besides, the code works perfectly well when I print the output to video using cout command or when I compile and run it on cygwin.
Is it a problem of Windows subsystem for Linux or am I missing something important?
By a comment.
Try substituting << endl with \r\n
This is due to the differences in the line endings of linux and windows.
In windows you need to add a carriage return and then the new line character.
While in linux there is not need for the carriage return.
The problem comes from the fact that you are compiling for linux so std::endl places the linux version line ending but you are trying to view the output in windows.

is_open() function in C++ always return 0 value and getLine(myFile, line) does not return anything

Trying to read a file in C++ using fstream.
But the is_open() function always return 0 result and readline() does not read anything. Please see the code snippet below.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main() {
string line;
ifstream myfile("D:\xx\xx\xx\xx\testdata\task1.in.1");
if (myfile.is_open()) {
while (getline(myfile, line)) {
cout << line << '\n';
}
myfile.close();
}
else
cout << "Unable to open file";
return 0;
}
you think you're opening D:\<somepath>\testdata\task1.in.1
but in fact you're trying to open D:\<somepath><tabulation char>estdata<tabulation char>ask1.in.1 since \t is interpreted as a tabulation.
(like \n is a newline in printf("hello world\n");)
(\x is special too BTW that's not the real path or you would have had another error: error: \x used with no following hex digits which maybe would have talked to you better!)
You have to escape the backslashes like this:
D:\\xx\\xx\\xx\\xx\\testdata\\task1.in.1
Windows also accepts paths like those, more convenient, unless you want to generate batch scripts with cd commands or the like that would require backslashes (/ is used as option switch in batch commands):
D:/xx/xx/xx/xx/testdata/task1.in.1
As NathanOliver stated, you can use the raw prefix if your compiler has C++11 mode enabled (or with --std=c++11)
R"(D:\xx\xx\xx\xx\testdata\task1.in.1)"
Last word: dirty way of getting away with it:
D:\Xx\Xx\Xx\Xx\Testdata\Task1.in.1
Using uppercase in that case would work
because windows is case insensitive
C would let the backslashes as is.
But that's mere luck. A lot of people do that without realizing they're very close to a bug.
BTW a lot of people capitalize windows paths (as seen a lot in this site) because they noticed that their paths wouldn't work with lowercase without knowing why.

std::getline partially reads first and last line and sets eof-bit

I need to read csv-files with C++: the first line of the file contains all column titles, the remaining lines contain floating point data (examples below, files have been shrunk down).
A few files have issues, I'm using the following code
#include <iostream>
#include <fstream>
#include <string>
// Compiled and testen on with Clang++ on Ubuntu 14.04
int main(int argc, char** argv) {
std::ifstream in;
in.open(argv[1]);
if(!in.is_open()) {
std::cerr << "Cannot open file: " << argv[1] << "\n";
return 1;
}
std::string buff;
std::getline(in, buff);
while(!in.eof()) {
std::cout << buff << "\n";
getline(in, buff);
}
in.close();
return 0;
}
For most files this runs okay, reading one line each iteration; example of a 'good' file:
Time,Smile,AU04,AU02,AU15,Trackerfail,AU18,AU09,negAU12,AU10,Expressive,Unilateral_LAU12,Unilateral_RAU12,AU14,Unilateral_LAU14,Unilateral_RAU14,AU05,AU17,AU26,Forward,Backward
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,33.333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,20.0
0.3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,33.333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,33.333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,16.667,0.0
58.3,50.0,0.0,0.0,0.0,33.333,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
62.4,33.333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,100.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,20.0
Some files go crazy and set the eof-bit after the first getline. After this first read, buff contains part of the first line and part of the last line; example of a 'bad' file:
Time,Smile,AU04,AU02,AU15,Trackerfail,AU18,AU09,negAU12,AU10,Occlusion,Expressive,Unilateral_LAU12,Unilateral_RAU12,AU14,Unilateral_LAU14,Unilateral_RAU14,AU05,Au17,AU57,AU58
0,0,0,0,0,16.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0.3,0,0,0,0,33.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
1.3,0,0,0,0,16.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
57.9,66.667,0,0,0,66.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
60.3,33.333,0,0,0,66.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
And the contents of buff after one call to getline:
Time,Smile,AU04,AU02,AU15,Trackerfail,AU18,AU09,negAU12,AU10,Occlusion,Expressive,Unilateral_LAU12,Unilateral_RAU12,AU14,Unilateral_LAU14,Unilateral_RA60.3,33.333,0,0,0,66.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
As you can see, the first line gets mixed with the last line. I can't figure out what's going wrong. Each line ends with a \n, the file ends with an empty \n.
I suppose my question is: why does getline skip to end-of-file while mixing the first and last line for some of the files while others work fine?
Edit: I need to convert a big dataset to a new, more consistent format. The current format is full of inconsistencies (using 0 and 0.0 or AU17 and Au17). Still, these formatting problems should not affect simply reading the file, right?
Edit2:
cat -v -e -t on a good file:
Time,Smile,AU04,AU02,AU15,Trackerfail,AU18,AU09,negAU12,AU10,Expressive,Unilateral_LAU12,Unilateral_RAU12,AU14,Unilateral_LAU14,AU05,AU17,AU26,Forward,Backward^M$
0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,66.667,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0^M$
0.2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,33.333,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0^M$
etc...
cat -v -e -t on a bad file:
Time,Smile,AU04,AU02,AU15,Trackerfail,AU18,AU09,negAU12,AU10,Occlusion,Expressive,Unilateral_LAU12,Unilateral_RAU12,AU14,Unilateral_LAU14,Unilateral_RAU14,AU05,Au17,AU57,AU58^M0,0,0,0,0,16.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M0.3,0,0,0,0,33.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M1.3,0,0,0,0,16.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M1.4,0,0,0,0,33.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M1.8,0,0,0,0,50,0,0,0,0,0,0,0,0,0,0,0,0,0,25,0^M2.8,0,0,0,0,50,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M3,0,0,0,0,33.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M31,0,0,0,0,33.333,0,0,0,0,25,0,0,0,0,0,0,0,0,0,0^M31.1,0,0,0,0,50,0,0,0,0,50,0,0,0,0,0,0,0,0,0,0^M31.2,0,0,0,0,66.667,0,0,0,0,50,0,0,0,0,0,0,0,0,0,0^M31.4,0,0,33.333,0,66.667,0,0,0,0,50,0,0,0,0,0,0,0,0,0,0^M31.5,0,0,33.333,0,66.667,0,0,0,0,50,25,0,0,0,0,0,0,0,0,0^M32,0,0,33.333,0,66.667,0,0,0,0,50,25,0,0,0,0,0,0,0,0,25^M32.1,0,0,33.333,0,83.333,0,0,0,0,50,25,0,0,0,0,0,0,0,0,25^M32.2,0,0,33.333,0,83.333,0,0,0,0,25,25,0,0,0,0,0,0,0,0,25^M32.4,0,0,33.333,0,83.333,0,0,0,0,25,0,0,0,0,0,0,0,0,0,25^M32.7,0,0,33.333,0,83.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,25^M33,0,0,33.333,0,83.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M33.5,0,0,0,0,83.333,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M33.9,0,0,0,0,66.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M55,33.333,0,0,0,66.667,0,0,0,0,0,25,0,0,0,0,0,0,0,0,0^M55.2,66.667,0,0,0,66.667,0,0,0,0,0,25,0,0,0,0,0,0,0,0,0^M55.8,100,0,0,0,66.667,0,0,0,0,0,25,0,0,0,0,0,0,0,0,0^M56.8,100,0,0,0,66.667,0,0,0,0,0,25,0,0,0,0,0,0,0,0,25^M57.4,66.667,0,0,0,66.667,0,0,0,0,0,25,0,0,0,0,0,0,0,0,25^M57.8,66.667,0,0,0,66.667,0,0,0,0,0,25,0,0,0,0,0,0,0,0,0^M57.9,66.667,0,0,0,66.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0^M60.3,33.333,0,0,0,66.667,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Seems like a big difference, how can I solve this?
It seems that the files are missing the newline character, and instead have only the carriage-return characters (which is equal to ^M or CTRLM).
You can fix it by using using cat with the file, and piping to tr to translate the carriage-return to a newline:
$ cat your-file | tr '\r' '\n' > your-file-fixed
After seeing your comment about the files coming from Max OS, I assume that it's the old pre-OSX versions, when the newline on Mac OS was just a single carriage-return.

ofstream does not print out newline to txt in Windows7

I have some issue when I want to print out \n I'm using endl for that. And the problem is when I run the code on Windows7 it won't print out the newline. But it will print out newline in Ubuntu. Both OS is using the same compiler GNU g++.
So I wonder if there are some different way to print newline to file in Windows?
void translate(ofstream &out, const string &line, map<string, string> m)
{
stringstream ss(line);
string word;
while(ss >> word)
{
if(m[word].size() == 0)
out << "A";
else
out << m[word] << " ";
}
out << "\n";
}
Outputting either '\n' or using endl will result in the exact same content (the only difference is endl also flushes). When that \n character is written, if the file is in "text mode", the runtime library converts it to the platform's native mechanism to indicate lines. On unix, this is unnecessary because that mechanism is a \n byte. On Windows, that \n becomes \r\n (carriage return, line feed). I suspect you know all of this, but I'm reviewing it just in case.
In short, as long as your runtime library is setup for Windows, the code you have will work as you expect. I suspect you are using cygwin's g++, or some other g++ port, that is not setup for Windows-style lines, even in text mode. Some editors will not correctly interpret that untranslated \n.