Reading of text file in Ubuntu has extra //r - c++

I am porting a program created in C++ from MS Studio to Ubuntu . The program works fine except when it reads from a text file .
My text file consists of lines of information seperated by the delimiter :
General Manager:G001:def
Customer:C001:def:Lim:Tom:Mr:99999999:zor#hotmail.com:Blk 145 B North #03-03 Singapore 111111
Read method
while (getline(afile,line,'\n')) //read line and store string in variable line
{
stringstream ss(line);
string s;
while (getline(ss,s,':'))
{
word.push_back(s);
}
word.clear();
}
On Windows platform , it is stored correctly as def
However on Ubuntu platform , it is stored as def\\r
It works fine for Customer Record but gives problem for General Manager
I know it has something to do with Carriage return but I am not sure how to resolve it

If the text file was created on Windows, you can use the dos2unix command to remove the extra \r's from the file. The command is simply dos2unix filenamegoeshere

Related

How can i convert linux text file to windows text file by using qt?

When I copy text files to USB flash memory with Qt on raspberry pi 3 , and when I open these text files on Windows , text file '\n' characters not seem to work on Windows.
I searched this topic and I saw that text file formats are different on Linux and Windows.So I have to copy Linux based text files to Flash Memory with Qt and open these files on Windows.
There are a few characters which can indicate a new line. The usual ones are these two:
'\n' or '0x0A' (10 in decimal) -> This character is called "Line Feed" (LF).
'\r' or '0x0D' (13 in decimal) -> This one is called "Carriage return" (CR).
Different Operating Systems handle newlines in a different way. Here is a short list of the most common ones:
DOS and Windows :
They expect a newline to be the combination of two characters, namely '\r\n' (or 13 followed by 10).
Unix (and hence Linux as well) :
Unix uses a single '\n' to indicate a new line.
Mac :
Macs use a single '\r'.
EDIT : As MSalters mentioned Mac OSX is Unix and uses \n. The single \r is ancient Mac OS9
I guess you are just transporting the file, not doing anything with it, but I can't think of another option than opening it and rewrite the line endings.
If you open the .txt file on Windows and read from it (with c++ or c++/Qt) and then write the lines as you get them to a new file, the line endings should then fit the Windows sepcifics.
You can read the file like this:
std::ifstream file;
file.open(filePath);
std::ofstream file2;
file2.open(filePath2);
while(std::getline(file, line))
{
file2<<line;
}
std::getline
At least the documentation states that getline searches for '\n', it should work on windows and Unix. If it doesn't, you can still set the delimeter to '\n'.
If you want to write the file 'Windowslike' on your raspberry, you can try to replace the '\n' characters with '\r\n'
It should look somehow like this:
std::string myFileAsString;
std::string toReplace = "\n";
std::string replaceWith = "\r\n";
myFileAsString.replace(myFileAsString.find(toReplace), toReplace.length(), replaceWith);
where find searches for '\n' and then replaces it with '\r\n'
replace
find

Unable to use program to access a File

I am using a Macbook.
I have created a file using text edit, and have written the numbers 3 and 4 on it with a space in between. I saved this file called 'mydata', which then produced 'mydata.rtf'. I then changed this files name to 'mydata.txt'.
I have then created this program to open this file, and then print the values within the file 'mydata.txt'.
However, it is printing the values: a = 446595126, b = 32767.
Could someone please explain why this program doesn't print 3 and 4.
Thank you.
#include<stdio.h>
int main(void)
{
int a, b;
FILE *fptr1;
fptr1 = fopen ( "mydata.txt", "r" );
if (fptr1 == NULL )
{
printf("FILE mydata.txt did not open\n");
}
else
{
fscanf(fptr1,"%d%d",&a,&b);
printf("a = %d, b = %d\n",a,b);
}
fclose(fptr1);
return 0;
}
Your file is RTF (Rich Text Format), no matter the extension it has: you've simply renamed it, not converted to a different format.
The reason why you see those values is because the first characters of your file does not correspond with the numbers you entered in the editor but with the first bytes of the RTF format.
As a solution, open the file with TextEdit again and save it with a plain text format instead. You can take a look at this post too.
In general, avoid using editors that supports rich format (TextEdit, Word, Pages) for creating plain text files. Instead, use other ones like BBEdit, TextWrangler (although I think it's discontinued), emacs, Atom, vim, nano, etc.
the reason behind it is you have included space between the two numbers and a space is a character which has ascii value of 32..
so try ommiting space while reading number from file.
This may work :-)

Reading file made by cmd, results in 3 weird symbols

Im using this piece of code to read a file to a string, and its working perfectly with files manually made in notepad, notepad++ or other text editors:
std::string utils::readFile(std::string file)
{
std::ifstream t(file);
std::string str((std::istreambuf_iterator<char>(t)),
std::istreambuf_iterator<char>());
return str;
}
When I create a file via notepad (or any other editor) and save it to something, I get this result in my program:
But when I create a file via CMD (example command below), and run my program, I receive an unexpected result:
cmd /C "hostname">"C:\Users\Admin\Desktop\lel.txt" & exit
Result:
When I open this file generated by CMD (lel.txt), this is the file contents:
If I edit the generated file (lel.txt) with notepad (adding a space to the end of the file), and try running my program again, I get the same weird 3char result.
What might cause this? How can I read a file made via cmd, correctly?
EDIT
I changed my command (now using powershell), and added a function I found, named SkipBOM, and now it works:
powershell -command "hostname | Out-File "C:\Users\Admin\Desktop\lel.txt" -encoding "UTF8""
SkipBOM:
void SkipBOM(std::ifstream &in)
{
char test[3] = { 0 };
in.read(test, 3);
if ((unsigned char)test[0] == 0xEF &&
(unsigned char)test[1] == 0xBB &&
(unsigned char)test[2] == 0xBF)
{
return;
}
in.seekg(0);
}
This is almost certainly BOM (Byte Order Mark) : see here, which means that your file is saved in UNICODE with BOM.
There is a way to use C++ streams to read files with BOM (you have to use converters) - let me know if you need help with that.
That is how unicode looks when treated as an ANSI string. In notepad use File - Save As to see what the current format of a file is.
Now CMD uses OEM font, which is the same as ANSI for English characters. So any unicode will be converted to OEM by CMD. Perhaps you are grabbing the data yourself.
In VB you would use StrConv to convert it.

C++ ofstream, printing without CRLF

I have a C++ code I am running in Linux with wine. I think this is actually part of the problem.
Usually, when I do something like this in a native Linux C++ program:
ofstream fout;
fout.open("myfile.txt")
fout<<"blah blah"<<endl;
fout<<"blah blah 2"<<endl;
fout.close;
The file is standard ASCII text. However, in the code I an running under wine, myfile.txt is now ASCII text with CRLF line terminators.
This is a problem because if I want to read the file using a native Linux C++ code running on the same machine, the CRLF line terminators really mess up a lot of the file handling and parsing.
Is there a way to get the code running under wine to output files without CRLF line terminators and in a fashion that I can read it using the native Linux C++ code on the same machine?
You could open the file in ios::binary mode. This doesn't, strictly speaking, mean that it's a binary file [any more than any other file is "text", since all files are binary]. Binary in this context just means "don't muck about with the stuff inside the file by interpreting characters as special, add or remove any characters, etc.
Or when you copy the file to Linux, use dos2unix myfile.txt to convert it from "dos" (and Windows) format to "unix" style text file.

Error with reading file in Linux

In my program, I take two file names from the command line arguments using the following code:
ifstream routesFile (arv[1]);
ifstream citiesFile (arv[2]);
I then proceed to read through the file and grab the data. Both files are CSVs:
while(citiesFile.good()){
string city;
string country;
string xString;
string yString;
getline(citiesFile, country, ',');
getline(citiesFile, city, ',');
getline(citiesFile, xString, ',');
getline(citiesFile, yString);
...
}
When I do this in Visual Studio using hard-coded file names, it works fine. When I use the command line argument in linux after using g++, it can open the files correctly but after that it has a lot of errors. To test the file reading, I printed out some of the read values which resulted in
terminate called after throwing an instance of 'std::out_of_range'
what(): map::at
hereELF
Òœc½Å¹jn!ýô (EÕL˜C
The appearance of here is due to actually being printed in the program. It doesn't arise from the error, I manually printed it to test the code.
It seems to not be able to read the data correctly. In the file for citiesFile, there are always 4 values per line, each separated by a single command, no spaces, and a new line character separates lines in the file. As I said above, it works fine in Visual Studio so I don't think it's a problem with the actual data, just reading it.
Linux and window has different new line symbol. Linux '\n', windows '\r\n'. If you just copied the file into linux, you need to handle these in your program. You can look at Mixing cin and getline under Linux and Windows as a reference.
If you simply moved the Windows files to Linux, check out the tool dos2unix to convert the file and fix the line endings. "EOL" in your output is a sign that something might be wrong with the endings.
http://www.linuxcommand.org/man_pages/dos2unix1.html