Clear CSV-file from non-specified symbols using C++ [duplicate] - c++

This question already has answers here:
Why can Windows not read beyond the 0x1A (EOF) character but Unix can? [duplicate]
(2 answers)
Closed 3 years ago.
I'm trying to convert CSV-file to TXT-file using simple C++-code like this:
std::ofstream txtFile(strFileName, std::ofstream::out | std::ofstream::app);
std::string strLine;
std::ifstream csvFile(strCSVDir);
while (std::getline(csvFile, strLine))
{
std::string subString;
std::stringstream s(strLine);
while (std::getline(s, subString, ';'))
{
txtFile << subString << "\t";
}
txtFile << "\n";
}
txtFile.close();
csvFile.close();
It works fine, but only if the CSV-file doesn't contain any non-specified symbols, like arrow on this picture:
In this case my code can read only part of CSV-file until it meet this arrow symbol. How can I get around this situation?
Update: if I look at this CSV-file in byte-representation (for example in Far Hex-view), than I see code of arrow-symbol is "1A". The table of Unicode-characters points that it is Substitute symbol. How does it get in this CSV-file I don't know.

It might be easier to just read the entire file - then replacing and finally saving.
Going from your snippet:
std::stringstream sstr;
sstr << csvFile.rdbuf();
std::string buffer = sstr.str();
boost::replace_all(buffer, ";", "");
txtFile << buffer;
Update: if you don't have boost it should be easy to replace with something else like a for loop (since it is just a single char replacement)
Update 2: The reason why reading might not read the entire file in this case is because it is being read as a text file and probably contains a terminating character somewhere due to the way it is being read - see https://en.cppreference.com/w/cpp/io/c#Binary_and_text_modes for explaination.

Related

C++ how to read a line with delimiter until the end of each line? [duplicate]

This question already has answers here:
How can I read and parse CSV files in C++?
(39 answers)
Closed 6 years ago.
Hi I need to read a file that looks like this...
1|Toy Story (1995)|Animation|Children's|Comedy
2|Jumanji (1995)|Adventure|Children's|Fantasy
3|Grumpier Old Men (1995)|Comedy|Romance
4|Waiting to Exhale (1995)|Comedy|Drama
5|Father of the Bride Part II (1995)|Comedy
6|Heat (1995)|Action|Crime|Thriller
7|Sabrina (1995)|Comedy|Romance
8|Tom and Huck (1995)|Adventure|Children's
9|Sudden Death (1995)|Action
As you can see the type of each movie can vary from 1 type to many...I wonder how could I read those until the end of each line?
I'm currently doing:
void readingenre(string filename,int **g)
{
ifstream myfile(filename);
cout << "reading file "+filename << endl;
if(myfile.is_open())
{
string item;
string name;
string type;
while(!myfile.eof())
{
getline(myfile,item,'|');
//cout <<item<< "\t";
getline(myfile,name,'|');
while(getline(myfile,type,'|'))
{
cout<<type<<endl;
}
getline(myfile,type,'\n');
}
myfile.close();
cout << "reading genre file finished" <<endl;
}
}
the result is not what I want...It looks like:
Animation
Children's
Comedy
2
Jumanji (1995)
Adventure
Children's
Fantasy
3
Grumpier Old Men (1995)
Comedy
Romance
So it doesn't stop at the end of each line...How could I fix this?
Attempting to parse this input file one field at a time is the wrong approach.
This is a text file. A text file consists of lines terminated by newline characters. getline() by itself, is what you use to read a text file, with newline-terminated lines:
while (std::getline(myfile, line))
And not:
while(!myfile.eof())
which is always a bug.
So now you have a loop that reads each line of text. A std::istringstream can be constructed inside the loop, containing the line just read:
std::istringstream iline(line);
and then you can use std::getline(), with this std::istringstream with the optional delimiter character overriden to '|' to read each field in the line.

c++ reading file with \ inside the text

I need help with a small problem.
I wrote a small program that reads every line of the text inside a .rd file to a string. But inside the text are some \ and when I output the strings the program think that the \ are escape characters.
What can I do to get the original text?
The Program run without an error.
Here is a small snippet of my code:
string find="something";
string replace="something2";
string line="";
fstream myfile;
myfile.open ("file.rb");
if (myfile.is_open())
{
while (getline(myfile,line))
{
cout << line << '\n';
if(line == find)
{
myfile << replace;
}
else
{
myfile << line;
}
}
myfile.close();
}
You should try using a unicode version of getline or you could try adding ios::binary to your stream constructor flags.
See this article for further info.
However, if you read in a string like "\0" from stdin or a file, it should be treated as two separate characters: '\' and '0'. There is no additional processing that you have to do.
Escaping characters is only used for string/character literals. That is to say, when you want to hard-code something into your source code.

How to read from a text file and split sentences apart in C

I want to read a series of questions from a text file. Each question is separated by a comma, so I am thinking that I have to check for each character to not be equal to a comma before copying the character?
The text file looks something like this "Is it red?, Is it bigger than a mailbox?, Is it an animal?"
In case it affects the code, I want to copy each string into a node to put in a tree later on.
while (fgets(stringPtr, 100, filePtr) != ',')
strcpy(stringPtr, treeNode);
Is something like this ok?
Given your description - something like the follow:
std::string question_string;
std::set<std::string> my_tree;
if (std::ifstream file_stream{filename})
{
while (std::getline(file_stream, question_string, ','))
my_tree.insert(question_string);
}
else
std::cerr << "unable to open " << filename << '\n';
You'll need to get the filename from somewhere, include the relevant headers (google the classes if you need to).

how to test for white space c++: [duplicate]

This question already has answers here:
Why does reading a record struct fields from std::istream fail, and how can I fix it?
(9 answers)
Closed 8 years ago.
I'm trying to parse a .csv file, and I need to be able to test for a carriage return. Here is a test .csv file called sample.csv:
2
3
As you'll notice, there are two rows and one column in this file. I now write the following C++ code:
ifstream myfile (sample.csv); //Import file
char nextchar;
myfile.get(nextchar);
cout<<nextchar<<'\n';
myfile.get(nextchar);
cout<< nextchar<<" If 0, then that was not a carriage return. If 1, it was. :"<<(nextchar=='\n')<<'\n';
myfile.get(nextchar);
cout<<nextchar<<'\n';
I expect the following output:
2
If 0, then that was not a carriage return. If 1, it was. :1
3
however, I get:
2
If 0, then that was not a carriage return. If 1, it was. :0
3
How is this possible? how do I test for a carriage return??
It may be a pair of characters CR + LF. In any case you could output the code of this character yourself. Why did not you do this?
Also you could apply standard function std::isspace decalred in header <cctype>
I suggest to use standard function std::getline to read a whole line instead of using get.
There are a lot of things that can go wrong in the assumptions: OS behaviour, the text editor used to write the sample file, an undesired extra space or tab at the end of line, and the ios_base::openmode used to open the file, as well as all possible combination between those...
First instert this line to see what you actually read: is it 0x0d or 0x0a ? or somthing else ?
cout << "Char read: 0x0"<< std::hex << (int)nextchar<<"\n";
cout << "If 0 ... // Existing line
You can also replace your sample with the following. It opens the file in binary mode and display in hex the chars really in the file :
ifstream myfile ("sample.csv", ifstream::binary); //Import file
while (myfile.good() ) {
char nextchar;
myfile.get(nextchar);
if (myfile.good())
cout << "0x0"<< std::hex << (int)nextchar
<< " " << (isprint(nextchar)? nextchar:'?') <<"\n";
}
If second and third line are 0x0d and 0x0a, you'll know for sure that your text editor has put the extra CR.
Then you can remove ifstream::binary in the code above. Normally you should have, as you pointed out only 0x0a in the second line. If it's not the case, then you should investigate if the default openmode was somehow altered.
By the way, I've compiled your original code under windows and prepared the sample file using notepad , ran the programm and got... what you did expect ! Then I've redone the test with the following modification and the finally got what you got.
Good luck !

Reading from a file, only reads text untill it gets to empty space

I managed to successfully read the text in a file but it only reads until it hits an empty space, for example the text: "Hi, this is a test", cout's as: "Hi,".
Removing the "," made no difference.
I think I need to add something similar to "inFil.ignore(1000,'\n');" to the following bit of code:
inFil>>text;
inFil.ignore(1000,'\n');
cout<<"The file cointains the following: "<<text<<endl;
I would prefer not to change to getline(inFil, variabel); because that would force me to redo a program that is essentially working.
Thank you for any help, this seems like a very small and easily fixed problem but I cant seem to find a solution.
std::ifstream file("file.txt");
if(!file) throw std::exception("Could not open file.txt for reading!");
std::string line;
//read until the first \n is found, essentially reading line by line unti file ends
while(std::getline(file, line))
{
//do something line by line
std::cout << "Line : " << line << "\n";
}
This will help you read the file. I don't know what you are trying to achieve since your code is not complete but the above code is commonly used to read files in c++.
You've been using formatted extraction to extract a single string, once: this means a single word.
If you want a string containing the entire file contents:
std::fstream fs("/path/to/file");
std::string all_of_the_file(
(std::istreambuf_iterator<char>(filestream)),
std::istreambuf_iterator<char>()
);