General CSV Parser with multiple EOL characters - c++

I'm trying to change this function to also account for when CSV files are given with \r endings. I can't seem to figure out how to get getline() take that into account.
vector<vector<string>> Parse::parseCSV(string file)
{
// input fstream instance
ifstream inFile;
inFile.open(file);
// check for error
if (inFile.fail()) { cerr << "Cannot open file" << endl; exit(1); }
vector<vector<string>> data;
string line;
while (getline(inFile, line))
{
stringstream inputLine(line);
char delimeter = ',';
string word;
vector<string> brokenLine;
while (getline(inputLine, word, delimeter)) {
word.erase(remove(word.begin(), word.end(), ' '), word.end()); // remove all white spaces
brokenLine.push_back(word);
}
data.push_back(brokenLine);
}
inFile.close();
return data;
};

This is a possible duplicate of Getting std :: ifstream to handle LF, CR, and CRLF?. The top answer is particularly good.
If you know every line ends with a \r you can always specify the getline delimiter with getline(input, data, '\r'), where input is an stream, data is a string, and the third parameter is the character to split by. You could also try something like the following after the start of the first while loop
// after the start of the first while loop
stringstream inputLine;
size_t pos = line.find('\r');
if(pos < line.size()) {
inputLine << std::string(x.begin(), x.begin() + p);
inputLine << "\n"
inputLine << std::string(x.begin() + p + 1, x.end());
} else {
inputLine << line;
}
// the rest of your code here

Related

Reading a file, line by line and then read that line word by word in c++

I am trying to read through a text file line by line and then I am trying to read that line word by word. Once I found a particular word, skip the remaining word from that line and jump to next line.
I tried this way:
void SymbolScanning::symScanning()
{
std::string s;
std::ifstream myfile;
myfile.open("SymbolRead.txt");
if(myfile.is_open())
{
while(std::getline(myfile, s))
{
std::istringstream iss(s);
std::string word;
while(iss >> word)
{
std::cout << word << std::endl;
// if desired word found, skip remainnig word and jump
// over next line.
}
}
}
else
cout<<"File is not open";
}
Output : File is not open.

How to read a number from a file and use it as a variable in C++?

Let's say I have a file I'm reading that goes something like this :
#character posX posY //commentary line: explains what it represents
CharacterName1 50.0 0.0
CharacterName2 32.0 50.0
The goal here is to be able to read the posX et posY and convert them in my C++ program into 2 double variables x and y.
For now, all I'm able to do is to start reading the file and see if the line corresponds to an empty line or a commentary line.
Then, if the reading line finds the corresponding character name, i should be able to to continue reading this line to get the posX and the posY, but I have no clue on how to do that. I don't know how to skip the blank and how to start reading the number and how to finish reading it to then convert it into double.
Any idea on how I should do that ?
I truly hope this is clear enough.
Thank you in advance.
Attempt example
void loadMap(std::string const& filepath) {
std::ifstream infile(filepath.c_str());
if(infile.fail()) {
std::cerr << "Error... " << std::endl;
} else { /opening occured correctly
while ( !infile.eof() ) {
std::string line;
std::getline(infile, line);
if ( (line[0] != '#') or (line.empty()) ) { //if line not comment or not empty
if( line.find("CharacterName1") ) {.......
Then I'm lost.
Hope this piece of code will help.
#include <bits/stdc++.h>
using namespace std; //change headers and namespaces; included for ease of use;
vector<string> split(const string &text, const char sep) {
vector<string> tokens;
std::size_t start = 0, end = 0;
while ((end = text.find(sep, start)) not_eq string::npos) {
tokens.emplace_back(text.substr(start, end - start));
start = end + 1;
}
tokens.emplace_back(text.substr(start));
return tokens;
}
int main()
{
ofstream outdata;
outdata.open("example.txt");
if( not outdata ) {
cerr << "Error: file could not be opened" << endl;
exit(1);
}
outdata<<"CharacterName1"<<','<<10.0<<','<<40.0<<endl; //writing data into file
outdata<<"CharacterName2"<<','<<20.0<<','<<30.0<<endl;
outdata<<"CharacterName3"<<','<<30.0<<','<<20.0<<endl;
outdata<<"CharacterName4"<<','<<40.0<<','<<10.0<<endl;
outdata.close();
ifstream inputFile;
inputFile.open("example.txt",fstream::in);
if (inputFile.fail())
{
cerr<<"Error: file could not be opened"<<endl;
exit(1);
}
string line;
vector<string> col1;
vector<double> col2;
vector<double> col3;
while (getline(inputFile, line))
{
if(not line.empty()){
auto lineData = split(line, ','); //separator can change in your case
col1.emplace_back(lineData[0]);
col2.emplace_back(std::stof(lineData[1]));
col3.emplace_back(std::stof(lineData[2]));
}
}
for(int i =0; i<(int) col1.size();i++) //printing the data;
cout<<col1[i]<<"\t"<<col2[i]<<"\t"<<col3[i]<<"\n";
return 0;
}
understand the above logic through the following approach:
read each line of the file.
for each line we will separate the column data through the split(string, sep) function which will return a vector<string> containing data of the row. Here sep is the separator used in the file; as I made input file comma-separated, I used ','
converting the returned vector<string> type row-data into appropriate data type and storing in respective column vector col1, col2, col3.
reference of split() functionality.
for another column that may have some missing data
you can add some logic like
if(lineData.size() > 3)
col4.emplace_back(std::stof(lineData[3]));
else
col4.emplace_back(0.0);
after col3.emplace_back(std::stof(lineData[2])); line.

Replace words in a string without skipping whitespaces

I've got a string which contains a sentence. I have to search and replace a specific word in that string. In my case I have a vector of lines and another vector of words to replace.
Here's my function that generates a file with the final text:
void Generator::generate_file(const string& fileName){
string inBuffer, outBuffer;
std::stringstream ss;
std::ofstream outFile;
outFile.open(fileName);
for (const auto& inIT : userCode){
//userCode is a vector which contains lines of text
ss.str(inIT);
ss.clear();
outBuffer = "";
while (ss >> inBuffer){
for (auto keyIT : keywords){
//keywords is a vector which contains words to replace
if (keyIT == inBuffer)
inBuffer = "REPLACED";
}
outBuffer += inBuffer + " ";
}
outFile << outBuffer << endl;
}
outFile.close();
}
The problem with this function is that it skips all whitespaces. I need them in the output file. What should I do to achieve that?
Below you can see an example of how it works:
userCode:
userCode[0] = "class UrlEncoder(object): class";
userCode[1] = " def __init__(self, alphabet=DEFAULT_ALPHABET,\n block_size=DEFAULT_BLOCK_SIZE):";
Displaying the userCode vector:
class UrlEncoder(object):
def __init__(self, alphabet=DEFAULT_ALPHABET, block_size=DEFAULT_BLOCK_SIZE):
After executing my function it looks like this:
REPLACED UrlEncoder(object):
REPLACED __init__(self, alphabet=DEFAULT_ALPHABET, block_size=DEFAULT_BLOCK_SIZE):
As you can see it properly replaced the keywords. But unfortunately it skipped the tabulator.
The main issue is the way the stream extraction >> operator works. It removes and discards any leading whitespace characters when reading the next formatted input. Assuming you want to stick with using ss >> inBuffer when grabbing input, you need to find someway to preemptively grab any leading whitespace before you perform any input extraction.
For example,
string eatwhite(const string &str, size_t pos)
{
size_t endwhite = str.find_first_not_of(" \t\n", pos);
if (endwhite == string::npos) return "";
return string(str.begin() + pos, str.begin() + endwhite);
}
Now you would call eatwhite before doing any >>:
string outBuffer = eatwhite(ss.str(), ss.tellg());
while (ss >> inBuffer)
{
for (auto keyIT : keywords)
{
//...
}
string whitesp = eatwhite(ss.str(), ss.tellg());
outBuffer += inBuffer + whitesp;
}
outFile << outBuffer << endl;

If I wanted to read in comma delineated data from an input file such as 1, 2, 3 INCLUDING the commas?

readInputRecord(ifstream &inputFile,
string &taxID, string &firstName, string &lastName, string &phoneNumber) {
while (!inputFile.eof()) {
inputFile >> firstName >> lastName >> phoneNumber >> taxID;
}
}
As you can see I read in the data like you would a standard read inputfile. The trouble is the data fields can be blank such as, ", ," and include no data-between the parenthesis. I've been reading forums here and elsewhere and a common method seems to be using getline(stuff, stuff, ','), but then that will read in data stopping at the comma. What is the method to include the commas because the output file should read and then output ", ," for the variable if that is read.
You don't need to explicitly read the ',' to be sure there has been a ',' and std::getline(...) offers a valid solution in combination with std::stringstream
// Read the file line by line using the
// std line terminator '\n'
while(std::getline(fi,line)) {
std::stringstream ss(line);
std::string cell;
// Read cells withing the line by line using
// ',' as "line terminator"
while(std::getline(fi,cell,',')) {
// here you have a string that may be '' when you got
// a ',,' sequence
std::cerr << "[" << cell << "]" << std::endl;
}
}
If you have boost-dev installed, then include header file <boost/algorithm/string.hpp>
void readInputRecord(std::ifstream &inputFile, std::vector<std::string>& fields) {
std::string line;
fields.clear();
while (std::getline(inputFile, line)) {
boost::split(fields, line, boost::is_any_of(","));
for (std::vector<std::string>::iterator it = fields.begin(); it != fields.end(); ++it)
std::cout << *it << "#";
std::cout << std::endl;
}
}
The all fields are contained in the vector, include the empty field. The code is not tested, but should work.

Modify cin to also return the newlines

I know about getline() but it would be nice if cin could return \n when encountered.
Any way for achieving this (or similar)?
edit (example):
string s;
while(cin>>s){
if(s == "\n")
cout<<"newline! ";
else
cout<<s<<" ";
}
input file txt:
hola, em dic pere
caram, jo també .
the end result shoud be like:
hola, em dic pere newline! caram, jo també .
If you are reading individual lines, you know that there is a newline after each read line. Well, except for the last line in the file which doesn't have to be delimited by a newline character for the read to be successful but you can detect if there is newline by checking eof(): if std::getline() was successful but eof() is set, the last line didn't contain a newline. Obviously, this requires the use of the std::string version of std::getline():
for (std::string line; std::getline(in, line); )
{
std::cout << line << (in.eof()? "": "\n");
}
This should write the stream to std::cout as it was read.
The question asked for the data to be output but with newlines converted to say "newline!". You can achieve this with:
for (std::string line; std::getline(in, line); )
{
std::cout << line << (in.eof()? "": "newline! ");
}
If you don't care about the stream being split into line but actually just want to get the entire file (including all newlines), you can just read the stream into a std::string:
std::string file((std::istreambuf_iterator<char>(in)),
std::istreambuf_iterator<char>());
Note, however, that this exact approach is probably fairly slow (although I know that it can be made fast). If you know that the file doesn't contain a certain character, you can also use std::getline() to read the entire file into a std::string:
std::getline(in, file, 0);
The above code assumes that your file doesn't contain any null characters.
A modification of #Dietmar's answer should do the trick:
for (std::string line; std::getline(in, line); )
{
std::istringstream iss(line);
for (std::string word; iss >> word; ) { std::cout << word << " "; }
if (in.eof()) { std::cout << "newline! "; }
}
Just for the record, I ended up using this (I wanted to post it 11h ago)
string s0, s1;
while(getline(cin,s0)){
istringstream is(s0);
while(is>>s1){
cout<<s1<<" ";
}
cout<<"newline! ";
}