Reading tab-separated values from text file into array - c++

Hey I am quite new to C++ and I am facing a problem:
I have a textfile which looks like this:
500
1120
10
number1,1 number1,2 ... number1,500
number2,1
.
.
number1120,1
So the first two values on top of the text-file describe the dimensions of the matrix. I now want to write a code, which reads all the files from the matrix into an array or vector of int values. I can read the first three values (500, 1120,10) and write them into an integer value using getline and stringstream, but I can't figure out how to read the matrix tap separated with a loop.

Something like this:
#include <iostream>
#include <sstream>
// Assume input is 12,34,56. You can use
// getline or something to read a line from
// input file.
std::string input = "12,34,56";
// Now convert the input line which is string
// to string stream. String stream is stream of
// string just like cin and cout.
std::istringstream ss(input);
std::string token;
// Now read from stream with "," as
// delimiter and store text in token named variable.
while(std::getline(ss, token, ',')) {
std::cout << token << '\n';
}

You may consider to read the matrix line by line with a loop, and split the current line using a tokenizer (e.g. std::strtok) or a nested loop that splits the line at your delimiters.
There is a thread about tokenizers.

Related

C++ CSV Getline

I have one column of floats in a csv file. No column header.
string val;
vector<float> array;
string file = "C:/path/test.csv";
ifstream csv(file);
if (csv.is_open())
{
string line;
getline(csv, line);
while (!csv.eof())
{
getline(csv, val, '\n');
array.push_back(stof(val));
}
csv.close();
}
I want to push the values in the column to vector array. When I use ',' as a delimiter it pushes the first line to the array but the rest of the column gets stuck together and unpushable. If I use '\n' it doesn't return the first line and I get a stof error.
I tried other answers unsuccessfully. What is the correct way to format this here?
test.csv
Your raw test.csv probably looks like this:
1.41286
1.425
1.49214
...
So there are no comma's, and using , as '(line) separator' would read the whole file and only parse the first float (up to the first \n).
Also, the first line is read but never used:
getline(csv, line); // <- first line never used
while (!csv.eof())
{
getline(csv, val, '\n');
array.push_back(stof(val));
}
Since there is only one field you don't have to use a separator and, as already mentioned in the comments, using while(getline(...)) is the right way to do this:
if (csv.is_open())
{
string line;
while (getline(..))
{
array.push_back(stof(val));
}
csv.close();
}

C++ file conversion: pipe delimited to comma delimited

I am trying to figure out how to turn this input file that is in pipe delimited form into comma delimited. I have to open the file, read it into an array, convert it into comma delimited in an output CSV file and then close all files. I have been told that the easiest way to do is within excel but I am not quite sure how.
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
ifstream inFile;
string myArray[5];
cout << "Enter the input filename:";
cin >> inFileName;
inFile.open(inFileName);
if(inFile.is_open())
std::cout<<"File Opened"<<std::endl;
// read file line by line into array
cout<<"Read";
for(int i = 0; i < 5; ++i)
{
file >> myArray[i];
}
// File conversion
// close input file
inFile.close();
// close output file
outFile.close();
...
What I need to convert is:
Miles per hour|6,445|being the "second" team |5.54|9.98|6,555.00
"Ending" game| left at "beginning"|Elizabeth, New Jersey|25.25|6.78|987.01
|End at night, or during the day|"Let's go"|65,978.21|0.00|123.45
Left-base night|10/07/1900|||4.07|777.23
"Let's start it"|Start Baseball Game|Starting the new game to win
What the output should look like in comma-delimited form:
Miles per hour,"6,445","being the ""second"" team member",5.54,9.98,"6,555.00",
"""Ending"" game","left at ""beginning""","Denver, Colorado",25.25,6.78,987.01,
,"End at night, during the day","""Let's go""","65,978.21",0.00,123.45,
Left-base night, 10/07/1900,,,4.07,777.23,
"""Let's start it""", Start Baseball Game, Starting the new game to win,
I will show you a complete solution and explain it to you. But let's first have view on it:
#include <iostream>
#include <vector>
#include <fstream>
#include <regex>
#include <string>
#include <algorithm>
// I omit in the example here the manual input of the filenames. This exercise can be done by somebody else
// Use fixed filenames in this example.
const std::string inputFileName("r:\\input.txt");
const std::string outputFileName("r:\\output.txt");
// The delimiter for the source csv file
std::regex re{ R"(\|)" };
std::string addQuotes(const std::string& s) {
// if there are single quotes in the string, then replace them with double quotes
std::string result = std::regex_replace(s, std::regex(R"(")"), R"("")");
// If there is any quote (") or comma in the file, then quote the complete string
if (std::any_of(result.begin(), result.end(), [](const char c) { return ((c == '\"') || (c == ',')); })) {
result = "\"" + result + "\"";
}
return result;
}
// Some output function
void printData(std::vector<std::vector<std::string>>& v, std::ostream& os) {
// Go throug all rows
std::for_each(v.begin(), v.end(), [&os](const std::vector<std::string>& vs) {
// Define delimiter
std::string delimiter{ "" };
// Show the delimited strings
for (const std::string& s : vs) {
os << delimiter << s;
delimiter = ",";
}
os << "\n";
});
}
int main() {
// We first open the ouput file, becuse, if this cannot be opened, then no meaning to do the rest of the exercise
// Open output file and check, if it could be opened
if (std::ofstream outputFileStream(outputFileName); outputFileStream) {
// Open the input file and check, if it could be opened
if (std::ifstream inputFileStream(inputFileName); inputFileStream) {
// In this variable we will store all lines from the CSV file including the splitted up columns
std::vector<std::vector<std::string>> data{};
// Now read all lines of the CSV file and split it into tokens
for (std::string line{}; std::getline(inputFileStream, line); ) {
// Split line into tokens and add to our resulting data vector
data.emplace_back(std::vector<std::string>(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {}));
}
std::for_each(data.begin(), data.end(), [](std::vector<std::string>& vs) {
std::transform(vs.begin(), vs.end(), vs.begin(), addQuotes);
});
// Output, to file
printData(data, outputFileStream);
// And to the screen
printData(data, std::cout);
}
else {
std::cerr << "\n*** Error: could not open input file '" << inputFileName << "'\n";
}
}
else {
std::cerr << "\n*** Error: could not open output file '" << outputFileName << "'\n";
}
return 0;
}
So, then let's have a look. We have function
main, read csv files, split it into tokens, convert it, and write it
addQuotes. Add quote if necessary
printData print he converted data to an output stream
Let's start with main. main will first open the input file and the output file.
The input file contains a kind of structured data and is also called csv (comma separted values). But here we do not have a comma, but a pipe symbol as delimter.
And the result will be typically stored in a 2d-vector. In dimension 1 is the rows and the other dimension is for the columns.
So, what do we need to do next? As we can see, we need to read first all complete text lines form the source stream. This can be easily done with a one-liner:
for (std::string line{}; std::getline(inputFileStream, line); ) {
As you can see here, the for statement has an declaration/initialization part, then a condition, and then a statement, carried out at the end of the loop. This is well known.
We first define a variable "line" of type std::string and use the default initializer to create an empty string. Then we use std::getline to read from the stream a complete line and put it into our variable. The std::getline returns a reference to sthe stream, and the stream has an overloaded bool operator, where it returns, if there was a failure (or end of file). So, the for loop does not need an additional check for the end of file. And we do not use the last statement of the for loop, because by reading a line, the file pointer is advanced automatically.
This gives us a very simple for loop, fo reading a complete file line by line.
Please note: Defining the variable "line" in the for loop, will scope it to the for loop. Meaning, it is only visible in the for loop. This is generally a good solution to prevent the pollution of the outer name space.
OK, now the next line:
data.emplace_back(std::vector<std::string>(std::sregex_token_iterator(line.begin(), line.end(), digit), {}));
Uh Oh, what is that?
OK, lets go step by step. First, we obviously want to add someting to our 2-dimensionsal data vector. We will use the std::vectors function emplace_back. We could have used also used push_back, but this would mean that we need to do unnecessary copying of data. Hence, we selected emplace_back to do an in place construction of the thing that we want to add to our 2-dimensionsal data vector.
And what do we want to add? We want to add a complete row, so a vector of columns. In our case a std::vector<std::string>. And, becuase we want to do in inplace construction of this vector, we call it with the vectors range constructor. Please see here: Constructor number 5. The range constructor takes 2 iterators, a begin and an end iterator, as parameter, and copies all values pointed to by the iterators into the vector.
So, we expect a begin and an end iterator. And what do we see here:
The begin iterator is: std::sregex_token_iterator(line.begin(), line.end(), digit)
And the end iterator is simply {}
But what is this thing, the sregex_token_iterator?
This is an iterator that iterates over patterns in a line. And the pattern is given by a regex. You may read here about the C++ regex libraray. Since it is very powerful, you unfortunately need to learn about it a little longer. And I cannot cover it here. But let us describe its basic functionality for our purpose: You can describe a pattern in some kind of meta language, and the
std::sregex_token_iterator will look for that pattern, and, if it finds a match, return the related data. In our case the pattern is very simple: Digits. This can be desribed with "\d+" and means, try to match one or more digits.
Now to the {} as the end iterator. You may have read that the {} will do default construction/initialization. And if you read here, number 1, then you see that the "default-constructor" constructs an end-of-sequence iterator. So, exactly what we need.
After we have read all data, we will transform the single strings, to the required output. This will be done with std::transform and the function addQuotes.
The strategy here is to first replace the single quotes with double quotes.
And then, next, we look, if there is any comma or quote in the string, then we enclose the whole string additionally in quotes.
And last, but not least, we have a simple output function and print the converted data into a file and on the screen.

How can we take out the multiple integer numbers in the string array or string and assign them to different int data type?

I am new to C++ and I am reading in a text file. The content of text file is like:
$ (first line)
2 (second)
MY NAME IS (whatever sentence with 10 or below characters)(third)
12 21 (forth)
22 22 (fifth)
221 (sixth)
fly jump run (seventh)
fish animal (eighth)
So I need to read all of these and store them into different variables line by line and so far I'd manage to store them into string array line by line but how can I store the numbers like 12 21 in forth line into 2 different integer variables such as int b and int c?
and also like last two line
how can I store the fly jump run fish animal into 5 different string variables respectively?
Basically Now I am putting them into a string array line by line and trying to access them and take them out of the array and store it.
if (file.is_open()){
cout<<"Congratulations! Your file was successfully read!";
while (!file.eof()){
getline(file,line);
txt[i]=line;
i++;
}
}
Just want to store every line into variables based on their data type.
The streams support streaming the content directly into the basic data types (int, double etc.). So the istream::operator>>(int&) does the work for you.
The below small sample class demonstrates it by reading your sample file into the members -- hope that helps:
class Creature
{
public:
void read(istream& stream)
{
string line;
stream.ignore(10, '\n'); // skip line 1 (= $)
stream >> m_integers[0]; // line 2 = single int
stream.ignore(1, '\n'); // skip end of line
getline(stream, m_sentence); // get the full sentence line ..
// and the rest ... we can read that in a single code line ...
stream >> m_integers[1] >> m_integers[2] >> m_integers[3] >> m_integers[4]
>> m_integers[5] >> m_whatCanIdDo[0] >> m_whatCanIdDo[1] >> m_whatCanIdDo[2] >> m_whatIAm[0] >> m_whatIAm[1];
}
private:
string m_sentence;
int m_integers[6];
string m_whatCanIdDo[3];
string m_whatIAm[2];
};
Calling the function:
int main()
{
ifstream file;
file.open("creature.txt");
Creature cr;
cr.read(file);
file.close();
}
There are several ways of doing this, but one of the most straightforward is to use a stringstream.
To do this, copy the lines you want to tokenize from your txt array into a stringstream. Use the stream extratction operator (>>) to read out each word from that line, separated by a space, into a separate variable.
//Required headers
#include <string>
#include <sstream>
...
string word1, word2;
stringstream words(txt[lineNumber]);
words >> word1 >> word2;
//Process words
For each line you tokenize, you'll have to reset the stream.
//Read in next line
lineNumber++;
//Reset stream flags
words.clear();
//Replace the stream's input string
words.str(txt[lineNumber]);
words >> word1 >> word2;
//Process new words
You can use the same process for both integers and strings. The stream extraction operator will automatically convert strings to whatever data type you give it. However, it's up to you to make sure that the data it's trying to convert is the correct type. If you try to write a string to an int using a stringstream, the stringstream will set a fail bit and you won't get any useful output.
It's a good idea to write your input to a string, and then check whether that string is, in fact, a number, before trying to write it to an integer. But that's an entirely different topic, there are many ways to do it, and there are several other questions on this site that cover it.

ifstream get line change output from char to string

C++ ifstream get line change getline output from char to string
I got a text file.. so i read it and i do something like
char data[50];
readFile.open(filename.c_str());
while(readFile.good())
{
readFile.getline(data,50,',');
cout << data << endl;
}
My question is instead of creating a char with size 50 by the variable name data, can i get the getline to a string instead something like
string myData;
readFile.getline(myData,',');
My text file is something like this
Line2D, [3,2]
Line3D, [7,2,3]
I tried and the compiler say..
no matching function for getline(std::string&,char)
so is it possible to still break by delimiter, assign value to a string instead of a char.
Updates:
Using
while (std::getline(readFile, line))
{
std::cout << line << std::endl;
}
IT read line by line, but i wanna break the string into several delimiter, originally if using char i will specify the delimiter as the 3rd element which is
readFile.getline(data,50,',');
how do i do with string if i break /explode with delimiter comma , the one above. in line by line
Use std::getline():
std::string line;
while (std::getline(readFile, line, ','))
{
std::cout << line << std::endl;
}
Always check the result of read operations immediately otherwise the code will attempt to process the result of a failed read, as is the case with the posted code.
Though it is possible to specify a different delimiter in getline() it could mistakenly process two invalid lines as a single valid line. Recommend retrieving each line in full and then split the line. A useful utility for splitting lines is boost::split().

C++ length of file and vectors

Hi I have a file with some text in it. Is there some easy way to get the number of lines in the file without traversing through the file?
I also need to put the lines of the file into a vector. I am new to C++ but I think vector is like ArrayList in java so I wanted to use a vector and insert things into it. So how would I do it?
Thanks.
There is no way of finding the number of lines in a file without reading it. To read all lines:
1) create a std::vector of std::string
3 ) open a file for input
3) read a line as a std::string using getline()
4) if the read failed, stop
5) push the line into the vector
6) goto 3
You would need to traverse the file to detect the number of lines (or at least call a library method that traverse the file).
Here is a sample code for parsing text file, assuming that you pass the file name as an argument, by using the getline method:
#include <string>
#include <vector>
#include <fstream>
#include <iostream>
int main(int argc, char* argv[])
{
std::vector<std::string> lines;
std::string line;
lines.clear();
// open the desired file for reading
std::ifstream infile (argv[1], std::ios_base::in);
// read each file individually (watch out for Windows new lines)
while (getline(infile, line, '\n'))
{
// add line to vector
lines.push_back (line);
}
// do anything you like with the vector. Output the size for example:
std::cout << "Read " << lines.size() << " lines.\n";
return 0;
}
Update: The code could fail for many reasons (e.g. file not found, concurrent modifications to file, permission issues, etc). I'm leaving that as an exercise to the user.
1) No way to find number of lines without reading the file.
2) Take a look at getline function from the C++ Standard Library. Something like:
string line;
fstream file;
vector <string> vec;
...
while (getline(file, line)) vec.push_back(line);
Traversing the file is fundamentally required to determine the number of lines, regardless of whether you do it or some library routine does it. New lines are just another character, and the file must be scanned one character at a time in its entirety to count them.
Since you have to read the lines into a vector anyways, you might as well combine the two steps:
// Read lines from input stream in into vector out
// Return the number of lines read
int getlines(std::vector<std::string>& out, std::istream& in == std::cin) {
out.clear(); // remove any data in vector
std::string buffer;
while (std::getline(in, buffer))
out.push_back(buffer);
// return number of lines read
return out.size();
}