Related
I'm using C++ to build my optimization model on Gurobi, and I have a question on how to assign values to coefficients. Currently, I did them in the .cpp file as
const int A = 4;
double B[] = { 1, 2, 3 };
double C[][A] = {
{ 5, 1, 0, 3 },
{ 7, 0, 2, 4 },
{ 4, 6, 8, 9 }
};
which means B[1]=1, B[2]=2, B[3]=3, and C[1][1]=5, C[1][2]=1, etc.
However, I would like to run the same model for different sets of coefficients, so instead of changing values in the .ccp file, it would be easier if I can read from multiple .dat files.
May I know how to do it?
And is that OK if I save the .dat file in the following format?
[4]
[1, 2, 3]
[[5, 1, 0, 3],
[7, 0, 2, 4],
[4, 6, 8, 9]]
I would not recommend that. Some people would recommend using JSON or YAML but if your coefficients will always be so simple, here is a recommendation:
Original file
4
1 2 3
5 1 0 3
7 0 2 4
4 6 8 9
#include <iostream>
#include <sstream>
#include <vector>
struct Coefficients {
unsigned A;
std::vector<double> B;
std::vector< std::vector<double> > C;
};
std::vector<double> parseFloats( const std::string& s ) {
std::istringstream isf( s );
std::vector<double> res;
while ( isf.good() ) {
double value;
isf >> value;
res.push_back( value );
}
return res;
}
void readCoefficients( std::istream& fs, Coefficients& c ) {
fs >> c.A;
std::ws( fs );
std::string line;
std::getline( fs, line );
c.B = parseFloats( line );
while ( std::getline( fs, line ) ) {
c.C.push_back( parseFloats( line ) );
}
}
One example of usage:
std::string data = R"(
4
1 2 3
5 1 0 3
7 0 2 4
4 6 8 9
)";
int main() {
Coefficients coef;
std::istringstream isf( data );
readCoefficients( isf, coef );
std::cout << "A:" << coef.A << std::endl;
std::cout << "B:" << std::endl << " ";
for ( double val : coef.B ) {
std::cout << val << " ";
}
std::cout << std::endl;
std::cout << "C:" << std::endl;
for ( const std::vector<double>& row : coef.C ) {
std::cout << " ";
for ( double val : row ) {
std::cout << val << " ";
}
std::cout << std::endl;
}
}
Result:
Program stdout
A:4
B:
1 2 3
C:
5 1 0 3
7 0 2 4
4 6 8 9
Code: https://godbolt.org/z/9s3zffahj
Gurobi. Very interesting! And you have chosen C++ to interface it. Good.
I try to give you a very simple answer, with very simple statements.
All the code has lowest complexity and will work with really only a few statements.
There is no need for many C-like statements, because C++ is a very expressive language. You can do really an understandable or "easily readable" abstraction for your problem. If you look in main, an do see only a few lines of code, and then you will understand what I mean.
Additionally, I will give you a detailed explanation for everything. So, I will not just dump code, but explain line by line. Additionally I add many comments and use reasonable long and "speaking" variable names.
This code will be in C++, and not C-Style, as you can often see.
Then let us a little bit concentrate on how we would do things in C++.
If we look at your first definition:
const int A = 4;
double B[] = { 1, 2, 3 };
double C[][A] = {
{ 5, 1, 0, 3 },
{ 7, 0, 2, 4 },
{ 4, 6, 8, 9 }
};
We can see here C-Style arrays. Those with the [] brackets. The number of elements of the B array is defined by the number of initializer elements. So, 3 elements in the initializer list will give use 3 elemnts in the array --> The array size is 3. OK, Understood
The C-elements are a 2-dimensional Matrix. The number of columns is defined by `const int A = 4’. So, I am not sure, if A is just a size or really a coefficient. But in the end, it does not matter. The number of rows is given by the number of lines in the source text file. Sor, we have a matrix with 3 rows and 4 columns.
First important information: In C++ we are not using C-Style arrays []. We have basically 2 versatile working-horses for that:
The std::array, if the size of the array is known at compile time
The C++ main container, the std::vector. An array that can dynamically grow as needed. That is extremely powerful and used a lot in C++. It knows also, how many elements it contains, so now explicit definition like A=4 needed.
And the std::vector is the container to use for this purpose. Please read here about the std::vector. So, even if we do not know the number aof rows and columns in advance, we can use a std::vectorand itwill grow as needed.
For that reason, I am not sure, if The "const with value 4", is needed in your Coefficients-Information at all. Anyway, I will add it.
Next, C++ is an object-oriented language. In the very beginning of the language it was even called ‘C with objects’. The objects in C++ are modeled with classes or structs.
And one major idea of an object-oriented approach is, to put data, and methods, operating on this data, in one class (or struct) together.
So, we can define a Coefficient class and store here all coefficient data like your A, B, C. And then, and most important, we will add functions to this class that will operate on this data. In our special case we will add read and write functionality.
As you know, C++ uses a very versatile IO-stream library. And has so called “extraction” operators >> and “inserter” operators <<. The “streams” are implemented as hierarchical classes. That means, it does not matter on which stream (e.g. std::cout, a filestream or a stringstream) you use the << or >> operators, it will work basically everywhere in the same way.
And the extractor >> and inserter <<operators are already overloaded for many many existing and build-in data types. And because of that, you can output many different data types to for example std::cout.
But, this will of course not work for custom types, like our class “Coefficient”. But here, we can simply add the functionality, by defining the appropriate inserter and extractor operators. And after that, we can use our new type in the same way as other, built in data types.
Then let us look now on the first code example:
struct Coefficient {
// The data
int A{};
std::vector<double> B{};
std::vector<std::vector<double>> C{};
friend std::istream& operator >>(std::istream& is, Coefficient& coefficient);
friend std::ostream& operator << (std::ostream& os, const Coefficient& coefficient);
};
That is all. Simple, isn't it? Now the class has the needed functionality.
We will show later the implementation for the operators.
Note, this mechanism is also called (de)serialization, because the data of cour class will be written/read in a serial and human readable way. We need to take care the output and the input structure of the data is the same, so that we always can take our 2 operators.
You should understand already now, that we later can have an extremely simple and low complexity handling of IO operations in main or other functions. Let us look at main already now:
// Some example data in a stream
std::istringstream exampleFile{ R"(4
1 2 3
5 1 0 3
7 0 2 4
4 6 8 9 )" };
// Test/Driver code
int main() {
// Here we have our coefficients
Coefficient coefficient{};
// Simply extract all data from the file and store it in our coefficients variable
// This is just a one-liner, intuitive and simple to understand.
exampleFile >> coefficient;
// One-liner debug output. Even mor simple
std::cout << coefficient;
}
This looks very intuitive and similar to the input and output of other, build-in data types.
Let us now come to the actual input and output functions. And, because we want to keep it simple, we will structure your data in your “.dat” file in an easy to read way.
And that is: White space separated data. So: 81 999 42 and so on. Why is that simple? Because in C++ the formatted input functions (those with the extractor >>) will read such data easily. Example:
int x,y,z;
std::cin >> x >> y >> z
If you give a white space separated input as shown above, it will read the characters, convert it to numbers and store it in the variables.
There is one problem in C++. And that is, the end of line character ‘\n’ will in most cases also be treated as a white space. So, reading values in a loop, would not stop at the end of a line.
The standard solution for this problem is to use a non-formatted input function like std::getline and first read a complete line into a std::stringvariable. Then, we will put this string into a std::istringstream which is again a stream and extract the values from there.
In your “.dat” file you have many lines with data. So, we need to do the above operation repeatedly. And for things that need to be done repeatedly, we use functions in C++. We need to have a function, that receives a stream (any stream) reads the values, store them in a std::vector and returns the vector.
Before I show you this function, I will save some typing work and abbreviate the vector and the 2d-vector with a using statement.
Please see:
// Some abbreviations for easier typing and reading
using DVec = std::vector<double>;
using DDVec = std::vector<DVec>;
// ---------------------------------------------------------------------------
// A function to retrieve a number of double values from a stream for one line
DVec getDVec(std::istream& is) {
// Read one complete line
std::string line{}; std::getline(is, line);
// Put it in an istringstream for better extracting
std::istringstream iss(line);
// And use the istream_iterator to iterate over all doubles and put the data in the resulting vector
return { std::istream_iterator<double>(iss), {} };
}
You see, a simple 3-line function. The last line is maybe difficult to understand for beginners. I will explain it later. So, our function expects a reference to a stream as input parameter and then returns a std::vector<double> containing all doubles from a line.
So, first, we read a complete line into a variable of type std::string. Ultra simple, with the existing std::getlinefunction.
Then, we put the string into a std::istringstream variable. This will basically convert the string to a stream and allow us, to use all stream functions on that. An remember, why we did that: Because we want to read a complete line and then extract the data from there. Now the last line:
return { std::istream_iterator<double>(iss), {} };
Uh, what’s that? We expect to return a std::vector<double>. The compiler knows that we want to return such a type. And therefore he will kindly create a temporary variable of that type for us and use its range constructor no 5 () (see here) to initialize our vector. And with what?
You can read in the CPP reference that it expects 2 iterators. A begin-iterator and an end-iterator. Everything between the iterators will be inclusively copied to the vector.
And the std::istream_iterator (Please read here) will simply call the extractor operator >> repeatedly and with that reads all doubles, until all values are read.
Cool!
Next we can use this functionality in our class’ extractor operator >>. This will then look like this;
// Simple extraction operator
friend std::istream& operator >>(std::istream& is, Coefficient& coefficient) {
// Get A and all the B coefficients
coefficient.B = std::move(getDVec(is >> coefficient.A >> std::ws));
// And in a simple for loop, readall C-coeeficients
for (DVec dVec{ getDVec(is) }; is and not dVec.empty(); dVec = getDVec(is))
coefficient.C.push_back(std::move(dVec));
return is;
}
It will first read
the value for A (>> coefficient.A)
then all white spaces that may exist in the stream, and then (>> std::ws)
the line with the B-coefficients (getDVec(is)
LAst but not least, we use s imple for loop, to read all lines of the C-coefficients and add them to the 2d output vector. We will skip empty lines.
std::move will avoid copying of large data and give us a little better efficiency.
Output is even more simple. Using "loops" to show the data. Not much to explain here.
Now, we have all functions. We made our live simpler, by splitting up a big problem inti smaller problems.
The final complete code would then look like this:
#include <iostream>
#include <sstream>
#include <vector>
#include <algorithm>
#include <iterator>
// Some abbreviations for easier typing and reading
using DVec = std::vector<double>;
using DDVec = std::vector<DVec>;
// ---------------------------------------------------------------------------
// A function to retrieve a number of double values from a stream for one line
DVec getDVec(std::istream& is) {
// Read one complete line
std::string line{}; std::getline(is, line);
// Put it in an istringstream for better extracting
std::istringstream iss(line);
// And use the sitream_iterator to iterate over all doubles and put the data in the resulting vector
return { std::istream_iterator<double>(iss), {} };
}
// -------------------------------------------------------------
// Cooeficient class. Holds data and methods to operate on this data
struct Coefficient {
// The data
int A{};
DVec B{};
DDVec C{};
// Simple extraction operator
friend std::istream& operator >>(std::istream& is, Coefficient& coefficient) {
// Get A and all the B coefficients
coefficient.B = std::move(getDVec(is >> coefficient.A >> std::ws));
// And in a simple for loop, readall C-coeeficients
for (DVec dVec{ getDVec(is) }; is and not dVec.empty(); dVec = getDVec(is))
coefficient.C.push_back(std::move(dVec));
return is;
}
// Even more simple inserter operator. Output values in loops
friend std::ostream& operator << (std::ostream& os, const Coefficient& coefficient) {
os << coefficient.A << '\n';
for (const double d : coefficient.B) os << d << ' '; os << '\n';
for (const DVec& dv : coefficient.C) {
for (const double d : dv) os << d << ' '; os << '\n'; }
return os;
}
};
// Some example data in a stream
std::istringstream exampleFile{ R"(4
1 2 3
5 1 0 3
7 0 2 4
4 6 8 9 )" };
// Test/Driver code
int main() {
// Here we have our coefficients
Coefficient coefficient{};
// Simply extract all data from the file and store it in our coefficients variable
exampleFile >> coefficient;
// One-liner debug output
std::cout << coefficient;
}
Please again see the simple statements in main.
I hope I could help you a little.
Some additional notes.
In professional software development, code without comments is considered to have 0 quality.
Also, the guidelines on SO recommend, to not just dump code, but also give a comprehensive explanation.
Please do not use: while ( isf.good() ) { It is considered as very bad practice and error prone. Please read this
If you made the decision to go to C++, you should try to go away from typycal serial C programming and use a more object oriented approach.
If you should have further questions then ask, I am happy to answer. Thank you for your question.
I am trying to figure out how to turn this input file that is in pipe delimited form into comma delimited. I have to open the file, read it into an array, convert it into comma delimited in an output CSV file and then close all files. I have been told that the easiest way to do is within excel but I am not quite sure how.
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
ifstream inFile;
string myArray[5];
cout << "Enter the input filename:";
cin >> inFileName;
inFile.open(inFileName);
if(inFile.is_open())
std::cout<<"File Opened"<<std::endl;
// read file line by line into array
cout<<"Read";
for(int i = 0; i < 5; ++i)
{
file >> myArray[i];
}
// File conversion
// close input file
inFile.close();
// close output file
outFile.close();
...
What I need to convert is:
Miles per hour|6,445|being the "second" team |5.54|9.98|6,555.00
"Ending" game| left at "beginning"|Elizabeth, New Jersey|25.25|6.78|987.01
|End at night, or during the day|"Let's go"|65,978.21|0.00|123.45
Left-base night|10/07/1900|||4.07|777.23
"Let's start it"|Start Baseball Game|Starting the new game to win
What the output should look like in comma-delimited form:
Miles per hour,"6,445","being the ""second"" team member",5.54,9.98,"6,555.00",
"""Ending"" game","left at ""beginning""","Denver, Colorado",25.25,6.78,987.01,
,"End at night, during the day","""Let's go""","65,978.21",0.00,123.45,
Left-base night, 10/07/1900,,,4.07,777.23,
"""Let's start it""", Start Baseball Game, Starting the new game to win,
I will show you a complete solution and explain it to you. But let's first have view on it:
#include <iostream>
#include <vector>
#include <fstream>
#include <regex>
#include <string>
#include <algorithm>
// I omit in the example here the manual input of the filenames. This exercise can be done by somebody else
// Use fixed filenames in this example.
const std::string inputFileName("r:\\input.txt");
const std::string outputFileName("r:\\output.txt");
// The delimiter for the source csv file
std::regex re{ R"(\|)" };
std::string addQuotes(const std::string& s) {
// if there are single quotes in the string, then replace them with double quotes
std::string result = std::regex_replace(s, std::regex(R"(")"), R"("")");
// If there is any quote (") or comma in the file, then quote the complete string
if (std::any_of(result.begin(), result.end(), [](const char c) { return ((c == '\"') || (c == ',')); })) {
result = "\"" + result + "\"";
}
return result;
}
// Some output function
void printData(std::vector<std::vector<std::string>>& v, std::ostream& os) {
// Go throug all rows
std::for_each(v.begin(), v.end(), [&os](const std::vector<std::string>& vs) {
// Define delimiter
std::string delimiter{ "" };
// Show the delimited strings
for (const std::string& s : vs) {
os << delimiter << s;
delimiter = ",";
}
os << "\n";
});
}
int main() {
// We first open the ouput file, becuse, if this cannot be opened, then no meaning to do the rest of the exercise
// Open output file and check, if it could be opened
if (std::ofstream outputFileStream(outputFileName); outputFileStream) {
// Open the input file and check, if it could be opened
if (std::ifstream inputFileStream(inputFileName); inputFileStream) {
// In this variable we will store all lines from the CSV file including the splitted up columns
std::vector<std::vector<std::string>> data{};
// Now read all lines of the CSV file and split it into tokens
for (std::string line{}; std::getline(inputFileStream, line); ) {
// Split line into tokens and add to our resulting data vector
data.emplace_back(std::vector<std::string>(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {}));
}
std::for_each(data.begin(), data.end(), [](std::vector<std::string>& vs) {
std::transform(vs.begin(), vs.end(), vs.begin(), addQuotes);
});
// Output, to file
printData(data, outputFileStream);
// And to the screen
printData(data, std::cout);
}
else {
std::cerr << "\n*** Error: could not open input file '" << inputFileName << "'\n";
}
}
else {
std::cerr << "\n*** Error: could not open output file '" << outputFileName << "'\n";
}
return 0;
}
So, then let's have a look. We have function
main, read csv files, split it into tokens, convert it, and write it
addQuotes. Add quote if necessary
printData print he converted data to an output stream
Let's start with main. main will first open the input file and the output file.
The input file contains a kind of structured data and is also called csv (comma separted values). But here we do not have a comma, but a pipe symbol as delimter.
And the result will be typically stored in a 2d-vector. In dimension 1 is the rows and the other dimension is for the columns.
So, what do we need to do next? As we can see, we need to read first all complete text lines form the source stream. This can be easily done with a one-liner:
for (std::string line{}; std::getline(inputFileStream, line); ) {
As you can see here, the for statement has an declaration/initialization part, then a condition, and then a statement, carried out at the end of the loop. This is well known.
We first define a variable "line" of type std::string and use the default initializer to create an empty string. Then we use std::getline to read from the stream a complete line and put it into our variable. The std::getline returns a reference to sthe stream, and the stream has an overloaded bool operator, where it returns, if there was a failure (or end of file). So, the for loop does not need an additional check for the end of file. And we do not use the last statement of the for loop, because by reading a line, the file pointer is advanced automatically.
This gives us a very simple for loop, fo reading a complete file line by line.
Please note: Defining the variable "line" in the for loop, will scope it to the for loop. Meaning, it is only visible in the for loop. This is generally a good solution to prevent the pollution of the outer name space.
OK, now the next line:
data.emplace_back(std::vector<std::string>(std::sregex_token_iterator(line.begin(), line.end(), digit), {}));
Uh Oh, what is that?
OK, lets go step by step. First, we obviously want to add someting to our 2-dimensionsal data vector. We will use the std::vectors function emplace_back. We could have used also used push_back, but this would mean that we need to do unnecessary copying of data. Hence, we selected emplace_back to do an in place construction of the thing that we want to add to our 2-dimensionsal data vector.
And what do we want to add? We want to add a complete row, so a vector of columns. In our case a std::vector<std::string>. And, becuase we want to do in inplace construction of this vector, we call it with the vectors range constructor. Please see here: Constructor number 5. The range constructor takes 2 iterators, a begin and an end iterator, as parameter, and copies all values pointed to by the iterators into the vector.
So, we expect a begin and an end iterator. And what do we see here:
The begin iterator is: std::sregex_token_iterator(line.begin(), line.end(), digit)
And the end iterator is simply {}
But what is this thing, the sregex_token_iterator?
This is an iterator that iterates over patterns in a line. And the pattern is given by a regex. You may read here about the C++ regex libraray. Since it is very powerful, you unfortunately need to learn about it a little longer. And I cannot cover it here. But let us describe its basic functionality for our purpose: You can describe a pattern in some kind of meta language, and the
std::sregex_token_iterator will look for that pattern, and, if it finds a match, return the related data. In our case the pattern is very simple: Digits. This can be desribed with "\d+" and means, try to match one or more digits.
Now to the {} as the end iterator. You may have read that the {} will do default construction/initialization. And if you read here, number 1, then you see that the "default-constructor" constructs an end-of-sequence iterator. So, exactly what we need.
After we have read all data, we will transform the single strings, to the required output. This will be done with std::transform and the function addQuotes.
The strategy here is to first replace the single quotes with double quotes.
And then, next, we look, if there is any comma or quote in the string, then we enclose the whole string additionally in quotes.
And last, but not least, we have a simple output function and print the converted data into a file and on the screen.
we had a algorithm coding event in our school today, and they asked a question and no one could answer. I am trying to find an answer by using only standard library. (I am trying to solve this without .h files because in contests they want us to solve it like that.) So basically question is as follows:
*Write a (C / C++) program that will get a graph model as an argument.
*You must get this values from console while we are starting your application.
*Your program must write down all the possible word combination by using the graph model.
Ex Input on Console to call your app: “yourapp.exe 5ABCD1BCD1CDE”
After your application name, second word gives you information about the graph.
Notation: [STEPS][FROM1-TO1-TO2-...TOn]1[FROM2-TO1-TO2-...TOn]1 .....
[STEPS] First integer value ( 5 in our example) is the maximum word length to measure.
[FROM TO ... TO] blocks show connections in the graph. Each node is symbolized with one Upper
Letter. First on is connections start position others are destinations. Each connection(link) is one
way. So: “ABCD” means we have connection from A to B , A to C and A to D
The first node in the text is the start point for word creation.
This input means you have a graph like: https://imgur.com/BioHGqA
Desired Output:
A
AB
AC
AD
ABD
ABC
ABCD
ABCE
ACD
ACE
--------------------------------------------END OF THE QUESTION-----------------------------------------------------
I personally tried to find the index numbers in input, connection starts and etc. but i couldn't figured out how to solve this properly. Please help :=)
#include <iostream>
#include <string>
using namespace std;
int inputLength,maxLength,digitLength;
string word,digitIndex,starters;
int main(int argc, char *argv[])
{
//Saving the graph input as a variable named word
word = argv[1];
//Finding the max word and input lengths
maxLength=word[0] -'0';
inputLength=word.length();
cout<<"Your graph input "<<word<<endl;
cout<<"Maximum word length : "<<maxLength<<endl;
//Finding the digitIndexes in input.
for(int i=0;i<inputLength;i++){
if(isdigit(word[i])){
digitIndex+=to_string(i);
}
}
digitLength=digitIndex.length();
cout<<"digit indexes : "<<digitIndex<<endl;
cout<<"digitindex[1] : "<<digitIndex[1]-'0'<<endl;
cout<<"your word : "<<word<<endl;
//Finding the connection starts
for(int i=0; i<inputLength;i++){
if(isdigit(word[i])==true){
starters+=word[i+1];
}
}
cout<<"starters : "<<starters<<endl;
}
Interesting problem. But easy to implement using std::algorithms and recursive calls. Also, the data structure selection may help to design such an application.
Unfortunately, the description of the input format is not fully clear. I understand the “steps” part, but there is no description for the rest of the digits. I assume that they are simply delimiters and have no further meaning.
We will split the big task into some subtasks. And, we will use a class Graph, where we store all needed data and functions.
So, obviously the first task is, to split the input string. For that we extract the first characters consisting of digits and convert them to the integer value “steps”. The rest of the string will be tokenized by using a C++ standard functionality: The std::sregex_token_iterator.
This thing is an iterator. For iterating over a string, hence “sregex”. The begin()/end() part defines, on what range of input we shall operate, then there is a std::regex for what should be matched / or what should not be matched in the input string. The type of matching strategy is given with last parameter.
1 --> give me the stuff that I defined in the regex and
-1 --> give me that what is NOT matched based on the regex
We can use this iterator for storing the tokens in a std::vector. The std::vector has a range constructor, which takes 2 iterators as parameter, and copies the data between the first iterator and 2nd iterator to the std::vector. The statement:
std::vector<std::string> split(std::sregex_token_iterator(init.begin(), init.end(), re, 1), {});
defines a variable “split” as a std::vector and uses the so called range-constructor of the std::vector.
You can see that I do not use the std::sregex_token_iterator’s "end()"-iterator for the std::vector explicitly.This iterator will be constructed from the empty brace-enclosed initializer list with the correct type, because it will be deduced to be the same as the type of the first argument due to the std::vector constructor requiring that.
Then we transform the resulting “from-to”-strings to our target storage. This is a std::map consisting of the “from” character as a key, and a std::vector of char’s (the targets) as the value. With that we always have an association from one start point (Vertex) to all end points and with that, implicitly the “Edges”. This data structure will span a virtual tree, which we can later traverse to find the required result.
We put all this in an “init” function, can call it from the class constructor and also from the overwritten extractor operator. I added the extractor as an additional functionality to make life easier. So you do not need to use the main-functions argc and argv, but can directly read from std::cin via:
Graph graph;
std::cin >> graph;
Now we have all data in our map and can start to build the solution. We will store all resulting “ways” in a std::vector of std::string. For building the “ways” we “track” the way through the “graph”. So, every time, when we see a new vertex, we add it to the “track” and if we reach the end of a route or if the route is longer than “steps”, we store the new track in “ways”.
So, the OP requested a special output format. To create this, we must use “Breadth First”- or “Level Order”-Traversal. Meaning, before we descent (with a simple recursive algorithm) we need to go horizontally, resulting in 2 for loops. But no problem. Very simple.
For a simple output functionality, I have overwritten the inserter operator.
And, I do validate the input (I allow also lower letters).
Please see the full working example below.
#include <iostream>
#include <vector>
#include <utility>
#include <regex>
#include <map>
#include <iterator>
#include <string>
#include <algorithm>
//std::string test{"5ABCD1BCD1CDE"};
std::regex re1(R"(([a-zA-Z]+))");
std::regex re2(R"(([0-9]+[a-zA-Z]+)+)");
using Map = std::map<char, std::vector<char>>;
class Graph {
public:
// Constructor
Graph() : steps(), fromTo(), root(), ways(), track() {}
Graph(const std::string input) : steps(), fromTo(), root(), ways(), track() { init(input); }
// Build the result
void build() { int level{ 0 }; rBuild(root, level); }
// inserter
friend std::ostream& operator << (std::ostream& os, const Graph& g) {
std::copy(g.ways.begin(), g.ways.end(), std::ostream_iterator<std::string>(os, " "));
return os << "\n";
}
// extractor
friend std::istream& operator >> (std::istream& is, Graph& g) {
if (std::string input{}; std::getline(is, input)) g.init(input);
return is;
}
private:
// Values derived from input
int steps{};
Map fromTo{};
char root{};
// The result
std::vector<std::string> ways{};
std::string track{};
// Recursive function to build all ways
void rBuild(const char vertex, int& level);
// Initialize source values
void init(const std::string& input);
};
void Graph::init(const std::string& input) {
fromTo.clear(), ways.clear(); track.clear(); steps = 0; root = '\0';
if (std::regex_match(input, re2)) {
// Get steps
size_t pos{}; steps = std::stoi(input, &pos); std::string init = input.substr(pos);
// Split string into substrings
std::vector<std::string> split(std::sregex_token_iterator(init.begin(), init.end(), re1, 1), {});
// Get root
root = split[0][0]; track += root; ways.push_back(track);
// Convert substrings to map entries
std::transform(split.begin(), split.end(), std::inserter(fromTo, fromTo.end()), [](std::string & s) {
return std::make_pair(s[0], std::vector<char>(std::next(s.begin()), s.end())); });
}
else
std::cerr << "\n***** Error: Wrong input format\n";
}
// Recursive function to build all ways through the graph
void Graph::rBuild(const char vertex, int& level) {
// Allow only a certain depth, while descencing down
if (level < steps-1) {
// Search the start point for this entry
if (Map::iterator node{ fromTo.find(vertex) }; node != fromTo.end()) {
// Go through all edges to just the next vertex. This is not a breadth first traversal
// So, first we will go horizontally
for (const char to : node->second) {
// We want to track the way that we were going so far
track.push_back(to);
// Saving this track as a new way
ways.push_back(track);
// Restoring the origninal track befor this way, so that we can generate the next way
track.pop_back();
}
// and now we will descent
for (const char to : node->second) {
// One level further down
++level;
// track will be one vertex longer
track.push_back(to);
// Recursive call, descent
rBuild(to, level);
// And backwards
track.pop_back();
--level;
}
}
}
}
int main(int argc, char* argv[]) {
if (argc == 2) {
std::string test = argv[1];
// Define and initialize the graph
Graph graph(test);
// Build the required strings
graph.build();
// Show result
std::cout << graph;
}
else {
std::cout << "\nEnter init string: \n";
if (Graph graph; std::cin >> graph) {
// Build the required strings
graph.build();
// Show result
std::cout << "\nResult:\n" << graph << "\n";
}
}
return 0;
}
What a pity that nobody will read that . . .
C++14
Generally, the staff in university has recommended us to use Boost to parse the file, but I've installed it and not succeeded to implement anything with it.
So I have to parse a CSV file line-by-line, where each line is of 2 columns, separated of course by a comma. Each of these two columns is a digit. I have to take the integral value of these two digits and use them to construct my Fractal objects at the end.
The first problem is: The file can look like for example so:
1,1
<HERE WE HAVE A NEWLINE>
<HERE WE HAVE A NEWLINE>
This format of file is okay. But my solution outputs "Invalid input" for that one, where the correct solution is supposed to print only once the respective fractal - 1,1.
The second problem is: The file can look like:
1,1
<HERE WE HAVE A NEWLINE>
1,1
This is supposed to be an invalid input but my solution treats it like a correct one - and just skips over the middle NEWLINE.
Maybe you can guide me how to fix these issues, it would really help me as I'm struggling with this exercise for 3 days from morning to evening.
This is my current parser:
#include <iostream>
#include "Fractal.h"
#include <fstream>
#include <stack>
#include <sstream>
const char *usgErr = "Usage: FractalDrawer <file path>\n";
const char *invalidErr = "Invalid input\n";
const char *VALIDEXT = "csv";
const char EXTDOT = '.';
const char COMMA = ',';
const char MINTYPE = 1;
const char MAXTYPE = 3;
const int MINDIM = 1;
const int MAXDIM = 6;
const int NUBEROFARGS = 2;
int main(int argc, char *argv[])
{
if (argc != NUBEROFARGS)
{
std::cerr << usgErr;
std::exit(EXIT_FAILURE);
}
std::stack<Fractal *> resToPrint;
std::string filepath = argv[1]; // Can be a relative/absolute path
if (filepath.substr(filepath.find_last_of(EXTDOT) + 1) != VALIDEXT)
{
std::cerr << invalidErr;
exit(EXIT_FAILURE);
}
std::stringstream ss; // Treat it as a buffer to parse each line
std::string s; // Use it with 'ss' to convert char digit to int
std::ifstream myFile; // Declare on a pointer to file
myFile.open(filepath); // Open CSV file
if (!myFile) // If failed to open the file
{
std::cerr << invalidErr;
exit(EXIT_FAILURE);
}
int type = 0;
int dim = 0;
while (myFile.peek() != EOF)
{
getline(myFile, s, COMMA); // Read to comma - the kind of fractal, store it in s
ss << s << WHITESPACE; // Save the number in ss delimited by ' ' to be able to perform the double assignment
s.clear(); // We don't want to save this number in s anymore as we won't it to be assigned somewhere else
getline(myFile, s, NEWLINE); // Read to NEWLINE - the dim of the fractal
ss << s;
ss >> type >> dim; // Double assignment
s.clear(); // We don't want to save this number in s anymore as we won't it to be assigned somewhere else
if (ss.peek() != EOF || type < MINTYPE || type > MAXTYPE || dim < MINDIM || dim > MAXDIM)
{
std::cerr << invalidErr;
std::exit(EXIT_FAILURE);
}
resToPrint.push(FractalFactory::factoryMethod(type, dim));
ss.clear(); // Clear the buffer to update new values of the next line at the next iteration
}
while (!resToPrint.empty())
{
std::cout << *(resToPrint.top()) << std::endl;
resToPrint.pop();
}
myFile.close();
return 0;
}
You do not need anything special to parse .csv files, the STL containers from C++11 on provide all the tools necessary to parse virtually any .csv file. You do not need to know the number of values per-row you are parsing before hand, though you will need to know the type of value you are reading from the .csv in order to apply the proper conversion of values. You do not need any third-party library like Boost either.
There are many ways to store the values parsed from a .csv file. The basic "handle any type" approach is to store the values in a std::vector<std::vector<type>> (which essentially provides a vector of vectors holding the values parsed from each line). You can specialize the storage as needed depending on the type you are reading and how you need to convert and store the values. Your base storage can be struct/class, std::pair, std::set, or just a basic type like int. Whatever fits your data.
In your case you have basic int values in your file. The only caveat to a basic .csv parse is the fact you may have blank lines in between the lines of values. That's easily handled by any number of tests. For instance you can check if the .length() of the line read is zero, or for a bit more flexibility (in handling lines with containing multiple whitespace or other non-value characters), you can use .find_first_of() to find the first wanted value in the line to determine if it is a line to parse.
For example, in your case, your read loop for your lines of value can simply read each line and check whether the line contains a digit. It can be as simple as:
...
std::string line; /* string to hold each line read from file */
std::vector<std::vector<int>> values {}; /* vector vector of int */
std::ifstream f (argv[1]); /* file stream to read */
while (getline (f, line)) { /* read each line into line */
/* if no digits in line - get next */
if (line.find_first_of("0123456789") == std::string::npos)
continue;
...
}
Above, each line is read into line and then line is checked on whether or not it contains digits. If so, parse it. If not, go get the next line and try again.
If it is a line containing values, then you can create a std::stringstream from the line and read integer values from the stringstream into a temporary int value and add the value to a temporary vector of int, consume the comma with getline and the delimiter ',', and when you run out of values to read from the line, add the temporary vector of int to your final storage. (Repeat until all lines are read).
Your complete read loop could be:
while (getline (f, line)) { /* read each line into line */
/* if no digits in line - get next */
if (line.find_first_of("0123456789") == std::string::npos)
continue;
int itmp; /* temporary int */
std::vector<int> tmp; /* temporary vector<int> */
std::stringstream ss (line); /* stringstream from line */
while (ss >> itmp) { /* read int from stringstream */
std::string tmpstr; /* temporary string to ',' */
tmp.push_back(itmp); /* add int to tmp */
if (!getline (ss, tmpstr, ',')) /* read to ',' w/tmpstr */
break; /* done if no more ',' */
}
values.push_back (tmp); /* add tmp vector to values */
}
There is no limit on the number of values read per-line, or the number of lines of values read per-file (up to the limits of your virtual memory for storage)
Putting the above together in a short example, you could do something similar to the following which just reads your input file and then outputs the collected integers when done:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
int main (int argc, char **argv) {
if (argc < 2) { /* validate at least 1 argument given for filename */
std::cerr << "error: insufficient input.\nusage: ./prog <filename>\n";
return 1;
}
std::string line; /* string to hold each line read from file */
std::vector<std::vector<int>> values {}; /* vector vector of int */
std::ifstream f (argv[1]); /* file stream to read */
while (getline (f, line)) { /* read each line into line */
/* if no digits in line - get next */
if (line.find_first_of("0123456789") == std::string::npos)
continue;
int itmp; /* temporary int */
std::vector<int> tmp; /* temporary vector<int> */
std::stringstream ss (line); /* stringstream from line */
while (ss >> itmp) { /* read int from stringstream */
std::string tmpstr; /* temporary string to ',' */
tmp.push_back(itmp); /* add int to tmp */
if (!getline (ss, tmpstr, ',')) /* read to ',' w/tmpstr */
break; /* done if no more ',' */
}
values.push_back (tmp); /* add tmp vector to values */
}
for (auto row : values) { /* output collected values */
for (auto col : row)
std::cout << " " << col;
std::cout << '\n';
}
}
Example Input File
Using an input file with miscellaneous blank lines and two-integers per-line on the lines containing values as you describe in your question:
$ cat dat/csvspaces.csv
1,1
2,2
3,3
4,4
5,5
6,6
7,7
8,8
9,9
Example Use/Output
The resulting parse:
$ ./bin/parsecsv dat/csvspaces.csv
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
Example Input Unknown/Uneven No. of Columns
You don't need to know the number of values per-line in the .csv or the number of lines of values in the file. The STL containers handle the memory allocation needs automatically allowing you to parse whatever you need. Now you may want to enforce some fixed number of values per-row, or rows per-file, but that is simply up to you to add simple counters and checks to your read/parse routine to limit the values stored as needed.
Without any changes to the code above, it will handle any number of comma-separated-values per-line. For example, changing your data file to:
$ cat dat/csvspaces2.csv
1
2,2
3,3,3
4,4,4,4
5,5,5,5,5
6,6,6,6,6,6
7,7,7,7,7,7,7
8,8,8,8,8,8,8,8
9,9,9,9,9,9,9,9,9
Example Use/Output
Results in the expected parse of each value from each line, e.g.:
$ ./bin/parsecsv dat/csvspaces2.csv
1
2 2
3 3 3
4 4 4 4
5 5 5 5 5
6 6 6 6 6 6
7 7 7 7 7 7 7
8 8 8 8 8 8 8 8
9 9 9 9 9 9 9 9 9
Let me know if you have questions that I didn't cover or if you have additional questions about something I did and I'm happy to help further.
I will not update your code. I look at your title Parsing a CSV file - C++ and would like to show you, how to read csv files in a more modern way. Unfortunately you are still on C++14. With C++20 or the ranges library it would be ultra simple using getlines and split.
And in C++17 we could use CTAD and if with initializer and so on.
But what we do not need is boost. C++`s standard lib is sufficient. And we do never use scanf and old stuff like that.
And in my very humble opinion the link to the 10 years old question How can I read and parse CSV files in C++? should not be given any longer. It is the year 2020 now. And more modern and now available language elements should be used. But as said. Everybody is free to do what he wants.
In C++ we can use the std::sregex_token_iterator. and its usage is ultra simple. It will also not slow down your program dramatically. A double std::getline would also be ok. Although it is not that flexible. The number of columns must be known for that. The std::sregex_token_iterator does not care about the number of columns.
Please see the following example code. In that, we create a tine proxy class and overwrite its extractor operator. Then we us the std::istream_iterator and read and parse the whole csv-file in a small one-liner.
#include <algorithm>
#include <fstream>
#include <iostream>
#include <iterator>
#include <regex>
#include <string>
#include <vector>
// Define Alias for easier Reading
// using Columns = std::vector<std::string>;
using Columns = std::vector<int>;
// The delimiter
const std::regex re(",");
// Proxy for the input Iterator
struct ColumnProxy {
// Overload extractor. Read a complete line
friend std::istream& operator>>(std::istream& is, ColumnProxy& cp) {
// Read a line
std::string line;
cp.columns.clear();
if(std::getline(is, line) && !line.empty()) {
// Split values and copy into resulting vector
std::transform(
std::sregex_token_iterator(line.begin(), line.end(), re, -1), {},
std::back_inserter(cp.columns),
[](const std::string& s) { return std::stoi(s); });
}
return is;
}
// Type cast operator overload. Cast the type 'Columns' to
// std::vector<std::string>
operator Columns() const { return columns; }
protected:
// Temporary to hold the read vector
Columns columns{};
};
int main() {
std::ifstream myFile("r:\\log.txt");
if(myFile) {
// Read the complete file and parse verything and store result into vector
std::vector<Columns> values(std::istream_iterator<ColumnProxy>(myFile), {});
// Show complete csv data
std::for_each(values.begin(), values.end(), [](const Columns& c) {
std::copy(c.begin(), c.end(),
std::ostream_iterator<int>(std::cout, " "));
std::cout << "\n";
});
}
return 0;
}
Please note: There are tons of other possible solutions. Please feel free to use whatever you want.
EDIT
Because I see a lot of complicated code here, I would like to show a 2nd example of how to
Parsing a CSV file - C++
Basically, you do not need more than 2 statements in the code. You first define a regex for digits. And then you use a C++ language element that has been exactly designed for the purpose of tokenizing strings into substrings. The std::sregex_token_iterator. And because such a most-fitting language element is available in C++ since years, it would may be worth a consideration to use it. And maybe you could do basically the task in 2 lines, instead of 10 or more lines. And it is easy to understand.
But of course, there are thousands of possible solutions and some like to continue in C-Style and others like more moderen C++ features. That's up to everybodies personal decision.
The below code reads the csv file as specified, regardless of how many rows(lines) it contains and how many columns are there for each row. Even foreing characters can be in it. An empty row will be an empty entry in the csv vector. This can also be easly prevented, with an "if !empty" before the emplace back.
But some like so and the other like so. Whatever people want.
Please see a general example:
#include <algorithm>
#include <iterator>
#include <iostream>
#include <regex>
#include <sstream>
#include <string>
#include <vector>
// Test data. Can of course also be taken from a file stream.
std::stringstream testFile{ R"(1,2
3, a, 4
5 , 6 b , 7
abc def
8 , 9
11 12 13 14 15 16 17)" };
std::regex digits{R"((\d+))"};
using Row = std::vector<std::string>;
int main() {
// Here we will store all the data from the CSV as std::vector<std::vector<std::string>>
std::vector<Row> csv{};
// This extremely simple 2 lines will read the complete CSV and parse the data
for (std::string line{}; std::getline(testFile, line); )
csv.emplace_back(Row(std::sregex_token_iterator(line.begin(), line.end(), digits, 1), {}));
// Now, you can do with the data, whatever you want. For example: Print double the value
std::for_each(csv.begin(), csv.end(), [](const Row& r) {
if (!r.empty()) {
std::transform(r.begin(), r.end(), std::ostream_iterator<int>(std::cout, " "), [](const std::string& s) {
return std::stoi(s) * 2; }
); std::cout << "\n";}});
return 0;
}
So, now, you may get the idea, you may like it, or you do not like it. Whatever. Feel free to do whatever you want.
I have a file that I need to read in. Each line of the file is exceedingly long, so I'd rather not read each line to a temporary string and then manipulate those strings (unless this isn't actually inefficient - I could be wrong). Each line of the file contains a string of triplets - two numbers and a complex number, separated by a colon (as opposed to a comma, which is used in the complex number). My current code goes something like this:
while (states.eof() == 0)
{
std::istringstream complexString;
getline(states, tmp_str, ':');
tmp_triplet.row() = stoi(tmp_str);
getline(states, tmp_str, ':');
tmp_triplet.col() = stoi(tmp_str);
getline(states, tmp_str, ':');
complexString.str (tmp_str);
complexString >> tmp_triplet.value();
// Then something useful done with the triplet before moving onto the next one
}
tmp_triplet is a variable that stores these three numbers. I want some way to run a function every line (specifically, the triplets in every line are pushed into a vector, and each line in the file denotes a different vector). I'm sure there's an easy way to go about this, but I just want a way to check whether the end of the line has been reached, and to run a function when this is the case.
When trying to plan stuff out, abstraction can be your best friend. If you break down what you want to do by abstract functionality, you can more easily decide what data types should be used and how different data types should be planned out, and often you can find some functions almost write themselves. And typically, your code will be more modular (almost by definition), which will make it easy to reuse, maintain, and adapt if future changes are needed.
For example, it sounds like you want to parse a file. So that should be a function.
To do that function, you want to read in the file lines then process the file lines. So you can make two functions, one for each of those actions, and just call the functions.
To read in file lines you just want to take a file stream, and return a collection of strings for each line.
To process file lines you want to take a collection of strings and for each one parse the string into a triplet value. So you can create a method that takes a string and breaks it into a triplet, and just use that method here.
To process a string you just need to take a string and assign the first part as the row, the second part as the column, and the third part as the value.
struct TripletValue
{
int Row;
int Col;
int Val;
};
std::vector<TripletValue> ParseFile(std::istream& inputStream)
{
std::vector<std::string> fileLines = ReadFileLines(inputStream);
std::vector<TripletValue> parsedValues = GetValuesFromData(fileLines);
return parsedValues;
}
std::vector<std::string> ReadFileLines(std::istream& inputStream)
{
std::vector<std::string> fileLines;
while (!inputStream.eof())
{
std::string fileLine;
getline(inputStream, fileLine);
fileLines.push_back(fileLine);
}
return fileLines;
}
std::vector<TripletValue> GetValuesFromData(std::vector<std::string> data)
{
std::vector<TripletValue> values;
for (int i = 0; i < data.size(); i++)
{
TripletValue parsedValue = ParseLine(data[i]);
values.push_back(parsedValue);
}
return values;
}
TripletValue ParseLine(std::string fileLine)
{
std::stringstream sstream;
sstream << fileLine;
TripletValue parsedValue;
std::string strValue;
sstream >> strValue;
parsedValue.Row = stoi(strValue);
sstream >> strValue;
parsedValue.Col = stoi(strValue);
sstream >> strValue;
parsedValue.Val = stoi(strValue);
return parsedValue;
}