Constructing a Graph of Strings (Levenshtein Distance)

Constructing a Graph of Strings (Levenshtein Distance) - c++

Currently, in my computer-science course, we are discussing graphs and how to find shortest distance using graphs. I received an assignment about a week ago where the teacher gave us the code for a graph using integers, and we have to adapt it to be able to calculate Levenshtein Distance using a list of words. The problem I'm having though is that I don't understand really how graphs work enough to manipulate one. I've tried googling graphs in c++ but none of the things I found resemble the type of program I was given.
We just finished a unit on linked lists, and I think graphs operate similarly? I understand that each node will point to many other nodes, but in a case where I have 2000 words all pointing to each other, how do I keep track of 2000 pointers per node without declaring that many nodes in my struct? I believe (not 100%) that in the program I was given my teacher used a vector of integer vectors to keep track but I don't know how to implement that.
I'm not asking anyone to fully comment each line as that is an enormous amount of work, but if someone could roughly explain how I would accomplish what I asked above and perhaps read the code and give me a rough understanding of what some sections mean (I'll put comments on some sections I'm specifically having trouble understanding) I would be extremely grateful.
Here is the code we were given:
#include <iostream>
#include <vector>
#include <algorithm> //for max<>
#include <limits>
using namespace std;
typedef vector <int> ivec;
typedef vector <ivec> imatrix; //A vector of vectors, not how this works or how to implement
typedef vector <bool> bvec;
struct graph
{
imatrix edges; //list of attached vertices for each node
int numVertices;
};
//I understand the ostream overloading
ostream & operator << (ostream & stream, ivec &vec)
{
for (int i = 0; i < vec.size(); i++)
{
stream << vec[i] << " ";
}
return stream;
}
ostream & operator << (ostream & stream, graph &g)
{
stream << endl << "numVert = " << g.numVertices << endl;
for (int i = 0; i < g.numVertices; i++)
{
stream << "vertex = " << i+1 << " | edges = " << g.edges[i] << endl;
}
return stream;
}
const int sentinel = -1;
bvec inTree;
ivec distanceNodes;
ivec parents;
void initGraph(graph * g);
void insertEdge(graph * g, int nodeNum, int edgeNum);
void initSearch(graph * g);
void shortestPath(graph * g, int start, int end);
int main()
{
//I understand the main, the two numbers in insertEdge are being hooked together and the two numbers in shortestPath are what we are looking to connect in the shortest way possible
graph g;
initGraph(&g);
insertEdge(&g, 1, 2);
insertEdge(&g, 1, 3);
insertEdge(&g, 2, 1);
insertEdge(&g, 2, 3);
insertEdge(&g, 2, 4);
insertEdge(&g, 3, 1);
insertEdge(&g, 3, 2);
insertEdge(&g, 3, 4);
insertEdge(&g, 4, 2);
insertEdge(&g, 4, 3);
insertEdge(&g, 4, 5);
insertEdge(&g, 5, 4);
insertEdge(&g, 6, 7);
insertEdge(&g, 7, 6);
cout << "The graph is " << g << endl;
shortestPath(&g, 1, 5);
shortestPath(&g, 2, 4);
shortestPath(&g, 5, 2);
shortestPath(&g, 1, 7);
return 0;
}
void initGraph(graph * g)
{
g -> numVertices = 0; //Why set the number of vertices to 0?
}
void insertEdge(graph * g, int nodeNum, int edgeNum)
{
int numVertices = max(nodeNum, edgeNum); //Max finds the larger of two numbers I believe? How can this be used with strings, one is not bigger than the other
numVertices = max(1, numVertices);
if (numVertices > g->numVertices)
{
for (int i = g->numVertices; i <= numVertices; i++)
{
ivec nodes;
if (g->edges.size() < i)
{
g -> edges.push_back(nodes);
}
}
g->numVertices = numVertices;
}
g->edges[nodeNum - 1].push_back(edgeNum);
}
void initSearch(graph * g) //I believe this function simply resets the values from a previous search
{
if (g == NULL)
{
return;
}
inTree.clear();
distanceNodes.clear();
parents.clear();
for (int i = 0; i <= g->numVertices; i++)
{
inTree.push_back(false);
distanceNodes.push_back(numeric_limits <int> :: max());
parents.push_back(sentinel);
}
}
void shortestPath(graph * g, int start, int end)
{
//Very confused about how this function works
initSearch(g);
int edge;
int curr; //current node
int dist;
distanceNodes[start] = 0;
curr = start;
while (! inTree[curr])
{
inTree[curr] = true;
ivec edges = g->edges[curr - 1];
for (int i = 0; i < edges.size(); i++)
{
edge = edges[i];
if (distanceNodes[edge] > distanceNodes[curr] + 1)
{
distanceNodes[edge] = distanceNodes[curr] + 1;
parents[edge] = curr;
}
}
curr = 1;
dist = numeric_limits <int> :: max();
for (int i = 1; i <= g->numVertices; i++)
{
if ((!inTree[i]) && (dist > distanceNodes[i]))
{
dist = distanceNodes[i];
curr = i;
}
}
}
ivec path;
if (distanceNodes[end] == numeric_limits <int> :: max()) //is there a numeric_limits <string> :: max?
{
cout << "No way from " << start << " to " << end << endl;
}
else
{
int temp = end;
while (temp != start)
{
path.push_back(temp);
temp = parents[temp];
}
path.push_back(start);
reverse(path.begin(), path.end());
cout << "From " << start << " to " << end << " is " << path << endl;
}
}
If you can help, that would be most welcome as I most likely will have more projects with graphs and I'm struggling due to not understanding them.
Thank you,
Tristan

typedef vector <ivec> imatrix; //A vector of vectors, not how this works or how to implement
Here the graph is represented as Adjacency Matrix. You can also represent a graph using Adjacency List, where each Node would hold a array/linked list of neighboring nodes.
g -> numVertices = 0; //Why set the number of vertices to 0?
It initializes the graph, at startup number of vertices/nodes is zero. When edges and nodes will be inserted using insertEdge method then this number will be updated.
int numVertices = max(nodeNum, edgeNum); //Max finds the larger of two numbers I believe? How can this be used with strings, one is not bigger than the other
though you have not posted full code, I think that the maximum value is used to add required number of vertices before inserting an edge.
ivec nodes;
if (g->edges.size() < i)
{
g -> edges.push_back(nodes);
}
above code inserts new vertices. You will probably do integer comparison as here for your version, not string, string is the data of node, not number of node. Still if you need string comparison, C++ already has overloaded operators for this.
About initSearch and shortestPath methods, here the latter finds shortest path between nodes using an algorithm( I don't know which, you can search), and before searching for a shortest path, the former method initializes the values that will be used to search. For example it could set the distances between each pair of node to infinity initially, when a path is found between them, it will be updated.

Some answers:
Q. You asked why numVertices is set to 0 in the following:
void initGraph(graph * g)
{
g -> numVertices = 0; //Why set the number of vertices to 0?
}
A. Look at the declaration of g - it is default initialized:
int main()
{
graph g;
....
}
Now look at the definition of graph - it has no constructor:
struct graph
{
imatrix edges; //list of attached vertices for each node
int numVertices;
};
So edges gets initialized properly by default because vectors have a constructor. But numVertices is a primitive type so it will contain whatever random value happens to be in that memory location - so that means it needs to be manually initialized. Thats why initGraph doesn't need to initialize edges but it does need to initalize numVertices.
Q. You asked how you can find the larger of two std::strings knowing that max() returns the larger of two integers:
int numVertices = max(nodeNum, edgeNum); //Max finds the larger of two numbers I believe? How can this be used with strings, one is not bigger than the other
A. According to http://www.cplusplus.com/reference/algorithm/max/ max uses "The function uses operator< (or comp, if provided) to compare the values." but std::strings can be compared using the < operator so there really is no problem.
Q. You asked about a vector of vectors:
typedef vector <int> ivec;
typedef vector <ivec> imatrix; //A vector of vectors, not how this works or how to implement
A. You can access a vector with [] so if you had a variable called x of imatrix type you could say x[0] which would return an ivec (because that is the type of object stored in an imatrix vector. So if you said x[0][0] that would return the first integer stored in the ivec that is returned by x[0]. To change it to use a string just say:
typedef vector <std::string> ivec;
typedef vector <ivec> imatrix;
You could also rename the variables if you wanted.
You would also need to #include <string>

Related

Weight implementation in graph

I am trying to find the path between two vertices and their distance.
My implementation is the following:
#include <iostream>
#include <list>
#include <string>
#include <vector>
using namespace std;
vector <string> v1 = {"Prague", "Helsinki", "Beijing", "Tokyo", "Jakarta","London", "New York"};
vector <int> w = {};
// A directed graph using
// adjacency list representation
class Graph {
int V; // No. of vertices in graph
list<int>* adj; // Pointer to an array containing adjacency lists
// A recursive function used by printAllPaths()
void printAllPathsUtil(int, int, bool[], int[], int&);
public:
Graph(int V); // Constructor
void addVertex(string name);
void addEdge(int u, int v, int weight);
void printAllPaths(int s, int d);
};
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
}
void Graph::addEdge(int u, int v, int weight)
{
adj[u].push_back(v); // Add v to u’s list.
w.push_back(weight);
}
// Prints all paths from 's' to 'd'
void Graph::printAllPaths(int s, int d)
{
// Mark all the vertices as not visited
bool* visited = new bool[V];
// Create an array to store paths
int* path = new int[V];
int path_index = 0; // Initialize path[] as empty
// Initialize all vertices as not visited
for (int i = 0; i < V; i++)
visited[i] = false;
// Call the recursive helper function to print all paths
printAllPathsUtil(s, d, visited, path, path_index);
}
// A recursive function to print all paths from 'u' to 'd'.
// visited[] keeps track of vertices in current path.
// path[] stores actual vertices and path_index is current
// index in path[]
void Graph::printAllPathsUtil(int u, int d, bool visited[],
int path[], int& path_index)
{
// Mark the current node and store it in path[]
visited[u] = true;
path[path_index] = u;
path_index++;
int sum = 0;
// If current vertex is same as destination, then print
// current path[]
if (u == d) {
for (int i = 0; i < path_index; i++){
sum += w[i];
cout << v1[path[i]] << " ";
}
cout << endl;
cout << "Total distance is: " << sum;
cout << endl;
}
else // If current vertex is not destination
{
// Recur for all the vertices adjacent to current vertex
list<int>::iterator i;
for (i = adj[u].begin(); i != adj[u].end(); ++i)
if (!visited[*i])
printAllPathsUtil(*i, d, visited, path, path_index);
}
// Remove current vertex from path[] and mark it as unvisited
path_index--;
visited[u] = false;
}
// Driver program
int main()
{
// Create a graph given in the above diagram
Graph g(7);
g.addEdge(0, 1, 1845);
g.addEdge(0, 5, 1264);
g.addEdge(1, 3, 7815);
g.addEdge(2, 5, 8132);
g.addEdge(2, 6, 11550);
g.addEdge(2, 3, 1303);
g.addEdge(3, 4, 5782);
g.addEdge(3, 6, 10838);
g.addEdge(4, 2, 4616);
g.addEdge(5, 3, 9566);
g.addEdge(6, 5, 5567);
int s = 0, d = 2;
cout << "Following are all different paths from " << v1[s] << " to " << v1[d] << endl;
g.printAllPaths(s, d);
return 0;
}
Obviously this part is wrong:
vector <int> w = {};
w.push_back(weight);
int sum = 0;
// If current vertex is same as destination, then print
// current path[]
if (u == d) {
for (int i = 0; i < path_index; i++){
sum += w[i];
cout << v1[path[i]] << " ";
}
cout << endl;
cout << "Total distance is: " << sum;
cout << endl;
Output is:
Prague Helsinki Tokyo Jakarta Beijing
Total distance is: 30606
Prague London Tokyo Jakarta Beijing
Total distance is: 30606
This is wrong (but the path is correct) because of my implementation, it prints out the summation of the overall first 5 weights. But I just did not understand how can I get the weights
How can I get the corresponding weights and add them up?
What I expect:
Prague Helsinki Tokyo Jakarta Beijing
Total distance is: 20058
Prague London Tokyo Jakarta Beijing
Total distance is: (There will be some number I have not calculated yet)

There are two main problems here. When you create the edges you do not coupled their cost to them in any way. Also when you traverse them in your algorithm you do not save the cost of traversing the edge, you only save the cities.
Here is a simple solution if you want to keep almost an identical structure. You can accompany the adjecency lists with a list of the costs for each such edge. Son instead of having the w array you can have one such for each noce (city). Then the path array can also be accompanied by another int array with the costs of each step.
Starting with defining and creating the edges it would be:
class Graph {
int V; // No. of vertices in graph
list<int>* adj; // Pointer to an array containing adjacency lists
list<int>* adj_weights; // Pointer to an array containing adjacency lists
...
Graph::Graph(int V)
{
this->V = V;
adj = new list<int>[V];
adj_weights = new list<int>[V];
}
void Graph::addEdge(int u, int v, int weight)
{
adj[u].push_back(v); // Add v to u’s list.
adj_weights[u].push_back(weight); // Add the weight of the path as well.
}
Now we have stored the weights and the edges together in a better way, but we still need to use this in the algorithm. Here is an example change of the main functions:
// Prints all paths from 's' to 'd'
void Graph::printAllPaths(int s, int d)
{
// Mark all the vertices as not visited
bool* visited = new bool[V];
// Create an array to store paths
int* path = new int[V];
int* path_costs = new int[V];
int path_index = 0; // Initialize path[] and path_costs[] as empty
// Initialize all vertices as not visited
for (int i = 0; i < V; i++)
visited[i] = false;
// Call the recursive helper function to print all paths
// Note that we let cost = 0 since we don't have to move to the starting city
printAllPathsUtil(s, d, visited, path_costs, path, path_index, 0);
}
// A recursive function to print all paths from 'u' to 'd'.
// visited[] keeps track of vertices in current path.
// path[] stores actual vertices and path_index is current
// index in path[]
void Graph::printAllPathsUtil(int u, int d, bool visited[], int path_costs[],
int path[], int& path_index, int cost)
{
// Mark the current node and store it in path[]
visited[u] = true;
path[path_index] = u;
path_costs[path_index] = cost; // Save cost of this step
path_index++;
int sum = 0;
// If current vertex is same as destination, then print
// current path[]
if (u == d) {
for (int i = 0; i < path_index; i++){
sum += path_costs[i]; // Now add all the costs
cout << v1[path[i]] << " ";
}
cout << endl;
cout << "Total distance is: " << sum;
cout << endl;
}
else // If current vertex is not destination
{
// Recur for all the vertices adjacent to current vertex
// Now we loop over both adj and adj_weights
list<int>::iterator i, j;
for (i = adj[u].begin(), j = adj_weights[u].begin();
i != adj[u].end(); ++i, ++j)
if (!visited[*i])
printAllPathsUtil(*i, d, visited, path_costs, path,
path_index, *j);
}
// Remove current vertex from path[] and mark it as unvisited
path_index--;
visited[u] = false;
}
You can see a full version of the augmented code here https://ideone.com/xGju0y and it gives the following output:
Following are all different paths from Prague to Beijing
Prague Helsinki Tokyo Jakarta Beijing
Total distance is: 20058
Prague London Tokyo Jakarta Beijing
Total distance is: 21228
I hope this helps! I tried to focus on using the same concepts you introduced to not solve things with some new classes or imports. But there are much nicer ways to solve this. One example is to merge the path and path_weights arrays into one array of pairs of ints, and to merge adj and adj_weights into an array of lists of pairs of ints instead of an array of lists of ints.

How to fix the error on erasing element from a vector?

I want to erase a particular element from a vector and I tried the following code. Actually I want to build a adjacency list and I want to delete an element from the adjacency list and to delete all edges connected to deleted element.
#include <iostream>
#include <vector>
#include <algorithm>
using namespace std;
vector<vector<int> > tree(20);
void addEdge(int u, int v)
{
tree[u].push_back(v);
tree[v].push_back(u);
}
void printGraph(vector<vector<int>> tree [], int V)
{
vector<vector<int>>::iterator it;
int j;
for (it = tree.begin(); it != tree.end(); )
{
cout << "\n Adjacency list of vertex ";
for (j = 0; j < (*it).size(); j++)
{
cout << j << "\n head ";
cout << "-> " << (*it)[j];
}
cout << endl;
if (j==2) it = tree.erase(it);
}
}
int main()
{
int n = 5;
addEdge(1, 2);
addEdge(3, 2);
addEdge(4, 2);
addEdge(2, 5);
printGraph(tree, n);
}
How to fix the error on erasing element from a vector?

For your immediate needs, use
for (it = tree.begin(); it != tree.end(); ) // loops on the entire tree
{
for (int j = 0; j < (*it).size(); j++) // loops through adjacent vertices in the current node
cout << ' ' << (*it)[j];
cout << endl;
it = tree.erase(it); // erase current node (same thing as it = tree.erase(it.begin()))
}
to print your tree and linearly delete each row.
Original Answer before question was updated:
as a static array of `std::vector<int</code>'s, we won't be able to use `std::vector<></code> member functions directly on tree (e.g. `tree.begin()</code> generates an error because there is no `.begin()</code> member function for a static array).>
An alternative is to use std::begin(tree) and std::end(tree) defined from the <iterator> header file.
But since, you're wanting to delete (erase) parts of the array, a better way of structuring this data is to use std::vector<std::vector<int>> tree(20) which will create a 2D array. Here, the .erase() member function is valid.
Alternative to Storing Data (Helpful)
Also I believe adjacency matrices usually have static size? For example:
A B C
A 0 1 0
B 1 0 1
C 0 1 0
would show that there are 3 nodes (A, B, C) and there are two edges (AB, BC).
Removing an edge would just be as simple as changing the edge from 1 to 0. Here, you can use std::vector<int> tree[20] or better, bool tree[20][20] (should use bool since we're only dealing with 1's and 0's). Then you would need to reimplement the addEdge function, but that should be in your scope of understanding.
Hope this helps.

why c++ stl priority_queue does not perform as efficiently as expected with customized compare functor

When I create a stl priority_queue with a customized compare function class, I find that it has very poor performance(It seems to have a O(n) complexity to push a element rather than O(logN)). But I test to use a stl set with the same compare function, the set runs much faster than priority_queue. Why this happened?
//There are a lot of horizontal lines on x-axis. Each line is represeted by a pair<int,int> p,
//where p.first is the left point and p.second is the right point.
//For example, p(3,6) is a line with end points of 3 and 6.
//a vector<pair<int,int>> lines represents all these lines.
//
//I would like to scan all the end points from left to right alone x-axis.
//I choose to use a struct end_point to represent each end point,
// struct end_point{
// int x_coor; //the x coordinate of the end point;
// int index; //the index of the line, this end point belongs to, in vector lines.
// };
//For example, if there are 3 lines on x-axis line1(1,10), line2(2,9) and line3(3,8),
//the vector of lines looks like vector<pair<int,int>> lines = [(1,10), (2,9), (3,8)];
//And the 6 end points look like {1,0}, {10,0}, {2,1}, {9,1}, {3,2}, {8,2}
//I want to sort these end points by comparing their x_coor(if equal, use index to identify different lines).
//If I choose a priority_queue to sort them, it has a very poor performace(6000 μs to complete 1000 lines).
//But I do the same thing(by using the same customized compare function class) on a set,
//it runs much faster(just 246 μs) than priority_queue.
//
//My question:
//1.Why priority_queue perform so bad as to set in this example?
//2.For the comparator function class, I introduce a variable count to record how many times the compare function is called totally.
//but for priority_queue, it seems that each compare function would resume it to 0 without keeping it, while for the set, it could
//increase all the time as I expect. Is there any bug in my code(or that is the reason why there performed so differently)?
//
#include <iostream>
#include <queue>
#include <set>
#include <chrono>
using namespace std::chrono;
//generate 1000 lines of [(0, 10000), (1, 9999), (2,9998),(3,9997).....(999, 9001)]
std::vector<std::pair<int,int>> linesGenerator(){
std::vector<std::pair<int,int>> lines;
int lines_num = 1000;
int length_range = 10000;
for(int i = 0; i < lines_num; i++){
int start = i;
int end = length_range - i;
lines.push_back(std::make_pair(start, end));
}
return lines;
}
struct end_point{
int x_coor; //the x coordinate of the end point;
int index; //the index of the line, this end point belongs to, in vector lines.
end_point(int x, int i):x_coor(x), index(i){}
};
//customized compare function class for priority_queue
class pqCompare{
public:
//construct by keeping its own copy of lines.
pqCompare(std::vector<std::pair<int,int>> &_lines){
lines = _lines;
count = 0; //use for count how many times the compare operation is called.
}
bool operator()(end_point ep1, end_point ep2){
++count; //increase by 1 if this function is called once.
//std::cout<<"count:"<<count<<std::endl;
if(ep1.x_coor != ep2.x_coor){
return ep2.x_coor < ep1.x_coor;
}
return ep1.index > ep2.index;
}
private:
std::vector<std::pair<int,int>> lines;
int count;
};
//customized compare function for set, almost same as priority_queue,
//the difference is only because piroity_queue is a max_heap.
class setCompare{
public:
setCompare(std::vector<std::pair<int,int>> &_lines){
lines = _lines;
count = 0;
}
bool operator()(end_point ep1, end_point ep2){
++count;
//std::cout<<"count:"<<count<<std::endl;
if(ep1.x_coor != ep2.x_coor){
return ep1.x_coor < ep2.x_coor;
}
return ep1.index < ep2.index;
}
private:
std::vector<std::pair<int,int>> lines;
int count;
};
void test_pqueue()
{
//generate 1000 lines;
std::vector<std::pair<int,int>> lines = linesGenerator();
//create a priority_queue with cmp as comparator.
pqCompare cmp(lines);
std::priority_queue<end_point, std::vector<end_point>, pqCompare> ct(cmp);
auto tp0 = steady_clock::now();
for (int i = 0; i < lines.size(); ++i){
//for each line, there are 2 end points.
int left_point = lines[i].first;
int right_point = lines[i].second;
ct.push(end_point(left_point, i));
ct.push(end_point(right_point, i));
}
//std::cout<<"total count"<<cmp.getCount()<<"\n";
auto tp1 = steady_clock::now();
std::cout << __PRETTY_FUNCTION__ << ':' << duration_cast<microseconds>(tp1-tp0).count() << " μs\n";
}
void test_set()
{
std::vector<std::pair<int,int>> lines = linesGenerator();
setCompare cmp(lines);
std::set<end_point, setCompare> ct(cmp);
auto tp0 = steady_clock::now();
for (int i = 0; i < lines.size(); ++i){
int left_point = lines[i].first;
int right_point = lines[i].second;
ct.insert(end_point(left_point, i));
ct.insert(end_point(right_point, i));
}
auto tp1 = steady_clock::now();
std::cout << __PRETTY_FUNCTION__ << ':' << duration_cast<microseconds>(tp1-tp0).count() << " μs\n";
}
int main()
{
test_pqueue();
test_set();
}

Speeding up algorithm in C++

TL;DR: My code is "fast" in Java but slow as hell in C++. Why?
#include <iostream>
#include <vector>
#include <map>
#include <algorithm>
using namespace std;
int read(string data, int depth, int pos, vector<long>& wantedList) {
// 91 = [
if (data.at(pos) == 91) {
pos++;
// Get first part
pos = read(data, depth + 1, pos, wantedList);
// Get second part
pos = read(data, depth + 1, pos, wantedList);
} else {
// Get the weight
long weight = 0;
while (data.length() > pos && isdigit(data.at(pos))) {
weight = 10 * weight + data.at(pos++) - 48;
}
weight *= 2 << depth;
wantedList.push_back(weight);
}
return ++pos;
}
int doStuff(string data) {
typedef map<long, int> Map;
vector<long> wantedList;
Map map;
read(data, 0, 0, wantedList);
for (long i : wantedList) {
if (map.find(i) != map.end()) {
map[i] = map[i] + 1;
} else {
map[i] = 1;
}
}
vector<int> list;
for (Map::iterator it = map.begin(); it != map.end(); ++it) {
list.push_back(it->second);
}
sort(list.begin(), list.begin() + list.size());
cout << wantedList.size() - list.back() << "\n";
return 0;
}
int main() {
string data;
int i;
cin >> i;
for (int j = 0; j < i ; ++j) {
cin >> data;
doStuff(data);
}
return 0;
}
I have just tried my first C++ project, and it's re-written code from Java.
The original task was to calculate how many numbers that needed to be changed in order to "balance" the input, given that each level above something weighs double the lower
eg [1,2] would need 1 change (either 1->2 or 2->1 in order to be equal on both sides and [8,[4,2]] would need 1 change (2->4) in order for the "lower level" to become 8 and therefore be of equal weight on the higher level. The problem can be found here for those who are interested:
Problem link
And for those who wonder, this is a school assignment regarding algorithms, but I'm not asking for help with that, since I have already completed it in Java. The problem is that my algorithm seem to be pretty shit when it comes to C++.
In Java I get times around 0.6 seconds, and in C++, the "same" code gives >2 seconds (time limit exceeded).
Anyone care to give me a pointer as to why this is? I was under the impression that C++ is supposedly faster than Java when it comes to these type of problems.

One of possible reasons is copying.
Whenever you pass something by value in C++ a copy is created. For tings like double, int or a pointer, that's not a problem.
But for objects like std::string copying may be expensive. Since you don't modify data it makes sense to pass it by const reference:
int read(const string &data, int depth, int pos, vector<long>& wantedList)

How to Create All Permutations of Variables from a Variable Number of STL Vectors [duplicate]

This question already has answers here:
Generate all combinations from multiple lists
(11 answers)
Closed 9 years ago.
I have a variable number of std::vectors<int>, let's say I have 3 vectors in this example:
std::vector<int> vect1 {1,2,3,4,5};
std::vector<int> vect2 {1,2,3,4,5};
std::vector<int> vect3 {1,2,3,4,5};
The values of the vectors are not important here. Also, the lengths of these vectors will be variable.
From these vectors, I want to create every permutation of vector values, so:
{1, 1, 1}
{1, 1, 2}
{1, 1, 3}
...
...
...
{3, 5, 5}
{4, 5, 5}
{5, 5, 5}
I will then insert each combination into a key-value pair map for further use with my application.
What is an efficient way to accomplish this? I would normally just use a for loop, and iterate across all parameters to create all combinations, but the number of vectors is variable.
Thank you.
Edit: I will include more specifics.
So, first off, I'm not really dealing with ints, but rather a custom object. ints are just for simplicity. The vectors themselves exist in a map like so std::map<std::string, std::vector<int> >.
My ultimate goal is to have an std::vector< std::map< std::string, int > >, which is essentially a collection of every possible combination of name-value pairs.

Many (perhaps most) problems of the form "I need to generate all permutations of X" can be solved by creative use of simple counting (and this is no exception).
Let's start with the simple example: 3 vectors of 5 elements apiece. For our answer we will view an index into these vectors as a 3-digit, base-5 number. Each digit of that number is an index into one of the vectors.
So, to generate all the combinations, we simply count from 0 to 53 (125). We convert each number into 3 base-5 digits, and use those digits as indices into the vectors to get a permutation. When we reach 125, we've enumerated all the permutations of those vectors.
Assuming the vectors are always of equal length, changing the length and/or number of vectors is just a matter of changing the number of digits and/or number base we use.
If the vectors are of unequal lengths, we simply produce a result in which not all of the digits are in the same base. For example, given three vectors of lengths 7, 4 and 10, we'd still count from 0 to 7x4x10 = 280. We'd generate the least significant digit as N%10. We'd generate the next least significant as (N/10)%4.
Presumably that's enough to make it fairly obvious how to extend the concept to an arbitrary number of vectors, each of arbitrary size.

0 - > 0,0,0
1 - > 0,0,1
2 - > 0,1,0
3 - > 0,1,1
4 - > 1,0,0
...
7 - > 1,1,1
8 - > 1,1,2
...
The map should translate a linear integer into a combination (ie: a1,a2,a3...an combination) that allows you to select one element from each vector to get the answer.
There is no need to copy any of the values from the initial vectors. You can use a mathematical formula to arrive at the right answer for each of the vectors. That formula will depend on some of the properties of your input vectors (how many are there? are they all the same length? how long are they? etc...)

Following may help: (https://ideone.com/1Xmc9b)
template <typename T>
bool increase(const std::vector<std::vector<T>>& v, std::vector<std::size_t>& it)
{
for (std::size_t i = 0, size = it.size(); i != size; ++i) {
const std::size_t index = size - 1 - i;
++it[index];
if (it[index] == v[index].size()) {
it[index] = 0;
} else {
return true;
}
}
return false;
}
template <typename T>
void do_job(const std::vector<std::vector<T>>& v, std::vector<std::size_t>& it)
{
// Print example.
for (std::size_t i = 0, size = v.size(); i != size; ++i) {
std::cout << v[i][it[i]] << " ";
}
std::cout << std::endl;
}
template <typename T>
void iterate(const std::vector<std::vector<T>>& v)
{
std::vector<std::size_t> it(v.size(), 0);
do {
do_job(v, it);
} while (increase(v, it));
}

This is an explicit implementation of what Lother and Jerry Coffin are describing, using the useful div function in a for loop to iterate through vectors of varying length.
#include <cstdlib> // ldiv
#include <iostream>
#include <map>
#include <string>
#include <vector>
using namespace std;
vector<int> vect1 {100,200};
vector<int> vect2 {10,20,30};
vector<int> vect3 {1,2,3,4};
typedef map<string,vector<int> > inputtype;
inputtype input;
vector< map<string,int> > output;
int main()
{
// init vectors
input["vect1"] = vect1;
input["vect2"] = vect2;
input["vect3"] = vect3;
long N = 1; // Total number of combinations
for( inputtype::iterator it = input.begin() ; it != input.end() ; ++it )
N *= it->second.size();
// Loop once for every combination to fill the output map.
for( long i=0 ; i<N ; ++i )
{
ldiv_t d = { i, 0 };
output.emplace_back();
for( inputtype::iterator it = input.begin() ; it != input.end() ; ++it )
{
d = ldiv( d.quot, input[it->first].size() );
output.back()[it->first] = input[it->first][d.rem];
}
}
// Sample output
cout << output[0]["vect1"] << endl; // 100
cout << output[0]["vect2"] << endl; // 10
cout << output[0]["vect3"] << endl; // 1
cout << output[N-1]["vect1"] << endl; // 200
cout << output[N-1]["vect2"] << endl; // 30
cout << output[N-1]["vect3"] << endl; // 4
return 0;
}

Use a vector array instead of separate variables. then use following recursive algorithm :-
permutations(i, k, vectors[], choices[]) {
if (i < k) {
for (int x = 0; x < vectors[i].size(); x++) {
choices[i] = x;
permutations(i + 1, k, vectors, choices);
}
} else {
printf("\n %d", vectors[choices[0]]);
for (int j = 1; j < k; j++) {
printf(",%d", vectors[choices[j]]);
}
}
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Constructing a Graph of Strings (Levenshtein Distance) - c++

Related

Weight implementation in graph

How to fix the error on erasing element from a vector?

why c++ stl priority_queue does not perform as efficiently as expected with customized compare functor

Speeding up algorithm in C++

How to Create All Permutations of Variables from a Variable Number of STL Vectors [duplicate]

Categories

Resources