find if a structure already exists in a vector c++ - c++

I have to find the weight between all edges in a graph, so since the edges are bidirectional I dont want to include 2 -> 1 if I already have 1 -> 2 (since they will have the same weight). The edges are stored in a vector from structure Edge. My initial idea was to look up, if an edge that has the start and end positions swapped and has the same weight already exists, and if this is the case, just dont do anything. However, I dont exactly know how to put it into code, so any help would be appreciated. Also any approaches that could optimise the solution are also welcome.
struct Vertex {
Vertex(const int i = 0) : index {i}, key {max_key}, parent_index {undef}, processed {false} {}
int index; // vertex identifier
int key; // temporary minimal weight (Prim algorithm)
int parent_index; // temporary minimal distance neighboor vertex (Prim algorithm)
int processed; // flag used to mark vertices that are already included in V'
static constexpr int max_key = std::numeric_limits<int>::max();
static const int undef = -1;
};
struct Edge {
Edge(int va, int vb, int w) : vi1 {va}, vi2 {vb}, weight {w} { }
int vi1; //start point
int vi2; //end point
int weight;
};
struct Graph {
int N; // number of vertices
std::vector<Vertex> V; // set of vertices
std::vector<Edge> E; // set of edges
std::vector<Edge> MST; // minimal spanning tree
const int* weights_table; // weights given as distance matrix
};
The problem is here in find I know this is a lot of irrelevant code, but I post it so that you can picture it more clearly. If there is no connection between 2 vertices they have weight of -1
// construct vertices and edges for a given graph
void createGraph(Graph& G) {
// TODO 5.1a: clear V and E and insert all vertex objects and edge objects
// - vertices are numbered (labeled) from 0 to N-1
// - edges exist if and only if there is positive distance between two vertices
// - edges are bidirectional, that is, edges are inserted only once between two vertices
G.E.clear();
G.V.clear();
for(int i = 0; i < G.N; i++){
Vertex V (i);
G.V.push_back(V);
}
for(int i = 0; i < G.N; i++){
for(int j = 0; j < G.N; j++){
Edge Ed (i,j,0);
int weight = getWeight(G,i,j);
if(weight > 0){
Ed.weight = weight;
auto it = find(G.E.begin(), G.E.end(), ....);
if( it != G.E.end() ) continue;
G.E.push_back(Ed);
}
}
}
}
Thanks!

since the edges are bidirectional
You can construct Edges such that v1 <= v2, then there is only one representation of each possible edge.
struct Edge {
Edge(int va, int vb, int w) : vi1 {std::min(va, vb)}, vi2 {std::max(va, vb)}, weight {w} { }
int vi1; // earlier point
int vi2; // later point
int weight;
};
Aside: prefer constructing the Edge in place
for(int i = 0; i < G.N; i++){
for(int j = G.N - 1; j >= 0 + i; j--){
int weight = getWeight(G,i,j);
if(weight > 0){
G.E.emplace_back(i, j, weight);
}
}
}

Okay, I think I got it, by changing the second for loop to look this way, but I am also curious to see how would the syntax would look like if find is being used
for(int i = 0; i < G.N; i++){
for(int j = G.N - 1; j >= 0 + i; j--){
Edge Ed (i, j , 0);
int weight = getWeight(G,i,j);
if(weight > 0){
Ed.weight = weight;
G.E.push_back(Ed);
}
}
}

Related

Finding a path in a graph with random edges

We have n vertices (where n is less than 100 000) and m random edges (where m is less than 10 000 000). We want to find a path between 2 given vertices. If there is no path we will just print -1.
My algorithm is to build a tree. Every vertex has a disjoint_index (i) which shows that all vertices with disjoint_index (i), are connected.
The default value of disjoint_index is the index of each vertex. After finding an edge between vertex v and u, I check if they are connected. If they are connected, I do nothing. Else I change the disjoint_index of u and all the vertices connected to u by a function named (dfs).
Here is the code of the function to build this tree in c++:
struct vertex{
int disjoint_index;
vector<int> adjacent;
};
void build_tree(int m, int s, int e)
{
for(int i = 0; i < m; i++)
{
int u = kiss() % n;
int v = kiss() % n;
if(disjoint_counter[u] > disjoint_counter[v])
{
int temp = u;
u = v;
v = temp;
}//counter v > u
if(ver[v].disjoint_index != ver[u].disjoint_index)
{
ver[v].adjacent.push_back(u);
ver[u].adjacent.push_back(v);
dfs(v, u, ver[v].disjoint_index);
disjoint_counter[v] += disjoint_counter[u];
}
if(ver[s].disjoint_index == ver[e].disjoint_index)
return;
}
}
void dfs(int parent, int v, int d)
{
ver[v].disjoint_index = d;
for(int i = 0; i < ver[v].adjacent.size(); i++)
{
if(ver[v].adjacent[i] == parent)
continue;
dfs(v, ver[v].adjacent[i], d);
}
}
Here you can skip kiss, It's just a function that returns two vertices and shows that there is an edge between u and v.
disjoint_counter[i] shows how many vertices are in connected group i.
After building this tree I will find a path with a simple dfs. The time limit is 1s and I get Time Limit Exceeded on some test cases.
Edit: Memory is limited so I can't save all the edges.
Maximum memory I can use is 32MB.
I used the disjoint set union algorithm, it developed speed.

Finding the number of connected components in an undirected graph

Source:
here
Problem:
Given n nodes labeled from 0 to n - 1 and a list of undirected edges (each edge is a pair of nodes), write a function to find the number of connected components in an undirected graph.
Approach:
class Solution
{
public:
int countComponents(int n, vector<vector<int>>& edges)
{
std::vector<bool> v(n, false);
int count = 0;
for(int i = 0; i < n; ++i)
{
if(!v[i])
{
dfs(edges, v, i);
count++;
}
}
return count;
}
void dfs(std::vector<std::vector<int>>& edges, std::vector<bool>& v, int i)
{
if(v[i] || i > edges.size())
return;
v[i] = true;
for(int j = 0; j < edges[i].size(); ++j)
dfs(edges, v, edges[i][j]);
}
};
Error:
heap-buffer overflow
I am not understanding why my code is causing a heap-buffer overflow for the test case:
5
[[0,1],[1,2],[2,3],[3,4]]
Any suggestions on how to fix my code would be really appreciated.
My guess is that your edges vector has only four elements in it for the provided input, since there is no outgoing edge from vertex 4. Your dfs function then eventually recurs into the point where i == 4, but your edges vector has only 4 elements, thus the last valid possition is edges[3].
I suggest that you represent a vertex with no outgoing vertices with an empty vector.
Also, the second part of the if statement
if(v[i] || i > edges.size())
return;
seems unecceserry and should probably just be
if(v[i])
return;

Prevent Cycles in Maximum Spanning Tree

I am trying to create a maximum spanning tree in C++ but am having trouble preventing cycles. The code I have works alright for some cases, but for the majority of cases there is a cycle. I am using an adjacency matrix to find the edges.
double maximumST( vector< vector<double> > adjacencyMatrix ) {
const int size = adjacencyMatrix.size();
vector <double> edges;
int edgeCount = 0;
double value = 0;
std::vector<std::vector<int>> matrix(size, std::vector<int>(size));
for (int i = 0; i < size; i++) {
for (int j = i; j < size; j++) {
if (adjacencyMatrix[i][j] != 0) {
edges.push_back(adjacencyMatrix[i][j]);
matrix[i][j] = adjacencyMatrix[i][j];
edgeCount++;
}
}
}
sort(edges.begin(), edges.end(), std::greater<int>());
for (int i = 0; i < (size - 1); i++) {
value += edges[i];
}
return value;
}
One I've tried to find a cycle was by creating a new adjacency matrix for the edges and checking that before adding a new edge, but that did not perform as expected. I also tried to build a 3D matrix, but I could not get that to work either.
What's a new approach I should try to prevent cycles?
You should add the edge if the lowest common ancestor(LCA) of the two vertices corresponding to that edge is not root.

Find similar distances between all values in vector and subset them

Given is a vector with double values. I want to know which distances between any elements of this vector have a similar distance to each other. In the best case, the result is a vector of subsets of the original values where subsets should have at least n members.
//given
vector<double> values = {1,2,3,4,8,10,12}; //with simple values as example
//some algorithm
//desired result as:
vector<vector<double> > subset;
//in case of above example I would expect some result like:
//subset[0] = {1,2,3,4}; //distance 1
//subset[1] = {8,10,12}; //distance 2
//subset[2] = {4,8,12}; // distance 4
//subset[3] = {2,4}; //also distance 2 but not connected with subset[1]
//subset[4] = {1,3}; //also distance 2 but not connected with subset[1] or subset[3]
//many others if n is just 2. If n is 3 (normally the minimum) these small subsets should be excluded.
This example is simplified as the distances of integer numbers could be iterated and tested for the vector which is not the case for double or float.
My idea so far
I thought of something like calculating the distances and storing them in a vector. Creating a difference distance matrix and thresholding this matrix for some tolerance for similar distances.
//Calculate distances: result is a vector
vector<double> distances;
for (int i = 0; i < values.size(); i++)
for (int j = 0; j < values.size(); j++)
{
if (i >= j)
continue;
distances.push_back(abs(values[i] - values[j]));
}
//Calculate difference of these distances: result is a matrix
Mat DiffDistances = Mat::zero(Size(distances.size(), distances.size()), CV_32FC1);
for (int i = 0; i < distances.size(); i++)
for (int j = 0; j < distances.size(); j++)
{
if (i >= j)
continue;
DiffDistances.at<float>(i,j) = abs(distances[i], distances[j]);
}
//threshold this matrix with some tolerance in difference distances
threshold(DiffDistances, DiffDistances, maxDistTol, 255, CV_THRESH_BINARY_INV);
//get points with similar distances
vector<Points> DiffDistancePoints;
findNonZero(DiffDistances, DiffDistancePoints);
At this point I get stuck with finding the original values corresponding to my similar distances. It should be possible to find them, but it seems very complicated to trace back the indices and I wonder if there isn't an easier way to solve the problem.
Here is a solution that works, as long as there are no branches meaning, that there are no values closer together than 2*threshold. That is the valid neighbor region because neighboring bonds should differ by less than the threshold, if I understood #Phann correctly.
The solution is definitively neither the fastest nor the nicest possible solution. But you might use it as a starting point:
#include <iostream>
#include <vector>
#include <algorithm>
int main(){
std::vector< double > values = {1,2,3,4,8,10,12};
const unsigned int nValues = values.size();
std::vector< std::vector< double > > distanceMatrix(nValues - 1);
// The distanceMatrix has a triangular shape
// First vector contains all distances to value zero
// Second row all distances to value one for larger values
// nth row all distances to value n-1 except those already covered
std::vector< std::vector< double > > similarDistanceSubsets;
double threshold = 0.05;
std::sort(values.begin(), values.end());
for (unsigned int i = 0; i < nValues-1; ++i) {
distanceMatrix.at(i).resize(nValues-i-1);
for (unsigned j = i+1; j < nValues; ++j){
distanceMatrix.at(i).at(j-i-1) = values.at(j) - values.at(i);
}
}
for (unsigned int i = 0; i < nValues-1; ++i) {
for (unsigned int j = i+1; j < nValues; ++j) {
std::vector< double > thisSubset;
double thisDist = distanceMatrix.at(i).at(j-i-1);
// This distance already belongs to another cluster
if (thisDist < 0) continue;
double minDist = thisDist - threshold;
double maxDist = thisDist + threshold;
thisSubset.push_back(values.at(i));
thisSubset.push_back(values.at(j));
//Indicate that this is already clustered
distanceMatrix.at(i).at(j-i-1) = -1;
unsigned int lastIndex = j;
for (unsigned int k = j+1; k < nValues; ++k) {
thisDist = distanceMatrix.at(lastIndex).at(k-lastIndex-1);
// This distance already belongs to another cluster
if (thisDist < 0) continue;
// Check if you found a new valid pair
if ((thisDist > minDist) && (thisDist < maxDist)){
// Update the valid distance interval
minDist = thisDist - threshold;
minDist = thisDist - threshold;
// Add the newly found point
thisSubset.push_back(values.at(k));
// Indicate that this is already clustered
distanceMatrix.at(lastIndex).at(k-lastIndex-1) = -1;
// Continue the search from here
lastIndex = k;
}
}
if (thisSubset.size() > 2) {
similarDistanceSubsets.push_back(thisSubset);
}
}
}
for (unsigned int i = 0; i < similarDistanceSubsets.size(); ++i) {
for (unsigned int j = 0; j < similarDistanceSubsets.at(i).size(); ++j) {
std::cout << similarDistanceSubsets.at(i).at(j);
if (j != similarDistanceSubsets.at(i).size()-1) {
std::cout << " ";
}
else {
std::cout << std::endl;
}
}
}
}
The idea is to precompute the distances and then look for every pair of particles, starting from the smallest and its larger neighbors, if there is another valid pair above it. If so these are all collected in a subset and this is added to the subset vector. For every new value the valid neighbor region has to be updated to ensure that neighboring distances differ by less than the threshold. Afterwards, the program continues with the next smallest value and its larger neighbors and so on.
Here is an algorithm which is slightly different from yours, which is O(n^3) in the length n of the vector - not very efficient.
It is based on the premise that you want to have subsets of at least size 2. So what you can do is consider all the two-element subsets of the vector, then find all other elements that also match.
So given a function
std::vector<int> findSubset(std::vector<int> v, int baseValue, int distance) {
// Find the subset of all elements in v that differ by a multiple of
// distance from the base value
}
you can do
std::vector<std::vector<int>> findSubsets(std::vector<int> v) {
for(int i = 0; i < v.size(); i++) {
for(int j = i + 1; j < v.size(); j++) {
subsets.push_back(findSubset(v, v[i], abs(v[i] - v[j])));
}
}
return subsets;
}
Only remaining problem is keeping track of the duplicates, maybe you can keep a hashed list of (baseValue % distance, distance) pairs for all the subsets you have already found.

My implementation of Dijkstra's algorithm keeps messing up

I'm implementing Dijkstra's algorithm for school and my code keeps messing up. I've followed the pseudo-code on Wikipedia very closely. I implement the graph with a weighted adjacency list in this form so I check neighbours by iterating through the corresponding row.
Here's my graph class, along with my vertex struct.
struct vertex
{
//constructor
vertex(size_t d_arg, size_t n_arg)
{
n = n_arg;
d = d_arg;
}
//member variables, n is name and d is distance
size_t n;
size_t d;
//overloaded operator so I can use std::sort in my priority queue
bool operator<(const vertex& rhs) const
{
return d<rhs.d;
}
};
class graph
{
public:
graph(vector<vector<size_t> > v){ ed = v;};
vector<size_t> dijkstra(size_t src);
bool dfs(size_t src);
private:
//stores my matrix describing the graph
vector<vector<size_t> > ed;
};
The function dfs implements a Depth-first Search to check if the graph's joint. I've got no problems with it. But the function dijkstra, however, gives me the wrong values. This is how it's implemented.
vector<size_t> graph::dijkstra(size_t src)
{
//a vector storing the distances to the vertices and a priority queue
vector<size_t> dist;
dist[src] = 0;
p_q<vertex> q;
//set the distance for the vertices to inphinity, since they're size_t and -1 is largest
for (size_t i = 0; i < ed.size(); i++) {
if(i!=src)
{
dist.push_back(-1);
}
//push the vertices to the priority queue
vertex node(dist[i], i);
q.push(node);
}
//while there's stuff in the queue
while(q.size())
{
//c, the current vertex, becomes the top
vertex c = q.pop();
//iterating through all the neighbours, listed in the adjacency matrix
for(int i = 0; i < ed[0].size(); i++)
{
//alternative distance to i is distance to current and distance between current and i
size_t alt = dist[c.n] + ed[c.n][i];
//if we've found a better distance
if(alt < dist[i])
{
//new distance is alternative distance, and it's pushed into the priority queue
dist[i] = alt;
vertex n(alt, i);
q.push(n);
}
}
}
return dist;
}
I can't see why I'm having trouble. I've debugged with this matrix.
0 3 -1 1
3 0 4 1
-1 4 0 -1
1 1 -1 0
And it didn't visit anything other than vertex 0 and vertex 3.
One of the problems is right at the beginning of graph::dijkstra, when an element of zero-sized array is assigned:
vector<size_t> dist;
dist[src] = 0;
It is OK in pseudo-code, but not in C++. Perhaps you may change like this:
vector<size_t> dist;
for (size_t i = 0; i < ed.size(); i++) {
if(i!=src)
{
dist.push_back(-1);
}
else
{
dist.push_back(0);
}
....