Prevent Cycles in Maximum Spanning Tree - c++

I am trying to create a maximum spanning tree in C++ but am having trouble preventing cycles. The code I have works alright for some cases, but for the majority of cases there is a cycle. I am using an adjacency matrix to find the edges.
double maximumST( vector< vector<double> > adjacencyMatrix ) {
const int size = adjacencyMatrix.size();
vector <double> edges;
int edgeCount = 0;
double value = 0;
std::vector<std::vector<int>> matrix(size, std::vector<int>(size));
for (int i = 0; i < size; i++) {
for (int j = i; j < size; j++) {
if (adjacencyMatrix[i][j] != 0) {
edges.push_back(adjacencyMatrix[i][j]);
matrix[i][j] = adjacencyMatrix[i][j];
edgeCount++;
}
}
}
sort(edges.begin(), edges.end(), std::greater<int>());
for (int i = 0; i < (size - 1); i++) {
value += edges[i];
}
return value;
}
One I've tried to find a cycle was by creating a new adjacency matrix for the edges and checking that before adding a new edge, but that did not perform as expected. I also tried to build a 3D matrix, but I could not get that to work either.
What's a new approach I should try to prevent cycles?

You should add the edge if the lowest common ancestor(LCA) of the two vertices corresponding to that edge is not root.

Related

find if a structure already exists in a vector c++

I have to find the weight between all edges in a graph, so since the edges are bidirectional I dont want to include 2 -> 1 if I already have 1 -> 2 (since they will have the same weight). The edges are stored in a vector from structure Edge. My initial idea was to look up, if an edge that has the start and end positions swapped and has the same weight already exists, and if this is the case, just dont do anything. However, I dont exactly know how to put it into code, so any help would be appreciated. Also any approaches that could optimise the solution are also welcome.
struct Vertex {
Vertex(const int i = 0) : index {i}, key {max_key}, parent_index {undef}, processed {false} {}
int index; // vertex identifier
int key; // temporary minimal weight (Prim algorithm)
int parent_index; // temporary minimal distance neighboor vertex (Prim algorithm)
int processed; // flag used to mark vertices that are already included in V'
static constexpr int max_key = std::numeric_limits<int>::max();
static const int undef = -1;
};
struct Edge {
Edge(int va, int vb, int w) : vi1 {va}, vi2 {vb}, weight {w} { }
int vi1; //start point
int vi2; //end point
int weight;
};
struct Graph {
int N; // number of vertices
std::vector<Vertex> V; // set of vertices
std::vector<Edge> E; // set of edges
std::vector<Edge> MST; // minimal spanning tree
const int* weights_table; // weights given as distance matrix
};
The problem is here in find I know this is a lot of irrelevant code, but I post it so that you can picture it more clearly. If there is no connection between 2 vertices they have weight of -1
// construct vertices and edges for a given graph
void createGraph(Graph& G) {
// TODO 5.1a: clear V and E and insert all vertex objects and edge objects
// - vertices are numbered (labeled) from 0 to N-1
// - edges exist if and only if there is positive distance between two vertices
// - edges are bidirectional, that is, edges are inserted only once between two vertices
G.E.clear();
G.V.clear();
for(int i = 0; i < G.N; i++){
Vertex V (i);
G.V.push_back(V);
}
for(int i = 0; i < G.N; i++){
for(int j = 0; j < G.N; j++){
Edge Ed (i,j,0);
int weight = getWeight(G,i,j);
if(weight > 0){
Ed.weight = weight;
auto it = find(G.E.begin(), G.E.end(), ....);
if( it != G.E.end() ) continue;
G.E.push_back(Ed);
}
}
}
}
Thanks!
since the edges are bidirectional
You can construct Edges such that v1 <= v2, then there is only one representation of each possible edge.
struct Edge {
Edge(int va, int vb, int w) : vi1 {std::min(va, vb)}, vi2 {std::max(va, vb)}, weight {w} { }
int vi1; // earlier point
int vi2; // later point
int weight;
};
Aside: prefer constructing the Edge in place
for(int i = 0; i < G.N; i++){
for(int j = G.N - 1; j >= 0 + i; j--){
int weight = getWeight(G,i,j);
if(weight > 0){
G.E.emplace_back(i, j, weight);
}
}
}
Okay, I think I got it, by changing the second for loop to look this way, but I am also curious to see how would the syntax would look like if find is being used
for(int i = 0; i < G.N; i++){
for(int j = G.N - 1; j >= 0 + i; j--){
Edge Ed (i, j , 0);
int weight = getWeight(G,i,j);
if(weight > 0){
Ed.weight = weight;
G.E.push_back(Ed);
}
}
}

Problem with a randomized graph contraction algorithm for min cut

Hi I am new to c++ and trying to implement a randomized graph contraction algorithm for min cut. The graph is represented as a vector of vectors (data) where the first entry of each vector denotes a vertex and the rest of the entries are the vertices connected to it.
Here is the function for randomized contraction:
std::vector<std::vector<int> > contraction (std::vector<std::vector<int> > data)
{
srand (time(NULL));
//choose a random vertex
int randNum = rand()%(data.size());
//choose a random vertex connected to the first
int randNum2 = rand()%(data[randNum].size()-1) +1;
//loop through all vertices
for (int i = 0; i < data.size(); i++)
{
//find the entry corresponding to the second vertex
if (data[i][0] == data[randNum][randNum2])
{
for (int j =1; j <data[i].size(); j++)
{
//no self loop
if (data[i][j] !=data[randNum][0])
//add the vertex connection from the second vertex to the first one
data[randNum].push_back(data[i][j]);
}
//remove the entry for the second vertex
data.erase (data.begin()+i-1);
}
}
for (int i = 0; i < data.size(); i++)
{
//all vertices connected to the second vertex are to be connected to the first
for (int j =1; j <data[i].size(); j++)
{
if (data[i][j] == data[randNum][randNum2])
data[i][j] = data[randNum][0];
}
}
//now the graph has one less vertex
return data;
}
This function is called iteratively in the main:
while (data.size() > 2)
{
data = contraction (data);
}
The graph is never contracted down to two vertices. The program terminates after a few iterations without any errors. The number of iterations it goes through are variable, typically 20-30. I cannot figure out why it terminates prematurely, any help would be appreciated.

How to access a vector inside a vector?

So I have a vector of vectors type double. I basically need to be able to set 360 numbers to cosY, and then put those 360 numbers into cosineY[0], then get another 360 numbers that are calculated with a different a now, and put them into cosineY[1].Technically my vector is going to be cosineYa I then need to be able to take out just cosY for a that I specify...
My code is saying this:
for (int a = 0; a < 8; a++)
{
for int n=0; n <= 360; n++
{
cosY[n] = cos(a*vectorOfY[n]);
}
cosineY.push_back(cosY);
}
which I hope is the correct way of actually setting it.
But then I need to take cosY for a that I specify, and calculate another another 360 vector, which will be stored in another vector again as a vector of vectors.
Right now I've got:
for (int a = 0; a < 8; a++
{
for (int n = 0; n <= 360; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosY[n]);
}
CosProductY.push_back(cosProductPt);
}
The VectorOfY is besically the amplitude of an input wave. What I am doing is trying to create a cosine wave with different frequencies (a). I am then calculation the product of the input and cosine wave at each frequency. I need to be able to access these 360 points for each frequency later on in the program, and right now also I need to calculate the addition of all elements in cosProductPt, for every frequency (stored in cosProductY), and store it in a vector dotProductCos[a].
I've been trying to work it out but I don't know how to access all the elements in a vector of vectors to add them. I've been trying to do this for the whole day without any results. Right now I know so little that I don't even know how I would display or access a vector inside a vector, but I need to use that access point for the addition.
Thank you for your help.
for (int a = 0; a < 8; a++)
{
for int n=0; n < 360; n++) // note traded in <= for <. I think you had an off by one
// error here.
{
cosY[n] = cos(a*vectorOfY[n]);
}
cosineY.push_back(cosY);
}
Is sound so long as cosY has been pre-allocated to contain at least 360 elements. You could
std::vector<std::vector<double>> cosineY;
std::vector<double> cosY(360); // strongly consider replacing the 360 with a well-named
// constant
for (int a = 0; a < 8; a++) // same with that 8
{
for int n=0; n < 360; n++)
{
cosY[n] = cos(a*vectorOfY[n]);
}
cosineY.push_back(cosY);
}
for example, but this hangs on to cosY longer than you need to and could cause problems later, so I'd probably scope cosY by throwing the above code into a function.
std::vector<std::vector<double>> buildStageOne(std::vector<double> &vectorOfY)
{
std::vector<std::vector<double>> cosineY;
std::vector<double> cosY(NumDegrees);
for (int a = 0; a < NumVectors; a++)
{
for int n=0; n < NumDegrees; n++)
{
cosY[n] = cos(a*vectorOfY[n]); // take radians into account if needed.
}
cosineY.push_back(cosY);
}
return cosineY;
}
This looks horrible, returning the vector by value, but the vast majority of compilers will take advantage of Copy Elision or some other sneaky optimization to eliminate the copying.
Then I'd do almost the exact same thing for the second step.
std::vector<std::vector<double>> buildStageTwo(std::vector<double> &vectorOfY,
std::vector<std::vector<double>> &cosineY)
{
std::vector<std::vector<double>> CosProductY;
for (int a = 0; a < numVectors; a++)
{
for (int n = 0; n < NumDegrees; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosineY[a][n]);
}
CosProductY.push_back(cosProductPt);
}
return CosProductY;
}
But we can make a couple optimizations
std::vector<std::vector<double>> buildStageTwo(std::vector<double> &vectorOfY,
std::vector<std::vector<double>> &cosineY)
{
std::vector<std::vector<double>> CosProductY;
for (int a = 0; a < numVectors; a++)
{
// why risk constantly looking up cosineY[a]? grab it once and cache it
std::vector<double> & cosY = cosineY[a]; // note the reference
for (int n = 0; n < numDegrees; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosY[n]);
}
CosProductY.push_back(cosProductPt);
}
return CosProductY;
}
And the next is kind of an extension of the first:
std::vector<std::vector<double>> buildStageTwo(std::vector<double> &vectorOfY,
std::vector<std::vector<double>> &cosineY)
{
std::vector<std::vector<double>> CosProductY;
std::vector<double> cosProductPt(360);
for (std::vector<double> & cosY: cosineY) // range based for. Gets rid of
{
for (int n = 0; n < NumDegrees; n++)
{
cosProductPt[n] = (VectorOfY[n]*cosY[n]);
}
CosProductY.push_back(cosProductPt);
}
return CosProductY;
}
We could do the same range-based for trick for the for (int n = 0; n < NumDegrees; n++), but since we are iterating multiple arrays here it's not all that helpful.

Find similar distances between all values in vector and subset them

Given is a vector with double values. I want to know which distances between any elements of this vector have a similar distance to each other. In the best case, the result is a vector of subsets of the original values where subsets should have at least n members.
//given
vector<double> values = {1,2,3,4,8,10,12}; //with simple values as example
//some algorithm
//desired result as:
vector<vector<double> > subset;
//in case of above example I would expect some result like:
//subset[0] = {1,2,3,4}; //distance 1
//subset[1] = {8,10,12}; //distance 2
//subset[2] = {4,8,12}; // distance 4
//subset[3] = {2,4}; //also distance 2 but not connected with subset[1]
//subset[4] = {1,3}; //also distance 2 but not connected with subset[1] or subset[3]
//many others if n is just 2. If n is 3 (normally the minimum) these small subsets should be excluded.
This example is simplified as the distances of integer numbers could be iterated and tested for the vector which is not the case for double or float.
My idea so far
I thought of something like calculating the distances and storing them in a vector. Creating a difference distance matrix and thresholding this matrix for some tolerance for similar distances.
//Calculate distances: result is a vector
vector<double> distances;
for (int i = 0; i < values.size(); i++)
for (int j = 0; j < values.size(); j++)
{
if (i >= j)
continue;
distances.push_back(abs(values[i] - values[j]));
}
//Calculate difference of these distances: result is a matrix
Mat DiffDistances = Mat::zero(Size(distances.size(), distances.size()), CV_32FC1);
for (int i = 0; i < distances.size(); i++)
for (int j = 0; j < distances.size(); j++)
{
if (i >= j)
continue;
DiffDistances.at<float>(i,j) = abs(distances[i], distances[j]);
}
//threshold this matrix with some tolerance in difference distances
threshold(DiffDistances, DiffDistances, maxDistTol, 255, CV_THRESH_BINARY_INV);
//get points with similar distances
vector<Points> DiffDistancePoints;
findNonZero(DiffDistances, DiffDistancePoints);
At this point I get stuck with finding the original values corresponding to my similar distances. It should be possible to find them, but it seems very complicated to trace back the indices and I wonder if there isn't an easier way to solve the problem.
Here is a solution that works, as long as there are no branches meaning, that there are no values closer together than 2*threshold. That is the valid neighbor region because neighboring bonds should differ by less than the threshold, if I understood #Phann correctly.
The solution is definitively neither the fastest nor the nicest possible solution. But you might use it as a starting point:
#include <iostream>
#include <vector>
#include <algorithm>
int main(){
std::vector< double > values = {1,2,3,4,8,10,12};
const unsigned int nValues = values.size();
std::vector< std::vector< double > > distanceMatrix(nValues - 1);
// The distanceMatrix has a triangular shape
// First vector contains all distances to value zero
// Second row all distances to value one for larger values
// nth row all distances to value n-1 except those already covered
std::vector< std::vector< double > > similarDistanceSubsets;
double threshold = 0.05;
std::sort(values.begin(), values.end());
for (unsigned int i = 0; i < nValues-1; ++i) {
distanceMatrix.at(i).resize(nValues-i-1);
for (unsigned j = i+1; j < nValues; ++j){
distanceMatrix.at(i).at(j-i-1) = values.at(j) - values.at(i);
}
}
for (unsigned int i = 0; i < nValues-1; ++i) {
for (unsigned int j = i+1; j < nValues; ++j) {
std::vector< double > thisSubset;
double thisDist = distanceMatrix.at(i).at(j-i-1);
// This distance already belongs to another cluster
if (thisDist < 0) continue;
double minDist = thisDist - threshold;
double maxDist = thisDist + threshold;
thisSubset.push_back(values.at(i));
thisSubset.push_back(values.at(j));
//Indicate that this is already clustered
distanceMatrix.at(i).at(j-i-1) = -1;
unsigned int lastIndex = j;
for (unsigned int k = j+1; k < nValues; ++k) {
thisDist = distanceMatrix.at(lastIndex).at(k-lastIndex-1);
// This distance already belongs to another cluster
if (thisDist < 0) continue;
// Check if you found a new valid pair
if ((thisDist > minDist) && (thisDist < maxDist)){
// Update the valid distance interval
minDist = thisDist - threshold;
minDist = thisDist - threshold;
// Add the newly found point
thisSubset.push_back(values.at(k));
// Indicate that this is already clustered
distanceMatrix.at(lastIndex).at(k-lastIndex-1) = -1;
// Continue the search from here
lastIndex = k;
}
}
if (thisSubset.size() > 2) {
similarDistanceSubsets.push_back(thisSubset);
}
}
}
for (unsigned int i = 0; i < similarDistanceSubsets.size(); ++i) {
for (unsigned int j = 0; j < similarDistanceSubsets.at(i).size(); ++j) {
std::cout << similarDistanceSubsets.at(i).at(j);
if (j != similarDistanceSubsets.at(i).size()-1) {
std::cout << " ";
}
else {
std::cout << std::endl;
}
}
}
}
The idea is to precompute the distances and then look for every pair of particles, starting from the smallest and its larger neighbors, if there is another valid pair above it. If so these are all collected in a subset and this is added to the subset vector. For every new value the valid neighbor region has to be updated to ensure that neighboring distances differ by less than the threshold. Afterwards, the program continues with the next smallest value and its larger neighbors and so on.
Here is an algorithm which is slightly different from yours, which is O(n^3) in the length n of the vector - not very efficient.
It is based on the premise that you want to have subsets of at least size 2. So what you can do is consider all the two-element subsets of the vector, then find all other elements that also match.
So given a function
std::vector<int> findSubset(std::vector<int> v, int baseValue, int distance) {
// Find the subset of all elements in v that differ by a multiple of
// distance from the base value
}
you can do
std::vector<std::vector<int>> findSubsets(std::vector<int> v) {
for(int i = 0; i < v.size(); i++) {
for(int j = i + 1; j < v.size(); j++) {
subsets.push_back(findSubset(v, v[i], abs(v[i] - v[j])));
}
}
return subsets;
}
Only remaining problem is keeping track of the duplicates, maybe you can keep a hashed list of (baseValue % distance, distance) pairs for all the subsets you have already found.

How to compute the complement of given vector Indices?

I have a 3D point vector, represented by class Point3D,
std::vector<Point3D> points;
I also have a size_t vector containing indices of the points vector,
std::vector<size_t> indices_true;
Now I want to build the inverse of indices_true, i.e. I want to build another index vector indices_false that contains all indices which are missing in indices_true. How can this be done in a faster way than the following:
for (size_t i = 0; i < points.size(); i++)
{
// TODO: The performance of the following is awful
if (std::find(indices_true.begin(), indices_true.end(), i) == indices_true.end())
indices_false.push_back(i);
}
Needs extra memory, but yields a linear algorithm:
Here is an attempt (neither compiled, nor tested):
indices_false.reserve(points.size() - indices_true.size());
std::vector<char> isTrue(points.size(), false); // avoided std::vector<bool> intentionally
for (const size_t i : indices_true)
{
isTrue[i] = true;
}
for (size_t i = 0; i < points.size(); ++i)
{
if (!isTrue[i])
indices_false.push_back(i);
}
Sort your indices_true vector first and use std::binary_search. To keep the orders within vector using std::stable_sort.
std::stable_sort(indices_true.begin(), indices_true.end());
for (size_t i = 0; i < points.size(); i++)
{
if (std::binary_search(indices_true.begin(), indices_true.end(), i))
indices_false.push_back(i);
}
Sort indices_true and gradually increase an index k within this sorted vector. Increase it when necessary. This yields (beside the initial sorting) a linear algorithm.
Here is an attempt (neither compiled, nor tested):
std::sort(begin(indices_true), end(indices_true));
indices_false.reserve(points.size() - indices_true.size());
size_t k = 0;
for (size_t i = 0; i < points.size(); ++i)
{
if (k < indices_true.size() && i > indices_true[k])
++k;
assert(k >= indices_true.size() || i <= indices_true[k]);
if (k >= indices_true.size() || i != indices_true[k])
indices_false.push_back(i);
}
}