Prim's Algorithm with matrices - c++

I am trying to implement Prim's algorithm with C++ and matrices.
Here is my problem:
int node[] = {11, 11, 0, 11, 11, 11, 11, 11};
int nodeCon[8];
void generatePrims() {
int cNode = 3;
for (int i = 1; i <= 8; i++) {
if (graph[cNode][i] != 0){
if (node[i] > graph[cNode][i]) {
node[i] = graph[cNode][i];
nodeCon[i] = cNode;
}
}
}
};
cNode is the starting node.
graph[][] is the 2d matrices that holds the connections.
nodeCon[] is the array that will hold the connections for the MST (which node is connected with other)
node[]= holds the cost-value for the nodeCon.
My question is how I am going to continue to the next hop? Let's say that I found the minimum connection and I will set the value cNode= minConnection how the loop is going to look? How I know that I had examine all the nodes?
Thanks in advance

Something like this:
int node[]={11,11,0,11,11,11,11,11};
int used[]={0,0,0,0,0,0,0,0,0,0};
int nodeCon[8];
void generatePrims(){
int cNode = 3;
int next, min_now;
for(int i=0; i<8; ++i) {
used[cNode] = 1;
min_now = MAX_INT;
for(int i=1;i<=8;i++){
if(!used[i]){
if(node[i] > graph[cNode][i]){
node[i] = graph[cNode][i];
nodeCon[i]= cNode;
}
if(node[i] < min_now) {
min_now = node[i];
next = i;
}
}
}
cNode = next;
}
};
Also worth noting: it will be faster if instead of array 'used' you will use a list of unused vertices.

I can't currently comment on the previous answer (as I don't have enough reputation) so I will do it through another answer. Piotr solution is almost correct however I believe Prim's algorithm takes into account more than just the current node. An example can be seen here Prim's Algorithm. What this essentially means is you need to check the path from nodes you have visited not just the most recent node.
This means you will need to store a vector containing the nodes you have visited and "for each" through them as opposed to just checking the paths from the last node you visited.

The following site has the algorithm inplemented and a junit test class. So it should be what you are looking for. The unit test class has an actual matrix, of course. And the implementation class has the code.
http://www.geekviewpoint.com/java/graph/mst

Related

How to write BFS function in C++?

#include <iostream>
#include <string>
#include <queue>
using namespace std;
void BFS(const string&, const string[], int[][10]);
int main()
{
const int CAP = 10;
string states[CAP] = { "Arizona", "California", "Idaho", "Nevada", "Oregon", "Utah", "Washington" };
string Point;
int matrix[CAP][CAP] =
{
{0,1,0,1,0,1,0},
{1,0,0,1,1,0,0},
{0,0,0,1,1,1,1},
{0,1,1,1,0,0,1},
{1,1,1,1,0,0,0},
{0,0,1,0,1,0,0},
{0,0,1,0,1,0,0}
};
BFS("California", states, matrix);
}
void BFS(const string& Point, const string states[], int matrix[][10])
{
int SPoint = 0;
queue<string> visited;
queue<string> Queue;
string temp = Point;
visited.push(temp);
do
{
for (int i = 0; i < 10; i++)
{
if (states[i] == temp)
{
SPoint = i;
}
}
for (int i = 0; i < 10; i++)
{
if (matrix[SPoint][i] == 1)
{
Queue.push(states[i]);
}
}
visited.push(Queue.front());
Queue.pop();
temp = visited.back();
} while (!Queue.empty());
for (int i = 0; i < 10; i++)
{
cout << visited.front();
visited.pop();
}
}
I'm doing an exercise where I have to make a function that does Breadth-First Search and prints out the visited path. But my function wouldn't print anything. What am I doing wrong here?
Note: The matrix is alphabetical order and represents the connection between states.
My expected output: California Arizona Oregon Nevada Utah Idaho Washington
Exercise description
While I won't offer a complete solution, I can help identify some of the issues the code exhibits.
Major issues
Since you have a cyclic graph, it's important to mark nodes as visited during the BFS else you'll wind up with an infinite loop (which is why nothing gets printed in your current implementation). Your visited queue could be an unordered_set. When nodes are visited, add them to the set and write a conditional to avoid visiting them again.
The adjacency matrix doesn't appear correct. Since it's an undirected graph, I would anticipate that the matrix would be mirrored from top left to bottom right, but it's not. Also, there are no self-edges in the graph yet Nevada appears to have an edge to itself in the matrix.
There's no need to loop over the adjacency matrix--you can index into it by mapping digit indexes and string names appropriately. If you do need to loop, running to 10 is out of bounds on a 7x7 matrix.
Minor issues
There's no sense in arbitrarily restricting the matrix size. Although the assignment enforces this, it's a poor design choice because the code needs to be rewritten any time you want to use a different input graph.
A matrix seems like a slightly awkward data structure here because it introduces an extra layer of indirection to translate strings into integers and back. Although the project doesn't permit it, I'd prefer using a structure like:
std::unordered_map<std::string, std::vector<std::string>> graph({
{"California", {"Oregon", "Nevada", "Arizona"}},
// ... more states ...
});
Ideally, these would be Node objects with neighbor vector members instead of strings.
C++ offers std::vector and std::array which are preferable to C arrays. I assume they haven't been introduced yet in your class or aren't permitted on the assignment, but if you're stuck, you can try writing the code using them, then re-introducing your instructor's constraints after you get it working. If nothing else, it'd be a learning experience.
Avoid using namespace std;.
Reserve uppercase variable names for class names. Objects and primitives should be lowercase.
Pseudocode for BFS
This assumes the preferred data structure above; it's up to you to convert to and from strings and adjacency matrix indexes as needed.
func BFS(string start, unordered_map<string, vector<string>> graph):
vector traversal
queue worklist = {start}
unordered_set visited = {start}
while !worklist.empty():
curr = worklist.pop_front()
traversal.push_back(curr)
for neighbor in graph[curr]:
if neighbor not in visited:
visited.insert(neighbor)
worklist.push(neighbor)
return traversal
Since this is an assignment, I'll leave it at this and let you take another crack at the code. Good luck.

How to std::move class with std::vector<class> member into another vector<class>?

I'm trying to linearize a hierarchy of 'Node' classes into a single (std::vector) array. This is a complete c++ program code demonstrating the problem, minimalized as much as I think is possible:
#include <iostream>
#include <vector>
struct Node;
struct B{
int nvar1;
std::vector<Node> Children;
};
struct Node{
B bvar1;
};
void Linearize(Node & NODE, std::vector<Node> & ArrayOfNodes){
std::cout<<"Linearizing started.\n";
ArrayOfNodes.push_back(std::move(NODE));
Node & node = ArrayOfNodes.back();
for(int n = 0; n < node.bvar1.Children.size(); n++){
std::cout<<"Running loop "<<n<<" of "<<node.bvar1.Children.size()<<"\n";
Linearize(node.bvar1.Children[n], ArrayOfNodes);
}
std::cout<<"Done with node linearization.\n";
}
int main(){
Node ParentNode;
//Fill the ParentNode
ParentNode.bvar1.nvar1 = 0;
ParentNode.bvar1.Children.resize(2);
ParentNode.bvar1.Children[0].bvar1.Children.resize(2);
ParentNode.bvar1.Children[0].bvar1.Children[0].bvar1.nvar1 = 1;
ParentNode.bvar1.Children[0].bvar1.Children[1].bvar1.nvar1 = 2;
ParentNode.bvar1.Children[1].bvar1.nvar1 = 3;
std::cout<<"I do get to the linearizing.\n";
std::vector<Node> ArrayOfNodes;
Linearize(ParentNode, ArrayOfNodes);
std::cout<<"I do get to the displaying part.\n";
for(int n = 0; n < ArrayOfNodes.size(); n++){
std::cout<<ArrayOfNodes[n].bvar1.nvar1<<"\n";
}
return 0;
}
This crashes the program. The output until the crash is:
I do get to the linearizing.
Linearizing started.
Running loop 0 of 2
Linearizing started.
Runnning loop 0 of 2
Linearizing started.
Done with node linearization.
Done with node linearization.
Running loop 1 of 18446744073709191157
Linearizing started.
Running loop 0 of 1011712
Linearizing started.
I'm trying to get an elegant and efficient solution here. The 'Node' class can get large and contains many other classes and vectors. Given the data size, I'm reluctant to construct move constructors/assignments to cover all that data structure.
What I want to do would work with this code:
void Linearize(Node & NODE, std::vector<Node> & ArrayOfNodes){
std::cout<<"Linearizing started.\n";
ArrayOfNodes.push_back(NODE);
Node & node = ArrayOfNodes.back();
for(int n = 0; n < NODE.bvar1.Children.size(); n++){
std::cout<<"Running loop "<<n<<" of "<<node.bvar1.Children.size()<<"\n";
Linearize(NODE.bvar1.Children[n], ArrayOfNodes);
}
std::cout<<"Done with node linearization.\n";
}
But that would copy stuff, when I want to move it. I want it to be more efficient than this.
Basically two question(s/ groups):
If default move constructer is called, why aren't the Nodes moved properly to ArrayOfNodes? Doesn't the default move constructor call the move constructor of every member, and std::vector has pointers inside anyway, so it should still point to the same data when moved? What part of the process am I misunderstanding?
What would be a standard/good/veteran coder solution to this kind of situation (linearization)?
Any and all comments welcome, this is my first question, if I'm doing something wrong or could just do better, tell me. Thanks!
What happens is that node is a reference (i.e. pointer) to an item in the std::vector that you are constructing. After taking the reference, you use push_back on the vector, which will grow the underlying array and hence might invalidate all pointers to it (growing the array often means a new, larger memory block is allocated and all data is moved to it). When you then want to access the next child of the node, you are referencing freed memory.
There are 3 ways to solve this. First, pre-allocate the array before starting the linearization process:
std::vector<Node> ArrayOfNodes;
ArrayOfNodes.reserve(numberOfNodes); // <-- you need to be able to determine this
Linearize(ParentNode, ArrayOfNodes);
A second solution would be to push the children on the vector before you move the node:
void Linearize(Node & NODE, std::vector<Node> & ArrayOfNodes){
std::cout<<"Linearizing started.\n";
for(int n = 0; n < NODE.bvar1.Children.size(); n++){
std::cout<<"Running loop "<<n<<" of "<< NODE.bvar1.Children.size()<<"\n";
Linearize(NODE.bvar1.Children[n], ArrayOfNodes);
}
ArrayOfNodes.push_back(std::move(NODE));
std::cout<<"Done with node linearization.\n";
}
A third solution would be to not take a reference to the node in the vector, but to take its index:
void Linearize(Node & NODE, std::vector<Node> & ArrayOfNodes){
std::cout<<"Linearizing started.\n";
size_t index = ArrayOfNodes.size();
ArrayOfNodes.push_back(std::move(NODE));
for(int n = 0; n < ArrayOfNodes[index].bvar1.Children.size(); n++){
std::cout<<"Running loop "<<n<<" of "<<ArrayOfNodes[index].bvar1.Children.size()<<"\n";
Linearize(ArrayOfNodes[index].bvar1.Children[n], ArrayOfNodes);
}
std::cout<<"Done with node linearization.\n";
}
A totally different approach would be to not move the nodes at all, but to construct a std::vector<Node*>, and fill it with pointers to the nodes. But that might not be what you're after.

Execution time of creating a graph adt in C++

Generally, is creating an undirected graph adt supposed to take a long time?
If I have a graph of 40 nodes, and each node is connected to 20% of the other nodes, my program will stall when it tries to link the nodes together.
The max I can really get up to is 20% density of 20 nodes. My code to link vertexes together looks like this:
while(CalculateDensity()){
LinkRandom();
numLinks++;
}
void LinkRandom(){
int index = rand()%edgeList.size();
int index2 = rand()%edgeList.size();
edgeList.at(index).links.push_back(edgeList.at(index2));
edgeList.at(index2).links.push_back(edgeList.at(index));
}
Is there any way to do this faster?
EDIT: Here is where the data structure declaration:
for(int i=0; i<TOTAL_NODES; i++){
Node *ptr = new Node();
edgeList.push_back(*ptr); //populate edgelist with nodes
}
cout<<"edgelist populated"<<endl;
cout<<"linking nodes..."<<endl;
while(CalculateDensity()){
LinkRandom();
numLinks++;
}
Seems to me that you're copying a growing structure with each push_back.
That could be the cause of slowness.
If you could show the data structure declaration I could try to be more specific.
edit I still miss the Node declaration, nevertheless I would try to change the edgeList to a list of pointers to Node. Then
// hypothetic declaration
class Node {
list<Node*> edgeList;
}
//populate edgelist with nodes
for(int i=0; i<TOTAL_NODES; i++)
edgeList.push_back(new Node());
....
void LinkRandom(){
int index = rand()%edgeList.size();
int index2 = rand()%edgeList.size();
edgeList.at(index)->links.push_back(edgeList.at(index2));
edgeList.at(index2)->links.push_back(edgeList.at(index));
}

Finding occurrence of vector entries in another vector without nested for loops

I have a piece of code that I'm migrating from Fortran to C++, and I'd like to avoid some of the nested for loop structures I had to create in the original F77 code.
The problem is this: I have a vector of objects called nodes that each include a vector holding (among other important info) the indices of other node objects to which each is connected (a connection graph). Like this
struct Node {
vector<int> conNode;
};
vector<Node> listOfNodes;
vector<int> nodeListA; // a subset of nodes of interest stored as their vector indices
I need to look for nodes that nodes in nodeListA are connected to, but only if those nodes are also in nodeListA. Right now, my code looks something like this:
// Loop over the subset of node indices
for (int i=0; i<nodeListA.size(); i++) {
// Loop over the nodes connected to the node i
for (int j=0; j<listOfNodes[nodeListA[i]].conNode.size(); j++) {
// Loop over the subset of node indices again
for (int k=0; k<nodeListA.size(); k++) {
// and determine if any of node i's connections are in the subset list
if (nodeListA[k] == listOfNodes[nodeListA[i]].conNode[j]) {
// do stuff here
}
}
}
}
There HAS to be a much simpler way to do this. It seems like I'm making this way too complicated. How can I simplify this code, possibly using the standard algorithm library?
If your variable should express a set of values, use std::set instead of std::vector. Then you'll have
typedef std::set<int> SetOfIndices;
SetOfIndices setOfIndices; // instead of nodeListA
for(SetOfIndices::const_iterator iter = setOfIndices.begin(); iter != setOfIndices.end(); ++iter)
{
Node const & node = listOfNodes[*iter];
for (int j = 0; j < node.conNode.size(); ++j)
{
if (setOfIndices.find(node.conNode[j]) != setOfIndices.end())
{
// do stuff here
}
}
}
EDIT
As Jerry Coffin suggests, std::set_intersection can be used in outer loop:
struct Node {
SetOfIndices conNode;
}
typedef std::set<int> SetOfIndices;
SetOfIndices setOfIndices; // instead of nodeListA
for(SetOfIndices::const_iterator iter = setOfIndices.begin(); iter != setOfIndices.end(); ++iter)
{
Node const & node = listOfNodes[*iter];
std::vector<int> interestingNodes;
std::set_intersection(setOfIndices.begin(), setOfIndices.end(),
node.conNode.begin(), node.conNode.end(),
std::back_inserter(interestingNodes));
for (int j = 0; j < interestingNodes.size(); ++j)
{
// do stuff here
}
}
ANOTHER EDIT
About efficiency - it depends what is the dominant operation. The number of executions of part described as "do stuff here" will not vary. The difference is in time of traversing your collections:
Your original code - nodeListA.size()^2 * [average conNode size]
My first solution - nodeListA.size() * log(nodeListA.size()) * [average conNode size]
After Jerry Coffin suggestion - nodeListA.size()^2 * [average number of interesting conNode elements]
So it seems that set_intersection use doesn't help in this case.
I'd suggest using a dictionary (an O(log n) one like std::set, or better a hash-based one like std::unordered_set from C++11) for nodeListA. The following is a C++11 code example.
#include <unordered_set>
#include <vector>
struct Node {
std::vector<int> conNode;
};
int main()
{
std::vector<Node> listOfNodes;
std::unordered_set<int> nodeListA;
for (int node_id : nodeListA)
for (int connected_id : listOfNodes[node_id].conNode)
if (nodeListA.find(connected_id) != end(nodeListA))
/* Do stuff here.. */
;
return 0;
}
The advantage of using a std::unordered_set is that look-ups (i.e. searching for a given node-id) are extremely fast. The implementation included in your standard library, however, may not be particularly fast. Google's sparse hash and dense hash implementation is an alternative that provides the same interface and is known to be very good for most purposes: http://code.google.com/p/sparsehash/
Depending on what you want to do with the resulting nodes, it may be possible to replace the inner loop of the above code with an STL algorithm. For example, if you want to put all the nodes identified by the algorithm in a vector, you could code it as follows (use this as a replacement for both loops together):
std::vector<int> results;
for (int node_id : nodeListA)
std::copy_if(begin(listOfNodes[node_id].conNode),
end(listOfNodes[node_id].conNode),
back_inserter(results),
[&nodeListA](int id){return nodeListA.find(id) != end(nodeListA);});
Again, this is C++11 syntax; it uses a lambda as function argument.

Implementation and Improvability of Depth First Search

I have coded DFS as the way it is on my mind and didn't referred any Text book or Pseudo-code for ideas. I think I have some lines of codes that are making unnecessary calculations. Any ideas on reducing the complexity of my algorithm ?
vector<int>visited;
bool isFound(vector<int>vec,int value)
{
if(std::find(vec.begin(),vec.end(),value)==vec.end())
return false;
else
return true;
}
void dfs(int **graph,int numOfNodes,int node)
{
if(isFound(visited,node)==false)
visited.push_back(node);
vector<int>neighbours;
for(int i=0;i<numOfNodes;i++)
if(graph[node][i]==1)
neighbours.push_back(i);
for(int i=0;i<neighbours.size();i++)
if(isFound(visited,neighbours[i])==false)
dfs(graph,numOfNodes,neighbours[i]);
}
void depthFirstSearch(int **graph,int numOfNodes)
{
for(int i=0;i<numOfNodes;i++)
dfs(graph,numOfNodes,i);
}
PS: Could somebody please sent me a link teaching me how can to insert C++ code with good quality. I've tried syntax highlighting but it didn't work out.
Your DFS has O(n^2) time complexity, which is really bad (it should run in O(n + m)).
This line ruins your implementation, because searching in vector takes time proportional to its length:
if(std::find(vec.begin(),vec.end(),value)==vec.end())
To avoid this, you can remember what was visited in an array of boolean values.
Second problem with your DFS is that for bigger graph it will probably cause stack overflow, because worst case recursion depth is equal to number of vertices in graph. Remedy to this problem is also simple: use std::list<int> as your own stack.
So, code that does DFS should look more or less like this:
// n is number of vertices in graph
bool visited[n]; // in this array we save visited vertices
std::list<int> stack;
std::list<int> order;
for(int i = 0; i < n; i++){
if(!visited[i]){
stack.push_back(i);
while(!stack.empty()){
int top = stack.back();
stack.pop_back();
if(visited[top])
continue;
visited[top] = true;
order.push_back(top);
for(all neighbours of top)
if(!visited[neighbour])
stack.push_back(neighbour);
}
}
}