Basically I am creating a mesh composed of nodes and springs and I keep receiving the segmentation fault (core dumped) error when trying to access an element of the nodes vector defined in the Mesh class in main().
When I run a test output within the Mesh class' constructor, I can access the node member just fine. I'm sure it's a memory problem but could anyone explain why this is happening?
Node class:
class Node
{
public:
/// (Non-)magic number indicating that the coordinate has not
/// been classified as pinned or free yet
static int Not_classified_yet;
/// (Non-)magic number indicating that the coordinate is pinned
static int Is_pinned;
/// Constructor: Pass the spatial dimension
Node(const unsigned& dim)
{
// Resize
X.resize(dim,0.0);
Eqn_number.resize(dim,Not_classified_yet);
}
/// Function to add a spring to the node
void add_spring_pt(Spring* spring_pt)
{
Spring_pt.push_back(spring_pt);
}
/// How many springs are attached to this node?
unsigned nspring()
{
return Spring_pt.size();
}
/// Access function to the ith spring connected to the node
Spring*& spring_pt(const unsigned& i)
{
return Spring_pt[i];
}
/// Access function to the position vector of the node
vector<double>& get_vector()
{
return X;
}
/// Access function to the coordinates of the node
double& x(int i)
{
return X[i];
}
/// Access function to the equation number for each coordinate
/// Can be negative if node is pinned in that direction.
int& eqn_number(const unsigned& i)
{
return Eqn_number[i];
}
/// Pin the i-th coordinate
void pin(const unsigned& i)
{
Eqn_number[i]=Is_pinned;
}
/// Is the i-th coordinate pinned?
bool is_pinned(const unsigned& i)
{
return (Eqn_number[i]==Is_pinned);
}
private:
/// Pointers to the springs attatched to the node.
vector<Spring*> Spring_pt;
/// Coordinates of the node
vector<double> X;
/// Vector containing equation indices for each coordinate direction.
/// Can be negative if node is pinned in that direction.
vector<int> Eqn_number;
};
Mesh class:
class Mesh
{
public:
/// constructor (nX contains number of nodes in each direction)
Mesh(const vector<unsigned> nX)
{
/// Function "num_nodes" defined in "myFunctions.cpp" to find the
/// total number of nodes.
unsigned nNodes = num_nodes(nX);
/// Check the dimension of the mesh and and construct a vector
/// of the nodes.
unsigned dim = nX.size();
vector<Node> nodes(nNodes,dim);
//std::cout<< nodes[1].x(0)<<std::endl;
/// Function "num_springs" defined in "myFunctions.cpp" to find the
/// total number of springs.
unsigned nsprings = num_springs(nX);
/// Vector to hold the springs.
vector<Spring> springs(nsprings);
/// Function to assign coordinates to all the nodes.
assign_coordinates(nodes,nX);
}
/// Access function to the ith node of the mesh.
Node& node(const unsigned& i)
{
return nodes[i];
}
/// Function declaration for assigning coordinates to nodes
void assign_coordinates(std::vector<Node>& nodes, std::vector<unsigned> nX);
/// Access function to the ith spring of the mesh.
Spring& spring(const unsigned& i)
{
return springs[i];
}
private:
/// Declare vectors to hold the nodes and springs.
vector<Node> nodes;
vector<Spring> springs;
};
And what I am trying to output from the main():
int main()
{
// create a mesh
// spatial dimensions
unsigned nx = 3;
unsigned ny = 3;
unsigned nz = 3;
vector<unsigned> nX(2);
nX[0] = nx;
nX[1] = ny;
//nX[2] = nz;
Mesh m(nX);
// segmentation fault (core dumped)
std::cout << m.node(6).eqn_number(1) << std::endl;
};
Thanks in advance for any help.
The problem (a problem?) is in Mesh constructor when you define
vector<Node> nodes(nNodes,dim);
I suppose that your intention was initialize the nodes member of the Mesh class; what you get is a nodes variable that is local to the Mesh constructor.
The nodes member of Mesh is initialized with the implicit default constructor, so the vectors in the Nodes (Spring_pt, X and Eqn_number) are initialized with the default std::vector constructor so with zero size.
When you call
m.node(6).eqn_number(1)
with node(6) you call the 7th element of a std::vector of size zero.
Same problem with
std::vector<Spring> springs(nsprings);
in the Mesh constructor: you declare and initialize a springs variable local in the constructor where your intention (I suppose) was initialize the springs member of Mesh.
If I understand correctly your intentions, you should be able to solve the problem writing your constructor as follows
Mesh (const std::vector<unsigned> nX)
: nodes(num_nodes(nX), nX.size()), springs(num_springs(nX))
{ assign_coordinates(nodes,nX); }
At a quick glance it looks like you initialized your mesh with 2 nodes and then went asking for the 7th (indexed starting at 0).
A quick way to check for this is to change nodes[i] to nodes.at(i) which will enable out-of-bounds checking for vector element accesses.
Related
I'm trying to migrate a big C++11 library to CUDA. The library builds a hierarchical tree of "Nodes", sharing the same virtual interface but implementing different algorithms, and then calculates the value of the root node for a number of trials. In each trial, the library changes some setting in the leaf nodes of the hierarchy, and then proceeds to recursively recalculate all node values upwards towards the root.
The library has the trial change on the outermost loop, and heavily uses polymorphic virtual classes. It is not feasible to redesign this aspect of it. What I could redesign are the value calculation functions, in order to vectorize them and make them run in kernel space.
Here's a cut down of the original, serial library:
// virtual ancestor
class Node {
public:
// Return value of this node for the current trial.
// This method will recursively call the value() method
// of the direct children of this Node.
virtual float value() = 0;
};
// Implement y = a1*x1 + a2*x2 + ... + an*xn,
// where x* are the values of the direct children of this node
// and a* are constants
class LinearCombination: virtual public Node {
public:
// skip: constructor
float value() {
float acc = 0;
for (size_t i = 0; i < children.size(); i++) {
// Recursively get value of child
acc += children[i]->value() * weights[i];
}
return acc;
}
protected:
std::vector<Node *> children;
std::vector<float> weights;
};
int main() {
Node * root;
// skip: initialise tree of nodes
for (auto trial: trials) {
// skip: setup global trial settings. This will change
// the result of the leaf Nodes.
auto value = root->value();
// skip: dump to disk
}
}
My first conversion attempt changes the value() method to calculate the value of each node in a vectorized way, that is for all trials at once. All flow control remains in plain C++, and only the actual maths is moved to the graphics card.
This is necessary because virtual polymorphism doesn't work inside a kernel, and it's very difficult to implement tree data structures (Node ** children can't be used).
class Node {
public:
// Return value for all scenarios
// result is a buffer in the device memory
virtual void value(float * result, int n) = 0;
};
class LinearCombination: virtual public Node {
public:
// skip: constructor
void value(float * result, int n) {
hemi::parallel_for(0, n, [=] HEMI_LAMBDA (int i) {
result[i] = 0.0;
});
// Allocate temporary buffer to store the value of the underlyings
float * scratch;
cudaMalloc((void **)&scratch, n * sizeof(float));
for (size_t child_id = 0; child_id < children.size(); child_id++) {
// Recursively get value of child
children[child_id]->value(scratch, n);
auto weight = weights[child_id];
hemi::parallel_for(0, n, [=] HEMI_LAMBDA (int i) {
result[i] += scratch[i] * weight;
});
}
cudaFree(scratch);
}
protected:
std::vector<Node *> children;
std::vector<float> weights;
};
int main() {
Node * root;
// skip: allocate nodes
// Create buffer for output value of the root nodes
float * value;
cudaMalloc((void **)&value, n * sizeof(float));
// skip: initialise vectors of data for the leaf nodes
root->value(value, n);
// skip: dump to disk
cudaFree(value);
}
The above technically works, but its performance is poor - on a GeForce GTX 970, running on 500,000 trials in parallel, it goes only 10x as fast as the serial algorithm goes on a single CPU - put it on a 16-core computer, and the GPU is slower.
This is unsurprising, as in the linear combination example above the value() function performs 3n+1 memory accesses (where n is the number of children), which could be completely avoided if the whole computation were done inside a single kernel.
So I came up with the idea of using the new C++11 Lambda support in CUDA 7.5:
class Node {
public:
// Return __device__ lambda which returns the value
// of the node for a single trial
virtual std::function<float (int)> valueFunc() = 0;
};
class LinearCombination: virtual public Node {
public:
// skip: constructor
std::function<float (int)> valueFunc() {
auto func = [=] HEMI_LAMBDA (int i) {
return 0.0;
};
for (size_t child_id = 0; child_id < children.size(); child_id++) {
auto childFunc = children[child_id]->valueFunc();
auto weight = weights[child_id];
func = [=] HEMI_LAMBDA (int i) {
return func(i) + childFunc(i) * weight;
};
}
return func;
}
protected:
std::vector<Node *> children;
std::vector<float> weights;
};
int main() {
Node * root;
// skip: allocate nodes
// Create buffer for output value of the root node
float * scratch;
cudaMalloc((void **)&scratch, n * sizeof(float));
// skip: initialise vectors of data for the leaf nodes
auto valueFunc = root->valueFunc();
hemi::parallel_for(0, n, [=] HEMI_LAMBDA (int i) {
scratch[i] = valueFunc(i);
});
// skip: dump to disk
cudaFree(scratch);
}
The idea of the above is that there is one big kernel that processes the whole tree, assembled at runtime as a recursion of scalar lambdas, so the whole tree calculation performs ONE memory write, plus whatever input vector the leaf nodes need to read.
However, it doesn't compile, and I can't understand if it's just a matter of syntax or if what I'm trying to do is outright impossible.
If the above can't be fixed, are there any alternative solutions to the problem? As mentioned earlier, refactoring the whole library to be less recursive, less object-oriented, or less based on virtual polymorphism is not an option.
To my knowledge, CUDA does support virtual function calls, as well as call through a function pointer. You just need to take a pointer to __device__ function on the device, not on the host.
That being said, be aware that actual function calls on the device are very expensive. That is because you need to keep a call stack for thousands of threads at the same time. Keeping theads in sync is another potential challenge.
Typical CUDA programs actually inline all calls to produce a single block of code.
I don't know the details of your program, so I can only guess what you need.
How about trying the following approach:
Copy all the plain node data on the device, but keep all virtuality and actual graph structure on the host.
Walk the graph structure on the CPU, but instead of performing the calculation, record what needs to be computed
Idenitify each virtual function by a unique index. There is a finite number of them, right? (by finite - I mean, a manageable number)
Create a planar work queue. Each element would hold the node index and the function index you need to compute. If you can, reorder the queue so that neighbouring threads would perform the same thing.
Transfer the queue to device. Run code with a big switch statement that selects the correct function depending on its index.
Yes, it seems a bit crude, but it can help you avoid a major overhead. When using CUDA with nonhomogeneous data (e.g. graphs) the hard part is usually how to organize and schedule your work, and not just how to actually compute what you need.
I have two classes. One is a container.
First class:
class node
{
private:
node *left, *right, *parent;
public:
node(node* parent,node* left,node* right);
virtual ~node() {cout<<"~node()"<<endl;}
};
Second class has a vector of pointers to the first class:
class tree
{
public:
vector<node*> nodes;
public:
tree(int size);
~tree();
void showPointers();
};
void tree::showPointers()
{
for (int i = 0; i < nodes.size(); i++)
{
cout<<"nodes["<<i<<"] = "<<nodes[i]<<endl;
}
}
I am creating one object tree with size: 5 and looking addresses of every vector nodes member.
int main()
{
tree d(5);
d.showPointers();
cout<<"end"<<endl;
}
In terminal I see (what showPointers shows):
The debugger shows:
What are these addresses from the debugger's variable pane?
#0x9dea0b8
#0x9dea0bc
#0x9dea0c0
#0x9dea0c4
#0x9dea0c8
I expected that they will be the same as the pointers I store in the nodes vector.
The addresses that you see are the addresses of where the values are stored. These addresses have nothing to do with what is a value - it's only a coincidence that your values are pointers. The debugger output would be the same had you used std::vector<intptr_t> instead of std::vector<node*>.
In your case, the expression &nodes[0] had value (node**)0x9dea0b8, etc.
You need to expand each of the [n] tree items to see their values - the values of the pointers that you store.
I am writing a c++ program to code for dijkstra's algorithm. Here is the code.
#include <iostream>
#include <vector>
#include <map>
using namespace std;
class vertex;
class node
{
public:
int value;
//bool exp=false;
char c;
};
class edge
{
public:
vertex* head;
vertex* tail;
int length;
edge(vertex*h,vertex* t, int l)
{
head=h;
tail=t;
length=l;
}
};
class vertex:public node
{
public:
vector<edge*> a;
vertex& operator|(vertex &p)
{
int l;
cout<<"Give the length of edge "<<this->c<<p.c<<endl;
cin>>l;
edge q(&p,this,l);
a.push_back(&q);
}
vertex(char a)
{
c=a;
}
};
int main()
{
vertex e('e');
vertex d('d');
vertex b('b');
vertex c('c');
vertex a('a');
vertex s('s');
s.value=1;
a.value=2;
b.value=3;
c.value=4;
d.value=5;
e.value=6;
s|a;
s|b;
a|c;
b|c;
b|d;
c|d;
c|e;
d|e;
cout<<"4";
map <char ,int >A;
vector<edge*>::iterator minin;
vector<edge*>::iterator j;
int min=0;
vector<vertex*> X;
X.push_back(&s);
A['s']=0;
vector<vertex*>::iterator i=X.begin();
for(; i<X.end(); i++)
{
cout<<"1";
j=((*i)->a).begin();
for(; j<((*i)->a).end(); j++)
{
cout<<"2";
if((*j)->length+A[((*j)->tail)->c]>min)
{
cout<<"3";
minin=j;
min=(*j)->length+A[((*j)->tail)->c];
}
}
}
X.push_back((*minin)->head);
A[((*minin)->tail)->c]=min;
cout<<((*minin)->head)->value;
}
The program returns a segmentation fault. I have used various cout statements to check where the fault occured but nothing is printed in console. However, I am able to input the edge length in the console but after giving the input it directly gives segmentation fault.
In
a.push_back(&q);
you are storing the address of a local object, which will cease to exist once the function terminates.
Why are you creating a class to keep your vertices/nodes?. I think you should use plain integers from 0 to N - 1 to avoid get things more complicated. If vertices are identified by a string or something else, you could use a hash/map data structure to transform the keys to an integer. That will help you to avoid moving complex vertex structures and using pointers.
The Edge class seems fine because the Dijkstra's algorithms needs all that data to work (start, end vertices, and the weight/cost of the path).
Having said that, the algorithm could be implemented using a binary heap data structure to prioritize the edge selection. You could also use a priority queue (http://en.cppreference.com/w/cpp/container/priority_queue) if you don't want to implement a binary heap.
Finally, I would use a Edge vector to iterate over the adjacent vertices of every vertex.
To represent a Graph in adjacency-list style, I'm using a vector containing pointers to a list of adjacent.
class Graph
{
public:
Graph(int V)
{
vector<list<int> *> vertices(V);
}
// Member functions for Graph class
void addEdge();
void print();
void type(string);
private:
vector<list<int> *> vertices;
};
Getting the number of vertices from the user in main function-> passing it to the constructor, it all works, meaning the vector is being initialized of the desired size! But right after the program comes back from the header file to the main function, Things change! as in tracing the value of vertices: the size is somehow being reset somewhere that I don' know of!!!
int size;
cout << "Enter the number of vertices in the Graph: ";
cin >> size;
Graph g(size);
Immona need help with this, what could possibly go wrong?!
You're creating a temporary vector (local variable) in your constructor :
Graph(int V)
{
vector<list<int> *> vertices(V);
}
You need to initialize your member variable instead :
Graph(int V) : vertices(V) {}
Also, I would suggest using std::unique_ptr<list<int>> instead of raw pointers if you really need to use pointers, else simply store plain std::list<int>.
The following code is the the beginning of an adjacency list representation of a graph.
In the buildGraph, which is immediately called by main, two vertices are created, then an edge is created between them. But then asking for the size of the edgelist of a vertex should return 1, not 0. I have tried putting couts in various places, and I'm just not able to figure out what the problem is, but I suspect it's due to a misunderstanding of pointers in some way. Thank you for your help!
#include "MinCut.h"
#include <iostream>
#include <list>
void buildGraph(undirected_graph *);
class vertex;
struct edge
{
vertex * start;
vertex * end;
};
class vertex
{
int vertexNumber;
std::list<edge> edges;
public:
int getVertexNumber(){return vertexNumber;}
std::list<edge> getEdges(){return edges;}
vertex(int n){vertexNumber=n;}
};
class undirected_graph
{
private:
std::list<vertex> graph;
public:
void addVertex(vertex v){graph.push_back(v);}
void createEdge(vertex * v1, vertex * v2);
};
void undirected_graph::createEdge(vertex * v1, vertex * v2)
{
std::list<edge> e1 = v1->getEdges();
std::list<edge> e2 = v2->getEdges();
edge e;
e.start=v1;
e.end=v2;
e1.push_back(e);
e2.push_back(e);
}
int main()
{
undirected_graph myGraph;
buildGraph(&myGraph);
return 0;
}
void buildGraph(undirected_graph * g)
{
vertex v1(1);
vertex v2(2);
g->addVertex(v1);
g->addVertex(v2);
g->createEdge(&v1,&v2);
std::list<edge> e = v1.getEdges();
std::cout<< "? " << e.size();
}
In createEdge() you have this:
e.start=v1;
e.start=v2;
Should it instead be
e.start=v1;
e.end=v2;
EDIT: Your problem is in createEdge, e1 and e2 are just copies, so changes don't affect the actual vertex objects.
Here's my solution, seems to be working:
Add a function to vertex like so:
void addEdge(edge &e){edges.push_back(e);}
Then in createEdge():
edge e;
e.start=v1;
e.end=v2;
v1->addEdge(e);
v2->addEdge(e);
In addition to #PatLillis's answer, I think you're also going to run into problems here:
vertex v1(1);
vertex v2(2);
g->addVertex(v1);
g->addVertex(v2);
g->createEdge(&v1,&v2);
The pointers &v1 and &v2 refer to v1 and v2 in your main function. However:
Since you're passing v1 and v2 by value to addVertex, you're going to get copies of those vertices in addVertex. That means your pointers in main will be pointing one place, and the copies will be somewhere else.
Since you're storing your vertices by value in a std::list, you'll have the same problem again. The list will hold copies of the copies in addVertex, and your pointers will still be pointing to the originals in main.
One way to fix this is to deal with vertex* in e.g. addVertex and in your std::list. Alternatively, if you want your graph to "own" the vertices (as opposed to them having potentially separate lifetimes from the graph) you could switch to std::unique_ptr<vertex>.