Previous content of std vector is replaced during iteration - c++

I am writing a program which is supposed to find a rooted spanning tree in a graph and all the unique paths joining the root with the other vertices in the spanning tree. I am trying to perform both operations with one function only:
void Spanning_tree_finder(){
int * v=Add_edges(s1); int control=0; int size;
for(int i=0; i<_g.GetE(); i++){
if(control==0 && v[i]==1) {
s1[i]=1; control=1;
size=_v.size();
for(int j=0; j<size; j++){
if(_v[j].Getv2()==_g.GetEdge1(i)){
Path pnew=_v[j];
pnew.Setv2(_g.GetEdge2(i));
pnew.Setp(i);
_v.push_back(pnew);
};
if(_v[j].Getv2()==_g.GetEdge2(i)){
Path pnew=_v[j];
pnew.Setv2(_g.GetEdge1(i));
pnew.Setp(i);
_v.push_back(pnew);
};
};
Spanning_tree_finder();
};
};
return;
};
For the sake of context, the function builds a spanning tree iteratively, which at the end of the process is contained in s1, by taking a tree, also contained in s1, searching all the adjacent edges using the function Add_edges and adding thus to the previous tree one of the adjacent edges. Then the function is called again (note that both Add_Edges and Spanning_tree_finder are part of a class and s1 is a private member of such class). During this process, the function also constructs the path joining the root with the loose vertex associated with the newly introduced adjacent edge by searching for a previous path joining the root with the non-loose vertex of the newly introduced edge and adding to this path the newly introduced edge. The paths are all stored in a vector of paths, _v. I know this explanation is a bit convoluted but I hope it is clear.
However, there is a problem with this function, since it seems that at every iteration all the paths contained in _v are substituted with the path which was obtained in the current iteration. Instead of obtaining _v.size() different paths, _v contains _v.size() copies of the same path, and this holds at every iteration. I don't understand why this would happen, since it seems to me that the function never accesses previously-added elements.
I hope the problem as I explained is clear, and I am happy to provide any further clarification.
EDIT: More specifically, the lines of codes which I think are problematic are
for(int j=0; j<size; j++){
if(_v[j].Getv2()==_g.GetEdge1(i)){
Path pnew=_v[j];
pnew.Setv2(_g.GetEdge2(i));
pnew.Setp(i);
_v.push_back(pnew);
};
if(_v[j].Getv2()==_g.GetEdge2(i)){
Path pnew=_v[j];
pnew.Setv2(_g.GetEdge1(i));
pnew.Setp(i);
_v.push_back(pnew);
};
};
The core of the problem is how pnew is inserted in _v. Instead of the element pnew being added at the end of the vector, all the elements in _v are substituted with pnew

Sure sounds like the objects in _v have some data accessed by pointer, and that you are modifying the objects already in _v by modifying an object you think is new.
Crystal ball says Path pnew = _v[j] copies an existing Path object, which contains some pointers to data. The pointers will be copied to the new object, but point to the same location, so modifiers like Setv2 and Setp will change the data for all of them.
Try fixing Path by either replacing the explicit pointers with something else, like vectors, or write a proper copy-constructor and assignment operator that deal with those pointer fields correctly (by allocating new memory and copying the contents).

Related

Best datastructure for iterating over and moving elements to front

As part of a solution to a bigger problem that is finding the solution to a maximum flow problem. In my implementation of the relabel-to-front algorithm I'm having a performance bottleneck that I didn't expect.
The general structure for storing the graph data is as follows:
struct edge{
int destination;
int capacity;
};
struct vertex{
int e_flow;
int h;
vector<edge> edges;
};
The specifics of the algorithm are not that important to the question. In the main loop of the solution I'm looping over all vertices except the source and the sink. If at some point a change is made to a vertex then that vertex is put at the front of the list and the iteration starts again from the start. Until the end of the list is reached and we terminate. This part looks as follows now
//nodes are 0..nodeCount-1 with source=0 and sink=nodeCount-1
vector<int> toDischarge(nodeCount-2,0);
for(int i=1;i<sink;i++){
toDischarge[i-1]=i;
}//skip over source and sink
//custom pointer to the entry of toDischarge we are currently accessing
int point = 0;
while(point != nodeCount-2){
int val = toDischarge[point];
int oldHeight = graph[val].h;
discharge(val, graph, graph[val].e_flow);
if(graph[val].h != oldHeight){
rotate(toDischarge.begin(), toDischarge.begin()+point, toDischarge.begin()+point+1);
//if the value of the vertex has changed move it to the front and reset pointer
point = 0;
}
point++;
}
I tried using an std::list data structure before the vector solution but that was even slower even though conceptually that didn't make sense to me since (re)moving elements in a list should be easy. After some research I found out that it was probably horribly performant due to caching issues with list.
Even with the vector solution though I did some basic benchmarking using valgrind and have the following results.
If I understand this correctly then over 30% of my execution time is just spent doing vector element accesses.
Another solution I've tried is making a copy of the vertex needed for that iteration into a variable since it is accessed multiple times, but that was even worse performance because I think it is also making a copy of the whole edge list.
What data structure would improve the general performance of these operations? I'm also interested in other data structures for storing the graph data if that would help.
It seems to me that this is what std::deque<> is for. Imagine it as a 'non-continuous vector', or some vector-like batches tied together. You can use the same interface as vector, except that you cannot assume that adding an index to the first element's pointer results in the given element (or anything sensible other than UB); you need to use [] for indexing. Also, you have dq.insert(it, elem); that's quick if it is std::begin(it) or std::end(it).

C++ Saving objects inside a list to reuse later

Suppose I own a list of edges saved inside a vector like:
typedef struct edge
{
int v;
size_t start;
size_t end;
}e;
typedef vector<list<e>> adj_list;
adj_list tree;
I have to do logic on this tree object, but the logic is too complicated to do it in place (constricted to not recurse). I need an extra data structure to handle each node. As a simple example, lets consider incrementing each edge's v value:
list<e> aux;
aux.insert(aux.begin(), tree[0].begin(), tree[0].end());
while (!aux.empty())
{
e& now = aux.front();
aux.pop_front();
now.v++;
aux.insert(aux.begin(), tree[now.v].begin(), tree[now.v].end());
}
The problem in doing this is that the changes made to the now variable does not reflect the value in tree. I need a list(can be any list(vector,linked,queue,stack) that has an empty() boolean like Dijkstra) ds to handle my edge objects in tree. Is there an elegant way to do this? Can I use a list of iterators? I'm specifically asking an "elegant" approach in hopes that it does not involve pointers.
As discussed in the comments, the solution is to store iterators instead of copies, e.g.:
list<list<e>::iterator> aux;
aux.insert(aux.begin(), tree[0].begin(), tree[0].end());
while (!aux.empty())
{
e& now = *(aux.front());
aux.pop_front();
now.v++;
aux.insert(aux.begin(), tree[now.v].begin(), tree[now.v].end());
}
This works only if you can guarantee that nothing will invalidate the stored iterators, such as certain operations on tree could do.
As pointed out by n. 'pronouns' m., iterators can be considered as "generalized pointers", so many problems that regular pointers have also apply to iterators.
Another (slightly safer) approach would be to store std::shared_ptrs in the inner list of tree - then you can simply store another std::shared_ptr to the same object in aux which makes sure that the object cannot be accidentally deleted while it is still being referenced

Some vector elements do not change

I am experiencing very strange behaviour, which I cannot explain. I hope someone might shed some light on it.
Code snippet first:
class TContour {
public:
typedef std::pair<int,int> TEdge; // an edge is defined by indices of vertices
typedef std::vector<TEdge> TEdges;
TEdges m_oEdges;
void splitEdge(int iEdgeIndex, int iMiddleVertexIndex) {
TEdge & oEdge = m_oEdges[iEdgeIndex];
m_oEdges.push_back(TEdge(oEdge.first, iMiddleVertexIndex));
oEdge = TEdge(oEdge.second, iMiddleVertexIndex); // !!! THE PROBLEM
};
void splitAllEdges(void) {
size_t iEdgesCnt = m_oEdges.size();
for (int i=0; i<iEdgesCnt; ++i) {
int iSomeVertexIndex = 10000; // some new value, not actually important
splitEdge(i, iSomeVertexIndex);
}
};
};
When I call splitAllEdges(), the original edges are changed and new edges are added (resulting in doubling the container size). Everything as expected, with an exception of 1 original edge, which does not change. Should that be of any interest, its index is 3 and value is [1,242]. All the other original edges change, but this one remains unchanged. Adding debug prints confirms that the edge is written with a different value, but m_oEdges contents does not change.
I have a simple workaround, replacing the problematic line with m_oEdges[iEdgeIndex] = TEdge(oEdge.end, iMiddleVertexIndex); does fix the issue. Though my concern is what is the cause for the unexpected behaviour. Might that be a compiler bug (hence what other issues do I have to expect?), or do I overlook some stupid bug in my code?
/usr/bin/c++ --version
c++ (Debian 4.9.2-10) 4.9.2
Switching from c++98 to c++11 did not change anything.
You're using an invalid reference after your push_back operation.
This:
TEdge & oEdge = m_oEdges[iEdgeIndex];
acquires the reference. Then this:
m_oEdges.push_back(TEdge(oEdge.start, iMiddleVertexIndex));
potentially resizes the vector, and in so doing, invalidates the oEdge reference. At which point this:
oEdge = TEdge(oEdge.end, iMiddleVertexIndex);
is no longer define behavior, as you're using a dangling reference. Reuse the index, not the reference, such as:
m_oEdges[iEdgeIndex] = TEdge(m_oEdges[iEdgeIndex].end, iMiddleVertexIndex);
Others have mentioned the invalidation of the reference, so I won't go into more details on that.
If performance is critical, you could explicitly reserve enough space in the original vector for the new edges before you start looping. This would avoid the problem, but would still be technically incorrect. i.e. it would work, but still be against the rules.
A safer, but slightly slower method would be to iterate through the vector, changing existing edges and generating new edges in a new vector (with sufficient space reserved beforehand for performance), and then at the end, append the new vector to the existing one.
The safest way (including being completely exception safe), would be to create a new vector (reserving double the size of the initial vector), iterate through the initial vector (without modifying any of its edges), pushing two new edges into the new vector for each old edge, and then right at the end vector.swap() the old vector with the new vector.
A big positive side-effect of this last approach is that your code either succeeds completely, or leaves the original edges unchanged. It maintains the integrity of the data even in the face of disaster.
P.S. I notice that you are doing:
TEdge(oEdge.first, iMiddleVertexIndex)
TEdge(oEdge.second, iMiddleVertexIndex)
If the rest of your code is sensitive to ring-orientation you probably want to reverse the parameters for the second edge. i.e.:
TEdge(oEdge.first, iMiddleVertexIndex)
TEdge(iMiddleVertexIndex, oEdge.second )

in C++, how can I find all elements of an array linked directly or indirectly to a specific element of the array?

I have an array of objects, every object has a weight value, some objects are attached to another object who becomes its parent. I need to add the weight of all the child objects to the parent object, also the weight of the objects attached to child objects and so on must be added to the parent.
Here my best approach so far, but somehow it ends up not changing the parents original weight at all:
void showMasterClass::mass_manager(int parent)
{
for (int n = 0; n < total_objects; n++)
{
object[n].setMass(object[n].getEmptyMass());
}
for (int n = 0; n < total_objects; n++)
{
if (object[n].getDockedTo() == parent)
{
object[parent].setMass(object[parent].getMass() + object[n].getMass());
mass_manager_subroutine(n, parent);
}
}
}
void showMasterClass::mass_manager_subroutine(int objeto, int parent)
{
for (int n = 0; n < total_objects; n++)
{
if (object[n].getDockedTo() == objeto)
{
object[parent].setMass(object[parent].getMass() + object[n].getMass());
mass_manager_subroutine(n, parent);
}
}
}
The thing you're trying to implement is a post-order, depth-first tree traversal. You just happen to be getting your tree in the form of an array with child-to-parent references.
The lack of parent-to-child references is going to make the process less efficient, but it's still doable.
Looking at it purely as a matter of structure (and avoiding code for now, to avoid making assumptions about yours), you're looking at a recursive call that, given the array and the index of a 'root':
Finds all of the nodes that have your root as their parent.
Recurses with each of those nodes as your new root.
Returns your root's weight plus the return values from each child (if any).
If your starting array isn't guaranteed to be free of circular dependencies, then you'll also want to pass down the current 'chain' of visited nodes, so you can return if you ever visit a node for a second time in the same branch of the descent.
To find the child nodes, you can just walk the full array each time. That'll give you an N^2 efficiency overall, which is pretty painful, but it's also the simplest approach to understand, so it's a good place to start. Once you understand how that's working, you can make performance improvements (like making a single pass at the start to map the parent-to-child relationships, which will make the traversal itself faster).
Assuming that you do not have to deal with circular dependencies:
Iterate through the array and mark all objects that are parents.
The remaining objects have no children. Iterate through these objects and add their weights to their parents recursively.

c++ NULL terminated array segfault

I am writing a program that splits graphs, I got a class
Graph and an Algorithm class. I compute the partitioning in my Algorithm class and split the graph with a method in the Graph class according to the partitioning.
My code looks like this:
In my GraphClass:
void bisectGraph(int *iPartitioning, Graph **Subgraphs, Edge **Separator){
...
// Store separators in an array
Separator = new Edge*[Separators.size()+1]; //Separators is a vector containing the separating edges
if(Separator == NULL)
writeErrorMsg("Error assigning memory.", "Graph::bisectGraph");
for(i=0, SepIter = Separators.begin(); SepIter != Separators.end(); i++, SepIter++)
Separator[i] = *SepIter;
Separator[Separators.size()] = NULL;
}
In my Algorithm clas I call it like this:
Edge** separators;
Graph** subgraphs;
int somePartitioning;
g->bisectGraph(somePartitioning, subgraphs, separators);
Works fine so far, but when I want to work on my separators array like this for instance:
for(int i=0; separators[i]!=NULL, i++){
...
}
I always get a segmentation fault. ddd tells me that at the end of bisectGraph separators contains some content. Since I can't find any other mistake I think i got some concept wrong?
The new value of Separator is not being propagated to the separators variable outside the function call. Even though it has type Edge ** you're assigning to it inside the function, but that only assigns to the function's copy of the variable. Remember that C++ is pass-by-value unless otherwise specified.
You could change the signature to Edge **&, but it'd be more sensible to use a vector, and take a parameter of type vector<Edge *> &.