Problem moving vertex on Arrangement using Arrangement_accessor - c++

I'm trying to move some vertex in an arrangement using an Arrangement_accessor. What I did so far is:
ArrangementAccessor acc(*(self.arr)); // I'm using Objective-C++
Point_2 newPoint = toPoint_2(mouse);
// Change the vertex position
acc.modify_vertex_ex(v, newPoint);
// adjust all incident curves
Arrangement_2::Halfedge_around_vertex_circulator curr, first;
curr = first = v->incident_halfedges();
do {
Point_2 sourcePoint = curr->source()->point();
Segment_2 newSegment = Segment_2(sourcePoint, newPoint);
acc.modify_edge_ex(curr, newSegment);
} while (++curr != first);
This is actually partially working (I ensure no overlapping is produced). But as soon as I change the lexicographically order of some halfedges endpoints, the Arrangement::isValid() returns false.
So my questions are:
Why does changing xy-order destroy the arrangement? I know this is also documented, but I don't really get why this is important.
And: Is there any way to fix this in my implementation? I already tried the easier remove vertex / insert vertex method, but that is not really what I want. (I want to keep the faces and their reference / index mapping alive)
Would be really glad if you could help me understand this.
Yours, Salabasti

The direction of every halfedge is cached to expedite frequent operations. The concept ArrangementDcelHalfedge (https://doc.cgal.org/latest/Arrangement_on_surface_2/classArrangementDcelHalfedge.html#a2bcd73c9eb8383be066161612de98033) requires supporting the methods direction() and set_direction(). So, either you also surgically fix the direction of the affected halfedges using the method set_direction() or you remove these halfedges and then re-insert them. You can retain low complexity by using the specialized insertion functions; see, e.g.,
https://doc.cgal.org/latest/Arrangement_on_surface_2/classCGAL_1_1Arrangement__2.html#a88fed7cf475e474d187e65beb894dde2, and
https://doc.cgal.org/latest/Arrangement_on_surface_2/classCGAL_1_1Arrangement__2.html#a7b90245d8a42ed90ea13b9d911adac73

Related

Best datastructure for iterating over and moving elements to front

As part of a solution to a bigger problem that is finding the solution to a maximum flow problem. In my implementation of the relabel-to-front algorithm I'm having a performance bottleneck that I didn't expect.
The general structure for storing the graph data is as follows:
struct edge{
int destination;
int capacity;
};
struct vertex{
int e_flow;
int h;
vector<edge> edges;
};
The specifics of the algorithm are not that important to the question. In the main loop of the solution I'm looping over all vertices except the source and the sink. If at some point a change is made to a vertex then that vertex is put at the front of the list and the iteration starts again from the start. Until the end of the list is reached and we terminate. This part looks as follows now
//nodes are 0..nodeCount-1 with source=0 and sink=nodeCount-1
vector<int> toDischarge(nodeCount-2,0);
for(int i=1;i<sink;i++){
toDischarge[i-1]=i;
}//skip over source and sink
//custom pointer to the entry of toDischarge we are currently accessing
int point = 0;
while(point != nodeCount-2){
int val = toDischarge[point];
int oldHeight = graph[val].h;
discharge(val, graph, graph[val].e_flow);
if(graph[val].h != oldHeight){
rotate(toDischarge.begin(), toDischarge.begin()+point, toDischarge.begin()+point+1);
//if the value of the vertex has changed move it to the front and reset pointer
point = 0;
}
point++;
}
I tried using an std::list data structure before the vector solution but that was even slower even though conceptually that didn't make sense to me since (re)moving elements in a list should be easy. After some research I found out that it was probably horribly performant due to caching issues with list.
Even with the vector solution though I did some basic benchmarking using valgrind and have the following results.
If I understand this correctly then over 30% of my execution time is just spent doing vector element accesses.
Another solution I've tried is making a copy of the vertex needed for that iteration into a variable since it is accessed multiple times, but that was even worse performance because I think it is also making a copy of the whole edge list.
What data structure would improve the general performance of these operations? I'm also interested in other data structures for storing the graph data if that would help.
It seems to me that this is what std::deque<> is for. Imagine it as a 'non-continuous vector', or some vector-like batches tied together. You can use the same interface as vector, except that you cannot assume that adding an index to the first element's pointer results in the given element (or anything sensible other than UB); you need to use [] for indexing. Also, you have dq.insert(it, elem); that's quick if it is std::begin(it) or std::end(it).

Generalised suffix tree traversal to find longest common substring

I'm working with suffix trees. As far as I can tell, I have Ukkonen's algorithm running correctly to build a generalised suffix tree from an arbitrary number of strings. I'm now trying to implement a find_longest_common_substring() method to do exactly that. For this to work, I understand that I need to find the deepest shared edge (with depth in terms of characters, rather than edges) between all strings in the tree, and I've been struggling for a few days to get the traversal right.
Right now I have the following in C++. I'll spare you all my code, but for context, I'm keeping the edges of each node in an unordered_map called outgoing_edges, and each edge has a vector of ints recorded_strings containing integers identifying the added strings. The child field of an edge is the node it is going to, and l and r identify its left and rightmost indices, respectively. Finally, current_string_number is the current number of strings in the tree.
SuffixTree::Edge * SuffixTree::find_deepest_shared_edge(SuffixTree::Node * start, int current_length, int &longest) {
Edge * deepest_shared_edge = new Edge;
auto it = start->outgoing_edges.begin();
while (it != start->outgoing_edges.end()) {
if (it->second->recorded_strings.size() == current_string_number + 1) {
int edge_length = it->second->r - it->second->l + 1;
int path_length = current_length + edge_length;
find_deepest_shared_edge(it->second->child, path_length, longest);
if (path_length > longest) {
longest = path_length;
deepest_shared_edge = it->second;
}
}
it++;
}
return deepest_shared_edge;
}
When trying to debug, as best I can tell, the traversal runs mostly fine, and correctly records the path length and sets longest. However, for reasons I don't quite understand, in the innermost conditional, deepest_shared_edge sometimes seems to get updated to a mistaken edge. I suspect I maybe don't quite understand how it->second is updated throughout the recursion. Yet I'm not quite sure how to go about fixing this.
I'm aware of this similar question, but the approach seems sufficiently different that I'm not quite sure how it applies here.
I'm mainly during this for fun and learning, so I don't necessarily need working code to replace the above - pseudocode or just any explanation of where I'm confused would be just as well.
Your handling of deepest_shared_edge is wrong. First, the allocation you do at the start of the function is a memory leak, since you never free the memory. Secondly, the result of the recursive call is ignored, so whatever deepest edge it finds is lost (although you update the depth, you don't keep track of the deepest edge).
To fix this, you should either pass deepest_shared_edge as a reference parameter (like you do for longest), or you can initialize it to nullptr, then check the return from your recursive call for nullptr and update it appropriately.

Some vector elements do not change

I am experiencing very strange behaviour, which I cannot explain. I hope someone might shed some light on it.
Code snippet first:
class TContour {
public:
typedef std::pair<int,int> TEdge; // an edge is defined by indices of vertices
typedef std::vector<TEdge> TEdges;
TEdges m_oEdges;
void splitEdge(int iEdgeIndex, int iMiddleVertexIndex) {
TEdge & oEdge = m_oEdges[iEdgeIndex];
m_oEdges.push_back(TEdge(oEdge.first, iMiddleVertexIndex));
oEdge = TEdge(oEdge.second, iMiddleVertexIndex); // !!! THE PROBLEM
};
void splitAllEdges(void) {
size_t iEdgesCnt = m_oEdges.size();
for (int i=0; i<iEdgesCnt; ++i) {
int iSomeVertexIndex = 10000; // some new value, not actually important
splitEdge(i, iSomeVertexIndex);
}
};
};
When I call splitAllEdges(), the original edges are changed and new edges are added (resulting in doubling the container size). Everything as expected, with an exception of 1 original edge, which does not change. Should that be of any interest, its index is 3 and value is [1,242]. All the other original edges change, but this one remains unchanged. Adding debug prints confirms that the edge is written with a different value, but m_oEdges contents does not change.
I have a simple workaround, replacing the problematic line with m_oEdges[iEdgeIndex] = TEdge(oEdge.end, iMiddleVertexIndex); does fix the issue. Though my concern is what is the cause for the unexpected behaviour. Might that be a compiler bug (hence what other issues do I have to expect?), or do I overlook some stupid bug in my code?
/usr/bin/c++ --version
c++ (Debian 4.9.2-10) 4.9.2
Switching from c++98 to c++11 did not change anything.
You're using an invalid reference after your push_back operation.
This:
TEdge & oEdge = m_oEdges[iEdgeIndex];
acquires the reference. Then this:
m_oEdges.push_back(TEdge(oEdge.start, iMiddleVertexIndex));
potentially resizes the vector, and in so doing, invalidates the oEdge reference. At which point this:
oEdge = TEdge(oEdge.end, iMiddleVertexIndex);
is no longer define behavior, as you're using a dangling reference. Reuse the index, not the reference, such as:
m_oEdges[iEdgeIndex] = TEdge(m_oEdges[iEdgeIndex].end, iMiddleVertexIndex);
Others have mentioned the invalidation of the reference, so I won't go into more details on that.
If performance is critical, you could explicitly reserve enough space in the original vector for the new edges before you start looping. This would avoid the problem, but would still be technically incorrect. i.e. it would work, but still be against the rules.
A safer, but slightly slower method would be to iterate through the vector, changing existing edges and generating new edges in a new vector (with sufficient space reserved beforehand for performance), and then at the end, append the new vector to the existing one.
The safest way (including being completely exception safe), would be to create a new vector (reserving double the size of the initial vector), iterate through the initial vector (without modifying any of its edges), pushing two new edges into the new vector for each old edge, and then right at the end vector.swap() the old vector with the new vector.
A big positive side-effect of this last approach is that your code either succeeds completely, or leaves the original edges unchanged. It maintains the integrity of the data even in the face of disaster.
P.S. I notice that you are doing:
TEdge(oEdge.first, iMiddleVertexIndex)
TEdge(oEdge.second, iMiddleVertexIndex)
If the rest of your code is sensitive to ring-orientation you probably want to reverse the parameters for the second edge. i.e.:
TEdge(oEdge.first, iMiddleVertexIndex)
TEdge(iMiddleVertexIndex, oEdge.second )

How to properly manage a vector of void pointers

First, some background:
I'm working on a project which requires me to simulate interactions between objects that can be thought of as polygons (usually triangles or quadrilaterals, almost certainly fewer than seven sides), each side of which is composed of the radius of two circles with a variable (and possibly zero) number of 'rivers' of various constant widths passing between them, and out of the polygon through some other side. As these rivers and circles and their widths (and the positions of the circles) are specified at runtime, one of these polygons with N sides and M rivers running through it can be completely described by an array of N+2M pointers, each referring to the relevant rivers/circles, starting from an arbitrary corner of the polygon and passing around (in principal, since rivers can't overlap, they should be specifiable with less data, but in practice I'm not sure how to implement that).
I was originally programming this in Python, but quickly found that for more complex arrangements performance was unacceptably slow. In porting this over to C++ (chosen because of its portability and compatibility with SDL, which I'm using to render the result once optimization is complete) I am at somewhat of a loss as to how to deal with the polygon structure.
The obvious thing to do is to make a class for them, but as C++ lacks even runtime-sized arrays or multi-type arrays, the only way to do this would be with a ludicrously cumbersome set of vectors describing the list of circles, rivers, and their relative placement, or else an even more cumbersome 'edge' class of some kind. Rather than this, it seems like the better option is to use a much simpler, though still annoying, vector of void pointers, each pointing to the rivers/circles as described above.
Now, the question:
If I am correct, the proper way to handle the relevant memory allocations here with the minimum amount of confusion (not saying much...) is something like this:
int doStuffWithPolygons(){
std::vector<std::vector<void *>> polygons;
while(/*some circles aren't assigned a polygon*/){
std::vector<void *> polygon;
void *start = &/*next circle that has not yet been assigned a polygon*/;
void *lastcircle = start;
void *nextcircle;
nextcircle = &/*next circle to put into the polygon*/;
while(nextcircle != start){
polygon.push_back(lastcircle);
std::vector<River *> rivers = /*list of rivers between last circle and next circle*/;
for(unsigned i = 0; i < rivers.size(); i++){
polygon.push_back(rivers[i]);
}
lastcircle = nextcircle;
nextcircle = &/*next circle to put into the polygon*/;
}
polygons.push_back(polygon);
}
int score = 0;
//do whatever you're going to do to evaluate the polygons here
return score;
}
int main(){
int bestscore = 0;
std::vector<int> bestarrangement; //contains position of each circle
std::vector<int> currentarrangement = /*whatever arbitrary starting arrangement is appropriate*/;
while(/*not done evaluating polygon configurations*/){
//fiddle with current arrangement a bit
int currentscore = doStuffWithPolygons();
if(currentscore > bestscore){
bestscore = currentscore;
bestarrangement = currentarrangement;
}
}
//somehow report what the best arrangement is
return 0;
}
If I properly understand how this stuff is handled, I shouldn't need any delete or .clear() calls because everything goes out of scope after the function call. Am I correct about this? Also, is there any part of the above that is needlessly complex, or else is insufficiently complex? Am I right in thinking that this is as simple as C++ will let me make it, or is there some way to avoid some of the roundabout construction?
And if you're response is going to be something like 'don't use void pointers' or 'just make a polygon class', unless you can explain how it will make the problem simpler, save yourself the trouble. I am the only one who will ever see this code, so I don't care about adhering to best practices. If I forget how/why I did something and it causes me problems later, that's my own fault for insufficiently documenting it, not a reason to have written it differently.
edit
Since at least one person asked, here's my original python, handling the polygon creation/evaluation part of the process:
#lots of setup stuff, such as the Circle and River classes
def evaluateArrangement(circles, rivers, tree, arrangement): #circles, rivers contain all the circles, rivers to be placed. tree is a class describing which rivers go between which circles, unrelated to the problem at hand. arrangement contains (x,y) position of the circles in the current arrangement.
polygons = []
unassignedCircles = range(len(circles))
while unassignedCircles:
polygon = []
start = unassignedCircles[0]
lastcircle = start
lastlastcircle = start
nextcircle = getNearest(start,arrangement)
unassignedCircles.pop(start)
unassignedCircles.pop(nextcircle)
while(not nextcircle = start):
polygon += [lastcircle]
polygon += getRiversBetween(tree, lastcircle,nextcircle)
lastlastcircle = lastcircle
lastcircle = nextcircle;
nextcircle = getNearest(lastcircle,arrangement,lastlastcircle) #the last argument here guarantees that the new nextcircle is not the same as the last lastcircle, which it otherwise would have been guaranteed to be.
unassignedCircles.pop(nextcircle)
polygons += [polygon]
return EvaluatePolygons(polygons,circles,rivers) #defined outside.
Void as template argument must be lower case. Other than that it should work, but I also recommend using a base class for that. With a smart pointer you can let the system handle all the memory management.

Efficient way to remove elements from std vector according to predicate

I'm writing an algorithm which is supposed to remove from a set of points stored inside a vector every element which is inside any of a list of rect that I supply.
I'm using it also as a testing ground for C++11 so, since I'm still getting used to new features, I would like to know if this is an efficient approach or if it has some particular flaw that I'm not getting.
vector<tuple<u16, u16, u16, u16>> limits;
FOR_EACH_AREA_TO_REMOVE
limits.push_back(make_tuple(
area->x - VIEWPORT_SIZE_X/2,
area->x + VIEWPORT_SIZE_X/2,
area->y - VIEWPORT_SIZE_Y/2,
area->y + VIEWPORT_SIZE_Y/2));
FOR_EACH_AREA_TO_REMOVE_END
vector<Point2D> points;
remove_copy_if(suitablePoints.begin(), suitablePoints.end(),
points.begin(), [&](const Point2D &point) {
for (auto limit : limits)
if (point->x > get<0>(limit) &&
point->x < get<1>(limit) &&
point->y > get<2>(limit) &&
point->y < get<3>(limit))
return true;
return false;
}
);
This seems the more trivial solution to the problem, create a vector of bounds that must be excluded from the point set and then iterate over the set point. I wonder if there's a better approach to the problem. I would like to point out that the set of points could be huge, while the set of rects is indeed enough limited.
You could change auto into auto const&, since you do not need to create a copy of each rectangle in limits as you iterate through the collection:
for (auto const& limit : limits)
// ^^^^^^
This should bring some performance improvement (but as always when performance is concerned, measure it before drawing any conclusions).
Also, unless you need to create a copy of the elements you remove from your vector (the text of the question does not mention this), you could use std::remove_if() instead of std::remove_copy_if().
std::remove_if() works by overwriting removed elements with subsequent ones, and will return the new logical end of the vector without actually resizing the vector itself (which is a desirable behavior if you do not need to do that).
It is therefore up to you whether to do this or not by calling std::vector::erase() after std::remove_if(). This is a very common practice which also has a name.