Efficiently expand set of graph edges

Efficiently expand set of graph edges - c++

I have a set of edges from a graph, and would like to expand it with all edges that share a vertex with any edge. How could I do this efficiently with boost::graphs?
The only way I've been able to come up with is the naive solution of extracting all the source and target vertices, using boost::adjacent_vertices to get all adjacencies and then creating all the new edges with boost::edge. Is there a better way of doing this?
Context: The graph vertices are centroids of a terrain triangulation, the edges connect vertices whose corresponding triangles are adjacent (so sort of a dual graph). The set of edges I'm looking to expand corresponds to inter-triangle paths which are blocked, and the blocked area is expanding. The area is sort-of circular, so most of the edges I'll see using my naive approach above will already be part of the set.

Instead of considering all adjacent vertices in each step to generate the new edges, use a property map to mark edges already encounterd. Thus, you need to consider unmarked edges in each step only. A vertex is marked after adding all edges incident to it to your set.
Given the fact that the internal data structure used by boost::graph is either an adjacency list or an adjacency matrix, I do not think that any further improvement is possible.

Related

Directed Graph - How to count the number of vertices from which each other vertex in graph is reachable?

In a directed graph how to efficiently count the number of vertices from which each other vertex in graph is reachable?

If there is no cycle in the graph, there can be only one such vertex and it has an in-degree zero, and there are no other vertices with an in-degree zero. You then have to run a DFS to check if all other vertices are reachable from it. So the answer is either one or zero, depending on the result of DFS.
If there is a cycle, then all the vertices in the cycle have this property or none of them have it.
If you detect a cycle, replace all the vertices in the cycle with one vertex and keep a label for that vertex of how many vertices it represents. Use the same procedure as above. I.e., check in-degrees and run DFS from the new node. The answer will be zero or the label.
Detecting a cycle can be accomplished using a DFS.
There might be several cycles in the graph. In that case you have to eliminate all of them. You can eliminate all of them in one linear pass of DFS, but that's tricky. You could also use Tarjan's algorithm as suggested by btilly in his answer.

Use Tarjan's strongly connected components algorithm to detect all loops then construct a graph with each strongly connected component collapsed to a single node.
Now in this new graph, it is sufficient to look for a vertex with no in edges. If that vertex connects to every other one (verifiable with a breadth first linear search), then everything in the strongly connected component that it came from is in your set, otherwise no vertex is in your set.

Given a set of vertices, how do you generate a strongly-connected directed graph with a near-minimal amount of edges?

I am trying to perform testing on my graph class's dijkstras algorithm. To do this, I generate a graph with a couple thousand vertices, and then make the graph connected by randomly adding thousands of edges until the graph is connected. I can then run the search between any two random vertices over and over and be sure that there is a path between them. The problem is, I often end-up with a nearly-dense graph, which because I am using an adjacency list representation, causes my search algorithm to be terribly slow.
Question :
Given a set of vertices V, how do you generate a strongly-connected, directed graph, that has significantly less edges than a dense-graph over the same vertices would have?
I was thinking about simply doing the following :
vertex 1 <--> vertex 2, vertex 2 <--> vertex 3, ..., vertex n-1 <--> vertex n
And then randomly adding like n/10 edges throughout the graph, but this doesn't seem like an optimal way of coming up with random graph structures to test my search algorithms on.

One approach would be to maintain a set of strongly connected components (starting with |V| single-vertex components), and in each iteration, merge some random subset of them into a single connected component by connecting a random vertex of each one to a random vertex of the next one, forming a cycle.
This will tend to generate very sparse graphs, so depending on your use case, you might want to toss in some extra random edges as well.
EDIT: Intuitively I think you'd want to use an exponential distribution when deciding how many components to merge in a single iteration. I don't have any real support for that, though.

I don't know if there is a better way of doing it, but at least this seems to work:
I would add E (directed) edges between random vertices. This will generate several clusters of vertices.
Then, I need to connect those clusters to form a chain of clusters, so ensuring that from a cluster I can reach any other cluster. For this I can label a random vertex of each cluster as the "master"vertex, and join the master vertices forming a loop. Thus, you have a strongly-connected directed graph composed of clusters (not vertices yet). The last master should be connected back to the first master, thus creating a loop.
Now, in order to turn it into a strongly-connected digraph composed of vertices I need to make each cluster a strongly-connected digraph by itself. But this is easy if I run a DFS starting at the master node of a cluster and each time I find a leaf I add an edge from that leaf to its master vertex. Note that the DFS must not traverse outside of the cluster.
I think this may work, though the topology will not be truly random, it will loop like a big loop composed of smaller graphs joined together. But depending on the algorithm you need to test, this may come in handy.
EDIT:
If after that you want to have a more random topology, you can add random edges between vertices of different clusters. That doesn't invalidate the rules and creates more complex paths for your algorithm to traverse.

What is the algorithm behind the gluTess functions?

I'm asking this question out of curiosity, having first tried to implement such an algorithm before using GLU's for performance reasons.
I've looked into common algorithms (Delaunay, Ear Clipping are often mentioned), but I can't seem to understand how GLU does its job so well all the time.
Do any of you have interesting papers or articles on that subjects?

There are some notes alongside the source:
This is only a very brief overview. There is quite a bit of
additional documentation in the source code itself.
Goals of robust tesselation
The tesselation algorithm is fundamentally a 2D algorithm. We
initially project all data into a plane; our goal is to robustly
tesselate the projected data. The same topological tesselation is
then applied to the input data.
Topologically, the output should always be a tesselation. If the
input is even slightly non-planar, then some triangles will
necessarily be back-facing when viewed from some angles, but the goal
is to minimize this effect.
The algorithm needs some capability of cleaning up the input data as
well as the numerical errors in its own calculations. One way to do
this is to specify a tolerance as defined above, and clean up the
input and output during the line sweep process. At the very least,
the algorithm must handle coincident vertices, vertices incident to an
edge, and coincident edges.
Phases of the algorithm
Find the polygon normal N.
Project the vertex data onto a plane. It does not need to be perpendicular to the normal, eg. we can project onto the plane
perpendicular to the coordinate axis whose dot product with N is
largest.
Using a line-sweep algorithm, partition the plane into x-monotone regions. Any vertical line intersects an x-monotone region in at
most one interval.
Triangulate the x-monotone regions.
Group the triangles into strips and fans.
Finding the normal vector
A common way to find a polygon normal is to compute the signed area
when the polygon is projected along the three coordinate axes. We
can't do this, since contours can have zero area without being
degenerate (eg. a bowtie).
We fit a plane to the vertex data, ignoring how they are connected
into contours. Ideally this would be a least-squares fit; however for
our purpose the accuracy of the normal is not important. Instead we
find three vertices which are widely separated, and compute the normal
to the triangle they form. The vertices are chosen so that the
triangle has an area at least 1/sqrt(3) times the largest area of any
triangle formed using the input vertices.
The contours do affect the orientation of the normal; after computing
the normal, we check that the sum of the signed contour areas is
non-negative, and reverse the normal if necessary.
Projecting the vertices
We project the vertices onto a plane perpendicular to one of the three
coordinate axes. This helps numerical accuracy by removing a
transformation step between the original input data and the data
processed by the algorithm. The projection also compresses the input
data; the 2D distance between vertices after projection may be smaller
than the original 2D distance. However by choosing the coordinate
axis whose dot product with the normal is greatest, the compression
factor is at most 1/sqrt(3).
Even though the accuracy of the normal is not that important (since
we are projecting perpendicular to a coordinate axis anyway), the
robustness of the computation is important. For example, if there are many vertices which lie almost along a line, and one vertex V
which is well-separated from the line, then our normal computation
should involve V otherwise the results will be garbage.
The advantage of projecting perpendicular to the polygon normal is
that computed intersection points will be as close as possible to
their ideal locations. To get this behavior, define TRUE_PROJECT.
The Line Sweep
There are three data structures: the mesh, the event queue, and the
edge dictionary.
The mesh is a "quad-edge" data structure which records the topology of
the current decomposition; for details see the include file "mesh.h".
The event queue simply holds all vertices (both original and computed
ones), organized so that we can quickly extract the vertex with the
minimum x-coord (and among those, the one with the minimum y-coord).
The edge dictionary describes the current intersection of the sweep
line with the regions of the polygon. This is just an ordering of the
edges which intersect the sweep line, sorted by their current order of
intersection. For each pair of edges, we store some information about
the monotone region between them -- these are call "active regions"
(since they are crossed by the current sweep line).
The basic algorithm is to sweep from left to right, processing each
vertex. The processed portion of the mesh (left of the sweep line) is
a planar decomposition. As we cross each vertex, we update the mesh
and the edge dictionary, then we check any newly adjacent pairs of
edges to see if they intersect.
A vertex can have any number of edges. Vertices with many edges can
be created as vertices are merged and intersection points are
computed. For unprocessed vertices (right of the sweep line), these
edges are in no particular order around the vertex; for processed
vertices, the topological ordering should match the geometric
ordering.
The vertex processing happens in two phases: first we process are the
left-going edges (all these edges are currently in the edge
dictionary). This involves:
deleting the left-going edges from the dictionary;
relinking the mesh if necessary, so that the order of these edges around the event vertex matches the order in the dictionary;
marking any terminated regions (regions which lie between two left-going edges) as either "inside" or "outside" according to
their winding number.
When there are no left-going edges, and the event vertex is in an
"interior" region, we need to add an edge (to split the region into
monotone pieces). To do this we simply join the event vertex to the
rightmost left endpoint of the upper or lower edge of the containing
region.
Then we process the right-going edges. This involves:
inserting the edges in the edge dictionary;
computing the winding number of any newly created active regions. We can compute this incrementally using the winding of each edge
that we cross as we walk through the dictionary.
relinking the mesh if necessary, so that the order of these edges around the event vertex matches the order in the dictionary;
checking any newly adjacent edges for intersection and/or merging.
If there are no right-going edges, again we need to add one to split
the containing region into monotone pieces. In our case it is most
convenient to add an edge to the leftmost right endpoint of either
containing edge; however we may need to change this later (see the
code for details).
Invariants
These are the most important invariants maintained during the sweep.
We define a function VertLeq(v1,v2) which defines the order in which
vertices cross the sweep line, and a function EdgeLeq(e1,e2; loc)
which says whether e1 is below e2 at the sweep event location "loc".
This function is defined only at sweep event locations which lie
between the rightmost left endpoint of {e1,e2}, and the leftmost right
endpoint of {e1,e2}.
Invariants for the Edge Dictionary.
Each pair of adjacent edges e2=Succ(e1) satisfies EdgeLeq(e1,e2) at any valid location of the sweep event.
If EdgeLeq(e2,e1) as well (at any valid sweep event), then e1 and e2 share a common endpoint.
For each e in the dictionary, e->Dst has been processed but not e->Org.
Each edge e satisfies VertLeq(e->Dst,event) && VertLeq(event,e->Org) where "event" is the current sweep line
event.
No edge e has zero length.
No two edges have identical left and right endpoints. Invariants for the Mesh (the processed portion).
The portion of the mesh left of the sweep line is a planar graph, ie. there is some way to embed it in the plane.
No processed edge has zero length.
No two processed vertices have identical coordinates.
Each "inside" region is monotone, ie. can be broken into two chains of monotonically increasing vertices according to VertLeq(v1,v2)
a non-invariant: these chains may intersect (slightly) due to
numerical errors, but this does not affect the algorithm's operation.
Invariants for the Sweep.
If a vertex has any left-going edges, then these must be in the edge dictionary at the time the vertex is processed.
If an edge is marked "fixUpperEdge" (it is a temporary edge introduced by ConnectRightVertex), then it is the only right-going
edge from its associated vertex. (This says that these edges exist
only when it is necessary.)
Robustness
The key to the robustness of the algorithm is maintaining the
invariants above, especially the correct ordering of the edge
dictionary. We achieve this by:
Writing the numerical computations for maximum precision rather
than maximum speed.
Making no assumptions at all about the results of the edge
intersection calculations -- for sufficiently degenerate inputs,
the computed location is not much better than a random number.
When numerical errors violate the invariants, restore them
by making topological changes when necessary (ie. relinking
the mesh structure).
Triangulation and Grouping
We finish the line sweep before doing any triangulation. This is
because even after a monotone region is complete, there can be further
changes to its vertex data because of further vertex merging.
After triangulating all monotone regions, we want to group the
triangles into fans and strips. We do this using a greedy approach.
The triangulation itself is not optimized to reduce the number of
primitives; we just try to get a reasonable decomposition of the
computed triangulation.

Check which mesh element is inside the original polygons

I have a set of non-overlapping polygons. These polygons can share nodes, edges, but strictly no overlapping.
Now, I am going to mesh them using Constrainted Delaunay Triangulation (CDT) technique. I can get the mesh without problem.
My problem is, after the mesh, I want to know which mesh element belongs to which original polygon. MY current approach is to compute the centroid for each mesh element, and check which of the original polygon this centroid falls into. But I don't like this approach as it is very computationally intensive.
Is there any efficient ways to do this ( in terms of Big O the runtime)? My projects involve tens of thousands of polygons and I don't want the speed to slow down.
Edit: Make sure that all the vertices in a mesh element share a common face is not going to work, because there are cases where the all the vertices can have more than one common face, as below ( the dotted line forms a mesh element whose vertices have 2 common faces):

I can think of two options, both somehow mentioned :
Maintain the information in your points/vertices. See this other related question.
Recompute the information the way you did, locating each mesh element centroid in the original polygon, but this can be optimized by using a spatial_sort, and locating them sequentially in your input polygon (using the previous result as hint for starting the next point location).

What about labeling each of your original vertices with a polygon id (or several, I guess, since polys can share vertices). Then, if I understand DT correctly, you can look at the three verts in a given triangle in the mesh and see if they share a common label, if so, that mesh came from the labeled polygon.

As Mikeb says label all your original vertices with a polygon id.
Since you want the one that's inside the polygon, just make sure you only go clockwise around the polygons, this makes sure that if the points overlap for two polygons you get the one facing the correct direction.
I would expect this approach to remain close to O(n) where n represents number of points as each triangle can at only have one or two polygons that overlap all three points.

Create a new graph G(V,E) in the following way. For every mesh create a node in V. For every dashed edge create an edge in E that connects the two corresponding meshes. Don't map solid edges into edges in E.
Run ConnectedComponents(G).
Every mesh will be labeled with a label (with 1-to-1 correspondence to polygons.)

Maybe you can call CDT separately for each polygon, and label the triangles with their polygon after each call.

How to find closed loops in graph networks

I have an undirected graph network made of streets and crossings, and I would like to know if there is any algorithm to help me finding closed loops, ie places where I can put buildings.
Any help appreciated, thanks !

Based on comments to my earlier answer:
It seems the graphs are all undirected and planar, i.e. can be embedded in a 2D plane without crossing edges, and one such embedding is given. This embedding will partition the plane. E.g. a figure 8 partitions the plane in three: two "inner" areas and an infinite outer area. An alternative view is that all edges of a node are cyclically ordered. (This is the essential part that allows us to apply graph theory)
A partition is necessarily enclosed by a cycle, but not all cycles may partition a single area. In the trivial case of a figure 8, though, all three areas are directly associated with a distinct cycle.
The input graph can generally be simplified. Some nodes may have only a single edge; they can't contribute to the partitioning and can be removed along with the edge. Other nodes have two edges connecting distinct nodes. Here, the node and the two edges can be replaced by a direct edge connecting the neighbors. I.e. a figure 8 graph can be simplified to two nodes and three edges between them. (This is not a necessary step but helps computation).
Now, each vertex will have two areas to either side (since they're undirected, "left and right" aren't obvious distinctions). So, for |V| vertices, we need to consider 2 * |V| sides. They're in general not distinct. Two adjacent edges (connected to the same node) may border the same area, if they're also adjacent in the cyclic order of edges of that node. Obviously, for nodes with only two edges, the two edges share both areas (which is why we'd eliminated them in the previous step). For nodes with three edges, any two edges share at least one area.
So, here's how to enumerate those areas: Assign a sequential number to all edges and vertices. Assign a direction to each edge so it runs from the lower-numbered edge to the higher. Start with vertex 1, right side, and number this area 1. Trace the boundary edges of this area, assigning the same number 1 to the appropriate sides of its boundary edges. You do this by taking at each node the next adjacent edge in counter-cyclical order. When you get back to your starting point, you know all edges bounding area 1.
You then check the left edge of the first vertex. If it's not part of area 1, then it's area 2, and you apply the same algorithm. Next, check vertex 2, right side and left side, etc. Each time you find an edge and a side that's unnumbered yet, assign the next area number and trace the edges of the newly founded area.
There's a slight problem with determining which area number corresponds to infinity. To see this, take a simple () graph: two edges, two nodes, and two areas (inside and outside). Due to the random numbering of edges and vertices, outside may end up as either 1 or 2. That's unavoidable; in graph theory there's no distinction between inside and outside.

It's a standard function in the Boost Graph library. See this previous answer for details.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js