Comparison of Graph implementation in C++ using stl - c++

To implement a Graph we can use vector of lists std::vector<std::list<vertex>>
but i have seen somewhere if use maps like this std::map<vertex, std::set<vertex>> then we can do better. Can anybody please figure it out how this is better option than first one in terms of memory or speed whatever in which it is better?

There are two differences to note here.
std::vector<std::list<vertex>> is what is known as an "adjacency list", and std::map<vertex, std::set<vertex>> is known as a an "adjacency set", with the added difference that there is hashing of the vertex array index using a map instead of a vector. I'll talk about the first difference first (that is, list<vertex> vs set<vertex>).
The first implementation is basically an array of linked lists, where each linked list gives all the vertices adjacent a vertex. The second implementation is an ordered map mapping each vertex to a set of adjacent vertices.
Comparison of Adjacency List vs Adjacency Set order of growth:
Space: (E + V) vs (E + V)
Add Edge: 1 vs log V
Check Adjacency: (degree of vertex checked) vs log V
Iterating through Neighbours of a vertex: (degree of vertex checked) vs (log V + degree of vertex checked)
... where E is the number of edges and V the number of vertices, and degree of a vertex is the number of edges connected to it. (I'm using the language of an undirected graph but you can reason similarly for directed graphs). So if you have a very dense graph (each vertex has lots of edges, i.e. high degree) then you want to use adjacency sets.
Regarding the use of map vs vector: insert and erase are O(N) for vector and O(log N) for map. However lookup is O(1) for vector and O(log N) for map. Depending on your purposes you might use one over the other. Though you should note that there are cache optimizations and such when you use a contiguous memory space (as vector does). I don't know much about that however, but there are other answers that mention it: vector or map, which one to use?

Related

Deleting edges in a graph stored as adjacency list

I am trying to implement an algorithm for finding an Eulerian path in undirected graph stored as adjacency list. I need a fast way(linear time) to remove an edge from the graph.
My initial idea was to use something like
vector<list<pair<Vertex, List<Vertex>::iterator>>> Graph
so when I delete the edge in one direction I will have a fast way to delete it in the oposite direction using the iterator to the place where it is stored for the reverse direction. However several sources claim that those iterators won't be valid anymore, because as I start deleting items the pointer structure will become different and those iterators won't point to the right elements anymore.
My question is, is there a way to achieve deleting an edge in O(1) time using adjacency lists or is there a way to mark the edge somehow, so when I am in the adjacent vertex I will know for sure that the edge in the oposite direction was traversed. Thanks in advance.
I need a fast way(linear time) to remove an edge from the graph.
It's possible, but you have to change your graph representation, because of problems you have described.
Approach 1 -- guaranteed O(logE) complexity
Just use std::set instead of std::list:
std::vector<std::set<int>> Graph;
This allows to traverse & process all adjacent nodes in the same manner:
// adj is your graph,
// v is current vertex
for (auto &w : adj[v]) {
// process edge [v, w]
}
But you can remove opposite edge in O(logE):
// remove [v,w] and [w,v]
adj[v].erase(w);
adj[w].erase(v);
Approach 2 -- average O(1), worst case O(E)
Constant time complexity is possible with std::unordered_set, but only on average:
std::vector<std::unordered_set<int>> Graph;
Traversing and erasing patterns stay the same, but personally I would prefer approach 1.

What is the best data structure to represent a multigraph using STL in C++?

I'm looking for some data structure to store a multigraph in C++ but I want to make use of the STL as much as possible. Currently I am using something similar to a seperate chaining hash table: std::map<int,std::vector<int>>. The key of the map is the vertex and the value is a std::vector<int> that contains all of the different vertices that form an edge with the key.
I'm mainly interested in O(1) lookups to see whether or not a vertex shares an edge with another vertex. Since this is an unweighted multigraph the vertex could share multiple edges with another vertex.
The graph is guaranteed to have an eulerian circuit and hamiltonian circuit, but I'm not sure if that is relevant or not.
Do you guys have any recommendations for a better data structure using the STL than std::map<int,std::vector<int>>?
If the number of vertices N is small, it's probably easiest to use an adjacency matrix unordered_map<int, unordered_map<int, int>> graph, where graph[u][v] is the number of edges from u to v.
If your vertex numbers all range from 0 to N–1, or 1 to N, or similar, you can simplify this to vector<vector<int>> graph, or even int graph[][] if N is known at compile time.
If N is large, and if the graph is sparse, as you indicated (since M &approx; N), it's probably best to use a set for each vertex: unordered_map<int, unordered_set<int>> graph or vector<unordered_set<int>> graph, where graph[u] is the set of all vertices v with an edge from u to v.
Notice that I'm using the unordered_map and unordered_set collections, which provide O(1) access on average. In the worst case, their performance is linear in the size of the map or set, as with any hash-based container.
And if you don't want to deal with all this and want a ready-made solution, consider the Boost Graph Library, NGraph, LEMON, or Google OR-Tools.

Reconstruct all the edges in a triangle mesh

I have a triangle mesh which contains millions of triangles. Currently in my data structure only the triangles and the vertices are stored. I want to reconstruct all the edges and stored them in a data container. The idea may be like this: Traverse all the triangles, get each two of its vertices, and create an edge between them. The question is the shared edge maybe created twice. So to overcome this problem, I need a data container EdgeContainer to store the edges and it should have a function to check whether this edge has been already created. So it is like a map with multiple keys, but according to my question, this map should also have the following functions:
EdgeContainer(v1, v2) should return the same result as EdgeContainer(v2, v1), where v1 and v2 are the pointers to two vertices.
EdgeContainer should have a function like EdgeContainer::Remove(v1), which will remove all edges incident to vertex v1.
The implementation should be as efficient as possible.
Is there any existing library which can handle this?
First i suggest you have a look at the concept of
half-edge http://www.flipcode.com/archives/The_Half-Edge_Data_Structure.shtml meshes it is used in CGAL and also in OpenMesh and you should be aware of the concept of you are going to use any of them.
I my slef recommend OpenMesh http://openmesh.org/Documentation/OpenMesh-2.0-Documentation/tutorial_01.html it is free and open source, you can easily create mesh from set of vertices and indices, and after creating mesh you can easily iterate over all edges.
Your easiest bet would be to use the Cgal library, which is basically designed for doing this.
http://doc.cgal.org/latest/Triangulation_2/index.html
It provides natural iterators for iterating over faces, edges and vertices.
Notice that in Cgal, they do not actually store the edges explicitly, they are generated
each time the structure is iterated. This can be done efficiently using some clever rules
that stop you from counting things twice: looking at the code, it appears that each face
is iterated once, and an edge is added for each neighbouring face to the current face,
that comes earlier in the list of faces than the current face.
Note that visiting the edges in this fashion only requires constant time per edge (depending on how you store your faces) so that you are unlikely to benefit from storing them separately. Also note that the edge is defined by two adjacent faces, rather than two adjacent vertices. You can transform them in constant time.
Simple solution is to use sorted pair of indices:
struct _edge_desc : public std::pair<int,int> {
_edge_desc(int a, int b): std::pair<int,int>(a<b?a:b, a<b?b:a) {}
};
std::set<_edge_desc> Edges;
If additional info about edges is needed than it can be store in separate vector, and instead of using set for storing edges, use map that maps to index in vector.
std::vector<some_struct> EdgesInfo;
std::map<_edge_desc, int> EdgesMap;

Does time complexity of dijkstra's algorithm for shortest path depends on data structure used?

One way to store the graph is to implement nodes as structures, like
struct node {
int vertex; node* next;
};
where vertex stores the vertex number and next contains link to the other node.
Another way I can think of is to implement it as vectors, like
vector<vector< pair<int,int> > G;
Now, while applying Dijkstra's algorithm for shortest path, we need to build priority queue and other required data structures and so as in case 2 (vector implementation).
Will there be any difference in complexity in above two different methods of applying graph? Which one is preferable?
EDIT:
In first case, every node is associated with a linked list of nodes which are directly accessible from the given node. In second case,
G.size() is the number of vertices in our graph
G[i].size() is the number of vertices directly reachable from vertex with index i
G[i][j].first is the index of j-th vertex reachable from vertex i
G[i][j].second is the length of the edge heading from vertex i to vertex G[i][j].first
Both are adjacency list representations. If implemented correctly, that would be expected to result in the same time complexity. You'd get a different time complexity if you use an adjacency matrix representation.
In more detail - this comes down to the difference between an array (vector) and a linked-list. When all you're doing is iterating through the entire collection (i.e. the neighbours of a vertex), as you do in Dijkstra's algorithm, this takes linear time (O(n)) regardless of whether you're using an array or linked-list.
The resulting complexity for running Dijkstra's algorithm, as noted on Wikipedia, would be
O(|E| log |V|) with a binary heap in either case.

Finding edge in weighted graph

I have a graph with four nodes, each node represents a position and they are laid out like a two dimensional grid. Every node has a connection (an edge) to all (according to the position) adjacent nodes. Every edge also has a weight.
Here are the nodes represented by A,B,C,D and the weight of the edges is indicated by the numbers:
A 100 B
120 220
C 150 D
I want to structure a container and an algorithm that switches the nodes sharing the edge with the highest weight. Then reset the weight of that edge. No node (position) can be switched more than once each time the algorithm is executed.
For example, processing the above, the highest weight is on edge BD, so we switch those. Since no node can be switched more than once, all edges involved in either B or D is reset.
A D
120
C B
Then, the next highest weight is on the only edge left, switching those would give us the final layout: C,D,A,B.
I'm currently running a quite awful implementation of this. I store a long list of edges, holding four values for the nodes they are (potentially) connected to, a value for its weight and the position for the node itself. Every time anything is requested, I loop through the entire list.
I'm writing this in C++, could some parts of the STL help speed this up? Also, how to avoid the duplication of data? A node position is currently in five objects. The node itself that is there and the four nodes indicating a connection to it.
In short, I want help with:
Can this be structured in a way so that there is no data duplication?
Recognise the problem? If any of this has a name, tell me so I can google for more info on the subject.
Fast algorithms are always nice.
As for names, this is a vertex cover problem. Optimal vertex cover is NP-hard with decent approximation solutions, but your problem is simpler. You're looking at a pseudo-maximum under a tighter edge selection criterion. Specifically, once an edge is selected every connected edge is removed (representing the removal of vertices to be swapped).
For example, here's a standard greedy approach:
0) sort the edges; retain adjacency information
while edges remain:
1) select the highest edge
2) remove all adjacent edges from the list
endwhile
The list of edges selected gives you the vertices to swap.
Time complexity is O(Sorting vertices + linear pass over vertices), which in general will boil down to O(sorting vertices), which will likely by O(V*log(V)).
The method of retaining adjacency information depends on the graph properties; see your friendly local algorithms text. Feel free to start with an adjacency matrix for simplicity.
As with the adjacency information, most other speed improvements will apply best to graphs of a certain shape but come with a tradeoff of time versus space complexity.
For example, your problem statement seems to imply that the vertices are laid out in a square pattern, from which we could derive many interesting properties. For example, that system is very easily parallelized. Also, the adjacency information would be highly regular but sparse at large graph sizes (most vertices wouldn't be connected to each other). This makes the adjacency matrix give a high overhead; you could instead store adjacency in an array of 4-tuples as it would retain fast access but almost entirely eliminate overhead.
If you have bigger graphs look into the boost graph library. It gives you good data structures for graphs and basic iterators for different types of graph traversing