nested lists to implement adjacency lists in c++ - c++

i was going through the code for topological sorting on this website.
i understood the code except for one part which is the declaration of the adjacency list (on line 15), which is
list<int> *adj;
basically, to me, adj should be a pointer to a list of integers but in this case based on how they have used it, it is a list of lists ... so shouldn't a list of lists be
list <list<int> > adj;
can someone please explain this to me?

You could also do it like that, however what this website is creating is an array of lists, not exactly a list of lists. I think this approach (array of lists) is the usual way to represent adjacency lists in lots of programming languages.
In this way the vertices are numbered from 0 to V-1, and you can access the list of adjacents directly by using the index operator adj[i]
I don't really know the exact reason, but I imagine it's for efficiency purposes.
EDIT:
Notice that lists are, according the C++ reference, double-linked lists, so if you want to access element i, an iteration through the linked nodes is needed until you reach element i. With an array, you access directly the list that you are interested in, without iterating and therefore more efficiently.
In www.cplusplus.com we can read:
The main drawback of lists and forward_lists compared to these other sequence containers is that they lack direct access to the elements by their position; For example, to access the sixth element in a list, one has to iterate from a known position (like the beginning or the end) to that position, which takes linear time in the distance between these. They also consume some extra memory to keep the linking information associated to each element (which may be an important factor for large lists of small-sized elements).

It's important to note that the code you've linked to is rather unidiomatic. One prominent issue it has is exactly with that member you mentioned, list<int> *adj.
In modern C++, we tend to avoid using new and delete directly, when instead, we could use a smart pointer (e.g. std::unique_ptr) or a container. In this specific case, instead of:
list<int> *adj;
// ... etc. ...
Graph(int V) {
adj = new std::list<int>[V];
}
it would indeed be better to use:
std::vector<std::list<int>> vertex_adjacencies;
// ... etc. ...
Graph(std::size_t num_vertices) : vertex_adjacencies(num_vertices) { }
Now, as for your suggestion of a list-of-lists - that's also possible:
std::list<std::list<int>> vertex_adjacencies;
// ... etc. ...
Graph(std::size_t num_vertices) : vertex_adjacencies()
{
auto empty_adjacencies = std::list<int>{};
std::fill_n(
std::front_inserter(vertex_adjacencies),
num_vertices,
empty_adjacencies);
}
but it would require rewriting various other methods. Also note, that the graph is intended to have a fixed number of vertices, without vertices being added or removed, so placing the vertex-specific adjacency in a list does not make a lot of sense. (Not that a separate std::list for each vertex' adjacencies is such a good idea, performance-wise, anyway, but never mind that).

Related

Is this a bad way to implement a graph?

When I look at a book, I only show examples of how to implement graphs in almost every book by adjacent matrix method and adjacent list method.
I'm trying to create a node-based editor, in which case the number of edges that stretch out on each node is small, but there's a lot of vertex.
So I'm trying to implement the adjacent list method rather than the adjacent matrix method.
However, adjacent lists store and use each edge as a connection list.
But, I would like to use the node in the form listed below.
class GraphNode
{
int x, y;
dataType data;
vector<GraphNode*> in;
vector<GraphNode*> out;
public:
GraphNode(var...) = 0;
};
So like this, I want to make the node act as a vertex and have access to other nodes that are connected.
Because when I create a node-based editor program, I have to access and process different nodes that are connected to each node.
I want to implement this without using a linked list.
And, I'm going to use graph algorithms in this state.
Is this a bad method?
Lastly, I apologize for my poor English.
Thank you for reading.
You're just missing the point of the difference between adjacency list and adjacency matrix. The main point is the complexity of operations, like finding edges or iterating over them. If you compare a std::list and a std::vector as datatype implementing an adjacency list, both have a complexity of O(n) (n being the number of edges) for these operations, so they are equivalent.
Other considerations:
If you're modifying the graph, insertion and deletion may be relevant as well. In that case, you may prefer a linked list.
I said that the two are equivalent, but generally std::vector has a better locality of reference and less memory overhead, so it performs better. The general rule in C++ is to use std::vector for any sequential container, until profiling has shown that it is a bottleneck.
Short answer: It is probably a reasonable way for implementing a graph.
Long answer: What graph data structure to use is always dependent on what you want to use it for. A adjacency matrix is good for very dense graphs were it will not waste space due to many 0 entries and if we want to answer the question "Is there an edge between A and B?" fast. The iteration over all members of a node can take pretty long, since it has to look at a whole row and not just the neighbors.
An adjacency list is good for sparse graphs and if we mostly want to look up all neighbors of a node (which is very often the case for graph mustering algorithms). In a directed graph were we want to treat ingoing and outgoing edges seperately, it is probably a good idea to have a seperate adjacency list for ingoing and outgoing egdes (as has your code).
Regarding what container to use for the list, it depends on the use case. If you will much more often iterate over the graph and not so often delete something from it, using a vector over a list is a very good idea (basically all graph programms I ever wrote were of this type). If you have a graph that changes very often, you have to delete edges very often, you don't want to have iterator invalidation and so on, maybe it is better having a list. But that is very seldom the case.
A good design would be to make it very easy to change between list and vector so you can easily profile both and then use what is better for your program.
Btw if you often delete one edge, this is also pretty easily done fast with a vector, if you do not care about the order of your edges in adjacency list (so do not do this without thinking while iterating over the vector):
void delte_in_edge(size_t index) {
std::swap(in[i], in.back()); // The element to be deleted is now at the last position,
// the formerly last element is at position i
in.pop_back(); // Delete the current last element
}
This has O(1) complexity (and the swap is probably pretty fast).

Efficient data structure for finding constrained shortest path in a graph

The constrained shortest path problem in a graph G=(V,E) is, given source node s and sink node t, find the shortest path from s to t such that the total resource consumed on the path is at most scalar R. Each arc (i,j) in the graph has a cost, scalar c_{ij} and uses up the resource to the tune of scalar r_{ij}. The cost of the path is the sum of costs of individual arcs constituting the path and the resource consumed by the path is the sum of resource of individual arcs constituting the path. This problem is known to be NP-HARD.
Most implementations to solve this problem use a dynamic programming approach which essentially does a sort of brute force enumeration along with other clever fathoming approaches to reduce the amount of searching done.
Dynamic programming is implemented using a labelling approach.
I have implemented this algorithm using a couple of different approaches and I want to ensure if I am doing it as efficiently as possible.
The labelling approach creates multiple labels, which are essentially partial paths from s to various other nodes. A large number of labels are created during the algorithm (note, the problem is NP HARD) until a stopping criterion is met.
Each label can be represented as a struct as follows.
struct labels_s {
double current_states[10];
double unscanned_states[10];
int already_visited[100];//If node i is already visited on partial path, already_visited[i] = 1, else 0
int do_not_visit[100];//if node i is not to be visited from this label, do_not_visit[i] = 1; 0 otherwise
struct labels_s* prev;
struct labels_s* next;
};
As the algorithm proceeds, many of the above structs need to be created and stored.
Method 1:
A very early implementation I had of this was very computationally inefficient. This involved newing structs as and when new labels were required and explicitly maintaining these in a linked list using the members next and prev of the structs.
Method 2:
Instead of newing structs, I started storing the new structs in a std::vector container:
vector <labels_s> labels;
To do this, and since vector gives integer index access to various labels, prev and next of struct labels_s could be changed to int prev; and int next;
Storing a label involved the following:
struct labels_s newlabel;//Step 1
//populate newlabel's members//Step 2
labels.push_back(newlabel);//Step 3
Computational times on the same problem using Method 2 are significantly better than Method 1. Labels are only added at the end of the vector. There is no need to insert in the middle of the vector or delete from the vector.
Is there any other way of managing these labels apart from Method 2?
My concern primarily is with Step 3 of Method 2. Since push_back() creates a copy of newlabel, is this copy operation costly and can this be avoided?
One alternative I was considering was to maintain a vector of pointers to label structs instead of vector of label structs as I do currently. But it appears to me that maintaining a vector of pointers to label structs should be no more efficient than Method 1.
Any input is appreciated.
In C++11 you can use emplace_back (cppreference) to construct a label in place at the end of the vector. You could do:
labels.emplace_back(); // default construct a new label at the end of labels
// then populate members like this:
labels.back().member1 = val1;
Depending on your use case, you could also create a constructor for labels_s that takes all the values of the members and initializes them. In this case you could write
labels.emplace_back(val1, val2, …);
and be done.
Apart from this, you should reserve (cppreference) generously before populating labels to avoid frequent reallocations.

Difference between ways of sorting linked lists c++

I was thinking about ways of sorting a linked list and I came up with two different ways (using BubbleSort, because I'm relatively new at programming and it is the simplest algorithm for me). Example struct:
struct node {
int value;
node *next;
};
The two different methods:
Rearranging the list elements
Doing something like swap(root->value, root->next->value)
I did some Google searches on the subject, and from the looks of it, the first method seems to be more popular. From my experience, such that it is, rearranging the list is more complicated than simply swapping the actual node values. Is there any benefit in rearranging the whole list, and if yes, what is it?
I can think of two advantages:
1) Other pointers might exist, pointing to nodes in this list. If you rearrange the list, these pointers will still point to the same values they pointed to before the sorting; if you swap values, they won't. (Which one of these two is better depends on the details of your design, but there are designs in which it is better if they remain pointing to the same values.)
2) It doesn't matter much for a list of mere ints, but eventually you might be sorting a list of more complex things, so that swapping values is very expensive or even impossible.
As answered by Beta, it's better to rearrange the nodes (via the next pointers) than it is to swap node data.
If actually using a bubble sort or any sort that "swaps" nodes via the pointers, swap the next (or head) pointers to the two nodes to be swapped first, then swap those two nodes next pointers. This handles both the adjacent node case where 3 pointers are rotated, and the normal case where 2 pairs of pointers are swapped.
Another simple option is to create an new empty list (node * pNew = NULL;) for the sorted list. Remove a node from the original list one at a time and insert that node into the sorted list in order, or scan the original list for the largest node, remove that node and prepend the sorted list with that node.
If the list is large and speed is important, than bottom up merge sorts are much faster.

Hashmap to implement adjacency lists

I've implement an adjacency list using the vector of vectors approach with the nth element of the vector of vectors refers to the friend list of node n.
I was wondering if the hash map data structure would be more useful. I still have hesitations because I simply cannot identify the difference between them and for example if I would like to check and do an operation in nth elements neighbors (search,delete) how could it be more efficient than the vector of vectors approach.
A vector<vector<ID>> is a good approach if the set of nodes is fixed. If however you suddenly decide to remove a node, you'll be annoyed. You cannot shrink the vector because it would displace the elements stored after the node and you would lose the references. On the other hand, if you keep a list of free (reusable) IDs on the side, you can just "nullify" the slot and then reuse later. Very efficient.
A unordered_map<ID, vector<ID>> allows you to delete nodes much more easily. You can go ahead and assign new IDs to the newly created nodes and you will not be losing empty slots. It is not as compact, especially on collisions, but not so bad either. There can be some slow downs on rehashing when a vector need be moved with older compilers.
Finally, a unordered_multimap<ID, ID> is probably one of the easiest to manage. It also scatters memory to the wind, but hey :)
Personally, I would start prototyping with a unordered_multimap<ID, ID> and switch to another representation only if it proves too slow for my needs.
Note: you can cut in half the number of nodes if the adjacency relationship is symmetric by establishing than the relation (x, y) is stored for min(x, y) only.
Vector of vectors
Vector of vectors is good solution when you don't need to delete edges.
You can add edge in O(1), you can iterate over neighbours in O(N).
You can delete edge by vector[node].erase(edge) but it will be slow, complexity only O(number of vertices).
Hash map
I am not sure how you want to use hash map. If inserting edge means setting hash_map[edge] = 1 then notice that you are unable to iterate over node's neighbours.

Implementing list position locator in C++?

I am writing a basic Graph API in C++ (I know libraries already exist, but I am doing it for the practice/experience). The structure is basically that of an adjacency list representation. So there are Vertex objects and Edge objects, and the Graph class contains:
list<Vertex *> vertexList
list<Edge *> edgeList
Each Edge object has two Vertex* members representing its endpoints, and each Vertex object has a list of Edge* members representing the edges incident to the Vertex.
All this is quite standard, but here is my problem. I want to be able to implement deletion of Edges and Vertices in constant time, so for example each Vertex object should have a Locator member that points to the position of its Vertex* in the vertexList. The way I first implemented this was by saving a list::iterator, as follows:
vertexList.push_back(v);
v->locator = --vertexList.end();
Then if I need to delete this vertex later, then rather than searching the whole vertexList for its pointer, I can call:
vertexList.erase(v->locator);
This works fine at first, but it seems that if enough changes (deletions) are made to the list, the iterators will become out-of-date and I get all sorts of iterator errors at runtime. This seems strange for a linked list, because it doesn't seem like you should ever need to re-allocate the remaining members of the list after deletions, but maybe the STL does this to optimize by keeping memory somewhat contiguous?
In any case, I would appreciate it if anyone has any insight as to why this happens. Is there a standard way in C++ to implement a locator that will keep track of an element's position in a list without becoming obsolete?
Much thanks,
Jeff
(I am assuming you are single-threaded, obviously list isn't thread-safe)
but maybe the STL does this to optimize by keeping memory somewhat contiguous?
Incorrect - list::insert, list::push_front and list::push_back do not affect the validity of list::iterators. If you are only calling these mutators on the list, than it will remain valid.
In any case, I would appreciate it if anyone has any insight as to why this happens. Is there a standard way in C++ to implement a locator that will keep track of an element's position in a list without becoming obsolete?
Your method should work, please post some code demonstrating it not working. In the meantime here are two alternative representations:
Why not use:
struct Graph
{
typedef unique_ptr<Vertex> PVertex;
typedef unique_ptr<Edge> PEdge;
unordered_set<PVertex> verticies;
unordered_set<PEdge> edges;
};
That way you can delete them in constant time like you wish. unordered_set is generally implemented with a hash table so its amortized O(1) access time.
And also unique_ptr means that you can have the unordred_sets "owning" them
If verticies are countable and have a fixed maxmimum upper limit (N), another representation would be:
struct Graph
{
typedef unique_ptr<Vertex> PVertex;
typedef unique_ptr<Edge> PEdge;
array<PVertex, N> verticies;
array<array<PEdge, N>, N> edges;
};
Where edges[i][j] holds the edge between verticies[i] and verticies[j]
If verticies[x] or edges[x][y] is nullptr in means the corresponding vertex or edge does not exist.
Old C++ Versions:
unordered_set was introduced in TR1. If you don't have this you can use boost. if you don't want to use boost you can use a plain old set which will give logn access time, or you can implement your own hash table.
unique_ptr can be replaced with auto_ptr for older versions.
array can be replaced with a regular array or with a vector.