Complexity in Dijkstras algorithm - heap

So I've been attempting to analyze a specialized variant of Dijkstras algorithm that I've been working on. I'm after the worst case complexity.
The algorithm uses a Fibonacci Heap which in the case of the normal Dijkstra would run in O(E + V log V).
However this implementation needs to do a lookup in the inner loop where we update neighbours. This lookup will execute for every edge and will be in logarithmic time, where the lookup is in a datastructure that contains all edges. Also the graph has the restriction that no node will have more than 4 neighbours.
O(Vlog V) is the complexity for the outer loop but I'm not sure what the worst case will be for the inner loop. I'm thinking that since that each edge in the graph will be checked O(E) times and each edge will take logarithmic time it should be Elog E which should exceed Vlog V and result in O(Elog E) complexity for the algorithm.
Any insight would be awesome!

The amortized complexity of Decrease-Key on Fibonacci Heap is O(1), that is to say you have |E| such operations on the Fibonacci Heap, the total cost will be O(E). Also you have |V| Extract-Min operations, which cost O(lnV) each. So the total cost is O(E+VlnV).

Related

Wrong Complexity Calculation for Hash Tables?

I was reading: https://www.geeksforgeeks.org/given-an-array-a-and-a-number-x-check-for-pair-in-a-with-sum-as-x/
I think there is a mistake with calculating the time complexity for Method2 (Hashing) where they claimed it's O(n) and I insist it's O(n) Amortized.
Algorithm:
1 Initialize an empty hash table s.
2 Do following for each element A[i] in A[]
2.1 If s[x – A[i]] is set then print the pair (A[i], x – A[i])
2.2 Insert A[i] into s.
Step 1 is done in O(1). Step 2 does O(n) iterations, Where for each one we are doing O(1) Amortized (2.1 & 2.2) so in total we have O(n) Amortized.
When an O(1) amortized step is performed n times, it is not valid to conclude the total cost is just O(n) amortized. The fact that a step is O(1) amortized means its average cost for n times is at most some constant c, and the fact that its average cost is at most c implies the total cost for those n steps is at most cn, so the cost for n steps is O(n), not just O(n) amortized.
By the definition of amortized cost with the aggregate method, the fact that a operation is T(n)/n amortized means there is some upper bound T(n) on performing n operation. So, if an operation is O(1) amortized, meaning there is some c such that the average cost is at most c, we have T(n)/n ≤ c, and therefore T(n) ≤ cn, and therefore performing n operation has at most cn cost. Therefore, the cost of n operations is O(n), not just O(n) amortized.
There can be some confusion in considering operations in isolation rather than as part of a sequence of n operations. If some program executes billions of unordered_set insertions and we take a random sample of n of them, it is not guaranteed that those n have an O(1) amortized time. We could have been unlikely to get many of the insertions that happened to be rebuilding the table. In such a random selection, the statistical average time would be O(1), but each sample could fluctuate. In contrast, when we look at all the insertions to insert n elements in the table, their times are correlated; the nature of the algorithm is such that it guarantees the table rebuilds occur only with a certain frequency, and this guarantees the total amount of work to be done over n insertions is O(n).

Time complexity analysis in data structures and algo

Can we compare O(m+n) with O(n), are both same because we need to focus only on the highest power?
The complexity of both O(m+n) and O(n) is linear in relation to input n. In realtion to m, the complexity of O(m+n) is linear while O(n) is constant.
So, unless we analyse only the input n and assume m to be constant, we cannot in general simplify O(m+n) to O(n).
Sometimes we may be able to combine two input dimensions into one: For example, if m is number of input strings and n is the maximum length of input string, then we might reframe the premise by analysing complexity in relation to total length of all input strings.
O(m+n) is two-dimensional (it has to parameters, m and n) and you can't reduce it to one dimension without more information about the relationship between m and n.
A concrete example: Many graph algorithms (e.g. depth first search, topological sort) have time complexity O(v + e), where v is the number of vertices and e is the number of edges. You can consider two separate types of graph:
In a dense graph with lots of edges, e is proportional to v². The time complexity of the algorithm on this type of graph is O(v + v²), or O(v²).
In a sparse graph with few edges, e is proportional to v. The time complexity of the algorithm on this type of graph is O(v + v), or O(v).

Prim’s MST for Adjacency List Representation

I was trying to implement Prim's algorithm using min Heap.
here is the link I was referring to
https://www.geeksforgeeks.org/prims-mst-for-adjacency-list-representation-greedy-algo-6/
I was wondering why can we use a vector and sort it using std::sort function in place of min heap.
You could, but remember that every time a new node is added to the MST you have to update the set of edges that cross the partition (edges to nodes that are not yet in the MST), so you would have to sort your vector at every iteration of the algorithm, which would waste time, since you just need to know which of the edges that cross the partition has the minimum value at that iteration, the min-heap is optimal for that operation since its time complexity for extracting the minimum value is O(log n) while the sort is O(n log n) which can make the algorithm run slower on large or dense graphs.

Understanding Time Complexity for tree traversal using BFS

I am trying to understand the time complexity when I traverse a tree with n nodes (not necessarily a binary tree) using BFS.
As per my understanding it should be O(n^2) since my outer loop runs for n times i.e till the queue is not empty and since the tree contains n nodes.
And my inner for loop has to keep adding the children associated with a particular node to the queue. (Every node has a dict which contains the address of all its children)
So for example if root node has n-1 nodes (and thus all those nodes have no children further) then wouldn't the time complexity be n*(n-1) = O(n^2).
Is my understanding correct?
Is there any way that this can be done in O(n) ? Please explain.
It's often more useful to describe the complexity of graph algorithms in terms of both the number of nodes and edges. Typically |V| is used to represent the number of nodes, and |E| to represent the number of edges.
In BFS, we visit each of the |V| nodes once and add all of their neighbors to a queue. And, by the end of the algorithm, each edge in the graph has been processed exactly once. Therefore we can say BFS is O(|V| + |E|).
In a fully connected graph, |E| = |V|(|V| - 1)/2. So you are correct that the complexity is O(|V|^2) for fully connected graphs; however, O(|V| + |E|) is considered a tighter analysis for graphs that are known to be sparse.
Big-O notation means the upper bound of the time complexity. You can of course say that the time complexity of BFS is O(n2), but it's not a strict upper bound.
To get the strict upper bound, you can consider BFS like this: Each node will be added into the queue only once, and each node will be removed from the queue only once. Each adding and removing operation costs only O(1) time, so the time complexity is O(n).
EDIT
To implement the O(n) BFS on a tree, you can try to implement the following pseudo code.
procedure bfs(root: root of the tree)
q := an empty queue
push root into q
while q is not empty
v := the element at the head of q
for u := children of v
push u into q
pop v out of q

Is the complexity of Dijkstra's correct?

I have a question regarding to runtime complexity of Dijkstra's algorithm. (see pseudo code in CLRS vertion 3):
DIJKSTRA(G, w, s)
1 INITIALIZE-SINGLE-SOURCE(G, s)
2 S ← ∅
3 Q ← V[G]
4 while Q != ∅
5 do u ← EXTRACT-MIN(Q)
6 S ← S ∪ {u}
7 for each vertex v ∈ Adj[u]
8 do RELAX(u, v,w)
I understand that line3 is O(V), line5 is O(VlogV) in total; line7 is O(E) in total, line8 implies decrease_key() so logV for each Relax() operation. But in relax(), after d[v]>d[u]+weight and decides to be relaxed, shouldn't we look up the position of v in queue Q before we call decrease_key(Q, pos, d[v]) to replace the key of pos with d[v]? note this look up itself costs O(V). so each Relax() should cost O(V), not O(logV), right?
A question regarding to space complexity: to compare the vertex in queue Q, I design a struct/class vertex with distance as one member and then I implement such as operator< to sort vertex by comparing their distance. but it seems I have to define a duplicate array dist[] in order to do dist[v] = dist[u]+weight in Relax(). If I do not define the duplicate array, I have to look up position of v and u in queue Q and then obtain and check their distance. is it suppose to work in this way? or maybe my implementation is not good?
Dijkstra's Algorithm (as you wrote it) does not have a runtime complexity unless you specify the datastructures. You are somehow right saying that "line 7" accounts with O(E) operations, but let's go through the lines (fortunately, Dijkstra is "easy" to analyze).
Initializing means "giving all vertices a infinite distance, except for the source, which has distance 0. Pretty easy, this can be done in O(V).
What is the set S good for? You use it "write only".
You put all elements to a queue. Here be dragons. What is a (priority!) queue? A datastructure with operations add, optionally decreaseKey (needed for Dijkstra), remove (not needed in Dijkstra), extractMin. Depending on the implementation, these operations have certain runtimes. For example, you can build a dumb PQ that is just a (marking) set - then adding and decreasing a key is constant time, but for extracting the minimum, you have to search. The canonical solution in Dijkstra is to use a queue (like a heap) that implements all relevant operations in O(log n). Let's analyze for this case, although technically speaking a Fibonacci-Heap would be better. Don't implement the queue on your own. It's amazing how much you can save by using a real PQ implementation.
You go through the loop n times.
Every time, you extract the minimum, which is in O(n log n) total (over all iterations).
What is the set S good for?
You go through the edges of each vertex at most once, i.e. you tough each edge at most twice, so in total you do whatever happens inside the loop O(E) times.
Relaxing means checking whether you have to decrease a key and do so. We already know that each such operation can add O(log V) in the queue (if it's a heap), and we have to do it O(E) times, so it'S O(E log V), which dominates the total runtime.
If you take a Fibonacci-Heap, you can go down to O(VlogV+E), but that's academic. Real implementations tune heaps. If you want to know your implementation's performance, analyze the PQ operations. But as I said, it's better to use existing implementations if you don't exactly know what your doing. Your idea of "looking up a position before calling decreaseKey" tells me you should digg deeper into that topic before you come up with an implementation which effectively takes O(V) per insert (by sorting every time some decreaseKey is called) or O(V) per extractMin (by finding the minimum on demand).