Lowest Common Ancestor Optimization

Lowest Common Ancestor Optimization - c++

I have a rudimentary array with elements [0 to N - 1] where each element is a structure that has an index always pointing to a location earlier in the array.
At one point, as part of a much larger algorithm, I want to find a specific C lowest common ancestor between the node X and any nodes after.
int LCA(a, b) {
while (a != b) {
if (a > b) {
a = nodes[a].parent;
} else {
b = nodes[b].parent;
}
}
return a;
}
for (y = x + 1; y < n; ++y) {
if (LCA(x, y) == c) {
//other code
}
}
The above code is really pseudo-code. I've managed to slightly improve performance of LCA() by having a look-up table generated as it is used. Something like this:
int LCA(a, b) {
if (lookup[a, b]) {
return lookup[a, b];
}
oa = a; ob = b;
while (a != b) {
if (a > b) {
a = nodes[a].parent;
} else {
b = nodes[b].parent;
}
}
lookup[oa, ob] = a;
lookup[ob, oa] = a;
return a;
}
I know there's likely a way I can make some sort of specialized LCA() function, that is, replace all of the above code in some manner to specialize it so it's considerably faster. But I've not thought of anything interesting.
I've attempted to see if I could simply do an LCA check between C and Y by seeing if LCA(c, y) == LCA(x, y), but of course that was not accurate.
To re-cap: X is always less than Y. C is always less than X (and thus Y). Parents are always at a lower index than their children (so it is ordered).
Would nodes knowing their depth help at all?
This code accounts for 80% of CPU time of the entire algorithm that takes about 4 minutes in total. A solution to this would easily improve the algorithm as a whole. Thanks!

The LCA of x and y will be the node with smallest height between an occurrence of x and an occurrence of y in the euler tour (*) of your tree. To find this in O(1) time, you need to solve the RMQ problem using this method.
(*): your tour needs a slight modification for this to work. You must append a value to your array each time you get back to it (return from a recursive call to a child) as well. For the wiki tree, it would look like this:
1 2 3 4 5 6 7 8 9 10 11
1 2 6 2 4 2 1 3 1 5 1
Note that there's no point to have leafs show up twice (although it wouldn't affect correctness).
So, for example, RMQ(2, 5) will be the node with minimum height out of these:
2 3 4 5 6 7 8 9 10
2 6 2 4 2 1 3 1 5
Which is node 1.
That is not the only valid interval you can take. It's also valid to take the last occurrence of 2:
6 7 8 9 10
2 1 3 1 5
This will also return 1 as the LCA.
This way, you can answer LCA queries in constant time with linear time spent on preprocessing.

Related

Efficient algorithms to check if a binary maze is solvable with restricted moves

I am given a problem to generate binary mazes of dimensions r x c (0/false for blocked cell and 1/true for free cell). Each maze is supposed to be solvable.
One can move from (i, j) to either (i + 1, j)(down) or (i, j + 1)(right). The solver is expected to reach (r - 1, c - 1)(last cell) starting from (0, 0)(first cell).
Below is my algorithm (modified BFS) to check if a maze is solvable. It runs in O(r*c) time complexity. I am trying to get a solution in better time complexity. Can anyone suggest me some other algorithm?? I don't want the path, I just want to check.
#include <iostream>
#include <queue>
#include <vector>
const int r = 5, c = 5;
bool isSolvable(std::vector<std::vector<bool>> &m) {
if (m[0][0]) {
std::queue<std::pair<int, int>> q;
q.push({0, 0});
while (!q.empty()) {
auto p = q.front();
q.pop();
if (p.first == r - 1 and p.second == c - 1)
return true;
if (p.first + 1 < r and m[p.first + 1][p.second])
q.push({p.first + 1, p.second});
if (p.second +1 < c and m[p.first][p.second + 1])
q.push({p.first, p.second + 1});
}
}
return false;
}
int main() {
char ch;
std::vector<std::vector<bool>> maze(r, std::vector<bool>(c));
for (auto &&row : maze)
for (auto &&ele : row) {
std::cin >> ch;
ele = (ch == '1');
}
std::cout << isSolvable(maze) << std::endl;
return 0;
}
Recursive Solution:
bool exploreMaze(std::vector<std::vector<bool>> &m, std::vector<std::vector<bool>> &dp, int x = 0, int y = 0) {
if (x + 1 > r or y + 1 > c) return false;
if (not m[x][y]) return false;
if (x == r - 1 and y == c - 1) return true;
if (dp[x][y + 1] and exploreMaze(m, dp, x, y + 1)) return true;
if (dp[x + 1][y] and exploreMaze(m, dp, x + 1, y)) return true;
return dp[x][y] = false;
}
bool isSolvable(std::vector<std::vector<bool>> &m) {
std::vector<std::vector<bool>> dp(r + 1, std::vector<bool>(c + 1, true));
return exploreMaze(m, dp);
}
Specific need:
I aim to use this function many times in my code: changing certain point of the grid, and then rechecking if that changes the result. Is there any possibility of memoization so that the results generated in a run can be re-used? That could give me better average time complexity.

If calling this function many times with low changes there's a data structure called Link-Cut tree which supports the following operations in O(log n) time:
Link (Links 2 graph nodes)
Cut (Cuts given edge from a graph)
Is Connected? (checks if 2 nodes are connected by some edges)
Given that a grid is an implicit graph we can first build Link-Cut tree, in O(n*m*log(n*m)) time
Then all updates (adding some node/deleting some node) can be done by just deleting/adding neighboring edges which will only take O(log(n*m)) time
Though I suggest optimizing maze generation algorithm instead of using this complicated data structure. Maze generation can be done with DFS quite easily

The problem you are looking at is known as Dynamic Connectivity and as #Photon said, as you have an acyclic graph one solution is to use Link-cut tree. Another one is based on another representation as Euler tour.

You cannot go below O(r*c) in the general case because, with any pathfinding strategy, there is always a special case of a maze where you need to traverse a rectangular subregion of dimensions proportional to r and c before finding the correct path.
As for memoization: there is something you can do, but it might not help that much. You can build a copy of the maze but only keeping the valid paths, and putting in each cell the direction towards the previous and next cells, as well as the number of paths that traverse it. Let me illustrate.
Take the following maze, and the corresponding three valid paths:
1 1 1 0 1 1 1 1 0 0 1 1 0 0 0 1 1 0 0 0
0 1 1 1 1 0 0 1 1 0 0 1 1 1 0 0 1 0 0 0
0 1 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0
1 1 0 1 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 0
0 1 1 1 1 0 0 0 1 1 0 0 0 1 1 0 1 1 1 1
You can build what I'll call the forward direction grid (FDG), the backward direction grid (BDG), and the valuation grid:
R B D N N B L L N N 3 3 1 0 0
N B R D N N U B L N 0 2 2 2 0
N D N D N N U N U N 0 1 0 2 0
N D N D N N U N U N 0 1 0 2 0
N R R R B N U L B L 0 1 1 3 3
R = right, D = down, L = left, U = up, B = both, and N = none.
The FDG tells you, in each cell, in what direction is the next cell on a valid path (or if both are). The BDG is the same thing in reverse. The valuation grid tells you how many valid paths contain each cell.
For convenience, I'm putting a B at the destination in the direction grids. You can see it as if the goal was to exit the maze, and to do so, you can go in either direction from the final cell. Note that there are always the same number of B cells, and that it's exactly the number of valid paths.
The easiest way to get these grids is to build them during a depth-first search. In fact, you can even use the BDG for the depth-first search since it contains backtracking information.
Now that you have these, you can block or free a cell and update the three grids accordingly. If you keep the number of valid paths separately as well, you can update it at the same time and the condition "the maze is solvable" becomes "the number of valid paths is not zero". Also note that you can combine both direction grids, but I find them easier to grasp separately.
To update the grids and the number of valid paths, there are three cases:
(A) you blocked a cell that was marked N; you don't need to do anything.
(B) you blocked a cell that was not marked N, so previously part of at least one valid path; decrement the number of valid paths by the cell's value in the valuation grid, and update all three grids accordingly.
(C) you freed a cell (that was necessarily marked N); update all three grids first and then increment the number of valid paths by the cell's new value in the valuation grid.
Updating the grids is a bit tricky, but the point is that you do not need to update every cell.
In case (B), if the number of valid paths hits zero, you can reset all three grids. Otherwise, you can use the FDG to update the correct cells forward until you hit the bottom-right, and the BDG to update the correct ones backward until you hit the top-left.
In case (C), you can update the direction grids first by doing a depth-first search, both forward and backward, and backtrack as soon as you hit a cell that isn't marked N (you need to update this cell as well). Then, you can make two sums of the values, in the valuation grid, of the cells you hit: one going forward and one going backward. The number of paths going through the new cell is the product of these two sums. Next, you can update the rest of the valuation grid with the help of the updated direction grids.
I would imagine this technique having an effect on performance with very large mazes, if the updates to the maze itself do not create or break too many paths every time.

Array folding into a single element

This is an interview question not a homework.
Given a array of 1 to 2 ^N. For eg: 1 2 3 4 5 6 7 8 (2^3) .Imagine this array is written on a paper, we need to fold this into half, so that the left half will be mirrored and then moved underneath the right half like this
1 2 3 4 5 6 7 8
left | right
half | half
becomes
5 6 7 8
4 3 2 1
And the next fold we take the right half instead, mirroring it and moving it below the left half,
5 6
4 3
8 7
1 2
The paper has to be folded, changing direction (left-vs-right) each time, until we have all the elements in the single column like this
6
3
7
2
5
4
8
1
My solution,
First step :
Create a linked list for the second half of the original array, and reverse the first half and connect it with head pointers,
5 6 7 8
| | | |
4 3 2 1
And store the head pointers of linked list in an array called headarray
Iteratively :
fold the head array, for each fold either the first half and second half headers will be linked. Delete the head pointers from the headarray once it is linked.
Continue until w have a single head pointer in the head array.
But the interviewer asked me to solve it in stack. Could anyone help in getting this solved in stack and also point out if have done any mistake in my solution. Thanks in advance.

This problem can be solved by using a stack and the original array. I will not code the solution for you, but I will point out how to solve it.
push the array elements on to the stack following the rules we'll discuss further down
right after that pop the stack back into the array starting at index 0
repeat until the end condition is fulfilled
Rule for filling the stack:
initially consider your array as one 'segment'
divide the segment in half; the first half you will iterate in reverse order(right->left), the second one in natural order (left->right)
You start pushing on to the stack from the end of the array:
if the iteration is Odd, push the odd half(s) first,
if the iteration is even start with the even half(s) first
repeat, and keep half-ing your segments until they contain only one element; this is your stop condition
This is a little abstract, so let's consider your example:
iter=1 ->1234 <-5678 Arrows show the direction of iteration
start from the end and fill the stack; inter is odd so start with the first odd half encountered
5
6
7
8
4 <-notice that the order of pushing the halfs on the stack is shown by the arrows
3
2
1
pop the stack back : 5 6 7 8 4 3 2 1
Continue dividing the halfs:
iter=2 <-56 ->78 <-43 ->21; odd halfs 56,43; even halfs 78,21
start from the end and fill the stack; inter is even so start with the first even halfs
5
6
4
3
8 <-even halfs end, odd halfs start
7
1
2
Pop the stack back: 5 6 4 3 8 7 1 2
Divide the segments again, since there will be only one element in each new half the arrows are used just to highlight the rule:
iter=3 ->5 <-6 ->4 <-3 ->8 <-7 ->1 <-2
iter is odd, so fill the stack odd halfs first
6
3
7
2
5
4
8
1
Pop the stack back, and you are done: 63725481
I hope this makes sense; happy coding :)

I have found a law, the element in array whose index is (2*n-1, 2*n), the n is odd, always array before the rest elements whatever direction your folded. For example, the array 12345678, the elements 2367 always are front of the 1458. Now I have used dichotomy for getting two arrays. Next you maybe find the law in two arrays. I hope this can help you.

Maybe your interviewer expected something like:
private int[] Fold(int pow)
{
if (pow < 0)
throw new Exception("illegal input");
int n = 1;
for (int factor = 1; factor <= pow; factor++)
n *= 2;
Stack<int> storage = new Stack<int>(n);
this.Add(n, 1, storage);
int[] result = new int[n];
for (int k = 0; k < n; k++)
result[k] = storage.Pop();
return result;
}
private void Add(int n, int value, Stack<int> storage)
{
storage.Push(value);
int m = n;
while (true)
{
int mirror = m + 1 - value;
if (mirror <= value)
break;
this.Add(n, mirror, storage);
m /= 2;
}
}
{ demonstrating that you know about stacks AND about recursion ;-) }

Here's a recursive solution turned iterative; hence a stack, although probably not as intended. The function returns the starting position of an element based on the given position. It seems to be of time O(1/2n(log n + 1)) and space O(log n).
JavaScript Code:
function f(n,y,x,l){
var stack = [[n,y,x,l]];
while (stack[0]){
var temp = stack.pop();
var n = temp[0], y = temp[1], x = temp[2], l = temp[3];
var m = 1 << l;
if (m == 1)
return x;
if (l % 2 == 0){
if (y > m / 2)
stack.push([n * 2,y - m / 2,n + n - x + 1,l - 1]);
else
stack.push([n * 2,y,x,l - 1]);
} else if (y > m / 2){
stack.push([n * 2,y - m / 2,n - x + 1,l - 1]);
} else
stack.push([n * 2,y,x + n,l - 1]);
}
}
function g(p){
var n = 1 << p;
for (var i=1; i<n; i+=2){
var a = f(1,i,1,p);
console.log(a);
console.log(n - a + 1);
}
}
g(3)

Indices of objects in a list of non-redundant pairs

I am implementing a collision detection algorithm stores the distance between all the objects in a single octree node. For instance if there are 4 objects in the node, there is a distance between objects 1&2, 1&3, 1&4, 2&3, 2&4 and 3&4. The formula for the total number of pairs is t = n * (n-1) / 2, where t is the total number of pairs and n is the number of objects in a node.
My question is, how do I convert from a position in the list to a pair of objects. For instance, using the above list of pairs, 3 would return the pair 2&3.
To save space in memory, the list is just a list of floats for the distance instead of containing distance and pointers to 2 objects.
I am unsure how to mathematically convert the single list index to a pair of numbers. Any help would be great. I am hoping to be able to break this down to 2 functions, the first returns the first object in the pair and the second returns the second, both the functions taking 2 variables, one being the index and the other being the total objects in the node. If possible I would like to make a function without any looping or having a recursive function because this will be run in real time for my collision detection algorithm.

Better ordering
I suggest using colexicographical order, as in that case you won't have to supply the total number of objects. Order your pairs like this:
0: 1: 2: 3: 4: 5: 6: 7: 8: 9: 10: 11: 12: 13: …
0&1, 0&2, 1&2, 0&3, 1&3, 2&3, 0&4, 1&4, 2&4, 3&4, 0&5, 1&5, 2&5, 3&5, …
You'll ve able to extend this list to infinite length, so you can know the index of any pair without knowing the number of items. This has the benefit that when you add new items to your data structure, you'll only have to append to your arrays, not relocate existing entries. I've adjusted the indices to zero-based, as you tagged your question C++ so I assume you'll be using zero-based indexing. All my answer below assumes this ordering.
You can also visualize the colex ordering like this:
a: 0 1 2 3 4 5 …
b:
1 0
2 1 2 index of
3 3 4 5 a&b
4 6 7 8 9
5 10 11 12 13 14
6 15 16 17 18 19 20
⋮ ⋮ ⋱
Pair to single index
Let us first turn a pair into a single index. The trick is that for every pair, you look at the second position and imagine all the pairs that had a lesser number in that position. So for example for the pair 2&4 you first count all the pairs where the second number is less than 4. This is the number of possible ways to choose two items from a set of 4 (i.e. the numbers 0 through 3), so you could express this as a binomial coefficient 4C2. If you evaluate it, you end up with 4(4−1)/2=6. To that you add the first number, as this is the number of pairs with lower index but with the same number in the second place. For 2&4 this is 2, so the overall index of 2&4 is 4(4−1)/2+2=8.
In general, for a pair a&b the index will be b(b−1)/2+a.
int index_from_pair(int a, int b) {
return b*(b - 1)/2 + a;
}
Single index to pair
One way to turn the single index i back into a pair of numbers would be increasing b until b(b+1)/2 > i, i.e. the situation where the next value of b would result in indices larger than i. Then you can find a as the difference a = i−b(b−1)/2. This approach by incrementing b one at a time involves using a loop.
pair<int, int> pair_from_index(int i) {
int a, b;
for (b = 0; b*(b + 1)/2 <= i; ++b)
/* empty loop body */;
a = i - b*(b - 1)/2;
return make_pair(a, b);
}
You could also interpret b(b−1)/2 = i as a quadratic equation, which you can solve using a square root. The real b you need is the floor of the floating point b you'd get as the positive solution to this quadratic equation. As you might encounter problems due to rounding errors in this approach, you might want to check whether b(b+1)/2 > i. If that is not the case, increment b as you would do in the loop approach. Once you have b, the computation of a remains the same.
pair<int, int> pair_from_index(int i) {
int b = (int)floor((sqrt(8*i + 1) + 1)*0.5);
if (b*(b + 1)/2 <= i) ++b; // handle possible rounding error
int a = i - b*(b - 1)/2;
return make_pair(a, b);
}
Sequential access
Note that you only need to turn indices back to pairs for random access to your list. When iterating over all pairs, a set of nested loops is easier. So instead of
for (int = 0; i < n*(n - 1)/2; ++i) {
pair<int, int> ab = pair_from_index(i);
int a = ab.first, b = ab.second;
// do stuff
}
you'd better write
for (int i = 0, b = 1; b != n; ++b) {
for (int a = 0; a != b; ++a) {
// do stuff
++i;
}
}

Based on my understanding of the question, one way to get a pair a&b (1-based, 2&3 in your example) from the index (0-based, 3 in your example) and the number of objects n (4 in your example) is:
t = n * (n - 1) / 2;
a = n - floor((1 + sqrt(1 + 8 * (t - index - 1))) / 2);
b = index + (n - a) * (n - a + 1) / 2 - t + a + 1;
Some credits to http://oeis.org/A002024
Generalized algorithms (for tuples rather than pairs) can be found at Calculate Combination based on position and http://saliu.com/bbs/messages/348.html, but they seem to involve calculating combinations in a loop.
Edit: a nicer formula for a (from the same source):
a = n - floor(0.5 + sqrt(2 * (t - index)));

Make QuickSort sort by multiple criteria?

Is there anyway to make a quicksort sort by multiple conditions? For example, I have a set of edges. Each edge has a source, destination, and length. I want to put the edge with a smaller length in my array first. But if the lengths are the same, I want to sort by that with a smaller source vertex. If these source vertexes are the same, I want to sort by the smaller of the two destination vertices.
For example:
4 (source) 2 (destination) 3 (length)
1 (source) 5 (destination) 3 (length)
Since they both have the same length, we look at the source vertex. Since the second edge is smaller than the first edge, we swap them because we compare by source vertex.
Below is my quicksort and I'm honestly not sure why it's not sorting correctly.If there's a way to make quicksort less efficient but more stable, I would gladly take suggestions!
void quickSort(edge *e, int left, int right)
{
int i = left, j = right;
int temp, temp1, temp2;
int pivot = (left + right)/2;
while(i <= j)
{
while(e[i] < e[pivot])
i++;
while(e[pivot] < e[j])
j--;
if(i <= j)
{
temp = e[i].getLength();
temp1 = e[i].getEdgeSrc();
temp2 = e[i].getEdgeDes();
e[i].setLength(e[j].getLength());
e[i].setEdgeSrc(e[j].getEdgeSrc());
e[i].setEdgeDes(e[j].getEdgeDes());
e[j].setLength(temp);
e[j].setEdgeSrc(temp1);
e[j].setEdgeDes(temp2);
i++;
j--;
} //if statement
}///while loop
if(left < j)
quickSort(e, left, j);
if(i < right)
quickSort(e, i, right);
}
My sorting of conditions:
bool edge::operator<(const edge &other) const
{
if (length < other.length)
return true;
else if ((length == other.length) && (source < other.source))
return true;
else if((length == other.length) && (source == other.source) && (destination < other.destination))
return true;
return false;
}
Again, if anyone knows a way to make this quicksort correctly by reducing the time complexity of it but making it stable, I would gladly take any suggestions! Thank you! Any help?
Edit: This is how I invoked my quicksort. I invoked it based on the number of edges read.
quickSort(e, 0, edges-1); //-1 because if you put in edges, it'd go past the bounds of the array
EDIT: when I try to put in something like this in my algorithm:
0 1 1
0 3 1
1 3 1
2 5 1
4 10 1
4 8 1
10 8 1
11 6 2
11 7 2
6 7 1
9 6 1
9 7 1
This is the output:
0 1 1
0 3 1
1 3 1
2 5 1
4 8 1
4 10 1
6 7 1
6 9 1
8 10 1 <- should be below 7 9 1
7 9 1 <- should be above 8 10 1
6 11 2
7 11 2

It is cleaner to write it this way
if (length != other.length)
return length<other.length;
if ( source != other.source)
return source < other.source;
return destination < other.destination;
You should also be able to do temp = e[i] and so on since the members are all ints.
This (and the code you submitted) should do the task you want I think.
If you are having stability issues, thats because quicksort isnt stable. You could get around it by adding more conditions so that lhs==rhs doesnt happen. Alternatively you can try Mergesort
I dont have much experience with Quick sort frankly, but your impl does look markedly different from Wikipedias In Place Algorithm. For instance, your pivot is not moved at all. Could you check if that is the problem?
Edit
After looking at your link
It looks like the algorithm linked also uses pivot as a value instead of as an index (as you do). It looks syntactically identical to yours until you consider that your pivot value might move, after which your pivot index would point to something else
int pivot = arr[(left + right) / 2];
Does this help?

EDIT: Here's pseudocode for in-place quicksort: http://en.wikipedia.org/wiki/Quicksort#In-place_version
This differs from your code in that the pivot is a value (an average of the left and right values) rather than an index.
If you're looking for a simple non-optimal solution, mergesort the entire list by destination vertex, then mergesort the entire list by origin vertex, then mergesort the entire list by edge length. This takes advantage of the fact that mergesort is a stable sort algorithm and has running time O(E) on the number of edges.

Counting number of members in a disjoint set

I am having a little bit of trouble counting the number of elements in each of my disjoint set members. For example, if someone enters in:
Note: first number = source vertex, 2nd number = destination vertex, 3rd number = length
0 2 1
4 8 7
5 8 6
1 2 5
2 3 17
I would have 2 as a count for the set
4 8 7
5 8 6
and 3 as a count for the set as both are connected by 2 and 3 (respective) elements.
0 2 1
1 2 5
2 3 17
I had the idea of storing the count of the number of elements for each disjoint set, into an integer array, so I can access the count for ever disjoint set. Below are my implementations for finding elements and unioning them into the same set. I also have a function for finding the root in every set.
int node::findSet(int v, int *parent)
{
if(parent[v] < 0)
return v;
else
{
return parent[v] = findSet(parent[v], parent);
}
}
void node::joinSets(int c, int p1, int p2, int *parents)
{
join(findSet(p1,parents),findSet(p2,parents),parents);
}
void node::join(int p1, int p2, int *parents)
{
if(parents[p2] < parents[p1])
parents[p1] = p2;
else
{
if(parents[p1] == parents[p2])
parents[p1]--;
parents[p2] = p1;
}
}
I'm just not sure where/when to increment and maintain my counter. Any help would be appreciated. Thanks!

If you want to count the number of edges connecting each disjoint set, store the current size of every root in an array similar to parents.
When an edge comes, find roots of both nodes. If they're equal, increment the counter for the root (I'm assuming there are no repeating edges). If they're not, union the roots, and for the resultant root's counter value put the sum of the roots' counter values plus one.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js