Calculating the vertical sum in binary tree - c++

This is a code i came across which calculates the vertical sum in binary tree. As the code doesn't have any documentation at all ,i am unable to understand how does it actually works and what exactly the condition if(base==hd) does?
Help needed :)
void vertical_line(int base,int hd,struct node * node)
{
if(!node) return;
vertical_line(base-1,hd,node->left);
if(base==hd) cout<<node->data<<" ";
vertical_line(base+1,hd,node->right);
}
void vertical_sum(struct node * node)
{
int l=0,r=0;
struct node * temp=node;
while(temp->left){
--l;temp=temp->left;
}
temp=node;
while(temp->right){
++r;temp=temp->right;
}
for(int i=l;i<=r;i++)
{
cout<<endl<<"VERTICAL LINE "<<i-l+1<<" : ";
vertical_line(0,i,node);
}
}

It is trying to display the tree in vertical order - let's try to understand what is vertical order
Take the example of following tree
4
2 6
1 3 5 7
Following is the distribution of nodes across vertical lines
Vertical Line 1 - 1
Vertical Line 2 - 2
Vertical Line 3 - 3,4,5
Vertical Line 4 - 6
Vertical Line 5 - 7
How did we decide that 3,4,5 are part of vertical line 3. We need to find horizontal distance of nodes from root to decide if they belong to same line or not. We start with root which has horizontal distance of zero. If we move left then we need to decrement the distance of parent by 1 and if we move to right we need to increment the distance of parent by 1. Same applies to every node in the tree i.e if parent has horizontal distance of d, then it's left child's distance is d-1 and right child's distance is d+1
In this case node 4 has distance 0. Node 2 is left child of 4, so it's distance is -1 (decrement by 1). Node 6 is right child of 4, so it's distance is 1 (increment by 1).
Node 2 has distance -1. Node 3 is right child of 2, so it's distance is 0 (increment by 1)
Similarly for Node 5. Nodes 3,4,5 have horizontal distance of zero so they fall on the same vertical line
Now coming to your code
while(temp->left){
--l;temp=temp->left;
}
In this loop, you are computing distance of farthest node from root on left hand side (This doesn't work all the time, we will discuss that later). Every time you move left, you are decrementing value of l by 1.
while(temp->right){
++r;temp=temp->right;
}
With logic similar to above, you are distance of computing farthest node from root on right hand side
Now you know the farthest distances on left hand and right hand sides, you are displaying the nodes vertically
for(int i=l;i<=r;i++)
{
cout<<endl<<"VERTICAL LINE "<<i-l+1<<" : ";
vertical_line(0,i,node);
}
Every iteration in above loop will display the nodes on vertical line. (This is not efficient). You are calling vertical_line method for every line
void vertical_line(int base,int hd,struct node * node)
{
if(!node) return;
vertical_line(base-1,hd,node->left);
if(base==hd) cout<<node->data<<" ";
vertical_line(base+1,hd,node->right);
}
Above method will print the nodes falling on the line hd. This method iterates over entire tree, computing the distance for every node i.e base contains the value of horizontal distance of a node. If a node is part of vertical line hd, then base becomes equal to hd i.e base = hd which is when your code is printing the value of the node

Related

How does this Dijkstra code return minimum value (and not maximum)?

I am solving this question on LeetCode.com called Path With Minimum Effort:
You are given heights, a 2D array of size rows x columns, where heights[row][col] represents the height of cell (row, col). Aim is to go from top left to bottom right. You can move up, down, left, or right, and you wish to find a route that requires the minimum effort. A route's effort is the maximum absolute difference in heights between two consecutive cells of the route. Return the minimum effort required to travel from the top-left cell to the bottom-right cell. For e.g., if heights = [[1,2,2],[3,8,2],[5,3,5]], the answer is 2 (in green).
The code I have is:
class Solution {
public:
vector<pair<int,int>> getNeighbors(vector<vector<int>>& h, int r, int c) {
vector<pair<int,int>> n;
if(r+1<h.size()) n.push_back({r+1,c});
if(c+1<h[0].size()) n.push_back({r,c+1});
if(r-1>=0) n.push_back({r-1,c});
if(c-1>=0) n.push_back({r,c-1});
return n;
}
int minimumEffortPath(vector<vector<int>>& heights) {
int rows=heights.size(), cols=heights[0].size();
using arr=array<int, 3>;
priority_queue<arr, vector<arr>, greater<arr>> pq;
vector<vector<int>> dist(rows, vector<int>(cols, INT_MAX));
pq.push({0,0,0}); //r,c,weight
dist[0][0]=0;
//Dijkstra
while(pq.size()) {
auto [r,c,wt]=pq.top();
pq.pop();
if(wt>dist[r][c]) continue;
vector<pair<int,int>> neighbors=getNeighbors(heights, r, c);
for(auto n: neighbors) {
int u=n.first, v=n.second;
int curr_cost=abs(heights[u][v]-heights[r][c]);
if(dist[u][v]>max(curr_cost,wt)) {
dist[u][v]=max(curr_cost,wt);
pq.push({u,v,dist[u][v]});
}
}
}
return dist[rows-1][cols-1];
}
};
This gets accepted, but I have two questions:
a. Since we update dist[u][v] if it is greater than max(curr_cost,wt), how does it guarantee that in the end we return the minimum effort required? That is, why don't we end up returning the effort of the one in red above?
b. Some solutions such as this one, short-circuit and return immediately when we reach the bottom right the first time (ie, if(r==rows-1 and c==cols-1) return wt;) - how does this work? Can't we possibly get a shorter dist when we revisit the bottom right node in future?
The problem statement requires that we find the path with the minimum "effort".
And "effort" is defined as the maximum difference in heights between adjacent cells on a path.
The expression max(curr_cost, wt) takes care of the maximum part of the problem statement. When moving from one cell to another, the distance to the new cell is either the same as the distance to the old cell, or it's the difference in heights, whichever is greater. Hence max(difference_in_heights, distance_to_old_cell).
And Dijkstra's algorithm takes care of the minimum part of the problem statement, where instead of using a distance from the start node, we're using the "effort" needed to get from the start node to any given node. Dijkstra's attempts to minimize the distance, and hence it minimizes the effort.
Dijkstra's has two closely related concepts: visited and explored. A node is visited when any incoming edge is used to arrive at the node. A node is explored when its outgoing edges are used to visit its neighbors. The key design feature of Dijkstra's is that after a node has been explored, additional visits to that node will never improve the distance to that node. That's the reason for the priority queue. The priority queue guarantees that the node being explored has the smallest distance of any unexplored nodes.
In the sample grid, the red path will be explored before the green path because the red path has effort 1 until the last move, whereas the green path has effort 2. So the red path will set the distance to the bottom right cell to 3, i.e. dist[2][2] = 3.
But when the green path is explored, and we arrive at the 3 at row=2, col=1, we have
dist[2][2] = 3
curr_cost=2
wt=2
So dist[2][2] > max(curr_cost, wt), and dist[2][2] gets reduced to 2.
The answers to the questions:
a. The red path does set the bottom right cell to a distance of 3, temporarily. But the result of the red path is discarded in favor of the result from the green path. This is the natural result of Dijkstra's algorithm searching for the minimum.
b. When the bottom right node is ready to be explored, i.e. it's at the head of the priority queue, then it has the best distance it will ever have, so the algorithm can stop at that point. This is also a natural result of Dijkstra's algorithm. The priority queue guarantees that after a node has been explored, no later visit to that node will reduce its distance.

If edges are not inserted in the deque in sorted order of weights, does 0-1 BFS produce the right answer?

The general trend of 0-1 BFS algorithms is: if the edge is encountered having weight = 0, then the node is pushed to the front of the deque and if the edge's weight = 1, then it will be pushed to the back of the deque.
If we randomly push the edges, then can 0-1 BFS calculate the right answer? What if edges are entered in the deque are not in sorted order of their weights?
This is the general 0-1 BFS algorithm. If I skip out the last if and else parts and randomly push the edges, then what will happen?
To me, it should work, but then why is this algorithm made in this way?
void bfs (int start)
{
std::deque<int> Q; // double ended queue
Q.push_back(start);
distance[start] = 0;
while(!Q.empty())
{
int v = Q.front();
Q.pop_front();
for(int i = 0 ; i < edges[v].size(); i++)
{
// if distance of neighbour of v from start node is greater than sum of
// distance of v from start node and edge weight between v and its
// neighbour (distance between v and its neighbour of v) ,then change it
if(distance[edges[v][i].first] > distance[v] + edges[v][i].second)
{
distance[edges[v][i].first] = distance[v] + edges[v][i].second;
// if edge weight between v and its neighbour is 0
// then push it to front of
// double ended queue else push it to back
if(edges[v][i].second == 0)
{
Q.push_front(edges[v][i].first);
}
else
{
Q.push_back(edges[v][i].first);
}
}
}
}
}
It is all a matter of performance. While random insertion still finds the shortest path, you have to consider a lot more paths (exponential in the size of the graph). So basically, the structured insertion guarantees a linear time complexity. Let's start with why the 0-1 BFS guarantees this complexity.
The basic idea is the same as the one of Dijkstra's algorithm. You visit nodes ordered by their distance from the start node. This ensures that you won't discover an edge that would decrease the distance to a node observed so far (which would require you to compute the entire subgraph again).
In 0-1 BFS, you start with the start node and the distances in the queue are just:
d = [ 0 ]
Then you consider all neighbors. If the edge weight is zero, you push it to the front, if it is one, then to the back. So you get a queue like this:
d = [ 0 0 0 1 1]
Now you take the first node. It may have neighbors for zero-weight edges and neighbors for one-weight edges. So you do the same and end up with a queue like this (new node are marked with *):
d = [ 0* 0* 0 0 1 1 1*]
So as you see, the nodes are still ordered by their distance, which is essential. Eventually, you will arrive at this state:
d = [ 1 1 1 1 1 ]
Going from the first node over a zero-weight edge produces a total path length of 1. Going over a one-weight edge results in two. So doing 0-1 BFS, you will get:
d = [ 1* 1* 1 1 1 1 2* 2*]
And so on... So concluding, the procedure is required to make sure that you visit nodes in order of their distance to the start node. If you do this, you will consider every edge only twice (once in the forward direction, once in the backward direction). This is because when visiting a node, you know that you cannot get to the node again with a smaller distance. And you only consider the edges emanating from a node when you visit it. So even if the node is added to the queue again by one of its neighbors, you will not visit it because the resulting distance will not be smaller than the current distance. This guarantees the time complexity of O(E), where E is the number of edges.
So what would happen if you did not visit nodes ordered by their distance from the start node? Actually, the algorithm would still find the shortest path. But it will consider a lot more paths. So assume that you have visited a node and that node is put in the queue again by one of its neighbors. This time, we cannot guarantee that the resulting distance will not be smaller. Thus, we might need to visit it again and put all its neighbors in the queue again. And the same applies to the neighbors, so in the worst case this might propagate through the entire graph and you end up visiting nodes over and over again. You will find a solution eventually because you always decrease the distance. But the time needed is far more than for the smart BFS.

Calculating depth of a node of a tree with certain constraints in C++

I have a tree in which there are 3 levels. There is a root node, the root node has 3 leaf nodes and all 3 leaf nodes have 3 other leaf nodes. The nodes represent servers. Now, I have to calculate the depth of a node at for a given level. The depth is calculated as follows:
1) If a server(node) is "up" at any level and any column, then the depth of that node is 0.
2) If a server is in the last level and is "down", then depth of that node is infinity.
3) For all other cases, the depth of the node is the max depth of it's leaf nodes + 1. By max depth, it means the majority value that has occurred in it's child nodes.
A bottom up approach is followed here and hence, the depth of the root node is the depth at level 1. The level is taken as the input parameter in the program. Now, I have to calculate the depth of the root node.
I have made some assumptions regarding the program:
1) To find child nodes, follow the child pointer of the parent node.
2) To find all nodes in a given level, traverse the child nodes from root till I reach that level and make a list of them.
3) Assign the values according to the given constraints.
I am not sure whether my approach is right or not. Please help me guys. Thank you.
I think you want something along the lines of the following pseudocode:
int nodeStatus(const Node& n) {
int status = 0;
if (n.isUp)
return 1;
else if (n.isLeaf)
return -1;
else {
for (Node child : n.children)
status += nodeStatus(child);
}
if (status > 0)
return 1;
else
return -1;
}
This is a recursive method. It first checks if the node is up, in which case it returns 1. Then if the node is down and is a leaf it returns -1 as this is a failure with no children. Finally, if n is an intermediate node then it recursively calls this method again for all the children, summing the result. The final if statement then tests whether the majority of the children are classed as 1 or -1 and returns the value accordingly. Notice that by using the values 1 and -1 it's possible to just sum the children up and providing each node definitely has 3 (or an odd number of) nodes then there will never be a situation where status == 0 which would be the case where there is no majority case.
You will need to define a struct called Node somewhere that looks like this:
struct Node {
Node(bool isUp, bool isLeaf);
bool isUp;
bool isLeaf;
std::Vector<Node> children = new std::Vector<Node>(3);
}
I hope this answers your question, but it may be that I've interpreted it wrong.

Finding Depth of Binary Tree

I am having trouble understanding this maxDepth code. Any help would be appreciated. Here is the snippet example I followed.
int maxDepth(Node *&temp)
{
if(temp == NULL)
return 0;
else
{
int lchild = maxDepth(temp->left);
int rchild = maxDepth(temp->right);
if(lchild <= rchild)
return rchild+1;
else
return lchild+1;
}
}
Basically, what I understand is that the function recursively calls itself (for each left and right cases) until it reaches the last node. once it does, it returns 0 then it does 0+1. then the previous node is 1+1. then the next one is 2+1. if there is a bst with 3 left childs, int lchild will return 3. and the extra + 1 is the root. So my question is, where do all these +1 come from. it returns 0 at the last node but why does it return 0+1 etc. when it goes up the left/right child nodes? I don't understand why. I know it does it, but why?
Consider this part (of a bigger tree):
A
\
B
Now we want to calculate the depth of this treepart, so we pass pointer to A as its param.
Obviously pointer to A is not NULL, so the code has to:
call maxDepth for each of A's children (left and right branches). A->right is B, but A->left is obviously NULL (as A has no left branch)
compare these, choose the greatest value
return this chosen value + 1 (as A itself takes a level, doesn't it?)
Now we're going to look at how maxDepth(NULL) and maxDepth(B) are calculated.
The former is quite easy: the first check will make maxDepth return 0. If the other child were NULL too, both depths would be equal (0), and we have to return 0 + 1 for A itself.
But B is not empty; it has no branches, though, so (as we noticed) its depth is 1 (greatest of 0 for NULLs at both parts + 1 for B itself).
Now let's get back to A. maxDepth of its left branch (NULL) is 0, maxDepth of its right branch is 1. Maximum of these is 1, and we have to add 1 for A itself - so it's 2.
The point is the same steps are to be done when A is just a part of the bigger tree; the result of this calculation (2) will be used in the higher levels of maxDepth calls.
Depth is being calculated using the previous node + 1
All the ones come from this part of the code:
if(lchild <= rchild)
return rchild + 1;
else
return lchild + 1;
You add yourself +1 to the results obtained in the leaves of the tree. These ones keep adding up until you exit all the recursive calls of the function and get to the root node.
Remember in binary trees a node has at most 2 children (left and right)
It is a recursive algorithm, so it calls itself over and over.
If the temp (the node being looked at) is null, it returns 0, as this node is nothing and should not count. that is the base case.
If the node being looked at is not null, it may have children. so it gets the max depth of the left sub tree (and adds 1, for the level of the current node) and the right subtree (and adds 1 for the level of the current node). it then compares the two and returns the greater of the two.
It dives down into the two subtrees (temp->left and temp->right) and repeats the operation until it reaches nodes without children. at that point it will call maxDepth on left and right, which will be null and return 0, and then start returning back up the chain of calls.
So if you you have a chain of three nodes (say, root-left1-left2) it will get down to left2 and call maxDepth(left) and maxDepth(right). each of those return 0 (they are null). then it is back at left2. it compares, both are 0, so the greater of the two is of course 0. it returns 0+1. then we are at left1 - repeats, finds that 1 is the greater of its left n right (perhaps they are the same or it has no right child) so it returns 1+1. now we are at root, same thing, it returns 2+1 = 3, which is the depth.
Because the depth is calculated with previous node+1
To find Maximum depth in binary tree keep going left and Traveres the tree, basically perform a DFS
or
We can find the depth of the binary search tree in three different recursive ways
– using instance variables to record current depth and total depth at every level
– without using instance variables in top-bottom approach
– without using instance variables in bottom-up approach
The code snippet can be reduced to just:
int maxDepth(Node *root){
if(root){ return 1 + max( maxDepth(root->left), maxDepth(root->right)); }
return 0;
}
A good way of looking at this code is from the top down:
What would happen if the BST had no nodes? We would have root = NULL and the function would immediately return an expected depth of 0.
Now suppose the tree was populated with a number of nodes. Starting at the top, the if condition would be true for the root node. We then ask, what is the max depth of the LEFT SUB TREE and the RIGHT SUB TREE by passing the root of those sub trees to maxDepth. Both the LST and the RST of the root are one level deeper than the root, so we must add one to get the depth of the tree at root of the tree passed to the function.
i think this is the right answer
int maxDepth(Node *root){
if(root){ return 1 + max( maxDepth(root->left), maxDepth(root->right)); }
return -1;
}

Confused about definition of a 'median' when constructing a kd-Tree

Im trying to build a kd-tree for searching through a set of points, but am getting confused about the use of 'median' in the wikipedia article. For ease of use, the wikipedia article states the pseudo-code of kd-tree construction as:
function kdtree (list of points pointList, int depth)
{
if pointList is empty
return nil;
else
{
// Select axis based on depth so that axis cycles through all valid values
var int axis := depth mod k;
// Sort point list and choose median as pivot element
select median by axis from pointList;
// Create node and construct subtrees
var tree_node node;
node.location := median;
node.leftChild := kdtree(points in pointList before median, depth+1);
node.rightChild := kdtree(points in pointList after median, depth+1);
return node;
}
}
I'm getting confused about the "select median..." line, simply because I'm not quite sure what is the 'right' way to apply a median here.
As far as I know, the median of an odd-sized (sorted) list of numbers is the middle element (aka, for a list of 5 things, element number 3, or index 2 in a standard zero-based array), and the median of an even-sized array is the sum of the two 'middle' elements divided by two (aka, for a list of 6 things, the median is the sum of elements 3 and 4 - or 2 and 3, if zero-indexed - divided by 2.).
However, surely that definition does not work here as we are working with a distinct set of points? How then does one choose the correct median for an even-sized list of numbers, especially for a length 2 list?
I appreciate any and all help, thanks!
-Stephen
It appears to me that you understand the meaning of median, but you are confused with something else. What do you mean be distinct set of points?
The code presented by Wikipedia is a recursive function. You have a set of points, so you create a root node and choose a median of the set. Then you call the function recursively - for the left subtree you pass in a parameter with all the points smaller than the split-value (the median) of the original list, for the right subtree you pass in the equal and larger ones. Then for each subtree a node is created where the same thing happens. It goes like this:
First step (root node):
Original set: 1 2 3 4 5 6 7 8 9 10
Split value (median): 5.5
Second step - left subtree:
Set: 1 2 3 4 5
Split value (median): 3
Second step - right subtree:
Set: 6 7 8 9 10
Split value (median): 8
Third step - left subtree of left subtree:
Set: 1 2
Split value (median): 1.5
Third step - right subtree of left subtree:
Set: 3 4 5
Split value (median): 4
Etc.
So the median is chosen for each node in the tree based on the set of numbers (points, data) which go into that subtree. Hope this helps.
You have to choose an axis with as many element on one side than the other. If the number of points is odd or the points are positioned in such a way that it isn't possible, just choose an axis to give an as even repartition as possible.