I have solved quite a few questions related to trees, however, I still don't feel confident about one particular aspect of trees (recursion in general):
How do you propagate values from the leaf to the root?
For example, consider we have a binary tree wherein we have to find the root to leaf path with the minimum sum. For the tree image here, the sum would be 7 (corresponding to two paths 0-3-2-1-1 or 0-6-1).
I wrote the following code:
struct Node
{
int cost;
vector<Node *> children;
Node *parent;
};
int getCheapestCost( Node *rootNode )
{
if(!rootNode) return 0;
return dfs(rootNode, INT_MAX, 0);
}
int dfs(Node* rootNode, int minVal, int currVal) {
if(!rootNode) return;
currVal+=rootNode->cost;
if(rootNode->children.empty()) {
minVal = min(minVal, currVal);
return minVal;
}
for(auto& neighbor: rootNode->children) {
dfs(neighbor, minVal, currVal);
}
return currVal; //this is incorrect, but what should I return?
}
I know the last return currVal is incorrect - but then what should I return? Technically, I only want to return the value of minVal when I reach the leaf nodes (and no value when I am at the intermediate nodes). So, how do I propagate the minVal from the leaf nodes to the topmost root node?
P.S.: I am preparing for interviews and this is a big pain area for me since I get stuck at this point almost every time. I would highly appreciate any help. Thanks.
Edit: For this particular one, I somehow wrote a solution using pass by reference.
Inside your for save the minVal from all children and return minVal instead of currVal.
for(auto& neighbor: rootNode->children) {
minVal = min(minVal, dfs(neighbor, minVal, currVal));
}
return minVal;
That way you're always returning the minVal, through the recursion all the way to the first call.
Edit: Explanation
I'll use the tree you provided in your question as an example. We'll start by entering the tree at the root(0). It'll add 0 to the currVal, won't enter the first if, then enter the for. Once it's there, the function will be called again, from the first child.
At the first node (5), it'll add that value, check if it's the end, and go to the next node (4), adds again, currVal is now 9. Then, since (4) has no children, it'll return min(currVal, minVal). At this point, minVal is INT_MAX, so it returns 9.
Once this value is returned, we go back to the function that called it, which was at node(5), exactly at the point when we called (4), and we'll (with my modification) compare whichever value it returned with minVal.
min(minVal, dfs(neighbor, minVal, currVal))
At this point, it's important to notice that the current minVal is still INT_MAX, as it's not a reference, and this was the value passed to the function. And as a result, we now set it to 9.
If (5) had other children, we would now enter a new instance of dfs and at the once we had a result, compare that value with 9, but since we don't, we end the for loop and return minVal, going back to the root node(0).
From there, I believe you can guess what happens, we enter node(3) which branches to (2)->(1)->(1) and (0)->(10), returning 7 and 13 to the for loop respectively, and node (6) will finally also return 7 to (0)'s for loop.
In the end, (0) will first compare INT_MAX with 9, then with 7 and finally with 7 again, returning 7 to getCheapestCost.
In short:
Your code will keep entering dfs until it finds a node without children, once that happens, it'll return the minVal it got from that node, and return to the function that called it, which is the parent node.
Once in the parent node, you need to check which children provided the minimum minVal, by comparing that with your previous minVal (from other children, branches or INT_MAX). After checking all children, minValue is returned to the next parent, which compares with its children until it reaches the root node.
Related
I am trying to find a way to output the amount of most left nodes found in a path.
For example:
The max nodes in this Binary Search Tree would be 2 (Goes from 5 ->3->1 and excluding the root).
What is the best way to approach this?
I have seen this thread which is fairly similar to what I am trying to achieve.
Count number of left nodes in BST
but there is like one line in the code that I don't understand.
count += countLeftNodes(overallRoot.left, count++);
overallRoot.left
My guess is that it calls a function on the object, but I can't figure out what goes into that function and what it would return.
Any answers to these two questions would be appreciated.
The answer you linked shows how to traverse the tree, but you need a different algorithm to get your count, since as you have noted, that question is trying to solve a slightly different problem.
At any given point in the traversal, you will have the current left count: this will be passed down the tree as a second parameter to countLeftNodes(). That starts with zero at the root, and is increased by one whenever you go into the left child of a node, but is set to zero when you enter the right node.
Then for both the left and right traversals, you set the left count to the greater of its current value, and the return from the recursive call to countLeftNodes(). And then this final value is what you return from countLeftNodes()
Here's a shot at the algorithm #dgnuff illustrated:
void maxLeftNodesInPath(Node *root, int count, int *best) {
if (root) {
maxLeftNodesInPath(root->left, ++count, best);
maxLeftNodesInPath(root->right, 0, best);
}
else if (count > *best) {
*best = count - 1;
}
}
The explanation is pretty much the same: keep accumulating on a count while traversing left, reset when moving to a right child, and if at a leaf, update the best.
I get a segmentation fault in the call to
auto n1=std::make_shared<Node>(n,n->x+i,n->y+j);
after a few recursive calls. Strange thing is that it's always at the same point in time. Can anyone spot the problem?
This is an implementation for a dynamic programming problem and here I'm accumulating the costs of a path. I have simplified the cost function but in this example the problem still occurs.
void HorizonLineDetector::dp(std::shared_ptr<Node> n)
{
n->cost= 1 + n->prev->cost;
//Check if we reached the last column(done!)
if (n->x==current_edges.cols-1)
{
//Save the info in the last node if it's the cheapest path
if (last_node->cost > n->cost)
{
last_node->cost=n->cost;
last_node->prev=n;
}
}
else
{
//Check for neighboring pixels to see if they are edges, launch dp with all the ones that are
for (int i=0;i<2;i++)
{
for (int j=-1;j<2;j++)
{
if (i==0 && j==0) continue;
if (n->x+i >= current_edges.cols || n->x+i < 0 ||
n->y+j >= current_edges.rows || n->y+j < 0) continue;
if (current_edges.at<char>(n->y+j,n->x+i)!=0)
{
auto n1=std::make_shared<Node>(n,n->x+i,n->y+j);
//n->next.push_back(n1);
nlist.push_back(n1);
dp(n1);
}
}
}
}
}
class Node
{
public:
Node(){}
Node(std::shared_ptr<Node> p,int x_,int y_){prev=p;x=x_;y=y_;lost=0;}
Node(Node &n1){x=n1.x;y=n1.y;cost=n1.cost;lost=n1.lost;prev=n1.prev;}//next=n1.next;}
std::shared_ptr<Node> prev; //Previous and next nodes
int cost; //Total cost until now
int lost; //Number of steps taken without a clear path
int x,y;
Node& operator=(const Node &n1){x=n1.x;y=n1.y;cost=n1.cost;lost=n1.lost;prev=n1.prev;}//next=n1.next;}
Node& operator=(Node &&n1){x=n1.x;y=n1.y;cost=n1.cost;lost=n1.lost;prev=n1.prev;n1.prev=nullptr;}//next=n1.next;n1.next.clear();}
};
Your code looks like a pathological path search, in that it checks almost every path and doesn't keep track of paths it has already checked you can get to more than one way.
This will build recursive depth equal to the length of the longest path, and then the next longest path, and ... down to the shortest one. Ie, something like O(# of pixels) depth.
This is bad. And, as call stack depth is limited, will crash you.
The easy solution is to modify dp into dp_internal, and have dp_internal return a vector of nodes to process next. Then write dp, which calls dp_internal and repeats on its return value.
std::vector<std::shared_ptr<Node>>
HorizonLineDetector::dp_internal(std::shared_ptr<Node> n)
{
std::vector<std::shared_ptr<Node>> retval;
...
if (current_edges.at<char>(n->y+j,n->x+i)!=0)
{
auto n1=std::make_shared<Node>(n,n->x+i,n->y+j);
//n->next.push_back(n1);
nlist.push_back(n1);
retval.push_back(n1);
}
...
return retval;
}
then dp becomes:
void HorizonLineDetector::dp(std::shared_ptr<Node> n)
{
std::vector<std::shared_ptr<Node>> nodes={n};
while (!nodes.empty()) {
auto node = nodes.back();
nodes.pop_back();
auto new_nodes = dp_internal(node);
nodes.insert(nodes.end(), new_nodes.begin(), new_nodes.end());
}
}
but (A) this will probably just crash when the number of queued-up nodes gets ridiculously large, and (B) this just patches over the recursion-causes-crash, doesn't make your algorithm suck less.
Use A*.
This involves keeping track of which nodes you have visited and what nodes to process next with their current path cost.
You then use heuristics to figure out which of the ones to process next you should check first. If you are on a grid of some sort, the heuristic is to use the shortest possible distance if nothing was in the way.
Add the cost to get to the node-to-process, plus the heuristic distance from that node to the destination. Find the node-to-process that has the least total. Process that one: you mark it as visited, and add all of its adjacent nodes to the list of nodes to process.
Never add a node to the list of nodes to process that you have already visited (as that is redundant work).
Once you have a solution, prune the list of nodes to process against any node whose current path value is greater than or equal to your solution. If you know your heuristic is a strong one (that it is impossible to get to the destination faster), you can even prune based off of the total of heuristic and current cost. Similarly, don't add to the list of nodes to process if it would be pruned by this paragraph.
The result is that your algorithm searches in a relatively strait line towards the target, and then expands outwards trying to find a way around any barriers. If there is a relatively direct route, it is used and the rest of the universe isn't even touched.
There are many optimizations on A* you can do, and even alternative solutions that don't rely on heuristics. But start with A*.
I have a binary tree and a method for the size of the longest path (the diameter):
int diameter(struct node * tree)
{
if (tree == 0)
return 0;
int lheight = height(tree->left);
int rheight = height(tree->right);
int ldiameter = diameter(tree->left);
int rdiameter = diameter(tree->right);
return max(lheight + rheight + 1, max(ldiameter, rdiameter));
}
I want the function to return also the exact path (list of all the nodes of the diameter).
How can I do it?
Thanks
You have two options:
A) Think.
B) Search. Among the first few google hits you can find this: http://login2win.blogspot.hu/2012/07/print-longest-path-in-binary-tree.html
Choose A) if you want to learn, choose B) if you do not care, only want a quick, albeit not necessarily perfect solution.
There are many possible solutions, some of them:
In a divide and conquer approach you will probably end up with maintaining the so far longest paths on both sides, and keep only the longer.
The quoted solution does two traversals, one for determining the diameter, and the second for printing. This is a nice trick to overcome the problem of not knowing whether we are at the deepest point in approach 1.
Instead of a depth first search, do a breadth first one. Use a queue. Proceed level by level, for each node storing the parent. When you reach the last level (no children added to queue), you can print the whole path easily, because the last printed node is on (one) longest path, and you have the parent links.
Add a property struct node * next to the node struct. Before the return statement, add a line like this tree->next = (ldiameter > rdiameter ? tree->left : tree->right) to get the longer path node as the next node. After calling diameter(root), you should be able to iterate through all of the next nodes from the root to print the largest path.
I think the following may work... compute the diameter as follows in O(N) time.
// this is a c++ code
int findDiameter(node *root, int &max_length, node* &max_dia_node, int parent[], node* parent_of_root){
if(!root) return 0;
parent[root->val] = parent_of_root->val;
int left = findDiameter(root->left, max_length);
int right = findDiameter(root->right, max_length);
if(left+right+1 > max_length){
max_dia_node = root;
max_length = left+right+1;
}
return 1 + max(left,right);
}
So in this function number of things is happening. First max_length is calculating the max diameter of the tree. And along with that I am assigning the max_dia_node to this node.
This is the node through which I will have my max diameter pass through.
Now using this information we can find the max depth left child and right child of this node (max_dia_node). From that we can have the actual nodes via "parent" array.
This is two traversal of the tree.
I'm trying to create a function that finds the average of some data within the nodes of a tree. The problem is, every node contains two pieces of data and unlike other BSTs, the primary data from which it is built is a string. Finding the average of number-based elements in a tree isn't an issue for me, but since each node contains a string (a person's name) and a seemingly random number (the weight of said person), the tree is actually in complete disarray, and I have no idea how to deal with it.
Here is my node so you see what I mean:
struct Node {
string name;
double weight;
Node* leftChild;
Node* rightChild;
};
Node* root;
Here's the function during one of its many stages:
// This isn't what I'm actually using so don't jump to conclusions
double nameTree::averageWeight(double total, double total, int count) const
{
if (parent != NULL)
{ //nonsense, nonsense
averageWeight(parent->leftChild, total, count);
averageWeight(parent->rightChild, total, count);
count++;
total = total + parent->weight;
return total;
}
return (total / count);
}
In an effort to traverse the tree, I tried some recursion but every time I manage to count and total everything, something gets screwey and it ends up doing return(total/count) each time. I've also tried an array implementation by traversing the tree and adding the weights to the array, but that didn't work because the returns and recursion interfered, or something.
And just because I know someone is going to ask, yes, this is for a school assignment. However, this is one out of like, 18 functions in a class so it's not like I'm asking anyone to do this for me. I've been on this one function for hours now and I've been up all night and my brain hurts so any help would be vastly appreciated!
You could try something like:
//total number of tree nodes
static int count=0;
// Calculate the total sum of the weights in the tree
double nameTree::calculateWeight(Node *parent)
{
double total=0;
if (parent != NULL)
{
//nonsense, nonsense
//Calculate total weight for left sub-tree
total+=calculateWeight(parent->leftChild);
//Calculate weight for right sub-tree
total+=calculateWeight(parent->rightChild);
//add current node weight
total+=parent->weight;
}
count++;
//if it is a leaf it will return 0
return total;
}
double averageWeight()
{
double weightSum;
weightSum=calculateWeight();
if(count!=0)
return (weightSum/count);
else
{
cout<<"The tree is empty";
return 0;
}
}
I don't have a compiler here but I believe it works.
To calculate the average you need two numbers: the total value and the number of elements in the set. You need to provide a function (recursive is probably the simplest) that will walk the tree and either return a pair<double,int> with those values or else modify some argument passed as reference to store the two values.
As of your code, averageWeight returns a double, but when you call it recursively you are ignoring (discarding) the result. The count argument is passed by copy, which means that the modifications applied in the recursive calls will not be visible by the caller (which then does not know how much parent->weight should weight towards the result.
This should be enough to get you started.
I'm having a problem with a pointer and can't get around it..
In a HashTable implementation, I have a list of ordered nodes in each bucket.The problem I have It's in the insert function, in the comparision to see if the next node is greater than the current node(in order to inserted in that position if it is) and keep the order.
You might find this hash implementation strange, but I need to be able to do tons of lookups(but sometimes also very few) and count the number of repetitions if It's already inserted (so I need fasts lookups, thus the Hash , I've thought about self-balanced trees as AVL or R-B trees, but I don't know them so I went with the solution I knew how to implement...are they faster for this type of problem?),but I also need to retrieve them by order when I've finished.
Before I had a simple list and I'd retrieve the array, then do a QuickSort, but I think I might be able to improve things by keeping the lists ordered.
What I have to map It's a 27 bit unsigned int(most exactly 3 9 bits numbers, but I convert them to a 27 bit number doing (Sr << 18 | Sg << 9 | Sb) making at the same time their value the hash_value. If you know a good function to map that 27 bit int to an 12-13-14 bit table let me know, I currently just do the typical mod prime solution.
This is my hash_node struct:
class hash_node {
public:
unsigned int hash_value;
int repetitions;
hash_node *next;
hash_node( unsigned int hash_val,
hash_node *nxt);
~hash_node();
};
And this is the source of the problem
void hash_table::insert(unsigned int hash_value) {
unsigned int p = hash_value % tableSize;
if (table[p]!=0) { //The bucket has some elements already
hash_node *pred; //node to keep the last valid position on the list
for (hash_node *aux=table[p]; aux!=0; aux=aux->next) {
pred = aux; //last valid position
if (aux->hash_value == hash_value ) {
//It's already inserted, so we increment it repetition counter
aux->repetitions++;
} else if (hash_value < (aux->next->hash_value) ) { //The problem
//If the next one is greater than the one to insert, we
//create a node in the middle of both.
aux->next = new hash_node(hash_value,aux->next);
colisions++;
numElem++;
}
}//We have arrive to the end od the list without luck, so we insert it after
//the last valid position
ant->next = new hash_node(hash_value,0);
colisions++;
numElem++;
}else { //bucket it's empty, insert it right away.
table[p] = new hash_node(hash_value, 0);
numElem++;
}
}
This is what gdb shows:
Program received signal SIGSEGV, Segmentation fault.
0x08050b4b in hash_table::insert (this=0x806a310, hash_value=3163181) at ht.cc:132
132 } else if (hash_value < (aux->next->hash_value) ) {
Which effectively indicates I'm comparing a memory adress with a value, right?
Hope It was clear. Thanks again!
aux->next->hash_value
There's no check whether "next" is NULL.
aux->next might be NULL at that point? I can't see where you have checked whether aux->next is NULL.