Pre-Order Traversals Of All Possible Binary Trees Given In-Order Traversal - c++

I was looking at questions on binary trees, and I came across the following:
Given the in-order traversal of a binary tree, print the pre-order traversals of all possible binary trees satisfying the given in-order traversal.
For e.g, if the in-order traversal is: {4, 5, 7}
The possible trees are:
4 4 5 7 7
\ \ / \ / /
5 7 4 7 4 5
\ / \ /
7 5 5 4
Therefore, the pre-order traversals are:
4 5 7
4 7 5
5 4 7
7 4 5
7 5 4
The solution I came up with:
Traverse the given in-order list. Upon each iteration, select an element from the list and make it the root of our tree. All elements preceding the current one will be part of the left subtree and all elements succeeding it will form the right subtree. We can then recursively do the same for the left and right subtrees.
For instance, in the above example, I begin by selecting 4 as the root of my tree. Now since there are no elements preceding 4, I cannot have a left subtree. I look at the remaining elements. They will form the right subtree. I select 5 to be the root of this subtree. For this, I am only left with one choice: to construct the right subtree of 5 from 7. That gives the first tree of the example.
Now, I keep 4 as the root, and instead of selecting 5, I select 7 as the root of the right subtree of 4. This leads me to the second tree of the example above.
That much is fine. The problem comes up with the code. I've spent quite sometime on translating the above solution to code. But I haven't been completely successful.
This is what I've tried in C++:
void solve(vector<int> inOrder, int beg, int end, string &s, bool &flag)
{
for(int i = beg; i <= end; ++i)
{
s += to_string(inOrder[i]);
flag = false;
solve(inOrder, beg, i - 1, s, flag);
solve(inOrder, i + 1, end, s, flag);
if(s.size() == inOrder.size()) {flag = true; cout << s << endl;}
if(s.size() && flag) {s.pop_back();}
}
}
I use a string to store the current permutation of elements in the pre-order traversal. Elements are appended to the string when a permutation satisfies the in-order traversal.
Naturally, elements must be subsequently removed from the string to make way for other permutations. However, I haven't been able to figure out when to remove an element. In the above code, I append an element the first time it is encountered, and I start removing elements when the string has size equal to that of the in-order list.
So, let's say I begin with 4. I append 4 to the string. I do the same for 5 and then 7. Now the string size equals the total number of elements. So I remove the last one. My string is now 45. Since there are no more possible combinations with the current string, I remove 5. I'm left with 4. Now, I can append 7 and then 5, leading to 475. This works fine in this case, but I haven't been able to make it work for other combinations. It fails when I begin by making 5 as the root, instead of 4.
So my question is, how exactly should I proceed to solve the above problem? Am I even doing it the right way? Or should I give up this approach and think of something else? If yes, what direction should I proceed in?
I'm not looking for an exact solution, only a hint as to what I'm missing or what I could do better.

Related

Extract the root from a max heap

I am struggling to write a recursive algorithm to extract the max(root) from a max heap.
The heap is constructed as a tree.
I know that I should swap the last node with the root and push it down recursively. But there is no pseudocode on the internet or stack overflow to deal with trees.
The extract algorithms that I have seen are based on arrays.
So let's say that I find the right most leaf.
Is there any solution that you can suggest?
void find_last(heap_node* root,int level,int* last_level,bool isRight,heap_node** res){
if(root == NULL)
return;
if(isRight && !root->left && !root->right && level > *last_level){
*res = root;
*last_level = level;
return;
}
find_last(root->right,level+1,last_level,true,res);
find_last(root->left,level+1,last_level,false,res);
}
And i made a function call like this
heap_node* last = NULL;
int last_level = -1;
find_last(root,0,&last_level,false,&last);
That is the code of finding the deepest right node.
And it is not working :D
Efficiently finding the last node in a heap that's implemented as a tree requires that you maintain a count of nodes. That is, you know how many items are in the heap.
If you know how many items are in the heap, then you get the binary representation of the number and use that to trace through the tree to the last node. Let me give you an example. You have this heap:
1
/ \
2 3
/ \ / \
4 5 6
There are 6 items in the heap. The binary representation is 110.
Now, moving from right to left in the binary representation. You remove the first '1' and you're at the root node. The rule is that you go right if the digit is '1', and left if the digit is '0'. At the root node, you have 10. So you go right and remove that digit, leaving you with 0. You're at the node marked "3". The remaining digit is 0, so you go left. That puts you at the last node in the heap.
The algorithm for sifting down through the heap is the same regardless of whether the heap is represented as an array or as a tree. The actual steps you take to swap nodes is different, of course. When swapping nodes, you have to be careful to set the child pointers correctly. One place people often forget is when swapping the root node with the last node.
My suggestion is that you code this up and then single-step in the debugger to make sure that you have the pointer assignments right.

Quicksort w/ "median of three" pivot selection: Understanding the process

We're being introduced to Quicksort (with arrays) in our class. I've been running in to walls trying to wrap my head around how they want our Quicksort assignment to work with the "median of three" pivot selection method. I just need a high-level explanation of how it all works. Our text doesn't help and I'm having a hard time Googling to find a clear explanation.
This is what I think to understand so far:
The "median of three" function takes the elements in index 0(first), array_end_index(last), and (index 0 + array_end_index)/2(middle). The index with the median value of those 3 is calculated. The corresponding index is returned.
Function parameters below:
/* #param left
* the left boundary for the subarray from which to find a pivot
* #param right
* the right boundary for the subarray from which to find a pivot
* #return
* the index of the pivot (middle index); -1 if provided with invalid input
*/
int QS::medianOfThree(int left, int right){}
Then, in the "partition" function, the number whose index matches with the one returned by the "median of three" function acts as the pivot. My assignment states that, in order to proceed with partitioning the array, the pivot must lie in-between the left and right boundaries. The problem is, our "median of three" function returned one of three indices: the first, the middle, or the last index. Only one of those three indices(middle) could ever be "in-between" anything.
Function parameters below:
/* #param left
* the left boundary for the subarray to partition
* #param right
* the right boundary for the subarray to partition
* #param pivotIndex
* the index of the pivot in the subarray
* #return
* the pivot's ending index after the partition completes; -1 if
* provided with bad input
*/
int QS::partition(int left, int right, int pivotIndex){}
What am I misunderstanding?
Here are the entire descriptions of the functions:
/*
* sortAll()
*
* Sorts elements of the array. After this function is called, every
* element in the array is less than or equal its successor.
*
* Does nothing if the array is empty.
*/
void QS::sortAll(){}
/*
* medianOfThree()
*
* The median of three pivot selection has two parts:
*
* 1) Calculates the middle index by averaging the given left and right indices:
*
* middle = (left + right)/2
*
* 2) Then bubble-sorts the values at the left, middle, and right indices.
*
* After this method is called, data[left] <= data[middle] <= data[right].
* The middle index will be returned.
*
* Returns -1 if the array is empty, if either of the given integers
* is out of bounds, or if the left index is not less than the right
* index.
*
* #param left
* the left boundary for the subarray from which to find a pivot
* #param right
* the right boundary for the subarray from which to find a pivot
* #return
* the index of the pivot (middle index); -1 if provided with invalid input
*/
int QS::medianOfThree(int left, int right){}
/*
* Partitions a subarray around a pivot value selected according to
* median-of-three pivot selection.
*
* The values which are smaller than the pivot should be placed to the left
* of the pivot; the values which are larger than the pivot should be placed
* to the right of the pivot.
*
* Returns -1 if the array is null, if either of the given integers is out of
* bounds, or if the first integer is not less than the second integer, OR IF THE
* PIVOT IS NOT BETWEEN THE TWO BOUNDARIES.
*
* #param left
* the left boundary for the subarray to partition
* #param right
* the right boundary for the subarray to partition
* #param pivotIndex
* the index of the pivot in the subarray
* #return
* the pivot's ending index after the partition completes; -1 if
* provided with bad input
*/
int QS::partition(int left, int right, int pivotIndex){}
Start with understanding quicksort first, median-of-three next.
To perform a quicksort you:
Pick an item from the array you are sorting (any item will do, but which is the best one to go for we'll come back to).
Reorder the array so that all items less than the one you picked are before it in the array, and all of those greater than it are after it.
Recursively do the above to the sets before and after the item you picked.
Step 2 is called the "partition operation". Consider if you had the following:
3 2 8 4 1 9 5 7 6
Now say you picked the first of those numbers as your pivot element (the one we picked at step 1). After we apply step 2 we end up with something like:
2 1 3 4 8 9 5 7 6
The value 3 is now in the correct place, and every element is on the correct side of it. If we now sort the left-hand side we end up with:
1 2 3 4 8 9 5 7 6.
Now, let's consider just the elements to the right of it:
4 8 9 5 7 6.
If we pick 4 to pivot next we end up changing nothing as it was in the correct position to begin with. To set of elements to the left of it is empty, so nothing to do here. We now need to sort the set:
8 9 5 7 6.
If we use 8 as our pivot we could end up with:
5 7 6 8 9.
The 9 now on its right is only one element, so obviously already sorted. The 5 7 6 is left to sort. If we pivot on the 5 we end up leaving it alone, and we just need to sort 7 6 into 6 7.
Now, considering all those changes in the wider context, what we have ended up with is:
1 2 3 4 5 6 7 8 9.
So to summarise again, quicksort picks one item, moves elements around it so that they are all correctly positioned relative to that one item, and then does the same thing recursively with the remaining two sets until there are no unsorted blocks left, and everything is sorted.
Let's come back to the matter I fudged over there when I said "any item will do". While it's true that any item will do, which item you do pick will affect the performance. If you are lucky you will end up doing this in a number of operations proportional to n log n where n is the number of elements. If you're just reasonably lucky it'll be a slightly bigger number still proportional to n log n. If you're really unlucky it'll be a number proportional to a number proportional to n2.
So which is the best item to pick? The best number is the item that will end up in the middle after you have done the partition operation. But we don't know what item that is, because to find the middle item we have to sort all of the items, and that's what we were trying to do in the first place.
So, there are a few strategies we can take:
Just go for the first one, because, meh, why not?
Go for the middle one, because maybe the array is already sorted or nearly sorted for some reason, and if not it's not any worse a choice than any other.
Pick a random one.
Pick the first one, middle one and last one, and go for the median of those three, because it's at least going to be the best of those three options.
Pick the median-of-three for the first third of the array, the median-of-three of the second third, the median-of-three of the last third and then finally go with the median of those three medians.
These have different pros and cons, but generally speaking each of those options gives better results in picking the best pivot than the previous, but at the cost of spending more time and effort on picking that pivot. (Random has the further advantage of beating cases where someone is deliberately trying to create data that you will have worse-case behaviour on, as part of some sort of DoS attack).
My assignment states that, in order to proceed with partitioning the array, the pivot must lie in-between the left and right boundaries.
Yes. Consider above again when we had sorted 3 into the correct position, and sorted the left:
1 2 3 4 8 9 5 7 6.
Now, here we need to sort the range 4 8 9 5 7 6. The boundary is the line between the 3 and the 4 and the line between the 6 and the end of the array (Or another way of looking at it, the boundary is the 4 and the 6, but it's an inclusive boundary including those items). To three we pick is hence the 4 (first) the 6 (last) and either the 9 or the 5 depending on whether we round up or down in dividing the count by 2 (we probably round down as that's usual in integer division so the 9). All of these are inside the boundary of the partition we are currently sorting. Our median-of-three is hence 6 (or if we did round up, we'd have gone for the 5).
(Incidentally, a magically perfect pivot-chooser that always picked the best pivot just would have picked either the 6 or the 7, so picking 6 is pretty good here, though there are still times when median-of-three will be unlucky and end up picking the 3rd worse option, or perhaps even an arbitrary choice out of 3 equal elements all of which were the worse. It's just much less likely to happen than with other approaches).
The documentation for medianOfThree says:
* 2) Then bubble-sorts the values at the left, middle, and right indices.
*
* After this method is called, data[left] <= data[middle] <= data[right].
* The middle index will be returned.
So you description does not match the documentation. What you are doing is sorting the first, middle and last elements in-place in your data, and always returning the middle index.
So, it is guaranteed that the pivot index is in between the boundaries (unless when middle ends up bein in the boundary...).
Even so, there's nothing wrong with pivoting the boundaries...
Calculating the "median of three" is sort of a way to get a pseudo median element in your array, and having that index equal to your partition. Its a simple way to get a rough estimate of what the median of the array would be, leading to better performance.
Why would this be useful? Because in theory, you want to have this partition value to be the true median of your array, so when you do quicksort on this array, the pivot would have divided this array equally and enables the nice O(NlogN) sorting time that quick sort gives you.
Example: Your array is:
[5,3,1,7,9]
The median of three would look at 5, 1, and 9. The median is obviously 5, so this is the pivot value we want to consider for the partition function of quick sort. What you can do next is swap this index with the last and get
[9,3,1,7,5]
Now we attempt to have all values that are smaller than 5 on the left of the middle, all values bigger than five on the right of the middle. We now get
[1,3,7,9,5]
Swap the last element (where we stored the partition value) with the middle
[1,3,5,9,7]
And thats the idea of using the middle of 3. Imagine if our partition was 1 or 9. You could imagine that this array we would get would not be a good case for quick sort.

Finding PostOrder traversal from LevelOrder traversal without constructing the tree

Given a binary tree where value of each internal node is 1 and leaf node is 0. Every internal
node has exactly two children. Now given level order traversal of this tree return postorder
traversal of the same tree.
This question can be easily solved if I construct a tree and then do its postorder traversal. Although it is O(n) time. But is it possible to print postOrder traversal without building up the tree.
It's definitely possible.
Considering it's a Full Binary Tree, once the number of nodes is determined, theoretically, the shape of tree is unique.
Deem the level order traversal as an array, for example, 1 2 3 4 5 6 7.
It represents such tree:
1
2 3
4 5 6 7
What you want to get is the post order traversal: 4 5 2 6 7 3 1.
The first step is calculate how deep the tree was:
depth = ceil(log(2, LevelOrderArray.length)) // =3 for this example
after that, set up a counter = 0, and extract nodes from the bottom level of the original array, one by one:
for(i=0, i<LevelOrderArray.length, i++){
postOrderArray[i] = LevelOrderArray[ 2 ^ (depth-1) +i ] //4,5,....
counter++; //1,2,.....
}
But notice that once the counter can be divided by 2, that means you need to retrieve another node from upper level:
if(counter mod 2^1 == 0)
postOrderArray[i] = LevelOrderArray[ 2 ^ (depth -2) + (++i) ] // =2 here,
//which is the node you need after 4 and 5, and get 3 after 6 and 7 at the 2nd round
Don't ++ the counter here, because the counter represents how many nodes you retrieved from the bottom level.
Every time 2^2 = 4 nodes was pop out, retrieve another node from 3rd level (counting from bottom)
if(counter mod 2^2 == 0)
postOrderArray[i] = LevelOrderArray[ 2 ^ (depth -3) + (++i) ] // =1
Every time 2^3 = 8 nodes was pop out, again, retrieve another node from 4th level
.... until the loop is finished.
It's not strict C++ code, only the concept. If you fully understand the algorithm, the value of tree nodes doesn't matter at all, even though there are all 0 and 1. The point is although you didn't build up the tree in program, but build it up in your mind instead, and convert it into algorithm.

Finding maximum values of rests of array

For example:
array[] = {3, 9, 10, **12**,1,4,**7**,2,**6**,***5***}
First, I need maximum value=12 then I need maximum value among the rest of array (1,4,7,2,6,5), so value=7, then maxiumum value of the rest of array 6, then 5, After that, i will need series of this values. This gives back (12,7,6,5).
How to get these numbers?
I have tried the following code, but it seems to infinite
I think I'll need ​​a recursive function but how can I do this?
max=0; max2=0;...
for(i=0; i<array_length; i++){
if (matrix[i] >= max)
max=matrix[i];
else {
for (j=i; j<array_length; j++){
if (matrix[j] >= max2)
max2=matrix[j];
else{
...
...for if else for if else
...??
}
}
}
}
This is how you would do that in C++11 by using the std::max_element() standard algorithm:
#include <vector>
#include <algorithm>
#include <iostream>
int main()
{
int arr[] = {3,5,4,12,1,4,7,2,6,5};
auto m = std::begin(arr);
while (m != std::end(arr))
{
m = std::max_element(m, std::end(arr));
std::cout << *(m++) << std::endl;
}
}
Here is a live example.
This is an excellent spot to use the Cartesian tree data structure. A Cartesian tree is a data structure built out of a sequence of elements with these properties:
The Cartesian tree is a binary tree.
The Cartesian tree obeys the heap property: every node in the Cartesian tree is greater than or equal to all its descendants.
An inorder traversal of a Cartesian tree gives back the original sequence.
For example, given the sequence
4 1 0 3 2
The Cartesian tree would be
4
\
3
/ \
1 2
\
0
Notice that this obeys the heap property, and an inorder walk gives back the sequence 4 1 0 3 2, which was the original sequence.
But here's the key observation: notice that if you start at the root of this Cartesian tree and start walking down to the right, you get back the number 4 (the biggest element in the sequence), then 3 (the biggest element in what comes after that 4), and the number 2 (the biggest element in what comes after the 3). More generally, if you create a Cartesian tree for the sequence, then start at the root and keep walking to the right, you'll get back the sequence of elements that you're looking for!
The beauty of this is that a Cartesian tree can be constructed in time Θ(n), which is very fast, and walking down the spine takes time only O(n). Therefore, the total amount of time required to find the sequence you're looking for is Θ(n). Note that the approach of "find the largest element, then find the largest element in the subarray that appears after that, etc." would run in time Θ(n2) in the worst case if the input was sorted in descending order, so this solution is much faster.
Hope this helps!
If you can modify the array, your code will become simpler. Whenever you find a max, output that and change its value inside the original array to some small number, for example -MAXINT. Once you have output the number of elements in the array, you can stop your iterations.
std::vector<int> output;
for (auto i : array)
{
auto pos = std::find_if(output.rbegin(), output.rend(), [i](int n) { return n > i; }).base();
output.erase(pos,output.end());
output.push_back(i);
}
Hopefully you can understand that code. I'm much better at writing algorithms in C++ than describing them in English, but here's an attempt.
Before we start scanning, output is empty. This is the correct state for an empty input.
We start by looking at the first unlooked at element I of the input array. We scan backwards through the output until we find an element G which is greater than I. Then we erase starting at the position after G. If we find none, that means that I is the greatest element so far of the elements we've searched, so we erase the entire output. Otherwise, we erase every element after G, because I is the greatest starting from G through what we've searched so far. Then we append I to output. Repeat until the input array is exhausted.

Confused about definition of a 'median' when constructing a kd-Tree

Im trying to build a kd-tree for searching through a set of points, but am getting confused about the use of 'median' in the wikipedia article. For ease of use, the wikipedia article states the pseudo-code of kd-tree construction as:
function kdtree (list of points pointList, int depth)
{
if pointList is empty
return nil;
else
{
// Select axis based on depth so that axis cycles through all valid values
var int axis := depth mod k;
// Sort point list and choose median as pivot element
select median by axis from pointList;
// Create node and construct subtrees
var tree_node node;
node.location := median;
node.leftChild := kdtree(points in pointList before median, depth+1);
node.rightChild := kdtree(points in pointList after median, depth+1);
return node;
}
}
I'm getting confused about the "select median..." line, simply because I'm not quite sure what is the 'right' way to apply a median here.
As far as I know, the median of an odd-sized (sorted) list of numbers is the middle element (aka, for a list of 5 things, element number 3, or index 2 in a standard zero-based array), and the median of an even-sized array is the sum of the two 'middle' elements divided by two (aka, for a list of 6 things, the median is the sum of elements 3 and 4 - or 2 and 3, if zero-indexed - divided by 2.).
However, surely that definition does not work here as we are working with a distinct set of points? How then does one choose the correct median for an even-sized list of numbers, especially for a length 2 list?
I appreciate any and all help, thanks!
-Stephen
It appears to me that you understand the meaning of median, but you are confused with something else. What do you mean be distinct set of points?
The code presented by Wikipedia is a recursive function. You have a set of points, so you create a root node and choose a median of the set. Then you call the function recursively - for the left subtree you pass in a parameter with all the points smaller than the split-value (the median) of the original list, for the right subtree you pass in the equal and larger ones. Then for each subtree a node is created where the same thing happens. It goes like this:
First step (root node):
Original set: 1 2 3 4 5 6 7 8 9 10
Split value (median): 5.5
Second step - left subtree:
Set: 1 2 3 4 5
Split value (median): 3
Second step - right subtree:
Set: 6 7 8 9 10
Split value (median): 8
Third step - left subtree of left subtree:
Set: 1 2
Split value (median): 1.5
Third step - right subtree of left subtree:
Set: 3 4 5
Split value (median): 4
Etc.
So the median is chosen for each node in the tree based on the set of numbers (points, data) which go into that subtree. Hope this helps.
You have to choose an axis with as many element on one side than the other. If the number of points is odd or the points are positioned in such a way that it isn't possible, just choose an axis to give an as even repartition as possible.