Search for secondary data in BST

Search for secondary data in BST - c++

I have built a BST that holds a name as it's primary data, along with the weight associated with that name (as in when the info is inserted it goes in as Tom - 150, but is sorted in the tree by Tom). I need to be able to determine who has the lowest weight and I am not sure how to go about it. Below is the code for the class and the add method, I have plenty others, but don't feel the need to post them as they seem unrelated (I can if needed though).
#include <iostream>
using namespace std;
class tNode
{
public:
string name;
int wt;
tNode *left, *right;
tNode()
{
left = right = 0;
}
tNode(string name, int wt, tNode *l = 0, tNode *r = 0)
{
this->name = name;
this->wt = wt;
left = l;
right = r;
}
};
class bSTree
{
public:
tNode *root;
bSTree()
{
root = 0;
}
bool add(string name, int wt)
{
tNode *temp = root, *prev = 0;
while (temp != 0)
{
prev = temp;
if (name < temp->name)
{
temp = temp->left;
}
else
{
temp = temp->right;
}
}
if (root == 0)
{
root = new tNode(name, wt);
}
else if (name < prev->name)
{
prev->left = new tNode(name, wt);
}
else if (name > prev->name)
{
prev->right = new tNode(name, wt);
}
else
{
return false;
}
return true;
}
Anyways, what is the technique of doing this? I found the lowest name value (alphabetically) by just going down the left of the tree as far as it could, but I am not 100% sure on the technique for finding the lowest weight, since the tree is not sorted by the weight. I'm not experienced as I'd like to be in c++ and the only thing I can think of is going through every weight, inputting the data to an int, then sorting the ints to find the lowest. I can't imagine this is the correct way to do it, or at least the most efficient. Any help is always appreciated. Thanks!
EDIT:
This is what I have came up with so far:
void searchWeight(tNode* temp)
{
// DETERMINE LOWEST WEIGHT CONTAINED IN TREE
if (temp != 0)
{
cout << temp->wt << endl;
searchWeight(temp->left);
searchWeight(temp->right);
}
}
This will output all of the weights to the console, but I'm not sure how to go through each one and determine the lowest. I've tried putting another if statement in there where
if(currwt < minwt)
minwt = currwt
but no luck with it outputting properly in the end.

You do not have to sort the tree to get the node with minimum weight.
Just do a traversal of the tree and store the lowest weight and the person who has the lowest weight. Update the two variables if the current node's weight is less than the minimum weight. You will have the lowest weight at the end of the traversal.
The pseudocode will look something like,
minWeight = 0
minWeightPerson = ""
for each node in the tree:
if ( minWeight > weight of current node):
minWeight = weight of current node
minWeightPerson = person in current node
return minWeightPerson

You will need to loop through the entire BST since you're not searching by the tree's primary key. You can use any of the tree traversal algorithms.
If you need to search by weight a large number of times (and simply searching the entire tree isn't fast enough), then it will be worth it to build an "index", i.e. a second BST that points to the nodes in the first BST, but is sorted on the secondary key.
Looping through the tree once will be O(n), but looping through the tree m times will be O(m*n). Building a second binary search tree with different index will be O(n*log(n)), and then searching the second tree m times will be O(m*log(n)), so the entire operation is O(n*log(n)+m*log(n)) = O((n+m)*log(n)).

Related

Understanding complexity of deleting duplicates from a linked-list

I have written this program to delete duplicate nodes from an unsorted linked list:
#include<bits/stdc++.h>
using namespace std;
/* A linked list node */
struct Node
{
int data;
struct Node *next;
};
// Utility function to create a new Node
struct Node *newNode(int data)
{
Node *temp = new Node;
temp->data = data;
temp->next = NULL;
return temp;
}
/* Function to remove duplicates from a
unsorted linked list */
void removeDuplicates(struct Node *start)
{
// Hash to store seen values
unordered_set<int> seen;
/* Pick elements one by one */
struct Node *curr = start;
struct Node *prev = NULL;
while (curr != NULL)
{
// If current value is seen before
if (seen.find(curr->data) != seen.end())
{
prev->next = curr->next;
delete (curr);
}
else
{
seen.insert(curr->data);
prev = curr;
}
curr = prev->next;
}
}
/* Function to print nodes in a given linked list */
void printList(struct Node *node)
{
while (node != NULL)
{
printf("%d ", node->data);
node = node->next;
}
}
/* Driver program to test above function */
int main()
{
/* The constructed linked list is:
10->12->11->11->12->11->10*/
struct Node *start = newNode(10);
start->next = newNode(12);
start->next->next = newNode(11);
start->next->next->next = newNode(11);
start->next->next->next->next = newNode(12);
start->next->next->next->next->next =
newNode(11);
start->next->next->next->next->next->next =
newNode(10);
printf("Linked list before removing duplicates : \n");
printList(start);
removeDuplicates(start);
printf("\nLinked list after removing duplicates : \n");
printList(start);
return 0;
}
Does finding each element in the hash table affect the complexity? If yes what should be the time complexity of this algorithm considering that the set is implemented as a Binary Search tree where the cost of searching an element is O(logn) in worst case.
According to me T(n)=T(n-1)+log(n-1) ie. the nth element will perform log(n-1) comparisons (ie the height of tree with n-1 elements)
Please give a mathematical analysis.

Does finding each element in the hash table affect the complexity?
Well, in your code you are using unordered_set which has an average complexity of O(1), so the simple answer is - No.
...considering that the set is implemented as a Binary Search tree where the cost of searching an element is O(logn) in worst case.
Again, you have chosen unordered_set which is not a binary search. I believe some of the implementation of set use Red/Black trees and you would be looking at O(logN), but with the unordered_set it should be constant time. So now the only issue is the traversal of your linked list. Which, since you are just walking it in one direction while visiting each node, is an O(N) operation.

Displaying shortest path via bfs search in directed graph

I've been working on a project for college and ran into a rather large problem. I'm supposed to make a function that gets the shortest path through a directed graph from point A to point B and display the path in order.
EX. if the node holds a state name and we want to find the shortest path between California and Utah the output would show california -> nevada -> utah
Currently, my traversal shows all nodes searched with bfs instead of the list of nodes that we took to get from point A to point B.
Below is my implementation of the assignment. My only real question is how would I go about keeping track of the nodes I actually traversed instead of all nodes searched.
bool DirectedGraph::GetShortestPath(
const string& startNode, const string& endNode,
bool nodeDataInsteadOfName, vector<string>& traversalList) const
{
//Nodes are the same
if (startNode.compare(endNode) == 0)
return false;
//Stores the location of our nodes in the node list
vector<int> path;
//Queue to hold the index of the node traversed
queue<int> q;
//Create our boolean table to handle visited nodes
bool *visited = new bool[m_nodes.size()];
//initialize bool table
memset(visited, false, sizeof(bool) * m_nodes.size());
//Label the start node as visited
visited[GetNodeIndex(startNode)] = true;
//Push the node onto our queue
q.push(GetNodeIndex(startNode));
while (!q.empty())
{
//Store the nodes index
int index = q.front();
path.push_back(q.front());
q.pop();
int i = 0;
for (i = 0; i < m_nodes[index]->Out.size(); i++)
{
//If this node matches what we are looking for break/return values
if (m_nodes[index]->Out[i]->targetI == GetNodeIndex(endNode))
{
path.push_back(m_nodes[index]->Out[i]->targetI);
if (nodeDataInsteadOfName)
{
path.push_back(m_nodes[index]->Out[i]->targetI);
for (int x = 0; x < path.size(); x++)
{
traversalList.push_back(m_nodes[path[x]]->Data);
}
}
else
{
for (int x = 0; x < path.size(); x++)
{
traversalList.push_back( m_nodes[path[x]]->Name);
}
}
return true;
}
//Continue through the data
if (!visited[m_nodes[index]->Out[i]->targetI])
{
visited[m_nodes[index]->Out[i]->targetI] = true;
q.push(m_nodes[index]->Out[i]->targetI);
}
}
}
// You must implement this function
return false;
}
//definition of graph private members
struct Edge
{
int srcI; // Index of source node
int targetI; // Index of target node
Edge(int sourceNodeIndex, int targetNodeIndex)
{
srcI = sourceNodeIndex;
targetI = targetNodeIndex;
}
};
struct Node
{
string Name;
string Data;
Node(const string& nodeName, const string& nodeData)
{
Name = nodeName;
Data = nodeData;
}
// List of incoming edges to this node
vector<Edge*> In;
// List of edges going out from this node
vector<Edge*> Out;
};
// We need a list of nodes and edges
vector<Node*> m_nodes;
vector<Edge*> m_edges;
// Used for efficiency purposes so that quick node lookups can be
// done based on node names. Maps a node name string to the index
// of the node within the nodes list (m_nodes).
unordered_map<string, int> m_nodeMap;

The first problem is with the if inside the for loop. Your path variable can only contain two items: the starting and the ending nodes. I suggest you do no track the path with the for loop. Instead, assign each node a distance.
struct Node
{
string Name;
string Data;
int Distance;
Node(const string& nodeName, const string& nodeData)
{
Name = nodeName;
Data = nodeData;
Distance = INT_MAX;
}
// List of incoming edges to this node
vector<Edge*> In;
// List of edges going out from this node
vector<Edge*> Out;
};
and set the distance of the starting node to zero, before looping.
m_nodes[GetNodeIndex(startNode)]->Distance = 0;
At each iteration, pick a node from the queue (you called it index), loop through its adjacency list (outgoing arcs) and test if the adjacent node is visited. If the node is visited, skip it. If the node is not visited, visit it by setting its distance to
m_nodes[index]->Distance + 1
After updating the distance of each node, check if it is the final node, if so break out of the loops.
At this point you have the distance's updated properly. Work your way from the end node backwards, each time selecting the node from the adjacency list with (distance = current node's distance - 1). You can do this using m_edges vector, each time you actually know targetI, so you can check for its corresponding scrI's with the distance value mentioned above.

C++ - Min heap implementation and post order traversal

So I have this small program that creates a min heap and insert values based on user input. If the users says change value 10 to 20, the program should change all occurrences of 10 to 20 and then heapify. When the user gives the print command the program should traverse the tree in postorder and print all the values. So I have written program but its giving me the incorrect output when I print. What am I doing wrong here:
int pArray[500];
int i = 0;
//Definition of Node for tree
struct TNode {
int data;
TNode* left;
TNode* right;
};
void Heapify(TNode* root, TNode* child);
// Function to create a new Node in heap
TNode* GetNewNode(int data) {
TNode* newNode = new TNode();
newNode->data = data;
newNode->left = newNode->right = NULL;
return newNode;
}
// To insert data in the tree, returns address of root node
TNode* Insert(TNode* root,int data) {
if(root == NULL) { // empty tree
root = GetNewNode(data);
}
// if the left child is empty fill that in
else if(root->left == NULL) {
root->left = Insert(root->left,data);
}
// else, insert in right subtree.
else if(root->right == NULL){
root->right = Insert(root->right,data);
}
else {
root->left = Insert(root->left,data);
}
Heapify(root, root->left);
Heapify(root, root->right);
return root;
}
void Heapify(TNode* root, TNode* child){
if(root != NULL && child != NULL){
if(root->data > child->data){
int temp = child->data;
child->data = root->data;
root->data = temp;
}
}
}
void Change(TNode* root,int from, int to) {
if (root == NULL)
return;
else if (root->data == from)
root->data = to;
Change(root->left, from, to);
Change(root->right, from, to);
}
void postOrder(TNode* n){
if ( n ) {
postOrder(n->left);
postOrder(n->right);
pArray[i] = n->data;
i++;
}
}

What am I doing wrong here?
I'm going to assume that you've verified the heap before you print it. Your tree implementation is a bit confusing, but it looks like it should work. I would suggest, however, that the first thing you do is print the tree before calling your Change method, just to make sure that you have a valid heap.
Assuming that you have a valid heap, your Change method has a problem: it never calls Heapify. You end up changing values in the heap and not rearranging. So of course it's going to be out of order when you output it.
When you change an item's value, you have to move that node (or the node's value) to its proper final position in the tree before you change any other value. You can probably make that work with your current model (by calling Heapify repeatedly until the node is in its proper position). Provided that you're increasing the value. If you're decreasing the value (i.e. changing 20 to 10), then you have a problem because your code has no way to move an item up the tree.
As #noobProgrammer pointed out in his comment, a binary heap typically is implemented as an array rather than as a tree. It's a whole lot easier to implement that way, uses less memory, and is much more efficient. If you're interested in how that's done, you should read my multi-part blog series on heaps and priority queues. The first entry, Priority queues, describes the problem. From there you can follow the links to learn about binary heaps and how they're implemented. The code samples are in C#, but if you read the first two introductory articles and understand the concepts, you'll be able to convert to C++ without trouble.

Creating an adjacency List for DFS

I'm having trouble creating a Depth First Search for my program. So far I have a class of edges and a class of regions. I want to store all the connected edges inside one node of my region. I can tell if something is connected by the getKey() function I have already implemented. If two edges have the same key, then they are connected. For the next region, I want to store another set of connected edges inside that region, etc etc. However, I am not fully understanding DFS and I'm having some trouble implementing it. I'm not sure when/where to call DFS again. Any help would be appreciated!
class edge
{
private:
int source, destination, length;
int key;
edge *next;
public:
getKey(){ return key; }
}
class region
{
edge *data;
edge *next;
region() { data = new edge(); next = NULL; }
};
void runDFS(int i, edge **edge, int a)
{
region *head = new region();
aa[i]->visited == true;//mark the first vertex as true
for(int v = 0; v < a; v++)
{
if(tem->edge[i].getKey() == tem->edge[v].getKey()) //if the edges of the vertex have the same root
{
if(head->data == NULL)
{
head->data = aa[i];
head->data->next == NULL;
} //create an edge
if(head->data)
{
head->data->next = aa[i];
head->data->next->next == NULL;
}//if there is already a node connected to ti
}
if(aa[v]->visited == false)
runDFS(v, edge, a); //call the DFS again
} //for loop
}

assuming n is total number of edges, k is final number of regions.
Creating adjacency list for the requisite DFS might be too costly O(n^2) (if k=1 i.e. all edges belong to same region) and hence dfs will cost you O(V+E) i.e. O(n^2) in the worst case.
Otherwise problem is easily solvable in O(n * log(k)) as follows:
Traverse through all edges adding them to the head of corresponding regions (using balanced bst eg. stl-map) [you may use hashing for this too]
traverse through all the regions and connect them in requisite linear fashion
No guaranteed O(n) solution exists for the problem I guess..

I tried to implement a adjacency list creating function.The next pointer of adj_list struct takes you down the adjacency list(there is no relationship between 2 nodes connected by next) and the list pointer is the adjacency list. The node has the address of the adj_list which has its adjacency list.
struct node{
int id;
adj_list* adj;
};
struct adj_list{
adj_list* next;
adj_list* list;
node* n;
adj_list(node& _n){
n = &(_n);
next = NULL;
list = NULL;
}
};
node* add_node(int id,std::queue<int> q , node* root)
{
node* n = new node(id);
adj_list* adj = new adj_list(*n);
n->adj = adj;
if(root == NULL){
return n;
}
std::queue<adj_list*> q1;
while(1){
adj_list* iter = root->adj;
if(q.empty())break;
int k = q.front();
q.pop();
while(iter){
if(iter->n->id == k){
q1.push(iter);
adj_list* temp = iter->list;
iter->list = new adj_list(*n);
break;
}
iter = iter->next;
}
}
adj_list* iter = root->adj;
while(iter->next){
iter = iter->next;
}
iter->next = adj;
while(!q1.empty()){
adj_list* temp = q1.front();
q1.pop();
adj->list = temp;
adj = temp;
}
return root;
}

How to create a function that returns smallest value of an unordered binary tree

This seems like it should be really easy but I've been having trouble with this for quite some time. As the title says, I'm just trying to find the node in a Binary tree (not a BST!) with the smallest value and return it. I can write a recursive void function pretty easily that can at least assign the smallest value in the function, but I'm getting stuck on how to back track to previous nodes once I reach a NULL pointer.
I have a node class that has a pointer to a left and right child, each with its own value. Here is my (failed) attempt so far:
int preOrder(Node *node, int value, int count, int sizeOfTree)
{
count++; //keeps track of whether or not we have traversed the whole tree
if(value < node->getValue())
value = node->getValue();
if(count == sizeOfTree);
return value;
if(node == NULL)
//Want to return to the previous function call
//How do I do this for a non void function?
//for a void function, you could jsut type "return;" and the function
//back tracks to your previous place in the tree
//but since I'm returning a value, How would I go about doing this?
//these 2 calls are incorrect but the idea is that I first traverse the left subtree
//followed by a traversal of the right subtree.
preOrder(node->getLeft(), value);
preOrder(node->getRight(), value);
}
If possible, I would like to try and do this without keeping track of a "count" as well to make the code cleaner.
Let me know if anymore clarification is needed.

I don't really understand why, in your original code, you need to keep track of the amount of elements traversed. Here is my solution:
int find_min(Node* node)
{
int value = node->getValue()
Node* left_node = node->getLeft();
if (left_node != NULL)
{
int left_value = find_min(left_node);
if (left_value < value)
value = left_value;
}
Node* right_node = node->getRight();
if (right_node != NULL)
{
int right_value = find_min(right_node);
if (right_value < value)
value = right_value;
}
return value;
}

Basically what you need to do is just visit every node and keep track of the smallest value you've seen. This can actually be done fairly simply:
#include <algorithm>
#include <limits>
int preOrder(Node *node)
{
if(node == NULL) return std::numeric_limits<int>::max();
// this should never affect the calculation of the minimum
// (What could possibly be bigger than INT_MAX? At worst it's equal)
int value = std::min(
node->getValue(),
preOrder(node->getLeft())
);
value = std::min(
value,
preOrder(node->getRight())
);
return value;
}

OK, so you have an unordered binary tree and you're trying to find the lowest element in it.
Since the tree is unordered, the lowest element can be at any position in the tree, so you must search the entire tree.
The characteristics of the search will be as follows:
thorough (whole tree is searched)
recursive (rather than iterative, which would be really yucky)
base case: node is NULL
base outcome: maintain current value
Lets write it then:
#include <algorithm>
using namespace std;
int searchLowest(Node * node, int value = INT_MAX)
{
if (node == NULL) // base case
return value; // base outcome
// at this point, node must not be NULL
value = min(value, preOrder(node->getRight(), value)); // thorough, always recurse
value = min(value, preOrder(node->getLeft (), value)); // and check children
value = min(value, node->getValue());
return value;
}
Edit for thoroughness, justice, and OOness:
// Node.h
#include <algorithm>
using namespace std;
template <typename T>
class Node
{
public:
Node(T item)
{
data = item;
}
T lowest()
{
T value = data;
if (right != NULL)
value = min(value, right->lowest());
if (left != NULL)
value = min(value, left->lowest());
return value;
}
Node<T> * getRight()
{
return right;
}
Node<T> * getLeft()
{
return left;
}
private:
T data;
Node<T> * right;
Node<T> * left;
};
// main.cpp
#include <iostream>
#include "Node.h"
using namespace std;
int main(int c, char * v[])
{
Node<int> * tree = sycamore(); // makes a nice big tree
cout << tree->lowest();
}
SEE JIMMY RUN

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js