Strcmp Error in Binary Search Tree Insertion (Recursion)? - c++

i'm trying to implement some functions that allow me to add "Books" to a binary search tree for the "Student" class, but I'm getting a strange error:
msvcr100d.dll!strcmp(unsigned char * str1, unsigned char * str2) Line 83 Asm
The program is entirely in C/C++, so I'm not sure why its returning an assembly language error? My first thought is something is wrong with my use of strcmp, and the Call Stack shows Line 188 as the last executed statement (before the above error), which means I'm probably messing up my recursion somewhere. I am calling the insertBook() function of "Student", so here is my "Student" class. Any help? Thanks.
class Student : public Personnel { //inherit from Personnel
public:
Book *bookTree;
Book* searchBookTree(Book *bookNode, char *title) {
if ((strcmp(title, bookNode->title)) < 0) //***LINE 188
return searchBookTree(bookNode->left, title);
else if ((strcmp(title, bookNode->title)) > 0)
return searchBookTree(bookNode->right, title);
else
return bookNode;
}
void insertBook(Book *node) {
Book *newBook, *parent;
newBook = node;
newBook->left = NULL;
newBook->right = NULL;
if (bookTree == NULL) { //if bookTree is empty
bookTree = newBook;
}
else {
parent = searchBookTree(bookTree, newBook->title);
newBook->left = parent->left;
newBook->right = parent->right;
}
}
void printBooks(Book *top) {
Book *root = top;
if (root != NULL) {
printBooks(root->left);
cout << "BOOK LIST" << endl;
cout << "Title:\t\t" << root->title << endl;
cout << "URL:\t\t" << root->url << endl;
printBooks(root->right);
}
}
void display() {
Personnel::display();
cout << "STUDENT" << endl;
cout << "Level:\t\t" << getLevel() << endl;
printBooks(bookTree); cout << endl;
}
Student(char *cName, char *cBirthday, char *cAddress, char *cPhone, char *cEmail, level gradeLevel)
: Personnel(cName, cBirthday, cAddress, cPhone, cEmail)
{
bookTree = NULL;
setLevel(gradeLevel);
}
};

Book* searchBookTree(Book *bookNode, char *title) {
if ((strcmp(title, bookNode->title)) < 0) //***LINE 188
// What happens if bookNode->left == NULL ???
return searchBookTree(bookNode->left, title);
else if ((strcmp(title, bookNode->title)) > 0)
// What happens if bookNode->right== NULL ???
return searchBookTree(bookNode->right, title);
else
return bookNode;
}
you'll need a termination point in your search function. At the top, I'd first check if bookNode == NULL.

Your recursive search an important termination test missing! At some point, you hit the bottom of the tree without finding the item. And so your search function is called with a null pointer for the tree node! The problem is not in strcmp, but in the null pointer in one of the argument expressions.
You have only considered the case when the item exists in the tree and is eventually found, neglecting the not-found case.
Programmers are not to be measured by their ingenuity and their logic but by the completeness of their case analysis.
Alan J. Perlis, Epigram #32

Your insert routine has problems. I suggest you make your searchBookTree just return a null pointer when it doesn't find anything. Do not use that routine in the implementation of insertBook. Rather, you can write insertBook recursively also:
private:
// Inserts bookNode into tree, returning new tree:
Book *insertBookHelper(Book *tree, Book *bookNode) {
if (tree == NULL)
return bookNode; // bookNode becomes new tree
// no need to call strcmp twice!!!
int cmp = strcmp(title, bookNode->title);
if (cmp < 0) {
tree->left = insertBookHelper(tree->left, bookNode->title);
else if (cmp > 0)
tree->right = insertBookHelper(tree->right, bookNode->title);
else {
// Uh oh! Tree already contains that title, what to do?
// Answer: update!
// I don't know how to write this because I don't know
// how your Book class handles the memory for the strings,
// and what other members it has besides the title.
// this could be a possibility:
// bookNode->left = tree->left; // install same child pointers
// bookNode->right = tree->right; // into bookNode.
// *tree = *bookNode; // if Book has a sane copy constructor!!!
}
return tree;
}
public:
void insertBook(Book *node) {
tree = insertBookHelper(tree, node);
}
Do you see how the recursion works? It's a little different from the pure search. Each recursive level handles the insertion into the subtree and returns the new subtree. Often, the returned tree is exactly the same as the tree that went in! But when inserting into an empty tree, the returned tree is not the same: the tree that went in is a null pointer, but a non-null pointer comes out. This trick of pretending that we are making a new tree and returning it as a replacement for the old tree makes for smooth code.

Related

Trie data structure using class C++

I am trying to implement trie data structure in C++ using class. In TrieNode class I have a TrieNode *children[26]; array and an isEndOfWord boolean value to determine if it is the end word. In that same class I have other functions appropriate to function like getters and setters and additionally insert and search.
Whenever I try to add a new word it is also setting the bool value as true at the end of each word by setting true to isEndOfWord. But in searching function it is not determining the end of the word. Please guide me as I am new to this data structure, and please comment on the way i write the code and what is the appropriate way to write it(in a Professional way, if interested). Thanks!
#include<cstdio>
#include<iostream>
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
using namespace std;
class TrieNode{
private:
TrieNode *children[26];
bool isEndOfWord;
public:
TrieNode(){
for(int i = 0; i < 26; i++){
children[i] = NULL;
}
isEndOfWord = false;
}
bool checkNull(char temp){
cout<<"\nIncheckNULL "<<temp<<" "<<(temp - 'a')<<" \n";
if(children[temp - 'a'] == NULL){
return true;
}
else{
return false;
}
}
void setNode(char temp){
cout<<"Setting node \n";
children[temp - 'a'] = new TrieNode();
}
TrieNode *getNode(char temp){
return children[temp - 'a'];
}
void setEndWord(){
this->isEndOfWord = true;
}
bool getEndWord(){
return this->isEndOfWord;
}
void insert(TrieNode*, string);
bool search(TrieNode*, string);
};
void TrieNode::insert(TrieNode *root, string key){
TrieNode *crawl = root;
//cout<<"key is "<<key<<endl;
int length = sizeof(key)/sizeof(key[0]);
//cout<<"find length\n";
for(int i = 0; key[i] != '\0'; i++){
cout<<"TEST null check key is "<<key[i]<<endl;
if(crawl->checkNull(key[i])){
cout<<"null check key is "<<key[i]<<endl;
crawl->setNode(key[i]);
crawl = crawl->getNode(key[i]);
if(key[i + 1] == '\0'){
cout<<"In setting end word\n";
if(crawl->getEndWord()){
cout<<"Word already exists";
}
else{
crawl->setEndWord();
cout<<"End word setted "<<crawl->getEndWord()<<endl;
}
}
}
else{
if(key[i + 1] == '\0'){
cout<<"In setting end word\n";
if(crawl->getEndWord()){
cout<<"Word already exists";
}
else{
crawl->setEndWord();
cout<<"End word setted\n";
}
}
else{
crawl = crawl->getNode(key[i]);
}
}
}
}
bool TrieNode::search(TrieNode *root, string key){
TrieNode *crawl = root;
cout<<"key is "<<key<<endl;
cout<<"\n In search\n";
int length = sizeof(key)/sizeof(key[0]);
for(int i = 0; key[i] != '\0'; i++){
if(crawl->checkNull(key[i])){
cout<<"INside search checknull"<<endl;
cout<<"Word does not exists"<<"sorry"<<endl;
break;
}
else{
cout<<"IN each character getting getEndWord "<<crawl->getEndWord()<<endl;
if(key[i + 1] == '\0'){
if(crawl->getEndWord()){
cout<<"Word Exists";
}
else{
cout<<"Word does not exists"<<"sorry"<<endl;
break;
}
}
else{
crawl = crawl->getNode(key[i]);
}
}
}
}
int main(){
TrieNode *root = new TrieNode();
cout<<"starting"<<endl;
root->insert(root, "hello");
cout<<"first added"<<endl;
root->insert(root, "anna");
root->insert(root, "anni");
cout<<"words added"<<endl;
root->search(root, "hello");
root->search(root, "anny");
}
Your insert and search functions can be simplified a bit.
Consider this. (Read the comments in the below code, they illustrate what the code does)
void TrieNode::insert(TrieNode *root, string key){
TrieNode *crawl = root;
if (!crawl) {
crawl = new TrieNode();
}
cout << "Adding " << key << " to the trie" << endl;
for (int index = 0, auto str_iterator = str.begin(); str_iterator < str.end(); ++str_iterator, ++index) {
char key_char = *str_iterator;
if(crawl -> checkNull(key_char)){
// If a node representing the char does not exist then make it
crawl -> setNode(key_char);
}
crawl = crawl -> getNode(key_char);
if (index == key.length() - 1) {
// We are at the last character, time to mark an end of word
crawl -> setEndWord();
}
}
}
bool TrieNode::search(TrieNode *root, string key){
TrieNode *crawl = root;
if (!crawl) {
cout << "Trie is empty!" << endl;
return false;
}
cout << "Searching for " << key << " in the trie" << endl;
for (int index = 0, auto str_iterator = str.begin(); str_iterator < str.end(); ++str_iterator, ++index) {
char key_char = *str_iterator;
if(crawl -> checkNull(key_char)){
cout << "Key is not in the trie" << endl;
return false;
}
crawl = crawl -> getNode(key_char);
if (index == key.length() - 1) {
if (!(crawl -> getEndWord())) {
cout << "Word is physically present in trie, but not present as a distinct word" << endl;
return false;
} else {
return true;
}
}
}
cout << "Code should not reach here" << endl; // IMO throw an exception I guess
return false;
}
Take advantage of the power of C++ std::string
Also your whole temp - 'a' logic is a bit iffy to me. I wouldn't much around with ASCII values unless I needed to
Why are you including a whole bunch of C headers? Just iostream should suffice to do what cstdio does.
if(!ptr) is a much more natural way to check for NULL.
In production don't use using namespace std; Instead just preface stuff like cout and endl with std::. The reason for this is to avoid polluting the standard namespace.
Read a good CPP OOP book :). It will help you a lot.
Also I lol'd at anna and anni. Your anna and anni must be proud to be in your trie :D
There are many things I'd give you feedback on, but this isn't a code review site, it's for specific questions. I'll point out briefly a few things I notice though:
1) don't include C headers; use c++ ones instead.
2) what type is string?
3) you compute length (incorrectly, assuming answer to question 2 is "the standard c++ string class"), but you don't use it.
4) search() returns a bool but you don't return anything. When you find the end of a word, you should return from the function.
5) search() calls checkNull() at the top of the for loop without ensuring that it's not null. After this: crawl = crawl->getNode(key[i]); it could be null, but then you loop and go through the pointer without testing it.
6) setNode is a public function, and unconditionally overwrites whatever is in the slot for the given variable. You can clobber an existing child if someone calls it with the same character twice and leak (and probably lose data in your tree.
7) search doesn't need to be a member of TrieNode. In fact, it doesn't access any data through "this". You probably don't want the TrieNode to be public at all, but an internal implenetation detail of Trie, which is where the search function should live, where the root should be stored and managed.
8) in c++ use nullptr instead of NULL
9) Looks like you need to debug search(), because it is not on the last letter when you check for end of word.
10) you need a destructor and need to deallocate your nodes. Or store them in unique_ptr<> for automatic deletion when your object goes out of scope.
11) don't "using namespace std;" in headers. It makes your headers toxic to include in my code.
The insert and search functions are a mess.
They use rather contrived ways to check the end of the string, duplicated unnecessarily and with a bug in one of the branches.
Here are simpler versions.
They use string size for the loop bounds, and the actions needed at the end of the loop are made after the loop, which is more natural.
void TrieNode::insert(TrieNode *root, string key){
TrieNode *crawl = root;
for(int i = 0; i < (int) (key.size()); i++){
if(crawl->checkNull(key[i])){
crawl->setNode(key[i]);
}
crawl = crawl->getNode(key[i]);
}
crawl->setEndWord();
}
bool TrieNode::search(TrieNode *root, string key){
TrieNode *crawl = root;
for(int i = 0; i < (int) (key.size()); i++){
if(crawl->checkNull(key[i])){
return false;
}
crawl = crawl->getNode(key[i]);
}
return crawl->getEndWord();
}
I used the same style, but omitted the debug outputs for readability.
Also, the code did not actually use search as a function, it didn't return a value.
Instead, it relied on debug output to show the result.
This is now corrected.
A main function complementing these is as follows.
int main(){
TrieNode *root = new TrieNode();
cout<<"starting"<<endl;
root->insert(root, "hello");
cout<<"first added"<<endl;
root->insert(root, "anna");
root->insert(root, "anni");
cout<<"words added"<<endl;
cout << root->search(root, "hello") << endl; // 1
cout << root->search(root, "anny") << endl; // 0
}

Recursive method to find value in linked list

template<class Type>
int StringList<Type>::find(Type value)
{
int count = 0;
// Start of linked list
Node<Type> *current = head;
// Traverse list until end (NULL)
while (current != NULL)
{
// Increase counter if found
if (current->data == value)
{
count++;
}
// If not, move to the next node
current = current->next;
}
cout << value << " was found " << count << " times" << endl;
return 0;
// same function but using Recursive method
// Start of linked list
Node<Type> *current = head;
int count = 0;
// Thinking this is the base case, since its similar to the while loop
if (current == NULL)
{
return 0;
}
// same as the while loop, finding the value increase the count, or in this case just prints to console
if ((current->data == value))
{
cout << "Found match" << endl;
return 0;
}
else
{ // If it didnt find a match, move the list forward and call the function again
current = current->next;
return find(value);
}
}
the function is supposed to find the value searched and return how many times that certain value was in the linked list.
how can I turn the first method, which uses a while loop, into something that does the same thing but uses recursion?
For starters instead of the return type int it is better to use an unsigned type like for example size_t
You can use the following approach. Define two methods. The first one is a public non-static method find defined like
template<class Type>
size_t StringList<Type>::find( const Type &value ) const
{
return find( head, value );
}
The second one is a private static method with two parameters defined like
template<class Type>
static size_t StringList<Type>::find( Node<Type> *current, const Type &value )
{
return current == nullptr ? 0 : ( current->data == value ) + find( current->next, value );
}
In order to use recursion, you will need to change the signature of your find function (or add a new function with the different signature) to take a node pointer as a parameter:
template<class Type>
int StringList<Type>::find(Type value, Node<Type> *where)
{
if (where != nullptr)
{
// Do things
}
}
Then, when you traverse the list, you pass where->next to the function. Once you hit the end of the list, with a nullptr value, the stack unrolls.
A key aspect of recursion as that the function or method being used only has to process a single node of your container. It then calls itself with the next node to be processed until there are no more nodes. In order to make this work, that function needs the node to process as a parameter, which is where your current code runs into problems.
Keep in mind that the elegance and simplicity of recursion is not free. Every call that a method makes to itself eats up stack, so a sufficiently large container can result in a crash if the stack for your process is depleted.
how can I turn the first method, which uses a while loop, into
something that does the same thing but uses recursion?
The following would be closer to what you want. You really should provide an [MCVE] ... the lack of which forces many guesses and assumptions about your code.
// It looks like StringList is a class (I ignored template issues),
// and it appears that your class holds 'anchors' such as head
// StringList is probably the public interface.
//
// To find and count a targetValue, the code starts
// at the head node, and recurses through the node list.
// I would make the following a public method.
//
int StringList::findAndCountTargetValue(int targetValue)
{
int retVal = 0;
if (nullptr != head) // exists elements to search?
retVal = head->countTV(targetValue); // recurse the nodes
// else no match is possible
return(retVal);
}
// visit each node in the list
int Node::countTV(const int targetValue)
{
int retVal = 0; // init the count
if (data != targetValue) // no match
{
if(nullptr != next) // more elements?
retVal += next->countTV() // continue recursive count
}
else
{
std::cout << "Found match" << std::endl; // for diag only
retVal += 1; // because 1 match has been found
if(nullptr != next) // more elments
retVal += next->countTV(); // continue recursive count
}
return (retVal); // always return value from each level
}

Insertion error in Binary Search tree

void BST::insert(string word)
{
insert(buildWord(word),root);
}
//Above is the gateway insertion function that calls the function below
//in order to build the Node, then passes the Node into the insert function
//below that
Node* BST::buildWord(string word)
{
Node* newWord = new Node;
newWord->left = NULL;
newWord->right = NULL;
newWord->word = normalizeString(word);
return newWord;
}
//The normalizeString() returns a lowercase string, no problems there
void BST::insert(Node* newWord,Node* wordPntr)
{
if(wordPntr == NULL)
{
cout << "wordPntr is NULL" << endl;
wordPntr = newWord;
cout << wordPntr->word << endl;
}
else if(newWord->word.compare(wordPntr->word) < 0)
{
cout << "word alphabetized before" << endl;
insert(newWord,wordPntr->left);
}
else if(newWord->word.compare(wordPntr->word) > 0)
{
cout << "word alphabetized after" << endl;
insert(newWord, wordPntr->right);
}
else
{
delete newWord;
}
}
So my problem is this: I call the gateway insert() externally (also no problems with the inflow of data) and every time it tells me that the root, or the initial Node* is NULL. But that should only be the case before the first insert. Each time the function is called, it sticks the newWord right at the root.
To clarify: These functions are part of the BST class, and root is a Node* and a private member of BST.h
It's possible it is quite obvious, and I have just been staring too long. Any help would be appreciated.
Also, this is a school-assigned project.
Best
Like user946850 says, the variable wordPntr is a local variable, if you change it to point to something else it will not be reflected in the calling function.
There are two ways of fixing this:
The old C way, by using a pointer to a pointer:
void BST::insert(Node *newWord, Node **wordPntr)
{
// ...
*wordPntr = newWord;
// ...
}
You call it this way:
some_object.insert(newWord, &rootPntr);
Using C++ references:
void BST::insert(Node *newWord, Node *&wordPntr)
{
// Nothing here or in the caller changes
// ...
}
To help you understand this better, I suggest you read more about scope and lifetime of variables.
The assignment wordPntr = newWord; is local to the insert function, it should somehow set the root of the tree in this case.

Segfault in recursive function

I'm getting a segfault when I run this code and I'm not sure why. Commenting out a particular line (marked below) removes the segfault, which led me to believe that the recursive use of the iterator "i" may have been causing trouble, but even after changing it to a pointer I get a segfault.
void executeCommands(string inputstream, linklist<linklist<transform> > trsMetastack)
{
int * i=new int;
(*i) = 0;
while((*i)<inputstream.length())
{
string command = getCommand((*i),inputstream);
string cmd = getArguments(command,0);
//cout << getArguments(command,0) << " " << endl;
if (cmd=="translate")
{
transform trs;
trs.type=1;
trs.arguments[0]=getValue(getArguments(command,2));
trs.arguments[1]=getValue(getArguments(command,3));
((trsMetastack.top)->value).push(trs);
executeCommands(getArguments(command,1),trsMetastack);
}
if (cmd=="group")
{
//make a NEW TRANSFORMS STACK, set CURRENT stack to that one
linklist<transform> transformStack;
trsMetastack.push(transformStack);
//cout << "|" << getAllArguments(command) << "|" << endl;
executeCommands(getAllArguments(command),trsMetastack); // COMMENTING THIS LINE OUT removes the segfault
}
if (cmd=="line")
{ //POP transforms off of the whole stack/metastack conglomeration and apply them.
while ((trsMetastack.isEmpty())==0)
{
while ((((trsMetastack.top)->value).isEmpty())==0) //this pops a single _stack_ in the metastack
{ transform tBA = ((trsMetastack.top)->value).pop();
cout << tBA.type << tBA.arguments[0] << tBA.arguments[1];
}
trsMetastack.pop();
}
}
"Metastack" is a linked list of linked lists that I have to send to the function during recursion, declared as such:
linklist<transform> transformStack;
linklist<linklist<transform> > trsMetastack;
trsMetastack.push(transformStack);
executeCommands(stdinstring,trsMetastack);
The "Getallarguments" function is just meant to extract a majority of a string given it, like so:
string getAllArguments(string expr) // Gets the whole string of arguments
{
expr = expr.replace(0,1," ");
int space = expr.find_first_of(" ",1);
return expr.substr(space+1,expr.length()-space-1);
}
And here is the linked list class definition.
template <class dataclass>
struct linkm {
dataclass value; //transform object, point object, string... you name it
linkm *next;
};
template <class dataclass>
class linklist
{
public:
linklist()
{top = NULL;}
~linklist()
{}
void push(dataclass num)
{
cout << "pushed";
linkm<dataclass> *temp = new linkm<dataclass>;
temp->value = num;
temp->next = top;
top = temp;
}
dataclass pop()
{
cout << "pop"<< endl;
//if (top == NULL) {return dataclass obj;}
linkm<dataclass> * temp;
temp = top;
dataclass value;
value = temp->value;
top = temp->next;
delete temp;
return value;
}
bool isEmpty()
{
if (top == NULL)
return 1;
return 0;
}
// private:
linkm<dataclass> *top;
};
Thanks for taking the time to read this. I know the problem is vague but I just spent the last hour trying to debug this with gdb, I honestly dunno what it could be.
It could be anything, but my wild guess is, ironically: stack overflow.
You might want to try passing your data structures around as references, e.g.:
void executeCommands(string &inputstream, linklist<linklist<transform> > &trsMetastack)
But as Vlad has pointed out, you might want to get familiar with gdb.

C++ Decision Tree Implementation Question: Think In Code

I've been coding for a few years but I still haven't gotten the hang of pseudo-coding or actually thinking things out in code yet. Due to this problem, I'm having trouble figuring out exactly what to do in creating a learning Decision Tree.
Here are a few sites that I have looked at and trust me there were plenty more
Decision Tree Tutorials
DMS Tutorials
Along with several books such as Ian Millington's AI for Games which includes a decent run-down of the different learning algorithms used in decision trees and Behavioral Mathematics for Game Programming which is basically all about Decision Trees and theory. I understand the concepts for a decision tree along with Entropy, ID3 and a little on how to intertwine a genetic algorithm and have a decision tree decide the nodes for the GA.
They give good insight, but not what I am looking for really.
I do have some basic code that creates the nodes for the decision tree, and I believe I know how to implement actual logic but it's no use if I don't have a purpose to the program or have entropy or a learning algorithm involved.
What I am asking is, can someone help me figure out what I need to do to create this learning decision tree. I have my nodes in a class of their own flowing through functions to create the tree, but how would I put entropy into this and should it have a class, a struct I'm not sure how to put it together. Pseudo-code and an Idea of where I am going with all this theory and numbers. I can put the code together if only I knew what I needed to code. Any guidance would be appreciated.
How Would I Go About This, Basically.
Adding a learning algorithm such as ID3 and Entropy. How should it be set up?
Once I do have this figured out on how to go about all this,I plan to implement this into a state machine that goes through different states in a game/simulation format. All of this is already set up, I just figure this could be stand-alone and once I figure it out I can just move it to the other project.
Here is the source code I have for now.
Thanks in advance!
Main.cpp:
int main()
{
//create the new decision tree object
DecisionTree* NewTree = new DecisionTree();
//add root node the very first 'Question' or decision to be made
//is monster health greater than player health?
NewTree->CreateRootNode(1);
//add nodes depending on decisions
//2nd decision to be made
//is monster strength greater than player strength?
NewTree->AddNode1(1, 2);
//3rd decision
//is the monster closer than home base?
NewTree->AddNode2(1, 3);
//depending on the weights of all three decisions, will return certain node result
//results!
//Run, Attack,
NewTree->AddNode1(2, 4);
NewTree->AddNode2(2, 5);
NewTree->AddNode1(3, 6);
NewTree->AddNode2(3, 7);
//Others: Run to Base ++ Strength, Surrender Monster/Player,
//needs to be made recursive, that way when strength++ it affects decisions second time around DT
//display information after creating all the nodes
//display the entire tree, i want to make it look like the actual diagram!
NewTree->Output();
//ask/answer question decision making process
NewTree->Query();
cout << "Decision Made. Press Any Key To Quit." << endl;
//pause quit, oh wait how did you do that again...look it up and put here
//release memory!
delete NewTree;
//return random value
//return 1;
}
Decision Tree.h:
//the decision tree class
class DecisionTree
{
public:
//functions
void RemoveNode(TreeNodes* node);
void DisplayTree(TreeNodes* CurrentNode);
void Output();
void Query();
void QueryTree(TreeNodes* rootNode);
void AddNode1(int ExistingNodeID, int NewNodeID);
void AddNode2(int ExistingNodeID, int NewNodeID);
void CreateRootNode(int NodeID);
void MakeDecision(TreeNodes* node);
bool SearchAddNode1(TreeNodes* CurrentNode, int ExistingNodeID, int NewNodeID);
bool SearchAddNode2(TreeNodes* CurrentNode, int ExistingNodeID, int NewNodeID);
TreeNodes* m_RootNode;
DecisionTree();
virtual ~DecisionTree();
};
Decisions.cpp:
int random(int upperLimit);
//for random variables that will effect decisions/node values/weights
int random(int upperLimit)
{
int randNum = rand() % upperLimit;
return randNum;
}
//constructor
//Step 1!
DecisionTree::DecisionTree()
{
//set root node to null on tree creation
//beginning of tree creation
m_RootNode = NULL;
}
//destructor
//Final Step in a sense
DecisionTree::~DecisionTree()
{
RemoveNode(m_RootNode);
}
//Step 2!
void DecisionTree::CreateRootNode(int NodeID)
{
//create root node with specific ID
// In MO, you may want to use thestatic creation of IDs like with entities. depends on how many nodes you plan to have
//or have instantaneously created nodes/changing nodes
m_RootNode = new TreeNodes(NodeID);
}
//Step 5.1!~
void DecisionTree::AddNode1(int ExistingNodeID, int NewNodeID)
{
//check to make sure you have a root node. can't add another node without a root node
if(m_RootNode == NULL)
{
cout << "ERROR - No Root Node";
return;
}
if(SearchAddNode1(m_RootNode, ExistingNodeID, NewNodeID))
{
cout << "Added Node Type1 With ID " << NewNodeID << " onto Branch Level " << ExistingNodeID << endl;
}
else
{
//check
cout << "Node: " << ExistingNodeID << " Not Found.";
}
}
//Step 6.1!~ search and add new node to current node
bool DecisionTree::SearchAddNode1(TreeNodes *CurrentNode, int ExistingNodeID, int NewNodeID)
{
//if there is a node
if(CurrentNode->m_NodeID == ExistingNodeID)
{
//create the node
if(CurrentNode->NewBranch1 == NULL)
{
CurrentNode->NewBranch1 = new TreeNodes(NewNodeID);
}
else
{
CurrentNode->NewBranch1 = new TreeNodes(NewNodeID);
}
return true;
}
else
{
//try branch if it exists
//for a third, add another one of these too!
if(CurrentNode->NewBranch1 != NULL)
{
if(SearchAddNode1(CurrentNode->NewBranch1, ExistingNodeID, NewNodeID))
{
return true;
}
else
{
//try second branch if it exists
if(CurrentNode->NewBranch2 != NULL)
{
return(SearchAddNode2(CurrentNode->NewBranch2, ExistingNodeID, NewNodeID));
}
else
{
return false;
}
}
}
return false;
}
}
//Step 5.2!~ does same thing as node 1. if you wanted to have more decisions,
//create a node 3 which would be the same as this maybe with small differences
void DecisionTree::AddNode2(int ExistingNodeID, int NewNodeID)
{
if(m_RootNode == NULL)
{
cout << "ERROR - No Root Node";
}
if(SearchAddNode2(m_RootNode, ExistingNodeID, NewNodeID))
{
cout << "Added Node Type2 With ID " << NewNodeID << " onto Branch Level " << ExistingNodeID << endl;
}
else
{
cout << "Node: " << ExistingNodeID << " Not Found.";
}
}
//Step 6.2!~ search and add new node to current node
//as stated earlier, make one for 3rd node if there was meant to be one
bool DecisionTree::SearchAddNode2(TreeNodes *CurrentNode, int ExistingNodeID, int NewNodeID)
{
if(CurrentNode->m_NodeID == ExistingNodeID)
{
//create the node
if(CurrentNode->NewBranch2 == NULL)
{
CurrentNode->NewBranch2 = new TreeNodes(NewNodeID);
}
else
{
CurrentNode->NewBranch2 = new TreeNodes(NewNodeID);
}
return true;
}
else
{
//try branch if it exists
if(CurrentNode->NewBranch1 != NULL)
{
if(SearchAddNode2(CurrentNode->NewBranch1, ExistingNodeID, NewNodeID))
{
return true;
}
else
{
//try second branch if it exists
if(CurrentNode->NewBranch2 != NULL)
{
return(SearchAddNode2(CurrentNode->NewBranch2, ExistingNodeID, NewNodeID));
}
else
{
return false;
}
}
}
return false;
}
}
//Step 11
void DecisionTree::QueryTree(TreeNodes* CurrentNode)
{
if(CurrentNode->NewBranch1 == NULL)
{
//if both branches are null, tree is at a decision outcome state
if(CurrentNode->NewBranch2 == NULL)
{
//output decision 'question'
///////////////////////////////////////////////////////////////////////////////////////
}
else
{
cout << "Missing Branch 1";
}
return;
}
if(CurrentNode->NewBranch2 == NULL)
{
cout << "Missing Branch 2";
return;
}
//otherwise test decisions at current node
MakeDecision(CurrentNode);
}
//Step 10
void DecisionTree::Query()
{
QueryTree(m_RootNode);
}
////////////////////////////////////////////////////////////
//debate decisions create new function for decision logic
// cout << node->stringforquestion;
//Step 12
void DecisionTree::MakeDecision(TreeNodes *node)
{
//should I declare variables here or inside of decisions.h
int PHealth;
int MHealth;
int PStrength;
int MStrength;
int DistanceFBase;
int DistanceFMonster;
////sets random!
srand(time(NULL));
//randomly create the numbers for health, strength and distance for each variable
PHealth = random(60);
MHealth = random(60);
PStrength = random(50);
MStrength = random(50);
DistanceFBase = random(75);
DistanceFMonster = random(75);
//the decision to be made string example: Player health: Monster Health: player health is lower/higher
cout << "Player Health: " << PHealth << endl;
cout << "Monster Health: " << MHealth << endl;
cout << "Player Strength: " << PStrength << endl;
cout << "Monster Strength: " << MStrength << endl;
cout << "Distance Player is From Base: " << DistanceFBase << endl;
cout << "Disntace Player is From Monster: " << DistanceFMonster << endl;
//MH > PH
//MH < PH
//PS > MS
//PS > MS
//DB > DM
//DB < DM
//good place to break off into different decision nodes, not just 'binary'
//if statement lower/higher query respective branch
if(PHealth > MHealth)
{
}
else
{
}
//re-do question for next branch. Player strength: Monster strength: Player strength is lower/higher
//if statement lower/higher query respective branch
if(PStrength > MStrength)
{
}
else
{
}
//recursive question for next branch. Player distance from base/monster.
if(DistanceFBase > DistanceFMonster)
{
}
else
{
}
//DECISION WOULD BE MADE
//if statement?
// inside query output decision?
//cout << <<
//QueryTree(node->NewBranch2);
//MakeDecision(node);
}
//Step.....8ish?
void DecisionTree::Output()
{
//take repsective node
DisplayTree(m_RootNode);
}
//Step 9
void DecisionTree::DisplayTree(TreeNodes* CurrentNode)
{
//if it doesn't exist, don't display of course
if(CurrentNode == NULL)
{
return;
}
//////////////////////////////////////////////////////////////////////////////////////////////////
//need to make a string to display for each branch
cout << "Node ID " << CurrentNode->m_NodeID << "Decision Display: " << endl;
//go down branch 1
DisplayTree(CurrentNode->NewBranch1);
//go down branch 2
DisplayTree(CurrentNode->NewBranch2);
}
//Final step at least in this case. A way to Delete node from tree. Can't think of a way to use it yet but i know it's needed
void DecisionTree::RemoveNode(TreeNodes *node)
{
//could probably even make it to where you delete a specific node by using it's ID
if(node != NULL)
{
if(node->NewBranch1 != NULL)
{
RemoveNode(node->NewBranch1);
}
if(node->NewBranch2 != NULL)
{
RemoveNode(node->NewBranch2);
}
cout << "Deleting Node" << node->m_NodeID << endl;
//delete node from memory
delete node;
//reset node
node = NULL;
}
}
TreeNodes.h:
using namespace std;
//tree node class
class TreeNodes
{
public:
//tree node functions
TreeNodes(int nodeID/*, string QA*/);
TreeNodes();
virtual ~TreeNodes();
int m_NodeID;
TreeNodes* NewBranch1;
TreeNodes* NewBranch2;
};
TreeNodes.cpp:
//contrctor
TreeNodes::TreeNodes()
{
NewBranch1 = NULL;
NewBranch2 = NULL;
m_NodeID = 0;
}
//deconstructor
TreeNodes::~TreeNodes()
{ }
//Step 3! Also step 7 hah!
TreeNodes::TreeNodes(int nodeID/*, string NQA*/)
{
//create tree node with a specific node ID
m_NodeID = nodeID;
//reset nodes/make sure! that they are null. I wont have any funny business #s -_-
NewBranch1 = NULL;
NewBranch2 = NULL;
}
Please correct me if I'm wrong but judging from the images at http://dms.irb.hr/tutorial/tut_dtrees.php and http://www.decisiontrees.net/?q=node/21 the actual decision logic should go in the nodes, not in the tree. You could model that by having polymorphic nodes, one for each decision there is to be made. With a bit of change to the tree construction and a minor minor modification for the decision delegation your code should be fine.
Basically you need to break everything down into stages and then modularise each part of the algorithm that you are trying to implement.
You can do this in pseudo-code or in code itself using functions/classes and stubs.
Each part of the algorithm you can then code up in some function and even test that function before adding it all together. You will basically end up with various functions or classes that are used for specific purposes in the algorithm implementation. So in your case for tree construction you will have one function that calculates entropy at a node, another that partitions the data into subsets at each node, etc.
I am talking in the general case here rather than specifically with respect to decision tree construction. Check out the book on Machine Learning by Mitchell if you need specific information on decision tree algorithms and the steps involved.
The pseudocode to implement a decision tree is as follows
createdecisiontree(data, attributes)
Select the attribute a with the highest information gain
for each value v of the attribute a
Create subset of data where data.a.val==v ( call it data2)
Remove the attribute a from the attribute list resulting in attribute_list2
Call CreateDecisionTree(data2, attribute_list2)
You will have to store nodes at each level with some code like
decisiontree[attr][val]=new_node