Entire program: http://pastebin.com/x7gfSY5v
I made a parser that takes lines from a .txt file and places them in a binary search tree based on its value. The ID is the first string, the values are the following strings separated by /
Example txt:
Abcd/foo/bar// #ID/value/value2
Abce/foo2// #ID/value
The first line would call:
//parser
SequenceMap a; #has id and value member variables
a.writeAcronym(id); #Abcd
a.writeValue(value); #foo
insertNode(a, root); #Abcd/foo put into tree
a.writeValue(value2); #bar
insertNode(a, root); #Abcd/bar put into tree
Creating the tree itself doesn't cause the crash, and it works as it should. It's only when I try accessing the minimum value of the tree when it crashes. It still shows the correct minimum value though.
Here's the function that gets the minimum value,
I doubt that this is the root cause of the problem.
But running this in main() crashes my program:
string printMin(Node * x)
{
if (x == nullptr)
return nullptr; //this isn't ever called, I parse the tree before calling this fucntion
if (x->left == nullptr)
return x -> sqmap.getAcronym(); //returns correct value (end of program) then crash
else
printMin(x->left);
}
void printMain(){
cout << printMin(root) << endl;
}
main():
a.parse("foo.txt"); //doesn't crash
a.printMain(); //crashes when called
//but doesn't crash if I remove both a.writeValue(value) from parser
Here is the insert/parser:
void parse(string file) //uses delimiter to split lines
{
string line, id, value, value2;
ifstream myfile(file);
if (myfile.is_open())
{
while(getline(myfile, line))
{
if (!line.empty())
{
SequenceMap a;
istringstream is(line);
getline(is, id, '/');
a.writeAcronym(id);
getline(is,value, '/');
a.writeValue(value); //program doesn't crash if I remove this
insertNode(a, root);
getline(is,value2,'/');
if (value2 != "")
{
a.writeValue(value2); //program doesn't crash if I remove this
insertNode(a, root);
}
}
}
myfile.close();
}
else
cout << "Could not open file.";
}
/*****************************************************/
void insertNode(SequenceMap & sq, Node * & x)
{
if (x == nullptr)
x = new Node(sq, nullptr, nullptr);
else if (sq.getValue() < x->sqmap.getValue())
insertNode(sq, x->left);
else if (sq.getValue() > x->sqmap.getValue())
insertNode(sq, x->right);
else if (sq.getValue() == x->sqmap.getValue())
x->sqmap.Merge(sq);
}
I'm seriously stuck right now, my only guess is it's a memory bug. Everything works as it should, and I do get the correct minimum value. But right after the last line of code ends, I get the crash. I made a destructor for the tree but that didn't solve it. Can anyone figure this out?
printMin doesn't return a value when you traverse the left node, so you get a garbage return from that and not a string. If you compile with the warnings all the way up you should get a warning about that.
Related
I'm in the middle of making a binary search tree that stores Items of type MechPart, which stores an int quantity and a string code. The MechParts are generated by reading from a text file and storing their data. A separate text file called MonthlyUpdate.txt is used to read a list of MechParts in the tree and then update their quantities. For example:
MechPart A0001's quantity = 12
MonthlyUpdate.txt says A0001's quantity = 6
Run an update function that finds A0001 in the tree
Replace it with the updated quantity value of 6 (12 - 6).
Here's the two functions that perform this task:
void DBInterface::updateFromFile(string f_Name)
{
ifstream file (f_Name.c_str());
string line;
MechPart tmp_mp;
if (file.is_open())
{
std::getline(file, line);
while (std::getline (file, line))
{
std::istringstream iss (line);
int q=0;
int pos=0;
pos = line.find('\t',0); //find position of blank space
string tmp_str = line.substr(0,pos); //create a substring
string tmp_str1 = line.substr((pos+1), string::npos);
stringstream ss (tmp_str1);
ss >> q;
tmp_mp.set_code(tmp_str); //set code
tmp_mp.set_quantity(q);
MechPart currentQuantity;
currentQuantity = tree.quantitySearch(tree.getRoot(), tmp_mp);
tmp_mp.set_quantity((currentQuantity.get_quantity()) + q);
tree.update(tree.getRoot(), tmp_mp);
cout << "Current node data: " << tmp_mp.get_code() << " | " << tmp_mp.get_quantity() << endl;
}
}
and BSTree.template:
template <typename Item>
Item BSTree<Item>::quantitySearch(BTNode<Item>* q_ptr, Item obj)
{
if (q_ptr == NULL)
{
//POINTER IS NULL
}
else if (q_ptr->data() == obj)
{
return q_ptr->data();
}
else if (obj > q_ptr->data())
{ //WORK ON RIGHT SIDE
quantitySearch(q_ptr->get_right(), obj);
}
else
{
//work on left side
quantitySearch(q_ptr->get_left(), obj);
}
}
The search goes through the tree and locates a MechPart with the same part name code as the parameter and then returns that MechPart.
I've been running the code through GDB debugger. I have it displaying currentQuantity.get_quantity() to validate the returned MechPart's quantity is correct, however i am getting very large numbers for some reason. What is also confusing me is that in the MechPart constructor it assigns a value of 0 to quantity.
Eventually the updateFromFile() function gives me a segmentation fault, so something is very wrong here but I can't work out what as yet.
Recursive functions need to return their recursive calls back up to their caller for them to work properly. Look at the classic factorial example of recursion:
int factorial(int n) {
if (n == 1) {
return 1;
}
else {
return n*factorial(n-1);
}
}
As others have pointed out, your quantitySearch function only returns q_ptr->data() but never returns the return value from the recursive quantitySearch calls. I would start there and I would strongly suggest adding in cout statements in the recursive function to get a complete picture of what's happening "under the hood"
I have created a binary search tree in c++ and have loaded it up with two types of data, strings and ints. I am reading a text file and loading the tree up alphabetically with the words I am pulling, and also the number of the line the word is found on. I am able to print the words and the numbers just fine. What I am wanting to do now is check to see if a word has already been printed, and if it has then I will only print out the number of the line from which the word is found on. The way I am thinking about doing this is by comparing previous data as the tree is traversed and printed. This is my print function.
void inOrderPrint(Node *rootPtr ) {
if ( rootPtr != NULL ) {
for (int i =0; rootPtr->data[i]; i++){
while(ispunct(rootPtr->data[i]))
rootPtr->data.erase(i,1);
}
rootPtr->data = rootPtr->data.substr(0,10);
inOrderPrint( rootPtr->left );
cout << (rootPtr->data)<<rootPtr->lineNum <<endl;
inOrderPrint( rootPtr->right );
}
}
This is what I was thinking:
if (rootPtr->data == previous rootPtr->data)
cout<<setw(10)<<theCurrentNode lineNum;
else
do normal printing
I think that if this function were to run on the first node and it compares it to the non existent previous node, it would automatically try to compare it to NULL, the if statement would return false and it would move on to the else.
Any suggestions on how to go about doing this with actual c++ syntax? Or does anyone see a flaw in my logic?
Thanks in advance!
This answer will describe how to make the program print unique entries and the line number of the first occurrence in the file. If there are duplicate occurrences it will print only the line number of the first occurrence for each duplicate occurrence. The approach is to make sure that there are no duplicate nodes in the tree and to count redundant occurrences.
To do this we might modify the node structure as follows:
struct Node{
string data;
int lineNum;
int count =1;
Node* left;
Node* right;
};
The function Insert might be edited to count duplicates like this:
Node* Insert(Node* rootPtr,string data,int lineNum){
if(rootPtr == NULL){
rootPtr = GetNewNode(data,lineNum);
for (int i =0; rootPtr->data[i]; i++){
while(ispunct(rootPtr->data[i]))
rootPtr->data.erase(i,1);
}
rootPtr->data = rootPtr->data.substr(0,10);
return rootPtr;
}
else if(data< rootPtr->data){
rootPtr->left = Insert(rootPtr->left,data,lineNum);
for (int i =0; rootPtr->data[i]; i++){
while(ispunct(rootPtr->data[i]))
rootPtr->data.erase(i,1);
}
rootPtr->data = rootPtr->data.substr(0,10);
}
else if(data > rootPtr->data) {
rootPtr->right = Insert(rootPtr->right,data,lineNum);
for (int i =0; rootPtr->data[i]; i++){
while(ispunct(rootPtr->data[i]))
rootPtr->data.erase(i,1);
}
rootPtr->data = rootPtr->data.substr(0,10);
}
else if(data == rootPtr->data)
++rootPtr->count;
return rootPtr;
}
Finally the print function can be modified:
void inOrderPrint(Node *rootPtr ) {
//ofstream outputFile;
//outputFile.open("Output.txt");
if ( rootPtr != NULL ) {
inOrderPrint( rootPtr->left );
cout << (rootPtr->data)<<" " << rootPtr->lineNum <<endl;
int j =rootPtr->count;
while( --j )
cout << rootPtr->lineNum <<endl;
//outputFile << (rootPtr->data)<<rootPtr->lineNum <<endl;
inOrderPrint( rootPtr->right );
}
}
Now this should be much closer to what you want. It would also be a good idea to separate the text processing from the node processing. (This answer sort of assumes that you will take care of that.) Otherwise duplicate nodes will be created if the preprocessed text does not match the processed text.
Good luck!
#include<iostream>
#include<windows.h>
#include<string>
#include<fstream>
using namespace std;
class linklist //linked list class
{
struct main_node;
struct sub_node;
struct main_node // main node that only have head pointers in it
{
sub_node *head;
main_node()
{ head=NULL; }
};
main_node array[26];
struct sub_node
{
double frequency;
string word;
sub_node *next;
sub_node()
{ frequency=1; word=""; next=NULL; }
};
public:
void add_node(string phrase)
{
char alphabat1=phrase[0];
if(isupper(alphabat1))
{
alphabat1=tolower(alphabat1);
}
if(!isalpha(alphabat1))
return;
sub_node*temp = new sub_node;
temp->word = phrase;
sub_node*current = array[alphabat1-97].head;
if(current == NULL)
array[alphabat1-97].head = temp;
else
{
while(current -> next != NULL && phrase != current-> word)
{ current= current->next; }
if(current->word == phrase)
current->frequency++;
else
current->next = temp; //adding words to linklist
}
}
void display()
{
for(int i=0;i<26;i++)
{
sub_node *temp=array[i].head;
cout<<char(i+97)<<" -> ";
while(temp!=NULL)
{
cout<<temp->word<<" ("<<temp->frequency<<") ";
temp=temp->next;
}
cout<<"\n";
}
}
void parsing_documents(char *path)
{
char token[100];
ifstream read;
read.open(path);
do
{
read>>token; // parsing words
add_node(token); //sending words to linked list
}
while(!read.eof());
read.clear();
read.close();
}
void reading_directory()
{
// code to read multiple files
HANDLE hFile; // Handle to file
WIN32_FIND_DATA FileInformation; // File information
char tempPattern[90];
strcpy(tempPattern,"*.txt");
hFile = ::FindFirstFile(tempPattern, &FileInformation);
long count=0;
if(hFile != INVALID_HANDLE_VALUE)
{
do
{
count++;
cout<<"."<<count;
this->parsing_documents( FileInformation.cFileName);
}
while(TRUE == ::FindNextFile(hFile, &FileInformation));
}
::FindClose(hFile);
}
};
void main()
{
linklist member;
member.reading_directory();
member.display();
}
I am working on a project in which I have to read more than 50,000 text files parse their words and save them in a linked list in a sorted manner , i have made the code in C++. it's working quite efficiently but I have one problem in this regard that it is not reading the files correctly sometimes 3000 sometimes 4000. I have searched for it a lot but i couldn't succeed to find my fault . . here is my code in C++
if any body help me in this regard i would be very thankful
!read.eof() only checks for end of file, not errors reading the file, such as a networked mounted file system not being ready, disk error, or lack of permission to read the file. You should check for all failures, with while(read) which has an overloaded operator to check everything for you. So, if the file fails, you stop trying to read from it. You should also check the status before trying to read from the file. As such, while(read) { ... } is preferable to the do/while loop. After the loop, you might issue a warning or error to the user of you did not reach the end of file !read.eof() so they can investigate that specific file.
Try to avoid char * and char [] as much possible as this is highly error prone. You have a char[100]. What happens if the string is longer than 100 characters? read >> token may overwrite the stack -- such as to damage the ifstream read.
Consider using std::list<sub_node> to avoid having to re-invent and re-debug the wheel? You would no longer need the next pointer as std::list already does that for you. This would leave far less code to debug.
void BST::insert(string word)
{
insert(buildWord(word),root);
}
//Above is the gateway insertion function that calls the function below
//in order to build the Node, then passes the Node into the insert function
//below that
Node* BST::buildWord(string word)
{
Node* newWord = new Node;
newWord->left = NULL;
newWord->right = NULL;
newWord->word = normalizeString(word);
return newWord;
}
//The normalizeString() returns a lowercase string, no problems there
void BST::insert(Node* newWord,Node* wordPntr)
{
if(wordPntr == NULL)
{
cout << "wordPntr is NULL" << endl;
wordPntr = newWord;
cout << wordPntr->word << endl;
}
else if(newWord->word.compare(wordPntr->word) < 0)
{
cout << "word alphabetized before" << endl;
insert(newWord,wordPntr->left);
}
else if(newWord->word.compare(wordPntr->word) > 0)
{
cout << "word alphabetized after" << endl;
insert(newWord, wordPntr->right);
}
else
{
delete newWord;
}
}
So my problem is this: I call the gateway insert() externally (also no problems with the inflow of data) and every time it tells me that the root, or the initial Node* is NULL. But that should only be the case before the first insert. Each time the function is called, it sticks the newWord right at the root.
To clarify: These functions are part of the BST class, and root is a Node* and a private member of BST.h
It's possible it is quite obvious, and I have just been staring too long. Any help would be appreciated.
Also, this is a school-assigned project.
Best
Like user946850 says, the variable wordPntr is a local variable, if you change it to point to something else it will not be reflected in the calling function.
There are two ways of fixing this:
The old C way, by using a pointer to a pointer:
void BST::insert(Node *newWord, Node **wordPntr)
{
// ...
*wordPntr = newWord;
// ...
}
You call it this way:
some_object.insert(newWord, &rootPntr);
Using C++ references:
void BST::insert(Node *newWord, Node *&wordPntr)
{
// Nothing here or in the caller changes
// ...
}
To help you understand this better, I suggest you read more about scope and lifetime of variables.
The assignment wordPntr = newWord; is local to the insert function, it should somehow set the root of the tree in this case.
The problem appears with the insert function that I wrote.
3 conditions must work, I tested b/w 1 and 2, b/w 2 and 3 and as last element, they worked.
EDIT;
It was my own problem. I did not realize I put MAXINPUT = 3 (instead of 4). I do appreciate all the efforts to help me becoming a better programmer, using more advance and more concise features of C++.
Basically, the problem has been solved.
Efficiency is not my concern here (not yet). Please guide me through this debug process.
Thank you very much.
#include<iostream>
#include<string>
using namespace std;
struct List // we create a structure called List
{
string name;
string tele;
List *nextAddr;
};
void populate(List *);
void display(List *);
void insert(List *);
int main()
{
const int MAXINPUT = 3;
char ans;
List * data, * current, * point; // create two pointers
data = new List;
current = data;
for (int i = 0; i < (MAXINPUT - 1); i++)
{
populate(current);
current->nextAddr = new List;
current = current->nextAddr;
}
// last record we want to do it sepeartely
populate(current);
current->nextAddr = NULL;
cout << "The current list consists of the following data records: " << endl;
display(data);
// now ask whether user wants to insert new record or not
cout << "Do you want to add a new record (Y/N)?";
cin >> ans;
if (ans == 'Y' || ans == 'y')
{
/*
To insert b/w first and second, use point as parameter
between second and third uses point->nextAddr
between third and fourth uses point->nextAddr->nextAddr
and insert as last element, uses current instead
*/
point = data;
insert(());
display(data);
}
return 0;
}
void populate(List *data)
{
cout << "Enter a name: ";
cin >> data->name;
cout << "Enter a phone number: ";
cin >> data->tele;
return;
}
void display(List *content)
{
while (content != NULL)
{
cout << content->name << " " << content->tele;
content = content->nextAddr;
cout << endl; // we skip to next line
}
return;
}
void insert(List *last)
{
List * temp = last->nextAddr; //save the next address to temp
last->nextAddr = new List; // now modify the address pointed to new allocation
last = last->nextAddr;
populate(last);
last->nextAddr = temp; // now link all three together, eg 1-NEW-2
return;
}
Your code works fine on my machine (once the insert(()) statement is "filled in" properly as explained in the code comment). The insertion works in all positions.
Something else, though: I initially had a look at your insert function. I thought I'd give you a hint on how to make it a little shorter and easier to understand what's going on:
void insert(List *last)
{
// create a new item and populate it:
List* new_item = new List;
populate(new_item);
// insert it between 'last' and the item succeeding 'last':
new_item->nextAddr = last->nextAddr;
last->nextAddr = new_item;
}
This would be preferable because it first creates a new, separate item, prepare it for insertion, and only then, when this has worked successfully, will the function "mess" with the linked list. That is, the linked list is not affected except in the very last statement, making your function "safer". Contrast this with your version of insert, where you mix code for constructing the new item with the actual insertion. If something goes wrong inside this function, chances are far higher that the linked list is messed up, too.
(What's still missing btw. is a initial check whether the passed argument last is actually valid, ie. not a null pointer.)
P.S.: Of course you could just use a standard C++ std::list container instead of building your own linked list, but seeing that you tagged your question beginner, I assume you want to learn how it actually works.
step one should be to make the list into an object instead of just keeping a bunch of pointers around in main(). you want an object called List that knows about it's own first (and maybe last) elements. it should also have methods like List.append() and List.insert().
your current code is nigh unreadable.
Use a std::list, unless this is homework, in which case it needs tagging as such.
In my experience, I have learned to start small and test, then build up. I'll guide you through these steps.
BTW, a linked list is a container of nodes. So we'll start with the node class first.
Minimally, a node must have a pointer to another node:
#include <iostream>
#include <cstdlib> // for EXIT_SUCCESS
#include <string>
using std::cout;
using std::endl;
using std::cerr;
using std::cin;
using std::string;
struct Node
{
// Add a default constructor to set pointer to null.
Node()
: p_next(NULL)
{ ; }
Node * p_next;
};
// And the testing framework
int main(void)
{
Node * p_list_start(NULL);
// Allocate first node.
p_list_start = new Node;
// Test the allocation.
// ALWAYS test dynamic allocation for success.
if (!p_list_start)
{
cerr << "Error allocating memory for first node." << endl;
return EXIT_FAILURE;
}
// Validate the constructor
ASSERT(p_list_start->p_next == 0);
// Announce to user that test is successful.
cout << "Test successful." << endl;
// Delete the allocated object.
delete p_list_start;
// Pause if necessary.
cin.ignore(100000, '\n'); // Ignore input chars until limit of 100,000 or '\n'
return EXIT_SUCCESS;
}
Compile, and run this simple test. Fix errors until it runs correctly.
Next, modify the tester to link two nodes:
int main(void)
{
Node * p_list_start(NULL);
Node * p_node(NULL); // <-- This is a new statement for the 2nd node.
//...
// Validate the constructor
ASSERT(p_list_start->p_next == 0);
// Allocate a second node.
p_node = new Node;
if (!p_node)
{
cerr << "Error allocating memory for 2nd node." << endl;
// Remember to delete the previously allocated objects here.
delete p_list start;
return EXIT_FAILURE;
}
// Link the first node to the second.
p_list_start->Link_To(p_node);
// Test the link
ASSERT(p_list_start.p_next == &p_node);
//...
// Delete the allocated object(s)
delete p_list_start;
delete p_node;
//...
}
Compile with the modifications.
It failed to compile, undefined method: Node::Link_To
Not to worry, this is expected. Show us the compiler is working. :-)
Add the Link_To method to the Node structure:
struct Node
{
// ...
void Link_To(const Node& n)
{
p_next = &n;
return;
}
//...
};
Compile and run. Test should pass.
At this point the linking process has been validated. Onto adding content to the node.
Since the Node object has been tested, we don't want to touch it. So let's inherit from it to create a node with content:
struct Name_Node
: public Node // Inherit from the tested object.
{
std::string name;
std::string phone;
};
If you haven't learned inheritance yet, you can append to the existing node:
struct Node
{
//...
std::string name;
std::string phone;
}
At this point you can add functions for setting and displaying content. Add the testing statements. Run and validate.
The next step would be to create two content nodes and link them together. As you build up, keep the testing code. Also, if stuff works you may want to put the functionality into separate functions.
For more information on this process, check out Test Driven Development.