Adding to hash table in C++? - c++

Guessing I'm doing something stupidly simple wrong, but can't seem to find an answer in existing stack overflow questions. I'm trying to implement a simple hash table containing lists of strings in C++ for a programming class. My add() function appears to be working correctly from inside the function, but as soon as I check the hash table's contents from the contains() function it's obvious that something's gone wrong.
void string_set::add(const char *s) {
//copy s into new char array str
char str[strlen(s)];
strcpy(str, s);
//find hash value of string
int hValue = string_set::hash_function(s);
//create new node to contain string
node* newNode = new node();
newNode->s = str;
//if string's position in hash table is empty, add directly and
//set newNode's next to null. if not, set newNode next to
//current first node in list and then add to hash table
if(hash_table[hValue] == NULL) {
hash_table[hValue] = newNode;
newNode->next = NULL;
} else {
newNode->next = hash_table[hValue];
hash_table[hValue] = newNode;
}
cout << "string added: " << hash_table[hValue]->s << endl;
return;
}
This prints the expected string; i.e. if I add "e" it prints "e".
But when I call this immediately after:
int string_set::contains(const char *s) {
//find hash value of string
int hValue = string_set::hash_function(s);
//return inital value of hash table at that value
cout << "hash table points to " << hash_table[hValue]->s << endl;
}
It prints some junk. What have I done?
Since this is for a class, the specifications have been provided and I have no opportunity to change the way the hash table is set up. I'll be adding exceptions etc later, just want to get the add function working. Thanks!
EDIT: Sorry, new to stack overflow and not sure about comment formatting! Yes, I can use std::string. The hash function is as follows
int string_set::hash_function(const char *s) {
int cValue =0;
int stringSum = 0;
unsigned int i = 0;
for(i = 0; i < strlen(s); i++) {
cValue = (int) s[i];
stringSum = stringSum + cValue;
}
stringSum = stringSum % HASH_TABLE_SIZE;
return stringSum;
}

You are trying to use local variable outside of its function scope. This is an undefined behavior in C++. In your compiler realization, stack frame is invalidated, so all newNode->s pointers became dangling, memory, they are pointing, is already used to store different stack frame. To solve this issue you could either dynamically allocate memory on the heap or use std::string instead of char* which is the best approach.
Also its worth pointing out, that standard C++ library already have hash table realization std::unordered_map.

Related

C++ Spell checking program with two classes; Dictionary and word

Here is the specification for the code:
You are to use the Word and Dictionary classes defined below and write all member functions and any necessary supporting functions to achieve the specified result.
The Word class should dynamically allocate memory for each word to be stored in the dictionary.
The Dictionary class should contain an array of pointers to Word. Memory for this array must be dynamically allocated. You will have to read the words in from the file. Since you do not know the "word" file size, you do not know how large to allocate the array of pointers. You are to let this grow dynamically as you read the file in. Start with an array size of 8, When that array is filled, double the array size, copy the original 8 words to the new array and continue.
You can assume the "word" file is sorted, so your Dictionary::find() function must contain a binary search algorithm. You might want to save this requirement for later - until you get the rest of your program running.
Make sure you store words in the dictionary as lower case and that you convert the input text to the same case - that way your Dictionary::find() function will successfully find "Four" even though it is stored as "four" in your Dictionary.
Here is my code so far.
#include <cstring>
#include <iostream>
#include <fstream>
using namespace std;
class Word
{
char* word_;
public:
Word(const char* text = 0);
~Word() { delete[] word_; word_ = nullptr; }
const char* word() const;
};
Word::Word(const char* arg)
: word_(new char[strlen(arg) + 1])
{
strcpy(word_, arg);
}
const char* Word::word() const
{
return word_;
}
class Dictionary
{
Word** words_;
unsigned int capacity_; // max number of words Dictionary can hold
unsigned int numberOfWordsInDictionary_;
void resize() {
capacity_ = capacity_ * 2;
cout << "Size = " << capacity_ << endl;
};
void addWordToDictionary(char* word) { words_ += *word; };
public:
Dictionary(const char* filename);
~Dictionary() {
delete[] words_; words_ = nullptr;
};
bool find(const char* word);
};
Dictionary::Dictionary(const char * filename)
: words_(new Word*[8]), capacity_(8), numberOfWordsInDictionary_(0)
{
ifstream fin(filename);
if (!filename) {
cout << "Failed to open file!" << endl;
}
char buffer[32];
while (fin.getline(buffer, sizeof(buffer)))
{
if (numberOfWordsInDictionary_ == capacity_)
{
resize();
}
addWordToDictionary(buffer);
}
}
bool Dictionary::find(const char * left)
{
int last = capacity_ - 1,
first = 0,
middle;
bool found = false;
while (!found && first <= last) {
middle = (first + last) / 2;
if (strcmp(left, reinterpret_cast<char*>(words_[middle])) == 0) {
found = true;
}
else if (left > reinterpret_cast<char*>(words_[middle]))
last = middle - 1;
else
first = middle + 1;
}
return found;
}
;
bool cleanupWord(char x[] ) {
bool lower = false;
int i = 0;
while (x[i]) {
char c = x[i];
putchar(tolower(c));
lower = true;
}
return lower;
}
int main()
{
char buffer[32];
Dictionary Websters("words.txt");
ifstream fin("gettysburg.txt");
cout << "\nSpell checking " << "gettysburg.text" << "\n\n";
while (fin >> buffer) {
if (cleanupWord(buffer) == true) {
if (!Websters.find(buffer)) {
cout << buffer << " not found in the Dictionary\n";
}
}
}
system("PAUSE");
}
When I run the program it stops after outputting "spellchecking Gettysburg.txt" and I don't know why. Thank you!
The most likely cause of this problem is the text files have not been opened. Add a check with is_open to make sure they have been opened.
When using Relative Paths (any path that does not go all the way back to the root of the file system (and is an Absolute Path)), take care that the program is being run from the directory you believe it to be. It is not always the same directory as the executable. Search Term to use to learn more about this: Working Directory.
Now on to other reasons this program will not work:
void addWordToDictionary(char* word) { words_ += *word; };
is not adding words to the dictionary. Instead it is advancing the address at which words_ points by the numeric value of the letter at *word. This is extremely destructive as it loses the pointer to the buffer allocated for words_ in the constructor making delete[] words_; in the Dictionary destructor ineffective and probably fatal.
Instead you want to (Note I use want to with a bit of trepidation. What you really want to do is use std::vector and std::string, but I strongly suspect this would upset the assignment's marker)
Dynamically allocate a new Word with new.
Place this word in a free spot in the words_ array. Something along the lines of words_[numberOfWordsInDictionary_] = myNewWord;
Increase numberOfWordsInDictionary_ by 1.
Note that the Words allocated with new must all be released in the Dictionary destructor. You will want a for loop to help with this.
In addition, I would move the
if (numberOfWordsInDictionary_ == capacity_)
{
resize();
}
from Dictionary to addWordToDictionary so that any time addWordToDictionary is called it is properly sized.
Hmmm. While we're at it, let's look at resize
void resize() {
capacity_ = capacity_ * 2;
cout << "Size = " << capacity_ << endl;
};
This increases the object's capacity_ but does nothing to allocate more storage for words_. This needs to be corrected. You must:
Double the value of capacity_. You already have this.
Allocate a larger buffer to hold the replacement of words_ with new.
Copy all of the Words in words_ to the larger buffer.
Free the buffer currently pointed to by words_
Point words_ at the new, larger buffer.
Addendum
I haven't looked closely at find because the carnage required to fix the reading and storage of the dictionary will most likely render find unusable even if it does currently work. The use of reinterpret_cast<char*> is an alarm bell, though. There should be no reason for a cast, let alone the most permissive of them all, in a find function. Rule of thumb: When you see a reinterpret_cast and you don't know what it's for, assume it's hiding a bug and approach it with caution and suspicion.
In addition to investigating the Rule of Three mentioned in the comments, look into the Rule of Five. This will allow you to make a much simpler, and probably more efficient, dictionary based around Word* words_, where words_ will point to an array of Word directly instead of pointers to Words.

Skip List C++ segmentation fault

I'm trying to implement the Skip List using this article Skip List.
Code:
#include<iostream>
#include<cstdlib>
#include<ctime>
#include<limits>
using namespace std;
template<class T>
class SkipList{
private:
class SkipNode{
public:
T* key; //Pointer to the key
SkipNode** forward; //Forward nodes array
int level; //Node level
//SkipNode constructor
SkipNode(T* key, int maxlvl, int lvl){
forward = new SkipNode*[maxlvl];
this->key=key;
level=lvl;
}
//Method that print key and level node
print(){
cout << "(" << *key << "," << level << ") ";
}
};
SkipNode *header,*NIL; //Root and End pointers
float probability; //Level rate
int level; //Current list level
int MaxLevel; //Maximum list levels number
//Function that returns a random level between 0 and MaxLevel-1
int randomLevel(){
int lvl = 0;
while( (float(rand())/RAND_MAX < probability) && (lvl < MaxLevel-1) )
lvl++;
return lvl;
}
public:
//SkipList constructor
SkipList(float probability, int maxlvl){
this->probability = probability;
MaxLevel = maxlvl;
srand(time(0));
header=new SkipNode(NULL,MaxLevel,0); //Header initialization
T* maxValue = new T;
*maxValue = numeric_limits<T>::max(); //Assign max value that T can reach
NIL = new SkipNode(maxValue,0,0); //NIL initialization
level=0; //First level
for(int i=0; i<MaxLevel; i++){ //Every header forward node points to NIL
header->forward[i]=NIL;
}
}
//SkipList destructor
~SkipList(){
delete header;
delete NIL;
}
//Method that search for a key in the list
SkipNode* search(T* key){
SkipNode* cursor = header;
//Scan the list
for(int i=level; i>=0; i--)
while(*(cursor->forward[i]->key) < (*key))
cursor=cursor->forward[i];
cursor=cursor->forward[0];
if(*(cursor->key) == *key)
return cursor;
return NULL;
}
//Method that insert a key in the list
SkipList* insert(T* key){
SkipNode* cursor = header;
SkipNode* update[MaxLevel]; //Support array used for fixing pointers
//Scan the list
for(int i=level; i>=0; i--){
while(*(cursor->forward[i]->key) < *(key))
cursor=cursor->forward[i];
update[i]=cursor;
}
cursor=cursor->forward[0];
if(*(cursor->key) == *(key)){ //Node already inserted
return this;
}
int lvl = randomLevel(); //New node random level
if(lvl > level){ //Adding missing levels
for(int i=level+1; i<=lvl; i++)
update[i]=header;
level=lvl;
}
SkipNode* x = new SkipNode(key,MaxLevel,lvl); //New node creation
for(int i=0; i<=lvl; i++){ //Fixing pointers
x->forward[i] = update[i]->forward[i];
update[i]->forward[i] = x;
}
return this;
}
//Method that delete a key in the list
SkipList* erase(T* key){
SkipNode* cursor = header;
SkipNode* update[MaxLevel]; //Support array used for fixing pointers
//Scan the list
for(int i=level; i>=0; i--){
while(*(cursor->forward[i]->key) < *(key))
cursor=cursor->forward[i];
update[i]=cursor;
}
cursor=cursor->forward[0];
if(*(cursor->key) == *(key)){ //Deletetion of the founded key
for(int i=0; i<=level && update[i]->forward[i] == cursor; i++){
update[i]->forward[i] = cursor->forward[i];
}
delete cursor;
while(level>0 && header->forward[level]==NIL){
level=level-1;
}
}
return this;
}
//Method that print every key with his level
SkipList* print(){
SkipNode* cursor = header->forward[0];
int i=1;
while (cursor != NIL) {
cursor->print();
cursor = cursor->forward[0];
if(i%15==0) cout << endl; i++;
}
cout << endl;
return this;
}
};
main(){
SkipList<int>* list = new SkipList<int>(0.80, 8);
int v[100];
for(int i=0; i<100; i++){
v[i]=rand()%100;
list->insert(&v[i]);
}
list->print();
cout << endl << "Deleting ";
for(int i=0; i<10; i++){
int h = rand()%100;
cout << v[h] << " ";
list->erase(&v[h]);
}
cout << endl;
list->print();
cout << endl;
for(int i=0; i<10; i++){
int h = rand()%100;
cout << v[h] << " ";
if(list->search(&v[h]))
cout << " is in the list" << endl;
else
cout << " isn't in the list" << endl;
}
delete list;
}
It gives me Segmentation Fault on line 59 (the for-cycle on the insert), but I can't understand why. May you help me please? I will accept any other improvement that you suggest. My deadline is on two days, that's why I'm asking for help.
EDIT:
I've corrected the code with bebidek suggestions (Thanks). Now first level is 0. It seems to be working, but sometimes some nodes is not inserted correctly and the search give a bad result.
LAST EDIT:
It works, thanks to all
ONE MORE EDIT:
Added comments to code, if you have any suggestion you're welcome
The biggest problem in your code is probably NIL=new SkipNode(numeric_limits<T*>::max());
First of all i suspect you want the key pointer to point to a memory address that contains the biggest possible int value.
But that's not what's actually happening here. Instead the key pointer points to the biggest possible memory-address which is most likely not available for your process.
Also the forward property probably contains an array of junk pointers.
Then when the first loop in the insert method is executed this leads to 2 problems:
while(*(cursor->forward[i]->key) < *(key)) will compare the key value to an invalid pointer
cursor=cursor->forward[i]; will re-assign cursor to an invalid pointer
I would first suggest you'd change the design to let SkipNode keep a value to T instead of a pointer:
class SkipNode{
public:
T key;
SkipNode* forward[100];
This will make a lot of pointer related code unnecessary and make the code simpler so less likely to run into access violation.
Also it might be cleaner to use an actual NULL (or event better nullptr) value instead of a dummy NIL value to indicate the end of the list.
So, first problem is when you create NIL node:
NIL=new SkipNode(numeric_limits<T*>::max());
As argument you should use pointer to existing variable, for example:
T* some_name = new T;
*some_name = numeric_limits<T>::max();
NIL = new SkipNode(some_name);
Notice, I used T instead of T* in numeric_limits. Of course you have to remember about deleting this variable in destructor.
Second problem is that level variable in your code sometimes is inclusive (I mean level number level exists) as in line 61, and sometimes exclusive (level number level doesn't exist) as in line 71. You have to be consistent.
Third problem is in line 52. You probably mean cursor=cursor->forward[1];, but after loop i = 0, and forward[0] doesn't have any sense in your code.
EDIT:
Fourth and fifth problem is in erase function.
cursor->~SkipNode();
It won't delete your node, but only run empty destructor. Use delete cursor; instead.
And in loop you probably wanted to write update[i]->forward[i] == cursor instead of !=.
ONE MORE EDIT:
You haven't implemented any destructor of SkipList and also you forgot about delete list; at the end of main(). These two will give you a memory leak.
ANOTHER EDIT:
srand(time(0));
This line should be executed once at the beginning of main and that's all. If you execute it before each random generation, you will get the same result every time (as time(0) counts only seconds and your program can run function randomLevel() more than once a second).
You also forgot about rewriting precision variable in constructor of SkipList.
NEXT EDIT:
In your insert function you don't have level randomization. I mean, you do not have ability of inserting node of level less than level of whole skip list. It's not error which will crash your program or give wrong results, but time complexity of queries in your structure is O(n) instead of O(log n).
You should use lvl instead of level in this loop in insert function:
for(int i=1; i<level; i++){
x->forward[i] = update[i]->forward[i];
update[i]->forward[i] = x;
}
And also minimum result of your random function randomLevel should be 1 instead of 0, as you don't want node witch level=0.

Outputting a string pointer array

I'm having a bit of trouble figuring out exactly what I am doing wrong here, and haven't found any posts with the same issue. I am using a dynamic array of strings to hold a binary tree with the root at [0], first row of children, left to right, at [1] and [2], etc. While I haven't debugged that output format yet, I am much more concerned as to why that specific line is crashing my program.
I thought it was a pointer de-referencing issue, but outStream << &contestList[i] prints addresses as I'd expect, and outStream << *contestList[i] throws errors as I'd expect them to.
//3 lines are from other functions/files
typedef string elementType;
typedef elementType* elementTypePtr;
elementTypePtr contestList = new elementType[arraySize];
void BinTreeTourneyArray::printDownward(ostream &outStream)
{
int row = 1;
for (int i = 0; i < getArraySize(); i++)
{
outStream << contestList[i]; //this is crashing the program
if (isPowerOfTwo(i))
{
outStream << endl;
row++;
}
else
{
outStream << ":";
}
}
}
arraySize is a private member arraySize = ((2 * contestants) - 1) where contestants is the number of contestants in my tournament. Each round or "row" in the tree is synonymous with a tournament bracket. If there are n contestants, then there are 2n-1 nodes needed in the tree. The issue wouldn't be with this function.
getArraySize() { return arraySize; }
Turns out elementTypePtr contestList = new elementType[arraySize]; was the issue. contestList was a private member of the class, then I threw this line in a function declaring a local variable of the same name that disappears after the function ends. No biggie, except for the fact that I needed it in the print function...Oops.

Not able to double the size of an array

I want to resize the array when the rehash function is called, by copying the values of initial dictionary into it and then at last redifining the newdictionary as dictionary
void rehash ()
{
int newsize=2*Size;
node **newdictionary;
newdictionary= new node*[newsize];
//Initialising the dictionary
for (int i = 0;i < newsize;i++)
{
newdictionary[i]->name = "";
newdictionary[i]->value = -1;
}
node **temp=dictionary;
delete [] dictionary;
dictionary=newdictionary;
SIZE=newsize;
for(int i=0;i<SIZE;i++)
{
if(temp[i]->value!= -1)
insertvalue(temp[i]->name,temp[i]->value);
}
delete [] temp;
};
Earlier I have defined insertvalue as:
void insertvalue (string filedata, int code)
{
// tableindex is the position where I want to insert the value
dictionary[tableindex]->name= filedata;
dictionary[tableindex]->value=code;
};
You didn't actually explain what problem(s) you're having, but your code has several issues:
void rehash ()
{
int newsize=2*Size;
node **newdictionary;
newdictionary= new node*[newsize];
At this point, newdictionary is simply an array of uninitialized pointers.
//Initialising the dictionary
for (int i = 0;i < newsize;i++)
{
newdictionary[i]->name = "";
newdictionary[i]->value = -1;
}
So the loop above is trying to access the members of node objects that don't yet exist.
node **temp=dictionary;
delete [] dictionary;
These two lines don't make sense. dictionary and temp point to the same memory. So when you delete dictinoary you've deleted the memory that temp is pointing to.
dictionary=newdictionary;
SIZE=newsize;
for(int i=0;i<SIZE;i++)
{
if(temp[i]->value!= -1)
insertvalue(temp[i]->name,temp[i]->value);
}
Even if you hadn't just deleted the memory out from under temp, you're now trying to access temp from 0 to the new size, not the old size. In other words, this would access temp beyond its bounds.
Those are the major problems that I've noticed in the code so far. You at least need to correct all of them before there's any hope of this working. You probably need to spend some time really stepping through your logic to ensure it makes sense in the end.

C++ Multidimensional arrays generating segmentation faults?

I am writing a script which must copy some names into a multidimensional array, print the contents of the array and then deallocate the memory and terminate. The problem I am having is that when I run the script it only prints out the last name entered. Here is what I have done. Any help would be great! Thanks in advance!
#include <iostream>
#include <string.h>
using namespace std;
void createArray(int n);
void addDetail(char*& name, char*& surname);
void printArray();
void clear();
char ***details;
int used;
int size;
int main()
{
createArray(3);
char* tmpName = new char[20];
char* tmpSurName = new char[120];
strcpy(tmpName, "nameA");
strcpy(tmpSurName, "surnameA");
addDetail(tmpName,tmpSurName);
strcpy(tmpName, "nameB");
strcpy(tmpSurName, "surnameB");
addDetail(tmpName,tmpSurName);
strcpy(tmpName, "nameC");
strcpy(tmpSurName, "surnameC");
addDetail(tmpName,tmpSurName);
clear();
return 0;
}
void createArray(int n)
{
details= new char**[n];
for(int i=0; i<n; i++)
details[i] = new char*[2];
size = n;
used = 0;
}
void addDetail(char*& name, char*& surname)
{
if(occupation < size)
{
details[used][0] = name;
details[used][1] = surname;
used++;
}else{
cout << "Array Full " << endl;
}
}
void printArray()
{
for(int i=0; i<used; i++)
cout << details[i][0] << " " << details[i][1] << endl;
}
void clear()
{
for(int i=0; i<size; i++)
{
delete [] details[i];
details[i] = 0;
}
delete [] details;
details=0;
}
You didn't allocate memory for details[used][0] and details[used][1] so it's using whatever random address was in those locations.
Since this is C++ you can use string instead perhaps? std::string **details;. This should work with your existing code, except that it will leak memory.
Better still is to use a vector of vectors.
Something like:
std::vector<std::vector<std::string> > details;
Then the createArray function can go away completely and addDetail becomes simpler:
std::vector<string> newName;
newName.push_back(name);
newName.push_back(surname);
details.push_back(newName);
It is because each time, you are effectively storing the pointer tmpName and tmpSurName in the array details. Then in the next iteration, you overwrite the contents of the memory where tmpName and tmpSurName point to, so at the end you'll have a list that contains the last name/surname n times.
To solve it, you need to re-allocate tmpName and tmpSurName before each call to addDetail.
Btw, why do you need to use an (ewww) char***, and can't use e.g. the STL?
What it looks like is happening is that you are not adding the string to the array, you are adding a pointer to name and surname. Every instance is pointing at that variable, so when you ask the array what it contains it goes and asks name and surname, which will only contain the last value.
Also that array, are you sure its working how you want it to work? Arrays are... concrete things. Your essentially saying "I want 5 of these, they will be this big (based on the type you put in)" and the computer quietly goes "well I'll set aside space for those here and you can put them in when your ready". When your code puts those names in there, there really isn't any prep on where to store them. If you fill up that space and go beyond you go to bad places. So what you should do is have that last * of char*** be a pointer to a char[120] so that you know (for your purposes atleast) it never gets filled up. Do that in your createArray function after you have created the outer arrays.
You keep overwriting your temporary buffers rather than making new buffers for each entry in the array. As a result, only the last data written to the buffer survives.
Here's a rough guide on one way to fix it, though this sample may have some problems - I made no attempt to compile or test this.
This portion of main belongs in addDetail:
char* tmpName = new char[20];
char* tmpSurName = new char[120];
strcpy(tmpName, "nameA");
strcpy(tmpSurName, "surnameA");
So, your new addDetail would look something like:
void addDetail(char*& name, char*& surname)
{
if(occupation < size)
{
details[used][0] = new char[20];
details[used][1] = new char[120];
strcpy(details[used][0], name);
strcpy(details[used][1], surname);
used++;
}else{
cout << "Array Full " << endl;
}
}
And it would be called from main like:
addDetail("nameA", "surnameA");
You'd need to update clear to properly cleanup the allocations made in addDetail though.