I am working on a project for my University where i have to implement a Hash table. I am quite new to c++, so please forgive me if I am not specific enough or if I have completely wrong assumptions.
Soo..my main problem is that I have a so called "Bucket" which is a struct in my program and which contains a pointer array of N(template parameter) places.
struct Bucket {
T *kptr{ nullptr };
Bucket *bptr{ nullptr }; //For overflow chains (linear Hashing)
Bucket(Bucket *bptr = nullptr) : kptr(new value_type[N]),bptr(bptr) {}
~Bucket() { if(bptr) delete[] bptr; if (kptr) delete[] kptr; }
};
In my main Class named My_Set for example I have an additional Bucket *table of [1<
My first assumption was to initialize the kptr array to nullptr and then in the insert method to make something like
void insert(Bucket &bkt, T &key) {
for (int i=0; i<N, ++i) {
if (bkt.kptr[i]) { //Check on nullptr!
kptr[i] = key;
}
}
}
But that´s not possible because then kptr should be Bucket T **kptr and not Bucket *kptr as far as i understood it.
So, is there any other efficient way to check one single field of an array if it has been assigned to an Object already or not?
IMPORTANT: I am not allowed to use STL Containers, Smart Poitners and similar things which would make the whole thing much easier.
Thanks!
Check whether pointer in pointer array is already “filled”
... So, is there any other efficient way to check one single field of an array if it has been assigned to an Object already or not?
Yes: Initialize the pointer to nullptr. Then, if the pointer has a value other than nullptr, you know that it has been pointed to an object.
However, your professor is correct that your checking is inefficient. On every insert you iterate through all previously inserted objects.
That is unnecessary. You can avoid trying to check whether any of the pointers have been assigned by remembering where the next free pointer is. How can we "remember" things in algorithms? Answer: Using variables. Since you must remember for each instance of your container, you need a member variable.
Since you are using this new variable to remember the next free pointer, how about we name it next_free. Now, considering that the variable must refer to an existing object, what type should it have? A reference would be a good guess, but you must also be able to reassign it once an element is inserted. What can refer to an object like a reference, but can be reassigned? Answer: A pointer. Since this pointer is going to point to a pointer to T, what should be its type? Answer: T**. What should it be initialized to? Answer: The address of the first element of kptr. With such member, insert can be implemented like this:
void insert(T &key) { // why would there be a Bucket argument for insert?
*next_free++ = new T(key); // Note: Do not do this in actual programs. Use
// RAII containers from the standard library instead
}
then kptr should be Bucket T **kptr and not Bucket *kptr as far as i understood it.
Correct. A T* can not point to an array that contains pointers (unless T happens to be a pointer) - it can point to an array of T objects. A T** can point to an array of pointers to T.
Instead of pointers to separately allocated objects, it would be more efficient to use a flat resizable array. But since you are not allowed to use std::vector, you would then have another standard container to re-implement. So consider whether the efficiency is worth the extra work.
Related
I am having a small issue with a bit of code for a school assignment (I know that's shunned upon here, but I locked myself into using the std::list library and am paying for it). I have a function that has a list of pointers to classes passed into it along with a particular ID belonging to one of those classes that I want to destroy and resize my list. However, with my code, the list is never resized and the values are garbage, that crash my program. So it looks like the actual class is being removed, but the element is never removed from the list...
If I had time to make my own doubly-linked list implementation, I'd iterate over the list looking for the element that I want to delete. If it is found, create a temporary node pointer and point that to the node I am about to delete. Set the previous node's "next" element to the iterator's "next" element, and then delete the iterator node.
But.. using the stl::list implementation, I'm at a loss what to do. Here is what I have so far, where a DOCO is a class, and the elements in the list are pointers to instances of classes. I've looked into remove() vs. erase(), which maybe using both may fix it, but I'm not sure how to implement remove() with an iterator like this.
bool DOCO::kill_doco(std::list < DOCO* > docolist, int docoid)
{
for (std::list<DOCO*>::iterator it = docolist.begin(); it != docolist.end(); )
{
if ((*it)->id == docoid)
{
delete * it;
it = docolist.erase(it);
std::cerr << "item erased\n";
}
else
{
++it;
}
}
std::cerr << "leaving kill\n";
return true;
}
kill_doco(std::list < DOCO* > docolist
this creates a copy of the list. This copy is a list of pointers.
You proceed to modify the copy of the list, and delete an element in it.
The original list (which you copied) still has the original pointer, which is now pointing to a deleted object.
The easy fix is:
kill_doco(std::list < DOCO* >& docolist
C++ is a value-oriented language, unlike languages like Java or C#. The name of something refers to an actual value of that thing, not a reference to it.
Pointers are similarly the value of the address of the object.
Reference like semantics, or pointer-like semantics, can be done in C++. But, unlike Java/C#, by default every object in C++ is an actual value.
People who move from one language to the other (either way) can get confused by this.
The "default" object type in a C++ program is a regular type -- a type that acts like an integer when you copy it around and the like. It is relatively easy to move away from this, but that is the default.
So what you did was akin to:
void clear_bit( int x, int bit ) {
x = x & ~(1 << bit);
}
and being surprised that the value x you passed in wasn't modified by the function. The "dangling" pointer left in the original list is the 2nd thing that bit you.
Suppose I have the following:
class Map
{
std::vector<Continent> continents;
public:
Map();
~Map();
Continent* getContinent(std::string name);
};
Continent* Map::getContinent(std::string name)
{
Continent * c = nullptr;
for (int i = 0; i < continents.size(); i++)
{
if (continents[i].getName() == name)
{
c = &continents[i];
break;
}
}
return c;
}
You can see here that there are continent objects that live inside the vector called continents. Would this be a correct way of getting the object's reference, or is there a better approach to this? Is there an underlying issue with vector which would cause this to misbehave?
It is OK to return a pointer or a reference to an object inside std::vector under one condition: the content of the vector must not change after you take the pointer or a reference.
This is easy to do when you initialize a vector at start-up or in the constructor, and never change it again. In situations when the vector is more dynamic than that returning by value, rather than by pointer, is a more robust approach.
I would advice you against doing something like the above. std::vector does some fancy way of handling memory which include resizing and moving the array when it is out of capacity which will result in a dangling reference. On the other hand if the map contains a const vector, which means it is guaranteed not to be altered, what you are doing would work.
Thanks
Sudharshan
The design is flawed, as other have pointed out.
However, if you don't mind using more memory, lose the fact that the sequence no longer will sit in contiguous memory, and that the iterators are no longer random access, then a drop-in replacement would be to use std::list instead of std::vector.
The std::list does not invalidate pointers or references to the internal data when resized. The only time when a pointer / reference is invalidated is if you are removing the item being pointed to / referred to.
How can one store an arbitrary number of dynamically created instances (of different types) in an STL container so that the memory can be freed later only having the container?
It should work like this:
std::vector< void * > vec;
vec.push_back( new int(10) );
vec.push_back( new float(1.) );
Now, if vec goes out of scope the pointers to the instances are destructed, but the memory for int and float are not freed. And obviously I can't do:
for( auto i : vec )
delete *i;
because void* is not a pointer-to-object type.
You could object and argue that this isn't a good idea because one can not access the elements of the vector. That is right, and I don't access them myself. The NVIDIA driver will access them as it just needs addresses (void* is fine) for it parameters to a kernel call.
I guess the problem here is that it can be different types that are stored. Wondering if a union can do the trick in case one wants to pass this as arguments to a cuda kernel.
The kernel takes parameters of different types and are collected by traversing an expression tree (expression templates) where you don't know the type beforehand. So upon visiting the leaf you store the parameter. it can only be void*, and built-in types int, float, etc.
The vector can be deleted right after the kernel launch (the launch is async but the driver copies the parameters first then continues host thread). 2nd question: Each argument is passed a void* to the driver. Regardless if its an int, float or even void*. So I guess one can allocate more memory than needed. I think the union thingy might be worth looking at.
You can use one vector of each type you want to support.
But while that's a great improvement on the idea of a vector of void*, it still quite smelly.
This does sound like an XY-problem: you have a problem X, you envision a solution Y, but Y obviously doesn't work without some kind of ingenious adaption, so ask about Y. When instead, should be asking about the real problem X. Which is?
Ok, FWIW
I would recomend using an in-place new combined with malloc. what this would do is allow you store the pointers created as void* in your vector. Then when the vector is finished with it can simply be iterated over and free() called.
I.E.
void* ptr = malloc(sizeof(int));
int* myNiceInt = new (ptr) int(myNiceValue);
vec.push_back(ptr);
//at some point later iterate over vec
free( *iter );
I believe that this will be the simplest solution to the problem in this case but do accept that this is a "C" like answer.
Just sayin' ;)
"NVIDIA driver" sounds like a C interface anyway, so malloc is not a crazy suggestion.
Another alternative, as you suggest, is to use a union... But you will also need to store "tags" in a parallel vector to record the actual type of the element, so that you can cast to the appropriate type on deletion.
In short, you must cast void * to an appropriate type before you can delete it. The "C++ way" would be to have a base class with a virtual destructor; you can call delete on that when it points to an instance of any sub-class. But if the library you are using has already determined the types, then that is not an option.
If you have control over the types you can create an abstract base class for them. Give that class a virtual destructor. Then you can have your std::vector<Object*> and iterate over it to delete anything which inherits from Object.
You probably need to have a second std::vector<void*> with pointers to the actual values, since the Object* probably hits the vtable first. A second virtual function like virtual void* ptr() { return &value; } would be useful here. And if it needs the size of the object you can add that too.
You could use the template pattern like this:
template<typename T>
class ObjVal : public Object {
public:
T val;
virtual void* ptr() { return &this->val; }
virtual size_t size() { return sizeof(this->val); }
};
Then you only have to type it once.
This is not particularly memory efficient because every Object picks up at least one extra pointer for the vtable.
However, new int(3) is not very memory efficient either because your allocator probably uses more than 4 bytes for it. Adding that vtable pointer may be essentially free.
Use more than 1 vector. Keep the vector<void*> around to talk to the API (which I'm guessing requires a contiguous block of void*s of non-uniform types?), but also have a vector<std::unique_ptr<int>> and vector<std::unique_ptr<float>> which owns the data. When you create a new int, push a unique_ptr that owns the memory into your vector of ints, and then stick it on the API-compatible vector as a void*. Bundle the three vectors into one struct so that their lifetimes are tied together if possible (and it probably is).
You can also do this with a single vector that stores the ownership of the variables. A vector of roll-your-own RAII pseudo-unique_ptr, or shared_ptr with custom destroyers, or a vector of std::function<void()> that your "Bundle"ing struct's destroyer invokes, or what have you. But I wouldn't recommend these options.
I want to ask whether there are some problems with the copy for the vector of pointer items. Do I need to strcpy or memcpy because there may be depth copy problem?
For instance:
Class B;
Class A
{
....
private:
std::vector<B*> bvec;
public:
void setB(std::vector<B*>& value)
{
this->bvec = value;
}
};
void main()
{
....
std::vector<const B*> value; // and already has values
A a;
a.setB(value);
}
This example only assign the value to the class variable bvec inside A class. Do I need to use memcpy since I found that std::vector bvec; has pointer items? I am confused with the depth copy in C++, could you make me clear about that? Thank you.
Think about this, if you remove and delete an item from the vector value after you call setB, then the vector in A will have a pointer that is no longer valid.
So either you need to do a "deep copy", have guarantees that the above scenario will never happen, or use shared smart pointers like std::shared_ptr instead of raw pointers. If you need pointers, I would recommend the last.
There is another alternative, and that is to store the vector in A as a reference to the real vector. However, this has other problems, like the real vector needs to be valid through the lifetime of the object. But here too you can use smart pointers, and allocate the vector dynamically.
It is unlikely you need strcpy or memcpy to solve your problem. However, I'm not sure what your problem is.
I will try to explain copying as it relates to std::vector.
When you assign bvev to value in setB you are making a deep copy. This means all of the elements in the vector are copied from value to bvec. If you have a vector of objects, each object is copied. If you have a vector of pointers, each pointer is copied.
Another option is to simply copy the pointer to the vector if you wish to reference the elements later on. Just be careful to manage the lifetimes properly!
I hope that helps!
You probably want to define your copy constructor for class A to ensure the problem your asking about is handled correctly (though not by using memcpy or strcpy). Always follow the rule of three here. I'm pretty sure with std::vector your good, but if not, then use a for loop instead of memcpy
I'd much prefer to use references everywhere but the moment you use an STL container you have to use pointers unless you really want to pass complex types by value. And I feel dirty converting back to a reference, it just seems wrong.
Is it?
To clarify...
MyType *pObj = ...
MyType &obj = *pObj;
Isn't this 'dirty', since you can (even if only in theory since you'd check it first) dereference a NULL pointer?
EDIT: Oh, and you don't know if the objects were dynamically created or not.
Ensure that the pointer is not NULL before you try to convert the pointer to a reference, and that the object will remain in scope as long as your reference does (or remain allocated, in reference to the heap), and you'll be okay, and morally clean :)
Initialising a reference with a dereferenced pointer is absolutely fine, nothing wrong with it whatsoever. If p is a pointer, and if dereferencing it is valid (so it's not null, for instance), then *p is the object it points to. You can bind a reference to that object just like you bind a reference to any object. Obviously, you must make sure the reference doesn't outlive the object (like any reference).
So for example, suppose that I am passed a pointer to an array of objects. It could just as well be an iterator pair, or a vector of objects, or a map of objects, but I'll use an array for simplicity. Each object has a function, order, returning an integer. I am to call the bar function once on each object, in order of increasing order value:
void bar(Foo &f) {
// does something
}
bool by_order(Foo *lhs, Foo *rhs) {
return lhs->order() < rhs->order();
}
void call_bar_in_order(Foo *array, int count) {
std::vector<Foo*> vec(count); // vector of pointers
for (int i = 0; i < count; ++i) vec[i] = &(array[i]);
std::sort(vec.begin(), vec.end(), by_order);
for (int i = 0; i < count; ++i) bar(*vec[i]);
}
The reference that my example has initialized is a function parameter rather than a variable directly, but I could just have validly done:
for (int i = 0; i < count; ++i) {
Foo &f = *vec[i];
bar(f);
}
Obviously a vector<Foo> would be incorrect, since then I would be calling bar on a copy of each object in order, not on each object in order. bar takes a non-const reference, so quite aside from performance or anything else, that clearly would be wrong if bar modifies the input.
A vector of smart pointers, or a boost pointer vector, would also be wrong, since I don't own the objects in the array and certainly must not free them. Sorting the original array might also be disallowed, or for that matter impossible if it's a map rather than an array.
No. How else could you implement operator=? You have to dereference this in order to return a reference to yourself.
Note though that I'd still store the items in the STL container by value -- unless your object is huge, overhead of heap allocations is going to mean you're using more storage, and are less efficient, than you would be if you just stored the item by value.
My answer doesn't directly address your initial concern, but it appears you encounter this problem because you have an STL container that stores pointer types.
Boost provides the ptr_container library to address these types of situations. For instance, a ptr_vector internally stores pointers to types, but returns references through its interface. Note that this implies that the container owns the pointer to the instance and will manage its deletion.
Here is a quick example to demonstrate this notion.
#include <string>
#include <boost/ptr_container/ptr_vector.hpp>
void foo()
{
boost::ptr_vector<std::string> strings;
strings.push_back(new std::string("hello world!"));
strings.push_back(new std::string());
const std::string& helloWorld(strings[0]);
std::string& empty(strings[1]);
}
I'd much prefer to use references everywhere but the moment you use an STL container you have to use pointers unless you really want to pass complex types by value.
Just to be clear: STL containers were designed to support certain semantics ("value semantics"), such as "items in the container can be copied around." Since references aren't rebindable, they don't support value semantics (i.e., try creating a std::vector<int&> or std::list<double&>). You are correct that you cannot put references in STL containers.
Generally, if you're using references instead of plain objects you're either using base classes and want to avoid slicing, or you're trying to avoid copying. And, yes, this means that if you want to store the items in an STL container, then you're going to need to use pointers to avoid slicing and/or copying.
And, yes, the following is legit (although in this case, not very useful):
#include <iostream>
#include <vector>
// note signature, inside this function, i is an int&
// normally I would pass a const reference, but you can't add
// a "const* int" to a "std::vector<int*>"
void add_to_vector(std::vector<int*>& v, int& i)
{
v.push_back(&i);
}
int main()
{
int x = 5;
std::vector<int*> pointers_to_ints;
// x is passed by reference
// NOTE: this line could have simply been "pointers_to_ints.push_back(&x)"
// I simply wanted to demonstrate (in the body of add_to_vector) that
// taking the address of a reference returns the address of the object the
// reference refers to.
add_to_vector(pointers_to_ints, x);
// get the pointer to x out of the container
int* pointer_to_x = pointers_to_ints[0];
// dereference the pointer and initialize a reference with it
int& ref_to_x = *pointer_to_x;
// use the reference to change the original value (in this case, to change x)
ref_to_x = 42;
// show that x changed
std::cout << x << '\n';
}
Oh, and you don't know if the objects were dynamically created or not.
That's not important. In the above sample, x is on the stack and we store a pointer to x in the pointers_to_vectors. Sure, pointers_to_vectors uses a dynamically-allocated array internally (and delete[]s that array when the vector goes out of scope), but that array holds the pointers, not the pointed-to things. When pointers_to_ints falls out of scope, the internal int*[] is delete[]-ed, but the int*s are not deleted.
This, in fact, makes using pointers with STL containers hard, because the STL containers won't manage the lifetime of the pointed-to objects. You may want to look at Boost's pointer containers library. Otherwise, you'll either (1) want to use STL containers of smart pointers (like boost:shared_ptr which is legal for STL containers) or (2) manage the lifetime of the pointed-to objects some other way. You may already be doing (2).
If you want the container to actually contain objects that are dynamically allocated, you shouldn't be using raw pointers. Use unique_ptr or whatever similar type is appropriate.
There's nothing wrong with it, but please be aware that on machine-code level a reference is usually the same as a pointer. So, usually the pointer isn't really dereferenced (no memory access) when assigned to a reference.
So in real life the reference can be 0 and the crash occurs when using the reference - what can happen much later than its assignemt.
Of course what happens exactly heavily depends on compiler version and hardware platform as well as compiler options and the exact usage of the reference.
Officially the behaviour of dereferencing a 0-Pointer is undefined and thus anything can happen. This anything includes that it may crash immediately, but also that it may crash much later or never.
So always make sure that you never assign a 0-Pointer to a reference - bugs likes this are very hard to find.
Edit: Made the "usually" italic and added paragraph about official "undefined" behaviour.