C++: Pointer as a key in a hashtable

C++: Pointer as a key in a hashtable - c++

Are there any problems using pointers as hashtable keys during program run? (no need to store to disk and use it later as that causes obvious problems)
There are many circumstances where I need to quickly know whether an object belongs to some object manager. A quick way to check it is to store every object in an object manager in a hashtable where the object's pointer is the key to the actual object: E.g, HashTable

No, there are no problems. It's just like storing an int.
A pointer has got a value that doesn't change and that uniquely identify a resource.
There of course be problems if you don't use well your pointers, but this is another, uncorrelated thing.

It should work well enough. Are you seeing any problems? Maybe you should just try and see. :)

Off the top of my head - If mem space pointed to by your hashtable is deallocated without first deleting the references to mem space pointed to by your pointers, there is going to be memory leaks hanging around.

Related

Pointer pointing to deleted stack memory

I believe what I have just experienced is called "undefined behavior", but I'm not quite sure. Basically, I had an instance declared in an outer scope that holds addresses of a class. In the inner level I instantiated an object on the stack and stored the address of that instance into the holder.
After the inner scope had escaped, I checked to see if I could still access methods and properties of the removed instance. To my surprise it worked without any problem.
Is there a simple way to combat this? Is there a way I can clear deleted pointers from the list?
example:
std::vector<int*> holder;
{
int inside = 12;
holder.push_back(&inside);
}
cout << "deleted variable:" << holder[0] << endl;

Is there a simple way to combat this?
Sure, there are a number of ways to avoid this sort of problem.
The easiest way would be to not use pointers at all -- pass objects by value instead. i.e. In your example code, you could use a std::vector<int> instead of a std::vector<int *>.
If your objects are not copy-able for some reason, or are large enough that you think it will be too expensive to make copies of them, you could allocate them on the heap instead, and manage their lifetimes automatically using shared_ptr or unique_ptr or some other smart-pointer class. (Note that passing objects by value is more efficient than you might think, even for larger objects, since it avoids having to deal with the heap, which can be expensive... and modern CPUs are most efficient when dealing with contiguous memory. Finally, modern C++ has various optimizations that allow the compiler to avoid actually doing a data copy in many circumstances)
In general, retaining pointers to stack objects is a bad idea unless you are 100% sure that the pointer's lifetime will be a subset of the lifetime of the stack object it points to. (and even then it's probably a bad idea, because the next programmer who takes over the code after you've moved on to your next job might not see this subtle hazard and is therefore likely to inadvertently introduce dangling-pointer bugs when making changes to the code)
After the inner scope had escaped, I checked to see if I could still
access methods and properties of the removed instance. To my surprise
it worked without any problem.
That can happen if the memory where the object was hasn't been overwritten by anything else yet -- but definitely don't rely on that behavior (or any other particular behavior) if/when you dereference an invalid pointer, unless you like spending a lot of quality time with your debugger chasing down random crashes and/or other odd behavior :)
Is there a way I can clear deleted pointers from the list?
In principle, you could add code to the objects' destructors that would go through the list and look for pointers to themselves and remove them. In practice, I think that is a poor approach, since it uses up CPU cycles trying to recover from an error that a better design would not have allowed to be made in the first place.
Btw this is off topic but it might interest you that the Rust programming language is designed to detect and prevent this sort of error by catching it at compile-time. Maybe someday C++ will get something similar.

There is no such thing as deleted pointer. Pointer is just a number, representing some address in your process virtual address space. Even if stack frame is long gone, memory, that was holding it is still available, since it was allocated when thread started, so technically speaking, it is still a valid pointer, valid in terms, that you could dereference it and get something. But since object it was pointing is already gone, valid term will be dangling pointer. Moral is that if you have pointer to the object in the stack frame, there is no way to determine is it valid or not, not even using functions like IsBadReadPtr (Win32 API just for example). The best way to prevent such situations is avoid returning and storing pointers to the stack objects.
However, if you wish to track your heap allocated memory and automatically deallocate it after it is no longer used, you could utilize smart pointers (std::shared_ptr, boost::shared_ptr, etc).

Pointer to std::vector

So here is the problem I am having.
I have a pointer to std::vector. So after I initialize the pointer, I don't add any items to the vector, nor remove any. However, at a certain point in my code, my std::vector moves locations, and I end up with a dangling pointer. This seems to happen randomly, even though I never touch the vector after making the pointer
It took me a while debugging this, to figure this problem out. Is there a way to guarantee that my std::vector will not change memory locations? Or is it just a bad idea to have a pointer to a vector.

Or is it just a bad idea to have a pointer to a vector?
In general, I would say it is a bad idea to have raw pointers for controlling an object's lifetime. Don't use raw pointers and automatic memory management, try using smart pointers with the appropriate ownership semantics (std::shared_ptr<> or std::unique_ptr<>). Use raw pointers only for observing pointers (and if you want to be able to verify at run-time whether they are dangling, use weak_ptr<>).
Also, in many cases you may realize you do not need a pointer at all. In that case, just use an object with automatic storage, which can be efficiently moved around or passed/returned by value in C++11 thanks to move semantics.

This seems to happen randomly
No, it doesn't. As long as it stays in scope it has the same address. What is probably happening is that the vector is going out of scope, and since it was automatically allocated (sounds like) it is getting destroyed at that time. What you can do is allocate the vector on the heap (for e.g. ints):
std::vector<int>* pv = new std::vector<int>();
Then you will not have this problem. However you must remember to explicitly delete it with
delete vp;
before pv goes out of scope or you'll get a memory leak.

Deleting dynamically allocated memory from a map

I've got a map of dynamically allocated object and was wondering what the best way of deleting them was?
I was thinking maybe an interator? Something like:
studentlist::const_iterator deletemem = studentmap.begin();
for(studentlist::const_iterator deletemem=studentmap.begin(); deletemem!=studentmap.end();++deletemem)
{
Record *z=deletemem->second // Record is the type of object stored in the map student map
delete z;
}
But i'm not sure, any help would be much appreciated!

Your code looks fine. However, manually deleting is probably not exception-safe. You might consider using share_ptr (either the one from Boost, or, if you use C++0x, the standard implementation) for the values or using a boost::ptr_map.
Edit: To delete the actual data in the map, call studentmap.clear() after deleting all the contents. This will remove all elements from the map.
Edit 2: The problem with your solution arises when, for instance due to an exception, your cleanup code does not get called. In that case you leak memory. To overcome this problem, you can employ the RAII-idiom. Two possible ways of doing this are described above.

A better solution would be to encapsulate your dynamically allocated memory inside another object which is stack allocated. This could either be one of your own invention or you could use the shared_ptr class from the new C++ standard or from boost.

Your code should work, but you should clear the studentmap at the end.
Of course, you have also to invalidate any other object that should hold a copy of the pointers you are deallocating, otherwise your application will probably crash.

A std::vector of pointers?

Here is what I'm trying to do. I have a std::vector with a certain number of elements, it can grow but not shrink. The thing is that its sort of cell based so there may not be anything at that position. Instead of creating an empty object and wasting memory, I thought of instead just NULLing that cell in the std::vector. The issue is that how do I get pointers in there without needing to manage my memory? How can I take advantage of not having to do new and keep track of the pointers?

How large are the objects and how sparse do you anticipate the vector will be? If the objects are not large or if there aren't many holes, the cost of having a few "empty" objects may be lower than the cost of having to dynamically allocate your objects and manage pointers to them.
That said, if you do want to store pointers in the vector, you'll want to use a vector of smart pointers (e.g., a vector<shared_ptr<T>>) or a container designed to own pointers (e.g., Boost's ptr_vector<T>).

If you're going to use pointers something will need to manage the memory.
It sounds like the best solution for you would be to use boost::optional. I believe it has exactly the semantics that you are looking for. (http://www.boost.org/doc/libs/1_39_0/libs/optional/doc/html/index.html).
Actually, after I wrote this, I realized that your use case(e.g. expensive default constructor) is used by the boost::optional docs: http://www.boost.org/doc/libs/1_39_0/libs/optional/doc/html/boost_optional/examples.html#boost_optional.examples.bypassing_expensive_unnecessary_default_construction

You can use a deque to hold an ever-increasing number of objects, and use your vector to hold pointers to the objects. The deque won't invalidate pointers to existing objects it holds if you only add new objects to the end of it. This is far less overhead than allocating each object separately. Just ensure that the deque is destroyed after or at the same time as the vector so you don't create dangling pointers.
However, based on the size of the 3-D array you mentioned in another answer's comment, you may have difficulty storing that many pointers. You might want to look into a sparse array implementation so that you mainly use memory for the portions of the array where you have non-null pointers.

You could use a smart pointer. For example boost::shared_ptr.

The issue is that how do I get pointers in there without needing to manage my memory?
You can do certainly do this using the shared_ptr or other similar techniques mentioned here. But in near future you will come across some problem where you will have to manage your own memory. So please get yourself comfortable with the pointer concept.
Normally if you see in big servers the memory management of object itself is considered as a responsibility and specially for this purpose you will create a class. This is known as pool. Whenever you need an object you ask the pool to give you the object and whenever you are done with the object you tell the pool that I am done. Now it is the responsibility of the pool to see what can be done with that object.
But the basic idea is your main program still deals with pointers but do not care about the memory. There is some other object who cares about it.

How do I know who holds the shared_ptr<>?

I use boost::shared_ptr in my application in C++. The memory problem is really serious, and the application takes large amount of memory.
However, because I put every newed object into a shared_ptr, when the application exits, no memory leaking can be detected.
There must be something like std::vector<shared_ptr<> > pool holding the resource. How can I know who holds the shared_ptr, when debugging?
It is hard to review code line by line. Too much code...

You can't know, by only looking at a shared_ptr, where the "sibling pointers" are. You can test if one is unique() or get the use_count(), among other methods.

The popular widespread use of shared_ptr will almost inevitably cause unwanted and unseen memory occupation.
Cyclic references are a well known cause and some of them can be indirect and difficult to spot especially in complex code that is worked on by more that one programmer; a programmer may decide than one object needs a reference to another as a quick fix and doesn't have time to examine all the code to see if he is closing a cycle. This hazard is hugely underestimated.
Less well understood is the problem of unreleased references. If an object is shared out to many shared_ptrs then it will not be destroyed until every one of them is zeroed or goes out of scope. It is very easy to overlook one of these references and end up with objects lurking unseen in memory that you thought you had finished with.
Although strictly speaking these are not memory leaks (it will all be released before the program exits) they are just as harmful and harder to detect.
These problems are the consequences of expedient false declarations:
Declaring what you really want to be single ownership as shared_ptr. scoped_ptr would be correct but then any other reference to that object will have to be a raw pointer, which could be left dangling.
Declaring what you really want to be a passive observing reference as shared_ptr. weak_ptr would be correct but then you have the hassle of converting it to share_ptr every time you want to use it.
I suspect that your project is a fine example of the kind of trouble that this practice can get you into.
If you have a memory intensive application you really need single ownership so that your design can explicitly control object lifetimes.
With single ownership opObject=NULL; will definitely delete the object and it will do it now.
With shared ownership spObject=NULL; ........who knows?......

One solution to dangling or circular smart pointer references we've done is customize the smart pointer class to add a debug-only bookkeeping function. Whenever a smartpointer adds a reference to an object, it takes a stack trace and puts it in a map whose each entry keeps track of
The address of the object being allocated (what the pointer points to)
The addresses of each smartpointer object holding a reference to the object
The corresponding stacktraces of when each smartpointer was constructed
When a smartpointer goes out of scope, its entry in the map gets deleted. When the last smartpointer to an object gets destroyed, the pointee object gets its entry in the map removed.
Then we have a "track leaks" command with two functions: '[re]start leak tracking' (which clears the whole map and enabled tracking if its not already), and 'print open references', which shows all outstanding smartpointer references created since the 'start leak tracking' command was issued. Since you can see the stack traces of where those smart pointers came into being, you can easily know exactly who's keeping from your object being freed. It slows things down when its on, so we don't leave it on all the time.
It's a fair amount of work to implement, but definitely worth it if you've got a codebase where this happens a lot.

You may be experiencing a shared pointer memory leak via cycles. What happens is your shared objects may hold references to other shared objects which eventually lead back to the original. When this happens the cycle keeps all reference counts at 1 even though no one else can access the objects. The solution is weak pointers.

Try refactoring some of your code so that ownership is more explicitly expressed by the use of weak pointers instead of shared pointers in some places.
When looking at your class hierarchy it's possible to determine which class really should hold a shared pointer and which merely needs only the weak one, so you can avoid cycles if there are any and if the "real" owner object is destructed, "non-owner" objects should have already been gone. If it turns out that some objects lose pointers too early, you have to look into object destruction sequence in your app and fix it.

You're obviously holding onto references to your objects within your application. This means that you are, on purpose, keeping things in memory. That means, you don't have a memory leak. A memory leak is when memory is allocated, and then you do not keep a reference to its address.
Basically, you need to look at your design and figure out why you are keeping so many objects and data in memory, and how can you minimize it.
The one possibility that you have a pseudo-memory leak is that you are creating more objects than you think you are. Try putting breakpoints on all statements containing a 'new'. See if your application is constructing more objects than you thought it should, and then read through that code.
The problem is really not so much a memory-leak as it is an issue of your application's design.

I was going to suggest using UMDH if you are on windows. It is a very powerful tool. Use it find allocations per transaction/time-period that you expect to be freed then find who is holding them.
There is more information on this SO answer
Find memory leaks caused by smart pointers

It is not possible to tell what objects own shared_ptr from within the program. If you are on Linux, one sure way to debug the memory leaks is the Valgrind tool -- while it won't directly answer your question, it will tell where the memory was allocated, which is usually sufficient for fixing the problem. I imagine Windows has comparable tools, but I am do not know which one is best.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js