Using an unordered_set with shared_ptr keys

Using an unordered_set with shared_ptr keys - c++

I am trying to use the following data collection in my program:
boost::unordered_set<boost::shared_ptr<Entity> > _entities;
I am using unordered_set because I want fast insertion and removal (by key, not iterator) of Entities.
My doubt is, if I implement the following two functions:
void addEntity(boost::shared_ptr<Entity> entity) {
_entities.insert(entity);
}
void removeEntity(boost::shared_ptr<Entity> entity) {
_entities.remove(entity);
}
When I try to remove the entity, will the unordered_set find it? Because the shared_ptr that is stored inside the unordered_set is a copy of the shared_ptr that I am trying to use to remove the entity from the unordered_set if I call removeEntity().
What do I need to do for the unordered_set to find the Entity? Do I need to create a comparison function that checks the values of the shared_ptr's? But then won't the unordered_set slow down because its hash function uses the shared_ptr as hash? Will I need to create a hash function that uses the Entity as hash also?

Yes you can use boost::shared_ptr in boost::unordered_set (same applies to std version of these classes)
boost::unordered_set uses boost::hash template function to generate keys for boost::unordered_set. This function is specialised for boost::shared_ptr to take underlying pointer into account.

Let me try to explain this if I got it right :-
boost::shared_ptr is implemented using reference counting mechanism. That means whenever you are passing it to some other let's say function you are just increasing reference count whereas when you delete it you are decreasing its reference count. When reference count is 0 then only that object is deleted from memory.
Always be cautious when using it. It can save you from memory leaks only as far as your design is good to accomodate them.
For example I faced one issue.
I'd a class with map containing shared_ptrs. Later this class ( without my knowledge )was also made responsible for passing these shared-ptrs to some other classes which in-turn used some containers to store these. As a result this code blowed up luckily on tester's face in the form of memory leak.
I hope you can figure out why.

Related

Access to custom objects in unordered_set

Please help to figure out the logic of using unordered_set with custom structures.
Consider I have following class
struct MyClass {
int id;
// other members
};
used with shared_ptr
using CPtr = std::shared_ptr<MyClass>;
Because of fast access by key I supposed to use an unordered_set with a custom hash and the MyClass::id member as a key):
template <class T> struct CHash;
template<> struct CHash<CPtr>
{
std::size_t operator() (const CPtr& c) const
{
return std::hash<decltype(c->id)> {} (c->id);
}
};
using std::unordered_set<CPtr, CHash>;
Right now, unordered_set still seems to be an appropriate container. However standard find() functions for sets are assumed to be const to ensure keys won't be changed. I intend to change objects guaranteeing keeping keys unchanged. So, the questions are:
1) How to realize easy accessing to element of set by int key reserving possibility to change element, something like
auto element = my_set.find(5);
element->b = 3.3;
It is possible to add converting constructor and use something like
auto element = my_set.find(MyClass (5));
But it doesn't solve the problem with constness and what if the class is huge.
2) Am I actually going wrong way? Should I use another container? For example unordered_map, that will store one more int key for each entry consuming more memory.

A pointer doesn't project its constness to the object it points to. Meaning, if you have a constant reference to a std::shared_ptr (as in a set) you can still modify the object via this pointer. Whether or not that is something you should do a is a different question and it doesn't solve your lookup problem.
OF course, if you want to lookup a value by a key, then this is what std::unordered_map was designed for so I'd have a closer look there. The main problem I see with this approach is not so much the memory overhead (unordered_set and unordered_map as well as shared_ptr have noticeable memory overhead anyway), but that you have to maintain redundant information (id in the object and id as a key).
If you have not many insertions and you don't absolutely need the (on average) constant lookup time and memory overhead is really important to you, you could consider a third solution (besides using a third-party or self written data structure of courses): namely to write a thin wrapper around a sorted std::vector<std::shared_ptr<MyClass>> or - if appropriate - even better std::vector<std::unique_ptr<MyClass>> that uses std::upper_bound for lookups.

I think you are going a wrong way using unordered_set,because unordered_set's definition is very clear that:
Keys are immutable, therefore, the elements in an unordered_set cannot be modified once in the container - they can be inserted and removed, though.
You can see its definition in site:
http://www.cplusplus.com/reference/unordered_set/unordered_set/.
And hope it is helpful for you.Thanks.

Observer with multiple subjects using std::unordered_set

I have seen implementations of the Observer design pattern in which the Observer is responsible for multiple Subjects. Most of these implementations use a std::vector<Subject*> in order to keep track of the Subjects.
Would it be possible for me to do a similar thing, using a std::unordered_set<weak_ptr<Subject>> instead?
The reason I want to use an unordered_set is that I will not need duplicates, and I don't need an ordered container. From what I understand, an unordered_set is the way to go in this situation. Also, the reason I am using a weak_ptr is that it should be safer?
If you disagree, leave an answer explaining what container I should use instead. If I did use the unordered_set, I would have to declare a hash function for the weak_ptr, but could this be accomplished by just using the hash function for the pointer inside, obtained with subjects.lock().get()?

First of all, in my answer I will use Subject as the one who sends messages to registered Observers, since it is the common use of this two terms.
Would it be possible for me to do a similar thing, using a std::unordered_set<weak_ptr<Observer>> instead?
It is possible. However remeber that the object held by a weak_ptr can be freed, weak_ptr needs to be casted to a shared_ptr before accessing the underlying object. It is done this way so the object is not freed while you are handling it.
Would it be possible for me to do a similar thing, using a std::unordered_set> instead?
If you need to enforce uniqueness the unordered_set looks like a good choice to me. If you don't need to, then a vector is more straightforward solution. Some would tell that unique_set is slower and requires more memory than a vector, but unless you need very high frequency registration of Observers or thousands of them, you won't notice the difference.
About the weak pointer, it gives you the flexibility of having your Observers deallocated while registered, so it should be fine. This behaviour may be unexpected if you come from a memory managed language like Java. If you want to hold them in existence while they are registered in your Subject you may use a shared_pointer instead.
I would have to declare a hash function for the weak_ptr, but could this be accomplished by just using the hash function for the pointer inside, obtained with observer.lock().get()?
Be careful when creating hash functions, I dont recommend you to use object's pointer for the hash function, specially if your Subjects can be copied/moved. Instead you may create an unique identifier for every Subject upon creation using a counter, and remember to write copy/move constructors and operators accordingly.
If you cannot write an identifying hash function, then you should't use the unique_set, since you lose the advantages it brings.
As a footnote, the beauty of object containers is that you can fit them to your needs, every solution is the correct solution if it does what you really want.

There isn't really one 'correct' answer for the choice of container; it depends upon what you are aiming for from the point of view of performance. And whether or not performance is really all that important for this.
It also depends upon memory efficiency. If you only have a few of these unordered_set objects and need very fast lookup then it may be a good choice. Since it is a hash table it will use a fairly large amount of memory per unordered_set object.
If you have a lot of unordered_set objects with fairly few items then it may get a bit expensive in terms of memory budget. If you need fast insertion and removal then std::set may be better in this case. However if the collection only ever contains a handful of items then the lookup will probably actually be faster with a linear search of std::vector due to the processor cache (i.e. better locality of reference of the vector elements compared to std::set - which may result in more elements being on the same cache line). Memory usage of vector will be lower than either of std::set or std::unordered_set.
If you need fast lookup of specific objects for some reason and use std::vector and typically have a a moderate number of elements then you could insert items into the vector in sorted order. You can then use std::lower_bound to do an O(log n) binary search lookup. However, this has a potentially high cost for inserting and removing elements.
I'd probably just go for std:vector in most cases - you generally have few observers, so may as well keep memory usage tighter.
And using weak_ptr is certainly a good option if these objects are used elsewhere with shared_ptr.

In the Observer pattern the Observer subscribes to notifications about changes on a Subject. The Subject is responsible for updating all subscribed Observers whenever its observable state changes. For that to work the Observers do not need to keep track of the Subjects. Instead the Subject must keep track of all subscribed Observers.
A nice explanation of the Observer pattern can be found here: https://sourcemaking.com/design_patterns/observer
Code outline:
class Subject;
class Observer
{
public:
// when notified about a change, the Observer
// knows which Subject changed, because of the parameter s
virtual void subjectChanged(Subject* s)=0;
};
class Subject
{
private:
int m_internalState;
std::set<Observer*> m_observers;
public:
void subscribe(Observer* o)
{
m_observers.insert(o);
}
void unsubscribe(Observer* o);
{
m_observers.erase(o);
}
void setInternalState(int state)
{
auto end=m_observers.end();
for(auto it=m_observers.begin(); it != end; ++it)
it->subjectChanged(this);
}
};
In most cases it won't matter which exact collection type you choose for storing the Observers, because there will be very few Observers. However, choosing a set-type has the advantage, that each Observer will receive only one notification. With a vector it could happen that the same Observer receives multiple notifications about the same change, if (for some reason) it was subscribed multiple times.

I really think that using std::unordered_set is a bit over-kill.
what is this observer pattern? when an event or a change of state occurs, iterate over an array of state-checkers and make them do something if the state is invalid or special in any sort.
this has being said, you want to iterate over an array with objects with overriden virtual function and call it. why whould set give us any benefit?
also, I don't get the weak_ptr idea - the owner of the observers is the array with holds them. the owner of that array is the Subject.
now that all has being said, I would go with std::vector<std::unique_ptr<Observer>>.
Edit:
using C++11, I'd even go with std::vector<std::function<void(Subject&)>> and avoid the boilerplate of inheriting+overriding.

The simplest thing to do is to roll with boost::signals2, which already implemented this for you, for all signatures. The fundamental problem with your approach is that the implementation is tied to a particular signature with a particular subject and observer, which is virtually worthless compared to a generic solution that applies for all cases.
The Observer pattern is not a pattern, it's a class template.

Efficient usage of a c++11 shared_ptr in an asset manager

I'm working on a game (and my own custom engine). I have quite a few assets (textures, skeletal animations, etc.) that are used by multiple models and therefore get loaded multiple times.
At first, my ambitions were smaller, game simpler and I could live with a little duplication, so shared_ptr which took care of resource cleanup once the last instance was gone seemed like a good idea. As my game grew, more and more resources got loaded multiple times and all the OpenGL state changing slowed the performance down to a crawl. To solve this problem, I decided to write an asset manager class.
I'm using an unordered_map to store a path to file in std::string and c++11's shared_ptr pointing to the actual loaded asset. If the file is already loaded, I return the pointer, if not, I call the appropriate Loader class. Plain and simple.
Unfortunately, I can't say the same about removal. One copy of the pointer remains in the unordered_map. Currently, I iterate through the entire map and perform .unique() checks every frame. Those pointers that turn out to be unique, get removed from the map, destroying the last copy and forcing the destructor run and do the cleanup.
Iterating through hundreds or thousands of objects is not the most efficient thing to do. (it's not a premature optimization, I am in optimization stage now) Is it possible to somehow override the shared pointer functionality? For example, add an "onLastRemains" event somehow? Maybe I should iterate through part of the unordered_map every frame (by bucket)? Some other way?
I know, I could try to write my own reference counted asset implementation, but all current code I have assumes that assets are shared pointers. Besides, shared pointers are excellent at what they do, so why re-invent the wheel?

Instead of storing shared_ptrs in the asset manager's map(see below, use a regular map), store weak_ptrs. When you construct a new asset, create a shared_ptr with a custom deleter which calls a function in the asset manager which tells it to remove this pointer from it's map. The custom deleter should contain the iterator into the map of this asset and supply that iterator when telling the asset manager to delete it's element from the map. The reason a weak_ptr is used in the map is that any subsequent requests for this element can still be given a shared_ptr (because you can make one from a weak_ptr) but the asset manager doesn't actually have ownership of the asset and the custom deleter strategy will work.
Edit: It was noted below the above technique only works if you use a std::map not a std::unordered_map. My recommendation would be to still do this, and switch to a regular map.

Use a std::unique_ptr in your unordered_map of assets.
Expose a std::shared_ptr with a custom deleter that looks up said pointer in the unordered_map, and either deletes it, or moves it to a second container "to be deleted later". Remember, std::shared_ptr does not have to actually own the data in question! The deleter can do any arbitrary action, and can even be stateful.
This lets you keep O(1) lookups for your assets, bunch cleanup (if you want to) instead of doing cleanup in the middle of other scenes.
You can even support temporary 0 reference count without deleting as follows:
Create a std::make_shared in the unordered_map of assets.
Expose custom std::shared_ptr. These hold a raw T* in the data, and the deleter holds a copy of the std::shared_ptr in the asset map. It "deletes" itself by storing the name (which it also holds) into a central "to be deleted" list.
Then go over said "to be deleted" list and check if they are indeed unique() -- if not, it means someone else in the meantime has spawned one of the "child" std::shared_ptr< T*, std::function<void(T*)>>s.
The only downside to this is that the type of exposed std::shared_ptr is no longer a simple std::shared_ptr.

Perhaps something like this?
shared_ptr<assed> get_asset(string path) {
static map<string, weak_ptr<asset>> cache;
auto ap = cache[path].lock();
if(!ap) cache[path] = ap = load_asset(path);
return ap;
}

Design pattern to allow the efficient deletion of an element from multiple containers in C++

Say for example, an element is referenced from multiple maps, e.g. a map name to element, a map address to element and a map age to element. Now one looks up the element for example via name, and now wishes to delete it from all three maps?
Several solutions come to mind:
1) The most straight forward. Look up the element in the name to element map, then search both other maps to find the element in those, then remove the element entry in all three.
2) Store weak pointers in all three maps. Store a shared pointer somewhere, at best maybe even in the element itself. After finding the element in one map, delete the element. When trying to access the element from the other maps and realizing the weak pointers can't be converted to shared pointers, remove the entry.
3) Use intrusive maps. This has the advantage that one does not need to search the remaining maps to find the element in those. However, as the object is stored in several maps, the element itself can't be made intrusive - rather the element might need to have a member implementing the hooks.
4) Others?
Is there a very clean nice solution to this? I have been bumping into this problem a few times...
A few thoughts. Solution 1 is typically the one that ends up being implemented naturally as a project grows. If the element itself has the key information of the other maps, and other containers are maps, this is probably quite acceptable. However, if the keys are missing, or if the container is e.g. a list, it can become very slow. Solution 2 depends on the implementation of weak pointers, and might also end up being quite slow. Solution 3 seems best, but maybe somewhat complicated?

boost::multi_index is designed specially for such case.

Sounds like you haven't decided what is managing the lifetime of the object - that comes first. Once you know that then use the observer pattern. When the object is to be destroyed, the objected magaing its lifetime notifies all the objects that wrap the maps holding the pointers, then destroys the object.
The observers can either implement a common interface like this:
class ObjectLifetimeMgr
{
public:
CauseObjDeletion()
{
/.. notify all observers ../
}
private:
list<IObserver*> observers;
};
class IObserver
{
public:
virtual void ObjectDestroyed( Obj* );
};
class ConcreteObserver
{
public:
void ObjectDestroyed( Obj* )
{
/.. delete Obj from map ../
}
};
Or to do a really lovely job you could implement a c++ delegate, this frees the observers from a common base class and simply allows them to register a callback using a member method

Never found anything to replace solution 1. I ended up with shared_pointers and delete flags in a delete function (e.g. DeleteFromMaps(bool map1, bool map2, bool map3)) in the object. The call from eg map2 then becomes e.g.
it->DeleteFromMaps(true,false,true);
erase(it);

Hash map with concurrent values (in C++)

I have a hash map, from a string key to a pointer to a thread-safe class. Each object of this class contains a mutex which I use to synchronize between its various methods.
In addition, every once in a while I want to discard values that were not used for some time. I want to be able to safely delete the value pointer and remove it from the hash map, while making sure that no one uses my value.
What's the best way to achieve this (in general, and in C++)?

If you want people to use your hash map to get these pointers then be able to keep using them without having to notify your hash map class that they've finished with it, then storing shared pointers in the hash map is the easiest way to time actual object deletion to the time the last user of the object finishes with it. The hash map is then free to erase it's shared pointer to the object at any time.

Store and pass around shared pointers (std::shared_ptr) to your objects. Use std::shared_ptr::use_count method to check how many pointers are being used at the moment

You can also use concurrent_hash_map from TBB library. It protects concurrent accesses and lifetime of its elements using accessors (kind of a smart pointer with a lock)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js