Hash map with concurrent values (in C++)

Hash map with concurrent values (in C++) - c++

I have a hash map, from a string key to a pointer to a thread-safe class. Each object of this class contains a mutex which I use to synchronize between its various methods.
In addition, every once in a while I want to discard values that were not used for some time. I want to be able to safely delete the value pointer and remove it from the hash map, while making sure that no one uses my value.
What's the best way to achieve this (in general, and in C++)?

If you want people to use your hash map to get these pointers then be able to keep using them without having to notify your hash map class that they've finished with it, then storing shared pointers in the hash map is the easiest way to time actual object deletion to the time the last user of the object finishes with it. The hash map is then free to erase it's shared pointer to the object at any time.

Store and pass around shared pointers (std::shared_ptr) to your objects. Use std::shared_ptr::use_count method to check how many pointers are being used at the moment

You can also use concurrent_hash_map from TBB library. It protects concurrent accesses and lifetime of its elements using accessors (kind of a smart pointer with a lock)

Related

Relationships between C++ objects

I have a vector of journeys and a vector of locations. A journey is between two places.
struct Data {
std::vector<Journey> m_journeys;
std::vector<Locations> m_locations;
};
struct Journey {
?? m_startLocation;
?? m_endLocation;
};
How can I create the relationship between each journey and two locations?
I thought I could just store references/pointers to the start and end locations, however if more locations are added to the vector, then it will reallocate storage and move all the locations elsewhere in memory, and then the pointers to the locations will point to junk.
I could store the place names and then search the list in Data, but that would require keeping a reference to Data (breaking encapsulation/SRP), and then a not so efficient search.
I think if all the objects were created on the heap, then shared_ptr could be used, (so Data would contain std::vector<std::shared_ptr<Journey>>), then this would work? (it would require massive rewrite so avoiding this would be preferable)
Is there some C++/STL feature that is like a pointer but abstracts away/is independent of memory location (or order in the vector)?

No, there isn't any "C++/STL feature that is like a pointer but abstracts away/is independent of memory location".
That answers that.
This is simply not the right set of containers for such a relationship between classes. You have to pick the appropriate container for your objects first, instead of selecting some arbitrary container first, and then trying to figure out how to make it work with your relationship.
Using a vector of std::shared_ptrs would be one option, just need to watch out for circular references. Another option would be to use std::list instead of std::vector, since std::list does not reallocate when it grows.
If each Locations instance has a unique identifier of some kind, using a std::map, and then using that location identifier to refer to a location, and then looking it up in the map. Although a std::map also doesn't reallocate upon growth, the layer of indirection offers some value as well.

I'd say make a vector<shared_ptr<Location>>for your index of locations, and Journey would contain two weak_ptr<Location>.
struct Data {
std::vector<Journey> m_journeys;
std::vector<std::shared_ptr<Location>> m_locations;
};
struct Journey {
std::weak_ptr<Location> m_startLocation;
std::weak_ptr<Location> m_endLocation;
};
std::weak_ptr can dangle and that's exactly what you want. :)
The concern is that one could access a Journey containing a deleted Location. A weak pointer provides an expired() method that can tell you if the data of the parent shared pointer (that would be in your m_locations vector) still exists.
Accessing data from a weak pointer is safe, and will require the use of the lock() method.
Here is a great example of how one usually uses a weak pointer:
http://en.cppreference.com/w/cpp/memory/weak_ptr/lock

Using an unordered_set with shared_ptr keys

I am trying to use the following data collection in my program:
boost::unordered_set<boost::shared_ptr<Entity> > _entities;
I am using unordered_set because I want fast insertion and removal (by key, not iterator) of Entities.
My doubt is, if I implement the following two functions:
void addEntity(boost::shared_ptr<Entity> entity) {
_entities.insert(entity);
}
void removeEntity(boost::shared_ptr<Entity> entity) {
_entities.remove(entity);
}
When I try to remove the entity, will the unordered_set find it? Because the shared_ptr that is stored inside the unordered_set is a copy of the shared_ptr that I am trying to use to remove the entity from the unordered_set if I call removeEntity().
What do I need to do for the unordered_set to find the Entity? Do I need to create a comparison function that checks the values of the shared_ptr's? But then won't the unordered_set slow down because its hash function uses the shared_ptr as hash? Will I need to create a hash function that uses the Entity as hash also?

Yes you can use boost::shared_ptr in boost::unordered_set (same applies to std version of these classes)
boost::unordered_set uses boost::hash template function to generate keys for boost::unordered_set. This function is specialised for boost::shared_ptr to take underlying pointer into account.

Let me try to explain this if I got it right :-
boost::shared_ptr is implemented using reference counting mechanism. That means whenever you are passing it to some other let's say function you are just increasing reference count whereas when you delete it you are decreasing its reference count. When reference count is 0 then only that object is deleted from memory.
Always be cautious when using it. It can save you from memory leaks only as far as your design is good to accomodate them.
For example I faced one issue.
I'd a class with map containing shared_ptrs. Later this class ( without my knowledge )was also made responsible for passing these shared-ptrs to some other classes which in-turn used some containers to store these. As a result this code blowed up luckily on tester's face in the form of memory leak.
I hope you can figure out why.

Efficient usage of a c++11 shared_ptr in an asset manager

I'm working on a game (and my own custom engine). I have quite a few assets (textures, skeletal animations, etc.) that are used by multiple models and therefore get loaded multiple times.
At first, my ambitions were smaller, game simpler and I could live with a little duplication, so shared_ptr which took care of resource cleanup once the last instance was gone seemed like a good idea. As my game grew, more and more resources got loaded multiple times and all the OpenGL state changing slowed the performance down to a crawl. To solve this problem, I decided to write an asset manager class.
I'm using an unordered_map to store a path to file in std::string and c++11's shared_ptr pointing to the actual loaded asset. If the file is already loaded, I return the pointer, if not, I call the appropriate Loader class. Plain and simple.
Unfortunately, I can't say the same about removal. One copy of the pointer remains in the unordered_map. Currently, I iterate through the entire map and perform .unique() checks every frame. Those pointers that turn out to be unique, get removed from the map, destroying the last copy and forcing the destructor run and do the cleanup.
Iterating through hundreds or thousands of objects is not the most efficient thing to do. (it's not a premature optimization, I am in optimization stage now) Is it possible to somehow override the shared pointer functionality? For example, add an "onLastRemains" event somehow? Maybe I should iterate through part of the unordered_map every frame (by bucket)? Some other way?
I know, I could try to write my own reference counted asset implementation, but all current code I have assumes that assets are shared pointers. Besides, shared pointers are excellent at what they do, so why re-invent the wheel?

Instead of storing shared_ptrs in the asset manager's map(see below, use a regular map), store weak_ptrs. When you construct a new asset, create a shared_ptr with a custom deleter which calls a function in the asset manager which tells it to remove this pointer from it's map. The custom deleter should contain the iterator into the map of this asset and supply that iterator when telling the asset manager to delete it's element from the map. The reason a weak_ptr is used in the map is that any subsequent requests for this element can still be given a shared_ptr (because you can make one from a weak_ptr) but the asset manager doesn't actually have ownership of the asset and the custom deleter strategy will work.
Edit: It was noted below the above technique only works if you use a std::map not a std::unordered_map. My recommendation would be to still do this, and switch to a regular map.

Use a std::unique_ptr in your unordered_map of assets.
Expose a std::shared_ptr with a custom deleter that looks up said pointer in the unordered_map, and either deletes it, or moves it to a second container "to be deleted later". Remember, std::shared_ptr does not have to actually own the data in question! The deleter can do any arbitrary action, and can even be stateful.
This lets you keep O(1) lookups for your assets, bunch cleanup (if you want to) instead of doing cleanup in the middle of other scenes.
You can even support temporary 0 reference count without deleting as follows:
Create a std::make_shared in the unordered_map of assets.
Expose custom std::shared_ptr. These hold a raw T* in the data, and the deleter holds a copy of the std::shared_ptr in the asset map. It "deletes" itself by storing the name (which it also holds) into a central "to be deleted" list.
Then go over said "to be deleted" list and check if they are indeed unique() -- if not, it means someone else in the meantime has spawned one of the "child" std::shared_ptr< T*, std::function<void(T*)>>s.
The only downside to this is that the type of exposed std::shared_ptr is no longer a simple std::shared_ptr.

Perhaps something like this?
shared_ptr<assed> get_asset(string path) {
static map<string, weak_ptr<asset>> cache;
auto ap = cache[path].lock();
if(!ap) cache[path] = ap = load_asset(path);
return ap;
}

Using shared pointers in map

I'm trying to decide what is best choice to use in my HW.
I have a map (I coded it) that supposed to store integer id's as keys and shared pointer of class named fan as values:
Map<Id, shared_ptr<Fan>> Online_list;
what is better to use shared_ptr<Fan>& or none reference ?
My homework is about creating server like Facebook with fans to be on-line and offline,so im having two maps one called Online_list and other is Offline_list, so when fan is disconnected i need to remove him from on-line list and add him to offline list.

A shared_ptr is a sort of reference. A pointer with memory management. You can store the plain shared_ptr since the internal refers to the same data anyway(Copy constructor increment reference count, etc).

Usually it's best to not store a pointer at all, but just store the Fan object by-value. Does it really make sense for two things to own this Fan object?
However, assuming that your design is correct, then you should simply store the shared_ptr by-value.

Lock map and vector from accessing from two threads

I have two threads, each of which has a function that manipulates the same std:map and std:vector variables.
What is the best way to keep these variables.
Thanks

It depends on the kind of manipulations. Do you only overwrite the stored values, or do you also insert / remove elements? In the former case you could lock only a specific element of the container (e.g. by embedding a std::mutex inside each element), whereas in the latter case you need to lock the entire container during each manipulation.

There is no universal best way. You need to sanitize all read/write calls to your synchronized structure through one functions that locks/unlocks mutex accordingly. You might have multiple functions but they should all operate on the same common mutex.
Its better to have a storage class and keep the map and vector as private member variables. and write forwarding functions in that class that locks/unlocks the mutex and forwards the read/write call to actual map or vector. then you have limited number of doors to access actual structures. and it will be easier to manage.
You may use boost::mutex as member variable of that class.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js