I often see code like:
if(myQMap.contains("my key")){
myValue = myQMap["my key"];
}
which theoretically performs two look-up's in the QMap.
My first reaction is that it should be replaced by the following, which performs one lookup only and should be two times faster:
auto it = myQMap.find("my key");
if(it != myQMap.end()){
myValue = it.value();
}
I am wondering if QMap does this optimization automatically for me?
In other words, I am wondering if QMap saves the position of the last element found with QMap::contains() and checks it first before performing the next lookup?
I would expect that QMap provides both functions for a better interface to the class. It's more natural to ask if the map 'contains' a value with a specified key than it is to call the 'find' function.
As the code shows, both find and contains call the following internal function: -
Node *n = d->findNode(akey);
So if you're going to use the returned iterator, then using find and checking the return value will be more efficient, but if you just want to know if the value exists in the map, calling contains is better for readability.
If you look at the source code, you'll see that QMap is implemented as a binary tree structure of nodes. Calling findNode iterates through the nodes and does not cache the result.
QMap source code reveals that there is no special code in QMap::contains() method.
In some cases you can use QMap::value() or QMap::values() to get value for a key and check if it is correct. These methods (and const operator[]) will copy the value, although this is probably OK for most Qt types since their underlying data are copied-on-write (notably QMap itself).
Related
When I use a std::map, it seems that accessing elements takes a different amount of time based on the method used.
First Method: Direct Access
cnt += umap[t];
Second Method:
if (umap.find(t) != umap.end()){
cnt += umap[t];
}
The second method seems to be quite a bit faster than the first method, and I don't understand why. Can someone explain the differences between these two methods?
Each of the code snippets is doing a different thing.
The first snippet takes more time because it inserts a key t to umap, if it does not exist (and initializes it with zero), before adding it to cnt.
In second snippet no key is inserted. Because of the if condition, umap[t] is (called then) added to cnt only when umap has the key t.
The second snippet can be more optimized by temporarily storing the iterator returned by find. In second snippet, operator[] internally calls find method again, which in turn increases time complexity.
Hence, an attempt like this will prove much faster (picked up from user253751's comment):
if(auto it = umap.find(t); it != umap.end())
cnt += it->second;
I'm trying to figure out the best way to do a cache for resources. I am mainly looking for native C/C++/C++11 solutions (i.e. I don't have boost and the likes as an option).
What I am doing when retrieving from the cache is something like this:
Object *ResourceManager::object_named(const char *name) {
if (_object_cache.find(name) == _object_cache.end()) {
_object_cache[name] = new Object();
}
return _object_cache[name];
}
Where _object_cache is defined something like: std::unordered_map <std::string, Object *> _object_cache;
What I am wondering is about the time complexity of doing this, does find trigger a linear-time search or is it done as some kind of a look-up operation?
I mean if I do _object_cache["something"]; on the given example it will either return the object or if it doesn't exist it will call the default constructor inserting an object which is not what I want. I find this a bit counter-intuitive, I would have expected it to report in some way (returning nullptr for example) that a value for the key couldn't be retrieved, not second-guess what I wanted.
But again, if I do a find on the key, does it trigger a big search which in fact will run in linear time (since the key will not be found it will look at every key)?
Is this a good way to do it, or does anyone have some suggestions, perhaps it's possible to use a look up or something to know if the key is available or not, I may access often and if it is the case that some time is spent searching I would like to eliminate it or at least do it as fast as possible.
Thankful for any input on this.
The default constructor (triggered by _object_cache["something"]) is what you want; the default constructor for a pointer type (e.g. Object *) gives nullptr (8.5p6b1, footnote 103).
So:
auto &ptr = _object_cache[name];
if (!ptr) ptr = new Object;
return ptr;
You use a reference into the unordered map (auto &ptr) as your local variable so that you assign into the map and set your return value in the same operation. In C++03 or if you want to be explicit, write Object *&ptr (a reference to a pointer).
Note that you should probably be using unique_ptr rather than a raw pointer to ensure that your cache manages ownership.
By the way, find has the same performance as operator[]; average constant, worst-case linear (only if every key in the unordered map has the same hash).
Here's how I'd write this:
auto it = _object_cache.find(name);
return it != _object_cache.end()
? it->second
: _object_cache.emplace(name, new Object).first->second;
The complexity of find on an std::unordered_map is O(1) (constant), specially with std::string keys which have good hashing leading to very low rate of collisions. Even though the name of the method is find, it doesn't do a linear scan as you pointed out.
If you want to do some kind of caching, this container is definitely a good start.
Note that a cache typically is not just a fast O(1) access but also a bounded data structure. The std::unordered_map will dynamically increase its size when more and more elements are added. When resources are limited (e.g. reading huge files from disk into memory), you want a bounded and fast data structure to improve the responsiveness of your system.
In contrast, a cache will use an eviction strategy whenever size() reaches capacity(), by replacing the least valuable element.
You can implement a cache on top of a std::unordered_map. The eviction strategy can then be implemented by redefining the insert() member. If you want to go for an N-way (for small and fixed N) associative cache (i.e. one item can replace at most N other items), you could use the bucket() interface to replace one of the bucket's entries.
For a fully associative cache (i.e. any item can replace any other item), you could use a Least Recently Used eviction strategy by adding a std::list as a secondary data structure:
using key_tracker_type = std::list<K>;
using key_to_value_type = std::unordered_map<
K,std::pair<V,typename key_tracker_type::iterator>
>;
By wrapping these two structures inside your cache class, you can define the insert() to trigger a replace when your capacity is full. When that happens, you pop_front() the Least Recently Used item and push_back() the current item into the list.
On Tim Day's blog there is an extensive example with full source code that implements the above cache data structure. It's implementation can also be done efficiently using Boost.Bimap or Boost.MultiIndex.
The insert/emplace interfaces to map/unordered_map are enough to do what you want: find the position, and insert if necessary. Since the mapped values here are pointers, ekatmur's response is ideal. If your values are fully-fledged objects in the map rather than pointers, you could use something like this:
Object& ResourceManager::object_named(const char *name, const Object& initialValue) {
return _object_cache.emplace(name, initialValue).first->second;
}
The values name and initialValue make up arguments to the key-value pair that needs to be inserted, if there is no key with the same value as name. The emplace returns a pair, with second indicating whether anything was inserted (the key in name is a new one) - we don't care about that here; and first being the iterator pointing to the (perhaps newly created) key-value pair entry with key equivalent to the value of name. So if the key was already there, dereferencing first gives the original Ojbect for the key, which has not been overwritten with initialValue; otherwise, the key was newly inserted using the value of name and the entry's value portion copied from initialValue, and first points to that.
ekatmur's response is equivalent to this:
Object& ResourceManager::object_named(const char *name) {
bool res;
auto iter = _object_cache.end();
std::tie(iter, res) = _object_cache.emplace(name, nullptr);
if (res) {
iter->second = new Object(); // we inserted a null pointer - now replace it
}
return iter->second;
}
but profits from the fact that the default-constructed pointer value created by operator[] is null to decide whether a new Object needs to be allocated. It's more succinct and easier to read.
I have a certain struct:
struct MyClass::MyStruct
{
Statistics stats;
Oject *objPtr;
bool isActive;
QDateTime expiration;
};
For which I need to store pointers to in a private container. I will be getting objects from client code for which I need to return a pointer to the MyStruct. For example:
QList<MyStruct*> MyClass::structPtr( Statistics stats )
{
// Return all MyStruct* for which myStruct->stats == stats (== is overloaded)
}
or
QList<MyStruct*> MyClass::structPtr( Object *objPtr )
{
// Return all MyStruct* for which myStruct->objPtr == objPtr
}
Right now I'm storing these in a QLinkedList<MyStruct*> so that I can have fast insertions, and lookups roughly equivalent to QList<MyStruct*>. Ideally I would like to be able to perform lookups faster, without losing my insertion speed. This leads me to look at QHash, but I am not sure how I would use a QHash when I'm only storing values without keys, or even if that is a good idea.
What is the proper Qt/C++ way to address a problem such as this? Ideally, lookup times should be <= log(n). Would a QHash be a good idea here? If so, what should I use for a key and/or value?
If you want to use QHash for fast lookups, the hash's key type must be the same as the search token type. For example, if you want to find elements by Statistics value, your hash should be QHash<Statistics, MyStruct*>.
If you can live with only looking up your data in one specific way, a QHash should be fine for you. Though, in your case where you're pulling lists out, you may want to investigate QMultiHash and its .values() member. However, it's important to note, from the documentation:
The key type of a QHash must provide operator==() and a global hash function called qHash()
If you need to be able to pull these lists based on different information at different times you might just be better off iterating over the lists. All of Qt's containers provide std-style iterators, including its hash maps.
I'm debugging my code and at one point I have a multimap which contains pairs of a long and a Note object created like this:
void Track::addNote(Note ¬e) {
long key = note.measureNumber * 1000000 + note.startTime;
this->noteList.insert(make_pair(key, note));
}
I wanted to look if these values are actually inserted in the multi map so I placed a breakpoint and this is what the multimap looks like (in Xcode):
It seems like I can infinitely open the elements (my actual multimap is the first element called noteList) Any ideas if this is normal and why I can't read the actual pair values (the long and the Note)?
libstdc++ implements it's maps and sets using a generic Red/Black tree. The nodes of the tree use a base class _Rb_tree_node_base which contain pointers to the same types for the parent/left/right nodes.
To access the data, it performs a static cast to the node type that's specific to the template arguments you provided. You won't be able to see the data using XCode unless you can force the cast.
It does something similar with linked lists, with a linked list node base.
Edit: It does this to remove the amount of duplicate code that is generated by the template. Rather than have a RbTree<Type1>, RbTree<Type2>, and so on; libstdc++ has a single set of operations that work on the base class, and those operations are the same regardless of the underlying type of the map. It only casts when it needs to examine the data, and the actual rotation/rebalance code is the same for all of the trees.
Seems like a bug in the component that renders the collection. About halfway down the list there is an entry that is 0x00000000, but the rendering continues below that, without any valid pointers though. Perhaps you need to add your own common-sense interpretation of the displayed data and treat a null value as the end of that part of the tree.
i want to see the number of appearance of words from some phrases.
My problem is that i can't use map to do this:
map[word] = appearnce++;
Instead i have a class that uses binary tree and behaves like a map, but i only have the method:
void insert(string, int);
Is there a way to counts the words apperances using this function?(because i can't find a way to increment the int for every different word) Or do I have to overload operator [] for the class? What should i do ?
Presumably you also have a way to retrieve data from your map-like structure (storing data does little good unless you can also retrieve it). The obvious method would be to retrieve the current value, increment it, and store the result (or store 1 if retrieving showed the value wasn't present previously).
I guess this is homework and you're learning about binary trees. In that case I would implement operator[] to return a reference to the existing value (and if no value exists, default construct a value, insert it, and return that. Obviously operator[] will be implemented quite similarly to your insert method.
can you edit "insert" function?
if you can, you can add static variable that count the appearnces inside the function