Google's dense_hash_map crashing in set_empty_key() function - c++

I am trying to use google dense_hash_map to store key value data instead of std:map.
When I tested with (int, int ) pair, I set the set_empty_key(mymap, -2) and it worked.
But, now when I use it with my (hash, value) pair, I set the set_empty_key (mymap -2) or set_empty_key(mymap, some_random_hash), in both the cases my program crashes in set_empty_key();.
Anyone can guide me with this? How can I fix this crash?
Thanks.

I don't know the exact reason of crash you've got, but, based on your description I see at least two potential mistakes.
First. Check that both key_type and data_type types are POD types and don't contain pointers to itself. More specifically (original):
Both key_type and data_type must be
plain old data. In addition, there should
be no data structures that point
directly into parts of key or value,
including the key or value itself (for
instance, you cannot have a value like
struct {int a = 1, *b = &a}. This is
because dense_hash_map uses malloc()
and free() to allocate space for the
key and value, and memmove() to
reorganize the key and value in
memory.
Second. Concerning using dense_hash_map. You need to set up some special "empty" key value which will never be used for real elements stored in your collection. Moreover if you are going to use erase() you need to specify special key for deleted items which also will never be used as key for real stored items.
That is perfectly described here:
dense_hash_map requires you call
set_empty_key() immediately after
constructing the hash-map, and before
calling any other dense_hash_map
method. (This is the largest
difference between the dense_hash_map
API and other hash-map APIs. See
implementation.html for why this is
necessary.) The argument to
set_empty_key() should be a key-value
that is never used for legitimate
hash-map entries. If you have no such
key value, you will be unable to use
dense_hash_map. It is an error to call
insert() with an item whose key is the
"empty key." dense_hash_map also
requires you call set_deleted_key()
before calling erase(). The argument
to set_deleted_key() should be a
key-value that is never used for
legitimate hash-map entries. It must
be different from the key-value used
for set_empty_key(). It is an error to
call erase() without first calling
set_deleted_key(), and it is also an
error to call insert() with an item
whose key is the "deleted key."

Related

Operator [] overload for hash table

i want to overload the [] operator to use in a hash table i have to do for homework.
I am using a vector of lists that contain pairs.std::vector <std::forward_list<std::pair<std::string, int>>>
What i want from the operator to do is to return the other part of the given pair , for instance if there is a pair("test" , 21) , by writing vectorname["test"] i should get 21 , or if i were to write vectorname["test"]=22 it should modify the pair.Also , there should be no identical keys , or if they were to be ,only the first one would be taken into consideration.
This is my first stack overflow question , sorry if i didn't explait things very well.
In order to do this sort of thing, you need to have your operator[] return a reference-like type that can be assigned to (in order to update the table) or just used (when reading the hash table). The critical thing you need to decide is what to do when the key is not present in the table.
add the key to the table immediately. This means that when you try to read a key that is not present, you'll add it to the table with a defaulted value (this is how STL maps work)
don't add the key until you actually assign to the element. This is more work, but allows you to have key values without a default constructor.
In the former case, you just return an actual reference to the element value. In the latter case, you need to implement a custom element_ref class that can be assigned to (operator=) or can be implicitly converted to the element value type (operator int in your case).

Accessing adjacent elements of a map in c++

Suppose I have a float-integer map m:
m[1.23] = 3
m[1.25] = 34
m[2.65] = 54
m[3.12] = 51
Imagine that I know that there's a mapping between 2.65 and 54, but I don't know about any other mappings.
Is there any way to visit the adjacent mappings without iterating from the beginning or searching using the find function?
In other words: can I directly access the adjacent values by just knowing about a single mapping...such as m[2.65]=54?
UPDATE Perhaps a more important "point" than my answer, brought up by #MattMcNabb:
Floating point keys in std:map
Can I directly access the adjacent values by just knowing about a single mapping (m[2.65]=54)
Yes. std::map is an ordered collection; which is to say that if an operator< exists (more generally, std::less) for the key type you can expect it to have sorted access. In fact--you won't be able to make a map for a key type if it doesn't have this comparison operator available (unless you pass in a predicate function to perform this comparison in the template invocation)
Note there is also a std::unordered_map which is often preferable for cases where you don't need this property of being able to navigate quickly between "adjacent" map entries. However you will need to have std::hash defined in that case. You can still iterate it, but adjacency of items in the iteration won't have anything to do with the sort order of the keys.
UPDATE also due to #MattMcNabb
Is there any way to visit the adjacent mappings without iterating from the beginning or searching using the find function?
You allude to array notation, and the general answer here would be "not really". Which is to say there is no way of saying:
if (not m[2.65][-2]) {
std::cout << "no element 2 steps prior to m[2.65]";
} else {
std::cout << "the element 2 before m[2.65] is " << *m[2.65][-2];
}
While no such notational means exist, the beauty (and perhaps the horror) of C++ is that you could write an augmentation of map that did that. Though people would come after you with torches and pitchforks. Or maybe they'd give you cult status and put your book on the best seller list. It's a fine line--but before you even try, count the letters and sequential consonants in your last name and make sure it's a large number.
What you need to access the ordering is an iterator. And find will get you one; and all the flexibility that it affords.
If you only use the array notation to read or write from a std::map, it's essentially a less-capable convenience layer built above iterators. So unless you build your own class derived from map, you're going to be stuck with the limits of that layer. The notation provides no way to get information about adjacent values...nor does it let you test for whether a key is in the map or not. (With find you can do this by comparing the result of a lookup to end(m) if m is your map.)
Technically speaking, find gives you the same effect as you could get by walking through the iterators front-to-back or back-to-front and comparing, as they are sorted. But that would be slower if you're seeking arbitrary elements. All the containers have a kind of algorithmic complexity guarantee that you can read up on.
When dereferencing an iterator, you will receive a pair whose first element is the key and second element is the value. The value will be mutable, but the key is constant. So you cannot find an element, then navigate to an adjacent element, and alter its key directly...just its value.

QMap::contains() VS QMap::find()

I often see code like:
if(myQMap.contains("my key")){
myValue = myQMap["my key"];
}
which theoretically performs two look-up's in the QMap.
My first reaction is that it should be replaced by the following, which performs one lookup only and should be two times faster:
auto it = myQMap.find("my key");
if(it != myQMap.end()){
myValue = it.value();
}
I am wondering if QMap does this optimization automatically for me?
In other words, I am wondering if QMap saves the position of the last element found with QMap::contains() and checks it first before performing the next lookup?
I would expect that QMap provides both functions for a better interface to the class. It's more natural to ask if the map 'contains' a value with a specified key than it is to call the 'find' function.
As the code shows, both find and contains call the following internal function: -
Node *n = d->findNode(akey);
So if you're going to use the returned iterator, then using find and checking the return value will be more efficient, but if you just want to know if the value exists in the map, calling contains is better for readability.
If you look at the source code, you'll see that QMap is implemented as a binary tree structure of nodes. Calling findNode iterates through the nodes and does not cache the result.
QMap source code reveals that there is no special code in QMap::contains() method.
In some cases you can use QMap::value() or QMap::values() to get value for a key and check if it is correct. These methods (and const operator[]) will copy the value, although this is probably OK for most Qt types since their underlying data are copied-on-write (notably QMap itself).

std::map<int, int> vs. vector of vector

I need a container to store a value (int) according to two attributes, source (int) and destination (int) i.e. when a source sends something to a destination, I need to store it as an element in a container. The source is identified by a unique int ID (an integer from 0-M), where M is in the tens to hundreds, and so is the destination (0-N). The container will be updated by iterations of another function.
I have been using a vector(vector(int)) which means goes in the order of source(destination(value)). A subsequent process needs to check this container, to see if an element exists in for a particular source, and a particular destination - it will need to differentiate between an empty 'space' and a filled one. The container has the possibility of being very sparse.
The value to be stored CAN be 0 so I haven't had success trying to find out if the space is empty, since I can't seem to do something like container[M][N].empty().
I have no experience with maps, but I have seen another post that suggests a map might be useful, and an std::map<int, int> seems to be similar to a vector<vector<int>>.
To summarise:
Is there a way to check if a specific vector of vector 'space' is empty (since I can't compare it to 0)
Is a std::map<int, int> better for this purpose, and how do I use one?
I need a container to store a value (int) according to two attributes,
source (int) and destination (int)
std::map<std::pair<int, int>, int>
A subsequent process needs to check this container, to see if an
element exists in for a particular source, and a particular
destination - it will need to differentiate between an empty 'space'
and a filled one.
std::map::find
http://www.cplusplus.com/reference/map/map/find/
The container has the possibility of being very sparse.
Use a std::map. The "correct" choice of a container is based on how you need to find things and how you need to insert/delete things. If you want to find things fast, use a map.
First of all, assuming you want an equivalent structure of
vector<vector<int>>
you would want
std::map<int,std::vector<int>>
because for each key in a map, there is one unique value only.
If your sources are indexed very closely sequentially as 0...N, will be doing a lot of look-ups, and few deletions, you should use a vector of vectors.
If your sources have arbitrary IDs that do not closely follow a sequential order or if you are going to do a lot of insertions/deletions, you should use a map<int,vector<int>> - usually implemented by a binary tree.
To check the size of a vector, you use
myvec.size()
To check whether a key exists in a map, you use
mymap.count(ID) //this will return 0 or 1 (we cannot have more than 1 value to a key)
I have used maps for a while and even though I'm nowhere close to an expert, they've been very convenient for me to use for storing and modifying connections between data.
P.S. If there's only up to one destination matching a source, you can proceed with
map<int,int>
Just use the count() method to see whether a key exists before reading it
If you want to keep using a vector but want to add a check for whether the item contains a valid value, look at boost::optional. The type would now be std::vector<std::vector<boost::optional<int>>>.
You can also use a map, but the key into the map needs to be both IDs not just one.
std::map<std::pair<int,int>,int>
Edit: std::pair implements a comparison operator operator< that should be sufficient for use in a map, see http://en.cppreference.com/w/cpp/utility/pair/operator_cmp.

what happens when you modify an element of an std::set?

If I change an element of an std::set, for example, through an iterator, I know it is not "reinserted" or "resorted", but is there any mention of if it triggers undefined behavior? For example, I would imagine insertions would screw up. Is there any mention of specifically what happens?
You should not edit the values stored in the set directly. I copied this from MSDN documentation which is somewhat authoritative:
The STL container class set is used
for the storage and retrieval of data
from a collection in which the values
of the elements contained are unique
and serve as the key values according
to which the data is automatically
ordered. The value of an element in a
set may not be changed directly.
Instead, you must delete old values
and insert elements with new values.
Why this is is pretty easy to understand. The set implementation will have no way of knowing you have modified the value behind its back. The normal implementation is a red-black tree. Having changed the value, the position in the tree for that instance will be wrong. You would expect to see all manner of wrong behaviour, such as exists queries returning the wrong result on account of the search going down the wrong branch of the tree.
The precise answer is platform dependant but as a general rule, a "key" (the stuff you put in a set or the first type of a map) is suppose to be "immutable". To put it simply, that should not be modified, and there is no such thing as automatic re-insertion.
More precisely, the member variables used for to compare the key must not be modified.
Windows vc compiler is quite flexible (tested with VC8) and this code compile:
// creation
std::set<int> toto;
toto.insert(4);
toto.insert(40);
toto.insert(25);
// bad modif
(*toto.begin())=100;
// output
for(std::set<int>::iterator it = toto.begin(); it != toto.end(); ++it)
{
std::cout<<*it<<" ";
}
std::cout<<std::endl;
The output is 100 25 40, which is obviously not sorted... Bad...
Still, such behavior is useful when you want to modify data not participating in the operator <. But you better know what you're doing: that's the price you get for being too flexible.
Some might prefer gcc behavior (tested with 3.4.4) which gives the error "assignment of read-only location". You can work around it with a const_cast:
const_cast<int&>(*toto.begin())=100;
That's now compiling on gcc as well, same output: 100 25 40.
But at least, doing so will probably makes you wonder what's happening, then go to stack overflow and see this thread :-)
You cannot do this; they are const. There exists no method by which the set can detect you making a change to the internal element, and as a result you cannot do so. Instead, you have to remove and reinsert the element. If you are using elements that are expensive to copy, you may have to switch to using pointers and custom comparators (or switch to a C++1x compiler that supports rvalue references, which would make things a whole lot nicer).