I was following a hash table implementation online (https://www.youtube.com/watch?v=2_3fR-k-LzI) when I observed the video author initialize a std::list with an array index. This was very confusing to me as I was always under the impression that std::list was always meant to operate like a linked list and was not capable of supporting random indexing. However, I thought it was maybe a weird way to declare the size of a list and ignored it and moved on. Specifically, he did the following:
static const int hashGroups = 10;
std::list<std::pair<int, std::string>> table[hashGroups];
Upon trying to implement a function to search to see if a key resided in the hash table, I realized that I could not access the std::list objects as I would expect to be able to. In HashTable.cpp (which includes the header file that defines the two variables above) I was only able to access the table member variable's elements as a pointer with -> instead of with . as I would expect to be able to. It looks like what is directly causing this is using the array index in the list definition. This seems to change the type of the table variable from a std::list to a pointer to a std::list. I do not understand why this is the case. This also appears to break my current implementation of attempting to iterate through the table variable because when I declare an iterator to iterate through table's elements, I am able to see that the table has the correct data in the VS debugger but the iterator seems to have completely invalid data and does not iterate through the loop even once despite seeing table correctly have 10 elements. My attempt at the search function is pasted below:
std::string HashTable::searchTable(int key) {
for (std::list<std::pair<int, std::string>>::const_iterator it = table->begin(); it != table->end(); it++)
{
if (key == it->first) {
return it->second;
}
std::cout << "One iteration performed." << std::endl;
}
return "No value found for that key.";
}
With all of this being said, I have several burning questions:
Why are we even able to declare a list with brackets when a std::list does not support random access?
Why does declaring a list like this change the type of the list from std::list to a pointer?
What would be the correct way to iterate through table in its current implementation with an iterator?
Thank you for any help or insight provided!
After reading the responses from #IgorTandetnik I realized that I was thinking about the list incorrectly. What I didn't fully understand was that we were declaring an array of lists and not attempting to initialize a list like an array. Once I realized this, I was able to access the elements correctly since I was not trying to iterate through an array with an iterator for a list. My revised searchTable function which to my knowledge now works correctly looks like this:
std::string HashTable::searchTable(int key) {
int hashedKey = hashFunction(key);
if (table[hashedKey].size() > 0)
{
for (std::list<std::pair<int, std::string>>::const_iterator it = table[hashedKey].begin(); it != table[hashedKey].end(); it++)
{
if (key == it->first) {
return it->second;
}
}
}
return "No value found for that key.";
}
And to answer my three previous questions...
1. Why are we even able to declare a list with brackets when a std::list does not support random access?
Response: We are declaring an array of std::list that contains a std::pair of int and std::string, not a list with the array index operator.
2. Why does declaring a list like this change the type of the list from std::list to a pointer?
Response: Because we are declaring table to be an array (which is equivalent to a const pointer to the first element) which contains instances of std::list. So we are never "changing" the type of the list variable.
3. What would be the correct way to iterate through table in its current implementation with an iterator?
Response: The current implementation only attempts to iterate over the first element of table. Create an iterator which uses the hashed key value as the array index of table and then tries to iterate through the std::list that holds instances of std::pair at that index.
Related
I have nested map of type:
std::map<int,std::map<pointer,pointer>>
I am iterating over the map each time/per frame and doing updates on it.So basically I have 2 nested if loops.
i have an array and i need to sort the data with 2 attributes. First attribute is integer which is the first key, then second attribute is a pointer which is a key of nested map inside the main map. so my code is something like:
iterator = outermap.find();
if(iterator!=outermap.end()){
value = iterator->second;
it1 = value.find();
if(it1!=value.end(){
value1 = it1->second;
// do something
}
else{
// do something and add new value
}
}
else {
// do something and add the values
}
This is really slow and causing my application to drop frame rate. Is there any alternative to this? Can we use hash codes and linked list to achieve the same?
You can use std::unordered_map, it will hash the keys so finds complete faster. Using value = iterator->second is copying your entire map to the 'value' variable. Using a reference avoids unnecessary copying and is better for performance, eg: auto & value = iterator->second.
Also std::map is guaranteed to be ordered. This can be used to your advantage since your keys are integers for the outermost map.
Firstly, your question is a bit vague, so this may or may not fit your problem.
Now, you have a map<int, map<pointer, pointer>>, but you never operate on the inner map itself. All you do is look up a value by an int and a pointer. This is also exactly what you should do instead, use an aggregate of those two as key in a map. The type for that is pair<int, pointer>, the map then becomes a map<pair<int, pointer>, pointer>.
One more note: You seem to know the keys to search in the map in advance. If the check whether the element exists is not just for safety, you could also use the overloaded operator[] of the map. The lookup then becomes outermap[ikey][pkey] and returns a default-initialized pointer (so probably a null pointer, it pointer really is a pointer). For the suggested combined map, the lookup would be outermap[make_pair(ikey, pkey)].
I want to know if anyone has a quick way for adding an element to a std::list<T*> if the element is not already in it.
It's a generic function and I can not use loops so something like this
template <class T>
bool Class<T>::addElement(const T* element)
{
for (list<T*>::iterator it = list_.begin(); it != list_.end(); it++)
{
if (element == *it)
return false;
}
list_.push_back(element);
return true;
}
Is not ok because of the loop. Does anyone have ideas?
Why is what you have "not ok"? Looks perfectly fine and readable to me (modulo missing typename).
If you really don't want to use a loop, you can accomplish the same by using the algorithm to does precisely that loop: std::find:
template <class T>
bool Class<T>::addElement(const T* element)
{
if (std::find(list_.begin(), list_.end(), element) != list_.end()) {
return false;
}
list_.push_back(element);
return true;
}
If you can add other members to your class, you could add an index such as a std::unordered_set. That container stores a list of unique values, and can be searched for specific values in O(1) complexity, which implies that no full-loop search is done by the implementation for checking if the value already exists. It will stay fast even if you have a lot of values already stored.
With std::list, using library functions such as std::find will avoid explicitely writing a loop, but the implementation will perform the loop and this will be slow when a lot of values are already stored (O(n) complexity)
You can use intrusive list instead of the std::list. In this case each element in the list keeps its node data, so you can just query that data to find out if the element is already in the list. The disadvantage is that all elements in this list must be able to provide such data, and you can't put in such lists, for example, integer or boolean elements.
If you still need the std::list and/or the elements can be of any type, then the only way of fast queryng whether the element already exists in the list is to use an index. The indexes can be stored in separate std::unordered_set for fast lookups. You can use for indexes either the list's values "as is" or calculate the indexes using any custom function.
I have a std::map associating const char* keys with int values:
std::map<const char*, int> myMap;
I initialize it with three keys, then check if it can find it:
myMap["zero"] = 0;
myMap["first"] = 1;
myMap["second"] = 2;
if (myMap.at("zero") != 0)
{
std::cerr << "We have a problem here..." << std::endl;
}
And nothing is printed. From here, everything looks ok.
But later in my code, without any alteration of this map, I try to find again a key:
int value = myMap.at("zero");
But the at function throws an std::out_of_range exception, which means it cannot find the element. myMap.find("zero") thinks the same, because it returns an iterator on the end of the map.
But the creepiest part is that the key is really in the map, if just before the call to the at function, I print the content of the map like this:
for (auto it = myMap.begin(); it != myMap.end(); it++)
{
std::cout << (*it).first << std::endl;
}
The output is as expected:
zero
first
second
How is it even possible? I don't use any beta-test library or anything supposed to be unstable.
You have a map of pointers to characters, not strings. The map lookup is based on the pointer value (address) and not the value of what's pointed at. In the first case, where "zero" is found in the map, you compiler has performed some string merging and is using one array of characters for both identical strings. This is not required by the language but is a common optimization. In the second case, when the string is not found, this merging has not been done (possibly your code here is in a different source module), so the address being used in the map is different from what was inserted and is then not found.
To fix this either store std::string objects in the map, or specify a comparison in your map declaration to order based on the strings and not the addresses.
key to map is char * . So map comparison function will try to compare raw pointer values and not the c style char string equivalence check. So declare the map having std::string as the key.
if you do not want to deal with the std::string and still want the same functionality with improved time complexity, sophisticated data structure is trie. Look at some implementations like Judy Array.
I'm trying to create a hash of arrays of pointers to my object.
The hash key is an int for the type of the object, and the array is a list of the objects to render.
What I'm trying to do is :
unordered_map<int, vector<Object*> > drawQueue;
drawQueue.clear(); // new empty draw queue
for ( ... ) {
drawQueue.at(type).push_back(my_obj);
}
So I'm not familiar enough with the nuances of the STL stuff, since I get an exception saying out_of_bounds, which is what happens when the key doesn't exist.
So I figured I need to create the key first, and then add to the vector :
if (drawQueue.count(type)) {
// key already exists
drawQueue.at(type).push_back(my_obj);
} else {
//key doesn't exist
drawQueue.insert(type, vector<Object*>); // problem here
drawQueue.at(type).push_back(my_obj);
}
But now I'm really lost, as I don't know how to create/initialise/whatever an empty vector to the insert of the unordered_map...
Or am I doing this the entirely wrong way?
You are not using insert in the proper way. This should work:
drawQueue.insert(std::make_pair(type, std::vector<Object*>()));
If using C++11, the previous statement can be simplified to:
drawQueue.emplace(type, std::vector<Object*>());
By using this approach the element is constructed in-place (i.e., no copy or move operations are performed).
I also include links to the documentation for insert and emplace.
I think this is an easy approach. My example will create an unordered_map string as key and integer vector as values.
unordered_map<string,vector<int>> keys;
keys["a"] = vector<int>(); // Initialize key with null vector
keys["a"].push_back(1); // push values into vector.
keys["a"].push_back(5);
for(int i : keys["a"] ){
cout << i << "\t";
}
I think you could simplify it by
drawQueue[type].push_back(my_obj);
The operator [] would do the insert for you if the key is not found.
The underlying data structure I am using is:
map<int, Cell> struct Cell{ char c; Cell*next; };
In effect the data structure maps an int to a linked list. The map(in this case implemented as a hashmap) ensures that finding a value in the list runs in constant time. The Linked List ensures that insertion and deletion also run in constant time. At each processing iteration I am doing something like:
Cell *cellPointer1 = new Cell;
//Process cells, build linked list
Once the list is built I put the elements Cell in map. The structure was working just fine and after my program I deallocate memory. For each Cell in the list.
delete cellPointer1
But at the end of my program I have a memory leak!!
To test memory leak I use:
#include <stdlib.h>
#include <crtdbg.h>
#define _CRTDBG_MAP_ALLOC
_CrtDumpMemoryLeaks();
I'm thinking that somewhere along the way the fact that I am putting the Cells in the map does not allow me to deallocate the memory correctly. Does anyone have any ideas on how to solve this problem?
We'll need to see your code for insertion and deletion to be sure about it.
What I'd see as a memleak-free insert / remove code would be:
( NOTE: I'm assuming you don't store the Cells that you allocate in the map )
//
// insert
//
std::map<int, Cell> _map;
Cell a; // no new here!
Cell *iter = &a;
while( condition )
{
Cell *b = new Cell();
iter->next = b;
iter = b;
}
_map[id] = a; // will 'copy' a into the container slot of the map
//
// cleanup:
//
std::map<int,Cell>::iterator i = _map.begin();
while( i != _map.end() )
{
Cell &a = i->second;
Cell *iter = a.next; // list of cells associated to 'a'.
while( iter != NULL )
{
Cell *to_delete = iter;
iter = iter->next;
delete to_delete;
}
_map.erase(i); // will remove the Cell from the map. No need to 'delete'
i++;
}
Edit: there was a comment indicating that I might not have understood the problem completely. If you insert ALL the cells you allocate in the map, then the faulty thing is that your map contains Cell, not Cell*.
If you define your map as: std::map<int, Cell *>, your problem would be solved at 2 conditions:
you insert all the Cells that you allocate in the map
the integer (the key) associated to each cell is unique (important!!)
Now the deletion is simply a matter of:
std::map<int, Cell*>::iterator i = _map.begin();
while( i != _map.end() )
{
Cell *c = i->second;
if ( c != NULL ) delete c;
}
_map.clear();
I've built almost the exact same hybrid data structure you are after (list/map with the same algorithmic complexity if I were to use unordered_map instead) and have been using it from time to time for almost a decade though it's a kind of bulky structure (something I'd use with convenience in mind more than efficiency).
It's worth noting that this is quite different from just using std::unordered_map directly. For a start, it preserves the original order in which one inserts elements. Insertion, removal, and searches are guaranteed to happen in logarithmic time (or constant time depending on whether key searching is involved and whether you use a hash table or BST), iterators do not get invalidated on insertion/removal (the main requirement I needed which made me favor std::map over std::unordered_map), etc.
The way I did it was like this:
// I use this as the iterator for my container with
// the list being the main 'focal point' while I
// treat the map as a secondary structure to accelerate
// key searches.
typedef typename std::list<Value>::iterator iterator;
// Values are stored in the list.
std::list<Value> data;
// Keys and iterators into the list are stored in a map.
std::map<Key, iterator> accelerator;
If you do it like this, it becomes quite easy. push_back is a matter of pushing back to the list and adding the last iterator to the map, iterator removal is a matter of removing the key pointed to by the iterator from the map before removing the element from the list as the list iterator, finding a key is a matter of searching the map and returning the associated value in the map which happens to be the list iterator, key removal is just finding a key and then doing iterator removal, etc.
If you want to improve all methods to constant time, then you can use std::unordered_map instead of std::map as I did here (though that comes with some caveats).
Taking an approach like this should simplify things considerably over an intrusive list-based solution where you're manually having to free memory.
Is there a reason why you are not using built-in containers like, say, STL?
Anyhow, you don't show the code where the allocation takes place, nor the map definition (is this coming from a library?).
Are you sure you deallocate all of the previously allocated Cells, starting from the last one and going backwards up to the first?
You could do this using the STL (remove next from Cell):
std::unordered_map<int,std::list<Cell>>
Or if cell only contains a char
std::unordered_map<int,std::string>
If your compiler doesn't support std::unordered_map then try boost::unordered_map.
If you really want to use intrusive data structures, have a look at Boost Intrusive.
As others have pointed out, it may be hard to see what you're doing wrong without seeing your code.
Someone should mention, however, that you're not helping yourself by overlaying two container types here.
If you're using a hash_map, you already have constant insertion and deletion time, see the related Hash : How does it work internally? post. The only exception to the O(c) lookup time is if your implementation decides to resize the container, in which case you have added overhead regardless of your linked list addition. Having two addressing schemes is only going to make things slower (not to mention buggier).
Sorry if this doesn't point you to the memory leak, but I'm sure a lot of memory leaks / bugs come from not using stl / boost containers to their full potential. Look into that first.
You need to be very careful with what you are doing, because values in a C++ map need to be copyable and with your structure that has raw pointers, you must handle your copy semantics properly.
You would be far better off using std::list where you won't need to worry about your copy semantics.
If you can't change that then at least std::map<int, Cell*> will be a bit more manageable, although you would have to manage the pointers in your map because std::map will not manage them for you.
You could of course use std::map<int, shared_ptr<Cell> >, probably easiest for you for now.
If you also use shared_ptr within your Cell object itself, you will need to beware of circular references, and as Cell will know it's being shared_ptr'd you could derive it from enable_shared_from_this
My final point will be that list is very rarely the correct collection type to use. It is the correct one to use sometimes, especially when you have an LRU cache situation and you want to move accessed elements to the end of the list fast. However that is the minority case and it probably doesn't apply here. Think of an alternative collection you really want. map< int, set<char> > perhaps? or map< int, vector< char > > ?
Your list has a lot of overheads to store a few chars