Making a hash table of iterators in C++ - c++

I'm trying to accelerate a specific linked-list operation by hashing some of the node pointers. This is the code I'm using:
unordered_set< typename list< int >::iterator > myhashset;
In Visual Studio 2012, I get an "error C2338: The C++ Standard doesn't provide a hash for this type", since the compiler doesn't know how to hash iterators. Therefore, I need to implement my own hash function for list iterators like so:
struct X{int i,j,k;};
struct hash_X{
size_t operator()(const X &x) const{
return hash<int>()(x.i) ^ hash<int>()(x.j) ^ hash<int>()(x.k);
}
};
(wikipedia reference)
I'm having trouble figuring out what members of the iterator guarantee uniqueness (and, therefore, the members that I want to hash). Another concern is that those members may be private.
One solution that comes to mind is re-implementing and list::iterator, but that seems like a hack and introduces more code to maintain.

Use the address of the element that the iterator refers to.
struct list_iterator_hash {
size_t operator()(const list<int>::iterator &i) const {
return hash<int*>()(&*i);
}
};
But this will only work for dereferenceable iterators, not end() or list<int>::iterator().

You can use pointers to the element in place of iterators. Let's say you had a list of structs MyStruct. You can use
unordered_set<MyStruct*> myhashset;
and the C++ standard library already implements std::hash for any pointer.
So if you ever need to insert or search listIt then use &(*listIt) which will get the pointer of type MyStruct*.

Related

The std::pair in the std::map is returned as const

My example code is
class A
{
int a = 0;
public:
void setA(const int value)
{
a = value;
}
};
std::map<std::string, std::set<A>> Map{{"A", {}}};
Map.rbegin()->second.rbegin()->setA(2);
I get the following error: "Member function 'setA' not viable: 'this' argument has type 'const A', but function is not marked const"
My question is why rbegin() returns a const pointer to A? Or why is the std:pair's second a const in the std::map?
All std::set elements are exposed in const fashion. That's because they're both keys and values, and if you could modify keys willy-nilly then you'd ruin the tree structure inside the set.
It is currently not possible to directly modify set elements. You will have to remove then re-insert.
(This has nothing to do with the encapsulating map.)
Basically, rbegin() returns a reverse iterator which points to an object of type A which is stored in the std::set in a const manner.
The reason behind such behaviour is quite simple: it is necessary to protect std::set from inadvertent changes of elements which are stored inside.
You should remember, that std::set stores its elements in a tree-like data structure to ensure fast search/insert/remove operations. Possible changes of elements inside std::set might lead to wrong elements comparison and corruption of data structure, that is why all iterators return by begin()/end() methods and their analogues expose elements in a const fashion.

Use boost::circular_buffer<T> as STL container

I've written a lot of code using std::vector<T> and std::vector<T>::iterator. Now I've decided to replace the vector container with a circular buffer from boost, namely boost::circular_buffer<T>.
Of course now the compiler will complain for every function that uses std::... where I'm passing the boost::... counterpart. Do I have to rewrite all functions now? I'm asking since the container from boost works exactly the same. Also the Boost documentation states the following:
The circular_buffer is a STL compliant container. It is a kind of sequence similar to std::list or std::deque.
What does the "STL compliant" part mean? Is it referring to what I would like to do (interchangability) or is it simply a mental note for programmers, that the containers work the same way in boost as in STL?
EDIT: To give an example
class Item{ };
class Queue{
private:
std::vector<Item*> item_vector; // Want to replace only this...
std::vector<Item*>::iterator current_position; // ...and this
public:
Item* get_current_item() const {
return *current_position;
}
std::vector<Item*> get_item_vector(){
return item_vector;
}
};
Do I have to rewrite all functions now?
If your functions specifically use vector and its iterator types, then yes, you will have to change them to use different types.
If they are templates, designed to work with any sufficiently compatible container and iterator types, then they should work without change.
What does the "STL compliant" part mean?
It usually means it follows the specification for an iterable sequence defined by the C++ standard library (which was influenced by the ancient STL library, whose name some people loosely use to refer to some or all of the modern standard library).
For example, it has begin() and end() member functions returning an iterator type; and the iterator types can be incremented with ++ and dereferenced with *.
This means that an algorithm implemented as a template, for example:
template <typename InputIterator, typename T>
InputIterator find(InputIterator begin, InputIterator end, T const & value) {
for (InputIterator it = begin; it != end; ++it) {
if (*it == value) {
return it;
}
}
return end;
}
will work for any iterator type that supports these operations. While a non-generic function
void find(some_iterator begin, some_iterator end, some_type t);
will only work for a single specific iterator type, and will have to be changed or duplicated to support others.

Get a pointer to STL container an iterator is referencing?

For example, the following is possible:
std::set<int> s;
std::set<int>::iterator it = s.begin();
I wonder if the opposite is possible, say,
std::set<int>* pSet = it->**getContainer**(); // something like this...
No, there is no portable way to do this.
An iterator may not even have a reference to the container. For example, an implementation could use T* as the iterator type for both std::array<T, N> and std::vector<T>, since both store their elements as arrays.
In addition, iterators are far more general than containers, and not all iterators point into containers (for example, there are input and output iterators that read to and write from streams).
No. You must remember the container that an iterator came from, at the time that you find the iterator.
A possible reason for this restriction is that pointers were meant to be valid iterators and there's no way to ask a pointer to figure out where it came from (e.g. if you point 4 elements into an array, how from that pointer alone can you tell where the beginning of the array is?).
It is possible with at least one of the std iterators and some trickery.
The std::back_insert_iterator needs a pointer to the container to call its push_back method. Moreover this pointer is protected only.
#include <iterator>
template <typename Container>
struct get_a_pointer_iterator : std::back_insert_iterator<Container> {
typedef std::back_insert_iterator<Container> base;
get_a_pointer_iterator(Container& c) : base(c) {}
Container* getPointer(){ return base::container;}
};
#include <iostream>
int main() {
std::vector<int> x{1};
auto p = get_a_pointer_iterator<std::vector<int>>(x);
std::cout << (*p.getPointer()).at(0);
}
This is of course of no pratical use, but merely an example of an std iterator that indeed carries a pointer to its container, though a quite special one (eg. incrementing a std::back_insert_iterator is a noop). The whole point of using iterators is not to know where the elements are coming from. On the other hand, if you ever wanted an iterator that lets you get a pointer to the container, you could write one.

How to create an unordered_map with non-stl types such as UnicodeString from ICU?

I'd like to be able to do this:
std::unordered_map<icu::UnicodeString, icu::UnicodeString> mymap;
However, when I do (and I come to use it) I was getting "cannot convert size_t to UnicodeString" errors. So I had a look around and read up on unordered containers. This blog post makes the point that I need to make available a specialisation of std::hash<icu::UnicodeString>, so I did exactly that:
namespace std
{
template<>
class hash<icu::UnicodeString> {
public:
size_t operator()(const icu::UnicodeString &s) const
{
return (size_t) s.hashCode();
}
};
};
Not perfect, however, it satisfies the requirements. However, now I'm getting errors that stem from:
error C2039: 'difference_type' : is not a member of 'icu_48::UnicodeString'
The blog post itself hints that I need to be doing more; however, it doesn't tell me what I should do, ending on these remarks:
In addition to requiring a hash function, the unordered containers also need to be able to test two keys for equality. The canonical way for them to do this is with a version of operator==() defined at the global namespace. This is typically a function you are used to having to construct when creating new classes, but if you overlook it, you will be up against the same raft of incomprehensible compiler errors seen earlier in this article.
I didn’t have to deal with it in this article because the standard library already defines this operator for std::pair. Of course, when using std::pair you also have to make sure you have an equality operator for T1 and T2.
So, now I'm a little confused, because operator== is defined for UnicodeString.
So, using C++11, MSVC and GCC. Also compiling with Qt dependencies. Then, my question is, what more do I need to do in order to add icu::UnicodeString types to an unordered map?
As requested, I'm later attempting to iterate over the map. The map itself is part of a class, called this->mymap:
std::unordered_map<icu::UnicodeString, icu::UnicodeString>::const_iterator it;
for ( it = this->mymap.begin(); it != this->mymap.end(); ++it )
{
// access it->first, it->second etc...
}
As OP discovered,
somebody had left a nice mymap->insert(key, value) which is wrong wrong wrong
Since an unordered map has a 2-argument insert method,
template <class P>
iterator insert(const_iterator hint, P&& obj);
the compiler will try to match the key as a const_iterator, which is probably why the difference_type type member is requested (it is a member of an iterator).
The correct way to insert an entry is to insert a pair,
mymap.insert(std::make_pair(key, value));
or just use the "emplace" method,
mymap.emplace(key, value);

Why does std::vector transfer its constness to the contained objects?

A const int * and an int *const are very different. Similarly with const std::auto_ptr<int> vs. std::auto_ptr<const int>. However, there appears to be no such distinction with const std::vector<int> vs. std::vector<const int> (actually I'm not sure the second is even allowed). Why is this?
Sometimes I have a function which I want to pass a reference to a vector. The function shouldn't modify the vector itself (eg. no push_back()), but it wants to modify each of the contained values (say, increment them). Similarly, I might want a function to only change the vector structure but not modify any of its existing contents (though this would be odd). This kind of thing is possible with std::auto_ptr (for example), but because std::vector::front() (for example) is defined as
const T &front() const;
T &front();
rather than just
T &front() const;
There's no way to express this.
Examples of what I want to do:
//create a (non-modifiable) auto_ptr containing a (modifiable) int
const std::auto_ptr<int> a(new int(3));
//this works and makes sense - changing the value pointed to, not the pointer itself
*a = 4;
//this is an error, as it should be
a.reset();
//create a (non-modifiable) vector containing a (modifiable) int
const std::vector<int> v(1, 3);
//this makes sense to me but doesn't work - trying to change the value in the vector, not the vector itself
v.front() = 4;
//this is an error, as it should be
v.clear();
It's a design decision.
If you have a const container, it usually stands to reason that you don't want anybody to modify the elements that it contains, which are an intrinsic part of it. That the container completely "owns" these elements "solidifies the bond", if you will.
This is in contrast to the historic, more lower-level "container" implementations (i.e. raw arrays) which are more hands-off. As you quite rightly say, there is a big difference between int const* and int * const. But standard containers simply choose to pass the constness on.
The difference is that pointers to int do not own the ints that they point to, whereas a vector<int> does own the contained ints. A vector<int> can be conceptualised as a struct with int members, where the number of members just happens to be variable.
If you want to create a function that can modify the values contained in the vector but not the vector itself then you should design the function to accept iterator arguments.
Example:
void setAllToOne(std::vector<int>::iterator begin, std::vector<int>::iterator end)
{
std::for_each(begin, end, [](int& elem) { elem = 1; });
}
If you can afford to put the desired functionality in a header, then it can be made generic as:
template<typename OutputIterator>
void setAllToOne(OutputIterator begin, OutputIterator end)
{
typedef typename iterator_traits<OutputIterator>::reference ref;
std::for_each(begin, end, [](ref elem) { elem = 1; });
}
One big problem syntactically with what you suggest is this: a std::vector<const T> is not the same type as a std::vector<T>. Therefore, you could not pass a vector<T> to a function that expects a vector<const T> without some kind of conversion. Not a simple cast, but the creation of a new vector<const T>. And that new one could not simply share data with the old; it would have to either copy or move the data from the old one to the new one.
You can get away with this with std::shared_ptr, but that's because those are shared pointers. You can have two objects that reference the same pointer, so the conversion from a std::shared_ptr<T> to shared_ptr<const T> doesn't hurt (beyond bumping the reference count). There is no such thing as a shared_vector.
std::unique_ptr works too because they can only be moved from, not copied. Therefore, only one of them will ever have the pointer.
So what you're asking for is simply not possible.
You are correct, it is not possible to have a vector of const int primarily because the elements will not assignable (requirements for the type of the element contained in the vector).
If you want a function that only modifies the elements of a vector but not add elements to the vector itself, this is primarily what STL does for you -- have functions that are agnostic about which container a sequence of elements is contained in. The function simply takes a pair of iterators and does its thing for that sequence, completely oblivious to the fact that they are contained in a vector.
Look up "insert iterators" for getting to know about how to insert something into a container without needing to know what the elements are. E.g., back_inserter takes a container and all that it cares for is to know that the container has a member function called "push_back".