implement the assign operator overload in a hashmap - c++

I'm working on the implementation of a hashmap in c++, however, there is an operation I want to do
myMap["key"] = value;
How can I implement this?
Thanks!

First of all you need to implement the map's index access operator [], which should return a reference to the corresponding value.
Then using that reference you can directly update the value.
Note: normally in C++ the map's operator [] automatically creates an entry when one does not exist.
It's good to first create helper methods find(), which should return a (potentially empty) iterator, and insert() to create a new entry. Then operator [] can be implemented simply as:
T& Map<T>::operator[] (Key const& key) {
auto it = find(key);
if it == end() {
it = insert(std::make_pair(key, T::value_type()));
}
return it->second;
}

Related

operator[] - differentiate between get and set?

Are there any advances in recent C++ that allows for differentiating between getting and setting values via the operator[] of a class? (as Python does via __setitem__ and __getitem__)
const T& operator[](unsigned int index) const;
T& operator[](unsigned int index);
I am wrapping an std::unordered_map, and want to let my users access the data via the operator[], but also do some behind the scenes record-keeping to keep things aligned in my data structure.
Searching reveals a few answers, but they are all many years old, and I was wondering if C++ has added extra functionality in the meantime.
Assume your wrapper class implements set and get methods that perform the appropriate record keeping actions. The wrapper class can then also implement operator[] to return a result object that will delegate to one of those methods depending on how the result is used.
This is in line with the first related question you identified (Operator[] C++ Get/Set).
A simple illustration is below. Note that a const map would not be able to call set_item anyway, so the const overload of operator[] calls get_item directly.
class MapType {
...
struct Result {
MapType &map_;
KeyType key_;
Result (MapType &m, KeyType k) : map_(m), key_(k) {}
operator const ValueType & () const {
return map_.get_item(key_);
}
ValueType & operator = (ValueType rhs) {
return map_.set_item(key_, rhs);
}
};
...
const ValueType & get_item (KeyType key) const {
/* ... record keeping ... */
return map_.at(key);
}
ValueType & set_item (KeyType key, ValueType v) {
/* ... record keeping ... */
return map_[key] = v;
}
...
Result operator [] (KeyType key) { return Result(*this, key); }
const ValueType & operator [] (KeyType key) const {
return get_item(key);
}
...
};
Actually there is not such a getting operator[] and setting operator[]. There are just constant and non-constant [] operators.
When objects are large, returning by reference may save an object creation, and also gives an opportunity to change the element on that position, so it makes sense to define these operators to return by reference than by value. To make the operator available on constant objects too, you should mark the constant overload with const keyword. The idea has not been changed since old times.
BTW, the STL has slightly different approaches with regard the container type. For example if you call [] with an out of range index in a vector, it will throw, but similar call on a non-constant map will create a node with given key. To keep things a little bit more consistent, STL provides a function named at() which regardless of container class, checks the index and throws if it is out of range.

`auto` for the result of `std::set::find` in non-const context resolves to `std::_Rb_tree_const_iterator` in g++ (GCC) 7.3.0 with `-std=c++11` [duplicate]

I find the update operation on std::set tedious since there's no such an API on cppreference. So what I currently do is something like this:
//find element in set by iterator
Element copy = *iterator;
... // update member value on copy, varies
Set.erase(iterator);
Set.insert(copy);
Basically the iterator return by Set is a const_iterator and you can't change its value directly.
Is there a better way to do this? Or maybe I should override std::set by creating my own (which I don't know exactly how it works..)
set returns const_iterators (the standard says set<T>::iterator is const, and that set<T>::const_iterator and set<T>::iterator may in fact be the same type - see 23.2.4/6 in n3000.pdf) because it is an ordered container. If it returned a regular iterator, you'd be allowed to change the items value out from under the container, potentially altering the ordering.
Your solution is the idiomatic way to alter items in a set.
C++17 introduced extract, see Barry's answer.
If you're stuck with an older version, there are 2 ways to do this, in the easy case:
You can use mutable on the variable that are not part of the key
You can split your class in a Key Value pair (and use a std::map)
Now, the question is for the tricky case: what happens when the update actually modifies the key part of the object ? Your approach works, though I admit it's tedious.
In C++17 you can do better with extract(), thanks to P0083:
// remove element from the set, but without needing
// to copy it or deallocate it
auto node = Set.extract(iterator);
// make changes to the value in place
node.value() = 42;
// reinsert it into the set, but again without needing
// to copy or allocate
Set.insert(std::move(node));
This will avoid an extra copy of your type and an extra allocation/deallocation, and will also work with move-only types.
You can also extract by key. If the key is absent, this will return an empty node:
auto node = Set.extract(key);
if (node) // alternatively, !node.empty()
{
node.value() = 42;
Set.insert(std::move(node));
}
Update: Although the following is true as of now, the behavior is considered a defect and will be changed in the upcoming version of the standard. How very sad.
There are several points that make your question rather confusing.
Functions can return values, classes can't. std::set is a class, and therefore cannot return anything.
If you can call s.erase(iter), then iter is not a const_iterator. erase requires a non-const iterator.
All member functions of std::set that return an iterator return a non-const iterator as long as the set is non-const as well.
You are allowed to change the value of an element of a set as long as the update doesn't change the order of elements. The following code compiles and works just fine.
#include <set>
int main()
{
std::set<int> s;
s.insert(10);
s.insert(20);
std::set<int>::iterator iter = s.find(20);
// OK
*iter = 30;
// error, the following changes the order of elements
// *iter = 0;
}
If your update changes the order of elements, then you have to erase and reinsert.
You may want to use an std::map instead. Use the portion of Element that affects the ordering the key, and put all of Element as the value. There will be some minor data duplication, but you will have easier (and possibly faster) updates.
I encountered the very same issue in C++11, where indeed ::std::set<T>::iterator is constant and thus does not allow to change its contents, even if we know the transformation will not affect the < invariant. You can get around this by wrapping ::std::set into a mutable_set type or write a wrapper for the content:
template <typename T>
struct MutableWrapper {
mutable T data;
MutableWrapper(T const& data) : data(data) {}
MutableWrapper(T&& data) : data(data) {}
MutableWrapper const& operator=(T const& data) { this->data = data; }
operator T&() const { return data; }
T* operator->() const { return &data; }
friend bool operator<(MutableWrapper const& a, MutableWrapper const& b) {
return a.data < b.data;
}
friend bool operator==(MutableWrapper const& a, MutableWrapper const& b) {
return a.data == b.data;
}
friend bool operator!=(MutableWrapper const& a, MutableWrapper const& b) {
return a.data != b.data;
}
};
I find this much simpler and it works in 90% the cases without the user even noticing there to be something between the set and the actual type.
This is faster in some cases:
std::pair<std::set<int>::iterator, bool> result = Set.insert(value);
if (!result.second) {
Set.erase(result.first);
Set.insert(value);
}
If the value is usually not already in the std::set then this can have better performance.

Who deletes previous nlohmann json object resources when value is replaced?

I have a key and I want to change the value of the key with another json object.
json newjs = ...;
json tempjs = ...;
newjs["key"] = tempjs["key"];
What will happen to the data existed in newjs["key"] previously?
Will nlohmann class automatically destroy it or is it a memory leak?
OR do I need to manually erase the key first and assign as above?
Internally it's a kept by an "ordered_map: a minimal map-like container that preserves insertion order".
The actual standard container used in this ordered_map is a std::vector<std::pair<const Key, T>, Allocator> and the assignment you do is performed via
T& operator[](const Key& key)
{
return emplace(key, T{}).first->second;
}
where emplace is defined as:
std::pair<iterator, bool> emplace(const key_type& key, T&& t)
{
for (auto it = this->begin(); it != this->end(); ++it)
{
if (it->first == key)
{
return {it, false};
}
}
Container::emplace_back(key, t);
return {--this->end(), true};
}
This means that operator[] tries to emplace a default initialized T into the internal map. If key isn't present in the map, it will succeed, otherwise it will fail.
Regardless of which, when emplace returns, there will be a T in the map and it's a reference to that T that is returned by operator[] and it's that you then copy assign to.
It's a "normal" copy assignment and no leaks should happen.

c++ for_each() lamda function not correct

Could anybody tell me why the for_each() doesn't work in the code below.
I need it to check if the third element in the tuple if not a nullptr and if it is not then add the first and third elements to list
However, it seems to be adding all elements to list.
std::vector<std::tuple<std::string, std::type_index, Value>> arguments;
std::vector<std::pair<std::string, mv::Value>> class::defaultValues() const
{
std::vector<std::pair<std::string, Value>> list;
list.reserve((arguments.size()));
std::for_each(arguments.begin(), arguments.end(),[&list](std::tuple<std::string, std::type_index, Value> arg)
{
if (&std::get<2>(arg) != nullptr)
list.push_back(make_pair(std::get<0>(arg),std::get<2>(arg)));
}
);
return list;
}
Update:
Value is a class.
What the default constructor is called for it, It populates a ptr_ to be a nullptr.
Value() : ptr_(nullptr)
{
}
&std::get<2>(arg) returns the memory address of the Value object itself, not the value of the ptr_ that it holds 1. That address will NEVER be null.
1: unless Value overrides operator& to return ptr_, which should not be done!
You need to drop the & so you are comparing the actual Value object. But that will work in your example only if Value has implemented operator== to take a T* (where T is the type of ptr_) or a nullptr_t as input and compares it to ptr_. Otherwise, your lambda would have to access and compare ptr_ directly instead.
You should also be passing the lambda's arg parameter by reference instead of by value, so that you are acting on the original tuple stored in arguments, and not on a copy of it.
Try this:
std::for_each(arguments.begin(), arguments.end(),
[&list](std::tuple<std::string, std::type_index, Value> &arg)
{
if (std::get<2>(arg) != nullptr) // or std::get<2>(arg).ptr_, depending on how Value is implemented
list.push_back(std::make_pair(std::get<0>(arg), std::get<2>(arg)));
}
In this situation, I would suggest making Value implement operator! instead (if it does not already) to return whether its ptr_ is nullptr, then you can do this:
std::for_each(arguments.begin(), arguments.end(),
[&list](std::tuple<std::string, std::type_index, Value> &arg)
{
if (!!std::get<2>(arg))
list.push_back(std::make_pair(std::get<0>(arg), std::get<2>(arg)));
}
Or, implement operator bool to return whether ptr_ is not nullptr, or implement operator T* to return ptr_ instead (where T is the type of ptr_), then you can do this:
std::for_each(arguments.begin(), arguments.end(),
[&list](std::tuple<std::string, std::type_index, Value> &arg)
{
if (std::get<2>(arg))
list.push_back(std::make_pair(std::get<0>(arg), std::get<2>(arg)));
}
Because &std::get<2>(arg) can never be nullptr. You are literally getting a pointer (with &) to some Value & returned from std::get<2>(arg).

Overloading [] in C++ to return lvalue

I'm writing a simple hash map class:
template <class K, class V> class HashMap;
The implementation is very orthodox: I have a heap array which doubles in size when it grows large. The array holds small vectors of key/value pairs.
Vector<Pair<K, V> > *buckets;
I would like to overload the subscript operator in such a way that code like this will work:
HashMap<int, int> m;
m[0] = 10; m[0] = 20;
m[2] = m[1] = m[0];
In particular,
For m[k] = v where m does not contain k, I'd like a new entry to be added.
For m[k] = v where m does contain k, I'd like the old value to be replaced.
In both of these cases, I'd like the assignment to return v.
Presumably the code will look something like
V& operator[](K &key)
{
if (contains(key))
{
// the easy case
// return a reference to the associated value
}
else
{
Vector<Pair<K, V> > *buck = buckets + hash(k) % num_buckets;
// unfinished
}
}
How should I handle the case where the key is not found? I would prefer to avoid copying values to the heap if I can.
I suppose I could make a helper class which overloads both the assignment operator and a cast to V, but surely there is a simpler solution?
Edit: I didn't realize that std::map required that the value type have a zero argument constructor. I guess I will just default-construct a value as well.
How should I handle the case where the key is not found?
Insert a new element with that key and return a reference to the value of that new element. Effectively, your pseudocode becomes something equivalent to:
if (!contains(key))
insert(Pair<K, V>(key, V()));
return reference_to_the_element_with_that_key;
This is exactly what your requirement is, too. You said "For m[k] = v where m does not contain k, I'd like a new entry to be added."
How should I handle the case where the key is not found?
std::map creates a new object, and inserts it into the map, and returns its reference. You can also do the same.
Alternatively, you can throw an exception KeyNotFoundException like the way .NET map throws. Of course, you've to define KeyNotFoundException yourself, possibly deriving from std::runtime_exception.
By the way, as a general rule, always implement operator[] in pair as:
V &operator[](const K &key);
const V &operator[](const K &key) const;
Just for the sake for const-correctness. However, if you decide to create a new object and insert it into the map, when the key is not found, then this rule is not applicable here, as const version wouldn't make sense in this situation.
See this FAQ:
What's the deal with "const-overloading"?
It sounds like what you want is a "smart reference", which you cannot generically implement in C++ because you cannot overload the dot operator (among other reasons).
In other words, instead of returning a reference to a V, you would return a "smart reference" to a V, which would contain a pointer to V. That smart reference would implement operator=(const V &v) as this->p = new V(v), which only requires a copy constructor (not a zero-argument constructor).
The problem is that the smart reference would have to behave like an actual reference in all other ways. I do not believe you can implement this in C++.
One not-quite-solution is to have your constructor take a "default" instance of V to use for initializing new entries. And it could default to V().
Like this:
template<class K, class V> class HashMap {
private:
V default_val;
public:
HashMap(const V& def = V()) : default_val(def) {
...
}
...
};
When V lacks a zero-argument constructor, HashMap h will not compile; the user will need to provide a V object whose value will be returned when a key is accessed for the first time.
This assumes V has a copy constructor, of course. But from your examples, that seems like a requirement anyway.
The simple solution is to do as std::map does: construct a new entry,
using the default constructor of the value type. This has two
drawbacks: you won't be able to use [] on a HashMap const, and you
can't instantiate HashMap with a value type which doesn't have a default
constructor. The first is more or less implicit in the specification,
which says that [] may modify the map. There are several solutions
for the second: the simplest is probably to pass an instance of a
"default" value to the constructor, which saves it, and uses it to copy
construct the new instance, e.g.:
template <typename Key, typename Value>
class HashMap
{
// ...
Value m_defaultValue;
public:
HashMap( ..., Value const& defaultValue = Value() )
: ... , m_defaultValue( defaultValue )...
Value& operator[]( Key& key )
{
// ...
// not found
insert( key, m_defaultValue );
// return reference to newly inserted value.
}
};
Alternatively, you can have operator[] return a proxy, something like:
template <typename Key, typename Value>
class HashMap::Helper // Member class of HashMap
{
HashMap* m_owner;
Key m_key;
public:
operator Value&() const
{
if ( ! m_owner->contains( m_key ) )
m_owner->createEntryWithDefaultValue( m_key );
return m_owner->getValue( m_key );
}
Helper& operator=( Value const& newValue ) const
{
m_owner->insert( m_key, newValue );
return *this;
}
};
Note that you'll still need the default value for the case where someone
writes:
v = m[x];
and x isn't present in the map. And that things like:
m[x].f();
won't work. You can only copy the entire value out or assign to it.
(Given this, I'd rather prefer the first solution in this case. There
are other cases, however, where the proxy is the only solution, and we
have to live with it.)