Order of destruction in unordered_map

Order of destruction in unordered_map - c++

I have an unordered_map of objects. Each object, in its destructor, browses the unordered map to find other objects, and then tweaks these other objects. This will fail if the other objects are zombie objects, but if the other objects are entirely removed from the unordered_map, there is no problem.
My questions:
does this work if I erase() an object, and its destructor tries to look for itself in the unordered map? Specifically, is the destructor called first, or is the object removed from the unordered_map first, or is there no guarantee?
does this work if the unordered_map is destroyed? Specifically, will the unordered_map be in a valid state as each individual destructor is called?

The lifetime of an object of type T ends when [...] if T is a class type with a non-trivial destructor (12.4), the destructor call starts [...]
[§ 3.8/1 N4431]
And, further down
The properties ascribed to objects throughout this International Standard apply for a given object only during its lifetime
[§ 3.8/3 N4431]
And finally
[...] after the lifetime of an object has ended [...] any pointer that refers to the storage location where the object will be or was located may be used but only in limited ways. [...] The program has undefined behavior if [...] the pointer is used to access a non-static data member or call a non-static member function of the object [...]
[§ 3.8/5 N4431]
So, since you must have some kind of reference (e.g. a pointer, or a real reference, which I'd count as pointer here, too) to the map, and its lifetime has ended, accessing a member function (to get an iterator for example) will - as far as I'd read this part of the standard - lead to undefined behaviour.
I was also looking at the part of the standard about unordered containers and containers in general, and couldn't find an exception to the above or any clue about the state during destruction.
So: Don't do this. Neither with unordered containers, nor with any other object.
BTW: What kind of tweaking makes any sense when you do it on objects that will be destructed moments thereafter?

I think i found a decent solution. You could possibly encapsulate an unordered_map in a class and use its distructor to raise a flag and detect it as an edge case in the destructor of the object that is typaram in the hash table. Like this:
template<typename K, typename V>
struct hash_table
{
unordered_map<K, V> map;
bool is_being_deleted = false;
~hash_table()
{
is_being_deleted = true;
}
};
struct PageRefrence
{
string str;
int page;
hash_table<string, PageRefrence>& refTable;
~PageRefrence()
{
if (refTable.is_being_deleted == true) // When the map is in the procces of deletion
return;
else
{ // Normal case
auto x = refTable.map.find(str);
cout << (*x).second.page;
}
}
};
int main()
{
hash_table<string, PageRefrence> refTable;
refTable.map.insert({ "HELP",{"HELP",42,refTable} });
}

Related

In-place initialization risks

Currently, I have a design that maps objects by a key, in an unordered_map. The problem is that in the constructor of this object, I need to look it up by the key- even though it doesn't exist yet. So far I have solved this problem by deferring everything, but it's quite awkward.
So I've been considering a kind of in-place-initializer. Something like
std::unordered_map<K, std::unique_ptr<T, FunkyDeleter>> map;
T* ptr = malloc(sizeof(T));
map[key] = std::unique_ptr<T, FunkyDeleter>(ptr);
try {
new (ptr) T(args);
} catch(...) {
map[key].release();
map.erase(key);
free(ptr);
throw;
}
This way, code in T's constructor can look it up in the map, even though it's not fully constructed yet.
What are the risks and problems inherent in this design? So far I identified exception safety, the destructor for the unique_ptr is gonna be awkward, as well as the risks of accessing a half-constructed T.
Edit:
Roughly speaking, T represents a node in a graph which is very definitely not acyclic and never will be acyclic, ever. In T's constructor, to calculate some things about T, I wanted to look at T's subnodes- which can contain direct references to that T instance. Imagine something like
struct K {
std::vector<K*> subkeys;
};
class T {
std::vector<T*> child_nodes;
public:
T(K* key, graph& graph) {
for(auto subkey : keys->subkeys)
child_nodes.push_back(graph.get(subkey));
}
std::vector<T*> children() { return child_nodes; }
};
class graph {
std::unordered_map<K*, std::unique_ptr<T>> nodes;
public:
T* get(K* key) {
if (nodes.find(key) == nodes.end())
nodes[key] = std::unique_ptr<T>(new T(key, *this));
return nodes[key].get();
}
};
int main() {
graph g;
K key1;
K key2;
key1.subkeys.push_back(&key2);
key2.subkeys.push_back(&key1);
g.get(&key1);
}
This obviously doesn't work in the case of cyclic references in K objects. The problem is how I'm going to support them. So far I simply deferred all work so that T simply does not evaluate any potentially referencing code in the constructor, but that leads to some very awkward designs in some places. I wanted to try and place the pointer to the T into the map as it's being constructed, so that circular references evaluate correctly in the constructor and I can throw out this deferred work, as some of it actually has important side-effects (which I cannot avoid due to a third-party design) and managing deferred side-effects is a bitch.

When you have such tight dependencies, in general constructor and destructor easily become a mess.
In general, you should avoid such things. Sometimes, it's just clearer to construct objects in an invalid state, and then initialize them with an initialize method. If you have some instance fields that you would like to be const, you can group them in a non-const struct field with const fields, that it assigned by initialize. You can even define some form of optional which can never be assigned more than once, but maybe that's overshooting.
Getting back to the question, what I see is:
The destructor of unique_ptr will call delete on a pointer which was not allocated with new, which is undefined behavior. You should use a unique_ptr with a dedicated/noop deallocator, unless you can control all the code which might remove that thing from the map, and be sure it uses that same release-erase-free boilerplate you used in your catch clause.
unique_ptr is meant to also work with incomplete types, so it will not try to access your object, and your T* is segregated inside the function. So you can only access a partially constructed T if:
T's constructor itself leaks a reference (this would be independent from your code here), or
Someone other thread tries to lookup for the new item. Doesn't seem to be the case, if your map is also local to the function. If you're doing multithread, then you must change your design, unless performance are not important. Because the only thing to do here is a reentrant mutex lock on the map, which would destroy all benefits you could get from multithreading.
EDIT
In response to your edit. First of all, it seems clean to me. Doesn't look like you're doing something weird. But you have to deal with the destructor problem, because your map will eventually get destroyed (program termination also causes destructors to be called).
Anyway:
I think you're using keys in a weird way. If the key itself already contains information about the subnodes, why are you also duplicating that information in the node itself? Information duplication calls for out-of-sync-data bugs.
Can't you just change T constructor to check the subkey before the lookup, and avoid looking for its own key?

inserting temporary std::shared_ptr into std::map, is it bad?

I'm designing a class for my application that implements a lot of standard shared pointers and usage of standard containers such as std::map and std::vector
It's very specific question to the problem so I just copied a piece of code
from my header for clarification purposes..
here is a snapshot of that declarations from the header:
struct Drag;
std::map<short, std::shared_ptr<Drag>> m_drag;
typedef sigc::signal<void, Drag&> signal_bet;
inline signal_bet signal_right_top();
and here is one of the functions that uses the above declarations and a temporary shared_ptr which is intended to be used not only in this function but until some late time. that means after the function returns a shared pointer should be still alive because it will be assigned at some point to another shared_ptr.
void Table::Field::on_signal_left_top(Drag& drag)
{
m_drag.insert(std::make_pair(drag.id, std::make_shared<Drag>(this))); // THIS!
auto iter = m_drag.find(drag.id);
*iter->second = drag;
iter->second->cx = 0 - iter->second->tx;
iter->second->cy = 0 - iter->second->ty;
invalidate_window();
}
the above function first insert a new shared_ptr and then assigns the values from one object into another,
What I need from your answer is to tell whether is it safe to insert temporary shared_ptr into the map and be sure that it will not be a dangling or what ever bad thing.
According to THIS website the above function is not considered safe because it would much better to write it like so:
void Table::Field::on_signal_left_top(Drag& drag)
{
std::shared_ptr pointer = std::make_shared<Drag>(this);
m_drag.insert(std::make_pair(drag.id, pointer));
auto iter = m_drag.find(drag.id);
*iter->second = drag;
// etc...
}
well one line more in the function.
is it really required to type it like that and why ?

There's no difference between the two functions in regard to the std::shared_ptr, because the std::make_pair function will create a copy of the temporary object before the temporary object is destructed. That copy will in turn be copied into the std::map, and will then itself be destructed, leaving you with a copy-of-a-copy in the map. But because the two other objects have been destructed, the reference count of the object in the map will still be one.
As for handling the return value from insert, it's very simple:
auto result = m_drag.insert(...);
if (!result.second)
{
std::cerr << "Could not insert value\n";
return;
}
auto iter = result.first;
...

The code in the example given is different from your example code, because it is using the new operator instead of std::make_shared. The key part of their advice is here:
Since function arguments are evaluated in unspecified order, it is possible for new int(2) to be evaluated first, g() second, and we may never get to the shared_ptr constructor if g throws an exception.
std::make_shared eliminates this problem - any dynamic memory allocated while constructing an object within std::make_shared will be de-allocated if anything throws. You won't need to worry about temporary std::shared_ptrs in this case.

Re-assigning an std::function object while inside its execution

I have an std::function object I'm using as a callback to some event. I'm assigning a lambda to this object, within which, I assign the object to a different lambda mid execution. I get a segfault when I do this. Is this not something I'm allowed to do? If so, why? And how would I go about achieving this?
declaration:
std::function<void(Data *)> doCallback;
calling:
//
// This gets called after a sendDataRequest call returns with data
//
void onIncomingData(Data *data)
{
if ( doCallback )
{
doCallback(data);
}
}
assignment:
doCallback =
[=](Data *data)
{
//
// Change the callback within itself because we want to do
// something else after getting one request
//
doCallback =
[=](Data *data2)
{
... do some work ...
};
sendDataRequest();
};
sendDataRequest();

The standard does not specify when in the operation of std::function::operator() that the function uses its internal state object. In practice, some implementations use it after the call.
So what you did was undefined behaviour, and in particular it crashes.
struct bob {
std::function<void()> task;
std::function<void()> next_task;
void operator()(){
next_task=task;
task();
task=std::move(next_task);
}
}
now if you want to change what happens when you next invoke bob within bob(), simply set next_task.

Short answer
It depends on whether, after the (re)assignment, the lambda being called accesses any of its non static data members or not. If it does then you get undefined behavior. Otherwise, I believe nothing bad should happen.
Long answer
In the OP's example, a lambda object -- denoted here by l_1 -- held by a std::function object is invoked and, during its execution, the std::function object is assigned to another lambda -- denoted here by l_2.
The assignment calls template<class F> function& operator=(F&& f); which, by 20.8.11.2.1/18, has the effects of
function(std::forward<F>(f)).swap(*this);
where f binds to l_2 and *this is the std::function object being assigned to. At this time, the temporary std::function holds l_2 and *this holds l_1. After the swap the temporary holds l_1 and *this holds l_2 (*). Then the temporary is destroyed and so is l_1.
In summary, while running operator() on l_1 this object gets destroyed. Then according to 12.7/1
For an object with a non-trivial constructor, referring to any non-static member or base class of the object before the constructor begins execution results in undefined behavior. For an object with a non-trivial destructor, referring to any non-static member or base class of the object after the destructor finishes execution results in undefined behavior.
Lambdas non static data members correspond its captures. So if you don't access them, then it should be fine.
There's one more important point raised by Yakk's answer. As far as I understand, the concern was whether std::function::operator(), after having forwarded the call to l_1, tries to access l_1 (which is now dead) or not? I don't think this is the case because the effects of std::function::operator() don't imply that. Indeed, 20.8.11.2.4 says that the effect of this call is
INVOKE(f, std::forward<ArgTypes>(args)..., R) (20.8.2), where f is the target object (20.8.1) of *this.
which basicallky says that std::function::operator() calls l_1.operator() and does nothing else (at least, nothing that is detectable).
(*) I'm putting details on how the interchange happens under the carpet but the idea remains valid. (E.g. what if the temporary holds a copy of l_1 and not a pointer to it?)

How to handle 'this' pointer in constructor?

I have objects which create other child objects within their constructors, passing 'this' so the child can save a pointer back to its parent. I use boost::shared_ptr extensively in my programming as a safer alternative to std::auto_ptr or raw pointers. So the child would have code such as shared_ptr<Parent>, and boost provides the shared_from_this() method which the parent can give to the child.
My problem is that shared_from_this() cannot be used in a constructor, which isn't really a crime because 'this' should not be used in a constructor anyways unless you know what you're doing and don't mind the limitations.
Google's C++ Style Guide states that constructors should merely set member variables to their initial values. Any complex initialization should go in an explicit Init() method. This solves the 'this-in-constructor' problem as well as a few others as well.
What bothers me is that people using your code now must remember to call Init() every time they construct one of your objects. The only way I can think of to enforce this is by having an assertion that Init() has already been called at the top of every member function, but this is tedious to write and cumbersome to execute.
Are there any idioms out there that solve this problem at any step along the way?

Use a factory method to 2-phase construct & initialize your class, and then make the ctor & Init() function private. Then there's no way to create your object incorrectly. Just remember to keep the destructor public and to use a smart pointer:
#include <memory>
class BigObject
{
public:
static std::tr1::shared_ptr<BigObject> Create(int someParam)
{
std::tr1::shared_ptr<BigObject> ret(new BigObject(someParam));
ret->Init();
return ret;
}
private:
bool Init()
{
// do something to init
return true;
}
BigObject(int para)
{
}
BigObject() {}
};
int main()
{
std::tr1::shared_ptr<BigObject> obj = BigObject::Create(42);
return 0;
}
EDIT:
If you want to object to live on the stack, you can use a variant of the above pattern. As written this will create a temporary and use the copy ctor:
#include <memory>
class StackObject
{
public:
StackObject(const StackObject& rhs)
: n_(rhs.n_)
{
}
static StackObject Create(int val)
{
StackObject ret(val);
ret.Init();
return ret;
}
private:
int n_;
StackObject(int n = 0) : n_(n) {};
bool Init() { return true; }
};
int main()
{
StackObject sObj = StackObject::Create(42);
return 0;
}

Google's C++ programming guidelines have been criticized here and elsewhere again and again. And rightly so.
I use two-phase initialization only ever if it's hidden behind a wrapping class. If manually calling initialization functions would work, we'd still be programming in C and C++ with its constructors would never have been invented.

Depending on the situation, this may be a case where shared pointers don't add anything. They should be used anytime lifetime management is an issue. If the child objects lifetime is guaranteed to be shorter than that of the parent, I don't see a problem with using raw pointers. For instance, if the parent creates and deletes the child objects (and no one else does), there is no question over who should delete the child objects.

KeithB has a really good point that I would like to extend (in a sense that is not related to the question, but that will not fit in a comment):
In the specific case of the relation of an object with its subobjects the lifetimes are guaranteed: the parent object will always outlive the child object. In this case the child (member) object does not share the ownership of the parent (containing) object, and a shared_ptr should not be used. It should not be used for semantic reasons (no shared ownership at all) nor for practical reasons: you can introduce all sorts of problems: memory leaks and incorrect deletions.
To ease discussion I will use P to refer to the parent object and C to refer to the child or contained object.
If P lifetime is externally handled with a shared_ptr, then adding another shared_ptr in C to refer to P will have the effect of creating a cycle. Once you have a cycle in memory managed by reference counting you most probably have a memory leak: when the last external shared_ptr that refers to P goes out of scope, the pointer in C is still alive, so the reference count for P does not reach 0 and the object is not released, even if it is no longer accessible.
If P is handled by a different pointer then when the pointer gets deleted it will call the P destructor, that will cascade into calling the C destructor. The reference count for P in the shared_ptr that C has will reach 0 and it will trigger a double deletion.
If P has automatic storage duration, when it's destructor gets called (the object goes out of scope or the containing object destructor is called) then the shared_ptr will trigger the deletion of a block of memory that was not new-ed.
The common solution is breaking cycles with weak_ptrs, so that the child object would not keep a shared_ptr to the parent, but rather a weak_ptr. At this stage the problem is the same: to create a weak_ptr the object must already be managed by a shared_ptr, which during construction cannot happen.
Consider using either a raw pointer (handling ownership of a resource through a pointer is unsafe, but here ownership is handled externally so that is not an issue) or even a reference (which also is telling other programmers that you trust the referred object P to outlive the referring object C)

A object that requires complex construction sounds like a job for a factory.
Define an interface or an abstract class, one that cannot be constructed, plus a free-function that, possibly with parameters, returns a pointer to the interface, but behinds the scenes takes care of the complexity.
You have to think of design in terms of what the end user of your class has to do.

Do you really need to use the shared_ptr in this case? Can the child just have a pointer? After all, it's the child object, so it's owned by the parent, so couldn't it just have a normal pointer to it's parent?

What is the default constructor for C++ pointer?

I have code like this:
class MapIndex
{
private:
typedef std::map<std::string, MapIndex*> Container;
Container mapM;
public:
void add(std::list<std::string>& values)
{
if (values.empty()) // sanity check
return;
std::string s(*(values.begin()));
values.erase(values.begin());
if (values.empty())
return;
MapIndex *&mi = mapM[s]; // <- question about this line
if (!mi)
mi = new MapIndex();
mi->add(values);
}
}
The main concern I have is whether the mapM[s] expression would return reference to NULL pointer if new item is added to the map?
The SGI docs say this: data_type& operator[](const key_type& k)
Returns a reference to the object that is associated with a particular key. If the map does not already contain such an object, operator[] inserts the default object data_type().
So, my question is whether the insertion of default object data_type() will create a NULL pointer, or it could create an invalid pointer pointing somewhere in the memory?

It'll create a NULL (0) pointer, which is an invalid pointer anyway :)

Yes it should be a zero (NULL) pointer as stl containers will default initialise objects when they aren't explicitly stored (ie accessing a non-existant key in a map as you are doing or resizing a vector to a larger size).
C++ Standard, 8.5 paragraph 5 states:
To default-initialize an object of
type T means:
If T is a non-POD class type (clause class), the default
constructor for T is called (and the
initialization is ill-formed if T has
no accessible default constructor)
If T is an array type, each element is default-initialized
Otherwise, the storage for the object iszero-initialized.
You should also note that default initialisation is different to simply ommiting the constructor. When you omit the constructor and simply declare a simple type you will get an indeterminate value.
int a; // not default constructed, will have random data
int b = int(); // will be initialised to zero

UPDATE: I completed my program and that very line I was asking about is causing it to crash sometimes, but at a later stage. The problem is that I'm creating a new object without changing the pointer stored in std::map. What is really needed is either reference or pointer to that pointer.
MapIndex *mi = mapM[s]; // <- question about this line
if (!mi)
mi = new MapIndex();
mi->add(values);
should be changed to:
MapIndex* &mi = mapM[s]; // <- question about this line
if (!mi)
mi = new MapIndex();
mi->add(values);
I'm surprised nobody noticed this.

The expression data_type() value-initializes an object. For a class type with a default constructor, it is invoked; if it doesn’t exist (or is defaulted), such as pointers, the object is zero-initialized.
So yes, you can rely on your map creating a NULL pointer.

Not sure about the crash, but there's definitely a memory leak as this statement:
if (!mi)
mi = new MapIndex();
always returns true, because pointer mi is not a reference to to what mapM is holding for a particular value of s.
I would also avoid using regular pointers and use boost::shared_ptr or some
other pointer that releases memory when destroyed. This allows you to call mapM.clear() or erase(), which should call destructors of keys and values stored in the map. Well, if the value is POD such as your pointer then no destructor is called therefor unless manually deleted, while iterating through a whole map will lead to memory leaks.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Order of destruction in unordered_map - c++

Related

In-place initialization risks

inserting temporary std::shared_ptr into std::map, is it bad?

Re-assigning an std::function object while inside its execution

How to handle 'this' pointer in constructor?

What is the default constructor for C++ pointer?

Categories

Resources