I am reading "C++ Concurrency in action" book and trying to comprehend exception safety in thread-safe data structures (for example stack). Author states that to avoid race condition, pop should do both operations - pop and return item from the stack, however:
If the pop() function was defined to return the value popped, as well as remove it from the stack, you have a potential problem: the value being popped is returned to the caller only after the stack has been modified, but the process of copying the data to return to the caller might throw an exception.
Here is proposed solution:
std::shared_ptr<T> pop()
{
std::lock_guard<std::mutex> lock(m);
if(data.empty()) throw runtime_error("empty");
std::shared_ptr<T> const res(std::make_shared<T>(data.top()));
data.pop();
return res;
}
In this case we got invocation of copy constructor on make_shared line. If copy constructor throws an exception, stack is not yet modified and we are good.
However I don't see how it differs much from this snippet:
T pop2() {
std::lock_guard<std::mutex> lock(m);
if(data.empty()) throw runtime_error("empty");
auto res = data.top();
data.pop();
return res;
}
Here we have copy constructor invocation at data.top() line and move constructor on return. Again, if copy constructor throws an exception we are good since stack is not yet modified. Move contstructor is not supposed to throw exceptions.
Am I missing something? What is the benefit of returning shared_ptr comparison to returning (movable) T ?
If T is not movable, your code might end up doing multiple copy-constructions. Using a shared_ptr to allocate the copy on the heap might be more efficient in this case.
And even if T is movable, your code might still do multiple (potentially costly) move-constructions. Move-constructing and/or copy-constructing a shared_ptr is known to be relatively cheap compared to move- or copy-constructing any possible type T.
Of course your mileage may vary depending on the exact types, your compiler and on your operating environment and hardware.
move-constructor may throw (in general). std::shared_ptr<T> has a noexcept move constructor.
So in first case return res; cannot throw, but in second case return res; may throw (and data.pop() has already been called).
Related
"C++ Concurrency in Action second edition" has an example of thread-safe stack implementation, and the following is the code for pop:
template <typename T>
std::shared_ptr<T> threadsafe_stack<T>::pop()
{
std::lock_guard<std::mutex> lock(m);
// `data` is a data member of type `std::stack<T>`
if(data.empty()) throw empty_stack(); // <--- 2
std::shared_ptr<T> const res(
std::make_shared<T>(std::move(data.top()))); // <--- 3
data.pop(); // <--- 4
return res;
}
The book has the following explanation for this code:
The creation of res (3) might throw an exception, though, for a couple of reasons: the call to std::make_shared might throw because it can’t allocate memory for the new object and the internal data required for reference counting, or the copy constructor or move constructor of the data item to be returned might throw when copying/moving into the freshly- allocated memory. In both cases, the C++ runtime and Standard Library ensure that there are no memory leaks and the new object (if any) is correctly destroyed. Because you still haven’t modified the underlying stack, you’re OK.
Does this explanation correct?
If data item's move constructor throws, and the data item is corrupted. Although this implementation avoid memory leak, the state of the stack has been damaged/modified (not strong exception safe), and latter read will get corrupted data. Right?
So it seems that this implementation works, only if the move constructor doesn't throw (or at least, even if it throws, it keeps the data item unmodified)?
It is assumed that the if the move constructor of T throws an exception it will make sure that the original object remains in its original state.
That is a general exception guarantee that move operations should satisfy, otherwise it is trivially impossible to make any guarantees about exception safety when the move operation is used.
If that is not supposed to be a precondition on the element type, the copy constructor can be used instead of the move constructor if the move constructor is not noexcept. The copy constructor should never modify the original object. There is std::move_if_noexcept as a replacement for std::move in the standard library to implement this behavior.
Obviously if the type is not copyable and the move constructor not noexcept we are back again at the original problem and then it will simply be impossible to make exception guarantees if T's move constructor doesn't make any.
I came across a thread-safe stack implementation of an interface method for stack<>::pop():
void pop(T& value)
{
std::lock_guard<std::mutex> lock(m);
if(data.empty()) throw empty_stack();
value=std::move(data.top()); <----- why not just value = data.top()?
data.pop();
}
Granted, my question has nothing to do with concurrency, but why MOVE the value at the top of the stack into variable value? My understanding is once it is moved, you can't pop it as it is no longer there.
Either it is a mistake in the source where I found it, or if someone can explain it to me I would be thankful.
Thanks,
Amine
value=std::move(data.top()); <----- why not just value = data.top()?
This largely depends on what T is, basically it is going to try and use the move constructor, T(T &&mv) instead of the copy constructor T(const T &cp) if the move constructor exists.
In both cases ~T() will be called for the original object on the data.pop(); line.
Firstly, using the move constructor might be required. Some objects are moveable, but not copyable, e.g. a unique_ptr.
Secondly, where move is provided, it is often more efficient. e.g. say T is a std::vector, the copy constructor will allocate another array and then copy over each element (which might also be expensive), and then it deletes the original array anyway. Which is pretty wasteful.
A move constructor just keeps the original elements by moving the internal data array from one vector to a new one, and leaves the original (which is about to be deleted anyway in this case) in an unspecified but valid state (probably "empty").
It might look something like:
template<typename T> class vector
{
public:
vector<T>(vector<T> &&mv)
: arr(mv.arr) , arr_len(mv.arr_len), arr_capacity(mv.arr_capacity)
{
mv.arr = nullptr;
mv.arr_len = 0;
mv.arr_capacity = 0;
}
...
private:
T *arr;
size_t arr_len;
size_t arr_capacity;
};
Since the original object state is "unspecified" in the general case if you want to keep using the original object, you have to be careful. Destroying it as in the pop case is OK, as is assignment.
T tmp = std::move(some_value);
some_value.foo(); // In general, what state some_value is in is unknown, this might vary even from compiler to compiler
some_value = some_other_value; // But assignment should work
some_value.foo(); // So it is now in a known state
Which can implement for example a "swap" without copying any "contents" where possible.
template<typename T> void swap(T &a, T &b)
{
T tmp = std::move(a);
a = std::move(b);
b = std::move(tmp);
}
Many types will be more specified, for example for std::vector, it promises to be empty().
std::vector<T> tmp = std::move(some_array);
assert(some_array.empty()); // guaranteed
some_array.push_back(x); // guaranteed to have one element
you can't pop it as it is no longer there.
So the important bit here is that data.top() does not remove the element, so it is still there. And the move doesn't actually delete the thing, just leaves it in some unspecified state.
Concurrency-safe stack interface method
On the separate subject of concurrency the thing here is both the top to access the value, and the pop to remove it are under the same lock. To be safe all accesses to this instance stack must use that same lock instance, so make sure any data.push also holds the lock.
A recent question (and especially my answer to it) made me wonder:
In C++11 (and newer standards), destructors are always implicitly noexcept, unless specified otherwise (i.e. noexcept(false)). In that case, these destructors may legally throw exceptions. (Note that this is still a you should really know what you are doing-kind of situation!)
However, all overloads of
std::unique_ptr<T>::reset() are declared to always be noexcept (see cppreference), even if the destructor if T isn't, resulting in program termination if a destructor throws an exception during reset(). Similar things apply to std::shared_ptr<T>::reset().
Why is reset() always noexcept, and not conditionally noexcept?
It should be possible to declare it noexcept(noexcept(std::declval<T>().~T())) which makes it noexcept exactly if the destructor of T is noexcept. Am I missing something here, or is this an oversight in the standard (since this is admittedly a highly academic situation)?
The requirements of the call to the function object Deleter are specific on this as listed in the requirements of the std::unique_ptr<T>::reset() member.
From [unique.ptr.single.modifiers]/3, circa N4660 §23.11.1.2.5/3;
unique_ptr modifiers
void reset(pointer p = pointer()) noexcept;
Requires: The expression get_deleter()(get()) shall be well formed, shall have well-defined behavior, and shall not throw exceptions.
In general the type would need to be destructible. And as per the cppreference on the C++ concept Destructible, the standard lists this under the table in [utility.arg.requirements]/2, §20.5.3.1 (emphasis mine);
Destructible requirements
u.~T() All resources owned by u are reclaimed, no exception is propagated.
Also note the general library requirements for replacement functions; [res.on.functions]/2.
std::unique_ptr::reset does not invoke destructor directly, instead it invokes operator () of the deleter template parameter (which defaults to std::default_delete<T>). This operator is required to not throw exceptions, as specified in
23.11.1.2.5 unique_ptr modifiers [unique.ptr.single.modifiers]
void reset(pointer p = pointer()) noexcept;
Requires: The expression get_deleter()(get()) shall be well-formed, shall have >well-defined behavior, and shall not throw exceptions.
Note that shall not throw is not the same as noexcept though. operator () of the default_delete is not declared as noexcept even though it only invokes delete operator (executes delete statement). So this seems to be a rather weak spot in the standard. reset should either be conditionally noexcept:
noexcept(noexcept(::std::declval<D>()(::std::declval<T*>())))
or operator () of the deleter should be required to be noexcept to give a stonger guarantee.
Without having been in the discussions in the standards committee, my first thought is that this is a case where the standards committee has decided that the pain of throwing in the destructor, which is generally considered undefined behaviour due to the destruction of stack memory when unwinding the stack, was not worth it.
For the unique_ptr in particular, consider what could happen if an object held by a unique_ptr throws in the destructor:
The unique_ptr::reset() is called.
The object inside is destroyed
The destructor throws
The stack starts unwinding
The unique_ptr goes out of scope
Goto 2
There was to ways of avoiding this. One is setting the pointer inside of the unique_ptr to a nullptr before deleting it, which would result in a memory leak, or to define what should happen if a destructor throws an exception in the general case.
Perhaps this would be easier to explain this with an example. If we assume that reset wasn't always noexcept, then we could write some code like this would cause problems:
class Foobar {
public:
~Foobar()
{
// Toggle between two different types of exceptions.
static bool s = true;
if(s) throw std::bad_exception();
else throw std::invalid_argument("s");
s = !s;
}
};
int doStuff() {
Foobar* a = new Foobar(); // wants to throw bad_exception.
Foobar* b = new Foobar(); // wants to throw invalid_argument.
std::unique_ptr<Foobar> p;
p.reset(a);
p.reset(b);
}
What do we when p.reset(b) is called?
We want to avoid memory leaks, so p needs to claim ownership of b so that it can destroy the instance, but it also needs to destroy a which wants to throw an exception. So how and we destroy both a and b?
Also, which exception should doStuff() throw? bad_exception or invalid_argument?
Forcing reset to always be noexcept prevents these problems. But this sort of code would be rejected at compile-time.
template<class T>
T Stack<T>::pop()
{
if (vused_ == 0)
{
throw "Popping empty stack";
}
else
{
T result = v_[used_ - 1];
--vused_;
return result;
}
}
I didn't understand all of it, or rather I understood none of it, but it was said that this code doesn't work, because it returns by value, I am guessing he was referring to result, and that calls the copy constructor and I have no idea how that's even possible. Can anyone care to explain?
Unlike the code in the question's example, std::stack<T>::pop does not return a value.
That's because if the item type needs to be copied, and the copying throws, then you have an operation failure that has changed the state of the object, with no means of re-establishing the original state.
I.e. the return-a-value-pop does not offer a strong exception guarantee (either succeed or no change).
Similarly, throwing a literal string is unconventional to say the least.
So while the code doesn't have any error in itself (modulo possible typing errors such as vused_ versus v_ etc.), it's weak on guarantees and so unconventional that it may lead to bugs in exception handling elsewhere.
A different viewpoint is that the non-value-returning pop of std::stack is impractical, leading to needlessly verbose client code.
And for using a stack object I prefer to have a value-returning pop.
But it's not either/or: a value-returning convenience method popped can be easily defined in terms of pop (state change) and top (inspection). This convenience method then has a weaker exception guarantee. But the client code programmer can choose. :-)
An improvement within the existing design would be to support movable objects, that is, replace
return result;
with
return move( result );
helping the compiler a little.
↑ Correction:
Actually, the above deleted text has the opposite effect of the intended one, namely, it inhibits RVO (guaranteeing a constructor call). Somehow my thinking got inverted here. But as a rule, don't use move on a return expression that is just the name of a non-parameter automatic variable, because the default is optimization, and the added move can not improve things, but can inhibit an RVO optimization.
Yes, returning by value formally calls the copy constructor. But that's not a problem at all, because in practice, compilers will typically be able to optimize away the additional copy. This technique is called "Return-Value Optimization".
More than the return statement (which can work if the class is movable but not copyable, e.g. you can return std::unique_ptrs), the problem is the copy you do here:
T result = v_[used_ - 1];
To make this copy possible, the type T must be copyable (e.g. T should have public copy constructor - required by the above statement - and copy assignment operator=).
As a side note, throwing a string is really bad: you should throw an exception class, e.g.
throw std::runtime_error("Popping empty stack.");
or just define an ad hoc class for this case and throw it, e.g.:
class StackUnderflowException : public std::runtime_error
{
public:
StackUnderflowException()
: std::runtime_error("Popping empty stack.")
{ }
};
....
throw StackUnderflowException();
I'm refreshing my C++ knowledge after not having used it in anger for a number of years. In writing some code to implement some data structure for practice, I wanted to make sure that my code was exception safe. So I've tried to use std::auto_ptrs in what I think is an appropriate way. Simplifying somewhat, this is what I have:
class Tree
{
public:
~Tree() { /* delete all Node*s in the tree */ }
void insert(const string& to_insert);
...
private:
struct Node {
...
vector<Node*> m_children;
};
Node* m_root;
};
template<T>
void push_back(vector<T*>& v, auto_ptr<T> x)
{
v.push_back(x.get());
x.release();
}
void Tree::insert(const string& to_insert)
{
Node* n = ...; // find where to insert the new node
...
push_back(n->m_children, auto_ptr<Node>(new Node(to_insert));
...
}
So I'm wrapping the function that would put the pointer into the container, vector::push_back, and relying on the by-value auto_ptr argument to
ensure that the Node* is deleted if the vector resize fails.
Is this an idiomatic use of auto_ptr to save a bit of boilerplate in my
Tree::insert? Any improvements you can suggest? Otherwise I'd have to have
something like:
Node* n = ...; // find where to insert the new node
auto_ptr<Node> new_node(new Node(to_insert));
n->m_children.push_back(new_node.get());
new_node.release();
which kind of clutters up what would have been a single line of code if I wasn't
worrying about exception safety and a memory leak.
(Actually I was wondering if I could post my whole code sample (about 300 lines) and ask people to critique it for idiomatic C++ usage in general, but I'm not sure whether that kind of question is appropriate on stackoverflow.)
It is not idiomatic to write your own container: it is rather exceptional, and for the most part useful only for learning how to write containers. At any rate, it is most certainly not idiomatic to use std::autp_ptr with standard containers. In fact, it's wrong, because copies of std::auto_ptr aren't equivalent: only one auto_ptr owns a pointee at any given time.
As for idiomatic use of std::auto_ptr, you should always name your auto_ptr on construction:
int wtv() { /* ... */ }
void trp(std::auto_ptr<int> p, int i) { /* ... */ }
void safe() {
std::auto_ptr<int> p(new int(12));
trp(p, wtv());
}
void danger() {
trp(std::auto_ptr<int>(new int(12)), wtv());
}
Because the C++ standard allows arguments to evaluate in any arbitrary order, the call to danger() is unsafe. In the call to trp() in danger(), the compiler may allocate the integer, then create the auto_ptr, and finally call wtv(). Or, the compiler may allocate a new integer, call wtv(), and finally create the auto_ptr. If wtv() throws an exception then danger() may or may not leak.
In the case of safe(), however, because the auto_ptr is constructed a-priori, RAII guarantees it will clean up properly whether or not wtv() throws an exception.
Yes it is.
For example, see the interface of the Boost.Pointer Container library. The various pointer containers all feature an insert function taking an auto_ptr which semantically guarantees that they take ownership (they also have the raw pointer version but hey :p).
There are however other ways to achieve what you're doing with regards to exception safety because it's only internal here. To understand it, you need to understand what could go wrong (ie throw) and then reorder your instructions so that the operations that may throw are done with before the side-effects occur.
For example, taking from your post:
auto_ptr<Node> new_node(new Node(to_insert)); // 1
n->m_children.push_back(new_node.get()); // 2
new_node.release(); // 3
Let's check each line,
The constructor may throw (for example if the CopyConstructor of the type throws), in this case however you are guaranteed that new will perform the cleanup for you
The call to push_back may throw a std::bad_alloc if the memory is exhausted. That's the only error possible as copying a pointer is a no-throw operation
This is guaranteed not to throw
If you look closely, you'll remark that you would not have to worry if you could somehow have 2 being executed before 1. It is in fact possible:
n->m_children.reserve(n->m_children.size() + 1);
n->m_children.push_back(new Node(to_insert));
The call to reserve may throw (bad_alloc), but if it completes normally you are then guaranteed that no reallocation will occur until size becomes equal to capacity and you try another insertion.
The call to new may fall if the constructor throw, in which case new will perform the cleanup, if it completes you're left with a pointer that is immediately inserted in the vector... which is guaranteed not to throw because of the line above.
Thus, the use of auto_ptr may be replaced here. It was nonetheless a good idea, though as has been noted you should refrain from executing RAII initialization within a function evaluation.
I like the idea of declaring the ownership of pointer. This is one of the great features in c++0x, std::unique_ptrs. However std::auto_ptr is so hard to understand and lethal is even slightly misued I would suggest avoiding it entirely.
http://www.gotw.ca/publications/using_auto_ptr_effectively.htm