Is it OK to have a throwing swap member-implementation? - c++

The general guideline when writing classes (using the copy-and-swap idiom) is to provide a non throwing swap member function. (Effective C++, 3rd edition, Item 25 and other resources)
However, what if I cannot provide the nothrow guarantee because my class uses a 3rd party class member that doesn't provide a swap operation?
// Warning: Toy code !!!
class NumberBuffer {
public:
...
void swap(NumberBuffer& rhs);
public:
float* m_data;
size_t m_n;
CString m_desc;
};
void swap(NumberBuffer& lhs, NumberBuffer& rhs) {
lhs.swap(rhs);
}
void NumberBuffer::swap(NumberBuffer& rhs) {
using std::swap;
swap(m_data, rhs.m_data);
swap(m_n, rhs.m_n);
swap(m_desc, rhs.m_desc); // could throw if CString IsLocked and out-of-mem
}
CString swap cannot be made no-throw, so there's the off chance the swap could fail.
Note: For rare 3rd party classes, using a smart ptr (pimpl) would be an option, but --
Note: CString is a good example as noone in his right mind (?) would start holding all members of a conceptually simple and ubiquitous class like CString via pimpl (smart ptr) because that would really look horrible -- and on the other hand, there's no (short to mid-term) chance to get the CString modified to allow fully no-throw swap.
So, is it OK to have a potentially throwing swap member function if you can't help it? (Or do you know ways around this conundrum?)
Edit: And: Can a throwing swap member be used with the copy-and-swap idiom to provide the basic guarantee if not the strong guarantee?

So, is it OK to have a potentially throwing swap member function if you can't help it? (Or do you know ways around this conundrum?)
There is nothing inherently wrong with having a swap function that can potentially throw, but beware that without the strong exception guarantee in swap, it cannot possibly be used to provide exception safety, that is, it can only be used as swap (that is, forget about the copy-and-swap idiom for that particular class as a way of providing the strong exception guarantee... but you can still use it to reduce the amount of code --and document that it is not exception safe)
Alternatively, you can move the CString into a smart pointer that offers a no-throw swap (or at the very least the strong exception guarantee), not a nice solution, but it will at least be exception safe. Lastly, you can move away from CString altogether by using any other string library that provides whatever you need and offers a no-throw swap operation.

There's nothing inherently wrong with a throwing swap, it is just less useful than a no-throw version.
The copy and swap idiom doesn't need swap to be no-throw in order to provide the strong exception guarantee. swap needs only to provide the strong exception guarantee.
The difficulty is that if the no-throw guarantee cannot be provided, it is also likely that the strong exception guarantee cannot be provided. Naive swapping using a temporary and three copies only provides the basic guarantee unless the copying operation provides the no-throw guarantee, in which case swap is also no-throw.

You can easily make it nothrow:
void NumberBuffer::swap(NumberBuffer& rhs) throw()
{
try
{
std::swap(m_desc, rhs.m_desc); //could throw
std::swap(m_data, rhs.m_data);
std::swap(m_n, rhs.m_n);
}
catch(...)
{
}
}
Of course, this is no real solution to the problem, but now you at least got your non-throwing swap ;)

Related

Exception Safety: Benefits of a nothrow swap

If a type has a swap function which cannot fail, this can make it easier for other functions to provide the strong exception safety guarantee. This is because we can first do all of the function's work which may fail "off to the side", and then commit the work using non-throwing swaps.
However, are there any other benefits of guaranteeing that swap will never fail?
For example, is there a situation in which the presence of a no-fail swap makes it easier for another function to provide the basic guarantee?
Let's say I do this:
class C {
T a, b; // invariant: a > b
void swap(C& other);
};
It seems that there's no way to implement C::swap() with basic guarantee if T::swap(T&) might throw. I'd have to add a level of indirection and store T* instead of T.

Making swap faster, easier to use and exception-safe

I could not sleep last night and started thinking about std::swap. Here is the familiar C++98 version:
template <typename T>
void swap(T& a, T& b)
{
T c(a);
a = b;
b = c;
}
If a user-defined class Foo uses external ressources, this is inefficient. The common idiom is to provide a method void Foo::swap(Foo& other) and a specialization of std::swap<Foo>. Note that this does not work with class templates since you cannot partially specialize a function template, and overloading names in the std namespace is illegal. The solution is to write a template function in one's own namespace and rely on argument dependent lookup to find it. This depends critically on the client to follow the "using std::swap idiom" instead of calling std::swap directly. Very brittle.
In C++0x, if Foo has a user-defined move constructor and a move assignment operator, providing a custom swap method and a std::swap<Foo> specialization has little to no performance benefit, because the C++0x version of std::swap uses efficient moves instead of copies:
#include <utility>
template <typename T>
void swap(T& a, T& b)
{
T c(std::move(a));
a = std::move(b);
b = std::move(c);
}
Not having to fiddle with swap anymore already takes a lot of burden away from the programmer.
Current compilers do not generate move constructors and move assignment operators automatically yet, but as far as I know, this will change. The only problem left then is exception-safety, because in general, move operations are allowed to throw, and this opens up a whole can of worms. The question "What exactly is the state of a moved-from object?" complicates things further.
Then I was thinking, what exactly are the semantics of std::swap in C++0x if everything goes fine? What is the state of the objects before and after the swap? Typically, swapping via move operations does not touch external resources, only the "flat" object representations themselves.
So why not simply write a swap template that does exactly that: swap the object representations?
#include <cstring>
template <typename T>
void swap(T& a, T& b)
{
unsigned char c[sizeof(T)];
memcpy( c, &a, sizeof(T));
memcpy(&a, &b, sizeof(T));
memcpy(&b, c, sizeof(T));
}
This is as efficient as it gets: it simply blasts through raw memory. It does not require any intervention from the user: no special swap methods or move operations have to be defined. This means that it even works in C++98 (which does not have rvalue references, mind you). But even more importantly, we can now forget about the exception-safety issues, because memcpy never throws.
I can see two potential problems with this approach:
First, not all objects are meant to be swapped. If a class designer hides the copy constructor or the copy assignment operator, trying to swap objects of the class should fail at compile-time. We can simply introduce some dead code that checks whether copying and assignment are legal on the type:
template <typename T>
void swap(T& a, T& b)
{
if (false) // dead code, never executed
{
T c(a); // copy-constructible?
a = b; // assignable?
}
unsigned char c[sizeof(T)];
std::memcpy( c, &a, sizeof(T));
std::memcpy(&a, &b, sizeof(T));
std::memcpy(&b, c, sizeof(T));
}
Any decent compiler can trivially get rid of the dead code. (There are probably better ways to check the "swap conformance", but that is not the point. What matters is that it's possible).
Second, some types might perform "unusual" actions in the copy constructor and copy assignment operator. For example, they might notify observers of their change. I deem this a minor issue, because such kinds of objects probably should not have provided copy operations in the first place.
Please let me know what you think of this approach to swapping. Would it work in practice? Would you use it? Can you identify library types where this would break? Do you see additional problems? Discuss!
So why not simply write a swap template that does exactly that: swap the object representations*?
There's many ways in which an object, once being constructed, can break when you copy the bytes it resides in. In fact, one could come up with a seemingly endless number of cases where this would not do the right thing - even though in practice it might work in 98% of all cases.
That's because the underlying problem to all this is that, other than in C, in C++ we must not treat objects as if they are mere raw bytes. That's why we have construction and destruction, after all: to turn raw storage into objects and objects back into raw storage. Once a constructor has run, the memory where the object resides is more than only raw storage. If you treat it as if it weren't, you will break some types.
However, essentially, moving objects shouldn't perform that much worse than your idea, because, once you start to recursively inline the calls to std::move(), you usually ultimately arrive at where built-ins are moved. (And if there's more to moving for some types, you'd better not fiddle with the memory of those yourself!) Granted, moving memory en bloc is usually faster than single moves (and it's unlikely that a compiler might find out that it could optimize the individual moves to one all-encompassing std::memcpy()), but that's the price we pay for the abstraction opaque objects offer us. And it's quite small, especially when you compare it to the copying we used to do.
You could, however, have an optimized swap() using std::memcpy() for aggregate types.
This will break class instances that have pointers to their own members. For example:
class SomeClassWithBuffer {
private:
enum {
BUFSIZE = 4096,
};
char buffer[BUFSIZE];
char *currentPos; // meant to point to the current position in the buffer
public:
SomeClassWithBuffer();
SomeClassWithBuffer(const SomeClassWithBuffer &that);
};
SomeClassWithBuffer::SomeClassWithBuffer():
currentPos(buffer)
{
}
SomeClassWithBuffer::SomeClassWithBuffer(const SomeClassWithBuffer &that)
{
memcpy(buffer, that.buffer, BUFSIZE);
currentPos = buffer + (that.currentPos - that.buffer);
}
Now, if you just do memcpy(), where would currentPos point? To the old location, obviously. This will lead to very funny bugs where each instance actually uses another's buffer.
Some types can be swapped but cannot be copied. Unique smart pointers are probably the best example. Checking for copyability and assignability is wrong.
If T isn't a POD type, using memcpy to copy/move is undefined behavior.
The common idiom is to provide a method void Foo::swap(Foo& other) and a specialization of std::swap<Foo>. Note that this does not work with class templates, …
A better idiom is a non-member swap and requiring users to call swap unqualified, so ADL applies. This also works with templates:
struct NonTemplate {};
void swap(NonTemplate&, NonTemplate&);
template<class T>
struct Template {
friend void swap(Template &a, Template &b) {
using std::swap;
#define S(N) swap(a.N, b.N);
S(each)
S(data)
S(member)
#undef S
}
};
The key is the using declaration for std::swap as a fallback. The friendship for Template's swap is nice for simplifying the definition; the swap for NonTemplate might also be a friend, but that's an implementation detail.
I deem this a minor issue, because
such kinds of objects probably should
not have provided copy operations in
the first place.
That is, quite simply, a load of wrong. Classes that notify observers and classes that shouldn't be copied are completely unrelated. How about shared_ptr? It obviously should be copyable, but it also obviously notifies an observer- the reference count. Now it's true that in this case, the reference count is the same after the swap, but that's definitely not true for all types and it's especially not true if multi-threading is involved, it's not true in the case of a regular copy instead of a swap, etc. This is especially wrong for classes that can be moved or swapped but not copied.
because in general, move operations
are allowed to throw
They are most assuredly not. It is virtually impossible to guarantee strong exception safety in pretty much any circumstance involving moves when the move might throw. The C++0x definition of the Standard library, from memory, explicitly states any type usable in any Standard container must not throw when moving.
This is as efficient as it gets
That is also wrong. You're assuming that the move of any object is purely it's member variables- but it might not be all of them. I might have an implementation-based cache and I might decide that within my class, I should not move this cache. As an implementation detail it is entirely within my rights not to move any member variables that I deem are not necessary to be moved. You, however, want to move all of them.
Now, it's true that your sample code should be valid for a lot of classes. However, it's extremely very definitely not valid for many classes that are completely and totally legitimate, and more importantly, it's going to compile down to that operation anyway if the operation can be reduced to that. This is breaking perfectly good classes for absolutely no benefit.
your swap version will cause havoc if someone uses it with polymorphic types.
consider:
Base *b_ptr = new Base(); // Base and Derived contain definitions
Base *d_ptr = new Derived(); // of a virtual function called vfunc()
yourmemcpyswap( *b_ptr, *d_ptr );
b_ptr->vfunc(); //now calls Derived::vfunc, while it should call Base::vfunc
d_ptr->vfunc(); //now calls Base::vfunc while it should call Derived::vfunc
//...
this is wrong, because now b contains the vtable of the Derived type, so Derived::vfunc is invoked on a object which isnt of type Derived.
The normal std::swap only swaps the data members of Base, so this is OK with std::swap

Strong guarantee method calling strong guarantee methods

When I have a method that calls a set of methods that offer strong guarantee, I often have a problem on rolling back changes in order to also have a strong guarantee method too. Let's use an example:
// Would like this to offer strong guarantee
void MacroMethod() throw(...)
{
int i = 0;
try
{
for(i = 0; i < 100; ++i)
SetMethod(i); // this might throw
}
catch(const std::exception& _e)
{
// Undo changes that were done
for(int j = i; j >= 0; --j)
UnsetMethod(j); // this might throw
throw;
}
}
// Offers strong guarantee
void SetMethod(int i) throw(...)
{
// Does a change on member i
}
// Offers strong guarantee
void UnsetMethod() throw(...)
{
// Undoes a change on member i
}
Obviously, the UnsetMethod could throw. In which case, my MacroMathod() only offers basic guarantee. Yet, i did all I could to offer a strong guarantee, but I can't be absolutly sure my UnsetMethod() will not throw. Here's my questions:
Should I even try to offer a strong guarantee in this case?
Should I document my MacroMethod() as having a basic or strong guarantee? Even if it is very unlikely UnsetMethod will throw?
Can you see a way to make this method truly offer a strong guarantee?
I should probably put the call to UnsetMethod() in a try, but that feels rather heavy, and what should I do in the catch?
Thanks!
A good pattern to try to achieve this is to make your method work on a copy of the object that you want to modify. When all modifications are done, you swap the objects (swap should be guaranteed not to throw). This only makes sense if copy and swap can be implemented efficiently.
This method has the advantage that you do not need any try...catch-blocks in your code, and also no cleanup-code. If an exception is thrown, the modified copy gets discarded during stack unwinding, and the original was not modified at all.
No, you can't in general with the code you gave. However, depending on your problem, maybe you can make a copy of the data, perform SetMethod on that copy and then swap the representation. This provides strong guarantee, but again depends on the problem.
You can document: Strong guarantee if UnsetMethod doesn't throw, basic otherwise. Actually this explains why it is said that destructors should not throw. Actually any undo operations should not throw.
Yes, see 1.
No, it makes no sense.
If a strong exception guarantee is important to MacroMethod(), I would redesign UnsetMethod() to not throw anything at all, if you can. Of course how this can be done depends on what you're doing.
You're using UnsetMethod() to clean up after a failure from SetMethod(). If UnsetMethod() fails to clean up, what can you do about that? It's the same reason why throwing exceptions from destructors is extremely dangerous.
Fundamentally, what you need to do in functions that call throwing functions is work on a temporary copy of the data until all of the possibly-throwing sub functions have finished, then swap (which must not throw) the temporary values in to the 'real' structure.
Some illustrative psudocode:
void MacroMethod() throw(...)
{
int i = 0;
MyValues working_copy[100] = *this; //obviously this is psudocode as I don't know your real code
for(i = 0; i < 100; ++i)
working_vopy[i].SetMethod(i); // if this throws the exception will propogate out, and no changes will be made
swap( *this, working_copy); // this must not throw
}

Idiomatic use of auto_ptr to transfer ownership to a container

I'm refreshing my C++ knowledge after not having used it in anger for a number of years. In writing some code to implement some data structure for practice, I wanted to make sure that my code was exception safe. So I've tried to use std::auto_ptrs in what I think is an appropriate way. Simplifying somewhat, this is what I have:
class Tree
{
public:
~Tree() { /* delete all Node*s in the tree */ }
void insert(const string& to_insert);
...
private:
struct Node {
...
vector<Node*> m_children;
};
Node* m_root;
};
template<T>
void push_back(vector<T*>& v, auto_ptr<T> x)
{
v.push_back(x.get());
x.release();
}
void Tree::insert(const string& to_insert)
{
Node* n = ...; // find where to insert the new node
...
push_back(n->m_children, auto_ptr<Node>(new Node(to_insert));
...
}
So I'm wrapping the function that would put the pointer into the container, vector::push_back, and relying on the by-value auto_ptr argument to
ensure that the Node* is deleted if the vector resize fails.
Is this an idiomatic use of auto_ptr to save a bit of boilerplate in my
Tree::insert? Any improvements you can suggest? Otherwise I'd have to have
something like:
Node* n = ...; // find where to insert the new node
auto_ptr<Node> new_node(new Node(to_insert));
n->m_children.push_back(new_node.get());
new_node.release();
which kind of clutters up what would have been a single line of code if I wasn't
worrying about exception safety and a memory leak.
(Actually I was wondering if I could post my whole code sample (about 300 lines) and ask people to critique it for idiomatic C++ usage in general, but I'm not sure whether that kind of question is appropriate on stackoverflow.)
It is not idiomatic to write your own container: it is rather exceptional, and for the most part useful only for learning how to write containers. At any rate, it is most certainly not idiomatic to use std::autp_ptr with standard containers. In fact, it's wrong, because copies of std::auto_ptr aren't equivalent: only one auto_ptr owns a pointee at any given time.
As for idiomatic use of std::auto_ptr, you should always name your auto_ptr on construction:
int wtv() { /* ... */ }
void trp(std::auto_ptr<int> p, int i) { /* ... */ }
void safe() {
std::auto_ptr<int> p(new int(12));
trp(p, wtv());
}
void danger() {
trp(std::auto_ptr<int>(new int(12)), wtv());
}
Because the C++ standard allows arguments to evaluate in any arbitrary order, the call to danger() is unsafe. In the call to trp() in danger(), the compiler may allocate the integer, then create the auto_ptr, and finally call wtv(). Or, the compiler may allocate a new integer, call wtv(), and finally create the auto_ptr. If wtv() throws an exception then danger() may or may not leak.
In the case of safe(), however, because the auto_ptr is constructed a-priori, RAII guarantees it will clean up properly whether or not wtv() throws an exception.
Yes it is.
For example, see the interface of the Boost.Pointer Container library. The various pointer containers all feature an insert function taking an auto_ptr which semantically guarantees that they take ownership (they also have the raw pointer version but hey :p).
There are however other ways to achieve what you're doing with regards to exception safety because it's only internal here. To understand it, you need to understand what could go wrong (ie throw) and then reorder your instructions so that the operations that may throw are done with before the side-effects occur.
For example, taking from your post:
auto_ptr<Node> new_node(new Node(to_insert)); // 1
n->m_children.push_back(new_node.get()); // 2
new_node.release(); // 3
Let's check each line,
The constructor may throw (for example if the CopyConstructor of the type throws), in this case however you are guaranteed that new will perform the cleanup for you
The call to push_back may throw a std::bad_alloc if the memory is exhausted. That's the only error possible as copying a pointer is a no-throw operation
This is guaranteed not to throw
If you look closely, you'll remark that you would not have to worry if you could somehow have 2 being executed before 1. It is in fact possible:
n->m_children.reserve(n->m_children.size() + 1);
n->m_children.push_back(new Node(to_insert));
The call to reserve may throw (bad_alloc), but if it completes normally you are then guaranteed that no reallocation will occur until size becomes equal to capacity and you try another insertion.
The call to new may fall if the constructor throw, in which case new will perform the cleanup, if it completes you're left with a pointer that is immediately inserted in the vector... which is guaranteed not to throw because of the line above.
Thus, the use of auto_ptr may be replaced here. It was nonetheless a good idea, though as has been noted you should refrain from executing RAII initialization within a function evaluation.
I like the idea of declaring the ownership of pointer. This is one of the great features in c++0x, std::unique_ptrs. However std::auto_ptr is so hard to understand and lethal is even slightly misued I would suggest avoiding it entirely.
http://www.gotw.ca/publications/using_auto_ptr_effectively.htm

Benefits of a swap function?

Browsing through some C++ questions I have often seen comments that a STL-friendly class should implement a swap function (usually as a friend.) Can someone explain what benefits this brings, how the STL fits into this and why this function should be implemented as a friend?
For most classes, the default swap is fine, however, the default swap is not optimal in all cases. The most common example of this would be a class using the Pointer to Implementation idiom. Where as with the default swap a large amount of memory would get copied, is you specialized swap, you could speed it up significantly by only swapping the pointers.
If possible, it shouldn't be a friend of the class, however it may need to access private data (for example, the raw pointers) which you class probably doesn't want to expose in the class API.
The standard version of std::swap() will work for most types that are assignable.
void std::swap(T& lhs,T& rhs)
{
T tmp(lhs);
lhs = rhs;
rhs = tmp;
}
But it is not an optimal implementation as it makes a call to the copy constructor followed by two calls to the assignment operator.
By adding your own version of std::swap() for your class you can implement an optimized version of swap().
For example std::vector. The default implementation as defined above would be very expensive as you would need to make copy of the whole data area. Potentially release old data areas or re-allocate the data area as well as invoke the copy constructor for the contained type on each item copied. A specialized version has a very simple easy way to do std::swap()
// NOTE this is not real code.
// It is just an example to show how much more effecient swaping a vector could
// be. And how using a temporary for the vector object is not required.
std::swap(std::vector<T>& lhs,std::vector<T>& rhs)
{
std::swap(lhs.data,rhs.data); // swap a pointer to the data area
std::swap(lhs.size,rhs.size); // swap a couple of integers with size info.
std::swap(lhs.resv,rhs.resv);
}
As a result if your class can optimize the swap() operation then you should probably do so. Otherwise the default version will be used.
Personally I like to implement swap() as a non throwing member method. Then provide a specialized version of std::swap():
class X
{
public:
// As a side Note:
// This is also useful for any non trivial class
// Allows the implementation of the assignment operator
// using the copy swap idiom.
void swap(X& rhs) throw (); // No throw exception guarantee
};
// Should be in the same namespace as X.
// This will allows ADL to find the correct swap when used by objects/functions in
// other namespaces.
void swap(X& lhs,X& rhs)
{
lhs.swap(rhs);
}
If you want to swap (for example) two vectors without knowing anything about their implementation, you basically have to do something like this:
typedef std::vector<int> vec;
void myswap(vec &a, vec &b) {
vec tmp = a;
a = b;
b = tmp;
}
This is not efficient if a and b contain many elements since all those elements are copied between a, b and tmp.
But if the swap function would know about and have access to the internals of the vector, there might be a more efficient implementation possible:
void std::swap(vec &a, vec &b) {
// assuming the elements of the vector are actually stored in some memory area
// pointed to by vec::data
void *tmp = a.data;
a.data = b.data;
b.data = tmp;
// ...
}
In this implementation just a few pointers need to be copied, not all the elements like in the first version. And since this implementation needs access to the internals of the vector it has to be a friend function.
I interpreted your question as basically three different (related) questions.
Why does STL need swap?
Why should a specialized swap be implemented (i.s.o. relying on the default swap)?
Why should it be implemented as a friend?
Why does STL need swap?
The reason an STL friendly class needs swap is that swap is used as a primitive operation in many STL algorithms. (e.g. reverse, sort, partition etc. are typically implemented using swap)
Why should a specialized swap be implemented (i.s.o. relying on the default swap)?
There are many (good) answers to this part of your question already. Basically, knowing the internals of a class frequently allows you to write a much more optimized swap function.
Why should it be implemented as a friend?
The STL algorithms will always call swap as a free function. So it needs to be available as a non member function to be useful. And, since it's only beneficial to write a customized swap when you can use knowledge of internal structures to write a much more efficient swap, this means your free function will need access to the internals of your class, hence a friend.
Basically, it doesn't have to be a friend, but if it doesn't need to be a friend, there's usually no reason to implement a custom swap either.
Note that you should make sure the free function is inside the same namespace as your class, so that the STL algorithms can find your free function via Koening lookup.
One other use of the swap function is to aid exception-safe code: http://www.gotw.ca/gotw/059.htm
Efficiency:
If you've got a class that holds (smart) pointers to data then it's likely to be faster to swap the pointers than to swap the actual data - 3 pointer copies vs. 3 deep copies.
If you use a 'using std::swap' + an unqualified call to swap (or just a qualified call to boost::swap), then ADL will pick up the custom swap function, allowing efficient template code to be written.
Safety:
Pointer swaps (raw pointers, std::auto_ptr and std::tr1::shared_ptr) do not throw, so can be used to implement a non-throwing swap. A non-throwing swap makes it easier to write code that provides the strong exception guarantee (transactional code).
The general pattern is:
class MyClass
{
//other members etc...
void method()
{
MyClass finalState(*this);//copy the current class
finalState.f1();//a series of funcion calls that can modify the internal
finalState.f2();//state of finalState and/or throw.
finalState.f3();
//this only gets call if no exception is thrown - so either the entire function
//completes, or no change is made to the object's state at all.
swap(*this,finalState);
}
};
As for whether it should be implemented as friend; swapping usually requires knowledge of implementation details. It's a matter of taste whether to use a non-friend that calls a member function or to use a friend.
Problems:
A custom swap is often faster than a single assignment - but a single assignment is always faster than the default three assignment swap. If you want to move an object, it's impossible to know in a generic way whether a swap or assignment would be best - a problem which C++0x solves with move constructors.
To implement assignment operators:
class C
{
C(C const&);
void swap(C&) throw();
C& operator=(C x) { this->swap(x); return *this; }
};
This is exception safe, the copy is done via the copy constructor when you pass by value, and the copy can be optimized out by the compiler when you pass a temporary (via copy elision).