I have a custom class, and I'd like to minimize the chances that someone on my team accidentally copies it, as that could break certain invariants within our system. To this end, I made the copy constructor private, as there is no reason anyone should need to copy it in any legitimate usage of the class.
However, under-the-hood of the framework that the class is a part of, a copy construction of the object into a std::tuple is required. I tried to use friend, but the compiler still complains, as the inner class(es?) of std::tuple require friend-access as well.
What is the best way to get what I want?
If the framework requires your class to be copyable, you really should provide a copyable class.
If your class really is only movable, or not even that, then maybe the framework should have a std::unique_ptr or similar to the object instead? Or you could create a movable adaptor class around that std::unique_ptr which forwards the interface...
Part of the forward-facing interface to users of the class is, whether it is moveable and/or copyable. If you are trying to make it non-copyable, unless you happen to be a component of the target application area... this limits code reuse, and it may confuse potential users of the class as to whether or not it is safe to copy / move it.
It may be that the framework doesn't really need to make a copy, and can be refactored to make moves instead?
It's very unclear from the question why you don't want it to be copyable. You seem to say that bad things will happen, but for some reason you aren't concerned if the framework makes a copy. Is it really okay to make copies or not?
It may be that you need to make a separate system for tracking / enforcing the invariant that you are concerned about, rather than just try to prohibit copying this class.
I used to think C++'s object model is very robust when best practices are followed.
Just a few minutes ago, though, I had a realization that I hadn't had before.
Consider this code:
class Foo
{
std::set<size_t> set;
std::vector<std::set<size_t>::iterator> vector;
// ...
// (assume every method ensures p always points to a valid element of s)
};
I have written code like this. And until today, I hadn't seen a problem with it.
But, thinking about it a more, I realized that this class is very broken:
Its copy-constructor and copy-assignment copy the iterators inside the vector, which implies that they will still point to the old set! The new one isn't a true copy after all!
In other words, I must manually implement the copy-constructor even though this class isn't managing any resources (no RAII)!
This strikes me as astonishing. I've never come across this issue before, and I don't know of any elegant way to solve it. Thinking about it a bit more, it seems to me that copy construction is unsafe by default -- in fact, it seems to me that classes should not be copyable by default, because any kind of coupling between their instance variables risks rendering the default copy-constructor invalid.
Are iterators fundamentally unsafe to store? Or, should classes really be non-copyable by default?
The solutions I can think of, below, are all undesirable, as they don't let me take advantage of the automatically-generated copy constructor:
Manually implement a copy constructor for every nontrivial class I write. This is not only error-prone, but also painful to write for a complicated class.
Never store iterators as member variables. This seems severely limiting.
Disable copying by default on all classes I write, unless I can explicitly prove they are correct. This seems to run entirely against C++'s design, which is for most types to have value semantics, and thus be copyable.
Is this a well-known problem, and if so, does it have an elegant/idiomatic solution?
C++ copy/move ctor/assign are safe for regular value types. Regular value types behave like integers or other "regular" values.
They are also safe for pointer semantic types, so long as the operation does not change what the pointer "should" point to. Pointing to something "within yourself", or another member, is an example of where it fails.
They are somewhat safe for reference semantic types, but mixing pointer/reference/value semantics in the same class tends to be unsafe/buggy/dangerous in practice.
The rule of zero is that you make classes that behave like either regular value types, or pointer semantic types that don't need to be reseated on copy/move. Then you don't have to write copy/move ctors.
Iterators follow pointer semantics.
The idiomatic/elegant around this is to tightly couple the iterator container with the pointed-into container, and block or write the copy ctor there. They aren't really separate things once one contains pointers into the other.
Yes, this is a well known "problem" -- whenever you store pointers in an object, you're probably going to need some kind of custom copy constructor and assignment operator to ensure that the pointers are all valid and point at the expected things.
Since iterators are just an abstraction of collection element pointers, they have the same issue.
Is this a well-known problem?
Well, it is known, but I would not say well-known. Sibling pointers do not occur often, and most implementations I have seen in the wild were broken in the exact same way than yours is.
I believe the problem to be infrequent enough to have escaped most people's notice; interestingly, as I follow more Rust than C++ nowadays, it crops up there quite often because of the strictness of the type system (ie, the compiler refuses those programs, prompting questions).
does it have an elegant/idiomatic solution?
There are many types of sibling pointers situations, so it really depends, however I know of two generic solutions:
keys
shared elements
Let's review them in order.
Pointing to a class-member, or pointing into an indexable container, then one can use an offset or key rather than an iterator. It is slightly less efficient (and might require a look-up) however it is a fairly simple strategy. I have seen it used to great effect in shared-memory situation (where using pointers is a no-no since the shared-memory area may be mapped at different addresses).
The other solution is used by Boost.MultiIndex, and consists in an alternative memory layout. It stems from the principle of the intrusive container: instead of putting the element into the container (moving it in memory), an intrusive container uses hooks already inside the element to wire it at the right place. Starting from there, it is easy enough to use different hooks to wire a single elements into multiple containers, right?
Well, Boost.MultiIndex kicks it two steps further:
It uses the traditional container interface (ie, move your object in), but the node to which the object is moved in is an element with multiple hooks
It uses various hooks/containers in a single entity
You can check various examples and notably Example 5: Sequenced Indices looks a lot like your own code.
Is this a well-known problem
Yes. Any time you have a class that contains pointers, or pointer-like data like an iterator, you have to implement your own copy-constructor and assignment-operator to ensure the new object has valid pointers/iterators.
and if so, does it have an elegant/idiomatic solution?
Maybe not as elegant as you might like, and probably is not the best in performance (but then, copies sometimes are not, which is why C++11 added move semantics), but maybe something like this would work for you (assuming the std::vector contains iterators into the std::set of the same parent object):
class Foo
{
private:
std::set<size_t> s;
std::vector<std::set<size_t>::iterator> v;
struct findAndPushIterator
{
Foo &foo;
findAndPushIterator(Foo &f) : foo(f) {}
void operator()(const std::set<size_t>::iterator &iter)
{
std::set<size_t>::iterator found = foo.s.find(*iter);
if (found != foo.s.end())
foo.v.push_back(found);
}
};
public:
Foo() {}
Foo(const Foo &src)
{
*this = src;
}
Foo& operator=(const Foo &rhs)
{
v.clear();
s = rhs.s;
v.reserve(rhs.v.size());
std::for_each(rhs.v.begin(), rhs.v.end(), findAndPushIterator(*this));
return *this;
}
//...
};
Or, if using C++11:
class Foo
{
private:
std::set<size_t> s;
std::vector<std::set<size_t>::iterator> v;
public:
Foo() {}
Foo(const Foo &src)
{
*this = src;
}
Foo& operator=(const Foo &rhs)
{
v.clear();
s = rhs.s;
v.reserve(rhs.v.size());
std::for_each(rhs.v.begin(), rhs.v.end(),
[this](const std::set<size_t>::iterator &iter)
{
std::set<size_t>::iterator found = s.find(*iter);
if (found != s.end())
v.push_back(found);
}
);
return *this;
}
//...
};
Yes, of course it's a well-known problem.
If your class stored pointers, as an experienced developer you would intuitively know that the default copy behaviours may not be sufficient for that class.
Your class stores iterators and, since they are also "handles" to data stored elsewhere, the same logic applies.
This is hardly "astonishing".
The assertion that Foo is not managing any resources is false.
Copy-constructor aside, if a element of set is removed, there must be code in Foo that manages vector so that the respective iterator is removed.
I think the idiomatic solution is to just use one container, a vector<size_t>, and check that the count of an element is zero before inserting. Then the copy and move defaults are fine.
"Inherently unsafe"
No, the features you mention are not inherently unsafe; the fact that you thought of three possible safe solutions to the problem is evidence that there is no "inherent" lack of safety here, even though you think the solutions are undesirable.
And yes, there is RAII here: the containers (set and vector) are managing resources. I think your point is that the RAII is "already taken care of" by the std containers. But you need to then consider the container instances themselves to be "resources", and in fact your class is managing them. You're correct that you're not directly managing heap memory, because this aspect of the management problem is taken care of for you by the standard library. But there's more to the management problem, which I'll talk a bit more about below.
"Magic" default behavior
The problem is that you are apparently hoping that you can trust the default copy constructor to "do the right thing" in a non-trivial case such as this. I'm not sure why you expected the right behavior--perhaps you're hoping that memorizing rules-of-thumb such as the "rule of 3" will be a robust way to ensure that you don't shoot yourself in the foot? Certainly that would be nice (and, as pointed out in another answer, Rust goes much further than other low-level languages toward making foot-shooting much harder), but C++ simply isn't designed for "thoughtless" class design of that sort, nor should it be.
Conceptualizing constructor behavior
I'm not going to try to address the question of whether this is a "well-known problem", because I don't really know how well-characterized the problem of "sister" data and iterator-storing is. But I hope that I can convince you that, if you take the time to think about copy-constructor-behavior for every class you write that can be copied, this shouldn't be a surprising problem.
In particular, when deciding to use the default copy-constructor, you must think about what the default copy-constructor will actually do: namely, it will call the copy-constructor of each non-primitive, non-union member (i.e. members that have copy-constructors) and bitwise-copy the rest.
When copying your vector of iterators, what does std::vector's copy-constructor do? It performs a "deep copy", i.e., the data inside the vector is copied. Now, if the vector contains iterators, how does that affect the situation? Well, it's simple: the iterators are the data stored by the vector, so the iterators themselves will be copied. What does an iterator's copy-constructor do? I'm not going to actually look this up, because I don't need to know the specifics: I just need to know that iterators are like pointers in this (and other respect), and copying a pointer just copies the pointer itself, not the data pointed to. I.e., iterators and pointers do not have deep-copying by default.
Note that this is not surprising: of course iterators don't do deep-copying by default. If they did, you'd get a different, new set for each iterator being copied. And this makes even less sense than it initially appears: for instance, what would it actually mean if uni-directional iterators made deep-copies of their data? Presumably you'd get a partial copy, i.e., all the remaining data that's still "in front of" the iterator's current position, plus a new iterator pointing to the "front" of the new data structure.
Now consider that there is no way for a copy-constructor to know the context in which it's being called. For instance, consider the following code:
using iter = std::set<size_t>::iterator; // use typedef pre-C++11
std::vector<iter> foo = getIters(); // get a vector of iterators
useIters(foo); // pass vector by value
When getIters is called, the return value might be moved, but it might also be copy-constructed. The assignment to foo also invokes a copy-constructor, though this may also be elided. And unless useIters takes its argument by reference, then you've also got a copy constructor call there.
In any of these cases, would you expect the copy constructor to change which std::set is pointed to by the iterators contained by the std::vector<iter>? Of course not! So naturally std::vector's copy-constructor can't be designed to modify the iterators in that particular way, and in fact std::vector's copy-constructor is exactly what you need in most cases where it will actually be used.
However, suppose std::vector could work like this: suppose it had a special overload for "vector-of-iterators" that could re-seat the iterators, and that the compiler could somehow be "told" only to invoke this special constructor when the iterators actually need to be re-seated. (Note that the solution of "only invoke the special overload when generating a default constructor for a containing class that also contains an instance of the iterators' underlying data type" wouldn't work; what if the std::vector iterators in your case were pointing at a different standard set, and were being treated simply as a reference to data managed by some other class? Heck, how is the compiler supposed to know whether the iterators all point to the same std::set?) Ignoring this problem of how the compiler would know when to invoke this special constructor, what would the constructor code look like? Let's try it, using _Ctnr<T>::iterator as our iterator type (I'll use C++11/14isms and be a bit sloppy, but the overall point should be clear):
template <typename T, typename _Ctnr>
std::vector< _Ctnr<T>::iterator> (const std::vector< _Ctnr<T>::iterator>& rhs)
: _data{ /* ... */ } // initialize underlying data...
{
for (auto i& : rhs)
{
_data.emplace_back( /* ... */ ); // What do we put here?
}
}
Okay, so we want each new, copied iterator to be re-seated to refer to a different instance of _Ctnr<T>. But where would this information come from? Note that the copy-constructor can't take the new _Ctnr<T> as an argument: then it would no longer be a copy-constructor. And in any case, how would the compiler know which _Ctnr<T> to provide? (Note, too, that for many containers, finding the "corresponding iterator" for the new container may be non-trivial.)
Resource management with std:: containers
This isn't just an issue of the compiler not being as "smart" as it could or should be. This is an instance where you, the programmer, have a specific design in mind that requires a specific solution. In particular, as mentioned above, you have two resources, both std:: containers. And you have a relationship between them. Here we get to something that most of the other answers have stated, and which by this point should be very, very clear: related class members require special care, since C++ does not manage this coupling by default. But what I hope is also clear by this point is that you shouldn't think of the problem as arising specifically because of data-member coupling; the problem is simply that default-construction isn't magic, and the programmer must be aware of the requirements for correctly copying a class before deciding to let the implicitly-generated constructor handle copying.
The elegant solution
...And now we get to aesthetics and opinions. You seem to find it inelegant to be forced to write a copy-constructor when you don't have any raw pointers or arrays in your class that must be manually managed.
But user-defined copy constructors are elegant; allowing you to write them is C++'s elegant solution to the problem of writing correct non-trivial classes.
Admittedly, this seems like a case where the "rule of 3" doesn't quite apply, since there's a clear need to either =delete the copy-constructor or write it yourself, but there's no clear need (yet) for a user-defined destructor. But again, you can't simply program based on rules of thumb and expect everything to work correctly, especially in a low-level language such as C++; you must be aware of the details of (1) what you actually want and (2) how that can be achieved.
So, given that the coupling between your std::set and your std::vector actually creates a non-trivial problem, solving the problem by wrapping them together in a class that correctly implements (or simply deletes) the copy-constructor is actually a very elegant (and idiomatic) solution.
Explicitly defining versus deleting
You mention a potential new "rule of thumb" to follow in your coding practices: "Disable copying by default on all classes I write, unless I can explicitly prove they are correct." While this might be a safer rule of thumb (at least in this case) than the "rule of 3" (especially when your criterion for "do I need to implement the 3" is to check whether a deleter is required), my above caution against relying on rules of thumb still applies.
But I think the solution here is actually simpler than the proposed rule of thumb. You don't need to formally prove the correctness of the default method; you simply need to have a basic idea of what it would do, and what you need it to do.
Above, in my analysis of your particular case, I went into a lot of detail--for instance, I brought up the possibility of "deep-copying iterators". You don't need to go into this much detail to determine whether or not the default copy-constructor will work correctly. Instead, simply imagine what your manually-created copy constructor will look like; you should be able to tell pretty quickly how similar your imaginary explicitly-defined constructor is to the one the compiler would generate.
For example, a class Foo containing a single vector data will have a copy constructor that looks like this:
Foo::Foo(const Foo& rhs)
: data{rhs.data}
{}
Without even writing that out, you know that you can rely on the implicitly-generated one, because it's exactly the same as what you'd have written above.
Now, consider the constructor for your class Foo:
Foo::Foo(const Foo& rhs)
: set{rhs.set}
, vector{ /* somehow use both rhs.set AND rhs.vector */ } // ...????
{}
Right away, given that simply copying vector's members won't work, you can tell that the default constructor won't work. So now you need to decide whether your class needs to be copyable or not.
I've written a simple linked list because a recent interview programming challenge showed me how rusty my C++ has gotten. On my list I declared a private copy constructor because I wanted to explicitly avoid making any copies (and of course, laziness). I ran in to some trouble when I wanted to return an object by value that owns one of my lists.
class Foo
{
MyList<int> list; // MyList has private copy constructor
public:
Foo() {};
};
class Bar
{
public:
Bar() {};
Foo getFoo()
{
return Foo();
}
};
I get a compiler error saying that MyList has a private copy constructor when I try to return a Foo object by value. Should Return-Value-Optimization negate the need for any copying? Am I required to write a copy constructor? I'd never heard of move constructors until I started looking for solutions to this problem, is that the best solution? If so, I'll have to read up on them. If not, what is the preferred way to solve this problem?
The standard explicitly states that the constructor still needs to be accessible, even if it is optimized away. See 12.8/32 in a recent draft.
I prefer making an object movable and non-copyable in such situations. It makes ownership very clear and explicit.
Otherwise, your users can always use a shared_ptr. Hiding shared ownership is at best a questionable idea (unless you can guarantee all your values are immutable).
The basic problem is that return by value might copy. The C++ implementation is not required by the standard to apply copy-elision where it does apply. That's why the object still has to be copyable: so that the implementation's decision when to use it doesn't affect whether the code is well-formed.
Anyway, it doesn't necessarily apply to every copy that the user might like it to. For example there is no elision of copy assignment.
I think your options are:
implement a proper copy. If someone ends up with a slow program due to copying it then their profiler will tell them, you don't have to make it your job to stop them if you don't want to.
implement a proper move, but no copy (C++11 only).
change getFoo to take a Foo& (or maybe Foo*) parameter, and avoid a copy by somehow mutating their object. An efficient swap would come in handy for that. This is fairly pointless if getFoo really returns a default-constructed Foo as in your example, since the caller needs to construct a Foo before they call getFoo.
return a dynamically-allocated Foo wrapped in a smart pointer: either auto_ptr or unique_ptr. Functions defined to create an object and transfer sole ownership to their caller should not return shared_ptr since it has no release() function.
provide a copy constructor but make it blow up somehow (fail to link, abort, throw an exception) if it's ever used. The problems with this are (1) it's doomed to fail but the compiler says nothing, (2) you're enforcing quality of implementation, so your class doesn't work if someone deliberately disables RVO for whatever reason.
I may have missed some.
The solution would be implementing your own copy constructor that would use other methods of MyList to implement the copy semantics.
... I wanted to explicitly avoid making any copies
You have to choose. Either you can't make copies of an object, like std::istream; then you have to hold such objects in pointers/references, since these can be copied (in C++11, you can use move semantics instead). Or you implement the copy constructor, which is probably easier then solving problems on each place a copy is needed.
I was just musing about the number of questions here that either are about the "big three" (copy constructor, assignment operator and destructor) or about problems caused by them not being implemented correctly, when it occurred to me that I could not remember the last time I had implemented them myself. A swift grep on my two most active projects indicate that I implement all three in only one class out of about 150.
That's not to say I don't implement/declare one or more of them - obviously base classes need a virtual destructor, and a large number of my classes forbid copying using the private copy ctor & assignment op idiom. But fully implemented, there is this single lonely class, which does some reference counting.
So I was wondering am I unusual in this? How often do you implement all three of these functions? Is there any pattern to the classes where you do implement them?
I think that it's rare that you need all three. Most classes that require an explicit destructor aren't really suitable for copying.
It's just better design to use self-destructing members (which normally don't require things like copy-construction) than a big explicit destructor.
I rarely implement them, but often declare them private (copy constructors and assignemt operators, that is).
Like you, almost never.
But I'm not tied to the STL approach of programming where you copy everything in and around in containers - usually if it's not a primitive, I'll use a pointer, smart or otherwise.
I mainly use RAII patterns, thus avoid writing destructors. Although, I do put empty bodies in my .cc file to help keep code bloat down.
And, like you, I'll declare them private and unimplemented to prevent any accidental invoking.
It really depends on what type of problems you are working on. I have been working on a new project for the past few months and I think every class inherits from boost::noncopyable. Nine months ago I worked on a different project that used PODs quite a bit and I leveraged automatic copy ctor and assignment operator. If you are using boost::shared_ptr (and you should be), it should be rare to write your own copy ctor or assignment operator nowadays.
Most of the time, hardly ever. This is because the members that are used (reference based smart ptr, etc) already implement the proper semantics, or the object is non-copyable.
A few patterns come up when I find myself implementing these:
destructive copy , i.e. move pattern like auto_ptr or lock
dispose pattern which hardly every comes up in C++, but I've used it about three times in my career (and just a week ago in fact)
pimpl pattern, where the pimpl is fwd declared in the header, and managed by a smart ptr. Then the empty dtor goes in the .cc file but still classifies as "not complier generated"
And one other trivial one that prints "I was destroyed" when I think I might have a circular reference somewhere and just want to make sure.
Any class that owns some pointers members need to define this three operations to implement deep copy (See here for a deep description).
I have a Shape class containing potentially many vertices, and I was contemplating making copy-constructor/copy-assignment private to prevent accidental needless copying of my heavyweight class (for example, passing by value instead of by reference).
To make a copy of Shape, one would have to deliberately call a "clone" or "duplicate" method.
Is this good practice? I wonder why STL containers don't use this approach, as I rarely want to pass them by value.
Restricting your users isn't always a good idea. Just documenting that copying may be expensive is enough. If a user really wants to copy, then using the native syntax of C++ by providing a copy constructor is a much cleaner approach.
Therefore, I think the real answer depends on the context. Perhaps the real class you're writing (not the imaginary Shape) shouldn't be copied, perhaps it should. But as a general approach, I certainly can't say that one should discourage users from copying large objects by forcing them to use explicit method calls.
IMHO, providing a copy constructor and assignment operator or not depend more of what your class modelizes than the cost of copying.
If your class represent values, that is if passing an object or a copy of the object doesn't make a difference, then provide them (and provide the equality operator also)
If your class isn't, that is if you think that object of the class have an identity and a state (one also speak of entities), don't. If a copy make sense, provide it with a clone or copy member.
There are sometimes classes you can't easily classify. Containers are in that position. It is meaninfull the consider them as entities and pass them only by reference and have special operations to make a copy when needed. You can also consider them simply as agregation of values and so copying makes sense. The STL was designed around value types. And as everything is a value, it makes sense for containers to be so. That allows things like map<int, list<> > which are usefull. (Remember, you can't put nocopyable classes in an STL container).
Generally, you do not make classes non-copyable just because they are heavy (you had shown a good example STL).
You make them non-copyable when they connected to some non-copyable resource like socket, file, lock or they are not designed to be copied at all (for example have some internal structures that can be hardly deep copied).
However, in your case your object is copyable so leave it as this.
Small note about clone() -- it is used as polymorphic copy constructor -- it has different
meaning and used differently.
Most programmers are already aware of the cost of copying various objects, and know how to avoid copies, using techniques such as pass by reference.
Note the STL's vector, string, map, list etc. could all be variously considered 'heavyweight' objects (especially something like a vector with 10,000 elements!). Those classes all still provide copy constructors and assignment operators, so if you know what you're doing (such as making a std::list of vectors), you can copy them when necessary.
So if it's useful, provide them anyway, but be sure to document they are expensive operations.
Depending on your needs...
If you want to ensure that a copy won't happen by mistake, and making a copy would cause a severe bottleneck or simply doesn't make sense, then this is good practice. Compiling errors are better than performance investigations.
If you are not sure how your class will be used, and are unsure if it's a good idea or not then it is not good practice. Most of the time you would not limit your class in this way.