I was just musing about the number of questions here that either are about the "big three" (copy constructor, assignment operator and destructor) or about problems caused by them not being implemented correctly, when it occurred to me that I could not remember the last time I had implemented them myself. A swift grep on my two most active projects indicate that I implement all three in only one class out of about 150.
That's not to say I don't implement/declare one or more of them - obviously base classes need a virtual destructor, and a large number of my classes forbid copying using the private copy ctor & assignment op idiom. But fully implemented, there is this single lonely class, which does some reference counting.
So I was wondering am I unusual in this? How often do you implement all three of these functions? Is there any pattern to the classes where you do implement them?
I think that it's rare that you need all three. Most classes that require an explicit destructor aren't really suitable for copying.
It's just better design to use self-destructing members (which normally don't require things like copy-construction) than a big explicit destructor.
I rarely implement them, but often declare them private (copy constructors and assignemt operators, that is).
Like you, almost never.
But I'm not tied to the STL approach of programming where you copy everything in and around in containers - usually if it's not a primitive, I'll use a pointer, smart or otherwise.
I mainly use RAII patterns, thus avoid writing destructors. Although, I do put empty bodies in my .cc file to help keep code bloat down.
And, like you, I'll declare them private and unimplemented to prevent any accidental invoking.
It really depends on what type of problems you are working on. I have been working on a new project for the past few months and I think every class inherits from boost::noncopyable. Nine months ago I worked on a different project that used PODs quite a bit and I leveraged automatic copy ctor and assignment operator. If you are using boost::shared_ptr (and you should be), it should be rare to write your own copy ctor or assignment operator nowadays.
Most of the time, hardly ever. This is because the members that are used (reference based smart ptr, etc) already implement the proper semantics, or the object is non-copyable.
A few patterns come up when I find myself implementing these:
destructive copy , i.e. move pattern like auto_ptr or lock
dispose pattern which hardly every comes up in C++, but I've used it about three times in my career (and just a week ago in fact)
pimpl pattern, where the pimpl is fwd declared in the header, and managed by a smart ptr. Then the empty dtor goes in the .cc file but still classifies as "not complier generated"
And one other trivial one that prints "I was destroyed" when I think I might have a circular reference somewhere and just want to make sure.
Any class that owns some pointers members need to define this three operations to implement deep copy (See here for a deep description).
Related
According to the Google style guidelines, "Few classes need to be copyable. Most should have neither a copy constructor nor an assignment operator."
They recommend you make a class uncopyable (that is, not giving it a copy constructor or assignment operator), and instead recommending passing by reference or pointer in most situations, or using clone() methods which cannot be invoked implicitly.
However, I've heard some arguments against this:
Accessing a reference is (usually) slower than accessing a value.
In some computations, I might want to leave the original object the way it is and just return the changed object.
I might want to store the value of a computation as a local object in a function and return it, which I couldn't do if I returned it by reference.
If a class is small enough, passing by reference is slower.
What are the positives/negatives of following this guideline? Is there any standard "rule of thumb" for making classes uncopyable? What should I consider when creating new classes?
I have two issues with their advice:
It doesn't apply to modern C++, ignoring move constructors/assignment operators, and so assumes that taking objects by value (which would have copied before) is often inefficient.
It doesn't trust the programmer to do the right thing and design their code appropriately. Instead it limits the programmer until they're forced to break the rule.
Whether your class should be copyable, moveable, both or neither should be a design decision based on the uses of the class itself. For example, a std::unique_ptr is a great example of a class that should only be moveable because copying it would invalidate its entire purpose. When you design a class, ask yourself if it makes sense to copy it. Most of the time the answer will be yes.
The advice seems to be based on the belief that programmers default to passing objects around by value which can be expensive when the objects are complex enough. This is just not true any more. You should default to passing objects around by value when you need a copy of the object, and there's no reason to be scared of this - in many cases, the move constructor will be used instead, which is almost always a constant time operation.
Again, the choice of how you should pass objects around is a design decision that should be influenced by a number of factors, such as:
Am I going to need a copy of this object?
Do I need to modify this object?
What is the lifetime of the object?
Is the object optional?
These questions should be asked with every type you write (parameter, return value, variable, whatever). You should find plenty of uses for passing objects by value that don't lead to poor performance due to copying.
If you follow good C++ programming practices, your copy constructors will be bug free, so that shouldn't be a concern. In fact, many classes can get away with just the defaulted copy/move constructors. If a class owns dynamically allocated resources and you use smart pointers appropriately, implementing the copy constructor is often as simple as copying the objects from the pointers - not much room for bugs.
Of course, this advice from Google is for people working on their code to ensure consistency throughout their codebase. That's fine. I don't recommend blindly adopting it in its entirety for a modern C++ project, however.
Before anything else, let me take you all into the highway of my thoughts (to say it simply, I'm just imagining these things)
Suppose, I am using a third-party library class that uses move semantics (r-value references). Here is its definition:
class VeryHeavyObject {
VeryHeavyObject(const VeryHeavyObject& obj); // copy constructor
VeryHeavyObject& operator= (const VeryHeavyObject& obj); // copy operator
public:
VeryHeavyObject(); // constructor
VeryHeavyObject(VeryHeavyObject&& obj); // move constructor
VeryHeavyObject& operator= (VeryHeavyObject&& obj); // move operator
// ....
};
Apparently, the author's really concerned about the cost of copying of VeryHeavyObject and decided to force everything to be moved (More apparently, he doesn't know how to design classes with move semantics). But then, in some point of my code, I need to have a copy of VeryHeavyObject.
Well, the core question is:
How can I copy an object with only a move constructor and a move operator?
P.S.: I have tried but I can't really contact the author of the library (I think he's on vacation).
You cannot.
However, provided that you have sufficient access to its internals (getters and the like), then you can construct a clone by yourself.
A well defined interface, and we will assume that this is the case, some methods may not be available because the author wants to discourage certain uses for performance reasons. A well-known example is std::list, which does not include a [] operator because it has O(n) complexity compared with O(1) in other containers, such as std::vector.
In this case, the author of the library wants to discourage the use of copies because, as you state in your question, it is very costly. But this does not mean that it is impossible. If you really need to do it, you can probably write your own Clone() function that takes data from the original VeryHeavyObject as appropriate, constructs a new one with these data and returns it using std::move. Since we haven't got the interface for VeryHeavyObject we cannot try to do it, but I'm sure you can.
It might not be possible.
The class has declared the copies private, but we can't see whether the functions are ever defined. You seem to assume that the class has a copy operation that it's hiding away from you to stop you doing something slow, but that might not be the case. Some objects simply cannot be copied. For examples, consider streams.
You wouldn't expect privately-declared-but-not-defined functions in C++11, but there's no law against it. Anyway even if there is an implemented private copy function, it's probably private for a reason (maybe it can only be used under certain controlled circumstances: the class internals know how to use it safely and you don't). So if there's no public copy, then as far as this class's API is concerned it cannot be copied.
Perhaps the class has enough public accessors, that you can interrogate it for the state you need, and construct a new object that matches it. If so then you could reasonably complain to the author of the class that it should be publicly copyable. If not then maybe it has state that can't be duplicated.
Anything that provides unique access to something (streams, drivers, locks) has a reason not to be copyable, because the original and the copy can't both provide unique access to the same thing. Admittedly dup means that even file descriptors don't physically provide unique access to something, let alone the streams that wrap them. But a stream has state involving buffered data not yet written, which means that copying them would introduce complexity that the class is designed to protect you from. So logically you normally use a stream as though it is the only way to access something.
If the copy assignment operator is implemented, then you might be able to hack a way to call it even though it's private. That won't work for the copy constructor, though, you can't take pointers to constructors. As a brutal hack you could #define private public before including its header: it's undefined behavior but it might work on the implementation you're using. Forking the third-party source would be better.
In general, it is not possible without modifying the class, because there might be private data that you cannot access. It might be possible if a shallow copy is sufficient, because then you should be able to do it with a memccpy. (Note that if the class does not have any virtual members or pointers, shallow and deep copy are the same).
A colleague is cleaning up a couple of libraries. In doing so he's been reading API design for C++ and it talks about explicitly enabling or disabling copying in C++ classes. This is the same thing that Sutter and Alexandrescu say in their C++ Coding Standards.
He agrees that one should follow this advice, but what neither book seems to say are what are those guiding principles that tell when to enable or disable.
Any guidance one way or the other? Thanks!
It depends on the role the classes play in the application. Unless the
class represents a value, where identity isn't significant, you should
ban copy and assignment. Similarly if the class is polymorphic. As a
generally rule, if you're allocating objects of the class type
dynamically, it shouldn't be copiable. And inversely, if the class is
copiable, you shouldn't allocate instances of it dynamically. (But
there are some exceptions, and it's not rare to allocate dynamically and
avoid copying big objects, even when the semantics argue otherwise.)
If you're designing a low-level library, the choice is less clear.
Something like std::vector can play many roles in an application; in
most of them, copying wouldn't be appropriate, but banning copy would
make it unusable in the few where it is appropriate.
Classes which are non-copyable should be the exception, not the rule. Your class should be non-copyable if and only if you cannot retain value semantics while copying- for example, named mutexes, unique-ownership pointers. Else, your class should be copyable. Many C++ libraries depend on copyability, especially pre-C++0x where they cannot be movable.
Contrary to DeadMG, I believe most classes should be non-copyable.
Here is what Stroustrup wrote in his "The Design and Evolution of C++" book:
"I personally consider it unfortunate that copy operations are defined by default and I prohibit copying of objects of many of my classes"
I think you should try to write as little code as possible to have the class doing what it is supposed to do. If no one is trying to copy the class and it is not going to be copied in the near future then do not add stuff like a copy constructor or assignment operator. Just make the class non copyable.
When someday you actually want to copy the class, then add things like the copy constructor. But until then having the class non copyable means less code to test and maintain.
I sincerely believe that copy semantics should be provided automatically, or not at all.
However, badly written libraries may sometimes benefit from a manual copy constructor.
Note that the situation is very different in C++ (because copy semantics are usually required by the standard library !) than in C++0x, where my advice pretty much always applies.
I have a medium complex C++ class which holds a set of data read from disc. It contains an eclectic mix of floats, ints and structures and is now in general use. During a major code review it was asked whether we have a custom assignment operator or we rely on the compiler generated version and if so, how do we know it works correctly? Well, we didn't write a custom assignment and so a unit test was added to check that if we do:
CalibDataSet datasetA = getDataSet();
CalibDataSet dataSetB = datasetA;
then datasetB is the same as datasetA. A couple of hundred lines or so. Now the customer inists that we cannot rely on the compiler (gcc) being correct for future releases and we should write our own. Are they right to insist on this?
Additional info:
I'm impressed by the answers/comments already posted and the response time.Another way of asking this question might be:
When does a POD structure/class become a 'not' POD structure/class?
It is well-known what the automatically-generated assignment operator will do - that's defined as part of the standard and a standards-compliant C++ compiler will always generate a correctly-behaving assignment operator (if it didn't, then it would not be a standards-compliant compiler).
You usually only need to write your own assignment operator if you've written your own destructor or copy constructor. If you don't need those, then you don't need an assignment operator, either.
The most common reason to explicitly define an assignment operator is to support "remote ownership" -- basically, a class that includes one or more pointers, and owns the resources to which those pointers refer. In such a case, you normally need to define assignment, copying (i.e., copy constructor) and destruction. There are three primary strategies for such cases (sorted by decreasing frequency of use):
Deep copy
reference counting
Transfer of ownership
Deep copy means allocating a new resource for the target of the assignment/copy. E.g., a string class has a pointer to the content of the string; when you assign it, the assignment allocates a new buffer to hold the new content in the destination, and copies the data from the source to the destination buffer. This is used in most current implementations of a number of standard classes such as std::string and std::vector.
Reference counting used to be quite common as well. Many (most?) older implementations of std::string used reference counting. In this case, instead of allocating and copying the data for the string, you simply incremented a reference count to indicate the number of string objects referring to a particular data buffer. You only allocated a new buffer when/if the content of a string was modified so it needed to differ from others (i.e., it used copy on write). With multithreading, however, you need to synchronize access to the reference count, which often has a serious impact on performance, so in newer code this is fairly unusual (mostly used when something stores so much data that it's worth potentially wasting a bit of CPU time to avoid such a copy).
Transfer of ownership is relatively unusual. It's what's done by std::auto_ptr. When you assign or copy something, the source of the assignment/copy is basically destroyed -- the data is transferred from one to the other. This is (or can be) useful, but the semantics are sufficiently different from normal assignment that it's often counterintuitive. At the same time, under the right circumstances, it can provide great efficiency and simplicity. C++0x will make transfer of ownership considerably more manageable by adding a unique_ptr type that makes it more explicit, and also adding rvalue references, which make it easy to implement transfer of ownership for one fairly large class of situations where it can improve performance without leading to semantics that are visibly counterintuitive.
Going back to the original question, however, if you don't have remote ownership to start with -- i.e., your class doesn't contain any pointers, chances are good that you shouldn't explicitly define an assignment operator (or dtor or copy ctor). A compiler bug that stopped implicitly defined assignment operators from working would prevent passing any of a huge number of regression tests.
Even if it did somehow get released, your defense against it would be to just not use it. There's no real room for question that such a release would be replaced within a matter of hours. With a medium to large existing project, you don't want to switch to a compiler until it's been in wide use for a while in any case.
If the compiler isn't generating the assignment properly, then you have bigger problems to worry about than implementing the assignment overload (like the fact that you have a broken compiler). Unless your class contains pointers, it is not necessary to provide your own overload; however, it is reasonable to request an explicit overload, not because the compiler might break (which is absurd), but rather to document your intention that assignment be permitted and behave in that manner. In C++0x, it will be possible to document intent and save time by using = default for the compiler-generated version.
If you have some resources in your class that you are supposed to manage then writing your own assignment operator makes sense else rely on compiler generated one.
I guess the problem of shallow copy deep copy may appear in case you are dealing with strings.
http://www.learncpp.com/cpp-tutorial/912-shallow-vs-deep-copying/
but i think it is always advisable to write assignment overloads for user defined classes.
I have a Shape class containing potentially many vertices, and I was contemplating making copy-constructor/copy-assignment private to prevent accidental needless copying of my heavyweight class (for example, passing by value instead of by reference).
To make a copy of Shape, one would have to deliberately call a "clone" or "duplicate" method.
Is this good practice? I wonder why STL containers don't use this approach, as I rarely want to pass them by value.
Restricting your users isn't always a good idea. Just documenting that copying may be expensive is enough. If a user really wants to copy, then using the native syntax of C++ by providing a copy constructor is a much cleaner approach.
Therefore, I think the real answer depends on the context. Perhaps the real class you're writing (not the imaginary Shape) shouldn't be copied, perhaps it should. But as a general approach, I certainly can't say that one should discourage users from copying large objects by forcing them to use explicit method calls.
IMHO, providing a copy constructor and assignment operator or not depend more of what your class modelizes than the cost of copying.
If your class represent values, that is if passing an object or a copy of the object doesn't make a difference, then provide them (and provide the equality operator also)
If your class isn't, that is if you think that object of the class have an identity and a state (one also speak of entities), don't. If a copy make sense, provide it with a clone or copy member.
There are sometimes classes you can't easily classify. Containers are in that position. It is meaninfull the consider them as entities and pass them only by reference and have special operations to make a copy when needed. You can also consider them simply as agregation of values and so copying makes sense. The STL was designed around value types. And as everything is a value, it makes sense for containers to be so. That allows things like map<int, list<> > which are usefull. (Remember, you can't put nocopyable classes in an STL container).
Generally, you do not make classes non-copyable just because they are heavy (you had shown a good example STL).
You make them non-copyable when they connected to some non-copyable resource like socket, file, lock or they are not designed to be copied at all (for example have some internal structures that can be hardly deep copied).
However, in your case your object is copyable so leave it as this.
Small note about clone() -- it is used as polymorphic copy constructor -- it has different
meaning and used differently.
Most programmers are already aware of the cost of copying various objects, and know how to avoid copies, using techniques such as pass by reference.
Note the STL's vector, string, map, list etc. could all be variously considered 'heavyweight' objects (especially something like a vector with 10,000 elements!). Those classes all still provide copy constructors and assignment operators, so if you know what you're doing (such as making a std::list of vectors), you can copy them when necessary.
So if it's useful, provide them anyway, but be sure to document they are expensive operations.
Depending on your needs...
If you want to ensure that a copy won't happen by mistake, and making a copy would cause a severe bottleneck or simply doesn't make sense, then this is good practice. Compiling errors are better than performance investigations.
If you are not sure how your class will be used, and are unsure if it's a good idea or not then it is not good practice. Most of the time you would not limit your class in this way.