Is it good practice to generally make heavyweight classes non-copyable? - c++

I have a Shape class containing potentially many vertices, and I was contemplating making copy-constructor/copy-assignment private to prevent accidental needless copying of my heavyweight class (for example, passing by value instead of by reference).
To make a copy of Shape, one would have to deliberately call a "clone" or "duplicate" method.
Is this good practice? I wonder why STL containers don't use this approach, as I rarely want to pass them by value.

Restricting your users isn't always a good idea. Just documenting that copying may be expensive is enough. If a user really wants to copy, then using the native syntax of C++ by providing a copy constructor is a much cleaner approach.
Therefore, I think the real answer depends on the context. Perhaps the real class you're writing (not the imaginary Shape) shouldn't be copied, perhaps it should. But as a general approach, I certainly can't say that one should discourage users from copying large objects by forcing them to use explicit method calls.

IMHO, providing a copy constructor and assignment operator or not depend more of what your class modelizes than the cost of copying.
If your class represent values, that is if passing an object or a copy of the object doesn't make a difference, then provide them (and provide the equality operator also)
If your class isn't, that is if you think that object of the class have an identity and a state (one also speak of entities), don't. If a copy make sense, provide it with a clone or copy member.
There are sometimes classes you can't easily classify. Containers are in that position. It is meaninfull the consider them as entities and pass them only by reference and have special operations to make a copy when needed. You can also consider them simply as agregation of values and so copying makes sense. The STL was designed around value types. And as everything is a value, it makes sense for containers to be so. That allows things like map<int, list<> > which are usefull. (Remember, you can't put nocopyable classes in an STL container).

Generally, you do not make classes non-copyable just because they are heavy (you had shown a good example STL).
You make them non-copyable when they connected to some non-copyable resource like socket, file, lock or they are not designed to be copied at all (for example have some internal structures that can be hardly deep copied).
However, in your case your object is copyable so leave it as this.
Small note about clone() -- it is used as polymorphic copy constructor -- it has different
meaning and used differently.

Most programmers are already aware of the cost of copying various objects, and know how to avoid copies, using techniques such as pass by reference.
Note the STL's vector, string, map, list etc. could all be variously considered 'heavyweight' objects (especially something like a vector with 10,000 elements!). Those classes all still provide copy constructors and assignment operators, so if you know what you're doing (such as making a std::list of vectors), you can copy them when necessary.
So if it's useful, provide them anyway, but be sure to document they are expensive operations.

Depending on your needs...
If you want to ensure that a copy won't happen by mistake, and making a copy would cause a severe bottleneck or simply doesn't make sense, then this is good practice. Compiling errors are better than performance investigations.
If you are not sure how your class will be used, and are unsure if it's a good idea or not then it is not good practice. Most of the time you would not limit your class in this way.

Related

Why/when should I use std::unique/shared_ptr (std::vector<>) over just std::vector<>?

I'm a little bit confused about the main use of std::unique/shared_ptr(std::vector<>) when I can simply use a std::vector<>, which, as I know, is itself inherently a dynamic array. As I have also seen around, people say that there is no any performance difference between these two. So, based on all this, what is the point of using a smart pointer pointing to a container (in this case, a vector) instead of a vector alone?
First of all, you shouldn't be using std::shared_ptr unless you need the specific "shared ownership" semantics associated with std::shared_ptr. If you need a smart pointer, you should default to std::unique_ptr by default, and only switch away from it in the scenario where you expressly find that you need to.
Secondly: ostensibly, the reason to prefer std::unique_ptr<TYPE> over TYPE is if you plan to move the object around a lot. This is the common design paradigm for large objects that are either unmovable, or otherwise expensive to move—i.e. they implemented a Copy Constructor and didn't implement a Move Constructor, so moves are forced to behave like a Copy.
std::vector, however, does have relatively efficient move semantics: if you move a std::vector around, regardless of how complex its contained types are, the move only constitutes a couple of pointer swaps. There's no real risk that moving a std::vector will incur a large amount of computational complexity. Even in the scenario where you're overriding a previously allocated array (invoking the Destructors of all objects in the vector), you'd still have that complexity if you were using std::unique_ptr<std::vector<TYPE>> instead, saving you nothing.
There are two advantages to std::unique_ptr<std::vector<TYPE>>. The first of which is that it gets rid of the implicit copy constructor; maybe you want to enforce to maintaining programmers that the object shouldn't be copied. But that's a pretty niche use. The other advantage is that it allows you to stipulate the scenario where there's no vector, i.e. vec.size() == 0 is a different condition than doesNotExist(vec). But even in that scenario, you should be preferring std::optional<std::vector> instead, which better conveys through the code the intent of the object. Granted, std::optional is only available in C++17→ Code, so maybe you're in an environment that hasn't implemented it yet. But otherwise, there's little reason to use std::unique_ptr<std::vector>.
So in general, I don't believe there are practical uses for std::unique_ptr<std::vector>. There's no practical performance difference between it and std::vector, and using it will just make your code needlessly complex.

Be-friend'ing std::tuple

I have a custom class, and I'd like to minimize the chances that someone on my team accidentally copies it, as that could break certain invariants within our system. To this end, I made the copy constructor private, as there is no reason anyone should need to copy it in any legitimate usage of the class.
However, under-the-hood of the framework that the class is a part of, a copy construction of the object into a std::tuple is required. I tried to use friend, but the compiler still complains, as the inner class(es?) of std::tuple require friend-access as well.
What is the best way to get what I want?
If the framework requires your class to be copyable, you really should provide a copyable class.
If your class really is only movable, or not even that, then maybe the framework should have a std::unique_ptr or similar to the object instead? Or you could create a movable adaptor class around that std::unique_ptr which forwards the interface...
Part of the forward-facing interface to users of the class is, whether it is moveable and/or copyable. If you are trying to make it non-copyable, unless you happen to be a component of the target application area... this limits code reuse, and it may confuse potential users of the class as to whether or not it is safe to copy / move it.
It may be that the framework doesn't really need to make a copy, and can be refactored to make moves instead?
It's very unclear from the question why you don't want it to be copyable. You seem to say that bad things will happen, but for some reason you aren't concerned if the framework makes a copy. Is it really okay to make copies or not?
It may be that you need to make a separate system for tracking / enforcing the invariant that you are concerned about, rather than just try to prohibit copying this class.

When should you make a class uncopyable?

According to the Google style guidelines, "Few classes need to be copyable. Most should have neither a copy constructor nor an assignment operator."
They recommend you make a class uncopyable (that is, not giving it a copy constructor or assignment operator), and instead recommending passing by reference or pointer in most situations, or using clone() methods which cannot be invoked implicitly.
However, I've heard some arguments against this:
Accessing a reference is (usually) slower than accessing a value.
In some computations, I might want to leave the original object the way it is and just return the changed object.
I might want to store the value of a computation as a local object in a function and return it, which I couldn't do if I returned it by reference.
If a class is small enough, passing by reference is slower.
What are the positives/negatives of following this guideline? Is there any standard "rule of thumb" for making classes uncopyable? What should I consider when creating new classes?
I have two issues with their advice:
It doesn't apply to modern C++, ignoring move constructors/assignment operators, and so assumes that taking objects by value (which would have copied before) is often inefficient.
It doesn't trust the programmer to do the right thing and design their code appropriately. Instead it limits the programmer until they're forced to break the rule.
Whether your class should be copyable, moveable, both or neither should be a design decision based on the uses of the class itself. For example, a std::unique_ptr is a great example of a class that should only be moveable because copying it would invalidate its entire purpose. When you design a class, ask yourself if it makes sense to copy it. Most of the time the answer will be yes.
The advice seems to be based on the belief that programmers default to passing objects around by value which can be expensive when the objects are complex enough. This is just not true any more. You should default to passing objects around by value when you need a copy of the object, and there's no reason to be scared of this - in many cases, the move constructor will be used instead, which is almost always a constant time operation.
Again, the choice of how you should pass objects around is a design decision that should be influenced by a number of factors, such as:
Am I going to need a copy of this object?
Do I need to modify this object?
What is the lifetime of the object?
Is the object optional?
These questions should be asked with every type you write (parameter, return value, variable, whatever). You should find plenty of uses for passing objects by value that don't lead to poor performance due to copying.
If you follow good C++ programming practices, your copy constructors will be bug free, so that shouldn't be a concern. In fact, many classes can get away with just the defaulted copy/move constructors. If a class owns dynamically allocated resources and you use smart pointers appropriately, implementing the copy constructor is often as simple as copying the objects from the pointers - not much room for bugs.
Of course, this advice from Google is for people working on their code to ensure consistency throughout their codebase. That's fine. I don't recommend blindly adopting it in its entirety for a modern C++ project, however.

Copying objects with only move semantics

Before anything else, let me take you all into the highway of my thoughts (to say it simply, I'm just imagining these things)
Suppose, I am using a third-party library class that uses move semantics (r-value references). Here is its definition:
class VeryHeavyObject {
VeryHeavyObject(const VeryHeavyObject& obj); // copy constructor
VeryHeavyObject& operator= (const VeryHeavyObject& obj); // copy operator
public:
VeryHeavyObject(); // constructor
VeryHeavyObject(VeryHeavyObject&& obj); // move constructor
VeryHeavyObject& operator= (VeryHeavyObject&& obj); // move operator
// ....
};
Apparently, the author's really concerned about the cost of copying of VeryHeavyObject and decided to force everything to be moved (More apparently, he doesn't know how to design classes with move semantics). But then, in some point of my code, I need to have a copy of VeryHeavyObject.
Well, the core question is:
How can I copy an object with only a move constructor and a move operator?
P.S.: I have tried but I can't really contact the author of the library (I think he's on vacation).
You cannot.
However, provided that you have sufficient access to its internals (getters and the like), then you can construct a clone by yourself.
A well defined interface, and we will assume that this is the case, some methods may not be available because the author wants to discourage certain uses for performance reasons. A well-known example is std::list, which does not include a [] operator because it has O(n) complexity compared with O(1) in other containers, such as std::vector.
In this case, the author of the library wants to discourage the use of copies because, as you state in your question, it is very costly. But this does not mean that it is impossible. If you really need to do it, you can probably write your own Clone() function that takes data from the original VeryHeavyObject as appropriate, constructs a new one with these data and returns it using std::move. Since we haven't got the interface for VeryHeavyObject we cannot try to do it, but I'm sure you can.
It might not be possible.
The class has declared the copies private, but we can't see whether the functions are ever defined. You seem to assume that the class has a copy operation that it's hiding away from you to stop you doing something slow, but that might not be the case. Some objects simply cannot be copied. For examples, consider streams.
You wouldn't expect privately-declared-but-not-defined functions in C++11, but there's no law against it. Anyway even if there is an implemented private copy function, it's probably private for a reason (maybe it can only be used under certain controlled circumstances: the class internals know how to use it safely and you don't). So if there's no public copy, then as far as this class's API is concerned it cannot be copied.
Perhaps the class has enough public accessors, that you can interrogate it for the state you need, and construct a new object that matches it. If so then you could reasonably complain to the author of the class that it should be publicly copyable. If not then maybe it has state that can't be duplicated.
Anything that provides unique access to something (streams, drivers, locks) has a reason not to be copyable, because the original and the copy can't both provide unique access to the same thing. Admittedly dup means that even file descriptors don't physically provide unique access to something, let alone the streams that wrap them. But a stream has state involving buffered data not yet written, which means that copying them would introduce complexity that the class is designed to protect you from. So logically you normally use a stream as though it is the only way to access something.
If the copy assignment operator is implemented, then you might be able to hack a way to call it even though it's private. That won't work for the copy constructor, though, you can't take pointers to constructors. As a brutal hack you could #define private public before including its header: it's undefined behavior but it might work on the implementation you're using. Forking the third-party source would be better.
In general, it is not possible without modifying the class, because there might be private data that you cannot access. It might be possible if a shallow copy is sufficient, because then you should be able to do it with a memccpy. (Note that if the class does not have any virtual members or pointers, shallow and deep copy are the same).

Should I use shared_ptr or unique_ptr

I've been making some objects using the pimpl idiom, but I'm not sure whether to use std::shared_ptr or std::unique_ptr.
I understand that std::unique_ptr is more efficient, but this isn't so much of an issue for me, as these objects are relatively heavyweight anyway so the cost of std::shared_ptr over std::unique_ptr is relatively minor.
I'm currently going with std::shared_ptr just because of the extra flexibility. For example, using a std::shared_ptr allows me to store these objects in a hashmap for quick access while still being able to return copies of these objects to callers (as I believe any iterators or references may quickly become invalid).
However, these objects in a way really aren't being copied, as changes affect all copies, so I was wondering that perhaps using std::shared_ptr and allowing copies is some sort of anti-pattern or bad thing.
Is this correct?
I've been making some objects using the pimpl idiom, but I'm not sure whether to used shared_ptr or unique_ptr.
Definitely unique_ptr or scoped_ptr.
Pimpl is not a pattern, but an idiom, which deals with compile-time dependency and binary compatibility. It should not affect the semantics of the objects, especially with regard to its copying behavior.
You may use whatever kind of smart pointer you want under the hood, but those 2 guarantee that you won't accidentally share the implementation between two distinct objects, as they require a conscious decision about the implementation of the copy constructor and assignment operator.
However, these objects in a way really aren't being copied, as changes affect all copies, so I was wondering that perhaps using shared_ptr and allowing copies is some sort of anti-pattern or bad thing.
It is not an anti-pattern, in fact, it is a pattern: Aliasing. You already use it, in C++, with bare pointers and references. shared_ptr offer an extra measure of "safety" to avoid dead references, at the cost of extra complexity and new issues (beware of cycles which create memory leaks).
Unrelated to Pimpl
I understand unique_ptr is more efficient, but this isn't so much of an issue for me, as these objects are relatively heavyweight anyway so the cost of shared_ptr over unique_ptr is relatively minor.
If you can factor out some state, you may want to take a look at the Flyweight pattern.
If you use shared_ptr, it's not really the classical pimpl
idiom (unless you take additional steps). But the real question
is why you want to use a smart pointer to begin with; it's very
clear where the delete should occur, and there's no issue of
exception safety or other to be concerned with. At most,
a smart pointer will save you a line or two of code. And the
only one which has the correct semantics is boost::scoped_ptr,
and I don't think it works in this case. (IIRC, it requires
a complete type in order to be instantiated, but I could be
wrong.)
An important aspect of the pimpl idiom is that its use should be
transparent to the client; the class should behave exactly as if
it were implemented classically. This means either inhibiting
copy and assignment or implementing deep copy, unless the class
is immutable (no non-const member functions). None of the usual
smart pointers implement deep copy; you could implement one, of
course, but it would probably still require a complete type
whenever the copy occurs, which means that you'd still have to
provide a user defined copy constructor and assignment operator
(since they can't be inline). Given this, it's probably not
worth the bother using the smart pointer.
An exception is if the objects are immutable. In this case, it
doesn't matter whether the copy is deep or not, and shared_ptr
handles the situation completely.
When you use a shared_ptr (for example in a container, then look this up and return it by-value), you are not causing a copy of the object it points to, simply a copy of the pointer with a reference count.
This means that if you modify the underlying object from multiple points, then you affect changes on the same instance. This is exactly what it is designed for, so not some anti-pattern!
When passing a shared_ptr (as the comments say,) it's better to pass by const reference and copy (there by incrementing the reference count) where needed. As for return, case-by-case.
Yes, please use them. Simply put, the shared_ptr is an implementation of smart pointer. unique_ptr is an implementation of automatic pointer: