Related
I used to think C++'s object model is very robust when best practices are followed.
Just a few minutes ago, though, I had a realization that I hadn't had before.
Consider this code:
class Foo
{
std::set<size_t> set;
std::vector<std::set<size_t>::iterator> vector;
// ...
// (assume every method ensures p always points to a valid element of s)
};
I have written code like this. And until today, I hadn't seen a problem with it.
But, thinking about it a more, I realized that this class is very broken:
Its copy-constructor and copy-assignment copy the iterators inside the vector, which implies that they will still point to the old set! The new one isn't a true copy after all!
In other words, I must manually implement the copy-constructor even though this class isn't managing any resources (no RAII)!
This strikes me as astonishing. I've never come across this issue before, and I don't know of any elegant way to solve it. Thinking about it a bit more, it seems to me that copy construction is unsafe by default -- in fact, it seems to me that classes should not be copyable by default, because any kind of coupling between their instance variables risks rendering the default copy-constructor invalid.
Are iterators fundamentally unsafe to store? Or, should classes really be non-copyable by default?
The solutions I can think of, below, are all undesirable, as they don't let me take advantage of the automatically-generated copy constructor:
Manually implement a copy constructor for every nontrivial class I write. This is not only error-prone, but also painful to write for a complicated class.
Never store iterators as member variables. This seems severely limiting.
Disable copying by default on all classes I write, unless I can explicitly prove they are correct. This seems to run entirely against C++'s design, which is for most types to have value semantics, and thus be copyable.
Is this a well-known problem, and if so, does it have an elegant/idiomatic solution?
C++ copy/move ctor/assign are safe for regular value types. Regular value types behave like integers or other "regular" values.
They are also safe for pointer semantic types, so long as the operation does not change what the pointer "should" point to. Pointing to something "within yourself", or another member, is an example of where it fails.
They are somewhat safe for reference semantic types, but mixing pointer/reference/value semantics in the same class tends to be unsafe/buggy/dangerous in practice.
The rule of zero is that you make classes that behave like either regular value types, or pointer semantic types that don't need to be reseated on copy/move. Then you don't have to write copy/move ctors.
Iterators follow pointer semantics.
The idiomatic/elegant around this is to tightly couple the iterator container with the pointed-into container, and block or write the copy ctor there. They aren't really separate things once one contains pointers into the other.
Yes, this is a well known "problem" -- whenever you store pointers in an object, you're probably going to need some kind of custom copy constructor and assignment operator to ensure that the pointers are all valid and point at the expected things.
Since iterators are just an abstraction of collection element pointers, they have the same issue.
Is this a well-known problem?
Well, it is known, but I would not say well-known. Sibling pointers do not occur often, and most implementations I have seen in the wild were broken in the exact same way than yours is.
I believe the problem to be infrequent enough to have escaped most people's notice; interestingly, as I follow more Rust than C++ nowadays, it crops up there quite often because of the strictness of the type system (ie, the compiler refuses those programs, prompting questions).
does it have an elegant/idiomatic solution?
There are many types of sibling pointers situations, so it really depends, however I know of two generic solutions:
keys
shared elements
Let's review them in order.
Pointing to a class-member, or pointing into an indexable container, then one can use an offset or key rather than an iterator. It is slightly less efficient (and might require a look-up) however it is a fairly simple strategy. I have seen it used to great effect in shared-memory situation (where using pointers is a no-no since the shared-memory area may be mapped at different addresses).
The other solution is used by Boost.MultiIndex, and consists in an alternative memory layout. It stems from the principle of the intrusive container: instead of putting the element into the container (moving it in memory), an intrusive container uses hooks already inside the element to wire it at the right place. Starting from there, it is easy enough to use different hooks to wire a single elements into multiple containers, right?
Well, Boost.MultiIndex kicks it two steps further:
It uses the traditional container interface (ie, move your object in), but the node to which the object is moved in is an element with multiple hooks
It uses various hooks/containers in a single entity
You can check various examples and notably Example 5: Sequenced Indices looks a lot like your own code.
Is this a well-known problem
Yes. Any time you have a class that contains pointers, or pointer-like data like an iterator, you have to implement your own copy-constructor and assignment-operator to ensure the new object has valid pointers/iterators.
and if so, does it have an elegant/idiomatic solution?
Maybe not as elegant as you might like, and probably is not the best in performance (but then, copies sometimes are not, which is why C++11 added move semantics), but maybe something like this would work for you (assuming the std::vector contains iterators into the std::set of the same parent object):
class Foo
{
private:
std::set<size_t> s;
std::vector<std::set<size_t>::iterator> v;
struct findAndPushIterator
{
Foo &foo;
findAndPushIterator(Foo &f) : foo(f) {}
void operator()(const std::set<size_t>::iterator &iter)
{
std::set<size_t>::iterator found = foo.s.find(*iter);
if (found != foo.s.end())
foo.v.push_back(found);
}
};
public:
Foo() {}
Foo(const Foo &src)
{
*this = src;
}
Foo& operator=(const Foo &rhs)
{
v.clear();
s = rhs.s;
v.reserve(rhs.v.size());
std::for_each(rhs.v.begin(), rhs.v.end(), findAndPushIterator(*this));
return *this;
}
//...
};
Or, if using C++11:
class Foo
{
private:
std::set<size_t> s;
std::vector<std::set<size_t>::iterator> v;
public:
Foo() {}
Foo(const Foo &src)
{
*this = src;
}
Foo& operator=(const Foo &rhs)
{
v.clear();
s = rhs.s;
v.reserve(rhs.v.size());
std::for_each(rhs.v.begin(), rhs.v.end(),
[this](const std::set<size_t>::iterator &iter)
{
std::set<size_t>::iterator found = s.find(*iter);
if (found != s.end())
v.push_back(found);
}
);
return *this;
}
//...
};
Yes, of course it's a well-known problem.
If your class stored pointers, as an experienced developer you would intuitively know that the default copy behaviours may not be sufficient for that class.
Your class stores iterators and, since they are also "handles" to data stored elsewhere, the same logic applies.
This is hardly "astonishing".
The assertion that Foo is not managing any resources is false.
Copy-constructor aside, if a element of set is removed, there must be code in Foo that manages vector so that the respective iterator is removed.
I think the idiomatic solution is to just use one container, a vector<size_t>, and check that the count of an element is zero before inserting. Then the copy and move defaults are fine.
"Inherently unsafe"
No, the features you mention are not inherently unsafe; the fact that you thought of three possible safe solutions to the problem is evidence that there is no "inherent" lack of safety here, even though you think the solutions are undesirable.
And yes, there is RAII here: the containers (set and vector) are managing resources. I think your point is that the RAII is "already taken care of" by the std containers. But you need to then consider the container instances themselves to be "resources", and in fact your class is managing them. You're correct that you're not directly managing heap memory, because this aspect of the management problem is taken care of for you by the standard library. But there's more to the management problem, which I'll talk a bit more about below.
"Magic" default behavior
The problem is that you are apparently hoping that you can trust the default copy constructor to "do the right thing" in a non-trivial case such as this. I'm not sure why you expected the right behavior--perhaps you're hoping that memorizing rules-of-thumb such as the "rule of 3" will be a robust way to ensure that you don't shoot yourself in the foot? Certainly that would be nice (and, as pointed out in another answer, Rust goes much further than other low-level languages toward making foot-shooting much harder), but C++ simply isn't designed for "thoughtless" class design of that sort, nor should it be.
Conceptualizing constructor behavior
I'm not going to try to address the question of whether this is a "well-known problem", because I don't really know how well-characterized the problem of "sister" data and iterator-storing is. But I hope that I can convince you that, if you take the time to think about copy-constructor-behavior for every class you write that can be copied, this shouldn't be a surprising problem.
In particular, when deciding to use the default copy-constructor, you must think about what the default copy-constructor will actually do: namely, it will call the copy-constructor of each non-primitive, non-union member (i.e. members that have copy-constructors) and bitwise-copy the rest.
When copying your vector of iterators, what does std::vector's copy-constructor do? It performs a "deep copy", i.e., the data inside the vector is copied. Now, if the vector contains iterators, how does that affect the situation? Well, it's simple: the iterators are the data stored by the vector, so the iterators themselves will be copied. What does an iterator's copy-constructor do? I'm not going to actually look this up, because I don't need to know the specifics: I just need to know that iterators are like pointers in this (and other respect), and copying a pointer just copies the pointer itself, not the data pointed to. I.e., iterators and pointers do not have deep-copying by default.
Note that this is not surprising: of course iterators don't do deep-copying by default. If they did, you'd get a different, new set for each iterator being copied. And this makes even less sense than it initially appears: for instance, what would it actually mean if uni-directional iterators made deep-copies of their data? Presumably you'd get a partial copy, i.e., all the remaining data that's still "in front of" the iterator's current position, plus a new iterator pointing to the "front" of the new data structure.
Now consider that there is no way for a copy-constructor to know the context in which it's being called. For instance, consider the following code:
using iter = std::set<size_t>::iterator; // use typedef pre-C++11
std::vector<iter> foo = getIters(); // get a vector of iterators
useIters(foo); // pass vector by value
When getIters is called, the return value might be moved, but it might also be copy-constructed. The assignment to foo also invokes a copy-constructor, though this may also be elided. And unless useIters takes its argument by reference, then you've also got a copy constructor call there.
In any of these cases, would you expect the copy constructor to change which std::set is pointed to by the iterators contained by the std::vector<iter>? Of course not! So naturally std::vector's copy-constructor can't be designed to modify the iterators in that particular way, and in fact std::vector's copy-constructor is exactly what you need in most cases where it will actually be used.
However, suppose std::vector could work like this: suppose it had a special overload for "vector-of-iterators" that could re-seat the iterators, and that the compiler could somehow be "told" only to invoke this special constructor when the iterators actually need to be re-seated. (Note that the solution of "only invoke the special overload when generating a default constructor for a containing class that also contains an instance of the iterators' underlying data type" wouldn't work; what if the std::vector iterators in your case were pointing at a different standard set, and were being treated simply as a reference to data managed by some other class? Heck, how is the compiler supposed to know whether the iterators all point to the same std::set?) Ignoring this problem of how the compiler would know when to invoke this special constructor, what would the constructor code look like? Let's try it, using _Ctnr<T>::iterator as our iterator type (I'll use C++11/14isms and be a bit sloppy, but the overall point should be clear):
template <typename T, typename _Ctnr>
std::vector< _Ctnr<T>::iterator> (const std::vector< _Ctnr<T>::iterator>& rhs)
: _data{ /* ... */ } // initialize underlying data...
{
for (auto i& : rhs)
{
_data.emplace_back( /* ... */ ); // What do we put here?
}
}
Okay, so we want each new, copied iterator to be re-seated to refer to a different instance of _Ctnr<T>. But where would this information come from? Note that the copy-constructor can't take the new _Ctnr<T> as an argument: then it would no longer be a copy-constructor. And in any case, how would the compiler know which _Ctnr<T> to provide? (Note, too, that for many containers, finding the "corresponding iterator" for the new container may be non-trivial.)
Resource management with std:: containers
This isn't just an issue of the compiler not being as "smart" as it could or should be. This is an instance where you, the programmer, have a specific design in mind that requires a specific solution. In particular, as mentioned above, you have two resources, both std:: containers. And you have a relationship between them. Here we get to something that most of the other answers have stated, and which by this point should be very, very clear: related class members require special care, since C++ does not manage this coupling by default. But what I hope is also clear by this point is that you shouldn't think of the problem as arising specifically because of data-member coupling; the problem is simply that default-construction isn't magic, and the programmer must be aware of the requirements for correctly copying a class before deciding to let the implicitly-generated constructor handle copying.
The elegant solution
...And now we get to aesthetics and opinions. You seem to find it inelegant to be forced to write a copy-constructor when you don't have any raw pointers or arrays in your class that must be manually managed.
But user-defined copy constructors are elegant; allowing you to write them is C++'s elegant solution to the problem of writing correct non-trivial classes.
Admittedly, this seems like a case where the "rule of 3" doesn't quite apply, since there's a clear need to either =delete the copy-constructor or write it yourself, but there's no clear need (yet) for a user-defined destructor. But again, you can't simply program based on rules of thumb and expect everything to work correctly, especially in a low-level language such as C++; you must be aware of the details of (1) what you actually want and (2) how that can be achieved.
So, given that the coupling between your std::set and your std::vector actually creates a non-trivial problem, solving the problem by wrapping them together in a class that correctly implements (or simply deletes) the copy-constructor is actually a very elegant (and idiomatic) solution.
Explicitly defining versus deleting
You mention a potential new "rule of thumb" to follow in your coding practices: "Disable copying by default on all classes I write, unless I can explicitly prove they are correct." While this might be a safer rule of thumb (at least in this case) than the "rule of 3" (especially when your criterion for "do I need to implement the 3" is to check whether a deleter is required), my above caution against relying on rules of thumb still applies.
But I think the solution here is actually simpler than the proposed rule of thumb. You don't need to formally prove the correctness of the default method; you simply need to have a basic idea of what it would do, and what you need it to do.
Above, in my analysis of your particular case, I went into a lot of detail--for instance, I brought up the possibility of "deep-copying iterators". You don't need to go into this much detail to determine whether or not the default copy-constructor will work correctly. Instead, simply imagine what your manually-created copy constructor will look like; you should be able to tell pretty quickly how similar your imaginary explicitly-defined constructor is to the one the compiler would generate.
For example, a class Foo containing a single vector data will have a copy constructor that looks like this:
Foo::Foo(const Foo& rhs)
: data{rhs.data}
{}
Without even writing that out, you know that you can rely on the implicitly-generated one, because it's exactly the same as what you'd have written above.
Now, consider the constructor for your class Foo:
Foo::Foo(const Foo& rhs)
: set{rhs.set}
, vector{ /* somehow use both rhs.set AND rhs.vector */ } // ...????
{}
Right away, given that simply copying vector's members won't work, you can tell that the default constructor won't work. So now you need to decide whether your class needs to be copyable or not.
Copy constructors were traditionally ubiquitous in C++ programs. However, I'm doubting whether there's a good reason to that since C++11.
Even when the program logic didn't need copying objects, copy constructors (usu. default) were often included for the sole purpose of object reallocation. Without a copy constructor, you couldn't store objects in a std::vector or even return an object from a function.
However, since C++11, move constructors have been responsible for object reallocation.
Another use case for copy constructors was, simply, making clones of objects. However, I'm quite convinced that a .copy() or .clone() method is better suited for that role than a copy constructor because...
Copying objects isn't really commonplace. Certainly it's sometimes necessary for an object's interface to contain a "make a duplicate of yourself" method, but only sometimes. And when it is the case, explicit is better than implicit.
Sometimes an object could expose several different .copy()-like methods, because in different contexts the copy might need to be created differently (e.g. shallower or deeper).
In some contexts, we'd want the .copy() methods to do non-trivial things related to program logic (increment some counter, or perhaps generate a new unique name for the copy). I wouldn't accept any code that has non-obvious logic in a copy constructor.
Last but not least, a .copy() method can be virtual if needed, allowing to solve the problem of slicing.
The only cases where I'd actually want to use a copy constructor are:
RAII handles of copiable resources (quite obviously)
Structures that are intended to be used like built-in types, like math vectors or matrices -
simply because they are copied often and vec3 b = a.copy() is too verbose.
Side note: I've considered the fact that copy constructor is needed for CAS, but CAS is needed for operator=(const T&) which I consider redundant basing on the exact same reasoning;
.copy() + operator=(T&&) = default would be preferred if you really need this.)
For me, that's quite enough incentive to use T(const T&) = delete everywhere by default and provide a .copy() method when needed. (Perhaps also a private T(const T&) = default just to be able to write copy() or virtual copy() without boilerplate.)
Q: Is the above reasoning correct or am I missing any good reasons why logic objects actually need or somehow benefit from copy constructors?
Specifically, am I correct in that move constructors took over the responsibility of object reallocation in C++11 completely? I'm using "reallocation" informally for all the situations when an object needs to be moved someplace else in the memory without altering its state.
The problem is what is the word "object" referring to.
If objects are the resources that variables refers to (like in java or in C++ through pointers, using classical OOP paradigms) every "copy between variables" is a "sharing", and if single ownership is imposed, "sharing" becomes "moving".
If objects are the variables themselves, since each variables has to have its own history, you cannot "move" if you cannot / don't want to impose the destruction of a value in favor of another.
Cosider for example std::strings:
std::string a="Aa";
std::string b=a;
...
b = "Bb";
Do you expect the value of a to change, or that code to don't compile? If not, then copy is needed.
Now consider this:
std::string a="Aa";
std::string b=std::move(a);
...
b = "Bb";
Now a is left empty, since its value (better, the dynamic memory that contains it) had been "moved" to b. The value of b is then chaged, and the old "Aa" discarded.
In essence, move works only if explicitly called or if the right argument is "temporary", like in
a = b+c;
where the resource hold by the return of operator+ is clearly not needed after the assignment, hence moving it to a, rather than copy it in another a's held place and delete it is more effective.
Move and copy are two different things. Move is not "THE replacement for copy". It an more efficient way to avoid copy only in all the cases when an object is not required to generate a clone of itself.
Short anwer
Is the above reasoning correct or am I missing any good reasons why logic objects actually need or somehow benefit from copy constructors?
Automatically generated copy constructors are a great benefit in separating resource management from program logic; classes implementing logic do not need to worry about allocating, freeing or copying resources at all.
In my opinion, any replacement would need to do the same, and doing that for named functions feels a bit weird.
Long answer
When considering copy semantics, it's useful to divide types into four categories:
Primitive types, with semantics defined by the language;
Resource management (or RAII) types, with special requirements;
Aggregate types, which simply copy each member;
Polymorphic types.
Primitive types are what they are, so they are beyond the scope of the question; I'm assuming that a radical change to the language, breaking decades of legacy code, won't happen. Polymorphic types can't be copied (while maintaining the dynamic type) without user-defined virtual functions or RTTI shenanigans, so they are also beyond the scope of the question.
So the proposal is: mandate that RAII and aggregate types implement a named function, rather than a copy constructor, if they should be copied.
This makes little difference to RAII types; they just need to declare a differently-named copy function, and users just need to be slightly more verbose.
However, in the current world, aggregate types do not need to declare an explicit copy constructor at all; one will be generated automatically to copy all the members, or deleted if any are uncopyable. This ensures that, as long as all the member types are correctly copyable, so is the aggregate.
In your world, there are two possibilities:
Either the language knows about your copy-function, and can automatically generate one (perhaps only if explicitly requested, i.e. T copy() = default;, since you want explicitness). In my opinion, automatically generating named functions based on the same named function in other types feels more like magic than the current scheme of generating "language elements" (constructors and operator overloads), but perhaps that's just my prejudice speaking.
Or it's left to the user to correctly implement copying semantics for aggregates. This is error-prone (since you could add a member and forget to update the function), and breaks the current clean separation between resource management and program logic.
And to address the points you make in favour:
Copying (non-polymorphic) objects is commonplace, although as you say it's less common now that they can be moved when possible. It's just your opinion that "explicit is better" or that T a(b); is less explicit than T a(b.copy());
Agreed, if an object doesn't have clearly defined copy semantics, then it should have named functions to cover whatever options it offers. I don't see how that affects how normal objects should be copied.
I've no idea why you think that a copy constructor shouldn't be allowed to do things that a named function could, as long as they are part of the defined copy semantics. You argue that copy constructors shouldn't be used because of artificial restrictions that you place on them yourself.
Copying polymorphic objects is an entirely different kettle of fish. Forcing all types to use named functions just because polymorphic ones must won't give the consistency you seem to be arguing for, since the return types would have to be different. Polymorphic copies will need to be dynamically allocated and returned by pointer; non-polymorphic copies should be returned by value. In my opinion, there is little value in making these different operations look similar without being interchangable.
One case where copy constructors come in useful is when implementing the strong exception guarantees.
To illustrate the point, let's consider the resize function of std::vector. The function might be implemented roughly as follows:
void std::vector::resize(std::size_t n)
{
if (n > capacity())
{
T *newData = new T [n];
for (std::size_t i = 0; i < capacity(); i++)
newData[i] = std::move(m_data[i]);
delete[] m_data;
m_data = newData;
}
else
{ /* ... */ }
}
If the resize function were to have a strong exception guarantee we need to ensure that, if an exception is thrown, the state of the std::vector before the resize() call is preserved.
If T has no move constructor, then we will default to the copy constructor. In this case, if the copy constructor throws an exception, we can still provide strong exception guarantee: we simply delete the newData array and no harm to the std::vector has been done.
However, if we were using the move constructor of T and it threw an exception, then we have a bunch of Ts that were moved into the newData array. Rolling this operation back isn't straight-forward: if we try to move them back into the m_data array the move constructor of T may throw an exception again!
To resolve this issue we have the std::move_if_noexcept function. This function will use the move constructor of T if it is marked as noexcept, otherwise the copy constructor will be used. This allows us to implement std::vector::resize in such a way as to provide a strong exception guarantee.
For completeness, I should mention that C++11 std::vector::resize does not provide a strong exception guarantee in all cases. According to www.cplusplus.com we have the the follow guarantees:
If n is less than or equal to the size of the container, the function never throws exceptions (no-throw guarantee).
If n is greater and a reallocation happens, there are no changes in the container in case of exception (strong guarantee) if the type of the elements is either copyable or no-throw moveable.
Otherwise, if an exception is thrown, the container is left with a valid state (basic guarantee).
Here's the thing. Moving is the new default- the new minimum requirement. But copying is still often a useful and convenient operation.
Nobody should bend over backwards to offer a copy constructor anymore. But it is still useful for your users to have copyability if you can offer it simply.
I would not ditch copy constructors any time soon, but I admit that for my own types, I only add them when it becomes clear I need them- not immediately. So far this is very, very few types.
Given a class "A" exists and is correct. What would be some of the negative results of using a reference to "A" instead of a pointer in a class "B". That is:
// In Declaration File
class A;
class B
{
public:
B();
~B();
private:
A& a;
};
// In Definition File
B::B(): a(* new A())
{}
B::~B()
{
delete &a;
}
Omitted extra code for further correctness of "B", such as the copy constructor and assignment operator, just wanted to demonstrate the concept of the question.
The immediate limitations are that:
You cannot alter a reference's value. You can alter the A it refers to, but you cannot reallocate or reassign a during B's lifetime.
a must never be 0.
Thus:
The object is not assignable.
B should not be copy constructible, unless you teach A and its subtypes to clone properly.
B will not be a good candidate as an element of collections types if stored as value. A vector of Bs would likely be implemented most easily as std::vector<B*>, which may introduce further complications (or simplifications, depending on your design).
These may be good things, depending on your needs.
Caveats:
slicing is another problem to be aware of if a is assignable and assignment is reachable within B.
You can't change the object referred to by a after the fact, e.g. on assignment. Also, it makes your type non-POD (the type given would be non-POD anyway due to the private data member anyway, but in some cases it might matter).
But the main disadvantage is probably it might confuse readers of your code.
Of course, by adding a reference member to your class B means that the compiler can no longer generate the implicit default and copy constructors, and assignment operators; and neither a manually written assignment operator can reasign a.
I don't think there are negative results, other than the fact that delete &a may look odd. The fact that the object was created by new is somewhat lost by binding the result to a reference, and it may only matter since the fact that its lifetime has to be controlled by B is not clear.
If you use a reference:
You must provide the value at construction time
You cannot change what it refers to
It cannot be null
It will prevent your class from being assignable
You might perhaps consider using a smart pointer of some sort instead (std::unique_ptr, std::shared_ptr, etc). This has the added benefit of automatically deleting the object for you.
A reference to a dynamically-allocated object violates the principle of least surprise; that is, no one normally expects code written as you have. In general that will make the maintenance cost higher in the future.
The problem with references is that it looks like a alias or other name for same variable but it you look at the machine code generated internally they use constant pointers to do operations, and this can be a performance issue because you may assume within a code that variable arithmetic is taking place but at assembly code level they are manipulating pointers which you may not realize at code level.
You may not want to use pointer manipulation where you want fast integer or float arithmetic.
A problem of "value types" with external resources (like std::vector<T> or std::string) is that copying them tends to be quite expensive, and copies are created implicitly in various contexts, so this tends to be a performance concern. C++0x's answer to this problem is move semantics, which is conceptionally based on the idea of resource pilfering and technically powered by rvalue references.
Does D have anything similar to move semantics or rvalue references?
I believe that there are several places in D (such as returning structs) that D manages to make them moves whereas C++ would make them a copy. IIRC, the compiler will do a move rather than a copy in any case where it can determine that a copy isn't needed, so struct copying is going to happen less in D than in C++. And of course, since classes are references, they don't have the problem at all.
But regardless, copy construction already works differently in D than in C++. Generally, instead of declaring a copy constructor, you declare a postblit constructor: this(this). It does a full memcpy before this(this) is called, and you only make whatever changes are necessary to ensure that the new struct is separate from the original (such as doing a deep copy of member variables where needed), as opposed to creating an entirely new constructor that must copy everything. So, the general approach is already a bit different from C++. It's also generally agreed upon that structs should not have expensive postblit constructors - copying structs should be cheap - so it's less of an issue than it would be in C++. Objects which would be expensive to copy are generally either classes or structs with reference or COW semantics.
Containers are generally reference types (in Phobos, they're structs rather than classes, since they don't need polymorphism, but copying them does not copy their contents, so they're still reference types), so copying them around is not expensive like it would be in C++.
There may very well be cases in D where it could use something similar to a move constructor, but in general, D has been designed in such a way as to reduce the problems that C++ has with copying objects around, so it's nowhere near the problem that it is in C++.
I think all answers completely failed to answer the original question.
First, as stated above, the question is only relevant for structs. Classes have no meaningful move. Also stated above, for structs, a certain amount of move will happen automatically by the compiler under certain conditions.
If you wish to get control over the move operations, here's what you have to do. You can disable copying by annotating this(this) with #disable. Next, you can override C++'s constructor(constructor &&that) by defining this(Struct that). Likewise, you can override the assign with opAssign(Struct that). In both cases, you need to make sure that you destroy the values of that.
For assignment, since you also need to destroy the old value of this, the simplest way is to swap them. An implementation of C++'s unique_ptr would, therefore, look something like this:
struct UniquePtr(T) {
private T* ptr = null;
#disable this(this); // This disables both copy construction and opAssign
// The obvious constructor, destructor and accessor
this(T* ptr) {
if(ptr !is null)
this.ptr = ptr;
}
~this() {
freeMemory(ptr);
}
inout(T)* get() inout {
return ptr;
}
// Move operations
this(UniquePtr!T that) {
this.ptr = that.ptr;
that.ptr = null;
}
ref UniquePtr!T opAssign(UniquePtr!T that) { // Notice no "ref" on "that"
swap(this.ptr, that.ptr); // We change it anyways, because it's a temporary
return this;
}
}
Edit:
Notice I did not define opAssign(ref UniquePtr!T that). That is the copy assignment operator, and if you try to define it, the compiler will error out because you declared, in the #disable line, that you have no such thing.
D have separate value and object semantics :
if you declare your type as struct, it will have value semantic by default
if you declare your type as class, it will have object semantic.
Now, assuming you don't manage the memory yourself, as it's the default case in D - using a garbage collector - you have to understand that object of types declared as class are automatically pointers (or "reference" if you prefer) to the real object, not the real object itself.
So, when passing vectors around in D, what you pass is the reference/pointer. Automatically. No copy involved (other than the copy of the reference).
That's why D, C#, Java and other language don't "need" moving semantic (as most types are object semantic and are manipulated by reference, not by copy).
Maybe they could implement it, I'm not sure. But would they really get performance boost as in C++? By nature, it don't seem likely.
I somehow have the feeling that actually the rvalue references and the whole concept of "move semantics" is a consequence that it's normal in C++ to create local, "temporary" stack objects. In D and most GC languages, it's most common to have objects on the heap, and then there's no overhead with having a temporary object copied (or moved) several times when returning it through a call stack - so there's no need for a mechanism to avoid that overhead too.
In D (and most GC languages) a class object is never copied implicitly and you're only passing the reference around most of the time, so this may mean that you don't need any rvalue references for them.
OTOH, struct objects are NOT supposed to be "handles to resources", but simple value types behaving similar to builtin types - so again, no reason for any move semantics here, IMHO.
This would yield a conclusion - D doesn't have rvalue refs because it doesn't need them.
However, I haven't used rvalue references in practice, I've only had a read on them, so I might have skipped some actual use cases of this feature. Please treat this post as a bunch of thoughts on the matter which hopefully would be helpful for you, not as a reliable judgement.
I think if you need the source to loose the resource you might be in trouble. However being GC'ed you can often avoid needing to worry about multiple owners so it might not be an issue for most cases.
I can't use shared_ptr in my project, no boost :(
So, I'm having a class roughly similar to the one below:
class MyClass
{
private:
std::auto_ptr<MyOtherClass> obj;
};
Now, I want to store the instances of above class in std::vector. Is it safe? I've read here that it's wrong to use std::auto_ptr with STL containers. Does it apply to my situation here?
It is not safe, bacause when container will copy MyClass instnace default copy operator will call copy for all members - and for auto_ptr member too and we will have same situation as you describe in your question ( storing auto_ptr in container )
BTW: for avoid confusion at compile time add
private:
MyClass& operator=( const MyClass& );
MyClass( const MyClass& );
compiler output error if you will try use copy operators, this can save you from hours of debug.
As Neil Butterworth said, auto_ptr is probably not the way to go.
boost::shared_ptr clearly is, but you say you can't use boost.
Let me mention that you could download boost, extract what you need for shared\ptr only using the bcp tool and use boost::shared_ptr. It would only mean a few added hpp files in your project. I believe it's the right way to go.
It is not valid to have an object that contains an auto_ptr in a standard container. You run into undefined behavior. Two common problems:
std::vector<>::resize copies its argument into each created element. The first copy will "succeed" (see below why not), but each further copy will be empty, because the element copied is also empty!
If something during reallocation throws, you can happen to have some elements copied (to a new buffer) - but the copy being thrown away - and other elements not, because push_back must not have any effects if an exception is being thrown. Thus some of your elements are now empty.
As this is all about undefined behavior it does not really matter. But even if we try to come up with this behavior based on what we think is valid, we would fail anyway. All the member functions like push_back, resize and so on have a const reference that takes an object of type T. Thus, a reference of type T const& is tried to copied into elements of the vector. But the implicitly created copy constructor/copy assignment operator looks like T(T&) - that is, it requires a non-const object to be copied from! Good implementations of the Standard library check that, and fail to compile if necessary.
Until the next C++ version, you have to live with this. The next one will support element types that are merely movable. That is, a moved object does not need to be equal to the object moved to. That will allow putting streams, transfer-of-ownership pointers and threads into containers.
See what the Standard says for this (17.4.3.6):
In certain cases (replacement functions, handler functions, operations on types used to instantiate standard library template components), the C++ Standard Library depends on components supplied by a C++ program. If these components do not meet their requirements, the Standard places no requirements on the implementation.
In particular, the effects are undefined in the following cases:
for types used as template arguments when instantiating a template component, if the operations on the type do not implement the semantics of the applicable Requirements subclause (20.1.5, 23.1, 24.1, 26.1).
I've posted a question as a follow-up
to this answer, see
Class containing auto_ptr stored in vector.
Assming your class does not have a user-defined copy constructor, then no, it is probably (see below) not safe. When your class is copied (as will happen when it is added to a vector) the copy constructor of the auto_ptr will be used. This has the weird behaviour of tranferring ownership of the thing being copied to the copy and, so the thing being copied's pointer is now null.
It is possible, though unlikely, that you actually want this behaviour, in which case an auto_ptr is safe. Assuming you do not, you should either:
add a copy constructor to manage the copying
Note this is not enough - see the follow-up question mentioned above for more info.
or:
use a smarter, possibly reference counted pointer, such as one of the boost smart pointers
Copying MyClass object will cause either call to assignment operator or copy constructor. If they are not overloaded to handle auto_ptr<> in unusual way, they will propagate the call to copy constructor (or assignment operator) to the auto_ptr<> member. This may lead to problems described in question you had linked.
The reason why it is not safe to instanciate a vector of auto_pointer is that there is an algorithm : sort(), that will do a copy of one object in your container on the stack. (sort() implements quicksort which needs a "pivot")
And therefore deleting it when going out of scpope of the sort() function.
As well any algorithm, or function of your own that are able to take your container as parameter, and copy one of its object on the stack will cause this issue as a result.
Well in your case, it is simple you must ensure your class does not behaves as an auto_ptr, or ensure you will never call such function/algorithm that can delete your underlying objects. The first solution is best, according to me :)
So your copy constructor and your affectation operator as well, should not give away property of the pointer object.
The best way to achieve that is to wrapp a boost smart pointer instead of an auto_ptr, to make your container safe when calling such function/algorithm.
By the way according to me, defining a better copy constructor/affectation operator to bypass this issue is not a good solution: I can't see a good copy constructor implementation (and affectation operator as well) that could keep safe the result of applying the sort() algorithm.
If you want to use a class that uses auto_ptr in a container, you can just provide a copy-constructor and assignment operator yourself:
class MyClass
{
private:
const std::auto_ptr<MyOtherClass> obj; // Note const here to keep the pointer from being modified.
public:
MyClass(const MyClass &other) : obj(new MyOtherClass(*other.obj)) {}
MyClass &operator=(const MyClass &other)
{
*obj = *other.obj;
return *this;
}
};
But as mentioned elsewhere, the standard lets containers make copies and assignments and assumes that the contained classes will behave in a specific manner that auto_ptr violates. By defining the methods above, you can make a class that contains an auto_ptr behave. Even if your implementation works fine with auto_ptrs, you run the risk of finding another implementation doesn't work. The standard only make guarantees of performance and observable behaviour, not implementation.
It will not work. auto_ptr doesn't count references which means at the first destructor call your pointer will be freed.
Use boost::shared_ptr instead.