C++: Move Semantic with Integer

C++: Move Semantic with Integer - c++

int c = 2;
int d = std::move(c);
std::cout << "c is: " << c << std::endl;
std::cout << "d is: " << d << std::endl;
this code output:
c is: 2
d is: 2
I thought that after move(c) to d, c will be empty, why does it still have 2 as its value ? Can you anyone please help me explain this ? Thank you.

I thought that after move(c) to d, c will be empty,
Your expectation was mis informed.
why does it still have 2 as its value ?
Fundamental types do not have move constructors. You have simply made a copy. Copying does not modify the source object.
For class types, it would not be safe to assume what the move constructor does exactly, and specifically what state the source object is left in. It is not necessarily guaranteed to be "empty". See the documentation of the class for what it does. If there is no documentation, or documentation doesn't give any guarantees, then you cannot assume anything about the state of the source object.

std::move doesn't move anything! (contrary to it's name). It is exactly equivalent to a static_cast to an rvalue reference type.
That it, it is just a cast to rvalue- more specifically to an xvalue, as opposed to a prvalue. And it is also true that having a cast named move sometimes confuses people. However the intent of this naming is not to confuse, but rather to make your code more readable.
Using the xvalue, we can trigger the right overload and hence, we can use std::swap in such overloads to take the ownership of another object (but aren't required).
For example, a move constructor of a linked list might copy the pointer to the head of the list and store nullptr in the argument instead of allocating and copying individual nodes.
why does it still have 2 as its value
As mentioned std::move doesn't move and the real job of swapping/moving the resources is being performed by the overloads like move constructor and move assignment. std::move tasks is just to cast so that compiler can call the right overload (for example, move constructor in favor of copy constructor) and the actual moving of resources has to be defined by the software developer in their respective overloads. Since, fundamental types like int doesn't have any move constructor, the statement int c = std::move(a); merely copies the value of a to c.
Try this:
#include <iostream>
#include <utility>
void hello(int& a)
{
std::cout << "LVALUE" << std::endl;
}
void hello(int&& a)
{
std::cout << "RVALUE" << std::endl;
}
int main(void)
{
int a = 8;
hello(a);
hello(std::move(a));
return 0;
}

First, as mentioned by #eerorika. Moving fundamental types is equivalent to copying. The reason of this behavior is pretty clear. Move semantic is developed for saving computational resource, and you clearly saved nothing (but wasted something) by clearing the value of an integer variable, which is not going to be further used. Leave it there is the best here.
Second, a "moved" variable is not necessarily "empty" or "cleared". It may be in any status formally, but for standard library objects there are some guarantees: (Quoted from here)
Unless otherwise specified, all standard library objects that have
been moved from are placed in a valid but unspecified state. That is,
only the functions without preconditions, such as the assignment
operator, can be safely used on the object after it was moved from
As a result, you may see a "moved" std::vector contains random values and it is perfectly correct. Never assume such a std::vector is (or is not) empty because it might yields undefined behaviours. More generally, make no assumption (except the status is valid for standard library object) about the status of a object that was moved from.

Related

Is moved variable valid to use after std::move?

I'm having hard time to understand if I std::move POD from one variable to another, is the source variable still valid to use or does it act something like dangling pointer? does it still point to stack memory?
for example:
int a = 5;
int b = std::move(a) // b owns a resources now
a = 10 // is this valid? does it have memory address?
std::cout << a; // prints 10 obviously valid?

Note that std::move does not move its argument, it just casts it to an rvalue reference. What actually moves an object is a constructor or an assignment operator that accept rvalue reference.
But int is a built-in type and does not have such a constructor or operator=, so applying std::move to int will not cause it to get moved.
Putting built-in types aside, the C++ Standard says that a moved-from object should be in a valid but unspecified state. Usually, it means that we cannot use its value, but can re-assign it.

std::move does nothing to a POD.
int a = 5;
int b = std::move(a);
a is still good after that.
For non-POD types, the moved object maybe valid for some operations and invalid for other operations -- it all depends on what the move constructor or move assignment operator does.

When you use std::move on a POD type, nothing special happens it just makes a plain copy and both the source and destination are still usable.

overlading assignment operator in c++ using return by value

Considering :
class MyObject{
public:
MyObject();
MyObject(int,int);
int x;
int y;
MyObject operator =(MyObject rhs);
};
MyObject::MyObject(int xp, int yp){
x = xp;
y = yp;
}
MyObject MyObject::operator =(MyObject rhs){
MyObject temp;
temp.x = rhs.x;
temp.y = rhs.y;
return temp;
}
int main(){
MyObject one(1,1);
MyObject two(2,2);
MyObject three(3,3);
one = two = three;
cout << one.x << ", " << one.y;
cout << two.x << ", " << two.y;
cout << three.x << ", " << three.y;
}
By doing this, the variables x and y in one,two and three are unchanged. I know that I should update the member variables for MyObject and use return by reference and return *this for proper behaviour. However, what actually happens to the return values in one = two = three ? Where does the return temp actually end up in the chain, like step by step ?

The call to the assignment operator in
two = three
returns a temporary object as rvalue. This is of type MyObject and is passed on to the next call of the assignment operator
one = t
(I use t to refer to the temporary object.)
Unfortunately, this won't compile because the assignment operator expects a reference MyObject&, and not an rvalue of type MyObject.
(Your code won't compile for various reasons, including uppercased Class and typos, too.)
However, if you were to define an assignment operator that takes an rvalue (i.e. takes the argument by value, const-reference, or indeed by rvalue reference MyObject&& if C++11 is used), the call would work and the temporary object would be copied into the function. Internally, assignments would be made and another temporary object would be returned.
The final temporary object would then go out of scope, i.e. cease to exist. There would be no way to access its contents.
Thanks for Joachim Pileborg and Benjamin Lindley for the helpful comments.
To answer the request for more details: MyObject is a class type, and the C++ Standard includes an entire section on the life cycle of temporary objects of class type (Section 12.2). There are various complex situations that are detailed there in length, and I won't explain them all. But the basic concepts are as follows:
C++ has the notion of expressions. Expressions are, along with declarations and statements, the basic units the code of the program is composed of. For example, a function call f(a,b,c) is an expression, or an assignment like a = b. Expressions may contain other expressions: a = f(b,c), a function call nested in an assignment expression. C++ also introduces the concept of full-expressions. In the previous example, c is part of the expression f(b,c), but also of a = f(b,c), and if that is not nested in another expression, we say that a = f(b,c) is the full-expressions that lexically contains c.
The Standard defines a variety of situations where temporary objects may be created. One such situation is the returning of an object by value from a function call (aka returning a prvalue, §6.6.3).
The Standard states that the life time of such a temporary object ends when the full-expression that contains it has been fully evaluated:
[...] Temporary objects are destroyed as the last step in evaluating the full-expression (1.9) that (lexically) contains the point where they were created. [...]
(Note. The Standard then goes on to define several exceptions to this rule. The case of the return value of your assignment operator, however, is not such an exception.)
Now, what does it mean that an object (of class type) is destroyed? It means, first and foremost, that its destructor is called (§12.2/3). It also means that the storage for that object can no longer be safely accessed. So if you somehow managed to store the address of the temporary object in a pointer before the evaluation of the full-expression has ended, then derefencing that pointer after the evaluation has ended generally causes undefined behaviour.
In practice, this may in many cases mean the following – I describe the entire life cycle of the temporary object in one possible scenario:
To provide for storage for the temporary, the compiler makes sure that sufficient stack space is allocated when the function that contains the full-expression is entered (this happens before the full-expression is actually evaluated).
During the evaluation of the assignment expression, the temporary is created. The compiler makes sure that its constructor is called to initialise the space that was allocated for it.
Then the contents of the temporary may be accessed or modified in the course of the evaluation of the full-expression it is part of.
When the expression has been fully evaluated (in your case, this moment corresponds to the end of the line that contains the assignment expression), the destructor for the temporary is called. After that it is no longer safe to access the memory that was allocated for it, although in reality that space will continue to be part of the current stack frame until the evaluation of the function in which all of this happens has finished.
But, again, this is only an example of what may happen. The creation of temporaries is in many situations not actually required. The compiler may perform optimizations that mean the temporary is never actually created. In this case, the compiler must nevertheless ensure that it could have been be created, e.g. it must ensure that the required constructors and destructors exist (they may never be called though).

Do built-in types have move semantics?

Consider this code:
#include <iostream>
using namespace std;
void Func(int&& i) {
++i;
}
int main() {
int num = 1234;
cout << "Before: " << num << endl;
Func(std::move(num));
cout << "After: " << num << endl;
}
Its output is:
Before: 1234
After: 1235
Clearly, i is being modified inside Func, as it is bound to parameter i after being "converted" to an r-value reference by std::move.
Well, my point:
Moving an object means transferring ownership of resources from one object into another. However, built-in types holds no resources because they themselves are the resources. It makes no sense to transfer the resources they hold. As shown by the example, num's value is modified. Its resource, its self, is the one being modified.
Do built-in types have move semantics?
Also, Do built-in type objects after it is moved (if it is) a well-defined behavior?

And so, is the one shown by the example a well-defined behavior?
Yes, the behaviour shown in the example is the only behaviour allowed by the standard. That is because std::move doesn't move. The things that move are move constructors and move assignment operators.
All std::move does is change an lvalue into an xvalue, so that it can bind to rvalue references. It does not invoke any constructor or anything else. Changing the value category happens at the type level. Nothing happens at runtime.
Rvalue references are still references: they refer to the original object. The function increments the original integer through the reference given.
If a function takes an argument by reference, no copies nor moves happen: the original object is bound to the reference.
If a function takes an argument by value, then we might have a move.
However, fundamental types don't have move constructors. Moves degrade to copies in that case.

Could a smart compiler do all the things std::move does without it being part of the language?

This is a bit theoretical question, but although I have some basic understanding of the std::move Im still not certain if it provides some additional functionality to the language that theoretically couldnt be achieved with supersmart compilers. I know that code like :
{
std::string s1="STL";
std::string s2(std::move(s1));
std::cout << s1 <<std::endl;
}
is a new semantic behavior not just performance sugar. :D But tbh I guess nobody will use var x after doing std::move(x).
Also for movable only data (std::unique_ptr<>, std::thread) couldnt compiler automatically do the move construction and clearing of the old variable if type is declared movable?
Again this would mean that more code would be generated behind programmers back(for example now you can count cpyctor and movector calls, with automagic std::moving you couldnt do that ).

No.
But tbh I guess nobody will use var x after doing std::move(x)
Absolutely not guaranteed. In fact, a decent part of the reason why std::move(x) is not automatically usable by the compiler is because, well, it can't be decided automatically whether or not you intend this. It's explicitly well-defined behaviour.
Also, removing rvalue references would imply that the compiler can automagically write all the move constructors for you. This is definitely not true. D has a similar scheme, but it's a complete failure, because there are numerous useful situations in which the compiler-generated "move constructor" won't work correctly, but you can't change it.
It would also prevent perfect forwarding, which has other uses.
The Committee make many stupid mistakes, but rvalue references is not one of them.
Edit:
Consider something like this:
int main() {
std::unique_ptr<int> x = make_unique<int>();
some_func_that_takes_ownership(x);
int input = 0;
std::cin >> input;
if (input == 0)
some_other_func(x);
}
Owch. Now what? You can't magic the value of "input" to be known at compile-time. This is doubly a problem if the bodies of some_other_func and some_func_that_takes_ownership are unknown. This is Halting Problem- you can't prove that x is or is not used after some_func_that_takes_ownership.
D fails. I promised an example. Basically, in D, "move" is "binary copy and don't destruct the old". Unfortunately, consider a class with, say, a pointer to itself- something you will find in most string classes, most node-based containers, in designs for std::function, boost::variant, and lots of other similar handy value types. The pointer to the internal buffer will be copied but oh noes! points to the old buffer, not the new one. Old buffer is deallocated - GG your program.

It depends on what you mean by "what move does". To satisfy your curiosity, I think what you're looking to be told about the existence of Uniqueness Type Systems and Linear Type Systems.
These are types systems that enforce, at compile-time (in the type system), that a value only be referenced by one location, or that no new references be made. std::unique_ptr is the best approximation C++ can provide, given its rather weak type system.
Let's say we had a new storage-class specifier called uniqueref. This is like const, and specifies that the value has a single unique reference; nobody else has the value. It would enable this:
int main()
{
int* uniqueref x(new int); // only x has this reference
// unique type feature: error, would no longer be unique
auto y = x;
// linear type feature: okay, x not longer usable, z is now the unique owner
auto z = uniquemove(x);
// linear type feature: error: x is no longer usable
*x = 5;
}
(Also interesting to note the immense optimizations that can be taking, knowing a pointer value is really truly only referenced through that pointer. It's a bit like C99's restrict in that aspect.)
In terms of what you're asking, since we can now say that a type is uniquely referenced, we can guarantee that it's safe to move. That said, move operates are ultimately user-defined, and can do all sorts of weird stuff if desired, so implicitly doing this is a bad idea in current C++ anyway.
Everything above is obviously not formally thought-out and specified, but should give you an idea of what such a type system might look like. More generally, you probably want an Effect Type System.
But yes, these ideas do exist and are formally researched. C++ is just too established to add them.

Doing this the way you suggest is a lot more complicated than necessary:
std::string s1="STL";
std::string s2(s1);
std::cout << s1 <<std::endl;
In this case, it is fairly sure that a copy is meant. But if you drop the last line, s1 essentially ends its lifetime after the construction of s2.
In a reference counted implementation, the copy constructor for std::string will only increment the reference counter, while the destructor will decrement and delete if it becomes zero.
So the sequence is
(inlined std::string::string(char const *))
determine string length
allocate memory
copy string
initialize reference counter to 1
initialize pointer in string object
(inlined std::string::string(std::string const &))
increment reference counter
copy pointer to string representation
Now the compiler can flatten that, simply initialize the reference counter to 2 and store the pointer twice. Common Subexpression Elimination then finds out that s1 and s2 keep the same pointer value, and merges them into one.
In short, the only difference in generated code should be that the reference counter is initialized to 2.

Construct object with itself as reference?

I just realised that this program compiles and runs (gcc version 4.4.5 / Ubuntu):
#include <iostream>
using namespace std;
class Test
{
public:
// copyconstructor
Test(const Test& other);
};
Test::Test(const Test& other)
{
if (this == &other)
cout << "copying myself" << endl;
else
cout << "copying something else" << endl;
}
int main(int argv, char** argc)
{
Test a(a); // compiles, runs and prints "copying myself"
Test *b = new Test(*b); // compiles, runs and prints "copying something else"
}
I wonder why on earth this even compiles. I assume that (just as in Java) arguments are evaluated before the method / constructor is called, so I suspect that this case must be covered by some "special case" in the language specification?
Questions:
Could someone explain this (preferably by referring to the specification)?
What is the rationale for allowing this?
Is it standard C++ or is it gcc-specific?
EDIT 1: I just realised that I can even write int i = i;
EDIT 2: Even with -Wall and -pedantic the compiler doesn't complain about Test a(a);.
EDIT 3: If I add a method
Test method(Test& t)
{
cout << "in some" << endl;
return t;
}
I can even do Test a(method(a)); without any warnings.

The reason this "is allowed" is because the rules say an identifiers scope starts immediately after the identifier. In the case
int i = i;
the RHS i is "after" the LHS i so i is in scope. This is not always bad:
void *p = (void*)&p; // p contains its own address
because a variable can be addressed without its value being used. In the case of the OP's copy constructor no error can be given easily, since binding a reference to a variable does not require the variable to be initialised: it is equivalent to taking the address of a variable. A legitimate constructor could be:
struct List { List *next; List(List &n) { next = &n; } };
where you see the argument is merely addressed, its value isn't used. In this case a self-reference could actually make sense: the tail of a list is given by a self-reference. Indeed, if you change the type of "next" to a reference, there's little choice since you can't easily use NULL as you might for a pointer.
As usual, the question is backwards. The question is not why an initialisation of a variable can refer to itself, the question is why it can't refer forward. [In Felix, this is possible]. In particular, for types as opposed to variables, the lack of ability to forward reference is extremely broken, since it prevents recursive types being defined other than by using incomplete types, which is enough in C, but not in C++ due to the existence of templates.

I have no idea how this relates to the specification, but this is how I see it:
When you do Test a(a); it allocates space for a on the stack. Therefore the location of a in memory is known to the compiler at the start of main. When the constructor is called (the memory is of course allocated before that), the correct this pointer is passed to it because it's known.
When you do Test *b = new Test(*b);, you need to think of it as two steps. First the object is allocated and constructed, and then the pointer to it is assigned to b. The reason you get the message you get is that you're essentially passing in an uninitialized pointer to the constructor, and the comparing it with the actual this pointer of the object (which will eventually get assigned to b, but not before the constructor exits).

The second one where you use new is actually easier to understand; what you're invoking there is exactly the same as:
Test *b;
b = new Test(*b);
and you're actually performing an invalid dereference. Try to add a << &other << to your cout lines in the constructor, and make that
Test *b = (Test *)0xFOOD1E44BADD1E5;
to see that you're passing through whatever value a pointer on the stack has been given. If not explicitly initialized, that's undefined. But even if you don't initialize it with some sort of (in)sane default, it'll be different from the return value of new, as you found out.
For the first, think of it as an in-place new. Test a is a local variable not a pointer, it lives on the stack and therefore its memory location is always well defined - this is very much unlike a pointer, Test *b which, unless explicitly initialized to some valid location, will be dangling.
If you write your first instantiation like:
Test a(*(&a));
it becomes clearer what you're invoking there.
I don't know a way to make the compiler disallow (or even warn) about this sort of self-initialization-from-nowhere through the copy constructor.

The first case is (perhaps) covered by 3.8/6:
before the lifetime of an object has
started but after the storage which
the object will occupy has been
allocated or, after the lifetime of an
object has ended and before the
storage which the object occupied is
reused or released, any lvalue which
refers to the original object may be
used but only in limited ways. Such an
lvalue refers to allocated storage
(3.7.3.2), and using the properties of
the lvalue which do not depend on its
value is well-defined.
Since all you're using of a (and other, which is bound to a) before the start of its lifetime is the address, I think you're good: read the rest of that paragraph for the detailed rules.
Beware though that 8.3.2/4 says, "A reference shall be initialized to refer to a valid object or function." There is some question (as a defect report on the standard) what "valid" means in this context, so possibly you can't bind the parameter other to the unconstructed (and hence, "invalid"?) a.
So, I'm uncertain what the standard actually says here - I can use an lvalue, but not bind it to a reference, perhaps, in which case a isn't good, while passing a pointer to a would be OK as long as it's only used in the ways permitted by 3.8/5.
In the case of b, you're using the value before it's initialized (because you dereference it, and also because even if you got that far, &other would be the value of b). This clearly is not good.
As ever in C++, it compiles because it's not a breach of language constraints, and the standard doesn't explicitly require a diagnostic. Imagine the contortions the spec would have to go through in order to mandate a diagnostic when an object is invalidly used in its own initialization, and imagine the data flow analysis that a compiler might have to do to identify complex cases (it may not even be possible at compile time, if the pointer is smuggled through an externally-defined function). Easier to leave it as undefined behavior, unless anyone has any really good suggestions for new spec language ;-)

If you crank your warning levels up, your compiler will probably warn you about using uninitialized stuff. UB doesn't require a diagnostic, many things that are "obviously" wrong may compile.

I don't know the spec reference, but I do know that accessing an uninitialized pointer always results in undefined behaviour.
When I compile your code in Visual C++ I get:
test.cpp(20): warning C4700:
uninitialized local variable 'b' used

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js