Construct object with itself as reference? - c++

I just realised that this program compiles and runs (gcc version 4.4.5 / Ubuntu):
#include <iostream>
using namespace std;
class Test
{
public:
// copyconstructor
Test(const Test& other);
};
Test::Test(const Test& other)
{
if (this == &other)
cout << "copying myself" << endl;
else
cout << "copying something else" << endl;
}
int main(int argv, char** argc)
{
Test a(a); // compiles, runs and prints "copying myself"
Test *b = new Test(*b); // compiles, runs and prints "copying something else"
}
I wonder why on earth this even compiles. I assume that (just as in Java) arguments are evaluated before the method / constructor is called, so I suspect that this case must be covered by some "special case" in the language specification?
Questions:
Could someone explain this (preferably by referring to the specification)?
What is the rationale for allowing this?
Is it standard C++ or is it gcc-specific?
EDIT 1: I just realised that I can even write int i = i;
EDIT 2: Even with -Wall and -pedantic the compiler doesn't complain about Test a(a);.
EDIT 3: If I add a method
Test method(Test& t)
{
cout << "in some" << endl;
return t;
}
I can even do Test a(method(a)); without any warnings.

The reason this "is allowed" is because the rules say an identifiers scope starts immediately after the identifier. In the case
int i = i;
the RHS i is "after" the LHS i so i is in scope. This is not always bad:
void *p = (void*)&p; // p contains its own address
because a variable can be addressed without its value being used. In the case of the OP's copy constructor no error can be given easily, since binding a reference to a variable does not require the variable to be initialised: it is equivalent to taking the address of a variable. A legitimate constructor could be:
struct List { List *next; List(List &n) { next = &n; } };
where you see the argument is merely addressed, its value isn't used. In this case a self-reference could actually make sense: the tail of a list is given by a self-reference. Indeed, if you change the type of "next" to a reference, there's little choice since you can't easily use NULL as you might for a pointer.
As usual, the question is backwards. The question is not why an initialisation of a variable can refer to itself, the question is why it can't refer forward. [In Felix, this is possible]. In particular, for types as opposed to variables, the lack of ability to forward reference is extremely broken, since it prevents recursive types being defined other than by using incomplete types, which is enough in C, but not in C++ due to the existence of templates.

I have no idea how this relates to the specification, but this is how I see it:
When you do Test a(a); it allocates space for a on the stack. Therefore the location of a in memory is known to the compiler at the start of main. When the constructor is called (the memory is of course allocated before that), the correct this pointer is passed to it because it's known.
When you do Test *b = new Test(*b);, you need to think of it as two steps. First the object is allocated and constructed, and then the pointer to it is assigned to b. The reason you get the message you get is that you're essentially passing in an uninitialized pointer to the constructor, and the comparing it with the actual this pointer of the object (which will eventually get assigned to b, but not before the constructor exits).

The second one where you use new is actually easier to understand; what you're invoking there is exactly the same as:
Test *b;
b = new Test(*b);
and you're actually performing an invalid dereference. Try to add a << &other << to your cout lines in the constructor, and make that
Test *b = (Test *)0xFOOD1E44BADD1E5;
to see that you're passing through whatever value a pointer on the stack has been given. If not explicitly initialized, that's undefined. But even if you don't initialize it with some sort of (in)sane default, it'll be different from the return value of new, as you found out.
For the first, think of it as an in-place new. Test a is a local variable not a pointer, it lives on the stack and therefore its memory location is always well defined - this is very much unlike a pointer, Test *b which, unless explicitly initialized to some valid location, will be dangling.
If you write your first instantiation like:
Test a(*(&a));
it becomes clearer what you're invoking there.
I don't know a way to make the compiler disallow (or even warn) about this sort of self-initialization-from-nowhere through the copy constructor.

The first case is (perhaps) covered by 3.8/6:
before the lifetime of an object has
started but after the storage which
the object will occupy has been
allocated or, after the lifetime of an
object has ended and before the
storage which the object occupied is
reused or released, any lvalue which
refers to the original object may be
used but only in limited ways. Such an
lvalue refers to allocated storage
(3.7.3.2), and using the properties of
the lvalue which do not depend on its
value is well-defined.
Since all you're using of a (and other, which is bound to a) before the start of its lifetime is the address, I think you're good: read the rest of that paragraph for the detailed rules.
Beware though that 8.3.2/4 says, "A reference shall be initialized to refer to a valid object or function." There is some question (as a defect report on the standard) what "valid" means in this context, so possibly you can't bind the parameter other to the unconstructed (and hence, "invalid"?) a.
So, I'm uncertain what the standard actually says here - I can use an lvalue, but not bind it to a reference, perhaps, in which case a isn't good, while passing a pointer to a would be OK as long as it's only used in the ways permitted by 3.8/5.
In the case of b, you're using the value before it's initialized (because you dereference it, and also because even if you got that far, &other would be the value of b). This clearly is not good.
As ever in C++, it compiles because it's not a breach of language constraints, and the standard doesn't explicitly require a diagnostic. Imagine the contortions the spec would have to go through in order to mandate a diagnostic when an object is invalidly used in its own initialization, and imagine the data flow analysis that a compiler might have to do to identify complex cases (it may not even be possible at compile time, if the pointer is smuggled through an externally-defined function). Easier to leave it as undefined behavior, unless anyone has any really good suggestions for new spec language ;-)

If you crank your warning levels up, your compiler will probably warn you about using uninitialized stuff. UB doesn't require a diagnostic, many things that are "obviously" wrong may compile.

I don't know the spec reference, but I do know that accessing an uninitialized pointer always results in undefined behaviour.
When I compile your code in Visual C++ I get:
test.cpp(20): warning C4700:
uninitialized local variable 'b' used

Related

C++: Move Semantic with Integer

int c = 2;
int d = std::move(c);
std::cout << "c is: " << c << std::endl;
std::cout << "d is: " << d << std::endl;
this code output:
c is: 2
d is: 2
I thought that after move(c) to d, c will be empty, why does it still have 2 as its value ? Can you anyone please help me explain this ? Thank you.
I thought that after move(c) to d, c will be empty,
Your expectation was mis informed.
why does it still have 2 as its value ?
Fundamental types do not have move constructors. You have simply made a copy. Copying does not modify the source object.
For class types, it would not be safe to assume what the move constructor does exactly, and specifically what state the source object is left in. It is not necessarily guaranteed to be "empty". See the documentation of the class for what it does. If there is no documentation, or documentation doesn't give any guarantees, then you cannot assume anything about the state of the source object.
std::move doesn't move anything! (contrary to it's name). It is exactly equivalent to a static_cast to an rvalue reference type.
That it, it is just a cast to rvalue- more specifically to an xvalue, as opposed to a prvalue. And it is also true that having a cast named move sometimes confuses people. However the intent of this naming is not to confuse, but rather to make your code more readable.
Using the xvalue, we can trigger the right overload and hence, we can use std::swap in such overloads to take the ownership of another object (but aren't required).
For example, a move constructor of a linked list might copy the pointer to the head of the list and store nullptr in the argument instead of allocating and copying individual nodes.
why does it still have 2 as its value
As mentioned std::move doesn't move and the real job of swapping/moving the resources is being performed by the overloads like move constructor and move assignment. std::move tasks is just to cast so that compiler can call the right overload (for example, move constructor in favor of copy constructor) and the actual moving of resources has to be defined by the software developer in their respective overloads. Since, fundamental types like int doesn't have any move constructor, the statement int c = std::move(a); merely copies the value of a to c.
Try this:
#include <iostream>
#include <utility>
void hello(int& a)
{
std::cout << "LVALUE" << std::endl;
}
void hello(int&& a)
{
std::cout << "RVALUE" << std::endl;
}
int main(void)
{
int a = 8;
hello(a);
hello(std::move(a));
return 0;
}
First, as mentioned by #eerorika. Moving fundamental types is equivalent to copying. The reason of this behavior is pretty clear. Move semantic is developed for saving computational resource, and you clearly saved nothing (but wasted something) by clearing the value of an integer variable, which is not going to be further used. Leave it there is the best here.
Second, a "moved" variable is not necessarily "empty" or "cleared". It may be in any status formally, but for standard library objects there are some guarantees: (Quoted from here)
Unless otherwise specified, all standard library objects that have
been moved from are placed in a valid but unspecified state. That is,
only the functions without preconditions, such as the assignment
operator, can be safely used on the object after it was moved from
As a result, you may see a "moved" std::vector contains random values and it is perfectly correct. Never assume such a std::vector is (or is not) empty because it might yields undefined behaviours. More generally, make no assumption (except the status is valid for standard library object) about the status of a object that was moved from.

Calling non-static member function outside of object's lifetime in C++17

Does the following program have undefined behavior in C++17 and later?
struct A {
void f(int) { /* Assume there is no access to *this here */ }
};
int main() {
auto a = new A;
a->f((a->~A(), 0));
}
C++17 guarantees that a->f is evaluated to the member function of the A object before the call's argument is evaluated. Therefore the indirection from -> is well-defined. But before the function call is entered, the argument is evaluated and ends the lifetime of the A object (see however the edits below). Does the call still have undefined behavior? Is it possible to call a member function of an object outside its lifetime in this way?
The value category of a->f is prvalue by [expr.ref]/6.3.2 and [basic.life]/7 does only disallow non-static member function calls on glvalues referring to the after-lifetime object. Does this imply the call is valid? (Edit: As discussed in the comments I am likely misunderstanding [basic.life]/7 and it probably does apply here.)
Does the answer change if I replace the destructor call a->~A() with delete a or new(a) A (with #include<new>)?
Some elaborating edits and clarifications on my question:
If I were to separate the member function call and the destructor/delete/placement-new into two statements, I think the answers are clear:
a->A(); a->f(0): UB, because of non-static member call on a outside its lifetime. (see edit below, though)
delete a; a->f(0): same as above
new(a) A; a->f(0): well-defined, call on the new object
However in all these cases a->f is sequenced after the first respective statement, while this order is reversed in my initial example. My question is whether this reversal does allow for the answers to change?
For standards before C++17, I initially thought that all three cases cause undefined behavior, already because the evaluation of a->f depends on the value of a, but is unsequenced relative to the evaluation of the argument which causes a side-effect on a. However, this is undefined behavior only if there is an actual side-effect on a scalar value, e.g. writing to a scalar object. However, no scalar object is written to because A is trivial and therefore I would also be interested in what constraint exactly is violated in the case of standards before C++17, as well. In particular, the case with placement-new seems unclear to me now.
I just realized that the wording about the lifetime of objects changed between C++17 and the current draft. In n4659 (C++17 draft) [basic.life]/1 says:
The lifetime of an object o of type T ends when:
if T is a class
type with a non-trivial destructor (15.4), the destructor call starts
[...]
while the current draft says:
The lifetime of an object o of type T ends when:
[...]
if T is a class type, the destructor call starts, or
[...]
Therefore, I suppose my example does have well-defined behavior in C++17, but not he current (C++20) draft, because the destructor call is trivial and the lifetime of the A object isn't ended. I would appreciate clarification on that as well. My original question does still stands even for C++17 for the case of replacing the destructor call with delete or placement-new expression.
If f accesses *this in its body, then there may be undefined behavior for the cases of destructor call and delete expression, however in this question I want to focus on whether the call in itself is valid or not.
Note however that the variation of my question with placement-new would potentially not have an issue with member access in f, depending on whether the call itself is undefined behavior or not. But in that case there might be a follow-up question especially for the case of placement-new because it is unclear to me, whether this in the function would then always automatically refer to the new object or whether it might need to potentially be std::laundered (depending on what members A has).
While A does have a trivial destructor, the more interesting case is probably where it has some side effect about which the compiler may want to make assumptions for optimization purposes. (I don't know whether any compiler uses something like this.) Therefore, I welcome answers for the case where A has a non-trivial destructor as well, especially if the answer differs between the two cases.
Also, from a practical perspective, a trivial destructor call probably does not affect the generated code and (unlikely?) optimizations based on undefined behavior assumptions aside, all code examples will most likely generate code that runs as expected on most compilers. I am more interested in the theoretical, rather than this practical perspective.
This question intends to get a better understanding of the details of the language. I do not encourage anyone to write code like that.
It’s true that trivial destructors do nothing at all, not even end the lifetime of the object, prior to (the plans for) C++20. So the question is, er, trivial unless we suppose a non-trivial destructor or something stronger like delete.
In that case, C++17’s ordering doesn’t help: the call (not the class member access) uses a pointer to the object (to initialize this), in violation of the rules for out-of-lifetime pointers.
Side note: if just one order were undefined, so would be the “unspecified order” prior to C++17: if any of the possibilities for unspecified behavior are undefined behavior, the behavior is undefined. (How would you tell the well-defined option was chosen? The undefined one could emulate it and then release the nasal demons.)
The postfix expression a->f is sequenced before the evaluation of any arguments (which are indeterminately sequenced relative to one another). (See [expr.call])
The evaluation of the arguments is sequenced before the body of the function (even inline functions, see [intro.execution])
The implication, then is that calling the function itself is not undefined behavior. However, accessing any member variables or calling other member functions within would be UB per [basic.life].
So the conclusion is that this specific instance is safe per the wording, but a dangerous technique in general.
You seem to assume that a->f(0) has these steps (in that order for most recent C++ standard, in some logical order for previous versions):
evaluating *a
evaluating a->f (a so called bound member function)
evaluating 0
calling the bound member function a->f on the argument list (0)
But a->f doesn't have either a value or type. It's essentially a non-thing, a meaningless syntax element needed only because the grammar decomposes member access and function call, even on a member function call which by define combines member access and function call.
So asking when a->f is "evaluated" is a meaningless question: there is no such thing as a distinct evaluation step for the a->f value-less, type-less expression.
So any reasoning based on such discussions of order of evaluation of non entity is also void and null.
EDIT:
Actually this is worse than what I wrote, the expression a->f has a phony "type":
E1.E2 is “function of parameter-type-list cv returning T”.
"function of parameter-type-list cv" isn't even something that would be a valid declarator outside a class: one cannot have f() const as a declarator as in a global declaration:
int ::f() const; // meaningless
And inside a class f() const doesn't mean "function of parameter-type-list=() with cv=const”, it means member-function (of parameter-type-list=() with cv=const). There is no proper declarator for proper "function of parameter-type-list cv". It can only exist inside a class; there is no type "function of parameter-type-list cv returning T" that can be declared or that real computable expressions can have.
In addition to what others said:
a->~A(); delete a;
This program has a memory leak which itself is technically not undefined behavior.
However, if you called delete a; to prevent it - that should have been undefined behavior because delete would call a->~A() second time [Section 12.4/14].
a->~A()
Otherwise in reality this is as others suggested - compiler generates machine code along the lines of A* a = malloc(sizeof(A)); a->A(); a->~A(); a->f(0);.
Since no member variables or virtuals all three member functions are empty ({return;}) and do nothing. Pointer a even still points to valid memory.
It will run but debugger may complain of memory leak.
However, using any nonstatic member variables inside f() could have been undefined behavior because you are accessing them after they are (implicitly) destroyed by compiler-generated ~A(). That would likely result in a runtime error if it was something like std::string or std::vector.
delete a
If you replaced a->~A() with expression that invoked delete a; instead then I believe this would have been undefined behavior because pointer a is no longer valid at that point.
Despite that, the code should still run without errors because function f() is empty. If it accessed any member variables it may have crashed or led to random results because the memory for a is deallocated.
new(a) A
auto a = new A; new(a) A; is itself undefined behavior because you are calling A() a second time for the same memory.
In that case calling f() by itself would be valid because a exists but constructing a twice is UB.
It will run fine if A does not contain any objects with constructors allocating memory and such. Otherwise it could lead to memory leaks, etc, but f() would access the "second" copy of them just fine.
I'm not a language lawyer but I took your code snippet and modified it slightly. I wouldn't use this in production code but this seems to produce valid defined results...
#include <iostream>
#include <exception>
struct A {
int x{5};
void f(int){}
int g() { std::cout << x << '\n'; return x; }
};
int main() {
try {
auto a = new A;
a->f((a->~A(), a->g()));
catch(const std::exception& e) {
std::cerr << e.what();
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
I'm running Visual Studio 2017 CE with compiler language flag set to /std:c++latest and my IDE's version is 15.9.16 and I get the follow console output and exit program status:
console output
5
IDE exit status output
The program '[4128] Test.exe' has exited with code 0 (0x0).
So this does seem to be defined in the case of Visual Studio, I'm not sure how other compilers will treat this. The destructor is being invoked, however the variable a is still in dynamic heap memory.
Let's try another slight modification:
#include <iostream>
#include <exception>
struct A {
int x{5};
void f(int){}
int g(int y) { x+=y; std::cout << x << '\n'; return x; }
};
int main() {
try {
auto a = new A;
a->f((a->~A(), a->g(3)));
catch(const std::exception& e) {
std::cerr << e.what();
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
console output
8
IDE exit status output
The program '[4128] Test.exe' has exited with code 0 (0x0).
This time let's not change the class anymore, but let's make call on a's member afterwards...
int main() {
try {
auto a = new A;
a->f((a->~A(), a->g(3)));
a->g(2);
} catch( const std::exception& e ) {
std::cerr << e.what();
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
console output
8
10
IDE exit status output
The program '[4128] Test.exe' has exited with code 0 (0x0).
Here it appears that a.x is maintaining its value after a->~A() is called since new was called on A and delete has not yet been called.
Even more if I remove the new and use a stack pointer instead of allocated dynamic heap memory:
int main() {
try {
A b;
A* a = &b;
a->f((a->~A(), a->g(3)));
a->g(2);
} catch( const std::exception& e ) {
std::cerr << e.what();
return EXIT_FAILURE;
}
return EXIT_SUCCESS;
}
I'm still getting:
console output
8
10
IDE exit status output
When I change my compiler's language flag setting from /c:std:c++latest to /std:c++17 I'm getting the same exact results.
What I'm seeing from Visual Studio it appears to be well defined without producing any UB within the contexts of what I've shown. However as from a language perspective when it concerns the standard I wouldn't rely on this type of code either. The above also doesn't consider when the class has internal pointers both stack-automatic storage as well as dynamic-heap allocation and if the constructor calls new on those internal objects and the destructor calls delete on them.
There are also a bunch of other factors than just the language setting for the compiler such as optimizations, convention calling, and other various compiler flags. It is hard to say and I don't have an available copy of the full latest drafted standard to investigate this any deeper. Maybe this can help you, others who are able to answer your question more thoroughly, and other readers to visualize this type of behavior in action.

How to change const std::shared_ptr after first assignment?

If I define a shared_ptr and a const shared_ptr of the same type, like this:
std::shared_ptr<int> first = std::shared_ptr<int>(new int);
const std::shared_ptr<int> second = std::shared_ptr<int>();
And later try to change the value of the const shared_ptr like this:
second = first;
It cause a compile error (as it should). But even if I try to cast away the const part:
(std::shared_ptr<int>)second = first;
The result of the code above is that second ends up being Empty, while first is untouched (eg ref count is still 1).
How can I change the value of a const shared_ptr after it was originally set? Is this even possible with std's pointer?
Thanks!
It is undefined behavior to modify in any way a variable declared as const outside of its construction or destruction.
const std::shared_ptr<int> second
this is a variable declared as const.
There is no standard compliant way to change what it refers to after construction and before destruction.
That being said, manually calling the destructor and constructing a new shared_ptr in the same spot might be legal, I am uncertain. You definitely cannot refer to said shared_ptr by its original name, and possibly leaving the scope where the original shared_ptr existed is illegal (as the destructor tries to destroy the original object, which the compiler can prove is an empty shared pointer (or a non-empty one) based on how the const object was constructed).
This is a bad idea even if you could make an argument the standard permits it.
const objects cannot be changed.
...
Your cast to a shared_ptr<int> simply creates a temporary copy. It is then assigned to, and the temporary copy is changed. Then the temporary copy is discarded. The const shared_ptr<int> not being modified is expected behavior. The legality of assigning to a temporary copy is because shared_ptr and most of the std library was designed before we had the ability to overload operator= based on the r/lvalue-ness of the left hand side.
...
Now, why is this the case? Actual constness is used by the compiler as an optimization hint.
{
const std::shared_ptr<int> bob = std::make_shared<int>();
}
in the above case, the compiler can know for certain that bob is non-empty at the end of the scope. Nothing can be done to bob that could make it empty and still leave you with defined behavior.
So the compiler can eliminate the branch at the end of the scope when destroying bob that checks if the pointer is null.
Similar optimizations could occur if you pass bob to an inline function that checks for bob's null state; the compiler can omit the check.
Suppose you pass bob to
void secret_code( std::shared_ptr<int> const& );
where the compiler cannot see into the implementation of secret_code. It can assume that secret code will not edit bob.
If it wasn't declared const, secret_code could legally do a const_cast<std::shared_ptr&> on the parameter and set it to null; but if the argument to secret_code is actually const this is undefined behavior. (Any code casting away const is responsible for guaranteeing that no actual modification of an actual const value occurs by doing so)
Without const on bob, the compiler could not guarantee:
{
const std::shared_ptr<int> bob = std::make_shared<int>();
secret_code(bob);
if (bob) {
std::cout << "guaranteed to run"
}
}
that the guaranteed to run string would be printed.
With const on bob, the compiler is free to elimiate the if check above.
...
Now, do not confuse my explanation asto why the standard states you cannot edit const stack variables with "if this doesn't happen there is no problem". The standard states you shall not do it; the consequences if you do it are unbounded and can grow with new versions of your compiler.
...
From comments:
For deserialize process, which is actually a type of constructor that deserialize object from file. C++ is nice, but it got its imperfections and sometimes its OK to search for less orthodox methods.
If it is a constructor, make it a constructor.
In C++17 a function returning a T has basically equal standing to a real constructor in many ways (due to guaranteed elision). In C++14, this isn't quite true (you also need a move constructor, and the compiler needs to elide it).
So a deserialization constructor for a type T in C++ needs to return a T, it cannot take a T by-reference and be a real constructor.
Composing this is a bit of a pain, but it can be done. Using the same code for serialization and deserialization is even more of a pain (I cannot off hand figure out how).

Could a smart compiler do all the things std::move does without it being part of the language?

This is a bit theoretical question, but although I have some basic understanding of the std::move Im still not certain if it provides some additional functionality to the language that theoretically couldnt be achieved with supersmart compilers. I know that code like :
{
std::string s1="STL";
std::string s2(std::move(s1));
std::cout << s1 <<std::endl;
}
is a new semantic behavior not just performance sugar. :D But tbh I guess nobody will use var x after doing std::move(x).
Also for movable only data (std::unique_ptr<>, std::thread) couldnt compiler automatically do the move construction and clearing of the old variable if type is declared movable?
Again this would mean that more code would be generated behind programmers back(for example now you can count cpyctor and movector calls, with automagic std::moving you couldnt do that ).
No.
But tbh I guess nobody will use var x after doing std::move(x)
Absolutely not guaranteed. In fact, a decent part of the reason why std::move(x) is not automatically usable by the compiler is because, well, it can't be decided automatically whether or not you intend this. It's explicitly well-defined behaviour.
Also, removing rvalue references would imply that the compiler can automagically write all the move constructors for you. This is definitely not true. D has a similar scheme, but it's a complete failure, because there are numerous useful situations in which the compiler-generated "move constructor" won't work correctly, but you can't change it.
It would also prevent perfect forwarding, which has other uses.
The Committee make many stupid mistakes, but rvalue references is not one of them.
Edit:
Consider something like this:
int main() {
std::unique_ptr<int> x = make_unique<int>();
some_func_that_takes_ownership(x);
int input = 0;
std::cin >> input;
if (input == 0)
some_other_func(x);
}
Owch. Now what? You can't magic the value of "input" to be known at compile-time. This is doubly a problem if the bodies of some_other_func and some_func_that_takes_ownership are unknown. This is Halting Problem- you can't prove that x is or is not used after some_func_that_takes_ownership.
D fails. I promised an example. Basically, in D, "move" is "binary copy and don't destruct the old". Unfortunately, consider a class with, say, a pointer to itself- something you will find in most string classes, most node-based containers, in designs for std::function, boost::variant, and lots of other similar handy value types. The pointer to the internal buffer will be copied but oh noes! points to the old buffer, not the new one. Old buffer is deallocated - GG your program.
It depends on what you mean by "what move does". To satisfy your curiosity, I think what you're looking to be told about the existence of Uniqueness Type Systems and Linear Type Systems.
These are types systems that enforce, at compile-time (in the type system), that a value only be referenced by one location, or that no new references be made. std::unique_ptr is the best approximation C++ can provide, given its rather weak type system.
Let's say we had a new storage-class specifier called uniqueref. This is like const, and specifies that the value has a single unique reference; nobody else has the value. It would enable this:
int main()
{
int* uniqueref x(new int); // only x has this reference
// unique type feature: error, would no longer be unique
auto y = x;
// linear type feature: okay, x not longer usable, z is now the unique owner
auto z = uniquemove(x);
// linear type feature: error: x is no longer usable
*x = 5;
}
(Also interesting to note the immense optimizations that can be taking, knowing a pointer value is really truly only referenced through that pointer. It's a bit like C99's restrict in that aspect.)
In terms of what you're asking, since we can now say that a type is uniquely referenced, we can guarantee that it's safe to move. That said, move operates are ultimately user-defined, and can do all sorts of weird stuff if desired, so implicitly doing this is a bad idea in current C++ anyway.
Everything above is obviously not formally thought-out and specified, but should give you an idea of what such a type system might look like. More generally, you probably want an Effect Type System.
But yes, these ideas do exist and are formally researched. C++ is just too established to add them.
Doing this the way you suggest is a lot more complicated than necessary:
std::string s1="STL";
std::string s2(s1);
std::cout << s1 <<std::endl;
In this case, it is fairly sure that a copy is meant. But if you drop the last line, s1 essentially ends its lifetime after the construction of s2.
In a reference counted implementation, the copy constructor for std::string will only increment the reference counter, while the destructor will decrement and delete if it becomes zero.
So the sequence is
(inlined std::string::string(char const *))
determine string length
allocate memory
copy string
initialize reference counter to 1
initialize pointer in string object
(inlined std::string::string(std::string const &))
increment reference counter
copy pointer to string representation
Now the compiler can flatten that, simply initialize the reference counter to 2 and store the pointer twice. Common Subexpression Elimination then finds out that s1 and s2 keep the same pointer value, and merges them into one.
In short, the only difference in generated code should be that the reference counter is initialized to 2.

const casting an int in a class vs outside a class

I read on the wikipedia page for Null_pointer that Bjarne Stroustrup suggested defining NULL as
const int NULL = 0;
if "you feel you must define NULL." I instantly thought, hey.. wait a minute, what about const_cast?
After some experimenting, I found that
int main() {
const int MyNull = 0;
const int* ToNull = &MyNull;
int* myptr = const_cast<int*>(ToNull);
*myptr = 5;
printf("MyNull is %d\n", MyNull);
return 0;
}
would print "MyNull is 0", but if I make the const int belong to a class:
class test {
public:
test() : p(0) { }
const int p;
};
int main() {
test t;
const int* pptr = &(t.p);
int* myptr = const_cast<int*>(pptr);
*myptr = 5;
printf("t.p is %d\n", t.p);
return 0;
}
then it prints "t.p is 5"!
Why is there a difference between the two? Why is "*myptr = 5;" silently failing in my first example, and what action is it performing, if any?
First of all, you're invoking undefined behavior in both cases by trying to modify a constant variable.
In the first case the compiler sees that MyNull is declared as a constant and replaces all references to it within main() with a 0.
In the second case, since p is within a class the compiler is unable to determine that it can just replace all classInstance.p with 0, so you see the result of the modification.
Firstly, what happens in the first case is that the compiler most likely translates your
printf("MyNull is %d\n", MyNull);
into the immediate
printf("MyNull is %d\n", 0);
because it knows that const objects never change in a valid program. Your attempts to change a const object leads to undefined behavior, which is exactly what you observe. So, ignoring the undefined behavior for a second, from the practical point of view it is quite possible that your *myptr = 5 successfully modified your Null. It is just that your program doesn't really care what you have in your Null now. It knows that Null is zero and will always be zero and acts accordingly.
Secondly, in order to define NULL per recommendation you were referring to, you have to define it specifically as an Integral Constant Expression (ICE). Your first variant is indeed an ICE. You second variant is not. Class member access is not allowed in ICE, meaning that your second variant is significantly different from the first. The second variant does not produce a viable definition for NULL, and you will not be able to initialize pointers with your test::p even though it is declared as const int and set to zero
SomeType *ptr1 = Null; // OK
test t;
SomeType *ptr2 = t.p; // ERROR: cannot use an `int` value to initialize a pointer
As for the different output in the second case... undefined behavior is undefined behavior. It is unpredictable. From the practical point of view, your second context is more complicated, so the compiler was unable to prefrom the above optimization. i.e. you are indeed succeeded in breaking through the language-level restrictions and modifying a const-qualified variable. Language specification does not make it easy (or possible) for the compilers to optimize out const members of the class, so at the physical level that p is just another member of the class that resides in memory, in each object of that class. Your hack simply modifies that memory. It doesn't make it legal though. The behavior si still undefined.
This all, of course, is a rather pointless exercise. It looks like it all began from the "what about const_cast" question. So, what about it? const_cast has never been intended to be used for that purpose. You are not allowed to modify const objects. With const_cast, or without const_cast - doesn't matter.
Your code is modifying a variable declared constant so anything can happen. Discussing why a certain thing happens instead of another one is completely pointless unless you are discussing about unportable compiler internals issues... from a C++ point of view that code simply doesn't have any sense.
About const_cast one important thing to understand is that const cast is not for messing about variables declared constant but about references and pointers declared constant.
In C++ a const int * is often understood to be a "pointer to a constant integer" while this description is completely wrong. For the compiler it's instead something quite different: a "pointer that cannot be used for writing to an integer object".
This may apparently seem a minor difference but indeed is a huge one because
The "constness" is a property of the pointer, not of the pointed-to object.
Nothing is said about the fact that the pointed to object is constant or not.
The word "constant" has nothing to do with the meaning (this is why I think that using const it was a bad naming choice). const int * is not talking about constness of anything but only about "read only" or "read/write".
const_cast allows you to convert between pointers and references that can be used for writing and pointer or references that cannot because they are "read only". The pointed to object is never part of this process and the standard simply says that it's legal to take a const pointer and using it for writing after "casting away" const-ness but only if the pointed to object has not been declared constant.
Constness of a pointer and a reference never affects the machine code that will be generated by a compiler (another common misconception is that a compiler can produce better code if const references and pointers are used, but this is total bogus... for the optimizer a const reference and a const pointer are just a reference and a pointer).
Constness of pointers and references has been introduced to help programmers, not optmizers (btw I think that this alleged help for programmers is also quite questionable, but that's another story).
const_cast is a weapon that helps programmers fighting with broken const-ness declarations of pointers and references (e.g. in libraries) and with the broken very concept of constness of references and pointers (before mutable for example casting away constness was the only reasonable solution in many real life programs).
Misunderstanding of what is a const reference is also at the base of a very common C++ antipattern (used even in the standard library) that says that passing a const reference is a smart way to pass a value. See this answer for more details.