C++: assign a integer to an integer pointer [duplicate] - c++

According to " How to get around the warning "rvalue used as lvalue"? ", Visual Studio will merely warn on code such as this:
int bar() {
return 3;
}
void foo(int* ptr) {
}
int main() {
foo(&bar());
}
In C++ it is not allowed to take the address of a temporary (or, at least, of an object referred to by an rvalue expression?), and I thought that this was because temporaries are not guaranteed to even have storage.
But then, although diagnostics may be presented in any form the compiler chooses, I'd still have expected MSVS to error rather than warn in such a case.
So, are temporaries guaranteed to have storage? And if so, why is the above code disallowed in the first place?

Actually, in the original language design it was allowed to take the address of a temporary. As you have noticed correctly, there is no technical reason for not allowing this, and MSVC still allows it today through a non-standard language extension.
The reason why C++ made it illegal is that binding references to temporaries clashes with another C++ language feature that was inherited from C: Implicit type conversion.
Consider:
void CalculateStuff(long& out_param) {
long result;
// [...] complicated calculations
out_param = result;
}
int stuff;
CalculateStuff(stuff); //< this won't compile in ISO C++
CalculateStuff() is supposed to return its result via the output parameter. But what really happens is this: The function accepts a long& but is given an argument of type int. Through C's implicit type conversion, that int is now implicitly converted to a variable of type long, creating an unnamed temporary in the process.
So instead of the variable stuff, the function really operates on an unnamed temporary, and all side-effects applied by that function will be lost once that temporary is destroyed. The value of the variable stuff never changes.
References were introduced to C++ to allow operator overloading, because from the caller's point of view, they are syntactically identical to by-value calls (as opposed to pointer calls, which require an explicit & on the caller's side). Unfortunately it is exactly that syntactical equivalence that leads to troubles when combined with C's implicit type conversion.
Since Stroustrup wanted to keep both features (references and C-compatibility), he introduced the rule we all know today: Unnamed temporaries only bind to const references. With that additional rule, the above sample no longer compiles. Since the problem only occurs when the function applies side-effects to a reference parameter, it is still safe to bind unnamed temporaries to const references, which is therefore still allowed.
This whole story is also described in Chapter 3.7 of Design and Evolution of C++:
The reason to allow references to be initialized by non-lvalues was to allow the distinction between call-by-value and call-by-reference to be a detail specified by the called function and of no interest to the caller. For const references, this is possible; for non-const references it is not. For Release 2.0 the definition of C++ was changed to reflect this.
I also vaguely remember reading in a paper who first discovered this behavior, but I can't remember right now. Maybe someone can help me out?

Certainly temporaries have storage. You could do something like this:
template<typename T>
const T *get_temporary_address(const T &x) {
return &x;
}
int bar() { return 42; }
int main() {
std::cout << (const void *)get_temporary_address(bar()) << std::endl;
}
In C++11, you can do this with non-const rvalue references too:
template<typename T>
T *get_temporary_address(T &&x) {
return &x;
}
int bar() { return 42; }
int main() {
std::cout << (const void *)get_temporary_address(bar()) << std::endl;
}
Note, of course, that dereferencing the pointer in question (outside of get_temporary_address itself) is a very bad idea; the temporary only lives to the end of the full expression, and so having a pointer to it escape the expression is almost always a recipe for disaster.
Further, note that no compiler is ever required to reject an invalid program. The C and C++ standards merely call for diagnostics (ie, an error or warning), upon which the compiler may reject the program, or it may compile a program, with undefined behavior at runtime. If you would like your compiler to strictly reject programs which produce diagnostics, configure it to convert warnings to errors.

You're right in saying that "temporaries are not guaranteed to even have storage", in the sense that the temporary may not be stored in addressable memory. In fact, very often functions compiled for RISC architectures (e.g. ARM) will return values in general use registers and would expect inputs in those registers as well.
MSVS, producing code for x86 architectures, may always produce functions that return their values on the stack. Therefore they're stored in addressable memory and have a valid address.

Temporary objects do have memory. Sometimes the compiler creates temporaries as well. In poth cases these objects are about to go away, i.e. they shouldn't gather important changes by chance. Thus, you can get hold of a temporary only via an rvalue reference or a const reference but not via a non-const reference. Taking the address of an object which about to go away also feels like a dangerous thing and thus isn't supported.
If you are sure you really want a non-const reference or a pointer from a temporary object you can return it from a corresponding member function: you can call non-const member functions on temporaries. And you can return this from this member. However, note that the type system is trying to help you. When you trick it you better know that what you are diing is the Right Thing.

As others mentioned, we all agreed temporaries do have storage.
why is it illegal to take the address of a temporary?
Because temporaries are allocated on stack, the compiler is free to use that address to any other purposes it wants to.
int foo()
{
int myvar=5;
return &myvar;
}
int main()
{
int *p=foo();
print("%d", *p);
return 0;
}
Let's say the address of 'myvar' is 0x1000. This program will most likely print 99 even though it's illegal to access 0x1000 in main(). Though, not necessarily all the time.
With a slight change to the above main():
int foo()
{
int myvar=5;
return &myvar; // address of myvar is 0x1000
}
int main()
{
int *p=foo(); //illegal to access 0x1000 here
print("%d", *p);
fun(p); // passing *that address* to fun()
return 0;
}
void fun(int *q)
{
int a,b; //some variables
print("%d", *q);
}
The second printf is very unlikely to print '5' as the compiler might have even allocated the same portion of stack (which contains 0x1000) for fun() as well. No matter whether it prints '5' for both printfs OR in either of them, it is purely an unintentional side effect on how stack memory is being used/allocated. That's why it's illegal to access an address which is not alive in the scope.

Temporaries do have storage. They are allocated on the stack of the caller (note: might be subject of calling convention, but I think they all use caller's stack):
caller()
{
callee1( Tmp() );
callee2( Tmp() );
}
Compiler will allocate space for the result Tmp() on stack of the caller. You can take address of this memory location - it'll be some address on stack of caller. What compiler does not guarantee is that it will preserve values at this stack address after callee returns. For example, compiler can place there another temporary etc.
EDIT: I believe, it's disallowed to eliminate code like this :
T bar();
T * ptr = &bar();
because it will very likely lead to problems.
EDIT: here is a little test:
#include <iostream>
typedef long long int T64;
T64 ** foo( T64 * fA )
{
std::cout << "Address of tmp inside callee : " << &fA << std::endl;
return ( &fA );
}
int main( void )
{
T64 lA = -1;
T64 lB = -2;
T64 lC = -3;
T64 lD = -4;
T64 ** ptr_tmp = foo( &lA );
std::cout << "**ptr_tmp = *(*ptr_tmp ) = lA\t\t\t\t**" << ptr_tmp << " = *(" << *ptr_tmp << ") = " << **ptr_tmp << " = " << lA << std::endl << std::endl;
foo( &lB );
std::cout << "**ptr_tmp = *(*ptr_tmp ) = lB (compiler override)\t**" << ptr_tmp << " = *(" << *ptr_tmp << ") = " << **ptr_tmp << " = " << lB << std::endl
<< std::endl;
*ptr_tmp = &lC;
std::cout << "Manual override" << std::endl << "**ptr_tmp = *(*ptr_tmp ) = lC (manual override)\t\t**" << ptr_tmp << " = *(" << *ptr_tmp << ") = " << **ptr_tmp
<< " = " << lC << std::endl << std::endl;
*ptr_tmp = &lD;
std::cout << "Another attempt to manually override" << std::endl;
std::cout << "**ptr_tmp = *(*ptr_tmp ) = lD (manual override)\t\t**" << ptr_tmp << " = *(" << *ptr_tmp << ") = " << **ptr_tmp << " = " << lD << std::endl
<< std::endl;
return ( 0 );
}
Program output GCC:
Address of tmp inside callee : 0xbfe172f0
**ptr_tmp = *(*ptr_tmp ) = lA **0xbfe172f0 = *(0xbfe17328) = -1 = -1
Address of tmp inside callee : 0xbfe172f0
**ptr_tmp = *(*ptr_tmp ) = lB (compiler override) **0xbfe172f0 = *(0xbfe17320) = -2 = -2
Manual override
**ptr_tmp = *(*ptr_tmp ) = lC (manual override) **0xbfe172f0 = *(0xbfe17318) = -3 = -3
Another attempt to manually override
**ptr_tmp = *(*ptr_tmp ) = lD (manual override) **0xbfe172f0 = *(0x804a3a0) = -5221865215862754004 = -4
Program output VC++:
Address of tmp inside callee : 00000000001EFC10
**ptr_tmp = *(*ptr_tmp ) = lA **00000000001EFC10 = *(000000013F42CB10) = -1 = -1
Address of tmp inside callee : 00000000001EFC10
**ptr_tmp = *(*ptr_tmp ) = lB (compiler override) **00000000001EFC10 = *(000000013F42CB10) = -2 = -2
Manual override
**ptr_tmp = *(*ptr_tmp ) = lC (manual override) **00000000001EFC10 = *(000000013F42CB10) = -3 = -3
Another attempt to manually override
**ptr_tmp = *(*ptr_tmp ) = lD (manual override) **00000000001EFC10 = *(000000013F42CB10) = 5356268064 = -4
Notice, both GCC and VC++ reserve on the stack of main hidden local variable(s) for temporaries and MIGHT silently reuse them. Everything goes normal, until last manual override: after last manual override we have additional separate call to std::cout. It uses stack space to where we just wrote something, and as a result we get garbage.
Bottom line: both GCC and VC++ allocate space for temporaries on stack of caller. They might have different strategies on how much space to allocate, how to reuse this space (it might depend on optimizations as well). They both might reuse this space at their discretion and, therefore, it is not safe to take address of a temporary, since we might try to access through this address the value we assume it still has (say, write something there directly and then try to retrieve it), while compiler might have reused it already and overwrote our value.

Related

Force C++ to assign new addresses to arguments

It seems that when I pass different integers directly to a function, C++ assigns them the same address as opposed to assigning different addresses to different values. Is this by design, or an optimization that can be turned off? See the code below for an illustration.
#include <iostream>
const int *funct(const int &x) { return &x; }
int main() {
int a = 3, b = 4;
// different addresses
std::cout << funct(a) << std::endl;
std::cout << funct(b) << std::endl;
// same address
std::cout << funct(3) << std::endl;
std::cout << funct(4) << std::endl;
}
The bigger context of this question is that I am trying to construct a list of pointers to integers that I would add one by one (similar to funct(3)). Since I cannot modify the method definition (similar to funct's), I thought of storing the address of each argument, but they all ended up having the same address.
The function const int *funct(const int &x) takes in a reference that is bound to an int variable.
a and b are int variables, so x can be bound to them, and they will have distinct memory addresses.
Since the function accepts a const reference, that means the compiler will also allow x to be bound to a temporary int variable as well (whereas a non-const reference cannot be bound to a temporary).
When you pass in a numeric literal to x, like funct(3), the compiler creates a temporary int variable to hold the literal value. That temporary variable is valid only for the lifetime of the statement that is making the function call, and then the temporary goes out of scope and is destroyed.
As such, when you are making multiple calls to funct() in separate statements, the compiler is free to reuse the same memory for those temporary variables, eg:
// same address
std::cout << funct(3) << std::endl;
std::cout << funct(4) << std::endl;
Is effectively equivalent to this:
// same address
int temp;
{
temp = 3;
std::cout << funct(temp) << std::endl;
}
{
temp = 4;
std::cout << funct(temp) << std::endl;
}
However, if you make multiple calls to funct() in a single statement, the compiler will be forced to make separate temporary variables, eg:
// different addresses
std::cout << funct(3) << std::endl << funct(4) << std::endl;
Is effectively equivalent to this:
// different addresses
{
int temp1 = 3;
int temp2 = 4;
std::cout << funct(temp1) << std::endl << funct(temp2) << std::endl;
}
Demo
The function
const int *funct(const int &x) { return &x; }
will return the address of whatever x is referencing.
So this will, as you expected, print the address of a:
std::cout << funct(a) << std::endl;
The problem with the expression funct(3) is that it is impossible to make a reference of a constant and pass it as a parameter. A constant doesn't have an address, and therefore for practical reasons C++ doesn't support taking a reference of a constant. What C++ actually does support is making a temporary object, initializing it with the value 3, and taking the reference of that object.
Basically, the compiler will, in this case, translate this:
std::cout << funct(3) << std::endl;
into something equivalent to this:
{
int tmp = 3;
std::cout << funct(tmp) << std::endl;
}
Unless you do something to extend the lifetime of a temporary object, it will go out of scope after the function call (or right before the next sequence point, I am not sure).
Since the temporary created by 3 goes out of scope before you create a temporary from 4, the memory used by the first temporary may be reused for the second temporary.

Why can't assign const initialize while compile by pointer

I'm curious about why my research result is strange
#include <iostream>
int test()
{
return 0;
}
int main()
{
/*include either the next line or the one after*/
const int a = test(); //the result is 1:1
const int a = 0; //the result is 0:1
int* ra = (int*)((void*)&a);
*ra = 1;
std::cout << a << ":" << *ra << std::endl;
return 0;
}
why the constant var initialize while runtime can completely change, but initialize while compile will only changes pointer's var?
The function isn't really that relevant here. In principle you could get same output (0:1) for this code:
int main() {
const int a = 0;
int* ra = (int*)((void*)&a);
*ra = 1;
std::cout << a << ":" << *ra;
}
a is a const int not an int. You can do all sorts of senseless c-casts, but modifiying a const object invokes undefined behavior.
By the way in the above example even for std::cout << a << ":" << a; the compiler would be allowed to emit code that prints 1:0 (or 42:3.1415927). When your code has undefinded behavior anything can happen.
PS: the function and the different outcomes is relevant if you want to study internals of your compiler and what it does to code that is not valid c++ code. For that you best look at the output of the compiler and how it differs in the two cases (https://godbolt.org/).
It is undefined behavior to cast a const variable and change it's value. If you try, anything can happen.
Anyway, what seems to happen, is that the compiler sees const int a = 0;, and, depending on optimization, puts it on the stack, but replaces all usages with 0 (since a will not change!). For *ra = 1;, the value stored in memory actually changes (the stack is read-write), and outputs that value.
For const int a = test();, dependent on optimization, the program actually calls the function test() and puts the value on the stack, where it is modified by *ra = 1, which also changes a.

Why does const_cast remove constness for a pointer but not for a pointer to a const?

I understand that const_cast works with pointers and references.
I'm assuming that the input to const_cast should be a pointer or reference. I want to know why it doesn't remove the constness if the input is a pointer/reference to a const int?
The following code works as expected.
const_cast with multilevel pointers
int main()
{
using std::cout;
#define endl '\n'
const int * ip = new int(123);
const int * ptr = ip;
*const_cast<int*>(ptr) = 321;
cout << "*ip: " << *ip << endl; // value of *ip is changed to 321
}
But when I try a pointer to const int or reference to const int, the value doesn't seem to change.
const_cast with reference to const int
int main()
{
using std::cout;
#define endl '\n'
const int i = 123;
const int & ri = i;
const_cast<int&>(ri) = 321;
cout << "i: " << i << endl; // value in 'i' is 123
}
const_cast with pointer to const int
int main()
{
using std::cout;
#define endl '\n'
const int i = 123;
const int * ri = &i;
*const_cast<int*>(ri) = 321;
cout << "i: " << i << endl; // value in 'i' is 123
}
(1) works as expected, but I'm unable to comprehend why (2) & (3) don't work the way I think though the input to the const_cast is a pointer/reference.
Please help me understand the philosophy behind this. Thanks.
There are two kinds of constness.
Constness of an object is an inherent property of an object. It cannot be changed.
Think of a page in a printed book. It can be viewed as a string of characters, and it cannot be changed. It says what it says and that's it. So it's a const string.
Now think of a blackboard. It may have something written on it. You can wipe that and write something else. So the blackboard is a non-const string.
The other kind of constness is pointer and reference constness. This constness is not an inherent property of the pointed-to object, but a permission. It says you are not allowed to modify the object through this pointer. It says nothing about whether the object itself can be modified.
So if you have a const pointer, you don't necessarily know what it really points to. Maybe it's a book page. Maybe it's a blackboard. The pointer doesn't tell.
Now if you do know somehow that it is indeed a blackboard, you can be nasty and demand permission to go ahead and change what's written on it. That's what const_cast does. It gives you permission to do something.
What happens if you demand permission to modify a string, and it turns out to be a printed page? You get your permission, you go ahead and wipe it... and... What exactly happens is undefined. Perhaps nothing at all. Perhaps the print is smeared and you can neither recognise the original string nor write anything on top. Perhaps your world explodes to little pieces. You can try and see, but there's no guarantee the same thing will happen tomorrow.
(2) and (3) has the same principle, so I'll talk about (2) only.
The line
const_cast<int&>(ri) = 321;
has undefined behavior.
You cannot modify a const object according to the standard, not even with const_cast. If you remove const from a pointer/reference, and modify the pointed/referenced object, the pointed/referenced object must not been declared as const in the first place.
const_cast should be used only, when you have a const pointer to something for some reason, and you know that something is not declared as const.
Modifying the constant via the const_cast is undefined behaviour.
The compiler sees that you are trying to print a constant variable, knows it can never change so compiles:
cout << "i: " << i << endl;
to:
cout << "i: " << 123 << endl;
see: https://godbolt.org/z/bYb0mx. With optimisations enabled it optimises your code to just printing 123: https://godbolt.org/z/4Ttlmj.
Ultimately compilers make assumptions in order to create faster/smaller code, if you enter the realms of undefined behaviour some of those assumptions may be incorrect and produce surprising results.

Does binding a reference actually evaluate the operand?

Consider this code:
int& x=*new int;
Does RHS of the assignment actually dereference the newly-created pointer, leading to UB due to reading uninitialized variable? Or can this be legitimately used to later assign a value like x=5;?
As far as I can tell, none of what you've done involves undefined behavior.
It does, however, immediately create a risk of a memory leak. It can be quickly resolved (since &x would resolve to the address of the leaked memory, and can therefore be deleted) but if you were to leave scope, you would have no way of retrieving that pointer.
Edit: to the point, if you were to write
int& x=*new int;
x = 5;
std::cout << x << std::endl;
std::cin >> x;
std::cout << x << std::endl;
The code would behave as though you had simply declared x as int x;, except that the pointer would also be left dangling after the program exited scope.
You would achieve undefined behavior if you were to attempt to read the uninitialized variable before assigning it a value, but that wouldn't be untrue if x were stack allocated.
I would not advise doing this. Aliasing dynamic memory with a variable that looks static seems dangerous to me. Consider the following code
#include <iostream>
int* dynamicAlloc() {
int& x = *new int;
std::cout << "Inside dynamicAlloc: " << x << std::endl;
std::cout << "Modified: " << (x = 1) << std::endl;
return &x;
}
int main() {
int *dall = dynamicAlloc();;
std::cout << "dall (address): " << dall << std::endl;
std::cout << "dall -> " << *dall << std::endl; // Memory not cleaned
delete dall; // Don't forget to delete
return 0;
}
I get output like this:
Inside dynamicAlloc: -842150451
Modified: 1
dall (address): 00F642A0
dall -> 1
Note how dereferencing dall doesn't result in a segfault. It wasn't deallocated when x fell out of scope. Otherwise a segfault would have likely occurred.
In conclusion:
int& x makes x an alias for the dynamically allocated memory. You can legitimately modify that memory, but it will not cleanup as it falls out of scope automatically easily leading to potential memory leaks.

Returning pointer to local function variable [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Returning the address of local or temporary variable
Can a local variable's memory be accessed outside its scope?
I know that I should not return pointers to local function variables (local stack variables) because when the function return, the variables will be invalid and the stack will be cleaned up unless I made these variables static or allocated them on heap.
The following code demonstrates that:
const char* v1() {
return "ABC";
}
const char* v2() {
string s = "DEF";
return s.c_str();
}
const char* v3() {
static string s = "JHI";
return s.c_str();
}
cout << v1() << endl; // Output: ABC
cout << v2() << endl; // Output: garbage (♀ ╠╠╠╠╠╠╠╠)
cout << v3() << endl; // Output: JHI
However, I returned pointer to a primitive local function variable and I was able to get its value although it is not static, as the following code shows:
int i1() {
int i = 5;
return i;
}
int* i2() {
int i = 6;
return &i;
}
int* i3() {
static int i = 7;
return &i;
}
cout << i1() << endl; // Output: 5
cout << *i2() << endl; // Output: 6 !!
cout << *i3() << endl; // Output: 7
The compiler only gives me warning that I am returning address of local variable or temporary (Visual C++ 2008). Is this behaviour common to all compilers and how the compiler allows me to use pointer to a local function variable to access the value it points to although the variable is invalidated when the function return?
you return an address. Returning an address is valid - always. But in your case, you also dereference it. This is undefined behavior. Theoretically, for undefined behavior, anything can happen. The compiler is even allowed to embed code to format your hard-disc. Practically it will dereference the address without any checks. If it is still accessible, it'll return the value at that address otherwise it'll cause an access violation.
Your address is on the stack, so it is always accessible. Depending on the calls you made in between, the value might still be there or not. So in simple cases, you get the value back, in more complicated cases you won't. It may even work sometimes and sometimes it does not.
For more information, you should read some information on how function calls are made in assembler to understand what the compiler is doing there on the stack (placing parameters, return address, placing local variables, stack cleanup on return, calling conventions).
it can be removed from the stack because it is local but the value will remain until another overwirte it.
it is c++ unsafe language u can do much strange things