Returning pointer to local function variable [duplicate] - c++

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
Returning the address of local or temporary variable
Can a local variable's memory be accessed outside its scope?
I know that I should not return pointers to local function variables (local stack variables) because when the function return, the variables will be invalid and the stack will be cleaned up unless I made these variables static or allocated them on heap.
The following code demonstrates that:
const char* v1() {
return "ABC";
}
const char* v2() {
string s = "DEF";
return s.c_str();
}
const char* v3() {
static string s = "JHI";
return s.c_str();
}
cout << v1() << endl; // Output: ABC
cout << v2() << endl; // Output: garbage (♀ ╠╠╠╠╠╠╠╠)
cout << v3() << endl; // Output: JHI
However, I returned pointer to a primitive local function variable and I was able to get its value although it is not static, as the following code shows:
int i1() {
int i = 5;
return i;
}
int* i2() {
int i = 6;
return &i;
}
int* i3() {
static int i = 7;
return &i;
}
cout << i1() << endl; // Output: 5
cout << *i2() << endl; // Output: 6 !!
cout << *i3() << endl; // Output: 7
The compiler only gives me warning that I am returning address of local variable or temporary (Visual C++ 2008). Is this behaviour common to all compilers and how the compiler allows me to use pointer to a local function variable to access the value it points to although the variable is invalidated when the function return?

you return an address. Returning an address is valid - always. But in your case, you also dereference it. This is undefined behavior. Theoretically, for undefined behavior, anything can happen. The compiler is even allowed to embed code to format your hard-disc. Practically it will dereference the address without any checks. If it is still accessible, it'll return the value at that address otherwise it'll cause an access violation.
Your address is on the stack, so it is always accessible. Depending on the calls you made in between, the value might still be there or not. So in simple cases, you get the value back, in more complicated cases you won't. It may even work sometimes and sometimes it does not.
For more information, you should read some information on how function calls are made in assembler to understand what the compiler is doing there on the stack (placing parameters, return address, placing local variables, stack cleanup on return, calling conventions).

it can be removed from the stack because it is local but the value will remain until another overwirte it.
it is c++ unsafe language u can do much strange things

Related

C++: assign a integer to an integer pointer [duplicate]

According to " How to get around the warning "rvalue used as lvalue"? ", Visual Studio will merely warn on code such as this:
int bar() {
return 3;
}
void foo(int* ptr) {
}
int main() {
foo(&bar());
}
In C++ it is not allowed to take the address of a temporary (or, at least, of an object referred to by an rvalue expression?), and I thought that this was because temporaries are not guaranteed to even have storage.
But then, although diagnostics may be presented in any form the compiler chooses, I'd still have expected MSVS to error rather than warn in such a case.
So, are temporaries guaranteed to have storage? And if so, why is the above code disallowed in the first place?
Actually, in the original language design it was allowed to take the address of a temporary. As you have noticed correctly, there is no technical reason for not allowing this, and MSVC still allows it today through a non-standard language extension.
The reason why C++ made it illegal is that binding references to temporaries clashes with another C++ language feature that was inherited from C: Implicit type conversion.
Consider:
void CalculateStuff(long& out_param) {
long result;
// [...] complicated calculations
out_param = result;
}
int stuff;
CalculateStuff(stuff); //< this won't compile in ISO C++
CalculateStuff() is supposed to return its result via the output parameter. But what really happens is this: The function accepts a long& but is given an argument of type int. Through C's implicit type conversion, that int is now implicitly converted to a variable of type long, creating an unnamed temporary in the process.
So instead of the variable stuff, the function really operates on an unnamed temporary, and all side-effects applied by that function will be lost once that temporary is destroyed. The value of the variable stuff never changes.
References were introduced to C++ to allow operator overloading, because from the caller's point of view, they are syntactically identical to by-value calls (as opposed to pointer calls, which require an explicit & on the caller's side). Unfortunately it is exactly that syntactical equivalence that leads to troubles when combined with C's implicit type conversion.
Since Stroustrup wanted to keep both features (references and C-compatibility), he introduced the rule we all know today: Unnamed temporaries only bind to const references. With that additional rule, the above sample no longer compiles. Since the problem only occurs when the function applies side-effects to a reference parameter, it is still safe to bind unnamed temporaries to const references, which is therefore still allowed.
This whole story is also described in Chapter 3.7 of Design and Evolution of C++:
The reason to allow references to be initialized by non-lvalues was to allow the distinction between call-by-value and call-by-reference to be a detail specified by the called function and of no interest to the caller. For const references, this is possible; for non-const references it is not. For Release 2.0 the definition of C++ was changed to reflect this.
I also vaguely remember reading in a paper who first discovered this behavior, but I can't remember right now. Maybe someone can help me out?
Certainly temporaries have storage. You could do something like this:
template<typename T>
const T *get_temporary_address(const T &x) {
return &x;
}
int bar() { return 42; }
int main() {
std::cout << (const void *)get_temporary_address(bar()) << std::endl;
}
In C++11, you can do this with non-const rvalue references too:
template<typename T>
T *get_temporary_address(T &&x) {
return &x;
}
int bar() { return 42; }
int main() {
std::cout << (const void *)get_temporary_address(bar()) << std::endl;
}
Note, of course, that dereferencing the pointer in question (outside of get_temporary_address itself) is a very bad idea; the temporary only lives to the end of the full expression, and so having a pointer to it escape the expression is almost always a recipe for disaster.
Further, note that no compiler is ever required to reject an invalid program. The C and C++ standards merely call for diagnostics (ie, an error or warning), upon which the compiler may reject the program, or it may compile a program, with undefined behavior at runtime. If you would like your compiler to strictly reject programs which produce diagnostics, configure it to convert warnings to errors.
You're right in saying that "temporaries are not guaranteed to even have storage", in the sense that the temporary may not be stored in addressable memory. In fact, very often functions compiled for RISC architectures (e.g. ARM) will return values in general use registers and would expect inputs in those registers as well.
MSVS, producing code for x86 architectures, may always produce functions that return their values on the stack. Therefore they're stored in addressable memory and have a valid address.
Temporary objects do have memory. Sometimes the compiler creates temporaries as well. In poth cases these objects are about to go away, i.e. they shouldn't gather important changes by chance. Thus, you can get hold of a temporary only via an rvalue reference or a const reference but not via a non-const reference. Taking the address of an object which about to go away also feels like a dangerous thing and thus isn't supported.
If you are sure you really want a non-const reference or a pointer from a temporary object you can return it from a corresponding member function: you can call non-const member functions on temporaries. And you can return this from this member. However, note that the type system is trying to help you. When you trick it you better know that what you are diing is the Right Thing.
As others mentioned, we all agreed temporaries do have storage.
why is it illegal to take the address of a temporary?
Because temporaries are allocated on stack, the compiler is free to use that address to any other purposes it wants to.
int foo()
{
int myvar=5;
return &myvar;
}
int main()
{
int *p=foo();
print("%d", *p);
return 0;
}
Let's say the address of 'myvar' is 0x1000. This program will most likely print 99 even though it's illegal to access 0x1000 in main(). Though, not necessarily all the time.
With a slight change to the above main():
int foo()
{
int myvar=5;
return &myvar; // address of myvar is 0x1000
}
int main()
{
int *p=foo(); //illegal to access 0x1000 here
print("%d", *p);
fun(p); // passing *that address* to fun()
return 0;
}
void fun(int *q)
{
int a,b; //some variables
print("%d", *q);
}
The second printf is very unlikely to print '5' as the compiler might have even allocated the same portion of stack (which contains 0x1000) for fun() as well. No matter whether it prints '5' for both printfs OR in either of them, it is purely an unintentional side effect on how stack memory is being used/allocated. That's why it's illegal to access an address which is not alive in the scope.
Temporaries do have storage. They are allocated on the stack of the caller (note: might be subject of calling convention, but I think they all use caller's stack):
caller()
{
callee1( Tmp() );
callee2( Tmp() );
}
Compiler will allocate space for the result Tmp() on stack of the caller. You can take address of this memory location - it'll be some address on stack of caller. What compiler does not guarantee is that it will preserve values at this stack address after callee returns. For example, compiler can place there another temporary etc.
EDIT: I believe, it's disallowed to eliminate code like this :
T bar();
T * ptr = &bar();
because it will very likely lead to problems.
EDIT: here is a little test:
#include <iostream>
typedef long long int T64;
T64 ** foo( T64 * fA )
{
std::cout << "Address of tmp inside callee : " << &fA << std::endl;
return ( &fA );
}
int main( void )
{
T64 lA = -1;
T64 lB = -2;
T64 lC = -3;
T64 lD = -4;
T64 ** ptr_tmp = foo( &lA );
std::cout << "**ptr_tmp = *(*ptr_tmp ) = lA\t\t\t\t**" << ptr_tmp << " = *(" << *ptr_tmp << ") = " << **ptr_tmp << " = " << lA << std::endl << std::endl;
foo( &lB );
std::cout << "**ptr_tmp = *(*ptr_tmp ) = lB (compiler override)\t**" << ptr_tmp << " = *(" << *ptr_tmp << ") = " << **ptr_tmp << " = " << lB << std::endl
<< std::endl;
*ptr_tmp = &lC;
std::cout << "Manual override" << std::endl << "**ptr_tmp = *(*ptr_tmp ) = lC (manual override)\t\t**" << ptr_tmp << " = *(" << *ptr_tmp << ") = " << **ptr_tmp
<< " = " << lC << std::endl << std::endl;
*ptr_tmp = &lD;
std::cout << "Another attempt to manually override" << std::endl;
std::cout << "**ptr_tmp = *(*ptr_tmp ) = lD (manual override)\t\t**" << ptr_tmp << " = *(" << *ptr_tmp << ") = " << **ptr_tmp << " = " << lD << std::endl
<< std::endl;
return ( 0 );
}
Program output GCC:
Address of tmp inside callee : 0xbfe172f0
**ptr_tmp = *(*ptr_tmp ) = lA **0xbfe172f0 = *(0xbfe17328) = -1 = -1
Address of tmp inside callee : 0xbfe172f0
**ptr_tmp = *(*ptr_tmp ) = lB (compiler override) **0xbfe172f0 = *(0xbfe17320) = -2 = -2
Manual override
**ptr_tmp = *(*ptr_tmp ) = lC (manual override) **0xbfe172f0 = *(0xbfe17318) = -3 = -3
Another attempt to manually override
**ptr_tmp = *(*ptr_tmp ) = lD (manual override) **0xbfe172f0 = *(0x804a3a0) = -5221865215862754004 = -4
Program output VC++:
Address of tmp inside callee : 00000000001EFC10
**ptr_tmp = *(*ptr_tmp ) = lA **00000000001EFC10 = *(000000013F42CB10) = -1 = -1
Address of tmp inside callee : 00000000001EFC10
**ptr_tmp = *(*ptr_tmp ) = lB (compiler override) **00000000001EFC10 = *(000000013F42CB10) = -2 = -2
Manual override
**ptr_tmp = *(*ptr_tmp ) = lC (manual override) **00000000001EFC10 = *(000000013F42CB10) = -3 = -3
Another attempt to manually override
**ptr_tmp = *(*ptr_tmp ) = lD (manual override) **00000000001EFC10 = *(000000013F42CB10) = 5356268064 = -4
Notice, both GCC and VC++ reserve on the stack of main hidden local variable(s) for temporaries and MIGHT silently reuse them. Everything goes normal, until last manual override: after last manual override we have additional separate call to std::cout. It uses stack space to where we just wrote something, and as a result we get garbage.
Bottom line: both GCC and VC++ allocate space for temporaries on stack of caller. They might have different strategies on how much space to allocate, how to reuse this space (it might depend on optimizations as well). They both might reuse this space at their discretion and, therefore, it is not safe to take address of a temporary, since we might try to access through this address the value we assume it still has (say, write something there directly and then try to retrieve it), while compiler might have reused it already and overwrote our value.

Function returns null pointer instead of address

I'm playing around with pointers to understand this concept better
and wanted to ask
Why do i get null pointer as return for the second function?
and why it isn't possible to get the address 0x7fff15504044.
What is happening and where inside memory is the integer 5 stored,
when im working with it inside the function?.
#include <iostream>
using namespace std;
int* return_adress(int* input){ return input; }
int* return_adress_from_input(int input){ return &input; }
int main(){
int k = 3;
cout << return_adress(&k) << endl;
cout << return_adress_from_input(k) << endl;
}
Output:
0x7fff15504044
0
With int* return_adress_from_input(int input), input is a value copy of k in the caller. They are two different variables therefore with different addresses.
input goes out of scope conceptually once the closing brace of the function is reached.
The pointer &input then points to memory that you no longer own, and the behaviour of reading that pointer value (let alone dereferencing it) is undefined prior to C++14, and implementation defined from and including C++14.
Because you pass input by value, not by reference. Compiller first creates a local copy of input and then returns address of this local copy. To get the variable address use
int* return_adress_from_input(int& input){ return &input; }
In general, you get undefined behavior which can lead to returning nullptr in the particular case
On my PC i get
00AFFC80
00AFFBA8
so you are just lucky with zero return

Function Output Problems [duplicate]

This question already has answers here:
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 4 years ago.
I made a function to convert an array of ints (a zipcode) into a cstring. If I call the function, the return is gibberish, but if I cout the returned variable inside the function it's exactly as expected.
const char* print_zip(const int* zip) {
char output[6];
char* ctemp = output;
const int *itemp = zip;
for (int i = 0; i < 5; i++) {
*ctemp = *itemp + 48; // convert to char numbers.
itemp++; // iterate using pointers rather than []
ctemp++; // per assignment specifications
}
*ctemp = '\0';
std::cout << output << std::endl; // (debug only) as expected
return output; // cout << print_zip(); not as expected
}
When you cout the return variable inside the function, it should be expected to return the value wanted. This is because when you create a local variable inside a function it resides in that function's stack frame. Thus, when STILL inside the function, the scope is still set to see that variable.
On the contrary, when the pointer to the local variable is returned, the stack frame no longer exists, basically meaning the pointed to object could or could not be complete junk.
In order to return a variable that will be constant within or out of the stack frame, you would need to pass it by reference, which is usually done by making a copy of object your trying to return, then returning that copy.

Behaviour of reference(&) data member pointing to stack variable

I have come across a sample source code regarding use reference data member and i am confused about output. Here is sample code.
class Test {
private:
int &t;
public:
Test (int y):t(y) { }
int getT() { return t; }
};
int main() {
int x = 20;
Test t1(x);
cout << t1.getT() << "\n"; // Prints 20 as output. however y has already been destroyed but still prints 20.
x = 30;
cout << t1.getT() << endl; // Prints Garbage as output Why ? Ideally both steps should be Garbage.
return 0;
}
And to add for more confusion here is one more piece of code for same class
int main() {
int x = 20;
int z = 60;
Test t1(x);
Test t2(z);
cout<<t1.getT()<<"\n"; // Prints 60! WHY? Should print garbage
cout<<t2.getT() << "\n"; // Prints Garbage
cout<<t1.getT() << endl; // Prints Same Garbage value as previous expression
return 0;
}
x is passed by value using a temporary, so t is a reference to that temporary, not x. That temporary will be destroyed after constructor returns. Your code has undefined behavior. anything can come up as output. Your problem can be solved by passing a reference to x like
Test (int& y):t(y);
but this is not a good idea. There can be cases where x goes out of scope but the Test object is still used , then the same problem will appear.
Your constructor:
Test (int y):t(y) { }
sets t to be a reference to y, the local (temporary) variable on the stack, and not the variable in the calling function. When you change the variable value in the calling function it does not change anything in the object you created.
The fact that the reference is to a temporary variable that is lost at the end of the life of the constructor means that getT() returns an undefined value.
Every call to int getT() accesses the memory address for y. That memory address was released from the stack at the end of the constructor, so it points to memory that is not on the stack or the heap and so may be reused at any time. The time of reuse is not defined and depends on other operations established by the compiler and dependency libraries. The return value of int getT() therefor depends on other elements on your OS that affect memory, the compiler type and version, and the OS amongst other things.
Now i got it. Yes it is undefined but to answer my question why it is printing 20 or 60 before printing garbage? Actually answer is that 20 and 60 both values are garbage and ideally both getT function calls should print Garbage but it doesn't.Because there is no other instruction between Test t2(z);
cout<<t1.getT()<<"\n";
but for next statement \n works as a instruction and meanwhile stack clears the value.

C++, Accessing a non-global variable declared inside other method [duplicate]

This question already has answers here:
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 9 years ago.
This code takes value returned from a function, creates and puts it in a new address space called variable 'b'
int main()
{
int b;
b = topkek();
std::cout << &b;
};
int topkek()
{
int kek = 2;
return kek;
};
Now I understand because variable kek was inside of topkek method I can not access it from main() method. With C++ and pointers, I figured out how I could access a variable declared inside a method even after the method has terminated, look at this code.
int main()
{
int * a;
a = topkek(); //a is now pointing at kek
std::cout << *a; //outputs 2, the value of kek.
*a = 3;
std::cout << *a; //output 3, changed the value of kek.
std::cin >> *a;
};
int * topkek()
{
int kek = 2;
int* b = &kek; //intializing pointer b to point to kek's address in stack
return b; //returning pointer to kek
};
Is this method safe? Would the compiler prevent kek's address space being overwritten later in code because it still understand it's being used?
This is undefined behavior: once the function finishes, accessing a variable inside it by pointer or reference makes the program invalid, and may cause a crash.
It is perfectly valid, however, to access a variable when you go "back" on the stack:
void assign(int* ptr) {
*ptr = 1234;
}
int main() {
int kek = 5;
cout << kek << endl;
assign(&kek);
cout << kek << endl;
}
Note how assign has accessed the value of a local variable declared inside another function. This is legal, because main has not finished at the time the access has happened.
No it is not safe. After your function finished execution, you are not pointing to an int anymore but just to some random place in memory. Its value could be anything since it can be modified elsewhere in your program.
Most definitely not safe. Undefined behavior will result.
When a local variable is "created", the compiler does so by giving the variable some space on the the stack [1]. This space is available for the variable as long as you are inside that function. When the function returns, the space is released, and available for other functions to use. So the address of b in your second code is returning a pointer to memory that will be releasted just after the return finishes.
Try adding something like this:
int foo()
{
int x = 42;
cout << "x = " << x << endl;
return x;
}
and call foo after your call to topkek, and it's pretty certain that the value in kek (or pointed to by kek) will change.
[1] For the pedants: Yes, the C++ standard doesn't specify that there needs to be a stack, or how local variables are supposed to be used, etc, etc. But in general, in nearly all compilers available today, this is how it works.
This will be safe:
int * topkek()
{
static int kek = 2;
int* b = &kek; //intializing pointer b to point to kek's address in stack
return b; //returning pointer to kek
};
Locally allocated automatic types will be automatically deallocated when exiting the function scope. The pointer you return will then point to deallocated memory. Accessing this memory will result in undefined behaviour.
int* topkek() {
int kek = 2; // Create a local int kek
int* b = &kek; // Declare pointer to kek
return b; // Return pointer to local variable. <-- Never do this!
}; // kek is destroyed and the returned pointer points to deallocated memory.
This is undefined behavior. After returning from the function, kek does not exist any more, and the pointer returned points into nirvana, to the location where kek once was. Accessing it gives undefined behavior. Undefined behavior means, anything can happen. You could indeed get the value that once was in kek or your application crashes, or you get some garbage value, or your compiler sees fit to order you a pizza online, with extra cheese, please.
This is undefined behavior, kek is a local variable and the memory it resides in will be released once you exit topkek. It may seem like it works but since it is undefined anything can result, it can break at any time and you can not rely on the results.
First, this is definitely unsafe as others have noted. But based on your usage, I think you might actually be trying to define an object. For example, this code might be what you want:
struct topkek{
int kek;
int *operator()() {
kek = 2;
return &kek;
}
};
int main()
{
topkek mykek;
int *a = mykek(); // sets mykek.kek to 2, returns pointer to kek
std::cout << *a;
*a = 3;
std::cout << *a;
std::cin >> *a;
};
This will give you clean access to the underlying kek variable in a totally safe way. You can of course put any code you want into the operator() method.