reference and literals in C++ - c++

I know that "literals" (c strings, int or whatever) are stored somewhere (in a read only data section apparently .rodata) maybe this is not accurate...
I want to understand why this code causes a runtime error:
#include <iostream>
using namespace std;
const int& foo()
{
return 2;
}
const char* bar()
{
return "Hello !";
}
int main() {
cout << foo() << endl; // crash
cout << bar() << endl; // OK
return 0;
}
foo returns a const reference on a literal (2) why does this cause a crash ? is the integer 2 stored in the stack of foo() ?
See also : Why are string literals l-value while all other literals are r-value?

I see why this is confusing so I will try to break it down.
First case:
const int& foo()
{
return 2;
}
The return statement makes a temporary object which is a copy of the literal 2. So its address is either non-extant or different from the location of the literal 2 (assuming literal 2 has a location - not guaranteed).
It is that temporary whose reference is returned.
Second case:
const char* bar()
{
return "Hello !";
}
The return statement makes a temporary object which is a copy of a pointer to the address of the first element of the literal char array. That pointer contains the actual address of the literal array and that address is returned by copy to the caller.
So to sum up. The second one works because the return statement takes a copy of the literal's address and not a copy of the literal itself. It doesn't matter that the storage for the address is temporary because the address still points to the correct place after the temporary holding its value collapses.

That is indeed very confusing, and in order to understand what's happening, one has to dive very deep in the language specification.
But before we do this, let me remind you that compiler warnings are your friends. With a sufficient level of warnings, you should see following when compiling your example:
In function 'const int& foo()': 3 : warning: returning reference to
temporary [-Wreturn-local-addr] return 2; ^
Now, what is happening in your first example? One can not really take an address of the integral literal, since they do not really exist as objects. However, one is allowed to bind constant references to literals. How is it possible, when everybody knows that references are akin to pointers? The reason is that when you bind a const reference to the literal, you do not really bind it to the literal. Instead, compiler creates a temporary variable, and binds your reference to it. And that variable is an object, albeit short-lived one. Once you function returns, the temporary object is destroyed, and you end up with dangling reference -> crash.
In the second example, "hello" is a literal, but you are not returning the literal - you are returning a pointer to the string. And a pointer remains valid, because the string it points to remains valid.

Related

Why do char arrays get lost when returning from a function in C++?

I know that, if we declare variables inside a function without allocating memory for them, they will be lost after the function finishes its job.
The following code prints:
(null)
5
char* getString()
{
char arr[] = "SomeText";
return arr;
}
int getInt()
{
int b = 5;
return b;
}
int main()
{
printf("%s", getString());
printf("\n");
printf("%d", getInt());
return 0;
}
Both arr and b variables are created on the stack, so they both should be destroyed when the functions end. My question is, how come variable b doesn't get lost while variable arr is lost?
A unique and often confusing feature of C (and inherited by C++) is that an array when used in an expression is not treated as a collection of values, but (most of the time) as a pointer to its first element.† So, when you return an array from a function, you are returning the address of its first element.
Dereferencing the address of an object with automatic storage duration that is no longer in scope results in undefined behavior.
When you return a value from a function, a copy of the value is returned to the caller.
Thus, when you return an integer, the caller receives a copy of that integer value.
If the value is a pointer, the copied value is a pointer. If the pointer is pointing to an invalid object, then if the receiver of the pointer tried to dereference the pointer value, it would result in undefined behavior.
† There are 3 exceptions: (1) As an operand to &; (2) As an operand to sizeof; and (3) A string literal used to initialize an array. In C++, there are other exceptions: (4) As an operand of decltype; (5) As the function argument to a reference parameter; (6) An object to initialize a reference variable; ... probably something else I am forgetting...
Both getInt and getString return a value.
getInt returns an int value of 5. It remains 5 in the caller.
getString returns a char * value that points to arr. While the caller receives a pointer, the thing it points to, arr, no longer exists (in the C standard’s model of computation) when the function returns.
Thus, it is not the value being returned by the function that is the problem so much as its meaning. The number 5 retains its meaning. A pointer to a thing that ceases to exist does not retain its meaning.

const char * value lifetime [duplicate]

This question already has answers here:
Can a local variable's memory be accessed outside its scope?
(20 answers)
Closed 7 years ago.
EDIT: The question about why the code in this question works has been answered by the linked question in the duplicate marking. The question about string literal lifetime is answered in the answer to this question.
I am trying to understand how and when the string pointed to by const char * gets deallocated.
Consider:
const char **p = nullptr;
{
const char *t = "test";
p = &t;
}
cout << *p;
After leaving the inner scope I would expect p to be a dangling pointer to const char *. However in my tests it is not. That would imply that the value of t actually continues to be valid and accessible even after t gets out of scope.
It could be due to prolonging the lifetime of the temporary by binding it to const reference. But I do no such thing and even by saving the reference to t in a member variable and printing the value from different function later still gives me its correct value.
class CStringTest
{
public:
void test1()
{
const char *t = "test";
m_P = &t;
test2();
}
void test2()
{
cout << *m_P;
}
private:
const char **m_P = nullptr;
};
So what is the lifetime of the t's value here? I would say I am invoking undefined behaviour by dereferencing a pointer to a value of a variable that went out of scope. But it works every time so I think that is not the case.
When trying some other type like QString:
QString *p = nullptr;
{
QString str = "test";
p = &str;
}
cout << *p;
the code always prints the value correctly too even though it should not. str went out of scope with its value and I have not prolonged its lifetime by binding it to const reference either.
Interestingly the class example with QString behaves as I would expect and test2() prints gibberish because the value indeed went out of scope and m_P became dangling pointer.
So what is the actual lifetime of const char *'s value?
The variables p and t are stack variables that you declared, so they have a lifetime that ends at the end of their enclosing block.
But the value of t is the address of the string literal "test", and that is not a variable you declared, it's not on the stack. It's a string literal, which is a constant defined in the program (similar to the integer literal 99 or the floating point literal 0.99). Literals don't go out of scope as you expect, because they are not created or destroyed, they just are.
The standard says:
Evaluating a string-literal results in a string literal object with static storage duration, initialized from the given characters as specified above.
So the object that the compiler creates to represent the literal "test" has static storage duration, which is the same duration as globals and static variables, meaning it doesn't go out of scope like a stack variable.
The value of p is the address of t, which does become an invalid pointer when t goes out of scope, but that doesn't mean that the value stored at that address suddenly becomes inaccessible or gets wiped. The expression *p is undefined behaviour, but it appears to work because nothing has reused that memory location yet so *p still contains the address of the string literal. For more details on that see the top answer to Can a local variable's memory be accessed outside its scope?
Compilers put literal strings into a statically allocated space which is loaded into a protected segment of the virtual memory, such that those strings can be shared over the entire lifetime of the process (the value is a constant, so no need to take the overhead of constantly springing them into existence). Looking for something like that to be deallocated is a waste of time, since it never actually happens.
The variables are stack-allocated. The string constant should be thought of as just that: a string constant...like the number 3.
String literals get allocated in the static storage.
If you mention a string literal anywhere in your program, it's as if you did:
static const char someUniqueIdentifier[]="the data";
in the global scope.
const char* str = "some string"; means make sure "some string" exists as a constant null-terminated array in the static section of the program and point str to it.
You're however referencing your automatic (=on the stack) pointer in the first example, not the static storage string. That indeed has a lifetime limited to it's scope, however when you invoke test2(), the scope of test1() hasn't ended yet.

return type in c++

#include<iostream>
int & fun();
int main()
{
int p = fun();
std::cout << p;
return 0;
}
int & fun()
{
int a=10;
return a;
}
Why is this program not giving error at line no.6 as "invalid conversion from int* to int", as it happens in case we do like this?
int x = 9;
int a = &x;
int& is a type; it means "a reference to int."
&x is an expression; it means "take the address of x." The unary & operator is the address operator. It takes the address of its argument. If x is an int, then the type of &x is "a pointer to int" (that is, int*).
int& and int* are different types. References and pointers are the same in many respects; namely, they both refer to objects, but they are quite different in how they are used. For one thing, a reference implicitly refers to an object and no indirection is needed to get to the referenced object. Explicit indirection (using * or ->) is needed to get the object referenced by a pointer.
These two uses of the & are completely different. They aren't the only uses either: for example, there is also the binary & operator that performs the bitwise and operation.
Note also that your function fun is incorrect because you return a reference to a local variable. Once the function returns, a is destroyed and ceases to exist so you can never use the reference that is returned from the function. If you do use it, e.g. by assigning the result of fun() to p as you do, the behavior is undefined.
When returning a reference from a function you must be certain that the object to which the reference refers will exist after the function returns.
Why is this program not giving error at line no.5 as "invalid conversion from int* to int", as it happens in case we do like this?
That's because you are trying to return the variable by reference and not by address. However your code invokes Undefined Behaviour because returning a reference to a local variable and then using the result is UB.
Because in one case its a pointer and in the other a reference:
int a=&x means set a to the address of x - wrong
int &p=fun() means set p to a reference to an int - ok
Functions in C++ are not same as macros i.e. when you qrite int p = fun() it doesn't become int p = &a; (I guess that is what you are expecting from your question). What you are doing is returning a reference from the function f. You are no where taking address of any variable. BTW, the above code will invoke undfeined behavior as you are returning a reference to the local variable.
You're not returning an int *, you're retuning an int &. That is, you're returning a reference to an integer, not a pointer. That reference can decay into an int.
Those are two different things, although they both use the ampersand symbol. In your first example, you are returning a reference to an int, which is assignable to an int. In your second example, you are trying to assign the address of x (pointer) to an int, which is illegal.

Does dereferencing a pointer make a copy of it?

Does dereferencing a pointer and passing that to a function which takes its argument by reference create a copy of the object?
In this case the value at the pointer is copied (though this is not necessarily the case as the optimiser may optimise it out).
int val = *pPtr;
In this case however no copy will take place:
int& rVal = *pPtr;
The reason no copy takes place is because a reference is not a machine code level construct. It is a higher level construct and thus is something the compiler uses internally rather than generating specific code for it.
The same, obviously, goes for function parameters.
In the simple case, no. There are more complicated cases, though:
void foo(float const& arg);
int * p = new int(7);
foo(*p);
Here, a temporary object is created, because the type of the dereferenced pointer (int) does not match the base type of the function parameter (float). A conversion sequence exists, and the converted temporary can be bound to arg since that's a const reference.
Hopefully it does not : it would if the called function takes its argument by value.
Furthermore, that's the expected behavior of a reference :
void inc(int &i) { ++i; }
int main()
{
int i = 0;
int *j = &i;
inc(*j);
std::cout << i << std::endl;
}
This code is expected to print 1 because inc takes its argument by reference. Had a copy been made upon inc call, the code would print 0.
No. A reference is more or less just like a pointer with different notation and the restriction that there is no null reference. But like a pointer it contains just the address of an object.

Pass temporary object to function that takes pointer

I tried following code :
#include<iostream>
#include<string>
using namespace std;
string f1(string s)
{
return s="f1 called";
}
void f2(string *s)
{
cout<<*s<<endl;
}
int main()
{
string str;
f2(&f1(str));
}
But this code doesn't compile.
What I think is : f1 returns by value so it creates temporary, of which I am taking address and passing to f2.
Now Please explain me where I am thinking wrong?
The unary & takes an lvalue (or a function name). Function f1() doesn't return an lvalue, it returns an rvalue (for a function that returns something, unless it returns a reference, its return value is an rvalue), so the unary & can't be applied to it.
It is possible to create (and pass) a pointer to a temporary object, assuming that you know what you are doing. However, it should be done differently.
A function with return value of non-reference type returns an rvalue. In C++ applying the built-in unary operator & to an rvalue is prohibited. It requires an lvalue.
This means, that if you want to obtain a pointer to your temporary object, you have to do it in some other way. For example, as a two-line sequence
const string &r = f1(str);
f2(&r);
which can also be folded into a single line by using a cast
f2(&(const string &) f1(str));
In both cases above the f2 function should accept a const string * parameter. Just a string * as in your case won't work, unless you cast away constness from the argument (which, BTW, will make the whole thing even uglier than it already is). Although, if memory serves me, in both cases there's no guarantee that the reference is attached to the original temporary and not to a copy.
Just keep in mind though creating pointers to temporary objects is a rather dubious practice because if the obvious lifetime issues. Normally you should avoid the need to do that.
Your program doesn't compile because f1 has a parameter and you're not passing any.
Additionally, the value returned from a function is an rvalue, you can't take its address.
Try this:
int main()
{
string str;
string str2 = f1(str); // copy the temporary
f2(&str2);
}