What happens after C++ references are compiled? - c++

After compilation, what does the reference become, an address, or a constant pointer?
I know the difference between pointers and references, but I want to know the difference between the underlying implementations.
int main()
{
int a = 1;
int &b = a;
int *ptr = &a;
cout << b << " " << *ptr << endl; // 1 1
cout << "&b: " << &b << endl; // 0x61fe0c
cout << "ptr: " << ptr << endl; // 0x61fe0c
return 0;
}

The pedantic answer is: Whatever the compiler feels like, all that matters is that it works as specified by the language's semantics.
To get the actual answer, you have to look at resulting assembly, or make heavy usage of Undefined Behavior. At that point, it becomes a compiler-specific question, not a "C++ in general" question
In practice, references that need to be stored essentially become pointers, while local references tend to get compiled out of existence. The later is generally the case because the guarantee that references never get reassigned means that if you can see it getting assigned, then you know full well what it refers to. However, you should not be relying on this for correctness purposes.
For the sake of completeness
It is possible to get some insight into what the compiler is doing from within valid code by memcpying the contents of a struct containing a reference into a char buffer:
#include <iostream>
#include <array>
#include <cstring>
struct X {
int& ref;
};
int main() {
constexpr std::size_t x_size = sizeof(X);
int val = 12;
X val_ref = {val};
std::array<unsigned char, x_size> raw ;
std::memcpy(&raw, &val_ref, x_size);
std::cout << &val << std::endl;
std::cout << "0x";
for(const unsigned char c : raw) {
std::cout << std::hex << (int)c;
}
std::cout << std::endl ;
}
When I ran this on my compiler, I got the (endian flipped) address of val stored within the struct.

it heavily depend on compiler maybe compiler decide to optimize the code therefore it will make it value or ..., but as far i know references will compiler like pointer i mean if you see their result assembly they are compiled like pointer.

Related

Const references with and without type conversion - why feature type conversion at all? - C++

I began learning C++ this week, and currently I am reading about compound types and constant variables. Unlike in most cases, references to const support type conversion by creating a temporary variable. But if so, then what's the difference in behaviour between:
int i = 42;
double di = 42;
and
int i = 42;
const double &di = 42;
Don't we end up with two independent variables that can end up having different values if we try to change i? Is the only difference that in the example with the const reference, the reference cannot be changed? The thing that bugs me the most is that when the types of a non-const variable and a const ref match, the reference points to the same address in memory and changes along with the change in the original variable, whereas this does not happen for a non-typematching const ref to a non-const variable:
#include <iostream>
int main() {
int i = 42;
const int &ri = i;
const double &dri = i;
++i;
std::cout << " at " << &i << ", " << ri << " at "
<< &ri << ", " << dri << " at " << &dri << std::endl;
int j = i;
int jj = ri;
int djj = dri;
std::cout << j << " at " << &j << ", " << jj << " at "
<< &jj << ", " << djj << " at " << &dri << std::endl;
return 0;
}
Output:
43 at %Address1%, 43 at %Address1%, 42 at %Address2%
43 at %Address3%, 43 at %Address4%, 42 at %Address2%
This seems to me like a major difference in behavior that is easy to overlook from simply looking at the syntax, on top of the fact that such behavior seems counter-intuitive to the entire idea of references. Also, why does jj is allocated a separate space, but not djj, which references the same address as dri?
Let's say you have a function of the form:
void foo(double const& d);
And now, let's say you have a float somewhere. And you want to pass that to this function via foo(f);. If a T const& could not bind to any object convertible to T, then this wouldn't work. Every user of this function that don't have a double would have to do foo(static_cast<double>(f)) or an equivalent.
You might say that maybe foo should take double by value. And for double specifically, maybe it should.
But what about if it's std::string, and I want to call foo("some string"). Well, "some string" is not a std::string; it is a string literal which is convertible to std::string. So we allow that conversion.
Again, you might say that it should take the string by value. But what about the cases when the caller really does have a std::string? They'd have to copy that string, a copy that is discarded and is therefore unnecessary.
Of course, C++'s rules should be uniform. So if we want this to work for function arguments&parameters, it also has to work for named variables. But even then, it could be useful. You might call a function that you expect to return a string of some form, but aren't especially picky about which form. Just so long as it is convertible to a std::string. This might be in template code:
template<typename T>
void foo(T t)
{
std::string const& data = t.get_a_string();
}
Do you really care if get_a_string returns std::string exactly, or just some string type convertible to std::string? Probably the latter.

Why can't assign const initialize while compile by pointer

I'm curious about why my research result is strange
#include <iostream>
int test()
{
return 0;
}
int main()
{
/*include either the next line or the one after*/
const int a = test(); //the result is 1:1
const int a = 0; //the result is 0:1
int* ra = (int*)((void*)&a);
*ra = 1;
std::cout << a << ":" << *ra << std::endl;
return 0;
}
why the constant var initialize while runtime can completely change, but initialize while compile will only changes pointer's var?
The function isn't really that relevant here. In principle you could get same output (0:1) for this code:
int main() {
const int a = 0;
int* ra = (int*)((void*)&a);
*ra = 1;
std::cout << a << ":" << *ra;
}
a is a const int not an int. You can do all sorts of senseless c-casts, but modifiying a const object invokes undefined behavior.
By the way in the above example even for std::cout << a << ":" << a; the compiler would be allowed to emit code that prints 1:0 (or 42:3.1415927). When your code has undefinded behavior anything can happen.
PS: the function and the different outcomes is relevant if you want to study internals of your compiler and what it does to code that is not valid c++ code. For that you best look at the output of the compiler and how it differs in the two cases (https://godbolt.org/).
It is undefined behavior to cast a const variable and change it's value. If you try, anything can happen.
Anyway, what seems to happen, is that the compiler sees const int a = 0;, and, depending on optimization, puts it on the stack, but replaces all usages with 0 (since a will not change!). For *ra = 1;, the value stored in memory actually changes (the stack is read-write), and outputs that value.
For const int a = test();, dependent on optimization, the program actually calls the function test() and puts the value on the stack, where it is modified by *ra = 1, which also changes a.

C++ Why does my code for overwriting const int x = *(&y); work?

Why does my code for overwriting const int variable work? Is it safe?
#include <iostream>
#include <cstring>
using namespace std;
int z = 5;
const int x = *(&z);
int main()
{
cout << "A:" << x << ", " << &x << endl;
int y = 7;
cout << "B:" << y << ", " << &y << endl;
memcpy((int*)&x, &y, sizeof(int));
cout << "C:" << x << ", " << &x << endl;
}
Output would be:
A:5, 0x600f94
B:7, 0x7a7efb68019c
C:7, 0x600f94
I am not sure if this has been asked before since I don't know what to search for in this situation.
Answering your questions:
It's not safe; should never be done, it's example of how not to use C.
It's based on undefined behaviour; that means specification doesn't give exact instruction how such attempt should be treated in the code.
Final answer to question why it works? The answer here is true for the GCC only as other compilers may optimise/treat const different way. You need to understand that technically const int x is a declaration of a variable with qualifier. That means (until it's optimised) it has it's place in the memory (in some circumstances in read-only section). When you removed the qualifier and give the address of the variable to the memcpy() (which is dummy library call unaware of memory protection) it makes attempt to write the new data to that address. If compiler puts that variable into read-only section (I faced that epic failure in the past) the execution on any Unix would end with segmentation fault caused by writing instruction violation memory protection of the read-only memory segment used by your program to hold constant data.
In C++ real constants are qualified by constexpr, however there are other implications.

Dereference a structure to get value of first member

I found out that address of first element of structure is same as the address of structure. But dereferencing address of structure doesn't return me value of first data member. However dereferencing address of first data member does return it's value. eg. Address of structure=100, address of first element of structure is also 100. Now dereferencing should work in the same way on both.
Code:
#include <iostream>
#include <cstring>
struct things{
int good;
int bad;
};
int main()
{
things *ptr = new things;
ptr->bad = 3;
ptr->good = 7;
std::cout << *(&(ptr->good)) <<" " << &(ptr->good) << std::endl;
std::cout << "ptr also print same address = " << ptr << std::endl;
std::cout << "But *ptr does not print 7 and gives compile time error. Why ?" << *ptr << std::endl;
return 0;
}
*ptr returns to you an instance of type of things, for which there is no operator << defined, hence the compile-time error.
A struct is not the same as an array†. That is, it doesn't necessarily decay to a pointer to its first element. The compiler, in fact, is free to (and often does) insert padding in a struct so that it aligns to certain byte boundaries‡. So even if a struct could decay in the same way as an array (bad idea), simply printing it would not guarantee printing of the first element!
† I mean a C-Style array like int[]
‡ These boundaries are implementation-dependent and can often be controlled in some manner via preprocessor statements like pragma pack
Try any of these:
#include <iostream>
#include <cstring>
struct things{
int good;
int bad;
};
int main()
{
things *ptr = new things;
ptr->bad = 3;
ptr->good = 7;
std::cout << *(int*)ptr << std::endl;
std::cout << *reinterpret_cast<int*>(ptr) << std::endl;
int* p = reinterpret_cast<int*>(ptr);
std::cout << *p << std::endl;
return 0;
}
You can do a cast of the pointer to Struct, to a pointer to the first element of the struct so the compiler knows what size and alignment to use to collect the value from memory.
If you want a "clean" cast, you can consider converting it to "VOID pointer" first.
_ (Struct*) to (VOID*) to (FirstElem*) _
Also see:
Pointers in Stackoverflow
Hope it helps!!
I found out that address of first element of structure is same as the address of structure.
Wherever you found this out, it wasn't the c++ standard. It's an incorrect assumption in the general case.
There is nothing but misery and pain for you if you continue down this path.

Passing a temp pointer to a temp pointer to an object

#include <iostream>
#include <tchar.h>
void output(int *param)
{
std::cout << "Value: " << *param << std::endl;
};
int _tmain(int argc, _TCHAR* argv[])
{
int i = 34;
output(&i);
return 0;
}
obviously writes "Value: 34" to the console.
But if I make the following changes
...
void output(int **param)
{
std::cout << "Value: " << **param << std::endl;
}
...
output(&(&i));
...
I get a compile error "'&' requires l-value".
By the way, I even tried to make the following change:
output(&34);
Indeed this feels wrong ... somehow.
My question is: Why is this not allowed to use & at an r-value? Is there some reason on assembler level?
You are trying to get a reference to a r-value and that is basically not defined, since it is always a temporary value and actually never has an address on the stack/heap. That is why C++11 introduced r-value references, but that is a totally different subject to your question.
To get your code to compile your need to do the following:
int i = 34;
int* pi = &i;
output(&ip);
By "grounding" your reference in pi, you give the compiler a real address on the stack that can be given to output.
You're trying to get the address of an address, and an address is not an l-value. (You can very roughly think about l-values as values that can stand on the left side of an operation. Variables, and "named values" are l-values, for example)
Store the first address somewhere.
int number = 4;
int* firstAddress = &number;
int** secondAddress = &firstAddress;
output(secondAddress);