Referencing and dereferencing in the same instruction - c++

Wondering through the LLVM source code i stumbled upon this line of code
MachineInstr *MI = &*I;
I am kinda newb in c++ and the difference between references and pointers is quite obscure to me, and I think that it has something to do about this difference, but this operation makes no sense to me. Does anybody have an explanation for doing that?

The type of I is probably some sort of iterator or smart pointer which has the unary operator*() overloaded to yield a MachineInstr&. If you want to get a built-in pointer to the object referenced by I you get a reference to the object using *I and then you take the address of this reference, using &*I.

C++ allows overloading of the dereference operator, so it is using the overloaded method on the object, and then it is taking the address of the result and putting it into the pointer.

This statement:
MachineInstr *MI = &*I;
Dereferences I with *, and gets the address of its result with & then assigns it to MI which is a pointer to MachineInstr. It looks like I is an iterator, so *I is the value stored in a container since iterators define the * operator to return the item at iteration point. The container (e.g. a list) must be containing a MachineInstr. This might be a std::list<MachineInstr>.

Related

dereferencing a pointer when passing by reference

what happens when you dereference a pointer when passing by reference to a function?
Here is a simple example
int& returnSame( int &example ) { return example; }
int main()
{
int inum = 3;
int *pinum = & inum;
std::cout << "inum: " << returnSame(*pinum) << std::endl;
return 0;
}
Is there a temporary object produced?
Dereferencing the pointer doesn't create a copy; it creates an lvalue that refers to the pointer's target. This can be bound to the lvalue reference argument, and so the function receives a reference to the object that the pointer points to, and returns a reference to the same. This behaviour is well-defined, and no temporary object is involved.
If it took the argument by value, then that would create a local copy, and returning a reference to that would be bad, giving undefined behaviour if it were accessed.
The Answer To Your Question As Written
No, this behavior is defined. No constructors are called when pointer types are dereferenced or reference types used, unless explicitly specified by the programmer, as with the following snippet, in which the new operator calls the default constructor for the int type.
int* variable = new int;
As for what is really happening, as written, returnSame(*pinum) is the same variable as inum. If you feel like verifying this yourself, you could use the following snippet:
returnSame(*pinum) = 10;
std::cout << "inum: " << inum << std::endl;
Further Analysis
I'll start by correcting your provided code, which it doesn't look like you tried to compile before posting it. After edits, the one remaining error is on the first line:
int& returnSame( int &example ) { return example; } // semi instead of colon
Pointers and References
Pointers and references are treated in the same way by the compiler, they differ in their use, not so much their implementation. Pointer types and reference types store, as their value, the location of something else. Pointer dereferencing (using the * or -> operators) instructs the compiler to produce code to follow the pointer and perform the operation on the location it refers to rather than the value itself. No new data is allocated when you dereference a pointer (no constructors are called).
Using references works in much the same way, except the compiler automatically assumes that you want the value at the location rather than the location itself. As a matter of fact, it is impossible to refer to the location specified by a reference in the same way pointers allow you to: once assigned, a reference cannot be reseated (changed) (that is, without relying on undefined behavior), however you can still get its value by using the & operator on it. It's even possible to have a NULL reference, though handling of these is especially tricky and I don't recommend using them.
Snippet analysis
int *pinum = & inum;
Creates a pointer pointing to an existing variable, inum. The value of the pointer is the memory address that inum is stored in. Creating and using pointers will NOT call a constructor for a pointed-to object implicitly, EVER. This task is left to the programmer.
*pinum
Dereferencing a pointer effectively produces a regular variable. This variable may conceptually occupy the same space that another named variable uses, or it may not. in this case, *pinum and inum are the same variable. When I say "produces", it's important to note than no constructors are called. This is why you MUST initialize pointers before using them: Pointer dereferencing will NEVER allocate storage.
returnSame(*pinum)
This function takes a reference and returns the same reference. It's helpful to realize that this function could be written with pointers as well, and behave exactly the same way. References do not perform any initialization either, in that they do not call constructors. However, it is illegal to have an uninitialized reference, so running into uninitialized memory through them is not as common a mistake as with pointers. Your function could be rewritten to use pointers in the following way:
int* returnSamePointer( int *example ) { return example; }
In this case, you would not need to dereference the pointer before passing it, but you would need to dereference the function's return value before printing it:
std::cout << "inum: " << *(returnSamePointer(pinum)) << std::endl;
NULL References
Declaring a NULL reference is dangerous, since attempting to use it will automatically attempt to dereference it, which will cause a segmentation fault. You can, however, safely check if a reference is a null reference. Again, I highly recommend not using these ever.
int& nullRef = *((int *) NULL); // creates a reference to nothing
bool isRefNull = (&nullRef == NULL); // true
Summary
Pointer and Reference types are two different ways to accomplish the same thing
Most of the gotchas that apply to one apply to the other
Neither pointers nor references will call constructors or destructors for referenced values implicitly under any circumstances
Declaring a reference to a dereferenced pointer is perfectly legal, as long as the pointer is initialized properly
A compiler doesn't "call" anything. It just generates code. Dereferencing a pointer would at the most basic level correspond to some sort of load instruction, but in the present code the compiler can easily optimize this away and just print the value directly, or perhaps shortcut directly to loading inum.
Concerning your "temporary object": Dereferencing a pointer always gives an lvalue.
Perhaps there's a more interesting question hidden in your question, though: How does the compiler implement passing function arguments as references?

Why do parameters passed by reference in C++ not require a dereference operator?

I'm new to the C++ community, and just have a quick question about how C++ passes variables by reference to functions.
When you want to pass a variable by reference in C++, you add an & to whatever argument you want to pass by reference. How come when you assign a value to a variable that is being passed by reference why do you say variable = value; instead of saying *variable = value?
void add_five_to_variable(int &value) {
// If passing by reference uses pointers,
// then why wouldn't you say *value += 5?
// Or does C++ do some behind the scene stuff here?
value += 5;
}
int main() {
int i = 1;
add_five_to_variable(i);
cout << i << endl; // i = 6
return 0;
}
If C++ is using pointers to do this with behind the scenes magic, why aren't dereferences needed like with pointers? Any insight would be much appreciated.
When you write,
int *p = ...;
*p = 3;
That is syntax for assigning 3 to the object referred to by the pointer p. When you write,
int &r = ...;
r = 3;
That is syntax for assigning 3 to the object referred to by the reference r. The syntax and the implementation are different. References are implemented using pointers (except when they're optimized out), but the syntax is different.
So you could say that the dereferencing happens automatically, when needed.
C++ uses pointers behind the scenes but hides all that complication from you. Passing by reference also enables you to avoid all the problems asssoicated with invalid pointers.
When you pass an object to a function by reference, you manipulate the object directly in the function, without referring to its address like with pointers. Thus, when manipulating this variable, you don't want to dereference it with the *variable syntax. This is good practice to pass objects by reference because:
A reference can't be redefined to point to another object
It can't be null. you have to pass a valid object of that type to the function
How the compiler achieves the "pass by reference" is not really relevant in your case.
The article in Wikipedia is a good ressource.
There are two questions in one, it seems:
one question is about syntax: the difference between pointer and reference
the other is about mechanics and implementation: the in-memory representation of a reference
Let's address the two separately.
Syntax of references and pointers
A pointer is, conceptually, a "sign" (as road sign) toward an object. It allows 2 kind of actions:
actions on the pointee (or object pointed to)
actions on the pointer itself
The operator* and operator-> allow you to access the pointee, to differenciate it from your accesses to the pointer itself.
A reference is not a "sign", it's an alias. For the duration of its life, come hell or high water, it will point to the same object, nothing you can do about it. Therefore, since you cannot access the reference itself, there is no point it bothering you with weird syntax * or ->. Ironically, not using weird syntax is called syntactic sugar.
Mechanics of a reference
The C++ Standard is silent on the implementation of references, it merely hints that if the compiler can it is allowed to remove them. For example, in the following case:
int main() {
int a = 0;
int& b = a;
b = 1;
return b;
}
A good compiler will realize that b is just a proxy for a, no room for doubts, and thus simply directly access a and optimize b out.
As you guessed, a likely representation of a reference is (under the hood) a pointer, but do not let it bother you, it does not affect the syntax or semantics. It does mean however that a number of woes of pointers (like access to objects that have been deleted for example) also affect references.
The explicit dereference is not required by design - that's for convenience. When you use . on a reference the compiler emits code necessary to access the real object - this will often include dereferencing a pointer, but that's done without requiring an explicit dereference in your code.

c++ question, about * and ->

when I have a pointer like:
MyClass * pRags = new MyClass;
So i can use
pRags->foo()
or
(*pRags).foo()
to call foo.
Why these 2 are identical? and what is *pRags?
Thank you
Why are these two identical?
They are equivalent because the spec says they are equivalent. The built-in -> is defined in terms of the built-in * and ..
What is *pRags?
It is the MyClass object pointed to by pRags. The * dereferences the pointer, yielding the pointed-to object.
For more information, consider picking up a good introductory C++ book.
In addition to the other answers, the '->' is there for convenience. Dereferencing a pointer to an object every time you access a class variable for function is quite ugly, inconvenient, and potentially more confusing.
For instance:
(*(*(*car).engine).flux_capacitor).init()
vs
car->engine->flux_capacitor->init()
pRags->foo() is defined as syntactic sugar that is equivalent to (*pRags).foo().
the * operator dereferences a pointer. That is, it says that you're operating on what the pointer points to, not the pointer itself.
The -> is just a convenient way to write (*).. and *pRags is the object at the address stored in pRags.
Yes they are identical. -> was included in C (and thus inherited into C++) as a notational convenience.
* in that context is used to dereference a pointer.
Unary * is known as the dereferencing operator. Dereferencing a pointer turns a T* into a T&; dereferencing an object invokes that object type's unary operator* on that object value if one is defined, or gives an error otherwise.
(*pRags) that goes through the pointer pRags and get you whole object, so on that you could use regular dot notation . .
pRags is a pointer of type MyClass. Just like you can have pointers for primitive data types, e.g. int ptr, you can also have pointers to objects and in this case represented by pRags. Now accessing the object "members" is done using the arrow operator (->) or you can use the "." and dereference "*" to access the object's member values. Here a member is a variable inside MyClass. So, foo() would have a definition inside MyClass.

About pointer and reference syntax

Embarrassing though it may be I know I am not the only one with this problem.
I have been using C/C++ on and off for many years. I never had a problem grasping the concepts of addresses, pointers, pointers to pointers, and references.
I do constantly find myself tripping over expressing them in C syntax, however. Not the basics like declarations or dereferencing, but more often things like getting the address of a pointer-to-pointer, or pointer to reference, etc. Essentially anything that goes a level or two of indirection beyond the norm. Typically I fumble with various semi-logical combinations of operators until I trip upon the correct one.
Clearly somewhere along the line I missed a rule or two that simplifies and makes it all fall into place. So I guess my question is: do you know of a site or reference that covers this matter with clarity and in some depth?
I don't know of any website but I'll try to explain it in very simple terms. There are only three things you need to understand:
variable will contain the contents of the variable. This means that if the variable is a pointer it will contain the memory address it points to.
*variable (only valid for pointers) will contain the contents of the variable pointed to. If the variable it points to is another pointer, ptr2, then *variable and ptr2 will be the same thing; **variable and *ptr2 are the same thing as well.
&variable will contain the memory address of the variable. If it's a pointer, it will be the memory address of the pointer itself and NOT the variable pointed to or the memory address of the variable pointed to.
Now, let's see a complex example:
void **list = (void **)*(void **)info.List;
list is a pointer to a pointer. Now let's examine the right part of the assignment starting from the end: (void **)info.List. This is also a pointer to a pointer.
Then, you see the *: *(void **)info.List. This means that this is the value the pointer info.List points to.
Now, the whole thing: (void **)*(void **)info.List. This is the value the pointer info.List points to casted to (void **).
I found the right-left-right rule to be useful. It tells you how to read a declaration so that you get all the pointers and references in order. For example:
int *foo();
Using the right-left-right rule, you can translate this to English as "foo is a function that returns a pointer to an integer".
int *(*foo)(); // "foo is a pointer to a function returning a pointer to an int"
int (*foo[])(); // "foo is an array of pointers to functions returning ints"
Most explanations of the right-left-right rule are written for C rather than C++, so they tend to leave out references. They work just like pointers in this context.
int &foo; // "foo is a reference to an integer"
Typedefs can be your friend when things get confusing. Here's an example:
typedef const char * literal_string_pointer;
typedef literal_string_pointer * pointer_to_literal_string_pointer;
void GetPointerToString(pointer_to_literal_string_pointer out_param)
{
*out_param = "hi there";
}
All you need to know is that getting the address of an object returns a pointer to that object, and dereferencing an object takes a pointer and turns it into to object that it's pointing to.
T x;
A a = &x; // A is T*
B b = *a; // B is T
C c = &a; // C is T**
D d = *c; // D is T*
Essentially, the & operator takes a T and gives you a T* and the * operator takes a T* and gives you a T, and that applies to higher levels of abstraction equally e.g.
using & on a T* will give you a T**.
Another way of thinking about it is that the & operator adds a * to the type and the * takes one away, which leads to things like &&*&**i == i.
I'm not sure exactly what you're looking for, but I find it helpful to remember the operator precedence and associativity rules. That said, if you're ever confused, you might as well throw in some more parens to disambiguate, even if it's just for your benefit and not the compiler's.
edit: I think I might understand your question a little better now. I like to think of a chain of pointers like a stack with the value at the bottom. The dereferencing operator (*) pops you down the stack, where you find the value itself at the end. The reference operator (&) lets you push another level of indirection onto the stack. Note that it's always legal to move another step away, but attempting to dereference the value is analogous to popping an empty stack.

C++: difference between ampersand "&" and asterisk "*" in function/method declaration?

Is there some kind of subtle difference between those:
void a1(float &b) {
b=1;
};
a1(b);
and
void a1(float *b) {
(*b)=1;
};
a1(&b);
?
They both do the same (or so it seems from main() ), but the first one is obviously shorter, however most of the code I see uses second notation. Is there a difference? Maybe in case it's some object instead of float?
Both do the same, but one uses references and one uses pointers.
See my answer here for a comprehensive list of all the differences.
Yes. The * notation says that what's being pass on the stack is a pointer, ie, address of something. The & says it's a reference. The effect is similar but not identical:
Let's take two cases:
void examP(int* ip);
void examR(int& i);
int i;
If I call examP, I write
examP(&i);
which takes the address of the item and passes it on the stack. If I call examR,
examR(i);
I don't need it; now the compiler "somehow" passes a reference -- which practically means it gets and passes the address of i. On the code side, then
void examP(int* ip){
*ip += 1;
}
I have to make sure to dereference the pointer. ip += 1 does something very different.
void examR(int& i){
i += 1;
}
always updates the value of i.
For more to think about, read up on "call by reference" versus "call by value". The & notion gives C++ call by reference.
In the first example with references, you know that b can't be NULL. With the pointer example, b might be the NULL pointer.
However, note that it is possible to pass a NULL object through a reference, but it's awkward and the called procedure can assume it's an error to have done so:
a1(*(float *)NULL);
In the second example the caller has to prefix the variable name with '&' to pass the address of the variable.
This may be an advantage - the caller cannot inadvertently modify a variable by passing it as a reference when they thought they were passing by value.
Aside from syntactic sugar, the only real difference is the ability for a function parameter that is a pointer to be null. So the pointer version can be more expressive if it handles the null case properly. The null case can also have some special meaning attached to it. The reference version can only operate on values of the type specified without a null capability.
Functionally in your example, both versions do the same.
The first has the advantage that it's transparent on the call-side. Imagine how it would look for an operator:
cin >> &x;
And how it looks ugly for a swap invocation
swap(&a, &b);
You want to swap a and b. And it looks much better than when you first have to take the address. Incidentally, bjarne stroustrup writes that the major reason for references was the transparency that was added at the call side - especially for operators. Also see how it's not obvious anymore whether the following
&a + 10
Would add 10 to the content of a, calling the operator+ of it, or whether it adds 10 to a temporary pointer to a. Add that to the impossibility that you cannot overload operators for only builtin operands (like a pointer and an integer). References make this crystal clear.
Pointers are useful if you want to be able to put a "null":
a1(0);
Then in a1 the method can compare the pointer with 0 and see whether the pointer points to any object.
One big difference worth noting is what's going on outside, you either have:
a1(something);
or:
a1(&something);
I like to pass arguments by reference (always a const one :) ) when they are not modified in the function/method (and then you can also pass automatic/temporary objects inside) and pass them by pointer to signify and alert the user/reader of the code calling the method that the argument may and probably is intentionally modified inside.