Trouble understanding how pointer dereference works in C++ - c++

I'm having some trouble understanding how pointer dereferencing in C++ works. Let's look at this simple example:
struct Value {
int x = 0;
void Inc() { x++; }
};
int main(int argc, char* argv[]) {
Value* v = new Value();
v->Inc();
std::cout << v->x << std::endl; // prints 1, as I would expect
(*v).Inc();
std::cout << v->x << std::endl; // prints 2, but I would have expected it to print 1,
// as I thought (*v) would create a local copy of
// the original `Value` object.
Value v2 = *v;
v2.Inc();
std::cout << v->x << std::endl; // prints 2, as I would expect
I'm a bit confused here. I would assume that the 2nd and 3rd calls to Inc() would be equivalent. Namely, that (*v).Inc() would unfold into a temporary variable holding a copy of v on the stack, and that Inc() would then increment that copy on the stack of v instead of the original v. Why is that not the case?
Thanks

In the (*v).Inc(); statement, the LHS of the . operator is the result of the indirection of the v pointer. This will be an lvalue expression referring to the object to which v points. From this Draft C++ Standard (emphasis mine):
8.5.2.1 Unary operators      [expr.unary.op]
1     The unary *
operator performs indirection: the expression to which it is applied
shall be a pointer to an object type, or a pointer to a function type
and the result is an lvalue referring to the object or function to
which the expression points.
So, in this first case, no temporary object need be created and the Inc() function is called on the original Value object created by the new operation.
However, in this statement: Value v2 = *v;, you are declaring a separate Value object and initialising it with a copy of the Value pointed to by v. Thus, any subsequent modifications to v2 will not affect the object referred to by v.

*pointer just returns an object the pointer points to, quoting [expr.unary.op]/1:
The unary * operator performs indirection: the expression to which it is applied shall be a pointer to an object type, or a pointer to a function type and the result is an lvalue referring to the object or function to which the expression points
Value v2 = *v is a form of initialisation, so it actually calls a constructor. This would be equivalent to Value v2{ *v } (for this particular class).
For the part why *pointer doesn't create a temporary, there are well-defined rules on when temporaries are created:
Temporary objects are created when a prvalue is materialized so that
it can be used as a glvalue, which occurs (since C++17) in the
following situations:
binding a reference to a prvalue
initializing an object of type
std::initializer_list from a braced-init-list (since C++11)
returning a prvalue from a function
conversion that creates a prvalue
(including T(a,b,c) and T{})
lambda expression (since C++11)
copy-initialization that requires conversion of the initializer,
reference-initialization to a different but convertible type or to a
bitfield.
plus some others scenarios for C++17. For this particular case the most important part is that indirection returns an lvalue, so there is no rule applicable to it if the expression doesn't partake in any other expression.

Related

Why I can take address of *v.begin() where v is a std::vector

#include <vector>
#include <cstdio>
using namespace std;
int f()
{
int* a = new int(3);
return *a;
}
int main()
{
//printf("%p\n", &f());
vector<int> v{3};
printf("%p\n", &(*(v.begin())));
}
I cannot take address of the f(), If I comment "printf("%p\n", &f()); " out I will get error: lvalue required as unary ‘&’ operand.
but how is it possible to take address of *(v.begin())? Isn't * operator the same as a function?
The function f returns a temporary object of the type int
int f()
{
int* a = new int(3);
return *a;
}
You may not apply the address of operator for a temporary object.
You could return a reference to the created dynamically object like for example
int & f()
{
int* a = new int(3);
return *a;
}
And in this case this call of printf written like
printf("%p\n", ( void * )&f());
will be correct.
As for this expression &(*(v.begin())) then the dereferencing operator does return a reference to the pointed object by the iterator.
I cannot take address of the f(), If I comment "printf("%p\n", &f()); " out I will get error: lvalue required as unary ‘&’ operand
We cannot take the address of f() because f returns by value which means that the expression f() is an rvalue of type int which can't be the operand of the & since the operator & requires an lvalue operand which f() is not. This is exactly what the error says.
how is it possible to take address of *(v.begin())?
On the other hand, std::vector::begin returns an iterator to the first element of the vector(in case vector is non empty). Then applying * on v.begin() gives us that first element itself( i.e *(v.begin()) is an lvalue) whose address we can take.
As mentioned in the comments dereferencing interators returns a reference, i.e. a construct similar to a pointer (technically, under the hoods) – you now can take the address of *v.begin() (you don't need parentheses around, . has higher precedence than * anyway) because taking the address of a reference means taking the address of the referred object, which in this case is the first int of the vector. This is just the same as taking the address of a normal object or of the first element in an array, as if, within f, you did &a[0].
On the other hand f itself returns a value – this is just a temporary object that doesn't have a valid address (potentially at least – the value might be returned, according to calling convention, on the stack, then it does have an address, but as well in a CPU register, then there simply is no adress). This value first needs to be assigned to a variable so that you indeed can take the address of.
Indeed v.begin() returns rvalue iterator so &v.begin() will not work. But dereference operator of itirator *v.begin() returns const int& which in turn is lvalue reference what allows to call addressof on it.

Is the object "this" points to the same as a const object?

This question has to to do with overloading the assignment operator in c++. Take a look at the following code. It shows the function definition given by my book to overload the assignment operator.
const cAssignmentOprOverload& cAssignmentOprOverload::operator=(
const cAssignmentOprOverload& otherList) {
if (this != &otherList) // avoid self-assignment; Line 1
{
delete[] list; // Line 2
maxSize = otherList.maxSize; // Line 3
length = otherList.length; // Line 4
list = new int[maxSize]; // Line 5
for (int i = 0; i < length; i++) // Line 6
list[i] = otherList.list[i]; // Line 7
}
return *this; // Line 8
}
The biggest issue that is making this hard to understand is the fact that in the definition of the function, it returns *this. Is *this a const object? I don't think it is so why are we allowed to return a non-const object when the return type is supposed to be const?
Inside the body of a non-static member function, the expression this can be used to get a pointer to the object the function has been called on [expr.prim.this]. Since your operator = is not a const member function, this will point to a non-const object (which makes sense since we're assigning a new value to something). Thus, *this will result in a non-const lvalue of type cAssignmentOprOverload. However, a reference to const can be bound to a non-const lvalue [dcl.init.ref]/5.1.1. In general, a less const qualified type can always be implicitly converted to a more const qualified one. Which makes sense: you should be able to use a modifiable object in places where a non-modifiable one is sufficient. Nothing really that can go wrong by treating a modifiable object as non-modifiable. All that happens is that you lose the information that that object was actually modifiable. Just the opposite, treating a non-modifiable object as modifiable, is problematic…
Note that this way of writing an overloaded operator = is not how this is typically done. The canonical form would be
cAssignmentOprOverload& operator=(const cAssignmentOprOverload& otherList)
i.e., returning a reference to non-const…
From implicit_conversion
A prvalue of type pointer to cv-qualified type T can be converted to a prvalue pointer to a more cv-qualified same type T (in other words, constness and volatility can be added).

What does the address of an lvalue reference to a prvalue represent?

When a function parameter is of type lvalue reference lref:
void PrintAddress(const std::string& lref) {
std::cout << &lref << std::endl;
}
and lref is bound to a prvalue:
PrintAddress(lref.substr() /* temporary of type std::string */)
what does the address represent? What lives there?
A prvalue cannot have its address taken. But an lvalue reference to a prvalue can have its address taken, which is curious to me.
Inside the function lref is not a prvalue it is an lvalue and you can take the address of it.
There is a common misconception about rvalues vs. lvalues.
A named parameter is always an lvalue. No matter whether it is a reference type that is bound to an rvalue. Through a const & reference type you can't even tell which kind of value category the object actually has at the point where the function is called. Rvalue references and non-const Lvalue references give you that information:
void foo(std::string& L, std::string&& R)
{
// yeah i know L is already an lvalue at the point where foo is called
// R on the other hand is an rvalue at the point where we get called
// so we can 'safely' move from it or something...
}
The temporary string is a prvalue in the context of the caller (at the point PrintAddress is called). Within the context of the callee (in PrintAddress) lref is an lvalue reference because in this context it actually is an lvalue.
PrintAddress isn't aware of the limited lifetime of the passed argument and from PrintAddress' point of view the object is "always" there.
std::string q("abcd");
PrintAddress(q.substr(1)); // print address of temporary
is conceptually equivalent to:
std::string q("abcd");
{
const std::string& lref = q.substr(1);
std::cout << &lref << std::endl;
}
where the temporary experiences a prolongation of its lifetime to the end of the scope in which lref is defined (which is to the end of PrintAddress function scope in the present example).
what does the address represent? What lives there?
A std::string object containing the passed content.
And is it legal (in C++, and with respect to memory) to write to that address?
No, it would be legal if you'd use an rvalue reference:
void PrintAddressR(std::string&& rref) {
rref += "Hello"; // writing possible
std::cout << &rref << std::endl; // taking the address possible
}
// ...
PrintAddressR(q.substr(1)); // yep, can do that...
The same applies here: rref is an lvalue (it has a name) so you can take its address plus it is mutable.
In short, because the prvalue's lifetime has been extended. By having its lifetime extended - by any reference -, it's an lvalue, and thus can have its address taken.
what does the address represent? What lives there?
The address represents an object, the object referenced by lref.
A prvalue is short lived, it doesn't live for long. In fact, it will be destroyed when the statement creating it ends.
But, when you create a reference to a prvalue (either an rvalue reference or a const lvalue reference), its lifetime is extended. Ref.::
An rvalue may be used to initialize a const lvalue [rvalue] reference, in which case the lifetime of the object identified by the rvalue is extended until the scope of the reference ends.
Now it makes actually sense to take its address, as it is an lvalue for all intents and purposes. Now, that the prvalue has an indeterminate lifetime, it is an lvalue.
Taking the address of a prvalue doesn't make sense however, and that's probably why it is disallowed:
The value is destroyed after the next statements, so you can't do anything with the address, except maybe print it out.
If you take the address of something, the compiler is required to actually create the object. Sometimes, the compiler will optimize out variables that are trivial, but if you were to take the address of them, the compiler won't be allowed to optimize them out.
Taking the address of a prvalue will thus result in the compiler being unable to elide the value completely, for no advantages whatsoever (see point 1).
In simple English:
void PrintAddress(const std::string& lref) {
std::cout << &lref << std::endl;
}
Any object that has a name is an lvalue, hence any use of lref within the scope of the funtion above is an lvalue use.
When you called the function with:
PrintAddress(lref.substr() /* temporary of type std::string */)
Of cause, lref.substr() produces a temporary which is an rvalue, but rvalues can bind to (have its lifetime extended by) const lvalue references or rvalue references.
Even if you provided an rvalue overload, for the fact it has a name, its an "lvalue of something" within its scope, example:
#include <string>
#include <iostream>
void PrintAddress(const std::string& lref) {
std::cout << "LValue: " << &lref << std::endl;
}
void PrintAddress(std::string&& `rref`) {
std::cout << "RValue: " << &rref << std::endl; //You can take address of `rref`
}
int main(){
std::string str = "Hahaha";
PrintAddress(str);
PrintAddress(str.substr(2));
}
Just remember:
In C++, any object(whether value type, reference type or pointer type) that has a name is an lvalue
Also know that some expressions produce lvalues too.

C++ Type and Value Category for Expression and Variable

From this link, it says that
Objects, references, functions including function template specializations, and expressions have a property called type
So given the following:
int &&rf_int = 10;
I can say that variable rf_int is of compound type rvalue reference to int.
But when talking about value category, it specifically says that
Each expression has some non-reference type
and
Each C++ expression (an operator with its operands, a literal, a variable name, etc.)
Based on the above two statement, rf_int can be treated as an expression and expression has non-reference type.
Now I am really confused. Does rf_int have a reference type or not? Do we have to provide context when talking about the type of a name, be it a variable or an expression?
More specifically, when a variable name is used in function call:
SomeFunc(rf_int);
Is rf_int now considered an expression (thus it is an lvalue with type int), or a variable (thus it is an lvalue with type rvalue reference to int)?
EDIT: A comment here got me wonder about this issue.
It confused me first too but let me clear out the ambiguity in a simple way.
EXPRESSION is something that can be evaluated and must evaluate to a non-reference type right ?
( Yes, of course duh !! )
Now we also know a variable name is an l-value EXPRESSION.
( Dude get to the point already and stop making the expression word bold )
Okay now here is the catch, when we say a variable we mean a place in memory. Now will we call a place in memory an expression ? No, definitely not that is just completely absurd.
Expression is a generic term that we identify by defining some rules and anything that fall under those rules is an expression. It is necessary to define it this way to make sense of the code during compiler construction. One of that rule is that anything that evaluates to a value is an expression. Since from coding perspective using a variable name means that you wish the actual value is used when the code is compiled, so we call that variable name an expression.
So when they say a variable is an expression they don't mean the variable as in place in memory but the variable NAME from coding perspective. But using the term "variable name" to differentiate from the actual variable (place in memory) is just absurd. So saying "variable is an expression" is just fine as long as you think it from coding perspective.
Now to answer this first :
More specifically, when a variable name is used in function call:
SomeFunc(rf_int);
Is rf_int now considered an expression (thus it is an lvalue with
type int), or a variable (thus it is an lvalue with type rvalue
reference to int)?
A single variable is also an expression. So this question becomes invalid.
Now to answer this :
Based on the above two statement, rf_int can be treated as an
expression and expression has non-reference type.
Now I am really confused. Does rf_int have a reference type or not?
What if I say rf_int is an r-value reference and rf_int is also an l-value EXPRESSION.
( Oh brother, this guy and his obsession with expression )
It is true because if you do the following it will work.
int &&rf_int = 10; // rf_int is an r-value reference
int &x = rf_int; // x is an l-value reference and l-value reference can be initialized with l-value expression
cout << x; //Output will be 10
Now is rf_int an expression or a r-value reference, what will it be ? The answer is both. It depends on from which perspective you are thinking.
In other words what I'm trying to say is that if we think rf_int as a variable (some place in memory) then surely it has the type of r-value reference but since rf_int is also a variable name and from coding perspective it is an expression and more precisely an l-value expression and whenever you use this variable for the sake of evaluation you will get the value 10 which is an int so we are bound to say that rf_int type as an expression is an int which is a non-reference type.
If you think from compiler perceptive for a moment what line of code evaluates to a reference ? None right ?. You can try searching if you find any do let me know as well. But the point here is that the type of expression doesn't mean type of variable. It means the type of value that you get after evaluating the expression.
Hope I have clarified your question. If I missed something do let me know.
Does rf_int have a reference type or not?
The entity (or variable) with the name rf_int has type int&& (a reference type) because of the way it is declared, but the expression rf_int has type int (a non-reference type) per [expr]/5:
If an expression initially has the type “reference to T” ([dcl.ref],
[dcl.init.ref]), the type is adjusted to T prior to any further
analysis. The expression designates the object or function denoted by
the reference, and the expression is an lvalue or an xvalue, depending
on the expression. [ Note: Before the lifetime of the reference
has started or after it has ended, the behavior is undefined (see
[basic.life]).  — end note ]
Do we have to provide context when talking about the type of a name,
be it a variable or an expression?
Yes, we do. rf_int can be said to have different types depending on whether it refers to the entity or the expression.
More specifically, when a variable name is used in function call:
SomeFunc(rf_int);
Is rf_int now considered an expression (thus it is an lvalue with
type int), or a variable (thus it is an lvalue with type rvalue
reference to int)?
It is considered an expression, which is an lvalue of type int. (Note that value category is a property of expressions. It is not correct to say a variable is an lvalue.)
Based on this each function call is an expression.
Each argument passed to the function is also an expression.
Therefore, when you make a call to SumFunc(rf_int); you create an expression from the single rf_int variable.
ISO/IEC 14882 (c++14 standard) states that "The expression designates the object or function denoted by the reference, and the expression is an lvalue or an xvalue, depending on the expression."
Thus, the rf_int expression will be of type int.
This comment (you mention it before) is about type conversion error.
To illustrate my explanation, I have prepared a more complex, but (hopefully) more understandable example.
This example is about why we need rvalue refernce type and how to use it correctly.
class Obj {
int* pvalue;
public:
Obj(int m) {
pvalue = new int[100];
for(int i = 0; i < 100; i++)
pvalue[i] = m + i;
}
Obj(Obj& o) { // copy constructor
pvalue = new int[100];
for(int i = 0; i < 100; i++)
pvalue[i] = o.pvalue[i];
}
Obj(Obj&& o) { // move constructor
for(int i = 0; i < 100; i++)
pvalue = o.pvalue;
o.pvalue = nullptr;
}
// ...
};
Example 1
Obj obj1(3);
Obj obj2(obj1); // copy constructor
Obj obj3(std::move(obj1)); // move constructor
Obj obj4(Obj(3)); // move constructor
Line #1 - obj1 created.
Line #2 - obj2 created as a copy of obj1
Line #3 - obj3 created by moving value of obj1 to the obj3; obj1 loose it's value
Line #4 - temporary object created by Obj(3); obj4 created by moving value of temporary object to obj4; temporary object loose it's value
I think we have no real need of int&& but Obj&& can be very useful. Especially in the line #4.
Example 2
with my (simplified) interpretation of compiler errors
void SomeFunc0(Obj arg) {};
void SomeFunc1(Obj& arg) {};
void SomeFunc2(Obj&& arg) {};
int main()
{
Obj obj1(3); // object
Obj& obj2 = obj1; // reference to the object
Obj&& obj3 = Obj(3); // reference to the temporary object with extened lifetime
SomeFunc0(obj1); // ok - new object created from Obj
SomeFunc0(obj2); // ok - new object created from Obj&
SomeFunc0(obj3); // ok - new object created from Obj&&
SomeFunc0(Obj(3)); // ok - new object created from temporary object
SomeFunc0(std::move(obj1)); // ok - new object created from temporary object
SomeFunc0(std::move(obj2)); // ok - new object created from temporary object
SomeFunc0(std::move(obj3)); // ok - new object created from temporary object
SomeFunc1(obj1); // ok - reference to obj1 passed
SomeFunc1(obj2); // ok - reference to obj1 passed
SomeFunc1(obj3); // ok - reference to temp. obj. passed
SomeFunc1(Obj(3)); // error - lifetime of the temp. obj. too short
SomeFunc1(std::move(obj1)); // error - lifetime of the temp. obj. too short
SomeFunc1(std::move(obj2)); // error - lifetime of the temp. obj. too short
SomeFunc1(std::move(obj3)); // error - lifetime of the temp. obj. too short
SomeFunc2(obj1); // error - temporary object required
SomeFunc2(obj2); // error - temporary object required
SomeFunc2(obj3); // error - lifetime of the temp. obj. too long
SomeFunc2(Obj(3)); // ok
SomeFunc2(std::move(obj1)); // ok
SomeFunc2(std::move(obj2)); // ok
SomeFunc2(std::move(obj3)); // ok
return 0;
}

Auto reference for const array

I'm getting into C++11 and really can't understand why this happens:
const int arrSource[4] = { 5,7,6,4 };
for (auto& i : arrSource) {
std::cout << i << " ";
++i; //error
}
It says that i must be a modifiable lvalue and i: you cannot assign to a variable that is const.
So it means, that if the arrSource[] is const, it makes i const too?
So it means, that if the arrSource[] is const, it makes i const too?
Yes, if the array is const, each element in the array is also const.
The auto& deduces the type based on the initialiser, in this case it is deduced to be int const& and hence cannot be modified.
The increment is probably not needed (not sure on your intent). The range based for loop takes care of the incrementing between iterations.
If modification of the array is intended (via i), then you need to remove the const.
N4567 § 3.9.3 [basic.type.qualifier] p6
Cv-qualifiers applied to an array type attach to the underlying element type, so the notation “cv T”, where
T is an array type, refers to an array whose elements are so-qualified. An array type whose elements are
cv-qualified is also considered to have the same cv-qualifications as its elements.