Clarification on smart pointer's operator* and operator-> overloading - c++

it's passed a lot since i used c++ so here the(probally dumb) question:
A basic smart pointer Object should behave like a normal pointer one, so in a typical implementation we add the * and -> operator to the object, something like this:
template <class T> class auto_ptr
{
T* ptr;
public:
explicit auto_ptr(T* p = 0) : ptr(p) {}
~auto_ptr() {delete ptr;}
T& operator*() {return *ptr;}
T* operator->() {return ptr;}
// ...
};
Now, in my knowings, the c++ * operator (dereference) stands for: "get the value pointed in the heap by the value of ptr" (is it right?), and the type of *ptr should be T. So why we would return an address?
T& operator*() {return *ptr;}
Instead of:
T operator*() {return *ptr;}
Second, by having the following snippet:
void foo()
{
auto_ptr<MyClass> p(new MyClass);
p->DoSomething();
}
Now, how can i access ptr->DoSomething() method by just writing p->DoSomething()? Logically i would come writing the wrong code:
p->->DoSomething();
Because p-> returns a T* and then i need another -> operator for access the DoSomething() method.
Thanks for any answer/clarification and sorry for eventually bad English.

In C++, when you evaluate a function, you end up with a value (unless the function's return type is void). The type of a value is always an object type. So when you say f(), that expression is a value of type T. However, there are different categories of value:
T f(); => f() is a prvalue, passed along by copy
T & f(); => f() is an lvalue, the same object that is bound to "return"
T && f(); => f() is an xvalue, the same object that is bound to "return"
So if you want a function to produce an existing value that you don't want to copy, you have to declare the return type of the function as one of the reference types. If the return type is not a reference type, then a copy of the return value will be made, and the caller only ever sees that copy.

The dereference operator returns a reference because then you can do e.g.
*somePointer = someValue;
and the value of what somePointer points to would change to someValue. If you returned by value, the above expression would have a temporary value that is assigned to, and then that temporary value is destructed and the change is lost.

The reason you don't have to write p->->DoSomething is that operator-> recurses until it finds something that isn't a pointer, T*.
p-> finds T* which is a pointer so it goes down another level and finds a MyClass-object, so it stops and does a normal operator. on that.
Note that smart pointers aren't considered pointers in this case.

Related

Why do we return *this in asignment operator and generally (and not &this) when we want to return a reference to the object?

I'm learning C++ and pointers and I thought I understood pointers until I saw this.
On one side the asterix(*) operator is dereferecing, which means it returns the value in the address the value is pointing to, and that the ampersand (&) operator is the opposite, and returns the address of where the value is stored in memory.
Reading now about assignment overloading, it says "we return *this because we want to return a reference to the object". Though from what I read *this actually returns the value of this, and actually &this logically should be returned if we want to return a reference to the object.
How does this add up? I guess I'm missing something here because I didn't find this question asked elsewhere, but the explanation seems like the complete opposite of what should be, regarding the logic of * to dereference, & get a reference.
For example here:
struct A {
A& operator=(const A&) {
cout << "A::operator=(const A&)" << endl;
return *this;
}
};
this is a pointer that keeps the address of the current object. So dereferencing the pointer like *this you will get the lvalue of the current object itself. And the return type of the copy assignment operator of the presented class is A&. So returning the expression *this you are returning a reference to the current object.
According to the C++ 17 Standard (8.1.2 This)
1 The keyword this names a pointer to the object for which a
non-static member function (12.2.2.1) is invoked or a non-static data
member’s initializer (12.2) is evaluated.
Consider the following code snippet as an simplified example.
int x = 10;
int *this_x = &x;
Now to return a reference to the object you need to use the expression *this_x as for example
std::cout << *this_x << '\n';
& has multiple meanings depending on the context. In C and used alone, I can either be a bitwise AND operator or the address of something referenced by a symbol.
In C++, after a type name, it also means that what follows is a reference to an object of this type.
This means that is you enter :
int a = 0;
int & b = a;
… b will become de facto an alias of a.
In your example, operator= is made to return an object of type A (not a pointer onto it). This will be seen this way by uppers functions, but what will actually be returned is an existing object, more specifically the instance of the class of which this member function has been called.
Yes, *this is (the value of?) the current object. But the pointer to the current object is this, not &this.
&this, if it was legal, would be a pointer-to-pointer to the current object. But it's illegal, since this (the pointer itself) is a temporary object, and you can't take addresses of those with &.
It would make more sense to ask why we don't do return this;.
The answer is: forming a pointer requires &, but forming a reference doesn't. Compare:
int x = 42;
int *ptr = &x;
int &ref = x;
So, similarly:
int *f1() return {return &x;}
int &f1() return {return x;}
A simple mnemonic you can use is that the * and & operators match the type syntax of the thing you're converting from, not the thing you're converting to:
* converts a foo* to a foo&
& converts a foo& to a foo*
In expressions, there's no meaningful difference between foo and foo&, so I could have said that * converts foo* to foo, but the version above is easier to remember.
C++ inherited its type syntax from C, and C type syntax named types after the expression syntax for using them, not the syntax for creating them. Arrays are written foo x[...] because you use them by accessing an element, and pointers are written foo *x because you use them by dereferencing them. Pointers to arrays are written foo (*x)[...] because you use them by dereferencing them and then accessing an element, while arrays of pointers are written foo *x[...] because you use them by accessing an element and then dereferencing it. People don't like the syntax, but it's consistent.
References were added later, and break the consistency, because there isn't any syntax for using a reference that differs from using the referenced object "directly". As a result, you shouldn't try to make sense of the type syntax for references. It just is.
The reason this is a pointer is also purely historical: this was added to C++ before references were. But since it is a pointer, and you need a reference, you have to use * to get rid of the *.

Why can delete operator be used in const context?

This question is different from:
Is a destructor considered a const function?
new-expression and delete-expression on const reference and const pointer
Deleting a pointer to const (T const*)
I wrote a class Test like this.
class Test {
private:
int *p;
public:
//constructor
Test(int i) {
p = new int(i);
}
Test & operator = (const Test &rhs) {
delete p;
p = new int(*(rhs.p));
return *this;
}
};
When the parameter rhs of the operator function is itself (i.e. Test t(3); t = t;), delete p; also changes the pointer p of rhs. Why is this allowed?
C++ standard (N3092, "3.7.4.2 Deallocation functions") says
If the argument given to a deallocation function in the standard library is a pointer that is not the null pointer value (4.10), the deallocation function shall deallocate the storage referenced by the pointer, rendering invalid all pointers referring to any part of the deallocated storage. The effect of using an invalid pointer value (including passing it to a deallocation function) is undefined.
(Note: delete-expression internally calls a deallocation function. So this excerpt is related with delete operator.)
So I think delete p; may change the member p of rhs though rhs is a const reference.
Someone may insist that "to render a pointer invalid is not to change the value of a pointer" but I don't find such a statement in the standard. I doubt there is a possibility that the address pointed by rhs's p has been changed after delete p; in operator =(*).
(*): Whether or not this situation can be reproduced on popular compilers doesn't matter. I want a theoretical guarantee.
Supplement:
I've changed delete p; to delete rhs.p;, but it still works. Why?
Full code here:
#include <iostream>
class Test {
private:
int *p;
//print the address of a pointer
void print_address() const {
std::cout << "p: " << p << "\n";
}
public:
//constructor
Test(int i) {
p = new int(i);
}
Test & operator = (const Test &rhs) {
print_address(); //=> output1
delete rhs.p;
print_address(); //=> output2
p = new int(*(rhs.p));
return *this;
}
};
int main() {
Test t(3);
t = t;
}
In this case, it is guaranteed that p is invalidated. But who guarantees invalidate != (change the value)? i.e. Does the standard guarantee that output1 and output2 are the same?
So I think delete p; may change the member p of rhs though rhs is a const reference.
No. delete p; doesn't change p. Invalidation is not modification.
Regardless, having a const reference to an object (rhs) does not by any means prevent the referred object form being modified. It merely prevents modification through the const reference. In this case we access the object through this which happens to be a pointer to non-const, so modification is allowed.
Someone may insist that "to render a pointer invalid is not to change the value of a pointer" but I don't find such a statement in the standard.
The behaviour of delete expression is specified in [expr.delete]. Nowhere in that section does it mention that the operand is modified.
Becoming invalid is specified like this:
[basic.compound]
... A pointer value becomes invalid when the storage it denotes reaches the end of its storage duration ...
Note that it is the value that becomes invalid. The pointer still has the same value because the pointer was not modified. The value that the pointer had and still has is simply a value that no longer points to an object - it is invalid.
Supplement: I've changed delete p; to delete rhs.p;, but it still works. Why?
Answer 2. From previous question no longer applies, but answer 1. does. delete rhs.p; does not modify rhs.p.
Calling delete on a member pointer frees the memory the pointer points to but does not change the pointer itself. Thus, it does not change the bitwise contents of the object, thus it can be done in a const member.
C++ only cares about bitwise const (of the object the method is invoked on). Not logical const. If no bits in the object change, then all is well - const wise - as far as the C++ language is concerned. It does not matter whether the logical behaviour of the object is changed (for example by changing something member pointers point to). That's not what the compiler checks for.

Regarding definition of dereferencing and member selection operators in smart pointer

In smart pointer implementation, dereferencing operator and member selection operators are always defined as below.
T& operator* () const // dereferencing operator
{
return *(m_pRawPointer);
}
T* operator->() const // member selection operator
{
return m_pRowPointer;
}
I don't quite understand why the former is returned by reference, the latter is returned by pointer. Is it just to differentiate them or some other reasons?
To be more specific, can I make dereferencing operator returns by pointer, while the other one returns by reference?
why the former is returned by reference
So that the expression *thing gives an lvalue denoting an object of type T, just as it would if thing were a pointer.
the latter is returned by pointer
Because that's how the language is specified. Note that you never use the result of -> directly, but always in an expression of the form thing->member.
If thing is a class type, that's evaluated by calling operator->, then applying ->member to the result of that. To support that, it must return either a pointer, or another class type which also overloads operator->.
can I make dereferencing operator returns by pointer
Yes, but that would be rather confusing since it would behave differently to applying the same operator a pointer. You'd have to say **thing to access the T.
while the other one returns by reference
No, because that would break the language's built-in assumptions about how the overloaded operator should work, making it unusable.
The reason that the dereference operator returns by reference and the member selection operator returns by pointer is to line up the syntax of using a smart pointer with the syntax of using a raw pointer:
int* p = new int(42);
*p = 7;
std::unique_ptr<int> p(new int(42));
*p = 7;
You could absolutely make your dereference operator return anything you like:
struct IntPtr {
int* p;
int* operator*() { return p; }
};
But that would be pretty confusing for your users when they have to write:
IntPtr p{new int{42}};
**p = 7;
The arrow operator is a little different in that [over.ref]:
An expression x->m is interpreted as (x.operator->())->m
So you have to return something on which you can call ->m, otherwise you'll just get an error like (from gcc):
error: result of 'operator->()' yields non-pointer result

Invalid initialization of non-const reference from a rvalue

MyObject& MyObject::operator++(int)
{
MyObject e;
e.setVector(this->vector);
...
return &e;
}
invalid initialization of non-const reference of type 'MyObject&' from an rvalue of type 'MyObject*'
return &e;
^
I am not sure what it's saying. Is it saying that e is a pointer, because it's not a pointer. Also, if you'd make a pointer to the address of e, it would get wiped out at the end of the bracket and the pointer would be lost.
Your return type is MyObject&, a reference to a (non-temporary) MyObject object. However, your return expression has a type of MyObject*, because you are getting the address of e.
return &e;
^
Still, your operator++, which is a postfix increment operator due to the dummy int argument, is poorly defined. In accordance to https://stackoverflow.com/a/4421719/1619294, it should be defined more or less as
MyObject MyObject::operator++(int)
{
MyObject e;
e.setVector(this->vector);
...
return e;
}
without the reference in the return type.
You're correct that e is not a pointer, but &e very much is a pointer.
I'm reasonably certain that returning a reference to a stack variable that will be out of scope before you can use it is also not such a good idea.
The general way to implement postfix operator++ is to save the current value to return it, and modify *this with the prefix variant, such as:
Type& Type::operator++ () { // ++x
this->addOne(); // whatever you need to do to increment
return *this;
}
Type Type::operator++ (int) { // x++
Type retval (*this);
++(*this);
return retval;
}
Especially note the fact that the prefix variant returns a reference to the current object (after incrementing) while the postfix variant returns a non-reference copy of the original object (before incrementing).
That's covered in the C++ standard. In C++11 13.6 Built-in operators /3:
For every pair (T, VQ), where T is an arithmetic type, and VQ is either volatile or empty, there exist candidate operator functions of the form:
VQ T & operator++(VQ T &);
T operator++(VQ T &, int);
If, for some reason, you can't use the constructor to copy the object, you can still do it the way you have it (creating a local e and setting its vector) - you just have to ensure you return e (technically a copy of e) rather than &e.
Change return &e; to return e;. In the same way that a function like
void Func(int &a);
isn't called with Func(&some_int) you don't need the & in the return statement. &e takes the location of e and is of type MyObject*.
Also note, MyObject& is a reference to the object, not a copy. You are returning a reference to e, which will be destroyed when the function finishes and as such will be invalid when you next make use of it.

Question About & operator in C++

I am looking at the .h file of a Wrapper class. And the class contains one private member:
T* dataPtr;
(where T is as in template < class T > defined at the top of the .h file)
The class provides two "* overloading operator" methods:
T& operator*()
{
return *dataPtr;
}
const T& operator*() const
{
return *dataPtr;
}
Both simply return *dataPtr, but what does the notation "*dataPtr" actually return, in plain English? And how does it fit with the return type "T&"?
The return type T& states that you are returning a reference of an instance of a T object. dataPtr is a pointer, which you "dereference" (get the reference value/instance of a pointer) using *.
dataPtr is a pointer to something.
The * operator dereferences the pointer, so *dataPtr is (or, instead of 'is', you can say 'refers to' or 'is a reference to') the pointee, i.e. the thing that dataPtr is pointing to.
T& means 'a reference to an object whose type is T' (not to be confused with T* which means 'a pointer to an object whose type is T').
*DataPtr is the actual data pointed to by DataPtr. Both operators return a reference to T. A reference is a type that you should think of like another name for the value it refers to. "Under the hood," it is similar to a pointer, but don't think of it that way. It can't do pointer math, or be "reseated." One of the operators is const and is used on a const object, and the other is used on a normal object.
The wrapper class seems to be acting like a C++ pointer.
Operator * dereferences the wrapper which will evaluate to the thing it stores (in dataPtr). What you get is a reference to this contents. E.g. you can assign something to the reference
*intWrapper = 42;
There are two operators because there is a constant and a non-constant version. When you dereference a constant wrapper class, you can't assign to it (a const reference (T&) is returned)