All considerations about when to use which aside, I am still unsure about pointer vs reference semantics.
Right now, I am under the impression that references are essentially pointers that must be initialized when they are declared, and then from that point on cannot point to anything else. In other words, they are a like a Type* const (not Type const*), or, they cannot be reseated. It essentially becomes a "new name" for that object. Now I heard that references do not actually need to be implemented by the compiler using pointers, but I am under the impression that you can still think of them this way, in regards to what their visible behavior will be.
But why can't you do something like this:
int& foo = new int;
I want to create a reference to dynamic memory. This does not compile. I get the error
error: invalid initialization of non-const reference of type 'int&' from a temporary of type 'int*'
That makes sense to me. It seems the new operator returns a pointer of given type to the address of memory that the OS? dynamically allocated for me.
So how do I create a "reference" to dynamic memory?
Edit: Links to resources that precisely explain the difference between references and pointers in C++ would be appreciated.
new returns a pointer to the allocated memory, So you need to capture the return value in a pointer.
You can create a reference to a pointer after allocation is done.
int *ptr = new int;
int* &ref = ptr;
then delete it after use as:
delete ref;
or more simply,
int &ref = *(new int);
delete it after use as:
delete &ref;
References are syntactic sugar. They allow one to access an object with the dot operator rather than the arrow.
Your choice of whether to use a pointer or a reference is semantic. When you pass an object by reference to a method, or return a reference from a method, you are saying: "This is my object and you may use it, but I own it (and it may be on the stack or the heap.)" It follows that the other answers here which suggest syntax like delete &foo; might technically work, but smell bad; If you have a reference to an object then you shouldn't be deleting it. You don't own it and, most importantly, as you can't reset the reference you end up with a reference to deallocated memory, which is a bad thing.
Now, if you have allocated an object on the heap (called 'new' to create it) then you do own it, and are responsible for cleaning it up later, so you need to hold a pointer to it. Why? So you can safely delete it later and null-out the pointer.
It follows that the difference between a pointer and a reference, other than the mechanical difference of using dot rather than arrow, is that by passing by reference to a method you indicate something about how an object should be used. To initialise a reference directly by calling new is nonsense, even if possible.
You can get a reference like this:
int& foo = *(new int);
In general, to get from T* to T& you use * to "dereference" the pointer.
However this is not a very good idea in the first place. You usually use pointers to store addresses of heap-allocated objects.
Related
How do I use the move assignment operator when working with raw pointers.
Is there any other way than doing something like:
void Function(T* dest)
{
T* src = LoadT();
(*dest) = std::move(*src);
delete src;
}
Your move is fine. The object pointed to by src will be moved into the object pointed to by dest.
About your updated code example, the version with Function:
If your LoadT() returns a raw pointer to an object allocated with new, that does not get stored somewhere else and later deleted, you will have a memory leak.
When you std::move something, you move the contents of that object, the object itself remains alive, only "empty"/in whatever state you leave it after moving.
If you return a pointer to an object that is owned by someone else and will be cleaned up somehow beyond the code that's seen here, you could make that explicit by changing your pointers to references - that way you will explicitly specify that: a) the pointers are guaranteed to not be null; b) that there should be no worries about deleting the object you get from LoadT, since a reference can't have ownership of that object.
Unless you reference the objects in a temporary variable you are out of luck. Technically (not sure if illegal) you are allowed to provide your own specialization of move in which you can hide that behavior. But at one point or another, you have to dereference those pointers.
You want to have the pointed-to-objects moved onto one another, then you are already doing that in the correct/standard/only way, that is de-reference the pointer and then move *src onto *dest.
That being said, you interface is semanticly 'problematic'. A pointer to a function is by convention taken to mean non-owning and optional. Ownership seems not to be the issue here, but optional is. Why use a pointer when you can use a reference ? and moreover why use a reference when you can just use return value ?
In effect you could simplify to:
T Function()
{
return LoadT();
}
Then let users of the function decide how to best use T. If T is polymorphic you could return std::unique_ptr<T> instead.
There have been tons of questions asked about passing by reference or pointer, and when to use pointers.
My understanding of the subject so far is the following rules:
Always try to pass by reference
Pass by pointer (only use pointers) if you must
In my case, I must use pointers in order to retain polymorphic behaviours as I am storing the passed object into a vector for later use (it is an 'add' method).
See: C++ Overridden method not getting called
I have read:
shared_ptr by reference or by value? which talks about passing shared_ptrs specifically
Passing shared pointers as arguments which tells me that I should only pass by shared_ptr if I am trying to transfer ownership, but later goes on to say that I should then pass by reference
Should I use std::shared pointer to pass a pointer? which pretty much tells me what the previous question told me, but in the context of a unique_ptr
So my question is this:
If I am trying to pass a pointer already contained in a shared_ptr to add to a vector, should I
pass a reference to the shared_ptr to be added into the vector (because the other method is unwieldy)
or
use shared_ptr::get to get the actual pointer, pass that pointer, re-wrap it using shared_ptr::reset, and then add it to the vector? (because I should only pass smart pointers if I'm transferring ownership)
Code:
//method definition
void addToVector(shared_ptr<Object>& obj) {
myVector.push_back(obj);
}
//call
shared_ptr<Object> myObj = make_shared<Object>();
addToVector(myObj);
or
//method definition
void addToVector(Object* obj) {
shared_ptr<Object> toAdd;
toAdd.reset(obj);
myVector.push_back(toAdd);
}
//call
shared_ptr<Object> myObj = make_shared<Object>();
addToVector(myObj.get());
I must use pointers in order to retain polymorphic behaviours as I am storing the passed object into a vector for later use
If you store the pointed object in vector, then you don't retain polymorphic behaviours.
Should I ... use shared_ptr::get to get the actual pointer, pass that pointer, re-wrap it using shared_ptr::reset, and then add it to the vector?
No. A shared pointer may not take ownership of the pointer that is already owned by another shared pointer. This would have undefined behaviour.
(because I should only pass smart pointers if I'm transferring ownership)
If you intend to store a shared pointer to the object, then you are transferring (sharing) the ownership. If that is your intention, then pass a const reference to the shared pointer, as described in the linked answer.
If you don't intend to share the ownership, then storing a shared pointer is not what you should do. You may want to store a reference wrapper, bare pointer, or a weak pointer instead. How you should pass the reference to the function, will depend on what you choose to do with it.
The second example is undefined behavior, so cannot be considered as a valid approach at all:
void addToVector(Object* obj) {
shared_ptr<Object> toAdd;
toAdd.reset(obj); // n.b. could just use shared_ptr(obj) ctor
myVector.push_back(toAdd);
}
shared_ptr<Object> myObj = make_shared<Object>();
addToVector(myObj.get()); // UB
What happens is that myObj owns its referent, then you use get() to form a raw pointer to that referent, then you create a new shared_ptr with the same referent in addToVector(). Now you have two smart pointers which refer to the same object but the two smart pointers don't know about each other, so will each destroy the object, which is double-free, which is undefined behavior.
Suppose I have:
class SomeObject {
};
SomeObject& f() {
SomeObject *s = new SomeObject();
return *s;
}
// Variant 1
int main() {
SomeObject& s = f();
// Do something with s
}
// Variant 2
int main() {
SomeObject s = f();
// Do something with s
}
Is there any difference between the first variant and the second? any cases I would use one over the other?
Edit: One more question, what does s contain in both cases?
First, you never want to return a reference to an object which
was dynamically allocated in the function. This is a memory
leak waiting to happen.
Beyond that, it depends on the semantics of the object, and what
you are doing. Using the reference (variant 1) allows
modification of the object it refers to, so that some other
function will see the modified value. Declaring a value
(variant 2) means that you have your own local copy, and any
modifications, etc. will be to it, and not to the object
referred to in the function return.
Typically, if a function returns a reference to a non-const,
it's because it expects the value to be modified; a typical
example would be something like std::vector<>::operator[],
where an expression like:
v[i] = 42;
is expected to modify the element in the vector. If this is
not the case, then the function should return a value, not
a reference (and you should almost never use such a function to
initialize a local reference). And of course, this only makes
sense if you return a reference to something that is accessible
elsewhere; either a global variable or (far more likely) data
owned by the class of which the function is a member.
In the first variant you attach a reference directly to a dynamically allocated object. This is a rather unorthodox way to own dynamic memory (a pointer would be better suited for that purpose), but still it gives you the opportunity to properly deallocate that object. I.e. at the end of your first main you can do
delete &s;
In the second variant you lose the reference, i.e. you lose the only link to that dynamically allocated object. The object becomes a memory leak.
Again, owning a dynamically allocated object through a reference does not strike me as a good practice. It is usually better to use a pointer or a smart pointer for that purpose. For that reason, both of your variants are flawed, even though the first one is formally redeemable.
Variant 1 will copy the address of the object and will be fast
Variant 2 will copy the whole object and will be slow (as already pointed out in Variant2 you cant delete the object which you created by calling new)
for the edit: Both f contain the same Object
None of the two options you asked about is very good. In this particular case you should use shared_ptr or unique_ptr, or auto_ptr if you use older C++ compilers, and change the function so it returns pointer, not reference. Another good option is returning the object by value, especially if the object is small and cheap to construct.
Modification to return the object by value:
SomeObject f() { return SomeObject(); }
SomeObject s(f());
Simple, clean, safe - no memory leaking here.
Using unique_ptr:
SomeObject* f() { return new SomeObject(); }
unique_ptr<SomeObject> s(f());
One of the advantages of using a unique_ptr or shared_ptr here is that you can change your function f at some point to return objects of a class derived from SomeObject and none of your client code will need to be changed - just make sure the base class (SomeObject) has a virtual constructor.
Why the options you were considering are not very good:
Variant 1:
SomeObject& s = f();
How are you going to destroy the object? You will need address of the object to call it's destructor anyway, so at some point you would need to dereference the object that s refers to (&s)
Variant 2. You have a leak here and not a chance to call destructor of the object returned from your function.
I'm sorry to ask such a newbie question.
Here is my problem :
MyClass* c = new MyClass("test");
method(c);//error cannot convert MyClass* to MyClass
the definition :
method(MyClass c);
should I also define ?
method(MyClass* c);
I don't want to duplicate the code, what is the proper way ?
You're clearly a Java programmer! First of all, you will need to do:
MyClass* c = new MyClass("test");
Note that c is a "pointer to MyClass" - this is necessary because the new expression gives you a pointer to the dynamically allocated object.
Now, to pass this to a function that takes a MyClass argument, you will need to dereference the pointer. That is, you will do:
method(*c);
Note that because the function takes a MyClass by value (not a reference), the object will be copied into your function. The MyClass object inside method will be a copy of the object you allocated earlier. The type of the argument depends on exactly what you want your function to do and convey. If instead you want a reference to the object, so that modifying the object inside the function will modify the c object outside the function, you need your function to take a MyClass& argument - a reference to MyClass.
If you were to have the argument be of type MyClass*, then you could simply do:
method(c);
This will give you similar semantics to passing a reference, because the pointer c will be copied into the function but the object that pointer refers to will still be the dynamically allocated MyClass object. That is, if inside the function you modify *d, the object pointed to by c is also modified.
Passing a raw pointer like this is usually not a very good approach. The function will have to explicitly check that the pointer is not null, otherwise your program may crash under certain conditions. If you want pass-by-reference semantics, use reference types - it's what they're for.
However, you're better off not dynamically allocating your MyClass object in the first place. I guess you're only using new because you did it a lot in Java. In C++, the new expression is used to dynamically allocate an object. More often than not, you do not want to dynamically allocate an object. It is perfectly fine to create it with automatic storage duration, which you would do like so:
MyClass c("test");
method(c);
It is considered very good practise in C++ to avoid pointers and dynamic allocation unless you have a good reason. It will only lead you to more complicated code with more room for errors and bugs. In fact, the code you've given already has a problem because you didn't delete c;. When you do as I have just suggested, you don't need to explicitly delete anything, because the object will be destroyed when it goes out of scope.
MyClass *c = new MyClass("test");
method(*c);
it works if you defined a method like method(MyClass c)
But if you define you method like method(MyClass *c);
then it should be
MyClass *c = new MyClass("test");
method(c);
The two alternatives have consequences depending on what you want to do with the object you have created.
I have a basic question regarding the const pointers. I am not allowed to call any non-const member functions using a const pointer. However, I am allowed to do this on a const pointer:
delete p;
This will call the destructor of the class which in essence is a non-const 'method'. Why is this allowed? Is it just to support this:
delete this;
Or is there some other reason?
It's to support:
// dynamically create object that cannot be changed
const Foo * f = new Foo;
// use const member functions here
// delete it
delete f;
But note that the problem is not limited to dynamically created objects:
{
const Foo f;
// use it
} // destructor called here
If destructors could not be called on const objects we could not use const objects at all.
Put it this way - if it weren't allowed there would be no way to delete const objects without using const_cast.
Semantically, const is an indication that an object should be immutable. That does not imply, however, that the object should not be deleted.
Constructors and Destructors should not be viewed as 'methods'. They are special constructs to initialise and tear down an object of a class.
'const pointer' is to indicate that the state of the object would not be changed when operations are performed on it while it is alive.
I am not allowed to call any non-const member functions using a const pointer.
Yes you are.
class Foo
{
public:
void aNonConstMemberFunction();
};
Foo* const aConstPointer = new Foo;
aConstPointer->aNonConstMemberFunction(); // legal
const Foo* aPointerToConst = new Foo;
aPointerToConst->aNonConstMemberFunction(); // illegal
You have confused a const pointer to a non-const object, with a non-const pointer to a const object.
Having said that,
delete aConstPointer; // legal
delete aPointerToConst; // legal
it's legal to delete either, for the reasons already stated by the other answers here.
Another way to look at it: the precise meaning of a const pointer is that you will not be able to make changes to the pointed-to object that would be visible via that or any other pointer or reference to the same object. But when an object destructs, all other pointers to the address previously occupied by the now-deleted object are no longer pointers to that object. They store the same address, but that address is no longer the address of any object (in fact it may soon be reused as the address of a different object).
This distinction would be more obvious if pointers in C++ behaved like weak references, i.e. as soon as the object is destroyed, all extant pointers to it would immediately be set to 0. (That's the kind of thing considered to be too costly at runtime to impose on all C++ programs, and in fact it is impossible to make it entirely reliable.)
UPDATE: Reading this back nine years later, it's lawyer-ish. I now find your original reaction understandable. To disallow mutation but allow destruction is clearly problematic. The implied contract of const pointers/references is that their existence will act as a block on destruction of the target object, a.k.a. automatic garbage collection.
The usual solution to this is to use almost any other language instead.