Easier way to have shared_ptr own an existing pointer - c++

Many programmers advocate the use of make_shared because it reduces typing and reduces programming errors. However there are some cases where using the constructors of shared_ptr is unavoidable. One of these cases is when you have an existing pointer that you want shared_ptr to own, since shared_ptr<Foo>(&existing_ptr) is bad code. Instead, you have to use the unwieldy shared_ptr<Foo>(shared_ptr<Foo>(), p). Not only are you repeating yourself, but you have to create a temporary object.
int main()
{
using namespace std;
Foo foo;
foo.n = 1;
{
auto ptr = make_shared<Foo>(move(foo));
ptr->n = 42;
cout << ptr->n << " " << foo.n << '\n';
}
{
auto p = &foo;
auto ptr = shared_ptr<Foo>(shared_ptr<Foo>(), p);
ptr->n = 42;
cout << ptr->n << " " << foo.n << '\n';
}
return 0;
}
Foo::Foo()
Foo::Foo(Foo &&)
42 1
Foo::~Foo()
42 42
Foo::~Foo()
What's a less verbose way to have shared_ptr own an existing pointer?

The intended use of that constructor is to allow shared pointers to sub objects of shared pointers.
Your use is not the intended use, and is quite dangerous, as you have implicitly created a guarantee that the data you are passing to the shared pointer will last as long as the shared pointer or its copies does, and then failing to enforce that guarantee in any meaningful sense.
If you pass a shared pointer to a function, it has every right to cache a copy if it and use it 15 minutes later. And if you aren't passing a shared pointer to a function, you don't need one.
In general, a function should only require a shared pointer if it intends to extend the lifetime of its argument in a difficult to predict way. So if you have a function that takes a shared pointer and never extends its lifetime (or the lifetime of a pointer to it), it should not be taking a shared pointer. The problem is in the function you are calling, not with how you have to jump through hoops to call it.
Only when you both have a function that is broken, and are unable to fix it and making a copy of Foo on the free store is overly expensive, is your technique worth trying. And that should be extreme corner cases anyhow.

IMO what you're doing there shouldn't be easy because it's very dangerous, and only necessary under highly specialized circumstances where typing a class name twice should be the least of your worries.
But here is a slightly more succinct way to achieve the same thing by using a dummy deleter:
auto ptr = std::shared_ptr<Foo>(&foo, [](void*){});
Also, in your approach it isn't really correct to say that ptr owns foo; rather it owns the null object in the empty std::shared_ptr<Foo> and points to foo (see this answer for a longer discussion). In my code above it does "own" foo in some technical sense (or at least it thinks it does); it's just been prevented from doing anything to it when its reference count reaches zero.

Related

Using std::vector from an object referenced by shared_ptr after shared_ptr's destruction

I apologise if the title is different from what I will be describing, I don't quite know how to describe it apart from using examples.
Suppose I have a shared_ptr of an object, and within that object, is a vector. I assign that vector to a variable so I can access it later on, and the shared_ptr gets destroyed as it goes out of scope. Question, is the vector I saved "safe" to access?
In the example below, from main(), outer() is called, and within outer(), inner() is called. inner() creates a shared_ptr to an object that contains a std::vector, and assigns it to a variable passed by reference. The role of outer() is to create some form of seperation, so that we know that the shared_ptr is destroyed. In main(), this referenced variable is accessed, but is it safe to use this variable?
#include <iostream>
#include <vector>
#include <memory>
struct sample_compound_obj {
std::vector<int> vektor;
sample_compound_obj(){std::cout << "I'm alive!" << std::endl;};
~sample_compound_obj(){std::cout << "Goodbye, thank you forever!" << std::endl;};
};
bool inner(std::vector<int>& input) {
std::cout << "About to create sample_compound_obj..." << std::endl;
std::shared_ptr<sample_compound_obj> hehe(new sample_compound_obj);
hehe->vektor.push_back(1);
hehe->vektor.push_back(2);
hehe->vektor.push_back(3);
input = hehe->vektor;
std::cout << "About to return from inner()..." << std::endl;
return true;
}
bool outer(std::vector<int>& input) {
std::cout << "About to enter inner()..." << std::endl;
inner(input);
std::cout << "About to return from outer()..." << std::endl;
return true;
}
int main() {
std::cout << "About to enter outer()..." << std::endl;
std::vector<int> vector_to_populate;
outer(vector_to_populate);
for (std::vector<int>::iterator it = vector_to_populate.begin(); it != vector_to_populate.end(); it++) {
std::cout << *it <<std::endl; // <-- is it even "safe" to access this vector
}
}
https://godbolt.org/z/47EWfPGK3
To avoid XY problem, I first thought of this issue when I was writing some ROS code, where a subscriber callback passes by reference the incoming message as a const shared_ptr&, and the message contains a std::vector. In this callback, the std::vector is assigned (via =) to a global/member variable, to be used some time later, after the end of the callback, so presumably the original shared_ptr is destroyed. One big difference is that in my example, I passed the std::vector by reference between the functions, instead of a global variable, but I hope it does not alter the behavior. Question is, is the std::vector I have "saved", suitable to be used?
is it safe to use this variable?
Yes, in the below statement, you copy the whole vector using the std::vector::operator= overload doing copy assignment. The two vectors do not share anything and live their separate lives and can be used independently of each other.
input = hehe->vektor;
In this case it's safe, because you get copy of the vector in this line:
input = hehe->vektor;
One big difference is that in my example, I passed the std::vector by reference between the functions, instead of a global variable, but I hope it does not alter the behavior.
Any reference can be bound only once, and in your scenario input reference is already bound to the argument passed (to be precise std::vector<int>& input of inner function is bound to std::vector<int>& input of outer function which itself is bound to std::vector<int> vector_to_populate). After a reference is bound, it acts as is object itself, so in the assignment statement you actually end up with calling something like this:
input.operator=(hehe->vektor);
Where operator= refers to the std::vector<T>::operator=(const std::vector<T> rhs) function.
If you're familiar with the RAII concept and C++ references, you can pretty much skip the following explanations
In C++, vectors, and as a matter of fact, pretty much all structs, defined in std work very differently from what you might be used to in other higher level languages, like Java or C#. In C++, the RAII (resource acquisition is initialization) technique is used. I highly recommend that you actually read the article, but in short, it means that an object will define a constructor and a destructor, that allocate and free all memory used by the object, in your case, a std::vector, and the language is going to call the destructor when the object falls out of scope. This ensures that there are no memory leaks.
However, how would we go about passing a std::vector to a function, for example? Well, if we just made a straight up copy of the object, byte by byte, as we'd do in C, the function would run fine, until we reach the end of the function, and we call the destructor of the vector. In that case, after the function executes, when we go back to the caller, the vector is no longer valid, because its data got freed by the callee.
void callee(std::vector<int> vec) { }
void caller() {
std::vector<int> vec;
vec.push_back(10);
callee(vec);
vec.push_back(10); // This will break, with our current logic
}
Well, the keyword here is "copy". We copied the vector so that we can pass it in the callee function. We can solve this issue by creating custom copy behavior, in C++ terms that would be a copy constructor. In simple terms, a copy constructor takes an instance of the type itself as an argument, and copies it in the current instance. This allows us now to execute the code from above without any issues.
There are a lot more intricacies to it than I've written here, but other people have said it better than I have. In short, whenever you try to make an assignment, or pass an argument, you utilize the copy constructor (with some exceptions). In your case, you assign a vector variable with a vector value. This vector value gets copied, and so is valid until the function goes out of scope. There is a catch to that thou: if you try to modify the vector, you will modify only the copy, not the original. If you want to do so, you'll need to utilize references.
In C++, we have the concept of references. You can consider the reference something like a pointer, with some caveats to it. First of all, unlike pointers, you can't have a reference to a reference. The reference tells C++ that you don't work with the object itself, but an "alias" of the object. The object will exist in one place in the memory, but you will be able to access it in two places:
int a = 10;
int &ref_a = a;
std::cout << ref_a << ", " << a << "\n"; // 10, 10
ref_a = 5;
std::cout << ref_a << ", " << a << "\n"; // 5, 5
a = 8;
std::cout << ref_a << ", " << a << "\n"; // 8, 8
In the line int &ref_a = a, we don't actually copy a, but we tell ref_a that it is a reference of a. This means that any operations (including assignment) will be applied to a, not to ref_a. We can use references with variables, fields, return values and parameters. A reference is valid as long as the value it refers to is valid, so as soon as the value goes out of scope, the reference is no longer valid.
References can be used in parameters, in order to avoid copying the value. This can provide a lot of performance benefits, since we don't need to copy the value, but just pass a "pointer" to it. Of course, this means that if the function modifies the parameter, that is reflected in the caller:
void func(int &ref) {
ref = 5;
}
void func2() {
int a = 10;
func(a);
std::cout << a; // 5
}
TL; DR
In your case, you're returning a vector via an out parameter (reference parameter). Still, even if you're working with references, setting a reference will actually set the object behind the reference, and will use the copy constructor of the object. You can avoid that by making that a reference of a pointer to a vector, but working with pointers in C++ is strongly advised against. Regardless, the answer to your question is that this code is completely safe. Still, if you try to modify input in outer, you won't modify the vector in hehe, but instead you're going to modify the copy that inener has created.

How can i handle shared_ptr ++

int main ()
{
shared_ptr<int[]> a(new int[2]);
a.get()[0] = 5;
a.get()[1] = 10;
int* foo = a.get();
++foo;
cout << *foo << endl;
return 0;
}
The output is "10" as i expected. But I used regular pointer (int* foo) how can i implement/use ++ operator with only shared_ptr.
A std::shared_ptr contains two parts:
A pointer to the bookkeeping.
A payload-pointer which should be valid at least as long.
So, for your pointer-arithmetic, just use a dependent shared_ptr setting both individually:
shared_ptr<int[]> foo(a, a.get() + 1);
As an aside, an array-shared_ptr supports indexing, so no need to go over .get() for that.
Also, as long as you keep one shared_ptr aroud to keep the data live, there is nothing wrong (actually, it is expected and more efficient) with using raw pointers and references for processing.
There isn't another way.
The only method is to call get and the increment it in the copy returned from the get.
I attach you some usefull link :
smartpointer - shared_ptr
cppreference.com - shared_ptr
What you are trying to do is semantically meaningless. A shared_ptr expresses the intent to own something (in this case an array of two int's). All it does is track ownership. If you increment a shared_ptr, does that mean you release the ownership of the first int? Or does it mean you own the second int twice? Either way is not what you are trying to do.
In general, use unique_ptr/shared_ptr only for ownership issues. To manipulate and use the memory directly, use a view class, something like a span. Not all compilers have this facility yet, so may have to use either the GSL library or write your own. However you implement it, it is better to make a separate object that views the object, but does not claim to own it. If you can tie the length of the array to pointer, this is better still.

Can you call the destructor without calling the constructor?

I've been trying not to initialize memory when I don't need to, and am using malloc arrays to do so:
This is what I've run:
#include <iostream>
struct test
{
int num = 3;
test() { std::cout << "Init\n"; }
~test() { std::cout << "Destroyed: " << num << "\n"; }
};
int main()
{
test* array = (test*)malloc(3 * sizeof(test));
for (int i = 0; i < 3; i += 1)
{
std::cout << array[i].num << "\n";
array[i].num = i;
//new(array + i) i; placement new is not being used
std::cout << array[i].num << "\n";
}
for (int i = 0; i < 3; i += 1)
{
(array + i)->~test();
}
free(array);
return 0;
}
Which outputs:
0 ->- 0
0 ->- 1
0 ->- 2
Destroyed: 0
Destroyed: 1
Destroyed: 2
Despite not having constructed the array indices. Is this "healthy"? That is to say, can I simply treat the destructor as "just a function"?
(besides the fact that the destructor has implicit knowledge of where the data members are located relative to the pointer I specified)
Just to specify: I'm not looking for warnings on the proper usage of c++. I would simply like to know if there's things I should be wary of when using this no-constructor method.
(footnote: the reason I don't wanna use constructors is because many times, memory simply does not need to be initialized and doing so is slow)
No, this is undefined behaviour. An object's lifetime starts after the call to a constructor is completed, hence if a constructor is never called, the object technically never exists.
This likely "seems" to behave correctly in your example because your struct is trivial (int::~int is a no-op).
You are also leaking memory (destructors destroy the given object, but the original memory allocated via malloc still needs to be freed).
Edit: You might want to look at this question as well, as this is an extremely similar situation, simply using stack allocation instead of malloc. This gives some of the actual quotes from the standard around object lifetime and construction.
I'll add this as well: in the case where you don't use placement new and it clearly is required (e.g. struct contains some container class or a vtable, etc.) you are going to run into real trouble. In this case, omitting the placement-new call is almost certainly going to gain you 0 performance benefit for very fragile code - either way, it's just not a good idea.
Yes, the destructor is nothing more than a function. You can call it at any time. However, calling it without a matching constructor is a bad idea.
So the rule is: If you did not initialize memory as a specific type, you may not interpret and use that memory as an object of that type; otherwise it is undefined behavior. (with char and unsigned char as exceptions).
Let us do a line by line analysis of your code.
test* array = (test*)malloc(3 * sizeof(test));
This line initializes a pointer scalar array using a memory address provided by the system. Note that the memory is not initialized for any kind of type. This means you should not treat these memory as any object (even as scalars like int, let aside your test class type).
Later, you wrote:
std::cout << array[i].num << "\n";
This uses the memory as test type, which violates the rule stated above, leading to undefined behavior.
And later:
(array + i)->~test();
You used the memory a test type again! Calling destructor also uses the object ! This is also UB.
In your case you are lucky that nothing harmful happens and you get something reasonable. However UBs are solely dependent on your compiler's implementation. It can even decide to format your disk and that's still standard-conforming.
That is to say, can I simply treat the destructor as "just a function"?
No. While it is like other functions in many ways, there are some special features of the destructor. These boil down to a pattern similar to manual memory management. Just as memory allocation and deallocation need to come in pairs, so do construction and destruction. If you skip one, skip the other. If you call one, call the other. If you insist upon manual memory management, the tools for construction and destruction are placement new and explicitly calling the destructor. (Code that uses new and delete combine allocation and construction into one step, while destruction and deallocation are combined into the other.)
Do not skip the constructor for an object that will be used. This is undefined behavior. Furthermore, the less trivial the constructor, the more likely that something will go wildly wrong if you skip it. That is, as you save more, you break more. Skipping the constructor for a used object is not a way to be more efficient — it is a way to write broken code. Inefficient, correct code trumps efficient code that does not work.
One bit of discouragement: this sort of low-level management can become a big investment of time. Only go this route if there is a realistic chance of a performance payback. Do not complicate your code with optimizations simply for the sake of optimizing. Also consider simpler alternatives that might get similar results with less code overhead. Perhaps a constructor that performs no initializations other than somehow flagging the object as not initialized? (Details and feasibility depend on the class involved, hence extend outside the scope of this question.)
One bit of encouragement: If you think about the standard library, you should realize that your goal is achievable. I would present vector::reserve as an example of something that can allocate memory without initializing it.
You currently have UB as you access field from non-existing object.
You might let field uninitialized by doing a constructor noop. compiler might then easily doing no initialization, for example:
struct test
{
int num; // no = 3
test() { std::cout << "Init\n"; } // num not initalized
~test() { std::cout << "Destroyed: " << num << "\n"; }
};
Demo
For readability, you should probably wrap it in dedicated class, something like:
struct uninitialized_tag {};
struct uninitializable_int
{
uninitializable_int(uninitialized_tag) {} // No initalization
uninitializable_int(int num) : num(num) {}
int num;
};
Demo

Return newly allocated pointer or update the object through parameters?

I'm actually working on pointers to user-defined objects but for simplicity, I'll demonstrate the situation with integers.
int* f(){
return new int(5);
}
void f2(int* i){
*i = 10;
}
int main(){
int* a;
int* b = new int();
a = f();
f2(b);
std::cout << *a << std::endl; // 5
std::cout << *b << std::endl; // 10
delete b;
delete a;
return 0;
}
Consider that in functions f() and f2() there are some more complex calculations that determine the value of the pointer to be returned(f()) or updated through paramteres(f2()).
Since both of them work, I wonder if there is a reason to choose one over the other?
From looking at the toy code, my first thought is just to put the f/f2 code into the actual object's constructor and do away with the free functions entirely. But assuming that isn't an option, it's mostly a matter of style/preference. Here are a few considerations:
The first is easier to read (in my opinion) because it's obvious that the pointer is an output value. When you pass a (nonconst) pointer as a parameter, it's hard to tell at a glance whether it's input, output or both.
Another reason to prefer the first is if you subscribe to the school of thought that says objects should be made immutable whenever possible in order to simplify reasoning about the code and to preclude thread safety problems. If you're attempting that, then the only real choice is for f() to create the object, configure it, and return const Foo* (again, assuming you can't just move the code to the constructor).
A reason to prefer the second is that it allows you to configure objects that were created elsewhere, and the objects can be either dynamic or automatic. Though this can actually be a point against this approach depending on context--sometimes it's best to know that objects of a certain type will always be created and initialized in one spot.
If the allocation function f() does the same thing as new, then just call new. You can do whatever initialisation in the object's construction.
As a general rule, try to avoid passing around raw pointers, if possible. However that may not be possible if the object must outlive the function that creates it.
For a simple case, like you have shown, you might do something like this.
void DoStuff(MyObj &obj)
{
// whatever
}
int Func()
{
MyObj o(someVal);
DoStuff(o);
// etc
}
f2 is better only because the ownership of the int is crystal clear, and because you can allocate it however you want. That's reason enough to pick it.

C++ Objects: When should I use pointer or reference

I can use an object as pointer to it, or its reference. I understand that the difference is that pointers have to be deleted manually, and references remain until they are out of scope.
When should I use each of them? What is the practical difference?
Neither of these questions answered my doubts:
Pointer vs. Reference
C++ difference between reference, objects and pointers
A reference is basically a pointer with restrictions (has to be bound on creation, can't be rebound/null). If it makes sense for your code to use these restrictions, then using a reference instead of a pointer allows the compiler to warn you about accidentally violating them.
It's a lot like the const qualifier: the language could exist without it, it's just there as a bonus feature of sorts that makes it easier to develop safe code.
"pointers I have to delete and reference they remain until their scope finish."
No, that's completely wrong.
Objects which are allocated with new must be deleted[*]. Objects which are not allocated with new must not be deleted. It is possible to have a pointer to an object that was not allocated with new, and it is possible to have a reference to an object that was allocated with new.
A pointer or a reference is a way of accessing an object, but is not the object itself, and has no bearing on how the object was created. The conceptual difference is that a reference is a name for an object, and a pointer is an object containing the address of another object. The practical differences, how you choose which one to use, include the syntax of each, and the fact that references can't be null and can't be reseated.
[*] with delete. An array allocated with new[] must be deleted with delete[]. There are tools available that can help keep track of allocated resources and make these calls for you, called smart pointers, so it should be quite rare to explicitly make the call yourself, as opposed to just arranging for it to be done, but nevertheless it must be done.
suszterpatt already gave a good explanation. If you want a rule of thumb that is easy to remember, I would suggest the following:
If possible use references, use pointers only if you can not avoid them.
Even shorter: Prefer references over pointers.
Here's another answer (perhaps I should've edited the first one, but since it has a different focus, I thought it would be OK to have them separate).
When you create a pointer with new, the memory for it is reserved and it persists until you call delete on it - but the identifier's life span is still limited to the code block's end. If you create objects in a function and append them to an external list, the objects may remain safely in the memory after the function returns and you can still reference them without the identifier.
Here's a (simplified) example from Umbra, a C++ framework I'm developing. There's a list of modules (pointers to objects) stored in the engine. The engine can append an object to that list:
void UmbraEngine::addModule (UmbraModule * module) {
modules.push(module);
module->id = modules.size() - 1;
}
Retrieve one:
UmbraModule * UmbraEngine::getModule (int id) {
for (UmbraModule **it=modules.begin(); it != modules.end(); it++) {
if ((*it)->id == id) return *it;
}
}
Now, I can add and get modules without ever knowing their identifiers:
int main() {
UmbraEngine e;
for (int i = 0; i < 10; i++) {
e.addModule(new UmbraModule());
}
UmbraModule * m = e.getModule(5); //OK
cout << m << endl; //"0x127f10" or whatever
for (int j = 0; k < 10; j++) {
UmbraModule mm; //not a pointer
e.addModule(&mm);
}
m = e.getModule(15);
cout << m << endl; //{null}
}
The modules list persists throughout the entire duration of the program, I don't need to care about the modules' life span if they're instantiated with new :). So that's basically it - with pointers, you can have long-lived objects that don't ever need an identifier (or a name, if you will) in order to reference them :).
Another nice, but very simple example is this:
void getVal (int * a) {
*a = 10;
}
int main() {
int b;
getVal(&b);
return b;
}
You have many situations wherein a parameter does not exist or is invalid and this can depend on runtime semantics of the code. In such situations you can use a pointer and set it to NULL (0) to signal this state. Apart from this,
A pointer can be re-assigned to a new
state. A reference cannot. This is
desirable in some situations.
A pointer helps transfer owner-ship semantics. This is especially useful
in multi-threaded environment if the parameter-state is used to execute in
a separate thread and you do not usually poll till the thread has exited. Now the thread can delete it.
Erm... not exactly. It's the IDENTIFIER that has a scope. When you create an object using new, but its identifier's scope ends, you may end up with a memory leak (or not - depends on what you want to achieve) - the object is in the memory, but you have no means of referencing it anymore.
The difference is that a pointer is an address in memory, so if you have, say, this code:
int * a = new int;
a is a pointer. You can print it - and you'll get something like "0x0023F1" - it's just that: an address. It has no value (although some value is stored in the memory at that address).
int b = 10;
b is a variable with a value of 10. If you print it, you'll get 10.
Now, if you want a to point to b's address, you can do:
a = &b; //a points to b's address
or if you want the address pointed by a to have b's value:
*a = b; //value of b is assigned to the address pointed by a
Please compile this sample and comment/uncomment lines 13 and 14 to see the difference (note WHERE the identifiers point and to WHAT VALUE). I hope the output will be self-explanatory.
#include <iostream>
using namespace std;
int main()
{
int * a = new int;
int b = 10;
cout << "address of a: " << a << endl;
cout << "address of b: " << &b << endl;
cout << "value of a: " << *a << endl;
cout << "value of b: " << b << endl;
a = &b; //comment/uncomment
//*a = b; //comment/uncomment
cout << "address of a: " << a << endl;
cout << "address of b: " << &b << endl;
cout << "value of a: " << *a << endl;
cout << "value of b: " << b << endl;
}
Let's answer the last question first. Then the first question will make more sense.
Q: "What is the practical difference[ between a pointer and a reference]?"
A: A reference is just a local pseudonym for another variable. If you pass a parameter by reference, then that parameter is exactly the same variable as the one that was listed in the calling statement. However, internally there usually is no difference between a pointer and a reference. References provide "syntax sugar" by allowing you to reduce the amount of typing you have to do when all you really wanted was access to a single instance of a given variable.
Q: "When should I use each of em?"
A: That's going to be a matter of personal preference. Here's the basic rule I follow. If I'm going to need to manipulate a variable in another scope, and that variable is either an intrinsic type, a class that should be used like an intrinsic type (i.e. std::string, etc...), or a const class instance, then I pass by reference. Otherwise, I'll pass by pointer.
The thing is you cannot rebind a reference to another object. References are bound compile time and cannot be null or rebound. So pointers aren't redundant if your doubt was that :)
As my c++ teacher used to put it, pointers point to the memory location while references are aliases . Hence the main advantage is that they can be used in the same way as the object's name they refer to, but in a scope where the object is not available by passing it there.
While pointers can be redirected to some other location, references being like constant pointers, can't be redirected. So references cant be used for traversing arrays in a functions etc.
However a pointer being a separate entity takes up some memory, but the reference being the same as the referred object doesn't take any additional space. This is one of its advantages.
I have also read that the processing time for references are less,
as
int & i = b ;
i++ ; takes lesser time than
int * j = b ;
(*j) ++ ;
but I am yet to confirm this. If anyone can throw light on this claim it would be great.
Comments are welcome :)