API return type -- Force a smart pointer onto the user?

API return type -- Force a smart pointer onto the user? - c++

Suppose I have an API-border function that creates and returns an object:
<?> createOneObject();
What would be the best way to return the object to the caller?
Return the raw MyObject* and let the caller handle the pointer herself
Return std::shared_ptr<MyObject>
Something else?

Depends.
In general, returning a raw pointer is bad, because the type itself does not communicate ownership. Should the user delete the pointer? If yes, then how? If not, then when does the library do it? How does the user know if the pointer is still valid? Those things have to be documented, and lazy programmers might skip reading the docs. But, if your API must be callable from C or other languages, then a raw pointer may be the only option.
Shared pointer might be useful in special cases, but they do have overhead. If you don't need shared ownership, then use of a shared pointer may be overkill.
Unique pointer is what I would use unless there is specific reason to use something else.
Although, that assumes that a pointer should be returned in the first place. Another great option could be to return an object.
To be a bit more concrete: I have a player class that just stores an id and just offers lots of methods. In this case you would prefer to return by value?
Most definitely.

You have 2 options (without any nasty drawbacks):
C Style
Have a matching void destroyOneObject(MyObject* object); that cleans up the resource. This is the proper choice when destroying the object isn't as simple as deleting it (e.g. needs to be unregistered at some manager class).
Smart pointer
When all that needs to happen is to delete the object, return a std::unique_ptr<MyObject>. This uses RAII to clean up the object, and fits modern C++ a lot better.
The bad solutions
Don't return a raw pointer and call delete on it. You cannot guarantee that the delete matches the new that was used to create the object. It also doesn't communicate very well if you are supposed to delete it in the first place.
Don't use shared pointers. They are complete overkill in almost any situation, and are usually the result of a lack of understanding of an application's structure.

Related

C++ Is making objects depend on others a good design?

I have a basic design that consists of three classes : A Data class, A Holder class wich holds and manages multiple Data objects, and a Wrapper returned by the Holder wich contains a reference to a Data object.
The problem is that Wrapper must not outlive Holder, or it will contain a dangling reference, as Holder is responsible for deleting the Data objects. But as Wrapper is intended to have a very short lifetime (get it in a function, make some computation on its data, and let it go out of scope), this should not be a problem, but i'm not sure this is a good design.
Here are some solutions i thought about:
-Rely on the user reading the documentation, technically the same thing happens with STL iterators
-Using shared_ptr to make sure the data lasts long enought, but it feels like overkill
-Make Wrapper verify its Holder still exists each time you use it
-Any idea?
(I hope everyone can understand this, as english is not my native language)
Edit : If you want to have a less theoric approach, this all comes from a little algorithm i'm trying to write to solve Sudokus, the Holder is the grid, the Data is the content of each box (either a result or a temporary supposition), and the Wrapper is a Box class wich contains a reference to the Data, plus additional information like row and column.
I did not originally said it because i want to know what to do in a more general situation.

Only ever returning a Wrapper by value will help ensure the caller doesn't hold onto it beyond the calling scope. In conjunction with a comment or documentation that "Wrappers are only valid for the lifetime of the Holder that created them" should be enough.
Either that or you decide that ownership of a Data is transferred to the Wrapper when the Wrapper is created, and the Data is destroyed along with the Wrapper - but what if you want a to destroy a Wrapper without deleting the Data? You'd need a method to optionally relinquish ownership of the Data back to the Holder.
Whichever you choose, you need to decide what owns (ie: is responsible for the lifetime of) Data and when - once you've done that you can, if you want, use smart pointers to help with that management - but they won't make the design decision for you, and you can't simply say "oh I'll use smart pointers instead of thinking about it".
Remember, if you can't manage heap memory without smart pointers - you've got no business managing heap memory with them either!

To elaborate on what you have already listed as options,
As you suggested, shared_ptr<Data> is a good option. Unless performance is an issue, you should use it.
Never hold a pointer to Data in Wrapper. Store a handle that can be used to get a pointer to the appropriate Data object. Before Data is accessed through Wrapper, get a pointer the Data object. If the pointer is not valid, throw an exception. If the pointer is valid, proceed along the happy path.

Correctly using smart pointers

I'm having trouble getting things organized properly with smart pointers. Almost to the point that I feel compelled to go back to using normal pointers.
I would like to make it easy to use smart pointers throughout the program without having to type shared_ptr<...> every time. One solution I think of right away is to make a template class and add a typedef sptr to it so I can do class Derived : public Object < Derived > .. and then use Derived::sptr = ... But this obviously is horrible because it does not work with another class that is then derived from Derived object.
And even doing typedef shared_ptr<..> MyObjectPtr is horrible because then it needs to be done for each kind of smart pointer for consistency's sake, or at least for unique_ptr and shared_ptr.
So what's the standard way people use smart pointers? Because frankly I'm starting to see it as being too much hassle to use them. :/

So what's the standard way people use smart pointers?
Rarely. The fact that you find it a hassle to use them is a sign that you over-use pointers. Try to refactor your code to make pointers the exception, not the rule. shared_ptr in particular has its niche, but it’s a small one: namely, when you genuinely have to share ownership of a resource between several objects. This is a rare situation.
Because frankly I'm starting to see it as being too much hassle to use them. :/
Agreed. That’s the main reason not to use pointers.
There are more ways to avoid pointers. In particular, shared_ptr really only needs to spelled out when you actually need to pass ownership. In functions which don’t deal with ownership, you wouldn’t pass a shared_ptr, or a raw pointer; you would pass a reference, and dereference the pointer upon calling the function.
And inside functions you almost never need to spell out the type; for instance, you can (and should) simply say auto x = …; instead of shared_ptr<Class> x = …; to initialise variables.
In summary, you should only need to spell out shared_ptr in very few places in your code.

I have a lot of code that creates objects dynamically. So using pointers is necessary because the number of objects is not known from the start. An object is created in one subsystem, then stored in another, then passed for further processing to the subsystem that created it. So that I guess means using shared_ptr. Good design? I don't know, but it seems most logical to ask subsystem to create a concrete object that it owns, return a pointer to an interface for that object and then pass it for further processing to another piece of code that will interact with the object through it's abstract interface.
I could return unique_ptr from factory method. But then I would run into trouble if I need to pass the object for processing multiple times. Because I would still need to know about the object after I pass it to another method and unique_ptr would mean that I lose track of the object after doing move(). Since I need to have at least two references to the object this means using shared_ptr.
I heard somewhere that most commonly used smart pointer is unique_ptr. Certainly not so in my application. I end up with using shared_ptr mush more often. Is this a sign of bad design then?

Is it alright to return a reference to a non-pointer member variable as a pointer?

I recently came across some C++ code that looked like this:
class SomeObject
{
private:
// NOT a pointer
BigObject foobar;
public:
BigObject * getFoobar() const
{
return &foobar;
}
};
I asked the programmer why he didn't just make foobar a pointer, and he said that this way he didn't have to worry about allocating/deallocating memory. I asked if he considered using some smart pointer, he said this worked just as well.
Is this bad practice? It seems very hackish.

That's perfectly reasonable, and not "hackish" in any way; although it might be considered better to return a reference to indicate that the object definitely exists. A pointer might be null, and might lead some to think that they should delete it after use.
The object has to exist somewhere, and existing as a member of an object is usually as good as existing anywhere else. Adding an extra level of indirection by dynamically allocating it separately from the object that owns it makes the code less efficient, and adds the burden of making sure it's correctly deallocated.
Of course, the member function can't be const if it returns a non-const reference or pointer to a member. That's another advantage of making it a member: a const qualifier on SomeObject applies to its members too, but doesn't apply to any objects it merely has a pointer to.
The only danger is that the object might be destroyed while someone still has a pointer or reference to it; but that danger is still present however you manage it. Smart pointers can help here, if the object lifetimes are too complex to manage otherwise.

You are returning a pointer to a member variable not a reference. This is bad design.
Your class manages the lifetime of foobar object and by returning a pointer to its members you enable the consumers of your class to keep using the pointer beyond the lifetime of SomeObject object. And also it enables the users to change the state of SomeObject object as they wish.
Instead you should refactor your class to include the operations that would be done on the foobar in SomeObject class as methods.
ps. Consider naming your classes properly. When you define it is a class. When you instantiate, then you have an object of that class.

It's generally considered less than ideal to return pointers to internal data at all; it prevents the class from managing access to its own data. But if you want to do that anyway I see no great problem here; it simplifies the management of memory.

Is this bad practice? It seems very hackish.
It is. If the class goes out of scope before the pointer does, the member variable will no longer exist, yet a pointer to it still exists. Any attempt to dereference that pointer post class destruction will result in undefined behaviour - this could result in a crash, or it could result in hard to find bugs where arbitrary memory is read and treated as a BigObject.
if he considered using some smart pointer
Using smart pointers, specifically std::shared_ptr<T> or the boost version, would technically work here and avoid the potential crash (if you allocate via the shared pointer constructor) - however, it also confuses who owns that pointer - the class, or the caller? Furthermore, I'm not sure you can just add a pointer to an object to a smart pointer.
Both of these two points deal with the technical issue of getting a pointer out of a class, but the real question should be "why?" as in "why are you returning a pointer from a class?" There are cases where this is the only way, but more often than not you don't need to return a pointer. For example, suppose that variable needs to be passed to a C API which takes a pointer to that type. In this case, you would probably be better encapsulating that C call in the class.

As long as the caller knows that the pointer returned from getFoobar() becomes invalid when the SomeObject object destructs, it's fine. Such provisos and caveats are common in older C++ programs and frameworks.
Even current libraries have to do this for historical reasons. e.g. std::string::c_str, which returns a pointer to an internal buffer in the string, which becomes unusable when the string destructs.
Of course, that is difficult to ensure in a large or complex program. In modern C++ the preferred approach is to give everything simple "value semantics" as far as possible, so that every object's life time is controlled by the code that uses it in a trivial way. So there are no naked pointers, no explicit new or delete calls scattered around your code, etc., and so no need to require programmers to manually ensure they are following the rules.
(And then you can resort to smart pointers in cases where you are totally unable to avoid shared responsibility for object lifetimes.)

Two unrelated issues here:
1) How would you like your instance of SomeObject to manage the instance of BigObject that it needs? If each instance of SomeObject needs its own BigObject, then a BigObject data member is totally reasonable. There are situations where you'd want to do something different, but unless that situation arises stick with the simple solution.
2) Do you want to give users of SomeObject direct access to its BigObject? By default the answer here would be "no", on the basis of good encapsulation. But if you do want to, then that doesn't change the assessment of (1). Also if you do want to, you don't necessarily need to do so via a pointer -- it could be via a reference or even a public data member.
A third possible issue might arise that does change the assessment of (1):
3) Do you want to give users of SomeObject direct access to an instance of BigObject that they continue using beyond the lifetime of the instance of SomeObject that they got it from? If so then of course a data member is no good. The proper solution might be shared_ptr, or for SomeObject::getFooBar to be a factory that returns a different BigObject each time it's called.
In summary:
Other than the fact it doesn't compile (getFooBar() needs to return const BigObject*), there is no reason so far to suppose that this code is wrong. Other issues could arise that make it wrong.
It might be better style to return const & rather than const *. Which you return has no bearing on whether foobar should be a BigObject data member.
There is certainly no "just" about making foobar a pointer or a smart pointer -- either one would necessitate extra code to create an instance of BigObject to point to.

Which kind of (auto) pointer to use?

I came accross several questions where answers state that using T* is never the best idea.
While I already make much use of RIIC, there is one particular point in my code, where I use T*. Reading about several auto-pointers, I couldn't find one where I'd say that I have a clear advantage from using it.
My scenario:
class MyClass
{
...
// This map is huge and only used by MyClass and
// and several objects that are only used by MyClass as well.
HashMap<string, Id> _hugeIdMap;
...
void doSomething()
{
MyMapper mapper;
// Here is what I pass. The reason I can't pass a const-ref is
// that the mapper may possibly assign new IDs for keys not yet in the map.
mapper.setIdMap(&_hugeIdMap);
mapper.map(...);
}
}
MyMapper now has a HashMap<...>* member, which - according to highly voted answers in questions on unrelated problems - never is a good idea (Altough the mapper will go out of scope before the instance of MyClass does and hence I do not consider it too much of a problem. There's no new in the mapper and no delete will be needed).
So what is the best alternative in this particular use-case?

Personally I think a raw pointer (or reference) is okay here. Smart pointers are concerned with managing the lifetime of the object pointed to, and in this case MyMapper isn't managing the lifetime of that object, MyClass is. You also shouldn't have a smart pointer pointing to an object that was not dynamically allocated (which the hash map isn't in this case).
Personally, I'd use something like the following:
class MyMapper
{
public:
MyMapper(HashMap<string, Id> &map)
: _map(map)
{
}
private:
HashMap<string, Id> &_map
};
Note that this will prevent MyMapper from having an assignment operator, and it can only work if it's acceptable to pass the HashMap in the constructor; if that is a problem, I'd make the member a pointer (though I'd still pass the argument as a reference, and do _map(&map) in the initializer list).
If it's possible for MyMapper or any other class using the hash map to outlive MyClass, then you'd have to start thinking about smart pointers. In that case, I would probably recommend std::shared_ptr, but you'd have to use it everywhere: _hugeIdMap would have to be a shared_ptr to a dynamically allocated value, not a regular non-pointer field.
Update:
Since you said that using a reference is not acceptable due to the project's coding standards, I would suggest just sticking with a raw pointer for the reasons mentioned above.

Naked pointers (normally referred to as raw pointers) are just fine when the object has no responsibility to delete the object. In the case of MyMapper then the pointer points to an object already owned by MyClass and is therefore absolutely fine to not delete it. The problem arises when you use raw pointers when you do intend for objects to be deleted through them, which is where problems lie. People only ask questions when they have problems, which is why you almost always see it only used in a problematic context, but raw pointers in a non-owning context is fine.

How about passing it into the constructor and keeping a reference (or const-reference) to it? That way your intent of not owning the object is made clear.
Passing auto-pointers or shared-pointers are mostly for communicating ownership.
shared pointers indicate it's shared
auto-pointers indicate it's the receivers responsibility
references indicate it's the senders responsibility
blank pointers indicate nothing.
About your coding style:
our coding standards have a convention that says never pass non-const references.
Whether you use the C++ reference mechanism or the C++ pointer mechanism, you're passing a (English-meaning) reference to the internal storage that will change. I think your coding standard is trying to tell you not to do that at all, not so much that you can't use references to do so but that you can do it in another way.

What is the best way to implement smart pointers in C++?

I've been evaluating various smart pointer implementations (wow, there are a LOT out there) and it seems to me that most of them can be categorized into two broad classifications:
1) This category uses inheritance on the objects referenced so that they have reference counts and usually up() and down() (or their equivalents) implemented. IE, to use the smart pointer, the objects you're pointing at must inherit from some class the ref implementation provides.
2) This category uses a secondary object to hold the reference counts. For example, instead of pointing the smart pointer right at an object, it actually points at this meta data object... Who has a reference count and up() and down() implementations (and who usually provides a mechanism for the pointer to get at the actual object being pointed to, so that the smart pointer can properly implement operator ->()).
Now, 1 has the downside that it forces all of the objects you'd like to reference count to inherit from a common ancestor, and this means that you cannot use this to reference count objects that you don't have control over the source code to.
2 has the problem that since the count is stored in another object, if you ever have a situation that a pointer to an existing reference counted object is being converted into a reference, you probably have a bug (I.E., since the count is not in the actual object, there is no way for the new reference to get the count... ref to ref copy construction or assignment is fine, because they can share the count object, but if you ever have to convert from a pointer, you're totally hosed)...
Now, as I understand it, boost::shared_pointer uses mechanism 2, or something like it... That said, I can't quite make up my mind which is worse! I have only ever used mechanism 1, in production code... Does anyone have experience with both styles? Or perhaps there is another way thats better than both of these?

"What is the best way to implement smart pointers in C++"
Don't! Use an existing, well tested smart pointer, such as boost::shared_ptr or std::tr1::shared_ptr (std::unique_ptr and std::shared_ptr with C++ 11)
If you have to, then remember to:
use safe-bool idiom
provide an operator->
provide the strong exception guarantee
document the exception requirements your class makes on the deleter
use copy-modify-swap where possible to implement the strong exception guarantee
document whether you handle multithreading correctly
write extensive unit tests
implement conversion-to-base in such a way that it will delete on the derived pointer type (policied smart pointers / dynamic deleter smart pointers)
support getting access to raw pointer
consider cost/benifit of providing weak pointers to break cycles
provide appropriate casting operators for your smart pointers
make your constructor templated to handle constructing base pointer from derived.
And don't forget anything I may have forgotten in the above incomplete list.

Just to supply a different view to the ubiquitous Boost answer (even though it is the right answer for many uses), take a look at Loki's implementation of smart pointers. For a discourse on the design philosophy, the original creator of Loki wrote the book Modern C++ Design.

I've been using boost::shared_ptr for several years now and while you are right about the downside (no assignment via pointer possible), I think it was definitely worth it because of the huge amount of pointer-related bugs it saved me from.
In my homebrew game engine I've replaced normal pointers with shared_ptr as much as possible. The performance hit this causes is actually not so bad if you are calling most functions by reference so that the compiler does not have to create too many temporary shared_ptr instances.

Boost also has an intrusive pointer (like solution 1), that doesn't require inheriting from anything. It does require changing the pointer to class to store the reference count and provide appropriate member functions. I've used this in cases where memory efficiency was important, and didn't want the overhead of another object for each shared pointer used.
Example:
class Event {
public:
typedef boost::intrusive_ptr<Event> Ptr;
void addRef();
unsigned release();
\\ ...
private:
unsigned fRefCount;
};
inline void Event::addRef()
{
fRefCount++;
}
inline unsigned Event::release(){
fRefCount--;
return fRefCount;
}
inline void intrusive_ptr_add_ref(Event* e)
{
e->addRef();
}
inline void intrusive_ptr_release(Event* e)
{
if (e->release() == 0)
delete e;
}
The Ptr typedef is used so that I can easily switcth between boost::shared_ptr<> and boost::intrusive_ptr<> without changing any client code

If you stick with the ones that are in the standard library you will be fine.
Though there are a few other types than the ones you specified.
Shared: Where the ownership is shared between multiple objects
Owned: Where one object owns the object but transfer is allowed.
Unmovable: Where one object owns the object and it can not be transferred.
The standard library has:
std::auto_ptr
Boost has a couple more than have been adapted by tr1 (next version of the standard)
std::tr1::shared_ptr
std::tr1::weak_ptr
And those still in boost (which in relatively is a must have anyway) that hopefully make it into tr2.
boost::scoped_ptr
boost::scoped_array
boost::shared_array
boost::intrusive_ptr
See:
Smart Pointers: Or who owns you baby?

It seems to me this question is kind of like asking "Which is the best sort algorithm?" There is no one answer, it depends on your circumstances.
For my own purposes, I'm using your type 1. I don't have access to the TR1 library. I do have complete control over all the classes I need to have shared pointers to. The additional memory and time efficiency of type 1 might be pretty slight, but memory usage and speed are big issues for my code, so type 1 was a slam dunk.
On the other hand, for anyone who can use TR1, I'd think the type 2 std::tr1::shared_ptr class would be a sensible default choice, to be used whenever there isn't some pressing reason not to use it.

The problem with 2 can be worked around. Boost offers boost::shared_from_this for this same reason. In practice, it's not a big problem.
But the reason they went with your option #2 is that it can be used in all cases. Relying on inheritance isn't always an option, and then you're left with a smart pointer you can't use for half your code.
I'd have to say #2 is best, simply because it can be used in any circumstances.

Our project uses smart pointers extensively. In the beginning there was uncertainty about which pointer to use, and so one of the main authors chose an intrusive pointer in his module and the other a non-intrusive version.
In general, the differences between the two pointer types were not significant. The only exception being that early versions of our non-intrusive pointer implicitly converted from a raw pointer and this can easily lead to memory problems if the pointers are used incorrectly:
void doSomething (NIPtr<int> const &);
void foo () {
NIPtr<int> i = new int;
int & j = *i;
doSomething (&j); // Ooops - owned by two pointers! :(
}
A while ago, some refactoring resulted in some parts of the code being merged, and so a choice had to be made about which pointer type to use. The non-intrusive pointer now had the converting constructor declared as explicit and so it was decided to go with the intrusive pointer to save on the amount of code change that was required.
To our great surprise one thing we did notice was that we had an immediate performance improvement by using the intrusive pointer. We did not put much research into this, and just assumed that the difference was the cost of maintaining the count object. It is possible that other implementations of non-intrusive shared pointer have solved this problem by now.

What you are talking about are intrusive and non-intrusive smart pointers. Boost has both. boost::intrusive_ptr calls a function to decrease and increase the reference count of your object, everytime it needs to change the reference count. It's not calling member functions, but free functions. So it allows managing objects without the need to change the definition of their types. And as you say, boost::shared_ptr is non-intrusive, your category 2.
I have an answer explaining intrusive_ptr: Making shared_ptr not use delete. In short, you use it if you have an object that has already reference counting, or need (as you explain) an object that is already referenced to be owned by an intrusive_ptr.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js