auto_ptr design

auto_ptr design - c++

In my opinion, a class should provide a well defined abstraction and no private members should be modified without the knowledge of class. But when I checked the "auto_ptr" (or any other smart pointer), this rule is violated. Please see the following code
class Foo{
public:
Foo(){}
};
int main(int argc, char* argv[])
{
std::auto_ptr<Foo> fooPtr(new Foo);
delete fooPtr.operator ->();
return 0;
}
The operator overload (->) gives the underlying pointer and it can be modified without the knowledge of "auto_ptr". I can't think it as a bad design as the smart pointers are designed by C++ geeks, but I am wondering why they allowed this. Is there any way to write a smart pointer without this problem.
Appreciate your thoughts.

There are two desirable properties a smart pointer should have:
The raw pointer can be retrieved (e.g. for passing to legacy library functions)
The raw pointer cannot be retrieved (to prevent double-delete)
Obviously, these properties are contradictory and cannot be realised at the same time! Even Boost's shared_ptr<Foo> et al. have get(), so they have this "problem." In practice, the first is more important, so the second has to go.
By the way, I'm not sure why you reached for the slightly obscure operator->() when the ordinary old get() method causes the same problem:
std::auto_ptr<Foo> fooPtr(new Foo);
delete fooPtr.get();

In order to provide fast, convenient, "pointer-like" access to the underlying object, operator-> unfortunately has to "leak" its abstraction a bit. Otherwise, smart pointers would have to manually wrap all of the members that are allowed to be exposed. These either requires a lot of "configuration" work on the part of those instantiating the smart pointer, or a level of meta-programming that just isn't present in C++. Besides, as pyrsta points out, even if this hole was plugged, there are still many other (perhaps non-standard) ways to subvert C++'s access control mechanisms.

Is there any way to write a smart pointer without this problem.
It isn't easy, and generally no (i.e., you can't do it for every, general Foo class).
The only way I can think of, to do this, would be by changing the declaration of the Foo class: make the Foo destructor private (or define a private delete operator as a member of the Foo class), and also specify in the declaration of the Foo class that std::auto_ptr<Foo> is a friend.

No, there's no way to completely prohibit such bad usage in C++.
As a general rule, the user of any library code should never call delete on any wrapped pointers unless specifically documented. And in my opinion, all modern C++ code should be designed so that the user of the classes never was left the full responsibility to manually release her acquired resources (ie. use RAII instead).
Aside note: std::auto_ptr<T> isn't the best option anymore. Its bad behaviour on copying can lead to serious coding errors. Often a better idea is to use std::tr1::scoped_ptr<T> or std::tr1::shared_ptr<T> or their Boost variants instead.
Moreover, in C++0x, std::unique_ptr<T> will functionally supercede std::auto_ptr<T> as a safer-to-use class. Some discussion on the topic and a recent C++03 implementation for unique_ptr emulation can be found here.

I don't think this shows that auto_ptr has an encapsulation problem. Whenever dealing with owned pointers, it is critical for people to understand who owns what. In the case of auto_ptr, it owns the pointer that it holds[1]; this is part of auto_ptr's abstraction. Therefore, deleting that pointer in any other way violates the contract that auto_ptr provides.
I'd agree that it's relatively easy to mis-use auto_ptr[2], which is very not ideal, but in C++, you can never avoid the fundamental issue of "who owns this pointer?", because for better or worse, C++ does not manage memory for you.
[1] Quote from cplusplus.com: "auto_ptr objects have the peculiarity of taking ownership of the pointers assigned to them": http://www.cplusplus.com/reference/std/memory/auto_ptr/
[2] For example, you might mistakenly believe that it has value semantics, and use it as a vector template parameter: http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&ved=0CEEQFjAD&url=http%3A%2F%2Fwww.gamedev.net%2Ftopic%2F502150-c-why-is-stdvectorstdauto_ptrmytype--bad%2F&ei=XU1qT5i9GcnRiAKCiu20BQ&usg=AFQjCNHigbgumbMG3MTmMPla2zo4LhaE1Q&sig2=WSyJF2eWrq2aB2qw8dF3Dw

I think this question addresses a non-issue. Smart pointers are there to manage ownership of pointers, and if doing so they make the pointer inaccessible, they fail their purpose.
Also consider this. Any container type gives you iterators over them; if it is such an iterator then &*it is a pointer to an item in the container; if you say delete &*it then you are dead. But exposing the adresses of its items is not a defect of container types.

Related

Why is raw pointer to shared_ptr construction allowed in all cases?

I was reading Top 10 dumb mistakes to avoid with C++11 smart pointer.
Number #5 reads:
Mistake # 5 : Not assigning an object(raw pointer) to a shared_ptr as
soon as it is created !
int main()
{
Aircraft* myAircraft = new Aircraft("F-16");
shared_ptr<aircraft> pAircraft(myAircraft);
...
shared_ptr<aircraft> p2(myAircraft);
// will do a double delete and possibly crash
}
and the recommendation is something like:
Use make_shared or new and immediately construct the pointer with
it.
Ok, no doubt about it the problem and the recommendation.
However I have a question about the design of shared_ptr.
This is a very easy mistake to make and the whole "safe" design of shared_ptr could be thrown away by very easy-to-detect missuses.
Now the question is, could this be easily been fixed with an alternative design of shared_ptr in which the only constructor from raw pointer would be that from a r-value reference?
template<class T>
struct shared_ptr{
shared_ptr(T*&& t){...basically current implementation...}
shared_ptr(T* t) = delete; // this is to...
shared_ptr(T* const& t) = delete; // ... illustrate the point.
shared_ptr(T*& t) = delete;
...
};
In this way shared_ptr could be only initialized from the result of new or some factory function.
Is this an underexploitation of the C++ language in the library?
or What is the point of having a constructor from raw pointer (l-value) reference if this is going to be most likely a misuse?
Is this a historical accident? (e.g. shared_ptr was proposed before r-value references were introduced, etc) Backwards compatibility?
(Of course one could say std::shared_ptr<type>(std::move(ptr)); that that is easier to catch and also a work around if this is really necessary.)
Am I missing something?

Pointers are very easy to copy. Even if you restrict to r-value reference you can sill easily make copies (like when you pass a pointer as a function parameter) which will invalidate the safety setup. Moreover you will run into problems in templates where you can easily have T* const or T*& as a type and you get type mismatches.
So you are proposing to create more restrictions without significant safety gains, which is likely why it was not in the standard to begin with.
The point of make_shared is to atomize the construction of a shared pointer. Say you have f(shared_ptr<int>(new int(5)), throw_some_exception()). The order of parameter invokation is not guaranteed by the standard. The compiler is allowed to create a new int, run throw_some_exception and then construct the shared_ptr which means that you could leak the int (if throw_some_exception actually throws an exception). make_shared just creates the object and the shared pointer inside itself, which doesn't allow the compiler to change the order, so it becomes safe.

I do not have any special insight into the design of shared_ptr, but I think the most likely explanation is that the timelines involved made this impossible:
The shared_ptr was introduced at the same time as rvalue-references, in C++11. The shared_ptr already had a working reference implementation in boost, so it could be expected to be added to standard libraries relatively quickly.
If the constructor for shared_ptr had only supported construction from rvalue references, it would have been unusable until the compiler had also implemented support for rvalue references.
And at that time, compiler and standards development was much more asynchronous, so it could have taken years until all compiler had implemented support, if at all. (export templates were still fresh on peoples minds in 2011)
Additionally, I assume the standards committee would have felt uncomfortable standardizing an API that did not have a reference implementation, and could not even get one until after the standard was published.

There's a number of cases in which you may not be able to call make_shared(). For example, your code may not be responsible for allocating and constructing the class in question. The following paradigm (private constructors + factory functions) is used in some C++ code bases for a variety of reasons:
struct A {
private:
A();
};
A* A_factory();
In this case, if you wanted to stick the A* you get from A_factory() into a shared_ptr<>, you'd have to use the constructor which takes a raw pointer instead of make_shared().
Off the top of my head, some other examples:
You want to get aligned memory for your type using posix_memalign() and then store it in a shared_ptr<> with a custom deleter that calls free() (this use case will go away soon when we add aligned allocation to the language!).
You want to stick a pointer to a memory-mapped region created with mmap() into a shared_ptr<> with a custom deleter that calls munmap() (this use case will go away when we get a standardized facility for shmem, something I'm hoping to work on in the next few months).
You want to stick a pointer allocated by into a shared_ptr<> with a custom deleter.

Revisiting reasons for no implicit conversion on smart pointer

[One answer, below is to force a choice between .release() and .get(), sadly the wrong general advice is to just use .get()]
Summary: This question is asking for technical reasons, or behavioural reasons (perhaps based on experience), why smart pointers such as unique_ptr are stripped of the major characteristic of a pointer, i.e. the ability to be passed where a pointer is expected (C API's).
I have researched the topic and cite two major claimed reasons below, but these hardly seem valid.
But my arrogance is not boundless, and my experience not so extensive as to convince me that I must be right.
It may be that unique_ptr was not designed simple lifetime management of dumb C API pointers as a main use, certainly this proposed development of unique_ptr would not be [http://bartoszmilewski.com/2009/05/21/unique_ptr-how-unique-is-it/], however unique_ptr claims to be "what auto_ptr should have been" (but that we couldn't write in C++98)" [http://www.stroustrup.com/C++11FAQ.html#std-unique_ptr] but perhaps that is not a prime use of auto_ptr either.
I'm using unique_ptr for management of some C API resources, and shocked [yes, shocked :-)] to find that so-called smart pointers hardly behave as pointers at all.
The API's I use expect pointers, and I really don't want to be adding .get() all over the place. It all makes the smart pointer unique_ptr seem quite dumb.
What's the current reasoning for unique_ptr not automatically converting to the pointer it holds when it is being cast to the type of that pointer?
void do_something(BLOB* b);
unique_ptr<BLOB> b(new_BLOB(20));
do_something(b);
// equivalent to do_something(b.get()) because of implicit cast
I have read http://herbsutter.com/2012/06/21/reader-qa-why-dont-modern-smart-pointers-implicitly-convert-to/ and it remarkably (given the author) doesn't actually answer the question convincingly, so I wonder if there are more real examples or technical/behavioural reasons to justify this.
To reference the article examples, I'm not trying do_something(b + 42), and + is not defined on the object I'm pointing to so *b + 42 doesn't make sense.
But if it did, and if I meant it, then I would actually type *b + 42, and if I wanted to add 42 to the pointer I would type b + 42 because I'm expecting my smart pointer to actually act like a pointer.
Can a reason for making smart pointers dumb, really be the fear that the C++ coder won't understand how to use a pointer, or will keep forgetting to deference it? That if I make an error with a smart pointer, it will silently compile and behave just as it does with a dumb pointer? [Surely that argument has no end, I might forget the > in ->]
For the other point in the post, I'm no more likely to write delete b than I am to write delete b.get(), though this seems to be a commonly proposed reason (perhaps because of legacy code conversions), and discussed C++ "smart pointer" template that auto-converts to bare pointer but can't be explicitly deleted however Meyers ambiguity of 1996, mentioned in http://www.informit.com/articles/article.aspx?p=31529&seqNum=7 seems to answer that case well by defining a cast for void* as well as for T* so that delete can't work out which cast to use.
The delete problem seems to have had some legitimacy, as it is likely to be a real problem when porting some legacy code but it seems to be well addressed even prior to Meyer (http://www.boost.org/doc/libs/1_43_0/libs/smart_ptr/sp_techniques.html#preventing_delete)
Are there more technical for denying this basic pointer-behaviour of smart pointers? Or did the reasons just seem very compelling at the time?
Previous discussions contain general warnings of bad things [Add implicit conversion from unique_ptr<T> to T* , Why doesn't `unique_ptr<QByteArray>` degrade to `QByteArray*`?, C++ "smart pointer" template that auto-converts to bare pointer but can't be explicitly deleted, http://bartoszmilewski.com/2009/05/21/unique_ptr-how-unique-is-it/] but nothing specific, or that is any worse than use of non-smart C pointers to C API's.
The inconvenience measured against the implied risks of coders blindly adding .get() everywhere and getting all the same harms they were supposed to have been protected against make the whole limitation seem very unworthwhile.
In my case, I used Meyer's trick of two casts, and take the accompanying hidden risks, hoping that readers will help me know what they are.
template<typename T, typename D, D F>
class unique_dptr : public std::unique_ptr<T, D> {
public: unique_dptr(T* t) : std::unique_ptr<T, D>(t, F) { };
operator T*() { return this->get(); }
operator void*() { return this->get(); }
};
#define d_type(__f__) decltype(&__f__), __f__
with thanks to #Yakk for the macro tip (Clean implementation of function template taking function pointer, How to fix error refactoring decltype inside template)
using RIP_ptr = unique_dptr<RIP, d_type(::RIP_free)>;
RIP_ptr rip1(RIP_new_bitmap("/tmp/test.png"));
No that's a smart pointer I can use.
* I declare the free function once when the smart pointer type is defined
* I can pass it around like a pointer
* It gets freed when the scope dies
Yes, I can use the API wrongly and leak owned references, but .get() doesn't stop that, despite it's inconvenience.
Maybe I should have some consts in there as a nod to lack of ownership-transference.

One answer that I'm surprised I haven't found in my searches is implied in the documentation for unique_ptr::release [http://en.cppreference.com/w/cpp/memory/unique_ptr/release]
release() returns the pointer and the unique_ptr then references nullptr and so clearly this can be used for passing on the owned reference to an API that doesn't use smart pointers.
By inference, get() is a corresponding function to pass unowned reference
As a pair, these functions explain why automatic de-referencing is not permitted; the coder is forced to replace each pointer use either with .release() or .get() depending on how the called function will treat the pointer.
And thus the coder is forced to take intelligent action and choose one behaviour or the other, and upgrade the legacy code to be more explicit and safe.
.get() is a weird name for that use, but this explanation makes good sense to me.
Sadly, this strategy is ineffective if the only advice the coder has is to use .get(); the coder is not aware of the choice and misses a chance to make his code safe and clear.

Customising std::shared_ptr or boost::shared_ptr to throw an exception on NULL dereference

I have a few projects that use boost::shared_ptr or std::shared_ptr extensively (I can convert to either implementation soon enough, if there is a good answer to this question for one, but not the other). The Boost implementation uses Boost.Assert to avoid returning in the case of encountering an empty (NULL) pointer in operator* or operator-> at runtime; while the libc++ implementation seems to lack any check.
While of course the validity of a shared_ptr should be checked before use, a large, mixed-paradigm codebase leads me to want to try an exception-throwing variation; as most of the code is relatively exception-aware and will at most fail to a high-level but resumable state, rather than std::terminate() or segfault.
How should I best customise these accessors while maintaining the robustness of shared_ptr? It seems that encapsulating shared_ptr in a throwing_shared_ptr may be the best option, but I'm wary of breaking the magic. Am I best off copying the Boost source and just changing the ASSERTs to an appropriate throw statement?
The actual type name used everywhere for the appropriate smart_ptr<T> type is a typedef expanded from a macro; i.e. ForwardDeclarePtr(Class) expands to something like:
class Class;
typedef boost::smart_ptr<Class> ClassPtr;
Everything passes, takes, or stores a ClassPtr - so I can replace the underlying type pretty freely; and I suspect this alleviates the potential slicing/hiding issue.

There's really no "magic" in std::shared_ptr<T> that would be removed if you wrapped it inside a custom class that would throw an exception when dereferencing a NULL shared pointer. So I don't see why that approach wouldn't work, as long as your new wrapper-class follows all the semantics of the std::shared_ptr<T> type.
BTW, you could also take a slightly different approach, and that is create a wrapper-class that simply won't allow others to pass NULL pointers to the wrapped std::shared_ptr<T> data-member in the first-place. Basically it would be a class that would enforce the std::make_shared<T> idiom in its constructor. I'm not sure, based on the workings of your code if this is possible, but it's another way to circumvent the problem using a RAII approach rather than throwing exceptions.

Just subclass std::shared_ptr into throwing_shared_ptr, override those two methods, and have them assert and call through to std::shared_ptr's impl. This should work fine as long as you use throwing_shared_ptr everywhere instead of slicing it to a std::shared_ptr.

Should I use shared_ptr or unique_ptr

I've been making some objects using the pimpl idiom, but I'm not sure whether to use std::shared_ptr or std::unique_ptr.
I understand that std::unique_ptr is more efficient, but this isn't so much of an issue for me, as these objects are relatively heavyweight anyway so the cost of std::shared_ptr over std::unique_ptr is relatively minor.
I'm currently going with std::shared_ptr just because of the extra flexibility. For example, using a std::shared_ptr allows me to store these objects in a hashmap for quick access while still being able to return copies of these objects to callers (as I believe any iterators or references may quickly become invalid).
However, these objects in a way really aren't being copied, as changes affect all copies, so I was wondering that perhaps using std::shared_ptr and allowing copies is some sort of anti-pattern or bad thing.
Is this correct?

I've been making some objects using the pimpl idiom, but I'm not sure whether to used shared_ptr or unique_ptr.
Definitely unique_ptr or scoped_ptr.
Pimpl is not a pattern, but an idiom, which deals with compile-time dependency and binary compatibility. It should not affect the semantics of the objects, especially with regard to its copying behavior.
You may use whatever kind of smart pointer you want under the hood, but those 2 guarantee that you won't accidentally share the implementation between two distinct objects, as they require a conscious decision about the implementation of the copy constructor and assignment operator.
However, these objects in a way really aren't being copied, as changes affect all copies, so I was wondering that perhaps using shared_ptr and allowing copies is some sort of anti-pattern or bad thing.
It is not an anti-pattern, in fact, it is a pattern: Aliasing. You already use it, in C++, with bare pointers and references. shared_ptr offer an extra measure of "safety" to avoid dead references, at the cost of extra complexity and new issues (beware of cycles which create memory leaks).
Unrelated to Pimpl
I understand unique_ptr is more efficient, but this isn't so much of an issue for me, as these objects are relatively heavyweight anyway so the cost of shared_ptr over unique_ptr is relatively minor.
If you can factor out some state, you may want to take a look at the Flyweight pattern.

If you use shared_ptr, it's not really the classical pimpl
idiom (unless you take additional steps). But the real question
is why you want to use a smart pointer to begin with; it's very
clear where the delete should occur, and there's no issue of
exception safety or other to be concerned with. At most,
a smart pointer will save you a line or two of code. And the
only one which has the correct semantics is boost::scoped_ptr,
and I don't think it works in this case. (IIRC, it requires
a complete type in order to be instantiated, but I could be
wrong.)
An important aspect of the pimpl idiom is that its use should be
transparent to the client; the class should behave exactly as if
it were implemented classically. This means either inhibiting
copy and assignment or implementing deep copy, unless the class
is immutable (no non-const member functions). None of the usual
smart pointers implement deep copy; you could implement one, of
course, but it would probably still require a complete type
whenever the copy occurs, which means that you'd still have to
provide a user defined copy constructor and assignment operator
(since they can't be inline). Given this, it's probably not
worth the bother using the smart pointer.
An exception is if the objects are immutable. In this case, it
doesn't matter whether the copy is deep or not, and shared_ptr
handles the situation completely.

When you use a shared_ptr (for example in a container, then look this up and return it by-value), you are not causing a copy of the object it points to, simply a copy of the pointer with a reference count.
This means that if you modify the underlying object from multiple points, then you affect changes on the same instance. This is exactly what it is designed for, so not some anti-pattern!
When passing a shared_ptr (as the comments say,) it's better to pass by const reference and copy (there by incrementing the reference count) where needed. As for return, case-by-case.

Yes, please use them. Simply put, the shared_ptr is an implementation of smart pointer. unique_ptr is an implementation of automatic pointer:

Letting go of auto_ptr

Occasionally, for fleeting moments, I think auto_ptr is cool. But most of the time I recognize that there are much simpler techniques that make it irrelevant. For example, if I want to have an object freed automatically, even if an exception is thrown, I could new up the object and assign to an auto_ptr. Very cool! But I could have more easily created my object as a local variable, and let the stack take care of it (duh!).
Thus I was not too surprised when I found google C++ coding standards banning the use of auto_ptr. Google states that scoped_ptr should be used instead (if a smart pointer is needed).
I'd like to know if anyone, contrary to my experience, can give a solid reason(s) of when auto_ptr is the best or simplest thing to use. If not, then I suppose I will ban using it myself (following google's lead).
update: For those who expressed concern, no I am not adopting google standards. For example, against google advice, I agree exception-handling should be activated. I also like using preprocessor macros, such as the printable enum I made. It is just the auto_ptr topic that struck me.
update2: It turns out my answer comes from two of the responders below, and a note from Wikipedia. First, Herb Sutter did show a valid use (source-sink idiom and lifetime-linked object composition). Second, there are shops where TR1 and boost are not available or banned and only C++03 is allowed. Third, according to Wikipedia, the C++0x spec is deprecating auto_ptr and replacing it with unique_ptr. So my answer is: use unique_ptr if available to me (on all platforms in consideration) else use auto_ptr for the cases that Sutter depicts.

It's the simplest thing to use when you need a scoped or unique pointer and you are working in a strict C++03 environment with no access to a tr1 implementation or boost.

Herb Sutter can help you out on this one: http://www.drdobbs.com/184403837

While banning auto_ptr seems attractive, but there is one issue:
template <typename T>
some_smart_ptr<T> create();
What will you replace the some_smart_ptr placeholder with ?
The generic answer, shared_ptr, is only worth it if the ownership is truly shared, if the function grants the caller exclusive ownership of the resources, it's misleading at best (and a typical case of premature pessimization as far as I am concerned).
On the other hand, in C++03, no other form of smart pointer can deliver: it's impossible, without move semantics, to provide what we'd like here. auto_ptr or a naked pointer are the two logical contenders. But then a naked pointer exposes you to the risk of leaks if the caller is careless.
With C++0x, unique_ptr advantageously replace auto_ptr in every situation.

std::auto_ptr still has pointer semantics, so automatic (non-pointer) variables aren't a substitute. In particular, std::auto_ptr supports polymorphism and re-assignment. With stack variables you can use references for polymorphism, but references don't allow for re-assignment.
Sometimes std::auto_ptr will do just fine. For example, for implementing a pimpl. True, in the vast majority of cases boost' smart pointer library offers better choices for a smart pointer. But right now std::auto_ptr is a standard solution, whereas boost's smart pointers aren't.

Using auto_ptr as function return value you will enjoy no copiyng overhead and never have memory leak. std::auto_ptr<obj> foo() can be safely called in { foo(); } while obj *foo() cannot. boost::shared_ptr can solve this, but with higher overhead.
Also, some objects can't be created on stack because of memory constraints: thread stacks are relatively small. But boost::scoped_ptr is better in this case since it can't be accidentally released.

Well one reason would be that scoped_ptr is non-copyable, so it's safer to use and harder to make mistakes with. auto_ptr allows transfer of ownership (eg. by passing it another auto_ptr as a constructor parameter). If you need to think things like transferring the ownership, the chances are you're better off with a smart pointer like shared_ptr.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js