Returning C++ polymorphic objects (interfaces) - c++

I'd like to know what is considered nowadays the best practice when returning a pointer to a polymorphic object from a function, for example when using factories. If I transfer the ownership, should I return boost::unique_ptr<Interface>? What should I return if I don't transfer the ownership (e.g. returning a reference to a member)? Is there an alternative, non boost-based way which is also commonly used? Thanks.
EDIT: it is supposed to be C++03 compatible, with a possibility to easily upgrade to 0x
EDIT2: Please note I'm explicitly asking about common approaches, best practices, and not just "a way to do this". A solution implying a conditional search-and-replace over the codebase in future does not look like a good practice, does it?
EDIT3: Another point about auto_ptr is that it is deprecated, whatever neat it is, so it looks strange to advertise its usage at the interface level. Then, someone unaware will put the returned pointer into a STL container, and so on and so forth. So if you know another somehow common solution, you are very welcome to add an answer.

Use ::std::auto_ptr for now, and when C++0x is available, then switch to ::std::unique_ptr. At least in the factory case where you are handing ownership back to the caller.
Yes ::std::auto_ptr has problems and is ugly. Yes, it's deprecated in C++0x. But that is the recommended way to do it. I haven't examined ::boost::unique_ptr, but without move semantics I don't see that it can do any better than ::std::auto_ptr.
I prefer the idea of upgrading by doing a search and replace, though there are some unusual cases in which that won't have the expected result. Fortunately, these cases generate compiler errors:
::std::auto_ptr<int> p(new int);
::std::auto_ptr<int> p2 = p;
will have to become at least like this
::std::unique_ptr<int> p(new int);
::std::unique_ptr<int> p2 = ::std::move(p);
I prefer search and replace because I find that using macros and typedefs for things like this tend to make things more obscure and difficult to understand later. A search and replace of your codebase can be applied selectively if need be (::std::auto_ptr won't go away in C++0x, it's just deprecated) and leaves your code with clear and obvious intent.
As for what's 'commonly' done, I don't think the problem has been around for long enough for there to be a commonly accepted method of handling the changeover.

Related

C++ std features and Binary size

I was told recently in a job interview their project works on building the smallest size binary for their application (runs embedded) so I would not be able to use things such as templating or smart pointers as these would increase the binary size, they generally seemed to imply using things from std would be generally a no go (not all cases).
After the interview, I tried to do research online about coding and what features from standard lib caused large binary sizes and I could find basically nothing in regards to this. Is there a way to quantify using certain features and the size impact they would have (without needing to code 100 smart pointers in a code base vs self managed for example).
This question probably deserves more attention than it’s likely to get, especially for people trying to pursue a career in embedded systems. So far the discussion has gone about the way that I would expect, specifically a lot of conversation about the nuances of exactly how and when a project built with C++ might be more bloated than one written in plain C or a restricted C++ subset.
This is also why you can’t find a definitive answer from a good old fashioned google search. Because if you just ask the question “is C++ more bloated than X?”, the answer is always going to be “it depends.”
So let me approach this from a slightly different angle. I’ve both worked for, and interviewed at companies that enforced these kinds of restrictions, I’ve even voluntarily enforced them myself. It really comes down to this. When you’re running an engineering organization with more than one person with plans to keep hiring, it is wildly impractical to assume everyone on your team is going to fully understand the implications of using every feature of a language. Coding standards and language restrictions serve as a cheap way to prevent people from doing “bad things” without knowing they’re doing “bad things”.
How you define a “bad thing” is then also context specific. On a desktop platform, using lots of code space isn’t really a “bad” enough thing to rigorously enforce. On a tiny embedded system, it probably is.
C++ by design makes it very easy for an engineer to generate lots of code without having to type it out explicitly. I think that statement is pretty self-evident, it’s the whole point of meta-programming, and I doubt anyone would challenge it, in fact it’s one of the strengths of the language.
So then coming back to the organizational challenges, if your primary optimization variable is code space, you probably don’t want to allow people to use features that make it trivial to generate code that isn’t obvious. Some people will use that feature responsibly and some people won’t, but you have to standardize around the least common denominator. A C compiler is very simple. Yes you can write bloated code with it, but if you do, it will probably be pretty obvious from looking at it.
(Partially extracted from comments I wrote earlier)
I don't think there is a comprehensive answer. A lot also depends on the specific use case and needs to be judged on a case-by-case basis.
Templates
Templates may result in code bloat, yes, but they can also avoid it. If your alternative is introducing indirection through function pointers or virtual methods, then the templated function itself may become bigger in code size simply because function calls take several instructions and removes optimization potential.
Another aspect where they can at least not hurt is when used in conjunction with type erasure. The idea here is to write generic code, then put a small template wrapper around it that only provides type safety but does not actually emit any new code. Qt's QList is an example that does this to some extend.
This bare-bones vector type shows what I mean:
class VectorBase
{
protected:
void** start, *end, *capacity;
void push_back(void*);
void* at(std::size_t i);
void clear(void (*cleanup_function)(void*));
};
template<class T>
class Vector: public VectorBase
{
public:
void push_back(T* value)
{ this->VectorBase::push_back(value); }
T* at(std::size_t i)
{ return static_cast<T*>(this->VectorBase::at(i)); }
~Vector()
{ clear(+[](void* object) { delete static_cast<T*>(object); }); }
};
By carefully moving as much code as possible into the non-templated base, the template itself can focus on type-safety and to provide necessary indirections without emitting any code that wouldn't have been here anyway.
(Note: This is just meant as a demonstration of type erasure, not an actually good vector type)
Smart pointers
When written carefully, they won't generate much code that wouldn't be there anyway. Whether an inline function generates a delete statement or the programmer does it manually doesn't really matter.
The main issue that I see with those is that the programmer is better at reasoning about code and avoiding dead code. For example even after a unique_ptr has been moved away, the destructor of the pointer still has to emit code. A programmer knows that the value is NULL, the compiler often doesn't.
Another issue comes up with calling conventions. Objects with destructors are usually passed on the stack, even if you declare them pass-by-value. Same for return values. So a function unique_ptr<foo> bar(unique_ptr<foo> baz) will have higher overhead than foo* bar(foo* baz) simply because pointers have to be put on and off the stack.
Even more egregiously, the calling convention used for example on Linux makes the caller clean up parameters instead of the callee. That means if a function accepts a complex object like a smart pointer by value, a call to the destructor for that parameter is replicated at every call site, instead of putting it once inside the function. Especially with unique_ptr this is so stupid because the function itself may know that the object has been moved away and the destructor is superfluous; but the caller doesn't know this (unless you have LTO).
Shared pointers are a different beast altogether, simply because they allow a lot of different tradeoffs. Should they be atomic? Should they allow type casting, weak pointers, what indirection is used for destruction? Do you really need two raw pointers per shared pointer or can the reference counter be accessed through shared object?
Exceptions, RTTI
Generally avoided and removed via compiler flags.
Library components
On a bare-metal system, pulling in parts of the standard library can have a significant effect that can only be measured after the linker step. I suggest any such project use continuous integration and tracks the code size as a metric.
For example I once added a small feature, I don't remember which, and in its error handling it used std::stringstream. That pulled in the entire iostream library. The resulting code exceeded my entire RAM and ROM capacity. IIRC the issue was that even though exception handling was deactivated, the exception message was still being set up.
Move constructors and destructors
It's a shame that C++'s move semantics aren't the same as for example Rust's where objects can be moved with a simple memcpy and then "forgetting" their original location. In C++ the destructor for a moved object is still invoked, which requires more code in the move constructor / move assignment operator, and in the destructor.
Qt for example accounts for such simple cases in its meta type system.

What would be the proper implementation of a concept describing a raw pointer (to objects)?

This might be of academic interest only.
Because I noticed that my practical example might not be the suitable one to raise that question.
Originally I had written something like
auto bak_ptr{ptr}; // think of an `int* ptr`
to make a backup of an address.
Now I wondered if I could do better by replacing that dumb auto by a more meaningful, horizon broadening C++20 concept.
After some thought, the right choice in that specific example might just be
std::copyable auto bak_ptr{ptr};
because that's mirroring the intent of a backup, a copy of something.
But the academic question remaining is: What would be a proper (concepts standard library) implementation correctly mapping the behavior of the raw pointer concept? And let's restrict 'pointer' to pointer to objects for know, if necessary. There are also pointers to void, functions, members - perhaps complicating the issue.
I thought of a relationship to iterators. Hopefully not yet again answering my own question correctly: would std::contiguous_iterator be a valuable choice in cases?
Could someone please share a more or less complete expert view on that topic?
And why there might be no std::pointer_to_object standard library implementation for that concept yet (yelling the thing into your face)?
Just building up upon std::is_pointer there should be nothing wrong with
template <class T>
concept pointer =
std::is_pointer_v<T>;
Thanks to #cigien for that idea.
And I would say that in the simple backup example the choice of std::copyable instead is exactly right. As that's what you want to do with the backup, restore it again by copy.
To signal that bak_ptr is a raw pointer-to-object does not require any fancy shenanigans:
auto* bak_ptr{ptr};

std::auto_ptr<T> Usage

I've read a reasonable amount in decent textbooks about the auto_ptr class. While I understand what it is, and how it gets you around the problem of getting exceptions in places like constructors, I am having trouble figuring out when someone would actually use it.
An auto_ptr can only hold a single type (no array new[] initialization is supported). It changes ownership when you pass it into functions or try and duplicate it (it's not a reference counting smart pointer).
What is a realistic usage scenario for this class give its limitations? It seems like most of the textbook examples of its use are reaching because there isn't even a reason to be using a pointer over a stack variable in most of the cases...
Anyway, I'll stop my rant - but if you can provide a short example/description or a link to a good usage scenario for this I'd be grateful. I just want to know where I should use it in practice in case I come across the situation - I like to practice what I learn so I remember it.
I'll give you a short example for a good usage. Consider this:
auto_ptr<SomeResource> some_function() {
auto_ptr<SomeResource> my_ptr = get_the_resource();
function_that_throws_an_exception();
return my_ptr;
}
The function that raises an exception would normally cause your pointer to be lost, and the object pointed to would not be deleted. With the auto_ptr this can't happen, since it is destroyed when it leaves the frame it was created, if it hasn't been assigned (for example with return).
auto_ptr has been deprecated in the now finalized C++11 standard. Some of the replacements are already available through TR1 or the Boost libraries. Examples are shared_ptr and unique_ptr (scoped_ptr in boost).

Pointer vs variable in class

I know what is the difference and how they both work but this question is more about coding style.
Whenever I'm coding I make many classes, they all have variables and some of them are pointers and some are normal variables. I usually prefer variables to pointers if that members lasts as long as the class does but then my code becomes like this:
engine->camera.somevar->x;
// vs
engine->camera->somevar->x;
I don't like the dot in the middle. Or with private variables:
foo_.getName();
// vs
foo_->gatName();
I think that dot "disappears" in a long code. I find -> easier to read in some cases.
My question would be if you use pointers even if the variable is going to be created in the constructor and deleted in the destructor? Is there any style advice in this case?
P.S. I do think that dot is looks better in some cases.
First of all it is bad form to expose member variables.
Second your class should probably never container pointers.
Slight corolary: Classes that contain business logic should never have pointers (as this means they also contain pointer management code and pointer management code should be left to classes that have no business logic but are designed specifically for the purpose of managing pointers (smart pointers and containers).
Pointer management classes (smart pointers/containers) should be designed to manage a single pointer. Managing more than one is much more difficult than you expect and I have yet to find a situation where the extra complexity paid off.
Finally public members should not expose the underlying implementation (you should not provide access to members even via getters/setters). This binds the interface to tightly to the implementation. Instead your public interface should provide a set of actions that can be performed on the object. i.e. methods are verbs.
In C++ it is rare to see pointers.
They are generally hidden inside other classes. But you should get used to using a mixture of -> and . as it all depends on context and what you are trying to convey. As long as the code is clean and readable it does not matter too much.
A personal addendum:
I hate the _ at then end of your identifier it makes the . disapear foo_.getName() I think it would look a lot better as foo.getName()
If the "embedded" struct has exactly the same lifetime as the "parent" struct and it is not referenced anywhere else, I prefer to have it as a member, rather than use a pointer. The produced code is slightly more efficient, since it saves a number of calls to the memory allocator and it avoids a number of pointer dereferences.
It is also easier to handle, since the chance of pointer-related mistakes is reduced.
If, on the other hand, there is the slightest chance that the embedded structure may be referenced somewhere else I prefer to use a separate struct and pointers. That way I won't have to refactor my code if it turns out that the embedded struct needs to be pulled out from its parent.
EDIT:
I guess that means that I usually go with the pointer alternative :-)
EDIT 2:
And yes, my answer is assuming that you really want (or have) to chose between the two i.e. that you write C-style code. The proper object-oriented way to access class members is through get/set functions.
My comments regarding whether to include an actual class instance or a pointer/reference to one are probably still valid, however.
You should not make your choice because you find '->' easier to read :)
Using a member variable is usually better as you can not make mistakes with you pointer.
This said, using a member variable force you to expose your implementation, thus you have to use references. But then you have to initialize then in your constructor, which is not always possible ...
A solution is to use std::auto_ptr or boost::scoped_ptr ot similar smart pointer. There you will get advantage of both solution, with very little drawbacks.
my2c
EDIT:
Some useful links :
Article on std::auto_ptr
boost::scoped_ptr
Pimpl : private implementation
Ideally, you shouldn't use either: you should use getter/setter methods. The performance hit is minimal (the compiler will probably optimize it away, anyway).
The second consideration is that using pointers is a generally dangerous idea, because at some point you're likely to screw it up.
If neither of these faze you, then I'd say all that's left is a matter of personal preference.

Should I stop using auto_ptr?

I've recently started appreciating std::auto_ptr and now I read that it will be deprecated. I started using it for two situations:
Return value of a factory
Communicating ownership transfer
Examples:
// Exception safe and makes it clear that the caller has ownership.
std::auto_ptr<Component> ComponentFactory::Create() { ... }
// The receiving method/function takes ownership of the pointer. Zero ambiguity.
void setValue(std::auto_ptr<Value> inValue);
Despite the problematic copy semantics I find auto_ptr useful. And there doesn't seem to be an alternative for the above examples.
Should I just keep using it and later switch to std::unique_ptr? Or is it to be avoided?
It is so very very useful, despite it's flaws, that I'd highly recommend just continuing to use it and switching to unique_ptr when it becomes available.
::std::unique_ptr requires a compiler that supports rvalue references which are part of the C++0x draft standard, and it will take a little while for there to be really wide support for it. And until rvalue references are available, ::std::auto_ptr is the best you can do.
Having both ::std::auto_ptr and ::std::unique_ptr in your code might confuse some people. But you should be able to search and replace for ::std::unique_ptr when you decide to change it. You may get compiler errors if you do that, but they should be easily fixable. The top rated answer to this question about replacing ::std::auto_ptr with ::std::unique_tr has more details.
deprecated doesn't mean it's going away, just that there will be a better alternative.
I'd suggest keeping using it on current code, but using the new alternative on new code (New programs or modules, not small changes to current code). Consistency is important
I'd suggest you go use boost smart pointers.