I'm learning C++ while I run into this situation, where I want to implement an equivalently efficient version in C++ of the following symbolic code in C.
<header.h>
struct Obj;
Obj* create(...);
void do_some_thing(Obj*);
void do_another_thing(Obj*);
void destroy(Obj*);
The requirements are:
The implementation is provided in a library (static/dynamic) and the
header doesn't expose any detail other than the interface
It should be equally efficient
Exposing an interface (COM-like) with virtual functions doesn't qualify; that's a solution to enable polymorphism (more than one implementation exposed through the same interface) which isn't the case, and since I don't need the value it brings, I can't see why I should pay the cost of calling functions through 2 indirect pointers.
So my next thought was the pimpl idiom:
<header.h>
class Obj {
public:
Obj();
~Obj();
void do_some_thing();
void do_another_thing();
private:
class Impl;
smart_ptr<Impl> impl_; // ! What should I use here, unique_ptr<> or shared_ptr<> ?
};
shared_ptr<> doesn't seem to qualify, I would pay for unnecessary interlocked increment/decrement that didn't exist in the original implementation.
On the other hand unique_ptr<> makes Obj non-copyable. This means that the client can't call his own functions that take Obj by value, and Obj is merely a wrapper for a pointer, so essentially he can't pass pointers by value! He could do that in the original version. (passing by reference still doesn't qualify: he's still passing a pointer to a pointer)
So what should be the equally efficient way to implement this in C++?
EDIT:
I gave it some more thought and I came to this solution:
<header.h>
class ObjRef // I exist only to provide an interface to implementation
{ // (without virtual functions and their double-level indirect pointers)
public:
ObjRef();
ObjRef(ObjRef); // I simply copy pointers value
ObjRef operator=(ObjRef); // ...
void do_some_thing();
void do_another_thing();
protected:
class Impl;
Impl* impl_; // raw pointer here, I'm not responsible for lifetime management
};
class Obj : public ObjRef
{
Obj(Obj const&); // I'm not copyable
Obj& operator=(Obj const&); // or assignable
public:
Obj(Obj&&); // but I'm movable (I'll have to implement unique_ptr<> functionality)
Obj& operator=(Obj&&);
Obj();
~Obj(); // I destroy impl_
// and expose the interface of ObjRef through inheritance
};
Now I return to client Obj, and If client needs to distribute the usage of Obj by calling some other his functions, he can declare them as
void use_obj(ObjRef o);
// and call them:
Obj o = call_my_lib_factory();
use_obj(o);
Why not keep the C original? The reason that you didn't have to pay the reference counting premium in the C version is that the C version relied on the caller to keep any tally of the number of copies of Obj* in use.
By trying to ensure that the replacement is both copyable and that ensure that the underlying destroy method is only called once the last reference is destroyed you are imposing additional requirements over the original, so it is only natural that the proper solution (which seems to me to be shared_ptr) has some limited additional expense over the original.
I assume there's a reason why you need the a pointer to the object in the first place. Because the simplest, and most efficient approach, would be to just create it on the stack:
{
Obj x(...);
x.do_some_thing();
x.do_another_thing();
} // let the destructor handle `destroy()` when the object goes out of scope
But say you need it created on the heap, for whatever reason, most of it is just as simple, and just as efficient:
{
std::unique_ptr<Obj> x = Obj::create(...); // if you want a separate `create` function
std::unique_ptr<Obj> x = new Obj(...); // Otherwise, a plain constructor will do
x->do_some_thing();
x->do_another_thing();
} // as before, let the destructor handle destruction
There is no need for inheritance or interfaces or virtual functions. You're writing C++, not Java. One of the fundamental rules of C++ is "don't pay for what you don't use". C++ is as performant as C by default. You don't have to do anything special to achieve good performance. All you have to do is not use the features which have a performance cost if you don't need them.
You didn't need reference counting in your C version, so why would you use reference counting in the C++ version (with shared_ptr)? You didn't need the functionality that virtual functions provide in the C version (where it would be implemented through function pointers), so why would you make anything virtual in the C++ function?
But C++ lets you tidy up the create/destroy stuff in particular, and that costs you nothing. So use that. Create the object in its constructor, and let its destructor destroy it. And just place the object in an appropriate scope, so it'll go out of scope when you want it destroyed. And it gives you nicer syntax for calling "member" functions, so make use of that too.
On the other hand unique_ptr<> makes Obj non-copyable. This means that the client can't call his own functions that take Obj by value, and Obj is merely a wrapper for a pointer, so essentially he can't pass pointers by value! He could do that in the original version. (passing by reference still doesn't qualify: he's still passing a pointer to a pointer)
The client can take Obj by value if you use move semantics. But if you want the object to be copied, then let it be copied. Use my first example, and create the object on the stack, and just pass it by value when it needs to be passed by value. It it contains any complex resources (such as pointers to allocated memory), then be sure to write an appropriate copy constructor and assignment operator, and then you can create the object (on the stack, or wherever you need it), pass it around as you desire, by value, by reference, by wrapping it in a smart pointer, or by moving it (if you implement the necessary move constructor and move assignment operator).
Related
I am trying to hold a unique pointer to both a vector (from a Base class) and a Derivated class object, so later I am able to call a method for all. However I want to call aswell methods from Derivated classes so I need to store the reference for them aswell.
class Foo
{
vector<unique_ptr<Base>> bar;
unique_ptr<Derivated1> bar2;
unique_ptr<Derivated2> bar3;
}
Foo::Foo(){
this->bar2 = make_unique<Derivated1>();
this->bar2->doSomethingDerivated1();
this->bar3 = make_unique<Derivated2>();
this->bar->push_back(move(bar2));
this->bar->push_back(move(bar3));
}
Foo::callForAll() {
for (const auto& foo: this->bar) foo->doSomethingBase();
}
Foo::callForDerivated1() {
this->bar2->doSomethingDerivated1();
}
Is something like this possible? For my understanding this code will most likely fail. Will move place bar2 and bar3 to nullptr? Creating a unique_ptr, store a raw pointer with bar2->get() and then push_back to vector works?
Note: These objects only belong to this class, so unique would make sense.
No. The whole point of unique_ptr is to be unique. This means you cannot copy it.
However, you CAN make copies of the underlying pointer that unique_ptr manages. E.g.
unique_ptr<Foo> foo(new Foo);
Foo* foo_copy = foo.get();
There are a couple things that you need to be careful of when doing this:
Do not delete foo_copy. That is the job of unique_ptr. If you delete foo_copy, then you will have committed the sin of double delete, which has undefined behavior (i.e. the compiler is allowed to generate code that launches a nuclear missile).
Do not use foo_copy after foo has been destroyed because when foo is destroyed, the underlying pointer is deleted (unless foo has relinquished ownership of the underlying pointer, and the underlying pointer hasn't been deleted by some other means yet). This is the sin of use after free, and it also has UB.
Moving is one way for unique_ptr to relinquish ownership. Once you move out of a unique_ptr, it no longer points to anything (that is the whole point of moving; when you walk from A to B, you are no longer at A, because there is only one of you, and you have decided to instead be located at B). I believe that trying to call a method using the -> operator on an empty unique_ptr is also UB.
It seems like what you should do is
vector<unique_ptr<Base>> v;
Derived* ob = v[i].get();
Let me reiterate that this is slightly dangerous and unusual.
Tangent: I find it highly suspicious that you are not using virtual methods; virtual really should be the default (kind of like how double is the default, not float, even though float looks like it is and should be the default). Actually, I find it suspicious that you are using inheritance. Inheritance is one of the most over-rated features of all time. I suspect that this is where your troubles may be originating. For example, the Go language doesn't even have inheritance, yet it does have polymorphism (i.e. different implementations of the same interface).
PS: Please, do not accidentally extinguish the human race by invoking UB.
There is a little code example here:
struct Data {
};
struct Init {
Data *m_data;
Init() : m_data(new Data) { }
~Init() {
delete m_data;
}
};
class Object {
private:
int m_initType;
Data *m_data;
public:
Object(const Init &init) : m_initType(0), m_data(init.m_data) { }
Object(Init &&init) : m_initType(1), m_data(init.m_data) { init.m_data = nullptr; }
~Object() {
if (m_initType==1) {
delete m_data;
}
}
};
Object can be initialized two ways:
const Init &: this initialization just stores m_data as a pointer, m_data is not owned, so ~Object() doesn't have to do anything (in this case, m_data will be destroyed at ~Init())
Init &&: this initialization transfers ownership of m_data, Object becomes the owner of m_data, so ~Object() needs to destroy it
Now, there is a function:
void somefunction(Object object);
This function is called in callInitA and callInitB:
void callInitA() {
Init x;
somefunction(x); // calls the "const Init &" constructor
}
void callInitB() {
somefunction(Init()); // calls the "Init &&" constructor
}
Now, here's what I'd like to accomplish: in the callInitA case, I'd like to make the compiler to optimize away the destructor call of the resulting temporary Object (Object is used frequently, and I'd like to decrease code size).
However, the compiler doesn't optimize it away (tested with GCC and clang).
Object is designed so it doesn't have any functions which alter m_initType, so the compiler would be able to find out that if m_initType is set to 0 at construct time, then it won't change, so at the destructor it is still be 0 -> no need to call destructor at all, as it would do nothing.
Even, m_initType is an unnecessary member of Object: it is only needed at destruct time.
Do you have any design ideas how to accomplish this?
UPDATE: I mean that using some kind of c++ construct (helper class, etc.). C++ is a powerful language, maybe with some kind of c++ trickery this can be done.
(My original problem is more complex that this simplified one: Object can be initialized with other kind of Init structures, but all Objects constructors boils down to getting a "Data*" somehow)
void callInitA() {
Init x;
somefunction(x); // calls the "const Init &" constructor
}
The destruction of x cannot be optimized away, regardless of the contents of Init. Doing so would violate the design of the language.
It's not just a matter of whether Init contains resources or not. Init x, like all objects, will allocate space on the stack that later needs to be cleaned up, as an implicit (not part of code that you yourself write) part of the destructor. It's impossible to avoid.
If the intention is for x to be an object that somefunction can call without having to repeatedly create and delete references to x, you should be handling it like this:
void callInitA(Init & x) { //Or Init const& x
somefunction(x); // calls the "const Init &" constructor
}
A few other notes:
Make sure you implement the Rule of Five (sometimes known as Rule of Three) on any object that owns resources.
You might consider wrapping all pointers inside std::unique_ptr, as it doesn't seem like you need functionality beyond what std::unique_ptr offers.
Your m_initType actually distinguishes between two kinds of Objects - those which own their memory and those which don't. Also, you mention that actually there are many kinds of Objects which can be initialized with all sorts of inputs; so actually there are all sorts of Objects. That would suggest Object should better be some abstract base class. Now, that wouldn't speed anything up or avoid destructor calls, but it might make your design more reasonable. Or maybe Object could be an std::variant (new in C++17, you can read up on it).
But then, you say that temporary Objects are "used frequently". So perhaps you should go another way: In your example, suppose you had
template <bool Owning> class Object;
which you would then specialize for the non-owning case, with only a const Init& constructor and default destruction, and the owning case, with only an Init&& constructor (considering the two you mentioned) and a destructor which deletes. This would mean templatizing the code that uses Object, which many mean larger code size, as well as having to know what kind of Objects you pass in; but if would avoid the condition check if that really bugs you so much.
I'd like to decrease code size
I kind of doubt that you do. Are you writing code for an embedded system? In that case it's kind of strange you use lots of temporary Objects which are sort-of polymorphic.
I'm using an old open-source library, with the following (simplified) API of interest:
// some class that holds a raw pointer to memory on the heap
// DOES NOT delete it in its destructor
// DOES NOT do a "deep" copy when copied/assigned (i.e., after copying both objects
// will point to the same address)
class Point;
// function used to construct a point and allocate its data on the heap
Point AllocPoint();
// function used to release the memory of the point's data
void DeallocPoint(Point& p);
// Receives a pointer/c-array of Points, along with the number of points
// Doesn't own the memory
void Foo(Point* points, int npts);
What's the best (safest/most readable/most elegant) way of using this API in C++11. I can't simply use vector<unique_ptr<Point, PointDeleter>> (where PointDeleter is a simple custom deleter I can implement), because then I will not be able to use the function Foo (which expects Point* and not unique_ptr<Point>*).
Thanks
If you really want to make it look nice, you're probably going to have to write a set of really comprehensive wrappers which completely hide the library's API - effectively, wrap the entire library with one that behaves in a modern C++ way on the outside and hides all the mess inside.
Not a pleasant task, but if you can get the behaviour of that library right then it should make your life a lot easier in the long term. Might not be worth it if you're not going to use this external library very extensively though.
I would wrap this non-RAII C-like API in RAII building blocks, and then use them in C++11 code.
For example: you can define a RaiiPoint class that wraps the (non-RAII) Point class, and in its constructor calls AllocPoint(), in the destructor DeallocPoint(). Then you can define proper copy constructor and copy operator=, or just implement move semantics (with move constructor and move operator=), or make the wrapper class both copyable and movable, basing on your requirements.
Then you can simply use a std::vector<RaiiPoint> with your RAII-based wrapper class.
(This is a general approach that you can use when you want to use C libraries in modern C++ code: you can wrap the "raw" C library handles and objects in safe RAII boundaries, and use these robust safe wrapper classes in your modern C++ code.)
You can use std::vector<Point>, calling Foo( &v[0],
v.size() ). But managing the memory here could be tricky,
since Point apparently doesn't provide any clean copy and
assignment; a custom deleter in the allocator will be called for
each element, even if it is copied.
If the vector should actually own the points, then you can wrap
it in a more complex class, which calls AllocPoint for each
insertion (and inserts the results), and DeallocPoint for each
removal (and for everything remaining in the vector on
destruction). This class should not allow write access to the
Point (non-const operator[], non-const iterators, etc.),
however, since this would allow changing any pointers in
Point, and loosing what is needed for DeallocPoint to work
correctly. Presumably, there other functions for manipulating
Point; you'll have to arrange for these to be available
through the wrapper interface.
"You" could write a simple wrapper to free the memory:
struct PointVectorWrapper {
vector<Point> points;
~PointVectorWrapper() {
for (Point& p : points) {
DeallocPoint(p);
}
}
PointVectorWrapper& operator=(const PointVectorWrapper&) = delete;
PointVectorWrapper(const PointVectorWrapper&) = delete;
};
// Now the usage is simple and safe:
PointVectorWrapper points;
// ... populate points ...
Foo(points.data(), points.size())
But this seems a little "adhoc". What's a more standard/reusable solution?
You could use a standard vector with a custom allocator, that invoke AllocPoint on construct method and DeallocPoint() on destruct method.
template<typename T>
class CustomAllocator : public std::allocator<T>
{
//Rebind and constructors
};
template<>
class CustomAllocator<Point> : public std::allocator<Point>
{
//Rebind and constructors
//For c++11
void construct( pointer p )
{
new (p) Point();
*p = AllocPoint();
}
void construct( pointer p, const_reference val )
{
construct(p);
//copy member from val to point if neccessary
};
void destroy( pointer p )
{
DeallocPoint(*p);
p->~Point();
}
};
typedef std::vector<Point, CustomAllocator<Point> > PointVector;
I have some code that currently uses raw pointers, and I want to change to smart pointers. This helps cleanup the code in various ways. Anyway, I have factory methods that return objects and its the caller's responsibility to manager them. Ownership isn't shared and so I figure unique_ptr would be suitable. The objects I return generally all derive from a single base class, Object.
For example,
class Object { ... };
class Number : public Object { ... };
class String : public Object { ... };
std::unique_ptr<Number> State::NewNumber(double value)
{
return std::unique_ptr<Number>(new Number(this, value));
}
std::unique_ptr<String> State::NewString(const char* value)
{
return std::unique_ptr<String>(new String(this, value));
}
The objects returned quite often need to be passed to another function, which operates on objects of type Object (the base class). Without any smart pointers the code is like this.
void Push(const Object* object) { ... } // push simply pushes the value contained by object onto a stack, which makes a copy of the value
Number* number = NewNumber(5);
Push(number);
When converting this code to use unique_ptrs I've run into issues with polymorphism. Initially I decided to simply change the definition of Push to use unique_ptrs too, but this generates compile errors when trying to use derived types. I could allocate objects as the base type, like
std::unique_ptr<Object> number = NewNumber(5);
and pass those to Push - which of course works. However I often need to call methods on the derived type. In the end I decided to make Push operate on a pointer to the object stored by the unique_ptr.
void Push(const Object* object) { ... }
std::unique_ptr<Object> number = NewNumber(5);
Push(number.get());
Now, to the reason for posting. I'm wanting to know if this is the normal way to solve the problem I had? Is it better to have Push operate on the unique_ptr vs the object itself? If so how does one solve the polymorphism issues? I would assume that simply casting the ptrs wouldn't work. Is it common to need to get the underlying pointer from a smart pointer?
Thanks, sorry if the question isn't clear (just let me know).
edit: I think my Push function was a bit ambiguous. It makes a copy of the underlying value and doesn't actually modify, nor store, the input object.
Initially I decided to simply change the definition of Push to use
unique_ptrs too, but this generates compile errors when trying to use
derived types.
You likely did not correctly deal with uniqueness.
void push(std::unique_ptr<int>);
int main() {
std::unique_ptr<int> i;
push(i); // Illegal: tries to copy i.
}
If this compiled, it would trivially break the invariant of unique_ptr, that only one unique_ptr owns an object, because both i and the local argument in push would own that int, so it is illegal. unique_ptr is move only, it's not copyable. It has nothing to do with derived to base conversion, which unique_ptr handles completely correctly.
If push owns the object, then use std::move to move it there. If it doesn't, then use a raw pointer or reference, because that's what you use for a non-owning alias.
Well, if your functions operate on the (pointed to) object itself and don't need its address, neither take any ownership, and, as I guess, always need a valid object (fail when passed a nullptr), why do they take pointers at all?
Do it properly and make them take references:
void Push(const Object& object) { ... }
Then the calling code looks exactly the same for raw and smart pointers:
auto number = NewNumber(5);
Push(*number);
EDIT: But of course no matter if using references or pointers, don't make Push take a std::unique_ptr if it doesn't take ownership of the passed object (which would make it steal the ownership from the passed pointer). Or in general don't use owning pointers when the pointed to object is not to be owned, std::shared_ptr isn't anything different in this regard and is as worse a choice as a std::unique_ptr for Push's parameter if there is no ownership to be taken by Push.
If Push does not take owenrship, it should probably take reference instead of pointer. And most probably a const one. So you'll have
Push(*number);
Now that's obviously only valid if Push isn't going to keep the pointer anywhere past it's return. If it does I suspect you should try to rethink the ownership first.
Here's a polymorphism example using unique pointer:
vector<unique_ptr<ICreature>> creatures;
creatures.emplace_back(new Human);
creatures.emplace_back(new Fish);
unique_ptr<vector<string>> pLog(new vector<string>());
for each (auto& creature in creatures)
{
auto state = creature->Move(*pLog);
}
I have objects which create other child objects within their constructors, passing 'this' so the child can save a pointer back to its parent. I use boost::shared_ptr extensively in my programming as a safer alternative to std::auto_ptr or raw pointers. So the child would have code such as shared_ptr<Parent>, and boost provides the shared_from_this() method which the parent can give to the child.
My problem is that shared_from_this() cannot be used in a constructor, which isn't really a crime because 'this' should not be used in a constructor anyways unless you know what you're doing and don't mind the limitations.
Google's C++ Style Guide states that constructors should merely set member variables to their initial values. Any complex initialization should go in an explicit Init() method. This solves the 'this-in-constructor' problem as well as a few others as well.
What bothers me is that people using your code now must remember to call Init() every time they construct one of your objects. The only way I can think of to enforce this is by having an assertion that Init() has already been called at the top of every member function, but this is tedious to write and cumbersome to execute.
Are there any idioms out there that solve this problem at any step along the way?
Use a factory method to 2-phase construct & initialize your class, and then make the ctor & Init() function private. Then there's no way to create your object incorrectly. Just remember to keep the destructor public and to use a smart pointer:
#include <memory>
class BigObject
{
public:
static std::tr1::shared_ptr<BigObject> Create(int someParam)
{
std::tr1::shared_ptr<BigObject> ret(new BigObject(someParam));
ret->Init();
return ret;
}
private:
bool Init()
{
// do something to init
return true;
}
BigObject(int para)
{
}
BigObject() {}
};
int main()
{
std::tr1::shared_ptr<BigObject> obj = BigObject::Create(42);
return 0;
}
EDIT:
If you want to object to live on the stack, you can use a variant of the above pattern. As written this will create a temporary and use the copy ctor:
#include <memory>
class StackObject
{
public:
StackObject(const StackObject& rhs)
: n_(rhs.n_)
{
}
static StackObject Create(int val)
{
StackObject ret(val);
ret.Init();
return ret;
}
private:
int n_;
StackObject(int n = 0) : n_(n) {};
bool Init() { return true; }
};
int main()
{
StackObject sObj = StackObject::Create(42);
return 0;
}
Google's C++ programming guidelines have been criticized here and elsewhere again and again. And rightly so.
I use two-phase initialization only ever if it's hidden behind a wrapping class. If manually calling initialization functions would work, we'd still be programming in C and C++ with its constructors would never have been invented.
Depending on the situation, this may be a case where shared pointers don't add anything. They should be used anytime lifetime management is an issue. If the child objects lifetime is guaranteed to be shorter than that of the parent, I don't see a problem with using raw pointers. For instance, if the parent creates and deletes the child objects (and no one else does), there is no question over who should delete the child objects.
KeithB has a really good point that I would like to extend (in a sense that is not related to the question, but that will not fit in a comment):
In the specific case of the relation of an object with its subobjects the lifetimes are guaranteed: the parent object will always outlive the child object. In this case the child (member) object does not share the ownership of the parent (containing) object, and a shared_ptr should not be used. It should not be used for semantic reasons (no shared ownership at all) nor for practical reasons: you can introduce all sorts of problems: memory leaks and incorrect deletions.
To ease discussion I will use P to refer to the parent object and C to refer to the child or contained object.
If P lifetime is externally handled with a shared_ptr, then adding another shared_ptr in C to refer to P will have the effect of creating a cycle. Once you have a cycle in memory managed by reference counting you most probably have a memory leak: when the last external shared_ptr that refers to P goes out of scope, the pointer in C is still alive, so the reference count for P does not reach 0 and the object is not released, even if it is no longer accessible.
If P is handled by a different pointer then when the pointer gets deleted it will call the P destructor, that will cascade into calling the C destructor. The reference count for P in the shared_ptr that C has will reach 0 and it will trigger a double deletion.
If P has automatic storage duration, when it's destructor gets called (the object goes out of scope or the containing object destructor is called) then the shared_ptr will trigger the deletion of a block of memory that was not new-ed.
The common solution is breaking cycles with weak_ptrs, so that the child object would not keep a shared_ptr to the parent, but rather a weak_ptr. At this stage the problem is the same: to create a weak_ptr the object must already be managed by a shared_ptr, which during construction cannot happen.
Consider using either a raw pointer (handling ownership of a resource through a pointer is unsafe, but here ownership is handled externally so that is not an issue) or even a reference (which also is telling other programmers that you trust the referred object P to outlive the referring object C)
A object that requires complex construction sounds like a job for a factory.
Define an interface or an abstract class, one that cannot be constructed, plus a free-function that, possibly with parameters, returns a pointer to the interface, but behinds the scenes takes care of the complexity.
You have to think of design in terms of what the end user of your class has to do.
Do you really need to use the shared_ptr in this case? Can the child just have a pointer? After all, it's the child object, so it's owned by the parent, so couldn't it just have a normal pointer to it's parent?