I have been working on replacing raw pointers with reference-counted pointers that expose only a const version of the underlying one. My objective is to reduce memory usage (and time spent unnecessarily constructing and destructing complex objects) without putting myself in a situation where any code has access to memory that it does not own. I am aware of the circular reference problem with reference counting, but my code should never create such a situation.
Requiring const-ness works because I am using a system in which classes generally expose no non-const members, and rather than altering an object you must call a method on it that returns a new object that results from the alteration. This is probably a design pattern, but I do not know it by name.
My problem has come when I have a method that returns pointer to an object of its type, which is sometimes itself. Previously it looked something like this:
Foo * Foo::GetAfterModification( const Modification & mod ) const
{
if( ChangesAnything( mod ) )
{
Foo * asdf = new Foo;
asdf.DoModification( mod );
return asdf;
}
else
return this;
}
I cannot see a good way to make this return a smart pointer. The naive approach would be something like return CRcPtr< Foo >( this ), but this breaks the ownership semantics because what I am returning and whomever previously owned the object now each think they have ownership but do not know about each other. The only safe thing to do would be return CRcPtr< Foo >( new Foo( *this ) ), but that defeats my intention of restricting unnecessary memory use.
Is there some way to safely return a smart pointer without allocating any additional memory? My suspicion is that there is not. If there were, how would it work if the object had been allocated on the stack? This question seems related, but is not the same because he can just make his functions take raw pointers as parameters and because he is using the boost library.
For reference, my home-rolled smart pointer implementation is below. I am sure it could be more robust, but it is more portable than depending on boost or tr1 being available everywhere, and until this issue it has worked well for me.
template <class T>
class CRcPtr
{
public:
explicit CRcPtr( T * p_pBaldPtr )
{
m_pInternal = p_pBaldPtr;
m_iCount = new unsigned short( 1 );
}
CRcPtr( const CRcPtr & p_Other )
{ Acquire( p_Other ); }
template <class U>
explicit CRcPtr( const CRcPtr< U > & p_It )
{
m_pInternal = dynamic_cast< T * >( p_It.m_pInternal );
if( m_pInternal )
{
m_iCount = p_It.m_iCount;
(*m_iCount)++;
}
else
m_iCount = new unsigned short( 1 );
}
~CRcPtr()
{ Release(); }
CRcPtr & operator=( const CRcPtr & p_Other )
{
Release();
Acquire( p_Other );
}
const T & operator*() const
{ return *m_pInternal; }
const T * operator->() const
{ return m_pInternal; }
const T * get() const
{ return m_pInternal; }
private:
void Release()
{
(*m_iCount)--;
if( *m_iCount == 0 )
{
delete m_pInternal;
delete m_iCount;
m_pInternal = 0;
m_iCount = 0;
}
}
void Acquire( const CRcPtr & p_Other )
{
m_pInternal = p_Other.m_pInternal;
m_iCount = p_Other.m_iCount;
(*m_iCount)++;
}
template <class U>
friend class CRcPtr;
T * m_pInternal;
unsigned short * m_iCount;
};
template <class U, class T>
CRcPtr< U > ref_cast( const CRcPtr< T > & p_It )
{ return CRcPtr< U >( p_It ); }
Edit: Thanks for the replies. I was hoping to avoid using boost or tr1, but I recognize that not using well tested libraries is generally unwise. I am fairly sure that what I have implemented is not similar to std::auto_ptr, but rather is similar to tr1::shared_ptr except that it only exposes a const version of the internal pointer and lacks some of the features in the official version. I really would like to avoid an intrusive scheme like the one proposed by Gian Paolo. I am aware that my implementation is not threadsafe, but this is a single-threaded application.
Take a look at the source for Boost shared pointers. If you derive a class from enable_shared_from_this<T>, you can then call the shared_from_this() member function to do that sort of thing.
As far as semantics go, the great thing about immutability is that you can safely share data between objects, threads, and not have to worry about people changing values you depend on. You shouldn't need to worry about ownership so long as your refcounting scheme is correct (I didn't read yours, but use Boost anyway).
As eluded to by Ferruccio, you need some form of shared pointer. Ie, one that can be shared (safely!) between multiple objects. One way to make this work is to derive all your objects (at least those that are intended for use with the shared pointer) from a class that implements the actual reference count. That way, the actual pointer itself carries the current count with it, and you share the pointer between as many different objects as you like.
What you currently have is very similar to std::auto_ptr and you are bumping into the same issues with ownership. Google it and you should find some useful info.
Be aware that there are additional complications with shared pointers: in particular in multi-threaded environments your reference counting must be atomic and self assignment must be handled to avoid incrementing the internal count such that the object is never destroyed.
Again google is your friend here. Just look for info on shared pointers in C++ and you'll find tonnes of info.
Related
So I've solved this problem, but I need your opinion if what I did is best practice.
A simple class holds a vector of unique_ptrs to order objects. I will explain the member variable null_unique below.
class order_collection {
typedef std::unique_ptr<order> ord_ptr;
typedef std::vector<ord_ptr> ord_ptr_vec;
ord_ptr_vec orders;
ord_ptr null_unique;
public:
...
const ord_ptr & find_order(std::string);
....
So I need the users of this class to get access to the order unique_ptr if found. However I'm not going to move the object out of the vector so I'm returning the unique_ptr as const ref. My implementation of the find_order method:
const order_collection::ord_ptr & order_collection::find_order(std::string id) {
auto it = std::find_if(orders.begin(),orders.end(),
[&](const order_collection::ord_ptr & sptr) {
return sptr->getId() == id;
});
if (it == orders.end())
return null_unique; // can't return nullptr here
return *it;
}
Since I'm returning by reference I can't return a nullptr. If I try to do so, I get warning : returning reference to a temporary. And if nothing is found the program crashes. So I added a unique_ptr<order> member variable called null_unique and I return it when find doesn't find an order. This solves the problem and warning is gone and doesn't crash when no order is found.
However I'm doubting my solution as it make my class ugly. Is this the best practice for handling this situation?
You should only return and accept smart pointers when you care about their ownership semantics. If you only care about what they're pointing to, you should instead return a reference or a raw pointer.
Since you're returning a dummy null_unique, it is clear that the caller of the method doesn't care about the ownership semantics. You can also have a null state: you should therefore return a raw pointer:
order* order_collection::find_order(std::string id) {
auto it = std::find_if(orders.begin(),orders.end(),
[&](const order_collection::ord_ptr & sptr) {
return sptr->getId() == id;
});
if (it == orders.end())
return nullptr;
return it->get();
}
It doesn't really make sense to return a unique_ptr here, reference or otherwise. A unique_ptr implies ownership over the object, and those aren't really the semantics being conveyed by this code.
As suggested in the comments, simply returning a raw pointer is fine here, provided that your Project Design explicitly prohibits you or anyone on your team from calling delete or delete[] outside the context of the destructor of a Resource-owning object.
Alternatively, if you either have access to Boost or C++17, a std::optional<std::reference_wrapper<order>> might be the ideal solution.
std::optional<std::reference_wrapper<order>> order_collection::find_order(std::string id) {
auto it = std::find_if(orders.begin(),orders.end(),
[&](const order_collection::ord_ptr & sptr) {
return sptr->getId() == id;
});
if (it == orders.end())
return {}; //empty optional object
return **it; //will implicitly convert to the correct object type.
}
/*...*/
void func() {
auto opt = collection.find_order("blah blah blah");
if(!opt) return;
order & ord = opt->get();
/*Do whatever*/
}
(EDIT: In testing on the most recent version of MSVC 2017, it looks like std::reference_wrapper<T> will happily do an implicit conversion to T& if you tell it to. So replacing opt->get() with *opt should work exactly the same.)
As long as I'm here, I might point out that a std::vector<std::unique_ptr<type>> object has a very "Code Smell" sense to it. std::vector<type> implies ownership of the object as is, so unless you have a good reason to prefer this (maybe the objects are large, unmovable/uncopyable, and you need to insert and remove entries frequently? Maybe this is a polymorphic type?), you're probably better off reducing this to a simple std::vector.
EDIT:
The boost version is subtly different, because boost::optional has no restrictions against "optional references", which are specifically forbidden by the C++ Standard Library's version of std::optional. The boost version is actually going to be slightly simpler:
//return type changes, nothing else changes
boost::optional<order&> order_collection::find_order(std::string id) {
auto it = std::find_if(orders.begin(),orders.end(),
[&](const order_collection::ord_ptr & sptr) {
return sptr->getId() == id;
});
if (it == orders.end())
return {}; //empty optional object
return **it; //will implicitly convert to the correct object type.
}
/*...*/
//Instead of calling opt->get(), we use *opt instead.
void func() {
auto opt = collection.find_order("blah blah blah");
if(!opt) return;
order & ord = *opt;
/*Do whatever*/
}
Say I have an Object:
class Object{
public:
Object(std::vector<int>stuff){
}
}
Each of these objects is only accessible from a class Foo:
class Foo{
public:
std::unordered_map<int,Object> _objects;
bool getObjectForId(const int& objectId,Object& rep){
bool found = false;
std::unordered_map<int, Object>::const_iterator got = _objects.find(objectId);
if(got != _objects.end()){
found = true;
rep = _objects[objectId];
}
return found;
}
In some other class I will try to get a reference to an object by doing:
class Other{
private:
Foo myFoo;
public:
void changeSomeObjectProperty(const int& objectId){
Object rep;
bool gotIt = myFoo.getObjectForId(objectId,rep);
//Then I will do some stuff with the rep, if gotIt is true
}
}
Does this pattern make sense ? I do not want a copy of the object. I want a reference to the object, but I am trying to avoid using pointers...
I'd plump for boost::optional as it conforms to the direction in which idiomatic C++ code is heading.
It will be adopted into the C++ standard from C++17 onwards as std::optional. For more details, see http://en.cppreference.com/w/cpp/utility/optional.
If you're reluctant to use the boost library or the timescales in migrating your toolchain to a C++17 standards compliant compiler are too long, then you could handcode the relevant functionality of std::optional in a few lines of code.
Returning a non-owning pointer is perfectly reasonable and idiomatic. Treating pointers as "references to data someone else owns that could not exist" is a reasonably pattern.
An alternative is boost::optional<T&>, but that is basically a pointer, and C++17 std::optional last I checked did not support optional references.
std::experimental::observer_ptr<T> is another option, or writing your own, if you want to be extremely clear that your T* is not-owning. An observer_ptr<T> is basically a boost::optional<T&> I believe.
Here are two ideas, to tackle the problem:
Use pointers
You said you don't want to use pointers. But I find, that they are a perfect match here.
Object * Foo::getObjectForId( int id )
{
const auto it = _objects.find( id );
return it != _objects.end() ? &it->second : nullptr;
}
In fact, a pointer is pretty much an std::optional<T&>.
Otherwise, use lambdas
Another way to treat the problem without unnecessary copies would be using lambdas.
template <typename F>
bool Foo::applyIfPresent( int id, F && f )
{
const auto it = _objects.find( id );
if ( it == _objects.end() )
return false;
f( it->second );
return true;
}
You can use this function like this:
Foo myFoo;
myFoo.applyIfPresent( id, []( Object & obj )
{
doSomethingWith( obj );
} );
This appears to be a more modern (functional) approach. It's harder to shoot yourself into the foot. However, it's also harder to read and it smells a bit like over-engineering. I would prefer the good ol' pointers.
My method can return some kind of pointer ( for example boost::shared_ptr ) and this pointer may be NULL. Is there is any way to enforce users of my code to check, if it is empty or not ?
Some example of such things - scals's Option container, may be boost has something like boost::option ?
You can do the following:
return a smart pointer type that throws an exception if accessed and set to NULL.
throw an exception instead of returning a NULL pointer
return a std::optional (or boost::optional) which expresses intent (i.e. "value may be missing") much better than a pointer
The usual solution is to wrap the return value in a class, which
contains a flag which is set if the pointer is checked or
copied, and whose destructor crashes if the flag wasn't set.
Something like:
template <typename T>
class MustBeChecked
{
T* myValue;
mutable bool myHasBeenChecked;
public:
MustBeChecked( T* value )
: myValue( value )
, myHasBeenChecked( false )
{
}
MustBeChecked( MustBeChecked const& other )
: myValue( other.myValue )
, myHasBeenChecked( false )
{
other.myHasBeenChecked = true;
}
~MustBeChecked()
{
assert( myHasBeenChecked );
}
bool operator==( nullptr_t ) const
{
myHasBeenChecked = true;
return myValue == nullptr;
}
bool operator!=( nullptr_t ) const
{
myHasBeenChecked = true;
return myValue != nullptr;
}
operator T*() const
{
assert( myHasBeenChecked );
return myValue;
}
};
To be frank, I find this to be overkill in most cases. But I've
seen it used on some critical systems.
The reality here is that the callers of your function already have to check. If they try to access the shared pointer without checking, then a seg-fault is coming their way if the underlying pointer is NULL.
You don't specify if you're writing a library, or some code within a project. Nor do you specify any details of the context this code lives in -- all of these might decide which approach I'd take in this situation -- but broadly speaking, all of utnapistim's suggestions are good ones.
Ok so I'm wanting to write a precise 'mark and sweep' garbage collector in C++. I have hopefully made some decisions that can help me as in all my pointers will be wrapped in a 'RelocObject' and I'll have a single block of memory for the heap. This looks something like this:
// This class acts as an indirection to the actual object in memory so that it can be
// relocated in the sweep phase of garbage collector
class MemBlock
{
public:
void* Get( void ) { return m_ptr; }
private:
MemBlock( void ) : m_ptr( NULL ){}
void* m_ptr;
};
// This is of the same size as the above class and is directly cast to it, but is
// typed so that we can easily debug the underlying object
template<typename _Type_>
class TypedBlock
{
public:
_Type_* Get( void ) { return m_pObject; }
private:
TypedBlock( void ) : m_pObject( NULL ){}
// Pointer to actual object in memory
_Type_* m_pObject;
};
// This is our wrapper class that every pointer is wrapped in
template< typename _Type_ >
class RelocObject
{
public:
RelocObject( void ) : m_pRef( NULL ) {}
static RelocObject New( void )
{
RelocObject ref( (TypedBlock<_Type_>*)Allocator()->Alloc( this, sizeof(_Type_), __alignof(_Type_) ) );
new ( ref.m_pRef->Get() ) _Type_();
return ref;
}
~RelocObject(){}
_Type_* operator-> ( void ) const
{
assert( m_pRef && "ERROR! Object is null\n" );
return (_Type_*)m_pRef->Get();
}
// Equality
bool operator ==(const RelocObject& rhs) const { return m_pRef->Get() == rhs.m_pRef->Get(); }
bool operator !=(const RelocObject& rhs) const { return m_pRef->Get() != rhs.m_pRef->Get(); }
RelocObject& operator= ( const RelocObject& rhs )
{
if(this == &rhs) return *this;
m_pRef = rhs.m_pRef;
return *this;
}
private:
RelocObject( TypedBlock<_Type_>* pRef ) : m_pRef( pRef )
{
assert( m_pRef && "ERROR! Can't construct a null object\n");
}
RelocObject* operator& ( void ) { return this; }
_Type_& operator* ( void ) const { return *(_Type_*)m_pRef->Get(); }
// SS:
TypedBlock<_Type_>* m_pRef;
};
// We would use it like so...
typedef RelocObject<Impl::Foo> Foo;
void main( void )
{
Foo foo = Foo::New();
}
So in order to find the 'root' RelocObjects when I allocate in 'RelocObject::New' I pass in the 'this' pointer of the RelocObject into the allocator(garbage collector). The allocator then checks to see if the 'this' pointer is in the range of the memory block for the heap and if it is then I can assume its not a root.
So the issue comes when I want to trace from the roots through the child objects using the zero or more RelocObjects located inside each child object.
I want to find the RelocObjects in a class (ie a child object) using a 'precise' method. I could use a reflection approach and make the user Register where in each class his or her RelocObjects are. However this would be very error prone and so I'd like to do this automatically.
So instead I'm looking to use Clang to find the offsets of the RelocObjects within the classes at compile time and then load this information at program start and use this in the mark phase of the garbage collector to trace through and mark the child objects.
So my question is can Clang help? I've heard you can gather all kinds of type information during compilation using its compile time hooks. If so what should I look for in Clang ie are there any examples of doing this kind of thing?
Just to be explicit: I want to use Clang to automatically find the offset of 'Foo' (which is a typedef of RelocObject) in FooB without the user providing any 'hints' ie they just write:
class FooB
{
public:
int m_a;
Foo m_ptr;
};
Thanks in advance for any help.
Whenever a RelocObject is instantiated, it's address can be recorded in a RelocObject ownership database along with sizeof(*derivedRelocObject) which will immediately identify which Foo belongs to which FooB. You don't need Clang for that. Also since Foo will be created shortly after FooB, your ownership database system can be very simple as the order of "I've been created, here's my address and size" calls will show the owning RelocObject record directly before the RelocObject instance's that it owns.
Each RelocObject has a ownership_been_declared flag initialized as false, upon first use (which would be after the constructors have completed, since no real work should be done in the constructor), so when any of those newly created objects is first used it requests that the database update it's ownership, the database goes through it's queue of recorded addresses and can identify which objects belong to which, clear some from it's list, setting their ownership_been_declared flag to true and you will have the offsets too (if you still need them).
p.s. if you like I can share my code for an Incremental Garbage Collector I wrote many years ago, which you might find helpful.
Basically i need to do reference counting on certain resources (like an integer index) that are not inmediately equivalent to a pointer/address semantic; basically i need to pass around the resource around, and call certain custom function when the count reaches zero. Also the way to read/write access to the resource is not a simple pointer derreference operation but something more complex. I don't think boost::shared_ptr will fit the bill here, but maybe i'm missing some other boost equivalent class i might use?
example of what i need to do:
struct NonPointerResource
{
NonPointerResource(int a) : rec(a) {}
int rec;
}
int createResource ()
{
data BasicResource("get/resource");
boost::shared_resource< MonPointerResource > r( BasicResource.getId() ,
boost::function< BasicResource::RemoveId >() );
TypicalUsage( r );
}
//when r goes out of scope, it will call BasicResource::RemoveId( NonPointerResource& ) or something similar
int TypicalUsage( boost::shared_resource< NonPointerResource > r )
{
data* d = access_object( r );
// do something with d
}
Allocate NonPointerResource on the heap and just give it a destructor as normal.
Maybe boost::intrusive_ptr could fit the bill. Here's a RefCounted base class and ancillary functions that I'm using in some of my code. Instead of delete ptr you can specify whatever operation you need.
struct RefCounted {
int refCount;
RefCounted() : refCount(0) {}
virtual ~RefCounted() { assert(refCount==0); }
};
// boost::intrusive_ptr expects the following functions to be defined:
inline
void intrusive_ptr_add_ref(RefCounted* ptr) { ++ptr->refCount; }
inline
void intrusive_ptr_release(RefCounted* ptr) { if (!--ptr->refCount) delete ptr; }
With that in place you can then have
boost::intrusive_ptr<DerivedFromRefCounted> myResource = ...
Here
is a small example about the use of shared_ptr<void> as a counted handle.
Preparing proper create/delete functions enables us to use
shared_ptr<void> as any resource handle in a sense.
However, as you can see, since this is weakly typed, the use of it causes us
inconvenience in some degree...