Implementing weak intrusive pointers in C++ - c++

Weak pointers are like smartpointers, except that references from weak
pointers do not prevent garbage collection, and weak pointers must
have their validity checked before they are used.
In our project (Linderdaum Engine http://www.linderdaum.com) we use intrusive pointers. To avoid circular references and islands of isolation we have implemented weak intrusive pointers the following way:
namespace LPtr
{
clPtr<iObject> GetObjectsGraphPtrWrapper( sEnvironment* Env, iObject* Obj, size_t Generation );
};
/// Intrusive weak smart pointer
template <class T> class clWeakPtr
{
public:
/// default constructor
clWeakPtr(): Env( NULL ), FObject( NULL ), FGeneration( 0 ) {}
explicit clWeakPtr( T* Ptr )
: Env( Ptr ? Ptr->Env : NULL )
, FObject( Ptr )
, FGeneration( Ptr ? Ptr->FGeneration : 0 ) {}
explicit clWeakPtr( const clPtr<T>& Ptr )
: Env( Ptr ? Ptr->Env : NULL )
, FObject( Ptr.GetInternalPtr() )
, FGeneration( Ptr ? Ptr->FGeneration : 0 ) {}
clPtr<T> Lock() const
{
clPtr<iObject> P = LPtr::GetObjectsGraphPtrWrapper( Env, FObject, FGeneration );
return P.DynamicCast<T>();
}
private:
sEnvironment* Env;
T* FObject;
size_t FGeneration;
};
GetObjectsGraphPtrWrapper is here just for the sake of forward declarations and does roughly this:
LMutex Lock( &FObjectsGraphMutex );
clObjectsGraph::const_iterator i = std::find( Env->ObjectsGraph.begin(), Env->ObjectsGraph.end(), Obj );
if ( i == Env->ObjectsGraph.end() ) return clPtr<iObject>();
bool IsSame = Obj->FGeneration == Generation;
bool IsAlive = Obj->GetReferenceCounter() > 0;
return ( IsSame && IsAlive ) ? clPtr<iObject>( Obj ) : clPtr<iObject>();
Generation is global in the scope of sEnvironment and is atomic-incremented every time a new object is instantiated.
My questions are:
1) Is it safe to implement weak-references like this?
2) Are there any ways to optimize the clWeakPtr::Lock()?

1) It seems safe indeed, but any modification of the graph will have some contention with LPtr::GetObjectsGraphPtrWrapper
2) a read-write lock could help, at least you'll be able to call several Lock() in parallel
The problem with your solution is that it defeats the locality that non-intrusive weak pointers bring.
Depending on the concurrency level, it might become a problem as each call to Lock() will prevent any object creation and as well any other Lock() call without a read-write lock.

Related

Understanding std::move and its purpose by example [duplicate]

This question already has answers here:
What is move semantics?
(11 answers)
Closed 2 years ago.
Consider the following code:
class c
{
large_type_t i;
public:
// the most simple variant
c( large_type_t i ): i( i ) { }
// move the parameter copied
c( large_type_t i ): i( std::move( i ) ) { }
// copy is done directly in interaction with our class member from original
c( const large_type_t &i ): i( i ) { }
// parameter was constructed in the call, just treat as rvalue
c( large_type_t &&i ): i( std::move( i ) ) { }
// this is here just to show all possible combinations in case someone in the future sees this question and someone answering it wants to explain everything
c( large_type_t &&i ): i( i ) { }
};
What is the best way to do this? Are these all going to boil down to the same code anyway and it doesn't matter? I feel that I am fundamentally not understanding the purpose of move.
Move constructors (or move assignments) provide the means for your program to avoid spending excessive time copying the contents of one object to another, if the copied-from object will no longer be used afterwards. For example, when an object is to be the returned value of a method or function.
This is more obvious when you have an object with, say, dynamically allocated content.
Example:
class MyClass {
private:
size_t _itemCount ;
double * _bigBufferOfDoubles ;
public:
// ... Initial contructor, creating a big list of doubles
explicit MyClass( size_t itemCount ) {
_itemCount = itemCount ;
_bigBufferOfDoubles = new double[ itemCount ];
}
// ... Copy constructor, to be used when the 'other' object must persist
// beyond this call
MyClass( const MyClass & other ) {
//. ... This is a complete copy, and it takes a lot of time
_itemCount = other._itemCount ;
_bigBufferOfDoubles = new double[ _itemCount ];
for ( int i = 0; i < itemCount; ++i ) {
_bigBufferOfDoubles[ i ] = other. _bigBufferOfDoubles[ i ] ;
}
}
// ... Move constructor, when the 'other' can be discarded (i.e. when it's
// a temp instance, like a return value in some other method call)
MyClass( MyClass && other ) {
// ... Blazingly fast, as we're just copying over the pointer
_itemCount = other._itemCount ;
_bigBufferOfDoubles = other._bigBufferOfDoubles ;
// ... Good practice to clear the 'other' as it won't be needed
// anymore
other._itemCount = 0 ;
other._bigBufferOfDoubles = null ;
}
~MyClass() {
delete [] _bigBufferOfDoubles ;
}
};
// ... Move semantics are useful to return an object 'by value'
// Since the returned object is temporary in the function,
// the compiler will invoke the move constructor of MyClass
MyClass someFunctionThatReturnsByValue() {
MyClass myClass( 1000000 ) ; // a really big buffer...
return myClass ; // this will call the move contructor, NOT the copy constructor
}
// ... So far everything is explicitly handled without an explicit
// call to std::move
void someOtherFunction() {
// ... You can explicitly force the use of move semantics
MyClass myTempClass( 1000000 ) ;
// ... Say I want to make a copy, but I don't want to invoke the
// copy contructor:
MyClass myOtherClass( 1 ) ;
myOtherClass = std::move( myTempClass ) ;
// ... At this point, I should abstain from using myTempClass
// as its contents have been 'transferred' over to myOtherClass.
}
What is the best way to do this?
The simplest variant. The reason: Because it is the simplest variant.
Trivial types do not have copy constructors nor move constructors, and "moving" them is same as copying.
Are these all going to boil down to the same code anyway and it doesn't matter?
If optimisation and in particular inline expansion is involved, then probably yes. You can verify whether that is the case for your program by comparing the resulting assembly. Assuming no (or poor) optimisation, the simplest variant is potentially fastest.
Regarding your edit: For non-trivial types, the second variant is best because it moves from rvalue arguments instead of copying.
All of those are different and it very much depends on what large_type_t i; is.
Does it manage dynamically allocated data? If yes and it supports both moving and copy construction then following two constuctors should be used:
c( const large_type_t &i ): mi( i ) { };
c( large_type_t &&i ): mi( std::move( i ) ) {};
Although, it is potentially a mildly less efficient way, but you can use a single constructor for both purposes:
c( large_type_t i): mi( std::move( i ) ) { };
For the above, without std::move it will likely result in unnecessary copying and memory allocations.
If it doesn't manage any dynamically allocated data - then usually moving/copying is the same thing (for exotic types figure it yourself since it's too dependent on circumstances). Thus you only need copy constructor:
c( const large_type_t &i ): mi( i ) { };
For small trivial types, you'd better use the most basic form:
c( small_type_t i ): mi( i ) { };

Enforce NULL checking in c++

My method can return some kind of pointer ( for example boost::shared_ptr ) and this pointer may be NULL. Is there is any way to enforce users of my code to check, if it is empty or not ?
Some example of such things - scals's Option container, may be boost has something like boost::option ?
You can do the following:
return a smart pointer type that throws an exception if accessed and set to NULL.
throw an exception instead of returning a NULL pointer
return a std::optional (or boost::optional) which expresses intent (i.e. "value may be missing") much better than a pointer
The usual solution is to wrap the return value in a class, which
contains a flag which is set if the pointer is checked or
copied, and whose destructor crashes if the flag wasn't set.
Something like:
template <typename T>
class MustBeChecked
{
T* myValue;
mutable bool myHasBeenChecked;
public:
MustBeChecked( T* value )
: myValue( value )
, myHasBeenChecked( false )
{
}
MustBeChecked( MustBeChecked const& other )
: myValue( other.myValue )
, myHasBeenChecked( false )
{
other.myHasBeenChecked = true;
}
~MustBeChecked()
{
assert( myHasBeenChecked );
}
bool operator==( nullptr_t ) const
{
myHasBeenChecked = true;
return myValue == nullptr;
}
bool operator!=( nullptr_t ) const
{
myHasBeenChecked = true;
return myValue != nullptr;
}
operator T*() const
{
assert( myHasBeenChecked );
return myValue;
}
};
To be frank, I find this to be overkill in most cases. But I've
seen it used on some critical systems.
The reality here is that the callers of your function already have to check. If they try to access the shared pointer without checking, then a seg-fault is coming their way if the underlying pointer is NULL.
You don't specify if you're writing a library, or some code within a project. Nor do you specify any details of the context this code lives in -- all of these might decide which approach I'd take in this situation -- but broadly speaking, all of utnapistim's suggestions are good ones.

Thread-safe lock-free array

I have a C++ library, which supposed to do some computations on multiple threads. I made independent threads code (i.e. there are no shared variables between them), except for one array. The problem is, I don't know how to make it thread-safe.
I looked at mutex lock/unlock (QMutex, as I'm using Qt), but it doesn't fit for my task - while one thread will lock the mutex, other threads will wait!
Then I read about std::atomic, which looked like exactly what I needed. Nevertheless, I tried to use it in the following way:
std::vector<std::atomic<uint64_t>> *myVector;
And it produced compiler error (use of deleted function 'std::atomic::atomic(const std::atomic&)'). Then I found the solution - use special wrapper for std::atomic. I tried this:
struct AtomicUInt64
{
std::atomic<uint64_t> atomic;
AtomicUInt64() : atomic() {}
AtomicUInt64 ( std::atomic<uint64_t> a ) : atomic ( atomic.load() ) {}
AtomicUInt64 ( AtomicUInt64 &auint64 ) : atomic ( auint64.atomic.load() ) {}
AtomicUInt64 &operator= ( AtomicUInt64 &auint64 )
{
atomic.store ( auint64.atomic.load() );
}
};
std::vector<AtomicUInt64> *myVector;
This thing compiles succesfully, but when I can't fill the vector:
myVector = new std::vector<AtomicUInt64>();
for ( int x = 0; x < 100; ++x )
{
/* This approach produces compiler error:
* use of deleted function 'std::atomic<long long unsigned int>::atomic(const std::atomic<long long unsigned int>&)'
*/
AtomicUInt64 value( std::atomic<uint64_t>( 0 ) ) ;
myVector->push_back ( value );
/* And this one produces the same error: */
std::atomic<uint64_t> value1 ( 0 );
myVector->push_back ( value1 );
}
What am I doing wrong? I assume I tried everything (maybe not, anyway) and nothing helped. Are there any other ways for thread-safe array sharing in C++?
By the way, I use MinGW 32bit 4.7 compiler on Windows.
Here is a cleaned up version of your AtomicUInt64 type:
template<typename T>
struct MobileAtomic
{
std::atomic<T> atomic;
MobileAtomic() : atomic(T()) {}
explicit MobileAtomic ( T const& v ) : atomic ( v ) {}
explicit MobileAtomic ( std::atomic<T> const& a ) : atomic ( a.load() ) {}
MobileAtomic ( MobileAtomic const&other ) : atomic( other.atomic.load() ) {}
MobileAtomic& operator=( MobileAtomic const &other )
{
atomic.store( other.atomic.load() );
return *this;
}
};
typedef MobileAtomic<uint64_t> AtomicUInt64;
and use:
AtomicUInt64 value;
myVector->push_back ( value );
or:
AtomicUInt64 value(x);
myVector->push_back ( value );
your problem was you took a std::atomic by value, which causes a copy, which is blocked. Oh, and you failed to return from operator=. I also made some constructors explicit, probably needlessly. And I added const to your copy constructor.
I would also be tempted to add store and load methods to MobileAtomic that forwards to atomic.store and atomic.load.
You're trying to copy a non-copyable type: the AtomicUInt64 constructor takes an atomic by value.
If you need it to be initialisable from atomic, then it should take the argument by (const) reference. However, in your case, it doesn't look like you need to initialise from atomic at all; why not initialise from uint64_t instead?
Also a couple of minor points:
The copy constructor and assignment operator should take their values by const reference, to allow temporaries to be copied.
Allocating the vector with new is a rather odd thing to do; you're just adding an extra level of indirection with no benefit.
Make sure you never resize the array while other threads might be accessing it.
This line
AtomicUInt64 ( std::atomic<uint64_t> a ) : atomic ( atomic.load() ) {}
You're completely ignoring the argument you pass in, You probably want it to be a.load() and you probably want to take elements by const reference so they aren't copied.
AtomicUInt64 (const std::atomic<uint64_t>& a) : atomic (a.load()) {}
As for what you're doing, I'm not sure if it is correct. The modification of the variables inside the array will be atomic, but if the vector is modified or reallocated (which is possible with push_back), then there's nothing to guarantee your array modifications will work between threads and be atomic.

C++ precise garbage collector using clang/llvm?

Ok so I'm wanting to write a precise 'mark and sweep' garbage collector in C++. I have hopefully made some decisions that can help me as in all my pointers will be wrapped in a 'RelocObject' and I'll have a single block of memory for the heap. This looks something like this:
// This class acts as an indirection to the actual object in memory so that it can be
// relocated in the sweep phase of garbage collector
class MemBlock
{
public:
void* Get( void ) { return m_ptr; }
private:
MemBlock( void ) : m_ptr( NULL ){}
void* m_ptr;
};
// This is of the same size as the above class and is directly cast to it, but is
// typed so that we can easily debug the underlying object
template<typename _Type_>
class TypedBlock
{
public:
_Type_* Get( void ) { return m_pObject; }
private:
TypedBlock( void ) : m_pObject( NULL ){}
// Pointer to actual object in memory
_Type_* m_pObject;
};
// This is our wrapper class that every pointer is wrapped in
template< typename _Type_ >
class RelocObject
{
public:
RelocObject( void ) : m_pRef( NULL ) {}
static RelocObject New( void )
{
RelocObject ref( (TypedBlock<_Type_>*)Allocator()->Alloc( this, sizeof(_Type_), __alignof(_Type_) ) );
new ( ref.m_pRef->Get() ) _Type_();
return ref;
}
~RelocObject(){}
_Type_* operator-> ( void ) const
{
assert( m_pRef && "ERROR! Object is null\n" );
return (_Type_*)m_pRef->Get();
}
// Equality
bool operator ==(const RelocObject& rhs) const { return m_pRef->Get() == rhs.m_pRef->Get(); }
bool operator !=(const RelocObject& rhs) const { return m_pRef->Get() != rhs.m_pRef->Get(); }
RelocObject& operator= ( const RelocObject& rhs )
{
if(this == &rhs) return *this;
m_pRef = rhs.m_pRef;
return *this;
}
private:
RelocObject( TypedBlock<_Type_>* pRef ) : m_pRef( pRef )
{
assert( m_pRef && "ERROR! Can't construct a null object\n");
}
RelocObject* operator& ( void ) { return this; }
_Type_& operator* ( void ) const { return *(_Type_*)m_pRef->Get(); }
// SS:
TypedBlock<_Type_>* m_pRef;
};
// We would use it like so...
typedef RelocObject<Impl::Foo> Foo;
void main( void )
{
Foo foo = Foo::New();
}
So in order to find the 'root' RelocObjects when I allocate in 'RelocObject::New' I pass in the 'this' pointer of the RelocObject into the allocator(garbage collector). The allocator then checks to see if the 'this' pointer is in the range of the memory block for the heap and if it is then I can assume its not a root.
So the issue comes when I want to trace from the roots through the child objects using the zero or more RelocObjects located inside each child object.
I want to find the RelocObjects in a class (ie a child object) using a 'precise' method. I could use a reflection approach and make the user Register where in each class his or her RelocObjects are. However this would be very error prone and so I'd like to do this automatically.
So instead I'm looking to use Clang to find the offsets of the RelocObjects within the classes at compile time and then load this information at program start and use this in the mark phase of the garbage collector to trace through and mark the child objects.
So my question is can Clang help? I've heard you can gather all kinds of type information during compilation using its compile time hooks. If so what should I look for in Clang ie are there any examples of doing this kind of thing?
Just to be explicit: I want to use Clang to automatically find the offset of 'Foo' (which is a typedef of RelocObject) in FooB without the user providing any 'hints' ie they just write:
class FooB
{
public:
int m_a;
Foo m_ptr;
};
Thanks in advance for any help.
Whenever a RelocObject is instantiated, it's address can be recorded in a RelocObject ownership database along with sizeof(*derivedRelocObject) which will immediately identify which Foo belongs to which FooB. You don't need Clang for that. Also since Foo will be created shortly after FooB, your ownership database system can be very simple as the order of "I've been created, here's my address and size" calls will show the owning RelocObject record directly before the RelocObject instance's that it owns.
Each RelocObject has a ownership_been_declared flag initialized as false, upon first use (which would be after the constructors have completed, since no real work should be done in the constructor), so when any of those newly created objects is first used it requests that the database update it's ownership, the database goes through it's queue of recorded addresses and can identify which objects belong to which, clear some from it's list, setting their ownership_been_declared flag to true and you will have the offsets too (if you still need them).
p.s. if you like I can share my code for an Incremental Garbage Collector I wrote many years ago, which you might find helpful.

Smart Pointers with "this" in C++

I have been working on replacing raw pointers with reference-counted pointers that expose only a const version of the underlying one. My objective is to reduce memory usage (and time spent unnecessarily constructing and destructing complex objects) without putting myself in a situation where any code has access to memory that it does not own. I am aware of the circular reference problem with reference counting, but my code should never create such a situation.
Requiring const-ness works because I am using a system in which classes generally expose no non-const members, and rather than altering an object you must call a method on it that returns a new object that results from the alteration. This is probably a design pattern, but I do not know it by name.
My problem has come when I have a method that returns pointer to an object of its type, which is sometimes itself. Previously it looked something like this:
Foo * Foo::GetAfterModification( const Modification & mod ) const
{
if( ChangesAnything( mod ) )
{
Foo * asdf = new Foo;
asdf.DoModification( mod );
return asdf;
}
else
return this;
}
I cannot see a good way to make this return a smart pointer. The naive approach would be something like return CRcPtr< Foo >( this ), but this breaks the ownership semantics because what I am returning and whomever previously owned the object now each think they have ownership but do not know about each other. The only safe thing to do would be return CRcPtr< Foo >( new Foo( *this ) ), but that defeats my intention of restricting unnecessary memory use.
Is there some way to safely return a smart pointer without allocating any additional memory? My suspicion is that there is not. If there were, how would it work if the object had been allocated on the stack? This question seems related, but is not the same because he can just make his functions take raw pointers as parameters and because he is using the boost library.
For reference, my home-rolled smart pointer implementation is below. I am sure it could be more robust, but it is more portable than depending on boost or tr1 being available everywhere, and until this issue it has worked well for me.
template <class T>
class CRcPtr
{
public:
explicit CRcPtr( T * p_pBaldPtr )
{
m_pInternal = p_pBaldPtr;
m_iCount = new unsigned short( 1 );
}
CRcPtr( const CRcPtr & p_Other )
{ Acquire( p_Other ); }
template <class U>
explicit CRcPtr( const CRcPtr< U > & p_It )
{
m_pInternal = dynamic_cast< T * >( p_It.m_pInternal );
if( m_pInternal )
{
m_iCount = p_It.m_iCount;
(*m_iCount)++;
}
else
m_iCount = new unsigned short( 1 );
}
~CRcPtr()
{ Release(); }
CRcPtr & operator=( const CRcPtr & p_Other )
{
Release();
Acquire( p_Other );
}
const T & operator*() const
{ return *m_pInternal; }
const T * operator->() const
{ return m_pInternal; }
const T * get() const
{ return m_pInternal; }
private:
void Release()
{
(*m_iCount)--;
if( *m_iCount == 0 )
{
delete m_pInternal;
delete m_iCount;
m_pInternal = 0;
m_iCount = 0;
}
}
void Acquire( const CRcPtr & p_Other )
{
m_pInternal = p_Other.m_pInternal;
m_iCount = p_Other.m_iCount;
(*m_iCount)++;
}
template <class U>
friend class CRcPtr;
T * m_pInternal;
unsigned short * m_iCount;
};
template <class U, class T>
CRcPtr< U > ref_cast( const CRcPtr< T > & p_It )
{ return CRcPtr< U >( p_It ); }
Edit: Thanks for the replies. I was hoping to avoid using boost or tr1, but I recognize that not using well tested libraries is generally unwise. I am fairly sure that what I have implemented is not similar to std::auto_ptr, but rather is similar to tr1::shared_ptr except that it only exposes a const version of the internal pointer and lacks some of the features in the official version. I really would like to avoid an intrusive scheme like the one proposed by Gian Paolo. I am aware that my implementation is not threadsafe, but this is a single-threaded application.
Take a look at the source for Boost shared pointers. If you derive a class from enable_shared_from_this<T>, you can then call the shared_from_this() member function to do that sort of thing.
As far as semantics go, the great thing about immutability is that you can safely share data between objects, threads, and not have to worry about people changing values you depend on. You shouldn't need to worry about ownership so long as your refcounting scheme is correct (I didn't read yours, but use Boost anyway).
As eluded to by Ferruccio, you need some form of shared pointer. Ie, one that can be shared (safely!) between multiple objects. One way to make this work is to derive all your objects (at least those that are intended for use with the shared pointer) from a class that implements the actual reference count. That way, the actual pointer itself carries the current count with it, and you share the pointer between as many different objects as you like.
What you currently have is very similar to std::auto_ptr and you are bumping into the same issues with ownership. Google it and you should find some useful info.
Be aware that there are additional complications with shared pointers: in particular in multi-threaded environments your reference counting must be atomic and self assignment must be handled to avoid incrementing the internal count such that the object is never destroyed.
Again google is your friend here. Just look for info on shared pointers in C++ and you'll find tonnes of info.