Related
I am currently using the dlopen functions for some plugin project.
This function handle returns a void* and then I save all the handle to a map named handles:
void* handle = dlopen(path.c_str(), RTLD_LAZY);
handles[file] = handle;
My goal is to pass the ownership to the map, I was thinking of a unique_ptr, but not sure if this is even possible.
If it's not possible what other alternatives do I have ?
If I understand correctly you can do something like
Define a close function and an alias for the pointer type:
auto closeFunc = [](void* vp) {
dlclose(vp);
};
using HandlePtr = std::unique_ptr<void, decltype(closeFunc)>;
std::map<std::string, HandlePtr> handles;
and then create the handles and add to the map:
void* handle = dlopen(path.c_str(), RTLD_LAZY);
HandlePtr ptr( handle, closeFunc );
handles[file] = std::move( ptr );
Then closeFunc will be called when the unique ptr goes out of scope
The raw pointer can be prevented by combining the two lines above:
HandlePtr handle(dlopen(path.c_str(), RTLD_LAZY), closeFunc );
handles[file] = std::move( handle );
This makes use of the second argument to the std::unique_ptr that specifies the deleter to use.
PS: maps and unique_ptrs don't play well as-is, you might need some emplaces or moves depending on the C++ standard you are using. Or use shared_ptr instead.
Even if TheBadger answer's is the best to follow. Remember that there is no concept of ownership in C++, C++ provides classes to simulate this and make a good use of RAII, but you don't really have any guarantees from the language to assure a ressource's ownership, even with unique_ptr, for example :
{
auto p_i = new int{5};
auto up_i = std::unique_ptr<int>{p_i};
*p_i = 6; //nothing in the language prevent something like this
assert(*up_i == 6); //the unique_ptr ownership is not assured as you can see here
delete p_i; //still not illegal
} //at run time, RAII mechanism strikes, destroy unique_ptr, try to free an already freed memory, ending up to a core dumped
You even have a get member function in unique_ptr class that gives you the opportunity to mess up your memory as you could with raw pointers.
In C++, there's no mechanism to detect "ownership" violation at compile time, maybe you can find some tools out there to check it but it's not in the standard, but it's only a matter of recommended practices.
Thus, it's more relevant and precise to talk about "managed" memory in C++ instead of "ownership".
The std::unique_ptr is particularly useful for "wrapping" c-libraries that operate using a resource than needs deleting or closing when finished. Not only does it stop you forgetting when to delete/close the resource it also make using the resource exception safe - because the resource gets properly cleaned up in the event of an exception.
I would probably do something like this:
// closer function to clean up the resource
struct dl_closer{ void operator()(void* dl) const { dlclose(dl); }};
// type alias so you don't forget to include the closer functor
using unique_dl_ptr = std::unique_ptr<void, dl_closer>;
// helper function that ensures you get a valid dl handle
// or an exception.
unique_dl_ptr make_unique_dl(std::string const& file_name)
{
auto ptr = unique_dl_ptr(dlopen(file_name.c_str(), RTLD_LAZY));
if(!ptr)
throw std::runtime_error(dlerror());
return ptr;
}
And use it a bit like this:
// open a dl
auto dl_ptr = make_unique_dl("/path/to/my/plugin/library.so");
// now set your symbols using POSIX recommended casting gymnastics
void (*my_function_name)();
if(!(*(void**)&my_function_name = dlsym(dl_ptr.get(), "my_function_name")))
throw std::runtime_error(dlerror());
I have been changing some vulkan code to use the vulkan.hpp structures and methods.
Since I like RAIIs, I am using the Unique wrappers, to not have to explicitely handle resource management.
So far I have been able to create 2 versions of each wrapping method I am making, one version creates a non unique object, and another method calls the first and then initializes a unique handle from the first. Example:
vector<vk::UniqueFramebuffer> CreateFramebuffersUnique(vk::SurfaceKHR surface,
vk::PhysicalDevice phys_device, vk::Device device, vk::RenderPass render_pass,
vector<vk::ImageView> image_views)
{
auto framebuffers =
CreateFramebuffers(surface, phys_device, device, render_pass, image_views);
vector<vk::UniqueFramebuffer> u_framebuffers;
for(auto &fb : framebuffers)
u_framebuffers.push_back(vk::UniqueFramebuffer(fb, device));
return u_framebuffers;
}
The above method creates an array of framebuffers and then re-initializes each framebuffer as a unique framebuffer prior of returning it.
I tried doing the same with command buffers:
vector<vk::UniqueCommandBuffer> CreateCommandBuffersUnique(vk::SurfaceKHR &surface,
vk::PhysicalDevice &phys_device, vk::Device &device, vk::RenderPass &render_pass,
vk::Pipeline &graphics_pipeline, vk::CommandPool &command_pool,
vector<vk::Framebuffer> &framebuffers)
{
auto command_buffers = CreateCommandBuffers(surface, phys_device, device, render_pass,
graphics_pipeline, command_pool, framebuffers);
vector<vk::UniqueCommandBuffer> u_command_buffers;
for(auto &cb : command_buffers)
u_command_buffers.push_back(vk::UniqueCommandBuffer(cb, device));
return u_command_buffers;
}
The above technically works, but upon program termination, the validation layers complain about a mistake upon resorce de allocation:
validation layer: vkFreeCommandBuffers: required parameter commandPool specified as VK_NULL_HANDLE
UNASSIGNED-GeneralParameterError-RequiredParameter
4096
validation layer: Invalid CommandPool Object 0x0. The Vulkan spec states: commandPool must be a valid VkCommandPool handle (https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#VUID-vkFreeCommandBuffers-commandPool-parameter)
VUID-vkFreeCommandBuffers-commandPool-parameter
This happens because the command pool field of the unique handle is not set properly:
(gdb) p t_command_buffers[0]
$6 = {<vk::PoolFree<vk::Device, vk::CommandPool, vk::DispatchLoaderStatic>> = {m_owner = {m_device = 0x555555ec14e0}, m_pool = {m_commandPool = 0x0},
m_dispatch = 0x7fffffffe2f7}, m_value = {m_commandBuffer = 0x555555fe6390}}
I checked, and although there is a commandBuffer.getPool(), there is no setPool().
Any suggestions on properly setting the field?
You're not including the source or the signature of the CreateCommandBuffers function you're calling internally from CreateCommandBuffersUnique. However, lets assume it returns a type of std::vector<vk::CommandBuffer>
It looks like you're then looping over those and wrapping them manually in vk::UniqueCommandBuffer instances. However, you're passing the cb (assumed here to be a vk::CommandBuffer and a vk::Device to the UniqueCommandBuffer constructor here
u_command_buffers.push_back(vk::UniqueCommandBuffer(cb, device));
I'm actually not clear on how that even compiles, unless vk::Device has an operator ()(). Regardless, the second parameter to a vk::UniqueCommandBuffer should be a deleter object.
So it turns out that all of the UniqueXXX types derive from UniqueHandle<>, which in turn derives from UniqueHandleTraits<...> which is specialized for each type, in order to give it a default deleter type. Most of the UniqueHandleTraits<...> specializations use an ObjectDestroy<...> template because most objects only need their creating Device to destroy them. However, the UniqueHandleTraits<CommandBuffer> and UniqueHandleTraits<DescriptorSet> specializations use PoolFree<...> for their deleters.
Therefore, your use of
vk::UniqueCommandBuffer(cb, device));
implicitly turns into
vk::UniqueCommandBuffer(cb,
vk::PoolFree<Device, CommandPool,Dispatch> { device, {}, {} }
)
That {} after device is where the pool is supposed to be specified.
See the corresponding code from vk::Device::allocateCommandBuffersUnique
PoolFree<Device,CommandPool,Dispatch> deleter( *this, allocateInfo.commandPool, d );
for ( size_t i=0 ; i<allocateInfo.commandBufferCount ; i++ )
{
commandBuffers.push_back( UniqueCommandBuffer( buffer[i], deleter ) );
}
To fix your code you either need to use vk::Device::allocateCommandBuffersUnique or replicate it's behavior, specifically create a deleter object or lambda and pass it as the second parameter to the UniqueCommandBuffer ctor.
Based on my examination of the UniqueHandleTraits, it looks like you'd be able to fix your code just by changing this line:
u_framebuffers.push_back(vk::UniqueFramebuffer(fb, device));
to
u_framebuffers.push_back(vk::UniqueFramebuffer(fb, { device, command_pool }));
Upon playing around I have found a solution that solves the issue, although I would prefer doing things more closely to the example I provided with framebuffers:
vector<vk::UniqueCommandBuffer> CreateCommandBuffersUnique(vk::SurfaceKHR &surface,
vk::PhysicalDevice &phys_device, vk::Device &device, vk::RenderPass &render_pass,
vk::Pipeline &graphics_pipeline, vk::CommandPool &command_pool,
vector<vk::Framebuffer> &framebuffers)
{
vk::CommandBufferAllocateInfo alloc_info(command_pool,
vk::CommandBufferLevel::ePrimary, framebuffers.size());
auto[result, u_command_buffers] = device.allocateCommandBuffersUnique(alloc_info);
if (result != vk::Result::eSuccess)
Log::RecordLogError("Failed to allocate command buffers!");
auto command_buffers = CreateCommandBuffers(surface, phys_device, device, render_pass,
graphics_pipeline, command_pool, framebuffers);
for(auto &cb : command_buffers)
{
u_command_buffers[&cb - &command_buffers[0]].reset(cb);
}
return std::move(u_command_buffers);
}
I'm picking apart some C++ Python wrapper code that allows the consumer to construct custom old style and new style Python classes from C++.
The original code comes from PyCXX, with old and new style classes here and here. I have however rewritten the code substantially, and in this question I will reference my own code, as it allows me to present the situation in the greatest clarity that I am able. I think there would be very few individuals capable of understanding the original code without several days of scrutiny... For me it has taken weeks and I'm still not clear on it.
The old style simply derives from PyObject,
template<typename FinalClass>
class ExtObj_old : public ExtObjBase<FinalClass>
// ^ which : ExtObjBase_noTemplate : PyObject
{
public:
// forwarding function to mitigate awkwardness retrieving static method
// from base type that is incomplete due to templating
static TypeObject& typeobject() { return ExtObjBase<FinalClass>::typeobject(); }
static void one_time_setup()
{
typeobject().set_tp_dealloc( [](PyObject* t) { delete (FinalClass*)(t); } );
typeobject().supportGetattr(); // every object must support getattr
FinalClass::setup();
typeobject().readyType();
}
// every object needs getattr implemented to support methods
Object getattr( const char* name ) override { return getattr_methods(name); }
// ^ MARKER1
protected:
explicit ExtObj_old()
{
PyObject_Init( this, typeobject().type_object() ); // MARKER2
}
When one_time_setup() is called, it forces (by accessing base class typeobject()) creation of the associated PyTypeObject for this new type.
Later when an instance is constructed, it uses PyObject_Init
So far so good.
But the new style class uses much more complicated machinery. I suspect this is related to the fact that new style classes allow derivation.
And this is my question, why is the new style class handling implemented in the way that it is? Why is it having to create this extra PythonClassInstance structure? Why can't it do things the same way the old-style class handling does? i.e. Just type convert from the PyObject base type? And seeing as it doesn't do that, does this mean it is making no use of its PyObject base type?
This is a huge question, and I will keep amending the post until I'm satisfied it represents the issue well. It isn't a good fit for SO's format, I'm sorry about that. However, some world-class engineers frequent this site (one of my previous questions was answered by the lead developer of GCC for example), and I value the opportunity to appeal to their expertise. So please don't be too hasty to vote to close.
The new style class's one-time setup looks like this:
template<typename FinalClass>
class ExtObj_new : public ExtObjBase<FinalClass>
{
private:
PythonClassInstance* m_class_instance;
public:
static void one_time_setup()
{
TypeObject& typeobject{ ExtObjBase<FinalClass>::typeobject() };
// these three functions are listed below
typeobject.set_tp_new( extension_object_new );
typeobject.set_tp_init( extension_object_init );
typeobject.set_tp_dealloc( extension_object_deallocator );
// this should be named supportInheritance, or supportUseAsBaseType
// old style class does not allow this
typeobject.supportClass(); // does: table->tp_flags |= Py_TPFLAGS_BASETYPE
typeobject.supportGetattro(); // always support get and set attr
typeobject.supportSetattro();
FinalClass::setup();
// add our methods to the extension type's method table
{ ... typeobject.set_methods( /* ... */); }
typeobject.readyType();
}
protected:
explicit ExtObj_new( PythonClassInstance* self, Object& args, Object& kwds )
: m_class_instance{self}
{ }
So the new style uses a custom PythonClassInstance structure:
struct PythonClassInstance
{
PyObject_HEAD
ExtObjBase_noTemplate* m_pycxx_object;
}
PyObject_HEAD, if I dig into Python's object.h, is just a macro for PyObject ob_base; -- no further complications, like #if #else. So I don't see why it can't simply be:
struct PythonClassInstance
{
PyObject ob_base;
ExtObjBase_noTemplate* m_pycxx_object;
}
or even:
struct PythonClassInstance : PyObject
{
ExtObjBase_noTemplate* m_pycxx_object;
}
Anyway, it seems that its purpose is to tag a pointer onto the end of a PyObject. This will be because Python runtime will often trigger functions we have placed in its function table, and the first parameter will be the PyObject responsible for the call. So this allows us to retrieve the associated C++ object.
But we also need to do that for the old-style class.
Here is the function responsible for doing that:
ExtObjBase_noTemplate* getExtObjBase( PyObject* pyob )
{
if( pyob->ob_type->tp_flags & Py_TPFLAGS_BASETYPE )
{
/*
New style class uses a PythonClassInstance to tag on an additional
pointer onto the end of the PyObject
The old style class just seems to typecast the pointer back up
to ExtObjBase_noTemplate
ExtObjBase_noTemplate does indeed derive from PyObject
So it should be possible to perform this typecast
Which begs the question, why on earth does the new style class feel
the need to do something different?
This looks like a really nice way to solve the problem
*/
PythonClassInstance* instance = reinterpret_cast<PythonClassInstance*>(pyob);
return instance->m_pycxx_object;
}
else
return static_cast<ExtObjBase_noTemplate*>( pyob );
}
My comment articulates my confusion.
And here, for completeness is us inserting a lambda-trampoline into the PyTypeObject's function pointer table, so that Python runtime can trigger it:
table->tp_setattro = [] (PyObject* self, PyObject* name, PyObject* val) -> int
{
try {
ExtObjBase_noTemplate* p = getExtObjBase( self );
return ( p -> setattro(Object{name}, Object{val}) );
}
catch( Py::Exception& ) { /* indicate error */
return -1;
}
};
(In this demonstration I'm using tp_setattro, note that there are about 30 other slots, which you can see if you look at the doc for PyTypeObject)
(in fact the major reason for working this way is that we can try{}catch{} around every trampoline. This saves the consumer from having to code repetitive error trapping.)
So, we pull out the "base type for the associated C++ object" and call its virtual setattro (just using setattro as an example here). A derived class will have overridden setattro, and this override will get called.
The old-style class provides such an override, which I've labelled MARKER1 -- it is in the top listing for this question.
The only the thing I can think of is that maybe different maintainers have used different techniques. But is there some more compelling reason why old and new style classes require different architecture?
PS for reference, I should include the following methods from new style class:
static PyObject* extension_object_new( PyTypeObject* subtype, PyObject* args, PyObject* kwds )
{
PyObject* pyob = subtype->tp_alloc(subtype,0);
PythonClassInstance* o = reinterpret_cast<PythonClassInstance *>( pyob );
o->m_pycxx_object = nullptr;
return pyob;
}
^ to me, this looks absolutely wrong.
It appears to be allocating memory, re-casting to some structure that might exceed the amount allocated, and then nulling right at the end of this.
I'm surprised it hasn't caused any crashes.
I can't see any indication anywhere in the source code that these 4 bytes are owned.
static int extension_object_init( PyObject* _self, PyObject* _args, PyObject* _kwds )
{
try
{
Object args{_args};
Object kwds{_kwds};
PythonClassInstance* self{ reinterpret_cast<PythonClassInstance*>(_self) };
if( self->m_pycxx_object )
self->m_pycxx_object->reinit( args, kwds );
else
// NOTE: observe this is where we invoke the constructor, but indirectly (i.e. through final)
self->m_pycxx_object = new FinalClass{ self, args, kwds };
}
catch( Exception & )
{
return -1;
}
return 0;
}
^ note that there is no implementation for reinit, other than the default
virtual void reinit ( Object& args , Object& kwds ) {
throw RuntimeError( "Must not call __init__ twice on this class" );
}
static void extension_object_deallocator( PyObject* _self )
{
PythonClassInstance* self{ reinterpret_cast< PythonClassInstance* >(_self) };
delete self->m_pycxx_object;
_self->ob_type->tp_free( _self );
}
EDIT: I will hazard a guess, thanks to insight from Yhg1s on the IRC channel.
Maybe it is because when you create a new old-style class, it is guaranteed it will overlap perfectly a PyObject structure.
Hence it is safe to derive from PyObject, and pass a pointer to the underlying PyObject into Python, which is what the old-style class does (MARKER2)
On the other hand, new style class creates a {PyObject + maybe something else} object.
i.e. It wouldn't be safe to do the same trick, as Python runtime would end up writing past the end of the base class allocation (which is only a PyObject).
Because of this, we need to get Python to allocate for the class, and return us a pointer which we store.
Because we are now no longer making use of the PyObject base-class for this storage, we cannot use the convenient trick of typecasting back to retrieve the associated C++ object.
Which means that we need to tag on an extra sizeof(void*) bytes to the end of the PyObject that actually does get allocated, and use this to point to our associated C++ object instance.
However, there is some contradiction here.
struct PythonClassInstance
{
PyObject_HEAD
ExtObjBase_noTemplate* m_pycxx_object;
}
^ if this is indeed the structure that accomplishes the above, then it is saying that the new style class instance is indeed fitting exactly over a PyObject, i.e. It is not overlapping into the m_pycxx_object.
And if this is the case, then surely this whole process is unnecessary.
EDIT: here are some links that are helping me learn the necessary ground work:
http://eli.thegreenplace.net/2012/04/16/python-object-creation-sequence
http://realmike.org/blog/2010/07/18/introduction-to-new-style-classes-in-python
Create an object using Python's C API
to me, this looks absolutely wrong. It appears to be allocating memory, re-casting to some structure that might exceed the amount allocated, and then nulling right at the end of this. I'm surprised it hasn't caused any crashes. I can't see any indication anywhere in the source code that these 4 bytes are owned
PyCXX does allocate enough memory, but it does so by accident. This appears to be a bug in PyCXX.
The amount of memory Python allocates for the object is determined by the first call to the following static member function of PythonClass<T>:
static PythonType &behaviors()
{
...
p = new PythonType( sizeof( T ), 0, default_name );
...
}
The constructor of PythonType sets the tp_basicsize of the python type object to sizeof(T). This way when Python allocates an object it knows to allocate at least sizeof(T) bytes. It works because sizeof(T) turns out to be larger that sizeof(PythonClassInstance) (T is derived from PythonClass<T> which derives from PythonExtensionBase, which is large enough).
However, it misses the point. It should actually allocate only sizeof(PythonClassInstance) . This appears to be a bug in PyCXX - that it allocates too much, rather than too little space for storing a PythonClassInstance object.
And this is my question, why is the new style class handling implemented in the way that it is? Why is it having to create this extra PythonClassInstance structure? Why can't it do things the same way the old-style class handling does?
Here's my theory why new style classes are different from the old style classes in PyCXX.
Before Python 2.2, where new style classes were introduced, there was no tp_init member int the type object. Instead, you needed to write a factory function that would construct the object. This is how PythonExtension<T> is supposed to work - the factory function converts the Python arguments to C++ arguments, asks Python to allocate the memory and then calls the constructor using placement new.
Python 2.2 added the new style classes and the tp_init member. Python first creates the object and then calls the tp_init method. Keeping the old way would have required that the objects would first have a dummy constructor that creates an "empty" object (e.g. initializes all members to null) and then when tp_init is called, would have had an additional initialization stage. This makes the code uglier.
It seems that the author of PyCXX wanted to avoid that. PyCXX works by first creating a dummy PythonClassInstance object and then when tp_init is called, creates the actual PythonClass<T> object using its constructor.
... does this mean it is making no use of its PyObject base type?
This appears to be correct, the PyObject base class does not seem to be used anywhere. All the interesting methods of PythonExtensionBase use the virtual self() method, which returns m_class_instance and completely ignore the PyObject base class.
I guess (only a guess, though) is that PythonClass<T> was added to an existing system and it seemed easier to just derive from PythonExtensionBase instead of cleaning up the code.
I am writing a library in C++, but want it to have a C API, which should also be thread safe. One thing that the API needs to do is to pass back and forth handles (e.g. a structure containing a reference or pointer) of objects created within the library. These objects need to be destroyed at some point, so any handles to such an object that were still in existence would them become invalid.
EDIT: We cannot assume that each handle is only used within a single client thread. In particular I want to handle the case where there are two client threads accessing the same resource simultaneously, one trying to destroy it while the other tries to modify it
There are two paradigms for dealing with this. One is the smart pointer paradigm, for example boost::shared_ptr or std::shared_ptr, which make sure the object is only destroyed when there are no more references to it. However, as I understand it, such pointers aren't possible to implement in C (as it doesn't support constructors and destructors), so this approach won't work in my case. I don't want to rely on the user to call a "release" function for every instance of a handle they obtain, since they inevitably won't.
Another paradigm is to simply destroy the object within the library, and have any subsequent function calls which pass a handle to that object as an input simply return an error code. My question is what techniques or libraries are available to implement this approach, specifically in a multi-threaded application?
EDIT:
The library should of course handle all the allocation of deallocation of internal memory itself. Smart pointers may be also used within the library; they may not be passed across the API, however.
EDIT:
More detail on handles might be used on the client side: the client might have two threads, one of which creates an object. That thread might pass the handle to the second thread, or the second thread might get it from the library using a "find_object" function. The second thread might then continuously update the object, but while that is going on the first thread might destroy the object, making all handles to the object invalid.
I appreciate rough suggestions of an approach - I have come up with some of these myself too. However, the nitty gritty details of things like how a C++ class instance is retrieved and locked, given a handle, while other threads may be attempting to destroy that object, are non-trivial, so I'm really after answers like "I have done the following and it works without fail." or, even better, "This library implements what you're after sensibly and safely".
IMHO handles (e.g. simple integer numbers) that are kept as key values in a map of smart pointers internally might be a viable solution.
Though you'll need to have a mechanism that guarantees a handle that was freed by a particular client won't destroy the map entry, as long the handle is still used by other clients in a multithreaded environment.
UPDATE:
At least #user1610015 answer isn't such a bad idea (if you're working with a COM enabled environment). In any case you'll need to keep track with some reference counting of your internally managed class instances.
I'm not so experienced with the C++11 or boosts smart pointer features, and how it's possible to intercept or override reference counting mechanism with these. But e.g. Alexandrescou's loki library smart pointer considerations may let you implement an appropriate policy on how you're going to handle the reference counting and which interfaces can access this or not.
It should be possible with Alexandrescou's Smart Pointer to provide a ref counting mechanism that supports access via a C-API and C++ internal implementation concurrently and thread safe.
One thing that the API needs to do is to pass back and forth handles
Ok so far
(e.g. a structure containing a reference or pointer)
Why? A "handle" is just a way to identify an object. Doesn't necessarily mean it has to hold the reference or pointer.
One is the smart pointer paradigm, for example boost::shared_ptr or std::shared_ptr, which make sure the object is only destroyed when there are no more references to it.
Sure, a
map<int, boost::shared_ptr<my_object>>
might work fine here if you want to use it for your memory deallocation mechanism.
simply destroy the object within the library,
This can exist with smart pointers, its not one or the other.
have any subsequent function calls which pass a handle to that object as an input simply return an error code.
sure sounds good.
If your library is responsible for allocating memory, then it should be responsible for deallocating memory.
I would return simple integer "handles" from the library _GetNewObject() method.
Your library needs a map of handles to internal objects. No one outside the library should see the objects from the C interface.
All the library methods should take a handle as their first parameter.
For the multi-threaded stuff, do two threads need to access the same object? If so, you'll need to put in some sort of locking that occurs when a C API function is entered, and released before it leaves. You will have to make a decision if you want the code outside the library to know about this locking (you probably don't), a C function that calls the library function will probably just want to get the return value and not worry about the lock/unlock.
So your library needs:
an interface to allocate and deallocate objects, viewed as handles by the outside
an interface to do stuff given the handles.
EDIT: MORE INFO
Within the library I would use a Factory pattern to create new objects. The Factory should hand out a shared_ptr after the object allocation. This way everything else in the library just uses shared_ptr, and the cleanup will be fairly automatic (i.e. the factory doesn't have to store a list of what is created to remember to clean up, and no one has to call delete explicitly). Store the shared_ptr in the map with the handle. You'll probably need some sort of static counter along with a GetNextHandle() function to get the next available handle and to deal with wrap around (depending on how many objects are created and destroyed within the lifetime of the running program).
Next, put your shared pointer into a Proxy. The proxy should be very lightweight and you can have many Proxy objects per actual object. Each proxy will hold a private shared_ptr and whatever thread / mutex object you choose to use (you haven't given any information about this, so its hard to be any more specific). When a Proxy is created, it should acquire the mutex, and release on destruction (i.e. RAII for releasing the lock).
You haven't included any information on how to determine if you want to create a new object or find an existing object, and how two different threads would "find" the same object. However, lets assume that you have a GetObject() with enough parameters to uniquely identify each object and return the handle from the map, if the objects exists.
In this case, each of your visible extern C library functions would accept an object handle and:
Create a new Proxy for the given handle. In the Proxy constructor, the Proxy would look in the map to find the handle, if it doesn't exist, ask the Factory to create one (or return an error, your choice here). The Proxy would then acquire the lock. Your function would then get the pointer from the Proxy and use it. When the function exits, the Proxy goes out of scope, releases the lock, and decrements the reference counter.
If two functions are running in different threads, as long as a Proxy exists in one of the functions, the object will still exist. The other function could ask the library to delete the object which would remove the reference from the map. Once all the other functions with active Proxy objects finish, the final shared_ptr will go out of scope and the object will be deleted.
You can do most of this generically with templates, or write concrete classes.
EDIT: MORE INFO
The Proxy will be a small class. It will have a shared_ptr, and have a lock. You would instantiate a Proxy within the scope of the extern C function called by the client (note, this is actually a C++ function with all the benefits such as being able to use C++ classes). The proxy is small and should go on the stack (do not new and delete this, more trouble than its worth, just make a scoped variable and let C++ do the work for you). The proxy would use the RAII pattern to get a copy of the shared_ptr (which would increment the reference count of the shared_ptr) and acquire the lock at construction. When the Proxy goes out of scope, the shared_ptr it has is destroyed thus decrementing the reference count. The Proxy destructor should release the lock. BTW, you may want to think about blocking and how you want your thread mutexes to work. I don't know enough about your specific mutex implementation to suggest anything yet.
The map will contain the "master" shared_ptr that all others are copied from. However, this is flexible and decoupled because once a Proxy gets a shared_ptr from the map, it doesn't have to worry about "giving it back". The shared_ptr in the map can be removed (i.e. the object no longer "exists" to the Factory), but there can still be Proxy classes that have a shared_ptr, so the actual object will still exist as long as something is using it.
When code uses handles, it is almost always responsible for calling a dispose handle function. And operating on a disposed handle is then illegal.
Handles are usually either pointers to forward decl struct with no body, or pointers to structwith a preable field or two (maybe to help debugging on the C side) that are incomplete. Creation and destruction of them occur inside the API.
In the API you have a full view ofthe contents of the handle, which need not be a C struct -- it can have unique ptrs or whatever. You will have to delete the handle manually, but that is inevidable.
As noted below, another possible handle is a guid, with inside the API having a map from guid to data. This is slow, but the space of guids is practically infinite, so you can detect use of erased handles and return an error. Note that failure to return a handle leaks the resource, but this eliminates dangling pointer segfaults at a modsst runtime cost.
Another way is to expose a COM API. This has the benefit that it's object-oriented, unlike a C API. But it still can be used from C. It would look like this:
C++:
// Instantiation:
ISomeObject* pObject;
HRESULT hr = CoCreateInstance(..., IID_PPV_ARGS(&pObject));
// Use:
hr = pObject->SomeMethod(...);
// Cleanup:
pObject->Release();
C:
// Instantiation:
ISomeObject* pObject;
HRESULT hr = CoCreateInstance(..., IID_PPV_ARGS(&pObject));
// Use:
hr = (*pObject->lpVtbl->SomeMethod)(pObject, ...);
// Cleanup:
(*pObject->lpVtbl->Release)(pObject);
Also, if the client is C++, it can use a COM smart pointer like ATL's CComPtr to automate memory management. So the C++ code could be turned into:
// Instantiation:
CComPtr<ISomeObject> pSomeObject;
HRESULT hr = pSomeObject.CoCreateInstance(...);
// Use:
hr = pSomeObject->SomeMethod(...);
// Cleanup is done automatically at the end of the current scope
Perhaps I have too much time on my hands... but I've thought about this a few times and decided to just go ahead and implement it. C++ calleable, no external libs. Totally reinvented the wheel, just for fun (if you can call coding on a sunday fun.)
Note, synchronization is not here, because I don't know what OS you are on...
SmartPointers.h:
#ifndef SMARTPOINTER_H
#define SMARTPOINTER_H
#ifdef __cplusplus
extern "C" {
#endif
#ifndef __cplusplus
#define bool int
#define true (1 == 1)
#define false (1 == 0)
#endif
// Forward declarations
struct tSmartPtr;
typedef struct tSmartPtr SmartPtr;
struct tSmartPtrRef;
typedef struct tSmartPtrRef SmartPtrRef;
// Type used to describe the object referenced.
typedef void * RefObjectPtr;
// Type used to describe the object that owns a reference.
typedef void * OwnerPtr;
// "Virtual" destructor, called when all references are freed.
typedef void (*ObjectDestructorFunctionPtr)(RefObjectPtr pObjectToDestruct);
// Create a smart pointer to the object pObjectToReference, and pass a destructor that knows how to delete the object.
SmartPtr *SmartPtrCreate( RefObjectPtr pObjectToReference, ObjectDestructorFunctionPtr Destructor );
// Make a new reference to the object, pass in a pointer to the object that will own the reference. Returns a new object reference.
SmartPtrRef *SmartPtrMakeRef( SmartPtr *pSmartPtr, OwnerPtr pReferenceOwner );
// Remove a reference to an object, pass in a pointer to the object that owns the reference. If the last reference is removed, the object destructor is called.
bool SmartPtrRemoveRef( SmartPtr *pSmartPtr, OwnerPtr pReferenceOwner );
// Remove a reference via a pointer to the smart reference itself.
// Calls the destructor if all references are removed.
// Does SmartPtrRemoveRef() using internal pointers, so call either this or SmartPtrRemoveRef(), not both.
bool SmartPtrRefRemoveRef( SmartPtrRef *pRef );
// Get the pointer to the object that the SmartPointer points to.
void *SmartPtrRefGetObjectPtr( SmartPtrRef *pRef );
#ifdef __cplusplus
}
#endif
#endif // #ifndef SMARTPOINTER_H
SmartPointers.c:
#include "SmartPointers.h"
#include <string.h>
#include <stdlib.h>
#include <assert.h>
typedef struct tLinkedListNode {
struct tLinkedListNode *pNext;
} LinkedListNode;
typedef struct tLinkedList {
LinkedListNode dummyNode;
} LinkedList;
struct tSmartPtrRef {
LinkedListNode listNode;
OwnerPtr pReferenceOwner;
RefObjectPtr pObjectReferenced;
struct tSmartPtr *pSmartPtr;
};
struct tSmartPtr {
OwnerPtr pObjectRef;
ObjectDestructorFunctionPtr ObjectDestructorFnPtr;
LinkedList refList;
};
// Initialize singly linked list
static void LinkedListInit( LinkedList *pList )
{
pList->dummyNode.pNext = &pList->dummyNode;
}
// Add a node to the list
static void LinkedListAddNode( LinkedList *pList, LinkedListNode *pNode )
{
pNode->pNext = pList->dummyNode.pNext;
pList->dummyNode.pNext = pNode;
}
// Remove a node from the list
static bool LinkedListRemoveNode( LinkedList *pList, LinkedListNode *pNode )
{
bool removed = false;
LinkedListNode *pPrev = &pList->dummyNode;
while (pPrev->pNext != &pList->dummyNode) {
if (pPrev->pNext == pNode) {
pPrev->pNext = pNode->pNext;
removed = true;
break;
}
pPrev = pPrev->pNext;
}
return removed;
}
// Return true if list is empty.
static bool LinkedListIsEmpty( LinkedList *pList )
{
return (pList->dummyNode.pNext == &pList->dummyNode);
}
// Find a reference by pReferenceOwner
static SmartPtrRef * SmartPtrFindRef( SmartPtr *pSmartPtr, OwnerPtr pReferenceOwner )
{
SmartPtrRef *pFoundNode = NULL;
LinkedList * const pList = &pSmartPtr->refList;
LinkedListNode *pIter = pList->dummyNode.pNext;
while ((pIter != &pList->dummyNode) && (NULL == pFoundNode)) {
SmartPtrRef *pCmpNode = (SmartPtrRef *)pIter;
if (pCmpNode->pReferenceOwner == pReferenceOwner) {
pFoundNode = pCmpNode;
}
pIter = pIter->pNext;
}
return pFoundNode;
}
// Commented in header.
SmartPtrRef *SmartPtrMakeRef( SmartPtr *pSmartPtr, OwnerPtr pReferenceOwner )
{
// TODO: Synchronization here!
SmartPtrRef *pRef = (SmartPtrRef *)malloc(sizeof(SmartPtrRef) );
LinkedListAddNode( &pSmartPtr->refList, &pRef->listNode );
pRef->pReferenceOwner = pReferenceOwner;
pRef->pObjectReferenced = pSmartPtr->pObjectRef;
pRef->pSmartPtr = pSmartPtr;
return pRef;
}
// Commented in header.
bool SmartPtrRemoveRef( SmartPtr *pSmartPtr, OwnerPtr pReferenceOwner )
{
// TODO: Synchronization here!
SmartPtrRef *pRef = SmartPtrFindRef( pSmartPtr, pReferenceOwner );
if (NULL != pRef) {
assert( LinkedListRemoveNode( &pSmartPtr->refList, &pRef->listNode ) );
pRef->pReferenceOwner = NULL;
pRef->pObjectReferenced = NULL;
free( pRef );
if (LinkedListIsEmpty( &pSmartPtr->refList ) ) {
pSmartPtr->ObjectDestructorFnPtr( pSmartPtr->pObjectRef );
}
}
return (NULL != pRef);
}
// Commented in header.
bool SmartPtrRefRemoveRef( SmartPtrRef *pRef )
{
return SmartPtrRemoveRef( pRef->pSmartPtr, pRef->pReferenceOwner );
}
// Commented in header.
void *SmartPtrRefGetObjectPtr( SmartPtrRef *pRef )
{
return pRef->pObjectReferenced;
}
// Commented in header.
SmartPtr *SmartPtrCreate( void *pObjectToReference, ObjectDestructorFunctionPtr Destructor )
{
SmartPtr *pThis = (SmartPtr *)malloc( sizeof( SmartPtr ) );
memset( pThis, 0, sizeof( SmartPtr ) );
LinkedListInit( &pThis->refList );
pThis->ObjectDestructorFnPtr = Destructor;
pThis->pObjectRef = pObjectToReference;
return pThis;
}
And a test program (main.cpp)
// SmartPtrs.cpp : Defines the entry point for the console application.
//
#include "stdafx.h"
#include "SmartPointers.h"
#include <assert.h>
#include <stdlib.h>
#include <string.h>
typedef struct tMyRefObj {
int refs;
SmartPtr *pPointerToMe;
bool deleted;
} MyRefObj;
static bool objDestructed = false;
static MyRefObj *MyObjectGetReference( MyRefObj *pThis, void *pObjectReferencing )
{
// TODO: Synchronization here...
pThis->refs++;
SmartPtrRef * const pRef = SmartPtrMakeRef( pThis->pPointerToMe, pObjectReferencing );
return (MyRefObj *)SmartPtrRefGetObjectPtr( pRef );
}
static void MyObjectRemoveReference( MyRefObj *pThis, void *pObjectReferencing )
{
// TODO: Synchronization here...
pThis->refs--;
assert( SmartPtrRemoveRef( pThis->pPointerToMe, pObjectReferencing ) );
}
static void MyObjectDestructorFunction(void *pObjectToDestruct)
{
MyRefObj *pThis = (MyRefObj *)pObjectToDestruct;
assert( pThis->refs == 0 );
free( pThis );
objDestructed = true;
}
static MyRefObj *MyObjectConstructor( void )
{
MyRefObj *pMyRefObj =new MyRefObj;
memset( pMyRefObj, 0, sizeof( MyRefObj ) );
pMyRefObj->pPointerToMe = SmartPtrCreate( pMyRefObj, MyObjectDestructorFunction );
return pMyRefObj;
}
#define ARRSIZE 125
int main(int argc, char* argv[])
{
int i;
// Array of references
MyRefObj *refArray[ARRSIZE];
// Create an object to take references of.
MyRefObj *pNewObj = MyObjectConstructor();
// Create a bunch of references.
for (i = 0; i < ARRSIZE; i++) {
refArray[i] = MyObjectGetReference( pNewObj, &refArray[i] );
}
assert( pNewObj->refs == ARRSIZE );
for (i = 0; i < ARRSIZE; i++) {
MyObjectRemoveReference( pNewObj, &refArray[i] );
refArray[i] = NULL;
}
assert(objDestructed);
return 0;
}
I've recently posted a general question about RAII at SO.
However, I still have some implementation issues with my HANDLE example.
A HANDLE is typedeffed to void * in windows.h. Therefore, the correct shared_ptr definition needs to be
std::tr1::shared_ptr<void> myHandle (INVALID_HANDLE_VALUE, CloseHandle);
Example 1 CreateToolhelp32Snapshot: returns HANDLE and works.
const std::tr1::shared_ptr<void> h
(CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, NULL), CloseHandle);
As I use void in the definition (what is the correct way?) problems go on, when I try to call some more winapi commands with this pointer. They functionally work, but are ugly and I am sure that there has to be a better solution.
In the following examples, h is a pointer which was created via the definition at the top.
Example 2 OpenProcessToken: last argument is a PHANDLE. medium ugly with the cast.
OpenProcessToken(GetCurrentProcess(), TOKEN_ADJUST_PRIVILEGES | TOKEN_QUERY,
(PHANDLE)&h);
Example 3 Process32First: first argument is a HANDLE. REALLY ugly.
Process32First(*((PHANDLE)&h), &pEntry);
Example 4 simple comparison with a constant HANDLE. REALLY ugly.
if (*((PHANDLE)&h) == INVALID_HANDLE) { /* do something */ }
What is the correct way to create a proper shared_ptr for a HANDLE?
Example 1 is OK
Example 2 is wrong. By blindly casting to PHANDLE, the shared_ptr logic is bypassed. It should be something like this instead:
HANDLE h;
OpenProcessToken(...., &h);
shared_ptr<void> safe_h(h, &::CloseHandle);
or, to assign to a pre-exising shared_ptr:
shared_ptr<void> safe_h = ....
{
HANDLE h;
OpenProcessToken(...., &h);
safe_h.reset(h, &::CloseHandle);
}//For extra safety, limit visibility of the naked handle
or, create your own, safe, version of OpenProcessToken that returns a shared handle instead of taking a PHANDLE:
// Using SharedHandle defined at the end of this post
SharedHandle OpenProcess(....)
{
HANDLE h = INVALID_HANDLE_VALUE;
::OpenProcessToken(...., &h);
return SharedHandle(h);
}
Example 3: No need to take these detours. This should be ok:
Process32First(h.get(), ...);
Example 4: Again, no detour:
if (h.get() == INVALID_HANDLE){...}
To make things nicer, you could typedef something like:
typedef shared_ptr<void> SharedHandle;
or better yet, if all handles are to be closed with CloseHandle(), create a SharedHandle class wrapping a shared_ptr and automatically providing the right deleter:
// Warning: Not tested. For illustration purposes only
class SharedHandle
{
public:
explicit SharedHandle(HANDLE h) : m_Handle(h, &::CloseHandle){};
HANDLE get()const{return m_Handle.get();}
//Expose other shared_ptr-like methods as needed
//...
private:
shared_ptr<void> m_Handle;
};
Don't bother with shared_ptr for that, use ATL::CHandle.
Here is why:
When you see CHandle you know that it's a RAII wrapper for a handle.
When you see shared_ptr<void> you don't know what it is.
CHandle doesn't make an ownership shared (however in some cases you may want a shared ownership).
CHandle is a standard for a windows development stack.
CHandle is more compact than shared_ptr<void> with custom deleter (less typing/reading).
Take a look at boost 2: shared_ptr wraps resource handles
Here is my alternative, which is quite nice except you need to dereference always after .get() and requires a functor or lambda:
template<typename HandleType, typename Deleter>
std::shared_ptr<HandleType> make_shared_handle(HandleType _handle, Deleter _dx)
{
return std::shared_ptr<HandleType>(new HandleType(_handle), _dx);
}
then:
auto closeHandleDeleter = [](HANDLE* h) {
::CloseHandle(*h);
delete h;
};
std::shared_ptr<HANDLE> sp = make_shared_handle(a_HANDLE, closeHandleDeleter);
f_that_takes_handle(*sp.get());
what I like most about this is there is no extra work to have access to this:
std::weak_ptr<HANDLE> wp = sp; // Yes. This could make sense in some designs.
and of course, the helper function works with any handle type of the likes.