Python C-API Object Allocation

Python C-API Object Allocation - c++

I want to use the new and delete operators for creating and destroying my objects.
The problem is python seems to break it into several stages. tp_new, tp_init and tp_alloc for creation and tp_del, tp_free and tp_dealloc for destruction. However c++ just has new which allocates and fully constructs the object and delete which destructs and deallocates the object.
Which of the python tp_* methods do I need to provide and what must they do?
Also I want to be able to create the object directly in c++ eg "PyObject *obj = new MyExtensionObject(args);" Will I also need to overload the new operator in some way to support this?
I also would like to be able to subclass my extension types in python, is there anything special I need to do to support this?
I'm using python 3.0.1.
EDIT:
ok, tp_init seems to make objects a bit too mutable for what I'm doing (eg take a Texture object, changing the contents after creation is fine, but change fundamental aspects of it such as, size, bitdept, etc will break lots of existing c++ stuff that assumes those sort of things are fixed). If I dont implement it will it simply stop people calling __init__ AFTER its constructed (or at least ignore the call, like tuple does). Or should I have some flag that throws an exception or somthing if tp_init is called more than once on the same object?
Apart from that I think ive got most of the rest sorted.
extern "C"
{
//creation + destruction
PyObject* global_alloc(PyTypeObject *type, Py_ssize_t items)
{
return (PyObject*)new char[type->tp_basicsize + items*type->tp_itemsize];
}
void global_free(void *mem)
{
delete[] (char*)mem;
}
}
template<class T> class ExtensionType
{
PyTypeObject *t;
ExtensionType()
{
t = new PyTypeObject();//not sure on this one, what is the "correct" way to create an empty type object
memset((void*)t, 0, sizeof(PyTypeObject));
static PyVarObject init = {PyObject_HEAD_INIT, 0};
*((PyObject*)t) = init;
t->tp_basicsize = sizeof(T);
t->tp_itemsize = 0;
t->tp_name = "unknown";
t->tp_alloc = (allocfunc) global_alloc;
t->tp_free = (freefunc) global_free;
t->tp_new = (newfunc) T::obj_new;
t->tp_dealloc = (destructor)T::obj_dealloc;
...
}
...bunch of methods for changing stuff...
PyObject *Finalise()
{
...
}
};
template <class T> PyObjectExtension : public PyObject
{
...
extern "C" static PyObject* obj_new(PyTypeObject *subtype, PyObject *args, PyObject *kwds)
{
void *mem = (void*)subtype->tp_alloc(subtype, 0);
return (PyObject*)new(mem) T(args, kwds)
}
extern "C" static void obj_dealloc(PyObject *obj)
{
~T();
obj->ob_type->tp_free(obj);//most of the time this is global_free(obj)
}
...
};
class MyObject : PyObjectExtension<MyObject>
{
public:
static PyObject* InitType()
{
ExtensionType<MyObject> extType();
...sets other stuff...
return extType.Finalise();
}
...
};

The documentation for these is at http://docs.python.org/3.0/c-api/typeobj.html and
http://docs.python.org/3.0/extending/newtypes.html describes how to make your own type.
tp_alloc does the low-level memory allocation for the instance. This is equivalent to malloc(), plus initialize the refcnt to 1. Python has it's own allocator, PyType_GenericAlloc, but a type can implement a specialized allocator.
tp_new is the same as Python's __new__. It's usually used for immutable objects where the data is stored in the instance itself, as compared to a pointer to data. For example, strings and tuples store their data in the instance, instead of using a char * or a PyTuple *.
For this case, tp_new figures out how much memory is needed, based on the input parameters, and calls tp_alloc to get the memory, then initializes the essential fields. tp_new does not need to call tp_alloc. It can for example return a cached object.
tp_init is the same as Python's __init__. Most of your initialization should be in this function.
The distinction between __new__ and __init__ is called two-stage initialization, or two-phase initialization.
You say "c++ just has new" but that's not correct. tp_alloc corresponds a custom arena allocator in C++, __new__ corresponds to a custom type allocator (a factory function), and __init__ is more like the constructor. That last link discusses more about the parallels between C++ and Python style.
Also read http://www.python.org/download/releases/2.2/descrintro/ for details about how __new__ and __init__ interact.
You write that you want to "create the object directly in c++". That's rather difficult because at the least you'll have to convert any Python exceptions that occurred during object instantiation into a C++ exception. You might try looking at Boost::Python for some help with this task. Or you can use a two-phase initialization. ;)

I don't know the python APIs at all, but if python splits up allocation and initialization, you should be able to use placement new.
e.g.:
// tp_alloc
void *buffer = new char[sizeof(MyExtensionObject)];
// tp_init or tp_new (not sure what the distinction is there)
new (buffer) MyExtensionObject(args);
return static_cast<MyExtensionObject*>(buffer);
...
// tp_del
myExtensionObject->~MyExtensionObject(); // call dtor
// tp_dealloc (or tp_free? again I don't know the python apis)
delete [] (static_cast<char*>(static_cast<void*>(myExtensionObject)));

Related

Getting raw pointer from shared_ptr to pass it to function that requires raw

Ok first off I'm very new to C++ so apologies if my understanding is poor. I'll try explain myself as best I can. What I have is I am using a library function that returns a std::shared_ptr<SomeObject>, I then have a different library function that takes a raw pointer argument (more specifically node-addon-api Napi::External<T>::New(Napi::Env env, T *data) static function). I want to create a Napi::External object using my std::shared_ptr. What I am currently doing is this:
{
// ...
std::shared_ptr<SomeObject> pSomeObject = something.CreateSomeObject();
auto ext = Napi::External<SomeObject>::New(info.Env(), pSomeObject.get());
auto instance = MyNapiObjectWrapper::Create({ ext });
return instance;
}
But I am worried this will run into memory issues.
My pSomeObject only exists in the current scope, so I imagine what should happen is after the return, it's reference count will drop to 0 and the SomeObject instance it points to will be destroyed and as such I will have issues with the instance I return which uses this object. However I have been able to run this code and call functions on SomeObject from my instance, so I'm thinking maybe my understanding is wrong.
My question is what should I do when given a shared pointer but I need to work off a raw pointer because of other third party library requirements? One option that was proposed to me was make a deep copy of the object and create a pointer to that
If my understanding on any of this is wrong please correct me, as I said I'm quite new to C++.
============================
Edit:
So I was missing from my original post info about ownership and what exactly this block is. The block is an instance method for an implementation I have for a Napi::ObjectWrap instance. This instance method needs to return an Napi::Object which will be available to the caller in node.js. I am using Napi::External as I need to pass a sub type of Napi::Value to the constructor New function when creating the Napi:Object I return, and I need the wrapped SomeObject object in the external which I extract in my MyNapiObjectWrapper constructor like so:
class MyNapiObjectWrapper
{
private:
SomeObject* someObject;
static Napi::FunctionReference constructor; // ignore for now
public:
static void Init(Napi::Env env) {...}
MyNapiObjectWrapper(const CallbackInfo& info)
{
Napi::Env env = info.Env();
Napi::HandleScope scope(env);
// My original code to match the above example
this->someObject = info[0].As<const Napi::External<SomeObject>>().Data();
}
DoSomething()
{
this->someObject->DoSomething();
}
}
I have since come to realise I can pass the address of the shared pointer when creating my external and use it as follows
// modified first sample
{{
// ...
std::shared_ptr<SomeObject> pSomeObject = something.CreateSomeObject();
auto ext = Napi::External<SomeObject>::New(info.Env(), &pSomeObject);
auto instance = MyNapiObjectWrapper::Create({ ext });
return instance;
}
// modified second sample
class MyNapiObjectWrapper
{
private:
std::shared_ptr<SomeObject> someObject;
static Napi::FunctionReference constructor; // ignore for now
public:
static void Init(Napi::Env env) {...}
MyNapiObjectWrapper(const CallbackInfo& info)
{
Napi::Env env = info.Env();
Napi::HandleScope scope(env);
// My original code to match the above example
this->someObject =
*info[0].As<const Napi::External<std::shared_ptr<SomeObject>>>().Data();
}
DoSomething()
{
this->someObject->DoSomething();
}
}
So now I am passing a pointer to a shared_ptr to create my Napi::External, my question now though is this OK? Like I said at the start I'm new to c++ but this seems like a bit of a smell. However I tested it with some debugging and could see the reference count go up, so I'm thinking I'm in the clear???

Here the important part of the documentation:
The Napi::External template class implements the ability to create a Napi::Value object with arbitrary C++ data. It is the user's responsibility to manage the memory for the arbitrary C++ data.
So you need to ensure that the object passed by data to Napi::External Napi::External::New exits until the Napi::External<T> object is destructed.
So the code that you have shown is not correct.
What you could do is to pass a Finalize callback to the New function:
static Napi::External Napi::External::New(napi_env env,
T* data,
Finalizer finalizeCallback);
And use a lambda function as Finalize, that lambda could hold a copy through the capture to the shared pointer allowing to keep the shared pointer alive until finalize is called.
std::shared_ptr<SomeObject> pSomeObject = something.CreateSomeObject();
auto ext = Napi::External<SomeObject>::New(
info.Env(),
pSomeObject.get(),
[pSomeObject](Env /*env*/, SomeObject* data) {});

In the V8 javascript engine, how to make a constructor function that re-uses an ObjectTemplate for each instance?

I have working code where I can create as many Point objects as I want, but it re-creates the object template each time the constructor is called, which seems like it's probably wrong.
Local<ObjectTemplate> global_templ = ObjectTemplate::New(isolate);
// make the Point constructor function available to JS
global_templ->Set(v8::String::NewFromUtf8(isolate, "Point"), FunctionTemplate::New(isolate, v8_Point));
and then the constructor itself:
void v8_Point(const v8::FunctionCallbackInfo<v8::Value>& args) {
HandleScope scope(args.GetIsolate());
// this bit should probably be cached somehow
Local<ObjectTemplate> point_template = ObjectTemplate::New(args.GetIsolate());
point_template->SetInternalFieldCount(1);
point_template->SetAccessor(String::NewFromUtf8(args.GetIsolate(), "x"), GetPointX, SetPointX);
point_template->SetAccessor(String::NewFromUtf8(args.GetIsolate(), "y"), GetPointY, SetPointY);
// end section to be cached
Local<Object> obj = point_template->NewInstance();
Point * p = new Point(1,1);
obj->SetInternalField(0, External::New(args.GetIsolate(), p));
args.GetReturnValue().Set(obj);
}
But it seems like I should be able to pass in the point_template object instead of re-creating it each time. I saw that there's a Data() field in args, but that only allows for a Value type and an ObjectTemplate is of type Template, not Value.
Any help on the right way to do this would be greatly appreciated.

I figured it out finally.
In javascript, when you add a function via a FunctionTemplate and then call it as a constructor (e.g. new MyFunction), then in your c++ callback the args.This() will be a new object created by the using the FunctionTemplate's InstanceTemplate object template.
// Everything has to go in a single global template (as I understand)
Local<ObjectTemplate> global_templ = ObjectTemplate::New(isolate);
// create the function template and tell it the callback to use
Local<FunctionTemplate> point_constructor = FunctionTemplate::New(isolate, v8_Point);
// set the internal field count so our actual c++ object can tag along
// with the javascript object so our accessors can use it
point_constructor->InstanceTemplate()->SetInternalFieldCount(1);
// associate getters and setters for the 'x' field on point
point_constructor->InstanceTemplate()->SetAccessor(String::NewFromUtf8(isolate, "x"), GetPointX, SetPointX);
... add any other function and object templates to the global template ...
// add the global template to the context our javascript will run in
Local<Context> x_context = Context::New(isolate, NULL, global_templ);
Then, for the actual function:
void v8_Point(const v8::FunctionCallbackInfo<v8::Value>& args) {
// (just an example of a handy utility function)
// whether or not it was called as "new Point()" or just "Point()"
printf("Is constructor call: %s\n", args.IsConstructCall()?"yes":"no");
// create your c++ object that will follow the javascript object around
// make sure not to make it on the stack or it won't be around later when you need it
Point * p = new Point();
// another handy helper function example
// see how the internal field count is what it was set to earlier
// in the InstanceTemplate
printf("Internal field count: %d\n",args.This()->InternalFieldCount()); // this prints the value '1'
// put the new Point object into the internal field
args.This()->SetInternalField(0, External::New(args.GetIsolate(), p));
// return the new object back to the javascript caller
args.GetReturnValue().Set(args.This());
}
Now, when you write the getter and setter, you have access to your actual c++ object in the body of them:
void GetPointX(Local<String> property,
const PropertyCallbackInfo<Value>& info) {
Local<Object> self = info.Holder();
// This is where we take the actual c++ object that was embedded
// into the javascript object and get it back to a useable c++ object
Local<External> wrap = Local<External>::Cast(self->GetInternalField(0));
void* ptr = wrap->Value();
int value = static_cast<Point*>(ptr)->x_; //x_ is the name of the field in the c++ object
// return the value back to javascript
info.GetReturnValue().Set(value);
}
void SetPointX(Local<String> property, Local<Value> value,
const PropertyCallbackInfo<void>& info) {
Local<Object> self = info.Holder();
// same concept here as in the "getter" above where you get access
// to the actual c++ object and then set the value from javascript
// into the actual c++ object field
Local<External> wrap = Local<External>::Cast(self->GetInternalField(0));
void* ptr = wrap->Value();
static_cast<Point*>(ptr)->x_ = value->Int32Value();
}
Almost all of this came from here: https://developers.google.com/v8/embed?hl=en#accessing-dynamic-variables
except it doesn't talk about the proper way to make your objects in a repeatable fashion.
I figured out how to clean up the c++ object in the internal field, but I don't have time to put the whole answer here. You have to pass in a Global object into your weak callback by creating a hybrid field (a struct works well) on the heap that has both the global object and a pointer to your c++ object. You can then delete your c++ object, call Reset() on your Global and then delete the whole thing. I'll try to add actual code, but may forget.
Here is a good source: https://code.google.com/p/chromium/codesearch#chromium/src/v8/src/d8.cc&l=1064 lines 1400-1441 are what you want. (edit: line numbers seem to be wrong now - maybe the link above has changed?)
Remember, v8 won't garbage collect small amounts of memory, so you may never see it. Also, just because your program ends doesn't mean the GC will run. You can use isolate->AdjustAmountOfExternalAllocatedMemory(length); to tell v8 about the size of the memory you've allocated (it includes this in its calculations about when there's too much memory in use and GC needs to run) and you can use isolate->IdleNotificationDeadline(1); to give the GC a chance to run (though it may choose not to).

Why does PyCXX handle new-style classes in the way it does?

I'm picking apart some C++ Python wrapper code that allows the consumer to construct custom old style and new style Python classes from C++.
The original code comes from PyCXX, with old and new style classes here and here. I have however rewritten the code substantially, and in this question I will reference my own code, as it allows me to present the situation in the greatest clarity that I am able. I think there would be very few individuals capable of understanding the original code without several days of scrutiny... For me it has taken weeks and I'm still not clear on it.
The old style simply derives from PyObject,
template<typename FinalClass>
class ExtObj_old : public ExtObjBase<FinalClass>
// ^ which : ExtObjBase_noTemplate : PyObject
{
public:
// forwarding function to mitigate awkwardness retrieving static method
// from base type that is incomplete due to templating
static TypeObject& typeobject() { return ExtObjBase<FinalClass>::typeobject(); }
static void one_time_setup()
{
typeobject().set_tp_dealloc( [](PyObject* t) { delete (FinalClass*)(t); } );
typeobject().supportGetattr(); // every object must support getattr
FinalClass::setup();
typeobject().readyType();
}
// every object needs getattr implemented to support methods
Object getattr( const char* name ) override { return getattr_methods(name); }
// ^ MARKER1
protected:
explicit ExtObj_old()
{
PyObject_Init( this, typeobject().type_object() ); // MARKER2
}
When one_time_setup() is called, it forces (by accessing base class typeobject()) creation of the associated PyTypeObject for this new type.
Later when an instance is constructed, it uses PyObject_Init
So far so good.
But the new style class uses much more complicated machinery. I suspect this is related to the fact that new style classes allow derivation.
And this is my question, why is the new style class handling implemented in the way that it is? Why is it having to create this extra PythonClassInstance structure? Why can't it do things the same way the old-style class handling does? i.e. Just type convert from the PyObject base type? And seeing as it doesn't do that, does this mean it is making no use of its PyObject base type?
This is a huge question, and I will keep amending the post until I'm satisfied it represents the issue well. It isn't a good fit for SO's format, I'm sorry about that. However, some world-class engineers frequent this site (one of my previous questions was answered by the lead developer of GCC for example), and I value the opportunity to appeal to their expertise. So please don't be too hasty to vote to close.
The new style class's one-time setup looks like this:
template<typename FinalClass>
class ExtObj_new : public ExtObjBase<FinalClass>
{
private:
PythonClassInstance* m_class_instance;
public:
static void one_time_setup()
{
TypeObject& typeobject{ ExtObjBase<FinalClass>::typeobject() };
// these three functions are listed below
typeobject.set_tp_new( extension_object_new );
typeobject.set_tp_init( extension_object_init );
typeobject.set_tp_dealloc( extension_object_deallocator );
// this should be named supportInheritance, or supportUseAsBaseType
// old style class does not allow this
typeobject.supportClass(); // does: table->tp_flags |= Py_TPFLAGS_BASETYPE
typeobject.supportGetattro(); // always support get and set attr
typeobject.supportSetattro();
FinalClass::setup();
// add our methods to the extension type's method table
{ ... typeobject.set_methods( /* ... */); }
typeobject.readyType();
}
protected:
explicit ExtObj_new( PythonClassInstance* self, Object& args, Object& kwds )
: m_class_instance{self}
{ }
So the new style uses a custom PythonClassInstance structure:
struct PythonClassInstance
{
PyObject_HEAD
ExtObjBase_noTemplate* m_pycxx_object;
}
PyObject_HEAD, if I dig into Python's object.h, is just a macro for PyObject ob_base; -- no further complications, like #if #else. So I don't see why it can't simply be:
struct PythonClassInstance
{
PyObject ob_base;
ExtObjBase_noTemplate* m_pycxx_object;
}
or even:
struct PythonClassInstance : PyObject
{
ExtObjBase_noTemplate* m_pycxx_object;
}
Anyway, it seems that its purpose is to tag a pointer onto the end of a PyObject. This will be because Python runtime will often trigger functions we have placed in its function table, and the first parameter will be the PyObject responsible for the call. So this allows us to retrieve the associated C++ object.
But we also need to do that for the old-style class.
Here is the function responsible for doing that:
ExtObjBase_noTemplate* getExtObjBase( PyObject* pyob )
{
if( pyob->ob_type->tp_flags & Py_TPFLAGS_BASETYPE )
{
/*
New style class uses a PythonClassInstance to tag on an additional
pointer onto the end of the PyObject
The old style class just seems to typecast the pointer back up
to ExtObjBase_noTemplate
ExtObjBase_noTemplate does indeed derive from PyObject
So it should be possible to perform this typecast
Which begs the question, why on earth does the new style class feel
the need to do something different?
This looks like a really nice way to solve the problem
*/
PythonClassInstance* instance = reinterpret_cast<PythonClassInstance*>(pyob);
return instance->m_pycxx_object;
}
else
return static_cast<ExtObjBase_noTemplate*>( pyob );
}
My comment articulates my confusion.
And here, for completeness is us inserting a lambda-trampoline into the PyTypeObject's function pointer table, so that Python runtime can trigger it:
table->tp_setattro = [] (PyObject* self, PyObject* name, PyObject* val) -> int
{
try {
ExtObjBase_noTemplate* p = getExtObjBase( self );
return ( p -> setattro(Object{name}, Object{val}) );
}
catch( Py::Exception& ) { /* indicate error */
return -1;
}
};
(In this demonstration I'm using tp_setattro, note that there are about 30 other slots, which you can see if you look at the doc for PyTypeObject)
(in fact the major reason for working this way is that we can try{}catch{} around every trampoline. This saves the consumer from having to code repetitive error trapping.)
So, we pull out the "base type for the associated C++ object" and call its virtual setattro (just using setattro as an example here). A derived class will have overridden setattro, and this override will get called.
The old-style class provides such an override, which I've labelled MARKER1 -- it is in the top listing for this question.
The only the thing I can think of is that maybe different maintainers have used different techniques. But is there some more compelling reason why old and new style classes require different architecture?
PS for reference, I should include the following methods from new style class:
static PyObject* extension_object_new( PyTypeObject* subtype, PyObject* args, PyObject* kwds )
{
PyObject* pyob = subtype->tp_alloc(subtype,0);
PythonClassInstance* o = reinterpret_cast<PythonClassInstance *>( pyob );
o->m_pycxx_object = nullptr;
return pyob;
}
^ to me, this looks absolutely wrong.
It appears to be allocating memory, re-casting to some structure that might exceed the amount allocated, and then nulling right at the end of this.
I'm surprised it hasn't caused any crashes.
I can't see any indication anywhere in the source code that these 4 bytes are owned.
static int extension_object_init( PyObject* _self, PyObject* _args, PyObject* _kwds )
{
try
{
Object args{_args};
Object kwds{_kwds};
PythonClassInstance* self{ reinterpret_cast<PythonClassInstance*>(_self) };
if( self->m_pycxx_object )
self->m_pycxx_object->reinit( args, kwds );
else
// NOTE: observe this is where we invoke the constructor, but indirectly (i.e. through final)
self->m_pycxx_object = new FinalClass{ self, args, kwds };
}
catch( Exception & )
{
return -1;
}
return 0;
}
^ note that there is no implementation for reinit, other than the default
virtual void reinit ( Object& args , Object& kwds ) {
throw RuntimeError( "Must not call __init__ twice on this class" );
}
static void extension_object_deallocator( PyObject* _self )
{
PythonClassInstance* self{ reinterpret_cast< PythonClassInstance* >(_self) };
delete self->m_pycxx_object;
_self->ob_type->tp_free( _self );
}
EDIT: I will hazard a guess, thanks to insight from Yhg1s on the IRC channel.
Maybe it is because when you create a new old-style class, it is guaranteed it will overlap perfectly a PyObject structure.
Hence it is safe to derive from PyObject, and pass a pointer to the underlying PyObject into Python, which is what the old-style class does (MARKER2)
On the other hand, new style class creates a {PyObject + maybe something else} object.
i.e. It wouldn't be safe to do the same trick, as Python runtime would end up writing past the end of the base class allocation (which is only a PyObject).
Because of this, we need to get Python to allocate for the class, and return us a pointer which we store.
Because we are now no longer making use of the PyObject base-class for this storage, we cannot use the convenient trick of typecasting back to retrieve the associated C++ object.
Which means that we need to tag on an extra sizeof(void*) bytes to the end of the PyObject that actually does get allocated, and use this to point to our associated C++ object instance.
However, there is some contradiction here.
struct PythonClassInstance
{
PyObject_HEAD
ExtObjBase_noTemplate* m_pycxx_object;
}
^ if this is indeed the structure that accomplishes the above, then it is saying that the new style class instance is indeed fitting exactly over a PyObject, i.e. It is not overlapping into the m_pycxx_object.
And if this is the case, then surely this whole process is unnecessary.
EDIT: here are some links that are helping me learn the necessary ground work:
http://eli.thegreenplace.net/2012/04/16/python-object-creation-sequence
http://realmike.org/blog/2010/07/18/introduction-to-new-style-classes-in-python
Create an object using Python's C API

to me, this looks absolutely wrong. It appears to be allocating memory, re-casting to some structure that might exceed the amount allocated, and then nulling right at the end of this. I'm surprised it hasn't caused any crashes. I can't see any indication anywhere in the source code that these 4 bytes are owned
PyCXX does allocate enough memory, but it does so by accident. This appears to be a bug in PyCXX.
The amount of memory Python allocates for the object is determined by the first call to the following static member function of PythonClass<T>:
static PythonType &behaviors()
{
...
p = new PythonType( sizeof( T ), 0, default_name );
...
}
The constructor of PythonType sets the tp_basicsize of the python type object to sizeof(T). This way when Python allocates an object it knows to allocate at least sizeof(T) bytes. It works because sizeof(T) turns out to be larger that sizeof(PythonClassInstance) (T is derived from PythonClass<T> which derives from PythonExtensionBase, which is large enough).
However, it misses the point. It should actually allocate only sizeof(PythonClassInstance) . This appears to be a bug in PyCXX - that it allocates too much, rather than too little space for storing a PythonClassInstance object.
And this is my question, why is the new style class handling implemented in the way that it is? Why is it having to create this extra PythonClassInstance structure? Why can't it do things the same way the old-style class handling does?
Here's my theory why new style classes are different from the old style classes in PyCXX.
Before Python 2.2, where new style classes were introduced, there was no tp_init member int the type object. Instead, you needed to write a factory function that would construct the object. This is how PythonExtension<T> is supposed to work - the factory function converts the Python arguments to C++ arguments, asks Python to allocate the memory and then calls the constructor using placement new.
Python 2.2 added the new style classes and the tp_init member. Python first creates the object and then calls the tp_init method. Keeping the old way would have required that the objects would first have a dummy constructor that creates an "empty" object (e.g. initializes all members to null) and then when tp_init is called, would have had an additional initialization stage. This makes the code uglier.
It seems that the author of PyCXX wanted to avoid that. PyCXX works by first creating a dummy PythonClassInstance object and then when tp_init is called, creates the actual PythonClass<T> object using its constructor.
... does this mean it is making no use of its PyObject base type?
This appears to be correct, the PyObject base class does not seem to be used anywhere. All the interesting methods of PythonExtensionBase use the virtual self() method, which returns m_class_instance and completely ignore the PyObject base class.
I guess (only a guess, though) is that PythonClass<T> was added to an existing system and it seemed easier to just derive from PythonExtensionBase instead of cleaning up the code.

Using manual memory management with a reference?

I have a variable which is referenced a lot. It started out as an automatic variable.
Now I decided that in the middle of some code I want to call its dtor to reset its state, so I intend to deallocate and reallocate it. The standard way to do this of course is to call delete on it and make a new one.
Before:
void func() {
ClassName varname;
while (varname.check()/*...*/) { if (varname.function()/*...*/) { /* bunches of code ... */
/*... some more code ... */
}
}
}
Now I want:
void func() {
ClassName varname;
while (varname.check()/*...*/) { if (varname.function()/*...*/) { /* bunches of code ... */
if (key_code[SDLK_r]) { // Pressing R key should reset "varname"!
/* Here I want to dealloc and realloc varname! */
/* But if I declare varname as a ptr on line 2, */
/* line 3 (rest of code) must be refactored. */
}
}
}
}
My first attempt is to go change line 2 to be something like this
ClassName *varnamep = new ClassName();
ClassName& varname = *varnamep;
But I'm not sure if that means I'll be able to call delete on it later and reassign the reference!
delete &varname;
varnamep = new ClassName();
varname = *varnamep; // I assume compiler will error here because I can't reassign a ref.
Can I do this some other way? Or should I just suck it up and do a find-replace for turning varname. into varname->? In this particular case for my actual real situation I will probably implement a member function reset() and not worry about this actual problem. But I would like to know if there is some shortcut to being able to effectively treat references as pointers (or it could turn out that this is absurd nonsense)

Given ClassName varname, you could do this:
varname.~ClassName();
new (&varname) ClassName;
But I wouldn't recommend it. This uses two less-commonly-known features of C++: an explicit destructor call, and placement new. Only use this if it makes a significant difference in performance, as measured by your profiler, and the ClassName constructor can't throw an exception.
If ClassName::operator= does what you need (or you can modify it to do what you need), you can do this:
varname = ClassName();
That is more easily understood than using an explicit destructor call followed by placement-new.
Another common idiom:
varname.swap(ClassName());
This works if ClassName has an efficient swap method, like standard containers do. This is subtle enough that it probably deserves a comment if you decide to use it.

The standard way is not to delete and create a new instance. Just reassign the variable:
ClassName varname = .... ;
....
if (some condition) {
varname = SomethingElse;
}
and make sure that the copy constructor, assignment operator and destructor correctly deal with resources managed by ClassName.

How can I emulate constructor and destructor behavior (for particular data types) in C

I have a C (nested) structure that I would like to automagically initialize and destroy in my code.
I am compiling with GCC (4.4.3) on Linux. I am vaguely aware of GCC function attributes constructor and destructor, but the construction/destruction they provide seem to relate to the entire program (i.e. before main() is called etc).
I want to be able to have different init/cleanup funcs for different data types - is this C++ like behaviour something that I can emulate using POC?
I have included the C++ tag because this is really C++ behaviour I am trying to emulate in C.

There's no way to do this automatically, at least not in any portable manner. In C you'd typically have functions that work somewhat like constructors and destructors — they (de)allocate memory and (de)initialize fields —, except they have to be called explicitly:
typedef struct{} MyStruct;
MyStruct *MyStruct_New(void);
void MyStruct_Free(MyStruct *obj);
The language was simply not designed for this and you shouldn't try to force it, imo. If you want to have automatic destruction, you shouldn't be using C.

#define your way through the problem...
As pointed out by previous authors there is no automatic way of doing what you are asking, which sadly is kind of obvious since C doesn't have any way of doing true OOP.
But a programmer can always hack him or herself through any kind of obstacle.. At the end of this post I wrote you a sample hack to circumvent the problem.
There are methods of cleaning up the macro provided, though it won't be as portable.
C99 implementation: http://ideone.com/9XcCt
C89 implementation: http://ideone.com/WYrjU
- C99 implementation
#include <stdio.h>
#include <stdlib.h>
...
#define SCOPIFY(TYPE,NAME, ...) { \
ctor_ ## TYPE(& NAME); \
__VA_ARGS__ \
dtor_ ## TYPE(& NAME); \
} (void)0
...
typedef struct {
int * p;
} Obj;
void
ctor_Obj (Obj* this) {
this->p = malloc (sizeof (int));
*this->p = 123;
fprintf (stderr, "Obj::ctor, (this -> %p)\n", (void*)this);
}
void
dtor_Obj (Obj* this) {
free (this->p);
fprintf (stderr, "Obj::dtor, (this -> %p)\n", (void*)this);
}
...
int
main (int argc, char *argv[])
{
Obj o1, o2;
SCOPIFY (Obj, o1,
fprintf (stderr, " o1.p -> %d\n", *o1.p);
SCOPIFY (Obj, o2,
int a, b;
fprintf (stderr, " o2.p -> %d\n", *o2.p);
(*o1.p) += (*o2.p);
);
fprintf (stderr, " o1.p -> %d\n", *o1.p);
);
return 0;
}
output (http://ideone.com/WYrjU)
Obj::ctor, (this -> 0xbf8f05ac)
o1.p -> 123
Obj::ctor, (this -> 0xbf8f05a8)
o2.p -> 123
Obj::dtor, (this -> 0xbf8f05a8)
o1.p -> 246
Obj::dtor, (this -> 0xbf8f05ac)

From what you write, I figure that you know already how to write init and destroy functions that eventually use their counterparts for individual parts recursively.
Yes, there is no standard mechanism in C that would allow for something like automatic construction or destruction.
Construction can be somewhat replace by writing an initializer macro. Designated initializers come handy for that
#define TOTO_INITIALIZER(TUTU_PARAM, TATA_PARAM) \
{ \
.tata_member = TATA_INITIALIZER(TATA_PARAM), \
.tutu_member = TUTU_INITIALIZER(TUTU_PARAM), \
}
since they make that such code robust against reordering of members.
For destructors there is nothing that can be coupled to a variable or data type. The only thing I know of what is possible is scope based resource management that in C you can implement through hidden for-scope local variables.

There's no default way to have a function automatically called when you create a struct. Here's an example of a creation and initialisation function set for a certain type of struct:
// Simple struct that holds an ID number and a file pointer.
typedef struct
{
int id;
FILE *data;
} Datum;
// Function to create a Datum from a given file.
Datum *create_datum(const char *fname)
{
// Create Datum object.
Datum *d = (Datum*)malloc(sizeof(Datum));
// malloc may return NULL if we're out of memory.
if(d)
{
// Initialise ID to something.
d->id = 0;
// Open filename passed.
d->data = fopen(fname, "r");
}
return d;
}
// Function to safely destroy a Datum. This function takes a pointer-pointer so
// that it can set the pointer to NULL after deleting the object. Saves you
// from dangling pointers.
void destroy_datum(Datum **dp)
{
if(!dp)
return;
// Get a plain pointer for convenience
Datum *d = *dp;
if(d)
{
// Close the file.
fclose(d->data);
// Delete the object.
free(d);
// Set the pointer to NULL.
*dp = NULL;
}
}
// Now use these functions:
int main(void)
{
Datum *datum = create_datum("test.txt");
if(datum)
{
// Do some things!
}
destroy_datum(&datum);
// datum is now equal to NULL.
}
Hope that helps! Like Homunculus has said, C isn't a great language if you need to do a lot of this sort of stuff - but sometimes you just want to abstract away the process of creating a struct, as well as cleaning it up. This is especially helpful in modular design, where a module can provide the create_ and destroy_ interface functions, and hide the actual implementation of those.

I did not see the gcc tag, but since the original poster mention explicit use of GCC constructor/destructor attributes:
https://gcc.gnu.org/onlinedocs/gcc-4.7.0/gcc/Function-Attributes.html#index-g_t_0040code_007bconstructor_007d-function-attribute-2500
I'd like to point out that there is also the cleanup attribute:
https://gcc.gnu.org/onlinedocs/gcc-6.1.0/gcc/Common-Variable-Attributes.html#index-g_t_0040code_007bcleanup_007d-variable-attribute-3486
cleanup (cleanup_function)
The cleanup attribute runs a function when
the variable goes out of scope. This attribute can only be applied to
auto function scope variables; it may not be applied to parameters or
variables with static storage duration. The function must take one
parameter, a pointer to a type compatible with the variable. The
return value of the function (if any) is ignored. If -fexceptions is
enabled, then cleanup_function is run during the stack unwinding that
happens during the processing of the exception. Note that the cleanup
attribute does not allow the exception to be caught, only to perform
an action. It is undefined what happens if cleanup_function does not
return normally.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js