Downcasting trouble - c++

This is my first experience with downcasting in C++ and I just can't understand the problem.
AInstruction and CInstruction inherit from AssemblerInstruction.
Parser takes the info in its ctor and creates one of those derived instruction types for its mInstruction member (accessed by getInstruction). In the program, a method of the base AssemblerInstruction class is used, for happy polymorphism.
But when I want to test that the Parser has created the correct instruction, I need to query the derived instruction members, which means I need to downcast parser.getInstruction() to an AInstruction or CInstruction.
As far as I can tell this needs to be done using a bunch of pointers and references. This is how I can get the code to compile:
TEST(ParserA, parsesBuiltInConstants)
{
AssemblerInstruction inst = Parser("#R3", 0).getInstruction();
EXPECT_EQ(inst.getInstructionType(), AssemblerInstruction::InstructionType::A);
AssemblerInstruction* i = &(inst);
AInstruction* a = dynamic_cast<AInstruction*>(i);
EXPECT_EQ(a->getLine(), "R3");
}
Running this gives this error:
unknown file: error: SEH exception with code 0xc0000005 thrown in the test body.
And stepping through the code, when the debugger is on the final line of the function, a is pointing to
0x00000000 <NULL>.
I imagine this is an instance where I don't have a full enough understanding of C++, meaning that I could be making a n00b mistake. Or maybe it's some bigger crazy problem. Help?
Update
I've been able to make this work by making mInstruction into a (dumb) pointer:
// in parser, when parsing
mInstructionPtr = new AInstruction(assemblyCode.substr(1), lineNumber);
// elsewhere in AssemblerInstruction.cpp
AssemblerInstruction* AssemblyParser::getInstructionPtr() { return mInstructionPtr; }
TEST(ParserA, parsesBuiltInConstants)
{
auto ptr = Parser("#R3", 0).getInstructionPtr();
AInstruction* a = dynamic_cast<AInstruction*>(ptr);
EXPECT_EQ(a->getLine(), "R3");
}
However I have trouble implementing it with a unique_ptr:
(I'm aware that mInstruction (non-pointer) is redundant, as are two types of pointers. I'll get rid of it later when I clean all this up)
class AssemblyParser
{
public:
AssemblyParser(std::string assemblyCode, unsigned int lineNumber);
AssemblerInstruction getInstruction();
std::unique_ptr<AssemblerInstruction> getUniqueInstructionPtr();
AssemblerInstruction* getInstructionPtr();
private:
AssemblerInstruction mInstruction;
std::unique_ptr<AssemblerInstruction> mUniqueInstructionPtr;
AssemblerInstruction* mInstructionPtr;
};
// in AssemblyParser.cpp
// in parser as in example above. this works fine.
mUniqueInstructionPtr = make_unique<AInstruction>(assemblyCode.substr(1), lineNumber);
// this doesn't compile!!!
unique_ptr<AssemblerInstruction> AssemblyParser::getUniqueInstructionPtr()
{
return mUniqueInstructionPtr;
}
In getUniqueInstructionPtr, there is a squiggle under mUniqueInstructionPtr with this error:
'std::unique_ptr<AssemblerInstruction,std::default_delete>::unique_ptr(const std::unique_ptr<AssemblerInstruction,std::default_delete> &)': attempting to reference a deleted function
What!? I haven't declared any functions as deleted or defaulted!

You can not downcast an object to something which doesn't match it's dynamic type. In your code,
AssemblerInstruction inst = Parser("#R3", 0).getInstruction();
inst has a fixed type, which is AssemblerInstruction. Downcasting it to AInstruction leads to undefined behavior - manifested as crash - because that is not what it is.
If you want your getInstruction to return a dynamically-typed object, it has to return a [smart] pointer to base class, while constructing an object of derived class. Something like that (pseudo code):
std::unique_ptr<AssemblerInstruction> getInstruction(...) {
return std::make_unique<AInstruction>(...);
}
Also, if you see yourself in need of downcasting object based on a value of a class, you are doing something wrong, as you are trying to home-brew polymorphism. Most of the times it does indicate a design flaw, and should instead be done using built-in C++ polymorphic support - namely, virtual functions.

Related

c++, dealing with exceptions from constructors

I have a class which is loaded from an external file, so ideally I would want its constructor to load from a given path if the load fails, I will want to throw an error if the file is not found/not readable (Throwing errors from constructors is not a horrible idea, see ISO's FAQ).
There is a problem with this though, I want to handle errors myself in some controlled manner, and I want to do that immediately, so I need to put a try-catch statement around the constructor for this object ... and if I do that, the object is not declared outside the try statement, i.e.:
//in my_class.hpp
class my_class
{
...
public:
my_class(string path);//Throws file not found, or other error error
...
};
//anywhere my_class is needed
try
{
my_class my_object(string);
}
catch(/*Whatever error I am interesetd in*/)
{
//error handling
}
//Problem... now my_object doesn't exist anymore
I have tried a number of ways of getting around it, but I don't really like any of them:
Firstly, I could use a pointer to my_class instead of the class itself:
my_class* my_pointer;
try
{
my_class my_pointer = new my_class(string);
}
catch(/*Whatever error I am interesetd in*/)
{
//error handling
}
The problem is that the instance of this object doesn't always end up in the same object which created it, so deleting all pointers correctly would be easy to do wrong, and besides, I personally think it is ugly to have some objects be pointers to objects, and have most others be "regular objects".
Secondly, I could use a vector with only one element in much the same way:
std::vector<my_class> single_vector;
try
{
single_vector.push_back(my_class(string));
single_vector.shrink_to_fit();
}
catch(/*Whatever error I am interesetd in*/)
{
//error handling
}
I don't like the idea of having a lot of single-element vectors though.
Thirdly, I can create an empty faux constructor and use another loading function, i.e.
//in my_class.hpp
class my_class
{
...
public:
my_class() {}// Faux constructor which does nothing
void load(string path);//All the code in the constructor has been moved here
...
};
//anywhere my_class is needed
my_class my_object
try
{
my_object.load(path);
}
catch(/*Whatever error I am interesetd in*/)
{
//error handling
}
This works, but largely defeats the purpose of having a constructor, so I don't really like this either.
So my question is, which of these methods for constructing an object, which may throw errors in the constructor, is the best (or least bad)? and are there better ways of doing this?
Edit: Why don't you just use the object within the try-statement
Because the object may need to be created as the program is first started, and stopped much later. In the most extreme case (which I do actually need in this case also) that would essentially be:
int main()
{
try
{
//... things which might fail
//A few hundred lines of code
}
catch(/*whaveter*/)
{
}
}
I think this makes my code hard to read since the catch statement will be very far from where things actually went wrong.
One possibility is to wrap the construction and error handling in a function, returning the constructed object. Example :
#include <string>
class my_class {
public:
my_class(std::string path);
};
my_class make_my_object(std::string path)
{
try {
return {std::move(path)};
}
catch(...) {
// Handle however you want
}
}
int main()
{
auto my_object = make_my_object("this path doesn't exist");
}
But beware that the example is incomplete because it isn't clear what you intend to do when construction fails. The catch block has to either return something, throw or terminate.
If you could return a different instance, one with a "bad" or "default" state, you could have just initialized your instance to that state in my_class(std::string path) when it was determined the path is invalid. So in that case, the try/catch block is not needed.
If you rethrow the exception, then there is no point in catching it in the first place. In that case, the try/catch block is also not needed, unless you want to do a bit of extra work, like logging.
If you want to terminate, you can just let the exception go uncaught. Again, in that case, the try/catch block is not needed.
The real solution here is probably to not use a try/catch block at all, unless there is actually error handling you can do that shouldn't be implemented as part of my_class which isn't made apparent in the question (maybe a fallback path?).
and if I do that, the object is not declared outside the try statement
I have tried a number of ways of getting around it
That doesn't need to be a problem. There's not necessarily need to get around it. Simply use the object within the try statement.
If you really cannot have the try block around the entire lifetime, then this is a use case for std::optional:
std::optional<my_class> maybe_my_object;
try {
maybe_my_object.emplace(string);
} catch(...) {}
The problem is that the instance of this object doesn't always end up in the same object which created it, so deleting all pointers correctly would be easy to do wrong,
A pointer returned by new is correct to delete. In the error case, simply set the pointer to null and there would be no problem. That said, use a smart pointer instead for dynamic allocation, if you were to use this approach.
single_vector.push_back(my_class(string));
single_vector.shrink_to_fit();
Don't push and shrink when you know the number of objects that are going to be in the vector. Use reserve instead if you were to use this approach.
The object creation can fail because a resource is unavailable. It's not the creation which fails; it is a prerequisite which is not fulfilled.
Consequently, separate these two concerns: First obtain all resources and then, if that succeeded, create the object with these resources and use it. The object creation as such in this design cannot fail, the constructor is nothrow; it is trivial boilerplate code (copy data etc.). If, on the other hand, resource acquisition failed, object creation and object use are both skipped: Your problem with existing but unusable objects is gone.
Responding to your edit about try/catch comprising the entire program: Exceptions as error indicators are better suited for things which are done in many places at various times in a program because they guarantee error handling (by default through an abort) while separating it from the normal control flow. This is impossible to do with classic return value examination, which leaves us with a choice between unreadable or unreliable programs.
But if you have long-lived objects which are created only rarely (in your example: only at startup) you don't need exceptions. As you said, constructor exceptions guarantee that only properly initialized objects can be used. But if such an object is only created at startup this danger is low. You check for success one way or another and exit the program which cannot perform its purpose if the initial resource acquisition failed. This way the error is handled where it occurred. Even in less extreme cases (e.g. when an object is created at the beginning of a large function other than main) this may be the simpler solution.
In code, my suggestion looks like this:
struct T2;
struct myEx { myEx(const char *); };
void exit(int);
T1 *acquireResource1(); // e.g. read file
T2 *acquireResource2(); // e.g. connect to db
void log(const char *what);
class ObjT
{
public:
struct RsrcT
{
T1 *mT1;
T2 *mT2;
operator bool() { return mT1 && mT2; }
};
ObjT(const RsrcT& res) noexcept
{
// initialize from file data etc.
}
// more member functions using data from file and db
};
int main()
{
ObjT::RsrcT rsrc = { acquireResource1(), acquireResource2() };
if(!rsrc)
{
log("bummer");
exit(1);
}
///////////////////////////////////////////////////
// all resources are available. "Real" code starts here.
///////////////////////////////////////////////////
ObjT obj(rsrc);
// 1000 lines of code using obj
}

Convert data type a of class object (C++)

I am writing a game in which one Object has an ability to turn into an object of another class (e.g. Clark Kent -> Superman). I would like to know what is the most efficient way to implement this.
The logic of my current code:
I have created a turnInto() function inside the ClarkKent class. The turnInto function calls the constructor of Superman class, passing all needed infos to it. The next step is to assign the address of Superman object to the current ClarkKent object.
void ClarkKent::turnInto() {
Superman sMan(getName(), getMaxHP(), getDamage());
&(*this) = &w; // <- error here
this->ClarkKent::~ClarkKent();
}
As you might have guessed, the compiler gives an error that the expression is not assignable. Not sure how to find a correct solution to this.
Keep it simple and don't play tricks you don't understand with your objects.
Superman ClartkKent::turnInto() {
return {getName(), getMaxHP(), getDamage()};
}
At the callee:
ClartkKent some_guy{...};
auto some_other_guy = some_guy.tunInto();
Or if you need something fancy:
using NotBatman = std::variant<ClartkKent, Superman>;
NotBatman some_guy = ClartkKent{...};
using std::swap;
swap(some_guy, some_guy.tunInto());
IDK

Correctly perform dynamic cast

In Setting a variable in a child class, I was trying to figure out how to correctly derive variables in polymorphic classes. After some help, I found out that I needed to use a dynamic_cast on a pointer to correctly access the information I need. I am having some trouble with this.
This is the function I am currently working on.
void translateLines(Parser parser, Code code)
{
while(parser.hasMoreCommands())
{
vector<Command>::const_iterator it = parser.currentCommand();
if(it->commandType() == "A")
{
//SubType* item = dynamic_cast<SubType*>(*the_iterator);
A_COMMAND* a_command = dynamic_cast<A_COMMAND*>(*it); //line that is throwing the error
//string symbol = a_command->get_symbol();
//cout << "symbol: " << symbol << endl;
//perform binary conversion
}
/*else if(command.commandType() == "C")
{
string dest = command.get_dest();
}*/
//shouldn't be any L commands in symbol-less version
else
{
std::cout << "unexpected command value \n";
}
parser.advance();
}
}
This is my Parser.h, which has the relevant information regarding the iterator for the vector.
#include "Command.h"
#include <vector>
class Parser {
private:
std::vector<Command> commands;
std::vector<Command>::const_iterator command_it = commands.begin();
public:
Parser(std::vector<std::string>);
bool hasMoreCommands() //are there more commands in the input?
{
if(command_it != commands.end())
return true;
else
return false;
}
void advance(){std::next(command_it);} //move to next command, should only work if hasMoreCommands returns false}
std::vector<Command>::const_iterator currentCommand(){return command_it;}
std::vector<std::string> translateCommands(); //convert commands into binary strings
};
Here is the error I am receiving:
g++ -O0 -g3 -Wall -c -fmessage-length=0 -std=c++11 -o Assembler.o "..\\Assembler.cpp"
..\Assembler.cpp: In function 'void translateLines(Parser, Code)':
..\Assembler.cpp:32:55: error: cannot dynamic_cast 'it.__gnu_cxx::__normal_iterator<_Iterator, _Container>::operator*<Command*, std::vector<Command> >()' (of type 'class Command') to type 'class A_COMMAND*' (source is not a pointer)
A_COMMAND* a_command = dynamic_cast<A_COMMAND*>(*it);
^
Any clue what's wrong here?
EDIT: So I see now that I can't use a vector of Commands, rather I need pointers to the commands. I already have changed Parser.h to handle vector<Command*> rather than vector<Command>. For the input I tried something like this:
A_COMMAND command();
commands.push_back(&command);
But this isn't quite working for me, as the vector is expecting pointers and not references. What would be the easiest way to create a pointer to the memory and push it into the vector?
You have a vector of Commands. You cannot cast a Command to an A_COMMAND*. It's important to note that a vector<Command> cannot possibly contain a A_COMMAND. If you want to do runtime polymorphism in C++, you have to use pointers or references. In this case your Parser::commands would need to be a std::vector<Command*> (or some type of smart pointer like std::vector<std::shared_ptr<Command>>).
Take for example this code:
std::vector<Command> commands;
A_COMMAND a_command;
commands.push_back(a_command);
commands does not contain an A_COMMAND object. It contains a Command object that is a copy of a_command. It more-or-less equivilant of this:
std::vector<Command> commands;
A_COMMAND a_command;
Command temp(a_command);
commands.push_back(temp);
Remember, in C++ a variable is an object, not a reference to an object like in some other languages (Java or C# for instance). Objects will never change type, but you can have a reference or pointer of one type that points to an object of a derived type:
std::vector<Command*> commands;
A_COMMAND a_command;
commands.push_back(&a_command);
In this case commands[0] is a Command*, but it points to an A_COMMAND object.
RE your edit:
You are adding a pointer. &some_variable returns a pointer to some_variable, BUT you should absolutely never, ever do something like that. As soon as command goes out of scope, it will be destroyed and any access to it will result in undefined behavior. You will need to use dynamic memory allocation with new. It would probably be best to use a smart pointer class like std::shared_ptr<Command> to hold your dynamically allocated objects so you don't have to worry about deleteing them later.
If you use raw pointers then something like this will work:
A_COMMAND* command = new A_COMMAND;
commands.push_back(command);
If you go with that approach, you'll need to delete all of your commands when you're done with them (probably Parser's destructor):
for(Command* command : commands) {
delete command;
}
It would be better to use std::shared_ptrs though. Declare commands as std::vector<std::shared_ptr<Command>> commands;, then:
std::shared_ptr<A_COMMAND> command = std::make_shared<A_COMMAND>();
commands.push_back(command);
Then your objects will all get automatically deleteed when the last shared_ptr to them goes out of scope. If you use smart pointers you'll need to cast them slightly differently though. Look into std::dynamic_pointer_cast.
The real question is why use dynamic_cast at all.
This is a job for virtual methods.
If another class is derived, then your dynamic_cast will also need updating, with a virtual method you do not need to be concerned what the derived class is, only that it overrides the virtual method, which can be forced by using an interface class for the base (pure virtual methods, no state). This sounds like an application for the strategy pattern. https://en.wikipedia.org/wiki/Strategy_pattern.
try (it) instead of (*it)
the iterator should be a pointer to the object allready so you need to omit the * as this would result in the actual data not the reference

Why does PyCXX handle new-style classes in the way it does?

I'm picking apart some C++ Python wrapper code that allows the consumer to construct custom old style and new style Python classes from C++.
The original code comes from PyCXX, with old and new style classes here and here. I have however rewritten the code substantially, and in this question I will reference my own code, as it allows me to present the situation in the greatest clarity that I am able. I think there would be very few individuals capable of understanding the original code without several days of scrutiny... For me it has taken weeks and I'm still not clear on it.
The old style simply derives from PyObject,
template<typename FinalClass>
class ExtObj_old : public ExtObjBase<FinalClass>
// ^ which : ExtObjBase_noTemplate : PyObject
{
public:
// forwarding function to mitigate awkwardness retrieving static method
// from base type that is incomplete due to templating
static TypeObject& typeobject() { return ExtObjBase<FinalClass>::typeobject(); }
static void one_time_setup()
{
typeobject().set_tp_dealloc( [](PyObject* t) { delete (FinalClass*)(t); } );
typeobject().supportGetattr(); // every object must support getattr
FinalClass::setup();
typeobject().readyType();
}
// every object needs getattr implemented to support methods
Object getattr( const char* name ) override { return getattr_methods(name); }
// ^ MARKER1
protected:
explicit ExtObj_old()
{
PyObject_Init( this, typeobject().type_object() ); // MARKER2
}
When one_time_setup() is called, it forces (by accessing base class typeobject()) creation of the associated PyTypeObject for this new type.
Later when an instance is constructed, it uses PyObject_Init
So far so good.
But the new style class uses much more complicated machinery. I suspect this is related to the fact that new style classes allow derivation.
And this is my question, why is the new style class handling implemented in the way that it is? Why is it having to create this extra PythonClassInstance structure? Why can't it do things the same way the old-style class handling does? i.e. Just type convert from the PyObject base type? And seeing as it doesn't do that, does this mean it is making no use of its PyObject base type?
This is a huge question, and I will keep amending the post until I'm satisfied it represents the issue well. It isn't a good fit for SO's format, I'm sorry about that. However, some world-class engineers frequent this site (one of my previous questions was answered by the lead developer of GCC for example), and I value the opportunity to appeal to their expertise. So please don't be too hasty to vote to close.
The new style class's one-time setup looks like this:
template<typename FinalClass>
class ExtObj_new : public ExtObjBase<FinalClass>
{
private:
PythonClassInstance* m_class_instance;
public:
static void one_time_setup()
{
TypeObject& typeobject{ ExtObjBase<FinalClass>::typeobject() };
// these three functions are listed below
typeobject.set_tp_new( extension_object_new );
typeobject.set_tp_init( extension_object_init );
typeobject.set_tp_dealloc( extension_object_deallocator );
// this should be named supportInheritance, or supportUseAsBaseType
// old style class does not allow this
typeobject.supportClass(); // does: table->tp_flags |= Py_TPFLAGS_BASETYPE
typeobject.supportGetattro(); // always support get and set attr
typeobject.supportSetattro();
FinalClass::setup();
// add our methods to the extension type's method table
{ ... typeobject.set_methods( /* ... */); }
typeobject.readyType();
}
protected:
explicit ExtObj_new( PythonClassInstance* self, Object& args, Object& kwds )
: m_class_instance{self}
{ }
So the new style uses a custom PythonClassInstance structure:
struct PythonClassInstance
{
PyObject_HEAD
ExtObjBase_noTemplate* m_pycxx_object;
}
PyObject_HEAD, if I dig into Python's object.h, is just a macro for PyObject ob_base; -- no further complications, like #if #else. So I don't see why it can't simply be:
struct PythonClassInstance
{
PyObject ob_base;
ExtObjBase_noTemplate* m_pycxx_object;
}
or even:
struct PythonClassInstance : PyObject
{
ExtObjBase_noTemplate* m_pycxx_object;
}
Anyway, it seems that its purpose is to tag a pointer onto the end of a PyObject. This will be because Python runtime will often trigger functions we have placed in its function table, and the first parameter will be the PyObject responsible for the call. So this allows us to retrieve the associated C++ object.
But we also need to do that for the old-style class.
Here is the function responsible for doing that:
ExtObjBase_noTemplate* getExtObjBase( PyObject* pyob )
{
if( pyob->ob_type->tp_flags & Py_TPFLAGS_BASETYPE )
{
/*
New style class uses a PythonClassInstance to tag on an additional
pointer onto the end of the PyObject
The old style class just seems to typecast the pointer back up
to ExtObjBase_noTemplate
ExtObjBase_noTemplate does indeed derive from PyObject
So it should be possible to perform this typecast
Which begs the question, why on earth does the new style class feel
the need to do something different?
This looks like a really nice way to solve the problem
*/
PythonClassInstance* instance = reinterpret_cast<PythonClassInstance*>(pyob);
return instance->m_pycxx_object;
}
else
return static_cast<ExtObjBase_noTemplate*>( pyob );
}
My comment articulates my confusion.
And here, for completeness is us inserting a lambda-trampoline into the PyTypeObject's function pointer table, so that Python runtime can trigger it:
table->tp_setattro = [] (PyObject* self, PyObject* name, PyObject* val) -> int
{
try {
ExtObjBase_noTemplate* p = getExtObjBase( self );
return ( p -> setattro(Object{name}, Object{val}) );
}
catch( Py::Exception& ) { /* indicate error */
return -1;
}
};
(In this demonstration I'm using tp_setattro, note that there are about 30 other slots, which you can see if you look at the doc for PyTypeObject)
(in fact the major reason for working this way is that we can try{}catch{} around every trampoline. This saves the consumer from having to code repetitive error trapping.)
So, we pull out the "base type for the associated C++ object" and call its virtual setattro (just using setattro as an example here). A derived class will have overridden setattro, and this override will get called.
The old-style class provides such an override, which I've labelled MARKER1 -- it is in the top listing for this question.
The only the thing I can think of is that maybe different maintainers have used different techniques. But is there some more compelling reason why old and new style classes require different architecture?
PS for reference, I should include the following methods from new style class:
static PyObject* extension_object_new( PyTypeObject* subtype, PyObject* args, PyObject* kwds )
{
PyObject* pyob = subtype->tp_alloc(subtype,0);
PythonClassInstance* o = reinterpret_cast<PythonClassInstance *>( pyob );
o->m_pycxx_object = nullptr;
return pyob;
}
^ to me, this looks absolutely wrong.
It appears to be allocating memory, re-casting to some structure that might exceed the amount allocated, and then nulling right at the end of this.
I'm surprised it hasn't caused any crashes.
I can't see any indication anywhere in the source code that these 4 bytes are owned.
static int extension_object_init( PyObject* _self, PyObject* _args, PyObject* _kwds )
{
try
{
Object args{_args};
Object kwds{_kwds};
PythonClassInstance* self{ reinterpret_cast<PythonClassInstance*>(_self) };
if( self->m_pycxx_object )
self->m_pycxx_object->reinit( args, kwds );
else
// NOTE: observe this is where we invoke the constructor, but indirectly (i.e. through final)
self->m_pycxx_object = new FinalClass{ self, args, kwds };
}
catch( Exception & )
{
return -1;
}
return 0;
}
^ note that there is no implementation for reinit, other than the default
virtual void reinit ( Object& args , Object& kwds ) {
throw RuntimeError( "Must not call __init__ twice on this class" );
}
static void extension_object_deallocator( PyObject* _self )
{
PythonClassInstance* self{ reinterpret_cast< PythonClassInstance* >(_self) };
delete self->m_pycxx_object;
_self->ob_type->tp_free( _self );
}
EDIT: I will hazard a guess, thanks to insight from Yhg1s on the IRC channel.
Maybe it is because when you create a new old-style class, it is guaranteed it will overlap perfectly a PyObject structure.
Hence it is safe to derive from PyObject, and pass a pointer to the underlying PyObject into Python, which is what the old-style class does (MARKER2)
On the other hand, new style class creates a {PyObject + maybe something else} object.
i.e. It wouldn't be safe to do the same trick, as Python runtime would end up writing past the end of the base class allocation (which is only a PyObject).
Because of this, we need to get Python to allocate for the class, and return us a pointer which we store.
Because we are now no longer making use of the PyObject base-class for this storage, we cannot use the convenient trick of typecasting back to retrieve the associated C++ object.
Which means that we need to tag on an extra sizeof(void*) bytes to the end of the PyObject that actually does get allocated, and use this to point to our associated C++ object instance.
However, there is some contradiction here.
struct PythonClassInstance
{
PyObject_HEAD
ExtObjBase_noTemplate* m_pycxx_object;
}
^ if this is indeed the structure that accomplishes the above, then it is saying that the new style class instance is indeed fitting exactly over a PyObject, i.e. It is not overlapping into the m_pycxx_object.
And if this is the case, then surely this whole process is unnecessary.
EDIT: here are some links that are helping me learn the necessary ground work:
http://eli.thegreenplace.net/2012/04/16/python-object-creation-sequence
http://realmike.org/blog/2010/07/18/introduction-to-new-style-classes-in-python
Create an object using Python's C API
to me, this looks absolutely wrong. It appears to be allocating memory, re-casting to some structure that might exceed the amount allocated, and then nulling right at the end of this. I'm surprised it hasn't caused any crashes. I can't see any indication anywhere in the source code that these 4 bytes are owned
PyCXX does allocate enough memory, but it does so by accident. This appears to be a bug in PyCXX.
The amount of memory Python allocates for the object is determined by the first call to the following static member function of PythonClass<T>:
static PythonType &behaviors()
{
...
p = new PythonType( sizeof( T ), 0, default_name );
...
}
The constructor of PythonType sets the tp_basicsize of the python type object to sizeof(T). This way when Python allocates an object it knows to allocate at least sizeof(T) bytes. It works because sizeof(T) turns out to be larger that sizeof(PythonClassInstance) (T is derived from PythonClass<T> which derives from PythonExtensionBase, which is large enough).
However, it misses the point. It should actually allocate only sizeof(PythonClassInstance) . This appears to be a bug in PyCXX - that it allocates too much, rather than too little space for storing a PythonClassInstance object.
And this is my question, why is the new style class handling implemented in the way that it is? Why is it having to create this extra PythonClassInstance structure? Why can't it do things the same way the old-style class handling does?
Here's my theory why new style classes are different from the old style classes in PyCXX.
Before Python 2.2, where new style classes were introduced, there was no tp_init member int the type object. Instead, you needed to write a factory function that would construct the object. This is how PythonExtension<T> is supposed to work - the factory function converts the Python arguments to C++ arguments, asks Python to allocate the memory and then calls the constructor using placement new.
Python 2.2 added the new style classes and the tp_init member. Python first creates the object and then calls the tp_init method. Keeping the old way would have required that the objects would first have a dummy constructor that creates an "empty" object (e.g. initializes all members to null) and then when tp_init is called, would have had an additional initialization stage. This makes the code uglier.
It seems that the author of PyCXX wanted to avoid that. PyCXX works by first creating a dummy PythonClassInstance object and then when tp_init is called, creates the actual PythonClass<T> object using its constructor.
... does this mean it is making no use of its PyObject base type?
This appears to be correct, the PyObject base class does not seem to be used anywhere. All the interesting methods of PythonExtensionBase use the virtual self() method, which returns m_class_instance and completely ignore the PyObject base class.
I guess (only a guess, though) is that PythonClass<T> was added to an existing system and it seemed easier to just derive from PythonExtensionBase instead of cleaning up the code.

Pointer object in C++

I have a very simple class that looks as follows:
class CHeader
{
public:
CHeader();
~CHeader();
void SetCommand( const unsigned char cmd );
void SetFlag( const unsigned char flag );
public:
unsigned char iHeader[32];
};
void CHeader::SetCommand( const unsigned char cmd )
{
iHeader[0] = cmd;
}
void CHeader::SetFlag( const unsigned char flag )
{
iHeader[1] = flag;
}
Then, I have a method which takes a pointer to CHeader as input and looks
as follows:
void updateHeader(CHeader *Hdr)
{
unsigned char cmd = 'A';
unsigned char flag = 'B';
Hdr->SetCommand(cmd);
Hdr->SetFlag(flag);
...
}
Basically, this method simply sets some array values to a certain value.
Afterwards, I create then a pointer to an object of class CHeader and pass it to
the updateHeader function:
CHeader* hdr = new CHeader();
updateHeader(hdr);
In doing this, the program crashes as soon as it executes the Hdr->SetCommand(cmd)
line. Anyone sees the problem, any input would be really appreciated
When you run into a crash, act like a crime investigator: investigate the crime scene.
what is the information you get from your environment (access violation? any debug messages? what does the memory at *Hdr look like? ...)
Is the passed-in Hdr pointer valid?
Then use logical deduction, e.g.:
the dereferencing of Hdr causes an access violation
=> passed in Hdr points to invalid memory
=> either memory wasn't valid to start with (wrong pointer passed in), or memory was invalidated (object was deleted before passing in the pointer, or someone painted over the memory)
...
It's probably SEGFAULTing. Check the pointers.
After
your adding some source code
your comment that the thing runs on another machine
the fact that you use the term 'flag' and 'cmd' and some very small datatypes
making me assume the target machine is quite limited in capacity, I suggest testing the result of the new CHeader for validity: if the system runs out of resources, the resulting pointer will not refer to valid memory.
There is nothing wrong with the code you've provided.
Are you sure the pointer you've created is the same same address once you enter the 'updateHeader' function? Just to be sure, after new() note the address, fill the memory, sizeof(CHeader), with something you know is unique like 0XDEAD, then trace into the updateHeader function, making sure everything is equal.
Other than that, I wonder if it is an alignment issues. I know you're using 8 bit values, but try changing your array to unsigned ints or longs and see if you get the same issue. What architecture are you running this on?
Your code looks fine. The only potential issue I can see is that you have declared a CHeader constructor and destructor in your class, but do not show the implementation of either. I guess you have just omitted to show these, else the linker should have complained (if I duplicate this project in VC++6 it comes up with an 'unresolved external' error for the constructor. It should also have shown the same error for the destructor if you had a... delete hdr; ...statement in your code).
But it is actually not necessary to have an implementation for every method declared in a class unless the methods are actually going to get called (any unimplemented methods are simply ignored by the compiler/linker if never called). Of course, in the case of an object one of the constructor(s) has to be called when the object is instantiated - which is the reason the compiler will create a default constructor for you if you omit to add any constructors to your class. But it will be a serious error for your compiler to compile/link the above code without the implementation of your declared constructor, so I will really be surprised if this is the reason for your problem.
But the symptoms you describe definitely sounds like the 'hdr' pointer you are passing to the updateHeader function is invalid. The reason being that the 1st time you are dereferencing this pointer after the updateHeader function call is in the... Hdr->SetCommand(cmd); ...call (which you say crashes).
I can only think of 2 possible scenarios for this invalid pointer:
a.) You have some problem with your heap and the allocation of memory with the 'new' operator failed on creation of the 'hdr' object. Maybe you have insufficient heap space. On some embedded environments you may also need to provide 'custom' versions of the 'new' and 'delete' operator. The easiest way to check this (and you should always do) is to check the validity of the pointer after the allocation:
CHeader* hdr = new CHeader();
if(hdr) {
updateHeader(hdr);
}
else
//handle or throw exception...
The normal behaviour when 'new' fails should actually be to throw an exception - so the following code will cater for that as well:
try{
CHeader* hdr = new CHeader();
} catch(...) {
//handle or throw specific exception i.e. AfxThrowMemoryException() for MFC
}
if(hdr) {
updateHeader(hdr);
}
else
//handle or throw exception...
}
b.) You are using some older (possibly 16 bit and/or embedded) environment, where you may need to use a FAR pointer (which includes the SEGMENT address) for objects created on the heap.
I suspect that you will need to provide more details of your environment plus compiler to get any useful feedback on this problem.