I am trying to access the type of a userdata so that I can process it accordingly. Imagine I have a class named as Foo:
class Foo:public CObject
{
public:
Foo():CObject(){}
int type() {return 1;}
}
class CObject
{
public:
virtual int type(void)=0;
}
The rationale is that every class extending the CObject has a type that must be made known by an integer number (later on an enum). The class Foo is bind to lua using luaWwrapper (//https://bitbucket.org/alexames/luawrapper/src/fd9c4fdbf4b25034e3b8475a2c8da66b7caab427?at=default).
Foo* Foo_new(lua_State* L)
{
Foo* f=new Foo();
lua_newuserdata(L,sizeof(f));
std::cout<<"f="<<f;
return f;
}
In Lua user calls this as:
f=Foo.new()
print(f)
Now I have a C++ function, say print:
int lua_print(lua_State* L)
{
void *ud = luaL_checkudata(L, 1, "Foo"); //ud is not zero
std::cout<<"ud="<<ud;
CObject* obj=(CObject*)ud; //Casting to CObject
int objtype=obj->type(); //program CRASHES here
}
I have seen that the program crashes cause the memory addresses of Foo and ud are not the same. I assume ud refers to the memory of stack which contains the memory adress of Foo. How can I access stack's memory address or the preferred memory address of Foo?
You have to use placement new to initialize the object in the memory returned by lua_newuserdata.
Something in the lines of
void *ud = lua_newuserdata(L,sizeof(Foo));
new (ud) Foo();
Foo_new should just return the pointer to the object.
In other words, your Foo_new would look like this:
Foo* Foo_new(lua_State* L)
{
return new Foo();
}
However, if you have no special initialization you need to do, you don't even need to write this function. This function is supplied for you by magical templates if you don't write one yourself.
When you want to get your Foo object from the Lua state, you do this:
int lua_print(lua_State* L)
{
Foo *ud = luaW_to<Foo>(L, 1); //ud is not zero
std::cout<<"ud="<<ud;
CObject* obj=(CObject*)ud;
int objtype=obj->type();
}
If CObject is registered with LuaWrapper too, you don't even need to do the manual cast. You can just do luaW_to<CObject>(L, 1);
Related
The problem
I am writing a thin C++ wrapper around an object oriented C library. The idea was to automate memory management, but so far its not been very automatic. Basically when I use my wrapper classes, I get all kinds of memory access and inappropriate freeing problems.
Minimal example of C library
Lets say the C library consists of A and B classes, each of which have a few 'methods' associated with them:
#include <memory>
#include "cstring"
#include "iostream"
extern "C" {
typedef struct {
unsigned char *string;
} A;
A *c_newA(const char *string) {
A *a = (A *) malloc(sizeof(A)); // yes I know, don't use malloc in C++. This is a demo to simulate the C library that uses it.
auto *s = (char *) malloc(strlen(string) + 1);
strcpy(s, string);
a->string = (unsigned char *) s;
return a;
}
void c_freeA(A *a) {
free(a->string);
free(a);
}
void c_printA(A *a) {
std::cout << a->string << std::endl;
}
typedef struct {
A *firstA;
A *secondA;
} B;
B *c_newB(const char *first, const char *second) {
B *b = (B *) malloc(sizeof(B));
b->firstA = c_newA(first);
b->secondA = c_newA(second);
return b;
}
void c_freeB(B *b) {
c_freeA(b->firstA);
c_freeA(b->secondA);
free(b);
}
void c_printB(B *b) {
std::cout << b->firstA->string << ", " << b->secondA->string << std::endl;
}
A *c_getFirstA(B *b) {
return b->firstA;
}
A *c_getSecondA(B *b) {
return b->secondA;
}
}
Test the 'C lib'
void testA() {
A *a = c_newA("An A");
c_printA(a);
c_freeA(a);
// outputs: "An A"
// valgrind is happy =]
}
void testB() {
B *b = c_newB("first A", "second A");
c_printB(b);
c_freeB(b);
// outputs: "first A, second A"
// valgrind is happy =]
}
Wrapper classes for A and B
class AWrapper {
struct deleter {
void operator()(A *a) {
c_freeA(a);
}
};
std::unique_ptr<A, deleter> aptr_;
public:
explicit AWrapper(A *a)
: aptr_(a) {
}
static AWrapper fromString(const std::string &string) { // preferred way of instantiating
A *a = c_newA(string.c_str());
return AWrapper(a);
}
void printA() {
c_printA(aptr_.get());
}
};
class BWrapper {
struct deleter {
void operator()(B *b) {
c_freeB(b);
}
};
std::unique_ptr<B, deleter> bptr_;
public:
explicit BWrapper(B *b)
: bptr_(std::unique_ptr<B, deleter>(b)) {
}
static BWrapper fromString(const std::string &first, const std::string &second) {
B *b = c_newB(first.c_str(), second.c_str());
return BWrapper(b);
}
void printB() {
c_printB(bptr_.get());
}
AWrapper getFirstA(){
return AWrapper(c_getFirstA(bptr_.get()));
}
AWrapper getSecondA(){
return AWrapper(c_getSecondA(bptr_.get()));
}
};
Wrapper tests
void testAWrapper() {
AWrapper a = AWrapper::fromString("An A");
a.printA();
// outputs "An A"
// valgrind is happy =]
}
void testBWrapper() {
BWrapper b = BWrapper::fromString("first A", "second A");
b.printB();
// outputs "first A"
// valgrind is happy =]
}
Demonstration of the problem
Great, so I move on and develop the full wrapper (lot of classes) and realise that when classes like this (i.e. aggregation relationship) are both in scope, C++ will automatically call the descructors of both classes separately, but because of the structure of the underlying library (i.e. the calls to free), we get memory problems:
void testUsingAWrapperAndBWrapperTogether() {
BWrapper b = BWrapper::fromString("first A", "second A");
AWrapper a1 = b.getFirstA();
// valgrind no happy =[
}
Valgrind output
Things I've tried
Cloning not possible
The first thing I tried was to take a copy of A, rather than having them try to free the same A. This, while a good idea, is not possible in my case because of the nature of the library I'm using. There is actually a catching mechanism in place so that when you create a new A with a string its seen before, it'll give you back the same A. See this question for my attempts at cloning A.
Custom destructors
I took the code for the C library destructors (freeA and freeB here) and copied them into my source code. Then I tried to modify them such that A does not get freed by B. This has partially worked. Some instances of memory problems have been resolved, but because this idea does not tackle the problem at hand (just kind of temporarily glosses over the main issue), new problems keep popping up, some of which are obscure and difficult to debug.
The question
So at last we arive at the question: How can I modify this C++ wrapper to resolve the memory problems that arise due to the interactions between the underlying C objects? Can I make better use of smart pointers? Should I abandon the C wrapper completly and just use the libraries pointers as is? Or is there a better way I haven't thought of?
Thanks in advance.
Edits: response to the comments
Since asking the previous question (linked above) I have restructed my code so that the wrapper is being developed and built in the same library as the one it wraps. So the objects are no longer opaque.
The pointers are generated from function calls to the library, which uses calloc or malloc to allocate.
In the real code A is raptor_uri* (typdef librdf_uri*) from raptor2 and is allocated with librdf_new_uri while B is raptor_term* (aka librdf_node*) and allocated with librdf_new_node_* functions. The librdf_node has a librdf_uri field.
Edit 2
I can also point to the line of code where the same A is returned if its the same string. See line 137 here
The problem is that getFirstA and getSecondA return instances of AWrapper, which is an owning type. This means that when constructing an AWrapper you're giving up the ownership of an A *, but getFirstA and getFirstB don't do that. The pointers from which the returned objects are constructed are managed by a BWrapper.
The easiest solution is that you should return an A * instead of the wrapper class. This way you're not passing the ownership of the inner A member. I also would recommend making the constructors taking pointers in the wrapper classes private, and having a fromPointer static method similar to fromString, which takes ownership of the pointer passed to it. This way you won't accidently make instances of the wrapper classes from raw pointers.
If you want to avoid using raw pointers or want to have methods on the returned objects from getFirstA and getSecondA you could write a simple reference wrapper, which has a raw pointer as a member.
class AReference
{
private:
A *a_ref_;
public:
explicit AReference(A *a_ref) : a_ref_(a_ref) {}
// other methods here, such as print or get
};
You are freeing A twice
BWrapper b = BWrapper::fromString("first A", "second A");
When b goes out of scope, c_freeB is called which also calls c_freeA
AWrapper a1 = b.getFirstA();
Wraps A with another unique_ptr, then when a1 goes out of scope it will call c_freeA on the same A.
Note that getFirstA in BWrapper gives ownership of an A to another unique_ptr when using the AWrapper constructor.
Ways to fix this:
Don't let B manage A memory, but since you are using a lib that won't be possible.
Let BWrapper manage A, don't let AWrapper manage A and make sure the BWrapper exists when using AWrapper. That is, use a raw pointer in AWrapper instead of a smart pointer.
Make a copy of A in the AWrapper(A *) constructor, for this you might want to use a function from the library.
Edit:
shared_ptr won't work in this case because c_freeB will call c_freeA anyways.
Edit 2:
In this specific case considering the raptor lib you mentioned, you could try the following:
explicit AWrapper(A *a)
: aptr_(raptor_uri_copy(a)) {
}
assuming that A is a raptor_uri. raptor_uri_copy(raptor_uri *) will increase the reference count and return the same passed pointer. Then, even if raptor_free_uri is called twice on the same raptor_uri * it will call free only when the counter becomes zero.
If I create a class in c++, it is possible to call a function of an object of this class, even if this class does not exists.
For example:
Class:
class ExampleClass
{
private:
double m_data;
public:
void readSomeData(double param)
{
m_data = param;
}
}
Any function where this class is used:
int main()
{
ExampleClass* myClass;
myClass->readSomeData(2.5);
}
Ofcourse this wouldn't function, because myClass is not defined.
To avoid such situations, I check if ExampleClass objects are a null_ptr
example:
void readSomeData(double param)
{
if(this == null_ptr)
return;
m_data = param;
}
But gcc says:
'this' pointer cannot be null in well-defined C++ code; comparison may
be assumed to always avaluate to false.
Ofcourse that is only a warning, but I think it is not nice to have this warning. Is there a better way to check if the pointer of a class is defined?
Testing it in the class is the wrong way, the warning is correct about that if your code is well defined then this must not be null, so the test should happen at the time when you call the member function:
int main()
{
ExampleClass* myClass = nullptr; // always initialize a raw pointer to ensure
// that it does not point to a random address
// ....
if (myClass != nullptr) {
myClass->readSomeData(2.5);
}
return 0;
}
If a pointer must not be null at a certain part of your code then you should do it according to CppCoreGuideline: I.12: Declare a pointer that must not be null as not_null
Micorosoft provides an Guidelines Support Library that has an implementation for not_null.
Or if possible then don't use pointers at all but std::optional.
So a code setup could look like this:
#include <gsl/gsl>
struct ExampleClass {
void readSomeData(double ){}
};
// now it is clear that myClass must not and can not be null within work_with_class
// it still could hold an invalid pointe, but thats another problem
void work_with_class(gsl::not_null<ExampleClass*> myClass) {
myClass->readSomeData(2.5);
}
int main()
{
ExampleClass* myClass = nullptr; // always initialize a raw pointer to ensure
// that it does not point to a random address
// ....
work_with_class(myClass);
return 0;
}
The best way is not use pointers at all:
int main()
{
ExampleClass myClass;
myClass.readSomeData(2.5);
}
That way there's no need for any check, and in fact, checking this inside the function is moot.
If you need nullability, use std::optional instead.
Either don't use pointers as Bartek Banachewicz has pointed out, or properly initialize and check the pointer:
int main()
{
ExampleClass* myClass= 0;
if (myClass)
myClass->readSomeData(2.5);
return 0;
}
Of course you still have to add the instantiation of the object at some point, otherwise the code is nonsense.
[Global Scope]
myClass *objA, *objB, *obj;
int objnum;
I want to switch between objA and objB and assign them alternatively to obj, so in main() I have:
int main()
{
objA = new myClass(parameters...);
objB = new myClass(parameters...);
// start with objA;
objnum = 0;
obj = objA;
}
At some point a function is called that switches between the two objects:
void switchObjects()
{
if (++objnum > 1) objnum = 0;
obj = objnum == 0 ? objA : objB;
}
And in the function where I use the object, I have:
void doYourJob()
{
int res = obj->work();
}
Now the weird thing is that if I don't assign obj to either objA or objB, it still works. I would expect an exception, instead. Even if I do obj = NULL;, it still works! What's this voodoo?
OK, I could provide a different example that brings to the same result, without using a NULL pointer:
myClass *obj[2];
int objnum;
void switchObject()
{
if (++objnum > 1) objnum = 0;
}
void doYourJob()
{
res = obj[objnum]->work();
}
int main()
{
obj[0] = new myClass(parameters...);
obj[1] = new myClass(parameters...);
objnum = 0;
}
With the above code, regardless of the value of objnum, I still get both objects working together, even if I'm calling work() on only one instance.
And if I replace the function doYourJob() with this:
void doYourJob()
{
int res1 = obj[0]->work();
int res2 = obj[1]->work();
}
I always get the results doubled, as if I were calling the function work() twice on every object.
Consider a simpler example:
#include <iostream>
struct X
{
void foo() { std::cout << "Works" << std::endl; }
};
int main() {
X* x = nullptr;
x->foo();
}
With most compilers and on most platforms, this code will appear to work fine, despite having called foo on a null pointer. However, the behaviour is technically undefined. That is, the C++ language gives no restrictions about what might happen if you do this.
Why does it work? Well, calling a member function only requires knowing the type of the object it is being called on. We know that x points at an X, so we know what function to call: X::foo. In many cases, it may be difficult or even impossible to know if a pointer points at a real object, so the compiler just lets it happen. The body of the function, in this case, doesn't actually depend on the X object actually existing, so it just works. This isn't something you can depend on though.
I recently created my own scripting language. My code structures are heavily based on polymorphism. (I'm not really sure about how is this called. I've got a virtual function and then I derive the class and let the OS decide what to call on runtime):
class Statement
{
virtual void exec() = 0;
};
class PrintStmt : public Statement
{
void exec()
{
std::cout << expression->eval();
};
class AssignStmt : public Statement
{
void exec()
{
vm->bind_var(name, expression->eval())
};
Any ideas how I can rework this so it can be compiled by a pure C compiler?
I know this is general question and there is no single answer, but how would you do this?
Note: I already downloaded the python code as a reference, but it will take time until I figure out how it is working.
Statement would be a struct. In addition to its data members, you will need a function pointer e.g.
struct Statement
{
void(*exec)(Statement* this); // Function pointer
// Other members
};
You would then have different implementations of the functions per statement type and a function for manufacturing objects of the right type e.g.
static void printExec(struct Statement* this)
{
printf("%s", this->whatever);
}
struct Statement* createPrintStatement()
{
struct Statement* statement = calloc(1, sizeof(struct Statement));
statement->exec = printExec;
return statement;
}
And you would invoke it like this:
statement->exec(statement);
The this pointer gives you access to the data members of the particular struct i.e. the instance whose exec method you invoked.
If you have lots of functions, consider using a vtable.
struct VTable
{
void(*exec)(Statement* this); // Function pointer
const char* (*stringValue)(Statement* this); // Function pointer
};
struct Statement
{
struct VTable* vtable;
// Other members
};
You build each a vtable for each kind of object only once
struct VTable printVTable =
{
printExec,
printStringValue
};
You create new objects thus:
struct Statement* createPrintStatement()
{
struct Statement* statement = calloc(1, sizeof(struct Statement));
statement->vtable = &printVTable;
return statement;
}
and invoke the methods thus
statement->vtable->exec(statement);
The vtable method is more or less what C++ does behind the scenes.
The most straightforward way to convert this to C would probably be to use function pointers.
As #DrewNorman said, you will need to understand how vtables work, class layouts etc, and reimplement it (at least partially) in C. The example code below is very limited but gives you a hint of what to expect.
struct Statement {
void (*exec)(struct Statement* s);
};
struct PrintStmt {
struct Statement statement;
char* what;
};
void print_function(struct Statement* s) {
struct PrintStmt* p = (struct PrintStmt*)s;
printf(p->what);
}
// ...
struct PrintStmt p;
p.statement.exec = &print_function;
p.what = "Hello world";
p.statement.exec(p);
There are numerous C projects that use this kind of technique, GObject comes to my mind but it's far from the only one.
(Note: I'm used to C++ not really to C so this may not even be valid C but you get the idea anyway)
Using C++ I built a Class that has many setter functions, as well as various functions that may be called in a row during runtime.
So I end up with code that looks like:
A* a = new A();
a->setA();
a->setB();
a->setC();
...
a->doA();
a->doB();
Not, that this is bad, but I don't like typing "a->" over and over again.
So I rewrote my class definitions to look like:
class A{
public:
A();
virtual ~A();
A* setA();
A* setB();
A* setC();
A* doA();
A* doB();
// other functions
private:
// vars
};
So then I could init my class like: (method 1)
A* a = new A();
a->setA()->setB()->setC();
...
a->doA()->doB();
(which I prefer as it is easier to write)
To give a more precise implementation of this you can see my SDL Sprite C++ Class I wrote at http://ken-soft.com/?p=234
Everything seems to work just fine. However, I would be interested in any feedback to this approach.
I have noticed One problem. If i init My class like: (method 2)
A a = A();
a.setA()->setB()->setC();
...
a.doA()->doB();
Then I have various memory issues and sometimes things don't work as they should (You can see this by changing how i init all Sprite objects in main.cpp of my Sprite Demo).
Is that normal? Or should the behavior be the same?
Edit the setters are primarily to make my life easier in initialization. My main question is way method 1 and method 2 behave different for me?
Edit: Here's an example getter and setter:
Sprite* Sprite::setSpeed(int i) {
speed = i;
return this;
}
int Sprite::getSpeed() {
return speed;
}
One note unrelated to your question, the statement A a = A(); probably isn't doing what you expect. In C++, objects aren't reference types that default to null, so this statement is almost never correct. You probably want just A a;
A a creates a new instance of A, but the = A() part invokes A's copy constructor with a temporary default constructed A. If you had done just A a; it would have just created a new instance of A using the default constructor.
If you don't explicitly implement your own copy constructor for a class, the compiler will create one for you. The compiler created copy constructor will just make a carbon copy of the other object's data; this means that if you have any pointers, it won't copy the data pointed to.
So, essentially, that line is creating a new instance of A, then constructing another temporary instance of A with the default constructor, then copying the temporary A to the new A, then destructing the temporary A. If the temporary A is acquiring resources in it's constructor and de-allocating them in it's destructor, you could run into issues where your object is trying to use data that has already been deallocated, which is undefined behavior.
Take this code for example:
struct A {
A() {
myData = new int;
std::cout << "Allocated int at " << myData << std::endl;
}
~A() {
delete myData;
std::cout << "Deallocated int at " << myData << std::endl;
}
int* myData;
};
A a = A();
cout << "a.myData points to " << a.myData << std::endl;
The output will look something like:
Allocated int at 0x9FB7128
Deallocated int at 0x9FB7128
a.myData points to 0x9FB7128
As you can see, a.myData is pointing to an address that has already been deallocated. If you attempt to use the data it points to, you could be accessing completely invalid data, or even the data of some other object that took it's place in memory. And then once your a goes out of scope, it will attempt to delete the data a second time, which will cause more problems.
What you have implemented there is called fluent interface. I have mostly encountered them in scripting languages, but there is no reason you can't use in C++.
If you really, really hate calling lots of set functions, one after the other, then you may enjoy the following code, For most people, this is way overkill for the 'problem' solved.
This code demonstrates how to create a set function that can accept set classes of any number in any order.
#include "stdafx.h"
#include <stdarg.h>
// Base class for all setter classes
class cSetterBase
{
public:
// the type of setter
int myType;
// a union capable of storing any kind of data that will be required
union data_t {
int i;
float f;
double d;
} myValue;
cSetterBase( int t ) : myType( t ) {}
};
// Base class for float valued setter functions
class cSetterFloatBase : public cSetterBase
{
public:
cSetterFloatBase( int t, float v ) :
cSetterBase( t )
{ myValue.f = v; }
};
// A couple of sample setter classes with float values
class cSetterA : public cSetterFloatBase
{
public:
cSetterA( float v ) :
cSetterFloatBase( 1, v )
{}
};
// A couple of sample setter classes with float values
class cSetterB : public cSetterFloatBase
{
public:
cSetterB( float v ) :
cSetterFloatBase( 2, v )
{}
};
// this is the class that actually does something useful
class cUseful
{
public:
// set attributes using any number of setter classes of any kind
void Set( int count, ... );
// the attributes to be set
float A, B;
};
// set attributes using any setter classes
void cUseful::Set( int count, ... )
{
va_list vl;
va_start( vl, count );
for( int kv=0; kv < count; kv++ ) {
cSetterBase s = va_arg( vl, cSetterBase );
cSetterBase * ps = &s;
switch( ps->myType ) {
case 1:
A = ((cSetterA*)ps)->myValue.f; break;
case 2:
B = ((cSetterB*)ps)->myValue.f; break;
}
}
va_end(vl);
}
int _tmain(int argc, _TCHAR* argv[])
{
cUseful U;
U.Set( 2, cSetterB( 47.5 ), cSetterA( 23 ) );
printf("A = %f B = %f\n",U.A, U.B );
return 0;
}
You may consider the ConstrOpt paradigm. I first heard about this when reading the XML-RPC C/C++ lib documentation here: http://xmlrpc-c.sourceforge.net/doc/libxmlrpc++.html#constropt
Basically the idea is similar to yours, but the "ConstrOpt" paradigm uses a subclass of the one you want to instantiate. This subclass is then instantiated on the stack with default options and then the relevant parameters are set with the "reference-chain" in the same way as you do.
The constructor of the real class then uses the constrOpt class as the only constructor parameter.
This is not the most efficient solution, but can help to get a clear and safe API design.