C++ new[] into base class pointer crash on array access - c++

When I allocate a single object, this code works fine. When I try to add array syntax, it segfaults. Why is this? My goal here is to hide from the outside world the fact that class c is using b objects internally. I have posted the program to codepad for you to play with.
#include <iostream>
using namespace std;
// file 1
class a
{
public:
virtual void m() { }
virtual ~a() { }
};
// file 2
class b : public a
{
int x;
public:
void m() { cout << "b!\n"; }
};
// file 3
class c : public a
{
a *s;
public:
// PROBLEMATIC SECTION
c() { s = new b[10]; } // s = new b;
void m() { for(int i = 0; i < 10; i++) s[i].m(); } // s->m();
~c() { delete[] s; } // delete s;
// END PROBLEMATIC SECTION
};
// file 4
int main(void)
{
c o;
o.m();
return 0;
}

Creating an array of 10 b's with new and then assigning its address to an a* is just asking for trouble.
Do not treat arrays polymorphically.
For more information see ARR39-CPP. Do not treat arrays polymorphically, at section 06. Arrays and the STL (ARR) of the CERT C++ Secure Coding Standard.

One problem is that the expression s[i] uses pointer arithmetic to compute the address of the desired object. Since s is defined as pointer to a, the result is correct for an array of as and incorrect for an array of bs. The dynamic binding provided by inheritance only works for methods, nothing else (e.g., no virtual data members, no virtual sizeof). Thus when calling the method s[i].m() the this pointer gets set to what would be the ith a object in the array. But since in actuality the array is one of bs, it ends up (sometimes) pointing to somewhere in the middle of an object and you get a segfault (probably when the program tries to access the object's vtable). You might be able to rectify the problem by virtualizing and overloading operator[](). (I Didn't think it through to see if it will actually work, though.)
Another problem is the delete in the destructor, for similar reasons. You might be able to virtualize and overload it too. (Again, just a random idea that popped into my head. Might not work.)
Of course, casting (as suggested by others) will work too.

You have an array of type "b" not of type "a" and you are assigning it to a pointer of type a. Polymorphism doesn't transfer to dynamic arrays.
a* s
to a
b* s
and you will see this start working.
Only not-yet-bound pointers can be treated polymorphically. Think about it
a* s = new B(); // works
//a* is a holder for an address
a* s = new B[10]
//a* is a holder for an address
//at that address are a contiguos block of 10 B objects like so
// [B0][B2]...[B10] (memory layout)
when you iterate over the array using s, think about what is used
s[i]
//s[i] uses the ith B object from memory. Its of type B. It has no polymorphism.
// Thats why you use the . notation to call m() not the -> notation
before you converted to an array you just had
a* s = new B();
s->m();
s here is just an address, its not a static object like s[i]. Just the address s can still be dynamically bound. What is at s? Who knows? Something at an address s.
See Ari's great answer below for more information about why this also doesn't make sense in terms of how C style arrays are layed out.

Each instance of B contains Both X data member and the "vptr" (pointer to the virtual table).
Each instance of A contain only the "vptr"
Thus , sizeof(a) != sizeof(b).
Now when you do this thing : "S = new b[10]" you lay on the memory 10 instances of b in a raw , S (which has the type of a*) is getting the beginning that raw of data.
in C::m() method , you tell the compiler to iterate over an array of "a" (because s has the type of a*) , BUT , s is actualy pointing to an array of "b". So when you call s[i] what the compiler actualy do is "s + i * sizeof(a)" , the compiler jumps in units of "a" instead of units of "b" and since a and b doesn't have the same size , you get a lot of mambojumbo.

I have figured out a workaround based on your answers. It allows me to hide the implementation specifics using a layer of indirection. It also allows me to mix and match objects in my array. Thanks!
#include <iostream>
using namespace std;
// file 1
class a
{
public:
virtual void m() { }
virtual ~a() { }
};
// file 2
class b : public a
{
int x;
public:
void m() { cout << "b!\n"; }
};
// file 3
class c : public a
{
a **s;
public:
// PROBLEMATIC SECTION
c() { s = new a* [10]; for(int i = 0; i < 10; i++) s[i] = new b(); }
void m() { for(int i = 0; i < 10; i++) s[i]->m(); }
~c() { for(int i = 0; i < 10; i++) delete s[i]; delete[] s; }
// END PROBLEMATIC SECTION
};
// file 4
int main(void)
{
c o;
o.m();
return 0;
}

Related

How to resolve memory related errors that arise from interaction between C objects in a C++ wrapper?

The problem
I am writing a thin C++ wrapper around an object oriented C library. The idea was to automate memory management, but so far its not been very automatic. Basically when I use my wrapper classes, I get all kinds of memory access and inappropriate freeing problems.
Minimal example of C library
Lets say the C library consists of A and B classes, each of which have a few 'methods' associated with them:
#include <memory>
#include "cstring"
#include "iostream"
extern "C" {
typedef struct {
unsigned char *string;
} A;
A *c_newA(const char *string) {
A *a = (A *) malloc(sizeof(A)); // yes I know, don't use malloc in C++. This is a demo to simulate the C library that uses it.
auto *s = (char *) malloc(strlen(string) + 1);
strcpy(s, string);
a->string = (unsigned char *) s;
return a;
}
void c_freeA(A *a) {
free(a->string);
free(a);
}
void c_printA(A *a) {
std::cout << a->string << std::endl;
}
typedef struct {
A *firstA;
A *secondA;
} B;
B *c_newB(const char *first, const char *second) {
B *b = (B *) malloc(sizeof(B));
b->firstA = c_newA(first);
b->secondA = c_newA(second);
return b;
}
void c_freeB(B *b) {
c_freeA(b->firstA);
c_freeA(b->secondA);
free(b);
}
void c_printB(B *b) {
std::cout << b->firstA->string << ", " << b->secondA->string << std::endl;
}
A *c_getFirstA(B *b) {
return b->firstA;
}
A *c_getSecondA(B *b) {
return b->secondA;
}
}
Test the 'C lib'
void testA() {
A *a = c_newA("An A");
c_printA(a);
c_freeA(a);
// outputs: "An A"
// valgrind is happy =]
}
void testB() {
B *b = c_newB("first A", "second A");
c_printB(b);
c_freeB(b);
// outputs: "first A, second A"
// valgrind is happy =]
}
Wrapper classes for A and B
class AWrapper {
struct deleter {
void operator()(A *a) {
c_freeA(a);
}
};
std::unique_ptr<A, deleter> aptr_;
public:
explicit AWrapper(A *a)
: aptr_(a) {
}
static AWrapper fromString(const std::string &string) { // preferred way of instantiating
A *a = c_newA(string.c_str());
return AWrapper(a);
}
void printA() {
c_printA(aptr_.get());
}
};
class BWrapper {
struct deleter {
void operator()(B *b) {
c_freeB(b);
}
};
std::unique_ptr<B, deleter> bptr_;
public:
explicit BWrapper(B *b)
: bptr_(std::unique_ptr<B, deleter>(b)) {
}
static BWrapper fromString(const std::string &first, const std::string &second) {
B *b = c_newB(first.c_str(), second.c_str());
return BWrapper(b);
}
void printB() {
c_printB(bptr_.get());
}
AWrapper getFirstA(){
return AWrapper(c_getFirstA(bptr_.get()));
}
AWrapper getSecondA(){
return AWrapper(c_getSecondA(bptr_.get()));
}
};
Wrapper tests
void testAWrapper() {
AWrapper a = AWrapper::fromString("An A");
a.printA();
// outputs "An A"
// valgrind is happy =]
}
void testBWrapper() {
BWrapper b = BWrapper::fromString("first A", "second A");
b.printB();
// outputs "first A"
// valgrind is happy =]
}
Demonstration of the problem
Great, so I move on and develop the full wrapper (lot of classes) and realise that when classes like this (i.e. aggregation relationship) are both in scope, C++ will automatically call the descructors of both classes separately, but because of the structure of the underlying library (i.e. the calls to free), we get memory problems:
void testUsingAWrapperAndBWrapperTogether() {
BWrapper b = BWrapper::fromString("first A", "second A");
AWrapper a1 = b.getFirstA();
// valgrind no happy =[
}
Valgrind output
Things I've tried
Cloning not possible
The first thing I tried was to take a copy of A, rather than having them try to free the same A. This, while a good idea, is not possible in my case because of the nature of the library I'm using. There is actually a catching mechanism in place so that when you create a new A with a string its seen before, it'll give you back the same A. See this question for my attempts at cloning A.
Custom destructors
I took the code for the C library destructors (freeA and freeB here) and copied them into my source code. Then I tried to modify them such that A does not get freed by B. This has partially worked. Some instances of memory problems have been resolved, but because this idea does not tackle the problem at hand (just kind of temporarily glosses over the main issue), new problems keep popping up, some of which are obscure and difficult to debug.
The question
So at last we arive at the question: How can I modify this C++ wrapper to resolve the memory problems that arise due to the interactions between the underlying C objects? Can I make better use of smart pointers? Should I abandon the C wrapper completly and just use the libraries pointers as is? Or is there a better way I haven't thought of?
Thanks in advance.
Edits: response to the comments
Since asking the previous question (linked above) I have restructed my code so that the wrapper is being developed and built in the same library as the one it wraps. So the objects are no longer opaque.
The pointers are generated from function calls to the library, which uses calloc or malloc to allocate.
In the real code A is raptor_uri* (typdef librdf_uri*) from raptor2 and is allocated with librdf_new_uri while B is raptor_term* (aka librdf_node*) and allocated with librdf_new_node_* functions. The librdf_node has a librdf_uri field.
Edit 2
I can also point to the line of code where the same A is returned if its the same string. See line 137 here
The problem is that getFirstA and getSecondA return instances of AWrapper, which is an owning type. This means that when constructing an AWrapper you're giving up the ownership of an A *, but getFirstA and getFirstB don't do that. The pointers from which the returned objects are constructed are managed by a BWrapper.
The easiest solution is that you should return an A * instead of the wrapper class. This way you're not passing the ownership of the inner A member. I also would recommend making the constructors taking pointers in the wrapper classes private, and having a fromPointer static method similar to fromString, which takes ownership of the pointer passed to it. This way you won't accidently make instances of the wrapper classes from raw pointers.
If you want to avoid using raw pointers or want to have methods on the returned objects from getFirstA and getSecondA you could write a simple reference wrapper, which has a raw pointer as a member.
class AReference
{
private:
A *a_ref_;
public:
explicit AReference(A *a_ref) : a_ref_(a_ref) {}
// other methods here, such as print or get
};
You are freeing A twice
BWrapper b = BWrapper::fromString("first A", "second A");
When b goes out of scope, c_freeB is called which also calls c_freeA
AWrapper a1 = b.getFirstA();
Wraps A with another unique_ptr, then when a1 goes out of scope it will call c_freeA on the same A.
Note that getFirstA in BWrapper gives ownership of an A to another unique_ptr when using the AWrapper constructor.
Ways to fix this:
Don't let B manage A memory, but since you are using a lib that won't be possible.
Let BWrapper manage A, don't let AWrapper manage A and make sure the BWrapper exists when using AWrapper. That is, use a raw pointer in AWrapper instead of a smart pointer.
Make a copy of A in the AWrapper(A *) constructor, for this you might want to use a function from the library.
Edit:
shared_ptr won't work in this case because c_freeB will call c_freeA anyways.
Edit 2:
In this specific case considering the raptor lib you mentioned, you could try the following:
explicit AWrapper(A *a)
: aptr_(raptor_uri_copy(a)) {
}
assuming that A is a raptor_uri. raptor_uri_copy(raptor_uri *) will increase the reference count and return the same passed pointer. Then, even if raptor_free_uri is called twice on the same raptor_uri * it will call free only when the counter becomes zero.

Segmentation fault when calling derived class method

I have a problem related to designing derived classes with array parameters. I have class B derived from A. And class BB derived from AA with array of B and A respectively...
#include <iostream>
class A
{
public:
A(){}
virtual void foo(){std::cout<<"foo A\n";}
int idx[3];
};
class B: public A
{
public:
B():A(){}
void foo(){std::cout<<"foo B\n";}
int uidx[3];
};
class AA
{
public:
AA(){}
AA(int count){
m_count = count;
m_a = new A[count];
}
virtual A* getA(){return m_a;}
~AA(){ delete[] m_a;}
protected:
A* m_a;
int m_count;
};
class BB: public AA
{
public:
BB(int count):AA()
{
m_count = count;
m_a = new B[count];
}
B* getA(){return dynamic_cast<B*>(m_a);}
};
int main()
{
AA* aa = new AA(2);
BB* bb = new BB(2);
B* b = bb->getA();
B& b0 = *b;
b0.idx[0] = 0;
b0.idx[1] = 1;
b0.idx[2] = 2;
B& b1 = *(b+1);
b1.idx[0] = 2;
b1.idx[1] = 3;
b1.idx[2] = 4;
std::cout<<bb->getA()[1].idx[0]<<"\n"; //prints 2
std::cout<<bb->getA()[1].idx[1]<<"\n"; //prints 3
std::cout<<bb->getA()[1].idx[2]<<"\n"; //prints 4
AA* cc = static_cast<AA*>(bb);
cc->getA()[0].foo(); //prints foo B
std::cout<<cc->getA()[1].idx[0]<<"\n"; //prints 4198624 ??
std::cout<<cc->getA()[1].idx[1]<<"\n"; //prints 0 ??
std::cout<<cc->getA()[1].idx[2]<<"\n"; //prints 2 ??
cc->getA()[1].foo(); //segmentation fault
delete aa;
delete bb;
return 0;
}
After static cast BB to AA I can't access A's with indices more then 0.
How to solve this issue?
Thank you.
Note that cc->getA() is semantically equal to cc->A::getA() (not cc->B::getA()) and returns a pointer to A (instead of B*).
Now, since A is the subclass of B, but the latter also includes some extra fields, then sizeof(B) > sizeof(A). Since cc->getA()[n] is basically *(cc->getA() + n) the line
cc->getA()[1].foo();
does the same thing as:
A * const tmp = cc->getA();
A & tmp2 = *(tmp + 1); // sizeof(A) bytes past tmp
tmp2.foo();
which causes undefined behaviour due to §5.7.6 [expr.add] of the C++ standard which states:
For addition or subtraction, if the expressions P or Q have type “pointer to cv T”, where T and the array element type are not similar ([conv.qual]), the behavior is undefined. [ Note: In particular, a pointer to a base class cannot be used for pointer arithmetic when the array contains objects of a derived class type. — end note ]
You probably wanted behaviour similar to the following:
A * const tmp = cc->getA();
A & tmp2 = *(static_cast<B *>(tmp) + 1); // sizeof(B) bytes past tmp
tmp2.foo();
For that you need to use something like:
std::cout<<static_cast<B*>(cc->getA())[1].idx[0]<<"\n"; // prints 2
std::cout<<static_cast<B*>(cc->getA())[1].idx[1]<<"\n"; // prints 3
std::cout<<static_cast<B*>(cc->getA())[1].idx[2]<<"\n"; // prints 4
static_cast<B*>(cc->getA())[1].foo(); // prints foo B
However, it is better to implement a virtual A & operator[](std::size_t) operator for AA and override it in BB.
I can see 2 issues in your code:
Since your classes are responsible for memory management, I would suggest to make your destructors virtual, because if you, at any point, will try to delete derived class object via base pointer, the destructors of derived classes will not be invoked. It shouldn't be a problem in your current code, but may become a problem in a future.
I.e:
int main ()
{
AA* aa = new BB (2);
delete aa;
}
Will not call the BB::~BB() in your case.
The problem that you are noticing, and writing this question about.
After you cast your variable of type from BB* to AA* (even though, the cast isn't necessary, you can straight-up assign, due to types being covariant) in line:
AA* cc = dynamic_cast<AA*>(bb);
Your variable cc is treated as if it is of type AA* (it doesn't matter that it has the runtime type of BB*, in general case - you don't know, and should not care about the exact runtime type). On any virtual method call, they are dispatched to the correct type via the use of the vtable.
And now, why are you getting strange values printed in the console/segmentation fault? What's the result of cc->getA ()? Since the variable cc is treated as AA*, the return value is A* (as explained above, actual type is B*, but, due to is-a relationship of inheritance is treated as A*). What's the problem, you may ask: The array m_a is the same size in both cases, right?
Well, not really, to explain that, I would need to explain how array indexing works in C++, and how it is related to sizes of the objects.
I guess, that I wouldn't shock you, stating that size of object of type B (sizeof (B)), is larger than that of type A (sizeof (A)), since B has everything that A has (due to inheritance), with some stuff of its own. On my machine sizeof(A) = 16 bytes, and sizeof(B) = 28 bytes.
So, when you create an array, the total amount of space that array takes up is [element_count] * [size of the element] bytes, which seems logical. But, when you need to take an element from an array, it needs to figure, where exactly, that element is, in the memory, in all the space that array is taking up, so it does so, by calculating it. It does so as follows: [start of the array] + [index] * [size of element].
And, now we arrive at the source of the problem. You are trying to do cc->getA ()[1], but, since cc, under the hood, is BB*, so the size of AA::m_a variable is 2 * sizeof (B) (= 2 * 28 = 56 on my machine; first objects starts at offset 0 (0 * sizeof (B); second at offset 28 (1 * sizeof(B))), but since cc->getA () gets treated as A*, and you are trying to fetch second element from the array (index 1), it tries to fetch the object from the offset of1 * sizeof (A)`, which, unfortunately, is in the middle of the space reserved to an object, and yet, any values can be printed/anything can happen - undefined behavior is invoked.
How to fix it? I would fix it by implementing the virtual indexing operators, instead of GetA method on classes AA/BB, as follows:
class AA
{
public:
...
virtual A& operator[] (int idx)
{
return m_a[idx];
}
...
};
class BB : public AA
{
public:
...
virtual B& operator[] (int idx)
{
return dynamic_cast<B*>(m_a)[idx];
}
...
};
But, then you would need to be careful to call the operator on the object itself, and not to a pointer to object:
std::cout << cc->operator[](1).idx[0] << "\n";
std::cout << cc->operator[](1).idx[1] << "\n";
std::cout << cc->operator[](1).idx[2] << "\n";

C++ Object-oriented programming

I have 1 question because I am pretty curious how to handle with such problem.
I have base class called "Pracownik" (Worker) and 2 subclasses which are made from public Pracownik;
- Informatyk (Informatic)
- Księgowy (Accountant)
Writing classes is easy. Made them pretty fast but I have small problem with main because I am helping friend with program but I was not using C++ for a while. So:
This is my header file "funkcje.h"
#include <iostream>
using namespace std;
class Pracownik
{
private:
string nazwisko;
int pensja;
public:
Pracownik(string="",int=0);
~Pracownik();
string getNazwisko();
int getPensja();
friend double srednia_pensja(int,Pracownik);
};
class Informatyk : public Pracownik
{
private:
string certyfikat_Cisco;
string certyfikat_Microsoft;
public:
Informatyk(string="",int=0, string="", string="");
~Informatyk();
void info();
};
class Ksiegowy : public Pracownik
{
private:
bool audytor;
public:
Ksiegowy(string="",int=0, bool=false);
~Ksiegowy();
void info();
};
double srednia_pensja(int,Pracownik);
These are definitions of my functions "funkcje.cpp"
#include "funkcje.h"
Pracownik::Pracownik(string a,int b)
{
nazwisko=a;
pensja=b;
}
Pracownik::~Pracownik()
{
}
string Pracownik::getNazwisko()
{
return nazwisko;
}
int Pracownik::getPensja()
{
return pensja;
}
Informatyk::Informatyk(string a, int b, string c, string d) : Pracownik(a,b)
{
certyfikat_Cisco=c;
certyfikat_Microsoft=d;
}
Informatyk::~Informatyk()
{
}
Ksiegowy::Ksiegowy(string a, int b, bool c) : Pracownik(a,b)
{
audytor=c;
}
Ksiegowy::~Ksiegowy()
{
}
void Informatyk::info()
{
cout<<"Nazwisko pracownika: "<<Pracownik::getNazwisko()<<endl;
cout<<"Pensja pracownika: "<<Pracownik::getPensja()<<endl;
cout<<"Certyfikat Cisco: "<<certyfikat_Cisco<<endl;
cout<<"Certyfikat Microsoft: "<<certyfikat_Microsoft<<endl;
}
void Ksiegowy::info()
{
cout<<"Nazwisko pracownika: "<<Pracownik::getNazwisko()<<endl;
cout<<"Pensja pracownika: "<<Pracownik::getPensja()<<endl;
cout<<"Audytor: ";
if(audytor)
cout<<"Tak"<<endl;
else
cout<<"Nie"<<endl;
}
double srednia_pensja(int a,Pracownik *b)
{
return 0;
}
And finally main!
#include <iostream>
#include "funkcje.h"
using namespace std;
int main()
{
Pracownik lista[10];
Pracownik *lista_wsk = new Pracownik[10];
Informatyk a("Kowalski1",1000,"Cisco1","Microsoft1");
Informatyk b("Kowalski2",2000,"Cisco2","Microsoft2");
Informatyk c("Kowalski3",3000,"Cisco3","Microsoft3");
Ksiegowy d("Kowalski4",4000,1);
Ksiegowy e("Kowalski5",5000,0);
lista[0]=a;
lista[1]=b;
lista[2]=c;
lista[3]=d;
lista[4]=e;
Informatyk *ab = new Informatyk("Kowalski1",1000,"Cisco1","Microsoft1");
Informatyk *ac = new Informatyk("Kowalski2",2000,"Cisco2","Microsoft2");
Informatyk *ad = new Informatyk("Kowalski3",3000,"Cisco3","Microsoft3");
Ksiegowy *ae = new Ksiegowy("Kowalski4",3000,1);
Ksiegowy *af = new Ksiegowy("Kowalski5",3000,0);
lista_wsk[0]=*ab;
lista_wsk[1]=*ac;
lista_wsk[2]=*ad;
lista_wsk[3]=*ae;
lista_wsk[4]=*af;
for(int i;i<5;i++)
{
lista[i].info();
cout<<endl;
}
cout<<endl;
// for(int i;i<5;i++)
// {
// lista_wsk[i].info();
// }
return 0;
}
Ok and here goes my questions:
I had to create array which is filled with base class objects "Pracownik".
Secondary i had to create array which is full of pointers to class "Pracownik" objects.
(Hope those 2 first steps are done correctly)
Next thing I had to write to array 3 objects of class Informatic and 2 of class Accountant.
So I ve created 5 objects manually and added them into the array in such way array[0]=a;. I guess this is still good.
Next thing i had to create and add similar objects to array of pointers using new. So I ve created array with new and pointers to objects with new. (Hope thats correct 2).
And FINALLY:
I had to use info() on added to array objects.
This is my main question if my array is type "Pracownik" and I want to use function info() from subclasses how should I do that? And how compiler will know if he should use info() from Accountant or Informatic while I am trying to show those information using "for".
In an array of Pracownik, the elements are of type Pracownik. Any information about the objects being of a subclass of Pracownik are lost when you copy the elements into the array.
This is called object slicing and leads to the fact that there is no way to invoke Informatyk::info() on these objects.
If you want to call methods of a subclass, you have to prevent object slicing by storing pointers or references in the array.
As Oswald says in his answer,
Pracownik * lista_wsk = new Pracownik[10];
allocates an array of 10 Pracownik objects. This is probably not what you want. With polymorphism involved, we usually want to deal with pointers or references. Hence, you'd want an array of Pracownik * pointers. Since you already know at compile-time that it will have 10 members, there is no need for a dynamic allocation here. I think you've meant to write
Pracownik * lista_wsk[10];
instead. Now we don't put objects but pointers to objects into the array. For example:
lista_wsk[2] = new Informatyk("Kowalski3", 3000, "Cisco3", "Microsoft3");
And then we can iterate over the items like so:
for (unsigned i = 0; i < 10; ++i)
std::cout << lista_wsk[i]->getNazwisko() << std::endl;
As you have already discovered, it is impossible to call a subclass function member on a superclass object. It would be possible to figure out the actual type at run-time yourslf by means of a cast.
for (unsigned i = 0; i < 10; ++i)
if (Informatyk * info_ptr = dynamic_cast<Informatyk *>(lista_wsk[i]))
info_ptr->info();
dynamic_cast returns a pointer to the target class if this is possible or a nullptr (which evaluates to false, hence the conditional) otherwise. Note however that this is considered very poor style. It is better to use virtual functions. Therefore, add
virtual void
info()
{
// Do what is appropriate to do for a plain Pracownik.
// Maybe leave this function empty.
}
to the superclass and again to the subclass
virtual void
info() // override
{
// Do what is appropriate to do for an Informatyk.
}
The function in the subclass with the same signature is said to override the function inherited from the superclass. Since the function is marked as virtual, the compiler will generate additional code to figure out at run-time what version of the function to call.
If you are coding C++11, you can make the override explicit by placing the keyword override after its type as shown above (uncomment the override). I recommend you use this to avoid bugs that arise from accidental misspelling or other typos.

C++ Access memory which isn't part of the object itself

It sounds weird, I guess, but I'm creating some low-level code for a hardware device. Dependend on specific conditions I need to allocate more space than the actual struct needs, store informations there and pass the address of the object itself to the caller.
When the user is deallocating such an object, I need to read these informations before I actually deallocate the object.
At the moment, I'm using simple pointer operations to get the addresses (either of the class or the extra space). However, I tought it would be more understandable if I do the pointer arithmetics in member functions of an internal (!) type. The allocator, which is dealing with the addresses, is the only one who know's about this internal type. In other words, the type which is returned to the user is a different one.
The following example show's what I mean:
struct foo
{
int& get_x() { return reinterpret_cast<int*>(this)[-2]; }
int& get_y() { return reinterpret_cast<int*>(this)[-1]; }
// actual members of foo
enum { size = sizeof(int) * 2 };
};
int main()
{
char* p = new char[sizeof(foo) + foo::size];
foo* bar = reinterpret_cast<foo*>(p + foo::size);
bar->get_x() = 1;
bar->get_y() = 2;
std::cout << bar->get_x() << ", " << bar->get_y() << std::endl;
delete p;
return 0;
}
Is it arguable to do it in that way?
It seems needlessly complex to do it this way. If I were to implement something like this, I would take a simpler approach:
#pragma pack(push, 1)
struct A
{
int x, y;
};
struct B
{
int z;
};
#pragma pack(pop)
// allocate space for A and B:
unsigned char* data = new char[sizeof(A) + sizeof(B)];
A* a = reinterpret_cast<A*>(data);
B* b = reinterpret_cast<B*>(a + 1);
a->x = 0;
a->y = 1;
b->z = 2;
// When deallocating:
unsigned char* address = reinterpret_cast<unsigned char*>(a);
delete [] address;
This implementation is subtly different, but much easier (in my opinion) to understand, and doesn't rely on intimate knowledge of what is or is not present. If all instances of the pointers are allocated as unsigned char and deleted as such, the user doesn't need to keep track of specific memory addresses aside from the first address in the block.
The very straightforward idea: wrap your extra logic in a factory which will create objects for you and delete them smart way.
You can also create the struct as a much larger object, and use a factory function to return an instance of the struct, but cast to a much smaller object that would basically act as the object's handle. For instance:
struct foo_handle {};
struct foo
{
int a;
int b;
int c;
int d;
int& get_a() { return a; }
int& get_b() { return b; }
//...more member methods
//static factory functions to create and delete objects
static foo_handle* create_obj() { return new foo(); }
static void delete_obj(foo_handle* obj) { delete reinterpret_cast<foo*>(obj); }
};
void another_function(foo_handle* masked_obj)
{
foo* ptr = reinterpret_cast<foo*>(masked_obj);
//... do something with ptr
}
int main()
{
foo_handle* handle = foo::create_obj();
another_function(handle);
foo::delete_obj(handle);
return 0;
}
Now you can hide any extra space you may need in your foo struct, and to the user of your factory functions, the actual value of the pointer doesn't matter since they are mainly working with an opaque handle to the object.
It seems your question is a candidate for the popular struct hack.
Is the "struct hack" technically undefined behavior?

C++ Class design - easily init / build objects

Using C++ I built a Class that has many setter functions, as well as various functions that may be called in a row during runtime.
So I end up with code that looks like:
A* a = new A();
a->setA();
a->setB();
a->setC();
...
a->doA();
a->doB();
Not, that this is bad, but I don't like typing "a->" over and over again.
So I rewrote my class definitions to look like:
class A{
public:
A();
virtual ~A();
A* setA();
A* setB();
A* setC();
A* doA();
A* doB();
// other functions
private:
// vars
};
So then I could init my class like: (method 1)
A* a = new A();
a->setA()->setB()->setC();
...
a->doA()->doB();
(which I prefer as it is easier to write)
To give a more precise implementation of this you can see my SDL Sprite C++ Class I wrote at http://ken-soft.com/?p=234
Everything seems to work just fine. However, I would be interested in any feedback to this approach.
I have noticed One problem. If i init My class like: (method 2)
A a = A();
a.setA()->setB()->setC();
...
a.doA()->doB();
Then I have various memory issues and sometimes things don't work as they should (You can see this by changing how i init all Sprite objects in main.cpp of my Sprite Demo).
Is that normal? Or should the behavior be the same?
Edit the setters are primarily to make my life easier in initialization. My main question is way method 1 and method 2 behave different for me?
Edit: Here's an example getter and setter:
Sprite* Sprite::setSpeed(int i) {
speed = i;
return this;
}
int Sprite::getSpeed() {
return speed;
}
One note unrelated to your question, the statement A a = A(); probably isn't doing what you expect. In C++, objects aren't reference types that default to null, so this statement is almost never correct. You probably want just A a;
A a creates a new instance of A, but the = A() part invokes A's copy constructor with a temporary default constructed A. If you had done just A a; it would have just created a new instance of A using the default constructor.
If you don't explicitly implement your own copy constructor for a class, the compiler will create one for you. The compiler created copy constructor will just make a carbon copy of the other object's data; this means that if you have any pointers, it won't copy the data pointed to.
So, essentially, that line is creating a new instance of A, then constructing another temporary instance of A with the default constructor, then copying the temporary A to the new A, then destructing the temporary A. If the temporary A is acquiring resources in it's constructor and de-allocating them in it's destructor, you could run into issues where your object is trying to use data that has already been deallocated, which is undefined behavior.
Take this code for example:
struct A {
A() {
myData = new int;
std::cout << "Allocated int at " << myData << std::endl;
}
~A() {
delete myData;
std::cout << "Deallocated int at " << myData << std::endl;
}
int* myData;
};
A a = A();
cout << "a.myData points to " << a.myData << std::endl;
The output will look something like:
Allocated int at 0x9FB7128
Deallocated int at 0x9FB7128
a.myData points to 0x9FB7128
As you can see, a.myData is pointing to an address that has already been deallocated. If you attempt to use the data it points to, you could be accessing completely invalid data, or even the data of some other object that took it's place in memory. And then once your a goes out of scope, it will attempt to delete the data a second time, which will cause more problems.
What you have implemented there is called fluent interface. I have mostly encountered them in scripting languages, but there is no reason you can't use in C++.
If you really, really hate calling lots of set functions, one after the other, then you may enjoy the following code, For most people, this is way overkill for the 'problem' solved.
This code demonstrates how to create a set function that can accept set classes of any number in any order.
#include "stdafx.h"
#include <stdarg.h>
// Base class for all setter classes
class cSetterBase
{
public:
// the type of setter
int myType;
// a union capable of storing any kind of data that will be required
union data_t {
int i;
float f;
double d;
} myValue;
cSetterBase( int t ) : myType( t ) {}
};
// Base class for float valued setter functions
class cSetterFloatBase : public cSetterBase
{
public:
cSetterFloatBase( int t, float v ) :
cSetterBase( t )
{ myValue.f = v; }
};
// A couple of sample setter classes with float values
class cSetterA : public cSetterFloatBase
{
public:
cSetterA( float v ) :
cSetterFloatBase( 1, v )
{}
};
// A couple of sample setter classes with float values
class cSetterB : public cSetterFloatBase
{
public:
cSetterB( float v ) :
cSetterFloatBase( 2, v )
{}
};
// this is the class that actually does something useful
class cUseful
{
public:
// set attributes using any number of setter classes of any kind
void Set( int count, ... );
// the attributes to be set
float A, B;
};
// set attributes using any setter classes
void cUseful::Set( int count, ... )
{
va_list vl;
va_start( vl, count );
for( int kv=0; kv < count; kv++ ) {
cSetterBase s = va_arg( vl, cSetterBase );
cSetterBase * ps = &s;
switch( ps->myType ) {
case 1:
A = ((cSetterA*)ps)->myValue.f; break;
case 2:
B = ((cSetterB*)ps)->myValue.f; break;
}
}
va_end(vl);
}
int _tmain(int argc, _TCHAR* argv[])
{
cUseful U;
U.Set( 2, cSetterB( 47.5 ), cSetterA( 23 ) );
printf("A = %f B = %f\n",U.A, U.B );
return 0;
}
You may consider the ConstrOpt paradigm. I first heard about this when reading the XML-RPC C/C++ lib documentation here: http://xmlrpc-c.sourceforge.net/doc/libxmlrpc++.html#constropt
Basically the idea is similar to yours, but the "ConstrOpt" paradigm uses a subclass of the one you want to instantiate. This subclass is then instantiated on the stack with default options and then the relevant parameters are set with the "reference-chain" in the same way as you do.
The constructor of the real class then uses the constrOpt class as the only constructor parameter.
This is not the most efficient solution, but can help to get a clear and safe API design.