It sounds weird, I guess, but I'm creating some low-level code for a hardware device. Dependend on specific conditions I need to allocate more space than the actual struct needs, store informations there and pass the address of the object itself to the caller.
When the user is deallocating such an object, I need to read these informations before I actually deallocate the object.
At the moment, I'm using simple pointer operations to get the addresses (either of the class or the extra space). However, I tought it would be more understandable if I do the pointer arithmetics in member functions of an internal (!) type. The allocator, which is dealing with the addresses, is the only one who know's about this internal type. In other words, the type which is returned to the user is a different one.
The following example show's what I mean:
struct foo
{
int& get_x() { return reinterpret_cast<int*>(this)[-2]; }
int& get_y() { return reinterpret_cast<int*>(this)[-1]; }
// actual members of foo
enum { size = sizeof(int) * 2 };
};
int main()
{
char* p = new char[sizeof(foo) + foo::size];
foo* bar = reinterpret_cast<foo*>(p + foo::size);
bar->get_x() = 1;
bar->get_y() = 2;
std::cout << bar->get_x() << ", " << bar->get_y() << std::endl;
delete p;
return 0;
}
Is it arguable to do it in that way?
It seems needlessly complex to do it this way. If I were to implement something like this, I would take a simpler approach:
#pragma pack(push, 1)
struct A
{
int x, y;
};
struct B
{
int z;
};
#pragma pack(pop)
// allocate space for A and B:
unsigned char* data = new char[sizeof(A) + sizeof(B)];
A* a = reinterpret_cast<A*>(data);
B* b = reinterpret_cast<B*>(a + 1);
a->x = 0;
a->y = 1;
b->z = 2;
// When deallocating:
unsigned char* address = reinterpret_cast<unsigned char*>(a);
delete [] address;
This implementation is subtly different, but much easier (in my opinion) to understand, and doesn't rely on intimate knowledge of what is or is not present. If all instances of the pointers are allocated as unsigned char and deleted as such, the user doesn't need to keep track of specific memory addresses aside from the first address in the block.
The very straightforward idea: wrap your extra logic in a factory which will create objects for you and delete them smart way.
You can also create the struct as a much larger object, and use a factory function to return an instance of the struct, but cast to a much smaller object that would basically act as the object's handle. For instance:
struct foo_handle {};
struct foo
{
int a;
int b;
int c;
int d;
int& get_a() { return a; }
int& get_b() { return b; }
//...more member methods
//static factory functions to create and delete objects
static foo_handle* create_obj() { return new foo(); }
static void delete_obj(foo_handle* obj) { delete reinterpret_cast<foo*>(obj); }
};
void another_function(foo_handle* masked_obj)
{
foo* ptr = reinterpret_cast<foo*>(masked_obj);
//... do something with ptr
}
int main()
{
foo_handle* handle = foo::create_obj();
another_function(handle);
foo::delete_obj(handle);
return 0;
}
Now you can hide any extra space you may need in your foo struct, and to the user of your factory functions, the actual value of the pointer doesn't matter since they are mainly working with an opaque handle to the object.
It seems your question is a candidate for the popular struct hack.
Is the "struct hack" technically undefined behavior?
Related
I have a class foo that manages data using small buffer optimization (SBO).
When size < 16, the data is held locally (in buffer), otherwise it is stored on the heap, with reserved holding the allocated space.
class foo {
static const int sbo_size = 16;
long size = 0;
char *ptr;
union {
char buffer[sbo_size];
long reserved;
};
public:
foo()
{
for (int i = 0; i < sbo_size; ++i)
buffer[i] = 0;
}
void clone(const foo &f)
{
// release 'ptr' if necessary
if (f.size < sbo_size)
{
memcpy(this, &f, sizeof(foo));
ptr = buffer;
} else
{
// handle non-sbo case
}
}
};
Question about clone():
With the SBO case, it may not be clear for the compiler that union::buffer will be used.
is it correct to use memcpy and set ptr accordingly?
If you can use C++17, I would side-step any potential type-punning problems by using std::variant in place of a union.
Although this uses a small amount of storage internally to keep track of the current type it contains, it's probably a win overall as your ptr variable can disappear (although that should be inside your union anyway).
It's also typesafe, which a union is not (because std::get will throw if the variant doesn't contain the desired type) and will keep track of the type of data it contains simply by assigning to it.
The resulting class fragment might look something like this (no doubt this code can be improved):
class foo
{
private:
static const size_t sbo_size = 16;
using small_buf = std::array <char, sbo_size>;
size_t size = 0;
std::variant <small_buf, char *> buf = { };
public:
void clone (const foo &f)
{
char **bufptr = std::get_if <char *> (&buf);
if (bufptr)
delete [] *bufptr;
size = f.size;
if (size < sbo_size)
buf = std::get <small_buf> (f.buf);
else
{
buf = new char [size];
std::memcpy (std::get <char *> (buf), std::get <char *> (f.buf), size);
}
}
};
Notes:
You will see that I've used std::array instead of a C-style array because std:array has lots of nice features that C-style arrays do not
Why clone and not a copy constructor?
if you want foo to have an empty state (after being default constructed, say), then you can look into the strangely named std::monostate.
For raw storage, std::byte is probably to be preferred over char.
Fully worked example here.
Edit: To answer the question as posed, I am no language lawyer but it seems to me that, inside clone, the compiler has no clue what the active member of f might be as it has, in effect, been parachuted in from outer space.
In such circumstances, I would expect compiler writers to play it safe and set the active member of the union to "don't know" until some concrete information comes along. But (and it's a big but), I wouldn't like to bet my shirt on that. It's a complex job and compiler writers do make mistakes.
So, in a spirit of sharing, here's a slightly modified version of your original code which fixes that. I've also moved ptr inside your union since it clearly belongs there:
class foo {
static const int sbo_size = 16;
long size = 0;
union {
std::array <char, sbo_size> buffer; // changing this
char *ptr;
long reserved;
};
public:
foo()
{
for (int i = 0; i < sbo_size; ++i)
buffer[i] = 0;
}
void clone(const foo &f)
{
// release 'ptr' if necessary
if (f.size < sbo_size)
{
buffer = f.buffer; // lets me do this
ptr = buffer.data ();
} else
{
// handle non-sbo case
}
}
};
So you can see, by using std::array for buffer (rather than one of those hacky C-style arrays), you can directly assign to it (rather than having to resort to memcpy) and the compiler will then make that the active member of your union and you should be safe.
In conclusion, the question is actually rather meaningless since one shouldn't (ever) need to write code like that. But no doubt someone will immediately come up with something that proves me wrong.
Guys I have a function like this (this is given and should not be modified).
void readData(int &ID, void*&data, bool &mybool) {
if(mybool)
{
std::string a = "bla";
std::string* ptrToString = &a;
data = ptrToString;
}
else
{
int b = 9;
int* ptrToint = &b;
data = ptrToint;
}
}
So I want to use this function in a loop and save the returned function parameters in a vector (for each iteration).
To do so, I wrote the following struct:
template<typename T>
struct dataStruct {
int id;
T** data; //I first has void** data, but would not be better to
// have the type? instead of converting myData back
// to void* ?
bool mybool;
};
my main.cpp then look like this:
int main()
{
void* myData = nullptr;
std::vector<dataStruct> vec; // this line also doesn't compile. it need the typename
bool bb = false;
for(int id = 1 ; id < 5; id++) {
if (id%2) { bb = true; }
readData(id, myData, bb); //after this line myData point to a string
vec.push_back(id, &myData<?>); //how can I set the template param to be the type myData point to?
}
}
Or is there a better way to do that without template? I used c++11 (I can't use c++14)
The function that you say cannot be modified, i.e. readData() is the one that should alert you!
It causes Undefined Behavior, since the pointers are set to local variables, which means that when the function terminates, then these pointers will be dangling pointers.
Let us leave aside the shenanigans of the readData function for now under the assumption that it was just for the sake of the example (and does not produce UB in your real use case).
You cannot directly store values with different (static) types in a std::vector. Notably, dataStruct<int> and dataStruct<std::string> are completely unrelated types, you cannot store them in the same vector as-is.
Your problem boils down to "I have data that is given to me in a type-unsafe manner and want to eventually get type-safe access to it". The solution to this is to create a data structure that your type-unsafe data is parsed into. For example, it seems that you inteded for your example data to have structure in the sense that there are pairs of int and std::string (note that your id%2 is not doing that because the else is missing and the bool is never set to false again, but I guess you wanted it to alternate).
So let's turn that bunch of void* into structured data:
std::pair<int, std::string> readPair(int pairIndex)
{
void* ptr;
std::pair<int, std::string> ret;
// Copying data here.
readData(2 * pairIndex + 1, ptr, false);
ret.first = *reinterpret_cast<int*>(ptr);
readData(2 * pairIndex + 2, ptr, true);
ret.second = *reinterpret_cast<std::string*>(ptr);
}
void main()
{
std::vector<std::pair<int, std::string>> parsedData;
parsedData.push_back(readPair(0));
parsedData.push_back(readPair(1));
}
Demo
(I removed the references from the readData() signature for brevity - you get the same effect by storing the temporary expressions in variables.)
Generally speaking: Whatever relation between id and the expected data type is should just be turned into the data structure - otherwise you can only reason about the type of your data entries when you know both the current ID and this relation, which is exactly something you should encapsulate in a data structure.
Your readData isn't a useful function. Any attempt at using what it produces gives undefined behavior.
Yes, it's possible to do roughly what you're asking for without a template. To do it meaningfully, you have a couple of choices. The "old school" way would be to store the data in a tagged union:
struct tagged_data {
enum { T_INT, T_STR } tag;
union {
int x;
char *y;
} data;
};
This lets you store either a string or an int, and you set the tag to tell you which one a particular tagged_data item contains. Then (crucially) when you store a string into it, you dynamically allocate the data it points at, so it will remain valid until you explicitly free the data.
Unfortunately, (at least if memory serves) C++11 doesn't support storing non-POD types in a union, so if you went this route, you'd have to use a char * as above, not an actual std::string.
One way to remove (most of) those limitations is to use an inheritance-based model:
class Data {
public:
virtual ~Data() { }
};
class StringData : public Data {
std::string content;
public:
StringData(std::string const &init) : content(init) {}
};
class IntData : public Data {
int content;
public:
IntData(std::string const &init) : content(init) {}
};
This is somewhat incomplete, but I think probably enough to give the general idea--you'd have an array (or vector) of pointers to the base class. To insert data, you'd create a StringData or IntData object (allocating it dynamically) and then store its address into the collection of Data *. When you need to get one back, you use dynamic_cast (among other things) to figure out which one it started as, and get back to that type safely. All somewhat ugly, but it does work.
Even with C++11, you can use a template-based solution. For example, Boost::variant, can do this job quite nicely. This will provide an overloaded constructor and value semantics, so you could do something like:
boost::variant<int, std::string> some_object("input string");
In other words, it's pretty what you'd get if you spent the time and effort necessary to finish the inheritance-based code outlined above--except that it's dramatically cleaner, since it gets rid of the requirement to store a pointer to the base class, use dynamic_cast to retrieve an object of the correct type, and so on. In short, it's the right solution to the problem (until/unless you can upgrade to a newer compiler, and use std::variant instead).
Apart from the problem in given code described in comments/replies.
I am trying to answer your question
vec.push_back(id, &myData<?>); //how can I set the template param to be the type myData point to?
Before that you need to modify vec definition as following
vector<dataStruct<void>> vec;
Now you can simple push element in vector
vec.push_back({id, &mydata, bb});
i have tried to modify your code so that it can work
#include<iostream>
#include<vector>
using namespace std;
template<typename T>
struct dataStruct
{
int id;
T** data;
bool mybool;
};
void readData(int &ID, void*& data, bool& mybool)
{
if (mybool)
{
data = new string("bla");
}
else
{
int b = 0;
data = &b;
}
}
int main ()
{
void* mydata = nullptr;
vector<dataStruct<void>> vec;
bool bb = false;
for (int id = 0; id < 5; id++)
{
if (id%2) bb = true;
readData(id, mydata, bb);
vec.push_back({id, &mydata, bb});
}
}
Using C++ I built a Class that has many setter functions, as well as various functions that may be called in a row during runtime.
So I end up with code that looks like:
A* a = new A();
a->setA();
a->setB();
a->setC();
...
a->doA();
a->doB();
Not, that this is bad, but I don't like typing "a->" over and over again.
So I rewrote my class definitions to look like:
class A{
public:
A();
virtual ~A();
A* setA();
A* setB();
A* setC();
A* doA();
A* doB();
// other functions
private:
// vars
};
So then I could init my class like: (method 1)
A* a = new A();
a->setA()->setB()->setC();
...
a->doA()->doB();
(which I prefer as it is easier to write)
To give a more precise implementation of this you can see my SDL Sprite C++ Class I wrote at http://ken-soft.com/?p=234
Everything seems to work just fine. However, I would be interested in any feedback to this approach.
I have noticed One problem. If i init My class like: (method 2)
A a = A();
a.setA()->setB()->setC();
...
a.doA()->doB();
Then I have various memory issues and sometimes things don't work as they should (You can see this by changing how i init all Sprite objects in main.cpp of my Sprite Demo).
Is that normal? Or should the behavior be the same?
Edit the setters are primarily to make my life easier in initialization. My main question is way method 1 and method 2 behave different for me?
Edit: Here's an example getter and setter:
Sprite* Sprite::setSpeed(int i) {
speed = i;
return this;
}
int Sprite::getSpeed() {
return speed;
}
One note unrelated to your question, the statement A a = A(); probably isn't doing what you expect. In C++, objects aren't reference types that default to null, so this statement is almost never correct. You probably want just A a;
A a creates a new instance of A, but the = A() part invokes A's copy constructor with a temporary default constructed A. If you had done just A a; it would have just created a new instance of A using the default constructor.
If you don't explicitly implement your own copy constructor for a class, the compiler will create one for you. The compiler created copy constructor will just make a carbon copy of the other object's data; this means that if you have any pointers, it won't copy the data pointed to.
So, essentially, that line is creating a new instance of A, then constructing another temporary instance of A with the default constructor, then copying the temporary A to the new A, then destructing the temporary A. If the temporary A is acquiring resources in it's constructor and de-allocating them in it's destructor, you could run into issues where your object is trying to use data that has already been deallocated, which is undefined behavior.
Take this code for example:
struct A {
A() {
myData = new int;
std::cout << "Allocated int at " << myData << std::endl;
}
~A() {
delete myData;
std::cout << "Deallocated int at " << myData << std::endl;
}
int* myData;
};
A a = A();
cout << "a.myData points to " << a.myData << std::endl;
The output will look something like:
Allocated int at 0x9FB7128
Deallocated int at 0x9FB7128
a.myData points to 0x9FB7128
As you can see, a.myData is pointing to an address that has already been deallocated. If you attempt to use the data it points to, you could be accessing completely invalid data, or even the data of some other object that took it's place in memory. And then once your a goes out of scope, it will attempt to delete the data a second time, which will cause more problems.
What you have implemented there is called fluent interface. I have mostly encountered them in scripting languages, but there is no reason you can't use in C++.
If you really, really hate calling lots of set functions, one after the other, then you may enjoy the following code, For most people, this is way overkill for the 'problem' solved.
This code demonstrates how to create a set function that can accept set classes of any number in any order.
#include "stdafx.h"
#include <stdarg.h>
// Base class for all setter classes
class cSetterBase
{
public:
// the type of setter
int myType;
// a union capable of storing any kind of data that will be required
union data_t {
int i;
float f;
double d;
} myValue;
cSetterBase( int t ) : myType( t ) {}
};
// Base class for float valued setter functions
class cSetterFloatBase : public cSetterBase
{
public:
cSetterFloatBase( int t, float v ) :
cSetterBase( t )
{ myValue.f = v; }
};
// A couple of sample setter classes with float values
class cSetterA : public cSetterFloatBase
{
public:
cSetterA( float v ) :
cSetterFloatBase( 1, v )
{}
};
// A couple of sample setter classes with float values
class cSetterB : public cSetterFloatBase
{
public:
cSetterB( float v ) :
cSetterFloatBase( 2, v )
{}
};
// this is the class that actually does something useful
class cUseful
{
public:
// set attributes using any number of setter classes of any kind
void Set( int count, ... );
// the attributes to be set
float A, B;
};
// set attributes using any setter classes
void cUseful::Set( int count, ... )
{
va_list vl;
va_start( vl, count );
for( int kv=0; kv < count; kv++ ) {
cSetterBase s = va_arg( vl, cSetterBase );
cSetterBase * ps = &s;
switch( ps->myType ) {
case 1:
A = ((cSetterA*)ps)->myValue.f; break;
case 2:
B = ((cSetterB*)ps)->myValue.f; break;
}
}
va_end(vl);
}
int _tmain(int argc, _TCHAR* argv[])
{
cUseful U;
U.Set( 2, cSetterB( 47.5 ), cSetterA( 23 ) );
printf("A = %f B = %f\n",U.A, U.B );
return 0;
}
You may consider the ConstrOpt paradigm. I first heard about this when reading the XML-RPC C/C++ lib documentation here: http://xmlrpc-c.sourceforge.net/doc/libxmlrpc++.html#constropt
Basically the idea is similar to yours, but the "ConstrOpt" paradigm uses a subclass of the one you want to instantiate. This subclass is then instantiated on the stack with default options and then the relevant parameters are set with the "reference-chain" in the same way as you do.
The constructor of the real class then uses the constrOpt class as the only constructor parameter.
This is not the most efficient solution, but can help to get a clear and safe API design.
I'm having this problem for quite a long time - I have fixed sized 2D array as a class member.
class myClass
{
public:
void getpointeM(...??????...);
double * retpointM();
private:
double M[3][3];
};
int main()
{
myClass moo;
double *A[3][3];
moo.getpointM( A ); ???
A = moo.retpointM(); ???
}
I'd like to pass pointer to M matrix outside. It's probably very simple, but I just can't find the proper combination of & and * etc.
Thanks for help.
double *A[3][3]; is a 2-dimensional array of double *s. You want double (*A)[3][3];
.
Then, note that A and *A and **A all have the same address, just different types.
Making a typedef can simplify things:
typedef double d3x3[3][3];
This being C++, you should pass the variable by reference, not pointer:
void getpointeM( d3x3 &matrix );
Now you don't need to use parens in type names, and the compiler makes sure you're passing an array of the correct size.
Your intent is not clear. What is getpointeM supposed to do? Return a pointer to the internal matrix (through the parameter), or return a copy of the matrix?
To return a pointer, you can do this
// Pointer-based version
...
void getpointeM(double (**p)[3][3]) { *p = &M; }
...
int main() {
double (*A)[3][3];
moo.getpointM(&A);
}
// Reference-based version
...
void getpointeM(double (*&p)[3][3]) { p = &M; }
...
int main() {
double (*A)[3][3];
moo.getpointM(A);
}
For retpointM the declaration would look as follows
...
double (*retpointM())[3][3] { return &M; }
...
int main() {
double (*A)[3][3];
A = moo.retpointM();
}
This is rather difficult to read though. You can make it look a lot clearer if you use a typedef-name for your array type
typedef double M3x3[3][3];
In that case the above examples will transform into
// Pointer-based version
...
void getpointeM(M3x3 **p) { *p = &M; }
...
int main() {
M3x3 *A;
moo.getpointM(&A);
}
// Reference-based version
...
void getpointeM(M3x3 *&p) { p = &M; }
...
int main() {
double (*A)[3][3];
moo.getpointM(A);
}
// retpointM
...
M3x3 *retpointM() { return &M; }
...
int main() {
M3x3 *A;
A = moo.retpointM();
}
The short answer is that you can get a double * to the start of the array:
public:
double * getMatrix() { return &M[0][0]; }
Outside the class, though, you can't really trivially turn the double * into another 2D array directly, at least not in a pattern that I've seen used.
You could create a 2D array in main, though (double A[3][3]) and pass that in to a getPoint method, which could copy the values into the passed-in array. That would give you a copy, which might be what you want (instead of the original, modifiable, data). Downside is that you have to copy it, of course.
class myClass
{
public:
void getpointeM(double *A[3][3])
{
//Initialize array here
}
private:
double M[3][3];
};
int main()
{
myClass moo;
double *A[3][3];
moo.getpointM( A );
}
You may want to take the code in your main function which works with the 2D array of doubles, and move that into myClass as a member function. Not only would you not have to deal with the difficulty of passing a pointer for that 2D array, but code external to your class would no longer need to know the details of how your class implements A, since they would now be calling a function in myClass and letting that do the work. If, say, you later decided to allow variable dimensions of A and chose to replace the array with a vector of vectors, you wouldn't need to rewrite any calling code in order for it to work.
In your main() function:
double *A[3][3];
creates a 3x3 array of double* (or pointers to doubles). In other words, 9 x 32-bit contiguous words of memory to store 9 memory pointers.
There's no need to make a copy of this array in main() unless the class is going to be destroyed, and you still want to access this information. Instead, you can simply return a pointer to the start of this member array.
If you only want to return a pointer to an internal class member, you only really need a single pointer value in main():
double *A;
But, if you're passing this pointer to a function and you need the function to update its value, you need a double pointer (which will allow the function to return the real pointer value back to the caller:
double **A;
And inside getpointM() you can simply point A to the internal member (M):
getpointeM(double** A)
{
// Updated types to make the assignment compatible
// This code will make the return argument (A) point to the
// memory location (&) of the start of the 2-dimensional array
// (M[0][0]).
*A = &(M[0][0]);
}
Make M public instead of private. Since you want to allow access to M through a pointer, M is not encapsulated anyway.
struct myClass {
myClass() {
std::fill_n(&M[0][0], sizeof M / sizeof M[0][0], 0.0);
}
double M[3][3];
};
int main() {
myClass moo;
double (*A)[3] = moo.M;
double (&R)[3][3] = moo.M;
for (int r = 0; r != 3; ++r) {
for (int c = 0; c != 3; ++c) {
cout << A[r][c] << R[r][c] << ' ';
// notice A[r][c] and R[r][c] are the exact same object
// I'm using both to show you can use A and R identically
}
}
return 0;
}
I would, in general, prefer R over A because the all of the lengths are fixed (A could potentially point to a double[10][3] if that was a requirement) and the reference will usually lead to clearer code.
When I allocate a single object, this code works fine. When I try to add array syntax, it segfaults. Why is this? My goal here is to hide from the outside world the fact that class c is using b objects internally. I have posted the program to codepad for you to play with.
#include <iostream>
using namespace std;
// file 1
class a
{
public:
virtual void m() { }
virtual ~a() { }
};
// file 2
class b : public a
{
int x;
public:
void m() { cout << "b!\n"; }
};
// file 3
class c : public a
{
a *s;
public:
// PROBLEMATIC SECTION
c() { s = new b[10]; } // s = new b;
void m() { for(int i = 0; i < 10; i++) s[i].m(); } // s->m();
~c() { delete[] s; } // delete s;
// END PROBLEMATIC SECTION
};
// file 4
int main(void)
{
c o;
o.m();
return 0;
}
Creating an array of 10 b's with new and then assigning its address to an a* is just asking for trouble.
Do not treat arrays polymorphically.
For more information see ARR39-CPP. Do not treat arrays polymorphically, at section 06. Arrays and the STL (ARR) of the CERT C++ Secure Coding Standard.
One problem is that the expression s[i] uses pointer arithmetic to compute the address of the desired object. Since s is defined as pointer to a, the result is correct for an array of as and incorrect for an array of bs. The dynamic binding provided by inheritance only works for methods, nothing else (e.g., no virtual data members, no virtual sizeof). Thus when calling the method s[i].m() the this pointer gets set to what would be the ith a object in the array. But since in actuality the array is one of bs, it ends up (sometimes) pointing to somewhere in the middle of an object and you get a segfault (probably when the program tries to access the object's vtable). You might be able to rectify the problem by virtualizing and overloading operator[](). (I Didn't think it through to see if it will actually work, though.)
Another problem is the delete in the destructor, for similar reasons. You might be able to virtualize and overload it too. (Again, just a random idea that popped into my head. Might not work.)
Of course, casting (as suggested by others) will work too.
You have an array of type "b" not of type "a" and you are assigning it to a pointer of type a. Polymorphism doesn't transfer to dynamic arrays.
a* s
to a
b* s
and you will see this start working.
Only not-yet-bound pointers can be treated polymorphically. Think about it
a* s = new B(); // works
//a* is a holder for an address
a* s = new B[10]
//a* is a holder for an address
//at that address are a contiguos block of 10 B objects like so
// [B0][B2]...[B10] (memory layout)
when you iterate over the array using s, think about what is used
s[i]
//s[i] uses the ith B object from memory. Its of type B. It has no polymorphism.
// Thats why you use the . notation to call m() not the -> notation
before you converted to an array you just had
a* s = new B();
s->m();
s here is just an address, its not a static object like s[i]. Just the address s can still be dynamically bound. What is at s? Who knows? Something at an address s.
See Ari's great answer below for more information about why this also doesn't make sense in terms of how C style arrays are layed out.
Each instance of B contains Both X data member and the "vptr" (pointer to the virtual table).
Each instance of A contain only the "vptr"
Thus , sizeof(a) != sizeof(b).
Now when you do this thing : "S = new b[10]" you lay on the memory 10 instances of b in a raw , S (which has the type of a*) is getting the beginning that raw of data.
in C::m() method , you tell the compiler to iterate over an array of "a" (because s has the type of a*) , BUT , s is actualy pointing to an array of "b". So when you call s[i] what the compiler actualy do is "s + i * sizeof(a)" , the compiler jumps in units of "a" instead of units of "b" and since a and b doesn't have the same size , you get a lot of mambojumbo.
I have figured out a workaround based on your answers. It allows me to hide the implementation specifics using a layer of indirection. It also allows me to mix and match objects in my array. Thanks!
#include <iostream>
using namespace std;
// file 1
class a
{
public:
virtual void m() { }
virtual ~a() { }
};
// file 2
class b : public a
{
int x;
public:
void m() { cout << "b!\n"; }
};
// file 3
class c : public a
{
a **s;
public:
// PROBLEMATIC SECTION
c() { s = new a* [10]; for(int i = 0; i < 10; i++) s[i] = new b(); }
void m() { for(int i = 0; i < 10; i++) s[i]->m(); }
~c() { for(int i = 0; i < 10; i++) delete s[i]; delete[] s; }
// END PROBLEMATIC SECTION
};
// file 4
int main(void)
{
c o;
o.m();
return 0;
}