I'm wondering if it's possible to address a class with a dynamic base address as opposed to a static one. The basic idea is as follows:
Have an object A defined like so:
class A
{
//member variables
...
//non-virtual member functions
...
//virtual methods
virtual void foo(...);
...
};
This class cannot be instantiated as a stack object, and does not have a standard new operator.
Instead, the object has a placement new that takes the base address and offset into memory from the base address, and uses this to compute the absolute address for construction.
What I want to do is have the object be accessed in code as follows:
A* clsA=new(base,offset) A();
...
clsA->foo( ... );
where
(char*)clsA == (char*)(base+offset)
but additionally be able to do the following
base+=4;
...
clsA->foo( ... );
and still have this:
(char*)clsA == (char*)(base+offset)
hold true.
I have no idea if this is possible in C++. I do know it can be done in ASM (x86/amd64), but I'd like a solution with as much portability as possible (which I recognize will still be next to none, but its better than nothing). Any suggestions?
EDIT:
So I guess I wasn't too clear about the problem I have. The idea is to allow for a dynamic object (one allocated on the heap) to be moved around in memory. Normally this wouldn't be a problem, but since the objects cannot be instantiated through stack memory then the only way to access the object is through a pointer to the memory underlying the object. When the array moves (in the example, by four bytes), the pointers loaned from the array are no longer valid and need to be updated. As this process would not only be lengthy, but consume more memory than I wish to, I would like to be able to have the classes recalculate their memory addresses on access, rather than storing a relocation table with an entry for each loaned pointer.
Some assembly that might represent this concept would be
;rax stores clsA
mov rcx, rax
shr rcx, 32
mov DWORD PTR[rdx], rax
lea rax, rdx+rcx
push rax
call foo
EDIT 2: As it so turns out, there is also a MSVC modifier for this exact type of behavior. __based declares a pointer relative to another pointer, so the underlying memory can be moved around and the pointer remains valid. The article is here.
If I understand you, what you need is pointers that are always relative to another pointer. This is easy, but usually a bad idea.
template<class T>
struct reloc_ptr {
template<class U>
reloc_ptr(char*& base, int offset, U&& v)
:base(&base), offset(offset)
{new(get())T(std::forward<U>(v));}
T* get() const {return (T*)(*base+offset);}
T* operator->() const {return get();}
void destroy() {get()->~T();}
private:
char** base;
int offset;
};
and then normal usage is like this
int main() {
char* base = new char[1000];
//construct
reloc_ptr<A> clsA(base,4, "HI"); //I construct the A with the param "HI"
//test
clsA->foo();
//cleanup
clsA.destroy();
delete[] base;
}
Note that the reloc_ptr is relative to the char* base that it was constructed from, so be very very careful with that. If the char* base variable is in a function, and the function ends, then all pointers constructed with that char* base variable become invalid and using them will make the program do strange things.
http://ideone.com/4DNUGQ
Something very similar to what you are asking is the placement new syntax of C++.
Placement new is used when you do not want operator new to allocate memory (you have pre-allocated it and you want to place the object there), but you do want the object to be constructed.
In your example you might allocate a class A on a particular location in memory in order to then apply method foo() to it.
void* memoryBuffer;
...
unsigned int i = 0;
for (uint i= 0; i < N; i += offsetSize){
//initialize a given a specific location in memoryBuffer
A* a = new(memoryBuffer + i)A(...);
//apply foo on that specific memory location
a->foo();
}
This is what appens for instance using Eigen to wrap a matrix object around a preallocated numerical buffer.
If you're talking about addressing the object of type A * at an offset position, an object someone else moved somehow, you'd use something like (clsA+offset)->foo( ... ), where offset is in sizeof A units, ie. your earlier 4 would actually be a 1 because you probably mean to move to the "next" adjacent object in memory.
If you're talking about actually moving the object, you can placement-new at the new address in memory, then call the copy constructor (or memcpy for a POD), but this is pretty iffy depending on what your object holds, if it insists on ownership of pointers and when you call placement-delete that memory gets freed you're pretty SOL.
I can't emphasize this enough, tell us what you're trying to accomplish because there might be a better way, this is pretty against the proverbial grain right here.
Related
Where exactly is the 'this' pointer stored in memory? Is it allocated on the stack, in the heap, or in the data segment?
#include <iostream>
using namespace std;
class ClassA
{
int a, b;
public:
void add()
{
a = 10;
b = 20;
cout << a << b << endl;
}
};
int main()
{
ClassA obj;
obj.add();
return 0;
}
In the above code I am calling the member function add() and the receiver object is passed implicitly as the 'this' pointer. Where is this stored in memory?
The easiest way is to think of this as being a hidden extra argument that is always passed automatically.
So, a fictional method like:
size_t String::length(void) const
{
return strlen(m_string);
}
is actually more like this under the hood:
size_t String__length(const String *this)
{
return strlen(this->m_string);
}
and a call like:
{
String example("hello");
cout << example.length();
}
becomes something like:
cout << String__length(&example);
Note that the above transformation is simplified, hopefully to make my point a bit clearer. No need to fill up the comments with "whaaa, where's the marshalling for method overloading, huh?"-type objection, please. :)
That transforms the question into "where are arguments stored?", and the answer is of course "it depends". :)
It's often on the stack, but it could be in registers too, or any other mechanism that the compiler considers is good for the target architecture.
Other answers have done a very good job explaining how a typical compiler implements this (by passing it as an implicit first parameter to the function).
I think it's also useful to see what the C++ ISO spec explicitly says about this. According to the C++03 ISO spec, ยง9.3.2/1:
In the body of a nonstatic (9.3) member function, the keyword this is a non-lvalue expression whose value is the address of the object for which the function is called.
It's important to note that this is not a variable - it's an expression, much in the same way that the expression 1 + 2 * 3 is an expression. The value of this expression is permitted to be stored pretty much anywhere. The compiler might put it on the stack and pass it as an implicit parameter to a function, or it might put it in a register, and it conceivably could put it in the heap or in the data segment. The C++ specification deliberately gives the implementation some flexibility here.
I think that the "language-lawyer" answer is "this is completely implementation-defined, and moreover this is technically not a pointer, but an expression that evaluates to a pointer."
Hope this helps!
this is usually passed as a hidden argument of the method (the only difference throughout different calling conventions is how).
If you call:
myClass.Method(1, 2, 3);
Compiler generates the following code:
Method(&myClass, 1, 2, 3);
Where the first parameter is actually the pointer to this.
Let's check the following code:
class MyClass
{
private:
int a;
public:
void __stdcall Method(int i)
{
a = i;
}
};
int main(int argc, char *argv[])
{
MyClass myClass;
myClass.Method(5);
return 0;
}
By using __stdcall I forced the compiler to pass all parameters through the stack. If you then start the debugger and inspect the assembly code, you'll find something like the following:
myClass.Method(5);
00AA31BE push 5
00AA31C0 lea eax,[myClass]
00AA31C3 push eax
00AA31C4 call MyClass::Method (0AA1447h)
As you see, the parameter of the method is passed through the stack, then address of myClass is loaded to eax register and again pushed on the stack. In other words, this is treated as a regular parameter of this method.
this is an rvalue (you cannot take its address), so it doesn't
(necessarily) occupy memory at all. Depending on the compiler
and the target architecture, it will often be in a register: i0
on a Sparc, ECX with MSVC on Intel, etc. When the optimizer is
active, it can even move around. (I've seen it in different
registers with MSVC).
this behaves mostly like a function argument, and as such will be stored on the stack or - if the binary calling conventions of the architecture allow that - in a register.
this isn't stored at a well-defined location! The object that it points to is stored somewhere, and has a well-defined address, but the address itself does not have a specific home address. It is communicated around in the program. Not only that, but there can be many copies of that pointer.
In the following imaginary init function, the object registers itself to receive events and timer callbacks (using imaginary event source objects). So after the registration, there are two additional copies of this:
void foo_listener::init()
{
g_usb_events.register(this); // register to receive USB events
g_timer.register(this, 5); // register for a 5 second timer
}
I a function activation chain, there will also be multiple copies of the this pointer. Suppose we have an object obj and call its foo function. That function calls the same object's bar function, and bar calls another function called update. Each function activation level has the this pointer. It's stored in a machine register, or in a memory location in the stack frame of the function activation.
How can one store an arbitrary number of dynamically created instances (of different types) in an STL container so that the memory can be freed later only having the container?
It should work like this:
std::vector< void * > vec;
vec.push_back( new int(10) );
vec.push_back( new float(1.) );
Now, if vec goes out of scope the pointers to the instances are destructed, but the memory for int and float are not freed. And obviously I can't do:
for( auto i : vec )
delete *i;
because void* is not a pointer-to-object type.
You could object and argue that this isn't a good idea because one can not access the elements of the vector. That is right, and I don't access them myself. The NVIDIA driver will access them as it just needs addresses (void* is fine) for it parameters to a kernel call.
I guess the problem here is that it can be different types that are stored. Wondering if a union can do the trick in case one wants to pass this as arguments to a cuda kernel.
The kernel takes parameters of different types and are collected by traversing an expression tree (expression templates) where you don't know the type beforehand. So upon visiting the leaf you store the parameter. it can only be void*, and built-in types int, float, etc.
The vector can be deleted right after the kernel launch (the launch is async but the driver copies the parameters first then continues host thread). 2nd question: Each argument is passed a void* to the driver. Regardless if its an int, float or even void*. So I guess one can allocate more memory than needed. I think the union thingy might be worth looking at.
You can use one vector of each type you want to support.
But while that's a great improvement on the idea of a vector of void*, it still quite smelly.
This does sound like an XY-problem: you have a problem X, you envision a solution Y, but Y obviously doesn't work without some kind of ingenious adaption, so ask about Y. When instead, should be asking about the real problem X. Which is?
Ok, FWIW
I would recomend using an in-place new combined with malloc. what this would do is allow you store the pointers created as void* in your vector. Then when the vector is finished with it can simply be iterated over and free() called.
I.E.
void* ptr = malloc(sizeof(int));
int* myNiceInt = new (ptr) int(myNiceValue);
vec.push_back(ptr);
//at some point later iterate over vec
free( *iter );
I believe that this will be the simplest solution to the problem in this case but do accept that this is a "C" like answer.
Just sayin' ;)
"NVIDIA driver" sounds like a C interface anyway, so malloc is not a crazy suggestion.
Another alternative, as you suggest, is to use a union... But you will also need to store "tags" in a parallel vector to record the actual type of the element, so that you can cast to the appropriate type on deletion.
In short, you must cast void * to an appropriate type before you can delete it. The "C++ way" would be to have a base class with a virtual destructor; you can call delete on that when it points to an instance of any sub-class. But if the library you are using has already determined the types, then that is not an option.
If you have control over the types you can create an abstract base class for them. Give that class a virtual destructor. Then you can have your std::vector<Object*> and iterate over it to delete anything which inherits from Object.
You probably need to have a second std::vector<void*> with pointers to the actual values, since the Object* probably hits the vtable first. A second virtual function like virtual void* ptr() { return &value; } would be useful here. And if it needs the size of the object you can add that too.
You could use the template pattern like this:
template<typename T>
class ObjVal : public Object {
public:
T val;
virtual void* ptr() { return &this->val; }
virtual size_t size() { return sizeof(this->val); }
};
Then you only have to type it once.
This is not particularly memory efficient because every Object picks up at least one extra pointer for the vtable.
However, new int(3) is not very memory efficient either because your allocator probably uses more than 4 bytes for it. Adding that vtable pointer may be essentially free.
Use more than 1 vector. Keep the vector<void*> around to talk to the API (which I'm guessing requires a contiguous block of void*s of non-uniform types?), but also have a vector<std::unique_ptr<int>> and vector<std::unique_ptr<float>> which owns the data. When you create a new int, push a unique_ptr that owns the memory into your vector of ints, and then stick it on the API-compatible vector as a void*. Bundle the three vectors into one struct so that their lifetimes are tied together if possible (and it probably is).
You can also do this with a single vector that stores the ownership of the variables. A vector of roll-your-own RAII pseudo-unique_ptr, or shared_ptr with custom destroyers, or a vector of std::function<void()> that your "Bundle"ing struct's destroyer invokes, or what have you. But I wouldn't recommend these options.
This is one topic that is not making sense to me. Pointers to data members of a class can be declared and used. However,
What is the logic that supports the idea ? [I am not talking about the syntax, but the logic of this feature]
Also,if i understand this correctly, this would imply an indefinite/variable amount of memory being allocated at the pointer initialization as any number of objects may exist at that time. Also, new objects may be created and destroyed during runtime. Hence, in effect, a single statement will cause a large number of allocations/deallocations. This seems rather counter-intuitive as compared to the rest of the language. Or is my understanding of this incorrect ? I dont think there is any other single initialization statement that will implicitly affect program execution as widely as this.
Lastly, how is memory allocated to these pointers ? Where are they placed with respect to objects ? Is it possible to see physical memory addresses of these pointers ?
A single declaration of a pointer to a data member, creates pointers for every object of that class.
No, it does not. A pointer to a member is a special object that is very different from a pointer; it is a lot more similar to an offset. Given a pointer to an object of the class and a member pointer, you'd be able to get the value of a member; without the pointer to an object of a class a pointer to a member is useless.
Questions 2 and 3 stem from the same basic misunderstanding.
A single declaration of a pointer to a data member, creates pointers for every object of that class.
No. It creates a pointer to a member (which can be though of as an offset from the base of object)
You can then use it with a pointer to an object to get that member.
struct S
{
int x;
int y;
};
int S::* ptrToMember = &S::x; // Pointer to a member.
S obj;
int* ptrToData = &obj.x; // Pointer to object
// that happens to be a member
Notice in creating the pointer to a member we don't use an object (we just use the type information). So this pointer is an offset into the class to get a specific member.
You can access the data member via a pointer or object.
(obj.*ptrToMember) = 5; // Assign via pointer to member (requires an object)
*ptrToData = 6; // Assign via pointer already points at object.
Why does this happen as opposed to a single pointer being created to point to only one specific instance of the class ?
That is called a pointer.
A similar but parallel concept (see above).
What is the logic that supports the idea ?
Silly example:
void addOneToMember(S& obj, int S::* member) { (obj.*member) += 1; }
void addOneToX(S& obj) { addOneToMember(obj, &Obj::x);}
void addOneToY(S& obj) { addOneToMember(obj, &Obj::y);}
Also,if i understand this correctly, this would imply an indefinite/variable amount of memory being allocated at the pointer initialization as any number of objects may exist at that time.
No. Because a pointer to a member is just an offset into an object. You still need the actual object to get the value.
Lastly, how is memory allocated to these pointers ?
Same way as other objects. There is nothing special about them in terms of layout.
But the actual layout is implementation defined. So there is no way of answering this question without referring to the compiler. But it is really of no use to you.
Is it possible to see physical memory addresses of these pointers ?
Sure. They are just like other objects.
// Not that this will provide anything meaningful.
std::cout.write(reinterpret_cast<char*>(&ptrToMember), sizeof(ptrToMember));
// 1) take the address of the pointer to member.
// 2) cast to char* as required by write.
// 3) pass the size of the pointer to member
// and you should write the values printed out.
// Note the values may be non printable but I am sure you can work with that
// Also note the meaning is not useful to you as it is compiler dependent.
Internally, for a class that does not have virtual bases, a pointer-to-member-data just has to hold the offset of the data member from the start of an object of that type. With virtual bases it's a bit more complicated, because the location of the virtual base can change, depending on the type of the most-derived object. Regardless, there's a small amount of data involved, and when you dereference the pointer-to-data-member the compiler generates appropriate code to access it.
I am building a class hierarchy that uses SSE intrinsics functions and thus some of the members of the class need to be 16-byte aligned. For stack instances I can use __declspec(align(#)), like so:
typedef __declspec(align(16)) float Vector[4];
class MyClass{
...
private:
Vector v;
};
Now, since __declspec(align(#)) is a compilation directive, the following code may result in an unaligned instance of Vector on the heap:
MyClass *myclass = new MyClass;
This too, I know I can easily solve by overloading the new and delete operators to use _aligned_malloc and _aligned_free accordingly. Like so:
//inside MyClass:
public:
void* operator new (size_t size) throw (std::bad_alloc){
void * p = _aligned_malloc(size, 16);
if (p == 0) throw std::bad_alloc()
return p;
}
void operator delete (void *p){
MyClass* pc = static_cast<MyClass*>(p);
_aligned_free(p);
}
...
So far so good.. but here is my problem. Consider the following code:
class NotMyClass{ //Not my code, which I have little or no influence over
...
MyClass myclass;
...
};
int main(){
...
NotMyClass *nmc = new NotMyClass;
...
}
Since the myclass instance of MyClass is created statically on a dynamic instance of NotMyClass, myclass WILL be 16-byte aligned relatively to the beginning of nmc because of Vector's __declspec(align(16)) directive. But this is worthless, since nmc is dynamically allocated on the heap with NotMyClass's new operator, which doesn't nesessarily ensure (and definitely probably NOT) 16-byte alignment.
So far, I can only think of 2 approaches on how to deal with this problem:
Preventing MyClass users from being able to compile the following code:
MyClass myclass;
meaning, instances of MyClass can only be created dynamically, using the new operator, thus ensuring that all instances of MyClass are truly dynamically allocatted with MyClass's overloaded new. I have consulted on another thread on how to accomplish this and got a few great answers:
C++, preventing class instance from being created on the stack (during compiltaion)
Revert from having Vector members in my Class and only have pointers to Vector as members, which I will allocate and deallocate using _aligned_malloc and _aligned_free in the ctor and dtor respectively. This methos seems crude and prone to error, since I am not the only programmer writing these Classes (MyClass derives from a Base class and many of these classes use SSE).
However, since both solutions have been frowned upon in my team, I come to you for suggestions of a different solution.
If you're set against heap allocation, another idea is to over allocate on the stack and manually align (manual alignment is discussed in this SO post). The idea is to allocate byte data (unsigned char) with a size guaranteed to contain an aligned region of the necessary size (+15), then find the aligned position by rounding down from the most-shifted region (x+15 - (x+15) % 16, or x+15 & ~0x0F). I posted a working example of this approach with vector operations on codepad (for g++ -O2 -msse2). Here are the important bits:
class MyClass{
...
unsigned char dPtr[sizeof(float)*4+15]; //over-allocated data
float* vPtr; //float ptr to be aligned
public:
MyClass(void) :
vPtr( reinterpret_cast<float*>(
(reinterpret_cast<uintptr_t>(dPtr)+15) & ~ 0x0F
) )
{}
...
};
...
The constructor ensures that vPtr is aligned (note the order of members in the class declaration is important).
This approach works (heap/stack allocation of containing classes is irrelevant to alignment), is portabl-ish (I think most compilers provide a pointer sized uint uintptr_t), and will not leak memory. But its not particularly safe (being sure to keep the aligned pointer valid under copy, etc), wastes (nearly) as much memory as it uses, and some may find the reinterpret_casts distasteful.
The risks of aligned operation/unaligned data problems could be mostly eliminated by encapsulating this logic in a Vector object, thereby controlling access to the aligned pointer and ensuring that it gets aligned at construction and stays valid.
You can use "placement new."
void* operator new(size_t, void* p) { return p; }
int main() {
void* p = aligned_alloc(sizeof(NotMyClass));
NotMyClass* nmc = new (p) NotMyClass;
// ...
nmc->~NotMyClass();
aligned_free(p);
}
Of course you need to take care when destroying the object, by calling the destructor and then releasing the space. You can't just call delete. You could use shared_ptr<> with a different function to deal with that automatically; it depends if the overhead of dealing with a shared_ptr (or other wrapper of the pointer) is a problem to you.
The upcoming C++0x standard proposes facilities for dealing with raw memory. They are already incorporated in VC++2010 (within the tr1 namespace).
std::tr1::alignment_of // get the alignment
std::tr1::aligned_storage // get aligned storage of required dimension
Those are types, you can use them like so:
static const floatalign = std::tr1::alignment_of<float>::value; // demo only
typedef std::tr1::aligned_storage<sizeof(float)*4, 16>::type raw_vector;
// first parameter is size, second is desired alignment
Then you can declare your class:
class MyClass
{
public:
private:
raw_vector mVector; // alignment guaranteed
};
Finally, you need some cast to manipulate it (it's raw memory until now):
float* MyClass::AccessVector()
{
return reinterpret_cast<float*>((void*)&mVector));
}
I am storing objects in a buffer. Now I know that I cannot make assumptions about the memory layout of the object.
If I know the overall size of the object, is it acceptible to create a pointer to this memory and call functions on it?
e.g. say I have the following class:
[int,int,int,int,char,padding*3bytes,unsigned short int*]
1)
if I know this class to be of size 24 and I know the address of where it starts in memory
whilst it is not safe to assume the memory layout is it acceptible to cast this to a pointer and call functions on this object which access these members?
(Does c++ know by some magic the correct position of a member?)
2)
If this is not safe/ok, is there any other way other than using a constructor which takes all of the arguments and pulling each argument out of the buffer one at a time?
Edit: Changed title to make it more appropriate to what I am asking.
You can create a constructor that takes all the members and assigns them, then use placement new.
class Foo
{
int a;int b;int c;int d;char e;unsigned short int*f;
public:
Foo(int A,int B,int C,int D,char E,unsigned short int*F) : a(A), b(B), c(C), d(D), e(E), f(F) {}
};
...
char *buf = new char[sizeof(Foo)]; //pre-allocated buffer
Foo *f = new (buf) Foo(a,b,c,d,e,f);
This has the advantage that even the v-table will be generated correctly. Note, however, if you are using this for serialization, the unsigned short int pointer is not going to point at anything useful when you deserialize it, unless you are very careful to use some sort of method to convert pointers into offsets and then back again.
Individual methods on a this pointer are statically linked and are simply a direct call to the function with this being the first parameter before the explicit parameters.
Member variables are referenced using an offset from the this pointer. If an object is laid out like this:
0: vtable
4: a
8: b
12: c
etc...
a will be accessed by dereferencing this + 4 bytes.
Basically what you are proposing doing is reading in a bunch of (hopefully not random) bytes, casting them to a known object, and then calling a class method on that object. It might actually work, because those bytes are going to end up in the "this" pointer in that class method. But you're taking a real chance on things not being where the compiled code expects it to be. And unlike Java or C#, there is no real "runtime" to catch these sorts of problems, so at best you'll get a core dump, and at worse you'll get corrupted memory.
It sounds like you want a C++ version of Java's serialization/deserialization. There is probably a library out there to do that.
Non-virtual function calls are linked directly just like a C function. The object (this) pointer is passed as the first argument. No knowledge of the object layout is required to call the function.
It sounds like you're not storing the objects themselves in a buffer, but rather the data from which they're comprised.
If this data is in memory in the order the fields are defined within your class (with proper padding for the platform) and your type is a POD, then you can memcpy the data from the buffer to a pointer to your type (or possibly cast it, but beware, there are some platform-specific gotchas with casts to pointers of different types).
If your class is not a POD, then the in-memory layout of fields is not guaranteed, and you shouldn't rely on any observed ordering, as it is allowed to change on each recompile.
You can, however, initialize a non-POD with data from a POD.
As far as the addresses where non-virtual functions are located: they are statically linked at compile time to some location within your code segment that is the same for every instance of your type. Note that there is no "runtime" involved. When you write code like this:
class Foo{
int a;
int b;
public:
void DoSomething(int x);
};
void Foo::DoSomething(int x){a = x * 2; b = x + a;}
int main(){
Foo f;
f.DoSomething(42);
return 0;
}
the compiler generates code that does something like this:
function main:
allocate 8 bytes on stack for object "f"
call default initializer for class "Foo" (does nothing in this case)
push argument value 42 onto stack
push pointer to object "f" onto stack
make call to function Foo_i_DoSomething#4 (actual name is usually more complex)
load return value 0 into accumulator register
return to caller
function Foo_i_DoSomething#4 (located elsewhere in the code segment)
load "x" value from stack (pushed on by caller)
multiply by 2
load "this" pointer from stack (pushed on by caller)
calculate offset of field "a" within a Foo object
add calculated offset to this pointer, loaded in step 3
store product, calculated in step 2, to offset calculated in step 5
load "x" value from stack, again
load "this" pointer from stack, again
calculate offset of field "a" within a Foo object, again
add calculated offset to this pointer, loaded in step 8
load "a" value stored at offset,
add "a" value, loaded int step 12, to "x" value loaded in step 7
load "this" pointer from stack, again
calculate offset of field "b" within a Foo object
add calculated offset to this pointer, loaded in step 14
store sum, calculated in step 13, to offset calculated in step 16
return to caller
In other words, it would be more or less the same code as if you had written this (specifics, such as name of DoSomething function and method of passing this pointer are up to the compiler):
class Foo{
int a;
int b;
friend void Foo_DoSomething(Foo *f, int x);
};
void Foo_DoSomething(Foo *f, int x){
f->a = x * 2;
f->b = x + f->a;
}
int main(){
Foo f;
Foo_DoSomething(&f, 42);
return 0;
}
A object having POD type, in this case, is already created (Whether or not you call new. Allocating the required storage already suffices), and you can access the members of it, including calling a function on that object. But that will only work if you precisely know the required alignment of T, and the size of T (the buffer may not be smaller than it), and the alignment of all the members of T. Even for a pod type, the compiler is allowed to put padding bytes between members, if it wants. For a non-POD types, you can have the same luck if your type has no virtual functions or base classes, no user defined constructor (of course) and that applies to the base and all its non-static members too.
For all other types, all bets are off. You have to read values out first with a POD, and then initialize a non-POD type with that data.
I am storing objects in a buffer. ... If I know the overall size of the object, is it acceptable to create a pointer to this memory and call functions on it?
This is acceptable to the extent that using casts is acceptable:
#include <iostream>
namespace {
class A {
int i;
int j;
public:
int value()
{
return i + j;
}
};
}
int main()
{
char buffer[] = { 1, 2 };
std::cout << reinterpret_cast<A*>(buffer)->value() << '\n';
}
Casting an object to something like raw memory and back again is actually pretty common, especially in the C world. If you're using a class hierarchy, though, it would make more sense to use pointer to member functions.
say I have the following class: ...
if I know this class to be of size 24 and I know the address of where it starts in memory ...
This is where things get difficult. The size of an object includes the size of its data members (and any data members from any base classes) plus any padding plus any function pointers or implementation-dependent information, minus anything saved from certain size optimizations (empty base class optimization). If the resulting number is 0 bytes, then the object is required to take at least one byte in memory. These things are a combination of language issues and common requirements that most CPUs have regarding memory accesses. Trying to get things to work properly can be a real pain.
If you just allocate an object and cast to and from raw memory you can ignore these issues. But if you copy an object's internals to a buffer of some sort, then they rear their head pretty quickly. The code above relies on a few general rules about alignment (i.e., I happen to know that class A will have the same alignment restrictions as ints, and thus the array can be safely cast to an A; but I couldn't necessarily guarantee the same if I were casting parts of the array to A's and parts to other classes with other data members).
Oh, and when copying objects you need to make sure you're properly handling pointers.
You may also be interested in things like Google's Protocol Buffers or Facebook's Thrift.
Yes these issues are difficult. And, yes, some programming languages sweep them under the rug. But there's an awful lot of stuff getting swept under the rug:
In Sun's HotSpot JVM, object storage is aligned to the nearest 64-bit boundary. On top of this, every object has a 2-word header in memory. The JVM's word size is usually the platform's native pointer size. (An object consisting of only a 32-bit int and a 64-bit double -- 96 bits of data -- will require) two words for the object header, one word for the int, two words for the double. That's 5 words: 160 bits. Because of the alignment, this object will occupy 192 bits of memory.
This is because Sun is relying on a relatively simple tactic for memory alignment issues (on an imaginary processor, a char may be allowed to exist at any memory location, an int at any location that is divisible by 4, and a double may need to be allocated only on memory locations that are divisible by 32 -- but the most restrictive alignment requirement also satisfies every other alignment requirement, so Sun is aligning everything according to the most restrictive location).
Another tactic for memory alignment can reclaim some of that space.
If the class contains no virtual functions (and therefore class instances have no vptr), and if you make correct assumptions about the way in which the class' member data is laid out in memory, then doing what you're suggesting might work (but might not be portable).
Yes, another way (more idiomatic but not much safer ... you still need to know how the class lays out its data) would be to use the so-called "placement operator new" and a default constructor.
That depends upon what you mean by "safe". Any time you cast a memory address into a point in this way you are bypassing the type safety features provided by the compiler, and taking the responsibility to yourself. If, as Chris implies, you make an incorrect assumption about the memory layout, or compiler implementation details, then you will get unexpected results and loose portability.
Since you are concerned about the "safety" of this programming style it is likely worth your while to investigate portable and type-safe methods such as pre-existing libraries, or writing a constructor or assignment operator for the purpose.