I have tried union...
struct foo
{
union
{
struct // 2 bytes
{
char var0_1;
};
struct // 5 bytes
{
char var1_1;
int var1_2;
};
};
};
Problem: Unions do what I want, except they will always take the size of the biggest datatype. In my case I need struct foo to have some initialization that allows me to tell it which structure to chose of the two (if that is even legal) as shown below.
So after that, I tried class template overloading...
template <bool B>
class foo { }
template <>
class foo<true>
{
char var1;
}
template <>
class foo<false>
{
char var0;
int var1;
}
Problem: I was really happy with templates and the fact that I could use the same variable name on the char and int, but the problem was the syntax. Because the classes are created on compile-time, the template boolean variable needed to be a hardcoded constant, but in my case the boolean needs to be user-defined on runtime.
So I need something of the two "worlds." How can I achieve what I'm trying to do?
!!NOTE: The foo class/struct will later be inherited, therefore as already mentioned, size of foo is of utmost importance.
EDIT#1::
Application:
Basically this will be used to read/write (using a pointer as an interface) a specific data buffer and also allow me to create (new instance of the class/struct) the same data buffer. The variables you see above specify the length. If it's a smaller data buffer, the length is written in a char/byte. If it's a bigger data buffer, the first char/byte is null as a flag, and the int specifies the length instead. After the length it's obvious that the actual data follows, hence why the inheritance. Size of class is of the utmost importance. I need to have my cake and eat it too.
A layer of abstraction.
struct my_buffer_view{
std::size_t size()const{
if (!m_ptr)return 0;
if (*m_ptr)return *m_ptr;
return *reinterpret_cast<std::uint32_t const*>(m_ptr+1);
}
std::uint8_t const* data() const{
if(!m_ptr)return nullptr;
if(*m_ptr)return m_ptr+1;
return m_ptr+5;
}
std::uint8_t const* begin()const{return data();}
std::uint8_t const* end()const{return data()+size();}
my_buffer_view(std::uint_t const*ptr=nullptr):m_ptr(ptr){}
my_buffer_view(my_buffer_view const&)=default;
my_buffer_view& operator=(my_buffer_view const&)=default;
private:
std::uint8_t const* m_ptr=0;
};
No variable sized data anywhere. I coukd have used a union for size etx:
struct header{
std::uint8_t short_len;
union {
struct{
std::uint32_t long_len;
std::uint8_t long_buf[1];
}
struct {
std::short_buf[1];
}
} body;
};
but I just did pointer arithmetic instead.
Writing such a buffer to a bytestream is another problem entirely.
Your solution does not make sense. Think about your solution: you could define two independents classes: fooTrue and fooFalse with corresponding members exactly with the same result.
Probably, you are looking for a different solution as inheritance. For example, your fooTrue is baseFoo and your fooFalse is derivedFoo with as the previous one as base and extends it with another int member.
In this case, you have the polymorphism as the method to work in runtime.
You can't have your cake and eat it too.
The point of templates is that the specialisation happens at compile time. At run time, the size of the class is fixed (albeit, in an implementation-defined manner).
If you want the choice to be made at run time, then you can't use a mechanism that determines size at compile-time. You will need a mechanism that accommodates both possible needs. Practically, that means your base class will need to be large enough to contain all required members - which is essentially what is happening with your union based solution.
In reference to your "!!NOTE". What you are doing qualifies as premature optimisation. You are trying to optimise size of a base class without any evidence (e.g. measurement of memory usage) that the size difference is actually significant for your application (e.g. that it causes your application to exhaust available memory). The fact that something will be a base for a number of other classes is not sufficient, on its own, to worry about its size.
I am trying to copy part of a large structure and I was hoping I could use pointer arithmetic to copy chunks of it at a time. So if I have the following stucture
struct {
int field1;
char field2;
myClass field3;
.
.
.
myOtherClass field42;
} myStruct;
struct origionalStruct;
struct *pCopyStruct;
can I use memcpy() to copy part of it using pointer arithmetic?
memcpy(pCopyStruct, &origionalStruct.field1,
(char*)&origionalStuct.field1 - (char*)&origionalStuct.field23);
I know that pointer arithmetic is only valid for arrays, but I was hoping I could get around that by casting everything to (char*).
My answer only holds for c++.
Using memcpy() to copy member variables of objects breaks encapsulation and is not good practice in general. I.e. only do that if you have very good reason. It can work if you are careful, but you are making your program very brittle: You increase the risk that future changes will introduce bugs.
E.g. also see http://flylib.com/books/en/2.123.1.431/1/
It would be best to put the fields you want to copy into a nested struct and just assign that to the corresponding field of the new struct. That would avoid writing, increases greatly readability and - least not last - maintains type-safety. All which memcpy does not provide.
offsetof() or using the addresses of enclosing fields would obviously not work if the copied fields are at the end or beginning of the struct.
struct {
int field1;
struct { char fields } cpy_fields;
} a, b;
a.cpy_fields = b.cpy_fields;
When using gcc, you can enable plan9-extensions and use an anonymous struct, but need a typedef for the inner:
typedef struct { char field1; } Inner;
struct {
int field1;
Inner;
} a, b;
This does not change existing code which can do: a.field2. You can still access the struct as a whole by its typename (provided you only have one instance in the outer struct): a.Inner = b.Inner.
While the first part (anonymous struct) is standard since C99, the latter is part of the plan9-extensions (which are very interesting for its other feature, too). Actually the other feature might provide an even better sulution for your problem. You might have a look at the doc-page and let it settle for a sec or two to get the implications. Still wonder why this feature did not make it into the standard (no extra code, more type-safety as much less casts required).
Yes, you can do:
memcpy(pCopyStruct, &origionalStruct.field1,
(char*)&origionalStuct.field23 - (char*)&origionalStuct.field1);
However, you probably want to use the offsetof() macro found in stddef.h.
memcpy(pCopyStruct, &originalStruct.field1,
offsetof(struct myStruct, field23)-offsetof(struct myStruct, field1));
Where if field1 is at offset 0 (first in the struct) you can omit the subtraction.
For completeness, the offsetof macro can be defined as:
#define offsetof(st, m) ((size_t)(&((st *)0)->m))
I program mainly in C for the embedded world and recently I have been experimenting around with C++ and I have an idea. This question pertains to data transferred over a network.
Currently in C I do something like this contrived example (disregarding packing):
typedef struct {
time_t date;
float value;
} Message1;
typedef union {
char raw[sizeof(Message1)];
Message1 msg;
} Overlay;
int my_func(Message1* ptr)
{
/* do stuff with stuff */
}
Data is placed into Overlay.raw and inspected through msg (regarding endianness of course). Can I do something similar in C++ without using a struct?
class Message1 {
public:
time_t date;
float value;
int my_func() { /* do stuff with stuff */ };
}
typedef union {
char raw[sizeof(Message1)];
Message1 msg;
}
I've done some experiments and from what I can tell it seems to be working so far. However I want to know more details about how C++ aligns stuff in the class. Like, will it break if I put a private section after the public section? What if I use inheritance? Is this a Dumb(tm) thing to do?
You generally want to keep unions simple. None of the construct, copy, assign, or move semantics apply to them; even if members have the functions defined. It's generally not a good idea to use them with complex data types though, since you need to worry about vtables, placement of access modified members, etc... However, POD classes are basically the same as C structs (C++ structs are also essentially the same as classes).
As I understand it, memory layout isn't part of C++ the standard, aside from the order of member variables for POD types. Public, protected, and private variables can be placed in separate memory regions. I think inherited member layouts are also implementation defined. So any code that would depend on layout would be platform/compiler specific. Members are generally laid out in sequential order, but again it's generally not a good idea to depend on layout (multiple inheritance for example). Obviously alignments are still platform/compiler defined as well, but you can control alignment using alignas(T) (C++11).
Also, it's probably just style preference, but it might be better to use the union as an explicit type. instead of a typedef.
union pkt {
char raw[sizeof(Message)]
Message msg;
}
I can't see a good reason to use unions here, at all.
You get no benefit of using a union with a byte array over a cast of the struct pointer to a (char*).
If you want to send a packet you don't need a union to access the data.
typedef struct {
time_t date;
float value;
} Message1;
void sendData(uin8_t *pData, int size)
{
while (size--)
sendByte(*pData++);
}
int main()
{
Message1 myMessage;
sendData( &myMessage, sizeof(myMessage) );
}
Btw. Sending data directly from a structure over a network results regular in problems with padding and/or endianess between different platforms.
I'm having trouble figuring out which is better in C++:
I use a struct to manage clients in a message queue, the struct looks like this:
typedef struct _MsgClient {
int handle;
int message_class;
void *callback;
void *private_data;
int priority;
} MsgClient;
All of these being POD entities.
Now, I have an array of these structs where I store my clients (I use an array for memory constraints, I have to limit fragmentation). So in my class I have something like this:
class Foo
{
private:
MsgClient _clients[32];
public:
Foo()
{
memset(_clients, 0x0, sizeof(_clients));
}
}
Now, I read here and there on SO that using memset is bad in C++, and that I'd rather use a constructor for my structure.
I figured something like this:
typedef struct _MsgClient {
int handle;
int message_class;
void *callback;
void *private_data;
int priority;
// struct constructor
_MsgClient(): handle(0), message_class(0), callback(NULL), private_data(NULL), priority(0) {};
} MsgClient;
...would eliminate the need of the memset. But my fear is that when foo is initialized, the struct constructor will be called 32 times, instead of optimizing it as a simple zero out of the memory taken by the array.
What's your opinion on this?
I just found this: Can a member struct be zero-init from the constructor initializer list without calling memset? , is it appropriate in my case (which is different: I have an array, not a single instance of the structure)?
Also, according to this post, adding a constructor to my structure will automatically convert it into a non-POD structure, is it right?
On a conforming implementation, it's perfectly valid to value-initialize an array in the constructor initializer list with an empty member initializer. For your array members, this will have the effect of zero-initializing the members of each array element.
The compiler should be able to make this very efficient and there's no need for you to add a constructor to your struct.
E.g.
Foo() : _clients() {}
You can you memset freely even in C++, as long as you understand what you are doing.
About the performance - the only way to see which way is really faster is to build your program in release configuration, and then see in the disassembler the code generated.
Using memset sounds somewhat faster than per-object initialization. However there's a chance the compiler will generated the same code.
Instead of having to remember to initialize a simple 'C' structure, I might derive from it and zero it in the constructor like this:
struct MY_STRUCT
{
int n1;
int n2;
};
class CMyStruct : public MY_STRUCT
{
public:
CMyStruct()
{
memset(this, 0, sizeof(MY_STRUCT));
}
};
This trick is often used to initialize Win32 structures and can sometimes set the ubiquitous cbSize member.
Now, as long as there isn't a virtual function table for the memset call to destroy, is this a safe practice?
You can simply value-initialize the base, and all its members will be zero'ed out. This is guaranteed
struct MY_STRUCT
{
int n1;
int n2;
};
class CMyStruct : public MY_STRUCT
{
public:
CMyStruct():MY_STRUCT() { }
};
For this to work, there should be no user declared constructor in the base class, like in your example.
No nasty memset for that. It's not guaranteed that memset works in your code, even though it should work in practice.
PREAMBLE:
While my answer is still Ok, I find litb's answer quite superior to mine because:
It teaches me a trick that I did not know (litb's answers usually have this effect, but this is the first time I write it down)
It answers exactly the question (that is, initializing the original struct's part to zero)
So please, consider litb's answer before mine. In fact, I suggest the question's author to consider litb's answer as the right one.
Original answer
Putting a true object (i.e. std::string) etc. inside will break, because the true object will be initialized before the memset, and then, overwritten by zeroes.
Using the initialization list doesn't work for g++ (I'm surprised...). Initialize it instead in the CMyStruct constructor body. It will be C++ friendly:
class CMyStruct : public MY_STRUCT
{
public:
CMyStruct() { n1 = 0 ; n2 = 0 ; }
};
P.S.: I assumed you did have no control over MY_STRUCT, of course. With control, you would have added the constructor directly inside MY_STRUCT and forgotten about inheritance. Note that you can add non-virtual methods to a C-like struct, and still have it behave as a struct.
EDIT: Added missing parenthesis, after Lou Franco's comment. Thanks!
EDIT 2 : I tried the code on g++, and for some reason, using the initialization list does not work. I corrected the code using the body constructor. The solution is still valid, though.
Please reevaluate my post, as the original code was changed (see changelog for more info).
EDIT 3 : After reading Rob's comment, I guess he has a point worthy of discussion: "Agreed, but this could be an enormous Win32 structure which may change with a new SDK, so a memset is future proof."
I disagree: Knowing Microsoft, it won't change because of their need for perfect backward compatibility. They will create instead an extended MY_STRUCTEx struct with the same initial layout as MY_STRUCT, with additionnal members at the end, and recognizable through a "size" member variable like the struct used for a RegisterWindow, IIRC.
So the only valid point remaining from Rob's comment is the "enormous" struct. In this case, perhaps a memset is more convenient, but you will have to make MY_STRUCT a variable member of CMyStruct instead of inheriting from it.
I see another hack, but I guess this would break because of possible struct alignment problem.
EDIT 4: Please take a look at Frank Krueger's solution. I can't promise it's portable (I guess it is), but it is still interesting from a technical viewpoint because it shows one case where, in C++, the "this" pointer "address" moves from its base class to its inherited class.
Much better than a memset, you can use this little trick instead:
MY_STRUCT foo = { 0 };
This will initialize all members to 0 (or their default value iirc), no need to specifiy a value for each.
This would make me feel much safer as it should work even if there is a vtable (or the compiler will scream).
memset(static_cast<MY_STRUCT*>(this), 0, sizeof(MY_STRUCT));
I'm sure your solution will work, but I doubt there are any guarantees to be made when mixing memset and classes.
This is a perfect example of porting a C idiom to C++ (and why it might not always work...)
The problem you will have with using memset is that in C++, a struct and a class are exactly the same thing except that by default, a struct has public visibility and a class has private visibility.
Thus, what if later on, some well meaning programmer changes MY_STRUCT like so:
struct MY_STRUCT
{
int n1;
int n2;
// Provide a default implementation...
virtual int add() {return n1 + n2;}
};
By adding that single function, your memset might now cause havoc.
There is a detailed discussion in comp.lang.c+
The examples have "unspecified behaviour".
For a non-POD, the order by which the compiler lays out an object (all bases classes and members) is unspecified (ISO C++ 10/3). Consider the following:
struct A {
int i;
};
class B : public A { // 'B' is not a POD
public:
B ();
private:
int j;
};
This can be laid out as:
[ int i ][ int j ]
Or as:
[ int j ][ int i ]
Therefore, using memset directly on the address of 'this' is very much unspecified behaviour. One of the answers above, at first glance looks to be safer:
memset(static_cast<MY_STRUCT*>(this), 0, sizeof(MY_STRUCT));
I believe, however, that strictly speaking this too results in unspecified behaviour. I cannot find the normative text, however the note in 10/5 says: "A base class subobject may have a layout (3.7) different from the layout of a most derived object of the same type".
As a result, I compiler could perform space optimizations with the different members:
struct A {
char c1;
};
struct B {
char c2;
char c3;
char c4;
int i;
};
class C : public A, public B
{
public:
C ()
: c1 (10);
{
memset(static_cast<B*>(this), 0, sizeof(B));
}
};
Can be laid out as:
[ char c1 ] [ char c2, char c3, char c4, int i ]
On a 32 bit system, due to alighments etc. for 'B', sizeof(B) will most likely be 8 bytes. However, sizeof(C) can also be '8' bytes if the compiler packs the data members. Therefore the call to memset might overwrite the value given to 'c1'.
Precise layout of a class or structure is not guaranteed in C++, which is why you should not make assumptions about the size of it from the outside (that means if you're not a compiler).
Probably it works, until you find a compiler on which it doesn't, or you throw some vtable into the mix.
If you already have a constructor, why not just initialize it there with n1=0; n2=0; -- that's certainly the more normal way.
Edit: Actually, as paercebal has shown, ctor initialization is even better.
My opinion is no. I'm not sure what it gains either.
As your definition of CMyStruct changes and you add/delete members, this can lead to bugs. Easily.
Create a constructor for CMyStruct that takes a MyStruct has a parameter.
CMyStruct::CMyStruct(MyStruct &)
Or something of that sought. You can then initialize a public or private 'MyStruct' member.
From an ISO C++ viewpoint, there are two issues:
(1) Is the object a POD? The acronym stands for Plain Old Data, and the standard enumerates what you can't have in a POD (Wikipedia has a good summary). If it's not a POD, you can't memset it.
(2) Are there members for which all-bits-zero is invalid ? On Windows and Unix, the NULL pointer is all bits zero; it need not be. Floating point 0 has all bits zero in IEEE754, which is quite common, and on x86.
Frank Kruegers tip addresses your concerns by restricting the memset to the POD base of the non-POD class.
Try this - overload new.
EDIT: I should add - This is safe because the memory is zeroed before any constructors are called. Big flaw - only works if object is dynamically allocated.
struct MY_STRUCT
{
int n1;
int n2;
};
class CMyStruct : public MY_STRUCT
{
public:
CMyStruct()
{
// whatever
}
void* new(size_t size)
{
// dangerous
return memset(malloc(size),0,size);
// better
if (void *p = malloc(size))
{
return (memset(p, 0, size));
}
else
{
throw bad_alloc();
}
}
void delete(void *p, size_t size)
{
free(p);
}
};
If MY_STRUCT is your code, and you are happy using a C++ compiler, you can put the constructor there without wrapping in a class:
struct MY_STRUCT
{
int n1;
int n2;
MY_STRUCT(): n1(0), n2(0) {}
};
I'm not sure about efficiency, but I hate doing tricks when you haven't proved efficiency is needed.
Comment on litb's answer (seems I'm not yet allowed to comment directly):
Even with this nice C++-style solution you have to be very careful that you don't apply this naively to a struct containing a non-POD member.
Some compilers then don't initialize correctly anymore.
See this answer to a similar question.
I personally had the bad experience on VC2008 with an additional std::string.
What I do is use aggregate initialization, but only specifying initializers for members I care about, e.g:
STARTUPINFO si = {
sizeof si, /*cb*/
0, /*lpReserved*/
0, /*lpDesktop*/
"my window" /*lpTitle*/
};
The remaining members will be initialized to zeros of the appropriate type (as in Drealmer's post). Here, you are trusting Microsoft not to gratuitously break compatibility by adding new structure members in the middle (a reasonable assumption). This solution strikes me as optimal - one statement, no classes, no memset, no assumptions about the internal representation of floating point zero or null pointers.
I think the hacks involving inheritance are horrible style. Public inheritance means IS-A to most readers. Note also that you're inheriting from a class which isn't designed to be a base. As there's no virtual destructor, clients who delete a derived class instance through a pointer to base will invoke undefined behaviour.
I assume the structure is provided to you and cannot be modified. If you can change the structure, then the obvious solution is adding a constructor.
Don't over engineer your code with C++ wrappers when all you want is a simple macro to initialise your structure.
#include <stdio.h>
#define MY_STRUCT(x) MY_STRUCT x = {0}
struct MY_STRUCT
{
int n1;
int n2;
};
int main(int argc, char *argv[])
{
MY_STRUCT(s);
printf("n1(%d),n2(%d)\n", s.n1, s.n2);
return 0;
}
It's a bit of code, but it's reusable; include it once and it should work for any POD. You can pass an instance of this class to any function expecting a MY_STRUCT, or use the GetPointer function to pass it into a function that will modify the structure.
template <typename STR>
class CStructWrapper
{
private:
STR MyStruct;
public:
CStructWrapper() { STR temp = {}; MyStruct = temp;}
CStructWrapper(const STR &myStruct) : MyStruct(myStruct) {}
operator STR &() { return MyStruct; }
operator const STR &() const { return MyStruct; }
STR *GetPointer() { return &MyStruct; }
};
CStructWrapper<MY_STRUCT> myStruct;
CStructWrapper<ANOTHER_STRUCT> anotherStruct;
This way, you don't have to worry about whether NULLs are all 0, or floating point representations. As long as STR is a simple aggregate type, things will work. When STR is not a simple aggregate type, you'll get a compile-time error, so you won't have to worry about accidentally misusing this. Also, if the type contains something more complex, as long as it has a default constructor, you're ok:
struct MY_STRUCT2
{
int n1;
std::string s1;
};
CStructWrapper<MY_STRUCT2> myStruct2; // n1 is set to 0, s1 is set to "";
On the downside, it's slower since you're making an extra temporary copy, and the compiler will assign each member to 0 individually, instead of one memset.