Question: Is there an automatic way to do structure packing?
Background:
Structure packing is very useful to reduce the memory cost of certain fundamental data. Basically it is the trick to achieve minimum memory cost by reordering the data inside. My question is that is there an auto way to do that? For example, I have a struct Foo here.(suppose 32bit)
struct Foo {
char flag;
char* p;
short number;
};
After a auto check(either it is a script or not, native or not), I should get a memory-optimization version of Foo, which is:
struct Foo {
char* p;
short number;
char flag;
};
This is just a toy example. Consider more difficult situations below, it would be quite a work for manual reordering.
struct has dependent struct:
struct Foo {
char* p;
short number;
MoreFoo more_foo // How to deal with this?
char flag;
};
struct is in legacy code and you are not familiar with codebase.
you want the code to be cross-platform. Sadly enough, this trick is compiler-dependent.
I am not considering using "packed" attribute since it will lead to some performance issue.
Can __attribute__((packed)) affect the performance of a program?
In C++03 you can give the compiler permission to reorder members by placing every one inside a separate access section, e.g.:
struct Foo
{
public:
char* p;
public:
short number;
public:
MoreFoo more_foo;
public:
char flag;
};
Whether a particular compiler uses this additional flexibility or not I do not know.
This doesn't change the declaration order, it simply unlinks memory order from declaration order, so PaulMcKenzie's concern about initialization order does not apply. (And I think he overstated the concern; it is very rare for member initializers to refer to other subobjects in the first place)
The way this works is because it causes the following rule, from 9.2) to no longer have effect:
Nonstatic data members of a (non-union) class declared without an intervening access-specifier are allocated so that later members have higher addresses within a class object. The order of allocation of nonstatic data members separated by an access-specifier is unspecified (11.1). Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions (10.3) and virtual base classes (10.1).
Also, it's questionable whether this still works in C++11, since the wording changed from "without an intervening access-specifier" to "with the same access control":
Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so that later members have higher addresses within a class object. The order of allocation of non-static data members with different access control is unspecified (Clause 11). Implementation alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for space for managing virtual functions (10.3) and virtual base classes (10.1).
In C programming, automatic optimization of a struct is not possible because this would go against the very way it was designed. C allows low-level access to the hardware, in fact, it is only a step higher from assembly language. It is designed to create dependent code that controls hardware.
Given this, unfortunately, you can only re-order a struct manually. You would probably need to find the size of all the struct's attributes, as so:
printf ("Size of char is %d\n", sizeof (char));
printf ("Size of char* is %d\n", sizeof (char*));
printf ("Size of short is %d\n", sizeof (short));
printf ("Size of MoreFoo is %d\n", sizeof (MoreFoo more_foo));
And then order the struct based on these values.
Related
I have a very big class with a bunch of members, and I want to initialize them with a given specific value.The code below is the most naive implementation, but I don't like it since it's inelegant and hard to maintain because I have to list all the members in the constructor.
struct I_Dont_Like_This_Approach {
int foo;
long bar;
unsigned baz;
int a;
int b;
int c;
int d;
SomeStruct and_so_on;
/*...*/
public:
explicit I_Dont_Like_This_Approach(int i) : foo(i), bar(i), baz(i), a(i), b(i), c(i), d(i), and_so_on(i) /*...*/ {}
};
I thought of an alternative implementation using templates.
template <int N>
struct MyBigClass {
int foo{N};
long bar{N};
unsigned baz{N};
int a{N};
int b{N};
int c{N};
int d{N};
SomeStruct and_so_on{N};
/*...*/
};
but I'm not sure if the code below is safe.
MyBigClass<1> all_one;
MyBigClass<2> all_two;
/* Is the following reinterpret_cast safe? */
all_one = reinterpret_cast<decltype(all_one) &>(all_two);
Does the C++ specification have any guarantees about the data layout compatibility of such templated structs? Or is there a more reasonable implementation? (in standard C++, and don't use macros)
I would argue that the first one is much more maintainable, with the right warnings enabled (and a modern compiler), you will see if your initializer list gets out of sync with the class fields at compile time.
As to your alternative.. you're using templates as compiler arguments, which is not what they're meant to be. That brings a whole slew of issues:
instantiated templates get copied in memory, making your executable larger. Though in this case, I'm hoping your compiler is smart enough to see that the field structure is the same and treat it as one type.
your code now works only with constant literal integers, no more run-time variables.
there is indeed no guarantee that the memory structure of those two classes is the same. You can disable optimizations in most compilers (like pack, alignment, etc), but that comes at the cost of disabling optimizations, which isn't actually necessary except to support your specific code.
And related to the last one, if you ever need to consider whether this is ever going to break, you're heading down a very dark road. I mean any sane person can tell you it will "probably work", but the fact that you have no guarantees in the language that pretty much popularized memory corruption and buffer overflows should terrify you. Write constructors.
I have two classes for communication that will never exist at the same time but both will be in use.
Example:
class CommA
{
public:
void SendA();
...
private:
ProtocolA a;
...
}
class CommB
{
public:
void SendB();
...
private:
ProtocolB b;
...
}
Is it possible to hold them in Union to save memory?
union CommAB
{
CommA a;
CommB b;
}
The std::variant was already raised by others in comments and answers. I'll explain why it is the recommended approach.
There is an important thing to know about unions: only one member of a union can be active at any time. If you create an AB object, it's either a or b, and if you access the wrong one, it's UB. How do you know which one is the active one? Fortunately the standard provides some guarantees: if classes of all union-members start with the same common members, you can access those safely. The usual trick is then to have a first member in all the classes of the union to determine which is the active one. Another issue can be when you have an active member and want to change it.
The advantage of the variant is that it takes care of these practical aspects without you having to worry, and without having to adjust the classes of the members.
Since you are looking for space: the union takes the space of its largest member. So if you have to add a common extra member in your classes to track the active union member, your union will be larger that either the initial A or B. The variant does the work for you, so it will be larger as well.
Given a standard layout class with standard layout members such as:
struct foo {
int n;
int m;
unsigned char garbage;
};
will it always be safe, according to the standard, to write in the last byte of the struct without writing into the memory areas of n and m (but possibly writing into garbage)? E.g.,
foo f;
*(static_cast<unsigned char *>(static_cast<void *>(&f)) + (sizeof(foo) - 1u)) = 0u;
After spending some time reading the C++11 standard, it seems to me like the answer might be yes.
From 9.2/15:
Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so
that later members have higher addresses within a class object. The order of allocation of non-static data
members with different access control is unspecified (11). Implementation alignment requirements might
cause two adjacent members not to be allocated immediately after each other; so might requirements for
space for managing virtual functions (10.3) and virtual base classes (10.1).
Hence, the garbage member has a higher address than the other two members (which are stored contiguously themselves as they have standard layout), and the last byte of the struct must hence either belong to garbage or be part of the final padding.
Is this reasoning correct? Am I meddling with the f object lifetime here? Is writing into padding bytes a problem?
EDIT
In reply to the comments, what I am trying to achieve here has to do with a variant-like class I am writing.
If proceed in a straightforward way (i.e., place an int member in the variant class to record which type is being stored), the padding will make the class almost 50% bigger than it needs to be.
What I am trying to do is to make sure that every last byte of each class type I am going to store in the variant is writable, so I can incorporate the storage flag into the raw storage (aligned raw char array) I am using in the variant. In my specific case, this eliminates most of the wasted space.
EDIT 2
As an actual example, consider these two classes to be stored in a variant on a typical 64-bit machine:
// Small dynamic vector class storing 8-bit integers.
class first {
std::int8_t *m_ptr;
unsigned short m_size_capacity; // Size and capacity packed into a single ushort.
};
// Vector class with static storage.
class second {
std::int8_t m_data[15];
std::uint8_t m_size;
};
class variant
{
char m_data[...] // Properly sized and aligned for first and second.
bool m_flag; // Flag to signal which class is being stored.
};
The size of these two classes is 16 on my machine, the extra member needed in the variant class makes the size go to 24. If I now add the garbage byte in the end:
// Small dynamic vector class storing 8-bit integers.
class first {
std::int8_t *m_ptr;
unsigned short m_size_capacity; // Size and capacity packed into a single ushort.
unsigned char m_garbage;
};
// Vector class with static storage.
class second {
std::int8_t m_data[14]; // Note I lost a vector element here.
std::uint8_t m_size;
unsigned char m_garbage;
};
The size of both classes will still be 16, but if now I can use the last byte of each class freely I can do away with the flag member in the variant, and the final size will still be 16.
Instead, you should put the tag first, followed by the other small members.
// Small dynamic vector class storing 8-bit integers.
struct first
{
unsigned char m_tag;
std::uint8_t m_size;
std::uint8_t m_capacity;
std::int8_t *m_ptr;
};
// Vector class with static storage.
struct second
{
unsigned char m_tag;
std::uint8_t m_size;
std::int8_t m_data[14];
};
Then, the language rules allow you to put these into a union and use either one to access m_tag, even if that wasn't the "active" member of the union, because the initial layout is the same (special rule for common initial sequence of members).
union tight_vector
{
first dynamic;
second small_opt;
};
tight_vector v;
if (v.dynamic.m_size < 4) throw std::exception("Not enough data");
if (v.dynamic.m_tag == DYNAMIC) { /* use v.dynamic */ }
else { /* use v.small_opt */ }
The rule in question is 9.2/18:
If a standard-layout union contains two or more standard-layout structs that share a common initial sequence, and if the standard-layout union object currently contains one of these standard-layout structs, it is permitted to inspect the common initial part of any of them. Two standard-layout structs share a common initial sequence if corresponding members have layout-compatible types and either neither member is a bit-field or both are bit-fields with the same width for a sequence of one or more initial members.
Yes, in C++ any object, including a class, is represented as a series of addressable char objects ("bytes"), and objects declared in sequence in a class without intervening access specifiers have sequentially ascending addresses. Therefore storage for garbage must have a higher address (when addressed as char *) than n or m.
In theory the compiler could store a base class at the end of the object, or something like a vtable pointer, but in practice such things always go into the beginning for the sake of simplicity. I'm not sure what the standard guarantees about the size of a standard-layout class, which would pertain to whether padding may be added, which would have implications for whether an implementation could rely on presence of padding for some purpose, but it probably comes down to the implementation being allowed to use padding which happens to be there, but it can't add any (and access would perhaps not be simple or efficient anyway).
What I am trying to do is to make sure that every last byte of each class type I am going to store in the variant is writable
How is this different from using the garbage member itself? If you know it's there, presumably you can simply access it.
The last byte of the struct will not be part of n and m, but how do you know the compiler hasn't stored something else in the last byte, such as type information?
I don't recall the standard guaranteeing any such thing. Only that a memcpy of sizeof(T) into and out will result in the same value, which doesn't mean that the final byte doesn't hold information.
After seeing this question a few minutes ago, I wondered why the language designers allow it as it allows indirect modification of private data. As an example
class TestClass {
private:
int cc;
public:
TestClass(int i) : cc(i) {};
};
TestClass cc(5);
int* pp = (int*)&cc;
*pp = 70; // private member has been modified
I tested the above code and indeed the private data has been modified. Is there any explanation of why this is allowed to happen or this just an oversight in the language? It seems to directly undermine the use of private data members.
Because, as Bjarne puts it, C++ is designed to protect against Murphy, not Machiavelli.
In other words, it's supposed to protect you from accidents -- but if you go to any work at all to subvert it (such as using a cast) it's not even going to attempt to stop you.
When I think of it, I have a somewhat different analogy in mind: it's like the lock on a bathroom door. It gives you a warning that you probably don't want to walk in there right now, but it's trivial to unlock the door from the outside if you decide to.
Edit: as to the question #Xeo discusses, about why the standard says "have the same access control" instead of "have all public access control", the answer is long and a little tortuous.
Let's step back to the beginning and consider a struct like:
struct X {
int a;
int b;
};
C always had a few rules for a struct like this. One is that in an instance of the struct, the address of the struct itself has to equal the address of a, so you can cast a pointer to the struct to a pointer to int, and access a with well defined results. Another is that the members have to be arranged in the same order in memory as they are defined in the struct (though the compiler is free to insert padding between them).
For C++, there was an intent to maintain that, especially for existing C structs. At the same time, there was an apparent intent that if the compiler wanted to enforce private (and protected) at run-time, it should be easy to do that (reasonably efficiently).
Therefore, given something like:
struct Y {
int a;
int b;
private:
int c;
int d;
public:
int e;
// code to use `c` and `d` goes here.
};
The compiler should be required to maintain the same rules as C with respect to Y.a and Y.b. At the same time, if it's going to enforce access at run time, it may want to move all the public variables together in memory, so the layout would be more like:
struct Z {
int a;
int b;
int e;
private:
int c;
int d;
// code to use `c` and `d` goes here.
};
Then, when it's enforcing things at run-time, it can basically do something like if (offset > 3 * sizeof(int)) access_violation();
To my knowledge nobody's ever done this, and I'm not sure the rest of the standard really allows it, but there does seem to have been at least the half-formed germ of an idea along that line.
To enforce both of those, the C++98 said Y::a and Y::b had to be in that order in memory, and Y::a had to be at the beginning of the struct (i.e., C-like rules). But, because of the intervening access specifiers, Y::c and Y::e no longer had to be in order relative to each other. In other words, all the consecutive variables defined without an access specifier between them were grouped together, the compiler was free to rearrange those groups (but still had to keep the first one at the beginning).
That was fine until some jerk (i.e., me) pointed out that the way the rules were written had another little problem. If I wrote code like:
struct A {
int a;
public:
int b;
public:
int c;
public:
int d;
};
...you ended up with a little bit of self contradition. On one hand, this was still officially a POD struct, so the C-like rules were supposed to apply -- but since you had (admittedly meaningless) access specifiers between the members, it also gave the compiler permission to rearrange the members, thus breaking the C-like rules they intended.
To cure that, they re-worded the standard a little so it would talk about the members all having the same access, rather than about whether or not there was an access specifier between them. Yes, they could have just decreed that the rules would only apply to public members, but it would appear that nobody saw anything to be gained from that. Given that this was modifying an existing standard with lots of code that had been in use for quite a while, the opted for the smallest change they could make that would still cure the problem.
Because of backwards-compatability with C, where you can do the same thing.
For all people wondering, here's why this is not UB and is actually allowed by the standard:
First, TestClass is a standard-layout class (§9 [class] p7):
A standard-layout class is a class that:
has no non-static data members of type non-standard-layout class (or array of such types) or reference, // OK: non-static data member is of type 'int'
has no virtual functions (10.3) and no virtual base classes (10.1), // OK
has the same access control (Clause 11) for all non-static data members, // OK, all non-static data members (1) are 'private'
has no non-standard-layout base classes, // OK, no base classes
either has no non-static data members in the most derived class and at most one base class with non-static data members, or has no base classes with non-static data members, and // OK, no base classes again
has no base classes of the same type as the first non-static data member. // OK, no base classes again
And with that, you can are allowed to reinterpret_cast the class to the type of its first member (§9.2 [class.mem] p20):
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa.
In your case, the C-style (int*) cast resolves to a reinterpret_cast (§5.4 [expr.cast] p4).
A good reason is to allow compatibility with C but extra access safety on the C++ layer.
Consider:
struct S {
#ifdef __cplusplus
private:
#endif // __cplusplus
int i, j;
#ifdef __cplusplus
public:
int get_i() const { return i; }
int get_j() const { return j; }
#endif // __cplusplus
};
By requiring that the C-visible S and the C++-visible S be layout-compatible, S can be used across the language boundary with the C++ side having greater access safety. The reinterpret_cast access safety subversion is an unfortunate but necessary corollary.
As an aside, the restriction on having all members with the same access control is because the implementation is permitted to rearrange members relative to members with different access control. Presumably some implementations put members with the same access control together, for the sake of tidiness; it could also be used to reduce padding, although I don't know of any compiler that does that.
The whole purpose of reinterpret_cast (and a C style cast is even more powerful than a reinterpret_cast) is to provide an escape path around safety measures.
The compiler would have given you an error if you had tried int *pp = &cc.cc, the compiler would have told you that you cannot access a private member.
In your code you are reinterpreting the address of cc as a pointer to an int. You wrote it the C style way, the C++ style way would have been int* pp = reinterpret_cast<int*>(&cc);. The reinterpret_cast always is a warning that you are doing a cast between two pointers that are not related. In such a case you must make sure that you are doing right. You must know the underlying memory (layout). The compiler does not prevent you from doing so, because this if often needed.
When doing the cast you throw away all knowledge about the class. From now on the compiler only sees an int pointer. Of course you can access the memory the pointer points to. In your case, on your platform the compiler happened to put cc in the first n bytes of a TestClass object, so a TestClass pointer also points to the cc member.
This is because you are manipulating the memory where your class is located in memory. In your case it just happen to store the private member at this memory location so you change it. It is not a very good idea to do because you do now know how the object will be stored in memory.
Is it possible to write a C++ class or struct that is fully compatible with C struct. From compatibility I mean size of the object and memory locations of the variables. I know that its evil to use *(point*)&pnt or even (float*)&pnt (on a different case where variables are floats) but consider that its really required for the performance sake. Its not logical to use regular type casting operator million times per second.
Take this example
Class Point {
long x,y;
Point(long x, long y) {
this->x=x;
this->y=y;
}
float Distance(Point &point) {
return ....;
}
};
C version is a POD struct
struct point {
long x,y;
};
The cleanest was to do this is to inherit from the C struct:
struct point
{
long x, y;
};
class Point : public struct point
{
public:
Point(long x, long y)
{ this->x=x; this->y=y; }
float Distance(Point &point)
{ return ....; }
}
The C++ compiler guarantees the C style struct point has the same layout as with the C compiler. The C++ class Point inherits this layout for its base class portion (and since it adds no data or virtual members, it will have the same layout). A pointer to class Point will be converted to a pointer to struct point without a cast, since conversion to a base class pointer is always supported. So, you can use class Point objects and freely pass pointers to them to C functions expecting a pointer to struct point.
Of course, if there is already a C header file defining struct point, then you can just include this instead of repeating the definition.
Yes.
Use the same types in the same order in both languages
Make sure the class doesn't have anything virtual in it (so you don't get a vtable pointer stuck on the front)
Depending on the compilers used you may need to adjust the structure packing (usually with pragmas) to ensure compatibility.
(edit)
Also, you must take care to check the sizeof() the types with your compilers. For example, I've encountered a compiler that stored shorts as 32 bit values (when most will use 16). A more common case is that an int will usually be 32 bits on a 32-bit architecture and 64 bits on a 64-bit architecture.
POD applies to C++. You can have member functions. "A POD type in C++ is an aggregate class that contains only POD types as members, has no user-defined destructor, no user-defined copy assignment operator, and no nonstatic members of pointer-to-member type"
You should design your POD data structures so they have natural alignment, and then they can be passed between programs created by different compilers on different architectures. Natural alignment is where the memory offset of any member is divisible by the size of that member. IE: a float is located at an address that is divisible by 4, a double is on an address divisible by 8. If you declare a char followed by a float, most architectures will pad 3 bytes, but some could conceivably pad 1 byte. If you declare a float followed by a char, all compilers (I ought to add a source for this claim, sorry) will not pad at all.
C and C++ are different languages but it has always been the C++'s intention that you can have an implementation that supports both languages in a binary compatible fashion. Because they are different languages it is always a compiler implementation detail whether this is actually supported. Typically vendors who supply both a C and C++ compiler (or a single compiler with two modes) do support full compatibility for passing POD-structs (and pointers to POD-structs) between C++ code and C code.
Often, merely having a user-defined constructor breaks the guarantee although sometimes you can pass a pointer to such an object to a C function expecting a pointer to a struct with and identical data structure and it will work.
In short: check your compiler documentation.
Use the same "struct" in both C and C++. If you want to add methods in the C++ implementation, you can inherit the struct and the size should be the same as long as you don't add data members or virtual functions.
Be aware that if you have an empty struct or data members that are empty structs, they are different sizes in C and C++. In C, sizeof(empty-struct) == 0 although in C99, empty-structs are not supposed to be allowed (but may be supported anyway as a "compiler extension"). In C++, sizeof(empty-struct) != 0 (typical value is 1).
In addition to other answers, I would be sure not to put any access specifiers (public:, private: etc) into your C++ class / struct. IIRC the compiler is allowed to reorder blocks of member variables according to visibility, so that private: int a; pubic: int b; might get a and b swapped round. See eg this link: http://www.embedded.com/design/218600150?printable=true
I admit to being baffled as to why the definition of POD does not include a prohibition to this effect.
As long as your class doesn't exhibit some advanced traits of its kind, like growing something virtual, it should be pretty much the same struct.
Besides, you can change Class (which is invalid due to capitalization, anyway) to struct without doing any harm. Except for the members will turn public (they are private now).
But now that I think of your talking about type conversion… There's no way you can turn float into long representing the same value or vice versa by casting pointer type. I hope you only want it these pointers for the sake of moving stuff around.