Packing bitfields even more tightly

Packing bitfields even more tightly - c++

I have a problem with bitfields in derived classes.
With the g++ compiler, you can assign __attribute__((packed)) to a class and it will pack bitfields. So
class A
{
public:
int one:10;
int two:10;
int three:10;
} __attribute__ ((__packed__));
takes up only 4 bytes. So far, so good.
However, if you inherit a class, like this
class B
{
public:
int one:10;
int two:10;
} __attribute__ ((__packed__));
class C : public B
{
public:
int three:10;
} __attribute__ ((__packed__));
I would expect class C, which has the same content as class A above, to have the same layout as well, i.e. take up 4 bytes. However, C turns out to occupy 5 bytes.
So my question is, am I doing something wrong, and if so, what? Or is this a problem with the compiler? An oversight, a real bug?
I tried googling, but haven't really come up with anything, apart from a difference between Linux and Windows (where the compiler tries to emulate MSVC), which I'm not interested in. This is just on Linux.

I believe the problem is with B, which cannot easily be 2.5 bytes. It has to be at least 3 bytes.
Theoretically, the derived class might be allowed to reuse padding from the base class, but I have never seen that happen.

Imagine for a second that what you are asking for is possible. What would be possible side-effects or issues of that? Let's see on a particular example that you have. Also assume a 32-bit architecture with 1-byte memory alignment.
There are 20 consecutive bits in class A that you can address via class's members one and two. It's a very convenient addressing for you, human. But what does the compiler do to make it happen? It uses masks and bit shifts to position those bits into correct places.
So far so good, seems simple and safe enough.
Adding 10 more bits. Let's say there was some amazingly smart compiler that allows you to squeeze those extra 10 bits into an already used 32-bit word (they fit nicely, don't they?).
Here comes trouble:
A* derived = new B; // upcast to base class
derived->one = 1;
derived->two = 2;
// what is the value of derived->three in this context?
// Especially taking into account that a compiler is free to do all sorts
// of optimizations when generating code for class A
Because of the above the class has to use different and separately-addressable memory locations for members of class A and members of class B causing those 10 bits to "spill" into next addressable memory location - next byte.
Even more trouble comes when you consider multiple inheritance - what is the one true way of arranging the bits in a derived class?

Related

Class static members contributing to program memory footprint even if class is not used

In class I want to have constant array of constant C strings:
.cpp
const char* const Colors::Names[] = {
"red",
"green"
};
.h
class Colors {
public:
static const char* const Names[];
};
The array should be common to all instances of class Colors (even though I plan to have just one instance but it should not metter), hence declaring array static.
The requirment is that if class is not instantied, array should not consume any memory in binary file.
However, with above solution, it does consume:
.rodata._ZN6Colors5NamesE
0x00000000 0x8
not sure about C strings itself as cannot find them in a map file but I assume they consume memory as well.
I know that one solution to this would be to use constexpr and C++17 where is it no longer needed to have definition of static constexpr members outside of class.
However, for some reasons (i.e. higher compilation times in my build system and slighlty higher program memory footprint) I don't want to change c++ standard version.
Another idea is to drop static (as I plan to have one instance anyway). However, the first issue with this solution is that I have to specify array size, which I would rather prefer not to do, otherwise I get:
error: flexible array member 'Colors::Names' in an otherwise empty 'class Colors'
Second issue is that array is placed in RAM section (inside class object), and only C strings are placed in FLASH memory.
Does anyone know other solutuions to this issue?
PS. My platform is Stm32 MCU and using GCC ARM compiler
EDIT (to address some of the answers in comments)
As suggested in comments this can't be done with just static members.
So the question should probably actually be: How to create (non-static) class array member, that's placed in read only memory (not initialized), which is placed in a memory only if the class is actually used in the program and preferably common for all instances of that class? Array itself is only used from that class.
Some background info:
Let's say that array has size of 256, and each C string 40 chars. That's 1kB for array + 10kB for C strings (32 bit architecture). Class is a part of library that is used by different projects (programs). If the class is not used in that project then I don't want that it (and it's array) would occupy even a single byte beacuse I need that FLASH space for other things, therefore compresion is not an option.
If there will be no other solutions then I will consider possiblity of removing unused sections by linker (alothough was hoping for a simpler solution).
Thanks for all suggestions.

Are classes larger in memory than their members in C++?

Let's say I have some class who's only member is an int. If it wasn't in a class, the int alone would be 4 bytes. Does the class take more than 4 bytes of memory (in C++)?

The decision about how big a class ends up being is implementation-specific and depends on a lot of different factors. Sometimes, due to structure and class padding, a class might end up bigger than the size of its members. If you have any virtual functions in your class, then you'll typically end up with a virtual function table pointer (vtable pointer) at the front of the class that adds a bit of space. And it's entirely possible that the compiler might just For The Heck Of It make your class bigger than the size of its members if it think it will help out in some way (or if you have a lazy compiler!)
In your case, with a single 32-bit integer, I'd be surprised if the class ended up being any larger than the integer itself, since you aren't using any virtual functions and there aren't any members to insert padding bytes between. However, you cannot necessarily rely on this across systems.
If you're working on an application where it's absolutely essential that your class be the same size as the fields - perhaps, for example, if you're reading raw bytes and want to reinterpret them as class objects - you could use a static_assert to check for this:
class MyClass {
...
};
static_assert(sizeof(MyClass) == sizeof(int), "MyClass must have the same size as an integer.");
Many compilers have custom options (often through #pragma directives) that you can tune to ensure that classes get sized in a way that you'd like, so you could also consider reading up on that.

The actual size is implementation-dependent, so it can change across different compilers and architectures due to padding and other implementation details. Never trust a simple sum like in the following pseudocode:
size = sizeof(member1) + ... + sizeof(memberN)
Also if the class has virtual functions, yes, it can be more than 4 bytes.
Moreover, in the case of virtual functions and class inheritance the size can be complicated to be understood at first sight:
Each class that include virtual functions will store a vtable in memory with function pointers to these virtual functions.
Class A, with virtual functions, that inherit from another class B, that has virtual functions too, could need more than one table to store both A and B function pointers.
See this answer for more details: how to determine sizeof class with virtual functions?

cast void* to a struct with an array member

I'm trying to directly cast a stream of data into a structure that actually has a variable number of other structures as members. Here's an example:
struct player
{
double lastTimePlayed;
double timeJoined;
};
struct team
{
uint32_t numberOfPlayers;
player everyone[];
};
then I call:
team *myTeam = (cache_team*)get_stream();
This should work like some kind of serialization, I know my stream is structured exactly as represented above, but I have the problem of the numberOfPlayers being a variable.
My stream starts with 4 bytes representing the number of players of the team, then it contains each player (in this case, each player has only lastTimePlayed and timeJoined).
The code posted seems to be working, I still get a warning from the compiler because of the default assignment and copy constructors, but my question is it it's possible to do this some other way, a better way.
BTW, my stream is actually a direct mapping to a file, and my goal is to use the structure as if it was the file itself (that part is working properly).

uint32_t is 4 bytes. If it starts with 8 bytes you want a uint64_t.
If you want to get rid of the warning you can make the default copy and assignment private:
struct team {
// ...
private:
team(const team &);
team &operator=(const team &);
};
Since you'd probably want to pass everything by pointer anyways it'll prevent ever doing an accidental copy.
Casting the mapped pointer to the struct is probably the easiest way. The big thing is to just make sure everything is lining up correctly.

Visual Studio 2012 gives the following:
warning C4200: nonstandard extension used : zero-sized array in struct/union
A structure or union contains an array with zero size.
Level-2 warning when compiling a C++ file and a Level-4 warning when compiling a C file.
This seems to be a legitimate message. I would recommend you to modify your struct to:
struct team
{
uint32_t numberOfPlayers;
player everyone[1];
};
Such definition is less elegant, but the result will be basically the same. C++ is not checking the value of indexes. Tons of code are using this.
New development should avoid the "array size violations" where possible. Describing external structures in this way is acceptable.

Both scaryrawr's solution, and yours do the trick, but I was in fact searching for another way.
I in fact did find it. I used an uint32_t everyonePtr instead of the array, then I will convert the uint32_t to a pointer using a reinterpret_cast like this:
player *entries = reinterpret_cast<player*>(&team->everyonePtr);
then my mapping will work as expected, and I think it's easier to understand than the array[1] or even the empty one. Thank you guys.

writing structs and classes to disk

The following function writes a struct to a file.
#define PAGESIZE sizeof(BTPAGE)
#define HEADERSIZE 2L
int btwrite(short rrn, BTPAGE *page_ptr)
{
long addr;
addr = (long) rrn * (long) PAGESIZE + HEADERSIZE;
lseek(btfd, addr, 0);
return (write(btfd, page_ptr, PAGESIZE));
}
The following is the struct.
typedef struct {
short keycount; /* number of keys in page */
int key[MAXKEYS]; /* the actual keys */
int value[MAXKEYS]; /* the actual values */
short child[MAXKEYS+1]; /* ptrs to rrns of descendants */
} BTPAGE;
What would happen if I changed the struct to a class, would it still work the same?
If I added class functions, would the size it takes up on disk increase?

There's a lot you need to learn here.
First of all, you're treating a structure as an array of bytes. This is strictly undefined behavior due to the strict aliasing rule. Anything can happen. So don't do it. Use proper serialization (for example via boost) instead. Yes, it's tedious. Yes, it's necessary.
Even if you ignore the undefinedness, and choose to become dependant on some particular compiler implementation (which may change even in the next compiler version), there's still reasons not to do it.
If you save a file on one machine, then load it on another, you may get garbage, because the second machine uses a different float representation, or a different endianness, or has different alignment rules, etc.
If your struct contains any pointers, it's very likely that saving them verbatim then loading them back will result in an address that doesn't not point to any meaningful place.
Typically when you add a member function, this happens:
the function's machine code is stored in a place shared by all the class instances (it wouldn't make sense to duplicate it, since it's logically immutable)
a hidden "this" pointer is passed to the function when it's called, so it knows which object it's been called on.
none of this requires any storage space in the instances.
However, when you add at least one virtual function, the compiler typically needs to also add a data chunk called a vtable (read up on it). This makes it possible to call different code depending on the current runtime type of the object (aka polymorphism). So the first virtual function you add to the class likely does increase the object size.

In C++, the difference between a struct and a class is simply that the members and base classes of a struct are public by default, whereas for a class they are private by default.
The technique of simply writing the bytes of the struct to a file and then reading them back in again only works if the struct is a plain old data, or POD, type. If you modify your struct such that it is no longer POD, this technique is not guaranteed to work (the rules describing what makes a POD struct are listed in answers to thet linked question).

If the class has any virtual function, then you're in trouble; if no virtual functions, you should still be OK (the same applies to a struct, of course, since it, too, could have virtual functions: the difference between struct and class is just that the default visibility in struct is public, in class it's private).

If you are doing more serialisation of classes consider using google protocol buffers, or something similar see this question

C++, statically detect base classes with differing addresses?

If I have a derived class with multiple bases, each this pointer for each base will be different from that of the derived object's this pointer, except for one. Given two types in an inheritance hierarchy, I'd like to detect at compile time whether they share the same this pointer. Something like this should work, but doesn't:
BOOST_STATIC_ASSERT(static_cast<Base1*>((Derived *)0xDEADBEEF) == (Derived*)0xDEADBEEF);
Because it needs to be an 'integral constant expression' and only integer casts are allowed in those according to the standard (which is stupid, because they only need compile time information if no virtual inheritance is being used). The same problem occurs trying to pass the results as integer template parameters.
The best I've been able to do is check at startup, but I need the information during compile (to get some deep template hackery to work).

I don't know how to check what you wan't but note that your assumption is false in presence of empty base classes. Any number of them can share the same offset from the start of the object, as long as they are of different type.

I am trying to solve this exact same issue. I have an implementation that works if you know what member variable is at the beginning of the base class's layout. E.g. if member variable "x" exists at the start of each class, then the following code will work to yield the byte offset of a particular base class layout from the derived class layout: offsetof(derived, base2::x).
In the case of:
struct base1 { char x[16]; };
struct base2 { int x; };
struct derived : public base1, public base2 { int x; };
static const int my_constant = offsetof(derived, base2::x);
The compiler will properly assign "16" to my_constant on my architecture (x86_64).
The difficulty is to get "16" when you don't know what member variable is at the start of a base class's layout.

I am not even sure that this offset is a constant in the first place. Do you have normative wording suggesting otherwise?
I'd agree that a non-const offset would be bloody hard to implement in the absence of virtual inheritance, and pointless to boot. That's besides the point.

Classes do not have a this pointer - instances of classes do, and it will be different for each instance, no matter how they are derived.

What about using
BOOST_STATIC_ASSERT(boost::is_convertible<Derived*,Base*>::value)
as documented in the following locations...
http://www.boost.org/doc/libs/1_39_0/doc/html/boost_staticassert.html
http://www.boost.org/doc/libs/1_38_0/libs/type_traits/doc/html/boost_typetraits/reference/is_convertible.html

I didn't realize that the compiler would insert this check at runtime, but your underlying assumption isn't entirely correct. Probably not in ways that you care about though: the compiler can use the Empty Base Class Optimization if you happen to inherit from more than one base class with sizeof(base class)==0. That would result in (base class *)(derived *)1==at least one other base class.
Like I said, this probably isn't something you would really need to care about.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Packing bitfields even more tightly - c++

I believe the problem is with B, which cannot easily be 2.5 bytes. It has to be at least 3 bytes. Theoretically, the derived class might be allowed to reuse padding from the base class, but I have never seen that happen.

Related

Class static members contributing to program memory footprint even if class is not used

Are classes larger in memory than their members in C++?

cast void* to a struct with an array member

writing structs and classes to disk

C++, statically detect base classes with differing addresses?

Categories

Resources