I often see structures in the code, at the end of which there is a memory reserve.
struct STAT_10K4
{
int32_t npos; // position number
...
float Plts;
Pxts;
float Plto [NUM];
uint32_t reserv [(NUM * 3)% 2 + 1];
};
Why do they do this?
Why are some of the reserve values dependent on constants?
What can happen if you do not make such reserves? Or make a mistake in their size?
This is a form of manual padding of a class to make its size a multiple of some number. In your case:
uint32_t reserv [(NUM * 3)% 2 + 1];
NUM * 3 % 2 is actually nonsensical, as it would be equivalent to NUM % 2 (not considering overflow). So if the array size is odd, we pad the struct with one additional uint32_t, on top of + 1 additional ones. This padding means that STAT_10K4's size is always a multiple of 8 bytes.
You will have to consult the documentation of your software to see why exactly this is done. Perhaps padding this struct with up to 8 bytes makes some algorithm easier to implement. Or maybe it has some perceived performance benefit. But this is pure speculation.
Typically, the compiler will pad your structs to 64-bit boundaries if you use any 64-bit types, so you don't need to do this manually.
Note: This answer is specific to mainstream compilers and x86. Obviously this does not apply to compiling for TI-calculators with 20-bit char & co.
This would typically be to support variable-length records. A couple of ways this could be used will be:
1 If the maximum number of records is known then a simple structure definition can accomodate all cases.
2 In many protocols there is a "header-data" idiom. The header will be a fixed size but the data variable. The data will be received as a "blob". Thus the structure of the header can be declared and accessed by a pointer to the blob, and the data will follow on from that. For example:
typedef struct
{
uint32_t messageId;
uint32_t dataType;
uint32_t dataLenBytes;
uint8_t data[MAX_PAYLOAD];
}
tsMessageFormat;
The data is received in a blob, so a void* ptr, size_t len.
The buffer pointer is then cast so the message can be read as follows:
tsMessageFormat* pMessage = (psMessageFormat*) ptr;
for (int i = 0; i < pMessage->dataLenBytes; i++)
{
//do something with pMessage->data[i];
}
In some languages the "data" could be specified as being an empty record, but C++ does not allow this. Sometimes you will see the "data" omitted and you have to perform pointer arithmetic to access the data.
The alternative to this would be to use a builder pattern and/or streams.
Windows uses this pattern a lot; many structures have a cbSize field which allows additional data to be conveyed beyond the structure. The structure accomodates most cases, but having cbSize allows additional data to be provided if necessary.
Related
I want to use and store "Handles" to data in an object buffer to reduce allocation overhead. The handle is simply an index into an array with the object. However I need to detect use-after-reallocations, as this could slip in quite easily. The common approach seems to be using bit fields. However this leads to 2 problems:
Bit fields are implementation defined
Bit shifting is not portable across big/little endian machines.
What I need:
Store handle to file (file handler can manage either integer types (byte swapping) or byte arrays)
Store 2 values in the handle with minimum space
What I got:
template<class T_HandleDef, typename T_Storage = uint32_t>
struct Handle
{
typedef T_HandleDef HandleDef;
typedef T_Storage Storage;
Handle(): handle_(0){}
private:
const T_Storage handle_;
};
template<unsigned T_numIndexBits = 16, typename T_Tag = void>
struct HandleDef{
static const unsigned numIndexBits = T_numIndexBits;
};
template<class T_Handle>
struct HandleAccessor{
typedef typename T_Handle::Storage Storage;
typedef typename T_Handle::HandleDef HandleDef;
static const unsigned numIndexBits = HandleDef::numIndexBits;
static const unsigned numMagicBits = sizeof(Storage) * 8 - numIndexBits;
/// "Magic" struct that splits the handle into values
union HandleData{
struct
{
Storage index : numIndexBits;
Storage magic : numMagicBits;
};
T_Handle handle;
};
};
A usage would be for example:
typedef Handle<HandleDef<24> > FooHandle;
FooHandle Create(unsigned idx, unsigned m){
HandleAccessor<FooHandle>::HandleData data;
data.idx = idx;
data.magic = m;
return data.handle;
}
My goal was to keep the handle as opaque as possible, add a bool check but nothing else. Users of the handle should not be able to do anything with it but passing it around.
So problems I run into:
Union is UB -> Replace its T_Handle by Storage and add a ctor to Handle from Storage
How does the compiler layout the bit field? I fill the whole union/type so there should be no padding. So probably the only thing that can be different is which type comes first depending on endianess, correct?
How can I store handle_ to a file and load it from a possible different endianess machine and still have index and magic be correct? I think I can store the containing Storage 'endian-correct' and get correct values, IF both members occupy exactly half the space (2 Shorts in an uint) But I always want more space for the index than for the magic value.
Note: There are already questions about bitfields and unions. Summary:
Bitfields may have unexpected padding (impossible here as whole type occupied)
Order of "members" depend on compiler (only 2 possible ways here, should be save to assume order depends entirely on endianess, so this may or may not actually help here)
Specific binary layout of bits can be achieved by manual shifting (or e.g. wrappers http://blog.codef00.com/2014/12/06/portable-bitfields-using-c11/) -> Is not an answer here. I need also a specific layout of the values IN the bitfield. So I'm not sure what I get, if I e.g. create a handle as handle = (magic << numIndexBits) | index and save/load this as binary (no endianess conversion) Missing a BigEndian machine for testing.
Note: No C++11, but boost is allowed.
Answer is pretty simple (based on another question I forgot the link to and comments by #Jeremy Friesner ):
As "numbers" are already an abstraction in C++ one can be sure to always have the same bit representation when the variable is in a CPU register (when it is used for anything calculation like) Also bit shifts in C++ are defined in an endian-independent way. This means x << 1 is always equal x * 2 (and hence big-endian)
Only time one get endianess problems is when saving to file, send/recv over network or accessing it from memory differently (e.g. via pointers...)
One cannot use C++ bitfields here, as one cannot be 100% sure about the order of the "entries". Bitfield containers might be ok, if they allow access to the data as a "number".
Savest is (still) using bitshifts, which are very simple in this case (only 2 values) During storing/serialization the number must then be stored in an endian-agnostic way.
I have embedded device connected to PC
and some big struct S with many fields and arrays of custom defined type FixedPoint_t.
FixedPoint_t is a templated POD class with exactly one data member that vary in size from char to long depending on template params. Anyway it passes static_assert((std::is_pod<FixedPoint_t<0,8,8> >::value == true),"");
It will be good if this big struct has compatible underlaying memory representation on both embedded system and controlling PC. This allows significant simplification of communication protocol to commands like "set word/byte with offset N to value V". Assume endianess is the same on both platforms.
I see 3 solutions here:
Use something like #pragma packed on both sides.
BUT i got warning when i put attribute((packed)) to struct S declaration
warning: ignoring packed attribute because of unpacked non-POD field.
This is because FixedPoint_t is not declared as packed.
I don't want declare it as packed because this type is widely used in whole program and packing can lead to performance drop.
Make correct struct serialization. This is not acceptable because of code bloat, additional RAM usege...Protocol will be more complicated because i need random access to the struct. Now I think this is not an option.
Control padding manually. I can add some field, reorder others...Just to acheive no padding on both platforms. This will satisfy me at the moment. But i need good way to write a test that shows me is padding is there or not.
I can compare sum of sizeof() each field to sizeof(struct).
I can compare offsetof() each struct field on both platforms.
Both variants are ugly enough...
What do you recommend? Especially i am interested in manual padding controling and automaic padding detection in tests.
EDIT: Is it sufficient to compare sizeof(big struct) on two platforms to detect layout compatibility(assume endianess is equal)?? I think size should not match if padding will be different.
EDIT2:
//this struct should have padding on 32bit machine
//and has no padding on 8bit
typedef struct
{
uint8_t f8;
uint32_t f32;
uint8_t arr[5];
} serialize_me_t;
//count of members in struct
#define SERTABLE_LEN 3
//one table entry for each serialize_me_t data member
static const struct {
size_t width;
size_t offset;
// size_t cnt; //why we need cnt?
} ser_des_table[SERTABLE_LEN] =
{
{ sizeof(serialize_me_t::f8), offsetof(serialize_me_t, f8)},
{ sizeof(serialize_me_t::f32), offsetof(serialize_me_t, f32)},
{ sizeof(serialize_me_t::arr), offsetof(serialize_me_t, arr)},
};
void serialize(void* serialize_me_ptr, char* buf)
{
const char* struct_ptr = (const char*)serialize_me_ptr;
for(int i=0; i<SERTABLE_LEN; I++)
{
struct_ptr += ser_des_table[i].offset;
memcpy(buf, struct_ptr, ser_des_table[i].width );
buf += ser_des_table[i].width;
}
}
I strongly recommend to use option 2:
You are save for future changes (new PCD/ABI, compiler, platform, etc.)
Code-bloat can be kept to a minimum if well thought. There is just one function needed per direction.
You can create the required tables/code (semi-)automatically (I use Python for such). This way both sides will stay in sync.
You definitively should add a CRC to the data anyway. As you likely do not want to calculate this in the rx/tx-interrupt, you'll have to provide an array anyway.
Using a struct directly will soon become a maintenance nightmare. Even worse if someone else has to track this code.
Protocols, etc. tend to be reused. If it is a platform with different endianess, the other approach goes bang.
To create the data-structs and ser/des tables, you can use offsetof to get the offset of each type in the struct. If that table is made an include-file, it can be used on both sides. You can even create the struct and table e.g. by a Python script. Adding that to the build-process ensures it is always up-to-date and you avoid additional typeing.
For instance (in C, just to get idea):
// protocol.inc
typedef struct {
uint32_t i;
uint 16_t s[5];
uint32_t j;
} ProtocolType;
static const struct {
size_t width;
size_t offset;
size_t cnt;
} ser_des_table[] = {
{ sizeof(ProtocolType.i), offsetof(ProtocolType.i), 1 },
{ sizeof(ProtocolType.s[0]), offsetof(ProtocolType.s), 5 },
...
};
If not created automatically, I'd use macros to generate the data. Possibly by including the file twice: one to generate the struct definition and one for the table. This is possible by redefining the macros in-between.
You should care about the representation of signed integers and floats (implementation defined, floats are likely IEEE754 as proposed by the standard).
As an alternative to the width field, you can use an "type" code (e.g. a char which maps to an implementation-defined type. This way you can add custom types with same width, but different encoding (e.g. uint32_t and IEEE754-float). This will completely abstract the network protocol encoding from the physical machine (the best solution). Note noting hinders you from using common encodings which do not complicate code a single bit (literally).
I am using Linux 32 bit os,
and GCC compiler.
I tried with three different type of structure.
in the first structure i have defined only one char variable. size of this structure is 1 that is correct.
in the second structure i have defined only one int variable. here size of the structure is showing 4 that is also correct.
but in the third structure when i defined one char and one int that means total size should be 5, but the output it is showing 8. Can anyone please explain how a structure is assigned?
typedef struct struct_size_tag
{
char c;
//int i;
}struct_size;
int main()
{
printf("Size of structure:%d\n",sizeof(struct_size));
return 0;
}
Output: Size of structure:1
typedef struct struct_size_tag
{
//char c;
int i;
}struct_size;
int main()
{
printf("Size of structure:%d\n",sizeof(struct_size));
return 0;
}
Output: Size of structure:4
typedef struct struct_size_tag
{
char c;
int i;
}struct_size;
int main()
{
printf("Size of structure:%d\n",sizeof(struct_size));
return 0;
}
Output:
Size of structure:8
The difference in size is due to alignment. The compiler is free to choose padding bytes, which make the total size of a structure not necessarily the sum of its individual elements.
If the padding of a structure is undesired, because it might have to interface with some hardware requirement (or other reasons), compilers usually support packing structures, so the padding is disabled.
You definitely get Data Structure Alignment
"Data alignment means putting the data at a memory offset equal to some
multiple of the word size, which increases the system's performance
due to the way the CPU handles memory. To align the data, it may be
necessary to insert some meaningless bytes between the end of the last
data structure and the start of the next, which is data structure
padding."
For more, take a look at this, Data Structure Alignment
The C standard allows a compiler to add padding bytes to structs after any field to allow the following field to be aligned according to any specific requirements of the compiler (or the user of the compiler). The standard does not specify, but typically a compiler will provide a command line argument to specify the (default) alignment. Good compilers also invariably support the de facto standard of #pragma pack, including push and pop options.
Padding bytes provide improved performance by reducing the amount of memory accesses required by suitably aligned data types. For example, on a 32-bit processor (more specifically a system which uses memory with 32 data lines) accessing a 32-bit integer will require two memory accesses when reading and writing the value rather than just one if it crosses a 4-byte boundary (ie, unless the bottom two bits of the address of the integer are zero).
See My Blog Post for more details (better than Wikipedia article).
The magic word is padding/memory alignment #see data structure alignment.
Okay, Allow me to re-ask the question, as none of the answers got at what I was really interested in (apologies if whole-scale editing of the question like this is a faux-paus).
A few points:
This is offline analysis with a different compiler than the one I'm testing, so SIZEOF() or similar won't work for what I'm doing.
I know it's implementation-defined, but I happen to know the implementation that is of interest to me, which is below.
Let's make a function called pack, which takes as input an integer, called alignment, and a tuple of integers, called elements. It outputs another integer, called size.
The function works as follows:
int pack (int alignment, int[] elements)
{
total_size = 0;
foreach( element in elements )
{
while( total_size % min(alignment, element) != 0 ) { ++total_size; }
total_size += element;
}
while( total_size % packing != 0 ) { ++total_size; }
return total_size;
}
I think what I want to ask is "what is the inverse of this function?", but I'm not sure whether inversion is the correct term--I don't remember ever dealing with inversions of functions with multiple inputs, so I could just be using a term that doesn't apply.
Something like what I want (sort of) exists; here I provide pseudo code for a function we'll call determine_align. The function is a little naive, though, as it just calls pack over and over again with different inputs until it gets an answer it expects (or fails).
int determine_align(int total_size, int[] elements)
{
for(packing = 1,2,4,...,64) // expected answers.
{
size_at_cur_packing = pack(packing, elements);
if(actual_size == size_at_cur_packing)
{
return packing;
}
}
return unknown;
}
So the question is, is there a better implementation of determine_align?
Thanks,
Alignment of struct members in C/C++ is entirely implementation-defined. There are a few guarantees there, but I don't see how they would help you.
Thus, there's no generic way to do what you want. In the context of a particular implementation, you should refer to the documentation of that implementation that covers this (if it is covered).
When choosing how to pack members into a struct an implementation doesn't have to follow the sort of scheme that you describe in your algorithm although it is a common one. (i.e. minimum of sizeof type being aligned and preferred machine alignment size.)
You don't have to compare overall size of a struct to determine the padding that has been applied to individual struct members, though. The standard macro offsetof will give the byte offset from the start of the struct of any individual struct member.
I let the compiler do the alignment for me.
In gcc,
typedef struct _foo
{
u8 v1 __attribute__((aligned(4)));
u16 v2 __attribute__((aligned(4)));
u32 v3 __attribute__((aligned(8)));
u8 v1 __attribute__((aligned(4)));
} foo;
Edit: Note that sizeof(foo) will return the correct value including any padding.
Edit2: And offsetof(foo, v2) also works. Given these two functions/macros, you can figure out everything you need to know about the layout of the struct in memory.
I'm honestly not sure what you're trying to do, and I'm probably completely misunderstanding what you're looking for, but if you want to simply determine what the alignment requirement of a struct is, the following macro might be helpful:
#define ALIGNMENT_OF( t ) offsetof( struct { char x; t test; }, test )
To determine the alignment of your foo structure, you can do:
ALIGNMENT_OF( foo);
If this isn't what you're ultimately tring to do, it might be possible that the macro might help in whatever algorithm you do come up with.
You need to pad based on the alignment of the next field and then pad the last element based on the maximum alignment you've seen in the struct. Note that the actual alignment of a field is the minimum of its natural alignment and the packing for that struct. I.e., if you have a struct packed at 4 bytes, a double will be aligned to 4 bytes, even though its natural alignment is 8.
You can make your inner loop faster with total_size+= total_size % min(packing, element.size); You can optimize it further if packing and element.size is a power of two.
If the problem is just that you want to guarantee a particular alignment, that is easy. For a particular alignment=2^n:
void* p = malloc( sizeof( _foo ) + alignment -1 );
p = (void*) ( ( (char*)(p) + alignment - 1 ) & ~alignment );
I've neglected to save to original p returned from malloc. If you intend to free this memory, you need to save that pointer somewhere.
I'm not sure what you want to achieve here. As Pavel Minaev said, alignment is handled by a compiler which in turn is constrained by a platform's Application Binary Interface for data that is made accessible to code compiled by a different compiler. The following paper discusses the problem in the context of a compiler that needs to implement calling conventions:
Christian Lindig and Norman Ramsey. Declarative Composition of Stack Frames. In Evelyn Duesterwald, editors, Proc. of the 14th International Conference on Compiler Construction, Springer, LNCS 2985, 2004.
I am working on refactoring some old code and have found few structs containing zero length arrays (below). Warnings depressed by pragma, of course, but I've failed to create by "new" structures containing such structures (error 2233). Array 'byData' used as pointer, but why not to use pointer instead? or array of length 1? And of course, no comments were added to make me enjoy the process...
Any causes to use such thing? Any advice in refactoring those?
struct someData
{
int nData;
BYTE byData[0];
}
NB It's C++, Windows XP, VS 2003
Yes this is a C-Hack.
To create an array of any length:
struct someData* mallocSomeData(int size)
{
struct someData* result = (struct someData*)malloc(sizeof(struct someData) + size * sizeof(BYTE));
if (result)
{ result->nData = size;
}
return result;
}
Now you have an object of someData with an array of a specified length.
There are, unfortunately, several reasons why you would declare a zero length array at the end of a structure. It essentially gives you the ability to have a variable length structure returned from an API.
Raymond Chen did an excellent blog post on the subject. I suggest you take a look at this post because it likely contains the answer you want.
Note in his post, it deals with arrays of size 1 instead of 0. This is the case because zero length arrays are a more recent entry into the standards. His post should still apply to your problem.
http://blogs.msdn.com/oldnewthing/archive/2004/08/26/220873.aspx
EDIT
Note: Even though Raymond's post says 0 length arrays are legal in C99 they are in fact still not legal in C99. Instead of a 0 length array here you should be using a length 1 array
This is an old C hack to allow a flexible sized arrays.
In C99 standard this is not neccessary as it supports the arr[] syntax.
Your intution about "why not use an array of size 1" is spot on.
The code is doing the "C struct hack" wrong, because declarations of zero length arrays are a constraint violation. This means that a compiler can reject your hack right off the bat at compile time with a diagnostic message that stops the translation.
If we want to perpetrate a hack, we must sneak it past the compiler.
The right way to do the "C struct hack" (which is compatible with C dialects going back to 1989 ANSI C, and probably much earlier) is to use a perfectly valid array of size 1:
struct someData
{
int nData;
unsigned char byData[1];
}
Moreover, instead of sizeof struct someData, the size of the part before byData is calculated using:
offsetof(struct someData, byData);
To allocate a struct someData with space for 42 bytes in byData, we would then use:
struct someData *psd = (struct someData *) malloc(offsetof(struct someData, byData) + 42);
Note that this offsetof calculation is in fact the correct calculation even in the case of the array size being zero. You see, sizeof the whole structure can include padding. For instance, if we have something like this:
struct hack {
unsigned long ul;
char c;
char foo[0]; /* assuming our compiler accepts this nonsense */
};
The size of struct hack is quite possibly padded for alignment because of the ul member. If unsigned long is four bytes wide, then quite possibly sizeof (struct hack) is 8, whereas offsetof(struct hack, foo) is almost certainly 5. The offsetof method is the way to get the accurate size of the preceding part of the struct just before the array.
So that would be the way to refactor the code: make it conform to the classic, highly portable struct hack.
Why not use a pointer? Because a pointer occupies extra space and has to be initialized.
There are other good reasons not to use a pointer, namely that a pointer requires an address space in order to be meaningful. The struct hack is externalizeable: that is to say, there are situations in which such a layout conforms to external storage such as areas of files, packets or shared memory, in which you do not want pointers because they are not meaningful.
Several years ago, I used the struct hack in a shared memory message passing interface between kernel and user space. I didn't want pointers there, because they would have been meaningful only to the original address space of the process generating a message. The kernel part of the software had a view to the memory using its own mapping at a different address, and so everything was based on offset calculations.
It's worth pointing out IMO the best way to do the size calculation, which is used in the Raymond Chen article linked above.
struct foo
{
size_t count;
int data[1];
}
size_t foo_size_from_count(size_t count)
{
return offsetof(foo, data[count]);
}
The offset of the first entry off the end of desired allocation, is also the size of the desired allocation. IMO it's an extremely elegant way of doing the size calculation. It does not matter what the element type of the variable size array is. The offsetof (or FIELD_OFFSET or UFIELD_OFFSET in Windows) is always written the same way. No sizeof() expressions to accidentally mess up.