actual structure size - c++

How can I know the actual structure size?
using sizeof returns the number of bytes after alignment.
For example :
struct s {
char c;
int i
}
sizeof(s) = 8;
I am interested to get the size of bytes without alignment ,i.e 5

The actual size of structure is size of structure after alignment. I mean sizeof return size of structure in the memory. if you want to have an structure that aligned on byte you should tell the compiler to do that. for example something like this:
#ifdef _MSVC
# pragma pack(push)
# pragma pack(1)
# define PACKED
#else
# define PACKED __attribute((packed))
#endif
struct byte_aligned {
char c;
int i;
} PACKED;
#ifdef _MSVC
# pragma pack(pop)
#endif
now sizeof(byte_aligned) return 5 and its actually 5 byte in memory

sizeof returns the actual structure size.
The size without padding is a bit meaningless, because the compiler will insert the padding it feels appropriate. Some compilers support pragmas that turn off structure padding, in which case you should use that, but you'll probably get all sorts of fun things happening.

You have misunderstood the documentation.
using sizeof returns the number of bytes after alignment
This is incorrect. sizeof returns the number of bytes after padding, which is not the same thing.
Thus, if what you want to know is "how much space will my struct take", sizeof() is telling you the correct answer.

Related

Why is this struct not the size I expect?

I am taking binary input from a file to a buffer vector then casting the pointer of that buffer to be my struct type.
The goal is for the data to populate the struct perfectly.
I know the size of all the various fields and the order they're going to come in.
As a result my struct needs to be tightly packed and be 42 bytes long.
My issue is that it is coming out at 44 bytes long when I test it.
Also, the first value lines up. After that, the data is incorrect.
Here's the struct:
#pragma pack(push, 1)
struct myStruct
{
uint8_t ID;
uint32_t size: 24;
uint16_t value;
char name[12];
char description[4];
char shoppingList[14];
char otherValue[6];
};
#pragma pack(pop)
Also, the first value lines up. After that, the data is incorrect.
uint32_t size: 24;
If you want to guarantee portably that this is three bytes with no padding before the next member, you're going to need to use a byte buffer and do the conversions yourself.
#pragma pack is an extension, and the packing of bitfield members is anyway implementation-defined.
FWIW both GCC and CLANG do seem to do what you want in this case, but unless it's defined by a platform ABI depending on this is still brittle.

why size of structure does not change when use 24 bit integer

I am trying to port embed code in windows platform.
I have come across below problem I am posting a sample code here.
here even after I use int24 size remains 12 bytes in windows why ?
struct INT24
{
INT32 data : 24;
};
struct myStruct
{
INT32 a;
INT32 b;
INT24 c;
};
int _tmain(int argc, _TCHAR* argv[])
{
unsigned char myArr[11] = { 0x00,0x00,0x00,0x00, 0x00,0x00,0x00,0x00, 0xFF,0xFF,0xFF };
myStruct *p = (myStruct*)myArr;
cout << sizeof(*p);
}
There are two reasons, each of which would be enough by themselves.
Presumably the size of INT32 is 4 bytes. The size of INT24 is also 4 bytes, because it contains a INT32 bit field. Since, myStruct contains 3 members of size 4, its size must therefore be at least 12.
Presumably the alignment requirement of INT32 is 4. So, even if the size of INT24 were 3, the size of myStruct would still have to be 12, because it must have at least the alignment requirement of INT32 and therefore the size of myStruct must be padded to the nearest multiple of 4.
any way or workaround ?
This is implementation specific, but the following hack may work for some compilers/cpu combinations. See the manual of your compiler for the syntax for similar feature, and the manual for your target cpu whether it supports unaligned memory access. Also do realize that unaligned memory access does have a performance penalty.
#pragma pack(push, 1)
struct INT24
{
INT32 data : 24;
};
#pragma pack(pop)
#pragma pack(push, 1)
struct myStruct
{
INT32 a;
INT32 b;
INT24 c;
};
#pragma pack(pop)
Packing a bit field might not work the same in all compilers. Be sure to check how yours behaves.
I think that a standard compliant way would be to store char arrays of sizes 3 and 4, and whenever you need to read or write one of the integer, you'd have to std::memcpy the value. That would be a bit burdensome to implement and possibly also slower than the #pragma pack hack.
Sadly for you, the compiler in optimising the code for a particular architecture reserves the right to pad out the structure by inserting spaces between members and even at the end of the structure.
Using a bit field does not reduce the size of the struct; you still get the whole of the "fielded" type in the struct.
The standard guarantees that the address of the first member of a struct is the same as the address of the struct, unless it's a polymorphic type.
But all is not lost: you can rely on the fact that an array of char will always be contiguous and contain no packing.
If CHAR_BIT is defined as 8 on your system (it probably is), you can model an array of 24 bit types on an array of char. If it's not 8 then even this approach will not work: I'd then suggest resorting to inline assembly.

size of struct-array

If I have a struct A defined as:
struct A {
char* c;
float f;
int i;
};
and an array
A col[5];
then why is
sizeof(*(col+0))
16?
On your platform, 16 bytes are required to hold that structure, the structure being of type A.
You should keep in mind that *(col+0) is identical to col[0] so it's only one of the structure, not the entire array of them. If you wanted the size of the array, you would use sizeof(col).
Possibly because:
you are on a 64-bit platform and char* takes 8 bytes while int and float take 4 bytes,
you are on a 32-bit platform and char* takes 4 bytes but your compiler decided that the array would be faster if it dropped 4 bytes of padding there. Padding can be controlled on most compilers by #pragma pack(push,1) and #pragma pack(pop) respectively.
If you want to be sure, you can use offsetof (on GCC) or create an object and examine the addresses of its member fields to inspect which fields got actually padded and how much.
For starters, your original declaration was incorrect (this has now been fixed in a question edit). A is the name of the type; to declare an array named col, you want
A col[5];
not
col A[5];
sizeof(*(col+0)) is the same as sizeof col[0], which is the same as sizeof (A).
It's 16 because that's the size of that structure, for the compiler and system you're using (you haven't mentioned what it is).
I take it from the question that you were expecting something different, but you didn't say so.
Compilers may insert padding bytes between members, or after the last member, to ensure that each member is aligned properly. I find 16 bytes to be an unsurprising size for that structure on a 64-bit system -- and in this particular case, it's probably that no padding is even required.
And in case you weren't aware, sizeof yields a result in bytes, where a byte is usually (but not always) 8 bits.
Your problem is most likely that your processor platform uses 8-byte alignment on floats. So, your char* will take 4 (assuming you're on a 32-bit system) since it's a pointer which is an address. Your float will take 8, and your int will take another 4 which totals 16 bytes.
Compilers will often make certain types align on certain byte boundaries in order to speed up computation on the hardware platform in use.
For example, if you did:
struct x {
char y;
int z;
};
Your system would (probably) say the size of x was 8, padding the char out to an int inside the structure.
You can add pragmas (implementation dependent) to stop this:
#pragma pack(1)
struct x {
char y;
int z;
};
#pragma pack(0)
which would make the size of this equal to 5.
Edit: There seem to be two parts to this question. "Why is sizeof(A) equal to 16?" On balance, I see now that this is probably the question that was intended. Instead I am answering the second part, i.e. "Why is sizeof(*(col+0)) == sizeof(A)?"
col is an array. col + 0 is meaningless for arrays, so the compiler must convert col to a pointer first. Then col is effectively just an A*. Adding zero to a pointer changes nothing. Finally, you dereference it with * and are left with a simple A of size 16.
In short, sizeof(A) == sizeof(*(col+0))
PS: I have not addressed the question "Why does that one element of the array take up 16 bytes?" Others have answered that well.
On a modern x86-64 processor, char* is 8 bytes, float is 4 bytes, int is 4 bytes. So the sizes of the members added together is 16. What else would you be expecting? Did someone tell you a pointer is 4 bytes? Because that's only true for x86-32.

struct size is different from typedef version?

I have the following struct declaration and typedef in my code:
struct blockHeaderStruct {
bool allocated;
unsigned int length;
};
typedef struct blockHeaderStruct blockHeader;
When I do sizeof(blockheader), I get the value of 4 bytes back, but when I do sizeof(struct blockHeaderStruct), I get 8 bytes.
Why is this happening? Why am I not getting 5 back instead?
Firstly, you cannot do sizeof(blockHeaderStruct). That simply will not compile. What you can do is sizeof(struct blockHeaderStruct), which could indeed give you 8 bytes as result.
Secondly, getting a different result from sizeof(blockheader) is highly unlikely. Judging by your reference to sizeof(blockHeaderStruct) (which, again, will not even compile) your description of the problem is inaccurate. Take a closer look at what is it you are really doing. Most likely, you are taking a sizeof of a pointer type (which gives you 4), not a struct type.
In any case, try posting real code.
Looking at the definition of your struct, you have 1 byte value followed by 4 byte Integer. This integer needs to be allocated on 4 byte boundary, which will force compiler to insert a 3 byte padding after your 1 byte bool. Which makes the size of struct to 8 byte. To avoid this you can change order of elements in the struct.
Also for two sizeof calls returning different values, are you sure you do not have a typo here and you are not taking size of pointer or different type or some integer variable.
The most likely scenario is that you are actually looking at the size of a pointer, not the struct, on a 32-bit system.
However, int may be 2 bytes (16 bits). In that case, the expected size of the structure is 4:
2 bytes for the int
1 byte for the bool
round up to the next multiple of 2, because the size of the struct is usually rounded to a multiple of the size of its largest primitive member.
Nothing could explain sizeof(blockHeaderStruct) != sizeof(struct blockHeader), though, given that typedef. That is completely impossible.
Struct allocation normally occurs on a 4 byte boundary. That is the compiler will pad data types within a struct up to the 4 byte boundary before starting with the next data type. Given that this is c++ (bool is a sizeof 1) and not c (bool needs to be #define as something)
struct blockHeaderStruct {
bool allocated; // 1 byte followed by 3 pad bytes
unsigned int length; // 4 bytes
};
typedef struct blockHeaderStruct blockHeader;
typedef struct blockHeaderStruct *blockHeaderPtr;
A sizeof operation would result:
sizeof(blockHeader) == 8
sizeof(struct blockHeader) == 8
sizeof(blockHeaderPtr) == 4
(Note: The last entry will be 8 for a
64 bit compiler. )
There should be no difference in sizes between the first two lines of code. A typedef merely assigns an alias to an existing type. The third is taking the sizeof a pointer which is 4 bytes in a 32 bit machine and 8 bytes on a 64 bit machine.
To fix this, simply apply the #pragma pack directive before a structure is defined. This forces the compiler to pack on the specified boundary. Usually set as 1,2,or 4 (although 4 is normally the default and doesn't need to be set).
#include <stddef.h>
#include <stdio.h>
#pragma pack(1)
struct blockHeaderStruct {
bool allocated;
unsigned int length;
};
typedef struct blockHeaderStruct blockHeader;
int main()
{
printf("sizeof(blockHeader) == %li\n", sizeof(blockHeader));
printf("sizeof(struct blockHeader) == %li\n", sizeof(struct blockHeaderStruct));
return 0;
}
Compiled with g++ (Ubuntu
4.4.1-4ubuntu9) 4.4.1
Results in:
sizeof(blockHeader) == 5
sizeof(structblockHeader) == 5
You don't normally need this directive. Just remember to pack your structs efficiently. Group smaller data types together. Do not alternate < 4 byte datatypes and 4 byte data types as your structs will be mostly unused space. This can cause unnecessary bandwidth for network related applications.
I actually copied that snippet direct
from my source file.
OK.
When I do sizeof(blockheader), I get
the value of 4 bytes back
It looks like blockheader is typedef'ed somewhere and its type occupies 4 bytes or its type requires 4 byte alignment.
If you try sizeof(blockHeader) you'll get your type.
when I do sizeof(blockHeaderStruct), I
get 8 bytes.
The reason why alignment matters is that if you need an array of blockHeaders then you can compute how much memory you need. Or if you have an array and need to compute how much memory to copy, you can compute it.
If you want to align all struct members to addresses that are multiples of 1 instead of 4 or instead of your compiler's defaults, your compiler might offer a #pragma to do it. Then you'll save memory but accesses might be slower in order to access unaligned data.
Some compilers will optimize to always allocate data by powers of 2 (4 stays 4, but 5 is rounded up to 8).

C++ Data Member Alignment and Array Packing

During a code review I've come across some code that defines a simple structure as follows:
class foo {
unsigned char a;
unsigned char b;
unsigned char c;
}
Elsewhere, an array of these objects is defined:
foo listOfFoos[SOME_NUM];
Later, the structures are raw-copied into a buffer:
memcpy(pBuff,listOfFoos,3*SOME_NUM);
This code relies on the assumptions that: a.) The size of foo is 3, and no padding is applied, and b.) An array of these objects is packed with no padding between them.
I've tried it with GNU on two platforms (RedHat 64b, Solaris 9), and it worked on both.
Are the assumptions above valid? If not, under what conditions (e.g. change in OS/compiler) might they fail?
It would definitely be safer to do:
sizeof(foo) * SOME_NUM
An array of objects is required to be contiguous, so there's never padding between the objects, though padding can be added to the end of an object (producing nearly the same effect).
Given that you're working with char's, the assumptions are probably right more often than not, but the C++ standard certainly doesn't guarantee it. A different compiler, or even just a change in the flags passed to your current compiler could result in padding being inserted between the elements of the struct or following the last element of the struct, or both.
If you copy your array like this you should use
memcpy(pBuff,listOfFoos,sizeof(listOfFoos));
This will always work as long as you allocated pBuff to the same size.
This way you are making no assumptions on padding and alignment at all.
Most compilers align a struct or class to the required alignment of the largest type included. In your case of chars that means no alignment and padding, but if you add a short for example your class would be 6 bytes large with one byte of padding added between the last char and your short.
I think the reason that this works because all of the fields in the structure are char which align one. If there is at least one field that does not align 1, the alignment of the structure/class will not be 1 (the alignment will depends on the field order and alignment).
Let see some example:
#include <stdio.h>
#include <stddef.h>
typedef struct {
unsigned char a;
unsigned char b;
unsigned char c;
} Foo;
typedef struct {
unsigned short i;
unsigned char a;
unsigned char b;
unsigned char c;
} Bar;
typedef struct { Foo F[5]; } F_B;
typedef struct { Bar B[5]; } B_F;
#define ALIGNMENT_OF(t) offsetof( struct { char x; t test; }, test )
int main(void) {
printf("Foo:: Size: %d; Alignment: %d\n", sizeof(Foo), ALIGNMENT_OF(Foo));
printf("Bar:: Size: %d; Alignment: %d\n", sizeof(Bar), ALIGNMENT_OF(Bar));
printf("F_B:: Size: %d; Alignment: %d\n", sizeof(F_B), ALIGNMENT_OF(F_B));
printf("B_F:: Size: %d; Alignment: %d\n", sizeof(B_F), ALIGNMENT_OF(B_F));
}
When executed, the result is:
Foo:: Size: 3; Alignment: 1
Bar:: Size: 6; Alignment: 2
F_B:: Size: 15; Alignment: 1
B_F:: Size: 30; Alignment: 2
You can see that Bar and F_B has alignment 2 so that its field i will be properly aligned. You can also see that Size of Bar is 6 and not 5. Similarly, the size of B_F (5 of Bar) is 30 and not 25.
So, if you is a hard code instead of sizeof(...), you will get a problem here.
Hope this helps.
It all comes down to memory alignment. Typical 32-bit machines read or write 4 bytes of memory per attempt. This structure is safe from problems because it falls under that 4 bytes easily with no confusing padding issues.
Now if the structure was as such:
class foo {
unsigned char a;
unsigned char b;
unsigned char c;
unsigned int i;
unsigned int j;
}
Your coworkers logic would probably lead to
memcpy(pBuff,listOfFoos,11*SOME_NUM);
(3 char's = 3 bytes, 2 ints = 2*4 bytes, so 3 + 8)
Unfortunately, due to padding the structure actually takes up 12 bytes. This is because you cannot fit three char's and an int into that 4 byte word, and so there's one byte of padded space there which pushes the int into it's own word. This becomes more and more of a problem the more diverse the data types become.
For situations where stuff like this is used, and I can't avoid it, I try to make the compilation break when the presumptions no longer hold. I use something like the following (or Boost.StaticAssert if the situation allows):
static_assert(sizeof(foo) <= 3);
// Macro for "static-assert" (only usefull on compile-time constant expressions)
#define static_assert(exp) static_assert_II(exp, __LINE__)
// Macro used by static_assert macro (don't use directly)
#define static_assert_II(exp, line) static_assert_III(exp, line)
// Macro used by static_assert macro (don't use directly)
#define static_assert_III(exp, line) enum static_assertion##line{static_assert_line_##line = 1/(exp)}
I would've been safe and replaced the magic number 3 with a sizeof(foo) I reckon.
My guess is that code optimised for future processor architectures will probably introduce some form of padding.
And trying to track down that sort of bug is a real pain!
As others have said, using sizeof(foo) is a safer bet. Some compilers (especially esoteric ones in the embedded world) will add a 4-byte header to classes. Others can do funky memory-alignment tricks, depending upon your compiler settings.
For a mainstream platform, you're probably alright, but its not a guarantee.
There might still be a problem with sizeof() when you are passing the data between two computers. On one of them the code might compile with padding and in the other without, in which case sizeof() would give different results. If the array data is passed from one computer to the other it will be misinterpreted because the array elements will not be found where expected.
One solution is to make sure that #pragma pack(1) is used whenever possible, but that may not be enough for the arrays. Best is to foresee the problem and use padding to a multiple of 8 bytes per array element.