Getting different header size by changing window size - c++

I have a C++ program representing a TCP header as a struct:
#include "stdafx.h"
/* TCP HEADER
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Source Port | Destination Port |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Sequence Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Acknowledgment Number |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Data | |U|A|P|R|S|F| |
| Offset| Reserved |R|C|S|S|Y|I| Window |
| | |G|K|H|T|N|N| |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Checksum | Urgent Pointer |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Options | Padding |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| data |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
*/
typedef struct { // RFC793
WORD wSourcePort;
WORD wDestPort;
DWORD dwSequence;
DWORD dwAcknowledgment;
unsigned int byReserved1:4;
unsigned int byDataOffset:4;
unsigned int fFIN:1;
unsigned int fSYN:1;
unsigned int fRST:1;
unsigned int fPSH:1;
unsigned int fACK:1;
unsigned int fURG:1;
unsigned int byReserved2:2;
unsigned short wWindow;
WORD wChecksum;
WORD wUrgentPointer;
} TCP_HEADER, *PTCP_HEADER;
int _tmain(int argc, _TCHAR* argv[])
{
printf("TCP header length: %d\n", sizeof(TCP_HEADER));
return 0;
}
If I run this program I get the size of this header as 24 bytes, which is not the size I was expecting. If I change the type of the field "wWindow" to "unsigned int wWindow:16", which has the same number of bits as an unsigned short, the program tells me the size of the struct is now 20 bytes, the correct size. Why is this?
I am using Microsoft Visual Studio 2005 with SP1 on a 32-bit x86 machine.

Because the compiler is packing your bitfield into a 32-bit int, not a 16-bit entity.
In general you should avoid bitfields and use other manifest constants (enums or whatever) with explicit bit masking and shifting to access the 'sub-fields' in a field.
Here's one reason why bitfields should be avoided - they aren't very portable between compilers even for the same platform. from the C99 standard (there's similar wording in the C90 standard):
An implementation may allocate any
addressable storage unit large enough
to hold a bitfield. If enough space
remains, a bit-field that immediately
follows another bit-field in a
structure shall be packed into
adjacent bits of the same unit. If
insufficient space remains, whether a
bit-field that does not fit is put
into the next unit or overlaps
adjacent units is
implementation-defined. The order of
allocation of bit-fields within a unit
(high-order to low-order or low-order
to high-order) is
implementation-defined. The alignment
of the addressable storage unit is
unspecified.
You cannot guarantee whether a bit field will 'span' an int boundary or not and you can't specify whether a bitfield starts at the low-end of the int or the high end of the int (this is independant of whether the processor is big-endian or little-endian).

Your series of "unsigned int:xx" bitfields use up only 16 of the 32 bits in an int. The other 16 bits (2 bytes) are there, but unused. This is followed by the unsigned short, which is on an int boundary, and then a WORD, which is along aligned on an int boundary which means that there 2 bytes of padding between them.
When you switch to "unsigned int wWindow:16", instead of being a separate short, the compiler uses the unused parts of the previous bitfield, so no waste, no short, and no padding after the short, hence you save four bytes.

See this question: Why isn't sizeof for a struct equal to the sum of sizeof of each member? .
I believe that compiler takes a hint to disable padding when you use the "unsigned int wWindow:16" syntax.
Also, note that a short is not guaranteed to be 16 bits. The guarantee is that: 16 bits <= size of a short <= size of an int.

The compiler is padding the non-bitfield struct member to 32-bit -- native word alignment. To fix this, do #pragma pack(0) before the struct and #pragma pack() after.

Struct boundaries in memory can be padded by the compiler depending on the size and order of fields.

Not a C/C++ expert when it comes to packing. But I imagine there is a rule in the spec which says that when a non-bitfield follows a bitfield it must be aligned on the word boundary regardless of whether or not it fits in the remaining space. By making it an explicit bitvector you are avoiding this problem.
Again this is speculation with a touch of experience.

Interesting - I would think that "WORD" would evaluate to "unsigned short", so you'd have that problem in more than one place.
Also be aware that you'll need to deal with endian issues in any value over 8 bits.

I think Mike B got it right, but but not perfectly clear. When you ask for "short", it's aligned on 32bit boundary. When you ask for int:16, it's not. So int:16 fits right after th ebit fields, while short skips 2 bytes and starts at the next 32-bit block.
The rest of what he is saying is perfectly applicable - the bit field must never be used to code an externally-visible structure, because there are no guarantee as to how they are allocated. At best, they belong in embedded programs where saving a byte is important. And even there, you can't use them to actually controly bits in memory-mapped ports.

You are seeing different values because of compiler packing rules. You can see rules specific to visual studio here.
When you have a structure that must be packed (or adhere to some specific alignment requirements), you should use the #pragma pack() option. For your code, you can use #pragma pack(0) which will align all structure members on byte boundaries. You can then use #pragma pack() to reset structure packing to it's default state. You can see more information on the pack pragma here.

Related

g++ question about endianism that *should* work?

I thought this should work, but I'm obviously wrong, but I don't know why :-)
Assume I had the following bytes from the network 0x03 0x02. In my head, I expect it to be converted to little endian and the following union
struct decoded {
uint16_t opcode : 12;
uint8_t unused : 1;
uint8_t numRegs : 3;
}
union words {
decoded a;
uint8_t byes[2];
}
I would expect that I could to a be16toh(a.opcode) and get 0x030 and numRegisters is 0x02. I find, even with endian conversion, I get things like 0x302 and 0x00, but I have no idea why :-(
To expand on my comment, people tend to make a lot of assumptions about structure layout in general and about bitfield layout in particular that simply are not founded on the C or C++ language specification. The details should be described in the the applicable Application Binary Interface (ABI) specification, but this varies from architecture to architecture and from OS to OS.
All that you can rely upon in general is that
bitfields will be stored in "pure binary notation"
storage for bitfields will be assigned within "addressable storage units" chosen by the C or C++ implementation, whose size and alignment requirement is unspecified.
each ASU will contain at least one bitfield in full
if there is sufficient space in the chosen ASU then adjacent full bitfields will be packed into adjacent bits of the same one.
It is implementation-defined whether bitfields will span two ASUs in the event that there is some space available at the end of one, but not enough to accommodate the next bitfield.
It is unspecified in what order bitfields will appear within a given ASU.
There are other unspecified and implementation defined aspects of bitfields, too.
But let's consider a program exploring your particular bitfields. In light of the question being tagged both C and C++, this is written in C++, but in largely C-like idiom:
#include <cstdio>
#include <cstdint>
struct decoded {
uint16_t opcode : 12;
uint8_t unused : 1;
uint8_t numRegs : 3;
};
union words {
decoded a;
uint8_t byes[2];
};
int main(void) {
words u;
u.byes[0] = 0x03;
u.byes[1] = 0x02;
printf("structure size: %zu\n", sizeof(decoded));
printf("opcode: %#06hx; unused: %#04hhx; numRegs: %#04hhx\n", u.a.opcode, u.a.unused, u.a.numRegs);
}
On my x86-64 Linux workstation, its output is:
structure size: 2
opcode: 0x0203; unused: 0000; numRegs: 0000
This shows the following about my system:
the compiler chose a single 16-bit ASU for the structure. It can't be smaller because it must accommodate a 12-bit bitfield and be a multiple of the size of a char (8 bits on this machine). It is not larger because the size of the whole structure is 16 bits.
the compiler assigned the opcode member to the least-significant 12 bits (0 - 11)
we can conclude that the compiler assigned unused to bit 12, and numRegs to bits 13 - 15.
So here's the layout, in storage order:
0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 0
L----------- words -----------|
L---------- decoded ----------|
L------------ ASU ------------|
L-- bytes[0] --|--- bytes[1] -|
L----|-|------- opcode -------|
\ \
\ +- unused
+- numRegs
It should be clear why the unused and numRegs fields are both 0.
The bits of the opcode are 001100000010, so the question arises of how to interpret that? The answer is that the bit pattern is padded with zeroes on the left to widen it to the number of bits of a uint16_t (since that's the declared type of the bitfield), and interpret it in the ordinary (for this machine) way from there. Since the machine is little-endian, that gives the reported 0x0203.
I would expect that I could to a be16toh(a.opcode) and get 0x030 and numRegisters is 0x02.
That would be a plausible result on a machine were bitfields were laid out from left (most-significant) to right. Mine is not such a machine, and I guess yours isn't either.

I expected a 9 but i am getting 12, can someone tell me why? [duplicate]

Why does the sizeof operator return a size larger for a structure than the total sizes of the structure's members?
This is because of padding added to satisfy alignment constraints. Data structure alignment impacts both performance and correctness of programs:
Mis-aligned access might be a hard error (often SIGBUS).
Mis-aligned access might be a soft error.
Either corrected in hardware, for a modest performance-degradation.
Or corrected by emulation in software, for a severe performance-degradation.
In addition, atomicity and other concurrency-guarantees might be broken, leading to subtle errors.
Here's an example using typical settings for an x86 processor (all used 32 and 64 bit modes):
struct X
{
short s; /* 2 bytes */
/* 2 padding bytes */
int i; /* 4 bytes */
char c; /* 1 byte */
/* 3 padding bytes */
};
struct Y
{
int i; /* 4 bytes */
char c; /* 1 byte */
/* 1 padding byte */
short s; /* 2 bytes */
};
struct Z
{
int i; /* 4 bytes */
short s; /* 2 bytes */
char c; /* 1 byte */
/* 1 padding byte */
};
const int sizeX = sizeof(struct X); /* = 12 */
const int sizeY = sizeof(struct Y); /* = 8 */
const int sizeZ = sizeof(struct Z); /* = 8 */
One can minimize the size of structures by sorting members by alignment (sorting by size suffices for that in basic types) (like structure Z in the example above).
IMPORTANT NOTE: Both the C and C++ standards state that structure alignment is implementation-defined. Therefore each compiler may choose to align data differently, resulting in different and incompatible data layouts. For this reason, when dealing with libraries that will be used by different compilers, it is important to understand how the compilers align data. Some compilers have command-line settings and/or special #pragma statements to change the structure alignment settings.
Packing and byte alignment, as described in the C FAQ here:
It's for alignment. Many processors can't access 2- and 4-byte
quantities (e.g. ints and long ints) if they're crammed in
every-which-way.
Suppose you have this structure:
struct {
char a[3];
short int b;
long int c;
char d[3];
};
Now, you might think that it ought to be possible to pack this
structure into memory like this:
+-------+-------+-------+-------+
| a | b |
+-------+-------+-------+-------+
| b | c |
+-------+-------+-------+-------+
| c | d |
+-------+-------+-------+-------+
But it's much, much easier on the processor if the compiler arranges
it like this:
+-------+-------+-------+
| a |
+-------+-------+-------+
| b |
+-------+-------+-------+-------+
| c |
+-------+-------+-------+-------+
| d |
+-------+-------+-------+
In the packed version, notice how it's at least a little bit hard for
you and me to see how the b and c fields wrap around? In a nutshell,
it's hard for the processor, too. Therefore, most compilers will pad
the structure (as if with extra, invisible fields) like this:
+-------+-------+-------+-------+
| a | pad1 |
+-------+-------+-------+-------+
| b | pad2 |
+-------+-------+-------+-------+
| c |
+-------+-------+-------+-------+
| d | pad3 |
+-------+-------+-------+-------+
If you want the structure to have a certain size with GCC for example use __attribute__((packed)).
On Windows you can set the alignment to one byte when using the cl.exe compier with the /Zp option.
Usually it is easier for the CPU to access data that is a multiple of 4 (or 8), depending platform and also on the compiler.
So it is a matter of alignment basically.
You need to have good reasons to change it.
This can be due to byte alignment and padding so that the structure comes out to an even number of bytes (or words) on your platform. For example in C on Linux, the following 3 structures:
#include "stdio.h"
struct oneInt {
int x;
};
struct twoInts {
int x;
int y;
};
struct someBits {
int x:2;
int y:6;
};
int main (int argc, char** argv) {
printf("oneInt=%zu\n",sizeof(struct oneInt));
printf("twoInts=%zu\n",sizeof(struct twoInts));
printf("someBits=%zu\n",sizeof(struct someBits));
return 0;
}
Have members who's sizes (in bytes) are 4 bytes (32 bits), 8 bytes (2x 32 bits) and 1 byte (2+6 bits) respectively. The above program (on Linux using gcc) prints the sizes as 4, 8, and 4 - where the last structure is padded so that it is a single word (4 x 8 bit bytes on my 32bit platform).
oneInt=4
twoInts=8
someBits=4
See also:
for Microsoft Visual C:
http://msdn.microsoft.com/en-us/library/2e70t5y1%28v=vs.80%29.aspx
and GCC claim compatibility with Microsoft's compiler.:
https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gcc/Structure_002dPacking-Pragmas.html
In addition to the previous answers, please note that regardless the packaging, there is no members-order-guarantee in C++. Compilers may (and certainly do) add virtual table pointer and base structures' members to the structure. Even the existence of virtual table is not ensured by the standard (virtual mechanism implementation is not specified) and therefore one can conclude that such guarantee is just impossible.
I'm quite sure member-order is guaranteed in C, but I wouldn't count on it, when writing a cross-platform or cross-compiler program.
The size of a structure is greater than the sum of its parts because of what is called packing. A particular processor has a preferred data size that it works with. Most modern processors' preferred size if 32-bits (4 bytes). Accessing the memory when data is on this kind of boundary is more efficient than things that straddle that size boundary.
For example. Consider the simple structure:
struct myStruct
{
int a;
char b;
int c;
} data;
If the machine is a 32-bit machine and data is aligned on a 32-bit boundary, we see an immediate problem (assuming no structure alignment). In this example, let us assume that the structure data starts at address 1024 (0x400 - note that the lowest 2 bits are zero, so the data is aligned to a 32-bit boundary). The access to data.a will work fine because it starts on a boundary - 0x400. The access to data.b will also work fine, because it is at address 0x404 - another 32-bit boundary. But an unaligned structure would put data.c at address 0x405. The 4 bytes of data.c are at 0x405, 0x406, 0x407, 0x408. On a 32-bit machine, the system would read data.c during one memory cycle, but would only get 3 of the 4 bytes (the 4th byte is on the next boundary). So, the system would have to do a second memory access to get the 4th byte,
Now, if instead of putting data.c at address 0x405, the compiler padded the structure by 3 bytes and put data.c at address 0x408, then the system would only need 1 cycle to read the data, cutting access time to that data element by 50%. Padding swaps memory efficiency for processing efficiency. Given that computers can have huge amounts of memory (many gigabytes), the compilers feel that the swap (speed over size) is a reasonable one.
Unfortunately, this problem becomes a killer when you attempt to send structures over a network or even write the binary data to a binary file. The padding inserted between elements of a structure or class can disrupt the data sent to the file or network. In order to write portable code (one that will go to several different compilers), you will probably have to access each element of the structure separately to ensure the proper "packing".
On the other hand, different compilers have different abilities to manage data structure packing. For example, in Visual C/C++ the compiler supports the #pragma pack command. This will allow you to adjust data packing and alignment.
For example:
#pragma pack 1
struct MyStruct
{
int a;
char b;
int c;
short d;
} myData;
I = sizeof(myData);
I should now have the length of 11. Without the pragma, I could be anything from 11 to 14 (and for some systems, as much as 32), depending on the default packing of the compiler.
C99 N1256 standard draft
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
6.5.3.4 The sizeof operator:
3 When applied to an operand that has structure or union type,
the result is the total number of bytes in such an object,
including internal and trailing padding.
6.7.2.1 Structure and union specifiers:
13 ... There may be unnamed
padding within a structure object, but not at its beginning.
and:
15 There may be unnamed padding at the end of a structure or union.
The new C99 flexible array member feature (struct S {int is[];};) may also affect padding:
16 As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply.
Annex J Portability Issues reiterates:
The following are unspecified: ...
The value of padding bytes when storing values in structures or unions (6.2.6.1)
C++11 N3337 standard draft
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf
5.3.3 Sizeof:
2 When applied
to a class, the result is the number of bytes in an object of that class including any padding required for
placing objects of that type in an array.
9.2 Class members:
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its
initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note:
There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning,
as necessary to achieve appropriate alignment. — end note ]
I only know enough C++ to understand the note :-)
It can do so if you have implicitly or explicitly set the alignment of the struct. A struct that is aligned 4 will always be a multiple of 4 bytes even if the size of its members would be something that's not a multiple of 4 bytes.
Also a library may be compiled under x86 with 32-bit ints and you may be comparing its components on a 64-bit process would would give you a different result if you were doing this by hand.
C language leaves compiler some freedom about the location of the structural elements in the memory:
memory holes may appear between any two components, and after the last component. It was due to the fact that certain types of objects on the target computer may be limited by the boundaries of addressing
"memory holes" size included in the result of sizeof operator. The sizeof only doesn't include size of the flexible array, which is available in C/C++
Some implementations of the language allow you to control the memory layout of structures through the pragma and compiler options
The C language provides some assurance to the programmer of the elements layout in the structure:
compilers required to assign a sequence of components increasing memory addresses
Address of the first component coincides with the start address of the structure
unnamed bit fields may be included in the structure to the required address alignments of adjacent elements
Problems related to the elements alignment:
Different computers line the edges of objects in different ways
Different restrictions on the width of the bit field
Computers differ on how to store the bytes in a word (Intel 80x86 and Motorola 68000)
How alignment works:
The volume occupied by the structure is calculated as the size of the aligned single element of an array of such structures. The structure should
end so that the first element of the next following structure does not the violate requirements of alignment
p.s More detailed info are available here: "Samuel P.Harbison, Guy L.Steele C A Reference, (5.6.2 - 5.6.7)"
The idea is that for speed and cache considerations, operands should be read from addresses aligned to their natural size. To make this happen, the compiler pads structure members so the following member or following struct will be aligned.
struct pixel {
unsigned char red; // 0
unsigned char green; // 1
unsigned int alpha; // 4 (gotta skip to an aligned offset)
unsigned char blue; // 8 (then skip 9 10 11)
};
// next offset: 12
The x86 architecture has always been able to fetch misaligned addresses. However, it's slower and when the misalignment overlaps two different cache lines, then it evicts two cache lines when an aligned access would only evict one.
Some architectures actually have to trap on misaligned reads and writes, and early versions of the ARM architecture (the one that evolved into all of today's mobile CPUs) ... well, they actually just returned bad data on for those. (They ignored the low-order bits.)
Finally, note that cache lines can be arbitrarily large, and the compiler doesn't attempt to guess at those or make a space-vs-speed tradeoff. Instead, the alignment decisions are part of the ABI and represent the minimum alignment that will eventually evenly fill up a cache line.
TL;DR: alignment is important.
In addition to the other answers, a struct can (but usually doesn't) have virtual functions, in which case the size of the struct will also include the space for the vtbl.
Among the other well-explained answers about memory alignment and structure padding/packing, there is something which I have discovered in the question itself by reading it carefully.
"Why isn't sizeof for a struct equal to the sum of sizeof of each member?"
"Why does the sizeof operator return a size larger for a structure than the total sizes of the structure's members"?
Both questions suggest something what is plain wrong. At least in a generic, non-example focused view, which is the case here.
The result of the sizeof operand applied to a structure object can be equal to the sum of sizeof applied to each member separately. It doesn't have to be larger/different.
If there is no reason for padding, no memory will be padded.
One most implementations, if the structure contains only members of the same type:
struct foo {
int a;
int b;
int c;
} bar;
Assuming sizeof(int) == 4, the size of the structure bar will be equal to the sum of the sizes of all members together, sizeof(bar) == 12. No padding done here.
Same goes for example here:
struct foo {
short int a;
short int b;
int c;
} bar;
Assuming sizeof(short int) == 2 and sizeof(int) == 4. The sum of allocated bytes for a and b is equal to the allocated bytes for c, the largest member and with that everything is perfectly aligned. Thus, sizeof(bar) == 8.
This is also object of the second most popular question regarding structure padding, here:
Memory alignment in C-structs
given a lot information(explanation) above.
And, I just would like to share some method in order to solve this issue.
You can avoid it by adding pragma pack
#pragma pack(push, 1)
// your structure
#pragma pack(pop)

why there are 6 bytes of padding in a class containing two char and an integer? [duplicate]

Why does the sizeof operator return a size larger for a structure than the total sizes of the structure's members?
This is because of padding added to satisfy alignment constraints. Data structure alignment impacts both performance and correctness of programs:
Mis-aligned access might be a hard error (often SIGBUS).
Mis-aligned access might be a soft error.
Either corrected in hardware, for a modest performance-degradation.
Or corrected by emulation in software, for a severe performance-degradation.
In addition, atomicity and other concurrency-guarantees might be broken, leading to subtle errors.
Here's an example using typical settings for an x86 processor (all used 32 and 64 bit modes):
struct X
{
short s; /* 2 bytes */
/* 2 padding bytes */
int i; /* 4 bytes */
char c; /* 1 byte */
/* 3 padding bytes */
};
struct Y
{
int i; /* 4 bytes */
char c; /* 1 byte */
/* 1 padding byte */
short s; /* 2 bytes */
};
struct Z
{
int i; /* 4 bytes */
short s; /* 2 bytes */
char c; /* 1 byte */
/* 1 padding byte */
};
const int sizeX = sizeof(struct X); /* = 12 */
const int sizeY = sizeof(struct Y); /* = 8 */
const int sizeZ = sizeof(struct Z); /* = 8 */
One can minimize the size of structures by sorting members by alignment (sorting by size suffices for that in basic types) (like structure Z in the example above).
IMPORTANT NOTE: Both the C and C++ standards state that structure alignment is implementation-defined. Therefore each compiler may choose to align data differently, resulting in different and incompatible data layouts. For this reason, when dealing with libraries that will be used by different compilers, it is important to understand how the compilers align data. Some compilers have command-line settings and/or special #pragma statements to change the structure alignment settings.
Packing and byte alignment, as described in the C FAQ here:
It's for alignment. Many processors can't access 2- and 4-byte
quantities (e.g. ints and long ints) if they're crammed in
every-which-way.
Suppose you have this structure:
struct {
char a[3];
short int b;
long int c;
char d[3];
};
Now, you might think that it ought to be possible to pack this
structure into memory like this:
+-------+-------+-------+-------+
| a | b |
+-------+-------+-------+-------+
| b | c |
+-------+-------+-------+-------+
| c | d |
+-------+-------+-------+-------+
But it's much, much easier on the processor if the compiler arranges
it like this:
+-------+-------+-------+
| a |
+-------+-------+-------+
| b |
+-------+-------+-------+-------+
| c |
+-------+-------+-------+-------+
| d |
+-------+-------+-------+
In the packed version, notice how it's at least a little bit hard for
you and me to see how the b and c fields wrap around? In a nutshell,
it's hard for the processor, too. Therefore, most compilers will pad
the structure (as if with extra, invisible fields) like this:
+-------+-------+-------+-------+
| a | pad1 |
+-------+-------+-------+-------+
| b | pad2 |
+-------+-------+-------+-------+
| c |
+-------+-------+-------+-------+
| d | pad3 |
+-------+-------+-------+-------+
If you want the structure to have a certain size with GCC for example use __attribute__((packed)).
On Windows you can set the alignment to one byte when using the cl.exe compier with the /Zp option.
Usually it is easier for the CPU to access data that is a multiple of 4 (or 8), depending platform and also on the compiler.
So it is a matter of alignment basically.
You need to have good reasons to change it.
This can be due to byte alignment and padding so that the structure comes out to an even number of bytes (or words) on your platform. For example in C on Linux, the following 3 structures:
#include "stdio.h"
struct oneInt {
int x;
};
struct twoInts {
int x;
int y;
};
struct someBits {
int x:2;
int y:6;
};
int main (int argc, char** argv) {
printf("oneInt=%zu\n",sizeof(struct oneInt));
printf("twoInts=%zu\n",sizeof(struct twoInts));
printf("someBits=%zu\n",sizeof(struct someBits));
return 0;
}
Have members who's sizes (in bytes) are 4 bytes (32 bits), 8 bytes (2x 32 bits) and 1 byte (2+6 bits) respectively. The above program (on Linux using gcc) prints the sizes as 4, 8, and 4 - where the last structure is padded so that it is a single word (4 x 8 bit bytes on my 32bit platform).
oneInt=4
twoInts=8
someBits=4
See also:
for Microsoft Visual C:
http://msdn.microsoft.com/en-us/library/2e70t5y1%28v=vs.80%29.aspx
and GCC claim compatibility with Microsoft's compiler.:
https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gcc/Structure_002dPacking-Pragmas.html
In addition to the previous answers, please note that regardless the packaging, there is no members-order-guarantee in C++. Compilers may (and certainly do) add virtual table pointer and base structures' members to the structure. Even the existence of virtual table is not ensured by the standard (virtual mechanism implementation is not specified) and therefore one can conclude that such guarantee is just impossible.
I'm quite sure member-order is guaranteed in C, but I wouldn't count on it, when writing a cross-platform or cross-compiler program.
The size of a structure is greater than the sum of its parts because of what is called packing. A particular processor has a preferred data size that it works with. Most modern processors' preferred size if 32-bits (4 bytes). Accessing the memory when data is on this kind of boundary is more efficient than things that straddle that size boundary.
For example. Consider the simple structure:
struct myStruct
{
int a;
char b;
int c;
} data;
If the machine is a 32-bit machine and data is aligned on a 32-bit boundary, we see an immediate problem (assuming no structure alignment). In this example, let us assume that the structure data starts at address 1024 (0x400 - note that the lowest 2 bits are zero, so the data is aligned to a 32-bit boundary). The access to data.a will work fine because it starts on a boundary - 0x400. The access to data.b will also work fine, because it is at address 0x404 - another 32-bit boundary. But an unaligned structure would put data.c at address 0x405. The 4 bytes of data.c are at 0x405, 0x406, 0x407, 0x408. On a 32-bit machine, the system would read data.c during one memory cycle, but would only get 3 of the 4 bytes (the 4th byte is on the next boundary). So, the system would have to do a second memory access to get the 4th byte,
Now, if instead of putting data.c at address 0x405, the compiler padded the structure by 3 bytes and put data.c at address 0x408, then the system would only need 1 cycle to read the data, cutting access time to that data element by 50%. Padding swaps memory efficiency for processing efficiency. Given that computers can have huge amounts of memory (many gigabytes), the compilers feel that the swap (speed over size) is a reasonable one.
Unfortunately, this problem becomes a killer when you attempt to send structures over a network or even write the binary data to a binary file. The padding inserted between elements of a structure or class can disrupt the data sent to the file or network. In order to write portable code (one that will go to several different compilers), you will probably have to access each element of the structure separately to ensure the proper "packing".
On the other hand, different compilers have different abilities to manage data structure packing. For example, in Visual C/C++ the compiler supports the #pragma pack command. This will allow you to adjust data packing and alignment.
For example:
#pragma pack 1
struct MyStruct
{
int a;
char b;
int c;
short d;
} myData;
I = sizeof(myData);
I should now have the length of 11. Without the pragma, I could be anything from 11 to 14 (and for some systems, as much as 32), depending on the default packing of the compiler.
C99 N1256 standard draft
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
6.5.3.4 The sizeof operator:
3 When applied to an operand that has structure or union type,
the result is the total number of bytes in such an object,
including internal and trailing padding.
6.7.2.1 Structure and union specifiers:
13 ... There may be unnamed
padding within a structure object, but not at its beginning.
and:
15 There may be unnamed padding at the end of a structure or union.
The new C99 flexible array member feature (struct S {int is[];};) may also affect padding:
16 As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply.
Annex J Portability Issues reiterates:
The following are unspecified: ...
The value of padding bytes when storing values in structures or unions (6.2.6.1)
C++11 N3337 standard draft
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf
5.3.3 Sizeof:
2 When applied
to a class, the result is the number of bytes in an object of that class including any padding required for
placing objects of that type in an array.
9.2 Class members:
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its
initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note:
There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning,
as necessary to achieve appropriate alignment. — end note ]
I only know enough C++ to understand the note :-)
It can do so if you have implicitly or explicitly set the alignment of the struct. A struct that is aligned 4 will always be a multiple of 4 bytes even if the size of its members would be something that's not a multiple of 4 bytes.
Also a library may be compiled under x86 with 32-bit ints and you may be comparing its components on a 64-bit process would would give you a different result if you were doing this by hand.
C language leaves compiler some freedom about the location of the structural elements in the memory:
memory holes may appear between any two components, and after the last component. It was due to the fact that certain types of objects on the target computer may be limited by the boundaries of addressing
"memory holes" size included in the result of sizeof operator. The sizeof only doesn't include size of the flexible array, which is available in C/C++
Some implementations of the language allow you to control the memory layout of structures through the pragma and compiler options
The C language provides some assurance to the programmer of the elements layout in the structure:
compilers required to assign a sequence of components increasing memory addresses
Address of the first component coincides with the start address of the structure
unnamed bit fields may be included in the structure to the required address alignments of adjacent elements
Problems related to the elements alignment:
Different computers line the edges of objects in different ways
Different restrictions on the width of the bit field
Computers differ on how to store the bytes in a word (Intel 80x86 and Motorola 68000)
How alignment works:
The volume occupied by the structure is calculated as the size of the aligned single element of an array of such structures. The structure should
end so that the first element of the next following structure does not the violate requirements of alignment
p.s More detailed info are available here: "Samuel P.Harbison, Guy L.Steele C A Reference, (5.6.2 - 5.6.7)"
The idea is that for speed and cache considerations, operands should be read from addresses aligned to their natural size. To make this happen, the compiler pads structure members so the following member or following struct will be aligned.
struct pixel {
unsigned char red; // 0
unsigned char green; // 1
unsigned int alpha; // 4 (gotta skip to an aligned offset)
unsigned char blue; // 8 (then skip 9 10 11)
};
// next offset: 12
The x86 architecture has always been able to fetch misaligned addresses. However, it's slower and when the misalignment overlaps two different cache lines, then it evicts two cache lines when an aligned access would only evict one.
Some architectures actually have to trap on misaligned reads and writes, and early versions of the ARM architecture (the one that evolved into all of today's mobile CPUs) ... well, they actually just returned bad data on for those. (They ignored the low-order bits.)
Finally, note that cache lines can be arbitrarily large, and the compiler doesn't attempt to guess at those or make a space-vs-speed tradeoff. Instead, the alignment decisions are part of the ABI and represent the minimum alignment that will eventually evenly fill up a cache line.
TL;DR: alignment is important.
In addition to the other answers, a struct can (but usually doesn't) have virtual functions, in which case the size of the struct will also include the space for the vtbl.
Among the other well-explained answers about memory alignment and structure padding/packing, there is something which I have discovered in the question itself by reading it carefully.
"Why isn't sizeof for a struct equal to the sum of sizeof of each member?"
"Why does the sizeof operator return a size larger for a structure than the total sizes of the structure's members"?
Both questions suggest something what is plain wrong. At least in a generic, non-example focused view, which is the case here.
The result of the sizeof operand applied to a structure object can be equal to the sum of sizeof applied to each member separately. It doesn't have to be larger/different.
If there is no reason for padding, no memory will be padded.
One most implementations, if the structure contains only members of the same type:
struct foo {
int a;
int b;
int c;
} bar;
Assuming sizeof(int) == 4, the size of the structure bar will be equal to the sum of the sizes of all members together, sizeof(bar) == 12. No padding done here.
Same goes for example here:
struct foo {
short int a;
short int b;
int c;
} bar;
Assuming sizeof(short int) == 2 and sizeof(int) == 4. The sum of allocated bytes for a and b is equal to the allocated bytes for c, the largest member and with that everything is perfectly aligned. Thus, sizeof(bar) == 8.
This is also object of the second most popular question regarding structure padding, here:
Memory alignment in C-structs
given a lot information(explanation) above.
And, I just would like to share some method in order to solve this issue.
You can avoid it by adding pragma pack
#pragma pack(push, 1)
// your structure
#pragma pack(pop)

Why the size of struct is 24 bytes rather than 20 [duplicate]

Why does the sizeof operator return a size larger for a structure than the total sizes of the structure's members?
This is because of padding added to satisfy alignment constraints. Data structure alignment impacts both performance and correctness of programs:
Mis-aligned access might be a hard error (often SIGBUS).
Mis-aligned access might be a soft error.
Either corrected in hardware, for a modest performance-degradation.
Or corrected by emulation in software, for a severe performance-degradation.
In addition, atomicity and other concurrency-guarantees might be broken, leading to subtle errors.
Here's an example using typical settings for an x86 processor (all used 32 and 64 bit modes):
struct X
{
short s; /* 2 bytes */
/* 2 padding bytes */
int i; /* 4 bytes */
char c; /* 1 byte */
/* 3 padding bytes */
};
struct Y
{
int i; /* 4 bytes */
char c; /* 1 byte */
/* 1 padding byte */
short s; /* 2 bytes */
};
struct Z
{
int i; /* 4 bytes */
short s; /* 2 bytes */
char c; /* 1 byte */
/* 1 padding byte */
};
const int sizeX = sizeof(struct X); /* = 12 */
const int sizeY = sizeof(struct Y); /* = 8 */
const int sizeZ = sizeof(struct Z); /* = 8 */
One can minimize the size of structures by sorting members by alignment (sorting by size suffices for that in basic types) (like structure Z in the example above).
IMPORTANT NOTE: Both the C and C++ standards state that structure alignment is implementation-defined. Therefore each compiler may choose to align data differently, resulting in different and incompatible data layouts. For this reason, when dealing with libraries that will be used by different compilers, it is important to understand how the compilers align data. Some compilers have command-line settings and/or special #pragma statements to change the structure alignment settings.
Packing and byte alignment, as described in the C FAQ here:
It's for alignment. Many processors can't access 2- and 4-byte
quantities (e.g. ints and long ints) if they're crammed in
every-which-way.
Suppose you have this structure:
struct {
char a[3];
short int b;
long int c;
char d[3];
};
Now, you might think that it ought to be possible to pack this
structure into memory like this:
+-------+-------+-------+-------+
| a | b |
+-------+-------+-------+-------+
| b | c |
+-------+-------+-------+-------+
| c | d |
+-------+-------+-------+-------+
But it's much, much easier on the processor if the compiler arranges
it like this:
+-------+-------+-------+
| a |
+-------+-------+-------+
| b |
+-------+-------+-------+-------+
| c |
+-------+-------+-------+-------+
| d |
+-------+-------+-------+
In the packed version, notice how it's at least a little bit hard for
you and me to see how the b and c fields wrap around? In a nutshell,
it's hard for the processor, too. Therefore, most compilers will pad
the structure (as if with extra, invisible fields) like this:
+-------+-------+-------+-------+
| a | pad1 |
+-------+-------+-------+-------+
| b | pad2 |
+-------+-------+-------+-------+
| c |
+-------+-------+-------+-------+
| d | pad3 |
+-------+-------+-------+-------+
If you want the structure to have a certain size with GCC for example use __attribute__((packed)).
On Windows you can set the alignment to one byte when using the cl.exe compier with the /Zp option.
Usually it is easier for the CPU to access data that is a multiple of 4 (or 8), depending platform and also on the compiler.
So it is a matter of alignment basically.
You need to have good reasons to change it.
This can be due to byte alignment and padding so that the structure comes out to an even number of bytes (or words) on your platform. For example in C on Linux, the following 3 structures:
#include "stdio.h"
struct oneInt {
int x;
};
struct twoInts {
int x;
int y;
};
struct someBits {
int x:2;
int y:6;
};
int main (int argc, char** argv) {
printf("oneInt=%zu\n",sizeof(struct oneInt));
printf("twoInts=%zu\n",sizeof(struct twoInts));
printf("someBits=%zu\n",sizeof(struct someBits));
return 0;
}
Have members who's sizes (in bytes) are 4 bytes (32 bits), 8 bytes (2x 32 bits) and 1 byte (2+6 bits) respectively. The above program (on Linux using gcc) prints the sizes as 4, 8, and 4 - where the last structure is padded so that it is a single word (4 x 8 bit bytes on my 32bit platform).
oneInt=4
twoInts=8
someBits=4
See also:
for Microsoft Visual C:
http://msdn.microsoft.com/en-us/library/2e70t5y1%28v=vs.80%29.aspx
and GCC claim compatibility with Microsoft's compiler.:
https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gcc/Structure_002dPacking-Pragmas.html
In addition to the previous answers, please note that regardless the packaging, there is no members-order-guarantee in C++. Compilers may (and certainly do) add virtual table pointer and base structures' members to the structure. Even the existence of virtual table is not ensured by the standard (virtual mechanism implementation is not specified) and therefore one can conclude that such guarantee is just impossible.
I'm quite sure member-order is guaranteed in C, but I wouldn't count on it, when writing a cross-platform or cross-compiler program.
The size of a structure is greater than the sum of its parts because of what is called packing. A particular processor has a preferred data size that it works with. Most modern processors' preferred size if 32-bits (4 bytes). Accessing the memory when data is on this kind of boundary is more efficient than things that straddle that size boundary.
For example. Consider the simple structure:
struct myStruct
{
int a;
char b;
int c;
} data;
If the machine is a 32-bit machine and data is aligned on a 32-bit boundary, we see an immediate problem (assuming no structure alignment). In this example, let us assume that the structure data starts at address 1024 (0x400 - note that the lowest 2 bits are zero, so the data is aligned to a 32-bit boundary). The access to data.a will work fine because it starts on a boundary - 0x400. The access to data.b will also work fine, because it is at address 0x404 - another 32-bit boundary. But an unaligned structure would put data.c at address 0x405. The 4 bytes of data.c are at 0x405, 0x406, 0x407, 0x408. On a 32-bit machine, the system would read data.c during one memory cycle, but would only get 3 of the 4 bytes (the 4th byte is on the next boundary). So, the system would have to do a second memory access to get the 4th byte,
Now, if instead of putting data.c at address 0x405, the compiler padded the structure by 3 bytes and put data.c at address 0x408, then the system would only need 1 cycle to read the data, cutting access time to that data element by 50%. Padding swaps memory efficiency for processing efficiency. Given that computers can have huge amounts of memory (many gigabytes), the compilers feel that the swap (speed over size) is a reasonable one.
Unfortunately, this problem becomes a killer when you attempt to send structures over a network or even write the binary data to a binary file. The padding inserted between elements of a structure or class can disrupt the data sent to the file or network. In order to write portable code (one that will go to several different compilers), you will probably have to access each element of the structure separately to ensure the proper "packing".
On the other hand, different compilers have different abilities to manage data structure packing. For example, in Visual C/C++ the compiler supports the #pragma pack command. This will allow you to adjust data packing and alignment.
For example:
#pragma pack 1
struct MyStruct
{
int a;
char b;
int c;
short d;
} myData;
I = sizeof(myData);
I should now have the length of 11. Without the pragma, I could be anything from 11 to 14 (and for some systems, as much as 32), depending on the default packing of the compiler.
C99 N1256 standard draft
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
6.5.3.4 The sizeof operator:
3 When applied to an operand that has structure or union type,
the result is the total number of bytes in such an object,
including internal and trailing padding.
6.7.2.1 Structure and union specifiers:
13 ... There may be unnamed
padding within a structure object, but not at its beginning.
and:
15 There may be unnamed padding at the end of a structure or union.
The new C99 flexible array member feature (struct S {int is[];};) may also affect padding:
16 As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply.
Annex J Portability Issues reiterates:
The following are unspecified: ...
The value of padding bytes when storing values in structures or unions (6.2.6.1)
C++11 N3337 standard draft
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf
5.3.3 Sizeof:
2 When applied
to a class, the result is the number of bytes in an object of that class including any padding required for
placing objects of that type in an array.
9.2 Class members:
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its
initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note:
There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning,
as necessary to achieve appropriate alignment. — end note ]
I only know enough C++ to understand the note :-)
It can do so if you have implicitly or explicitly set the alignment of the struct. A struct that is aligned 4 will always be a multiple of 4 bytes even if the size of its members would be something that's not a multiple of 4 bytes.
Also a library may be compiled under x86 with 32-bit ints and you may be comparing its components on a 64-bit process would would give you a different result if you were doing this by hand.
C language leaves compiler some freedom about the location of the structural elements in the memory:
memory holes may appear between any two components, and after the last component. It was due to the fact that certain types of objects on the target computer may be limited by the boundaries of addressing
"memory holes" size included in the result of sizeof operator. The sizeof only doesn't include size of the flexible array, which is available in C/C++
Some implementations of the language allow you to control the memory layout of structures through the pragma and compiler options
The C language provides some assurance to the programmer of the elements layout in the structure:
compilers required to assign a sequence of components increasing memory addresses
Address of the first component coincides with the start address of the structure
unnamed bit fields may be included in the structure to the required address alignments of adjacent elements
Problems related to the elements alignment:
Different computers line the edges of objects in different ways
Different restrictions on the width of the bit field
Computers differ on how to store the bytes in a word (Intel 80x86 and Motorola 68000)
How alignment works:
The volume occupied by the structure is calculated as the size of the aligned single element of an array of such structures. The structure should
end so that the first element of the next following structure does not the violate requirements of alignment
p.s More detailed info are available here: "Samuel P.Harbison, Guy L.Steele C A Reference, (5.6.2 - 5.6.7)"
The idea is that for speed and cache considerations, operands should be read from addresses aligned to their natural size. To make this happen, the compiler pads structure members so the following member or following struct will be aligned.
struct pixel {
unsigned char red; // 0
unsigned char green; // 1
unsigned int alpha; // 4 (gotta skip to an aligned offset)
unsigned char blue; // 8 (then skip 9 10 11)
};
// next offset: 12
The x86 architecture has always been able to fetch misaligned addresses. However, it's slower and when the misalignment overlaps two different cache lines, then it evicts two cache lines when an aligned access would only evict one.
Some architectures actually have to trap on misaligned reads and writes, and early versions of the ARM architecture (the one that evolved into all of today's mobile CPUs) ... well, they actually just returned bad data on for those. (They ignored the low-order bits.)
Finally, note that cache lines can be arbitrarily large, and the compiler doesn't attempt to guess at those or make a space-vs-speed tradeoff. Instead, the alignment decisions are part of the ABI and represent the minimum alignment that will eventually evenly fill up a cache line.
TL;DR: alignment is important.
In addition to the other answers, a struct can (but usually doesn't) have virtual functions, in which case the size of the struct will also include the space for the vtbl.
Among the other well-explained answers about memory alignment and structure padding/packing, there is something which I have discovered in the question itself by reading it carefully.
"Why isn't sizeof for a struct equal to the sum of sizeof of each member?"
"Why does the sizeof operator return a size larger for a structure than the total sizes of the structure's members"?
Both questions suggest something what is plain wrong. At least in a generic, non-example focused view, which is the case here.
The result of the sizeof operand applied to a structure object can be equal to the sum of sizeof applied to each member separately. It doesn't have to be larger/different.
If there is no reason for padding, no memory will be padded.
One most implementations, if the structure contains only members of the same type:
struct foo {
int a;
int b;
int c;
} bar;
Assuming sizeof(int) == 4, the size of the structure bar will be equal to the sum of the sizes of all members together, sizeof(bar) == 12. No padding done here.
Same goes for example here:
struct foo {
short int a;
short int b;
int c;
} bar;
Assuming sizeof(short int) == 2 and sizeof(int) == 4. The sum of allocated bytes for a and b is equal to the allocated bytes for c, the largest member and with that everything is perfectly aligned. Thus, sizeof(bar) == 8.
This is also object of the second most popular question regarding structure padding, here:
Memory alignment in C-structs
given a lot information(explanation) above.
And, I just would like to share some method in order to solve this issue.
You can avoid it by adding pragma pack
#pragma pack(push, 1)
// your structure
#pragma pack(pop)

Why Data alignment of a 2 byte variable at a 2's multiple location in 4 byte compiler (32 bits) needed? [duplicate]

Why does the sizeof operator return a size larger for a structure than the total sizes of the structure's members?
This is because of padding added to satisfy alignment constraints. Data structure alignment impacts both performance and correctness of programs:
Mis-aligned access might be a hard error (often SIGBUS).
Mis-aligned access might be a soft error.
Either corrected in hardware, for a modest performance-degradation.
Or corrected by emulation in software, for a severe performance-degradation.
In addition, atomicity and other concurrency-guarantees might be broken, leading to subtle errors.
Here's an example using typical settings for an x86 processor (all used 32 and 64 bit modes):
struct X
{
short s; /* 2 bytes */
/* 2 padding bytes */
int i; /* 4 bytes */
char c; /* 1 byte */
/* 3 padding bytes */
};
struct Y
{
int i; /* 4 bytes */
char c; /* 1 byte */
/* 1 padding byte */
short s; /* 2 bytes */
};
struct Z
{
int i; /* 4 bytes */
short s; /* 2 bytes */
char c; /* 1 byte */
/* 1 padding byte */
};
const int sizeX = sizeof(struct X); /* = 12 */
const int sizeY = sizeof(struct Y); /* = 8 */
const int sizeZ = sizeof(struct Z); /* = 8 */
One can minimize the size of structures by sorting members by alignment (sorting by size suffices for that in basic types) (like structure Z in the example above).
IMPORTANT NOTE: Both the C and C++ standards state that structure alignment is implementation-defined. Therefore each compiler may choose to align data differently, resulting in different and incompatible data layouts. For this reason, when dealing with libraries that will be used by different compilers, it is important to understand how the compilers align data. Some compilers have command-line settings and/or special #pragma statements to change the structure alignment settings.
Packing and byte alignment, as described in the C FAQ here:
It's for alignment. Many processors can't access 2- and 4-byte
quantities (e.g. ints and long ints) if they're crammed in
every-which-way.
Suppose you have this structure:
struct {
char a[3];
short int b;
long int c;
char d[3];
};
Now, you might think that it ought to be possible to pack this
structure into memory like this:
+-------+-------+-------+-------+
| a | b |
+-------+-------+-------+-------+
| b | c |
+-------+-------+-------+-------+
| c | d |
+-------+-------+-------+-------+
But it's much, much easier on the processor if the compiler arranges
it like this:
+-------+-------+-------+
| a |
+-------+-------+-------+
| b |
+-------+-------+-------+-------+
| c |
+-------+-------+-------+-------+
| d |
+-------+-------+-------+
In the packed version, notice how it's at least a little bit hard for
you and me to see how the b and c fields wrap around? In a nutshell,
it's hard for the processor, too. Therefore, most compilers will pad
the structure (as if with extra, invisible fields) like this:
+-------+-------+-------+-------+
| a | pad1 |
+-------+-------+-------+-------+
| b | pad2 |
+-------+-------+-------+-------+
| c |
+-------+-------+-------+-------+
| d | pad3 |
+-------+-------+-------+-------+
If you want the structure to have a certain size with GCC for example use __attribute__((packed)).
On Windows you can set the alignment to one byte when using the cl.exe compier with the /Zp option.
Usually it is easier for the CPU to access data that is a multiple of 4 (or 8), depending platform and also on the compiler.
So it is a matter of alignment basically.
You need to have good reasons to change it.
This can be due to byte alignment and padding so that the structure comes out to an even number of bytes (or words) on your platform. For example in C on Linux, the following 3 structures:
#include "stdio.h"
struct oneInt {
int x;
};
struct twoInts {
int x;
int y;
};
struct someBits {
int x:2;
int y:6;
};
int main (int argc, char** argv) {
printf("oneInt=%zu\n",sizeof(struct oneInt));
printf("twoInts=%zu\n",sizeof(struct twoInts));
printf("someBits=%zu\n",sizeof(struct someBits));
return 0;
}
Have members who's sizes (in bytes) are 4 bytes (32 bits), 8 bytes (2x 32 bits) and 1 byte (2+6 bits) respectively. The above program (on Linux using gcc) prints the sizes as 4, 8, and 4 - where the last structure is padded so that it is a single word (4 x 8 bit bytes on my 32bit platform).
oneInt=4
twoInts=8
someBits=4
See also:
for Microsoft Visual C:
http://msdn.microsoft.com/en-us/library/2e70t5y1%28v=vs.80%29.aspx
and GCC claim compatibility with Microsoft's compiler.:
https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gcc/Structure_002dPacking-Pragmas.html
In addition to the previous answers, please note that regardless the packaging, there is no members-order-guarantee in C++. Compilers may (and certainly do) add virtual table pointer and base structures' members to the structure. Even the existence of virtual table is not ensured by the standard (virtual mechanism implementation is not specified) and therefore one can conclude that such guarantee is just impossible.
I'm quite sure member-order is guaranteed in C, but I wouldn't count on it, when writing a cross-platform or cross-compiler program.
The size of a structure is greater than the sum of its parts because of what is called packing. A particular processor has a preferred data size that it works with. Most modern processors' preferred size if 32-bits (4 bytes). Accessing the memory when data is on this kind of boundary is more efficient than things that straddle that size boundary.
For example. Consider the simple structure:
struct myStruct
{
int a;
char b;
int c;
} data;
If the machine is a 32-bit machine and data is aligned on a 32-bit boundary, we see an immediate problem (assuming no structure alignment). In this example, let us assume that the structure data starts at address 1024 (0x400 - note that the lowest 2 bits are zero, so the data is aligned to a 32-bit boundary). The access to data.a will work fine because it starts on a boundary - 0x400. The access to data.b will also work fine, because it is at address 0x404 - another 32-bit boundary. But an unaligned structure would put data.c at address 0x405. The 4 bytes of data.c are at 0x405, 0x406, 0x407, 0x408. On a 32-bit machine, the system would read data.c during one memory cycle, but would only get 3 of the 4 bytes (the 4th byte is on the next boundary). So, the system would have to do a second memory access to get the 4th byte,
Now, if instead of putting data.c at address 0x405, the compiler padded the structure by 3 bytes and put data.c at address 0x408, then the system would only need 1 cycle to read the data, cutting access time to that data element by 50%. Padding swaps memory efficiency for processing efficiency. Given that computers can have huge amounts of memory (many gigabytes), the compilers feel that the swap (speed over size) is a reasonable one.
Unfortunately, this problem becomes a killer when you attempt to send structures over a network or even write the binary data to a binary file. The padding inserted between elements of a structure or class can disrupt the data sent to the file or network. In order to write portable code (one that will go to several different compilers), you will probably have to access each element of the structure separately to ensure the proper "packing".
On the other hand, different compilers have different abilities to manage data structure packing. For example, in Visual C/C++ the compiler supports the #pragma pack command. This will allow you to adjust data packing and alignment.
For example:
#pragma pack 1
struct MyStruct
{
int a;
char b;
int c;
short d;
} myData;
I = sizeof(myData);
I should now have the length of 11. Without the pragma, I could be anything from 11 to 14 (and for some systems, as much as 32), depending on the default packing of the compiler.
C99 N1256 standard draft
http://www.open-std.org/JTC1/SC22/WG14/www/docs/n1256.pdf
6.5.3.4 The sizeof operator:
3 When applied to an operand that has structure or union type,
the result is the total number of bytes in such an object,
including internal and trailing padding.
6.7.2.1 Structure and union specifiers:
13 ... There may be unnamed
padding within a structure object, but not at its beginning.
and:
15 There may be unnamed padding at the end of a structure or union.
The new C99 flexible array member feature (struct S {int is[];};) may also affect padding:
16 As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply.
Annex J Portability Issues reiterates:
The following are unspecified: ...
The value of padding bytes when storing values in structures or unions (6.2.6.1)
C++11 N3337 standard draft
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3337.pdf
5.3.3 Sizeof:
2 When applied
to a class, the result is the number of bytes in an object of that class including any padding required for
placing objects of that type in an array.
9.2 Class members:
A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its
initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa. [ Note:
There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning,
as necessary to achieve appropriate alignment. — end note ]
I only know enough C++ to understand the note :-)
It can do so if you have implicitly or explicitly set the alignment of the struct. A struct that is aligned 4 will always be a multiple of 4 bytes even if the size of its members would be something that's not a multiple of 4 bytes.
Also a library may be compiled under x86 with 32-bit ints and you may be comparing its components on a 64-bit process would would give you a different result if you were doing this by hand.
C language leaves compiler some freedom about the location of the structural elements in the memory:
memory holes may appear between any two components, and after the last component. It was due to the fact that certain types of objects on the target computer may be limited by the boundaries of addressing
"memory holes" size included in the result of sizeof operator. The sizeof only doesn't include size of the flexible array, which is available in C/C++
Some implementations of the language allow you to control the memory layout of structures through the pragma and compiler options
The C language provides some assurance to the programmer of the elements layout in the structure:
compilers required to assign a sequence of components increasing memory addresses
Address of the first component coincides with the start address of the structure
unnamed bit fields may be included in the structure to the required address alignments of adjacent elements
Problems related to the elements alignment:
Different computers line the edges of objects in different ways
Different restrictions on the width of the bit field
Computers differ on how to store the bytes in a word (Intel 80x86 and Motorola 68000)
How alignment works:
The volume occupied by the structure is calculated as the size of the aligned single element of an array of such structures. The structure should
end so that the first element of the next following structure does not the violate requirements of alignment
p.s More detailed info are available here: "Samuel P.Harbison, Guy L.Steele C A Reference, (5.6.2 - 5.6.7)"
The idea is that for speed and cache considerations, operands should be read from addresses aligned to their natural size. To make this happen, the compiler pads structure members so the following member or following struct will be aligned.
struct pixel {
unsigned char red; // 0
unsigned char green; // 1
unsigned int alpha; // 4 (gotta skip to an aligned offset)
unsigned char blue; // 8 (then skip 9 10 11)
};
// next offset: 12
The x86 architecture has always been able to fetch misaligned addresses. However, it's slower and when the misalignment overlaps two different cache lines, then it evicts two cache lines when an aligned access would only evict one.
Some architectures actually have to trap on misaligned reads and writes, and early versions of the ARM architecture (the one that evolved into all of today's mobile CPUs) ... well, they actually just returned bad data on for those. (They ignored the low-order bits.)
Finally, note that cache lines can be arbitrarily large, and the compiler doesn't attempt to guess at those or make a space-vs-speed tradeoff. Instead, the alignment decisions are part of the ABI and represent the minimum alignment that will eventually evenly fill up a cache line.
TL;DR: alignment is important.
In addition to the other answers, a struct can (but usually doesn't) have virtual functions, in which case the size of the struct will also include the space for the vtbl.
Among the other well-explained answers about memory alignment and structure padding/packing, there is something which I have discovered in the question itself by reading it carefully.
"Why isn't sizeof for a struct equal to the sum of sizeof of each member?"
"Why does the sizeof operator return a size larger for a structure than the total sizes of the structure's members"?
Both questions suggest something what is plain wrong. At least in a generic, non-example focused view, which is the case here.
The result of the sizeof operand applied to a structure object can be equal to the sum of sizeof applied to each member separately. It doesn't have to be larger/different.
If there is no reason for padding, no memory will be padded.
One most implementations, if the structure contains only members of the same type:
struct foo {
int a;
int b;
int c;
} bar;
Assuming sizeof(int) == 4, the size of the structure bar will be equal to the sum of the sizes of all members together, sizeof(bar) == 12. No padding done here.
Same goes for example here:
struct foo {
short int a;
short int b;
int c;
} bar;
Assuming sizeof(short int) == 2 and sizeof(int) == 4. The sum of allocated bytes for a and b is equal to the allocated bytes for c, the largest member and with that everything is perfectly aligned. Thus, sizeof(bar) == 8.
This is also object of the second most popular question regarding structure padding, here:
Memory alignment in C-structs
given a lot information(explanation) above.
And, I just would like to share some method in order to solve this issue.
You can avoid it by adding pragma pack
#pragma pack(push, 1)
// your structure
#pragma pack(pop)