How to ensure certain struct layout across compilations? - c++

The C++ standard says nothing about packing and padding of structs, because it is implementation defined.
If it is implementation defined, then for example, why it is safe to pass a struct to a DLL, if this DLL could have been compiled with a different compiler, which could have different methods for struct padding?
Is the struct padding method enforced by the OS's ABI (for example, the padding will be the same on all Windows platforms)?
Or, is there standard method for padding when compiling for a PC (x64 or x86_64 systems) that is used in every modern compiler?
If there is nothing that can guarantee the layout of variables, then is it safe to assume that each basic type in C++ (char, all numeric variables and pointers) must be aligned to an address that is a multiple of its size, and because of that, padding inside a struct can be done by hand without performance problems or UB?
From what I have checked, g++ compiles structs in such a way, that it inserts minimum amount of padding, just to ensure alignment of the next variable.
For example:
struct foo
{
char a;
// char _padding1[3]; <- inserted by compiler
uint32_t b;
};
There are 3 bytes of padding after a because that is the minimum amount that will give us a suitably aligned address for b.
Can we take for granted that compilers will do this that way? Or, can we force this kind of padding by hand without UB or performance issues?
By hand, I mean:
#pragma pack(1)
struct foo
{
char a;
char _padding1[3]; //<- manually adding padding bytes
uint32_t b;
};
#pragma pack()
Just to be clear: I am asking about behavior of compilers only on PC platforms : Windows, Linux distros, and maybe MacOS.
Sorry if my question is in category of "you dig into this too much". I just couldn't find a satisfying answer on the Internet. Some people say that it is not guaranteed. Others say that compiling with different compilers on systems that use the same ABI guarantee that the same struct will have the same layout. Others show how to reduce struct padding assuming that compilers pack structs the way that I described above (it is with minimum required padding to align variables).

If it is implementation defined, then for example, why it is safe to pass struct to dll
Because the dll and the caller follow the same Application binary interface (ABI) that defines the layout.
By the way, dll are a language extension and not part of standard C++.
if this dll could have been compiled with different compiler, which could have different method for struct padding?
If the library and the dependent don't follow an intercompatible ABI, then they cannot work together.
Is structpadding method enforced by the OS's ABI
Yes, class layout (structs are classes) is defined by the ABI.
For example padding will be the same on all Windows platforms
Not quite, since Windows on ARM has a different ABI for example. But within the same CPU architecture, the layout would be the same in Windows.
Or is there standard method for padding when compiling for PC (x64 or x86_64 systems) that is used in every modern compiler?
No, there is no universal class layout followed by OS, even within x86_64 architecture.
From what I checked, g++ compiles structs in such way, that it inserts minimum amount of padding, just to ensure alignment of next variable.
All objects in C++ must be aligned as per the alignment requirement of the type of the object. This guarantee isn't compiler specific. However alignment requirements of types - and even the sizes of types - vary across different ABIs.
Bonus info: Compilers have language extensions that remove such guarantee.
There are 3 bytes of padding after a because it is minimum amount that will give us suitably aligned address for b. Can we take for granted that compilers will do this that way?
In general no. On some systems, alignof(std::uint32_t) == 1 in which case there wouldn't be need for any padding.
Within a single ABI, you can take for granted that the layout is the same, but across multiple systems - which might not follow the same ABI - you cannot take it for granted.
When dealing with binary layout across systems (for example, when reading from a file or network), the standard compliant way is to treat the data as an array of bytes1, and to copy each sequence of bytes2 from pre-determined offsets onto fixed width3 fundamental objects (not classes whose layout may differ). In practice, you don't need to care about sign representation although that used to be a problem historically.
If the optimiser does its job, there ideally shouldn't be any performance penalty if the layout of input data matches the native layout. In case it doesn't match, then there may be a cost (compared to a matching layout) that cannot be optimised away.
1 This isn't sufficient when byte size differs across systems, but you don't need to worry about that since you care about x86_64 only.
2 In order to support systems with varying byte endianness, you must interpret the bytes in order of their significance rather than memory order, but you don't need to worry about that since you care about x86_64 only.
3 I.e. not int, short, long etc., but rather std::int32_t etc.

The C and C++ standards were written to describe existing languages. In situations where 99+% of implementations would do things a certain way, and it was obvious that implementations should do things that way absent a compelling reason for doing otherwise, the standards would generally leave open the possibility of implementations doing something unusual.
Consider, for example, given something like:
struct foo {int i; char a,b[4],c,d,e;}; // Assume sizeof (int) is 4
struct foo myFoo;
On most platforms, making bar be a three-word type which contains all of the individual bytes packed together may be more efficient than doing anything else. On the other hand, on a platform that uses word-addressed storages, but includes instructions to load or store bytes at a specified byte offset from a specified word address, word-aligning the start of b may allow a construct like myfoo.b[i] to be processed by directly using the value of i as an offset onto the word-aligned address of myFoo.b.
The standards were designed by people designing compilers for such platforms to weigh the pros and cons of following normal practice versus deviating from it to better fit the target architecture.
Machines that use word addresses but allow byte-based loads and stores are of course exceptionally rare, and very little code that isn't deliberately written from such machines for which compatibility with such them would offer any added value whatsoever.
The committees weren't willing to say that such machines should be viewed as archaic and not worth supporting, but that doesn't mean they didn't expect and intend that programs written for commonplace implementations could exploit aspects of behavior that were shared by all commonplace implementations, even if not by some obscure ones.

Related

size of pointers and architecture

By conducting a basic test by running a simple C++ program on a normal desktop PC it seems plausible to suppose that sizes of pointers of any type (including pointers to functions) are equal to the target architecture bits ?
For example: in 32 bits architectures -> 4 bytes and in 64 bits architectures -> 8 bytes.
However I remember reading that, it is not like that in general!
So I was wondering what would be such circumstances?
For equality of size of pointers to data types compared with size of pointers
to other data types
For equality of size of pointers to data types compared with size of pointers
to functions
For equality of size of pointers to target architecture
No, it is not reasonable to assume. Making this assumption can cause bugs.
The sizes of pointers (and of integer types) in C or C++ are ultimately determined by the C or C++ implementation. Normal C or C++ implementations are heavily influenced by the architectures and the operating systems they target, but they may choose the sizes of their types for reasons other than execution speed, such as goals of supporting lower memory use (smaller pointers means less memory used in programs with lots of pointers), supporting code that was not written to be fully portable to any type sizes, or supporting easier use of big integers.
I have seen a compiler targeted for a 64-bit system but providing 32-bit pointers, for the purpose of building programs with smaller memory use. (It had been observed that the sizes of pointers were a considerable factor in memory consumption, due to the use of many structures with many connections and references using pointers.) Source code written with the assumption that the pointer size equalled the 64-bit register size would break.
It is reasonable to assume that in general sizes of pointers of any type (including pointers to functions) are equal to the target architecture bits?
Depends. If you're aiming for a quick estimate of memory consumption it can be good enough. But not if your programs correctness depends on it.
(including pointers to functions)
But here is one important remark. Although most pointers will have the same size, function pointers may differ. It is not guaranteed that a void* will be able to hold a function pointer. At least, this is true for C. I don't know about C++.
So I was wondering what would be such circumstances if any?
It can be tons of reasons why it differs. If your programs correctness depends on this size it is NEVER ok to do such an assumption. Check it up instead. It shouldn't be hard at all.
You can use this macro to check such things at compile time in C:
#include <assert.h>
static_assert(sizeof(void*) == 4, "Pointers are assumed to be exactly 4 bytes");
When compiling, this gives an error message:
$ gcc main.c
In file included from main.c:1:
main.c:2:1: error: static assertion failed: "Pointers are assumed to be exactly 4 bytes"
static_assert(sizeof(void*) == 4, "Pointers are assumed to be exactly 4 bytes");
^~~~~~~~~~~~~
If you're using C++, you can skip #include <assert.h> because static_assert is a keyword in C++. (And you can use the keyword _Static_assert in C, but it looks ugly, so use the include and the macro instead.)
Since these two lines are so extremely easy to include in your code, there's NO excuse not to do so if your program would not work correctly with the wrong pointer size.
It is reasonable to assume that in general sizes of pointers of any type (including pointers to functions) are equal to the target architecture bits?
It might be reasonable, but it isn't reliably correct. So I guess the answer is "no, except when you already know the answer is yes (and aren't worried about portability)".
Potentially:
systems can have different register sizes, and use different underlying widths for data and addressing: it's not apparent what "target architecture bits" even means for such a system, so you have to choose a specific ABI (and once you've done that you know the answer, for that ABI).
systems may support different pointer models, such as the old near, far and huge pointers; in that case you need to know what mode your code is being compiled in (and then you know the answer, for that mode)
systems may support different pointer sizes, such as the X32 ABI already mentioned, or either of the other popular 64-bit data models described here
Finally, there's no obvious benefit to this assumption, since you can just use sizeof(T) directly for whatever T you're interested in.
If you want to convert between integers and pointers, use intptr_t. If you want to store integers and pointers in the same space, just use a union.
Target architecture "bits" says about registers size. Ex. Intel 8051 is 8-bit and operates on 8-bit registers, but (external)RAM and (external)ROM is accessed with 16-bit values.
For correctness, you cannot assume anything. You have to check and be prepared to deal with weird situations.
As a general rule of thumb, it is a reasonable default assumption.
It's not universally true though. See the X32 ABI, for example, which uses 32bit pointers on 64bit architectures to save a bit of memory and cache footprint. Same for the ILP32 ABI on AArch64.
So, for guesstimating memory use, you can use your assumption and it will often be right.
It is reasonable to assume that in general sizes of pointers of any type (including pointers to functions) are equal to the target architecture bits?
If you look at all types of CPUs (including microcontrollers) currently being produced, I would say no.
Extreme counterexamples would be architectures where two different pointer sizes are used in the same program:
x86, 16-bit
In MS-DOS and 16-bit Windows, a "normal" program used both 16- and 32-bit pointers.
x86, 32-bit segmented
There were only a few, less known operating systems using this memory model.
Programs typically used both 32- and 48-bit pointers.
STM8A
This modern automotive 8-bit CPU uses 16- and 24-bit pointers. Both in the same program, of course.
AVR tiny series
RAM is addressed using 8-bit pointers, Flash is addressed using 16-bit pointers.
(However, AVR tiny cannot be programmed with C++, as far as I know.)
It's not correct, for example DOS pointers (16 bit) can be far (seg+ofs).
However, for the usual targets (Windows, OSX, Linux, Android, iOS) then it's correct. Because they all use the flat programming model which relies on paging.
In theory, you can also have systems which uses only the lower 32 bits when in x64. An example is a Windows executable linked without LARGEADDRESSAWARE. However this is to help the programmer avoid bugs when switching to x64. The pointers are truncated to 32 bits, but they are still 64 bit.
In x64 operating systems then this assumption is always true, because the flat mode is the only valid one. Long mode in CPU forces GDT entries to be 64 bit flat.
One also mentions a x32 ABI, I believe it is based on the same paging technology, forcing all pointers to be mapped to the lower 4gb. However this must be based to the same theory as in Windows. In x64 you can only have flat mode.
In 32 bit protected mode you could have pointers up to 48 bits. (Segmented mode). You can also have callgates. But, no operating system uses that mode.
Historically, on microcomputers and microcontrollers, pointers were often wider than general-purpose registers so that the CPU could address enough memory and still fit within the transistor budget. Most 8-bit CPUs (such as the 8080, Z80 or 6502) had 16-bit addresses.
Today, a mismatch is more likely to be because an app doesn’t need multiple gigabytes of data, so saving four bytes of memory on every pointer is a win.
Both C and C++ provide separate size_t, uintptr_t and off_t types, representing the largest possible object size (which might be smaller than the size of a pointer if the memory model is not flat), an integral type wide enough to hold a pointer, and a file offset (often wider than the largest object allowed in memory), respectively. A size_t (unsigned) or ptrdiff_t (signed) is the most portable way to get the native word size. Additionally, POSIX guarantees that the system compiler has some flag that means a long can hold any of these, but you cannot always assume so.
Generally pointers will be size 2 on a 16-bit system, 3 on a 24-bit system, 4 on a 32-bit system, and 8 on a 64-bit system. It depends on the ABI and C implementation. AMD has long and legacy modes, and there are differences between AMD64 and Intel64 for Assembly language programmers but these are hidden for higher level languages.
Any problems with C/C++ code is likely to be due to poor programming practices and ignoring compiler warnings. See: "20 issues of porting C++ code to the 64-bit platform".
See also: "Can pointers be of different sizes?" and LRiO's answer:
... you are asking about C++ and its compliant implementations, not some specific physical machine. I'd have to quote the entire standard in order to prove it, but the simple fact is that it makes no guarantees on the result of sizeof(T*) for any T, and (as a corollary) no guarantees that sizeof(T1*) == sizeof(T2*) for any T1 and T2).
Note: Where is answered by JeremyP, C99 section 6.3.2.3, subsection 8:
A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the pointed-to type, the behavior is undefined.
In GCC you can avoid incorrect assumptions by using built-in functions: "Object Size Checking Built-in Functions":
Built-in Function: size_t __builtin_object_size (const void * ptr, int type)
is a built-in construct that returns a constant number of bytes from ptr to the end of the object ptr pointer points to (if known at compile time). To determine the sizes of dynamically allocated objects the function relies on the allocation functions called to obtain the storage to be declared with the alloc_size attribute (see Common Function Attributes). __builtin_object_size never evaluates its arguments for side effects. If there are any side effects in them, it returns (size_t) -1 for type 0 or 1 and (size_t) 0 for type 2 or 3. If there are multiple objects ptr can point to and all of them are known at compile time, the returned number is the maximum of remaining byte counts in those objects if type & 2 is 0 and minimum if nonzero. If it is not possible to determine which objects ptr points to at compile time, __builtin_object_size should return (size_t) -1 for type 0 or 1 and (size_t) 0 for type 2 or 3.

Size of Primitive data types

On what exactly does the size of a primitive data type like int depend on?
Compiler
Processor
Development Environment
Or is it a combination of these or other factors?
An explanation on the reason of the same will be really helpful.
EDIT: Sorry for the confusion..I meant to ask about Primitive data type like int and not regarding PODs, I do understand PODs can include structure and with structure it is a whole different ball game with padding coming in to the picture.
I have corrected the Q, the edit note here should ensure the answers regarding POD don't look irrelevant.
I think there are two parts to this question:
What sizes primitive types are allowed to be.
This is specified by the C and C++ standards: the types have allowed minimum value ranges they must have, which implicitly places a lower bound on their size in bits (e.g. long must be at least 32 bit to comply with the standard).
The standards do not specify the size in bytes, because the definition of the byte is up to the implementation, e.g. char is byte, but byte size (CHAR_BIT macro) may be 16 bit.
The actual size as defined by the implementation.
This, as other answers have already pointed out, is dependent on the implementation: the compiler. And the compiler implementation, in turn, is heavily influenced by the target architecture. So it's plausible to have two compilers running on the same OS and architecture, but having different size of int. The only assumption you can make is the one stated by the standard (given that the compiler implements it).
There also may be additional ABI requirements (e.g. fixed size of enums).
First of all, it depends on Compiler. Compiler in turns usually depends on the architecture, processor, development environment etc because it takes them into account. So you may say it's a combination of all. But I would NOT say that. I would say, Compiler, since on the same machine you may have different sizes of POD and built-in types, if you use different compilers. Also note that your source code is input to the compiler, so it's the compiler which makes final decision of the sizes of POD and built-in types. However, it's also true that this decision is influenced by the underlying architecture of the target machine. After all, the real useful compiler has to emit efficient code that eventually runs on the machine you target.
Compilers provides options too. Few of them might effect sizes also!
EDIT: What Standards say,
Size of char, signed char and unsigned char is defined by C++ Standard itself! Sizes of all other types are defined by the compiler.
C++03 Standard $5.3.3/1 says,
sizeof(char), sizeof(signed char) and
sizeof(unsigned char) are 1; the
result of sizeof applied to any other
fundamental type (3.9.1) is
implementation-defined. [Note: in
particular,sizeof(bool) and
sizeof(wchar_t) are
implementation-defined.69)
C99 Standard ($6.5.3.4) also itself defines the size of char, signed char and unsigned char to be 1, but leaves the size of other types to be defined by the compiler!
EDIT:
I found this C++ FAQ chapter really good. The entire chapter. It's very tiny chapter though. :-)
http://www.parashift.com/c++-faq-lite/intrinsic-types.html
Also read the comments below, there are some good arguments!
If you're asking about the size of a primitive type like int, I'd say it depends on the factor you cited.
The compiler/environment couple (where environment often means OS) is surely a part of it, since the compiler can map the various "sensible" sizes on the builtin types in different ways for various reasons: for example, compilers on x86_64 Windows will usually have a 32 bit long and a 64 bit long long to avoid breaking code thought for plain x86; on x86_64 Linux, instead, long is usually 64 bit because it's a more "natural" choice and apps developed for Linux are generally more architecture-neutral (because Linux runs on a much greater variety of architectures).
The processor surely matters in the decision: int should be the "natural size" of the processor, usually the size of the general-purpose registers of the processor. This means that it's the type that will work faster on the current architecture. long instead is often thought as a type which trades performance for an extended range (this is rarely true on regular PCs, but on microcontrollers it's normal).
If in instead you're also talking about structs & co. (which, if they respect some rules, are POD), again the compiler and the processor influence their size, since they are made of builtin types and of the appropriate padding chosen by the compiler to achieve the best performance on the target architecture.
As I commented under #Nawaz's answer, it technically depends solely on the compiler.
The compiler is just tasked with taking valid C++ code, and outputting valid machine code (or whatever language it targets).
So a C++ compiler could decide to make an int have a size of 15, and require it to be aligned on 5-byte boundaries, and it could decide to insert arbitrary padding between the variables in a POD. Nothing in the standard prohibits this, and it could still generate working code.
It'd just be much slower.
So in practice, compilers take some hints from the system they're running on, in two ways:
- the CPU has certain preferences: for example, it may have 32-bit wide registers, so making an int 32 bits wide would be a good idea, and it usually requires variables to be naturally aligned (a 4-byte wide variable must be aligned on an address divisible by 4, for example), so a sensible compiler respects these preferences because it yields faster code.
- the OS may have some influence too, in that if it uses another ABI than the compiler, making system calls is going to be needlessly difficult.
But those are just practical considerations to make life a bit easier for the programmer or to generate faster code. They're not required.
The compiler has the final word, and it can choose to completely ignore both the CPU and the OS. As long as it generates a working executable with the semantics specified in the C++ standard.
It depends on the implementation (compiler).
Implementation-defined behavior means unspecified behavior where each implementation documents how the choice is made.
A struct can also be POD, in which case you can explicity control potential padding between members with #pragma pack on some compilers.

Is it possible to share a C struct in shared memory between apps compiled with different compilers?

I realize that in general the C and C++ standards gives compiler writers a lot of latitude. But in particular it guarantees that POD types like C struct members have to be laid out in memory the same order that they're listed in the structs definition, and most compilers provide extensions letting you fix the alignment of members. So if you had a header that defined a struct and manually specified the alignment of its members, then compiled two apps with different compilers using the header, shouldn't one app be able to write an instance of the struct into shared memory and the other app be able to read it without errors?
I am assuming though that the size of the types contained is consistent across two compilers on the same architecture (it has to be the same platform already since we're talking about shared memory). I realize that this is not always true for some types (e.g. long vs. long long in GCC and MSVC 64-bit) but nowadays there are uint16_t, uint32_t, etc. types, and float and double are specified by IEEE standards.
As long as you can guarantee the exact same memory layout, including offsets, and the data types have the same sizes between the 2 compilers then yes this is fine. Because at that point the struct is identical with respect to data access.
Yes, sure. I've done this many times. The problems and solutions are the same whether mixed code is compiled and linked together, or when transmitting struct-formatted data between machines.
In the bad old days, this frequently occurred when integrating MS C and almost anything else: Borland Turbo C. DEC VAX C, Greenhills C.
The easy part is getting the number of bytes for various data types to agree. For example short on a 32-bit compiler on one side being the same as int on a 16-bit compiler at the other end. Since common source code to declare structures is usually a good thing, a number of to-the-point declarations are helpful:
typedef signed long s32;
typedef signed short s16;
typedef signed char s8;
typedef unsigned long u32;
typedef unsigned short u16;
typedef unsigned char u8;
...
Microsoft C is the most annoying. Its default is to pad members to 16-bit alignment, and maybe more with 64-bit code. Other compilers on x86 don't pad members.
struct {
int count;
char type;
char code;
char data [100];
} variable;
It might seem like the offset of code should be the next byte after type, but there might be a padding byte inserted between. The fix is usually
#ifdef _MSC_VER // if it's any Microsoft compiler
#pragma pack(1) // byte align structure members--that is, no padding
#endif
There is also a compiler command line option to do the same.
The way memory is laid out is important in addition to the datatype size if you need struct from library 1 compiled by compiler 1 to be used in library 2 compiled by compiler 2.
It is indeed possible, you just have to make sure that all compilers involved generate the same data structure from the same code. One way to test this is to write a sample program that creates a struct and writes it to a binary file. Open the resulting files in a hex editor and verify that they are the same. Alternatively, you can cast the struct to an array of uint8_t and dump the individual bytes to the screen.
One way to make sure that the data sizes are the same is to use data types like int16_t (from stdint.h) instead of a plain old int which may change sizes between compilers (although this is rare on two compilers running on the same platform).
It's not as difficult as it sounds. There are many pre-compiled libraries out there that can be used with multiple compilers. The key thing is to build a test program that will let you verify that both compilers are treating the structure equally.
Refer to your compiler manuals.
most compilers provide extensions letting you fix the alignment of members
Are you restricting yourself to those compilers and a mutually compatible #pragma align style? If so, the safety is dictated by their specification.
In the interest of portability, you are possibly better off ditching #pragma align and relying on your ABI, which may provide a "reasonable" standard for compliance of all compilers of your platform.
As the C and C++ standards allow any deterministic struct layout methodology, they're essentially irrelevant.

Determining the alignment of C/C++ structures in relation to its members

Can the alignment of a structure type be found if the alignments of the structure members are known?
Eg. for:
struct S
{
a_t a;
b_t b;
c_t c[];
};
is the alignment of S = max(alignment_of(a), alignment_of(b), alignment_of(c))?
Searching the internet I found that "for structured types the largest alignment requirement of any of its elements determines the alignment of the structure" (in What Every Programmer Should Know About Memory) but I couldn't find anything remotely similar in the standard (latest draft more exactly).
Edited:
Many thanks for all the answers, especially to Robert Gamble who provided a really good answer to the original question and the others who contributed.
In short:
To ensure alignment requirements for structure members, the alignment of a structure must be at least as strict as the alignment of its strictest member.
As for determining the alignment of structure a few options were presented and with a bit of research this is what I found:
c++ std::tr1::alignment_of
not standard yet, but close (technical report 1), should be in the C++0x
the following restrictions are present in the latest draft: Precondition:T shall be a complete type, a reference type, or an array of
unknown bound, but shall not be a function type or (possibly
cv-qualified) void.
this means that my presented use case with the C99 flexible array won't work (this is not that surprising since flexible arrays are not standard c++)
in the latest c++ draft it is defined in the terms of a new keyword - alignas (this has the same complete type requirement)
in my opinion, should c++ standard ever support C99 flexible arrays, the requirement could be relaxed (the alignment of the structure with the flexible array should not change based on the number of the array elements)
c++ boost::alignment_of
mostly a tr1 replacement
seems to be specialized for void and returns 0 in that case (this is forbidden in the c++ draft)
Note from developers: strictly speaking you should only rely on the value of ALIGNOF(T) being a multiple of the true alignment of T, although in practice it does compute the correct value in all the cases we know about.
I don't know if this works with flexible arrays, it should (might not work in general, this resolves to compiler intrinsic on my platform so I don't know how it will behave in the general case)
Andrew Top presented a simple template solution for calculating the alignment in the answers
this seems to be very close to what boost is doing (boost will additionally return the object size as the alignment if it is smaller than the calculated alignment as far as I can see) so probably the same notice applies
this works with flexible arrays
use Windbg.exe to find out the alignment of a symbol
not compile time, compiler specific, didn't test it
using offsetof on the anonymous structure containing the type
see the answers, not reliable, not portable with c++ non-POD
compiler intrinsics, eg. MSVC __alignof
works with flexible arrays
alignof keyword is in the latest c++ draft
If we want to use the "standard" solution we're limited to std::tr1::alignment_of, but that won't work if you mix your c++ code with c99's flexible arrays.
As I see it there is only 1 solution - use the old struct hack:
struct S
{
a_t a;
b_t b;
c_t c[1]; // "has" more than 1 member, strictly speaking this is undefined behavior in both c and c++ when used this way
};
The diverging c and c++ standards and their growing differences are unfortunate in this case (and every other case).
Another interesting question is (if we can't find out the alignment of a structure in a portable way) what is the most strictest alignment requirement possible. There are a couple of solutions I could find:
boost (internally) uses a union of variety of types and uses the boost::alignment_of on it
the latest c++ draft contains std::aligned_storage
The value of default-alignment shall be the most stringent alignment requirement for any C++ object type whose size is no greater than Len
so the std::alignment_of< std::aligned_storage<BigEnoughNumber>>::value should give us the maximum alignment
draft only, not standard yet (if ever), tr1::aligned_storage does not have this property
Any thoughts on this would also be appreciated.
I have temporarily unchecked the accepted answer to get more visibility and input on the new sub-questions
There are two closely related concepts to here:
The alignment required by the processor to access a particular object
The alignment that the compiler actually uses to place objects in memory
To ensure alignment requirements for structure members, the alignment of a structure must be at least as strict as the alignment of its strictest member. I don't think this is spelled out explicitly in the standard but it can be inferred from the the following facts (which are spelled out individually in the standard):
Structures are allowed to have padding between their members (and at the end)
Arrays are not allowed to have padding between their elements
You can create an array of any structure type
If the alignment of a structure was not at least as strict as each of its members you would not be able to create an array of structures since some structure members some elements would not be properly aligned.
Now the compiler must ensure a minimum alignment for the structure based on the alignment requirements of its members but it can also align objects in a stricter fashion than required, this is often done for performance reasons. For example, many modern processors will allow access to 32-bit integers in any alignment but accesses may be significantly slower if they are not aligned on a 4-byte boundary.
There is no portable way to determine the alignment enforced by the processor for any given type because this is not exposed by the language, although since the compiler obviously knows the alignment requirements of the target processor it could expose this information as an extension.
There is also no portable way (at least in C) to determine how a compiler will actually align an object although many compilers have options to provide some level of control over the alignment.
I wrote this type trait code to determine the alignment of any type(based on the compiler rules already discussed). You may find it useful:
template <class T>
class Traits
{
public:
struct AlignmentFinder
{
char a;
T b;
};
enum {AlignmentOf = sizeof(AlignmentFinder) - sizeof(T)};
};
So now you can go:
std::cout << "The alignment of structure S is: " << Traits<S>::AlignmentOf << std::endl;
The following macro will return the alignment requirement of any given type (even if it's a struct):
#define TYPE_ALIGNMENT( t ) offsetof( struct { char x; t test; }, test )
Note: I probably borrowed this idea from a Microsoft header at some point way back in my past...
Edit: as Robert Gamble points out in the comments, this macro is not guaranteed to work. In fact, it will certainly not work very well if the compiler is set to pack elements in structures. So if you decide to use it, use it with caution.
Some compilers have an extension that allows you obtain the alignment of a type (for example, starting with VS2002, MSVC has an __alignof() intrinsic). Those should be used when available.
As the others mentioned, its implementation dependant. Visual Studio 2005 uses 8 bytes as the default structure alignment. Internally, items are aligned by their size - a float has 4 byte alignment, a double uses 8, etc.
You can override the behavior with #pragma pack. GCC (and most compilers) have similar compiler options or pragmas.
It is possible to assume a structure alignment if you know more details about the compiler options that are in use. For example, #pragma pack(1) will force alignment on the byte level for some compilers.
Side note: I know the question was about alignment, but a side issue is padding. For embedded programming, binary data, and so forth -- In general, don't assume anything about structure alignment if possible. Rather use explicit padding if necessary in the structures. I've had cases where it was impossible to duplicate the exact alignment used in one compiler to a compiler on a different platform without adding padding elements. It had to do with the alignment of structures inside of structures, so adding padding elements fixed it.
If you want to find this out for a particular case in Windows, open up windbg:
Windbg.exe -z \path\to\somemodule.dll -y \path\to\symbols
Then, run:
dt somemodule!CSomeType
I don't think memory layout is guaranteed in any way in any C standard. This is very much vendor and architect-dependent. There might be ways to do it that work in 90% of cases, but they are not standard.
I would be very glad to be proven wrong, though =)
I agree mostly with Paul Betts, Ryan and Dan. Really, it's up to the developer, you can either keep the default alignment symanic's which Robert noted about (Robert's explanation is just the default behaviour and not by any means enforced or required), or you can setup whatever alignment you want /Zp[##].
What this means is that if you have a typedef with floats', long double's, uchar's etc... various assortments of arrays's included. Then have another type which has some of these oddly shaped members, and a single byte, then another odd member, it will simply be aligned at whatever preference the make/solution file defines.
As noted earlier, using windbg's dt command at runtime you can find out how the compiler laid out the structure in memory.
You can also use any pdb reading tool like dia2dump to extract this info from pdb's statically.
Modified from Peeter Joot's Blog
C structure alignment is based on the biggest size native type in the structure, at least generally (an exception is something like using a 64-bit integer on win32 where only 32-bit alignment is required).
If you have only chars and arrays of chars, once you add an int, that int will end up starting on a 4 byte boundary (with possible hidden padding before the int member). Additionally, if the structure isn’t a multiple of sizeof(int), hidden padding will be added at the end. Same thing for short and 64-bit types.
Example:
struct blah1 {
char x ;
char y[2] ;
};
sizeof(blah1) == 3
struct blah1plusShort {
char x ;
char y[2] ;
// <<< hidden one byte inserted by the compiler here
// <<< z will start on a 2 byte boundary (if beginning of struct is aligned).
short z ;
char w ;
// <<< hidden one byte tail pad inserted by the compiler.
// <<< the total struct size is a multiple of the biggest element.
// <<< This ensures alignment if used in an array.
};
sizeof(blah1plusShort) == 8
I read this answer after 8 years and I feel that the accepted answer from #Robert is generally right, but mathematically wrong.
To ensure alignment requirements for structure members, the alignment of a structure must be at least as strict as the least common multiple of the alignment of its members. Consider an odd example, where the alignment requirements of members are 4 and 10; in which case the alignment of the structure is LCM(4, 10) which is 20, and not 10. Of course, it is odd to see platforms with such alignment requirement which is not a power of 2, and thus for all practical cases, the structure alignment is equal to the maximum alignment of its members.
The reason for this is that, only if the address of the structure starts with the LCM of its member alignments, the alignment of all the members can be satisfied and the padding between the members and the end of the structure is independent of the start address.
Update: As pointed out by #chqrlie in the comment, C standard does not allow the odd values of the alignment. However this answer still proves why structure alignment is the maximum of its member alignments, just because the maximum happens to be the least common multiple, and thus the members are always aligned relative to the common multiple address.

Are POD types always aligned?

For example, if I declare a long variable, can I assume it will always be aligned on a "sizeof(long)" boundary? Microsoft Visual C++ online help says so, but is it standard behavior?
some more info:
a. It is possible to explicitely create a misaligned integer (*bar):
char foo[5]
int * bar = (int *)(&foo[1]);
b. Apparently, #pragma pack() only affects structures, classes, and unions.
c. MSVC documentation states that POD types are aligned to their respective sizes (but is it always or by default, and is it standard behavior, I don't know)
As others have mentioned, this isn't part of the standard and is left up to the compiler to implement as it sees fit for the processor in question. For example, VC could easily implement different alignment requirements for an ARM processor than it does for x86 processors.
Microsoft VC implements what is basically called natural alignment up to the size specified by the #pragma pack directive or the /Zp command line option. This means that, for example, any POD type with a size smaller or equal to 8 bytes will be aligned based on its size. Anything larger will be aligned on an 8 byte boundary.
If it is important that you control alignment for different processors and different compilers, then you can use a packing size of 1 and pad your structures.
#pragma pack(push)
#pragma pack(1)
struct Example
{
short data1; // offset 0
short padding1; // offset 2
long data2; // offset 4
};
#pragma pack(pop)
In this code, the padding1 variable exists only to make sure that data2 is naturally aligned.
Answer to a:
Yes, that can easily cause misaligned data. On an x86 processor, this doesn't really hurt much at all. On other processors, this can result in a crash or a very slow execution. For example, the Alpha processor would throw a processor exception which would be caught by the OS. The OS would then inspect the instruction and then do the work needed to handle the misaligned data. Then execution continues. The __unaligned keyword can be used in VC to mark unaligned access for non-x86 programs (i.e. for CE).
By default, yes. However, it can be changed via the pack() #pragma.
I don't believe the C++ Standard make any requirement in this regard, and leaves it up to the implementation.
C and C++ don't mandate any kind of alignment. But natural alignment is strongly preferred by x86 and is required by most other CPU architectures, and compilers generally do their utmost to keep CPUs happy. So in practice you won't see a compiler generate misaligned data unless you really twist it's arm.
Yes, all types are always aligned to at least their alignment requirements.
How could it be otherwise?
But note that the sizeof() a type is not the same as it's alignment.
You can use the following macro to determine the alignment requirements of a type:
#define ALIGNMENT_OF( t ) offsetof( struct { char x; t test; }, test )
Depends on the compiler, the pragmas and the optimisation level. With modern compilers you can also choose time or space optimisation, which could change the alignment of types as well.
Generally it will be because reading/writing to it is faster that way. But almost every compiler has a switch to turn this off. In gcc its -malign-???. With aggregates they are generally aligned and sized based on the alignment requirements of each element within.