Data alignment in C++, standard and portability - c++

I want to construct an object of class T by using ::operator new(size_t) and placement new.
To "extend" the size of char v[1], which is the last declared data member in T, I allocate sizeof(T) + n - 1 bytes with operator new(), where n is the wanted size in bytes. This trick allows me to access v[i] for any i in [0, n -1].
My questions are about the C++ standard:
Does the order of declaration of data members in T reflect the order in which data is represented in memory?
If the order is preserved, are data member alignments also preserved no matter how bigger is the size of the allocated memory?

1) Yes. From the section on pointer comparisons, the standard states that pointers to later members must compare as greater than pointers to earlier members. Edit: As pointed out by Martin, the standard only mandates this for POD structs.
2) Yes. Data alignment is not affected by the size of the allocation.
The thing is, nothing in the standard actually guarantees that the trick of using arrays this way works (IIRC).
struct something {
...
char buf[1];
};
However, this is done so commonly that it is a de-facto standard. The standards folks, last time I checked, were working on a way that they could codify these existing practices (It's already made its way into the C standard and it's only a matter of time before it's standardized in C++).

1a: Yes, just like C, the order of data members define the order in memory.
1b: Not unless the class is a POD (plain-old-data) class. To get that, it must not have constructors or virtual functions. The C++ standard has a list of rules that define what qualifies as POD.

You can influence the alignment in memory with the pragma pack declarations, just Google for it.

Does the order of declaration of data members in T reflect the order in which data is represented in memory?
In limited situations yes.
See Section 9 Classes [class] paragraph 7 (Look for details about A standard-layout class)
But in general no. There are no guarantees about the order of members in different protected/private/public regions.
If the order is preserved, are data member alignments also preserved no matter how bigger is the size of the allocated memory?
What do you mean preserved. The compiler decides for you. Once they are defined for a class they are constant through the code.
I want to construct an object of class T by using ::operator new(size_t) and placement new.
This is guaranteed. As long as you use new to allocate a block of memory at lease the same size as T then it is guaranteed to be aligned correctly for objects of type T.
3.1.1 Alignment [basic.align]
Paragraph 5:
Alignments have an order from weaker to stronger or stricter alignments. Stricter alignments have larger alignment values. An address that satisfies an alignment requirement also satisfies any weaker valid alignment requirement.
Thus if you have an object that is aligned to stricter requirement it is guaranteed to be aligned for weaker alignments. Thus space aligned for something that that is larger than T is also aligned for objects of size T.
5.3.4 New [expr.new]
Paragraph 10
A new-expression passes the amount of space requested to the allocation function as the first argument of type std::size_t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array. For arrays of char and unsigned char, the difference between the result of the new-expression and the address returned by the allocation function shall be an integral multiple of the strictest fundamental alignment requirement (3.11) of any object type whose size is no greater than the size of the array being created. [ Note: Because allocation functions are assumed to return pointers to storage that is appropriately aligned for objects of any type with fundamental alignment, this constraint on array allocation overhead permits the common idiom of allocating character arrays into which objects of other types will later be placed. — end note ]

Related

Do the standard library versions of alignment-unaware array-form allocation functions meet the requirements on alignment?

The relevant paragraph is [basic.stc.dynamic.allocation]/3 (emphasis mine):
(3) For an allocation function other than a reserved placement allocation function, the pointer returned on a successful call shall represent the address of storage that is aligned as follows:
(3.1) -- If the allocation function takes an argument of type std​::​align_­val_­t, the storage will have the alignment specified by the value of this argument.
(3.2) -- Otherwise, if the allocation function is named operator new[], the storage is aligned for any object that does not have new-extended alignment and is no larger than the requested size.
(3.3) -- Otherwise, the storage is aligned for any object that does not have new-extended alignment and is of the requested size.
My understanding is as follows:
Both the single-object and the array forms of alignment-unaware allocation functions cap the guaranteed alignment to __STDCPP_DEFAULT_NEW_ALIGNMENT__.
With that constraint, and assuming __STDCPP_DEFAULT_NEW_ALIGNMENT__ == 8u:
The single-object form aligns for any object of the requested size. Thus, a request of 4 bytes would only guarantee 4-byte-aligned storage, as an 8-byte-aligned object would be at least 8 bytes in size. A 3-byte request would only guarantee 1-byte alignment, as an object with any stricter alignment could not be 3 bytes in size. (An object's size is a (non-zero) multiple of its alignment requirement (sizeof(x) % alignof(decltype(x)) == 0).)
The array form aligns for any object no larger than the requested size. Thus, a request of 4 bytes would only guarantee 4-byte-aligned storage (as above), but a 3-byte request would guarantee 2-byte alignment, as a 2-byte-aligned object could be only 2 bytes in size.
The array form must therefore provide stronger guarantees; it must satisfy alignment requirements for a superset of objects for which the single-object form must satisfy such requirements. In other words, the post-conditions of the former subsume (and strengthen) those of the latter. Yet, the default behavior of the standard library version of the array form is to simply forward to the corresponding single-object form and return its result. Would that not mean that ::operator new[](3), being equivalent (by default) to ::operator new(3), yields a pointer to storage only guaranteed to have 1-byte alignment, failing the above requirements?
You seem to have proven that single element operator new provided by the standard library must allocate memory more aligned than the requirement for single element allocators in general.
That seems plausible.
Those restrictions are the minimum requirements for any allocation function in C++, including ones a programmer writes.
It turns out, if your logic is right, that the base operator new has to return 2 byte aligned memory when asked for 3 bytes.
But when you write your own you have to only follow those rules.

Is a byte array allocated with new[] aligned on platform word boundary? [duplicate]

Is allocating a buffer via new char[sizeof(T)] guaranteed to allocate memory which is properly aligned for the type T, where all members of T has their natural, implementation defined, alignment (that is, you have not used the alignas keyword to modify their alignment).
I have seen this guarantee made in a few answers around here but I'm not entirely clear how the standard arrives at this guarantee. 5.3.4-10 of the standard gives the basic requirement: essentially new char[] must be aligned to max_align_t.
What I'm missing is the bit which says alignof(T) will always be a valid alignment with a maximum value of max_align_t. I mean, it seems obvious, but must the resulting alignment of a structure be at most max_align_t? Even point 3.11-3 says extended alignments may be supported, so may the compiler decide on its own a class is an over-aligned type?
The expressions new char[N] and new unsigned char[N] are guaranteed
to return memory sufficiently aligned for any object. See §5.3.4/10
"[...] For arrays of char and unsigned char, the difference between the
result of the new-expression and the address returned by the allocation
function shall be an integral multiple of the strictest fundamental
alignment requirement (3.11) of any object type whose size is no greater
than the size of the array being created. [ Note: Because allocation
functions are assumed to return pointers to storage that is
appropriately aligned for objects of any type with fundamental
alignment, this constraint on array allocation overhead permits the
common idiom of allocating character arrays into which objects of other
types will later be placed. —end note ]".
From a stylistic point of view, of course: if what you want is to allocate raw
memory, it's clearer to say so: operator new(N). Conceptually,
new char[N] creates N char; operator new(N) allocates N bytes.
What I'm missing is the bit which says alignof(T) will always be a valid alignment with a maximum value of max_align_t. I mean, it seems obvious, but must the resulting alignment of a structure be at most max_align_t ? Even point 3.11-3 says extended alignments may be supported, so may the compiler decide on its own a class is an over-aligned type ?
As noted by Mankarse, the best quote I could get is from [basic.align]/3:
A type having an extended alignment requirement is an over-aligned type. [ Note:
every over-aligned type is or contains a class type to which extended alignment applies (possibly through a non-static data member). —end note ]
which seems to imply that extended alignment must be explicitly required (and then propagates) but cannot
I would have prefer a clearer mention; the intent is obvious for a compiler-writer, and any other behavior would be insane, still...

Cast A primitive type pointer to A structure pointer - Alignment and Padding?

Just 20 minutes age when I answered a question, I come up with an interesting scenario that I'm not sure of the behavior:
Let me have an integer array of size n, pointed by intPtr;
int* intPtr;
and let me also have a struct like this:
typedef struct {
int val1;
int val2;
//and less or more integer declarations goes on like this(not any other type)
}intStruct;
My question is if I do a cast intStruct* structPtr = (intStruct*) intPtr;
Am I sure to get every element correctly if I traverse the elements of the struct? Is there any possibility of miss-alignment(possible because of padding) in any architecture/compiler?
The standard is fairly specific that even a POD-struct (which is, I believe the most restrictive class of structs) can have padding between members. ("There might therefore be unnamed padding within a POD-struct object, but not at its beginning, as necessary to achieve appropriate alignment." -- a non-normative note, but still makes the intent quite clear).
For example, contrast the requirements for a standard-layout struct (C++11, §1.8/4):
An object of trivially copyable or standard-layout type (3.9) shall occupy contiguous bytes of storage."
...with those for an array (§8.3.4/1):
An object of array type contains a contiguously allocated non-empty set of N subobjects of type T.
In the array, the elements themselves are required to be allocated contiguously, whereas in the struct, only the storage is required to be contiguous.
The third possibility that might make the "contiguous storage" requirement make more sense would be to consider a struct/class that is not trivially copyable or standard layout. In this case, it's possible that the storage might might not be contiguous at all. For example, an implementation might set aside one area of memory for holding all the private variables, and an entirely separate area of memory to hold all the public variables. To make that a little more concrete, consider two definitions like:
class A {
int a;
public:
int b;
} a;
class B {
int x;
public:
int y;
} b;
With these definitions, the memory might be laid out something like:
a.a;
b.x;
// ... somewhere else in memory entirely:
a.b;
b.y;
In this case, neither the elements nor the storage needs to be contiguous, so interleaving parts of entirely separate structs/classes is allowable.
That said, the first element must be at the same address as the struct as a whole (9.2/17): "A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa."
In your case, you have a POD-struct, so (§9.2/17): "A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa." Since the first member must be aligned, and the remaining members are all of the same type, it's impossible for any padding to be truly necessary between the other members (i.e., except for bit-fields, any type you can put in a struct you can also put in an array, where contiguous allocation of the elements is required). If you have elements smaller than a word, on a word-oriented machine (e.g., early DEC Alphas), it's possible that padding could make access somewhat simpler though. For example, early DEC Alphas (at the hardware level) were only capable of reading/writing an entirely (64-bit) word at a time. As such, let's consider something like a struct of four char elements:
struct foo {
char a, b, c, d;
};
If it was required to lay these out in memory so they were contiguous, accessing a foo::b (for example) would require that the CPU load the word, then shift it 8-bits right, then mask to zero-extend that byte to fill the entire register.
Storing would be even worse -- the CPU would have to load the current value of the whole word, mask out the current contents of the appropriate char-sized piece of that, shift the new value to the correct place, OR it into the word, and finally store the result.
By contrast, with padding between the elements, each of those becomes a simple load/store, with no shifting, masking, etc.
At least if memory serves, with DEC's normal compiler for the Alpha, int was 32 bits, and long was 64 bits (it predated long long). As such, with your struct of four ints, you could have expected to see another 32 bits of padding between the elements (and another 32 bits after the last element as well).
Given that you do have a POD-struct, you still have some possibilities though. The one I'd probably prefer would be to use offsetof to get the offsets of the members of the struct, create an array of them, and access the members via those offsets. I showed how to do this in a couple of previous answers.
Strictly speaking, such pointer casts aren't allowed and lead to undefined behavior.
The main issue with the cast is however that the compiler is free to add any number of padding bytes anywhere inside a struct, except before the very first element. So whether it will work or not depends on the alignment requirements of the specific system, and also whether struct padding is enabled or not.
int is not necessarily of the same size as the optimal size for an addressable chunk of data, even though this is true for most 32-bit systems. Some 32-bitters don't care about misalignment, some will allow misalignment but produce less efficient code, and some must have the data aligned. In theory, 64-bitters may also want to add padding after an int (which will be 32 bit there) to get a 64-bit chunk, but in practice they support 32-bit instruction sets.
If you write code relying on this cast, you should add something like this:
static_assert (sizeof(intStruct) ==
sizeof(int) + sizeof(int));
It is guaranteed to be legal, given that the element type is standard-layout. Note: all references in the following are to the c++11 standard.
8.3.4 Arrays [dcl.array]
1 - [...] An object of array type contains a contiguously allocated non-empty set of N subobjects of type T. [...]
Regarding a struct with N members of type T,
9.2 Class members [class.mem]
14 - Nonstatic data members of a (non-union) class with the same access control are allocated so
that later members have higher addresses within a class object. [...] Implementation alignment requirements might
cause two adjacent members not to be allocated immediately after each other [...]
20 - A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its
initial member [...] and vice versa. [ Note:
There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning,
as necessary to achieve appropriate alignment. —end note ]
So the question is whether any alignment-required padding within a struct could cause its members not to be contiguously allocated with respect to each other. The answer is:
1.8 The C++ object model [intro.object]
4 - [...] An object of trivially copyable or standard-layout type shall occupy contiguous bytes of storage.
In other words, a standard-layout struct a containing at least two members x, y of the same (standard-layout) type that does not respect the identity &a.y == &a.x + 1 is in violation of 1.8:4.
Note that alignment is defined as (3.11 Alignment [basic.align]) the number of bytes between successive addresses at which a given object can be allocated; it follows that alignment of a type T can be no greater than the distance between adjacent objects in an array of T, and (since 5.3.3 Sizeof [expr.sizeof] specifies that the size of an array of n elements is n times the size of an element) alignof(T) can be no greater than sizeof(T). Thus any additional padding between adjacent elements of a struct of the same type would not be required by alignment and so would not be countenanced by 9.2:14.
With regard to AProgrammer's point, I would interpret the language in 26.4 Complex numbers [complex.numbers] as requiring that the instantiations of std::complex<T> should behave as standard-layout types with regard to the position of their members, without being required to conform to all the requirements of standard-layout types.
The behavior there is almost certainly compiler-, architecture-, and ABI-dependent. However, if you're using gcc, you can make use of __attribute__((packed)) to force the compiler to pack struct members one after the other, without any padding. With that, the memory layout should match that of a flat array.
I've found nothing which guarantee it is valid when I searched some time ago, and I've found explicit guarantee for the case of std::complex<> in C++ which could have been formulated more easily if it was more generally true, so I doubt I missed something in my search (but absence of proof is hardly a proof of absence and the standard is sometimes obscure in its formulation).
A typical alignment of C structs guarantees that the data structure members in the struct will be stored sequentially which is the same as a C array. So order cannot be a problem.
As it comes to alignment, since you have only one data type(int), though the compiler is eligible to do so, there is no scenario it would be necessary to add padding to align your data members. The compiler can add padding before the beginning of the struct, but it cannot add padding at the beginning of the data structure. So if the compiler were to add padding in your situation,
Instead of this:
[4Byte int][4Byte int][4Byte int]...[4Byte int]
Your data structure would have to be stored like this:
[4Byte Data][4Byte Padding][4Byte Data]... which is unreasonable.
Overall, I think this cast should work with no problems in your situation, though I think it is bad practice to use it.

Alignment of char arrays

How is STL vector usually implemented? It has a raw storage of char[] which it occasionally resizes by a certain factor and then calls placement new when an element is pushed_back (a very interesting grammatical form I should note - linguists should study such verb forms as pushed_back :)
And then there are the alignment requirements. So a natural question arises how can I call a placement new on a char[] and be sure the alignment requirements are satisfied. So I searched the C++ standard of 2003 for the word "alignment" and found these:
Paragraph 3.9 Clause 5
Object types have alignment requirements (3.9.1, 3.9.2). The alignment of a complete object type is an implementation-defined integer value representing a number of bytes; an object is allocated at an address that meets the alignment requirements of its object type.
Paragraph 5.3.4 Clause 10:
A new-expression passes the amount of space requested to the allocation function as the first argument of type std::size_t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array. For arrays of char and unsigned char, the difference between the result of the new-expression and the address returned by the allocation function shall be an integral multiple of the most stringent alignment requirement (3.9) of any object type whose size is no greater than the size of the array being created. [Note: Because allocation functions are assumed to return pointers to storage that is appropriately aligned for objects of any type, this constraint on array allocation overhead permits the common idiom of allocating character arrays into which objects of other types will later be placed. ]
These two give a perfectly satisfactory answer for my above question, but...
Statement1:
An alignment requirement for an object of type X where sizeof(X) == n is at least the requirement that address of X be divisible by n or something like that (put all the architecture-dependent things into the "or something like that").
Question1: Please confirm, refine, or deny the above statement1.
Statement2: If statement1 is correct then from the second quote in the standard it follows that an array of 5000000 chars is allocated at an address divisible by 5000000 which is completely unnecessary if I just need the array of char as such, not as a raw storage for possible placement of other objects.
Question2: So, are the chances of successfully allocating 1000 chars really lower than 500 shorts(provided short is 2 bytes)? Is it practically a problem?
When you dynamically allocate memory using operator new, you have the guarantee that:
The pointer returned shall be suitably aligned so that it can be converted to a pointer of any complete object type and then used to access the object or array in the storage allocated (until the storage is explicitly deallocated by a call to a corresponding deallocation function) (C++03 3.7.3.1/2).
vector does not create an array of char; it uses an allocator. The default allocator uses ::operator new to allocate memory.
An alignment requirement for an object
of type X where sizeof(X) == n is at
least the requirement that address of
X be divisible by n or something like
that
No. The alignment requirement of a type is always a factor of its size, but need not be equal to its size. It is usually equal to the greatest of the alignment requirements of all the members of a class.
An array of 5M char, on its own account, need only have an alignment requirement of 1, the same as the alignment requirement of a single char.
So, the text you quote about the alignment of memory allocated via global operator new, (and malloc has a similar although IIRC not identical requirement) in effect means that a large allocation must obey the most stringent alignment requirement of any type in the system. Further to that, implementations often exclude large SIMD types from this, and require that memory for SIMD be specially allocated. This is slightly dubious, but I think they justify it on the basis that non-standard, extension types can impose arbitrary special requirements.
So in practice the number which you think is 5000000 is often 4 :-)
Q1: Alignment isn't related to size.
Q2: Theoretically yes, but you will hardly find an architecture that has a type with such huge alignment. SSE requires 16 bytes alignment (the biggest I have seen).

What is the byte alignment of the elements in a std::vector<char>?

I'm hoping that the elements are 1 byte aligned and similarly that a std::vector<int> is 4 byte aligned ( or whatever size int happens to be on a particular platform ).
Does anyone know how standard library containers get aligned?
The elements of the container have at least the alignment required for them in that implementation: if int is 4-aligned in your implementation, then each element of a vector<int> is an int and therefore is 4-aligned. I say "if" because there's a difference between size and alignment requirements - just because int has size 4 doesn't necessarily mean that it must be 4-aligned, as far as the standard is concerned. It's very common, though, since int is usually the word size of the machine, and most machines have advantages for memory access on word boundaries. So it makes sense to align int even if it's not strictly necessary. On x86, for example, you can do unaligned word-sized memory access, but it's slower than aligned. On ARM unaligned word operations are not allowed, and typically crash.
vector guarantees contiguous storage, so there won't be any "padding" in between the first and second element of a vector<char>, if that's what you're concerned about. The specific requirement for std::vector is that for 0 < n < vec.size(), &vec[n] == &vec[0] + n.
[Edit: this bit is now irrelevant, the questioner has disambiguated: The container itself will usually have whatever alignment is required for a pointer, regardless of what the value_type is. That's because the vector itself would not normally incorporate any elements, but will have a pointer to some dynamically-allocated memory with the elements in that. This isn't explicitly required, but it's a predictable implementation detail.]
Every object in C++ is 1-aligned, the only things that aren't are bitfields, and the elements of the borderline-crazy special case that is vector<bool>. So you can rest assured that your hope for std::vector<char> is well-founded. Both the vector and its first element will probably also be 4-aligned ;-)
As for how they get aligned - the same way anything in C++ gets aligned. When memory is allocated from the heap, it is required to be aligned sufficiently for any object that can fit into the allocation. When objects are placed on the stack, the compiler is responsible for designing the stack layout. The calling convention will specify the alignment of the stack pointer on function entry, then the compiler knows the size and alignment requirement of each object it lays down, so it knows whether the stack needs any padding to bring the next object to the correct alignment.
I'm hoping that the elements are 1 byte aligned and similarly that a std::vector is 4 byte aligned ( or whatever size int happens to be on a particular platform ).
To put it simply, std::vector is a wrapper for a C array. The elements of the vector are aligned as if they were in the array: elements are guaranteed to occupy continues memory block without any added gaps/etc, so that std::vector<N> v can be accessed as a C array using the &v[0]. (Why vector has to reallocate storage sometimes when elements are added to it.)
Does anyone know how standard library containers get aligned?
Alignment of elements is platform specific but generally a simple variable is aligned so that its address is divisible by its size (natural alignment). Structures/etc are padded (empty filler space at the end) on the largest data type they contain to ensure that if the structure is put into an array, all fields would retain their natural alignment.
For other containers (like std::list or std::map) use data via template mechanics are made a part of internal structure and the structure is allocated by operator new. The new is guaranteed (custom implementation must obey the rule too; inherited from the malloc()) to return memory block which is aligned on largest available primitive data type (*). That is to ensure that regardless what structure or variable would be places in the memory block, it will be accessed in aligned fashion. Unlike std::vector, obviously, the elements of most other STL containers are not guaranteed to be within the same continuous memory block: they are newed one by one, not with new[].
(*) As per C++ standard, "The allocation function (basic.stc.dynamic.allocation) called by a new-expression (expr.new) to allocate size bytes of storage suitably aligned to represent any object of that size." That is a softer requirement compared to one malloc() generally abides, as per POSIX: "The pointer returned if the allocation succeeds shall be suitably aligned so that it may be assigned to a pointer to any type of object [...]". C++ requirement in a way reenforces the natural alignment requirement: dynamically allocated char would be aligned as char requires, but not more.
Do you mean the vector members, or the vector structure itself? Members are guaranteed to be contiguous in memory but structure alignment is platform/compiler-dependent. On Windows this can be set at compile-time and also overridden using #pragma pack().
The answer for other containers is likely not the same as for vector, so I would ask specific questions about the ones you care about.
The alignment of the whole container is implementation dependent. It is usually at least sizeof(void*), that is 4 or 8 bytes depending on the platform, but may be larger.
If special (guaranteed) alignment is needed use plain arrays or write/adapt some generic array class with:
// allocation
char* pointer = _mm_malloc(size, alignment);
// deallocation
_mm_free(pointer);