C++ pointer's suitable alignment - c++

[basic.stc.dynamic.allocation]/2 about allocation functions:
The pointer returned shall be suitably aligned so that it can be
converted to a pointer of any complete object type with a fundamental
alignment requirement (3.11) and then used to access the object or
array in the storage allocated (until the storage is explicitly
deallocated by a call to a corresponding deallocation function).
It is a bit inclear. I thought that any pointer to (include the void*) type has alignment equal to 8. What is the point of The pointer returned shall be suitably aligned so...? Could you get an example of no suitable aligned pointer?

Many systems require the dereferenced pointers are aligned to be a multiple of the size of the type. For instance, pointers for shorts would be on multiples of 2 bytes, char pointers are unrestricted, etc. Not all systems have this requirement, but accesses on unaligned memory on these systems are frequently very slow, and so typically programmers try to keep everything aligned anyways.
You can find the alignment requirement for a type with alignof, if you want to poke around on your system. A pointer that isn't aligned properly for any type might be something like 0xFFFF0002, which wouldn't be aligned for any 4 byte or higher type.
In short, what that documentation is saying is that the memory returned will be aligned for any fundemental type.

Related

Do the standard library versions of alignment-unaware array-form allocation functions meet the requirements on alignment?

The relevant paragraph is [basic.stc.dynamic.allocation]/3 (emphasis mine):
(3) For an allocation function other than a reserved placement allocation function, the pointer returned on a successful call shall represent the address of storage that is aligned as follows:
(3.1) -- If the allocation function takes an argument of type std​::​align_­val_­t, the storage will have the alignment specified by the value of this argument.
(3.2) -- Otherwise, if the allocation function is named operator new[], the storage is aligned for any object that does not have new-extended alignment and is no larger than the requested size.
(3.3) -- Otherwise, the storage is aligned for any object that does not have new-extended alignment and is of the requested size.
My understanding is as follows:
Both the single-object and the array forms of alignment-unaware allocation functions cap the guaranteed alignment to __STDCPP_DEFAULT_NEW_ALIGNMENT__.
With that constraint, and assuming __STDCPP_DEFAULT_NEW_ALIGNMENT__ == 8u:
The single-object form aligns for any object of the requested size. Thus, a request of 4 bytes would only guarantee 4-byte-aligned storage, as an 8-byte-aligned object would be at least 8 bytes in size. A 3-byte request would only guarantee 1-byte alignment, as an object with any stricter alignment could not be 3 bytes in size. (An object's size is a (non-zero) multiple of its alignment requirement (sizeof(x) % alignof(decltype(x)) == 0).)
The array form aligns for any object no larger than the requested size. Thus, a request of 4 bytes would only guarantee 4-byte-aligned storage (as above), but a 3-byte request would guarantee 2-byte alignment, as a 2-byte-aligned object could be only 2 bytes in size.
The array form must therefore provide stronger guarantees; it must satisfy alignment requirements for a superset of objects for which the single-object form must satisfy such requirements. In other words, the post-conditions of the former subsume (and strengthen) those of the latter. Yet, the default behavior of the standard library version of the array form is to simply forward to the corresponding single-object form and return its result. Would that not mean that ::operator new[](3), being equivalent (by default) to ::operator new(3), yields a pointer to storage only guaranteed to have 1-byte alignment, failing the above requirements?
You seem to have proven that single element operator new provided by the standard library must allocate memory more aligned than the requirement for single element allocators in general.
That seems plausible.
Those restrictions are the minimum requirements for any allocation function in C++, including ones a programmer writes.
It turns out, if your logic is right, that the base operator new has to return 2 byte aligned memory when asked for 3 bytes.
But when you write your own you have to only follow those rules.

How to tell the maximum data alignment requirement in C++

Referencing this question and answer Memory alignment : how to use alignof / alignas? "Alignment is a restriction on which memory positions a value's first byte can be stored." Is there a portable way in C++ to find out the highest granularity of alignment that is required for instructions not to fault (for example in ARM)? Is alignof(intmax_t) sufficient since its the largest integer primitive type?
Further, how does something in malloc align data to a struct's boundaries? For example if I have a struct like so
alignas(16) struct SomethingElse { ... };
the SomethingElse struct is required to be aligned to a boundary that is a multiple of 16 bytes. Now if I request memory for a struct from malloc like so
SomethingElse* ptr = malloc(sizeof(SomethingElse));
Then what happens when malloc returns a pointer that points to an address like 40? Is that invalid since the pointer to SomethingElse objects must be a multiple of 16?
How to tell the maximum data alignment requirement in C++
There is std::max_align_t which
is a POD type whose alignment requirement is at least as strict (as large) as that of every scalar type.
Vector instructions may use array operands that require a higher alignment than any scalar. I don't know if there is a portable way to find out the the upper bound for that.
To find the alignment requirement for a particular type, you can use alignof.
Further, how does something in malloc align data to a struct's boundaries?
malloc aligns to some boundary that is enough for any scalar. In practice exactly the same as std::max_align_t.
It won't align correctly for "overaligned" types. By overaligned, I mean a type that has higher alignment requirement than std::max_align_t.
Then what happens when malloc returns a pointer that points to an address like 40?
Dereferencing such SomethingElse* would be have undefined behaviour.
Is that invalid since the pointer to SomethingElse objects must be a multiple of 16?
Yes.
how even would one go about fixing that? There is no way to tell malloc the alignment that you want right?
To allocate memory for overaligned types, you need std::aligned_alloc which is to be introduced in C++17.
Until then, you can use platform specific functions like posix_memalign or you can over allocate with std::malloc (allocate the alignment - 1 worth of extra bytes), then use std::align to find the correct boundary. But then you must keep track of both the start of the buffer, and the start of the memory block separately.
You might be looking for std::max_align_t

How does compiler know how to increment different pointers?

I understand that in general , pointers to any data type will have the same size. On a 16 bit system, normally 2 bytes and and on a 32 bit system , 4 bytes.
Depending on what this pointer points to, if it is incremented , it will increment by a different number of bytes depending on if it's a char pointer, long pointer etc.
My query is how does the compiler know by how many bytes to increment this pointer. Isn't it just a variable stored in memory like any other? Are the pointers stored in some symbol table with information about how much they should be incremented by? Thanks
That is why there are data types. Each pointer variable will have an associated data type and that data type has a defined size (See about complete/incomplete type in footnote). Pointer arithmetic will take place based on the data type.
To add to that, for pointer arithmetic to happen, the pointer(s) should be (quoted from c11 standard)
pointer to a complete object type
So, the size of the "object" the pointer points to is known and defined.
Footnote: FWIW, that is why, pointer arithmetic on void pointers (incomplete type) is not allowed / defined in the standard. (Though GCC supports void pointer arithmetic via an extension.)
Re
” I understand that in general , pointers to any data type will have the same size
No. Different pointer sizes are unusual for ¹simple pointers to objects, but can occur on word-addressed machines. Then char* is the largest pointer, and void* is the same size.
C++14 §3.9.2/4
” An object of type cv void* shall have the same
representation and alignment requirements as cv char*.
All pointers to class type object are however the same size. For example, you wouldn't be able to use an array of pointers to base type, if this wasn't the case.
Re
” how does the compiler know by how many bytes to increment this pointer
It knows the size of the type of object pointed to.
If it does not know the size of the object pointed to, i.e. that type is incomplete, then you can't increment the pointer.
For example, if p is a void*, then you can't do ++p.
Notes:
¹ In addition to ordinary pointers to objects, there are function pointers, and pointers to members. The latter kind are more like offsets, that must be combined with some specification of a relevant object, to yield a reference.
The data type of the pointer variable defines how many bytes to be incremented.
for example:
1) on incrementing a character pointer, pointer is increment by 1 Byte.
2) Likewise, for a integer pointer, pointer is increment by 4 Bytes (for 32-bit system) and 8 Bytes (for 64-bit system)

Alignment of char arrays

How is STL vector usually implemented? It has a raw storage of char[] which it occasionally resizes by a certain factor and then calls placement new when an element is pushed_back (a very interesting grammatical form I should note - linguists should study such verb forms as pushed_back :)
And then there are the alignment requirements. So a natural question arises how can I call a placement new on a char[] and be sure the alignment requirements are satisfied. So I searched the C++ standard of 2003 for the word "alignment" and found these:
Paragraph 3.9 Clause 5
Object types have alignment requirements (3.9.1, 3.9.2). The alignment of a complete object type is an implementation-defined integer value representing a number of bytes; an object is allocated at an address that meets the alignment requirements of its object type.
Paragraph 5.3.4 Clause 10:
A new-expression passes the amount of space requested to the allocation function as the first argument of type std::size_t. That argument shall be no less than the size of the object being created; it may be greater than the size of the object being created only if the object is an array. For arrays of char and unsigned char, the difference between the result of the new-expression and the address returned by the allocation function shall be an integral multiple of the most stringent alignment requirement (3.9) of any object type whose size is no greater than the size of the array being created. [Note: Because allocation functions are assumed to return pointers to storage that is appropriately aligned for objects of any type, this constraint on array allocation overhead permits the common idiom of allocating character arrays into which objects of other types will later be placed. ]
These two give a perfectly satisfactory answer for my above question, but...
Statement1:
An alignment requirement for an object of type X where sizeof(X) == n is at least the requirement that address of X be divisible by n or something like that (put all the architecture-dependent things into the "or something like that").
Question1: Please confirm, refine, or deny the above statement1.
Statement2: If statement1 is correct then from the second quote in the standard it follows that an array of 5000000 chars is allocated at an address divisible by 5000000 which is completely unnecessary if I just need the array of char as such, not as a raw storage for possible placement of other objects.
Question2: So, are the chances of successfully allocating 1000 chars really lower than 500 shorts(provided short is 2 bytes)? Is it practically a problem?
When you dynamically allocate memory using operator new, you have the guarantee that:
The pointer returned shall be suitably aligned so that it can be converted to a pointer of any complete object type and then used to access the object or array in the storage allocated (until the storage is explicitly deallocated by a call to a corresponding deallocation function) (C++03 3.7.3.1/2).
vector does not create an array of char; it uses an allocator. The default allocator uses ::operator new to allocate memory.
An alignment requirement for an object
of type X where sizeof(X) == n is at
least the requirement that address of
X be divisible by n or something like
that
No. The alignment requirement of a type is always a factor of its size, but need not be equal to its size. It is usually equal to the greatest of the alignment requirements of all the members of a class.
An array of 5M char, on its own account, need only have an alignment requirement of 1, the same as the alignment requirement of a single char.
So, the text you quote about the alignment of memory allocated via global operator new, (and malloc has a similar although IIRC not identical requirement) in effect means that a large allocation must obey the most stringent alignment requirement of any type in the system. Further to that, implementations often exclude large SIMD types from this, and require that memory for SIMD be specially allocated. This is slightly dubious, but I think they justify it on the basis that non-standard, extension types can impose arbitrary special requirements.
So in practice the number which you think is 5000000 is often 4 :-)
Q1: Alignment isn't related to size.
Q2: Theoretically yes, but you will hardly find an architecture that has a type with such huge alignment. SSE requires 16 bytes alignment (the biggest I have seen).

What is the byte alignment of the elements in a std::vector<char>?

I'm hoping that the elements are 1 byte aligned and similarly that a std::vector<int> is 4 byte aligned ( or whatever size int happens to be on a particular platform ).
Does anyone know how standard library containers get aligned?
The elements of the container have at least the alignment required for them in that implementation: if int is 4-aligned in your implementation, then each element of a vector<int> is an int and therefore is 4-aligned. I say "if" because there's a difference between size and alignment requirements - just because int has size 4 doesn't necessarily mean that it must be 4-aligned, as far as the standard is concerned. It's very common, though, since int is usually the word size of the machine, and most machines have advantages for memory access on word boundaries. So it makes sense to align int even if it's not strictly necessary. On x86, for example, you can do unaligned word-sized memory access, but it's slower than aligned. On ARM unaligned word operations are not allowed, and typically crash.
vector guarantees contiguous storage, so there won't be any "padding" in between the first and second element of a vector<char>, if that's what you're concerned about. The specific requirement for std::vector is that for 0 < n < vec.size(), &vec[n] == &vec[0] + n.
[Edit: this bit is now irrelevant, the questioner has disambiguated: The container itself will usually have whatever alignment is required for a pointer, regardless of what the value_type is. That's because the vector itself would not normally incorporate any elements, but will have a pointer to some dynamically-allocated memory with the elements in that. This isn't explicitly required, but it's a predictable implementation detail.]
Every object in C++ is 1-aligned, the only things that aren't are bitfields, and the elements of the borderline-crazy special case that is vector<bool>. So you can rest assured that your hope for std::vector<char> is well-founded. Both the vector and its first element will probably also be 4-aligned ;-)
As for how they get aligned - the same way anything in C++ gets aligned. When memory is allocated from the heap, it is required to be aligned sufficiently for any object that can fit into the allocation. When objects are placed on the stack, the compiler is responsible for designing the stack layout. The calling convention will specify the alignment of the stack pointer on function entry, then the compiler knows the size and alignment requirement of each object it lays down, so it knows whether the stack needs any padding to bring the next object to the correct alignment.
I'm hoping that the elements are 1 byte aligned and similarly that a std::vector is 4 byte aligned ( or whatever size int happens to be on a particular platform ).
To put it simply, std::vector is a wrapper for a C array. The elements of the vector are aligned as if they were in the array: elements are guaranteed to occupy continues memory block without any added gaps/etc, so that std::vector<N> v can be accessed as a C array using the &v[0]. (Why vector has to reallocate storage sometimes when elements are added to it.)
Does anyone know how standard library containers get aligned?
Alignment of elements is platform specific but generally a simple variable is aligned so that its address is divisible by its size (natural alignment). Structures/etc are padded (empty filler space at the end) on the largest data type they contain to ensure that if the structure is put into an array, all fields would retain their natural alignment.
For other containers (like std::list or std::map) use data via template mechanics are made a part of internal structure and the structure is allocated by operator new. The new is guaranteed (custom implementation must obey the rule too; inherited from the malloc()) to return memory block which is aligned on largest available primitive data type (*). That is to ensure that regardless what structure or variable would be places in the memory block, it will be accessed in aligned fashion. Unlike std::vector, obviously, the elements of most other STL containers are not guaranteed to be within the same continuous memory block: they are newed one by one, not with new[].
(*) As per C++ standard, "The allocation function (basic.stc.dynamic.allocation) called by a new-expression (expr.new) to allocate size bytes of storage suitably aligned to represent any object of that size." That is a softer requirement compared to one malloc() generally abides, as per POSIX: "The pointer returned if the allocation succeeds shall be suitably aligned so that it may be assigned to a pointer to any type of object [...]". C++ requirement in a way reenforces the natural alignment requirement: dynamically allocated char would be aligned as char requires, but not more.
Do you mean the vector members, or the vector structure itself? Members are guaranteed to be contiguous in memory but structure alignment is platform/compiler-dependent. On Windows this can be set at compile-time and also overridden using #pragma pack().
The answer for other containers is likely not the same as for vector, so I would ask specific questions about the ones you care about.
The alignment of the whole container is implementation dependent. It is usually at least sizeof(void*), that is 4 or 8 bytes depending on the platform, but may be larger.
If special (guaranteed) alignment is needed use plain arrays or write/adapt some generic array class with:
// allocation
char* pointer = _mm_malloc(size, alignment);
// deallocation
_mm_free(pointer);