Do I understand C/C++ strict-aliasing correctly? - c++

I've read this article about C/C++ strict aliasing. I think the same applies to C++.
As I understand, strict aliasing is used to rearrange the code for performance optimization. That's why two pointers of different (and unrelated in C++ case) types cannot refer to the same memory location.
Does this mean that problems can occur only if memory is modified? Apart of possible problems with memory alignment.
For example, handling network protocol, or de-serialization. I have a byte array, dynamically allocated and packet struct is properly aligned. Can I reinterpret_cast it to my packet struct?
char const* buf = ...; // dynamically allocated
unsigned int i = *reinterpret_cast<unsigned int*>(buf + shift); // [shift] satisfies alignment requirements

The problem here is not strict aliasing so much as structure representation requirements.
First, it is safe to alias between char, signed char, or unsigned char and any one other type (in your case, unsigned int. This allows you to write your own memory-copy loops, as long as they're defined using a char type. This is authorized by the following language in C99 (§6.5):
6. The effective type of an object for an access to its stored value is the declared type of the object, if any. [Footnote: Allocated objects have no declared type] [...] If a value is copied into an object having no declared type using
memcpy or memmove, or is copied as an array of character type, then the effective type
of the modified object for that access and for subsequent accesses that do not modify the
value is the effective type of the object from which the value is copied, if it has one. For
all other accesses to an object having no declared type, the effective type of the object is
simply the type of the lvalue used for the access.
7. An object shall have its stored value accessed only by an lvalue expression that has one of the following types: [Footnote: The intent of this list is to specify those circumstances in which an object may or may not be aliased.]
a type compatible with the effective type of the object,
[...]
a character type.
Similar language can be found in the C++0x draft N3242 §3.11/10, although it is not as clear when the 'dynamic type' of an object is assigned (I'd appreciate any further references on what the dynamic type is of a char array, to which a POD object has been copied as a char array with proper alignment).
As such, aliasing is not a problem here. However, a strict reading of the standard indicates that a C++ implementation has a great deal of freedom in choosing a representation of an unsigned int.
As one random example, unsigned ints might be a 24-bit integer, represented in four bytes, with 8 padding bits interspersed; if any of these padding bits does not match a certain (constant) pattern, it is viewed as a trap representation, and dereferencing the pointer will result in a crash. Is this a likely implementation? Perhaps not. But there have been, historically, systems with parity bits and other oddness, and so directly reading from the network into an unsigned int, by a strict reading of the standard, is not kosher.
Now, the problem of padding bits is mostly a theoretical issue on most systems today, but it's worth noting. If you plan to stick to PC hardware, you don't really need to worry about it (but don't forget your ntohls - endianness is still a problem!)
Structures make it even worse, of course - alignment representations depend on your platform. I have worked on an embedded platform in which all types have an alignment of 1 - no padding is ever inserted into structures. This can result in inconsistencies when using the same structure definitions on multiple platforms. You can either manually work out the byte offsets for data structure members and reference them directly, or use a compiler-specific alignment directive to control padding.
So you must be careful when directly casting from a network buffer to native types or structures. But the aliasing itself is not a problem in this case.

Actually this code already has UB at the point you dereference the reinterpret_casted integer pointer without even needing to invoke strict-aliasing rules. Not only that, but if you aren't rather careful, reinterpreting directly to your packet structure could cause all sorts of issues depending on struct packing and endianness.
Given all that, and that you're already invoking UB I suspect that it's "likely to work" on multiple compilers and you're free to take that (possibly measurable) risk.

Related

Can I reinterpret_cast some byte range of a POD C-Array to std::array<char,N>?

I want to use fixed contiguous bytes of a long byte array s as keys in a std::map<std::array<char,N>,int>.
Can I do this without copying by reinterpreting subarrays of s as std::array<char,N>?
Here is a minimal example:
#include <map>
int main() {
std::map<std::array<char,10>,int> m;
const char* s="Some long contiguous data";
// reinterpret some contiguous 10 bytes of s as std::array<char,10>
// Is this UB or valid?
const std::array<char,10>& key=*reinterpret_cast<const std::array<char,10>*>(s+5);
m[key]=1;
}
I would say yes, because char is a POD type that does not require alignment to specific addresses (in contrast to bigger POD types, see https://stackoverflow.com/a/32590117/6212870). Therefore, it should be OK to reinterpret_cast to std::array<char,N> starting at every address as long as the covered bytes are still a subrange of s, i.e. as long as I ensure that I do not have buffer overflow.
Can I really do such reinterpret_cast or is it UB?
EDIT:
In the comments, people correctly pointed to the fact that I cannot know for sure that for std::array<char,10> arr it holds that (void*)&arr==(void*)&arr[0] due to the possibility of padding of the internal c-array data member of the std::array template class, even though this typically should not be the case, especially since we are considering a char POD array. So I update my question:
Can I rely on the reinterpret_cast as done above when I check via static_assert that indeed there is no padding? Of coures the code won't compile anymore on compiler/platform combinations where there is padding, so I won't use this method. But I want to know: Are there other concerns apart from the padding? Or is the code valid with a static_assert check?
No—there is no object of type std::array<char,10> at that address, regardless of the layout of that type. (The special rules for char do not apply to a type that happens to have char subobjects.) As always, it is not the reinterpret_cast itself whose behavior is undefined, but rather the access through that non-object when using it as a map key. (What you are allowed to do in this case is merely cast it back to the real type, for use with C-like interfaces that require a fixed pointer type but do not actually use the object.)
This access also of course involves a copy; if your goal was to avoid copying at all, just make a
std::map<const char*,int,ten_cmp>
where ten_cmp is a functor type that compares 10 bytes starting from each address (via std::strncmp or std::string_view).
If you do want the map to own its key data, just std::memcpy from the string into a key; compilers often recognize that such temporary “buffers” don’t need to exist independently and actually read from the source in the fashion you hope to do with reinterpret_cast.

Is object address guaranteed to be a mulitple of its type alignment?

Alignment is defined in the Standard as follows:
An alignment is an implementation-defined integer value representing the number of bytes between successive addresses at which a given object can be allocated.
However, this does not imply that such addresses are multiples of the alignment value. For instance, two double objects at addresses 0x01 and 0x09 satisfies the above definition.
Is it guaranteed somehow that an address of an object is a multiple of the alignment value for its type?
No it isn't.
Only a linear relationship, not a proportionality is guaranteed, but even then the alignment requirements could be relaxed in structure packing, for example
/*packed*/ struct s {double a; char b; double c;);
Note that nullptr does not even have to be the zero memory byte, virtual memory or otherwise.
For objects that are not struct or union members, the Standard does not define any means by which alignment could be observed, unless operations fail because of misalignment. If a platform can silently process objects with arbitrary alignment (though perhaps not as fast as objects that are correctly aligned), and if an implementation for that platform doesn't define uintptr_t or intptr_t, there would be no observable way by which failure to align standalone objects could be detected, and thus no requirement to actually align objects.
Most implementations would document ways of detecting pointer alignment, and should process alignment directives so they behave in a fashion consistent with their documentation. One could have a conforming implementation that did otherwise, just as one could have conforming-but-low-quality implementations do all sorts of weird goofy things, but quality implementations should refrain from such nonsense.

storing flag inside pointer

I have heard quite a lot about storing external data in pointer.
For example in (short string optimization).
For example:
when we want to overload << for our SSO class, dependant of the length of the string we want to print either value of pointer or string.
Instead of creating bool flag we could encode this flag inside pointer itself. If i am not mistaken its thanks PC architecture that adds padding to prevent unalligned memory access.
But i have yet to see it in example. How could we detect such flag, when binary operation such as & to check if RSB or LSB is set to 1 ( as a flag ) are not allowed on pointers? Also wouldnt this mess up dereferencing pointers?
All answers are appreciated.
It is quite possible to do such things (unlike other's have said). Most modern architectures (x86-64, for example) enforce alignment requirements that allow you to use the fact that the least significant bits of a pointer may be assumed to be zero, and make use of that storage for other purposes.
Let me pause for a second and say that what I'm about to describe is considered 'undefined behavior' by the C & C++ standard. You are going off-the-rails in a non-portable way by doing what I describe, but there are more standards governing the rules of a computer than the C++ standard (such as the processors assembly reference and architecture docs). Caveat emptor.
With the assumption that we're working on x86_64, let us say that you have a class/structure that starts with a pointer member:
struct foo {
bar * ptr;
/* other stuff */
};
By the x86 architectural constraints, that pointer in foo must be aligned on an 8-byte boundary. In this trivial example, you can assume that every pointer to a struct foo is therefore an address divisible by 8, meaning the lowest 3 bits of a foo * will be zero.
In order to take advantage of such a constraint, you must play some casting games to allow the pointer to be treated as a different type. There's a bunch of different ways of performing the casting, ranging from the old C method (not recommended) of casting it to and from a uintptr_t to cleaner methods of wrapping the pointer in a union. In order to access either the pointer or ancillary data, you need to logically 'and' the datum with a bitmask that zeros out the part of the datum you don't wish.
As an example of this explanation, I wrote an AVL tree a few years ago that sinks the balance book-keeping data into a pointer, and you can take a look at that example here: https://github.com/jschmerge/structures/blob/master/tree/avl_tree.h#L31 (everything you need to see is contained in the struct avl_tree_node at the line I referenced).
Swinging back to a topic you mentioned in your initial question... Short string optimization isn't implemented quite the same way. The implementations of it in Clang and GCC's standard libraries differ somewhat, but both boil down to using a union to overload a block of storage with either a pointer or an array of bytes, and play some clever tricks with the string's internal length field for differentiating whether the data is a pointer or local array. For more of the details, this blog post is rather good at explaining: https://shaharmike.com/cpp/std-string/
"encode this flag inside pointer itself"
No, you are not allowed to do this in either C or C++.
The behaviour on setting (let alone dereferencing) a pointer to memory you don't own is undefined in either language.
Sadly what you want to achieve is to be done at the assembler level, where the distinction between a pointer and integer is sufficiently blurred.

Is a byte array allocated with new[] aligned on platform word boundary? [duplicate]

Is allocating a buffer via new char[sizeof(T)] guaranteed to allocate memory which is properly aligned for the type T, where all members of T has their natural, implementation defined, alignment (that is, you have not used the alignas keyword to modify their alignment).
I have seen this guarantee made in a few answers around here but I'm not entirely clear how the standard arrives at this guarantee. 5.3.4-10 of the standard gives the basic requirement: essentially new char[] must be aligned to max_align_t.
What I'm missing is the bit which says alignof(T) will always be a valid alignment with a maximum value of max_align_t. I mean, it seems obvious, but must the resulting alignment of a structure be at most max_align_t? Even point 3.11-3 says extended alignments may be supported, so may the compiler decide on its own a class is an over-aligned type?
The expressions new char[N] and new unsigned char[N] are guaranteed
to return memory sufficiently aligned for any object. See §5.3.4/10
"[...] For arrays of char and unsigned char, the difference between the
result of the new-expression and the address returned by the allocation
function shall be an integral multiple of the strictest fundamental
alignment requirement (3.11) of any object type whose size is no greater
than the size of the array being created. [ Note: Because allocation
functions are assumed to return pointers to storage that is
appropriately aligned for objects of any type with fundamental
alignment, this constraint on array allocation overhead permits the
common idiom of allocating character arrays into which objects of other
types will later be placed. —end note ]".
From a stylistic point of view, of course: if what you want is to allocate raw
memory, it's clearer to say so: operator new(N). Conceptually,
new char[N] creates N char; operator new(N) allocates N bytes.
What I'm missing is the bit which says alignof(T) will always be a valid alignment with a maximum value of max_align_t. I mean, it seems obvious, but must the resulting alignment of a structure be at most max_align_t ? Even point 3.11-3 says extended alignments may be supported, so may the compiler decide on its own a class is an over-aligned type ?
As noted by Mankarse, the best quote I could get is from [basic.align]/3:
A type having an extended alignment requirement is an over-aligned type. [ Note:
every over-aligned type is or contains a class type to which extended alignment applies (possibly through a non-static data member). —end note ]
which seems to imply that extended alignment must be explicitly required (and then propagates) but cannot
I would have prefer a clearer mention; the intent is obvious for a compiler-writer, and any other behavior would be insane, still...

Cast A primitive type pointer to A structure pointer - Alignment and Padding?

Just 20 minutes age when I answered a question, I come up with an interesting scenario that I'm not sure of the behavior:
Let me have an integer array of size n, pointed by intPtr;
int* intPtr;
and let me also have a struct like this:
typedef struct {
int val1;
int val2;
//and less or more integer declarations goes on like this(not any other type)
}intStruct;
My question is if I do a cast intStruct* structPtr = (intStruct*) intPtr;
Am I sure to get every element correctly if I traverse the elements of the struct? Is there any possibility of miss-alignment(possible because of padding) in any architecture/compiler?
The standard is fairly specific that even a POD-struct (which is, I believe the most restrictive class of structs) can have padding between members. ("There might therefore be unnamed padding within a POD-struct object, but not at its beginning, as necessary to achieve appropriate alignment." -- a non-normative note, but still makes the intent quite clear).
For example, contrast the requirements for a standard-layout struct (C++11, §1.8/4):
An object of trivially copyable or standard-layout type (3.9) shall occupy contiguous bytes of storage."
...with those for an array (§8.3.4/1):
An object of array type contains a contiguously allocated non-empty set of N subobjects of type T.
In the array, the elements themselves are required to be allocated contiguously, whereas in the struct, only the storage is required to be contiguous.
The third possibility that might make the "contiguous storage" requirement make more sense would be to consider a struct/class that is not trivially copyable or standard layout. In this case, it's possible that the storage might might not be contiguous at all. For example, an implementation might set aside one area of memory for holding all the private variables, and an entirely separate area of memory to hold all the public variables. To make that a little more concrete, consider two definitions like:
class A {
int a;
public:
int b;
} a;
class B {
int x;
public:
int y;
} b;
With these definitions, the memory might be laid out something like:
a.a;
b.x;
// ... somewhere else in memory entirely:
a.b;
b.y;
In this case, neither the elements nor the storage needs to be contiguous, so interleaving parts of entirely separate structs/classes is allowable.
That said, the first element must be at the same address as the struct as a whole (9.2/17): "A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa."
In your case, you have a POD-struct, so (§9.2/17): "A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa." Since the first member must be aligned, and the remaining members are all of the same type, it's impossible for any padding to be truly necessary between the other members (i.e., except for bit-fields, any type you can put in a struct you can also put in an array, where contiguous allocation of the elements is required). If you have elements smaller than a word, on a word-oriented machine (e.g., early DEC Alphas), it's possible that padding could make access somewhat simpler though. For example, early DEC Alphas (at the hardware level) were only capable of reading/writing an entirely (64-bit) word at a time. As such, let's consider something like a struct of four char elements:
struct foo {
char a, b, c, d;
};
If it was required to lay these out in memory so they were contiguous, accessing a foo::b (for example) would require that the CPU load the word, then shift it 8-bits right, then mask to zero-extend that byte to fill the entire register.
Storing would be even worse -- the CPU would have to load the current value of the whole word, mask out the current contents of the appropriate char-sized piece of that, shift the new value to the correct place, OR it into the word, and finally store the result.
By contrast, with padding between the elements, each of those becomes a simple load/store, with no shifting, masking, etc.
At least if memory serves, with DEC's normal compiler for the Alpha, int was 32 bits, and long was 64 bits (it predated long long). As such, with your struct of four ints, you could have expected to see another 32 bits of padding between the elements (and another 32 bits after the last element as well).
Given that you do have a POD-struct, you still have some possibilities though. The one I'd probably prefer would be to use offsetof to get the offsets of the members of the struct, create an array of them, and access the members via those offsets. I showed how to do this in a couple of previous answers.
Strictly speaking, such pointer casts aren't allowed and lead to undefined behavior.
The main issue with the cast is however that the compiler is free to add any number of padding bytes anywhere inside a struct, except before the very first element. So whether it will work or not depends on the alignment requirements of the specific system, and also whether struct padding is enabled or not.
int is not necessarily of the same size as the optimal size for an addressable chunk of data, even though this is true for most 32-bit systems. Some 32-bitters don't care about misalignment, some will allow misalignment but produce less efficient code, and some must have the data aligned. In theory, 64-bitters may also want to add padding after an int (which will be 32 bit there) to get a 64-bit chunk, but in practice they support 32-bit instruction sets.
If you write code relying on this cast, you should add something like this:
static_assert (sizeof(intStruct) ==
sizeof(int) + sizeof(int));
It is guaranteed to be legal, given that the element type is standard-layout. Note: all references in the following are to the c++11 standard.
8.3.4 Arrays [dcl.array]
1 - [...] An object of array type contains a contiguously allocated non-empty set of N subobjects of type T. [...]
Regarding a struct with N members of type T,
9.2 Class members [class.mem]
14 - Nonstatic data members of a (non-union) class with the same access control are allocated so
that later members have higher addresses within a class object. [...] Implementation alignment requirements might
cause two adjacent members not to be allocated immediately after each other [...]
20 - A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its
initial member [...] and vice versa. [ Note:
There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning,
as necessary to achieve appropriate alignment. —end note ]
So the question is whether any alignment-required padding within a struct could cause its members not to be contiguously allocated with respect to each other. The answer is:
1.8 The C++ object model [intro.object]
4 - [...] An object of trivially copyable or standard-layout type shall occupy contiguous bytes of storage.
In other words, a standard-layout struct a containing at least two members x, y of the same (standard-layout) type that does not respect the identity &a.y == &a.x + 1 is in violation of 1.8:4.
Note that alignment is defined as (3.11 Alignment [basic.align]) the number of bytes between successive addresses at which a given object can be allocated; it follows that alignment of a type T can be no greater than the distance between adjacent objects in an array of T, and (since 5.3.3 Sizeof [expr.sizeof] specifies that the size of an array of n elements is n times the size of an element) alignof(T) can be no greater than sizeof(T). Thus any additional padding between adjacent elements of a struct of the same type would not be required by alignment and so would not be countenanced by 9.2:14.
With regard to AProgrammer's point, I would interpret the language in 26.4 Complex numbers [complex.numbers] as requiring that the instantiations of std::complex<T> should behave as standard-layout types with regard to the position of their members, without being required to conform to all the requirements of standard-layout types.
The behavior there is almost certainly compiler-, architecture-, and ABI-dependent. However, if you're using gcc, you can make use of __attribute__((packed)) to force the compiler to pack struct members one after the other, without any padding. With that, the memory layout should match that of a flat array.
I've found nothing which guarantee it is valid when I searched some time ago, and I've found explicit guarantee for the case of std::complex<> in C++ which could have been formulated more easily if it was more generally true, so I doubt I missed something in my search (but absence of proof is hardly a proof of absence and the standard is sometimes obscure in its formulation).
A typical alignment of C structs guarantees that the data structure members in the struct will be stored sequentially which is the same as a C array. So order cannot be a problem.
As it comes to alignment, since you have only one data type(int), though the compiler is eligible to do so, there is no scenario it would be necessary to add padding to align your data members. The compiler can add padding before the beginning of the struct, but it cannot add padding at the beginning of the data structure. So if the compiler were to add padding in your situation,
Instead of this:
[4Byte int][4Byte int][4Byte int]...[4Byte int]
Your data structure would have to be stored like this:
[4Byte Data][4Byte Padding][4Byte Data]... which is unreasonable.
Overall, I think this cast should work with no problems in your situation, though I think it is bad practice to use it.