I have tested this code on ideone.com and it outputs 16 as it should. However when I try it in Visual Studio 2013 it shows 8. Is it a bug or lack of C++11 support from the compiler?
#include <iostream>
#include <type_traits>
using namespace std;
using float_pack = aligned_storage<4 * sizeof(float), 16>::type;
int main() {
cout << alignment_of<float_pack>::value << endl;
return 0;
}
I have used alignment_of because MSVC doesn't support alignof.
Edit: I see that I can't get 16 alignment with aligned_storage. But why this snippet is ok?
#include <iostream>
#include <type_traits>
#include <xmmintrin.h>
using namespace std;
__declspec(align(16)) struct float_pack {
float x[4];
};
int main()
{
cout << alignment_of<float_pack>::value << endl;
}
Output is 16. Does that mean that compiler can provide larger alignment when using extensions? Why I can't achieve the same result with aligned_storage? Only because MSVC doesn't provide that with aligned_storage?
It looks like std::max_align_t is 8, see it live:
std::cout << alignment_of<std::max_align_t>::value << '\n';
In the draft C++ standard section 3.11 Alignment it says:
A fundamental alignment is represented by an alignment less than or equal to the greatest alignment sup- ported by the implementation in all contexts, which is equal to alignof(std::max_align_t) (18.2).[...]
Which says that that is the max alignment the implementation supports, this seems to be backed up by this boost doc which says:
An extended alignment is represented by an alignment greater than alignof(std::max_align_t). It is implementation-defined whether any extended alignments are supported and the contexts in which they are supported. A type having an extended alignment requirement is an over-aligned type.
max_align_t is by the standard tied to the fundamental alignment which James as informed us is 8 bytes. Whereas an extension does not have to stick to this as long as it is documented which if we read the docs for __declspec align we see that it says:
Writing applications that use the latest processor instructions
introduces some new constraints and issues. In particular, many new
instructions require that data must be aligned to 16-byte boundaries.
Additionally, by aligning frequently used data to the cache line size
of a specific processor, you improve cache performance. For example,
if you define a structure whose size is less than 32 bytes, you may
want to align it to 32 bytes to ensure that objects of that structure
type are efficiently cached.
[...]
Without __declspec(align(#)), Visual C++ aligns data on natural
boundaries based on the size of the data, for example 4-byte integers
on 4-byte boundaries and 8-byte doubles on 8-byte boundaries. Data in
classes or structures is aligned within the class or structure at the
minimum of its natural alignment and the current packing setting (from #pragma pack or the /Zp compiler option).
std::aligned_storage defines a type of size Len, with the alignment requirement you provide. If you ask for an unsupported alignment, your program is ill-formed.
template <std::size_t Len, std::size_t Align
= default-alignment > struct aligned_storage;
Len shall not be zero. Align shall be equal to alignof(T) for some type T or to default-alignment.
The value of default-alignment shall be the most stringent alignment requirement for any C++ object type whose size is no greater than Len (3.9). The member typedef type shall be a POD type suitable for use as uninitialized storage for any object whose size is at most Len and whose alignment is a divisor of Align.
[ Note: A typical implementation would define aligned_storage as:
template <std::size_t Len, std::size_t Alignment>
struct aligned_storage {
typedef struct {
alignas(Alignment) unsigned char __data[Len];
} type;
};
—end note ]
And for alignas:
7.6.2 Alignment specifier [dcl.align]
1 An alignment-specifier may be applied to a variable or to a class data member, but it shall not be applied to a bit-field, a function parameter, the formal parameter of a catch clause (15.3), or a variable declared with the register storage class specifier. An alignment-specifier may also be applied to the declaration of a class or enumeration type. An alignment-specifier with an ellipsis is a pack expansion (14.5.3).
2 When the alignment-specifier is of the form alignas( assignment-expression ):
— the assignment-expression shall be an integral constant expression
— if the constant expression evaluates to a fundamental alignment, the alignment requirement of the
declared entity shall be the specified fundamental alignment
— if the constant expression evaluates to an extended alignment and the implementation supports that
alignment in the context of the declaration, the alignment of the declared entity shall be that alignment
— if the constant expression evaluates to an extended alignment and the implementation does not support
that alignment in the context of the declaration, the program is ill-formed
— if the constant expression evaluates to zero, the alignment specifier shall have no effect
— otherwise, the program is ill-formed.
Related
What does the following C++ code mean?
unsigned char a : 1;
unsigned char b : 7;
I guess it creates two char a and b, and both of them should be one byte long, but I have no idea what the ": 1" and ": 7" part does.
The 1 and the 7 are bit sizes to limit the range of the values. They're typically found in structures and unions. For example, on some systems (depends on char width and packing rules, etc), the code:
typedef struct {
unsigned char a : 1;
unsigned char b : 7;
} tOneAndSevenBits;
creates an 8-bit value, one bit for a and 7 bits for b.
Typically used in C to access "compressed" values such as a 4-bit nybble which might be contained in the top half of an 8-bit char:
typedef struct {
unsigned char leftFour : 4;
unsigned char rightFour : 4;
} tTwoNybbles;
For the language lawyers amongst us, the 9.6 section of the C++11 standard explains this in detail, slightly paraphrased:
Bit-fields [class.bit]
A member-declarator of the form
identifieropt attribute-specifieropt : constant-expression
specifies a bit-field; its length is set off from the bit-field name by a colon. The optional attribute-specifier appertains to the entity being declared. The bit-field attribute is not part of the type of the class member.
The constant-expression shall be an integral constant expression with a value greater than or equal to zero. The value of the integral constant expression may be larger than the number of bits in the object representation of the bit-field’s type; in such cases the extra bits are used as padding bits and do not participate in the value representation of the bit-field.
Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit.
Note: bit-fields straddle allocation units on some machines and not on others. Bit-fields are assigned right-to-left on some machines, left-to-right on others. - end note
I believe those would be bitfields.
Strictly speaking, a bitfield must be a int, unsigned int, or _Bool. Although most compilers will take any integral type.
Ref C11 6.7.2.1:
A bit-field shall have a type that is a qualified or unqualified
version of _Bool, signed int, unsigned int, or some other
implementation-defined type.
Your compiler will probably allocate 1 byte of storage, but it is free to grab more.
Ref C11 6.7.2.1:
An implementation may allocate any addressable storage unit large
enough to hold a bit- field.
The savings comes when you have multiple bitfields that are declared one after another. In this case, the storage allocated will be packed if possible.
Ref C11 6.7.2.1:
If enough space remains, a bit-field that
immediately follows another bit-field in a structure shall be packed
into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or
overlaps adjacent units is implementation-defined.
The C++ standard states, regarding the std::aligned_storage template, that
Align shall be equal to alignof(T) for some type T or to default-alignment.
Does that mean that there must be such a type in the program, or that it must be possible to make such a type? In particular, the possible implementation suggested on cppreference is
template<std::size_t Len, std::size_t Align /* default alignment not implemented */>
struct aligned_storage {
typedef struct {
alignas(Align) unsigned char data[Len];
} type;
};
It seems like this makes a type with that alignment, if possible (that is, if Align is a valid alignment). Is that behavior required, or is it undefined behavior to specify an Align if such a type does not already exist?
And, perhaps more importantly, is it plausible in practice that the compiler or standard library would fail to do the right thing in this case, assuming that Align is at least a legal alignment for a type to have?
You can always attempt to make a type with arbitrary (valid) alignment N:
template <std::size_t N> struct X { alignas(N) char c; };
When N is greater than the default alignment, X has extended alignment. The support for extended alignment is implementation-defined, and [dcl.align] says:
if the constant expression does not evaluate to an alignment value (6.11), or evaluates to an extended
alignment and the implementation does not support that alignment in the context of the declaration, the program is ill-formed.
Therefore, when you attempt to say X<N> for an extended alignment that is not supported, you will face a diagnostic. You can now use the existence (or otherwise) of X<N> to justify the validity of the specialization aligned_storage<Len, N> (whose condition is now met with T = X<N>).
Since aligned_storage will effectively use something like X internally, you don't even have to actually define X. It's just a mental aid in the explanation. The aligned_storage will be ill-formed if the requested alignment is not supported.
I'm reading a bit about alignment in C++, and I am not sure why the alignment of a class that contains solely a char array member is not the sizeof of the array, but turns out to be always 1. For example
#include <iostream>
struct Foo{char m_[16];}; // shouldn't this have a 16 byte alignment?!
int main()
{
std::cout << sizeof(Foo) << " " << alignof(Foo);
}
Live on Coliru
in the code above it's clear that the sizeof(Foo) is 16, however its alignment is 1, see the output of the code.
Why is the alignof(Foo) 1 in this case?
Note that if I replace char m_[16]; with a fundamental type like int m_;, then alignof(Foo) becomes what I would've expected, i.e. sizeof(int) (on my machine this is 4).
Same happens if I simply declare an array char arr[16];, then alignof(arr) will be 1.
Note: data alignment has been explained in details in this article. If you want to know what the term means in general and why it is an important issue read the article.
Aligment is defined in C++ as an implementation-defined integer value representing the number of bytes between successive addresses at which a given object can be allocated [6.11.1] Alignment.
Moreover alignments must be non-negative integral powers of 2 [6.11.4] Alignment.
When we calculate the alignment of a struct we have to take into account yet another rule [6.11.5] Alignment:
Alignments have an order from weaker to stronger or stricter
alignments. Stricter alignments have larger alignment values. An
address that satisfies an alignment requirement also satisfies any
weaker valid alignment requirement.
It's not directly stated but these rules imply that struct alignment has to be at least as strict as the alignment of its most strictly aligned member. It could be bigger but it doesn't have to be and usually isn't.
So when the alignment of the struct from OP's example is decided the alignment of the struct must be no less than alignment of its only member's type char[16]. Then by the 8.3.6 [expr.alignof]:
When alignof is applied to a reference type, the result is the
alignment of the referenced type. When alignof is applied to an array
type, the result is the alignment of the element type.
alignof(char[16]) equals alignof(char) which will usually be 1 because of [6.11.6] Alignment:
(...) narrow character types shall have the weakest alignment requirement.
In this example:
struct Foo
{
char c[16];
double d;
};
double has more strict alignment than char so alignof(Foo) equals alignof(double).
I need to access unaligned values using GCC vector extension
The program below crashes - in both clang and gcc
typedef int __attribute__((vector_size(16))) int4;
typedef int __attribute__((vector_size(16),aligned(4))) *int4p;
int main()
{
int v[64] __attribute__((aligned(16))) = {};
int4p ptr = reinterpret_cast<int4p>(&v[7]);
int4 val = *ptr;
}
However if I change
typedef int __attribute__((vector_size(16),aligned(4))) *int4p;
to
typedef int __attribute__((vector_size(16),aligned(4))) int4u;
typedef int4u *int4up;
The generated assembly code is correct (using unaligned load) - in both clang and gcc.
What is wrong with single definition or what do I miss? Can it be the same bug in both clang and gcc?
Note: it happens in both clang and gcc
TL;DR
You've altered the alignment of the pointer type itself, not the pointee type. This has nothing to do with the vector_size attribute and everything to do with the aligned attribute. It's also not a bug, and it's implemented correctly in both GCC and Clang.
Long Story
From the GCC documentation, § 6.33.1 Common Type Attributes (emphasis added):
aligned (alignment)
This attribute specifies a minimum alignment (in bytes) for variables of the specified type. [...]
The type in question is the type being declared, not the type pointed to by the type being declared. Therefore,
typedef int __attribute__((vector_size(16),aligned(4))) *int4p;
declares a new type T that points to objects of type *T, where:
*T is a 16-byte vector with default alignment for its size (16 bytes)
T is a pointer type, and the variables of this type may be exceptionally stored aligned to as low as 4-byte boundaries (even though what they point to is a type *T that is far more aligned).
Meanwhile, § 6.49 Using Vector Instructions through Built-in Functions says (emphasis added):
On some targets, the instruction set contains SIMD vector instructions which operate on multiple values contained in one large register at the same time. For example, on the x86 the MMX, 3DNow! and SSE extensions can be used this way.
The first step in using these extensions is to provide the necessary data types. This should be done using an appropriate typedef:
typedef int v4si __attribute__ ((vector_size (16)));
The int type specifies the base type, while the attribute specifies the vector size for the variable, measured in bytes. For example, the declaration above causes the compiler to set the mode for the v4si type to be 16 bytes wide and divided into int sized units. For a 32-bit int this means a vector of 4 units of 4 bytes, and the corresponding mode of foo is V4SI.
The vector_size attribute is only applicable to integral and float scalars, although arrays, pointers, and function return values are allowed in conjunction with this construct. Only sizes that are a power of two are currently allowed.
Demo
#include <stdio.h>
typedef int __attribute__((aligned(128))) * batcrazyptr;
struct batcrazystruct{
batcrazyptr ptr;
};
int main()
{
printf("Ptr: %zu\n", sizeof(batcrazyptr));
printf("Struct: %zu\n", sizeof(batcrazystruct));
}
Output:
Ptr: 8
Struct: 128
Which is consistent with batcrazyptr ptr itself having its alignment requirement changed, not its pointee, and in agreement with the documentation.
Solution
I'm afraid you'll be forced to use a chain of typedef's, as you have done with int4u. It would be unreasonable to have a separate attribute to specify the alignment of each pointer level in a typedef.
What does the following C++ code mean?
unsigned char a : 1;
unsigned char b : 7;
I guess it creates two char a and b, and both of them should be one byte long, but I have no idea what the ": 1" and ": 7" part does.
The 1 and the 7 are bit sizes to limit the range of the values. They're typically found in structures and unions. For example, on some systems (depends on char width and packing rules, etc), the code:
typedef struct {
unsigned char a : 1;
unsigned char b : 7;
} tOneAndSevenBits;
creates an 8-bit value, one bit for a and 7 bits for b.
Typically used in C to access "compressed" values such as a 4-bit nybble which might be contained in the top half of an 8-bit char:
typedef struct {
unsigned char leftFour : 4;
unsigned char rightFour : 4;
} tTwoNybbles;
For the language lawyers amongst us, the 9.6 section of the C++11 standard explains this in detail, slightly paraphrased:
Bit-fields [class.bit]
A member-declarator of the form
identifieropt attribute-specifieropt : constant-expression
specifies a bit-field; its length is set off from the bit-field name by a colon. The optional attribute-specifier appertains to the entity being declared. The bit-field attribute is not part of the type of the class member.
The constant-expression shall be an integral constant expression with a value greater than or equal to zero. The value of the integral constant expression may be larger than the number of bits in the object representation of the bit-field’s type; in such cases the extra bits are used as padding bits and do not participate in the value representation of the bit-field.
Allocation of bit-fields within a class object is implementation-defined. Alignment of bit-fields is implementation-defined. Bit-fields are packed into some addressable allocation unit.
Note: bit-fields straddle allocation units on some machines and not on others. Bit-fields are assigned right-to-left on some machines, left-to-right on others. - end note
I believe those would be bitfields.
Strictly speaking, a bitfield must be a int, unsigned int, or _Bool. Although most compilers will take any integral type.
Ref C11 6.7.2.1:
A bit-field shall have a type that is a qualified or unqualified
version of _Bool, signed int, unsigned int, or some other
implementation-defined type.
Your compiler will probably allocate 1 byte of storage, but it is free to grab more.
Ref C11 6.7.2.1:
An implementation may allocate any addressable storage unit large
enough to hold a bit- field.
The savings comes when you have multiple bitfields that are declared one after another. In this case, the storage allocated will be packed if possible.
Ref C11 6.7.2.1:
If enough space remains, a bit-field that
immediately follows another bit-field in a structure shall be packed
into adjacent bits of the same unit. If insufficient space remains,
whether a bit-field that does not fit is put into the next unit or
overlaps adjacent units is implementation-defined.