C++ alignment and arrays - c++

I have some type T that I explicitly specify as x-aligned
x > sizeof(T)
x > any implementation fundamental alignment
(ex: x is page or cache aligned)
Suppose I now have: T arr[y], where arr is x-aligned (either by being allocated on the stack, or in data, or by an x-aligned heap allocation)
Then at least some of arr[1],...,arr[y-1] are not x-aligned.
Correct? (In fact, it must be correct if sizeof(T) does not change with extended alignment specification)
Note1: This is not the same question as How is an array aligned in C++ compared to a type contained?. This question asks about the alignment of the array itself, not of the individual elements inside.
Note2: This question: Does alignas affect the value of sizeof? is essentially what I'm asking - but for extended-alignment.
Note3: https://stackoverflow.com/a/4638295/7226419 Is an authoritative answer to the question (that sizeof(T) includes any padding necessary to satisfy alignment requirements for having all T's in an array of T's properly aligned.

If type T is x-aligned, every object of type T is x-aligned, including any array elements. In particular, this means x > sizeof(T) cannot possibly hold.
A quick test with a couple of modern compilers confirms:
#include <iostream>
struct alignas(16) overaligned {};
struct unaligned {};
template <class T> void sizes()
{
T m, marr[2];
std::cout << sizeof(m) << " " << sizeof(marr) << std::endl;
}
int main ()
{
sizes<unaligned>();
sizes<overaligned>();
}
Output:
1 2
16 32

Related

Is there a standard-compliant way to determine the alignment of a non-static member?

Suppose I have some structure S and a non-static member member, as in this example:
struct S { alignas(alignof(void *)) char member[sizeof(void *)]; };
How do you get the alignment of member?
The operator alignof can only be applied to complete types, not expressions [in 7.6.2.5.1], although GCC allows it, so alignof(S::member) and Clang supports it.
What is the "language-lawyerly" standard way to do it without this restriction?
Also, sizeof allows expression arguments, is there a reason for the asymmetry?
The practical concern is to be able to get the alignment of members of template structures, you can do decltype to get their type, sizeof to get their size, but then you also need the alignment.
The alignment of a type or variable is a description of what memory addresses the variable can inhabit—the address must be a multiple of the alignment*. However, for data-members, the address of the data-member can be any K * alignof(S) + offsetof(S, member). Let's define the alignment of a data-member to be the maximum possible integer E such that &some_s.member is always a multiple of E.
Given a type S with member member, let A = alignof(S), O = offsetof(S, member).
The valid addresses of S{}.member are V = K * A + O for some integer K.
V = K * A + O = gcd(A, O) * (K * A / gcd(A, O) + O / gcd(A, O)).
For the case where K = 1, no other factors exist.
Thus, gcd(A, O) is the best factor valid for unknown K.
In other words, "alignof(S.member)" == gcd(alignof(S), offsetof(S, member)).
Note that this alignment is always a power of two, as alignof(S) is always a power of two.
*: In my brief foray into the standard, I couldn't find this guarantee, meaning that the address of the variable could be K * alignment + some_integer. However, this doesn't affect the final result.
We can define a macro to compute the alignment of a data-member:
#include <cstddef> // for offsetof(...)
#include <numeric> // for std::gcd
// Must be a macro, as `offsetof` is a macro because the member name must be known
// at preprocessing time.
#define ALIGNOF_MEMBER(cls, member) (::std::gcd(alignof(cls), offsetof(cls, member)))
This is only guaranteed valid for standard layout types, as offsetof is only guaranteed valid for standard layout types. If the class is not standard layout, this operation is conditionally supported.
Example:
#include <cstddef>
#include <numeric>
struct S1 { char foo; alignas(alignof(void *)) char member[sizeof(void *)]; };
struct S2 { char foo; char member[sizeof(void *)]; };
#define ALIGNOF_MEMBER(cls, member) (::std::gcd(alignof(cls), offsetof(cls, member)))
int f1() { return ALIGNOF_MEMBER(S1, member); } // returns alignof(void *) == 8
int f2() { return ALIGNOF_MEMBER(S1, foo); } // returns 8*
int f3() { return ALIGNOF_MEMBER(S2, member); } // returns 1
// *: alignof(S1) == 8, so the `foo` member must always be at an alignment of 8
Compiler Explorer
I don't think it's possible. In the general case, declaring a non-static data member with an alignment specifier might not change the layout of the class that contains it. In the below example, if (as is most common) int has a size and alignment of 4, the structs S1 and S2 are likely to have the same layout, with a total size of 8 bytes. Each is likely to have 3 bytes of padding at the end:
struct S1 {
int x;
char y;
};
struct S2 {
int x;
alignas(4) char y;
};
This prevents us from using any information about the layout of the struct to determine the alignment of y. And as the OP noted, alignof(S::member) isn't valid.
By the way, there also isn't any way to query the alignment specifier of a regular variable. You can use the std::align function to check whether the variable is allocated at an address that is appropriately aligned for an object with alignment X, but this doesn't imply that the variable was actually declared with an alignment of X or greater. It could have been declared with an alignment less than X and coincidentally ended up allocated at an address that could have supported an object with alignment X.
Since this functionality is unsupported not only for non-static data members but also regular variables, I'm inclined to think that it's not an oversight; it's deliberately not supported because it's not useful. The compiler needs to know the alignment specifier so that it can allocate the variable or data member appropriately. That is not the programmer's job. Sure, the programmer may need to know the alignment requirement of a type in order to appropriately allocate memory for instances of that type, but you cannot, as the programmer, create additional instances of a variable, other than by triggering some condition that makes it happen automatically (e.g., continuing to the next iteration of a loop will deallocate and reallocate automatic variables in the loop's body). Nor can you, as of now, create a second class at compile time that's guaranteed to be layout-compatible with a given class, which is the main application I can think of for the hypothetical "query alignment of non-static data member" feature. I expect that, once C++ provides enough other reflection functionality so that something like that is close to possible, someone will also put forth a realistic proposal to add a way to query the alignment of a non-static data member.

How does an union determine max size from a list of objects?

I am not sure the question is well put, because I understood how, but I don't know to write the questions with the thing I don't understand. Here it is:
I have some classes:
class Animal{};
class Rabbit{}: public Animal;
class Horse{}: public Animal;
class Mouse{}: public Animal;
class Pony{}: public Horse;
My goal was to find the maximum size from this object list in order to use it in memory allocation afterwards. I've stored each sizeof of the object in an array then took the max of the array. The superior(to whom I send the code for review) suggested me to use an union in order to find maximum size at pre-compilation time. The idea seemed very nice to me so I've did it like this:
typedef union
{
Rabbit rabbitObject;
Horse horseObject;
Mouse mouseObject;
Pony ponyObject;
} Size;
... because an union allocates memory according to the greatest-in-size element.
The next suggestion was to do it like this:
typedef union
{
unsigned char RabbitObject[sizeof(Rabbit)];
unsigned char HorseObject[sizeof(Horse)];
unsigned char MouseObject[sizeof(Mouse)];
unsigned char PonyObject[sizeof(Pony)];
} Interesting;
My question is:
How does Interesting union get the maximum size of object? To me, it makes no sense to create an array of type unsigned char, of length sizeof(class) inside it. Why the second option would solve the problem and previous union it doesn't?
What's happening behind and I miss?
PS: The conditions are in that way that I cannot ask the guy personally.
Thank you in advance
The assumptions are incorrect, and the question is moot. Standard does not require the union size to be equal of the size of the largest member. Instead, it requires union size to be sufficient to hold the largest member, which is not the same at all. Both solutions are flawed is size of the largest class needs to be known exactly.
Instead, something like that should be used:
template<class L, class Y, class... T> struct max_size
: std::integral_constant<size_t, std::max(sizeof (L), max_size<Y, T...>::value)> { };
template<class L, class Y> struct max_size<L, Y>
: std::integral_constant<size_t, std::max(sizeof (L), sizeof (Y))> { };
As #Caleth suggested below, it could be shortened using initializer list version of std::max (and template variables):
template<class... Ts>
constexpr size_t max_size_v = std::max({sizeof(Ts)...});
The two approaches provide a way to find a maximum size that all of the objects of the union will fit within. I would prefer the first as it is clearer as to what is being done and the second provides nothing that the first does not for your needs.
And the first, a union composed of the various classes, offers the ability to access a specific member of the union as well.
See also Is a struct's address the same as its first member's address?
as well as sizeof a union in C/C++
and Anonymous union and struct [duplicate]
.
For some discussions on memory layout of classes see the following postings:
Structure of a C++ Object in Memory Vs a Struct
How is the memory layout of a class vs. a struct
memory layout C++ objects [closed]
C++ Class Memory Model And Alignment
What does an object look like in memory? [duplicate]
C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?
Since the compiler is free to add to the sizes of the various components in order to align variables on particular memory address boundaries, the size of the union may be larger than the actual size of the data. Some compilers offer a pragma or other type of directive to instruct the compiler as to whether packing of the class, struct, or union members should be done or not.
The size as reported by sizeof() will be the size of the variable or type specified however again this may include additional unused memory area to pad the variable to the next desirable memory address alignment. See Why isn't sizeof for a struct equal to the sum of sizeof of each member?.
Typically a class, struct, or union is sized so that if an array of the type is created then each element of the array will begin on the most useful memory alignment such as a double word memory alignment for an Intel x86 architecture. This padding is typically on the end of the variable.
You superior suggested you use the array version because a union could have padding. For instance if you have
union padding {
char arr[sizeof (double) + 1];
double d;
};
The this could either be of size sizeof(double) + 1 or it could be sizeof (double) * 2 as the union could be padded to keep it aligned for double's (Live example).
However if you have
union padding {
char arr[sizeof(double) + 1];
char d[sizeof(double)];
};
The the union need not be double aligned and the union most likely has a size of sizeof(double) + 1 (Live example). This is not guanrteed though and the size can be greater than it's largest element.
If you want for sure to have largest size I would suggest using
auto max_size = std::max({sizeof(Rabbit), sizeof(Horse), sizeof(Mouse), sizeof(Pony)});

alignment when referring to structs vs variables

when we're talking about alignment we're always referring to variables inside a struct and not to single variables.
could you please tell me why is that?
when we're referring to a variable , would it take the whole size of "word"?
Most likely because if you're messing with variable alignement you are either in the world of low level optimisation (based on cache line estimations and whatnot) or doing some embed programming.
Most developers don't do that and the ones who do are more likely to go read their plateforme specs and rethink about the alignement principles that they probably already know rather than discuss it on internet (there are exceptions of course, it's just not the general tendency).
I have yet to see a variable alignement which is not a derivate of:
// the array "cacheline" will be aligned to 128-byte boundary
alignas(128) char cacheline[128];
On the other hand you don't need very specific situations to see the impact of aggregate(struct) alignement on a program.
This is something a beginner will write and question at some point or another:
#include <iostream>
struct no_align
{
char c;
double d;
int i;
};
struct align
{
double d;
int i;
char c;
};
int main(void)
{
no_align no_align_array[100];
align align_array[100];
std::cout << sizeof(no_align_array) << std::endl;
std::cout << sizeof(align_array) << std::endl;
}
On my machine the result is:
2400
1600
And that's the point where you'll go around on internet asking why in the world one version makes you use 800 more bytes than the other if no teacher ever explained that to you.
Every type has a size, which is fixed, and an alignment requirement.
A struct has members with their own alignment requirements. As a logical consequence, a struct must have an alignment requirement at least as strong as those of all its members. A struct may have to add padding so that all its members meet their alignment requirements.
An array stores multiple array elements consecutively without any padding. As a logical consequence, the size of any type must be a multiple of its alignment requirement (so a struct containing an int and a char cannot have an alignment requirement of four bytes and a size of five bytes, because that wouldn't work for the second array element in an array of two such structs).
Variables need to have addresses so their alignment requirements are satisfied, so your first sentence is wrong.
However, there is the "as-if" rule: Normally, the compiler has to do what the language tells it. But the "as-if" rule says that the compiler can do whatever it wants to do as long as a program cannot find the difference. So if storing an int on an unaligned address makes no difference (except maybe a tiny cost in time), the compiler is allowed to do this.

Casting double array to a struct of doubles

Is it OK to cast a double array to a struct made of doubles?
struct A
{
double x;
double y;
double z;
};
int main (int argc , char ** argv)
{
double arr[3] = {1.0,2.0,3.0};
A* a = static_cast<A*>(static_cast<void*>(arr));
std::cout << a->x << " " << a->y << " " << a->z << "\n";
}
This prints 1 2 3. But is it guaranteed to work every time with any compiler?
EDIT: According to
9.2.21: A pointer to a standard-layout struct object, suitably converted ? using a reinterpret_cast, points to its initial member (...) and vice versa.
if I replace my code with
struct A
{
double & x() { return data[0]; }
double & y() { return data[1]; }
double & z() { return data[2]; }
private:
double data[3];
};
int main (int, char **)
{
double arr[3] = {1.0,2.0,3.0};
A* a = reinterpret_cast<A*>(arr);
std::cout << a->x() << " " << a->y() << " " << a->z() << "\n";
}
then it is guaranteed to work. Correct? I understand that many people would not find this aesteticaly pleasing but there are advantages in working with a struct and not having to copy the input array data. I can define member functions in that struct to compute scalar and vector products, distances etc, that will make my code much easier to understand than if I work with arrays.
How about
int main (int, char **)
{
double arr[6] = {1.0,2.0,3.0,4.0,5.0,6.0};
A* a = reinterpret_cast<A*>(arr);
std::cout << a[0].x() << " " << a[0].y() << " " << a[0].z() << "\n";
std::cout << a[1].x() << " " << a[1].y() << " " << a[1].z() << "\n";
}
Is this also guaranteed to work or the compiler could put something AFTER the data members so that sizeof(A) > 3*sizeof(double)? And is there any portable way to prevent the compiler from doing so?
No, it's not guaranteed.
The only thing prohibiting any compiler from inserting padding between x and y, or between y and z is common sense. There is no rule in any language standard that would disallow it.
Even if there is no padding, even if the representation of A is exactly the same as that of double[3], then it's still not valid. The language doesn't allow you to pretend one type is really another type. You're not even allowed to treat an instance of struct A { int i; }; as if it's a struct B { int i; };.
The standard gives little guarantees about memory layout of objects.
For classes/structs:
9.2./15: Nonstatic data members of a class with the same access control are allocated so that later members have higher addresses
within a class object. The order of allocation of non-static data
members with different access control is unspecified. Implementation
alignment requirements might cause two adjacent members not to be allocated immediately after each other; so might requirements for
space for managing virtual functions and virtual base classes.
For arrays, the elements are contiguous. Nothing is said about alignment, so it may or may not use same alignment rules than in struct :
8.3.4: An object of array type contains a contiguously allocated non-empty set of N subobjects of type T.
The only thing you can be sure of in your specific example is that a.x corresponds to arr[0], if using a reinterpret_cast:
9.2.21: A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its initial member (...)
and vice versa. [
>
No it is not guaranteed, even if it should work with all compilers I know on common architectures, because C language specification says :
6.2.6 Representations of types
6.2.6.1 General1 The representations of all types are unspecified except as stated in this subclause. And it says nothing on the default padding in a struct.
Of course, common architectures use at most 64bits which is the size of a double on those architecture, so there should be no padding and your conversion should work.
But beware : you are explicitely invoking Undefined Behaviour, and next generation of compilers could do anything when compiling such a cast.
From all I know the answer is: yes.
The only thing that could throw you off is a #pragma directive with some very unusual alignment setting for the struct. If for example a double takes 8 bytes on your machine and the #pragma directive tells to align every member on 16-byte boundaries that could cause problems. Other than that you are fine.
std::complex implementation of msvc use the array solution, and llvm libc++ use the former form.
I think, just check the implementation of the std::complex of your libc++, and use the same solution with it.
I disagree with the consensus here. A struct with three doubles in it, is exactly the same as an array with 3 doubles in it. Unless you specifically pack the struct differently and are on a weird processor that has an odd number of bytes for doubles.
It's not built into the language, but I would feel safe in doing it. Style wise I wouldn't do it, because it's just confusing.

Why is the size of my class zero? How can I ensure that different objects have different address?

I created a class but its size is zero. Now, how can I be sure that all objects have different addresses? (As we know, empty classes have a non-zero size.)
#include<cstdio>
#include<iostream>
using namespace std;
class Test
{
int arr[0];//Why is the sizezero?
};
int main()
{
Test a,b;
cout <<"size of class"<<sizeof(a)<<endl;
if (&a == &b)// now how we ensure about address of objects ?
cout << "impossible " << endl;
else
cout << "Fine " << endl;//Why isn't the address the same?
return 0;
}
Your class definition is illegal. C++ does not allow array declarations with size 0 in any context. But even if you make your class definition completely empty, the sizeof is still required to evaluate to a non-zero value.
9/4 Complete objects and member subobjects of class type shall have
nonzero size.
In other words, if your compiler accepts the class definition and evaluates the above sizeof to zero, that compiler is going outside of scope of standard C++ language. It must be a compiler extension that has no relation to standard C++.
So, the only answer to the "why" question in this case is: because that's the way it is implemented in your compiler.
I don't see what it all has to do with ensuring that different objects have different addresses. The compiler can easily enforce this regardless of whether object size is zero or not.
The standard says that having an array of zero size causes undefined behavior. When you trigger undefined behavior, other guarantees that the standard provides, such as requiring that objects be located at a different address, may not hold.
Don't create arrays of zero size, and you shouldn't have this problem.
This is largely a repetition of what the other answers have already said, but with a few more references to the ISO C++ standard and some musings about the odd behavior of g++.
The ISO C++11 standard, in section 8.3.4 [dcl.array], paragraph 1, says:
If the constant-expression (5.19) is present, it shall be an
integral constant expression and its value shall be greater than zero.
Your class definition:
class Test
{
int arr[0];
};
violates this rule. Section 1.4 [intro.compliance] applies here:
If a program contains a violation of any diagnosable rule [...], a
conforming implementation shall issue at least one diagnostic message.
As I understand it, if a compiler issues this diagnostic and then accepts the program, the program's behavior is undefined. So that's all the standard has to say about your program.
Now it becomes a question about your compiler rather than about the language.
I'm using g++ version 4.7.2, which does permit zero-sized arrays as an extension, but prints the required diagnostic (a warning) if you invoke it with, for example, -std=c++11 -pedantic:
warning: ISO C++ forbids zero-size array ‘arr’ [-pedantic]
(Apparently you're also using g++.)
Experiment shows that g++'s treatment of zero-sized arrays is a bit odd. Here's an example, based on the one in your program:
#include <iostream>
class Empty {
/* This is valid C++ */
};
class Almost_Empty {
int arr[0];
};
int main() {
Almost_Empty arr[2];
Almost_Empty x, y;
std::cout << "sizeof (Empty) = " << sizeof (Empty) << "\n";
std::cout << "sizeof (Almost_Empty) = " << sizeof (Almost_Empty) << "\n";
std::cout << "sizeof arr[0] = " << sizeof arr[0] << '\n';
std::cout << "sizeof arr = " << sizeof arr << '\n';
if (&x == &y) {
std::cout << "&x == &y\n";
}
else {
std::cout << "&x != &y\n";
}
if (&arr[0] == &arr[1]) {
std::cout << "&arr[0] == &arr[1]\n";
}
else {
std::cout << "&arr[0] != &arr[1]\n";
}
}
I get the required warning on int arr[0];, and then the following run-time output:
sizeof (Empty) = 1
sizeof (Almost_Empty) = 0
sizeof arr[0] = 0
sizeof arr = 0
&x != &y
&arr[0] == &arr[1]
C++ requires a class, even one with no members, to have a size of at least 1 byte. g++ follows this requirement for class Empty, which has no members. But adding a zero-sized array to a class actually causes the class itself to have a size of 0.
If you declare two objects of type Almost_Empty, they have distinct addresses, which is sensible; the compiler can allocate distinct objects any way it likes.
But for elements in an array, a compiler has less flexibility: an array of N elements must have a size of N times the number of elements.
In this case, since class Almost_Empty has a size of 0, it follows that an array of Almost_Empty elements has a size of 0 *and that all elements of such an array have the same address.
This does not indicate that g++ fails to conform to the C++ standard. It's done its job by printing a diagnostic (even though it's a non-fatal warning); after that, as far as the standard is concerned, it's free to do whatever it likes.
But I would probably argue that it's a bug in g++. Just in terms of common sense, adding an empty array to a class should not make the class smaller.
But there is a rationale for it. As DyP points out in a comment, the gcc manual (which covers g++) mentions this feature as a C extension which is also available for C++. They are intended primarily to be used as the last member of a structure that's really a header for a variable-length object. This is known as the struct hack. It's replaced in C99 by flexible array members, and in C++ by container classes.
My advice: Avoid all this confusion by not defining zero-length arrays. If you really need sequences of elements that can be empty, use one of the C++ standard container classes such as std::vector or std::array.
There is a difference between variable declaration and variable initialization. In your case, you just declare variables; A and B. Once you have declared a variable, you need to initialize it using either NEW or MALLOC.
The initialization will now allocate memory to the variables that you just declared. You can initialize the variable to an arbitrary size or block of memory.
A and B are both variables meaning you have created two variables A and B. The compiler will identify this variable as unique variables, it will then allocate A to a memory address say 2000 and then allocate B to another memory address say 150.
If you want A to point to B or B to point to A, you can make a reference to A or B such as;
A = &B. Now A as a memory reference or address to B or rather A points to B. This is called passing variables, in C++ you can either pass variables by reference or pass variables by value.