Array of class holding an array memory layout - c++

If we have a class which holds an array, let's call it vector and hold the values in a simple array called data:
class vector
{
public:
double data[3];
<...etc..>
};
Note: called as vector is for clearer explanation, it is not std::vector!!!
So my question is that, if I store only typedefs near this array inside the class and some constrexpr, am I correct if the class will be only 3 doubles after each other inside the memory?
And then if i create an array of vectors like:
vector vl[3];
Note: size of the array is not always known at compile time, not use 3 for the example.
then in the memory it'll be just 9 doubles after each other, right?
so vl[0].data[3] will always return the 2nd vectors 1st element? And in this case is it guaranteed that the result will be always like a simple array in the memory?
I found only cases with array of arrays, but not with array of classes holding an array, and I'm not sure if it is exactly the same at the end. I made some tests and it seems like it is working as I expected, but I don't know if it is always true..
Thank you!

Mostly, yes.
The standard doesn't promise that there never is anything after data in the representation of a vector, but all the implementations that I know of won't add any padding in this case.
What is promised is that there is no padding before data in the representation of vector, because it is a StandardLayout type.

You are right with your first example: The class layout is like a C struct. The first member resides at the address of the struct itself, and if it is an array, all the array's members are adjacent.
Between struct members, however, may be padding; so there is no guarantee that the size of a struct is the sum of all member sizes. I'd have to dig into the standard but I assume this includes padding at the end. This answer affirms that; assert(sizeof(vector) == 3*sizeof(double)) may not hold. In reality I'd assume that an implementation may pad a struct containing three chars so that the struct aligns at word boundaries in an array, but not three doubles which are typically the type with the strongest alignment requirements. But there is no guarantee between implementations, architectures and compiler options: Imagine we switch to 128 bit CPUs.
With respect to your second example: The above applies recursively, so the standard gives no guarantee that the 9 doubles will be adjacent. On the other hand, I bet they will be, and the program can assert it with a simple compile-time static_assert.

Related

Why does std::vector<bool> have no .data()?

The specialisation of std::vector<bool>, as specified in C++11 23.3.7/1, doesn't declare a data() member (e.g. mentioned here and here).
The question is: Why does a std::vector<bool> have no .data()? This is the very same question as why is a vector of bools not stored contiguously in memory. What are the benefits in not doing so?
Why can a pointer to an array of bools not be returned?
Why does a std::vector have no .data()?
Because std::vector<bool> stores multiple values in 1 byte.
Think about it like a compressed storage system, where every boolean value needs 1 bit. So, instead of having one element per memory block (one element per array cell), the memory layout may look like this:
Assuming that you want to index a block to get a value, how would you use operator []? It can't return bool& (since it will return one byte, which stores more than one bools), thus you couldn't assign a bool* to it. In other words bool *bool_ptr =&v[0]; is not valid code, and would result in a compilation error.
Moreover, a correct implementation might not have that specialization and don't do the memory optimization (compression). So data() would have to copy to the expected return type depending of implementation (or standard should force optimization instead of just allowing it).
Why can a pointer to an array of bools not be returned?
Because std::vector<bool> is not stored as an array of bools, thus no pointer can be returned in a straightforward way. It could do that by copying the data to an array and return that array, but it's a design choice not to do that (if they did, I would think that this works as the data() for all containers, which would be misleading).
What are the benefits in not doing so?
Memory optimization.
Usually 8 times less memory usage, since it stores multiple bits in a single byte. To be exact, CHAR_BIT times less.

Cast A primitive type pointer to A structure pointer - Alignment and Padding?

Just 20 minutes age when I answered a question, I come up with an interesting scenario that I'm not sure of the behavior:
Let me have an integer array of size n, pointed by intPtr;
int* intPtr;
and let me also have a struct like this:
typedef struct {
int val1;
int val2;
//and less or more integer declarations goes on like this(not any other type)
}intStruct;
My question is if I do a cast intStruct* structPtr = (intStruct*) intPtr;
Am I sure to get every element correctly if I traverse the elements of the struct? Is there any possibility of miss-alignment(possible because of padding) in any architecture/compiler?
The standard is fairly specific that even a POD-struct (which is, I believe the most restrictive class of structs) can have padding between members. ("There might therefore be unnamed padding within a POD-struct object, but not at its beginning, as necessary to achieve appropriate alignment." -- a non-normative note, but still makes the intent quite clear).
For example, contrast the requirements for a standard-layout struct (C++11, §1.8/4):
An object of trivially copyable or standard-layout type (3.9) shall occupy contiguous bytes of storage."
...with those for an array (§8.3.4/1):
An object of array type contains a contiguously allocated non-empty set of N subobjects of type T.
In the array, the elements themselves are required to be allocated contiguously, whereas in the struct, only the storage is required to be contiguous.
The third possibility that might make the "contiguous storage" requirement make more sense would be to consider a struct/class that is not trivially copyable or standard layout. In this case, it's possible that the storage might might not be contiguous at all. For example, an implementation might set aside one area of memory for holding all the private variables, and an entirely separate area of memory to hold all the public variables. To make that a little more concrete, consider two definitions like:
class A {
int a;
public:
int b;
} a;
class B {
int x;
public:
int y;
} b;
With these definitions, the memory might be laid out something like:
a.a;
b.x;
// ... somewhere else in memory entirely:
a.b;
b.y;
In this case, neither the elements nor the storage needs to be contiguous, so interleaving parts of entirely separate structs/classes is allowable.
That said, the first element must be at the same address as the struct as a whole (9.2/17): "A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa."
In your case, you have a POD-struct, so (§9.2/17): "A pointer to a POD-struct object, suitably converted using a reinterpret_cast, points to its initial member (or if that member is a bit-field, then to the unit in which it resides) and vice versa." Since the first member must be aligned, and the remaining members are all of the same type, it's impossible for any padding to be truly necessary between the other members (i.e., except for bit-fields, any type you can put in a struct you can also put in an array, where contiguous allocation of the elements is required). If you have elements smaller than a word, on a word-oriented machine (e.g., early DEC Alphas), it's possible that padding could make access somewhat simpler though. For example, early DEC Alphas (at the hardware level) were only capable of reading/writing an entirely (64-bit) word at a time. As such, let's consider something like a struct of four char elements:
struct foo {
char a, b, c, d;
};
If it was required to lay these out in memory so they were contiguous, accessing a foo::b (for example) would require that the CPU load the word, then shift it 8-bits right, then mask to zero-extend that byte to fill the entire register.
Storing would be even worse -- the CPU would have to load the current value of the whole word, mask out the current contents of the appropriate char-sized piece of that, shift the new value to the correct place, OR it into the word, and finally store the result.
By contrast, with padding between the elements, each of those becomes a simple load/store, with no shifting, masking, etc.
At least if memory serves, with DEC's normal compiler for the Alpha, int was 32 bits, and long was 64 bits (it predated long long). As such, with your struct of four ints, you could have expected to see another 32 bits of padding between the elements (and another 32 bits after the last element as well).
Given that you do have a POD-struct, you still have some possibilities though. The one I'd probably prefer would be to use offsetof to get the offsets of the members of the struct, create an array of them, and access the members via those offsets. I showed how to do this in a couple of previous answers.
Strictly speaking, such pointer casts aren't allowed and lead to undefined behavior.
The main issue with the cast is however that the compiler is free to add any number of padding bytes anywhere inside a struct, except before the very first element. So whether it will work or not depends on the alignment requirements of the specific system, and also whether struct padding is enabled or not.
int is not necessarily of the same size as the optimal size for an addressable chunk of data, even though this is true for most 32-bit systems. Some 32-bitters don't care about misalignment, some will allow misalignment but produce less efficient code, and some must have the data aligned. In theory, 64-bitters may also want to add padding after an int (which will be 32 bit there) to get a 64-bit chunk, but in practice they support 32-bit instruction sets.
If you write code relying on this cast, you should add something like this:
static_assert (sizeof(intStruct) ==
sizeof(int) + sizeof(int));
It is guaranteed to be legal, given that the element type is standard-layout. Note: all references in the following are to the c++11 standard.
8.3.4 Arrays [dcl.array]
1 - [...] An object of array type contains a contiguously allocated non-empty set of N subobjects of type T. [...]
Regarding a struct with N members of type T,
9.2 Class members [class.mem]
14 - Nonstatic data members of a (non-union) class with the same access control are allocated so
that later members have higher addresses within a class object. [...] Implementation alignment requirements might
cause two adjacent members not to be allocated immediately after each other [...]
20 - A pointer to a standard-layout struct object, suitably converted using a reinterpret_cast, points to its
initial member [...] and vice versa. [ Note:
There might therefore be unnamed padding within a standard-layout struct object, but not at its beginning,
as necessary to achieve appropriate alignment. —end note ]
So the question is whether any alignment-required padding within a struct could cause its members not to be contiguously allocated with respect to each other. The answer is:
1.8 The C++ object model [intro.object]
4 - [...] An object of trivially copyable or standard-layout type shall occupy contiguous bytes of storage.
In other words, a standard-layout struct a containing at least two members x, y of the same (standard-layout) type that does not respect the identity &a.y == &a.x + 1 is in violation of 1.8:4.
Note that alignment is defined as (3.11 Alignment [basic.align]) the number of bytes between successive addresses at which a given object can be allocated; it follows that alignment of a type T can be no greater than the distance between adjacent objects in an array of T, and (since 5.3.3 Sizeof [expr.sizeof] specifies that the size of an array of n elements is n times the size of an element) alignof(T) can be no greater than sizeof(T). Thus any additional padding between adjacent elements of a struct of the same type would not be required by alignment and so would not be countenanced by 9.2:14.
With regard to AProgrammer's point, I would interpret the language in 26.4 Complex numbers [complex.numbers] as requiring that the instantiations of std::complex<T> should behave as standard-layout types with regard to the position of their members, without being required to conform to all the requirements of standard-layout types.
The behavior there is almost certainly compiler-, architecture-, and ABI-dependent. However, if you're using gcc, you can make use of __attribute__((packed)) to force the compiler to pack struct members one after the other, without any padding. With that, the memory layout should match that of a flat array.
I've found nothing which guarantee it is valid when I searched some time ago, and I've found explicit guarantee for the case of std::complex<> in C++ which could have been formulated more easily if it was more generally true, so I doubt I missed something in my search (but absence of proof is hardly a proof of absence and the standard is sometimes obscure in its formulation).
A typical alignment of C structs guarantees that the data structure members in the struct will be stored sequentially which is the same as a C array. So order cannot be a problem.
As it comes to alignment, since you have only one data type(int), though the compiler is eligible to do so, there is no scenario it would be necessary to add padding to align your data members. The compiler can add padding before the beginning of the struct, but it cannot add padding at the beginning of the data structure. So if the compiler were to add padding in your situation,
Instead of this:
[4Byte int][4Byte int][4Byte int]...[4Byte int]
Your data structure would have to be stored like this:
[4Byte Data][4Byte Padding][4Byte Data]... which is unreasonable.
Overall, I think this cast should work with no problems in your situation, though I think it is bad practice to use it.

C++ alignment of multidimensional array structure

In my code, I have to consider an array of arrays, where the inner arrays are of a fixed dimension. In order to make use of STL algorithms, it is useful to actually store the data as array of arrays, but I also need to pass that data to a C library, which takes a flattened C-style array.
It would be great to be able to convert (i.e. flatten) the multi-dimensional array cheaply and in a portable way. I will stick to a very simple case, the real problem is more general.
struct my_inner_array { int data[3]; };
std::vector<my_inner_array> x(15);
Is
&(x[0].data[0])
a pointer to a continuous block of memory of size 45*sizeof(int) containing the same entries as x? Or do I have to worry about alignment? I am afraid that this will work for me (at least for certain data types and inner array sizes) but that it is not portable.
Is this code portable?
If not, is there a way to make it work?
If not, do you have any suggestions what I could do?
Does it change anything at all if my_inner_array is not a POD struct, but contains some methods (as long as the class does not contain any virtual methods)?
1 Theoretically no. The compiler may decide to add padding to my_inner_array. In practice, I don't see a reason why the compiler would add padding to a struct that has an array in it. In such a case there's no alignment problem creating an array of such structs. You can use a compile time assert:
typedef int my_inner_array_array[3];
BOOST_STATIC_ASSERT(sizeof(my_inner_array) == sizeof(my_inner_array_array));
4 If there are no virtual methods it shouldn't make any difference.

Questions on usages of sizeof

Question 1
I have a struct like,
struct foo
{
int a;
char c;
};
When I say sizeof(foo), I am getting 8 on my machine. As per my understanding, 4 bytes for int, 1 byte for char and 3 bytes for padding. Is that correct? Given a struct like the above, how will I find out how many bytes will be added as padding?
Question 2
I am aware that sizeof can be used to calculate the size of an array. Mostly I have seen the usage like (foos is an array of foo)
sizeof(foos)/sizeof(*foos)
But I found that the following will also give same result.
sizeof(foos) / sizeof(foo)
Is there any difference in these two? Which one is preferred?
Question 3
Consider the following statement.
foo foos[] = {10,20,30};
When I do sizeof(foos) / sizeof(*foos), it gives 2. But the array has 3 elements. If I change the statement to
foo foos[] = {{10},{20},{30}};
it gives correct result 3. Why is this happening?
Any thoughts..
Answer 1
Yes - your calculation is correct. On your machine, sizeof(int) == 4, and int must be 4-byte aligned.
You can find out about the padding by manually adding the sizes of the base elements and subtracting that from the size reported by sizeof(). You can predict the padding if you know the alignment requirements on your machine. Note that some machines are quite fussy and give SIGBUS errors when you access misaligned data; others are more lax but slow you down when you access misaligned data (and they might support '#pragma packed' or something similar). Often, a basic type has a size that is a power of 2 (1, 2, 4, 8, 16) and an n-byte type like that must be n-byte aligned. Also, remember that structures have to be padded so that an array of structures will leave all elements properly aligned. That means the structure will normally be padded up to a multiple of the size of the most stringently aligned member in the structure.
Answer 2
Generally, a variant on the first is better; it remains correct when you change the base type of the array from a 'foo' to a 'foobar'. The macro I customarily use is:
#define DIM(x) (sizeof(x)/sizeof(*(x)))
Other people have other names for the same basic operation - and you can put the name I use down to pollution from the dim and distant past and some use of BASIC.
As usual, there are caveats. Most notably, you can't apply this meaningfully to array arguments to a function or to a dynamically allocated array (using malloc() et al or new[]); you have apply to the actual definition of an array. Normally the value is a compile-time constant. Under C99, it could be evaluated at runtime if the array is a VLA - variable-length array.
Answer 3
Because of the way initialization works when you don't have enough braces. Your 'foo' structure must have two elements. The 10 and the 20 are allocated to the first row; the 30 and an implicit 0 are supplied to the second row. Hence the size is two. When you supply the sub-braces, then there are 3 elements in the array, the first components of which have the values 10, 20, 30 and the second components all have zeroes.
The padding is usually related to the size of the registers on the hist CPU - in your case, you've got a 32-bit CPU, so the "natural" size of an int is 4 bytes. It is slower and more difficult for the CPU to access quantities of memory smaller than this size, so it is generally preferable to align values onto 4-byte boundaries. The struct thus comes out as a multiple of 4 bytes in size. Most compilers will allow you to modify the amount of padding used (e.g. with "#pragma"s), but this should only be used where the memory footprint of the struct is absolutely critical.
"*foos" references the first entry in the foos array. "foo" references (a single instance of) the type. So they are essentially the same. I would use sizeof(type) or sizeof(array[0]) myself, as *array is easier to mis-read.
In your first example, you are not intialising the array entries correctly. Your struct has 2 members so you must use { a, b } to initialise each member of the array. So you need the form { {a, b}, {a, b}, {a, b} } to correctly initialise the entries.
To find out how much padding you have, simply add up the sizeof() each element of the structure, and subtract this sum from the sizeof() the whole structure.
You can use offsetof() to find out exactly where the padding is, in more complex structs. This may help you to fill holes by rearranging elements, reducing the size of the struct as a whole.
It is good practice to explicitly align structure elements, by manually inserting padding elements so that every element is guaranteed to be "naturally aligned". You can reuse these padding elements for useful data in the future. If you ever write a library that will require a stable ABI, this will be a required technique.

Casting between multi- and single-dimentional arrays

This came up from this answer to a previous question of mine.
Is it guaranteed for the compiler to treat array[4][4] the same as array[16]?
For instance, would either of the below calls to api_func() be safe?
void api_func(const double matrix[4][4]);
// ...
{
typedef double Matrix[4][4];
double* array1 = new double[16];
double array2[16];
// ...
api_func(reinterpret_cast<Matrix&>(array1));
api_func(reinterpret_cast<Matrix&>(array2));
}
From the C++ standard, referring to the sizeof operator:
When applied to an array, the result is the total number of bytes in the array. This implies that the size of an array of n elements is n times the size of an element.
From this, I'd say that double[4][4] and double[16] would have to have the same underlying representation.
I.e., given
sizeof(double[4]) = 4*sizeof(double)
and
sizeof(double[4][4]) = 4*sizeof(double[4])
then we have
sizeof(double[4][4]) = 4*4*sizeof(double) = 16*sizeof(double) = sizeof(double[16])
I think a standards-compliant compiler would have to implement these the same, and I think that this isn't something that a compiler would accidentally break. The standard way of implementing multi-dimensional arrays works as expected. Breaking the standard would require extra work, for likely no benefit.
The C++ standard also states that an array consists of contiguously-allocated elements, which eliminates the possibility of doing anything strange using pointers and padding.
I don't think there is a problem with padding introduced by having a multi-dimensional array.
Each element in an array must satisfy the padding requirements imposed by the architecture. An array [N][M] is always going to have the same in memory representation as one of [M*N].
Each array element should be laid out sequentially in memory by the compiler. The two declarations whilst different types are the same underlying memory structure.
#Konrad Rudolph:
I get those two (row major/column major) mixed up myself, but I do know this: It's well-defined.
int x[3][5], for example, is an array of size 3, whose elements are int arrays of size 5. (§6.5.2.1) Adding all the rules from the standard about arrays, addressing, etc. you get that the second subscript references consecutive integers, wheras the first subscript will reference consecutive 5-int objects. (So 3 is the bigger number; you have 5 ints between x[1][0] and x[2][0].)
I would be worried about padding being added for things like Matrix[5][5] to make each row word aligned, but that could be simply my own superstition.
A bigger question is: do you really need to perform such a cast?
Although you might be able to get away with it, it would still be more readable and maintainable to avoid altogether. For example, you could consistently use double[m*n] as the actual type, and then work with a class that wraps this type, and perhaps overloads the [] operator for ease of use. In that case, you might also need an intermediate class to encapsulate a single row -- so that code like my_matrix[3][5] still works as expected.