Why does std::vector<bool> have no .data()? - c++

The specialisation of std::vector<bool>, as specified in C++11 23.3.7/1, doesn't declare a data() member (e.g. mentioned here and here).
The question is: Why does a std::vector<bool> have no .data()? This is the very same question as why is a vector of bools not stored contiguously in memory. What are the benefits in not doing so?
Why can a pointer to an array of bools not be returned?

Why does a std::vector have no .data()?
Because std::vector<bool> stores multiple values in 1 byte.
Think about it like a compressed storage system, where every boolean value needs 1 bit. So, instead of having one element per memory block (one element per array cell), the memory layout may look like this:
Assuming that you want to index a block to get a value, how would you use operator []? It can't return bool& (since it will return one byte, which stores more than one bools), thus you couldn't assign a bool* to it. In other words bool *bool_ptr =&v[0]; is not valid code, and would result in a compilation error.
Moreover, a correct implementation might not have that specialization and don't do the memory optimization (compression). So data() would have to copy to the expected return type depending of implementation (or standard should force optimization instead of just allowing it).
Why can a pointer to an array of bools not be returned?
Because std::vector<bool> is not stored as an array of bools, thus no pointer can be returned in a straightforward way. It could do that by copying the data to an array and return that array, but it's a design choice not to do that (if they did, I would think that this works as the data() for all containers, which would be misleading).
What are the benefits in not doing so?
Memory optimization.
Usually 8 times less memory usage, since it stores multiple bits in a single byte. To be exact, CHAR_BIT times less.

Related

Array of class holding an array memory layout

If we have a class which holds an array, let's call it vector and hold the values in a simple array called data:
class vector
{
public:
double data[3];
<...etc..>
};
Note: called as vector is for clearer explanation, it is not std::vector!!!
So my question is that, if I store only typedefs near this array inside the class and some constrexpr, am I correct if the class will be only 3 doubles after each other inside the memory?
And then if i create an array of vectors like:
vector vl[3];
Note: size of the array is not always known at compile time, not use 3 for the example.
then in the memory it'll be just 9 doubles after each other, right?
so vl[0].data[3] will always return the 2nd vectors 1st element? And in this case is it guaranteed that the result will be always like a simple array in the memory?
I found only cases with array of arrays, but not with array of classes holding an array, and I'm not sure if it is exactly the same at the end. I made some tests and it seems like it is working as I expected, but I don't know if it is always true..
Thank you!
Mostly, yes.
The standard doesn't promise that there never is anything after data in the representation of a vector, but all the implementations that I know of won't add any padding in this case.
What is promised is that there is no padding before data in the representation of vector, because it is a StandardLayout type.
You are right with your first example: The class layout is like a C struct. The first member resides at the address of the struct itself, and if it is an array, all the array's members are adjacent.
Between struct members, however, may be padding; so there is no guarantee that the size of a struct is the sum of all member sizes. I'd have to dig into the standard but I assume this includes padding at the end. This answer affirms that; assert(sizeof(vector) == 3*sizeof(double)) may not hold. In reality I'd assume that an implementation may pad a struct containing three chars so that the struct aligns at word boundaries in an array, but not three doubles which are typically the type with the strongest alignment requirements. But there is no guarantee between implementations, architectures and compiler options: Imagine we switch to 128 bit CPUs.
With respect to your second example: The above applies recursively, so the standard gives no guarantee that the 9 doubles will be adjacent. On the other hand, I bet they will be, and the program can assert it with a simple compile-time static_assert.

Do map and set allocate 1 item at a time always?

I am implementing an allocator for std::map and std::set in C++14. The allocator has to provide a function pointer allocate(size_type n) that allocates space for n items at a time.
After some tests, I have seen std::map and std::set always do allocate(1) on my platform, I have not seen any n > 1. It makes sense to me if I think about the internal tree representation.
Does the standard guarantee this behavior? Or can I safely trust n == 1 always in any specific platform?
Does the standard guarantee this behavior?
No. The standard does not guarantee this.
Or can I safely trust n == 1 always in any specific platform?
The number of allocations when instering is constrained by the complexity of the containers methods. For example, for std::map::insert the standard specifies (from cppreference, only first 3 overloads, inserting a single element):
1-3) Logarithmic in the size of the container, O(log(size())).
Then the implementers are free to choose an implementation that fulfills this specification. The log(size()) part is because you need to find the place where to insert and allocating space for a fixed number of elements is just a constant contribution to complexity.
An implementation could choose to allocate space for two elements every second time it is called. 2 is just as constant as 1. However, it shouldn't be too hard to find cases where allocating 1 is more efficient than allocating 2 in absolute terms. Moreover, std::map and std::set are not required to store their elements in contiguous memory.
Hence, I would assume that it is always 1, but you have no guarantee. If you want to be certain you have to look at the specific implementation, but then you rely on implementation details.
allocate(n) is not the same as allocate(1) n times.
A::allocate(n) must return a single pointer, hence it is not trivial to allocate non-contiguous memory. There is however no requirement that this pointer is a T*. Instead A::allocate(n) returns an A::pointer. This can be any type as long as it satisfies NullablePointer, LegacyRandomAccessIterator, and LegacyContiguousIterator.
cppreference mentions boost::interprocess::offset_ptr as an example of how to allocate segmented memory. You might want to take a look at that. Here is the full quote:
Fancy pointers
When the member type pointer is not a raw pointer type, it is commonly
referred to as a "fancy pointer". Such pointers were introduced to
support segmented memory architectures and are used today to access
objects allocated in address spaces that differ from the homogeneous
virtual address space that is accessed by raw pointers. An example of
a fancy pointer is the mapping address-independent pointer
boost::interprocess::offset_ptr, which makes it possible to allocate
node-based data structures such as std::set in shared memory and
memory mapped files mapped in different addresses in every process.
Fancy pointers can be used independently of the allocator that
provided them, through the class template std::pointer_traits.

Memory allocation of C++ vector<bool>

The vector<bool> class in the C++ STL is optimized for memory to allocate one bit per bool stored, rather than one byte. Every time I output sizeof(x) for vector<bool> x, the result is 40 bytes creating the vector structure. sizeof(x.at(0)) always returns 16 bytes, which must be the allocated memory for many bool values, not just the one at position zero. How many elements do the 16 bytes cover? 128 exactly? What if my vector has more or less elements?
I would like to measure the size of the vector and all of its contents. How would I do that accurately? Is there a C++ library available for viewing allocated memory per variable?
I don't think there's any standard way to do this. The only information a vector<bool> implementation gives you about how it works is the reference member type, but there's no reason to assume that this has any congruence with how the data are actually stored internally; it's just that you get a reference back when you dereference an iterator into the container.
So you've got the size of the container itself, and that's fine, but to get the amount of memory taken up by the data, you're going to have to inspect your implementation's standard library source code and derive a solution from that. Though, honestly, this seems like a strange thing to want in the first place.
Actually, using vector<bool> is kind of a strange thing to want in the first place. All of the above is essentially why its use is frowned upon nowadays: it's almost entirely incompatible with conventions set by other standard containers… or even those set by other vector specialisations.

What is the byte alignment of the elements in a std::vector<char>?

I'm hoping that the elements are 1 byte aligned and similarly that a std::vector<int> is 4 byte aligned ( or whatever size int happens to be on a particular platform ).
Does anyone know how standard library containers get aligned?
The elements of the container have at least the alignment required for them in that implementation: if int is 4-aligned in your implementation, then each element of a vector<int> is an int and therefore is 4-aligned. I say "if" because there's a difference between size and alignment requirements - just because int has size 4 doesn't necessarily mean that it must be 4-aligned, as far as the standard is concerned. It's very common, though, since int is usually the word size of the machine, and most machines have advantages for memory access on word boundaries. So it makes sense to align int even if it's not strictly necessary. On x86, for example, you can do unaligned word-sized memory access, but it's slower than aligned. On ARM unaligned word operations are not allowed, and typically crash.
vector guarantees contiguous storage, so there won't be any "padding" in between the first and second element of a vector<char>, if that's what you're concerned about. The specific requirement for std::vector is that for 0 < n < vec.size(), &vec[n] == &vec[0] + n.
[Edit: this bit is now irrelevant, the questioner has disambiguated: The container itself will usually have whatever alignment is required for a pointer, regardless of what the value_type is. That's because the vector itself would not normally incorporate any elements, but will have a pointer to some dynamically-allocated memory with the elements in that. This isn't explicitly required, but it's a predictable implementation detail.]
Every object in C++ is 1-aligned, the only things that aren't are bitfields, and the elements of the borderline-crazy special case that is vector<bool>. So you can rest assured that your hope for std::vector<char> is well-founded. Both the vector and its first element will probably also be 4-aligned ;-)
As for how they get aligned - the same way anything in C++ gets aligned. When memory is allocated from the heap, it is required to be aligned sufficiently for any object that can fit into the allocation. When objects are placed on the stack, the compiler is responsible for designing the stack layout. The calling convention will specify the alignment of the stack pointer on function entry, then the compiler knows the size and alignment requirement of each object it lays down, so it knows whether the stack needs any padding to bring the next object to the correct alignment.
I'm hoping that the elements are 1 byte aligned and similarly that a std::vector is 4 byte aligned ( or whatever size int happens to be on a particular platform ).
To put it simply, std::vector is a wrapper for a C array. The elements of the vector are aligned as if they were in the array: elements are guaranteed to occupy continues memory block without any added gaps/etc, so that std::vector<N> v can be accessed as a C array using the &v[0]. (Why vector has to reallocate storage sometimes when elements are added to it.)
Does anyone know how standard library containers get aligned?
Alignment of elements is platform specific but generally a simple variable is aligned so that its address is divisible by its size (natural alignment). Structures/etc are padded (empty filler space at the end) on the largest data type they contain to ensure that if the structure is put into an array, all fields would retain their natural alignment.
For other containers (like std::list or std::map) use data via template mechanics are made a part of internal structure and the structure is allocated by operator new. The new is guaranteed (custom implementation must obey the rule too; inherited from the malloc()) to return memory block which is aligned on largest available primitive data type (*). That is to ensure that regardless what structure or variable would be places in the memory block, it will be accessed in aligned fashion. Unlike std::vector, obviously, the elements of most other STL containers are not guaranteed to be within the same continuous memory block: they are newed one by one, not with new[].
(*) As per C++ standard, "The allocation function (basic.stc.dynamic.allocation) called by a new-expression (expr.new) to allocate size bytes of storage suitably aligned to represent any object of that size." That is a softer requirement compared to one malloc() generally abides, as per POSIX: "The pointer returned if the allocation succeeds shall be suitably aligned so that it may be assigned to a pointer to any type of object [...]". C++ requirement in a way reenforces the natural alignment requirement: dynamically allocated char would be aligned as char requires, but not more.
Do you mean the vector members, or the vector structure itself? Members are guaranteed to be contiguous in memory but structure alignment is platform/compiler-dependent. On Windows this can be set at compile-time and also overridden using #pragma pack().
The answer for other containers is likely not the same as for vector, so I would ask specific questions about the ones you care about.
The alignment of the whole container is implementation dependent. It is usually at least sizeof(void*), that is 4 or 8 bytes depending on the platform, but may be larger.
If special (guaranteed) alignment is needed use plain arrays or write/adapt some generic array class with:
// allocation
char* pointer = _mm_malloc(size, alignment);
// deallocation
_mm_free(pointer);

Casting between multi- and single-dimentional arrays

This came up from this answer to a previous question of mine.
Is it guaranteed for the compiler to treat array[4][4] the same as array[16]?
For instance, would either of the below calls to api_func() be safe?
void api_func(const double matrix[4][4]);
// ...
{
typedef double Matrix[4][4];
double* array1 = new double[16];
double array2[16];
// ...
api_func(reinterpret_cast<Matrix&>(array1));
api_func(reinterpret_cast<Matrix&>(array2));
}
From the C++ standard, referring to the sizeof operator:
When applied to an array, the result is the total number of bytes in the array. This implies that the size of an array of n elements is n times the size of an element.
From this, I'd say that double[4][4] and double[16] would have to have the same underlying representation.
I.e., given
sizeof(double[4]) = 4*sizeof(double)
and
sizeof(double[4][4]) = 4*sizeof(double[4])
then we have
sizeof(double[4][4]) = 4*4*sizeof(double) = 16*sizeof(double) = sizeof(double[16])
I think a standards-compliant compiler would have to implement these the same, and I think that this isn't something that a compiler would accidentally break. The standard way of implementing multi-dimensional arrays works as expected. Breaking the standard would require extra work, for likely no benefit.
The C++ standard also states that an array consists of contiguously-allocated elements, which eliminates the possibility of doing anything strange using pointers and padding.
I don't think there is a problem with padding introduced by having a multi-dimensional array.
Each element in an array must satisfy the padding requirements imposed by the architecture. An array [N][M] is always going to have the same in memory representation as one of [M*N].
Each array element should be laid out sequentially in memory by the compiler. The two declarations whilst different types are the same underlying memory structure.
#Konrad Rudolph:
I get those two (row major/column major) mixed up myself, but I do know this: It's well-defined.
int x[3][5], for example, is an array of size 3, whose elements are int arrays of size 5. (§6.5.2.1) Adding all the rules from the standard about arrays, addressing, etc. you get that the second subscript references consecutive integers, wheras the first subscript will reference consecutive 5-int objects. (So 3 is the bigger number; you have 5 ints between x[1][0] and x[2][0].)
I would be worried about padding being added for things like Matrix[5][5] to make each row word aligned, but that could be simply my own superstition.
A bigger question is: do you really need to perform such a cast?
Although you might be able to get away with it, it would still be more readable and maintainable to avoid altogether. For example, you could consistently use double[m*n] as the actual type, and then work with a class that wraps this type, and perhaps overloads the [] operator for ease of use. In that case, you might also need an intermediate class to encapsulate a single row -- so that code like my_matrix[3][5] still works as expected.