Alternative to zero-sized array in embedded C++

Alternative to zero-sized array in embedded C++ - c++

The Multiboot Specification has structures like this:
struct multiboot_tag_mmap
{
multiboot_uint32_t type;
multiboot_uint32_t size;
multiboot_uint32_t entry_size;
multiboot_uint32_t entry_version;
struct multiboot_mmap_entry entries[0];
};
The intent seems to be that the array size can vary. The information is not known until passed along by the boot loader. In hosted C++, the suggested advice is to "use vector". Well, I can't do that. The alternative is to use dynamic allocation, but that would require implementing a significant chunk of the kernel (paging, MMU, etc.) before I even have the memory map information. A bit of a chicken or egg problem.
The "hack" is to just enable extensions with gnu++11. But I try to avoid using extensions as much as possible to avoid C-ish code or code that could potentially lead to undefined behavior. The more portable the code is, the less chance of bugs in my opinion.
Finally, you iterate over the memory map like this:
for (mmap = ((struct multiboot_tag_mmap *) tag)->entries;
(multiboot_uint8_t *) mmap
< (multiboot_uint8_t *) tag + tag->size;
mmap = (multiboot_memory_map_t *)
((unsigned long) mmap
+ ((struct multiboot_tag_mmap *) tag)->entry_size))
So the size of the structure is tag->size.
I can modify the multiboot header so long as the semantics are the same. The point is how it looks to the bootloader. What can I do?

Instead of 0-sized array, you can use 1-sized array:
struct multiboot_tag_mmap
{
...
struct multiboot_mmap_entry entries[1];
};
This will change only result of sizeof(struct multiboot_tag_mmap), but it shouldn't be used in any case: size of allocated structure should be computed as
offsetof(struct multiboot_tag_mmap, entries) + <num-of-entries> * sizeof(struct multiboot_mmap_entry)
Alignment of the map structure doesn't depends on the number of elements in the entries array, but on the entry type.
Strictly confirming alternative:
If there is known bounary for array size, one can use this boundary for type declaration:
struct multiboot_tag_mmap
{
...
struct multiboot_mmap_entry entries[<UPPER-BOUNDARY>];
};
For such declaration all possible issues described below are not applied.
NOTE about elements accessing:
For accessing elements (above the first one) in such flexible array one need to declare new pointer variable:
struct multiboot_mmap_entry* entries = tag->entries;
entries[index] = ...; // This is OK.
instead of using entries field directly:
tag->entries[index] = ...; // WRONG! May spuriously fail!
The thing is that compiler, knowing that the only one element exists in the entries field array, may optimize last case it to:
tag->entries[0] = ...; // Compiler is in its rights to assume index to have the only allowed value
Issues about standard confirmance: With flexible array approach, there is no object of type struct multiboot_tag_mmap in the memory(in the heap or on the stack). All we have is a pointer of this type, which is never dereferenced (e.g. for making full copy of the object). Similarly, there is no object of the array type struct multiboot_mmap_entry[1], corresponded to the entries field of the structure, this field is used only for conversion to generic pointer of type struct multiboot_mmap_entry*.
So, phrase in the C standard, which denotes Undefine Behavior
An object is assigned to an inexactly overlapping object or to an exactly overlapping object with incompatible type
is inapplicable for accessing entries array field using generic pointer: there is no overlapping object here.

Related

How does an union determine max size from a list of objects?

I am not sure the question is well put, because I understood how, but I don't know to write the questions with the thing I don't understand. Here it is:
I have some classes:
class Animal{};
class Rabbit{}: public Animal;
class Horse{}: public Animal;
class Mouse{}: public Animal;
class Pony{}: public Horse;
My goal was to find the maximum size from this object list in order to use it in memory allocation afterwards. I've stored each sizeof of the object in an array then took the max of the array. The superior(to whom I send the code for review) suggested me to use an union in order to find maximum size at pre-compilation time. The idea seemed very nice to me so I've did it like this:
typedef union
{
Rabbit rabbitObject;
Horse horseObject;
Mouse mouseObject;
Pony ponyObject;
} Size;
... because an union allocates memory according to the greatest-in-size element.
The next suggestion was to do it like this:
typedef union
{
unsigned char RabbitObject[sizeof(Rabbit)];
unsigned char HorseObject[sizeof(Horse)];
unsigned char MouseObject[sizeof(Mouse)];
unsigned char PonyObject[sizeof(Pony)];
} Interesting;
My question is:
How does Interesting union get the maximum size of object? To me, it makes no sense to create an array of type unsigned char, of length sizeof(class) inside it. Why the second option would solve the problem and previous union it doesn't?
What's happening behind and I miss?
PS: The conditions are in that way that I cannot ask the guy personally.
Thank you in advance

The assumptions are incorrect, and the question is moot. Standard does not require the union size to be equal of the size of the largest member. Instead, it requires union size to be sufficient to hold the largest member, which is not the same at all. Both solutions are flawed is size of the largest class needs to be known exactly.
Instead, something like that should be used:
template<class L, class Y, class... T> struct max_size
: std::integral_constant<size_t, std::max(sizeof (L), max_size<Y, T...>::value)> { };
template<class L, class Y> struct max_size<L, Y>
: std::integral_constant<size_t, std::max(sizeof (L), sizeof (Y))> { };
As #Caleth suggested below, it could be shortened using initializer list version of std::max (and template variables):
template<class... Ts>
constexpr size_t max_size_v = std::max({sizeof(Ts)...});

The two approaches provide a way to find a maximum size that all of the objects of the union will fit within. I would prefer the first as it is clearer as to what is being done and the second provides nothing that the first does not for your needs.
And the first, a union composed of the various classes, offers the ability to access a specific member of the union as well.
See also Is a struct's address the same as its first member's address?
as well as sizeof a union in C/C++
and Anonymous union and struct [duplicate]
.
For some discussions on memory layout of classes see the following postings:
Structure of a C++ Object in Memory Vs a Struct
How is the memory layout of a class vs. a struct
memory layout C++ objects [closed]
C++ Class Memory Model And Alignment
What does an object look like in memory? [duplicate]
C++11 introduced a standardized memory model. What does it mean? And how is it going to affect C++ programming?
Since the compiler is free to add to the sizes of the various components in order to align variables on particular memory address boundaries, the size of the union may be larger than the actual size of the data. Some compilers offer a pragma or other type of directive to instruct the compiler as to whether packing of the class, struct, or union members should be done or not.
The size as reported by sizeof() will be the size of the variable or type specified however again this may include additional unused memory area to pad the variable to the next desirable memory address alignment. See Why isn't sizeof for a struct equal to the sum of sizeof of each member?.
Typically a class, struct, or union is sized so that if an array of the type is created then each element of the array will begin on the most useful memory alignment such as a double word memory alignment for an Intel x86 architecture. This padding is typically on the end of the variable.

You superior suggested you use the array version because a union could have padding. For instance if you have
union padding {
char arr[sizeof (double) + 1];
double d;
};
The this could either be of size sizeof(double) + 1 or it could be sizeof (double) * 2 as the union could be padded to keep it aligned for double's (Live example).
However if you have
union padding {
char arr[sizeof(double) + 1];
char d[sizeof(double)];
};
The the union need not be double aligned and the union most likely has a size of sizeof(double) + 1 (Live example). This is not guanrteed though and the size can be greater than it's largest element.
If you want for sure to have largest size I would suggest using
auto max_size = std::max({sizeof(Rabbit), sizeof(Horse), sizeof(Mouse), sizeof(Pony)});

How to do c++ aligned array allocation?

I'd like to modify an array allocation:
float * a = new float[n] ;
to use an aligned allocator. I was inclined to try to use placement new and posix_memalign (or the new c++11 equivalent), but see that placement new with arrays is problematic with array allocations, because the compiler may need to have additional storage for count or other metadata.
I tried:
int main()
{
float * a = new alignas(16) float[3] ;
a[2] = 0.0 ;
return a[2] ;
}
but the compiler seems to indicate that the alignas is ignored:
$ g++ -std=c++11 t.cc -Werror
t.cc: In function ‘int main()’:
t.cc:4:39: error: attribute ignored [-Werror=attributes]
float * a = new alignas(16) float[3] ;
^
t.cc:4:39: note: an attribute that appertains to a type-specifier is ignored
It looks like the proper way to use alignas is in a structure declaration declare a structure with alignas, but that will only work with a fixed size.
There is also a aligned_storage template, but I think that will also only work with fixed sizes.
Is there any standard way to do an aligned array allocation that will invoke the constructor on all the elements?

As other people said, overaligned types are not required to be supported. Check your compiler documentation before using it.
You can try to solve your problem using one of the following approaches:
1) Overallocate your array (by (desired aligment / sizeof element) - 1) and use std::align. A link to libstdc++ implementation.
2) declare a struct containing array of desired aligment / sizeof element elements and aligned by desired aligment. It should give you compact representation in memory if you use array of such structs, but you will not be able to use normal array notation or pointer arithmetics (as it (a) undefined behaviour, (b) there is a small chance that they will not be placed as you want)
3) Write your own aligned allocation function. Notice that you can add your own versions of operator new and delete.
namespace my
{
struct aligned_allocator_tag {};
aligned_allocator_tag aligned;
}
void* operator new( std::size_t count, my::aligned_allocator_tag, std::size_t aligment);
void* operator new[]( std::size_t count, my::aligned_allocator_tag, std::size_t aligment)
{
return ::operator new(count, my::aligned, aligment);
}
//Usage
foo* c = new(my::aligned, 16) foo[20];
You will need to allocate memory, reserve enough space to store original pointer (returned by malloc/whatever) or amount of bytes pointer was displaced, so subsequent delete will free corect pointer, align pointer to needed size and return it.
Here is an answer, and another one, which shows how to align memory.
Notice that both of these answers uses implementation-defined behaviour of doing bitwise arithmetic on pointers converted to integers and converting them back. The only really completely standard way would be to cast memory to char* and add difference between its value and next aligned address.
If you can use some nonstandard memory allocation functions, you can wrap them into custom operator new too.

Basically, you're stuck because, in [expr.new]:
It is implementation-defined whether over-aligned types are supported.
There is a proposal to support this better. Until then, if you want to do what you're trying to do, you'll have to use aligned_alloc instead of new.
If you stick your array in a struct:
struct S {
alignas(16) float _[3];
};
then new S will give you the right alignment for _, though not necessarily for S itself. That may suffice. If it doesn't, then you can overload operator new() and operator delete() on S itself to guarantee the correct behavior.

Native support for alignment in C++ is still dismal. You're aligning to a 4-float vector, from the look of it, so you're missing the hazard where C++14 can't do better than 16 byte alignments, but even so, it's not a feature reliably supported by all C++14 compilers. If you need portable code that uses new float[], you've lost the race right out of the gate.
I think the closest you could get, within the current standard, would be to create a vector-sized and vector-aligned data type (e.g. with std::aligned_storage) and then get in the habit of using that, rather than arrays of individual floats, for your vector math. If you need variable-length vectors, then you'd have to round up to the nearest 4 floats.
(Sorry, it's not the answer you wanted, but I think it's the answer you need in order to move forward.)

Storing a Dynamic Array of Structures

I have been working on a project which utilizes a dynamic array of structures. To avoid storing the number of structures in its own variables (the count of structures), I have been using an array of pointers to the structure variables with a NULL terminator.
For example, let's say my structure type is defined as:
typedef struct structure_item{
/* ... Structure Variables Here ... */
} item_t;
Now let's say my code has item_t **allItems = { item_1, item_2, item_3, ..., item_n, NULL }; and all item_#s are of the type item_t *.
Using this setup, I then do not have to keep track of another variable which tells me the total number of items. Instead, I can determine the total number of items as needed by saying:
int numberOfStructures;
for( numberOfStructures = 0;
*(allItems + numberOfStructures) != NULL;
numberOfStructures++
);
When this code executes, it counts the total number of pointers before NULL.
As a comparison, this system is similar to C-style strings; whereas tracking the total number of structures would be similar to a Pascal-style string. (Because C uses a NULL terminated array of characters vs. Pascal which tracks the length of its array of characters.)
My question is rather simple, is an array of pointers (pointer to pointer to struct) really necessary or could this be done with an array of structs (pointer to struct)? Can anybody provide better ways to handle this?
Note: it is important that the solution is compatible with both C and C++. This is being used in a wrapper library which is wrapping a C++ library for use in standard C.
Thank you all in advance!

What you need is a sentinel value, a recognizable valid value that means "nothing". For pointers, the standard sentinel value is NULL.
If you want to use your structs directly, you will need to decide on a sentinel value of type item_t, and check for that. Your call.

Yes, it is possible to have an array of structs, and (at least) one of those a defined sentinel (which is that the '\0' used at the end of strings, and the NULL pointer in your case).
What you need to do, for your struct type, is reserve one or more possible values of that struct (composed of the set of values of its members) to indicate a sentinel.
For example, let's say we have a struct type
struct X {int a; char *p};
then define a function
int is_sentinel(struct X x)
{
return x.p == NULL;
}
This will mean any struct X for which the member p is NULL can be used as a sentinel (and the member a would not matter in this case).
Then just loop looking for a sentinel.
Note: to be compatible in both C and C++, the struct type needs to be compatible (e.g. POD).

Accessing consecutive members with a single pointer

I want to access continuously declared member arrays of the same type with a single pointer.
So for example say I have :
struct S
{
double m1[2];
double m2[2];
}
int main()
{
S obj;
double *sp = obj.m1;
// Code that maybe unsafe !!
for (int i(0); i < 4; ++i)
*(sp++) = i; // *
return 0;
}
Under what circumstances is line (*) problematic ?
I know there's for sure a problem when virtual functions are present but I need a more structured answer than my assumptions

You can be sure that the members of the struct are stored in a contiguos block of bytes, in the order they appear. Besides, the elements of the arrays are contiguous. So, it seems that everything is fine.
The problem here is that there is no standard way of knowing if there is padding bytes between consecutive members in the struct.
So, it is unsafe to assume that there is not padding bytes at all.
If you can be plenty sure, for some particular reason, that there are not padding bytes, then the 4 double elements will be contiguous, as you want.

The C++ standard makes certain guarantees about the layout of "plain old data" (or in C++11, standard layout) types. For the most part, these inherit from how C treated such data.
What follows only applies to "plain old data"/"standard layout" structures and data.
If you have two structs with the same initial order and type of arguments, casting a pointer to one to a pointer to the other and accessing their common initial prefix is valid, and will access the corresponding field. This is known as "layout compatible". This also applies if you have a structure X and a structure Y, and X is the first element of the structure Y -- a pointer to Y can be cast to a pointer to X, and it will access the fields of the X substructure in Y.
Now, while it is a common assumption, I am unaware of a requirement of either C or C++ that an array and a structure starting with fields of the same type and count are layout compatible with an array.
Your case is somewhat similar, in that we have two arrays adjacent to each other in a structure, and you are treating it as one large array of size equal to the sum of those two arrays size. It is a relatively common and safe assumption that it works, but I am unaware of a guarantee in the standard that it actually works.
In this kind of undefined behavior, you have to examine your particular compilers guarantees (de facto or explicit) about layout of plain old data/standard layout data, as the C++ standard does not guarantee your code does what you want it to do.

Only one array without a size allowed per struct?

I was writing a struct to describe a constant value I needed, and noticed something strange.
namespace res{
namespace font{
struct Structure{
struct Glyph{
int x, y, width, height, easement, advance;
};
int glyphCount;
unsigned char asciiMap[]; // <-- always generates an error
Glyph glyphData[]; // <-- never generates an error
};
const Structure system = {95,
{
// mapping data
},
{
// glyph spacing data
}
}; // system constructor
} // namespace font
} // namespace res
The last two members of Structure, the unsized arrays, do not stop the compiler if they are by themselves. But if they are both included in the struct's definition, it causes an error, saying the "type is incomplete"
This stops being a problem if I give the first array a size. Which isn't a problem in this case, but I'm still curious...
My question is, why can I have one unsized array in my struct, but two cause a problem?

In standard C++, you can't do this at all, although some compilers support it as an extension.
In C, every member of a struct needs to have a fixed position within the struct. This means that the last member can have an unknown size; but nothing can come after it, so there is no way to have more than one member of unknown size.
If you do take advantage of your compilers non-standard support for this hack in C++, then beware that things may go horribly wrong if any member of the struct is non-trivial. An object can only be "created" with a non-empty array at the end by allocating a block of raw memory and reinterpreting it as this type; if you do that, no constructors or destructors will be called.

You are using a non-standard microsoft extension. C11 (note: C, not C++) allows the last array in a structure to be unsized (read: a maximum of one arrays):
A Microsoft extension allows the last member of a C or C++ structure or class to be a variable-sized array. These are called unsized arrays. The unsized array at the end of the structure allows you to append a variable-sized string or other array, thus avoiding the run-time execution cost of a pointer dereference.
// unsized_arrays_in_structures1.cpp
// compile with: /c
struct PERSON {
unsigned number;
char name[]; // Unsized array
};
If you apply the sizeof operator to this structure, the ending array size is considered to be 0. The size of this structure is 2 bytes, which is the size of the unsigned member. To get the true size of a variable of type PERSON, you would need to obtain the array size separately.
The size of the structure is added to the size of the array to get the total size to be allocated. After allocation, the array is copied to the array member of the structure, as shown below:

The compiler needs to be able to decide on the offset of every member within the struct. That's why you're not allowed to place any further members after an unsized array. It follows from this that you can't have two unsized arrays in a struct.

It is an extension from Microsoft, and sizeof(structure) == sizeof(structure_without_variable_size_array).
I guess they use the initializer to find the size of the array. If you have two variable size arrays, you can't find it (equivalent to find one unique solution of a 2-unknown system with only 1 equation...)

Arrays without a dimension are not allowed in a struct,
period, at least in C++. In C, the last member (and only the
last) may be declared without a dimension, and some compilers
allow this in C++, as an extension, but you shouldn't count on
it (and in strict mode, they should at least complain about it).
Other compilers have implemented the same semantics if the last
element had a dimension of 0 (also an extension, requiring
a diagnostic in strict mode).
The reason for limiting incomplete array types to the last
element is simple: what would be the offset of any following
elements? Even when it is the last element, there are
restrictions to the use of the resulting struct: it cannot be
a member of another struct or an array, for example, and
sizeof ignores this last element.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js