C++ Zero Length Arrays in Header File - c++

From ISO/IEC 14882:2003 8.3.4/1:
If the constant-expression (5.19) is present, it shall be an integral
constant expression and its value shall be greater than zero.
Therefore the following should not compile:
#pragma once
class IAmAClass
{
public:
IAmAClass();
~IAmAClass();
private:
int somearray[0]; // Zero sized array
};
But it does. However, the following:
#pragma once
class IAmAClass
{
public:
IAmAClass();
~IAmAClass();
private:
int somearray[0];
int var = 23; // Added this expression
};
does not compile, with the following error (as what would be expected) (Visual C++)
error C2229: class 'IAmAClass' has an illegal zero-sized array
When the code is in a function, it, in accordance with the standard, will never compile.
So, why does the code behave in such a way in a header file, where the difference of the compilation passing or failing appears to be down to whether a statement proceeds the zero sized array declaration or not.

The keyword in "If the constant-expression (5.19) is present," is if. It's not, so the first version compiles.
However, such variant arrays are only permissible (and sane) when they are the last element in a struct or class, where it's expected that they'll use extra space allocated to the struct on a case-by-case basis.
If an unknown-length array were allowed before other elements, how would other code know where in memory to find those elements?

This is a Visual C++ language extension: Declaring Unsized Arrays in Member Lists. From the linked MSDN page:
Unsized arrays can be declared as the last data member in class member lists if the program is not compiled with the ANSI-compatibility option (/Za)
Edit: If the member has been declared as a zero-sized array (like int somearray[0];) instead of an array of unknown bounds (like int somearray[];), this is still a language extension, albeit a different one
A zero-sized array is legal only when the array is the last field in a struct or union and when the Microsoft extensions (/Ze) are enabled.
This extension is similar to C99's flexible array members C11/n1570 §6.7.2.1/18
As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member.
and /20 contains an example:
EXAMPLE 2 After the declaration:
struct s { int n; double d[]; };
the structure struct s has a flexible array member d. A typical
way to use this is:
int m = /* some value */;
struct s *p = malloc(sizeof (struct s) + sizeof (double [m]));
and assuming that the call to malloc succeeds, the object pointed to
by p behaves, for most purposes, as if p had been declared as:
struct { int n; double d[m]; } *p;
[...]

Related

Incomplete type error when store an array of T in the struct T

Why this is allowed:
// 1
struct S {
std::vector<S> v;
};
// 2
struct T {
T* ptr;
};
and this is not:
// 3
struct X {
X arr[];
};
Here is the error for the third example (clang-1001.0.46.3 compiler):
sample.cpp:9:4: error: field has incomplete type 'X'
X arr[];
^
sample.cpp:8:8: note: definition of 'X' is not complete until the closing '}'
struct X {
^
I understand the reason why array of fixed size is not allowed in the struct, that's because sizeof(T) = sizeof(T)*array_size + size_of_other_members, btw it compiles fine with std::vector<T>, but doesn't compile with T[].
For a variable to be defined the compiler needs to know the size of the variable.
In the first case what you have are pointers to S (remember that std::vector allocates memory dynamically of the heap, and therefore only need a pointer), which is okay because the compiler knows the size of pointers.
In the second case when you use X the type (structure) isn't fully defined yet, so the compiler doesn't know the size of X yet. Furthermore in C++ you can't have "empty" arrays, all arrays must have a compile-time fixed size.
It is called Flexible array member, it was introduced in c99 standard of the C programming language.
So in C following is possible,
struct IntContainer
{
size_t length;
int arr[];
};
But C++ does not have flexible array members.
For more information please see following, Flexible array member - wikipedia

Is there a standard-compliant way to determine the alignment of a non-static member?

Suppose I have some structure S and a non-static member member, as in this example:
struct S { alignas(alignof(void *)) char member[sizeof(void *)]; };
How do you get the alignment of member?
The operator alignof can only be applied to complete types, not expressions [in 7.6.2.5.1], although GCC allows it, so alignof(S::member) and Clang supports it.
What is the "language-lawyerly" standard way to do it without this restriction?
Also, sizeof allows expression arguments, is there a reason for the asymmetry?
The practical concern is to be able to get the alignment of members of template structures, you can do decltype to get their type, sizeof to get their size, but then you also need the alignment.
The alignment of a type or variable is a description of what memory addresses the variable can inhabit—the address must be a multiple of the alignment*. However, for data-members, the address of the data-member can be any K * alignof(S) + offsetof(S, member). Let's define the alignment of a data-member to be the maximum possible integer E such that &some_s.member is always a multiple of E.
Given a type S with member member, let A = alignof(S), O = offsetof(S, member).
The valid addresses of S{}.member are V = K * A + O for some integer K.
V = K * A + O = gcd(A, O) * (K * A / gcd(A, O) + O / gcd(A, O)).
For the case where K = 1, no other factors exist.
Thus, gcd(A, O) is the best factor valid for unknown K.
In other words, "alignof(S.member)" == gcd(alignof(S), offsetof(S, member)).
Note that this alignment is always a power of two, as alignof(S) is always a power of two.
*: In my brief foray into the standard, I couldn't find this guarantee, meaning that the address of the variable could be K * alignment + some_integer. However, this doesn't affect the final result.
We can define a macro to compute the alignment of a data-member:
#include <cstddef> // for offsetof(...)
#include <numeric> // for std::gcd
// Must be a macro, as `offsetof` is a macro because the member name must be known
// at preprocessing time.
#define ALIGNOF_MEMBER(cls, member) (::std::gcd(alignof(cls), offsetof(cls, member)))
This is only guaranteed valid for standard layout types, as offsetof is only guaranteed valid for standard layout types. If the class is not standard layout, this operation is conditionally supported.
Example:
#include <cstddef>
#include <numeric>
struct S1 { char foo; alignas(alignof(void *)) char member[sizeof(void *)]; };
struct S2 { char foo; char member[sizeof(void *)]; };
#define ALIGNOF_MEMBER(cls, member) (::std::gcd(alignof(cls), offsetof(cls, member)))
int f1() { return ALIGNOF_MEMBER(S1, member); } // returns alignof(void *) == 8
int f2() { return ALIGNOF_MEMBER(S1, foo); } // returns 8*
int f3() { return ALIGNOF_MEMBER(S2, member); } // returns 1
// *: alignof(S1) == 8, so the `foo` member must always be at an alignment of 8
Compiler Explorer
I don't think it's possible. In the general case, declaring a non-static data member with an alignment specifier might not change the layout of the class that contains it. In the below example, if (as is most common) int has a size and alignment of 4, the structs S1 and S2 are likely to have the same layout, with a total size of 8 bytes. Each is likely to have 3 bytes of padding at the end:
struct S1 {
int x;
char y;
};
struct S2 {
int x;
alignas(4) char y;
};
This prevents us from using any information about the layout of the struct to determine the alignment of y. And as the OP noted, alignof(S::member) isn't valid.
By the way, there also isn't any way to query the alignment specifier of a regular variable. You can use the std::align function to check whether the variable is allocated at an address that is appropriately aligned for an object with alignment X, but this doesn't imply that the variable was actually declared with an alignment of X or greater. It could have been declared with an alignment less than X and coincidentally ended up allocated at an address that could have supported an object with alignment X.
Since this functionality is unsupported not only for non-static data members but also regular variables, I'm inclined to think that it's not an oversight; it's deliberately not supported because it's not useful. The compiler needs to know the alignment specifier so that it can allocate the variable or data member appropriately. That is not the programmer's job. Sure, the programmer may need to know the alignment requirement of a type in order to appropriately allocate memory for instances of that type, but you cannot, as the programmer, create additional instances of a variable, other than by triggering some condition that makes it happen automatically (e.g., continuing to the next iteration of a loop will deallocate and reallocate automatic variables in the loop's body). Nor can you, as of now, create a second class at compile time that's guaranteed to be layout-compatible with a given class, which is the main application I can think of for the hypothetical "query alignment of non-static data member" feature. I expect that, once C++ provides enough other reflection functionality so that something like that is close to possible, someone will also put forth a realistic proposal to add a way to query the alignment of a non-static data member.

Class with incomplete char array [duplicate]

Why does C permit this:
typedef struct s
{
int arr[];
} s;
where the array arr has no size specified?
This is C99 feature called flexible arrays, the main feature is to allow the use variable length array like features inside a struct and R.. in this answer to another question on flexible array members provides a list of benefits to using flexible arrays over pointers. The draft C99 standard in section 6.7.2.1 Structure and union specifiers paragraph 16 says:
As a special case, the last element of a structure with more than one named member may
have an incomplete array type; this is called a flexible array member. In most situations,
the flexible array member is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more trailing padding than
the omission would imply. [...]
So if you had a s* you would allocate space for the array in addition to space required for the struct, usually you would have other members in the structure:
s *s1 = malloc( sizeof(struct s) + n*sizeof(int) ) ;
the draft standard actually has a instructive example in paragraph 17:
EXAMPLE After the declaration:
struct s { int n; double d[]; };
the structure struct s has a flexible array member d. A typical way to use this
is:
int m = /* some value */;
struct s *p = malloc(sizeof (struct s) + sizeof (double [m]));
and assuming that the call to malloc succeeds, the object pointed to by p
behaves, for most purposes, as if p had been declared as:
struct { int n; double d[m]; } *p;
(there are circumstances in which this equivalence is broken; in particular, the
offsets of member d might not be the same).
You are probably looking for flexible arrays in C99. Flexible array members are members of unknown size at the end of a struct/union.
As a special case, the last element of a structure with more than one
named member may have an incomplete array type; this is called a
flexible array member. In most situations, the flexible array member
is ignored. In particular, the size of the structure is as if the
flexible array member were omitted except that it may have more
trailing padding than the omission would imply.
You may also look at the reason for the struct hack in the first place.
It's not clear if it's legal or portable, but it is rather popular. An implementation of the technique might look something like this:
#include <stdlib.h>
#include <string.h>
struct name *makename(char *newname)
{
struct name *ret =
malloc(sizeof(struct name)-1 + strlen(newname)+1);
/* -1 for initial [1]; +1 for \0 */
if(ret != NULL) {
ret->namelen = strlen(newname);
strcpy(ret->namestr, newname);
}
return ret;
}
This function allocates an instance of the name structure with the
size adjusted so that the namestr field can hold the requested name
(not just one character, as the structure declaration would suggest).
Despite its popularity, the technique is also somewhat notorious -
Dennis Ritchie has called it "unwarranted chumminess with the C implementation." An official interpretation has deemed that it is NOT
strictly conforming with the C Standard, although it does seem to work
under all known implementations. Compilers that check array bounds
carefully might issue warnings.

Why does this variable need to be static?

class armon
{
static const int maxSize=10;
int array[maxSize];
int count=0;
int* topOfStack=array;
}
Why does maxSize need to be static for it to be used inside array?
It doesn't have to be static, but it must be a constant expression.
C++ Standard § 8.3.4 [dcl.array] (emphasis mine) :
If the constant-expression (5.19) is present, it shall be a converted constant expression of type std::size_t and its value shall be greater than zero
That is, the following is also valid :
constexpr std::size_t Size() { return 10; };
struct Y
{
int array[Size()];
};
Note:
Since the compiler needs to know the size of the class, you cannot do this :
struct Y
{
const int size = 10;
int array[size];
};
Possibly making different instances of Y having different sizes.
Also note that in this context, int array[size] is not a constant expression, because it makes use of this, see the C++ standard section § 5.19 [expr.const] :
A conditional-expression e is a core constant expression unless the evaluation of e, following the rules of the abstract machine (1.9), would evaluate one of the following expressions:
— this (5.1.1), except in a constexpr function or a constexpr constructor that is being evaluated as part of e;
(the evaluation of size is really this->size)
There are two aspects to this question
Aspect 1
C++ array is of fixed size, the size of which needs to be known during compile time. If the decision needs to be deferred during runtime, the array expression becomes ill-formed.
Aspect 2
Declaring a member variable as non-static makes it an instance variable, the value of which only exist once the object is instantiated which is done during run-time. A static variable is a class variable, the value of which can be determined during compile time.
Your particular example becomes the classic chicken-egg paradox.
class armon
{
static const int maxSize=10;
int array[maxSize];
}
In order to instantiate your class armon, you need to know its size.
In order to know its size, you need to know the size of individual members. In your particular case, you need to know the size of the array.
In order to know the size of the array, you need to know the value of the dependent variable maxSize.
In order to access the dependent variable maxSize you need to instantiate the class armon.
In order to instantiate your class armon, you need to know its size.
So, your array size dependent variable should be a constant expression, which in your particular case should be a static variable,
It doesn't have to be static, it has to be constant.
When you declare a constant inside a class you will be making a constant for each instance of the class.
Also if your maxSize is just const you would have to intialize it in the constructor initializer list because const maxSize is treated as variable whose value you can't change.
Inside a class const
keyword means "This is a constant during the hole lifetime of this object". Different objects of the same class can have different values for that constant.
But when it is a static constant there will be only one constant for all instances of the class. This means that you have to initialize constants value on the same line where you are defining it.

Only one array without a size allowed per struct?

I was writing a struct to describe a constant value I needed, and noticed something strange.
namespace res{
namespace font{
struct Structure{
struct Glyph{
int x, y, width, height, easement, advance;
};
int glyphCount;
unsigned char asciiMap[]; // <-- always generates an error
Glyph glyphData[]; // <-- never generates an error
};
const Structure system = {95,
{
// mapping data
},
{
// glyph spacing data
}
}; // system constructor
} // namespace font
} // namespace res
The last two members of Structure, the unsized arrays, do not stop the compiler if they are by themselves. But if they are both included in the struct's definition, it causes an error, saying the "type is incomplete"
This stops being a problem if I give the first array a size. Which isn't a problem in this case, but I'm still curious...
My question is, why can I have one unsized array in my struct, but two cause a problem?
In standard C++, you can't do this at all, although some compilers support it as an extension.
In C, every member of a struct needs to have a fixed position within the struct. This means that the last member can have an unknown size; but nothing can come after it, so there is no way to have more than one member of unknown size.
If you do take advantage of your compilers non-standard support for this hack in C++, then beware that things may go horribly wrong if any member of the struct is non-trivial. An object can only be "created" with a non-empty array at the end by allocating a block of raw memory and reinterpreting it as this type; if you do that, no constructors or destructors will be called.
You are using a non-standard microsoft extension. C11 (note: C, not C++) allows the last array in a structure to be unsized (read: a maximum of one arrays):
A Microsoft extension allows the last member of a C or C++ structure or class to be a variable-sized array. These are called unsized arrays. The unsized array at the end of the structure allows you to append a variable-sized string or other array, thus avoiding the run-time execution cost of a pointer dereference.
// unsized_arrays_in_structures1.cpp
// compile with: /c
struct PERSON {
unsigned number;
char name[]; // Unsized array
};
If you apply the sizeof operator to this structure, the ending array size is considered to be 0. The size of this structure is 2 bytes, which is the size of the unsigned member. To get the true size of a variable of type PERSON, you would need to obtain the array size separately.
The size of the structure is added to the size of the array to get the total size to be allocated. After allocation, the array is copied to the array member of the structure, as shown below:
The compiler needs to be able to decide on the offset of every member within the struct. That's why you're not allowed to place any further members after an unsized array. It follows from this that you can't have two unsized arrays in a struct.
It is an extension from Microsoft, and sizeof(structure) == sizeof(structure_without_variable_size_array).
I guess they use the initializer to find the size of the array. If you have two variable size arrays, you can't find it (equivalent to find one unique solution of a 2-unknown system with only 1 equation...)
Arrays without a dimension are not allowed in a struct,
period, at least in C++. In C, the last member (and only the
last) may be declared without a dimension, and some compilers
allow this in C++, as an extension, but you shouldn't count on
it (and in strict mode, they should at least complain about it).
Other compilers have implemented the same semantics if the last
element had a dimension of 0 (also an extension, requiring
a diagnostic in strict mode).
The reason for limiting incomplete array types to the last
element is simple: what would be the offset of any following
elements? Even when it is the last element, there are
restrictions to the use of the resulting struct: it cannot be
a member of another struct or an array, for example, and
sizeof ignores this last element.