How is memory allocated for a static multi-dimensional array?

How is memory allocated for a static multi-dimensional array? - c++

All,
This has been bugging me for a while now. In C\C++( i guess java and .NET as well) we do not have to specify the row index in a multi-dimensional array.
So, for example i can declare an array of ints as such:
int Array[][100];
I think static arrays in general are represented as contiguous memory on the stack. So, taking a column-major representation, how does the compiler know how much memory to allocate in the above case as it's missing one of the dimensions?

In C++ language you can't just do
int Array[][100]; /* ERROR: incomplete type */
because that would be a definition of an object of incomplete type, which is explicitly illegal in C++. You can use that in a non-defining declaration
extern int Array[][100];
(or as a static member of a class), but when it will come to the actual definition of the same array object both sizes will have to be specified explicitly (or derived from an explicit initializer).
In C the situation is not much different, except that in C there are such things as tentative definitions which let you write
int Array[][100];
However, a tentative definition in this regard is pretty similar to a non-defining declaration, which is why it is allowed. Eventually you will have to define the same object with explicitly specified size in the same translation unit (some compilers don't require that as an non-stanard extension). If you try something like that in a non-tentative definition, you'll get an error
static int Array[][100]; /* ERROR: incomplete type */
So, if you think of it, aside from tentative definitions, the situation in C and C++ is not much different: it is illegal to define objects of incomplete type in these languages and an array of unspecified size is an incomplete type.

In java and .NET, don't think about "the stack" -- objects live on the heap. And in C, that's just a declaration -- only a definition actually reserves memory! So that would NOT be an acceptable definition -- if you put it as the only line in file a.c:
$ gcc -c a.c
a.c:1: warning: array ‘Array’ assumed to have one element
so gcc is just treating it as if it were int Array[1][100];, as it warns you it's doing.

It does not know how much memory to allocate, what he knows with array[] is that array is a pointer (like int *array). array[][100] ( someone please correct me if i am wrong ) is the same as array[100].

Related

How can this structure have sizeof == 0?

There is an old post asking for a construct for which sizeof would return 0. There are some high score answers from high reputation users saying that by the standard no type or variable can have sizeof 0. And I agree 100% with that.
However there is this new answer which presents this solution:
struct ZeroMemory {
int *a[0];
};
I was just about to down-vote and comment on it, but time spent here taught me to check even the things that I am 100% sure on. So... to my surprise both gcc and clang show the same results: sizeof(ZeroMemory) == 0. Even more, sizeof a variable is 0:
ZeroMemory z{};
static_assert(sizeof(z) == 0); // Awkward...
Whaaaat...?
Godbolt link
How is this possible?

Before C was standardized, many compilers would have had no difficulty handling zero-size types as long as code never tried to subtract one pointer to a zero-size type from another. Such types were useful, and supporting them was easier and cheaper than forbidding them. Other compilers decided to forbid such types, however, and some static-assertion code may have relied upon the fact that they would squawk if code tried to create a zero-sized array. The authors of the Standard were faced with a choice:
Allow compilers to silently accept zero-sized array declarations, even
in cases where the purpose of such declarations would be to trigger a
diagnostic and abort compilation, and require that all compilers accept
such declarations (though not necessarily silently) as producing zero-
sized objects.
Allow compilers to silently accept zero-sized array declarations, even
in cases where the purpose of such declarations would be to trigger a
diagnostic and abort compilation, and allow compilers encountering such
declarations to either abort compilation or continue it at their leisure.
Require that implementations issue a diagnostic if code declares a
zero-sized array, but then allow implementations to either abort
compilation or continue it (with whatever semantics they see fit) at
their leisure.
The authors of the Standard opted for #3. Consequently, zero-sized array declarations are regarded by the Standard "extension", even though such constructs were widely supported before the Standard forbade them.
The C++ Standard allows for the existence of empty objects, but in an effort to allow the addresses of empty objects to be usable as tokens it mandates that they have a minimum size of 1. For an object that has no members to have a size of 0 would thus violate the Standard. If an object contains zero-sized members, however, the C++ Standard imposes no requirements about how it is processed beyond the fact that a program containing such a declaration must trigger a diagnostic. Since most code that uses such declarations expects the resulting objects to have a size of zero, the most useful behavior for compilers receiving such code is to treat them that way.

As pointed out by Jarod42 zero size arrays are not standard C++, but GCC and Clang extensions.
Adding -pedantic produces this warning:
5 : <source>:5:12: warning: zero size arrays are an extension [-Wzero-length-array]
int *a[0];
^
I always forget that std=c++XX (instead of std=gnu++XX) doesn't disable all extensions.
This still doesn't explain the sizeof behavior. But at least we know it's not standard...

In C++, a zero-size array is illegal.
ISO/IEC 14882:2003 8.3.4/1:
[..] If the constant-expression (5.19) is present, it shall be an integral constant expression and its value shall be greater than zero. The constant expression specifies the bound of (number of elements in) the array. If the value of the constant expression is N, the array has N elements numbered 0 to N-1, and the type of the identifier of D is “derived-declarator-type-list array of N T”. [..]
g++ requires the -pedantic flag to give a warning on a zero-sized array.

Zero length arrays are an extension by GCC and Clang. Applying sizeof to zero-length arrays evaluates to zero.
A C++ class (empty) can't have size 0, but note that the class ZeroMemory is not empty. It has a named member with size 0 and applying sizeof will return zero.

Memory-layout compatibility between C and C++

I'm building a C++ library which uses many functions and struct's defined in a C library. To avoid porting any code to C++, I add the typical conditional preprocessing to the C header files. For example,
//my_struct.h of the C library
#include <complex.h>
#ifdef __cplusplus
extern "C" {
#endif
typedef struct {
double d1,d2,d3;
#ifdef __cplusplus
std::complex<double> z1,z2,z3;
std::complex<double> *pz;
#else
double complex z1,z2,z3;
double complex *pz;
#endif
int i,j,k;
} my_struct;
//Memory allocating + initialization function
my_struct *
alloc_my_struct(double);
#ifdef __cplusplus
}
#endif
The implementation of alloc_my_struct() is compiled in C. It simply allocates memory via malloc() and initializes members of my_struct.
Now when I do the following in my C++ code,
#include "my_struct.h"
...
my_struct *const ms = alloc_my_struct(2.);
I notice that *ms always have the expected memory layout, i.e., any access such as ms->z1 evaluates to the expected value. I find this really cool considering that (correct me if I'm wrong) the memory layout of my_struct during allocation is decided by the C compiler (in my case gcc -std=c11), while during access by the C++ compiler (in my case g++ -std=c++11).
My question is : Is this compatibility standardized? If not, is there any way around it?
NOTE : I don't have enough knowledge to argue against alignment, padding, and other implementation-defined specifics. But it is noteworthy that the GNU scientific library, which is C-compiled, is implementing the same approach (although their structs do not involve C99 complex numbers) for use in C++. On the other hand, I've done sufficient research to conclude that C++11 guarantees layout compatibility between C99 double complex and std::complex<double>.

C and C++ do share memory layout rules. In both languages structs are placed in memory in the same way. And even if C++ did want to do things a little differently, placing the struct inside extern "C" {} guarantees C layout.
But what your code is doing relies on C++ std::complex and C99 complex to be the same.
So see:
https://gcc.gnu.org/ml/libstdc++/2007-02/msg00161.html
C Complex Numbers in C++?

Your program has undefined behaviour: your definitions of my_struct are not lexically identical.
You're gambling that alignment, padding and various other things will not change between the two compilers, which is bad enough… but since this is UB anything could happen even if it were true!

It may not always be identical!
In this case looks like sizeof(std::complex<double>) is identical to sizeof(double complex).
Also pay attention to the fact that the compilers may (or may not) add padding to the structs to make them aligned to a specific value, based on the optimization configuration. And the padding may not always be identical resulting in different structure sizes (between C and c++).
Links to related posts:
C/C++ Struct memory layout equivalency
I would add compiler-specific attributes to "pack" the fields,
thereby guaranteeing all the ints are adjacent and compact. This is
less about C vs. C++ and more about the fact that you are likely using
two "different" compilers when compiling in the two languages, even if
those compilers come from a single vendor.
Adding a constructor will not change the layout (though it will make
the class non-POD), but adding access specifiers like private between
the two fields may change the layout (in practice, not only in
theory).
C struct memory layout?
In C, the compiler is allowed to dictate some alignment for every
primitive type. Typically the alignment is the size of the type. But
it's entirely implementation-specific.
Padding bytes are introduced so every object is properly aligned.
Reordering is not allowed.
Possibly every remotely modern compiler implements #pragma pack which
allows control over padding and leaves it to the programmer to comply
with the ABI. (It is strictly nonstandard, though.)
From C99 §6.7.2.1:
12 Each non-bit-field member of a structure or union object is aligned
in an implementation- defined manner appropriate to its type.
13 Within a structure object, the non-bit-field members and the units
in which bit-fields reside have addresses that increase in the order
in which they are declared. A pointer to a structure object, suitably
converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa.
There may be unnamed padding within a structure object, but not at its
beginning.

In general, C and C++ have compatible struct layouts, because the layout is dictated by the platform's ABI rules, not just by the language, and (for most implementations) C and C++ follow the same ABI rules for type sizes, data layout, calling conventions etc.
C++11 even defined a new term, standard-layout, which means the type will have a compatible layout to a similar type in C. That means it can't use virtual functions, private data members, multiple inheritance (and a few other things). A C++ standard-layout type should have the same layout as an equivalent C type.
As noted in other answers, your specific code is not safe in general because std::complex<double> and complex double are not equivalent types, and there is no guarantee that they are layout-compatible. However GCC's C++ standard library ensures it will work because std::complex<double> and std::complex<float> are implemented in terms of the underlying C types. Instead of containing two double, GCC's std::complex<double> has a single member of type __complex__ double, which the compiler implements identically to the equivalent C type.
GCC does this specifically to support code like yours, because it's a reasonable thing to want to do.
So combining GCC's special efforts for std::complex with the standard-layout rules and the platform ABI, means that your code will work with that implementation.
This is not necessarily portable to other C++ implementations.

Also note that by malloc() a struct with C++ object (std::complex<double>) you skipped the ctor and this is also UB - even if you expect the ctor is empty or just zero the value and harmless to be skipped, you can't complain if this breaks. So your program work is by pure luck.

Initializing an array

I am doing the following for initializing an array in c++
int a;
cin>>a;
float b[a];
This works and compiles in my computer. IS this correct? I thought that we can only do this if a was a const int.

Depends on you definition of "correct".
This is called variable-length array (or just VLA) and it's not officially supported in the current versions of C++ (100% sure for C++03 and before, 99.99% sure for C++11), but it is in C.
Some compilers allow this as a compiler extension.

It's not about whether a is a constant int. It's about whether a has a initial value assigned at comipling time. Compiler needs to allocate storage according to a const int value. C++ standard doesn't support variable length array right now.
In C99， this syntax of variable length array is valid, but C++ standard says no. It is a very useful feature, leaving all the hairy memory allocating stuff to the compiler.
In GCC and Clang, this feature is supported as a compiler extension, so you won't get any warning and error. But MSVC compiler will put an error message that says cannot allocate an array of constant size 0, So it is compiler specific.
The compiler that supports this feature may have convert your code with new operator.
int a;
cin>>a;
float *b = new float[a];
This is valid in C++ standard.
Another thing is that though it is called variable-length array, it is not length-variable at all. Once it is defined, its length is a constant value which never change. You can't expand it or shrink it.
It is much better to use the vector container which is truly length variable, and with much more scalability and adaptivity.
See the post for more discussion on Why aren't variable-length arrays part of the C++ standard?

Why the size of the object is zero

I am getting the sizeof of object as zero, which is ought not to be. Please explain me the concept as why the compiler is giving this answer?
#include<iostream>
using namespace std;
class xxx{
public: int a[]; // Why this line is not giving error.
};
int main(int argc, char *argv[])
{
xxx x1;
cout<<sizeof(x1); //Q=Why this code is not giving error.
return 0;
}

As the others have said, an object in C++ can never have size 0.
However, since the code isn’t valid C++ in the first place (arrays cannot be empty) this is inconsequential. The compiler just does what it wants.
GCC with -pedantic rejects this code. MSVC at least warns. My version of clang++ with -pedantic ICEs but does emit a warning before that.

You're not using a standard-compliant compiler. An object size can't be 0, even an empty class or struct has size 1. Moreover, the array dimension has to be specified.
EDIT: It's strange, ideone also prints out 0. In MSVS I get a warning, but at least the size is 1.
5.3.3. Sizeof
[...] When applied to a class, the result is the number of bytes in an object of that class [...] The size of a most derived class shall
be greater than zero. [...] The result of applying sizeof to a base class subobject is the size of the base class type. [...]
EDIT 2:
I tried the following in MSVS:
xxx a[100];
and it fails to compile. Strange how it doesn't pick up the error beforehand.

That element a in your class xxx is called a flexible array member.
Flexible array members are not in the C++ standard. They are a part of C99. However, many compiler vendors provide flexible array members as a C++ extension.
Your code as-is is not legal C code. It uses C++ specific constructs. Your code is easy to change to C. Change the class to struct, get rid of the public, and change the use of C++ I/O to C's printf. With those changes, your converted code is still illegal C99 code. Flexible array members are only allowed as the last element of a structure that is otherwise non-empty.
Apparently your vendor took the flexible array member concept over to C++, but not the constraint that the structure be otherwise non-empty.

The size of an object can not be zero. even if the class is empty, its size is never zero.
Checkout the link to know more Bjarne Stroustrup's C++ Style and Technique FAQ.

Why is this "invalid C++"

I was reading intro on gtest and found this part confusing:
The compiler complains about "undefined references" to some static
const member variables, but I did define them in the class body.
What's wrong?
If your class has a static data member:
// foo.h
class Foo {
...
static const int kBar = 100;
};
You also need to define it outside of the class body in foo.cc:
const int Foo::kBar; // No initializer here.
Otherwise your code is invalid C++, and may break in unexpected
ways. In particular, using it in Google Test comparison assertions
(EXPECT_EQ, etc) will generate an "undefined reference" linker error.
Can somebody explain why defining a static const in in a class without defining it outside class body is illegal C++?

First things first, inside a class body is not a definition, it's a declaration. The declaration specifies the type and value of the constant, the definition reserves storage space. You might not need the storage space, for instance if you only use the value as a compile time constant. In this case your code is perfectly legal C++. But if you do something like pass the constant by reference, or make a pointer point to the constant then you are going to need the storage as well. In these cases you would get an 'undefined reference' error.

The standard basically states that even though you can give a value in the header, if the static variable is "used" you must still define it in the source file.
In this context "used" is generally understood to mean that some part of the program needs actual memory and/or an address of the variable.
Most likely the google test code takes the address of the variable at some point (or uses it in some other equivalent way).

Roughly: In the class definition, static const int kBar = 100; tells the compiler "Foo will have a kBar constant (which I promise will always be 100)". However, the compiler doesn't know where that variable is yet. In the foo.cc file, the const int Foo::kBar; tells the compiler "alright, make kBar in this spot". Otherwise, the linker goes looking for kBar, but can't find it anywhere.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js