T* versus char* pointer arithmetic

T* versus char* pointer arithmetic - c++

Assume we have an array that contains N elements of type T.
T a[N];
According to the C++14 Standard, under which conditions do we have a guarantee that
(char*)(void*)&a[0] + n*sizeof(T) == (char*)(void*)&a[n], (0<=n<N) ?
While this is true for many types and implementations, the standard mentions it in a footnote, and in an ambiguous way:
§5.7.6, footnote 85) Another way to approach pointer arithmetic ...
There is little indication that this other way was thought of being equivalent to the standard's way. It might rather be a hint for implementers that suggests one of many conforming implementations.
Edits:
People have underestimated the difficulty of this question.
This question is not about what you can read in textbooks, it is about what what you can deduce from the C++14 Standard through the use of logic and reason.
If you use 'contiguous' or 'contiguously', please also say what is being contiguous.
While T[] and T* are closely related, they are abstractions, and the addition on T* x N may be defined by the implementation in any consistent way.
The equation was rearranged using pointer addition. If p points to a char, p+1 is always defined using (§5.7 (4)) or unary addition, so we don't run into UB. The original included a pointer subtraction, which might have caused UB early on. (The char pointers are only compared, not dereferenced).

In [dcl.array]:
An object of array type contains a contiguously allocated non-empty
set of N subobjects of type T.
Contiguous implies that the offset between any consecutive subobjects of type T is sizeof(T), which implies that the offset of the nth subobject is n*sizeof(T).
The upper bound of n < N comes from [expr.add]:
When an expression that has integral type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the expression P points to element x[i] of an array object x with n elements,
the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) element x[i + j] if 0 <= i + j < n; otherwise, the behavior is undefined.

It's always true, but instead of looking at the rules for pointer arithmetic you must rely on the semantics given for the sizeof operator (5.3.3 [expr.sizeof]):
When applied to a reference or a reference type, the result is the size of the referenced type. When applied to a class, the result is the number of bytes in an object of that class including any padding required for placing objects of that type in an array. The size of a most derived class shall be greater than zero.
The result of applying sizeof to a base class subobject is the size of the base class type. When applied to an array, the result is the total number of bytes in the array. This implies that the size of an array of n elements is n times the size of an element.
It should be clear that there's only one packing that puts n non-overlapping elements in space of n * sizeof(element), namely that they are regularly spaced sizeof (element) bytes apart. And only one ordering is allowed by the pointer comparison rules found under the relational operator section (5.9 [expr.rel]):
Comparing pointers to objects is defined as follows:
If two pointers point to different elements of the same array, or to subobjects thereof, the pointer to the element with the higher subscript compares greater.

The declaration in the first line is also a definition. (§3.1(2))
It creates the array object. (§1.8(1))
An object can be accessed via multiple lvalues
due to the aliasing rules. (§3.10(10)) In particular, the objects on the
right hand side may be legally accessed (aliased) through char pointers.
Lets look at a sentence in the array definition and then disambiguate 'contiguous'.
"An object of array type contains a contiguously allocated non-empty set
of N subobjects of type T." [dcl.array] §8.3.4.
Disambiguation
We start from the binary symmetric relation 'contiguous' for char objects, which should be obvious. ('iff' is short for 'if and only if', sets and sequences are mathematical ones, not C++ containers) If you can
link to a better or more acknowledged definition, comment.
A sequence x_1 ... x_N of char objects is contiguous iff
x_i and x_{i+1} are contiguous in memory for all i=1...N-1.
A set M of char objects is contiguous iff the objects in
M can be numbered, x_1 ...x_N, say, such that the sequence (x_i)_i is contiguous.
That is, iff M is the image of a contiguous, injective sequence.
Two sets M_1, M_2 of char objects are contiguous iff there
exist x_1 in M_1 and x_2 in M_2 such that x_1 and x_2 are contiguous.
A sequence M_1 ... M_N of sets of char objects is contiguous iff
M_i and M_{i+1} are contiguous for all i=1...N-1.
A set of sets of char objects is contiguous iff it is the image of
a contiguous, injective sequence of sets of char objects.
Now which version of 'contiguous' to apply? Linguistic overload resolution:
1) 'contiguous' may refer to 'allocation'. As an allocation function call provides a
subset of the available char objects, this would invoke the set-of-chars variant. That is,
the set of all char objects that occur in any of the N subobjects would be meant to be contiguous.
2) 'contiguous' may refer to 'set'. This would invoke the set-of-sets-of-chars variant with every subobject considered as a set of char objects.
What does this mean? First, while the authors numbered the array subobjects a[0] ... a[N-1], they chose not to say anything about the
order of subobjects in memory: they used 'set' instead of 'sequence'.
They described the allocation as contiguous, but they do not say that
a[j] and a[j+1] are contiguous in memory. Also, they chose not to write down the
straightforward formula involving (char*) pointers and sizeof(). While it looks like they
deliberately separated contiguity from ordering concerns,
§5.9 (3) requires one and the same ordering for array subobjects of all types.
If pointers point to two different elements of the same array, or a subobject thereof, the pointer
to the element with the higher subscript compares greater.
Now do the bytes that make up the array subobjects qualify as
subobjects in the sense of the above quote? Reading §1.8(2) and Complete object or subobject?
the answer is: No, at least not for arrays whose elements don't contain subobjects and are no arrays of chars, e.g. arrays of ints. So we may find examples where no particular ordering is imposed on the array elements.
But for the moment let's assume that our array subobjects are populated with chars only.
What does this mean considering the two possible interpretations of 'contiguous'?
1) We have a contiguous set of bytes that coincides with an ordered set of subobjects.
Then the claim in the OP is unconditionally true.
2) We have a contiguous sequence of subobjects, each of which may be non-contiguous individually.
This may happen in two ways: either the subobjects may have gaps, that is, they
contain two char objects at distance greater than sizeof(subobject)-1. Or the
subobjects may be distributed among different sequences of contiguous bytes.
In case 2) there is no guarantee that that the claim in the OP is true.
Therefore, it is important to be clear about what 'contiguous' means.
Finally, here's an example of an implementation where no obvious ordering is imposed on the array subobjects by §5.9 because the array subobjects don't have subobjects themselves. Readers raised concerns that this would contradict the standard in other places, but no definite contradiction has been demonstrated yet.
Assume T is int, and we have one particular conforming implementation that behaves as expected naively with one exception:
It allocates arrays of ints in reversed memory order,
putting the array's first element at the high-memory-address end of the object:
a[N-1], a[N-2], ... a[0]
instead of
a[0], a[1], ... a[N-1]
This implementation satisfies any reasonable contiguity
requirement, so we don't have to agree on a single interpretation of
'contiguous' to proceed with the argument.
Then if p points to a, mapping p to &a[0] (invoking [conv.array]) would make the pointer jump near the high memory end of a.
As array arithmetic has to be compatible with pointer arithmetic, we'd also have
int * p= &intVariable;
(char*)(p+1) + sizeof(int) == (char*)p
and
int a[N];
(char*)(void*)&a[n] + n*sizeof(int)==(char*)(void*)&a[0], (0<=n<N)
Then, for T=int, there is no guarantee that the claim in the original post is true.
edit history: removed and reintroduced in modified form a possibly erroneous shortcut that was due to not applying a relevant part of the pointer < relation specification. It has not been determined yet whether this was justified or not, but the main argument about contiguity comes through anyway.

Related

Global variables in a translation unit, will they be stored contiguous and can pointer arithmetic be done?

Say I have global variables defined in a TU such as:
extern const std::string s0{"s0"};
extern const std::string s1{"s11"};
extern const std::string s2{"s222"};
// etc...
And a function get_1 to get them depending on an index:
size_t get_1(size_t i)
{
switch (i)
{
case 0: return s0.size();
case 1: return s1.size();
case 2: return s2.size();
// etc...
}
}
And someone proposes replacing get_1 with get_2 with:
size_t get_2(size_t i)
{
return *(&s0 + i);
}
Are global variables defined next to each other in a translation unit like this guaranteed to be stored contiguously, and in the order defined?
Ie will &s1 == &s0 + 1 and &s2 == &s1 + 1 always be true?
Or can a compiler (does the standard allow a compiler to) place the variables s0 higher than s1 in memory ie. swap them?
Is it well defined behaviour to perform pointer arithmetic, like in get_2, over such variables? (that crucially aren't in the same sub-object or in an array etc., they're just globals like this)
Do rules about using relational operators on pointers from https://stackoverflow.com/a/9086675/8594193 apply to pointer arithmetic too? (Is the last comment on this answer about std::less and friends yielding a total order over any void*s where the normal relational operators don't relevant here too?)
Edit: this is not necessarily a duplicate of/asking about variables on the stack and their layout in memory, I'm aware of that already, I was specifically asking about global variables. Although the answer turns out to be the same, the question is not.

Pointer arithmetic on disparate objects yields undefined behavior as per [expr.add]:
4 When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.
(4.1) — If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
(4.2) — Otherwise, if P points to an array element i of an array object x with n elements (9.3.4.5), the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i + j of x if 0 ≤ i + j ≤ n and the expression P - J points to the (possibly-hypothetical) array element i − j of x if 0 ≤ i − j ≤ n.
(4.3) — Otherwise, the behavior is undefined.
Since s0 through s2 are not elements of an array, get_2 yields explicitly documented undefined behavior.
As far as I can tell, the standard puts no limits on the order in memory of these variables, so the compiler could order them any way it wanted, with any amount of padding or other variables between them. This is not explicitly mentioned as such, but as was pointed out to me in the comments, [expr.rel] and [expr.eq] determine that the results of relational operators in these cases are undefined/unspecified. In particular, [expr.eq] states about operators == and != that
(3.1) — If one pointer represents the address of a complete object, and another pointer represents the address one past the last element of a different complete object, the result of the comparison is unspecified.
and [expr.rel] about <, >, <=, >= that
4 The result of comparing unequal pointers to objects is defined in terms of a partial order consistent with the following rules:
(4.1) — If two pointers point to different elements of the same array, or to subobjects thereof, the pointer to the element with the higher subscript is required to compare greater.
(4.2) — If two pointers point to different non-static data members of the same object, or to subobjects of such members, recursively, the pointer to the later declared member is required to compare greater provided the two members have the same access control (11.9), neither member is a subobject of zero size, and their class is not a union.
(4.3) — Otherwise, neither pointer is required to compare greater than the other.
Again, since s0, s1, s2 are not part of the same array and not members of the same object, 4.3 is relevant, and the results of comparing pointers to them is unspecified. In practical terms, this means that the compiler can order them in memory in an arbitrary fashion.

Does C or C++ guarantee array < array + SIZE?

Suppose you have an array:
int array[SIZE];
or
int *array = new(int[SIZE]);
Does C or C++ guarantee that array < array + SIZE, and if so where?
I understand that regardless of the language spec, many operating systems guarantee this property by reserving the top of the virtual address space for the kernel. My question is whether this is also guaranteed by the language, rather than just by the vast majority of implementations.
As an example, suppose an OS kernel lives in low memory and sometimes gives the highest page of virtual memory out to user processes in response to mmap requests for anonymous memory. If malloc or ::operator new[] directly calls mmap for the allocation of a huge array, and the end of the array abuts the top of the virtual address space such that array + SIZE wraps around to zero, does this amount to a non-compliant implementation of the language?
Clarification
Note that the question is not asking about array+(SIZE-1), which is the address of the last element of the array. That one is guaranteed to be greater than array. The question is about a pointer one past the end of an array, or also p+1 when p is a pointer to a non-array object (which the section of the standard pointed to by the selected answer makes clear is treated the same way).
Stackoverflow has asked me to clarify why this question is not the same as this one. The other question asks how to implement total ordering of pointers. That other question essentially boils down to how could a library implement std::less such that it works even for pointers to differently allocated objects, which the standard says can only be compared for equality, not greater and less than.
In contrast, my question was about whether one past the end of an array is always guaranteed to be greater than the array. Whether the answer to my question is yes or no doesn't actually change how you would implement std::less, so the other question doesn't seem relevant. If it's illegal to compare to one past the end of an array, then std::less could simply exhibit undefined behavior in this case. (Also, typically the standard library is implemented by the same people as the compiler, and so is free to take advantage of properties of the particular compiler.)

Yes. From section 6.5.8 para 5.
If the expression P points to an element of an array object
and the expression Q points to the last element of the same array
object, the pointer expression Q+1 compares greater than P.
Expression array is P. The expression array + SIZE - 1 points to the last element of array, which is Q.
Thus:
array + SIZE = array + SIZE - 1 + 1 = Q + 1 > P = array

C requires this. Section 6.5.8 para 5 says:
pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values
I'm sure there's something analogous in the C++ specification.
This requirement effectively prevents allocating objects that wrap around the address space on common hardware, because it would be impractical to implement all the bookkeeping necessary to implement the relational operator efficiently.

The guarantee does not hold for the case int *array = new(int[SIZE]); when SIZE is zero .
The result of new int[0] is required to be a valid pointer that can have 0 added to it , but array == array + SIZE in this case, and a strictly less-than test will yield false.

This is defined in C++, from 7.6.6.4 (p139 of current C++23 draft):
When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.
(4.1) — If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
(4.2) — Otherwise, if P points to an array element i of an array object x with n elements (9.3.4.5) the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i + j of x if 0 <= i + j <= n and the expression P - J points to the (possibly-hypothetical) array element i − j of x if 0 <= i − j <= n.
(4.3) — Otherwise, the behavior is undefined.
Note that 4.2 explicitly has "<= n", not "< n". It's undefined for any value larger than size(), but is defined for size().
The ordering of array elements is defined in 7.6.9 (p141):
(4.1) If two pointers point to different elements of the same array, or to subobjects thereof, the pointer to the element with the higher subscript is required to compare greater.
Which means the hypothetical element n will compare greater than the array itself (element 0) for all well defined cases of n > 0.

The relevant rule in C++ is [expr.rel]/4.1:
If two pointers point to different elements of the same array, or to subobjects thereof, the pointer to the element with the higher subscript is required to compare greater.
The above rule appears to only cover pointers to array elements, and array + SIZE doesn't point to an array element. However, as mentioned in the footnote, a one-past-the-end pointer is treated as if it were an array element here. The relevant language rule is in [basic.compound]/3:
For purposes of pointer arithmetic ([expr.add]) and comparison ([expr.rel], [expr.eq]), a pointer past the end of the last element of an array x of n elements is considered to be equivalent to a pointer to a hypothetical array element n of x and an object of type T that is not an array element is considered to belong to an array with one element of type T.
So C++ guarantees that array + SIZE > array (at least when SIZE > 0), and that &x + 1 > &x for any object x.

array is guaranteed to have consecutive memory space inside. after c++03 or so vectors is guaranteed to have one too for its &vec[0] ... &vec[vec.size() - 1]. This automatically means that that what you're asking about is true
it's called contiguous storage . can be found here for vectors
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0944r0.html
The elements of a vector are stored contiguously, meaning that if v is a vector<T, Allocator> where T is some type other than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size(). Presumably five more years of studying the interactions of contiguity with caching made it clear to WG21 that contiguity needed to be mandated and non-contiguous vector implementation should be clearly banned.
latter is from standard docs. C++03 I've guessed right.

Can ptrdiff_t represent all subtractions of pointers to elements of the same array object?

For subtraction of pointers i and j to elements of the same array object the note in [expr.add#5] reads:
[ Note: If the value i−j is not in the range of representable values of type std::ptrdiff_t, the behavior is undefined. — end note ]
But given [support.types.layout#2], which states that (emphasis mine):
The type ptrdiff_t is an implementation-defined signed integer type that can hold the difference of two subscripts in an array object, as described in [expr.add].
Is it even possible for the result of i-j not to be in the range of representable values of ptrdiff_t?
PS: I apologize if my question is caused by my poor understanding of the English language.
EDIT: Related: Why is the maximum size of an array "too large"?

Is it even possible for the result of i-j not to be in the range of representable values of ptrdiff_t?
Yes, but it's unlikely.
In fact, [support.types.layout]/2 does not say much except the proper rules about pointers subtraction and ptrdiff_t are defined in [expr.add]. So let us see this section.
[expr.add]/5
When two pointers to elements of the same array object are subtracted, the type of the result is an implementation-defined signed integral type; this type shall be the same type that is defined as std::ptrdiff_t in the <cstddef> header.
First of all, note that the case where i and j are subscript indexes of different arrays is not considered. This allows to treat i-j as P-Q would be where P is a pointer to the element of an array at subscript i and Q is a pointer to the element of the same array at subscript j. In deed, subtracting two pointers to elements of different arrays is undefined behavior:
[expr.add]/5
If the expressions P and Q point to, respectively, elements x[i] and x[j] of the same array object x, the expression P - Q has the value i−j
; otherwise, the behavior is undefined.
As a conclusion, with the notation defined previously, i-j and P-Q are defined to have the same value, with the latter being of type std::ptrdiff_t. But nothing is said about the possibility for this type to hold such a value. This question can, however, be answered with the help of std::numeric_limits; especially, one can detect if an array some_array is too big for std::ptrdiff_t to hold all index differences:
static_assert(std::numeric_limits<std::ptrdiff_t>::max() > sizeof(some_array)/sizeof(some_array[0]),
"some_array is too big, subtracting its first and one-past-the-end element indexes "
"or pointers would lead to undefined behavior as per [expr.add]/5."
);
Now, on usual target, this would usually not happen as sizeof(std::ptrdiff_t) == sizeof(void*); which means an array would need to be stupidly big for ptrdiff_t to overflow. But there is no guarantee of it.

I think it is a bug of the wordings.
The rule in [expr.add] is inherited from the same rule for pointer subtraction in the C standard. In the C standard, ptrdiff_t is not required to hold any difference of two subscripts in an array object.
The rule in [support.types.layout] comes from Core Language Issue 1122. It added direct definitions for std::size_t and std::ptrdiff_t, which is supposed to solve the problem of circular definition. I don't see there is any reason (at least not mentioned in any official document) to make std::ptrdiff_t hold any difference of two subscripts in an array object. I guess it just uses an improper definition to solve the circular definition issue.
As another evidence, [diff.library] does not mention any difference between std::ptrdiff_t in C++ and ptrdiff_t in C. Since in C ptrdiff_t has no such constraint, in C++ std::ptrdiff_t should not have such constraint too.

`std::complex<T>[n]` and `T[n*2]` type aliasing

Since C++11 std::complex<T>[n] is guaranteed to be aliasable as T[n*2], with well defined values. Which is exactly what one would expect for any mainstream architecture. Is this guarantee achievable with standard C++ for my own types, say struct vec3 { float x, y, z; } or is it only possible with special support from the compiler?

TL;DR: The compiler must inspect reinterpret_casts and figure out that (standard library) specializations of std::complex are involved. We cannot conformably mimic the semantics.
I think it's fairly clear that treating three distinct members as array elements is not going to work, since pointer arithmetic on pointers to them is extremely restricted (e.g. adding 1 yields a pointer past-the-end).
So let's assume vec3 contained an array of three ints instead.
Even then, the underlying reinterpret_cast<int*>(&v) you implicitly need (where v is a vec3) does not leave you with a pointer to the first element. See the exhaustive requirements on pointer-interconvertibility:
Two objects a and b are pointer-interconvertible if:
they are the same object, or
one is a standard-layout union object and the other is a non-static data member of that object ([class.union]), or
one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no
non-static data members, the first base class subobject of that object
([class.mem]), or
there exists an object c such that a and c are pointer-interconvertible, and c and b are
pointer-interconvertible.
If two objects are pointer-interconvertible, then they have the same
address, and it is possible to obtain a pointer to one from a pointer
to the other via a reinterpret_cast. [ Note: An array object
and its first element are not pointer-interconvertible, even though
they have the same address.  — end note ]
That's quite unequivocal; while we can get a pointer to the array (being the first member), and while pointer-interconvertibility is transitive, we cannot obtain a pointer to its first element.
And finally, even if you managed to obtain a pointer to the first element of your member array, if you had an array of vec3s, you cannot traverse all the member arrays using simple pointer increments, since we get pointers past-the-end of the arrays in between. launder doesn't solve this problem either, because the objects that the pointers are associated with don't share any storage (cf [ptr.launder] for specifics).

It's only possible with special support from the compiler, mostly.
Unions don't get you there because the common approach actually has undefined behaviour, although there are exceptions for layout-compatible initial sequences, and you may inspect an object through an unsigned char* as a special case. That's it, though.
Interestingly, unless we assume a broad and useless meaning of "below", the standard is technically contradictory in this regard:
[C++14: 5.2.10/1]: [..] Conversions that can be performed explicitly using reinterpret_cast are listed below. No other conversion can be performed explicitly using reinterpret_cast.
The case for complex<T> is then not mentioned. Finally the rule you're referring to is introduced much, much later, in [C++14: 26.4/4].

I think it would work for a single vec3 if your type contained float x[3] instead, and you ensure sizeof(vec3) == 3*sizeof(float) && is_standard_layout_v<vec3>. Given those conditions, the standard guarantees that the first member is at zero offset so the address of the first float is the address of the object, and you can perform array arithmetic to get the other elements in the array:
struct vec3 { float x[3]; } v = { };
float* x = reinterpret_cast<float*>(&v); // points to first float
assert(x == v.x);
assert(&x[0] == &v.x[0]);
assert(&x[1] == &v.x[1]);
assert(&x[2] == &v.x[2]);
What you can't do is treat an array of vec3 as an array of floats three times the length. Array arithmetic on the array inside each vec3 won't allow you to access the array inside the next vec3. CWG 2182 is relevant here.

Does the standard define the type for `a[i]` where `a` is `T [M][N]`?

I, very occasionally, make use of multidimensional arrays, and got curious what the standard says (C11 and/or C++11) about the behavior of indexing with less "dimensions" than the one declared for the array.
Given:
int a[2][2][2] = {{{1, 2}, {3, 4}}, {{5, 6}, {7, 8}}};
Does the standard says what type a[1] is, or a[0][1], is it legal, and whether it should properly index sub-arrays as expected?
auto& b = a[1];
std::cout << b[1][1];

m[1] is just of type int[2][2]. Likewise m[0][1] is just int[2]. And yes, indexing as sub-arrays works the way you think it does.

Does the standard define the type for a[i] where a is T [M][N]?
Of course. The standard basically defines the types of all expressions, and if it does not, it would be a defect report. But I guess you are more interested on what that type might be...
While the standard may not explicitly mention your case, the rules are stated and are simple, given an array a of N elements of type T, the expression a[0] is an lvalue expression of type T. In the variable declaration int a[2][2] the type of a is array of 2 elements of type array of two elements of type int, which applying the rule above means that a[0] is lvalue to an array of 2 elements, or as you would have to type it in a program: int (&)[2]. Adding extra dimensions does not affect the mechanism.

I think this example in C11 explained it implicitly.
C11 6.5.2.1 Array subscripting
EXAMPLE Consider the array object defined by the declaration int x[3][5]; Here x is a 3 × 5 array of ints; more precisely, x is an array of three element objects, each of which is an array of five ints. In the expression x[i], which is equivalent to (*((x) + (i))), x is first converted to a pointer to the initial array of five ints. Then i is adjusted according to the type of x, which conceptually entails multiplying i by the size of the object to which the pointer points, namely an array of five int objects. The results are added and indirection is applied to yield an array of five ints. When used in the expression x[i][j], that array is in turn converted to a pointer to the first of the ints, so x[i][j] yields an int.
The similar is in C++11 8.3.4 Arrays
Example: consider
int x[3][5];
Here x is a 3 × 5 array of integers. When x appears in an expression, it is converted to a pointer to (the first of three) five-membered arrays of integers. In the expression x[i] which is equivalent to *(x + i), x is first converted to a pointer as described; then x + i is converted to the type of x, which involves multiplying i by the length of the object to which the pointer points, namely five integer objects. The results are added
and indirection applied to yield an array (of five integers), which in turn is converted to a pointer to the first of the integers. If there is another subscript the same argument applies again; this time the result is an integer. —end example ] —end note ]

The key point to remember is that, in both C and C++, a multidimensional array is simply an array of arrays (so a 3-dimensional array is an array of arrays of arrays). All the syntax and semantics of multidimensional arrays follow from that (and from the other rules of the language, of course).
So given an object definition:
int m[2][2][2];
m is an object of type int[2][2][2] (an array of two arrays, each of which consists of two elements, each of which consists of two elements, each of which is an array of two ints).
When you write m[1][1][1], you're already evaluating m, m[1] and m[1][1].
The expression m is an lvalue referring to an array object of type int[2][2][2].
In m[1], the array expression m is implicitly converted to ("decays" to) a pointer to the array's first element. This pointer is of type int(*)[2][2], a pointer to a two-element array of two-element arrays of int. m[1] is by definition equivalent to *(m+1); the +1 advances m by one element and dereferences the resulting pointer value. So m[1] refers to an object of type int[2][2] (an array of two arrays, each of which consists of two int elements).
(The array indexing operator [] is defined to operate on a pointer, not an array. In a common case like arr[42], the pointer happens to be the result of an implicit array-to-pointer conversion.)
We repeat the process for m[1][1], giving us a pointer to an array of two ints (of type int(*)[2]).
Finally, m[1][1][1] takes the result of evaluating m[1][1] and repeats the process yet again, giving us an lvalue referring to an object of type int. And that's how multidimensional arrays work.
Just to add to the frivolity, an expression like foo[index1][index2][index3] can work directly with pointers as well as with arrays. That means you can construct something that works (almost) like a true multidimensional array using pointers and allocations of arbitrary size. This gives you the possibility of having "ragged" arrays with different numbers of elements in each row, or even rows and elements that are missing. But then it's up to you to manage the allocation and deallocation for each row, or even for each element.
Recommended reading: Section 6 of the comp.lang.c FAQ.
A side note: There are languages where multidimensional arrays are not arrays of arrays. In Ada, for example (which uses parentheses rather than square brackets for array indexing), you can have an array of arrays, indexed like arr(i)(j), or you can have a two-dimensional array, indexed like arr(i, j). C is different; C doesn't have direct built-in support for multidimensional arrays, but it gives you the tools to build them yourself.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

T* versus char* pointer arithmetic - c++

Related

Global variables in a translation unit, will they be stored contiguous and can pointer arithmetic be done?

Does C or C++ guarantee array < array + SIZE?

Can ptrdiff_t represent all subtractions of pointers to elements of the same array object?

`std::complex<T>[n]` and `T[n*2]` type aliasing

Does the standard define the type for `a[i]` where `a` is `T [M][N]`?

Categories

Resources