In C++ what is the difference (if any) between using char and char[1].
examples:
struct SomeStruct
{
char x;
char y[1];
};
Do the same reasons follow for unsigned char?
The main difference is just the syntax you use to access your one char.
By "access" I mean, act upon it using the various operators in the language, most or all of which do different things when applied to a char compared with a char array. This makes it sound as if x and y are almost entirely different. If fact they both "consist of" one char, but that char has been represented in a very different way.
The implementation could cause there to be other differences, for example it could align and pad the structure differently according to which one you use. But I doubt it will.
An example of the operator differences is that a char is assignable, and an array isn't:
SomeStruct a;
a.x = 'a';
a.y[0] = 'a';
SomeStruct b;
b.x = a.x; // OK
b.y = a.y; // not OK
b.y[0] = a.y[0]; // OK
But the fact that y isn't assignable doesn't stop SomeStruct being assignable:
b = a; // OK
All this is regardless of the type, char or not. An object of a type, and an array of that type with size 1, are pretty much the same in terms of what's in memory.
As an aside, there is a context in which it makes a big difference which you "use" out of char and char[1], and which sometimes helps confuse people into thinking that arrays are really pointers. Not your example, but as a function parameter:
void foo(char c); // a function which takes a char as a parameter
void bar(char c[1]); // a function which takes a char* as a parameter
void baz(char c[12]); // also a function which takes a char* as a parameter
The numbers provided in the declarations of bar and baz are completely ignored by the C++ language. Apparently someone at some point felt that it would be useful to programmers as a form of documentation, indicating that the function baz is expecting its pointer argument to point to the first element of an array of 12 char.
In bar and baz, c never has array type - it looks like an array type, but it isn't, it's just a fancy special-case syntax with the same meaning as char *c. Which is why I put the quotation marks on "use" - you aren't really using char[1] at all, it just looks like it.
If you've actually seen the construct char y[1] as the last member of a struct in production code, then it is fairly likely that you've encountered an instance of the struct hack.
That short array is a stand-in for a real, but variable length array (recall that before c99 there was no such thing in the c standard). The programmer would always allocate such structs on the heap, taking care to insure that the allocation was big enough for the actual size of array that he wanted to use.
As well as the notational differences in usage emphasised by Steve, char[1] can be passed to e.g. template <int N> void f(char(&a)[N]), where char x = '\0'; f(&x); wouldn't match. Reliably capturing the size of array arguments is very convenient and reassuring.
It may also imply something different: either that the real length may be longer (as explained by dmckee), or that the content is logically an ASCIIZ string (that happens to be empty in this case), or an array of characters (that happens to have one element). If the structure was one of several related structures (e.g. a mathematical vector where the array size was a template argument, or an encoding of the layout of memory needed for some I/O operation), then it's entirely possible that some similarity with other fields where the arrays may be larger would suggest a preference for a single-character array, allowing support code to be simpler and/or more universally applicable.
Related
I am reading a book about c and the following paragraph is a bit unclear for me:
Surprisingly, passing the pointer is not efficient in the above example! That's because of the fact that the int type is 4 bytes and copying it is more efficient than copying 8 bytes of its pointer. But this is not the case regarding structures and arrays. Since copying structures and arrays is done byte-wise, and all of the bytes in them should be copied one by one, it is usually better to pass pointers instead.
as I know all the operations in CPU are limited to arithmetic(plus or minعس) or bit-wise kind of operation so
What does the writer mean about copying array and structure, isn't an int copying a bit shifting operation?
Second: are pointers array?
NOTE: the book name is Extreme C and published by packT
and following example is what the author is referring to:
#include <stdio.h>
void func(int* a) {
int b = 9;
*a = 5; a = &b;
}
int main(int argc, char** argv) {
int x = 3;
int* xptr = &x;
printf("Value before call: %d\n", x);
printf("Pointer before function call: %p\n", (void*)xptr); func(xptr);
printf("Value after call: %d\n", x);
printf("Pointer after function call: %p\n", (void*)xptr);
return 0;
}
'''
The book is not clear and it's also wrong.
The assumption seem to be that an 8 byte pointer is "harder" to copy than a 4 byte integer. That's wrong for nearly all modern CPUs.
Further, the part about copying an array is just plain wrong. That is not what C does. Passing an array in C does not involve an copy. It's actually just like passing a pointer.
The part about structs is however correct... as long as the struct isn't just a simple integer or char but "something bigger".
What does the writer mean about copying array
Sounds like rubbish... as C doesn't pass array by doing a copy
What does the writer mean about copying ... structure,
Structs are copied by value. So passing a struct to a function involves copying every byte of the struct. That is rather expensive if the struct is large.
are pointers array?
No. Pointers are pointers. But... Under the correct circumstances a pointer can be used as an array because *(p + i) is the same as p[i]
What does the writer mean about copying array and structure?
Let's compare two functions taking a large amount of data (e.g. a struct with lots of data members):
void f(const big_type_t* p_big_type);
void g(const big_type_t big_type);
Both can effectively read the values from the caller-specified big_type_t object, but in the former case f() need only be passed a pointer (which is typically 8 bytes on modern everyday hardware) to tell it where the caller has a big_type_t object for it to use. In the latter case g() pass-by-value argument asks the compiler to make a complete copy of the caller's big_type_t argument and copy it to a location on the stack where g() implicitly knows to find it. Every byte of the data in the struct must be copied (unless the compiler's smart enough to optimise under the as-if rule, but that's a bit of a distraction - it's generally best to write code so it's not unnecessarily inefficient if not optimised well).
For built-in arrays, the situation is different. C and C++ implicitly pass arrays by pointer, so...
void h(const int* my_array);
void i(const int my_array[]);
...are both called the same way, with the my_array argument actually being a pointer to the first int the caller specifies.
In C++ there are also std::array<>s, which are effectively struct/classes with a static-sized array data member (i.e. template <typename T, size_t N> struct array { T data_[N]; ... }). They can be passed by-value, the same as structs. So, for large std::array objects, access via a pointer or reference is more efficient than doing a full copy.
Sometimes a function really does want a copy though, as it may need to do something like sort it without affecting the caller-specified variable passed to that argument. In that case, there's not much point passing by pointer or reference.
isn't an int copying a bit shifting operation?
No... the term "bit shifting" has a very specific meaning in programming. Consider an 8-bit integer - say 0x10010110. If we shift this value one bit to the right, we get 0x01001011 - a 0 is introduced on the left, and a 0 is discarded on the right. If we shift the new value to the right again, we could get either 0x00100101 (add 0 at left; discard at right) or - what's called a circular shift or rotation - 0x100100101`, where the right-most bit is moved to become the left-most bit. Bit-shifting happens to CPU registers, and the shifted values can be stored back into the memory where a variable is located, or used in some calculation.
All that's quite unrelated to memory copying, which is where the bits in one value are (at least notionally, without optimisation) copied into "another" value. For large amounts of data, this usually does mean actually copying the bits in a value read from memory to another area of memory.
Second: are pointers array?
No they're not. But, when you have an array, it easily "decays" to a pointer to its first element. For example:
void f(const char* p);
f("hello");
In C++, "hello" is a string literal of type char[6] (as there's implicitly a null character at the end. When calling f, it decays from array form to a pointer to the first character - 'h'. That's usually desirable to give the called function access to the array data. In C++, you can also do this:
template <size_t N> void f(const char(&arr)[N]);
f("hello");
The call above does not involve decay from an array to a pointer - arr is bound to the string literal array and N is derived as 6.
What does the writer mean about copying array and structure, isn't an int copying a bit shifting operation?
When you pass an object of struct type as a parameter in a function, the contents of that structure are copied into the formal parameter:
struct foo {
...
};
void do_something_with( struct foo arg )
{
// do something with arg
}
int main( void )
{
struct foo f = { 1, 2.0, "three" };
...
do_something_with( f );
...
}
The objects main:f and do_something_with:arg are two separate instances of struct foo - when you pass f as an argument, its contents are copied into arg. Any changes you make to the contents of arg do not affect the contents of f.
The thing is, the author of the book is wrong about arrays - when you pass an array expression as an argument to a function, what you are actually passing is a pointer to the first element, not the whole array.
Second: are pointers array?
Arrays are not pointers - however, unless it is the operand of the sizeof or unary & operators, an expression of type "N-element array of T" will be converted, or "decay", to an expression of type "pointer to T" and the value will be the address of the first element of the array.
When you pass an array expression as an argument to a function, what the function actually receives is a pointer to the first element of the array - no copy of the array is made like it is for the struct above.
Finally - while runtime efficiency does matter, correctness, clarity, and maintainability matter more. If it makes sense to pass an argument as a pointer (such as you want the function to modify the argument), then by all means do so. But don't start passing everything as a pointer because it might speed things up. Start by making things clear and correct - then, measure the performance of your code and take action based on that. Most of your runtime performance gains come from using the right data structures and algorithms, not how you pass arguments.
While the sample code has much to be desired and some bugs, I think that the gist of what the author is saying is that for a small data type it is more efficient to directly pass a parameter to a function by value (int) rather than by passing by pointer (int *). When a function is called, parameters are pushed onto the stack and and a type of int would require 2 bytes, but an int *parameter may require 4 or 8 bytes depending on the system.
When passing a struct as a parameter, the overall size of the struct will typically be greater than 4 or 8 bytes, so passing a pointer to thr struct may be more efficient, since only 4 or 8 bytes would need to be copied to the stack.
I am not sure why the author mentioned arrays, since an array cannot be passed to a function by value unless it is contained in a struct.
I'm trying to understand the nature of type-decay. For example, we all know arrays decay into pointers in a certain context. My attempt is to understand how int[] equates to int* but how two-dimensional arrays don't correspond to the expected pointer type. Here is a test case:
std::is_same<int*, std::decay<int[]>::type>::value; // true
This returns true as expected, but this doesn't:
std::is_same<int**, std::decay<int[][1]>::type>::value; // false
Why is this not true? I finally found a way to make it return true, and that was by making the first dimension a pointer:
std::is_same<int**, std::decay<int*[]>::type>::value; // true
And the assertion holds true for any type with pointers but with the last being the array. For example (int***[] == int****; // true).
Can I have an explanation as to why this is happening? Why doesn't the array types correspond to the pointer types as would be expected?
Why does int*[] decay into int** but not int[][]?
Because it would be impossible to do pointer arithmetic with it.
For example, int p[5][4] means an array of (length-4 array of int). There are no pointers involved, it's simply a contiguous block of memory of size 5*4*sizeof(int). When you ask for a particular element, e.g. int a = p[i][j], the compiler is really doing this:
char *tmp = (char *)p // Work in units of bytes (char)
+ i * sizeof(int[4]) // Offset for outer dimension (int[4] is a type)
+ j * sizeof(int); // Offset for inner dimension
int a = *(int *)tmp; // Back to the contained type, and dereference
Obviously, it can only do this because it knows the size of the "inner" dimension(s). Casting to an int (*)[4] retains this information; it's a pointer to (length-4 array of int). However, an int ** doesn't; it's merely a pointer to (pointer to int).
For another take on this, see the following sections of the C FAQ:
6.18: My compiler complained when I passed a two-dimensional array to a function expecting a pointer to a pointer.
6.19: How do I write functions which accept two-dimensional arrays when the width is not known at compile time?
6.20: How can I use statically- and dynamically-allocated multidimensional arrays interchangeably when passing them to functions?
(This is all for C, but this behaviour is essentially unchanged in C++.)
C was not really "designed" as a language; instead, features were added as needs arose, with an effort not to break earlier code. Such an evolutionary approach was a good thing in the days when C was being developed, since it meant that for the most part developers could reap the benefits of the earlier improvements in the language before everything the language might need to do was worked out. Unfortunately, the way in which array- and pointer handling have evolved has led to a variety of rules which are, in retrospect, unfortunate.
In the C language of today, there is a fairly substantial type system, and variables have clearly defined types, but things were not always thus. A declaration char arr[8]; would allocate 8 bytes in the present scope, and make arr point to the first of them. The compiler wouldn't know that arr represented an array--it would represent a char pointer just like any other char*. From what I understand, if one had declared char arr1[8], arr2[8];, the statement arr1 = arr2; would have been perfectly legal, being somewhat equivalent conceptually to char *st1 = "foo, *st2 = "bar"; st1 = st2;, but would have almost always represented a bug.
The rule that arrays decompose into pointers stemmed from a time when arrays and pointers really were the same thing. Since then, arrays have come to be recognized as a distinct type, but the language needed to remain essentially compatible with the days when they weren't. When the rules were being formulated, the question of how two-dimensional arrays should be handled wasn't an issue because there was no such thing. One could do something like char foo[20]; char *bar[4]; int i; for (i=0; i<4; i++) bar[i] = foo + (i*5); and then use bar[x][y] in the same way as one would now use a two-dimensional array, but a compiler wouldn't view things that way--it just saw bar as a pointer to a pointer. If one wanted to make foo[1] point somewhere completely different from foo[2], one could perfectly legally do so.
When two two-dimensional arrays were added to C, it was not necessary to maintain compatibility with earlier code that declared two-dimensional arrays, because there wasn't any. While it would have been possible to specify that char bar[4][5]; would generate code equivalent to what was shown using the foo[20], in which case a char[][] would have been usable as a char**, it was thought that just as assigning array variables would have been a mistake 99% of the time, so too would have been re-assignment of array rows, had that been legal. Thus, arrays in C are recognized as distinct types, with their own rules which are a bit odd, but which are what they are.
Because int[M][N] and int** are incompatible types.
However, int[M][N] can decay into int (*)[N] type. So the following :
std::is_same<int(*)[1], std::decay<int[1][1]>::type>::value;
should give you true.
Two dimensional arrays are not stored as pointer to pointers, but as a contiguous block of memory.
An object declared as type int[y][x] is a block of size sizeof(int) * x * y whereas, an object of type int ** is a pointer to an int*
This is what I found during my learning period:
#include<iostream>
using namespace std;
int dis(char a[1])
{
int length = strlen(a);
char c = a[2];
return length;
}
int main()
{
char b[4] = "abc";
int c = dis(b);
cout << c;
return 0;
}
So in the variable int dis(char a[1]) , the [1] seems to do nothing and doesn't work at
all, because I can use a[2]. Just like int a[] or char *a. I know the array name is a pointer and how to convey an array, so my puzzle is not about this part.
What I want to know is why compilers allow this behavior (int a[1]). Or does it have other meanings that I don't know about?
It is a quirk of the syntax for passing arrays to functions.
Actually it is not possible to pass an array in C. If you write syntax that looks like it should pass the array, what actually happens is that a pointer to the first element of the array is passed instead.
Since the pointer does not include any length information, the contents of your [] in the function formal parameter list are actually ignored.
The decision to allow this syntax was made in the 1970s and has caused much confusion ever since...
The length of the first dimension is ignored, but the length of additional dimensions are necessary to allow the compiler to compute offsets correctly. In the following example, the foo function is passed a pointer to a two-dimensional array.
#include <stdio.h>
void foo(int args[10][20])
{
printf("%zd\n", sizeof(args[0]));
}
int main(int argc, char **argv)
{
int a[2][20];
foo(a);
return 0;
}
The size of the first dimension [10] is ignored; the compiler will not prevent you from indexing off the end (notice that the formal wants 10 elements, but the actual provides only 2). However, the size of the second dimension [20] is used to determine the stride of each row, and here, the formal must match the actual. Again, the compiler will not prevent you from indexing off the end of the second dimension either.
The byte offset from the base of the array to an element args[row][col] is determined by:
sizeof(int)*(col + 20*row)
Note that if col >= 20, then you will actually index into a subsequent row (or off the end of the entire array).
sizeof(args[0]), returns 80 on my machine where sizeof(int) == 4. However, if I attempt to take sizeof(args), I get the following compiler warning:
foo.c:5:27: warning: sizeof on array function parameter will return size of 'int (*)[20]' instead of 'int [10][20]' [-Wsizeof-array-argument]
printf("%zd\n", sizeof(args));
^
foo.c:3:14: note: declared here
void foo(int args[10][20])
^
1 warning generated.
Here, the compiler is warning that it is only going to give the size of the pointer into which the array has decayed instead of the size of the array itself.
The problem and how to overcome it in C++
The problem has been explained extensively by pat and Matt. The compiler is basically ignoring the first dimension of the array's size effectively ignoring the size of the passed argument.
In C++, on the other hand, you can easily overcome this limitation in two ways:
using references
using std::array (since C++11)
References
If your function is only trying to read or modify an existing array (not copying it) you can easily use references.
For example, let's assume you want to have a function that resets an array of ten ints setting every element to 0. You can easily do that by using the following function signature:
void reset(int (&array)[10]) { ... }
Not only this will work just fine, but it will also enforce the dimension of the array.
You can also make use of templates to make the above code generic:
template<class Type, std::size_t N>
void reset(Type (&array)[N]) { ... }
And finally you can take advantage of const correctness. Let's consider a function that prints an array of 10 elements:
void show(const int (&array)[10]) { ... }
By applying the const qualifier we are preventing possible modifications.
The standard library class for arrays
If you consider the above syntax both ugly and unnecessary, as I do, we can throw it in the can and use std::array instead (since C++11).
Here's the refactored code:
void reset(std::array<int, 10>& array) { ... }
void show(std::array<int, 10> const& array) { ... }
Isn't it wonderful? Not to mention that the generic code trick I've taught you earlier, still works:
template<class Type, std::size_t N>
void reset(std::array<Type, N>& array) { ... }
template<class Type, std::size_t N>
void show(const std::array<Type, N>& array) { ... }
Not only that, but you get copy and move semantic for free. :)
void copy(std::array<Type, N> array) {
// a copy of the original passed array
// is made and can be dealt with indipendently
// from the original
}
So, what are you waiting for? Go use std::array.
It's a fun feature of C that allows you to effectively shoot yourself in the foot if you're so inclined. I think the reason is that C is just a step above assembly language. Size checking and similar safety features have been removed to allow for peak performance, which isn't a bad thing if the programmer is being very diligent. Also, assigning a size to the function argument has the advantage that when the function is used by another programmer, there's a chance they'll notice a size restriction. Just using a pointer doesn't convey that information to the next programmer.
First, C never checks array bounds. Doesn't matter if they are local, global, static, parameters, whatever. Checking array bounds means more processing, and C is supposed to be very efficient, so array bounds checking is done by the programmer when needed.
Second, there is a trick that makes it possible to pass-by-value an array to a function. It is also possible to return-by-value an array from a function. You just need to create a new data type using struct. For example:
typedef struct {
int a[10];
} myarray_t;
myarray_t my_function(myarray_t foo) {
myarray_t bar;
...
return bar;
}
You have to access the elements like this: foo.a[1]. The extra ".a" might look weird, but this trick adds great functionality to the C language.
To tell the compiler that myArray points to an array of at least 10 ints:
void bar(int myArray[static 10])
A good compiler should give you a warning if you access myArray [10]. Without the "static" keyword, the 10 would mean nothing at all.
This is a well-known "feature" of C, passed over to C++ because C++ is supposed to correctly compile C code.
Problem arises from several aspects:
An array name is supposed to be completely equivalent to a pointer.
C is supposed to be fast, originally developerd to be a kind of "high-level Assembler" (especially designed to write the first "portable Operating System": Unix), so it is not supposed to insert "hidden" code; runtime range checking is thus "forbidden".
Machine code generrated to access a static array or a dynamic one (either in the stack or allocated) is actually different.
Since the called function cannot know the "kind" of array passed as argument everything is supposed to be a pointer and treated as such.
You could say arrays are not really supported in C (this is not really true, as I was saying before, but it is a good approximation); an array is really treated as a pointer to a block of data and accessed using pointer arithmetic.
Since C does NOT have any form of RTTI You have to declare the size of the array element in the function prototype (to support pointer arithmetic). This is even "more true" for multidimensional arrays.
Anyway all above is not really true anymore :p
Most modern C/C++ compilers do support bounds checking, but standards require it to be off by default (for backward compatibility). Reasonably recent versions of gcc, for example, do compile-time range checking with "-O3 -Wall -Wextra" and full run-time bounds checking with "-fbounds-checking".
C will not only transform a parameter of type int[5] into *int; given the declaration typedef int intArray5[5];, it will transform a parameter of type intArray5 to *int as well. There are some situations where this behavior, although odd, is useful (especially with things like the va_list defined in stdargs.h, which some implementations define as an array). It would be illogical to allow as a parameter a type defined as int[5] (ignoring the dimension) but not allow int[5] to be specified directly.
I find C's handling of parameters of array type to be absurd, but it's a consequence of efforts to take an ad-hoc language, large parts of which weren't particularly well-defined or thought-out, and try to come up with behavioral specifications that are consistent with what existing implementations did for existing programs. Many of the quirks of C make sense when viewed in that light, particularly if one considers that when many of them were invented, large parts of the language we know today didn't exist yet. From what I understand, in the predecessor to C, called BCPL, compilers didn't really keep track of variable types very well. A declaration int arr[5]; was equivalent to int anonymousAllocation[5],*arr = anonymousAllocation;; once the allocation was set aside. the compiler neither knew nor cared whether arr was a pointer or an array. When accessed as either arr[x] or *arr, it would be regarded as a pointer regardless of how it was declared.
One thing that hasn't been answered yet is the actual question.
The answers already given explain that arrays cannot be passed by value to a function in either C or C++. They also explain that a parameter declared as int[] is treated as if it had type int *, and that a variable of type int[] can be passed to such a function.
But they don't explain why it has never been made an error to explicitly provide an array length.
void f(int *); // makes perfect sense
void f(int []); // sort of makes sense
void f(int [10]); // makes no sense
Why isn't the last of these an error?
A reason for that is that it causes problems with typedefs.
typedef int myarray[10];
void f(myarray array);
If it were an error to specify the array length in function parameters, you would not be able to use the myarray name in the function parameter. And since some implementations use array types for standard library types such as va_list, and all implementations are required to make jmp_buf an array type, it would be very problematic if there were no standard way of declaring function parameters using those names: without that ability, there could not be a portable implementation of functions such as vprintf.
It's allowed for compilers to be able to check whether the size of array passed is the same as what expected. Compilers may warn an issue if it's not the case.
I'm trying to understand the nature of type-decay. For example, we all know arrays decay into pointers in a certain context. My attempt is to understand how int[] equates to int* but how two-dimensional arrays don't correspond to the expected pointer type. Here is a test case:
std::is_same<int*, std::decay<int[]>::type>::value; // true
This returns true as expected, but this doesn't:
std::is_same<int**, std::decay<int[][1]>::type>::value; // false
Why is this not true? I finally found a way to make it return true, and that was by making the first dimension a pointer:
std::is_same<int**, std::decay<int*[]>::type>::value; // true
And the assertion holds true for any type with pointers but with the last being the array. For example (int***[] == int****; // true).
Can I have an explanation as to why this is happening? Why doesn't the array types correspond to the pointer types as would be expected?
Why does int*[] decay into int** but not int[][]?
Because it would be impossible to do pointer arithmetic with it.
For example, int p[5][4] means an array of (length-4 array of int). There are no pointers involved, it's simply a contiguous block of memory of size 5*4*sizeof(int). When you ask for a particular element, e.g. int a = p[i][j], the compiler is really doing this:
char *tmp = (char *)p // Work in units of bytes (char)
+ i * sizeof(int[4]) // Offset for outer dimension (int[4] is a type)
+ j * sizeof(int); // Offset for inner dimension
int a = *(int *)tmp; // Back to the contained type, and dereference
Obviously, it can only do this because it knows the size of the "inner" dimension(s). Casting to an int (*)[4] retains this information; it's a pointer to (length-4 array of int). However, an int ** doesn't; it's merely a pointer to (pointer to int).
For another take on this, see the following sections of the C FAQ:
6.18: My compiler complained when I passed a two-dimensional array to a function expecting a pointer to a pointer.
6.19: How do I write functions which accept two-dimensional arrays when the width is not known at compile time?
6.20: How can I use statically- and dynamically-allocated multidimensional arrays interchangeably when passing them to functions?
(This is all for C, but this behaviour is essentially unchanged in C++.)
C was not really "designed" as a language; instead, features were added as needs arose, with an effort not to break earlier code. Such an evolutionary approach was a good thing in the days when C was being developed, since it meant that for the most part developers could reap the benefits of the earlier improvements in the language before everything the language might need to do was worked out. Unfortunately, the way in which array- and pointer handling have evolved has led to a variety of rules which are, in retrospect, unfortunate.
In the C language of today, there is a fairly substantial type system, and variables have clearly defined types, but things were not always thus. A declaration char arr[8]; would allocate 8 bytes in the present scope, and make arr point to the first of them. The compiler wouldn't know that arr represented an array--it would represent a char pointer just like any other char*. From what I understand, if one had declared char arr1[8], arr2[8];, the statement arr1 = arr2; would have been perfectly legal, being somewhat equivalent conceptually to char *st1 = "foo, *st2 = "bar"; st1 = st2;, but would have almost always represented a bug.
The rule that arrays decompose into pointers stemmed from a time when arrays and pointers really were the same thing. Since then, arrays have come to be recognized as a distinct type, but the language needed to remain essentially compatible with the days when they weren't. When the rules were being formulated, the question of how two-dimensional arrays should be handled wasn't an issue because there was no such thing. One could do something like char foo[20]; char *bar[4]; int i; for (i=0; i<4; i++) bar[i] = foo + (i*5); and then use bar[x][y] in the same way as one would now use a two-dimensional array, but a compiler wouldn't view things that way--it just saw bar as a pointer to a pointer. If one wanted to make foo[1] point somewhere completely different from foo[2], one could perfectly legally do so.
When two two-dimensional arrays were added to C, it was not necessary to maintain compatibility with earlier code that declared two-dimensional arrays, because there wasn't any. While it would have been possible to specify that char bar[4][5]; would generate code equivalent to what was shown using the foo[20], in which case a char[][] would have been usable as a char**, it was thought that just as assigning array variables would have been a mistake 99% of the time, so too would have been re-assignment of array rows, had that been legal. Thus, arrays in C are recognized as distinct types, with their own rules which are a bit odd, but which are what they are.
Because int[M][N] and int** are incompatible types.
However, int[M][N] can decay into int (*)[N] type. So the following :
std::is_same<int(*)[1], std::decay<int[1][1]>::type>::value;
should give you true.
Two dimensional arrays are not stored as pointer to pointers, but as a contiguous block of memory.
An object declared as type int[y][x] is a block of size sizeof(int) * x * y whereas, an object of type int ** is a pointer to an int*
Is there a "good" way to write "pointer to something" in C/C++ ?
I use to write void foo( char *str ); But sometimes I find it quite illogical because the type of str is "pointer to char", then it should more logical to attach the * to the type name.
Is there a rule to write pointers ?
char*str;
char* str;
char *str;
char * str;
There is no strict rule, but bear in mind that the * attaches to the variable, so:
char *str1, *str2; // str1 and str2 are pointers
char* str1, str2; // str1 is a pointer, str2 is a char
Some people like to do char * str1 as well, but it's up to you or your company's coding standard.
The common C convention is to write T *p, whereas the common C++ convention is to write T* p. Both parse as T (*p); the * is part of the declarator, not the type specifier. It's purely an accident of pointer declaration syntax that you can write it either way.
C (and by extension, C++) declaration syntax is expression-centric; IOW, the form of a declaration should match the form of an expression of the same type in the code.
For example, suppose we had a pointer to int, and we wanted to access that integer value. To do so, we dereference the pointer with the * indirection operator, like so:
x = *p;
The type of the expression *p is int; thus, it follows that the declaration of p should be
int *p
The int-ness of p is provided by the type specifier int, but the pointer-ness of p is provided by the declarator *p.
As a slightly more complicated example, suppose we had a pointer to an array of float, and wanted to access the floating point value at the i'th element of the array through the pointer. We dereference the array pointer and subscript the result:
f = (*ap)[i];
The type of the expression (*ap)[i] is float, so it follows that the declaration of the array pointer is
float (*ap)[N];
The float-ness of ap is provided by the type specifier float, but the pointer-ness and array-ness are provided by the declarator (*ap)[N]. Note that in this case the * must explicitly be bound to the identifer; [] has a higher precedence than unary * in both expression and declaration syntax, so float* ap[N] would be parsed as float *(ap[N]), or "array of pointers to float", rather than "pointer to array of float". I suppose you could write that as
float(* ap)[N];
but I'm not sure what the point would be; it doesn't make the type of ap any clearer.
Even better, how about a pointer to a function that returns a pointer to an array of pointer to int:
int *(*(*f)())[N];
Again, at least two of the * operators must explicitly be bound in the declarator; binding the last * to the type specifier, as in
int* (*(*f)())[N];
just indicates confused thinking IMO.
Even though I use it in my own C++ code, and even though I understand why it became popular, the problem I have with the reasoning behind the T* p convention is that it just doesn't apply outside of the simplest of pointer declarations, and it reinforces a simplistic-to-the-point-of-being-wrong view of C and C++ declaration syntax. Yes, the type of p is "pointer to T", but that doesn't change the fact that as far as the language grammar is concerned * binds to the declarator, not the type specifier.
For another case, if the type of a is "N-element array of T", we don't write
T[N] a;
Obviously, the grammar doesn't allow it. Again, the argument just doesn't apply in this case.
EDIT
As Steve points out in the comments, you can use typedefs to hide some of the complexity. For example, you could rewrite
int *(*(*f)())[N];
as something like
typedef int *iptrarr[N]; // iptrarr is an array of pointer to int
typedef iptrarr *arrptrfunc(); // arrptrfunc is a function returning
// a pointer to iptrarr
arrptrfunc *f; // f is a pointer to arrptrfunc
Now you can cleanly apply the T* p convention, declaring f as arrptrfunc* f. I personally am not fond of doing things this way, since it's not necessarily clear from the typedef how f is supposed to be used in an expression, or how to use an object of type arrptrfunc. The non-typedef'd version may be ugly and difficult to read, but at least it tells you everything you need to know up front; you don't have to go digging through all the typedefs.
The "good way" depends on
internal coding standards in your project
your personal preferences
(probably) in that order.
There is no right or wrong in this. The important thing is to pick one coding standard and stick to it.
That being said, I personally believe that the * belongs with the type and not the variable name, as the type is "pointer to char". The variable name is not a pointer.
I think this is going to be heavily influenced by the general pattern in how one declares the variables.
For example, I have a tendency to declare only one variable per line. This way, I can add a comment reminding me how the variable is to be used.
However, there are times, when it is practical to declare several variables of the same type on one line. Under such circumstances, my personal coding rule is to never, NEVER, EVER declare pointers on the same line as non-pointers. I find that mixing them can be a source of errors, so I try to make it easier to see "wrongness" by avoiding mixing.
As long as I follow the first guideline, I find that it does not matter overly much how I declare the pointers so long as I am consistent.
However, if I use the second guideline and declare several pointers on the same line, I find the following style to be most beneficial and clear (of course others may disagree) ...
char *ptr1, *ptr2, *ptr3;
By having no space between the * and the pointer name, it becomes easier to spot whether I have violated the second guideline.
Now, if I wanted to be consistent between my two personal style guidelines, when declaring only one pointer on a line, I would use ...
char *ptr;
Anyway, that's my rationale for part of why I do what I do. Hope this helps.