Understanding difference/similarities with array and pointers in c++ [duplicate] - c++

This question already has answers here:
Why am I being told that an array is a pointer? What is the relationship between arrays and pointers in C++?
(6 answers)
Closed 5 years ago.
I am trying to understand arrays and pointers in c++.
In a past project, I did. I created a pointer and then assigned that pointer to an array of the same type. I've read previous posts, though, that says you can't assign a pointer to an array. Also from the pointer, I was able to use the index to get a certain element (square brackets []).
So I was wondering if anyone could explain to me arrays and pointers in C++ and why something like this is possible, I've tried looking for it online, but haven't found anything.
Example,
#include <iostream>
using namespace std;
int main()
{
int *pointer = new int[10];
for (int i = 0; i < 10; i++)
{
pointer[i] = i;
}
for (int i = 0; i < 10; i++)
{
cout << pointer[i];
}
return 1;
}

The main difference between arrays and pointers is that they are completely different things.
As array is a collection of objects, which is laid out contiguously in memory. For example, int x[5] defines an array named x, which is a collection of 5 integers, laid out side by side in memory. Individual elements in the array may be accessed using "array syntax" of the form x[i] where i is an integral value with values between 0 and 4. (Other values of i will result in undefined behaviour).
A pointer is a variable which holds a value that is an address in memory. For example, int *p defines p as a pointer to an int, and it can be initialised with the address of a variable of type int. For example, p = &some_int causes p to contain the address of some_int. When that is done, the notation *p (called dereferencing) provides access to the pointed-to variable. For example, *p = 42 will set some_int to have the value 42.
You'll notice, in the description above, I have not used the word "pointer" in describing an array, nor have I used the word "array" to describe a pointer. They are completely different things.
However, they can be used in ways that makes them seem the same, because of a few rules in the language. Firstly, there is a conversion called the "array-to-pointer" conversion. Because of this, it is possible to do
int x[5];
int *p = x;
The initialisation of p actually works by using the array-to-pointer conversion. Because it is being used to initialise a pointer, the compiler implicitly converts x to a pointer, equal to the address of x[0]. To do this explicitly (without the compiler silently and sneakily doing a conversion) you could have written
int *p = &x[0];
and got exactly the same effect. Either way, the assignment *p = 42 will subsequently have the effect of assigning x[0] to 42.
That suggests there is a relationship between expressions involving pointers and expressions involving (the name of) arrays. If p is equal to &x[0], then
p + i is equivalent to &x[i]; AND
*(p + i) is equivalent to x[i].
The language rules of C and C++ make these relationships symmetric, so (look carefully here)
x + i is equivalent to &x[i]; AND
*(x + i) is equivalent to x[i]
and, with pointers
p + i is equivalent to &p[i]; AND
*(p + i) is equivalent to p[i]
Which basically means that pointer syntax can be used to work with arrays (thanks to the pointer-to-array conversion) AND array syntax can be used to work with pointers.
Really bad textbooks then go on from this and conclude that pointers are arrays and that arrays are pointers. But they are not. If you find textbooks which say such things, burn them. Arrays and pointers are different things entirely. What we have here is a syntactic equivalence - even though arrays and pointers are different things entirely, they can be worked on using the same syntax.
One of the differences - where the syntactic equivalence does not apply - is that arrays cannot be reassigned. For example;
int x[5];
int y[5];
int *p = y; // OK - pointer to array conversion
x = y; // error since x is an array
x = p; // error since x is an array
The last two statements will be diagnosed by a C or C++ compiler as an error, because x is an array.
Your example
int *pointer = new int[10];
is a little different again. pointer is still not an array. It is a pointer, initialised with a "new expression", which dynamically allocates an array of 10 integers. But because of the syntactic equivalence of pointers and arrays, pointer can be treated syntactically AS IF it is an array of 10 elements.
Note: the above is concerned with raw arrays. The C++ standard library also has a type named std::array which is a data structure which contains an array, but behaves somewhat differently than described here.

a pointer in reality is a variable which keeps an address of a memory it points to. The memory itself can be allocated by one of the heap management functions like 'malloc' or 'new' or some others. So, in general you ask the function to allocate a certain amount of memory and return its address. You keep the latter as the pointer to it in a variable.
Arrays in 'c++/c' are contiguous chunks of memory. So, there is no difference between allocating 40 bytes or 10 integers in the array (assuming that an int is 4 byte long).
'c/c++' also understand the type of the data which the allocated memory contains. The languages provide a feature named 'pointer arithmetic'. This actually means that you can add to a pointer or subtract from it. The language will factor the values by the size of the type. I.e.
int *a = new ...
int b = *(a+4);
In the above case value of 'b' will be the same as the value saved in memory 'a' with offset of 16 bytes.
The above is very similar to the array indexing arithmetic and the language allow you to use the array indexing instead of the above:
b = a[4];
So, in this sense both worlds intersect and you can interchangeably use either pointer arithmetic or array arithmetic in the language.
Arrays do not have to be allocated on the heap as well as the pointers do hot have to address the heap only. You can have an array allocated on stack or in the global scope and have a pointer to it:
int myarray[10];
int *pointer = myarray;
How you can apply either the pointer arithmetic or array arithmetic to the pointer. the following are equivalent (if you did not advance the pointer)
myarray[3]
pointer[3]
*(pointer + 3)
Hope it clarifies the issue for you.

From wikipedia:
A pointer references a location in memory, and obtaining the value
stored at that location is known as dereferencing the pointer
Arrays are the contiguous memory locations and their location in memory is referenced by a pointer.
int *pointer = new int[10];
for (int i = 0; i < 10; i++)
{
pointer[i] = i;
}
the code above is actually accesses pointer + i * sizeof(int) thanks to operator[] (assume variable pointer is an integer or smth like that, no pointer arithmetic here.)

Related

Error: cannot convert ‘std::string (*)[3]' to ‘std::string** in return [duplicate]

I'm trying to understand the nature of type-decay. For example, we all know arrays decay into pointers in a certain context. My attempt is to understand how int[] equates to int* but how two-dimensional arrays don't correspond to the expected pointer type. Here is a test case:
std::is_same<int*, std::decay<int[]>::type>::value; // true
This returns true as expected, but this doesn't:
std::is_same<int**, std::decay<int[][1]>::type>::value; // false
Why is this not true? I finally found a way to make it return true, and that was by making the first dimension a pointer:
std::is_same<int**, std::decay<int*[]>::type>::value; // true
And the assertion holds true for any type with pointers but with the last being the array. For example (int***[] == int****; // true).
Can I have an explanation as to why this is happening? Why doesn't the array types correspond to the pointer types as would be expected?
Why does int*[] decay into int** but not int[][]?
Because it would be impossible to do pointer arithmetic with it.
For example, int p[5][4] means an array of (length-4 array of int). There are no pointers involved, it's simply a contiguous block of memory of size 5*4*sizeof(int). When you ask for a particular element, e.g. int a = p[i][j], the compiler is really doing this:
char *tmp = (char *)p // Work in units of bytes (char)
+ i * sizeof(int[4]) // Offset for outer dimension (int[4] is a type)
+ j * sizeof(int); // Offset for inner dimension
int a = *(int *)tmp; // Back to the contained type, and dereference
Obviously, it can only do this because it knows the size of the "inner" dimension(s). Casting to an int (*)[4] retains this information; it's a pointer to (length-4 array of int). However, an int ** doesn't; it's merely a pointer to (pointer to int).
For another take on this, see the following sections of the C FAQ:
6.18: My compiler complained when I passed a two-dimensional array to a function expecting a pointer to a pointer.
6.19: How do I write functions which accept two-dimensional arrays when the width is not known at compile time?
6.20: How can I use statically- and dynamically-allocated multidimensional arrays interchangeably when passing them to functions?
(This is all for C, but this behaviour is essentially unchanged in C++.)
C was not really "designed" as a language; instead, features were added as needs arose, with an effort not to break earlier code. Such an evolutionary approach was a good thing in the days when C was being developed, since it meant that for the most part developers could reap the benefits of the earlier improvements in the language before everything the language might need to do was worked out. Unfortunately, the way in which array- and pointer handling have evolved has led to a variety of rules which are, in retrospect, unfortunate.
In the C language of today, there is a fairly substantial type system, and variables have clearly defined types, but things were not always thus. A declaration char arr[8]; would allocate 8 bytes in the present scope, and make arr point to the first of them. The compiler wouldn't know that arr represented an array--it would represent a char pointer just like any other char*. From what I understand, if one had declared char arr1[8], arr2[8];, the statement arr1 = arr2; would have been perfectly legal, being somewhat equivalent conceptually to char *st1 = "foo, *st2 = "bar"; st1 = st2;, but would have almost always represented a bug.
The rule that arrays decompose into pointers stemmed from a time when arrays and pointers really were the same thing. Since then, arrays have come to be recognized as a distinct type, but the language needed to remain essentially compatible with the days when they weren't. When the rules were being formulated, the question of how two-dimensional arrays should be handled wasn't an issue because there was no such thing. One could do something like char foo[20]; char *bar[4]; int i; for (i=0; i<4; i++) bar[i] = foo + (i*5); and then use bar[x][y] in the same way as one would now use a two-dimensional array, but a compiler wouldn't view things that way--it just saw bar as a pointer to a pointer. If one wanted to make foo[1] point somewhere completely different from foo[2], one could perfectly legally do so.
When two two-dimensional arrays were added to C, it was not necessary to maintain compatibility with earlier code that declared two-dimensional arrays, because there wasn't any. While it would have been possible to specify that char bar[4][5]; would generate code equivalent to what was shown using the foo[20], in which case a char[][] would have been usable as a char**, it was thought that just as assigning array variables would have been a mistake 99% of the time, so too would have been re-assignment of array rows, had that been legal. Thus, arrays in C are recognized as distinct types, with their own rules which are a bit odd, but which are what they are.
Because int[M][N] and int** are incompatible types.
However, int[M][N] can decay into int (*)[N] type. So the following :
std::is_same<int(*)[1], std::decay<int[1][1]>::type>::value;
should give you true.
Two dimensional arrays are not stored as pointer to pointers, but as a contiguous block of memory.
An object declared as type int[y][x] is a block of size sizeof(int) * x * y whereas, an object of type int ** is a pointer to an int*

Array and pointers in c++ [duplicate]

This question already has answers here:
C/C++ int[] vs int* (pointers vs. array notation). What is the difference?
(5 answers)
Why am I being told that an array is a pointer? What is the relationship between arrays and pointers in C++?
(6 answers)
Closed 6 years ago.
I often hear that the name of an array is constant pointer to a block of memory therefore statement like
int a[10];
and
int * const p= a;
must be equal in a sense that p is pointer that points to the same block of memory as array a[] and also it may not be changed to point to another location in memory.
However if you try to print sizeof these two pointers you get different results:
cout<< sizeof(a); // outputs size of 10 integer elements
While
cout<< sizeof(p); // outputs sizeof pointer to int
So, why do compilers treat these two differently? What's the true relation between arrays and pointers from compiler point of view?
I often hear that the name of an array is constant pointer to a block of memory
You've often been mislead - or you've simply misunderstood. An array is not a constant pointer to a block of memory. Array is an object that contains a sequence of sub-objects. All objects are a block of memory. A pointer is an object that contains an address of an object i.e. it points to the object.
So in the following quote, a is an array, p points to the first sub-object within a.
int a[10];
and
int * const p= a;
must be equal in a sense that p is pointer that points to the same block of memory as array a[] and also it may not be changed to point to another location in memory.
If that is your definition of equal, then that holds for non-array objects as well:
char c;
int * const p = &c;
Here p "points to the same memory as c" and may not be changed to point to another location in memory. Does that mean that char objects are "equal" to pointers? No. And arrays aren't either.
But isn't a (the name of the array), a constant pointer that points to the same element of the array?
No, the name of the array isn't a constant pointer. Just like name of the char isn't a constant pointer.
the name of an array holds the address of the first element in the array, right?
Let's be more general, this is not specific to arrays. The name of a variable "holds the address" of the object that the variable names. The address is not "held" in the memory at run time. It's "held" by the compiler at compile time. When you operate on a variable, the compiler makes sure that operations are done to the object at the correct address.
The address of the array is always the same address as where the first element (sub-object) of the array is. Therefore, the name indeed does - at least conceptually - hold the same address.
And if i use *(a+1), this is the same as a[1], right? [typo fixed]
Right. I'll elaborate: One is just another way of writing another in the case of pointers. Oh, but a isn't a pointer! Here is the catch: The array operand is implicitly converted to a pointer to first element. This implicit conversion is called decaying. This is special feature of array types - and it is the special feature which probably makes understanding the difference between pointers and arrays difficult the most.
So, even though the name of the array isn't a pointer, it can decay into a pointer. The name doesn't always decay into a pointer, just in certain contexts. It decays when you use operator[], and it decays when you use operator+. It decays when you pass the array to a function that accepts a pointer to the type of the sub-object. It doesn't decay when you use sizeof and it doesn't decay when you pass it to a function that accepts an array by reference.

C++ why accessing dynamic array is used without the asterisk operator?

for example:
#include <iostream>
using namespace std;
int main(){
int *a;
a = new int[2];
a[1] = 1;
}
From what I understand, a 2 sized array of int is allocated in "the heap memory" and pointer a takes the memory address of that newly created array. However, when trying to access (for example) the second index of the array (a[1]), it simply does so without the asterisk operator and I don't understand why, I'm used to seeing that the value stored in a memory address pointed to by a pointer is accessed as *pointername and not like pointername[value].
So my question is, why do we use the subscript operator to access a pointer which points to an array without the asterisk operator?
In C++, applying operator[] to a pointer p with index i is the semantic equivalent of
*(p + i)
and
*(i + p)
You can think of it as syntactic sugar. Also note that this implies that p[N] is equivalent to N[p].
a[N] is equal to *(a+N) if a is a pointer. Thus, a[1] dereferences the pointer a+1.
An array is a block of memory containing multiple instances of same size elements,
the subscript operator just puts an offset for the right element from the original pointer. so a[0] would be equal to *a. and a[1] = *a + 1 * sizeof(element) so in a sense you are right we do use the asterisk operator in the end it's just hidden behind syntactic sugar.
Ihe index operator has a peculiar meaning for an array of primitives: array[index] is equivalent to *(array+index). One side effect is that if you want to obfuscate your code, index[array] means exactly the same thing (for primitive types, of course). For primitive arrays, the index operator is semantic sugar for a dereference.
For non-primitive types, classes can override operator[]. This is the opposite of semantic sugar.

Why does int*[] decay into int** but not int[][]?

I'm trying to understand the nature of type-decay. For example, we all know arrays decay into pointers in a certain context. My attempt is to understand how int[] equates to int* but how two-dimensional arrays don't correspond to the expected pointer type. Here is a test case:
std::is_same<int*, std::decay<int[]>::type>::value; // true
This returns true as expected, but this doesn't:
std::is_same<int**, std::decay<int[][1]>::type>::value; // false
Why is this not true? I finally found a way to make it return true, and that was by making the first dimension a pointer:
std::is_same<int**, std::decay<int*[]>::type>::value; // true
And the assertion holds true for any type with pointers but with the last being the array. For example (int***[] == int****; // true).
Can I have an explanation as to why this is happening? Why doesn't the array types correspond to the pointer types as would be expected?
Why does int*[] decay into int** but not int[][]?
Because it would be impossible to do pointer arithmetic with it.
For example, int p[5][4] means an array of (length-4 array of int). There are no pointers involved, it's simply a contiguous block of memory of size 5*4*sizeof(int). When you ask for a particular element, e.g. int a = p[i][j], the compiler is really doing this:
char *tmp = (char *)p // Work in units of bytes (char)
+ i * sizeof(int[4]) // Offset for outer dimension (int[4] is a type)
+ j * sizeof(int); // Offset for inner dimension
int a = *(int *)tmp; // Back to the contained type, and dereference
Obviously, it can only do this because it knows the size of the "inner" dimension(s). Casting to an int (*)[4] retains this information; it's a pointer to (length-4 array of int). However, an int ** doesn't; it's merely a pointer to (pointer to int).
For another take on this, see the following sections of the C FAQ:
6.18: My compiler complained when I passed a two-dimensional array to a function expecting a pointer to a pointer.
6.19: How do I write functions which accept two-dimensional arrays when the width is not known at compile time?
6.20: How can I use statically- and dynamically-allocated multidimensional arrays interchangeably when passing them to functions?
(This is all for C, but this behaviour is essentially unchanged in C++.)
C was not really "designed" as a language; instead, features were added as needs arose, with an effort not to break earlier code. Such an evolutionary approach was a good thing in the days when C was being developed, since it meant that for the most part developers could reap the benefits of the earlier improvements in the language before everything the language might need to do was worked out. Unfortunately, the way in which array- and pointer handling have evolved has led to a variety of rules which are, in retrospect, unfortunate.
In the C language of today, there is a fairly substantial type system, and variables have clearly defined types, but things were not always thus. A declaration char arr[8]; would allocate 8 bytes in the present scope, and make arr point to the first of them. The compiler wouldn't know that arr represented an array--it would represent a char pointer just like any other char*. From what I understand, if one had declared char arr1[8], arr2[8];, the statement arr1 = arr2; would have been perfectly legal, being somewhat equivalent conceptually to char *st1 = "foo, *st2 = "bar"; st1 = st2;, but would have almost always represented a bug.
The rule that arrays decompose into pointers stemmed from a time when arrays and pointers really were the same thing. Since then, arrays have come to be recognized as a distinct type, but the language needed to remain essentially compatible with the days when they weren't. When the rules were being formulated, the question of how two-dimensional arrays should be handled wasn't an issue because there was no such thing. One could do something like char foo[20]; char *bar[4]; int i; for (i=0; i<4; i++) bar[i] = foo + (i*5); and then use bar[x][y] in the same way as one would now use a two-dimensional array, but a compiler wouldn't view things that way--it just saw bar as a pointer to a pointer. If one wanted to make foo[1] point somewhere completely different from foo[2], one could perfectly legally do so.
When two two-dimensional arrays were added to C, it was not necessary to maintain compatibility with earlier code that declared two-dimensional arrays, because there wasn't any. While it would have been possible to specify that char bar[4][5]; would generate code equivalent to what was shown using the foo[20], in which case a char[][] would have been usable as a char**, it was thought that just as assigning array variables would have been a mistake 99% of the time, so too would have been re-assignment of array rows, had that been legal. Thus, arrays in C are recognized as distinct types, with their own rules which are a bit odd, but which are what they are.
Because int[M][N] and int** are incompatible types.
However, int[M][N] can decay into int (*)[N] type. So the following :
std::is_same<int(*)[1], std::decay<int[1][1]>::type>::value;
should give you true.
Two dimensional arrays are not stored as pointer to pointers, but as a contiguous block of memory.
An object declared as type int[y][x] is a block of size sizeof(int) * x * y whereas, an object of type int ** is a pointer to an int*

What are the ramifications of passing & assigning arrays as pointers in C++?

As background, I gave an answer to this post a little while ago:
Return array in a function
And it unintentionally kicked off a really long comment chain about pointers vs. arrays in C++ because I tried to oversimplify and I made the statement "arrays are pointers". Though my final answer sounds pretty decent, it was only after some heavy editing in response to a lot of the comments I got.
This question is not meant to be troll bait, I understand that a pointer and an array are not the same thing, but some of the available syntax in the C++ language certainly makes them behave very similarly in a lot of cases. (FYI, my compiler is i686-apple-darwin9-g++-4.0.1 on OS X 10.5.8)
For instance, this code compiles and runs just fine for me (I realize x[8] is a potential segmentation fault):
//this is just a simple pointer
int *x = new int;
cout << x << " " << (*x) << " " << x[8] << endl; //might segfault
//this is a dynamic array
int* y = new int[10];
cout << y << " " << (*y) << " " << y[8] << endl;
//this is a static array
int z[10];
cout << z << " " << (*z) << " " << z[8] << endl;
That particular snippet makes it look like pointers and arrays can be used almost identically, but if I add this to the bottom of that code, the last two lines won't compile:
x = y;
x = z;
y = x;
y = z;
//z = x; //won't compile
//z = y; //won't compile
So clearly the compiler at least understands that z and x are different things, but I can interchange x and y just fine.
This is further confusing when you look at passing arrays to functions and returning arrays from functions. Consider this example (again, I am aware of the potential segmentation faults here when passing x):
void foo(int in[])
{
cout << in[8] << endl;
}
void bar(int* in)
{
cout << in[8] << endl;
}
int main()
{
//this is just a simple pointer
int *x = new int;
foo(x);
bar(x);
//this is a dynamic array
int* y = new int[10];
foo(y);
bar(y);
//this is a static array
int z[10];
foo(z);
bar(z);
}
All this code properly compiles and runs on my machine.
I feel like I have a decent internal understanding of what's going on here, but if you asked me to articulate exactly what's happening, I don't feel like I could satisfactorily explain. So here's what I'm getting at:
When I pass an array to a function as int* in instead of int in[], what am I gaining or losing? Is the same true when returning an array as int*? Are there ever bad side effects from doing this?
If I asked you what the data type of y is, would you say pointer to int, array of ints or something else?
Similarly, what happens when I say x = y vs. x = z? I'm still able to use x[] and access the things that were originally in y or z, but is this really just because pointer arithmetic happens to land me in memory space that is still valid?
I've dug through all the similar array/pointer questions on SO and I'm having trouble finding the definitive explanation that clears this up for me once and for all.
C++ is statically typed, so of course the compiler understands that x and z are not the same kind of thing. They have different types - z is an array, x and y are pointers.
The reason z = x doesn't compile isn't (just) that the types are incompatible, though, it's that you can't assign to an array variable at all. Ever. x = z assigns to x, a pointer to the first element of z. x = y assigns the value of y to x.[*]
When I pass an array to a function as int* in instead of int in[], what am I gaining or losing?
They do exactly the same thing, so you have no choice to make. Possibly you have been misled by the fact that C++ syntax permits int in[] as a function parameter. The type of the parameter in is not any kind of array, it is int*.
If I asked you what the data type of y is
It's int*. That's what it's declared as, so that's what it is.
The value that it holds is a pointer to (the first element of) an array. I frequently use that formula: "pointer to (the first element of)" in cases where I'd like to say "pointer to array", but can't because there's the potential for ambiguity as to whether the type involved is pointer-to-array, or not.
However, pointers-to-arrays are rarely used in C++, because the size of the array is part of the type. There's no such type as "pointer to an array of int" in C++, just "pointer to array of 1 int", "pointer to array of 2 int", etc. This usually isn't very convenient, hence the use of a pointer to the first element of an array whose size may not be known at compile time.
is this really just because pointer arithmetic happens to land me in memory space that is still valid
Pretty much, yes. The size of the array is part of the type of z, but is not part of the type of x or y, and also is not part of the type of the result of z decaying to a pointer to its first element. So y could be a pointer to the first of 10 elements, or just to 1 element. You only know the difference by context, and by requiring of your callers that the value you have points to what it's supposed to point to.
"Happens" is leaving too much to chance, though - part of your job when using arrays is to make sure you don't stray beyond their bounds.
[*] z = x isn't allowed, even after you've done x = z, because z is (and always will be) an particular array of 10 ints in memory. Back when C was designed, there was a question of whether array variables could in principle be "reseatable", meaning that you could do:
int z[10];
int y[10];
z = y; // z is now an alias for y
y[0] = 3;
// z[0] now has the value 3
Dennis Ritchie decided not to allow this, because it would prevent him from distinguishing arrays from pointers in a way that he needed to do. So z cannot ever refer to a different array from the one it was declared as. Read all about it here: http://cm.bell-labs.com/cm/cs/who/dmr/chist.html, under "Embryonic C".
Another plausible meaning for z = y could be memcpy(z,y,sizeof(z)). It wasn't given that meaning either.
The fundamental difference between a pointer and an array is that the pointer has a unique memory address that holds the address of the array data.
An array name, though treated as a pointer based on context, does not itself have a memory location whose address you can take. When it is treated as a pointer, its value is generated at runtime as the address of its first element.
That is why you can assign its value to another pointer but not vice versa. There is no pointer memory location to treat as an l-value.
Arrays are not pointers, but arrays easily decay to pointers to their first element. Additionally, C (and thus C++) allow array access syntax to be used for pointers.
When I pass an array to a function as int* in instead of int in[], what am I gaining or losing? Is the same true when returning an array as int*? Are there ever bad side effects from doing this?
You're gaining nothing, because int[] is just another way to write int*. If you want to pass an array, you have to pass it per reference, exactly matching its size. Non-type template arguments can ease the problem with the exact size:
template< std:::size_t N >
void f(int (&arr)[N])
{
...
}
If I asked you what the data type of y is, would you say pointer to int, array of ints or something else?
It's a pointer to the first element of a dynamically allocated array.
Similarly, what happens when I say x = y vs. x = z?
You assign the addresses of different objects of different types to the same pointer. (And you leak an int on the heap. :))
I'm still able to use x[] and access the things that were originally in y or z, but is this really just because pointer arithmetic happens to land me in memory space that is still valid?
Yep. As I said, pointers conveniently and confusingly allow array syntax to be applied to them. However, that still doesn't make a pointer an array.
Here's an snippet from this book (and C++ semantics follows from its backward compatibility with C). Array "are" pointers in the following cases:
An array name in an expression (in contrast with a declaration) is treated by the compiler as a pointer to the first element of the array (this does not apply to sizeof) (ANSI C Standard, 6.2.2.1)
A subscript is always equivalent to an offset from a pointer (6.3.2.1)
An array name in the declaration of a function parameter is treated by the compiler as a pointer to the first element of the array (6.7.1)
This basically means that:
int arr[20]; int* p = arr;
is equivalent to:
int arr[20]; int* p = &arr[0];
Then
int arr[20]; int x = arr[10];
is equivalent to:
int arr[20]; int x = *( arr + 10 );
And
void func( int arr[] );
is equivalent to:
void func( int* arr );
On the other hand, pointers are never transformed back into arrays - that's why your last two lines do not compile.
When I pass an array to a function as
int* in instead of int in[], what am I
gaining or losing? Is the same true
when returning an array as int*? Are
there ever bad side effects from doing
this?
AFAIK, one is just syntactic sugar for the other and they mean exactly the same.
The version with [] probably just gives a strong hint that this function expects a pointer into an array, not a pointer to a single object.
You will notice a difference when it comes to real multi-dimensional arrays vs array of pointers (to arrays), because in such case only the first dimension decays to a pointer with multi-dimensional arrays. Those things have a completely different layout in memory (one big contiguous block vs one small block of pointers to distinct blocks of memory).
If I asked you what the data type of y
is, would you say pointer to int,
array of ints or something else?
The type of y is a pointer to int. In fact, in case of a dynamically allocated array, you never get to see the array at all! That is, there is no way to determine the size of the allocation with sizeof, unlike actual arrays.
Similarly, what happens when I say x =
y vs. x = z? I'm still able to use x[]
and access the things that were
originally in y or z, but is this
really just because pointer arithmetic
happens to land me in memory space
that is still valid?
This is because x is a pointer. You will not be able to do z = x; because you can't assign to arrays.
There's no difference (at all) between a function parameter like int *in and int in[]. For a function parameter, these are just different ways of spelling pointer to T. The only way they can differ at all is (possibly) something like readability (e.g., if you intend to always pass the base address of an array, you might find array notation more fitting, whereas if you intend to pass the address of a single object, you might find pointer notation more tasteful).
In the code above, y clearly has the type pointer to int.
x and y are pointers, which can be assigned. z is an array, which cannot be assigned.
As an aside (which has some relevance to the topic - would have added this as a comment but don't have enough rep.) - you can gauge the number of elements in an array vs. using a pointer. Or stated differently sizeof returns sizeof(array_type)*num_elements_in_array vs returning the size of the pointer. Glib provides this macro for this purpose.
When I pass an array to a function as int* in instead of int in[], what am I gaining or losing? Is the same true when returning an array as int*? Are there ever bad side effects from doing this?
Your not gaining or losing anything
If I asked you what the data type of y is, would you say pointer to int, array of ints or something else?
I would call y a pointer to an array of ints
Similarly, what happens when I say x = y vs. x = z? I'm still able to use x[] and access the things that were originally in y or z, but is this really just because pointer arithmetic happens to land me in memory space that is still valid?
x = y does not make a copy of the array pointed to by y only a copy of the pointer in y.
x = z does not make a copy of the array z, only a pointer to the value of the first element.
Also, free allocated memory.