Interpretation of int (*a)[3] - c++

When working with arrays and pointers in C, one quickly discovers that they are by no means equivalent although it might seem so at a first glance. I know about the differences in L-values and R-values. Still, recently I tried to find out the type of a pointer that I could use in conjunction with a two-dimensional array, i.e.
int foo[2][3];
int (*a)[3] = foo;
However, I just can't find out how the compiler "understands" the type definition of a in spite of the regular operator precedence rules for * and []. If instead I were to use a typedef, the problem becomes significantly simpler:
int foo[2][3];
typedef int my_t[3];
my_t *a = foo;
At the bottom line, can someone answer me the questions as to how the term int (*a)[3] is read by the compiler? int a[3] is simple, int *a[3] is simple as well. But then, why is it not int *(a[3])?
EDIT: Of course, instead of "typecast" I meant "typedef" (it was just a typo).

Every time you have doubts with complex declarations, you can use the cdecl tool in Unix like systems:
[/tmp]$ cdecl
Type `help' or `?' for help
cdecl> explain int (*a)[10];
declare a as pointer to array 10 of int
EDIT:
There is also a online version of this tool available here.
Thanks to Tiberiu Ana and gf

First, you mean "typedef" not "typecast" in your question.
In C, a pointer to type T can point to an object of type T:
int *pi;
int i;
pi = &i;
The above is simple to understand. Now, let's make it a bit more complex. You seem to know the difference between arrays and pointers (i.e., you know that arrays are not pointers, they behave like them sometimes though). So, you should be able to understand:
int a[3];
int *pa = a;
But for completeness' sake: in the assignment, the name a is equivalent to &a[0], i.e., a pointer to the first element of the array a. If you are not sure about how and why this works, there are many answers explaining exactly when the name of an array "decays" to a pointer and when it does not:
My answer to a question titled type of an array,
Another answer with examples of instances when the name of an array does not decay to a pointer, and
The answers to what is array decaying.
I am sure there are many more such questions and answers on SO, I just mentioned some that I found from a search.
Back to the topic: when we have:
int foo[2][3];
foo is of type "array [2] of array [3] of int". This means that foo[0] is an array of 3 ints, and foo[1] is an array of 3 ints.
Now let's say we want to declare a pointer, and we want to assign that to foo[0]. That is, we want to do:
/* declare p somehow */
p = foo[0];
The above is no different in form to the int *pa = a; line, because the types of a and of foo[0] are the same. So, we need int *p; as our declaration of p.
Now, the main thing to remember about arrays is that "the rule" about array's name decaying to a pointer to its first element applies only once. If you have an array of an array, then in value contexts, the name of the array will not decay to the type "pointer to pointer", but rather to "pointer to array". Going back to foo:
/* What should be the type of q? */
q = foo;
The name foo above is a pointer to the first element of foo, i.e., we can write the above as:
q = &foo[0];
The type of foo[0] is "array [3] of int". So we need q to be a pointer to an "array [3] of int":
int (*q)[3];
The parentheses around q are needed because [] binds more tightly than * in C, so int *q[3] declares q as an array of pointers, and we want a pointer to an array. int *(q[3]) is, from above, equivalent to int *q[3], i.e., an array of 3 pointers to int.
Hope that helps. You should also read C for smarties: arrays and pointers for a really good tutorial on this topic.
About reading declarations in general: you read them "inside-out", starting with the name of the "variable" (if there is one). You go left as much as possible unless there is a [] to the immediate right, and you always honor parentheses. cdecl should be able to help you to an extent:
$ cdecl
cdecl> declare p as pointer to array 3 of int
int (*p)[3]
cdecl> explain int (*p)[3]
declare p as pointer to array 3 of int
To read
int (*a)[3];
a # "a is"
(* ) # parentheses, so precedence changes.
# "a pointer to"
[3] # "an array [3] of"
int ; # "int".
For
int *a[3];
a # "a is"
[3] # "an array [3] of"
* # can't go right, so go left.
# "pointer to"
int ; # "int".
For
char *(*(*a[])())()
a # "a is"
[] # "an array of"
* # "pointer to"
( )() # "function taking unspecified number of parameters"
(* ) # "and returning a pointer to"
() # "function"
char * # "returning pointer to char"
(Example from c-faq question 1.21. In practice, if you are reading such a complicated declaration, there is something seriously wrong with the code!)

It declares a pointer to an array of 3 ints.
The parentheses are necessary as the following declares an array of 3 pointers to int:
int* a[3];
You get better readability when using typedef:
typedef int threeInts[3];
threeInts* pointerToThreeInts;

It means
declare a as pointer to array 3 of
int
See cdecl for future reference.

Its probably easier for the compiler to interpret than it is to you. A language defines a set of rules for what constitutes a valid expression and the compiler uses these rules to interpret such expressions (might sound complicated but compilers have been studied in length and there are a lot of standardized algorithms and tools available, you might want to read about parsing). cdecl is a tool that will translate C expressions to English. The source code is available on the website so you can check it out. The book The C++ Programming Language included sample code for writing such a program if I'm not mistaking.
As for interpreting these expressions yourself, I found the best technique to do that in the book Thinking in C++ volume 1:
To define a pointer to a function that has no arguments and no return value, you say:
void (*funcPtr)();
When you are looking at a complex
definition like this, the best way to
attack it is to start in the middle
and work your way out. “Starting in
the middle” means starting at the
variable name, which is funcPtr.
“Working your way out” means looking
to the right for the nearest item
(nothing in this case; the right
parenthesis stops you short), then
looking to the left (a pointer denoted
by the asterisk), then looking to the
right (an empty argument list
indicating a function that takes no
arguments), then looking to the left
(void, which indicates the function
has no return value). This
right-left-right motion works with
most declarations.
To review, “start in the middle”
(“funcPtr is a ...”), go to the right
(nothing there – you're stopped by the
right parenthesis), go to the left and
find the ‘*’ (“... pointer to a ...”),
go to the right and find the empty
argument list (“... function that
takes no arguments ... ”), go to the
left and find the void (“funcPtr is a
pointer to a function that takes no
arguments and returns void”).
You can download the book for free on Bruce Eckel's download site. Let's try to apply it to int (*a)[3]:
Start at the name in the middle, in this case the a. So far it reads: 'a is a...'
Go to the right, you encounter a closing parenthesis stopping you, so you go to the left and find the *. Now you read it: 'a is a pointer to...'
Go to the right and you find [3], meaning an array of 3 elements. The expression becomes: 'a is a pointer to an array of 3...'
Finally you go to the left and find the int, so finally you get 'a is a pointer to an array of 3 ints'.

You may find this article helpful:
Reading C Declarations: A Guide for the Mystified
I've made this answer a community wiki (since it's just a link to an article, not an answer), if anyone wants to add further resources to it.

Why what is "not int *(a[3])" exactly (referring to your last question)? The int *(a[3]) is the same as plain int *a[3]. The braces are redundant. It is an array of 3 pointers to int and you said you know what it means.
The int (*a)[3] is a pointer to an array of 3 int (i.e. a pointer to int[3] type). The braces in this case are important. You yourself provided a correct example the makes an equivalent declaration through an intermediate typedef.
The simple popular inside-out C declaration reading rule works in this case perfectly well. In the int *a[3] declaration we'd have to start from a and go right first, i.e. to get a[3], meaning that a is an array of 3 something (and so on). The () in int (*a)[3] changes that, so we can't go right and have to start from *a instead: a is a pointer to something. Continuing to decode the declaration we'll come to the conclusion that that something is an array of 3 int.
In any case, find a good tutorial on reading C declarations, of which there are quite a few available on the Net.

In situations when you don't know what is declaration for, I found very helpful so called Right-left rule.
BTW. Once you've learned it, such declarations are peace of cake:)

Related

Why is the dereference operator (*) also used to declare a pointer?

I'm not sure if this is a proper programming question, but it's something that has always bothered me, and I wonder if I'm the only one.
When initially learning C++, I understood the concept of references, but pointers had me confused. Why, you ask? Because of how you declare a pointer.
Consider the following:
void foo(int* bar)
{
}
int main()
{
int x = 5;
int* y = NULL;
y = &x;
*y = 15;
foo(y);
}
The function foo(int*) takes an int pointer as parameter. Since I've declared y as int pointer, I can pass y to foo, but when first learning C++ I associated the * symbol with dereferencing, as such I figured a dereferenced int needed to be passed. I would try to pass *y into foo, which obviously doesn't work.
Wouldn't it have been easier to have a separate operator for declaring a pointer? (or for dereferencing). For example:
void test(int# x)
{
}
In The Development of the C Language, Dennis Ritchie explains his reasoning thusly:
The second innovation that most clearly distinguishes C from its
predecessors is this fuller type structure and especially its
expression in the syntax of declarations... given an object of any
type, it should be possible to describe a new object that gathers
several into an array, yields it from a function, or is a pointer to
it.... [This] led to a
declaration syntax for names mirroring that of the expression syntax
in which the names typically appear. Thus,
int i, *pi, **ppi; declare an integer, a pointer to an integer, a
pointer to a pointer to an integer. The syntax of these declarations
reflects the observation that i, *pi, and **ppi all yield an int type
when used in an expression.
Similarly, int f(), *f(), (*f)(); declare
a function returning an integer, a function returning a pointer to an
integer, a pointer to a function returning an integer. int *api[10],
(*pai)[10]; declare an array of pointers to integers, and a pointer to
an array of integers.
In all these cases the declaration of a
variable resembles its usage in an expression whose type is the one
named at the head of the declaration.
An accident of syntax contributed to the perceived complexity of the
language. The indirection operator, spelled * in C, is syntactically a
unary prefix operator, just as in BCPL and B. This works well in
simple expressions, but in more complex cases, parentheses are
required to direct the parsing. For example, to distinguish
indirection through the value returned by a function from calling a
function designated by a pointer, one writes *fp() and (*pf)()
respectively. The style used in expressions carries through to
declarations, so the names might be declared
int *fp(); int (*pf)();
In more ornate but still realistic cases,
things become worse: int *(*pfp)(); is a pointer to a function
returning a pointer to an integer.
There are two effects occurring.
Most important, C has a relatively rich set of ways of describing
types (compared, say, with Pascal). Declarations in languages as
expressive as C—Algol 68, for example—describe objects equally hard to
understand, simply because the objects themselves are complex. A
second effect owes to details of the syntax. Declarations in C must be
read in an `inside-out' style that many find difficult to grasp.
Sethi [Sethi 81] observed that many of the nested
declarations and expressions would become simpler if the indirection
operator had been taken as a postfix operator instead of prefix, but
by then it was too late to change.
The reason is clearer if you write it like this:
int x, *y;
That is, both x and *y are ints. Thus y is an int *.
That is a language decision that predates C++, as C++ inherited it from C. I once heard that the motivation was that the declaration and the use would be equivalent, that is, given a declaration int *p; the expression *p is of type int in the same way that with int i; the expression i is of type int.
Because the committee, and those that developed C++ in the decades before its standardisation, decided that * should retain its original three meanings:
A pointer type
The dereference operator
Multiplication
You're right to suggest that the multiple meanings of * (and, similarly, &) are confusing. I've been of the opinion for some years that it they are a significant barrier to understanding for language newcomers.
Why not choose another symbol for C++?
Backwards-compatibility is the root cause... best to re-use existing symbols in a new context than to break C programs by translating previously-not-operators into new meanings.
Why not choose another symbol for C?
It's impossible to know for sure, but there are several arguments that can be — and have been — made. Foremost is the idea that:
when [an] identifier appears in an expression of the same form as the declarator, it yields an object of the specified type. {K&R, p216}
This is also why C programmers tend to[citation needed] prefer aligning their asterisks to the right rather than to the left, i.e.:
int *ptr1; // roughly C-style
int* ptr2; // roughly C++-style
though both varieties are found in programs of both languages, varyingly.
Page 65 of Expert C Programming: Deep C Secrets includes the following: And then, there is the C philosophy that the declaration of an object should look like its use.
Page 216 of The C Programming Language, 2nd edition (aka K&R) includes: A declarator is read as an assertion that when its identifier appears in an expression of the same form as the declarator, it yields an object of the specified type.
I prefer the way van der Linden puts it.
Haha, I feel your pain, I had the exact same problem.
I thought a pointer should be declared as &int because it makes sense that a pointer is an address of something.
After a while I thought for myself, every type in C has to be read backwards, like
int * const a
is for me
a constant something, when dereferenced equals an int.
Something that has to be dereferenced, has to be a pointer.

good way to write "pointer to something" in C/C++

Is there a "good" way to write "pointer to something" in C/C++ ?
I use to write void foo( char *str ); But sometimes I find it quite illogical because the type of str is "pointer to char", then it should more logical to attach the * to the type name.
Is there a rule to write pointers ?
char*str;
char* str;
char *str;
char * str;
There is no strict rule, but bear in mind that the * attaches to the variable, so:
char *str1, *str2; // str1 and str2 are pointers
char* str1, str2; // str1 is a pointer, str2 is a char
Some people like to do char * str1 as well, but it's up to you or your company's coding standard.
The common C convention is to write T *p, whereas the common C++ convention is to write T* p. Both parse as T (*p); the * is part of the declarator, not the type specifier. It's purely an accident of pointer declaration syntax that you can write it either way.
C (and by extension, C++) declaration syntax is expression-centric; IOW, the form of a declaration should match the form of an expression of the same type in the code.
For example, suppose we had a pointer to int, and we wanted to access that integer value. To do so, we dereference the pointer with the * indirection operator, like so:
x = *p;
The type of the expression *p is int; thus, it follows that the declaration of p should be
int *p
The int-ness of p is provided by the type specifier int, but the pointer-ness of p is provided by the declarator *p.
As a slightly more complicated example, suppose we had a pointer to an array of float, and wanted to access the floating point value at the i'th element of the array through the pointer. We dereference the array pointer and subscript the result:
f = (*ap)[i];
The type of the expression (*ap)[i] is float, so it follows that the declaration of the array pointer is
float (*ap)[N];
The float-ness of ap is provided by the type specifier float, but the pointer-ness and array-ness are provided by the declarator (*ap)[N]. Note that in this case the * must explicitly be bound to the identifer; [] has a higher precedence than unary * in both expression and declaration syntax, so float* ap[N] would be parsed as float *(ap[N]), or "array of pointers to float", rather than "pointer to array of float". I suppose you could write that as
float(* ap)[N];
but I'm not sure what the point would be; it doesn't make the type of ap any clearer.
Even better, how about a pointer to a function that returns a pointer to an array of pointer to int:
int *(*(*f)())[N];
Again, at least two of the * operators must explicitly be bound in the declarator; binding the last * to the type specifier, as in
int* (*(*f)())[N];
just indicates confused thinking IMO.
Even though I use it in my own C++ code, and even though I understand why it became popular, the problem I have with the reasoning behind the T* p convention is that it just doesn't apply outside of the simplest of pointer declarations, and it reinforces a simplistic-to-the-point-of-being-wrong view of C and C++ declaration syntax. Yes, the type of p is "pointer to T", but that doesn't change the fact that as far as the language grammar is concerned * binds to the declarator, not the type specifier.
For another case, if the type of a is "N-element array of T", we don't write
T[N] a;
Obviously, the grammar doesn't allow it. Again, the argument just doesn't apply in this case.
EDIT
As Steve points out in the comments, you can use typedefs to hide some of the complexity. For example, you could rewrite
int *(*(*f)())[N];
as something like
typedef int *iptrarr[N]; // iptrarr is an array of pointer to int
typedef iptrarr *arrptrfunc(); // arrptrfunc is a function returning
// a pointer to iptrarr
arrptrfunc *f; // f is a pointer to arrptrfunc
Now you can cleanly apply the T* p convention, declaring f as arrptrfunc* f. I personally am not fond of doing things this way, since it's not necessarily clear from the typedef how f is supposed to be used in an expression, or how to use an object of type arrptrfunc. The non-typedef'd version may be ugly and difficult to read, but at least it tells you everything you need to know up front; you don't have to go digging through all the typedefs.
The "good way" depends on
internal coding standards in your project
your personal preferences
(probably) in that order.
There is no right or wrong in this. The important thing is to pick one coding standard and stick to it.
That being said, I personally believe that the * belongs with the type and not the variable name, as the type is "pointer to char". The variable name is not a pointer.
I think this is going to be heavily influenced by the general pattern in how one declares the variables.
For example, I have a tendency to declare only one variable per line. This way, I can add a comment reminding me how the variable is to be used.
However, there are times, when it is practical to declare several variables of the same type on one line. Under such circumstances, my personal coding rule is to never, NEVER, EVER declare pointers on the same line as non-pointers. I find that mixing them can be a source of errors, so I try to make it easier to see "wrongness" by avoiding mixing.
As long as I follow the first guideline, I find that it does not matter overly much how I declare the pointers so long as I am consistent.
However, if I use the second guideline and declare several pointers on the same line, I find the following style to be most beneficial and clear (of course others may disagree) ...
char *ptr1, *ptr2, *ptr3;
By having no space between the * and the pointer name, it becomes easier to spot whether I have violated the second guideline.
Now, if I wanted to be consistent between my two personal style guidelines, when declaring only one pointer on a line, I would use ...
char *ptr;
Anyway, that's my rationale for part of why I do what I do. Hope this helps.

How do you declare a pointer to a function that returns a pointer to an array of int values in C / C++?

Is this correct?
int (*(*ptr)())[];
I know this is trivial, but I was looking at an old test about these kind of constructs, and this particular combination wasn't on the test and it's really driving me crazy; I just have to make sure. Is there a clear and solid understandable rule to these kind of declarations?
(ie: pointer to... array of.. pointers to... functions that.... etc etc)
Thanks!
R
The right-left rule makes it easy.
int (*(*ptr)())[];can be interpreted as
Start from the variable name ------------------------------- ptr
Nothing to right but ) so go left to find * -------------- is a pointer
Jump out of parentheses and encounter () ----------- to a function that takes no arguments(in case of C unspecified number of arguments)
Go left, find * ------------------------------------------------ and returns a pointer
Jump put of parentheses, go right and hit [] ---------- to an array of
Go left again, find int ------------------------------------- ints.
In almost all situations where you want to return a pointer to an array the simplest thing to do is to return a pointer to the first element of the array. This pointer can be used in the same contexts as an array name an provides no more or less indirection than returning a pointer of type "pointer to array", indeed it will hold the same pointer value.
If you follow this you want a pointer to a function returning a pointer to an int. You can build this up (construction of declarations is easier than parsing).
Pointer to int:
int *A;
Function returning pointer to int:
int *fn();
pointer to function returning a pointer to int:
int *(*pfn)();
If you really want to return a pointer to a function returning a pointer to an array of int you can follow the same process.
Array of int:
int A[];
Pointer to array of int:
int (*p)[];
Function returning pointer ... :
int (*fn())[];
Pointer to fn ... :
int (*(*pfn)())[];
which is what you have.
You don't. Just split it up into two typedefs: one for pointer to int array, and one for pointer to functions. Something like:
typedef int (*IntArrayPtr_t)[];
typedef IntArrayPtr_t (*GetIntArrayFuncPtr_t)(void);
This is not only more readable, it also makes it easier to declare/define the functions that you are going to assign the variable:
IntArrayPtr_t GetColumnSums(void)
{ .... }
Of course this assumes this was a real-world situation, and not an interview question or homework. I would still argue this is a better solution for those cases, but that's only me. :)
If you feel like cheating:
typedef int(*PtrToArray)[5];
PtrToArray function();
int i = function;
Compiling that on gcc yields: invalid conversion from 'int (*(*)())[5]' to 'int'. The first bit is the type you're looking for.
Of course, once you have your PtrToArray typedef, the whole exercise becomes rather more trivial, but sometimes this comes in handy if you already have the function name and you just need to stick it somewhere. And, for whatever reason, you can't rely on template trickery to hide the gory details from you.
If your compiler supports it, you can also do this:
typedef int(*PtrToArray)[5];
PtrToArray function();
template<typename T> void print(T) {
cout << __PRETTY_FUNCTION__ << endl;
}
print(function);
Which, on my computer box, produces void function(T) [with T = int (* (*)())[5]]
Being able to read the types is pretty useful, since understanding compiler errors is often dependent on your ability to figure out what all those parenthesis mean. But making them yourself is less useful, IMO.
Here's my solution...
int** (*func)();
Functor returning an array of int*'s. It isn't as complicated as your solution.
Using cdecl you get the following
cdecl> declare a as pointer to function returning pointer to array of int;
Warning: Unsupported in C -- 'Pointer to array of unspecified dimension'
(maybe you mean "pointer to object")
int (*(*a)())[]
This question from C-faq is similar but provides 3 approaches to solve the problem.

How do you decipher complex declarations of pointers+arrays?

Although I use std::vector almost all the time, I am interested in understanding as much as I can about pointers. Examples of what I am talking about:
char* array[5]; // What does it mean?
// 1) pointer to an array of 5 elements!
// 2) an array of 5 pointers?
I am interested in the precise definition of this declaration.
Not just pointers and arrays: How to interpret complex C/C++ declarations:
Start reading the declaration from the
innermost parentheses, go right, and
then go left. When you encounter
parentheses, the direction should be
reversed. Once everything in the
parentheses has been parsed, jump out
of it. Continue till the whole
declaration has been parsed.
One small change to the right-left
rule: When you start reading the
declaration for the first time, you
have to start from the identifier, and
not the innermost parentheses.
You example:
char* array[5];
Is an array of 5 pointers to char.
cdecl is a program which is nice for this sort of thing. (particularly when you add function pointers into the mix!)
Type `help' or `?' for help
cdecl> explain char* foo[5]
declare foo as array 5 of pointer to char
cdecl> declare bar as array 5 of pointer to function (integer, integer) returning char
char (*bar[5])(int , int )
I learned the clockwise/spiral rule long ago from some magazine article. Here's an online article that describes the technique:
http://c-faq.com/decl/spiral.anderson.html
It's served me well, though I still struggle mightily with some of the monstrous template-based declarations I come across at times.
The general procedure for reading a type in C/C++ is:
Identify the final type, which can be a basic type or a typedef identifier, and which can have type modifiers such as const, volatile, etc. In your example that's "char".
Apply the operators to the identifier in the same order of precedence as they have in expressions. These operators can be * (dereference), [] (index) and () (call function).
In the original philosophy of the syntax, your example would have been written "char *array[5]", the identifier is "array" and the operators are [] (index) then * (dereference).
The declaration then reads like a contract "if you apply these operators in that order, then you get an object of the final type".
In your case the full sentence is "if you index the variable "array", then dereference the resulting expression, you get a char".
You can also consider it like that "if you index the variable "array", then you get an object such that if you dereference it, you get a char"
The trick is mostly to keep track of the fact that [] and () have a higher precedence than *. You can control the operator order with parentheses just like for regular expressions.
char * is the type and you have an array of 5 of them.
[] has higher precedence than *, that's why it's an array of pointers and not the other way around.
you always read pointers from right to left interpreting the '*' as a pointer.
for example char** a[5] is an array of 5 pointers to pointers of characters...

Why do we use "type * var" instead of "type & var" when defining a pointer?

I'm relatively new to C++ (about one year of experience, on and off). I'm curious about what led to the decision of type * name as the syntax for defining pointers. It seems to me that the syntax should be type & name as the & symbol is used everywhere else in code to refer to the variable's memory address. So, to use the traditional example of int pointers:
int a = 1;
int * b = &a;
would become
int a = 1;
int & b = &a
I'm sure there's some reason for this that I'm just not seeing, and I'd love to hear some input from C++ veterans.
Thanks,
-S
C++ adopts the C syntax. As revealed in "The Development of the C Language" (by Dennis Ritchie) C uses * for pointers in type declarations because it was decided that type syntax should follow use.
For each object of [a compound type], there was already a way to mention the underlying object: index the array, call the function, use the indirection operator [*] on the pointer. Analogical reasoning led to a declaration syntax for names mirroring that of the expression syntax in which the names typically appear. Thus,
int i, *pi, **ppi;
declare an integer, a pointer to an integer, a pointer to a pointer to an integer. The syntax of these declarations reflects the observation that i, *pi, and **ppi all yield an int type when used in an expression.
Here's a more complex example:
int *(*foo)[4][];
This declaration means an expression *(*foo)[4][0] has type int, and from that (and that [] has higher precedence than unary *) you can decode the type: foo is a pointer to an array of size 4 of array of pointers to ints.
This syntax was adopted in C++ for compatibility with C. Also, don't forget that C++ has a use for & in declarations.
int & b = a;
The above line means a reference variable refering to another variable of type int. The difference between a reference and pointer roughly is that references are initialized only, and you can not change where they point, and finally they are always dereferenced automatically.
int x = 5, y = 10;
int& r = x;
int sum = r + y; // you do not need to say '*r' automatically dereferenced.
r = y; // WRONG, 'r' can only have one thing pointing at during its life, only at its infancy ;)
I think that Dennis Ritchie answered this in The Development of the C Language:
For each object of such a composed
type, there was already a way to
mention the underlying object: index
the array, call the function, use the
indirection operator on the pointer.
Analogical reasoning led to a
declaration syntax for names mirroring
that of the expression syntax in which
the names typically appear. Thus,
int i, *pi, **ppi;
declare an integer, a pointer to an
integer, a pointer to a pointer to an
integer. The syntax of these
declarations reflects the observation
that i, *pi, and **ppi all yield an
int type when used in an expression.
Similarly,
int f(), *f(), (*f)();
declare a function returning an
integer, a function returning a
pointer to an integer, a pointer to a
function returning an integer;
int *api[10], (*pai)[10];
declare an array of pointers to
integers, and a pointer to an array of
integers. In all these cases the
declaration of a variable resembles
its usage in an expression whose type
is the one named at the head of the
declaration.
So we use type * var to declare a pointer because this allows the declaration to mirror the usage (dereferencing) of the pointer.
In this article, Ritchie also recounts that in "NB", an extended version of the "B" programming language, he used int pointer[] to declare a pointer to an int, as opposed to int array[10] to declare an array of ints.
If you are a visual thinker, it may help to imagine the asterisk as a black hole leading to the data value. Hence, it is a pointer.
The ampersand is the opposite end of the hole, think of it as an unraveled asterisk or a spaceship wobbling about in an erratic course as the pilot gets over the transition coming out of the black hole.
I remember being very confused by C++ overloading the meaning of the ampersand, to give us references. In their desperate attempt to avoid using any more characters, which was justified by the international audience using C and known issues with keyboard limitations, they added a major source of confusion.
One thing that may help in C++ is to think of references as pre-prepared dereferenced pointers. Rather than using &someVariable when you pass in an argument, you've already used the trailing ampersand when you defined someVariable. Then again, that might just confuse you further!
One of my pet hates, which I was unhappy to see promulgated in Apple's Objective-C samples, is the layout style int *someIntPointer instead of int* someIntPointer
IMHO, keeping the asterisk with the variable is an old-fashioned C approach emphasizing the mechanics of how you define the variable, over its data type.
The data type of someIntPointer is literally a pointer to an integer and the declaration should reflect that. This does lead to the requirement that you declare one variable per line, to avoid subtle bugs such as:
int* a, b; // b is a straight int, was that our intention?
int *a, *b; // old-style C declaring two pointers
int* a;
int* b; // b is another pointer to an int
Whilst people argue that the ability to declare mixed pointers and values on the same line, intentionally, is a powerful feature, I've seen it lead to subtle bugs and confusion.
Your second example is not valid C code, only C++ code. The difference is that one is a pointer, whereas the other is a reference.
On the right-hand side the '&' always means address-of. In a definition it indicates that the variable is a reference.
On the right-hand side the '*' always means value-at-address. In a definition it indicates that the variable is a pointer.
References and pointers are similar, but not the same. This article addresses the differences.
Instead of reading int* b as "b is a pointer to int", read it as int *b: "*b is an int". Then, you have & as an anti-*: *b is an int. The address of *b is &*b, or just b.
I think the answer may well be "because that's the way K&R did it."
K&R are the ones who decided what the C syntax for declaring pointers was.
It's not int & x; instead of int * x; because that's the way the language was defined by the guys who made it up -- K&R.