Why is the dereference operator (*) also used to declare a pointer? - c++

I'm not sure if this is a proper programming question, but it's something that has always bothered me, and I wonder if I'm the only one.
When initially learning C++, I understood the concept of references, but pointers had me confused. Why, you ask? Because of how you declare a pointer.
Consider the following:
void foo(int* bar)
{
}
int main()
{
int x = 5;
int* y = NULL;
y = &x;
*y = 15;
foo(y);
}
The function foo(int*) takes an int pointer as parameter. Since I've declared y as int pointer, I can pass y to foo, but when first learning C++ I associated the * symbol with dereferencing, as such I figured a dereferenced int needed to be passed. I would try to pass *y into foo, which obviously doesn't work.
Wouldn't it have been easier to have a separate operator for declaring a pointer? (or for dereferencing). For example:
void test(int# x)
{
}

In The Development of the C Language, Dennis Ritchie explains his reasoning thusly:
The second innovation that most clearly distinguishes C from its
predecessors is this fuller type structure and especially its
expression in the syntax of declarations... given an object of any
type, it should be possible to describe a new object that gathers
several into an array, yields it from a function, or is a pointer to
it.... [This] led to a
declaration syntax for names mirroring that of the expression syntax
in which the names typically appear. Thus,
int i, *pi, **ppi; declare an integer, a pointer to an integer, a
pointer to a pointer to an integer. The syntax of these declarations
reflects the observation that i, *pi, and **ppi all yield an int type
when used in an expression.
Similarly, int f(), *f(), (*f)(); declare
a function returning an integer, a function returning a pointer to an
integer, a pointer to a function returning an integer. int *api[10],
(*pai)[10]; declare an array of pointers to integers, and a pointer to
an array of integers.
In all these cases the declaration of a
variable resembles its usage in an expression whose type is the one
named at the head of the declaration.
An accident of syntax contributed to the perceived complexity of the
language. The indirection operator, spelled * in C, is syntactically a
unary prefix operator, just as in BCPL and B. This works well in
simple expressions, but in more complex cases, parentheses are
required to direct the parsing. For example, to distinguish
indirection through the value returned by a function from calling a
function designated by a pointer, one writes *fp() and (*pf)()
respectively. The style used in expressions carries through to
declarations, so the names might be declared
int *fp(); int (*pf)();
In more ornate but still realistic cases,
things become worse: int *(*pfp)(); is a pointer to a function
returning a pointer to an integer.
There are two effects occurring.
Most important, C has a relatively rich set of ways of describing
types (compared, say, with Pascal). Declarations in languages as
expressive as C—Algol 68, for example—describe objects equally hard to
understand, simply because the objects themselves are complex. A
second effect owes to details of the syntax. Declarations in C must be
read in an `inside-out' style that many find difficult to grasp.
Sethi [Sethi 81] observed that many of the nested
declarations and expressions would become simpler if the indirection
operator had been taken as a postfix operator instead of prefix, but
by then it was too late to change.

The reason is clearer if you write it like this:
int x, *y;
That is, both x and *y are ints. Thus y is an int *.

That is a language decision that predates C++, as C++ inherited it from C. I once heard that the motivation was that the declaration and the use would be equivalent, that is, given a declaration int *p; the expression *p is of type int in the same way that with int i; the expression i is of type int.

Because the committee, and those that developed C++ in the decades before its standardisation, decided that * should retain its original three meanings:
A pointer type
The dereference operator
Multiplication
You're right to suggest that the multiple meanings of * (and, similarly, &) are confusing. I've been of the opinion for some years that it they are a significant barrier to understanding for language newcomers.
Why not choose another symbol for C++?
Backwards-compatibility is the root cause... best to re-use existing symbols in a new context than to break C programs by translating previously-not-operators into new meanings.
Why not choose another symbol for C?
It's impossible to know for sure, but there are several arguments that can be — and have been — made. Foremost is the idea that:
when [an] identifier appears in an expression of the same form as the declarator, it yields an object of the specified type. {K&R, p216}
This is also why C programmers tend to[citation needed] prefer aligning their asterisks to the right rather than to the left, i.e.:
int *ptr1; // roughly C-style
int* ptr2; // roughly C++-style
though both varieties are found in programs of both languages, varyingly.

Page 65 of Expert C Programming: Deep C Secrets includes the following: And then, there is the C philosophy that the declaration of an object should look like its use.
Page 216 of The C Programming Language, 2nd edition (aka K&R) includes: A declarator is read as an assertion that when its identifier appears in an expression of the same form as the declarator, it yields an object of the specified type.
I prefer the way van der Linden puts it.

Haha, I feel your pain, I had the exact same problem.
I thought a pointer should be declared as &int because it makes sense that a pointer is an address of something.
After a while I thought for myself, every type in C has to be read backwards, like
int * const a
is for me
a constant something, when dereferenced equals an int.
Something that has to be dereferenced, has to be a pointer.

Related

Is there a difference between int *x and int* x in C++? [duplicate]

This question already has an answer here:
Difference between char* var; and char *var;? [duplicate]
(1 answer)
Closed 9 years ago.
I'm getting back into my C++ studies, and really trying to understand the basics. Pointers have always given me trouble, and I want to make sure I really get it before I carry on and get confused down the road.
Sadly, I've been stymied at the very outset by an inconsistency in the tutorials I'm reading. Some do pointer declarations this way:
int *x
and some do it this way:
int* x
Now the former of the two seems to be by far the most common. Which is upsetting, because the second makes much more sense to me. When I read int *x, I read "Here is an int, and its name is *x", which isn't quite right. But when I read the second, I see "here is an int pointer, and its name is x", which is pretty accurate.
So, before I build my own mental map, is there a functional difference between the two? Am I going to be a leper outcast if I do it the second way?
Thanks.
The famous Bjarne Stroustrup, notable for the creation and the development of the C++, said ...
The choice between "int* p;" and "int *p;" is not about right and
wrong, but about style and emphasis. C emphasized expressions;
declarations were often considered little more than a necessary evil.
C++, on the other hand, has a heavy emphasis on types.
A "typical C programmer" writes "int *p;" and explains it "*p is what
is the int" emphasizing syntax, and may point to the C (and C++)
declaration grammar to argue for the correctness of the style. Indeed,
the * binds to the name p in the grammar.
A "typical C++ programmer" writes "int* p;" and explains it "p is a
pointer to an int" emphasizing type. Indeed the type of p is int*. I
clearly prefer that emphasis and see it as important for using the
more advanced parts of C++ well.
So, there's no difference. It is just a code style.
From here
To the compiler, they have exactly the same meaning.
Stylistically, there are arguments for and against both.
One argument is that the first version is preferable because the second version:
int* x, y, z;
implies that x, y and z are all pointers, which they are not (only x is).
The first version does not have this problem:
int *x, y, z;
In other words, since the * binds to the variable name and not the type, it makes sense to place it right next to the variable name.
The counter-argument is that one shouldn't mix pointer and non-pointer type in the same declaration. If you don't, the above argument doesn't apply and it makes sense to place the * right next to int because it's part of the type.
Whichever school of thought you decide to subscribe to, you'll encounter both styles in the wild, and more (some people write int * x).
No difference at all.
It is just a matter of style.
I personally prefer this:
int *x;
over this,
int* x;
Because the latter is less readable when you declare many variables on the same line. For example, see this:
int* x, y, z;
Here x is a pointer to int, but are y and z pointers too? It looks like they're pointers, but they are not.
There is no difference. I use the int* x form because I prefer to keep all of the type grouped together away from the name, but that kind of falls apart with more complex types like int (*x)[10].
Some people prefer int *x because you can read it as "dereferencing x gives you an int". But even that falls apart when you start to use reference types; int &x does not mean that taking the address of x will give you an int.
Another reason that you might prefer int *x is because, in terms of the grammar, the int is the declaration specifier sequence and the *x is the declarator. They are two separate parts of the declaration. This becomes more obvious when you have multiple declarators like int *x, y;. It might be easier to see that y is not a pointer in this case.
However, many people, like myself, prefer not to declare multiple variables in a single declaration; there isn't exactly much need to in C++. That might be another reason for preferring the int* x form.
There's only one rule: be consistent.
There's absolutely no difference.
No difference. I've tended to prefer the
int* x
Due to thinking of 'x' as having type int* .
To the compiler they are the same
But at the same time the difference is readabilty when there are multiple variables
int *x,y
However this is more misleading
int* x,y
Both are the same thing.
The former is although preferred. Suppose you want to declare 2 pointers p & q in the same line.
int* p, q;
This actually means 'p' is a pointer to an int, while 'q' is an int.
The correct syntax would be:
int* p, *q;
Hence, we include the * in the pointer's name.
This is a type of style of coding. There's no difference, just a matter of preference as you said yourself.Hope you clear.Enjoy coding
In my opinion, in the expression int * x,y; x and y are names of variables. When these names are declared like this by programmer, it means they have same type and int * is the
type.

Declaring multiple object pointers on one line causes compiler error

when I do this (in my class)
public:
Entity()
{
re_sprite_eyes = new sf::Sprite();
re_sprite_hair = new sf::Sprite();
re_sprite_body = new sf::Sprite();
}
private:
sf::Sprite* re_sprite_hair;
sf::Sprite* re_sprite_body;
sf::Sprite* re_sprite_eyes;
Everything works fine. However, if I change the declarations to this:
private:
sf::Sprite* re_sprite_hair, re_sprite_body, re_sprite_eyes;
I get this compiler error:
error: no match for 'operator=' in '((Entity*)this)->Entity::re_sprite_eyes = (operator new(272u), (<statement>, ((sf::Sprite*)<anonymous>)))
And then it says candidates for re_sprite_eyes are sf::Sprite objects and/or references.
Why does this not work? Aren't the declarations the same?
sf::Sprite* re_sprite_hair, re_sprite_body, re_sprite_eyes;
Does not declare 3 pointers - it is one pointer and 2 objects.
sf::Sprite* unfortunately does not apply to all the variables declared following it, just the first. It is equivalent to
sf::Sprite* re_sprite_hair;
sf::Sprite re_sprite_body;
sf::Sprite re_sprite_eyes;
You want to do:
sf::Sprite *re_sprite_hair, *re_sprite_body, *re_sprite_eyes;
You need to put one star for each variable. In such cases I prefer to keep the star on the variable's side, rather than the type, to make exactly this situation clear.
In both C and C++, the * binds to the declarator, not the type specifier. In both languages, declarations are based on the types of expressions, not objects.
For example, suppose you have a pointer to an int named p, and you want to access the int value that p points to; you do so by dereferencing the pointer with the unary * operator, like so:
x = *p;
The type of the expression *p is int; thus, the declaration of p is
int *p;
This is true no matter how many pointers you declare within the same declaration statement; if q and r also need to be declared as pointers, then they also need to have the unary * as part of the declarator:
int *p, *q, *r;
because the expressions *q and *r have type int. It's an accident of C and C++ syntax that you can write T *p, T* p, or T * p; all of those declarations will be interpreted as T (*p).
This is why I'm not fond of the C++ style of declaring pointer and reference types as
T* p;
T& r;
because it implies an incorrect view of how C and C++ declaration syntax works, leading to the exact kind of confusion that you just experienced. However, I've written enough C++ to realize that there are times when that style does make the intent of the code clearer, especially when defining container types.
But it's still wrong.
This is a (two years late) response to Lightness Races in Orbit (and anyone else who objects to my labeling the T* p convention as "wrong")...
First of all, you have the legion of questions just like this one that arise specifically from the use of the T* p convention, and how it doesn't work like people expect. How many questions on this site are on the order of "why doesn't T* p, q declare both p and q as pointers?"
It introduces confusion - that by itself should be enough to discourage its use.
But beyond that, it's inconsistent. You can't separate array-ness or function-ness from the declarator, why should you separate pointer-ness from it?
"Well, that's because [] and () are postfix operators, while * is unary". Yes, it is, so why aren't you associating the operator with its operand? In the declaration T* p, T is not the operand of *, so why are we writing the declaration as though it is?
If a is "an array of pointers", why should we write T* a[N]? If f is "a function returning a pointer", why should we write T* f()? The declarator system makes more sense and is internally consistent if you write those declarations as T *a[N] and T *f(). That should be obvious from the fact that I can use T as a stand-in for any type (indeed, for any sequence of declaration specifiers).
And then you have pointers to arrays and pointers to functions, where the * must be explicitly bound to the declarator1:
T (*a)[N];
T (*f)();
Yes, pointer-ness is an important property of the thing you're declaring, but so are array-ness and function-ness, and emphasizing one over the other creates more problems than it solves. Again, as this question shows, the T* p convention introduces confusion.
Because * is unary and a separate token on its own you can write T* p, T *p, T*p, and T * p and they'll all be accepted by the compiler, but they will all be interpreted as T (*p). More importantly, T* p, q, r will be interpreted as T (*p), q, r. That interpretation is more obvious if you write T *p, q, r. Yeah, yeah, yeah, "declare only one thing per line and it won't be a problem." You know how else to not make it a problem? Write your declarators properly. The declarator system itself will make more sense and you will be less likely to make mistake.
We're not arguing over an "antique oddity" of the language, it's a fundamental component of the language grammar and its philosophy. Pointer-ness is a property of the declarator, just like array-ness and function-ness, and pretending it's somehow not just leads to confusion and makes both C and C++ harder to understand than they need to be.
I would argue that making the dereference operator unary as opposed to postfix was a mistake2, but that's how it worked in B, and Ritchie wanted to keep as much of B as possible. I will also argue that Bjarne's promotion of the T* p convention is a mistake.
At this point in the discussion, somebody will suggest using a typedef liketypedef T arrtype[N];
arrtype* p;
which just totally misses the point and earns the suggester a beating with the first edition of "C: The Complete Reference" because it's big and heavy and no good for anything else.
Writing T a*[N]*() as opposed to T (*(*a)[N])() is definitely less eye-stabby and scans much more easily.
In C++11 you have a nice little workaround, which might be better than shifting spaces back and forth:
template<typename T> using type=T;
template<typename T> using func=T*;
// I don't like this style, but type<int*> i, j; works ok
type<int*> i = new int{3},
j = new int{4};
// But this one, imho, is much more readable than int(*f)(int, int) = ...
func<int(int, int)> f = [](int x, int y){return x + y;},
g = [](int x, int y){return x - y;};
Another thing that may call your attention is the line:
int * p1, * p2;
This declares the two pointers used in the previous example. But notice that there is an asterisk (*) for each pointer, in order for both to have type int* (pointer to int). This is required due to the precedence rules. Note that if, instead, the code was:
int * p1, p2;
p1 would indeed be of type int*, but p2 would be of type int. Spaces do not matter at all for this purpose. But anyway, simply remembering to put one asterisk per pointer is enough for most pointer users interested in declaring multiple pointers per statement. Or even better: use a different statemet for each variable.
From http://www.cplusplus.com/doc/tutorial/pointers/
The asterisk binds to the pointer-variable name. The way to remember this is to notice that in C/C++, declarations mimic usage.
The pointers might be used like this:
sf::Sprite *re_sprite_body;
// ...
sf::Sprite sprite_bod = *re_sprite_body;
Similarly,
char *foo[3];
// ...
char fooch = *foo[1];
In both cases, there is an underlying type-specifier, and the operator or operators required to "get to" an object of that type in an expression.

Why can't arrays be passed as function arguments?

Why can't you pass arrays as function arguments?
I have been reading this C++ book that says 'you can't pass arrays as function arguments', but it never explains why. Also, when I looked it up online I found comments like 'why would you do that anyway?' It's not that I would do it, I just want to know why you can't.
Why can't arrays be passed as function arguments?
They can:
void foo(const int (&myArray)[5]) {
// `myArray` is the original array of five integers
}
In technical terms, the type of the argument to foo is "reference to array of 5 const ints"; with references, we can pass the actual object around (disclaimer: terminology varies by abstraction level).
What you can't do is pass by value, because for historical reasons we shall not copy arrays. Instead, attempting to pass an array by value into a function (or, to pass a copy of an array) leads its name to decay into a pointer. (some resources get this wrong!)
Array names decay to pointers for pass-by-value
This means:
void foo(int* ptr);
int ar[10]; // an array
foo(ar); // automatically passing ptr to first element of ar (i.e. &ar[0])
There's also the hugely misleading "syntactic sugar" that looks like you can pass an array of arbitrary length by value:
void foo(int ptr[]);
int ar[10]; // an array
foo(ar);
But, actually, you're still just passing a pointer (to the first element of ar). foo is the same as it was above!
Whilst we're at it, the following function also doesn't really have the signature that it seems to. Look what happens when we try to call this function without defining it:
void foo(int ar[5]);
int main() {
int ar[5];
foo(ar);
}
// error: undefined reference to `func(int*)'
So foo takes int* in fact, not int[5]!
(Live demo.)
But you can work-around it!
You can hack around this by wrapping the array in a struct or class, because the default copy operator will copy the array:
struct Array_by_val
{
int my_array[10];
};
void func (Array_by_val x) {}
int main() {
Array_by_val x;
func(x);
}
This is somewhat confusing behaviour.
Or, better, a generic pass-by-reference approach
In C++, with some template magic, we can make a function both re-usable and able to receive an array:
template <typename T, size_t N>
void foo(const T (&myArray)[N]) {
// `myArray` is the original array of N Ts
}
But we still can't pass one by value. Something to remember.
The future...
And since C++11 is just over the horizon, and C++0x support is coming along nicely in the mainstream toolchains, you can use the lovely std::array inherited from Boost! I'll leave researching that as an exercise to the reader.
So I see answers explaining, "Why doesn't the compiler allow me to do this?" Rather than "What caused the standard to specify this behavior?" The answer lies in the history of C. This is taken from "The Development of the C Language" (source) by Dennis Ritchie.
In the proto-C languages, memory was divided into "cells" each containing a word. These could be dereferenced using the eventual unary * operator -- yes, these were essentially typeless languages like some of today's toy languages like Brainf_ck. Syntactic sugar allowed one to pretend a pointer was an array:
a[5]; // equivalent to *(a + 5)
Then, automatic allocation was added:
auto a[10]; // allocate 10 cells, assign pointer to a
// note that we are still typeless
a += 1; // remember that a is a pointer
At some point, the auto storage specifier behavior became default -- you may also be wondering what the point of the auto keyword was anyway, this is it. Pointers and arrays were left to behave in somewhat quirky ways as a result of these incremental changes. Perhaps the types would behave more alike if the language were designed from a bird's-eye view. As it stands, this is just one more C / C++ gotcha.
Arrays are in a sense second-class types, something that C++ inherited from C.
Quoting 6.3.2.1p3 in the C99 standard:
Except when it is the operand of the sizeof operator or the unary
& operator, or is a string literal used to initialize an array, an
expression that has type "array of type" is converted to an
expression with type "pointer to type" that points to the initial
element of the array object and is not an lvalue. If the array object
has register storage class, the behavior is undefined.
The same paragraph in the C11 standard is essentially the same, with the addition of the new _Alignof operator. (Both links are to drafts which are very close to the official standards. (UPDATE: That was actually an error in the N1570 draft, corrected in the released C11 standard. _Alignof can't be applied to an expression, only to a parenthesized type name, so C11 has only the same 3 exceptions that C99 and C90 did. (But I digress.)))
I don't have the corresponding C++ citation handy, but I believe it's quite similar.
So if arr is an array object, and you call a function func(arr), then func will receive a pointer to the first element of arr.
So far, this is more or less "it works that way because it's defined that way", but there are historical and technical reasons for it.
Permitting array parameters wouldn't allow for much flexibility (without further changes to the language), since, for example, char[5] and char[6] are distinct types. Even passing arrays by reference doesn't help with that (unless there's some C++ feature I'm missing, always a possibility). Passing pointers gives you tremendous flexibility (perhaps too much!). The pointer can point to the first element of an array of any size -- but you have to roll your own mechanism to tell the function how big the array is.
Designing a language so that arrays of different lengths are somewhat compatible while still being distinct is actually quite tricky. In Ada, for example, the equivalents of char[5] and char[6] are the same type, but different subtypes. More dynamic languages make the length part of an array object's value, not of its type. C still pretty much muddles along with explicit pointers and lengths, or pointers and terminators. C++ inherited all that baggage from C. It mostly punted on the whole array thing and introduced vectors, so there wasn't as much need to make arrays first-class types.
TL;DR: This is C++, you should be using vectors anyway! (Well, sometimes.)
Arrays are not passed by value because arrays are essentially continuous blocks of memmory. If you had an array you wanted to pass by value, you could declare it within a structure and then access it through the structure.
This itself has implications on performance because it means you will lock up more space on the stack. Passing a pointer is faster because the envelope of data to be copied onto the stack is far less.
I believe that the reason why C++ did this was, when it was created, that it might have taken up too many resources to send the whole array rather than the address in memory. That is just my thoughts on the matter and an assumption.
It's because of a technical reason. Arguments are passed on the stack; an array can have a huge size, megabytes and more. Copying that data to the stack on every call will not only be slower, but it will exhaust the stack pretty quickly.
You can overcome that limitation by putting an array into a struct (or using Boost::Array):
struct Array
{
int data[512*1024];
int& operator[](int i) { return data[i]; }
};
void foo(Array byValueArray) { .......... }
Try to make nested calls of that function and see how many stack overflows you'll get!

good way to write "pointer to something" in C/C++

Is there a "good" way to write "pointer to something" in C/C++ ?
I use to write void foo( char *str ); But sometimes I find it quite illogical because the type of str is "pointer to char", then it should more logical to attach the * to the type name.
Is there a rule to write pointers ?
char*str;
char* str;
char *str;
char * str;
There is no strict rule, but bear in mind that the * attaches to the variable, so:
char *str1, *str2; // str1 and str2 are pointers
char* str1, str2; // str1 is a pointer, str2 is a char
Some people like to do char * str1 as well, but it's up to you or your company's coding standard.
The common C convention is to write T *p, whereas the common C++ convention is to write T* p. Both parse as T (*p); the * is part of the declarator, not the type specifier. It's purely an accident of pointer declaration syntax that you can write it either way.
C (and by extension, C++) declaration syntax is expression-centric; IOW, the form of a declaration should match the form of an expression of the same type in the code.
For example, suppose we had a pointer to int, and we wanted to access that integer value. To do so, we dereference the pointer with the * indirection operator, like so:
x = *p;
The type of the expression *p is int; thus, it follows that the declaration of p should be
int *p
The int-ness of p is provided by the type specifier int, but the pointer-ness of p is provided by the declarator *p.
As a slightly more complicated example, suppose we had a pointer to an array of float, and wanted to access the floating point value at the i'th element of the array through the pointer. We dereference the array pointer and subscript the result:
f = (*ap)[i];
The type of the expression (*ap)[i] is float, so it follows that the declaration of the array pointer is
float (*ap)[N];
The float-ness of ap is provided by the type specifier float, but the pointer-ness and array-ness are provided by the declarator (*ap)[N]. Note that in this case the * must explicitly be bound to the identifer; [] has a higher precedence than unary * in both expression and declaration syntax, so float* ap[N] would be parsed as float *(ap[N]), or "array of pointers to float", rather than "pointer to array of float". I suppose you could write that as
float(* ap)[N];
but I'm not sure what the point would be; it doesn't make the type of ap any clearer.
Even better, how about a pointer to a function that returns a pointer to an array of pointer to int:
int *(*(*f)())[N];
Again, at least two of the * operators must explicitly be bound in the declarator; binding the last * to the type specifier, as in
int* (*(*f)())[N];
just indicates confused thinking IMO.
Even though I use it in my own C++ code, and even though I understand why it became popular, the problem I have with the reasoning behind the T* p convention is that it just doesn't apply outside of the simplest of pointer declarations, and it reinforces a simplistic-to-the-point-of-being-wrong view of C and C++ declaration syntax. Yes, the type of p is "pointer to T", but that doesn't change the fact that as far as the language grammar is concerned * binds to the declarator, not the type specifier.
For another case, if the type of a is "N-element array of T", we don't write
T[N] a;
Obviously, the grammar doesn't allow it. Again, the argument just doesn't apply in this case.
EDIT
As Steve points out in the comments, you can use typedefs to hide some of the complexity. For example, you could rewrite
int *(*(*f)())[N];
as something like
typedef int *iptrarr[N]; // iptrarr is an array of pointer to int
typedef iptrarr *arrptrfunc(); // arrptrfunc is a function returning
// a pointer to iptrarr
arrptrfunc *f; // f is a pointer to arrptrfunc
Now you can cleanly apply the T* p convention, declaring f as arrptrfunc* f. I personally am not fond of doing things this way, since it's not necessarily clear from the typedef how f is supposed to be used in an expression, or how to use an object of type arrptrfunc. The non-typedef'd version may be ugly and difficult to read, but at least it tells you everything you need to know up front; you don't have to go digging through all the typedefs.
The "good way" depends on
internal coding standards in your project
your personal preferences
(probably) in that order.
There is no right or wrong in this. The important thing is to pick one coding standard and stick to it.
That being said, I personally believe that the * belongs with the type and not the variable name, as the type is "pointer to char". The variable name is not a pointer.
I think this is going to be heavily influenced by the general pattern in how one declares the variables.
For example, I have a tendency to declare only one variable per line. This way, I can add a comment reminding me how the variable is to be used.
However, there are times, when it is practical to declare several variables of the same type on one line. Under such circumstances, my personal coding rule is to never, NEVER, EVER declare pointers on the same line as non-pointers. I find that mixing them can be a source of errors, so I try to make it easier to see "wrongness" by avoiding mixing.
As long as I follow the first guideline, I find that it does not matter overly much how I declare the pointers so long as I am consistent.
However, if I use the second guideline and declare several pointers on the same line, I find the following style to be most beneficial and clear (of course others may disagree) ...
char *ptr1, *ptr2, *ptr3;
By having no space between the * and the pointer name, it becomes easier to spot whether I have violated the second guideline.
Now, if I wanted to be consistent between my two personal style guidelines, when declaring only one pointer on a line, I would use ...
char *ptr;
Anyway, that's my rationale for part of why I do what I do. Hope this helps.

Why do we use "type * var" instead of "type & var" when defining a pointer?

I'm relatively new to C++ (about one year of experience, on and off). I'm curious about what led to the decision of type * name as the syntax for defining pointers. It seems to me that the syntax should be type & name as the & symbol is used everywhere else in code to refer to the variable's memory address. So, to use the traditional example of int pointers:
int a = 1;
int * b = &a;
would become
int a = 1;
int & b = &a
I'm sure there's some reason for this that I'm just not seeing, and I'd love to hear some input from C++ veterans.
Thanks,
-S
C++ adopts the C syntax. As revealed in "The Development of the C Language" (by Dennis Ritchie) C uses * for pointers in type declarations because it was decided that type syntax should follow use.
For each object of [a compound type], there was already a way to mention the underlying object: index the array, call the function, use the indirection operator [*] on the pointer. Analogical reasoning led to a declaration syntax for names mirroring that of the expression syntax in which the names typically appear. Thus,
int i, *pi, **ppi;
declare an integer, a pointer to an integer, a pointer to a pointer to an integer. The syntax of these declarations reflects the observation that i, *pi, and **ppi all yield an int type when used in an expression.
Here's a more complex example:
int *(*foo)[4][];
This declaration means an expression *(*foo)[4][0] has type int, and from that (and that [] has higher precedence than unary *) you can decode the type: foo is a pointer to an array of size 4 of array of pointers to ints.
This syntax was adopted in C++ for compatibility with C. Also, don't forget that C++ has a use for & in declarations.
int & b = a;
The above line means a reference variable refering to another variable of type int. The difference between a reference and pointer roughly is that references are initialized only, and you can not change where they point, and finally they are always dereferenced automatically.
int x = 5, y = 10;
int& r = x;
int sum = r + y; // you do not need to say '*r' automatically dereferenced.
r = y; // WRONG, 'r' can only have one thing pointing at during its life, only at its infancy ;)
I think that Dennis Ritchie answered this in The Development of the C Language:
For each object of such a composed
type, there was already a way to
mention the underlying object: index
the array, call the function, use the
indirection operator on the pointer.
Analogical reasoning led to a
declaration syntax for names mirroring
that of the expression syntax in which
the names typically appear. Thus,
int i, *pi, **ppi;
declare an integer, a pointer to an
integer, a pointer to a pointer to an
integer. The syntax of these
declarations reflects the observation
that i, *pi, and **ppi all yield an
int type when used in an expression.
Similarly,
int f(), *f(), (*f)();
declare a function returning an
integer, a function returning a
pointer to an integer, a pointer to a
function returning an integer;
int *api[10], (*pai)[10];
declare an array of pointers to
integers, and a pointer to an array of
integers. In all these cases the
declaration of a variable resembles
its usage in an expression whose type
is the one named at the head of the
declaration.
So we use type * var to declare a pointer because this allows the declaration to mirror the usage (dereferencing) of the pointer.
In this article, Ritchie also recounts that in "NB", an extended version of the "B" programming language, he used int pointer[] to declare a pointer to an int, as opposed to int array[10] to declare an array of ints.
If you are a visual thinker, it may help to imagine the asterisk as a black hole leading to the data value. Hence, it is a pointer.
The ampersand is the opposite end of the hole, think of it as an unraveled asterisk or a spaceship wobbling about in an erratic course as the pilot gets over the transition coming out of the black hole.
I remember being very confused by C++ overloading the meaning of the ampersand, to give us references. In their desperate attempt to avoid using any more characters, which was justified by the international audience using C and known issues with keyboard limitations, they added a major source of confusion.
One thing that may help in C++ is to think of references as pre-prepared dereferenced pointers. Rather than using &someVariable when you pass in an argument, you've already used the trailing ampersand when you defined someVariable. Then again, that might just confuse you further!
One of my pet hates, which I was unhappy to see promulgated in Apple's Objective-C samples, is the layout style int *someIntPointer instead of int* someIntPointer
IMHO, keeping the asterisk with the variable is an old-fashioned C approach emphasizing the mechanics of how you define the variable, over its data type.
The data type of someIntPointer is literally a pointer to an integer and the declaration should reflect that. This does lead to the requirement that you declare one variable per line, to avoid subtle bugs such as:
int* a, b; // b is a straight int, was that our intention?
int *a, *b; // old-style C declaring two pointers
int* a;
int* b; // b is another pointer to an int
Whilst people argue that the ability to declare mixed pointers and values on the same line, intentionally, is a powerful feature, I've seen it lead to subtle bugs and confusion.
Your second example is not valid C code, only C++ code. The difference is that one is a pointer, whereas the other is a reference.
On the right-hand side the '&' always means address-of. In a definition it indicates that the variable is a reference.
On the right-hand side the '*' always means value-at-address. In a definition it indicates that the variable is a pointer.
References and pointers are similar, but not the same. This article addresses the differences.
Instead of reading int* b as "b is a pointer to int", read it as int *b: "*b is an int". Then, you have & as an anti-*: *b is an int. The address of *b is &*b, or just b.
I think the answer may well be "because that's the way K&R did it."
K&R are the ones who decided what the C syntax for declaring pointers was.
It's not int & x; instead of int * x; because that's the way the language was defined by the guys who made it up -- K&R.