I've recently decided that I just have to finally learn C/C++, and there is one thing I do not really understand about pointers or more precisely, their definition.
How about these examples:
int* test;
int *test;
int * test;
int* test,test2;
int *test,test2;
int * test,test2;
Now, to my understanding, the first three cases are all doing the same: Test is not an int, but a pointer to one.
The second set of examples is a bit more tricky. In case 4, both test and test2 will be pointers to an int, whereas in case 5, only test is a pointer, whereas test2 is a "real" int. What about case 6? Same as case 5?
4, 5, and 6 are the same thing, only test is a pointer. If you want two pointers, you should use:
int *test, *test2;
Or, even better (to make everything clear):
int* test;
int* test2;
White space around asterisks have no significance. All three mean the same thing:
int* test;
int *test;
int * test;
The "int *var1, var2" is an evil syntax that is just meant to confuse people and should be avoided. It expands to:
int *var1;
int var2;
Many coding guidelines recommend that you only declare one variable per line. This avoids any confusion of the sort you had before asking this question. Most C++ programmers I've worked with seem to stick to this.
A bit of an aside I know, but something I found useful is to read declarations backwards.
int* test; // test is a pointer to an int
This starts to work very well, especially when you start declaring const pointers and it gets tricky to know whether it's the pointer that's const, or whether its the thing the pointer is pointing at that is const.
int* const test; // test is a const pointer to an int
int const * test; // test is a pointer to a const int ... but many people write this as
const int * test; // test is a pointer to an int that's const
Use the "Clockwise Spiral Rule" to help parse C/C++ declarations;
There are three simple steps to follow:
Starting with the unknown element, move in a spiral/clockwise
direction; when encountering the following elements replace them with
the corresponding english statements:
[X] or []: Array X size of... or Array undefined size of...
(type1, type2): function passing type1 and type2 returning...
*: pointer(s) to...
Keep doing this in a spiral/clockwise direction until all tokens have been covered.
Always resolve anything in parenthesis first!
Also, declarations should be in separate statements when possible (which is true the vast majority of times).
There are three pieces to this puzzle.
The first piece is that whitespace in C and C++ is normally not significant beyond separating adjacent tokens that are otherwise indistinguishable.
During the preprocessing stage, the source text is broken up into a sequence of tokens - identifiers, punctuators, numeric literals, string literals, etc. That sequence of tokens is later analyzed for syntax and meaning. The tokenizer is "greedy" and will build the longest valid token that's possible. If you write something like
inttest;
the tokenizer only sees two tokens - the identifier inttest followed by the punctuator ;. It doesn't recognize int as a separate keyword at this stage (that happens later in the process). So, for the line to be read as a declaration of an integer named test, we have to use whitespace to separate the identifier tokens:
int test;
The * character is not part of any identifier; it's a separate token (punctuator) on its own. So if you write
int*test;
the compiler sees 4 separate tokens - int, *, test, and ;. Thus, whitespace is not significant in pointer declarations, and all of
int *test;
int* test;
int*test;
int * test;
are interpreted the same way.
The second piece to the puzzle is how declarations actually work in C and C++1. Declarations are broken up into two main pieces - a sequence of declaration specifiers (storage class specifiers, type specifiers, type qualifiers, etc.) followed by a comma-separated list of (possibly initialized) declarators. In the declaration
unsigned long int a[10]={0}, *p=NULL, f(void);
the declaration specifiers are unsigned long int and the declarators are a[10]={0}, *p=NULL, and f(void). The declarator introduces the name of the thing being declared (a, p, and f) along with information about that thing's array-ness, pointer-ness, and function-ness. A declarator may also have an associated initializer.
The type of a is "10-element array of unsigned long int". That type is fully specified by the combination of the declaration specifiers and the declarator, and the initial value is specified with the initializer ={0}. Similarly, the type of p is "pointer to unsigned long int", and again that type is specified by the combination of the declaration specifiers and the declarator, and is initialized to NULL. And the type of f is "function returning unsigned long int" by the same reasoning.
This is key - there is no "pointer-to" type specifier, just like there is no "array-of" type specifier, just like there is no "function-returning" type specifier. We can't declare an array as
int[10] a;
because the operand of the [] operator is a, not int. Similarly, in the declaration
int* p;
the operand of * is p, not int. But because the indirection operator is unary and whitespace is not significant, the compiler won't complain if we write it this way. However, it is always interpreted as int (*p);.
Therefore, if you write
int* p, q;
the operand of * is p, so it will be interpreted as
int (*p), q;
Thus, all of
int *test1, test2;
int* test1, test2;
int * test1, test2;
do the same thing - in all three cases, test1 is the operand of * and thus has type "pointer to int", while test2 has type int.
Declarators can get arbitrarily complex. You can have arrays of pointers:
T *a[N];
you can have pointers to arrays:
T (*a)[N];
you can have functions returning pointers:
T *f(void);
you can have pointers to functions:
T (*f)(void);
you can have arrays of pointers to functions:
T (*a[N])(void);
you can have functions returning pointers to arrays:
T (*f(void))[N];
you can have functions returning pointers to arrays of pointers to functions returning pointers to T:
T *(*(*f(void))[N])(void); // yes, it's eye-stabby. Welcome to C and C++.
and then you have signal:
void (*signal(int, void (*)(int)))(int);
which reads as
signal -- signal
signal( ) -- is a function taking
signal( ) -- unnamed parameter
signal(int ) -- is an int
signal(int, ) -- unnamed parameter
signal(int, (*) ) -- is a pointer to
signal(int, (*)( )) -- a function taking
signal(int, (*)( )) -- unnamed parameter
signal(int, (*)(int)) -- is an int
signal(int, void (*)(int)) -- returning void
(*signal(int, void (*)(int))) -- returning a pointer to
(*signal(int, void (*)(int)))( ) -- a function taking
(*signal(int, void (*)(int)))( ) -- unnamed parameter
(*signal(int, void (*)(int)))(int) -- is an int
void (*signal(int, void (*)(int)))(int); -- returning void
and this just barely scratches the surface of what's possible. But notice that array-ness, pointer-ness, and function-ness are always part of the declarator, not the type specifier.
One thing to watch out for - const can modify both the pointer type and the pointed-to type:
const int *p;
int const *p;
Both of the above declare p as a pointer to a const int object. You can write a new value to p setting it to point to a different object:
const int x = 1;
const int y = 2;
const int *p = &x;
p = &y;
but you cannot write to the pointed-to object:
*p = 3; // constraint violation, the pointed-to object is const
However,
int * const p;
declares p as a const pointer to a non-const int; you can write to the thing p points to
int x = 1;
int y = 2;
int * const p = &x;
*p = 3;
but you can't set p to point to a different object:
p = &y; // constraint violation, p is const
Which brings us to the third piece of the puzzle - why declarations are structured this way.
The intent is that the structure of a declaration should closely mirror the structure of an expression in the code ("declaration mimics use"). For example, let's suppose we have an array of pointers to int named ap, and we want to access the int value pointed to by the i'th element. We would access that value as follows:
printf( "%d", *ap[i] );
The expression *ap[i] has type int; thus, the declaration of ap is written as
int *ap[N]; // ap is an array of pointer to int, fully specified by the combination
// of the type specifier and declarator
The declarator *ap[N] has the same structure as the expression *ap[i]. The operators * and [] behave the same way in a declaration that they do in an expression - [] has higher precedence than unary *, so the operand of * is ap[N] (it's parsed as *(ap[N])).
As another example, suppose we have a pointer to an array of int named pa and we want to access the value of the i'th element. We'd write that as
printf( "%d", (*pa)[i] );
The type of the expression (*pa)[i] is int, so the declaration is written as
int (*pa)[N];
Again, the same rules of precedence and associativity apply. In this case, we don't want to dereference the i'th element of pa, we want to access the i'th element of what pa points to, so we have to explicitly group the * operator with pa.
The *, [] and () operators are all part of the expression in the code, so they are all part of the declarator in the declaration. The declarator tells you how to use the object in an expression. If you have a declaration like int *p;, that tells you that the expression *p in your code will yield an int value. By extension, it tells you that the expression p yields a value of type "pointer to int", or int *.
So, what about things like cast and sizeof expressions, where we use things like (int *) or sizeof (int [10]) or things like that? How do I read something like
void foo( int *, int (*)[10] );
There's no declarator, aren't the * and [] operators modifying the type directly?
Well, no - there is still a declarator, just with an empty identifier (known as an abstract declarator). If we represent an empty identifier with the symbol λ, then we can read those things as (int *λ), sizeof (int λ[10]), and
void foo( int *λ, int (*λ)[10] );
and they behave exactly like any other declaration. int *[10] represents an array of 10 pointers, while int (*)[10] represents a pointer to an array.
And now the opinionated portion of this answer. I am not fond of the C++ convention of declaring simple pointers as
T* p;
and consider it bad practice for the following reasons:
It's not consistent with the syntax;
It introduces confusion (as evidenced by this question, all the duplicates to this question, questions about the meaning of T* p, q;, all the duplicates to those questions, etc.);
It's not internally consistent - declaring an array of pointers as T* a[N] is asymmetrical with use (unless you're in the habit of writing * a[i]);
It cannot be applied to pointer-to-array or pointer-to-function types (unless you create a typedef just so you can apply the T* p convention cleanly, which...no);
The reason for doing so - "it emphasizes the pointer-ness of the object" - is spurious. It cannot be applied to array or function types, and I would think those qualities are just as important to emphasize.
In the end, it just indicates confused thinking about how the two languages' type systems work.
There are good reasons to declare items separately; working around a bad practice (T* p, q;) isn't one of them. If you write your declarators correctly (T *p, q;) you are less likely to cause confusion.
I consider it akin to deliberately writing all your simple for loops as
i = 0;
for( ; i < N; )
{
...
i++;
}
Syntactically valid, but confusing, and the intent is likely to be misinterpreted. However, the T* p; convention is entrenched in the C++ community, and I use it in my own C++ code because consistency across the code base is a good thing, but it makes me itch every time I do it.
I will be using C terminology - the C++ terminology is a little different, but the concepts are largely the same.
As others mentioned, 4, 5, and 6 are the same. Often, people use these examples to make the argument that the * belongs with the variable instead of the type. While it's an issue of style, there is some debate as to whether you should think of and write it this way:
int* x; // "x is a pointer to int"
or this way:
int *x; // "*x is an int"
FWIW I'm in the first camp, but the reason others make the argument for the second form is that it (mostly) solves this particular problem:
int* x,y; // "x is a pointer to int, y is an int"
which is potentially misleading; instead you would write either
int *x,y; // it's a little clearer what is going on here
or if you really want two pointers,
int *x, *y; // two pointers
Personally, I say keep it to one variable per line, then it doesn't matter which style you prefer.
#include <type_traits>
std::add_pointer<int>::type test, test2;
In 4, 5 and 6, test is always a pointer and test2 is not a pointer. White space is (almost) never significant in C++.
The rationale in C is that you declare the variables the way you use them. For example
char *a[100];
says that *a[42] will be a char. And a[42] a char pointer. And thus a is an array of char pointers.
This because the original compiler writers wanted to use the same parser for expressions and declarations. (Not a very sensible reason for a langage design choice)
I would say that the initial convention was to put the star on the pointer name side (right side of the declaration
in the c programming language by Dennis M. Ritchie the stars are on the right side of the declaration.
by looking at the linux source code at https://github.com/torvalds/linux/blob/master/init/main.c
we can see that the star is also on the right side.
You can follow the same rules, but it's not a big deal if you put stars on the type side.
Remember that consistency is important, so always but the star on the same side regardless of which side you have choose.
In my opinion, the answer is BOTH, depending on the situation.
Generally, IMO, it is better to put the asterisk next to the pointer name, rather than the type. Compare e.g.:
int *pointer1, *pointer2; // Fully consistent, two pointers
int* pointer1, pointer2; // Inconsistent -- because only the first one is a pointer, the second one is an int variable
// The second case is unexpected, and thus prone to errors
Why is the second case inconsistent? Because e.g. int x,y; declares two variables of the same type but the type is mentioned only once in the declaration. This creates a precedent and expected behavior. And int* pointer1, pointer2; is inconsistent with that because it declares pointer1 as a pointer, but pointer2 is an integer variable. Clearly prone to errors and, thus, should be avoided (by putting the asterisk next to the pointer name, rather than the type).
However, there are some exceptions where you might not be able to put the asterisk next to an object name (and where it matters where you put it) without getting undesired outcome — for example:
MyClass *volatile MyObjName
void test (const char *const p) // const value pointed to by a const pointer
Finally, in some cases, it might be arguably clearer to put the asterisk next to the type name, e.g.:
void* ClassName::getItemPtr () {return &item;} // Clear at first sight
The pointer is a modifier to the type. It's best to read them right to left in order to better understand how the asterisk modifies the type. 'int *' can be read as "pointer to int'. In multiple declarations you must specify that each variable is a pointer or it will be created as a standard variable.
1,2 and 3) Test is of type (int *). Whitespace doesn't matter.
4,5 and 6) Test is of type (int *). Test2 is of type int. Again whitespace is inconsequential.
I have always preferred to declare pointers like this:
int* i;
I read this to say "i is of type int-pointer". You can get away with this interpretation if you only declare one variable per declaration.
It is an uncomfortable truth, however, that this reading is wrong. The C Programming Language, 2nd Ed. (p. 94) explains the opposite paradigm, which is the one used in the C standards:
The declaration of the pointer ip,
int *ip;
is intended as a mnemonic; it says that the expression *ip is an
int. The syntax of the declaration for a variable mimics the syntax
of expressions in which the variable might appear. This reasoning
applies to function declarations as well. For example,
double *dp, atof(char *);
says that in an expression *dp and atof(s) have values of type
double, and that the argument of atof is a pointer to char.
So, by the reasoning of the C language, when you declare
int* test, test2;
you are not declaring two variables of type int*, you are introducing two expressions that evaluate to an int type, with no attachment to the allocation of an int in memory.
A compiler is perfectly happy to accept the following:
int *ip, i;
i = *ip;
because in the C paradigm, the compiler is only expected to keep track of the type of *ip and i. The programmer is expected to keep track of the meaning of *ip and i. In this case, ip is uninitialized, so it is the programmer's responsibility to point it at something meaningful before dereferencing it.
A good rule of thumb, a lot of people seem to grasp these concepts by: In C++ a lot of semantic meaning is derived by the left-binding of keywords or identifiers.
Take for example:
int const bla;
The const applies to the "int" word. The same is with pointers' asterisks, they apply to the keyword left of them. And the actual variable name? Yup, that's declared by what's left of it.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Declaring pointers; asterisk on the left or right of the space between the type and name?
I've always been wondering what's the exact correct position to put * and &. it seems that C++ is pretty tolerant about where to put these marks.
For example I've seem pointer and ampersand been put on both left and right of a keyword or in the middle of two keywords, but confusingly sometimes they seems to mean the same thing, especially when used together with const
void f1(structure_type const& parameter)
void f2(structure_type const ¶meter)
void f2(structure_type const *sptr);
void f2(structure_type const* sptr);
void f2(structure_type const * sptr);
The examples are not exhaustive. I see them everywhere while being declared or been passed to a function. Do they even mean the same thing? But I also see cases while putting * will affect which object being referred as a pointer (likely the case where * is in between two keywords).
EDITED:
int const *Constant
int const * Constant // this above two seem the same to me, both as pointer to a constant value
int const* Constant // EDIT: this one seems the same as above. instead of a constant pointer
const int * Constant // this is also a pointer to a constant value, but the word order changed while the pointer position stays the same, rather confusing.
int* const Constant
int * const Constant // instead, these two are constant pointers
So I concluded this:
T const* p; // pointer to const T
const T* p // seems same from above
T* const p; // const pointer to T
Still, this confused the hell out of me. Doesn't the compiler care about position and the spacing required for them?
Edit:
I want to know in general of the position matters. If yes in what cases.
White space matters only to the degree that it keeps tokens from running together and (for example) creating a single token, so (for example) int x is obviously different from intx.
When you're dealing with something like: int const*x;, whitespace on either size of the * makes absolutely no difference to the compiler at all.
The difference between a pointer to const int and const pointer to int depends on which side of the * the const is on.
int const *x; // pointer to const int
int *const x; // const pointer to int
The primary difference is readability when/if you define/declare multiple objects in the same declaration.
int* x, y;
int *x, y;
In the first, somebody might think that x and y are pointers to int -- but in fact, x is a pointer to an int, and y is an int. To some people's eyes, the second reflects that fact more accurately.
One way to prevent any misinterpretation is to only ever define one object at a time:
int *x;
int y;
For any of these, correct interpretation is fairly easy if you ignore whitespace entirely (except to tell you where one toke ends and another starts, so you know "const int" is two tokens) and read from right to left, reading * as "pointer to". For example: int volatile * const x; is read as "x is a const pointer to a volatile int".
int const *Constant
int const * Constant
int const* Constant
All of the above intend to declare a non constant pointer to a constant integer.
Simple rule:
If const follows after the * it applies to the pointer if not it applies to the pointed object. The spacing doesn't matter.
Both & and * placements in variable declaration are acceptable and only depends on your own flavour. They strictly mean the same thing, creating respectively a pointer and a reference.
However, the const keyword placement is primordial, as a int const* variable declares a constant pointer to a non-constant int, and const int* variable is a non-constant pointer to a constant int.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
c++ * vs & in function declaration
I know that this probably seems like an incredibly elementary question to many of you, but I have genuinely had an impossible time finding a good, thorough explanation, despite all my best Googling. I'm certain that the answer is out there, and so my search terms must be terrible.
In C++, a variety of symbols and combinations thereof are used to mark parameters (as well as arguments to those parameters). What, exactly, are their meanings?
Ex: What is the difference between void func(int *var) and void func(int **var)? What about int &var?
The same question stands for return types, as well as arguments. What does int& func(int var) mean, as compared to int* func(int var)? And in arguments, how does y = func(*x) differ from y = func(&x)?
I am more than happy to read enormous volumes on the subject if only you could point me in the right direction. Also, I'm extremely familiar with general programming concepts: OO, generics/templates, etc., just not the notation used in C/C++.
EDIT: It seems I may have given the impression that I do not know what pointers are. I wonder how that could be :)
So for clarification: I understand perfectly how pointers work. What I am not grasping, and am weirdly unable to find answers to, is the meaning of, for example 'void func(int &var)'. In the case of an assignment statement, the '&' operator would be on the right hand side, as in 'int* x = &y;', but in the above, the '&' operator is effectively on the left hand side. In other words, it is operating on the l-value, rather than the r-value. This clearly cannot have the same meaning.
I hope that I'm making more sense now?
To understand this you'll first need to understand pointers and references. I'll simply explain the type declaration syntax you're asking about assuming you already know what pointers and references are.
In C, it is said that 'declaration follows use.' That means the syntax for declaring a variable mimics using the variable: generally in a declaration you'll have a base type like int or float followed something that looks like an expression. For example in int *y the base type is int and the expression look-alike is *y. Thereafter that expression evaluates to a value with the given base type.
So int *y means that later an expression *y is an int. That implies that y must be a pointer to an int. The same holds true for function parameters, and in fact for whole function declarations:
int *foo(int **bar);
In the above int **bar says **bar is an int, implying *bar is a pointer to an int, and bar is a pointer to a pointer to an int. It also declares that *foo(arg) will be an int (given arg of the appropriate type), implying that foo(arg) results in a pointer to an int.¹ So the whole function declaration reads "foo is a function taking a pointer to a pointer to an int, and returning a pointer to an int."
C++ adds the concept of references, and messes C style declarations up a little bit in the process. Because taking the address of a variable using the address-of operator & must result in a pointer, C doesn't have any use for & in declarations; int &x would mean &x is an int, implying that x is some type where taking the address of that type results in an int.² So because this syntax is unused, C++ appropriates it for a completely different purpose.
In C++ int &x means that x is a reference to an int. Using the variable does not involve any operator to 'dereference' the reference, so it doesn't matter that the reference declarator symbol clashes with the address-of operator. The same symbol means completely different things in the two contexts, and there is never a need to use one meaning in the context where the other is allowed.
So char &foo(int &a) declares a function taking a reference to an int and returning a reference to a char. func(&x) is an expression taking the address of x and passing it to func.
1. In fact in the original C syntax for declaring functions 'declarations follow use' was even more strictly followed. For example you'd declare a function as int foo(a,b) and the types of parameters were declared elsewhere, so that the declaration would look exactly like a use, without the extra typenames.
2. Of course int *&x; could make sense in that *&x could be an int, but C doesn't actually do that.
What you're asking about are called pointers (*), and reference to (&), which I think is best explained here.
The symbols & and * are used to denote a reference and pointer type, respectively.
int means simply the type 'int',
int* means 'pointer to int',
int& means 'reference to int',
A pointer is a variable which is used to store the address of a variable.
A reference has the syntax of its base type, but the semantics of a pointer to that type. This means you don't need to dereference it in order to change the value.
To take an example, the following code blocks two are semantically equivalent:
int* a = &value;
*a = 0;
And:
int& a = value;
a = 0;
The main reasons to use pointers or references as an argument type is to avoid copying of objects and to be able to change the value of a passed argument. Both of these work because, when you pass by reference, only the address is copied, giving you access to the same memory location as was "passed" to the function.
In contrast, if a reference or pointer type is not used, a full copy of the argument will be made, and it is this copy which is available inside the function.
The symbols * and & have three meanings each in C++:
When applied to an expression, they mean "dereference" and "address-of" respectively, as you know.
When part of a type, they mean "pointer" and "reference", respectively.
Since C++ doesn't care about arbitrary spacing, the declaration int *ptr is exactly the same as the declaration int* ptr, in which you can now more clearly see that this is an object called ptr of type int*.1
When used between two expressions, they mean "multiply" and "bitwise AND", respectively.
1 - though, frustratingly, this isn't actually how the internal grammar reads it, thanks to the nasty legacy of C's type system. So avoid single-line multi-declarations involving pointers unless you want a surprise.
Ex: What is the difference between 'void func(int *var)' and 'void
func(int **var)'? What about 'int &var'?
The same question stands for return types, as well as arguments. What
does 'int& func(int var)' mean, as compared to 'int* func(int var)'?
And in arguments, how does 'y = func(*x)' differ from 'y = func(&x)'?
(1)
<return type> <function name> <parameters>
void func (int *var)
<parameter> here int *var is a pointer to integer, ie it can point to
an array or any buffer that should be handled with integer pointer
arithmetic. In simple terms , var holds the address of the respective
**actual parameter.**
eg: int arr[10];
func(arr);
int a = 33;
func(&a);
Here, &a means we are explicitly passing address of the the variable 'a'.
(2)
int m = 0;
int &var = m;
Here var means reference, ie it another alias name for variable 'm' ,
so any change 'var' makes will change the contents of variable 'm'.
var = 2; /* will change the actual contents of 'm' */
This simple example will not make sense , unless you understand the context.
Reference are usually use to pass parameter to function, so that changes made by
the function to the passed variable is visible at the caller.
int swap(int &m, int &n) {
tmp = m;
m = n;
n = tmp;
}
void main( void ) {
int x = 1, y = 2;
swap(x, y);
/* x = 2, y =1 */
}
(3)
'int& func(int var)' mean, as compared to 'int* func(int var)'?
int& func(int var) means the function returns a reference;
int* func(int var) means the function returns a pointer / address;
Both of the them has its context;
int& setNext() {
return next;
}
setNext() = previous;
where as
int* setNext() {
return &next;
}
int *inptr;
inptr = setNext();
*inptr = previous;
In the previous two lines,
int *inptr <- integer pointer declaration;
*inptr <- means we are accessing contents of the address pointed by inptr;
ie we are actually referring to 'next' variable.
The actual use is context specific. It can't be generalized.
(4)
how does 'y = func(*x)' differ from 'y = func(&x)'?
y = func(&x) is already explained.
y = func(*x) , well i'm not sure if you actually meant *x.
int swap(int *m, int *n) {
tmp = *m;
*m = *n;
*n = tmp;
}
To start you probably know that const can be used to make either an object's data or a pointer not modifiable or both.
const Object* obj; // can't change data
Object* const obj; // can't change pointer
const Object* const obj; // can't change data or pointer
However you can also use the syntax:
Object const *obj; // same as const Object* obj;
The only thing that seems to matter is which side of the asterisk you put the const keyword. Personally I prefer to put const on the left of the type to specify it's data is not modifiable as I find it reads better in my left-to-right mindset but which syntax came first?
More importantly why is there two correct ways of specifying const data and in what situation would you prefer or need one over the other if any?
Edit:
So it sounds like this was an arbitrary decision when the standard for how compilers should interpret things was drafted long before I was born. Since const is applied to what is to the left of the keyword (by default?) I guess they figured there was no harm in adding "shortcuts" to apply keywords and type qualifiers in other ways at least until such a time as the declaration changes by parsing a * or & ...
This was the case in C as well then I'm assuming?
why is there two correct ways of specifying const data and in what situation would you prefer or need one over the other if any?
Essentially, the reason that the position of const within specifiers prior to an asterisk does not matter is that the C grammar was defined that way by Kernighan and Ritchie.
The reason they defined the grammar in this way was likely that their C compiler parsed input from left-to-right and finished processing each token as it consumed that. Consuming the * token changes the state of the current declaration to a pointer type. Encountering const after * means the const qualifier is applied to a pointer declaration; encountering it prior to the * means the qualifier is applied to the data pointed to.
Because the semantic meaning does not change if the const qualifier appears before or after the type specifiers, it is accepted either way.
A similar sort of case arises when declaring function pointers, where:
void * function1(void) declares a function which returns void *,
void (* function2)(void) declares a function pointer to a function which returns void.
Again the thing to notice is that the language syntax supports a left-to-right parser.
The rule is:
const applies to the thing left of it. If there is nothing on the left then it applies to the thing right of it.
I prefer using const on the right of the thing to be const just because it is the "original" way const is defined.
But I think this is a very subjective point of view.
I prefer the second syntax. It helps me keep track of 'what' is constant by reading the type declaration from right to left:
Object * const obj; // read right-to-left: const pointer to Object
Object const * obj; // read right-to-left: pointer to const Object
Object const * const obj; // read right-to-left: const pointer to const Object
The order of the keywords in a declaration isn't all that fixed. There are many alternatives to "the one true order". Like this
int long const long unsigned volatile i = 0;
or should it be
volatile unsigned long long int const i = 0;
??
The first rule is to use whichever format your local coding standards
requires. After that: putting the const in front leads to no end of
confusion when typedefs are involved, e.g.:
typedef int* IntPtr;
const IntPtr p1; // same as int* const p1;
If your coding standard allows typedef's of pointers, then it really
should insist on putting the const after the type. In every case but
when applied to the type, const must follow what it applies to, so
coherence also argues in favor of the const after. But local coding
guidelines trump all of these; the difference isn't normally important
enough to go back and change all of the existing code.
There are historical reasons that either left or right is acceptable. Stroustrup had added const to C++ by 1983, but it didn't make it to C until C89/C90.
In C++ there's a good reason to always use const on the right. You'll be consistent everywhere because const member functions must be declared this way:
int getInt() const;
C uses a right-to-left syntax. Just read the declarations from right to left:
int var = 0;
// one is a pointer to a const int
int const * one = &var;
// two is a pointer to an int const (same as "const int")
const int * two = &var;
// three is a constant pointer to an int
int * const three = &var;
The first thing left to the "const" is affected by it.
For more fun read this guide:
http://cseweb.ucsd.edu/~ricko/rt_lt.rule.html
With
using P_Int = int *;
P_Int const a = nullptr;
int * const b = nullptr;
const P_Int c = nullptr;
const int * d = nullptr;
the variables a and b are the same type as each other but, somewhat confusingly, the variables c and d are not the same type as each other. My preference is for the first scenario, without the confusion: have const on the right of the type. N.B. it is the pointer that is const with a, b, and c; but the int is const with d.
When I work with the existing code, I follow the way already being used which is most of the time, const on left e.g const int a = 10; but if I got a chance to work from start I choose the "right way", means const on right e.g int const a = 10;, which is a more universal approach and straightforward to read.
As mentioned already, the rule is
const applies to the thing left of it. If there is nothing on the left then it applies to the thing right of it.
Here is the example code for basic use cases.
int main() {
int const a = 10; // Constant integer
const int b = 20; // Constant integer (same as above)
int c = 30; // Integer (changeable)
int * const d = &c; // Constant pointer to changeable integer
int const * e = &a; // Changeable pointer to constant integer
int const * const f = &a; // Constant pointer to constant integer
return 0;
}
At the end of the day, if there is no guideline to follow, this is subjective, and harmless to use either approach.