Are references distinct types? [duplicate] - c++

From what I can tell, references can be used wherever the original type can (I'm not implying the reverse is true), the only difference is their mutation semantics (when the variables are used as lvalues).
Wouldn't they then qualify as the same type as the original? If so, why is the fact that something is a reference, stored in its type?
Edit: if references are a different type, why can they be substituted for the original type in so many situations, without explicit casting? Is there implicit cast involved?
Example:
void bar(int& a);
int x;
int& y = x;
bar(y) // matching type
bar(x) // what happened here? was x cast to a reference?

A reference is formally a type; or at least you can read things like "if T is a reference type" in the C++ standard itself.
However, your question is perfectly legitimate in that references have very confusing semantics. They are not quite first-class types (for example, you can't have a reference-to-reference or a pointer-to-reference), and in my opinion that's because C++ managed to conflate two different kinds of types with how it defines and uses references.
What a reference really does is it gives an alternate name to an already-existing object (value). What does this mean? It means that it doesn't "qualify" the type of the value it refers to per se; it rather qualifies the name ("variable", "storage") itself that is used for referencing the value.
In C++, the semantics of a type and a value often depends on additional properties of the storage where the object/value is stored. This is not always explicit, and that's what confuses people.
I think because C++ heavily relies on exposing the concept of "storage" (rather than hiding it as an implementation detail), there really should be two type systems: one for pure values themselves, and one for storage, where the type system for storage should be a superset of the type system for values.
Another example where a very similar issue appears is CV-qualification. It's not an object/value itself that should be const or volatile. It's the storage containing that value that may or may not be mutable, and may or may not need to be protected from certain load/store optimizations. Again, this could be better expressed if there was a way to express these properties of types separately in the case of values and storage.

From what I can tell, references can be used wherever the original type can
That is simply not true.
Consider:
void foo(int x);
void bar(int& x);
foo(3);
bar(3); // whoops!
And how about this:
struct T
{
int& x;
};
It wouldn't make sense not to have a distinct type for references. This way, you get function overloading powers and every other benefit that the type system gives you.
You would otherwise need to invent some other mechanism to denote a thing as being a reference rather than a non-reference; surely the type system is the perfect mechanism to denote that?
int and int& are two distinct types.

” From what I can tell, references can be used wherever the original type can
No. A reference refers. You can think of it as a pointer in disguise.
” Are references separate types in C++?
Yes.
” If not, why are they written in the type?
That's just the syntax for specifying a reference type, using & as a type builder symbol. As another example, * is a type builder for pointers. Except for a limitation of type inference we could now replace that (1)impractical syntax with template syntax.
1) Both the creators of C and the creator of C++ have on several occasions described the original C declaration syntax as a “failed experiment”.

Unlike a pointer, a reference cannot be reseated; the address it is referencing cannot be changed. By like a pointer, the reference is useful when avoiding copying semantics, thus needing to create an alias to something that already exists... i.e., knowing it is a reference and not an object means the compiler knows not to copy the object at assignment or when passing through functions.
EDIT: regarding the updated questions, "if references are a different type, why can they be substituted for the original type in so many situations, without explicit casting? Is there implicit casting involved?" ... not casting, it is a reference so it simply gets "dereferenced" by "pointing" to the original object; it may help to just think of it as just a substitution name, or an alias, etc.

Related

Contradicting definition of references

I am learning about references in C++. In particular, i have learnt that references are not actual objects. Instead they refer to some other object. That is, reference are just alias for other objects.
Then i came across this which says:
Important note: Even though a reference is often implemented using an address in the underlying assembly language, please do not think of a reference as a funny looking pointer to an object. A reference is the object, just with another name. It is neither a pointer to the object, nor a copy of the object. It is the object. There is no C++ syntax that lets you operate on the reference itself separate from the object to which it refers.
I get that the above quote means that we can't operate on the reference itself separate from the object to which it refers but it still seems to imply that "a reference is an object".
Also, i have come across the the sentence given below:
In ISO C++, a reference is not an object. As such, it needs not have any memory representation.
I don't have a link to this 2nd quote but i read it in one of SO's post somewhere.
My question is that assuming the second quote is also from the standard(which may not be the case), doesn't these 2 quoted statements contradict each other. Or at least the first quote is misleading. Which one is correct.
My current understanding(by reading books like C++ Primer 5th edition) is that references are an alias for objects. Which leads me to the thinking that they should not take any space in memory.
Important note: Even though a reference is often implemented using an address in the underlying assembly language, please do not think of a reference as a funny looking pointer to an object. A reference is the object, just with another name. ...
Notes are informal and usually should not to be interpreted as strict rules. If an interpretation contradicts with standard rules, then that interpretation is wrong.
References and objects are different kinds of entities. A reference is not an object distinct from the one that it names. It isn't possible to form a pointer to a reference. A "pointer to reference" isn't even a valid type category.
The note is trying to say that reference "is" the object which it names in the sense that using the reference is using the referred object.
I was thinking of confirming that whether or not references take any space
References take space or they don't take space. It's up to the language implementation to figure out whether it needs space in each case.
Standard quote:
[dcl.ref] It is unspecified whether or not a reference requires storage
Outside of standard specifications, if you want an example of reference using space, try adding a reference member to a class and you are very likely to observe that the size of the class increases.
since pointers take space then reference should also take space. ...
Pointers do take space in the abstract machine that the standard specifies. But if you never observe the storage, then it's entirely possible that the storage never exists in practice. A significant difference between references and pointers is that you cannot observe the storage of a reference directly.
Philosopher: "If a tree falls in an abstract machine and no one is around to observe it, does it have an effect?"
Optimiser: "Not if I can help it."
A reference provides another way of referring to an object. That's useful particularly when passing parameters by reference to functions. More formally, a reference is an alias that binds to a variable including, in this case, an anonymous temporary.
Fortunately we don't need to concern ourselves how they are implemented. That's the job of the compiler, and techniques vary. The C++ standard does not require them to occupy any memory.
There is a way of distinguishing reference types by the way. Non-separability is really more about not being able to bind the reference to any other variable. See
#include <iostream>
#include <type_traits>
int main() {
int a = 0;
int& ref = a;
std::cout << (
std::is_reference<decltype(ref)>::value ? "pay me a bonus\n" : "reformat my hard disk\n"
);
std::cout << (
std::is_reference<decltype(a)>::value ? "pay me a bonus\n" : "reformat my hard disk\n"
);
}
Note finally that &a and &ref must always be the same.
The first quote is really saying the reference is not separable from the object.
... still seems to imply that "a reference is an object".
It really implies that a reference is a non-separable, non-first-class alias for an object, exactly as you first said.
The difficulty with these discussions is that the standardese meaning of "object" is already different from the meaning used in most less-formal contexts.
Let's start simple:
int a;
Would often be described as declaring an integer object a, right? It actually
declares an integer object
binds the name a to that object in the appropriate scope
Now, if we write
int &b = a;
we could say that b is the object in the same way as we could say that a is the object. Actually neither are correct, but given that informal text already uses the latter, it's no worse.
We should instead say that the name b refers to the same object as the name a. This is exactly consistent with calling it an alias, but informal or introductory texts would seem pretty cumbersome if they wrote "... the integer object referred to by the name a ..." everywhere instead of just "the integer a".
As for taking space in memory ... it depends. If I introduce 100 aliases for a single object inside a single function I'd be really surprised if the compiler didn't just collapse them (although of course they might still show up in debug symbols). No information is being lost here by eliding the redundant names.
If I pass an argument by reference to a non-inlined function, some actual information is being communicated, and that information must be stored somewhere.
What a reference actually "is" doesn't make much sense: you could say it is the referenced object or that it is an alias to it, and these claims are both true in some sense.
int main()
{
int a(0);
int& ref(a);
ref = 1; // Will actually affect the value of a
return 0;
}
Let's go through this program line by line.
int a(0); allocates some memory (usually 4 bytes) on the stack to hold an integer.
int& ref(a); doesn't necessarily allocate memory, and wether it actually will is compiler-specific. In this sense, ref itself is not an object: it is simply an alias, another name, for a. This is what the second quote means by "a reference is not an object". (Please note that sometimes, when the what object is referenced can't be known at compile-time for example, a reference has to reserve additional space for the object's address. In these cases, references are just syntactic sugar for pointers.)
ref = 1; sets the value of a to one. In this sense, you can think of ref as being precisely the same object as a. Any operation "on" the reference will actually operate on the referenced object. This is what the first quote means by "It is the object".

Are alias and reference the same thing in C++?

I'm not asking the difference between pointer and reference. Just a bit confused about the difference between reference and alias.
As far as I'm concerned, reference is a data type while alias is just a word describing the utility of this data type?
Thanks!
Aliasing refers to any way to refer to the same data through different names. References and pointers are two ways of achieving this behavior.
No, a reference is not a data-type, a reference references some other variable. Using a reference is the same as using the variable it references. It's very similar to a pointer (and it's not unlikely that the compiler treats references as pointers under the hood).
I've never heard of "alias" by itself in the context of C++, but there are type-aliases, created by e.g. typedef or using. There's also aliasing which is unrelated to both references and type-aliases.
Sorry, you said you were not asking the difference between a pointer and a reference.
To answer what you’re asking, the word reference means that a variable is pointing to a location in memory. Alias has a few different meanings, but the one I’ve seen most often in this context is that more than one variable are referencing the same location in memory, such as when you try to call memcpy(p, p, n);. One way to make this happen is with a C++ reference, which is a term of art for a language feature similar but not identical to pointers. Not every reference in C++ necessarily refers to something which ever has another name. You can also do aliasing with pointers, the address operator, a call by reference, or the compiler merging constants so that "Hello" and "Hello" in two places point to the same bytes in memory, or unions. Probably not exhaustive.
If people want to call a reference to something an “alias” even when there isn't any other variable referencing the same memory at the same time, I’m not strongly motivated to argue.
As several others have pointed out, C++14 uses the term “alias” to refer to template types declared with using. (http://en.cppreference.com/w/cpp/language/type_alias)
A type alias declaration introduces a name which can be used as a synonym for the type denoted by type-id. It does not introduce a new type and it cannot change the meaning of an existing type name. There is no difference between a type alias declaration and typedef declaration. This declaration may appear in block scope, class scope, or namespace scope.
From http://en.cppreference.com/w/cpp/language/type_alias

Are references separate types in C++? If not, why are they written in the type?

From what I can tell, references can be used wherever the original type can (I'm not implying the reverse is true), the only difference is their mutation semantics (when the variables are used as lvalues).
Wouldn't they then qualify as the same type as the original? If so, why is the fact that something is a reference, stored in its type?
Edit: if references are a different type, why can they be substituted for the original type in so many situations, without explicit casting? Is there implicit cast involved?
Example:
void bar(int& a);
int x;
int& y = x;
bar(y) // matching type
bar(x) // what happened here? was x cast to a reference?
A reference is formally a type; or at least you can read things like "if T is a reference type" in the C++ standard itself.
However, your question is perfectly legitimate in that references have very confusing semantics. They are not quite first-class types (for example, you can't have a reference-to-reference or a pointer-to-reference), and in my opinion that's because C++ managed to conflate two different kinds of types with how it defines and uses references.
What a reference really does is it gives an alternate name to an already-existing object (value). What does this mean? It means that it doesn't "qualify" the type of the value it refers to per se; it rather qualifies the name ("variable", "storage") itself that is used for referencing the value.
In C++, the semantics of a type and a value often depends on additional properties of the storage where the object/value is stored. This is not always explicit, and that's what confuses people.
I think because C++ heavily relies on exposing the concept of "storage" (rather than hiding it as an implementation detail), there really should be two type systems: one for pure values themselves, and one for storage, where the type system for storage should be a superset of the type system for values.
Another example where a very similar issue appears is CV-qualification. It's not an object/value itself that should be const or volatile. It's the storage containing that value that may or may not be mutable, and may or may not need to be protected from certain load/store optimizations. Again, this could be better expressed if there was a way to express these properties of types separately in the case of values and storage.
From what I can tell, references can be used wherever the original type can
That is simply not true.
Consider:
void foo(int x);
void bar(int& x);
foo(3);
bar(3); // whoops!
And how about this:
struct T
{
int& x;
};
It wouldn't make sense not to have a distinct type for references. This way, you get function overloading powers and every other benefit that the type system gives you.
You would otherwise need to invent some other mechanism to denote a thing as being a reference rather than a non-reference; surely the type system is the perfect mechanism to denote that?
int and int& are two distinct types.
” From what I can tell, references can be used wherever the original type can
No. A reference refers. You can think of it as a pointer in disguise.
” Are references separate types in C++?
Yes.
” If not, why are they written in the type?
That's just the syntax for specifying a reference type, using & as a type builder symbol. As another example, * is a type builder for pointers. Except for a limitation of type inference we could now replace that (1)impractical syntax with template syntax.
1) Both the creators of C and the creator of C++ have on several occasions described the original C declaration syntax as a “failed experiment”.
Unlike a pointer, a reference cannot be reseated; the address it is referencing cannot be changed. By like a pointer, the reference is useful when avoiding copying semantics, thus needing to create an alias to something that already exists... i.e., knowing it is a reference and not an object means the compiler knows not to copy the object at assignment or when passing through functions.
EDIT: regarding the updated questions, "if references are a different type, why can they be substituted for the original type in so many situations, without explicit casting? Is there implicit casting involved?" ... not casting, it is a reference so it simply gets "dereferenced" by "pointing" to the original object; it may help to just think of it as just a substitution name, or an alias, etc.

is reference in c++ internally compiled as pointers or alias?

This tutorial says,
You're probably noticing a similarity to pointers here--and that's true, references are often implemented by the compiler writers as pointers
In similar, one commented in
What is a reference variable in C++?
as
Technically not. If bar was a variable you could get its address. A reference is an alias to another variable (not the address of as this would imply the compiler would need to insert a dereference operation). When this gets compiled out bar probably is just replaced by foo
Which statement is true?
Both are true, but under different circumstances.
Semantically, a reference variable just introduces a new name for an object (in the C++ sense of "object").
(There's plenty of confusion around what "variable" and "object" mean, but I think that a "variable" in many other languages is called an "object" in C++, and that's what your second quote refers to as a "variable".)
If this reference isn't stored anywhere or passed as a parameter, it doesn't necessarily have any representation at all (the compiler can just use whatever it refers to instead).
If it is stored (e.g. as a member) or passed as a parameter, the compiler needs to give it a representation, and the most sensible one is to use the address of the object it refers to, which is exactly the same way as pointers are represented.
Note that the standard explicitly says that it it unspecified whether a reference variable has any size at all.
The C++ Standard states, at §8.3.2/4:
It is unspecified whether or not a reference requires storage.
And this non-specification is the main reason why both a pointer implementation and an aliasing implementation are valid implementations.
Therefore, both can be right.
They're both true, in a manner of speaking. Whether a reference gets compiled as a pointer is an implementation detail of the compiler, rather than a part of the C++ standard. Some compilers may use regular pointers, and some may use some other form or aliasing the referenced variable.
Consider the folowing line:
int var = 0;
int &myRef = var;
Compiler "A" may compile myRef as a pointer, and compiler "B" might use some other method for using myRef.
Of course, the same compiler may also compile the reference in different ways depending on the context. For example, in my example above, myRef may get optimized away completely, whereas in contexts where the reference is required to be present (such as a method parameter), it may be compiled to a pointer.

What do I call a "normal" variable?

int* p;
int& r;
int i;
double* p2;
double& r2;
double d;
p and p2 are pointers, r and r2 are references, but what are i and d? (No, I am not looking for the answer "an int and a double")
I am looking for a name to use for "normal" variables, setting them apart from pointers and references. I don't believe that such a name doesn't exist (after all, I can't be the first one who wants to distinguish them from pointers and references. I do have the feeling that it's something really easy and I'm just missing it here.
Who knows what to call "normal" variables?
Additional info
I am looking for a name that can refer to anything but references and pointers, so including classes. The whole same story could be held when the following was included as well:
MyClass* p3;
MyClass& r3;
MyClass c;
I am not looking for a way to refer to i, a way to refer to d and a way to c. I am looking for a way to refer to the group (of non-references, non-pointers) which i, d and c are part of.
If I understand what you are talking about, I would call it a value type.
Pointers and references are variables as well. I think it is sufficient to say that i is a variable of type int and p is a variable of type pointer to int. If they are members of a class i is a member variable of type int would be the most precise description.
Edit: There is no definite answer to your question. int would be a fundamental type in standard terminology. Other types as classes, unions and pointers are called compound types. This just isn't helpful in your case as you would refer to
A* a;
int* b;
A c;
as only consisting of compound types. But you want to emphasize that you use pointers. Just say it.
If you want the official terminology, then the C++ standard defines:
fundamental types, built-in value types such as int and double, and
compound types, including pointers and references, and also arrays, functions, classes, unions and enumerations.
It also uses the terms "object types", "reference types" and "function types", but it seems a bit vague to me about whether a pointer is an "object type" or a "reference type".
If you want to include fundamental types and classes, I would use the term "object type", and leave it to pedants to quibble about whether that should include pointers.
Value types (value variables) was my first thought, but it seems to make some people uncomfortable, so nonreferential types (nonreferential variables) works just as well: pointers and references are both "referential types" in the sense that they refer to another location, while ordinary value types do not.
They are just variables. Not pointers and not references, variables.
I think you want to use "non-pointer and non-reference type". There is no name especially designed for your purpose, i think.
A name which allows me to say, I am using pointers all over the place, but there in many cases there is no reason to do so, so I should change them to s wherever possible.
It sounds to me you look rather for "pass by copy" or "copy" vs "pass by reference" or "reference". The fact that you can pass a pointer by value makes it impossible to use an absolute term, but makes it necessary to use a term tied to the usecase:
int **p;
Is this a variable used to change a pointer of type int* (then it would be "pass by reference" in the sense you aim to modify the referent. To clarify that a pointer is used, you may wish to use the term "pass by pointer" too), or is this variable just used to hold a value of type int** (then it would be "pass by value" in the sense that you copy such a value)?
Values, or variables.
I typically call those "plain ol' data" (POD) variables, after the convention of referring to a struct as "POD" if it contains only data members (no functions). That's not an official convention, but it gets the point across (for simple types like int and float, it doesn't apply for classes).
I have also heard these types of variables called "concrete" variables. I think the distinction that was trying to be made is that these variables are something in and of themselves, whereas pointers and references simply tell you about some other piece of data somewhere else.
More than anything else, I hear these "loose" variables (that is, not a member of another object) referred to simply by their type (integer, floating-point number, class, etc).
i and d are consistently called objects in the C++ standard. But, then again, so are p, p2, and objects of class type such as the "c" of MyClass c;.
Personally I like calling i, d, and p objects, though it might be a little bit confusing to programmers of other languages such as Java, where they would be known as primitive variables with the term objects reserved for instances of classes.
EDIT: Instead of
I am using pointers all over the place, but there are many cases where there is no reason to do so, so I should change them to <normal variable>s wherever possible.
I would say: "I am using pointers all over the place, but there are many cases where there is no reason to do so, so I should remove the levels of indirection wherever possible."
I think we should call them OBJECTs. I think it's no need to be that strict.
Consider:
typedef int *, pint;
pint foo;
What do you think foo is?