C++ reference & const pointers in C/C++ - c++

Recently while I was explaining the basic difference between pointers and references(in context of C++ programming) to someone, I told the usual explanations which talk about different function argument passing conventions - Call by value, Call by pointer, Call by reference, and all the associated theory about references.
But then I thought whatever a C+ reference does in terms of argument passing,(Allows a memory efficient way of passing large structures/objects, at same time keeps it safe by not allowing the callee to modify any variables of the object passed as reference, if our design demands it)
A const pointer in C would achieve the same thing , e.g. If one needs to pass a structure pointer say struct mystr *ptr, by casting the structure pointer as constant -
func(int,int,(const struct mystr*)(ptr));
will ptr not be some kind of equivalent to a reference?
Will it not work in the way which would be memory efficient by not replicating the structure(pass by pointer) but also be safe by disallowing any changes to the structure fields owing to the fact that it is passed as a const pointer.
In C++ object context, we may pass const object pointer instead of object reference as achieve same functionality)
If yes, then what use-case scenario in C++, needs references.
Any specific benefits of references, and associated drawbacks?
thank you.
-AD

You can bind a const reference to an rvalue:
void foo(const std::string& s);
foo(std::string("hello"));
But it is impossible to pass the address of an rvalue:
void bar(const std::string* s);
bar(&std::string("hello")); // error: & operator requires lvalue
References were introduced into the language to support operator overloading. You want to be able to say a - b, not &a - &b. (Note that the latter already has a meaning: pointer subtraction.)
References primarily support pass by reference, pointers primarily support reference semantics. Yes I know, the distinction isn't always quite clear in C++.

There are two typical use-case scenarios:
First: Pointers denote optional arguments. Since, references cannot be NULL, but pointers can, document in the coding style that any argument that is notated as pointer, may be NULL, the function needs to handle that. Optional arguments can then be const or non-const, as can mandatory (reference) arguments.
Second: References are only used in conjunction with the const keyword, because the calling syntax suggests to the reader pass-by-value semantics, which is by definition constant. Then pointers are only used for arguments that can be changed by the callee.
I personally prefer the first option, because there each of the four cases "const reference", "non-const reference", "const pointer", "non-const pointer" has a different meaning. Option two only differentiates between two "things": "function may modify that value" vs. "function will not modify that value".

In terms of the referent object, what code can manipulate given a const reference is the same as what it can do with a pointer to a const object, and similarly what it can manipulate given a non-const reference is the same as given a pointer to a non-const object.
What a reference prevents the called function from doing is changing which object the reference refers to, or pointer arithmetic on the reference.
In terms of the caller, a const reference can be bound directly to an rvalue, and you have well defined semantics when creating and destroying objects passed as arguments. This is a common idiom - to construct a temporary for an argument:
// declaration
void bar ( const Foo& );
// use
bar ( Foo() );
But it isn't immediately obvious that the Foo object in these has a lifespan which exceeds the length of the function call:
// declaration
void bar ( const Foo* );
// use
Foo temp;
bar ( &temp );
// cast to avoid warning about taking address of temporary
bar ( &static_cast<const Foo&>( Foo() ) );
// helper function to same effect
template<typename T> const T* address_of ( const T& t) { return &t; }
bar ( address_of ( Foo() ) );
Though obviously the latter, being just a function call, should make it obvious it does.
Basically, the references are syntactic sugar for pointers which point to single objects, not the start of arrays.

Related

C++ demo function uses both a const argument that is a pointer why is this?

In the std::filesystem c++17 library documentation there are multiple instances of an input to a function being a const and yet also using its reference.
void demo_exists(const fs::path& p, fs::file_status s = fs::file_status{});
Surely the const and the & contradict. My understanding is that const is used when you don't want the variable to be modified and an & when you want to mutate multiple arguments within the function without having to return them as an array after.
Surely using both const and & is oxymoronic and confusing or is there a reason why they are both used?
Thanks in advance
original example from the docs https://en.cppreference.com/w/cpp/filesystem/exists
My understanding is that const is used when you don't want the variable to be modified and an & when you want to mutate multiple arguments within the function without having to return them as an array after.
That is not quite correct. You pass by reference when you want to avoid a copy. Though since move-semantics have been introduced, many uses of passing by (const) reference are obsolote.
You declare the reference as const when the method does not modify the parameter and only non-const when it has to modify it.
Example:
void foo(const SomeLargeObject&); // pass by reference to avoid copy
void bar(int&); // pass non-const reference to modify the parameter
PS:
C++ demo function uses both a const argument that is a pointer why is this?
There is not a single pointer here. What might be confusing is that the address-off operator and the symbol for reference types are the same: &. However, it has two very different meanings:
int x = 0;
int* pointer_to_x = &x; // here it is address-of operator
int& reference_to_x = x; // here it declares reference_to_x as reference to int
PS2: For the sake of completeness I want to mention one more thing. The second unfortunate thingy in C++ here is that "references" usually refer to C++ references (see examples above). On the other hand the term "pass by reference" is used in a wider context and uses "reference" in the meaning of "some reference", ie it can be a reference or it can be a pointer. In the wider sense (*) those two are both pass-by-reference allowing the method to modify what is refered to:
void foo(int& x); // A
void foo(int* x); // B
In C++ references (A) should be prefered unless "no value" (ie a null-pointer) is a valid input.
(*) To be precise the pointer itself is actually passed by value.

Meaning of pass by reference in C and C++?

I am confused about the meaning of "pass by reference" in C and C++.
In C, there are no references. So I guess pass by reference means passing a pointer. But then why not call it pass by pointer?
In C++, we have both pointers and references (and stuff like iterators that lies close). So what does pass by reference mean here?
In colloquial usage, "pass by reference" means that, if the callee modifies its arguments, it affects the caller, because the argument as seen by the callee refers to the value as seen by the caller.
The phrase is used independent of the actual programming language, and how it calls things (pointers, references, whatever).
In C++, call-by-reference can be done with references or pointers. In C, call-by-reference can only be achieved by passing a pointer.
"Call by value":
void foo( int x )
{
// x is a *copy* of whatever argument foo() was called with
x = 42;
}
int main()
{
int a = 0;
foo( a );
// at this point, a == 0
}
"Call by reference", C style:
void foo( int * x )
{
// x is still a *copy* of foo()'s argument, but that copy *refers* to
// the value as seen by the caller
*x = 42;
}
int main()
{
int a = 0;
foo( &a );
// at this point, a == 42
}
So, strictly speaking, there is no pass-by-reference in C. You either pass the variable by-value, or you pass a pointer to that variable by-value.
In C, there are no references
There are no reference variables. But you can refer to objects using pointers. Therefore pointers are "references" from an abstract point of view.
But then why not call it pass by pointer?
You can call it pass by pointer. Reference is a more general term than pointer. It is often preferable to use the more general term when you want to discuss abstractions and want to ignore implementation details. You would call it pass by reference for the same reason that you call a variable "integer" rather than "int32_t".
In C++, we have both pointers and references (and stuff like iterators that lies close). So what does pass by reference mean here?
Depends on context. Often it means that the function argument is a reference variable, but it may also refer to a pointer, iterator, a reference wrapper... anything that referes to an object.
Reference is an abstract concept that exists beyond c and c++; even beyond programming. In c++, the term is ambiguous with reference variables and the context and convention (which isn't universal) determines the meaning.
In C, there are no any reference variables, but you can pass by reference with using pointers.
In wikipedia, there is this definition.
In call-by-reference evaluation (also referred to as pass-by-reference), a function receives an implicit reference to a variable used as argument, rather than a copy of its value. So this term is for type of parameter passing as mentioned by Thomas. So yes, since C is older than C++, also this idea is older than C++.
However, in C++ both pointers and references can be used for passing to the function(Call by address and call by reference). Actually they are working the same way, they have only a few differences.
Once a reference is created, it cannot be later made to reference
another object; it cannot be reseated. This is often done with
pointers.
References cannot be NULL. Pointers are often made NULL to indicate
that they are not pointing to any valid thing.
A reference must be initialized when declared. There is no such
restriction with pointers
With these differences, if you use call by reference instead of call by pointer, you can reduce the possibility of NULL pointer error kind of problems.
Let's clear your confusion.
In C, there are no references. So I guess pass by reference means passing a pointer. But then why not call it pass by pointer?
Because every argument passing in C is pass-by-value. Even a pointer argument is a copy. But it contains (or points to, if you prefer) the same value -- memory address. That is how you can change the variable it points to, but not the pointer itself. Since it's a copy, whatever you do will not affect the pointer on the caller level.
In C++, we have both pointers and references (and stuff like iterators that lies close). So what does pass by reference mean here?
It means, that the argument is an alias of a variable on the caller level, not a copy, which allows us to change it.
Hope that helped.
A reference in general is an instance that is referencing something else. Thus in a wider sense, also a pointer can be considered as one possible implementation of a reference. References in C++ are just called references, because apart from referencing something they offer no other features.
Pass-by-reference is used in general to distinguish from pass-by-value. Whether it is via pointer or via a reference is often just a minor detail. However, with C++ references it is imho more clear what is the purpose of the function parameter. Eg:
int foo(int& a); // pass-by-reference
int foo(const int& a); // is pratically pass-by-value
// (+ avoiding the copy of the parameter)
on the other hand, with references (as compared to pointers) it is not so obvious at the call site if it is pass-by-value or pass-by-reference. E.g.
int x;
int y = foo(x); // could be pass-by-value or pass-by-reference
int z = foo(&x); // obviously pass-by-reference (as a pointer)
Imagine you have to paint your house...
by value: you bring a copy of your house to the painter => much effort (maybe on rails)
by reference: you give your house address to the painter so he can come and paint it
"Pass by reference" (or "call by reference") is a term for a type of parameter passing when calling a function, and the idea is older than C++. It does not necessarily have to be done using C++ "references". C doesn't have a built-in mechanism to do this, so you have to use pointers.
Just to add to the answers, referencing does not mean reference by address. The compiler may use any method to reference to a variable.
when you pass something by reference you're working with the address and not the value of a variable directly, If you use a reference parameter you're getting the address of the variable you pass in.
From there you can manipulate it how ever you want as the variable you passed in WILL change if you change the reference in the function. It's an easier way to work with large amounts of a data it really just saves on memory etc..
In C there are two concepts
1. Call by value - Here copy of values are passed so actual values will not change outside the function.
2. Call by reference - but here actual values (Address of actual operands) are passed so it will change the values globally.
Where in C++ there are two concepts
1. Pass by value - it is same as c, actual values will not change, scope of this values are of function only.
2. Pass by Reference - actual values (Address of actual operands) are passed so it will change the values globally, it means if values gets changed then it will affect in whole program.
In Pass by Reference, the address of operands are passed that's why it is called as Pass By Reference not as pointer.

Function-pointer syntax ambiguity

Take the following example. I create a function pointer named s, set it to f and call it. This compiles fine of course:
void f() {}
int main() {
void (*s)();
s = f;
s();
}
But take this next example, where I declare s now as a "function reference" (if it's so called) and set to f inline. This compiles fine as well:
void f() {}
int main() {
void (&s)() = f;
s();
}
What are the differences between these two ways to create and initialize a function-pointer? Note that when I use the reference syntax I am required to initialize it "in-line" to f whereas with the "pointer" syntax I had the ability to do it both ways. Can you explain that as well? And with that, can you explain what their differences are in terms of usability, and when must I use one form over the other?
Fundamentally the calling side has no distinct difference. But the decl side definitely does. As you have pointed out, references must be initialized to reference something. This makes them "safer", but even then there is no guarantee of "safety".
The function pointer need NOT point to a function at all. It could be NULL, or even uninitialized (pointing to garbage). Not that it matters, because you can always change it later (something you can NOT do with references).
void (*s)(); // uninitialized
or
void (*s)() = NULL; // set to null
and later
void foo()
{
}
s = foo;
You can do none of those with a reference. The reference must be initialized to something and preferabley something valid:
void (&f)() = foo; // ok. also references foo().
void (&f)() = *s; // also ok, but wait, is s valid?? does it have to be??
However, even here a function reference is not guaranteed to be safe, just safer. You can certainly do this:
void (*s)();
void (&f)() = *s;
You may get a compiler warning out of this (I do, "s used before being initialized") but in the end f still is now a reference to a "function" that isn't a function at all; just the random stack garbage in the s pointer. Worse, since references cannot be reassigned. this thing will always point at garbage.
The differences are the same as for any pointer/reference.
References must be initialized and cannot be reassigned later:
int i,j;
int &r = i;
r = j; // i = j, not &r == &j
References cannot be treated as objects distinct from the object they reference (as opposed to pointers, which are objects distinct from the object they point at)...
int i;
int *p = &i; // &p != &i
int &r = i; // &r == &i
Using a function pointer looks syntactically the same as using a reference, but that's because of a special rule with function pointers that allows you to use them without dereferencing them.
You said it yourself, a difference is that with a reference, you have to bind it upon declaration, which guarantees that references always refer to valid objects.
Another difference is that references cannot be rebinded after they are declared, so they refer to one and only one object throughout their lives.
Other than that they are the same thing.
I have met some purists that prefer references and said that pointers are a vestige of C that shouldn't be used.
Other people prefer pointers because they are more "explicit" about the fact that they are pointers.
Whether using one or the other depends on your needs. The way to choose is, use a reference if possible, but if you really need to be able to point it to a different function, then use a pointer.
A reference to a type P is a lot like a const pointer to a type P (not a pointer to a const P, which is different).
As it happens most of the ways they differ are not important if your type P is a function type. & behaves slightly differently, you can directly assign the pointer to a non const pointer, and functions that take one may not take the other.
If the type P was not a function type there would be loads of other differences -- operator=, lifetime of temporaries, etc.
In short, the answer is 'not much'.
Function identifiers are expressions of function type, but they implicitly convert to pointer-to-function type or reference-to-function type. So they can be passed to constructor of either reference or pointer and to operator= of pointer.
Since references syntactically behave like instances, there is no way to act on the reference rather than the referred object. That's why they can only be initialized. By the way prefered syntax in C++ is with parenthesis, not =.
You should use reference when possible and pointers only if you can't use reference. The reason is that since many things can't be done to reference (pointing to NULL, changing referred object, deleting it etc.) there is fewer things you have to look for when reading the code. Plus it saves some * and & characters.

When to use const and const reference in function args?

When writing a C++ function which has args that are being passed to it, from my understanding const should always be used if you can guarantuee that the object will not be changed or a const pointer if the pointer won't be changed.
When else is this practice advised?
When would you use a const reference and what are the advantages over just passing it through a pointer for example?
What about this void MyObject::Somefunc(const std::string& mystring) What would be the point in having a const string if a string is in fact already an immutable object?
Asking whether to add const is the wrong question, unfortunately.
Compare non-const ref to passing a non-const pointer
void modifies(T &param);
void modifies(T *param);
This case is mostly about style: do you want the call to look like call(obj) or call(&obj)? However, there are two points where the difference matters. If you want to be able to pass null, you must use a pointer. And if you're overloading operators, you cannot use a pointer instead.
Compare const ref to by value
void doesnt_modify(T const &param);
void doesnt_modify(T param);
This is the interesting case. The rule of thumb is "cheap to copy" types are passed by value — these are generally small types (but not always) — while others are passed by const ref. However, if you need to make a copy within your function regardless, you should pass by value. (Yes, this exposes a bit of implementation detail. C'est le C++.)
Compare const pointer to non-modifying plus overload
void optional(T const *param=0);
// vs
void optional();
void optional(T const &param); // or optional(T param)
This is related to the non-modifying case above, except passing the parameter is optional. There's the least difference here between all three situations, so choose whichever makes your life easiest. Of course, the default value for the non-const pointer is up to you.
Const by value is an implementation detail
void f(T);
void f(T const);
These declarations are actually the exact same function! When passing by value, const is purely an implementation detail. Try it out:
void f(int);
void f(int const) {/*implements above function, not an overload*/}
typedef void C(int const);
typedef void NC(int);
NC *nc = &f; // nc is a function pointer
C *c = nc; // C and NC are identical types
The general rule is, use const whenever possible, and only omit it if necessary. const may enable the compiler to optimize and helps your peers understand how your code is intended to be used (and the compiler will catch possible misuse).
As for your example, strings are not immutable in C++. If you hand a non-const reference to a string to a function, the function may modify it. C++ does not have the concept of immutability built into the language, you can only emulate it using encapsulation and const (which will never be bullet-proof though).
After thinking #Eamons comment and reading some stuff, I agree that optimization is not the main reason for using const. The main reason is to have correct code.
The questions are based on some incorrect assumptions, so not really meaningful.
std::string does not model immutable string values. It models mutable values.
There is no such thing as a "const reference". There are references to const objects. The distinction is subtle but important.
Top-level const for a function argument is only meaningful for a function implementation, not for a pure declaration (where it's disregarded by the compiler). It doesn't tell the caller anything. It's only a restriction on the implementation. E.g. int const is pretty much meaningless as argument type in a pure declaration of a function. However, the const in std::string const& is not top level.
Passing by reference to const avoids inefficient copying of data. In general, for an argument passing data into a function, you pass small items (such as an int) by value, and potentially larger items by reference to const. In the machine code the reference to const may be optimized away or it may be implemented as a pointer. E.g., in 32-bit Windows an int is 4 bytes and a pointer is 4 bytes. So argument type int const& would not reduce data copying but could, with a simple-minded compiler, introduce an extra indirection, which means a slight inefficiency -- hence the small/large distinction.
Cheers & hth.,
The main advantage of const reference over const pointer is following: its clear that the parameter is required and cannot be NULL.
Vice versa, if i see a const pointer, i immedeately assume the reason for it not being a reference is that the parameter could be NULL.

Working around the C++ limitation on non-const references to temporaries

I've got a C++ data-structure that is a required "scratchpad" for other computations. It's not long-lived, and it's not frequently used so not performance critical. However, it includes a random number generator amongst other updatable tracking fields, and while the actual value of the generator isn't important, it is important that the value is updated rather than copied and reused. This means that in general, objects of this class are passed by reference.
If an instance is only needed once, the most natural approach is to construct them whereever needed (perhaps using a factory method or a constructor), and then passing the scratchpad to the consuming method. Consumers' method signatures use pass by reference since they don't know this is the only use, but factory methods and constructors return by value - and you can't pass unnamed temporaries by reference.
Is there a way to avoid clogging the code with nasty temporary variables? I'd like to avoid things like the following:
scratchpad_t<typeX<typeY,potentially::messy>, typename T> useless_temp = factory(rng_parm);
xyz.initialize_computation(useless_temp);
I could make the scratchpad intrinsically mutable and just label all parameters const &, but that doesn't strike me as best-practice since it's misleading, and I can't do this for classes I don't fully control. Passing by rvalue reference would require adding overloads to all consumers of scratchpad, which kind of defeats the purpose - having clear and concise code.
Given the fact that performance is not critical (but code size and readability are), what's the best-practice approach to passing in such a scratchpad? Using C++0x features is OK if required but preferably C++03-only features should suffice.
Edit: To be clear, using a temporary is doable, it's just unfortunate clutter in code I'd like to avoid. If you never give the temporary a name, it's clearly only used once, and the fewer lines of code to read, the better. Also, in constructors' initializers, it's impossible to declare temporaries.
While it is not okay to pass rvalues to functions accepting non-const references, it is okay to call member functions on rvalues, but the member function does not know how it was called. If you return a reference to the current object, you can convert rvalues to lvalues:
class scratchpad_t
{
// ...
public:
scratchpad_t& self()
{
return *this;
}
};
void foo(scratchpad_t& r)
{
}
int main()
{
foo(scratchpad_t().self());
}
Note how the call to self() yields an lvalue expression even though scratchpad_t is an rvalue.
Please correct me if I'm wrong, but Rvalue reference parameters don't accept lvalue references so using them would require adding overloads to all consumers of scratchpad, which is also unfortunate.
Well, you could use templates...
template <typename Scratch> void foo(Scratch&& scratchpad)
{
// ...
}
If you call foo with an rvalue parameter, Scratch will be deduced to scratchpad_t, and thus Scratch&& will be scratchpad_t&&.
And if you call foo with an lvalue parameter, Scratch will be deduced to scratchpad_t&, and because of reference collapsing rules, Scratch&& will also be scratchpad_t&.
Note that the formal parameter scratchpad is a name and thus an lvalue, no matter if its type is an lvalue reference or an rvalue reference. If you want to pass scratchpad on to other functions, you don't need the template trick for those functions anymore, just use an lvalue reference parameter.
By the way, you do realize that the temporary scratchpad involved in xyz.initialize_computation(scratchpad_t(1, 2, 3)); will be destroyed as soon as initialize_computation is done, right? Storing the reference inside the xyz object for later user would be an extremely bad idea.
self() doesn't need to be a member method, it can be a templated function
Yes, that is also possible, although I would rename it to make the intention clearer:
template <typename T>
T& as_lvalue(T&& x)
{
return x;
}
Is the problem just that this:
scratchpad_t<typeX<typeY,potentially::messy>, typename T> useless_temp = factory(rng_parm);
is ugly? If so, then why not change it to this?:
auto useless_temp = factory(rng_parm);
Personally, I would rather see const_cast than mutable. When I see mutable, I'm assuming someone's doing logical const-ness, and don't think much of it. const_cast however raises red flags, as code like this should.
One option would be to use something like shared_ptr (auto_ptr would work too depending on what factory is doing) and pass it by value, which avoids the copy cost and maintains only a single instance, yet can be passed in from your factory method.
If you allocate the object in the heap you might be able to convert the code to something like:
std::auto_ptr<scratch_t> create_scratch();
foo( *create_scratch() );
The factory creates and returns an auto_ptr instead of an object in the stack. The returned auto_ptr temporary will take ownership of the object, but you are allowed to call non-const methods on a temporary and you can dereference the pointer to get a real reference. At the next sequence point the smart pointer will be destroyed and the memory freed. If you need to pass the same scratch_t to different functions in a row you can just capture the smart pointer:
std::auto_ptr<scratch_t> s( create_scratch() );
foo( *s );
bar( *s );
This can be replaced with std::unique_ptr in the upcoming standard.
I marked FredOverflow's response as the answer for his suggestion to use a method to simply return a non-const reference; this works in C++03. That solution requires a member method per scratchpad-like type, but in C++0x we can also write that method more generally for any type:
template <typename T> T & temp(T && temporary_value) {return temporary_value;}
This function simply forwards normal lvalue references, and converts rvalue references into lvalue references. Of course, doing this returns a modifiable value whose result is ignored - which happens to be exactly what I want, but may seem odd in some contexts.