I am a C guy and I'm trying to understand some C++ code. I have the following function declaration:
int foo(const string &myname) {
cout << "called foo for: " << myname << endl;
return 0;
}
How does the function signature differ from the equivalent C:
int foo(const char *myname)
Is there a difference between using string *myname vs string &myname? What is the difference between & in C++ and * in C to indicate pointers?
Similarly:
const string &GetMethodName() { ... }
What is the & doing here? Is there some website that explains how & is used differently in C vs C++?
The "&" denotes a reference instead of a pointer to an object (In your case a constant reference).
The advantage of having a function such as
foo(string const& myname)
over
foo(string const* myname)
is that in the former case you are guaranteed that myname is non-null, since C++ does not allow NULL references. Since you are passing by reference, the object is not copied, just like if you were passing a pointer.
Your second example:
const string &GetMethodName() { ... }
Would allow you to return a constant reference to, for example, a member variable. This is useful if you do not wish a copy to be returned, and again be guaranteed that the value returned is non-null. As an example, the following allows you direct, read-only access:
class A
{
public:
int bar() const {return someValue;}
//Big, expensive to copy class
}
class B
{
public:
A const& getA() { return mA;}
private:
A mA;
}
void someFunction()
{
B b = B();
//Access A, ability to call const functions on A
//No need to check for null, since reference is guaranteed to be valid.
int value = b.getA().bar();
}
You have to of course be careful to not return invalid references.
Compilers will happily compile the following (depending on your warning level and how you treat warnings)
int const& foo()
{
int a;
//This is very bad, returning reference to something on the stack. This will
//crash at runtime.
return a;
}
Basically, it is your responsibility to ensure that whatever you are returning a reference to is actually valid.
Here, & is not used as an operator. As part of function or variable declarations, & denotes a reference. The C++ FAQ Lite has a pretty nifty chapter on references.
string * and string& differ in a couple of ways. First of all, the pointer points to the address location of the data. The reference points to the data. If you had the following function:
int foo(string *param1);
You would have to check in the function declaration to make sure that param1 pointed to a valid location. Comparatively:
int foo(string ¶m1);
Here, it is the caller's responsibility to make sure the pointed to data is valid. You can't pass a "NULL" value, for example, int he second function above.
With regards to your second question, about the method return values being a reference, consider the following three functions:
string &foo();
string *foo();
string foo();
In the first case, you would be returning a reference to the data. If your function declaration looked like this:
string &foo()
{
string localString = "Hello!";
return localString;
}
You would probably get some compiler errors, since you are returning a reference to a string that was initialized in the stack for that function. On the function return, that data location is no longer valid. Typically, you would want to return a reference to a class member or something like that.
The second function above returns a pointer in actual memory, so it would stay the same. You would have to check for NULL-pointers, though.
Finally, in the third case, the data returned would be copied into the return value for the caller. So if your function was like this:
string foo()
{
string localString = "Hello!";
return localString;
}
You'd be okay, since the string "Hello" would be copied into the return value for that function, accessible in the caller's memory space.
Your function declares a constant reference to a string:
int foo(const string &myname) {
cout << "called foo for: " << myname << endl;
return 0;
}
A reference has some special properties, which make it a safer alternative to pointers in many ways:
it can never be NULL
it must always be initialised
it cannot be changed to refer to a different variable once set
it can be used in exactly the same way as the variable to which it refers (which means you do not need to deference it like a pointer)
How does the function signature differ from the equivalent C:
int foo(const char *myname)
There are several differences, since the first refers directly to an object, while const char* must be dereferenced to point to the data.
Is there a difference between using string *myname vs string &myname?
The main difference when dealing with parameters is that you do not need to dereference &myname. A simpler example is:
int add_ptr(int *x, int* y)
{
return *x + *y;
}
int add_ref(int &x, int &y)
{
return x + y;
}
which do exactly the same thing. The only difference in this case is that you do not need to dereference x and y as they refer directly to the variables passed in.
const string &GetMethodName() { ... }
What is the & doing here? Is there some website that explains how & is used differently in C vs C++?
This returns a constant reference to a string. So the caller gets to access the returned variable directly, but only in a read-only sense. This is sometimes used to return string data members without allocating extra memory.
There are some subtleties with references - have a look at the C++ FAQ on References for some more details.
#include<iostream>
using namespace std;
int add(int &number);
int main ()
{
int number;
int result;
number=5;
cout << "The value of the variable number before calling the function : " << number << endl;
result=add(&number);
cout << "The value of the variable number after the function is returned : " << number << endl;
cout << "The value of result : " << result << endl;
return(0);
}
int add(int &p)
{
*p=*p+100;
return(*p);
}
This is invalid code on several counts. Running it through g++ gives:
crap.cpp: In function ‘int main()’:
crap.cpp:11: error: invalid initialization of non-const reference of type ‘int&’ from a temporary of type ‘int*’
crap.cpp:3: error: in passing argument 1 of ‘int add(int&)’
crap.cpp: In function ‘int add(int&)’:
crap.cpp:19: error: invalid type argument of ‘unary *’
crap.cpp:19: error: invalid type argument of ‘unary *’
crap.cpp:20: error: invalid type argument of ‘unary *’
A valid version of the code reads:
#include<iostream>
using namespace std;
int add(int &number);
int main ()
{
int number;
int result;
number=5;
cout << "The value of the variable number before calling the function : " << number << endl;
result=add(number);
cout << "The value of the variable number after the function is returned : " << number << endl;
cout << "The value of result : " << result << endl;
return(0);
}
int add(int &p)
{
p=p+100;
return p;
}
What is happening here is that you are passing a variable "as is" to your function. This is roughly equivalent to:
int add(int *p)
{
*p=*p+100;
return *p;
}
However, passing a reference to a function ensures that you cannot do things like pointer arithmetic with the reference. For example:
int add(int &p)
{
*p=*p+100;
return p;
}
is invalid.
If you must use a pointer to a reference, that has to be done explicitly:
int add(int &p)
{
int* i = &p;
i=i+100L;
return *i;
}
Which on a test run gives (as expected) junk output:
The value of the variable number before calling the function : 5
The value of the variable number after the function is returned : 5
The value of result : 1399090792
One way to look at the & (reference) operator in c++ is that is merely a syntactic sugar to a pointer. For example, the following are roughly equivalent:
void foo(int &x)
{
x = x + 1;
}
void foo(int *x)
{
*x = *x + 1;
}
The more useful is when you're dealing with a class, so that your methods turn from x->bar() to x.bar().
The reason I said roughly is that using references imposes additional compile-time restrictions on what you can do with the reference, in order to protect you from some of the problems caused when dealing with pointers. For instance, you can't accidentally change the pointer, or use the pointer in any way other than to reference the singular object you've been passed.
In this context & is causing the function to take stringname by reference.
The difference between references and pointers is:
When you take a reference to a variable, that reference is the variable you referenced. You don't need to dereference it or anything, working with the reference is sematically equal to working with the referenced variable itself.
NULL is not a valid value to a reference and will result in a compiler error. So generally, if you want to use an output parameter (or a pointer/reference in general) in a C++ function, and passing a null value to that parameter should be allowed, then use a pointer (or smart pointer, preferably). If passing a null value makes no sense for that function, use a reference.
You cannot 're-seat' a reference. While the value of a pointer can be changed to point at something else, a reference has no similar functionality. Once you take a variable by reference, you are effectively dealing with that variable directly. Just like you can't change the value of a by writing b = 4;. A reference's value is the value of whatever it referenced.
Related
I've looked for an answer to this one, but I can't seem to find anything, so I'm asking here:
Do reference parameters decay into pointers where it is logically necessary?
Let me explain what I mean:
If I declare a function with a reference to an int as a parameter:
void sum(int& a, const int& b) { a += b; }
(assuming that this won't be inlined)
The logical assumption would be that calling this function can be optimized by not passing any parameters, but by letting the function access the variables that are already on the stack. Changing these directly prevents the need for passing pointers.
Problem with this is that (again, assuming this doesn't get inlined), if the function is called from a ton of different places, the relevant values for each call are potentially in different places in the stack, which means the call can't be optimized.
Does that mean that, in those cases (which could potentially make up the majority of one's cases if the function is called from a ton of different places in the code), the reference decays into a pointer, which gets passed to the function and used to influence the variables in the outer scope?
Bonus question: If this is true, does that mean I should consider caching referenced parameters inside of function bodies, so that I avoid the hidden dereferences that come with passing these references? I would then conservatively access the actual reference parameters, only when I need to actually write something to them. Is this approach warranted or is it best to trust the compiler to cache the values for me if it deems the cost of dereferencing higher than the cost of copying them one time?
Code for bonus question:
void sum(int& a, const int& b) {
int aCached = a;
// Do processing that required reading from a with aCached.
// Do processing the requires writing to a with the a reference.
a += b;
}
Bonus bonus question: Is it safe to assume (assuming everything above is true), that, when "const int& b" is passed, the compiler will be smart enough to pass b by value when passing by pointer isn't efficient enough? My reasoning behind this is that values are ok for "const int& b" because you never try to write to it, only read.
The compiler can decide to implement references as pointers, or inlining or any other method it chooses to use. In terms of performance, it's irrelevant. The compiler can and will do whatever it wants to when it comes to optimization. The compiler can implement your reference as a pass-by-value if it wants to (and if it's valid to do so in the specific situation).
Caching the result won't help because the compiler will do that anyways.
If you want to explicitly tell the compiler that the value might change (because of another thread that has access to the same pointer), you need to use the keyword volatile (or std::atomic if you're not already using a std::mutex).
Edit: The keyword "volatile" is never required for multithreading. std::mutex is enough.
If you don't use the keyword volatile, the compiler will almost certainly cache the result for you (if appropriate).
There are, however, at least 2 actual differences in the rules between pointers and references.
Taking the address (pointer) of a temporary value (rvalue) is undefined behavior in C++.
References are immutable, sometimes need to be wrapped in std::ref.
Here I'll provide examples for both differences.
This code using references is valid:
static int do_stuff(const int& i)
{
}
int main()
{
do_stuff(5);
return 0;
}
But this code has undefined behavior (in practice it will probably still work):
static int do_stuff(const int* i)
{
}
int main()
{
do_stuff(&5);
return 0;
}
That's because taking the address of a temporary value (non lvalue) is undefined behavior in C++. The value is not guaranteed to have an address. Note that taking the address like this is valid:
static int do_stuff(const int& i)
{
const int *ptr = &i;
}
int main()
{
do_stuff(5);
return 0;
}
Because inside of the function do_stuff, the variable has a name and is therefore an lvalue. That means that by the time it's inside of do_stuff it's guaranteed to have an address.
So that's one difference between a pointer an a reference in C++.
There is another difference, and that is the constness / immutability.
On important thing to know about in C++ is the use for the helper function std::ref.
Consider the following code:
#include <functional>
#include <thread>
#include <future>
#include <chrono>
#include <iostream>
struct important_t
{
int val = 0;
};
static void work(const volatile important_t& arg)
{
std::cout << "Doing work..." << std::endl;
std::this_thread::sleep_for(std::chrono::seconds(3));
}
int main()
{
important_t my_object;
{
std::cout << "Starting thread" << std::endl;
std::future<void> t = std::async(std::launch::async, work, std::ref(my_object));
std::cout << "Waiting for thread to finish" << std::endl;
}
return 0;
}
The above code will compile just fine, and is perfectly valid C++ code.
But if you wrote it like this:
std::future<void> t = std::async(std::launch::async, work, my_object);
It wouldn't compile. That's because of the std::ref.
The reason that the code doesn't compile without std::ref is that the function std::async (and also std::thread) requires each and every one of the objects being passed as function parameters to be copy constructible.
That demonstrates a fundamental difference between references and all other built-in types in C++. References are immutable, and there's no way to make them editable.
Consider the following code:
#include <iostream>
int main()
{
// Perfectly valid
// Prints 5
{
int val = 0;
int& val_ref = val;
val_ref = 5;
std::cout << val << std::endl;
}
// Compiler error:
// A reference must always be initialized.
// A reference will always point to the same value throughout its lifetime.
{
int val = 0;
int& val_ref;
val_ref = val;
val_ref = 5;
std::cout << val << std::endl;
}
// We will encounter a similar compiler error with a const pointer:
// A const value must always be initialized.
// A const pointer will always point to the same value throughout its lifetime.
{
int val = 0;
int *const val_ptr;
val_ref = &val;
val_ref = 5;
std::cout << val << std::endl;
}
return 0;
}
That leads to the conclusion that a reference is not the same thing as a pointer in C++. It's almost the same thing as a const pointer.
Just a little bit of clarification:
A const pointer to a const int:
void do_stuff(const int *const val)
{
int i;
val = 5; // Error
val = &i; // Error
}
A const pointer to an int:
void do_stuff(int *const val)
{
int i;
val = 5; // Allowed. The int is not const.
val = &i; // Error
}
A pointer to a const int:
void do_stuff(const int* val)
{
int i;
val = 5; // Error
val = &i; // Allowed
}
An int reference in C++ is the closest thing to a const pointer to an int. The int is editable, the pointer is not.
I am currently taking a data structures and algorithms class and my professor gave us material that included functions that take in pointer values and pointer/reference values.
These functions would look like this:
int function1(int a); // Pass by value
int function2(int &ref); // Pass by reference
int function3(int* ptr); // This will take in a pointer value
int function4(int*& ptr); // This will take in a pointer/reference value
I understand the difference between pass by value, and pass by reference. I also have tried implementing both of the latter two examples as basic functions, but I am not entirely sure how these two arguments differ from pass by reference or how they differ from each other.
Could somebody explain how these two functions parameters work and how they could be used practically?
[...] but I am not entirely sure how these two arguments differ from
pass by reference or how they differ from each other.
In the first function
int function3(int* ptr);
// ^^^^
you pass the pointer to an int by value. Meaning int* by value.
In second one,
int function4(int*& ptr);
// ^^ --> here
you pass the pointer to the int by reference. Meaning you are passing the reference to the int* type.
But how does passing the pointer by value and by reference differ in
usage from passing a regular variable type such as an integer by value
or reference?
Same. When you pass the pointer by value, the changes that you do the passed pointer(ex: assiging another pointer) will be only valid in the function scop. On the otherside, pointer pass by reference case, you can directly make changes to the pointer in the main(). For example, see the folloiwng demonstration.
#include <iostream>
// some integer
int valueGlobal{ 200 };
void ptrByValue(int* ptrInt)
{
std::cout << "ptrByValue()\n";
ptrInt = &valueGlobal;
std::cout << "Inside function pointee: " << *ptrInt << "\n";
}
void ptrByRef(int *& ptrInt)
{
std::cout << "ptrByRef()\n";
ptrInt = &valueGlobal;
std::cout << "Inside function pointee: " << *ptrInt << "\n";
}
int main()
{
{
std::cout << "Pointer pass by value\n";
int value{ 1 };
int* ptrInt{ &value };
std::cout << "In main() pointee before function call: " << *ptrInt << "\n";
ptrByValue(ptrInt);
std::cout << "In main() pointee after function call: " << *ptrInt << "\n\n";
}
{
std::cout << "Pointer pass by reference\n";
int value{ 1 };
int* ptrInt{ &value };
std::cout << "In main() pointee before function call: " << *ptrInt << "\n";
ptrByRef(ptrInt);
std::cout << "In main() pointee after function call: " << *ptrInt << "\n\n";
}
}
Output:
Pointer pass by value
In main() pointee before function call: 1
ptrByValue()
Inside function pointee: 200
In main() pointee after function call: 1
Pointer pass by reference
In main() pointee before function call: 1
ptrByRef()
Inside function pointee: 200
In main() pointee after function call: 200
Passing a pointer by reference allow you to change the pointer and not only the pointed value. So if you do and assignment of pointer in the function int function4(int*& ptr) in this way :
ptr = nullptr;
the caller have the pointer set to "nullptr".
In the function int function3(int* ptr) the pointer passed is copied in the variabile ptr that is valid only in the scope of function3. In this case you can only change the value pointed by the pointer and not the pointer ( or to better specify you can change the pointer but only in the scope of the function ). So the previous expression ptr = nullptr; has no effect in the caller.
If you want to modify an outside int value, use this:
int function1(int* ptr) {
*ptr = 100; // Now the outside value is 100
ptr = nullptr; // Useless. This does not affect the outside pointer
}
If you want to modify an outside int* pointer, i.e. redirect that pointer, use this:
int function2(int*& ptr) {
ptr = nullptr; // Now the outside pointer is nullptr
}
Or this:
int function3(int** ptr) {
*ptr = nullptr; // Same
}
The first and second are pass by value and reference respectively as commented in the code.
The third one takes an integer pointer as a parameter by value, which can be assigned a memory location but the value of the pointer only changes locally inside the function scope.
The fourth one holds the value of an integer variable with reference as there is an amperson(&) for referencing the value(which is a pointer).
They are function declarations, not definitions so unless they are defined with a function body you can't explore its practical uses. (or a list of endless possibilities from what you could do with an integer value)
If you want a case of pass by value vs pass by reference, the classic example would be the swapping of two or more values within a function (without returning values), wherein pass by reference works and pass by value fails or your first and third functions would change the values locally inside the function only whereas the second and fourth ones would change the value in its entirety.
If I have a function that takes a reference to a map:
pair<int,int> myFunc(map<int,char> &myVar){...}
I can pass it a map without needing the '&'.
myFunc(someMapitoa);
Is there any difference? Is a copy made and then thrown away? Should I use the '&' anyway?
C++ is pass-by-value by default.
So, this makes a copy:
void foo (bar b);
This does not:
void foo (bar & b);
This makes a copy of a pointer, but not the actual data that it points to:
void foo (bar * b);
If you really want to get deeper into it then see this SO post about move semantics.
Anyway, for the above three examples they are all called the same way:
#include <iostream>
using namespace std;
int alpha (int arg) {
// we can do anything with arg and it won't impact my caller
// because arg is just a copy of what my caller passed me
arg = arg + 1;
return arg;
}
int bravo (int & arg) {
// if I do anything to arg it'll change the value that my caller passed in
arg = arg + 1;
return arg;
}
int charlie (int * arg) {
// when we deal with it like this it's pretty much the same thing
// as a reference even though it's not exactly the same thing
*arg = *arg + 1;
return *arg;
}
int main () {
int a = 0;
// 1
cout << alpha (a) << endl;
// 1
cout << bravo (a) << endl;
// 2
cout << charlie (&a) << endl;
return 0;
}
You should think of this in terms of what is being initialized from what.
When you call a function, each argument is used to initialize the corresponding parameter. If the parameter is declared with reference type, it's a reference. If the parameter is not declared with reference type, it's an object.
The initialization of a reference to class type T from an expression of type T never makes a copy.
The initialization of an object of class type T from an expression of type T either copies or moves.
The rules here are the same as the rules for initializing non-parameter variables, as in:
T t = ...
T& r = ...
The fact that a function may take a reference to an argument even when there is no explicit notation at the call site is viewed by some as confusing. This is why some style guides ban non-const reference parameters (such as the Google C++ style guide) and force you to declare the argument as a pointer so that & must be used at the call site. I don't advocate this coding style, but it is an option you might want to consider.
The following code compiles and runs but I'm not sure what exactly is going on at a lower level. Doesn't a reference just store the address of the object being referenced? If so, both test functions are receiving an address as a parameter? Or is the C++ implementation able to differentiate between these types in some other way?
int main() {
int i = 1;
cout << test(i) << endl;
}
char test(int &i) {
return 'a';
}
char test(int *i) {
return 'b';
}
As int& and int* are distinct types and i can be treated as a int& but not as a int*, overload resolution is absolutely unambiguous here.
It doesn't matter at this point that references are just a somewhat cloaked kind of pointer. From a language point of view they are distinct types.
References in C++ are more akin to an alias than a pointer. A reference is not a seperate variable in itself, but it is a new "name" for an exisiting variable. In your example the first test would get called because you are passing an integer to the function. A pointer is a seperate variable that holds the address of another variable so for the second function to be called you would have to call test with a pointer. Like so.. test(&i); While a tad confusing the operator & gets the address of a variable while a variable declared with an & like int &i declares a reference.
you code only matches with char test(int&i) since you are passing an int& to the function and that can not be converted to int*
In a C++ function like this:
int& getNumber();
what does the & mean? Is it different from:
int getNumber();
It's different.
int g_test = 0;
int& getNumberReference()
{
return g_test;
}
int getNumberValue()
{
return g_test;
}
int main()
{
int& n = getNumberReference();
int m = getNumberValue();
n = 10;
cout << g_test << endl; // prints 10
g_test = 0;
m = 10;
cout << g_test << endl; // prints 0
return 0;
}
the getNumberReference() returns a reference, under the hood it's like a pointer that points to an integer variable. Any change applyed to the reference applies to the returned variable.
The getNumberReference() is also a left-value, therefore it can be used like this:
getNumberReference() = 10;
Yes, the int& version returns a reference to an int. The int version returns an int by value.
See the section on references in the C++ FAQ
Yes, it's different.
The & means you return a reference. Otherwise it will return a copy (well, sometimes the compiler optimizes it, but that's not the problem here).
An example is vector. The operator[] returns an &. This allows us to do:
my_vector[2] = 42;
That wouldn't work with a copy.
The difference is that without the & what you get back is a copy of the returned int, suitable for passing into other routines, comparing to stuff, or copying into your own variable.
With the &, what you get back is essentially the variable containing the returned integer. That means you can actually put it on the left-hand side of an assignment, like so:
getNumber() = 200;
The first version allows you to write getNumber() = 42, which is probably not what you want. Returning references is very useful when overloading operator[] for your own containers types. It enables you to write container[9] = 42.
int& getNumber(): function returns an integer by reference.
int getNumber(): function returns an integer by value.
They differ in some ways and one of the interesting differences being that the 1st type can be used on the left side of assignment which is not possible with the 2nd type.
Example:
int global = 1;
int& getNumber() {
return global; // return global by reference.
}
int main() {
cout<<"before "<<global<<endl;
getNumber() = 2; // assign 2 to the return value which is reference.
cout<<"after "<<global<<endl;
return 0;
}
Ouptput:
before 1
after 2
"&" means reference, in this case "reference to an int".
It means that it is a reference type. What's a reference?
Wikipedia:
In the C++ programming language, a reference is a simple reference datatype that is less powerful but safer than the pointer type inherited from C. The name C++ reference may cause confusion, as in computer science a reference is a general concept datatype, with pointers and C++ references being specific reference datatype implementations. The declaration of the form:
Type & Name
where is a type and is
an identifier whose type is reference
to .
Examples:
int A = 5;
int& rA = A;
extern int& rB;
int& foo ();
void bar (int& rP);
class MyClass { int& m_b; /* ... */ };
int funcX() { return 42 ; }; int (&xFunc)() = funcX;
Here, rA and rB are of type "reference
to int", foo() is a function that
returns a reference to int, bar() is a
function with a reference parameter,
which is reference to int, MyClass is
a class with a member which is
reference to int, funcX() is a
function that returns an int, xFunc()
is an alias for funcX.
Rest of the explanation is here
It's a reference
It means it's returning a reference to an int, not an int itself.
It's a reference, which is exactly like a pointer except you don't have to use a pointer-dereference operator (* or ->) with it, the pointer dereferencing is implied.
Especially note that all the lifetime concerns (such as don't return a stack variable by address) still need to be addressed just as if a pointer was used.