When a function requires a pointer as an argument (and not the variable that the pointer references), is this simply because of the size of the values that would be passed to the function?
I can understand how someone would want to pass a pointer to an array or struct, rather than passing the entire array or struct, but are their other reasons for this decision? For example, a function requiring a pointer to an int (4 bytes) rather than the int (4 bytes) itself.
If you would like your function to change the value of a parameter (such as an int), then you must pass in a pointer to it. Otherwise, any changes that your function makes will be made on a copy.
In general, so-called "output parameters" in C and C++ are often pointers to whatever variable is to be affected by the function.
As for arrays, C doesn't actually permit one to pass a large block of memory to a function, and so we have no choice but to pass a pointer.
(Edit: as discussed in the comments, this answer applies to pointers only. In C++, one may also use references)
In C++ you would pass built in types by value except when you want to modify them in the method or function and have the modification apply to the original variable.
You can pass by reference or by pointer. Some people prefer to pass by pointer if they are going to modify the input as it is more explicit as you have to dereference the pointer.
IE:
void foo(int& a, int* b)
{
a = 1; // This modifies the external variable, but you can't see that just looking at this line
*b = 1; //explicitly modifying external variable
}
int z = 0;
int y = 0;
foo(y, &z); //z is explicitly being allowed to be modified, that y can be too isn't apparent until you look at the function declaration.
Others think this passing pointers is ugly and don't like it.
The best practice for passing large types around is by const reference, which says you won't be modifying the instance.
The answer in one line is: pass-by-(pointer/reference)-to-const if you are dealing with input parameters to non fundamental types, pass-by-value if you are dealing with input parameters to fundamental types, pass-by-(pointer/reference) otherwise. As pointed out in the comments (thanks TonyD) this last "rule" is meant to be an optimisation over using pass-by-(pointer/reference)-to-const; it is likely unnecessary, but it's worth nowing. Note that passing by reference to const does not inficiate the ability to call a function with a temporary (be it a literal or a result from a function call) parameter.
Several distinctions have to made to answer this question appropriately. First of all C and C++ are two different beasts: the only options in C are pass-by-value (pbv), pass-by-pointer (pbp) and pass-by-pointer-to-const (pbptc). In C++ you have also the option to pass-by-reference (pbr) and pass-by-reference-to-const (pbrtc). Secondly, there is the distinction between an input parameter and an (input/)output parameter; when a parameter belongs to the second class you have no options but pbp or pbr (if appliable, i.e. if using c++). As for input parameters, the considerations to be made are more subtle. Alexandrescu addresses this issue in his book "Modern C++"
you sometimes need to answer the following question: Given an
arbitrary type T, what is the most efficient way of passing and
accepting objects of type T as arguments to functions? In general, the
most efficient way is to pass elaborate types by reference and scalar
types by value. (Scalar types consist of the arithmetic types
described earlier as well as enums, pointers, and pointers to
members.) For elaborate types you avoid the overhead of an extra
temporary (constructor-plus-destructor calls), and for scalar types
you avoid the overhead of the indirection resulting from the
reference.
(of course, for input parameters, he is referring to pbrtc). Similarly, you should choose to pbptc for "elaborate" types in C.
Finally, if you are using C++, you can automate this choice by using "type traits" (either the standard ones or custom written ones, see Modern C++ for more on this). Type traits allow you to automatically know if a type is a fundamental type, if it is a reference already (in which case you cannot pass it by reference, because C++ doesn't allow references to references) and all kind of meaningful stuff. By means of type_traits, for example, you can write something like this
#include <type_traits>
typedef int& my_type;
void f(const std::add_lvalue_reference<my_type> a){
}
typedef int my_type2;
void g(const std::add_lvalue_reference<my_type2> a){
}
int main() {
}
Of course, this is a made up example, but you can see the utility of the approach, which is much greater if you are using templates. Notice that type_traits are part of the c++11 std library, if you are not using c++11 you have to make your own (or use some library as loki)
When you want to change the int variable, you can use reference too.
For an array, the array name is just a pointer to the first element, when it's used as a parameter passed to a function, it will change to ordinary pointer, so you must pass the number of the elements in the array as a parameter.
Using variables versus pointers to variables as parameters for a function
General recommendations:
If a function does not change parameter, pass by value.
#include <iostream>
int test(int arg){
std::cout << arg;
}
int main(int argc, char** argv){
int a = 6;
test(a);
return 0;
}
If a function needs to change passed parameter, pass by reference.
#include <iostream>
int test(int &arg){
arg = 6;
}
int main(int argc, char** argv){
int a = 0;
test(a);
std::cout << arg;
return 0;
}
If a function does not need to change parameter, but parameter is BIG, pass by const reference.
If a function neeeds to change passed parameter AND this parameter is optional, pass by pointer.
#include <iostream>
int test(int *arg){
if (arg)
*arg = 6;
}
int main(int argc, char** argv){
int a = 0, b = 1;
test(0);
test(&b);
std::cout << a << std::endl << b << std::endl;
return 0;
}
If a function does not neeed to change passed parameter, parameter is big and parameter is optional, pass by pointer to const.
Reasoning: references and pointers can be used to modify values "outside" of function, but references cannot be set to 0/NULL.
pointer to an int (4 bytes)
Depending on the platform, pointer to int may not be 4 bytes big. On 64bit system it'll be 8bytes big, for example.
Returning pointer to int makes sense if that function allocates memory block. Returning pointer/reference to int makes sense if this function is is used as a "selector" and you need to write into returned value.
#include <iostream>
int& getMaxVal(int &a, int &b){
return (a > b)? a: b;
}
int main(int argc, char** argv){
int i = 3, j = 4;
std::cout << i << " " << j << std::endl;
getMaxVal(i, j) /= 2;
std::cout << i << " " << j << std::endl;
return 0;
}
Related
I need to store a function reference as a char*, and later call that function with only the char*.
I am looking for something like:
char* funcRef = (char*)myFunc;
//...
(void (*funcRef)())();
How do you do this? (Note: I am not asking how to call a function by reference, just if its possible to store a reference to it as a char* and then convert it back)
This conversion is not allowed. The allowable conversions are given in section 6.3 Conversions, sub-subsection 6.3.2.3 Pointers, of which paragraphs (6) and (8) apply to function pointers.
Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined...
and
A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the pointed-to type, the behavior is undefined.
Conversion between pointer to function and pointer to object is not on the allowed list; it is therefore disallowed.
There are systems where pointers to functions and pointers to objects are not interchangeable. The most obvious case is where the size of a function pointer is not the same as the size of an object pointer, such as Harvard architecture, or the 8086 in Compact or Medium model.
The reason your code isn't working is because a char* pointer is not a function pointer. You cannot call a char* as if it was a function. You need to cast that pointer back to a function pointer prior to using it as a function pointer.
Note very well: Whether this will work at all is highly compiler and system dependent.
There is nothing in the C standard that says anything about converting a pointer to a function to a pointer to an object, or vice versa. This is undefined behavior. On the other hand, POSIX standard requires that a compliant implementation must be able to convert a pointer to void to a pointer to a function. (Note: The reverse capability is not required.)
This question is also tagged as C++. Prior to C++11, converting a pointer to a function to a pointer to an object, or vice versa, was illegal. The compiler had to issue a diagnostic message. On POSIX-compliant systems, the compiler would issue a diagnostic and then generate the POSIXly-compliant object code. In C++11 and later, converting converting between pointers to functions to a pointers to an object, or vice versa, is conditionally-supported.
With those caveats, the following works on my POSIX-compliant machine, with multiple compilers. Whether it works on a non-POSIX compliant machine with non-POSIX complaint compilers is anyone's guess.
C++ version:
#include <iostream>
int sqr (int k) { return k*k; }
int p42 (int k) { return k+42; }
void call_callback(void* vptr, int k)
{
using Fptr = int(*)(int);
Fptr fun = reinterpret_cast<Fptr>(vptr);
std::cout << fun(k) << '\n';
}
int main ()
{
call_callback(reinterpret_cast<void*>(sqr), 2);
call_callback(reinterpret_cast<void*>(p42), 2);
}
C version:
#include <stdio.h>
int sqr (int k) { return k*k; }
int p42 (int k) { return k+42; }
void call_callback(void* vptr, int k)
{
printf ("%d\n", ((int(*)(int))(vptr))(k));
}
int main ()
{
call_callback((void*)(sqr), 2);
call_callback((void*)(p42), 2);
}
I came across something I don't understand well. Let's suppose I want to pass a character pointer to a function that takes a reference to a void pointer.
void doStuff(void*& buffer)
{
// do something
}
I would usually do something like this :
int main()
{
unsigned char* buffer = 0;
void* b = reinterpret_cast<void *>(buffer);
doStuff(b);
return 0;
}
Why it is not possible to directly pass the reinterpret_cast to the function?
int main()
{
unsigned char* buffer = 0
// This generate a compilation error.
doStuff(reinterpret_cast<void *>(buffer));
// This would be fine.
doStuff(reinterpret_cast<void *&>(buffer));
return 0;
}
There must be a good reason behind this behavior but I don't see it.
In the first example, you're actually passing the pointer variable b. So it works.
In the second example, the first reinterpret_cast returns a pointer (by value), which doesn't match the reference the function should get, while the second returns said reference.
As an example to show you how references work, look at these two functions,
void doSomething( unsigned char *ptr );
void doSomethingRef( unsigned char *&ptr );
Say we have this pointer,
unsigned char *a;
Both functions are called the same way,
doSomething( a ); // Passing pointer a by value
doSomethingRef( a );// Passing pointer a by reference
Though it may look like you're passing it by value, but the function takes a reference so it will be passed as a reference.
A reference is similar to a pointer but it has to be initialized with a left value and can't be null.
Having said that, there are much better alternatives to using void* and especially void*&. void* makes code harder to read and easier to shoot yourself in the foot (if anything by making yourself use these strange casts).
As I said in the comments, you could use a template and not bother with void casting.
template< class T > void doStuff( T *&buffer ) {
...
}
Or,
template< class T > T* doStuff( T* buffer ) {
...
}
EDIT: On a side note, your second example is missing a semicolon,
unsigned char* buffer = 0; // Right here
int main()
{
unsigned char* buffer = 0;
void* b = reinterpret_cast<void *>(buffer);
doStuff(b);
return 0;
}
b is a pointer and doStuff(b) is receiving the address of a pointer. The types match, b is of type void*& (*b is of type void*) and doStuff receives a parameter of type void*&.
int main()
{
unsigned char* buffer = 0
// This generate a compilation error.
doStuff(reinterpret_cast<void *>(buffer));
// This would be fine.
doStuff(reinterpret_cast<void *&>(buffer));
return 0;
}
The second call is like the the call from the above function with b as parameter.
The first call is passing simply a void pointer. The types are different, look closer void* is not the same as void*&
This is how you would specify a reinterpret_cast as the function argument directly, without using an intermediate variable. As others have told you, it's bad practice, but I want to answer your original question. This is for educational purposes only, of course!
#include <iostream>
void doStuff(void*& buffer) {
static const int count = 4;
buffer = static_cast<void*>(static_cast<char*>(buffer) + count);
}
int main() {
char str[] = "0123456789";
char* ptr = str;
std::cout << "Before: '" << ptr << "'\n";
doStuff(*reinterpret_cast<void**>(&ptr)); // <== Here's the Magic!
std::cout << "After: '" << ptr << "'\n";
}
Here we have a pointer to char named ptr and we want to wrangle its type to void*& (a reference to a void pointer), suitable for passing as an argument to function doStuff.
Although references are implemented like pointers, they are semantically more like transparent aliases for another value, so the language doesn't provide the kind of flexibility you get for manipulating pointers.
The trick is: a dereferenced pointer converts directly into a correspondingly typed reference.
So to get a reference to a pointer, we start with a pointer to a pointer:
&ptr (char** - a pointer to a pointer to char)
Now the magic of reinterpret_cast brings us closer to our goal:
reinterpret_cast<void**>(&ptr) (now void** - a pointer to a void pointer)
Finally add the dereferencing operator and our masquerade is complete:
*reinterpret_cast<void**>(&ptr) (void*& - a reference to a void pointer)
This compiles fine in Visual Studio 2013. Here is what the program spits out:
Before: '0123456789'
After: '456789'
The doStuff function successfully advanced ptr by 4 characters, where ptr is a char*, passed by reference as a reinterpret_cast void*.
Obviously, one reason this demonstration works is because doStuff casts the pointer back to a char* to get the updated value. In real-world implementations, all pointers have the same size, so you can probably still get away with this kind of manipulation while switching between types.
But, if you start manipulating pointed-to values using reinterpreted pointers, all kinds of badness can happen. You will also probably be in violation of the "strict aliasing" rule then, so you might as well just change your name to Mister Undefined Behavior and join the circus. Freak.
I'm not sure if this is right, but...
I believe it's as simple matching the argument type:
void doStuff(void* buffer) {
std::cout << reinterpret_cast<char*>(buffer) << std::endl;
return;
}
You could do the above and the int main() would compile correctly.
A reference is different from a copy of a value--the difference is that the copied value doesn't necessarily need to live in a variable or in a place in memory--a copied value could be just a stack variable while a reference shouldn't be able to point to an expiring value. This becomes important once you start playing around with reference and value semantics.
tl;dr: Don't mix references and values when casting. Doing operations on a reference is different than doing operations on a value; even if argument substitution is implicitly casted.
There is such code:
const int fun(){ return 2; } // can be assigned to int and const int
int fun2(){ return 2; } // can be assigned to int and const int
Is there any difference in using these functions? They both return by value so it is always copied at the end of function call.
Is there any difference in using these functions?
No. There is, however, a difference in their type, and if the functions returned a class type, there would be difference regarding invoking methods on the return value.
There's no practical difference when returning an int, basically because anything you do with a temporary of builtin type only needs its value. You can take a const reference to a temporary - it may be valid (if unwise) in the latter case to cast that const reference to non-const and modify the temporary through it, but I can't be bothered to look up whether temporaries of builtin type really are mutable, and there's not any great practical need to do anything like that.
When returning a class type there is a difference - in the second case you can call a non-const member function on the function's return value, and in the first case you can't. For example, given std::string fun2() { return "hello"; } you can do std::cout << (fun2() += " world\n");, or std::string s("foo"); std::cout << s; fun2().swap(s); std::cout << "s";. Such tricks are potential optimizations (especially before C++11 move semantics came along), and they don't work if fun2 returns const std::string. The second trick is called "swaptimization", which at least tells you that it's used enough to be worth naming.
There is no difference in using these functions. Note that your assumption that copying would happen may not be true in the face of an optimizer, that would probably inline the value 2 at the call site.
Apparently, we can pass complex class instances to functions, but why can't we pass arrays to functions?
The origin is historical. The problem is that the rule "arrays decay into pointers, when passed to a function" is simple.
Copying arrays would be kind of complicated and not very clear, since the behavior would change for different parameters and different function declarations.
Note that you can still do an indirect pass by value:
struct A { int arr[2]; };
void func(struct A);
Here's another perspective: There isn't a single type "array" in C. Rather, T[N] is a a different type for every N. So T[1], T[2], etc., are all different types.
In C there's no function overloading, and so the only sensible thing you could have allowed would be a function that takes (or returns) a single type of array:
void foo(int a[3]); // hypothetical
Presumably, that was just considered far less useful than the actual decision to make all arrays decay into a pointer to the first element and require the user to communicate the size by other means. After all, the above could be rewritten as:
void foo(int * a)
{
static const unsigned int N = 3;
/* ... */
}
So there's no loss of expressive power, but a huge gain in generality.
Note that this isn't any different in C++, but template-driven code generation allows you to write a templated function foo(T (&a)[N]), where N is deduced for you -- but this just means that you can create a whole family of distinct, different functions, one for each value of N.
As an extreme case, imagine that you would need two functions print6(const char[6]) and print12(const char[12]) to say print6("Hello") and print12("Hello World") if you didn't want to decay arrays to pointers, or otherwise you'd have to add an explicit conversion, print_p((const char*)"Hello World").
Answering a very old question, as Question is market with C++ just adding for completion purposes, we can use std::array and pass arrays to functions by value or by reference which gives protection against accessing out of bound indexes:
below is sample:
#include <iostream>
#include <array>
//pass array by reference
template<size_t N>
void fill_array(std::array<int, N>& arr){
for(int idx = 0; idx < arr.size(); ++idx)
arr[idx] = idx*idx;
}
//pass array by value
template<size_t N>
void print_array(std::array<int, N> arr){
for(int idx = 0; idx < arr.size(); ++idx)
std::cout << arr[idx] << std::endl;
}
int main()
{
std::array<int, 5> arr;
fill_array(arr);
print_array(arr);
//use different size
std::array<int, 10> arr2;
fill_array(arr2);
print_array(arr2);
}
The reason you can't pass an array by value is because there is no specific way to track an array's size such that the function invocation logic would know how much memory to allocate and what to copy. You can pass a class instance because classes have constructors. Arrays do not.
Summery:
Passing the Address of the array's first element &a = a = &(a[0])
New Pointer (new pointer, new address, 4 bytes, in the memory)
Points to the same memory location, in different type.
Example 1:
void by_value(bool* arr) // pointer_value passed by value
{
arr[1] = true;
arr = NULL; // temporary pointer that points to original array
}
int main()
{
bool a[3] = {};
cout << a[1] << endl; // 0
by_value(a);
cout << a[1] << endl; // 1 !!!
}
Addresses:
[main]
a = 0046FB18 // **Original**
&a = 0046FB18 // **Original**
[func]
arr = 0046FB18 // **Original**
&arr = 0046FA44 // TempPTR
[func]
arr = NULL
&arr = 0046FA44 // TempPTR
Example 2:
void by_value(bool* arr)
{
cout << &arr << arr; // &arr != arr
}
int main()
{
bool a[3] = {};
cout << &a << a; // &a == a == &a[0]
by_value(arr);
}
Addresses
Prints:
[main] 0046FB18 = 0046FB18
[func] 0046FA44 != 0046FB18
Please Note:
&(required-lvalue): lvalue -to-> rvalue
Array Decay: new pointer (temporary) points to (by value) array address
readmore:
Rvalue
Array Decay
It was done that way in order to preserve syntactical and semantic compatibility with B language, in which arrays were implemented as physical pointers.
A direct answer to this question is given in Dennis Ritchie's "The Development of the C Language", see the "Critique" section. It says
For example, the empty square brackets in the function declaration
int f(a) int a[]; { ... }
are a living fossil, a remnant of NB's way of declaring a pointer; a is, in this special case only, interpreted in C as a pointer. The notation survived in part for the sake of compatibility, in part under the rationalization that it would allow programmers to communicate to their readers an intent to pass f a pointer generated from an array, rather than a reference to a single integer. Unfortunately, it serves as much to confuse the learner as to alert the reader.
This should be taken in the context of the previous part of the article, especially "Embryonic C", which explains how introduction of struct types in C resulted in rejection of B- and BCPL-style approach to implementing arrays (i.e. as ordinary pointers). C switched to non-pointer array implementation, keeping that legacy B-style semantics in function parameter lists only.
So, the current variant of array parameter behavior is a result of a compromise: one the one hand, we had to have copyable arrays in structs, on the other hand, we wanted to preserve semantic compatibility with functions written in B, where arrays are always passed "by pointer".
The equivalent of that would be to first make a copy of the array and then pass it to the function (which can be highly inefficient for large arrays).
Other than that I would say it's for historical reasons, i.e. one could not pass arrays by value in C.
My guess is that the reasoning behind NOT introducing passing arrays by value in C++ was that objects were thought to be moderately sized compared to arrays.
As pointed out by delnan, when using std::vector you can actually pass array-like objects to functions by value.
You are passing by value: the value of the pointer to the array. Remember that using square bracket notation in C is simply shorthand for de-referencing a pointer. ptr[2] means *(ptr+2).
Dropping the brackets gets you a pointer to the array, which can be passed by value to a function:
int x[2] = {1, 2};
int result;
result = DoSomething(x);
See the list of types in the ANSI C spec. Arrays are not primitive types, but constructed from a combination of pointers and operators. (It won't let me put another link, but the construction is described under "Array type derivation".)
actually, a pointer to the array is passed by value, using that pointer inside the called function will give you the feeling that the array is passed by reference which is wrong. try changing the value in the array pointer to point to another array in your function and you will find that the original array was not affected which means that the array is not passed by reference.
The following code compiles and runs but I'm not sure what exactly is going on at a lower level. Doesn't a reference just store the address of the object being referenced? If so, both test functions are receiving an address as a parameter? Or is the C++ implementation able to differentiate between these types in some other way?
int main() {
int i = 1;
cout << test(i) << endl;
}
char test(int &i) {
return 'a';
}
char test(int *i) {
return 'b';
}
As int& and int* are distinct types and i can be treated as a int& but not as a int*, overload resolution is absolutely unambiguous here.
It doesn't matter at this point that references are just a somewhat cloaked kind of pointer. From a language point of view they are distinct types.
References in C++ are more akin to an alias than a pointer. A reference is not a seperate variable in itself, but it is a new "name" for an exisiting variable. In your example the first test would get called because you are passing an integer to the function. A pointer is a seperate variable that holds the address of another variable so for the second function to be called you would have to call test with a pointer. Like so.. test(&i); While a tad confusing the operator & gets the address of a variable while a variable declared with an & like int &i declares a reference.
you code only matches with char test(int&i) since you are passing an int& to the function and that can not be converted to int*