Passing arrays as a reference - c++

In C++ how can I pass an array as a reference when I don't know the size at compile time? So far, I found out that the only way to make it work is to use something like
const double( &numbers ) [size]
but it means that I need to know the size of the array at compile time, thus I cannot use it in an external function.
My questions are:
If I don't pass an array as a ( const double( &numbers ) [length] ), because for example I don't know its size, how do I make sure that it doesn't get copied, but it is referenced?
If I pass an array like in the above example, ( double array[] ) is it referenced or is it copied?

The other answers are great, but no one mention using templates to handle this. You should still make your function take a pointer and a size, but templates can fill that automatically for you:
void f( double const numbers[], std::size_t length )
{ ... }
template< std::size_t Length >
void f( double const (&numbers)[ Length ] )
{
return f( numbers, Length );
}

In C++, you should use std::vector.
In C/C++, you can't pass arrays as a copy. Arrays are always passed by reference.
EDIT
In C++, arrays passed by reference has different meaning. In both C and C++, arrays decay into a pointer to the first element of the array. Please check the comments below.

If I don't pass an array as a ( const double( &numbers ) [length] ),
because for example I don't know its size, how do I make sure that it
doesn't get copied, but it is referenced?
Yes, it means you're passing an array as reference,
void Foo(const double( &numbers ) [length]);
Note that the length is a constant integer.
If I pass an array like in the above example, ( double array[] ) is it
referenced or is it copied?
No, it is not copied. It means you're passing a pointer to your array which is equivalent to,
void Foo(const double *length);

A couple things:
C++ doesn't allow variable-sized arrays anyway. So all your arrays will need to have known sizes at compile time. So I'm not entirely sure if your initial question is even applicable since you won't be able to make an array with an unknown size in the first place.
When you pass an array, it is done by reference. It is not copied.
In any case, you'll probably want to consider using vector instead.
EDIT : See comments.
double average(const double *arr, size_t len){
// Compute average
return accumulate(arr, arr + len, 0) / (double)len;
}
int main(){
double array[10] = // Initialize it
cout << average(array, 10) << endl;
// Alternatively: This could probably be made a macro.
// But be careful though since the function can still take a pointer instead
// of an array.
cout << average(array, sizeof(array) / sizeof(double)) << endl;
return 0;
}

In C++, an array's name is just a constant pointer to its first element. A constant pointer means a pointer that's capable of changing whatever it points to, but it can't be changed to point to something else.
This means that whenever you pass an array, you're actually passing a constant pointer to that array. In other words, you're already passing it by reference, no need for extra efforts. To be more accurate, what is actually copied is that constant pointer, so the final (hopefully not-that-confusing) phrasing is that you're passing a constant pointer to the array by value.
If you don't know your array's size at compile time, just use a (normal) pointer to your data type instead of an explicit array. So whatever is T my_array[] (where T is a type, like int, double or even one of your classes) becomes T* my_array, and the syntax is exactly the same hereafter... my_array[i] will work fine (another syntax also exists, but not as elegant). For initialization, use the new operator:
T* my_array;
my_array = new T[3];
or
T* my_array;
my_array = new T[x];
where x is an integer (not necessarily constant as is the case with normal arrays). This way you can take this x from the user at runtime and create your "array" then. Just take care not to forget delete[] my_array after you finish using it to avoid memory leaks.
[Final Note] Using such a dynamically allocated array is a good choice only when you know exactly how many elements you want... either at compile time or even at runtime. So, for example, if after the user supplies his x you'll exactly be using those, that's fine. Otherwise you're facing the danger of overflowing your array (if you need more than x) - which usually crashes the application - or just wasting some space. But even if this is the case, you'll implement most of the functions you need for array manipulation yourself. That's why it's preferable to use containers provided by the C++ Standard Library, like std::vector (as Donotalo mentioned). I just wanted to elaborate on that point more.

Other languages like Java and Python do store the length of arrays at runtime. In C++ arrays, the length of the array is not stored. That means you need to store it manually somewhere.
Whenever you have a fixed size array on your code, the compiler knows the size of the array since it is reading it from the source code itself. But once the code get compiled, that length information is lost. For example:
void f1(double array[10]) {...}
The compiler won't enforce the size of the array. The following code will silently compile since the array parameter of f1 us just a pointer to the first element of an array:
void h1() {
double a[10];
double b[5];
f1(a); // OK.
f2(b); // Also OK.
}
Since the compiler ignores the static size of the array when passing it to a function, the only way you have to know the size of an arbitrarily sized array passed as a reference is to explicitly state the size:
void f2(double array[], size_t array_size) {...}
Then you can call that function with any array:
void h2() {
double a[10];
double b[19];
f2(a, sizeof(a) / sizeof(a[0]));
f2(b, sizeof(a) / sizeof(a[0]));
}
The parameter array_size contains the actual size of the array.
As a note, sizeof(array) only works on statically defined arrays. When you pass that array to another functions, the size information is lost.
An array is not the same as a pointer. However, an array of undefined size like the parameter of f2 is just a pointer to the first double element in the sequence:
void f3(double array*, size_t array_size) {...}
For any practicar purpose, f2 and f3 are equivalent.
This is exactly how std::vector works. Internally, a vector is a class with two fields: a pointer to the first element, and the number of elements in the vector. This makes things a little simpler when you want to accept an array of any size as a parameter:
void g(std::vector& v) {...}

Related

Why is the size of an array passed to a function by reference known to the compiler in C++?

I know that when I want to pass an array to a function, it will decay into pointer, so its size won't be known and these two declarations are equivalent:
void funtion(int *tab, int size);
and
void funtion(int tab[], int size);
And I understand why. However, I checked that when I pass an array as a reference:
void funtion(int (&tab)[4]);
the compiler will know the size of the array and won't let me pass an array of different size as an argument of this function.
Why is that? I know that when I pass an array by address, the size isn't taken into account while computing the position of the ith element in the array, so it is discarded even if I explicitly include it in the function declaration:
void funtion(int tab[4], int size);
But what is different when I pass an array by reference? Why is its size known to the compiler?
Note: I'm interested in arrays whose size is known at compile time, so I didn't use any templates.
I found a similar question on Stack Overflow, however it doesn't answer my question - it doesn't explain why the compiler knows the size of the array, there is just some information on how to pass arrays to functions.
Because it can, and because checking adds extra safety. The compiler knows the size of the array because you tell it so, right there in the function declaration. And since that information is available, why wouldn't it use it to signal errors in your source code?
The real question is why your last example wouldn't do the same check. This is, unfortunately, another piece of C legacy - you are never passing an array, it always decays into a pointer. The size of the array then becomes irrelevant.
Could a check be added? Possibly, but it would be of limited use (since we are all using std::array now - right!?), and because it would undoubtedly break some code. This, for example:
void func (char Values [4]);
func ("x");
This is currently legal, but wouldn't be with an additional check on array size.
Because there is no odd implicit type change committed by the compiler in the case. Normally when you write:
void func(int[4]);
or
void func(void());
The compiler decides to "help" you and translates those into:
void func(int *);
or
void func(void(*)());
Funny thing though - it wouldn't aid you in such a way when you try returning one of those. Try writing:
int func()[4], func1()();
Ooops - surprise - compiler error.
Otherwise arrays are arrays and have constant size which can be acquired by using the sizeof operator.
This however is often forgotten because of the compiler behavior noted above and also because of the implicit pointer conversion applied to objects of array type when such isn't expected. And this is very often. Though here are the few exceptions when no implicit array object conversion is applied:
size_t arr[4],
(*parr)[3] = &arr, //taking the address of array
(&refarr)[3] = arr, //storing reference to array
sizearrobject = sizeof(arr); //taking the array object size
The above examples will trigger compiler error because of incompatible types on the second and third line.
I'm talking about the cases when arr object isn't automatically converted to something like this:
(size_t*)&arr
Well, there are several ways to pass an array to function. You can pass it by pointer an by reference, and there are ways to define or not to define it's size explicitely for both ways.
In your question you compare these 2 ways:
Pointer to first element: void f(int *arr)
Reference to an entire array: void f(int (&arr)[size])
You ask why you need to specify size only in one of these cases.
It looks like you assume that the only difference between them is the fact that one uses pointer and another uses reference. But this statement is incorrect, they have more differences: One is pointer to first element, but second is a reference to an entire array.
You can pass an array by pointer to an entire array:
void f(int (*arr)[size])
Compare it with your example, with passing by refence to an entire array:
void f(int (&arr)[size])
They are similar, they have similar syntax, they both explicitely define array size.
Also, consider this:
void f(int &arr)
It looks like passing a single int by reference, but you can pass an array of unknown size to it.
Pointer alternative to it is
void f(int *arr)
You ask why you need to specify array size only in one of those cases. It's because of the syntax you used, not because one is pointer and other is reference.
As I said, you can use pointer or reference. And you can specify array size or you can allow an array of any size to be used. These two are not connected.
// by pointer by reference
/* Any size */ void f(int *arr) void f(int &arr)
/* Specific size */ void f(int (*arr)[x]) void f(int (&arr)[x])

C++ multidimensional array member accessor

I have a class that has a large 2 dimensional array in it. It used to be a dynamic array allocated on the heap and now it is statically sized which I prefer.
private:
int fontTextureCoords_[100][7];
I had to add the type casting to the the accessor in order to return the array for access outside the class which is currently working okay, but I'm not sure it is safe or the correct way to handle this.
public:
inline int **getFontTextureCoords()
{
return (int**)fontTextureCoords_;
}
Is this safe / the correct way to do this or is there a more preferred method for returning a pointer to a multi-dimensional array?
That's not the correct way to do that and shouldn't compile. A 2d array is not convertible to a pointer to pointer. You'd have to return a pointer to an array, which is easiest to write using a typedef:
using T = int[7];
inline T* getFontTextureCoords() { return fontTextureCoords_; }
Although it'd be much better to just return a reference the full array:
using T = int[100][7];
inline T& getFontTextureCoords() { return fontTextureCoords_; }
You could also just std::array<std::array<int, 7>, 100>.
Maybe this diagram shows you the difference between the two types of multi-dimensional array declarations. (Sometime people don't understand this.)
The first one says a is a single block of 100 consecutive 7-int chunks, or 700 ints total, all together in one piece.
The second says a is an array of pointers, where each pointer points to a different chunk of ints, scattered all over memory.
The compiler needs to know this, because if you write a[i][j] it has to generate totally different code.
Casting an array such as int fontTextureCoords_[100][7]; to an int** is not right. It leads to undefined behavior.
If it is not too much, change getFontTextureCoords to:
inline int (*getFontTextureCoords()) [7]
{
return fontTextureCoords_;
}
and use it as:
int (*ptr)[7] = getFontTextureCoords();
If you have the option of using std::vector or std::array, it will be better to use them.
There are no multi-dimensional arrays in C/C++. There are only single dimenstional arrays. You can have a single-dimensional array, with every element of it being another single dimensional array. While there seem to be no difference, it is there and is very important.
This is exactly way transitive logic doesn not work. Everybody has gone through it. 'If single-dimensional arrays are passed as a pointer to the first elelement, 2-D arrays should be passed as a pointer to pointer to first element, etc'. But since it is not a two-dimensional array, but array of arrays, the logic can not be applied.
You can reason about it in the following way. Let's say, you have an array of types X.
X x[10];
How do you access element number 5? Easy -
x[5] = 42;
But what compiler does when it sees it? It does approximately this:
*(&x[0] + 5) = 42;
So it takes the address of the first element, and adds 5 to it to get to the address of your 5th element. But what adding 5 means? In bytes, how many bytes should be skipped from address of beginning of the array to arrive at requested memory location? Of course, 5 * sizeof(X). Now, if you have '2-D' array, declared like this:
X x[2][3];
And you try to work with it through the pointer to pointer:
px = (X**)x;
px[3][4] = 42;
Remember, to genereate the code for [][], compiler needs to express in the way of *(px + ). And something has to be the size of the array (as elements of your array are arrays). But you need to know array size for this, and as you can see, your px does not have any array size encoded in it. The only size it know is size of X, which is not enough.
Hope it makes sense, and explains why you can't use int** instead of x[][].

c++ pass array to function question

How is it possible to pass an array parameter to a function, and calculate the size of the array parameter? Because every time i try to calculate it, the return value is always 4.
Thanks the fast replies. I'm going to pass the size of the array as well then. Or i give a shot to vectors. Thanks
You can do it using templates as described here:
template<size_t Size>
void AcceptsArray( ParameterType( &Array )[Size] )
{
//use Size to find the number of elements
}
it is be called like this:
ParameterType array[100];
AcceptsArray( array ); //Size will be auto-deduced by compiler and become 100
The only drawback is that you now have a templated function and that increases code bloat. This can be addressed by redirecting the call to a non-templated function that accepts the address of the first element and the number of elements.
This is (one of) the difference between arrays and pointers. Taking the size of an array results in its size in bytes, whereas taking the size of a pointer yields the size of pointers on that system. However, whenever you pass an array to a function, it decays to a pointer, the size of which is always the same no matter what type of pointer it is (4 on a 32 bit machine).
So technically, it's impossible to pass an array into a function since whenever you try, it becomes a pointer.
You'll need to pass the size of the array in to the function as well, or better yet if you can, use std::vector or std::array, or std::string if you're using the array as a C-style string.
In C++ it is considered a bad pratice to use raw arrays. You should consider using std::vector or boost::array instead.
It is difficult to calculate the size of an array if you do not keep track of the size or supply some sort of an guardian value at the end. In C-strings (not std::strings), for example, the '\0' character marks the end of the string (a char array).
This works with g++:
template <typename T>
std::size_t size_of_array (T const & array)
{
return sizeof (array) / sizeof (*array);
}
int main ()
{
int x [4];
std::cout << "size: " << size_of_array (x) << '\n';
}
I guess it is because the function is inlined, but still it seems the array does not decay in this case. Can somebody explain why?
If you are using c arrays it's not possible because they're automatically casted to pointer when passing it to a function. So 4 is the size of the pointer.
Solution: use std::vector
#include <vector>
int carray[] = {1,2,3,4};
std::vector<int> v(carray, carray + sizeof(carray)/sizeof(int));
my_function(v);

when do we need to pass the size of array as a parameter

I am a little bit confused about pass an array in C/C++. I saw some cases in which the signature is like this
void f(int arr[])
some is like this
void f(int arr[], int size)
Could anybody elaborate what's the difference and when and how to use it?
First, an array passed to a function actually passes a pointer to the first element of the array, e.g., if you have
int a[] = { 1, 2, 3 };
f(a);
Then, f() gets &a[0] passed to it. So, when writing your function prototypes, the following are equivalent:
void f(int arr[]);
void f(int *arr);
This means that the size of the array is lost, and f(), in general, can't determine the size. (This is the reason I prefer void f(int *arr) form over void f(int arr[]).)
There are two cases where f() doesn't need the information, and in those two cases, it is OK to not have an extra parameter to it.
First, there is some special, agreed value in arr that both the caller and f() take to mean "the end". For example, one can agree that a value 0 means "Done".
Then one could write:
int a[] = { 1, 2, 3, 0 }; /* make sure there is a 0 at the end */
int result = f(a);
and define f() something like:
int f(int *a)
{
size_t i;
int result = 0;
for (i=0; a[i]; ++i) /* loop until we see a 0 */
result += a[i];
return result;
}
Obviously, the above scheme works only if both the caller and the callee agree to a convention, and follow it. An example is strlen() function in the C library. It calculates the length of a string by finding a 0. If you pass it something that doesn't have a 0 at the end, all bets are off, and you are in the undefined behavior territory.
The second case is when you don't really have an array. In this case, f() takes a pointer to an object (int in your example). So:
int change_me = 10;
f(&change_me);
printf("%d\n", change_me);
with
void f(int *a)
{
*a = 42;
}
is fine: f() is not operating on an array anyway.
WHen an array is passed in C or C++ only its address is passed. That is why the second case is quite common, where the second parameter is the number of elements in the array. The function has no idea, only by looking at the address of the array, how many elements it is supposed to contain.
you can write
void f( int *arr, int size )
as well, having latter (size) allows to not step outside the array boundaries while reading/writing to it
C and C++ are not the same thing. They have some common subset, though. What you observed here is that the "first" array dimension when passed to a function always results just in a pointer being passed. The "signature" (C doesn't use this term) of a function declared as
void toto(double A[23]);
is always just
void toto(double *A);
That is that the 23 above is somewhat redundant and not used by the compiler. Modern C (aka C99) has an extension here that lets you declare that A always has 23 elements:
void toto(double A[static 23]);
or that the pointer is const qualified
void toto(double A[const 23]);
If you add other dimension the picture changes, then the array size is used:
void toto(double A[23][7]);
in both C and C++ is
void toto(double (*A)[7]);
that is a pointer to an array of 7 elements. In C++ these array bounds must be an integer constant. In C it can be dynamic.
void toto(size_t n, size_t m, double A[n][m]);
They only thing that you have to watch here is that here n and m come before A in the parameter list. So better you always declare functions with the parameters in that order.
The first signature just passes the array with no way to tell how big the array is and can lead to problems with out-of-bounds errors and/or security flaws.\
The second signature is a more secure version because it allows the function to check against the size of the array to prevent the first versions shortcomings.
Unless this is homework, raw arrays are a bit out-dated. Use std::vector instead. It allows passing the vector around without having to manually pass the size as it does this for you.
The size of an array is not passed with the array itself. Therefore, if the other function needs the size, it will have it as a parameter.
The thing is, some functions implicitly understand the array to be of a certain size. So they won't need to have it specified explicitly. For example, if your function operates on an array of 3 floats, you don't need the user to tell you that it is an array of 3 floats. Just take an array.
And then there are those functions (let's call them "terrible" because they are) that will fill an array in with arbitrary data up to a point defined by that data. sprintf is probably the "best" example. It will keep putting characters in that array until it is finished writing them. That's very bad, because there's no explicit or implicit agreement between the user and the function as to how big this array could be. sprintf will write some number of characters, but there's no way for the user to know exactly how many get written (in the general case).
Which is why you should never use sprintf; use snprintf or _snprintf, depending on your compiler.
Anytime you need to know the size of the array, it needs to be provided. There is nothing special about the two forms of passing the array itself; the first parameter is the same either way. The second method simply provides the information needed to know the size of the array while the first does not.
Sometimes the array itself holds the information about its size, though. In your first example, for instance, perhaps arr[0] is set to the size of the array, and the actual data begins at arr[1]. Or consider the case of c-strings... you provide just a char[], and the array is assumed to end at the first element equal to \0. In your example, a negative value may act as a similar sentinel. Or perhaps the function simply doesn't care about the array's size, and will simply assume it is large enough.
Such methods are inherently unsafe, though... it is easy to forget to set arr[0] or to accidently overwrite the null terminator. Then, f suddenly has no way of knowing how much space it has available to it. Always prefer to provide the size explicitly, either via a size parameter like you show, or with a second pointer to the end of the array. The latter is the method generally taken by the standard library functions in C++. You still have the issue of providing an incorrect size, though, which is why in C++ it isn't recommended you ever use such an array in the first place... use an actual container that will keep track of that information for you.
The difference is that the second one includes a parameter that indicates the array size. The logical conclusion is that if you don't use such a parameter, the function doesn't know what the array size is. And this indeed turns out to be the case. In fact, it doesn't know you have an array. In fact, you don't have to have an array to call the function.
The array syntax here, without a specified size inside the square brackets, is a fake-out. The parameter is actually a pointer. For more information, please see http://c-faq.com/aryptr/index.html , especially section 4.

What is useful about a reference-to-array parameter?

I recently found some code like this:
typedef int TenInts[10];
void foo(TenInts &arr);
What can you do in the body of foo() that is useful, that you could not do if the declaration was:
void foo(int *arr); // or,
void foo(int arr[]); // or,
void foo(int arr[10]); // ?
I found a question that asks how to pass a reference to an array. I guess I am asking why.
Also, only one answer to "When is pointer to array useful?" discussed function parameters, so I don't think this is a duplicate question.
The reference-to-array parameter does not allow array type to decay to pointer type. i.e. the exact array type remains preserved inside the function. (For example, you can use the sizeof arr / sizeof *arr trick on the parameter and get the element count). The compiler will also perform type checking in order to make sure the array argument type is exactly the same as the array parameter type, i.e. if the parameter is declared as a array of 10 ints, the argument is required to be an array of exactly 10 ints and nothing else.
In fact, in situations when the array size is fixed at compile-time, using a reference-to-array (or pointer-to-array) parameter declarations can be preceived as the primary, preferred way to pass an array. The other variant (when the array type is allowed to decay to pointer type) are reserved for situations when it is necessary to pass arrays of run-time size.
For example, the correct way to pass an array of compile-time size to a function is
void foo(int (&arr)[10]); // reference to an array
or
void foo(int (*arr)[10]); // pointer to an array
An arguably incorrect way would be to use a "decayed" approach
void foo(int arr[]); // pointer to an element
// Bad practice!!!
The "decayed" approach should be normally reserved for arrays of run-time size and is normally accompanied by the actual size of the array in a separate parameter
void foo(int arr[], unsigned n); // pointer to an element
// Passing a run-time sized array
In other words, there's really no "why" question when it comes to reference-to-array (or pointer-to-array) passing. You are supposed to use this method naturally, by default, whenever you can, if the array size is fixed at compile-time. The "why" question should really arise when you use the "decayed" method of array passing. The "decayed" method is only supposed to be used as a specialized trick for passing arrays of run-time size.
The above is basically a direct consequence of a more generic principle. When you have a "heavy" object of type T, you normally pass it either by pointer T * or by reference T &. Arrays are no exception from this general principle. They have no reason to be.
Keep in mind though that in practice it is often makes sense to write functions that work with arrays of run-time size, especially when it comes to generic, library-level functions. Such functions are more versatile. That means that often there's a good reason to use the "decayed" approach in real life code, Nevertheless, this does not excuse the author of the code from recognizing the situations when the array size is known at compile time and using the reference-to-array method accordingly.
One difference is that it's (supposed to be) impossible to pass a null reference. So in theory the function does not need to check if the parameter is null, whereas an int *arr parameter could be passed null.
You can write a function template to find out the size of an array at compile time.
template<class E, size_t size>
size_t array_size(E(&)[size])
{
return size;
}
int main()
{
int test[] = {2, 3, 5, 7, 11, 13, 17, 19};
std::cout << array_size(test) << std::endl; // prints 8
}
No more sizeof(test) / sizeof(test[0]) for me ;-)
Shouldn't we also address the words in bold from the question:
What can you do in the body of foo() that is useful, that you could not do if the declaration was void foo(int arr[]);?
The answer is: nothing. Passing an argument by reference allows a function to change its value and pass back this change to the caller. However, it is not possible to change the value of the array as a whole, which would have been a reason to pass it by reference.
void foo(int (&arr)[3]) { // reference to an array
arr = {1, 2 ,3}; // ILLEGAL: array type int[3] is not assignable
arr = new(int[3]); // same issue
arr = arr2; // same issue, with arr2 global variable of type int[3]
}
You can ensure that the function is only called on int arrays of size 10. That may be useful from a type-checking standpoint.
You get more semantic meaning regarding what the function is expecting.