size of string array through function - c++

How to find the size of string array passed to a function. The size should be computed inside the function.
#include<iostream>
using namespace std;
template <typename T,unsigned S>
unsigned arraysize(const T (&v)[S]) { return S; }
void func(string args[])
{
unsigned m=arraysize(args);
cout<<m;
}
int main()
{
string str_arr[]={"hello","foo","bar"};
func(str_arr);
}
What i dont understand is:
If the statement arraysize(str_arr) is used in main,it wouldn't pose a problem. The str_arr is an array, so str_arr acts as a pointer, so when we use arraysize(str_arr) that means we're sending the address to arraysize function.(correct me if i'm wrong).
But in function func(), i dont understand why there is a problem, i.e. the statement arraysize(args) sends the address of the string array args(or the address of pointer args).. or is it more complicated since it becomes some double pointer?? Explain?
Also please correct the above code..

str_arr is an array of strings. When you do sizeof(str_arr), you get the size of that array. However, despite the fact that args looks like an array of strings, it's not. An argument that looks like an array is really of pointer type. That is, string args[] is transformed to string* args by the compiler. When you do sizeof(args) you are simply getting the size of the pointer.
You can either pass the size of the array into the function or take a reference to the array with a template parameter size (as you did with arraysize):
template <size_t N>
void func(string (&args)[N])
{
// ...
}

There is no way to determine the size of an array when sent to a function. You also have to remember that only a pointer to the array is sent to the function, which makes it even theoretically quite implausible to calculate the array's size.

The information of the array's size is never visible in your function, as you threw it away when you decided to use string args[] for the argument. From the compiler's perspective, it's the same as string* args. You could change the function to:
template<size_t M>
void func(string (&args)[M])
{
cout<<M;
}
but it seems you already know that, right?

If the statement arraysize(str_arr) is used in main,it wouldn't pose a
problem. The str_arr is an array, so str_arr acts as a pointer, so
when we use arraysize(str_arr) that means we're sending the address to
arraysize function.(correct me if i'm wrong).
I have to correct you here. You state a correct premise, but draw the wrong conclusion.
The key point is indeed that str_arr is an array in main. While an array decays to a pointer in many (most) expression contexts, this does not apply when a reference to array is initialized. That is the reason why array_size is declared to take a reference to array parameter - this is the only way to have a parameter of array type, which implies that it comes with a defined length.
That is not the case for func. When a function parameter is declared to be of plain array type, the the array to pointer decay is applied to that declaraction. Your declaration of func is equivalent to void func(string * args). Thus args is a pointer, not an array. You could call func as
string str_non_array;
func(&str_non_array);
Because of this, a reference-to-array can't bind to it. And anyways, args has completely lost all information about the size of the array it is pointing to.
You could use the same reference-to-array trick as is used in arraysize: declare func as
template <std::size_t N>
void func(string (&args)[N]);
But this gets impractical to do everywhere (and may lead to code bloat, if applied naively to all array-handling code). The C++ equivalent of an array-with-length as available in other languages is std::vector<string> (for dynamically sized arrays) or std::array<string,N> (for fixed size known at compile time). Note that the latter can cause the same code bloat as mentioned above, so in most cases, std::vector<string> would be the preferred type for array that you need to pass to various functions.

Dmitry is right and I would like to explain it a bit further. The reason its happening is because array is not a First Class citizen in C++ and when passed as parameter it decays to pointer and what you get in called function is a pointer to its first element and size is lost.
You can refer C++ arrays as function arguments to see what alternative options are available.

Related

Why is the size of an array passed to a function by reference known to the compiler in C++?

I know that when I want to pass an array to a function, it will decay into pointer, so its size won't be known and these two declarations are equivalent:
void funtion(int *tab, int size);
and
void funtion(int tab[], int size);
And I understand why. However, I checked that when I pass an array as a reference:
void funtion(int (&tab)[4]);
the compiler will know the size of the array and won't let me pass an array of different size as an argument of this function.
Why is that? I know that when I pass an array by address, the size isn't taken into account while computing the position of the ith element in the array, so it is discarded even if I explicitly include it in the function declaration:
void funtion(int tab[4], int size);
But what is different when I pass an array by reference? Why is its size known to the compiler?
Note: I'm interested in arrays whose size is known at compile time, so I didn't use any templates.
I found a similar question on Stack Overflow, however it doesn't answer my question - it doesn't explain why the compiler knows the size of the array, there is just some information on how to pass arrays to functions.
Because it can, and because checking adds extra safety. The compiler knows the size of the array because you tell it so, right there in the function declaration. And since that information is available, why wouldn't it use it to signal errors in your source code?
The real question is why your last example wouldn't do the same check. This is, unfortunately, another piece of C legacy - you are never passing an array, it always decays into a pointer. The size of the array then becomes irrelevant.
Could a check be added? Possibly, but it would be of limited use (since we are all using std::array now - right!?), and because it would undoubtedly break some code. This, for example:
void func (char Values [4]);
func ("x");
This is currently legal, but wouldn't be with an additional check on array size.
Because there is no odd implicit type change committed by the compiler in the case. Normally when you write:
void func(int[4]);
or
void func(void());
The compiler decides to "help" you and translates those into:
void func(int *);
or
void func(void(*)());
Funny thing though - it wouldn't aid you in such a way when you try returning one of those. Try writing:
int func()[4], func1()();
Ooops - surprise - compiler error.
Otherwise arrays are arrays and have constant size which can be acquired by using the sizeof operator.
This however is often forgotten because of the compiler behavior noted above and also because of the implicit pointer conversion applied to objects of array type when such isn't expected. And this is very often. Though here are the few exceptions when no implicit array object conversion is applied:
size_t arr[4],
(*parr)[3] = &arr, //taking the address of array
(&refarr)[3] = arr, //storing reference to array
sizearrobject = sizeof(arr); //taking the array object size
The above examples will trigger compiler error because of incompatible types on the second and third line.
I'm talking about the cases when arr object isn't automatically converted to something like this:
(size_t*)&arr
Well, there are several ways to pass an array to function. You can pass it by pointer an by reference, and there are ways to define or not to define it's size explicitely for both ways.
In your question you compare these 2 ways:
Pointer to first element: void f(int *arr)
Reference to an entire array: void f(int (&arr)[size])
You ask why you need to specify size only in one of these cases.
It looks like you assume that the only difference between them is the fact that one uses pointer and another uses reference. But this statement is incorrect, they have more differences: One is pointer to first element, but second is a reference to an entire array.
You can pass an array by pointer to an entire array:
void f(int (*arr)[size])
Compare it with your example, with passing by refence to an entire array:
void f(int (&arr)[size])
They are similar, they have similar syntax, they both explicitely define array size.
Also, consider this:
void f(int &arr)
It looks like passing a single int by reference, but you can pass an array of unknown size to it.
Pointer alternative to it is
void f(int *arr)
You ask why you need to specify array size only in one of those cases. It's because of the syntax you used, not because one is pointer and other is reference.
As I said, you can use pointer or reference. And you can specify array size or you can allow an array of any size to be used. These two are not connected.
// by pointer by reference
/* Any size */ void f(int *arr) void f(int &arr)
/* Specific size */ void f(int (*arr)[x]) void f(int (&arr)[x])

Passing array to function C++

I have one quick question about the passing of arrays in C++ which I don't understand.
Basically when you want to pass a array of type integer to another function you have to pass an address to that array instead of directly passing the whole block of contiguous memory. Exactly why is the case?
Also, why is that char arrays can directly be passed to another function in C++ without the need to pass an address instead??
I have tried looking for learning materials for this online (such as cplusplus.com) but I haven't managed to find and explanation for this.
Thanks for your time, Dan.
As long as C++ is concerned, passing char arrays and int arrays are same.
There are 2 ways to pass arrays in c++.
Address is passed
int fn(int *arrays, int len);
int fn(int arrays[], int len); // Similar to above, still staying as sytax hangover from anci c
Array reference is passed
int fn(int (&array)[SIZE]); // Actual array passed as reference
You can templatized above function as
template<size_t SIZE>
int fn(int (&array)[SIZE]);
Above method allows you to pass array of anysize to this function. But beware, a different function is created from template for each size. If your function's side effect changes a local state (static variable for ex), this should be used with care.
If you don't want to change contents, use const with arguments.
If you want a copy of array in function argument, consider using stl container like std::array or std::vector or embed array in your class.
It isn't entirely clear from your question exactly what you're trying and what problems you've had, but I'll try to give you useful answers anyway.
Firstly, what you're talking about is probably int[] or int* (or some other type), which isn't an array itself... its a pointer to a chunk of memory, which can be accessed as if it were an array. Because all you have is a pointer, the array has to be passed around by reference.
Secondly, passing around an array as a "whole block of contiguous memory" is rather inefficient... passing the point around might only involve moving a 32 or 64 bit value. Passing by reference is often a sensible thing with memory buffers, and you can explicitly use functions like memcpy to copy data if you needed to.
Thirdly, I don't understand what you mean about char arrays being "directly" passable, but other types of arrays cannot be. There's nothing magic about char arrays when it comes to passing or storing them... they're just arrays like any other. The principle difference is that compilers allow you to use string literals to create char arrays.
Lastly, if you're using C++11, you might want to consider the new std::array<T> class. It provides various handy facilities, including automatic memory management and keeping track of its own size. You can pass these by value, template<class T> void foo(std::array<T> bar) or by reference template<class T> void foo(std::array<T>& bar), as you like.
You can't pass any array by value. You can pass by value either a struct containing array or std::array from C++11.

Need help understanding the syntax of this C++ template

I am new to C++ templates and encountered these C++ templates related codes but is not able to understand their meaning:
class StringBuffer
{
CharBuffer cb;
..
template <size_t ArrayLength>
bool append(const char (&array)[ArrayLength]) {
return cb.append(array, array + ArrayLength - 1); /* No trailing '\0'. */
}
};
What does the bool append(const char (&array)[ArrayLength]) mean? It seems to me that the function template will be instantiated to something taking a parameter with a specific ArrayLength. But isn't that we cannot specify an array length in the parameter list of a function? Also what does const char (&array) mean? Shouldn't it be something like const char &(without the parentheses)?
I am reading the book C++ Templates The Complete Guide by David Vandevoorde/Nicolai M.Josuttis, which part of the book covers the above syntax?
It means "reference to const char array".
The reason for it is that if you pass like
template <int S>
void f(T a[s]){}
You will lose the size information according to "array parameter deprecation rules", mainly because pointer doesn't hold array size information. (AKA standard said so.)
So you will have to pass by reference and not by pointer value.
The parenthesis before [] is required because [] will take precedence in front of &, so in order to make & take precedence it needs to be done like
T (&a)[s]
const char (&array)[ArrayLength]
is a reference to an array of ArrayLength objects of type char.
Without the parentheses, it would be an array of references, which is not allowed. Without the &, it would be an array which (as a function parameter) decays to a pointer, losing the information about the size of the array.
It seems to me that the function template will be instantiated to something taking a parameter with a specific ArrayLength.
That's right. The array length is known at compile time, and this will instantiate a function that can use that compile-time value.
But isn't that we cannot specify an array length in the parameter list of a function?
Yes, you could supply the length as an extra function parameter; but that would be a runtime value, and there'd be know way to validate that it was correct. The template ensures that the template argument really is the size of the array.
which part of the book covers the above syntax?
I don't have that book but, looking at the table of contents I'd suggest looking at 4.2 (Nontype Function Template Parameters) and 11 (Template Argument Deduction) for this kind of thing.
It's the syntax for passing the array by reference (since arrays can't be passed by value in C++):
void foo(const char (&array)[10]) { ... } // We can pass an array of lenth 10
Now throw a template parameter in the mix instead of the 10. The compiler knows the size of an array at compile time and can instantiate the template with correct value.
template<size_t N>
void foo(const char (&array)[N])
{
// use N, it'll be whatever the size of the array you instantiate the template with is
}
This syntax will set a template parameter based on the size of a statically allocated array argument.
The templated version of "append" (that you included) calls an overload which takes 2 arguments: a pointer to char and a count (you did not include this).
So you might have an array like:
const char my_string[] = "hi";
You would use the "append" member function like this:
my_string_buffer_object.append(my_string);
And the length of my_string will be auto-detected, setting the ArrayLength parameter to the length of my_string. Then a more verbose version of "append" is called with the string length automatically filled in for you.
Basically, this version of "accept" wraps another version. It lets you pass an array as the only argument, automatically filling in a length using the template parameter's info.
If you use this syntax, keep in mind that these array length parameters count elements and not object sizes (what sizeof would tell you about the array). For char these are the same, but arrays with larger-sized element types will produce a template array length parameter smaller than its sizeof.
The given code is a good lesson:
at first , It want to pass a array , So can't pass by value, then pass by refrence (&) , Then it pass it by const word that safe pass it.
You know C/C++ has limitation in array, So programmer of this code defined a template for length of Array, and solve this problem.

when do we need to pass the size of array as a parameter

I am a little bit confused about pass an array in C/C++. I saw some cases in which the signature is like this
void f(int arr[])
some is like this
void f(int arr[], int size)
Could anybody elaborate what's the difference and when and how to use it?
First, an array passed to a function actually passes a pointer to the first element of the array, e.g., if you have
int a[] = { 1, 2, 3 };
f(a);
Then, f() gets &a[0] passed to it. So, when writing your function prototypes, the following are equivalent:
void f(int arr[]);
void f(int *arr);
This means that the size of the array is lost, and f(), in general, can't determine the size. (This is the reason I prefer void f(int *arr) form over void f(int arr[]).)
There are two cases where f() doesn't need the information, and in those two cases, it is OK to not have an extra parameter to it.
First, there is some special, agreed value in arr that both the caller and f() take to mean "the end". For example, one can agree that a value 0 means "Done".
Then one could write:
int a[] = { 1, 2, 3, 0 }; /* make sure there is a 0 at the end */
int result = f(a);
and define f() something like:
int f(int *a)
{
size_t i;
int result = 0;
for (i=0; a[i]; ++i) /* loop until we see a 0 */
result += a[i];
return result;
}
Obviously, the above scheme works only if both the caller and the callee agree to a convention, and follow it. An example is strlen() function in the C library. It calculates the length of a string by finding a 0. If you pass it something that doesn't have a 0 at the end, all bets are off, and you are in the undefined behavior territory.
The second case is when you don't really have an array. In this case, f() takes a pointer to an object (int in your example). So:
int change_me = 10;
f(&change_me);
printf("%d\n", change_me);
with
void f(int *a)
{
*a = 42;
}
is fine: f() is not operating on an array anyway.
WHen an array is passed in C or C++ only its address is passed. That is why the second case is quite common, where the second parameter is the number of elements in the array. The function has no idea, only by looking at the address of the array, how many elements it is supposed to contain.
you can write
void f( int *arr, int size )
as well, having latter (size) allows to not step outside the array boundaries while reading/writing to it
C and C++ are not the same thing. They have some common subset, though. What you observed here is that the "first" array dimension when passed to a function always results just in a pointer being passed. The "signature" (C doesn't use this term) of a function declared as
void toto(double A[23]);
is always just
void toto(double *A);
That is that the 23 above is somewhat redundant and not used by the compiler. Modern C (aka C99) has an extension here that lets you declare that A always has 23 elements:
void toto(double A[static 23]);
or that the pointer is const qualified
void toto(double A[const 23]);
If you add other dimension the picture changes, then the array size is used:
void toto(double A[23][7]);
in both C and C++ is
void toto(double (*A)[7]);
that is a pointer to an array of 7 elements. In C++ these array bounds must be an integer constant. In C it can be dynamic.
void toto(size_t n, size_t m, double A[n][m]);
They only thing that you have to watch here is that here n and m come before A in the parameter list. So better you always declare functions with the parameters in that order.
The first signature just passes the array with no way to tell how big the array is and can lead to problems with out-of-bounds errors and/or security flaws.\
The second signature is a more secure version because it allows the function to check against the size of the array to prevent the first versions shortcomings.
Unless this is homework, raw arrays are a bit out-dated. Use std::vector instead. It allows passing the vector around without having to manually pass the size as it does this for you.
The size of an array is not passed with the array itself. Therefore, if the other function needs the size, it will have it as a parameter.
The thing is, some functions implicitly understand the array to be of a certain size. So they won't need to have it specified explicitly. For example, if your function operates on an array of 3 floats, you don't need the user to tell you that it is an array of 3 floats. Just take an array.
And then there are those functions (let's call them "terrible" because they are) that will fill an array in with arbitrary data up to a point defined by that data. sprintf is probably the "best" example. It will keep putting characters in that array until it is finished writing them. That's very bad, because there's no explicit or implicit agreement between the user and the function as to how big this array could be. sprintf will write some number of characters, but there's no way for the user to know exactly how many get written (in the general case).
Which is why you should never use sprintf; use snprintf or _snprintf, depending on your compiler.
Anytime you need to know the size of the array, it needs to be provided. There is nothing special about the two forms of passing the array itself; the first parameter is the same either way. The second method simply provides the information needed to know the size of the array while the first does not.
Sometimes the array itself holds the information about its size, though. In your first example, for instance, perhaps arr[0] is set to the size of the array, and the actual data begins at arr[1]. Or consider the case of c-strings... you provide just a char[], and the array is assumed to end at the first element equal to \0. In your example, a negative value may act as a similar sentinel. Or perhaps the function simply doesn't care about the array's size, and will simply assume it is large enough.
Such methods are inherently unsafe, though... it is easy to forget to set arr[0] or to accidently overwrite the null terminator. Then, f suddenly has no way of knowing how much space it has available to it. Always prefer to provide the size explicitly, either via a size parameter like you show, or with a second pointer to the end of the array. The latter is the method generally taken by the standard library functions in C++. You still have the issue of providing an incorrect size, though, which is why in C++ it isn't recommended you ever use such an array in the first place... use an actual container that will keep track of that information for you.
The difference is that the second one includes a parameter that indicates the array size. The logical conclusion is that if you don't use such a parameter, the function doesn't know what the array size is. And this indeed turns out to be the case. In fact, it doesn't know you have an array. In fact, you don't have to have an array to call the function.
The array syntax here, without a specified size inside the square brackets, is a fake-out. The parameter is actually a pointer. For more information, please see http://c-faq.com/aryptr/index.html , especially section 4.

What is useful about a reference-to-array parameter?

I recently found some code like this:
typedef int TenInts[10];
void foo(TenInts &arr);
What can you do in the body of foo() that is useful, that you could not do if the declaration was:
void foo(int *arr); // or,
void foo(int arr[]); // or,
void foo(int arr[10]); // ?
I found a question that asks how to pass a reference to an array. I guess I am asking why.
Also, only one answer to "When is pointer to array useful?" discussed function parameters, so I don't think this is a duplicate question.
The reference-to-array parameter does not allow array type to decay to pointer type. i.e. the exact array type remains preserved inside the function. (For example, you can use the sizeof arr / sizeof *arr trick on the parameter and get the element count). The compiler will also perform type checking in order to make sure the array argument type is exactly the same as the array parameter type, i.e. if the parameter is declared as a array of 10 ints, the argument is required to be an array of exactly 10 ints and nothing else.
In fact, in situations when the array size is fixed at compile-time, using a reference-to-array (or pointer-to-array) parameter declarations can be preceived as the primary, preferred way to pass an array. The other variant (when the array type is allowed to decay to pointer type) are reserved for situations when it is necessary to pass arrays of run-time size.
For example, the correct way to pass an array of compile-time size to a function is
void foo(int (&arr)[10]); // reference to an array
or
void foo(int (*arr)[10]); // pointer to an array
An arguably incorrect way would be to use a "decayed" approach
void foo(int arr[]); // pointer to an element
// Bad practice!!!
The "decayed" approach should be normally reserved for arrays of run-time size and is normally accompanied by the actual size of the array in a separate parameter
void foo(int arr[], unsigned n); // pointer to an element
// Passing a run-time sized array
In other words, there's really no "why" question when it comes to reference-to-array (or pointer-to-array) passing. You are supposed to use this method naturally, by default, whenever you can, if the array size is fixed at compile-time. The "why" question should really arise when you use the "decayed" method of array passing. The "decayed" method is only supposed to be used as a specialized trick for passing arrays of run-time size.
The above is basically a direct consequence of a more generic principle. When you have a "heavy" object of type T, you normally pass it either by pointer T * or by reference T &. Arrays are no exception from this general principle. They have no reason to be.
Keep in mind though that in practice it is often makes sense to write functions that work with arrays of run-time size, especially when it comes to generic, library-level functions. Such functions are more versatile. That means that often there's a good reason to use the "decayed" approach in real life code, Nevertheless, this does not excuse the author of the code from recognizing the situations when the array size is known at compile time and using the reference-to-array method accordingly.
One difference is that it's (supposed to be) impossible to pass a null reference. So in theory the function does not need to check if the parameter is null, whereas an int *arr parameter could be passed null.
You can write a function template to find out the size of an array at compile time.
template<class E, size_t size>
size_t array_size(E(&)[size])
{
return size;
}
int main()
{
int test[] = {2, 3, 5, 7, 11, 13, 17, 19};
std::cout << array_size(test) << std::endl; // prints 8
}
No more sizeof(test) / sizeof(test[0]) for me ;-)
Shouldn't we also address the words in bold from the question:
What can you do in the body of foo() that is useful, that you could not do if the declaration was void foo(int arr[]);?
The answer is: nothing. Passing an argument by reference allows a function to change its value and pass back this change to the caller. However, it is not possible to change the value of the array as a whole, which would have been a reason to pass it by reference.
void foo(int (&arr)[3]) { // reference to an array
arr = {1, 2 ,3}; // ILLEGAL: array type int[3] is not assignable
arr = new(int[3]); // same issue
arr = arr2; // same issue, with arr2 global variable of type int[3]
}
You can ensure that the function is only called on int arrays of size 10. That may be useful from a type-checking standpoint.
You get more semantic meaning regarding what the function is expecting.