Array as parameter - c++

I was wondering which one of these is the best when I pass an array as parameter?
void function(int arr[]) {...};
or
void function(int* arr) {...};
Could you tell me your reason? and which book you might refer to? Thanks!

Since this question is tagged c++, I would use neither. If you must use this, both are equivalent.
But since you use C++, a better approach is to use a std::vector for such tasks
void function(std::vector<int> &arr) {...}
or, if you don't modify the array/vector
void function(const std::vector<int> &arr) {...}

If you just want to pass any array (including dynamically allocated), they are equivalent.
Should your function require an actual fixed-size array, you could do this:
template <size_t N>
void function(char (&str)[N]) { ... }

They are semantically identical.
#1 is slightly better (in my opinion) because it makes it explicit that you are passing an array.

You ask which to choose of
void function(int arr[]) {...};
and
void function(int* arr) {...};
The latter allows you to declare the pointer itself as const, while the former only allows you to declare the items as const. However, to the reader [] indicates an array as opposed to possibly a pointer to a single object. Personally I think the constraint offered by const is more practically useful than the indication of array'ness offered by [].
For higher level code, use std::vector or std::array, whichever is more appropriate.

The best way to pass an array as a parameter is actually:
void function(type_t* array, int amountOfElts, int arrayAllocatedElts);
For some cases (like when passing strings) you can often skip these two parameters. But they may be useful either way, for example when doing some operations on the string, or they may save you a strlen call or two.
Using the [] option in function arguments is in my opinion confusing and should be avoided. But I don't think it's a convention.

Depending on the layer of abstraction you want to provide, you should chose between Olaf's approach or something STL uses to the great level:
template <class Iter>
void function(Iter first, Iter last)
{
// for example, to get size:
std::size_t size = std::distance(first, last);
// for example, to iterate:
for ( Iter it = first; it != last; ++it )
{
// *it to get the value, like:
std::cout << *it << std::endl;
}
}
That way, you can use the function not only for arrays, but for various STL types: vectors, lists, queues, stacks and so.
int tab[4] = {1,2,3,4};
function(tab, tab+4);
std::vector<int> vec;
// fill vector with something
function(vec.begin(), vec.end());

Related

Why in particular should I rather pass a std::span than a std::vector& to a function?

I know this might overlap with the question What is a “span” and when should I use one?, but I think the answer to this specific part of the question is pretty confusing. On one hand, there are quotes like this:
Don't use it if you have a standard library container (or a Boost container etc.) which you know is the right fit for your code. It's not intended to supplant any of them.
But in the same answer, this statement occurs:
is the reasonable alternative to passing const vector& to functions when you expect your data to be contiguous in memory. No more getting scolded by high-and-mighty C++ gurus!
So what part am I not getting here? When would I do this:
void foo(const std::vector<int>& vec) {}
And when this?
void foo(std::span<int> sp) {}
Also, would this
void foo(const std::span<int> sp) {}
make any sense? I figured that it shouldn't, because a std::span is just a struct, containing a pointer and the length. But if it doesn't prevent you from changing the values of the std::vector you passed as an argument, how can it replace a const std::vector<T>&?
The equivalent of passing a std::vector<int> const& is not std::span<int> const, but rather std::span<int const>. The span itself being const or not won't really change anything, but more const is certainly good practice.
So when should you use it?
I would say that it entirely depends on the body of the function, which you omitted from your examples.
For example, I would still pass a vector around for this kind of functions:
std::vector<int> stored_vec;
void store(std::vector<int> vec) {
stored_vec = std::move(vec);
}
This function does store the vector, so it needs a vector. Here's another example:
void needs_vector(std::vector<int> const&);
void foo(std::vector<int> const& vec) {
needs_vector(vec);
}
As you can see, we need a vector. With a span you would have to create a new vector and therefore allocate.
For this kind of functions, I would pass a span:
auto array_sum(std::span<int const> const values) -> int {
auto total = int{0};
for (auto const v : values) {
total += v;
}
return total;
}
As you can see, this function don't need a vector.
Even if you need to mutate the values in the range, you can still use span:
void increment(std::span<int> const values) {
for (auto& v : values) {
++v;
}
}
For things like getter, I will tend to use a span too, in order to not expose direct references to members from the class:
struct Bar {
auto get_vec() const -> std::span<int const> {
return vec;
}
private:
std::vector<int> vec;
};
Regarding the difference between passing a &std::vector and passing a std::span, I can think of two important things:
std::span allows you to pass only the data you want the function to see or modify, as opposed to the whole vector (and you don't have to pass a start index and an end index). I've found this was much needed to keep code clean. After all, why would you give a function access to any more data than it needs?
std::span can take data from multiple types of containers (e.g. std::array, std::vector, C-style arrays).
This can of course be also done by passing C-style arrays - std::span is just a wrapper around C-style arrays with some added safety and convenience.
Another differentiator between the two: In order to modify size of the owning vector inside your function (via std::vector::assign or std::vector::clear, for instance), you would rather pass a std::vector& than a std::span, since span doesn't provide those features.
You can modify the contents of a std::span, but you can't change its size.

Is it bad practice to template array sizes when calling methods that take in arrays?

I am writing an implementation for a neural network, and I am passing in the number of nodes in each layer into the constructor. Here is my constructor:
class Network {
public:
template<size_t n>
Network(int inputNodes, int (&hiddenNodes)[n], int outputNodes);
};
I am wondering if it is bad practice to use templates to specify array size. Should I be doing something like this instead?
class Network {
public:
Network(int inputNodes, int numHiddenLayers, int* hiddenNodes, int outputNodes);
};
Templates are necessary when you want to write something that uses variable types. You don't need it when you just want to pass a value of a given type. So one argument against using a template for this is to keep things simple.
Another problem with the template approach is that you can only pass in a constant value for the size. You can't write:
size_t n;
std::cin >> n;
Network<n> network(...); // compile error
A third issue with the template approach is that the compiler will have to instantiate a specialization of the function for every possible size you are using. For small values of n, that might give some benefits, because the compiler could optimize each specialization better when it knows the exact value (for example, by unrolling loops), but for large values it will probably not be able to optimize it any better than if it didn't know the size. And having multiple specializations might mean the instruction cache in your CPU is trashed more easily, that your program's binary is larger and thus uses more disk space and memory.
So it likely is much better to pass the size as a variable, or instead of using a size and a pointer to an array, use a (reference to an) STL container, or if you can use C++20, consider using std::span.
Use std::span<int> or write your own.
struct int_span {
int* b = 0;
int* e = 0;
// iteration:
int* begin() const { return b; }
int* end() const { return e; }
// container-like access:
int& operator[](std::size_t i) const { return begin()[i]; }
std::size_t size() const { return end()-begin(); }
int* data() const { return begin(); }
// implicit constructors from various contiguous buffers:
template<std::size_t N>
int_span( int(&arr)[N] ):int_span( arr, N ) {}
template<std::size_t N>
int_span( std::array<int, N>& arr ):int_span( arr.data(), N ) {}
template<class A>
int_span( std::vector<int, A>& v ):int_span(v.data(), v.size()) {}
// From a pair of pointers, or pointer+length:
int_span( int* s, int* f ):b(s),e(f) {}
int_span( int* s, std::size_t len ):int_span(s, s+len) {}
// special member functions. Copy is enough:
int_span() = default;
// This is a view type; so assignment and copy is copying the selection,
// not the contents:
int_span(int_span const&) = default;
int_span& operator=(int_span const&) = default;
};
there we go; an int_span with represents a view into a contiguous buffer of ints of some size.
class Network {
public:
Network(int inputNodes, int_span hiddenNodes, int outputNodes);
};
From the way you write the second function argument
int (&hiddenNodes)[n]
I guess you're not an experienced C/C++ programmer. The point is that n will be ignored by the compiler and you'll lose any possibility to verify that the size of the C-style array you'll input here and the n passed as the template parameter will be equal to each other or at least coherent with each other.
So, forget about templates. Go std::vector<int>.
The only advantage of using a template (or std::array) here is that the compiler might optimize your code better than with std::vector. The chances that you'll be able to exploit it are, however, very small, and even if you succeed, the speedup most likely be hardly measureable.
The advantage of std::vector is that it is practically as fast and easy to use as std::array, but far more flexible (its size is adjustable at runtime). If you go std::array or templates and you are going to use in your program hidden layers of different sizes, soon you'll have to turn other parts of your program into templates and it is likely that rather than implementing your neural network, you'll find yourself fighting with templates. It's not worth it.
However, when you'll have a working implementation of your NN, based on std::vector, you can THEN consider its optimization, which may include std::array or templates. But I'm 99.999% sure you'll stay with std::vector.
I've never implemented a neural network, but did a lot of time-consuming simulations. The first choice is always std::vector and only if one has some special, well defined requirements for the data container does one use other containers.
Finally, keep in mind that std::array is stack-allocated, whereas std::vector is allocated on the heap. Heap is much larger and in some scenarios this is a crucial factor to consider.
EDIT
In short:
if an array size may vary freely, never pass its value as a
template parameter. Use std::vector
If it can take on 2, 3, 4, perhaps 5 sizes from a fixed set, you CAN consider std::array, but std::vector will most likely be as efficient and the code will be simpler
If the array will always be of the same size known at compile-time, and the limited size of the function stack is not an issue, use std::array.

Passing the pointer to the first element of a vector if any

I have a function taking double*
void funct(double*);
and I have a vector:
std::vector<double> myVec;
how should I correctly and safely call funct(myVec)?
not safe:
funct(&myVec[0]);
funct(&*myVec.begin());
not nice to be read:
funct( myVec.empty()? NULL : &myVec[0]);
funct( myVec.empty()? NULL : &*myVec.begin());
any suggestion?
What's the standard approach?
Well, the standard class type std::vector has a member function called data, that is supposed to return the pointer to the underlying storage. Apparently data() is nothing more than &front() with the guaranteed that:
The pointer is such that range [data(); data() + size()) is always a valid range, even if the container is empty.
Therefore I'd say that both:
funct(vector.data());
funct(&vector.front());
can be safely used.
But the real question is: what are you trying to do inside the function?
I can see 3 obvious answers to this, and I'm going to propose alternatives for all:
I only want one element of the array
I want an optional argument
I want to pass a container
Let's start with the first, shall we? If you only want an element of the array, why bother with pointers and array in general? You can just use:
void funct(double);
and be done with it. And if you want to modify that double, why not pass it by reference?
void funct(double&);
and call the function as:
funct(vector[0]);
The number two has two very possible answers. One is to use function overloading like this:
void funct();
void funct(double);
And basically consider the function with no argument and an argument. The simplest solution is probably the right one, correct?
Otherwise, if you are really feeling fancy and you can't be bothered to write funct two times, you can even use boost::optional or std::optional (C++14), which clearly express the intent of the argument:
void funct(std::optional<double> optional) {
if (optional) {
// we have a double
} else {
// we don't have a double
}
}
And finally, the third one has three possible answers (can you see the pattern?).
If you only want a specific kind of container (why would you want that, only God knows) you can simply do:
void funct(const std::vector<double>&);
Otherwise you can either use templates like Bartek explained below or use my favorite solution: iterators (which is the choice of standard algorithms as well, just to make it less official).
And guess what? It works also with C-style arrays (which you shouldn't be using by the way). Here's the solution:
template<class Iterator>
void funct(Iterator begin, Iterator end) {
for (auto it = begin; it != end; ++it) {
// do something to element (*it)
}
}
And BOOM. You can use it like this:
double x[100];
funct(std::begin(x), std::end(x));
or:
std::vector<double> x(100);
funct(x.begin(), x.end());
Happy coding.
std::vector has member function data. So you can use it like this:
func(nyVec.data());
I did not find in the Standard that if a vector is empty then data has to return 0. Maybe it is a Standard defect. Though an empty vector can has non-zero capacity.
If you need to check whether a vector is empty then you can write:
func(myVec.empty() ? NULL : nyVec.data());
Usually if you pass an array by value you should specify a second parameter that will contain the size of the array. So maybe it would be better if func was declared as:
func(double *, std::vector<double>::size_type);
In this case you could call the function as:
func(myVec.data(), myVec.size());
If you need to process only one element then the standard approach is the following:
if (!myVec.empty()) func(&myVec.front());
Try
funct(&myVec.at(0));
This performs bounds checking and will throw std::out_of_range if element is not within the range of the container.
Create a utility function. This will both hide the not-niceness and prevent code duplication.
template <class T>
typename T::value_type* first_ptr(T &&container)
{
return container.empty() ? nullptr : &container.front();
}
int main()
{
funct(first_ptr(myVec));
}
I would wrap or change the function into more idiomatic optional primitive:
void funct(optional<double&> f);
Let's think about the passing then. The function should, if the vector is not empty, pass the first element, and nothing otherwise.
Directly transcribes to
if (!v.empty()) {
funct(v.front());
} else {
funct(none);
}
I would probably change it to regular value semantics, though; referencing elements from collections directly is rather dangerous.
Of course you can pack it into a reusable function:
template<class Container>
optional<typename Container::value_type&> myFront(Container& cont) {
if (!cont.empty())
return cont.front();
else
return none;
}
funct(myFront(v));
Now you only need lift :).
std::vector data() member function returns a pointer to its internal data (array), so we can use it this way:
if ( !(myVec.size() == 0)) func( myVec.data());
or:
if ( !myVec.empty()) func( myVec.data());
The choose of size() or empty() is dependent on implementation of these functions that you are using. C++ standard guarantees that empty() is constant time for all standard containers.
Not really an answer but anyway: this seems like an XY problem.
I would say none of the solutions proposed in the OP are good fits based on the following: it makes little sense to be calling a function with a null pointer argument so I would say this should be handled at the call site. Something like:
if(!myVec.empty()) funct(&myVec[0]);
else ...

How to get std::set pointer to the raw data?

I want to pass the whole set as an argument to a function, like the way we do for arrays (i.e &array[0]). I am not able to figure out how to get the pointer to the raw data for a set.
It is not possible to do it in the same way as an array because std::set is not required to have it's data arranged in a contiguous block of memory. It is a binary tree so it most likely consists of linked nodes. But you can pass it by reference, or use the begin() and end() iterators.
template <typename T>
void foo(const std::set<T>& s);
template <typename Iterator>
void bar(Iterator first, Iterator last);
std::set<int> mySet = ....;
foo(mySet);
bar(mySet.begin(), mySet.end());
You can't get a pointer to the raw data in the same sense as you'd do for an array, because a set doesn't reside in continuous memory.
I want to pass the whole set as an argument to a function
Pass it by reference. There's no memory overhead (if that's what you were worrying about):
void foo(std::set<int>& x);
You will have to iterate through the std::set to extract all the elements of the std::set.
Unlike std::vector and arrays there is no requirment imposed by the standard that std::set elements should be located in contiguos memory.
Either pass an reference/pointer to std::set in the function and extract the data inside the function by iterating over it.
It depends what you mean by:
"I want to pass the whole set as an argument to a function"
std::set<int> data;
// fill data;
You can pass the set by reference:
plop(data); // void plop(std::set<int>& data); // passing be reference would be the C++ way
Alternatively you can pass iterators.
This abstracts away the type of container you are using and thus allows the writers of plop() to concentrate on the algorithm. In this case the iterators behave in the same way as pointers (in C++ code).
plop(data.begin(), data.end(); // template<typename I> void plop(I begin, I end);
Alternatively do you mean you want to pass the data in a set to a C like function.
In this case you need to pass a pointer (as that is the only thing C can understand). Unfortunately you can not pass a pointer into a set directly as that has no real meaning. But you can copy the data into a vector and from there into a C program:
std::vector<int> datavec(data.begin(), data.end());
plop(&data[0], datavec.size()); // void plop(int* data, std::size_t size);
This works because vector stores the data in contiguous memory.

How to initialize std::vector from C-style array?

What is the cheapest way to initialize a std::vector from a C-style array?
Example: In the following class, I have a vector, but due to outside restrictions, the data will be passed in as C-style array:
class Foo {
std::vector<double> w_;
public:
void set_data(double* w, int len){
// how to cheaply initialize the std::vector?
}
Obviously, I can call w_.resize() and then loop over the elements, or call std::copy(). Are there any better methods?
Don't forget that you can treat pointers as iterators:
w_.assign(w, w + len);
You use the word initialize so it's unclear if this is one-time assignment or can happen multiple times.
If you just need a one time initialization, you can put it in the constructor and use the two iterator vector constructor:
Foo::Foo(double* w, int len) : w_(w, w + len) { }
Otherwise use assign as previously suggested:
void set_data(double* w, int len)
{
w_.assign(w, w + len);
}
Well, Pavel was close, but there's even a more simple and elegant solution to initialize a sequential container from a c style array.
In your case:
w_ (array, std::end(array))
array will get us a pointer to the beginning of the array (didn't catch it's name),
std::end(array) will get us an iterator to the end of the array.
The quick generic answer:
std::vector<double> vec(carray,carray+carray_size);
or question specific:
std::vector<double> w_(w,w+len);
based on above: Don't forget that you can treat pointers as iterators
You can 'learn' the size of the array automatically:
template<typename T, size_t N>
void set_data(const T (&w)[N]){
w_.assign(w, w+N);
}
Hopefully, you can change the interface to set_data as above. It still accepts a C-style array as its first argument. It just happens to take it by reference.
How it works
[ Update: See here for a more comprehensive discussion on learning the size ]
Here is a more general solution:
template<typename T, size_t N>
void copy_from_array(vector<T> &target_vector, const T (&source_array)[N]) {
target_vector.assign(source_array, source_array+N);
}
This works because the array is being passed as a reference-to-an-array. In C/C++, you cannot pass an array as a function, instead it will decay to a pointer and you lose the size. But in C++, you can pass a reference to the array.
Passing an array by reference requires the types to match up exactly. The size of an array is part of its type. This means we can use the template parameter N to learn the size for us.
It might be even simpler to have this function which returns a vector. With appropriate compiler optimizations in effect, this should be faster than it looks.
template<typename T, size_t N>
vector<T> convert_array_to_vector(const T (&source_array)[N]) {
return vector<T>(source_array, source_array+N);
}
std::vector<double>::assign is the way to go, because it's little code. But how does it work, actually? Doesnt't it resize and then copy? In MS implementation of STL I am using it does exactly so.
I'm afraid there's no faster way to implement (re)initializing your std::vector.