Convert const float* to std::array<float, ...> - c++

I have a const float* pointing to a huge array, and would like to be able to access the elements through a std::array.
What is the best way to do this? Without copying the elements if possible.
Thanks!

In order to use std::array, you need to know the size of the array at compile time. You create an empty array and use std::copy to copy the elements into the array.
If the code which uses your const float* only knows that size at runtime, then you cannot use std::array but have to use std::vector. std::vector has a constructor to which you can pass pointers to the begin and end of the range to copy into it.
Note that in both cases, the container owns a copy of the original elements.
Without copying the elements if possible.
No, that is not possible. C++ standard containers are designed to own their contents, not just representing a view into them.
Here is an example to illustrate the difference:
#define SIZE 10 // let's assume some C or legacy code which uses macros
// ...
void f(const float* arr)
{
// size is known at compile time
std::array<float, SIZE> a;
std::copy(arr, arr + SIZE, begin(a));
}
void g(const float* arr, int size)
{
// size is only known at runtime
std::vector<float> v(arr, arr + size);
}

There are at least two proposed additions to the standard library that do something like that. One is std::experimental::to_array<T,N>(), which makes a deep copy of the entire data. That’s wasteful in terms of both time and memory, but could be useful if you really do want to create a copy, especially a const copy which you cannot create and then modify.
If what you want is a container interface to a range of data represented by an arbitrary pointer and element count, the Guideline Support Library offers a span template. I recommend that, if you pass both, you wrap them in a lightweight object similar to this.
Since you said you can change the interface to the constructor, I strongly suggest you do so. If you want to keep around the version that takes a std::array, you can, and delegate it to the version that takes an arbitrary range of data, such as:
#include <algorithm>
#include <array>
#include <iterator>
#include "span"
MyClass::MyClass( const span<value_type>& s /*, more, params */ )
{
/* In this example, m_storage is some private container and span has a
* container interface.
*/
auto it = std::back_inserter(m_storage);
std::copy_n( s.begin(), s.size(), it );
// ...
}
MyClass::MyClass( std::array<value_type, initializer_size>& a /*, more, params */ )
: MyClass( span<value_type>( a.data(), a.size() ) /*, more, params */ )
{}

std::array does not take ownership of pointers. You have to copy the elements. Use std::copy. It works on abstract iterators, for which pointers also qualify.
const float *source = ...; // C array of at least of size N
std::array<float, N> destination;
std::copy(source, source + N, destination.begin());
An aside: without copying the elements, the element type of whatever container with support for taking ownership of a pointer would have to be const float as well. Or you would have to use const_cast<> in case you were sure that the pointer is actually not really to const float. Please, only use const_cast<> in cases where it's absolutely unavoidable due to a flaw in an external API though :)

Related

Is it bad practice to template array sizes when calling methods that take in arrays?

I am writing an implementation for a neural network, and I am passing in the number of nodes in each layer into the constructor. Here is my constructor:
class Network {
public:
template<size_t n>
Network(int inputNodes, int (&hiddenNodes)[n], int outputNodes);
};
I am wondering if it is bad practice to use templates to specify array size. Should I be doing something like this instead?
class Network {
public:
Network(int inputNodes, int numHiddenLayers, int* hiddenNodes, int outputNodes);
};
Templates are necessary when you want to write something that uses variable types. You don't need it when you just want to pass a value of a given type. So one argument against using a template for this is to keep things simple.
Another problem with the template approach is that you can only pass in a constant value for the size. You can't write:
size_t n;
std::cin >> n;
Network<n> network(...); // compile error
A third issue with the template approach is that the compiler will have to instantiate a specialization of the function for every possible size you are using. For small values of n, that might give some benefits, because the compiler could optimize each specialization better when it knows the exact value (for example, by unrolling loops), but for large values it will probably not be able to optimize it any better than if it didn't know the size. And having multiple specializations might mean the instruction cache in your CPU is trashed more easily, that your program's binary is larger and thus uses more disk space and memory.
So it likely is much better to pass the size as a variable, or instead of using a size and a pointer to an array, use a (reference to an) STL container, or if you can use C++20, consider using std::span.
Use std::span<int> or write your own.
struct int_span {
int* b = 0;
int* e = 0;
// iteration:
int* begin() const { return b; }
int* end() const { return e; }
// container-like access:
int& operator[](std::size_t i) const { return begin()[i]; }
std::size_t size() const { return end()-begin(); }
int* data() const { return begin(); }
// implicit constructors from various contiguous buffers:
template<std::size_t N>
int_span( int(&arr)[N] ):int_span( arr, N ) {}
template<std::size_t N>
int_span( std::array<int, N>& arr ):int_span( arr.data(), N ) {}
template<class A>
int_span( std::vector<int, A>& v ):int_span(v.data(), v.size()) {}
// From a pair of pointers, or pointer+length:
int_span( int* s, int* f ):b(s),e(f) {}
int_span( int* s, std::size_t len ):int_span(s, s+len) {}
// special member functions. Copy is enough:
int_span() = default;
// This is a view type; so assignment and copy is copying the selection,
// not the contents:
int_span(int_span const&) = default;
int_span& operator=(int_span const&) = default;
};
there we go; an int_span with represents a view into a contiguous buffer of ints of some size.
class Network {
public:
Network(int inputNodes, int_span hiddenNodes, int outputNodes);
};
From the way you write the second function argument
int (&hiddenNodes)[n]
I guess you're not an experienced C/C++ programmer. The point is that n will be ignored by the compiler and you'll lose any possibility to verify that the size of the C-style array you'll input here and the n passed as the template parameter will be equal to each other or at least coherent with each other.
So, forget about templates. Go std::vector<int>.
The only advantage of using a template (or std::array) here is that the compiler might optimize your code better than with std::vector. The chances that you'll be able to exploit it are, however, very small, and even if you succeed, the speedup most likely be hardly measureable.
The advantage of std::vector is that it is practically as fast and easy to use as std::array, but far more flexible (its size is adjustable at runtime). If you go std::array or templates and you are going to use in your program hidden layers of different sizes, soon you'll have to turn other parts of your program into templates and it is likely that rather than implementing your neural network, you'll find yourself fighting with templates. It's not worth it.
However, when you'll have a working implementation of your NN, based on std::vector, you can THEN consider its optimization, which may include std::array or templates. But I'm 99.999% sure you'll stay with std::vector.
I've never implemented a neural network, but did a lot of time-consuming simulations. The first choice is always std::vector and only if one has some special, well defined requirements for the data container does one use other containers.
Finally, keep in mind that std::array is stack-allocated, whereas std::vector is allocated on the heap. Heap is much larger and in some scenarios this is a crucial factor to consider.
EDIT
In short:
if an array size may vary freely, never pass its value as a
template parameter. Use std::vector
If it can take on 2, 3, 4, perhaps 5 sizes from a fixed set, you CAN consider std::array, but std::vector will most likely be as efficient and the code will be simpler
If the array will always be of the same size known at compile-time, and the limited size of the function stack is not an issue, use std::array.

Is there an advantage by passing std::array on the stack for API call

I was reading about the key differences between std::array and C type array and came to know that one of the key differences is that C type array when called to an API go as a pointer while a copy of std::array gets passed. One of the blogs mentioned that as an advantage but I don't think so. What are the key advantages of using an std::array over C-type array? My exploration and understanding suggests that almost everything can be done with C type arrays. They can even be passed to STL algorithms. So I don't see any key advantage of using std::array over C-type array. Is that actually so?
It's true that everything you can do with std::array could be done with C array. Actually, std::array simply wraps a fixed size C array internally. However, std::array is often more convenient.
Ability to pass an array by value is an advantage. If you'd like to do this with C array, you'd have to pass a pointer and size and then create local copy. With std::array, you can avoid this, and choose what better suits your needs:
void takeArrayByValue(std::array<int, 5> arr)
{
arr[0] = newValue; // arr is a local copy, caller does not see this
}
void takeArrayByReference(std::array<int, 5>& arr)
{
arr[0] = newValue; // variable passed as argument is modified
}
// compare with this:
void takeCArrayAndMakeLocalCopy(const int (&arr)[5])
{
int localArr[5];
std::copy(std::begin(arr), std::end(arr), localArr);
// do something with localArr
}
Another thing: it's easier to misuse C arrays:
void takeCAray(int arr[5]);
int arr[3];
takeCArray(arr); // compiles
In this example, takeCArray really takes a pointer. It should have been takeCArray(int (&arr)[5]), but compiler won't complain about this, and the bug can potentially stay unnoticed at first. This can't happen with std::array:
void takeStdArray(std::array<int, 5>& arr);
std::array<int, 3> arr;
takeStdArray(arr); // compiler error!

How to use std::move() to move char[] to a vector<char>

I am trying to concatenate two char arrays char a[9000]; char b[9000] to a container std::vector<char>. Since the array size is not small. I am trying to use std::move()
class obj
{
public:
obj &operator <<(char *&&ptr)
{
//???
ptr = nullptr;
return *this;
}
private:
std::vector<char> m_obj;
};
So in the main function
int main()
{
obj o;
enum {SIZE=9000};
char a[SIZE]; memset(a, 1, SIZE);
char b[SIZE]; memset(b, 2, SIZE);
o << a << b;
return 0;
}
My question is what is the correct way to move a char*?
Environment:
Ubuntu 16.04.4 LTS
g++ 5.4.0
tl;dr
Call reserve() and copy.
What's wrong with moving?
Several things. First, the expectation that you seem to have about std::move is wrong. std::move does not magically move anything (or moves anything at all, including non-magically). What it does is a cast to a rvalue reference, which may enable moving under some conditions.
Second, beginning with C++03, std::vector is guaranteed to have its element contiguously laid out in memory (before C++03 it was factually the case anyway, but no explicit guarantee was given). If you use std::move then you necessarily use a version later than C++03, so std::vector necessarily lays out its elements contiguously.
Which means that except for the totally coincidal case that the two arrays happen to be adjacent to each other, there is no way of getting the elements from two arrays into a vector without copying. It's simply not possible to move them because at least one array must be relocated (= copied).
Further, I couldn't imagine how moving elements from an array the way you intend to do it would be possible in the first place. Moving assumes that you take over ownership, which usually means you modify the moved-from object in some way so that it is still valid, but without whatever resources it owned previously.
How do you do that with an array that has automatic storage duration in a surrounding scope? What is dereferencing the array after the move (which is possible and legitimate) going to do then?
Also, you want to decay an array to a pointer (that's perfectly legal) and then somehow magically move. That, however, does not move the contents of the array! The best you can do is move the pointer, but that doesn't make sense since moving a pointer is just the same as copying it.
There is no way you're getting around having to copy the array elements, but you can save one memory allocation and one needless copy by properly calling reserve() ahead of time.
Well, char* isn't exactly object you want to move. So, getting rvalue reference to the pointer won't exactly help.
Why don't you try array-like way of moving and use std::move from "algorithm"?
http://en.cppreference.com/w/cpp/algorithm/move
It'd look something like that:
obj& obj::addChars(char *ptr, size_t size = 9000u) {
m_obj.resize(size);
std::move(ptr, ptr + size, m_obj.begin());
return *this;
}
The following works because a vector is contiguous memory.
It is necessary to resize the vector to hold the amount of elements being copied.
&vc[0] is the memory address of the 0th element of the vector.
&vc[SIZE] is the memory address of the 9000th element of the vector, which is where array b should start.
This will execute much faster than trying to iterate through the arrays and push_back, or assign the vector elements.
enum {SIZE=9000};
char a[SIZE];
memset(a, 1, SIZE);
char b[SIZE];
memset(b, 2, SIZE);
// make a vector of characters
std::vector<char> vc;
// resize to hold two arrays
vc.resize(SIZE * 2);
// copy array a to the beginning of the vector
memcpy(&vc[0], a, SIZE);
// copy array b to the end of the vector
memcpy(&vc[SIZE], b, SIZE);
edit:
std::vector<char> vc(SIZE * 2);
is the same as create, then resize. It will create the vector with the specified size, then the resize is not necessary.
Even if you allocated the arrays using new there is no way to get vector to adopt the arrays. A copy will take place.
The recommended approach is std::copy which in most implementations will be recognized as optimizable to a std::memcpy().
#include <cstring>
#include <iostream>
#include <vector>
#include <algorithm>
#define SIZE 90
int main(void) {
std::vector<char> v;
v.resize(2*SIZE);
char a[SIZE];
char b[SIZE];
std::memset(a, 'a', SIZE);
std::memset(b, 'b', SIZE);
std::copy(a,a+SIZE,v.begin());
std::copy(b,b+SIZE,v.begin()+SIZE);
for(char c : v){
std::cout << c;
}
std::cout << std::endl;
return 0;
}
If it isn't so optimized (unlikely), use a std::memcpy().
How you implement that in you class is another debate.
You could implement an add(char *start,char *end); member.
As others point out 9000 char isn't huge and you might be over-engineering if you do more.
A further extension would allow you to resize() or reserve() size in the vector to avoid an inevitable resize during the second copy.

Standard way of std::vector<double> borrowing the contents of a double[]

Is there a C++ (preferably C++11) standard-compliant idiom which I can employ to allow a std::vector<double> to borrow the contents of a double[] of known size?
I have a function (actually a functor masquerading as a callback from an optimiser) with prototype:
double MyFunctorClass::operator()(double s[]) const;
(MyFunctorClass also has m_size which reveals the number of elements of s).
I want to call a function that takes a const std::vector<double>& as an input.
One solution technique involves my creating a std::vector<double> member variable and somehow switching the double[] data into the data area of that std::vector, call the function, then switch it back to the caller. I'd rather not copy due to performance concerns: it is the objective function. Any ideas?
No, you cannot do that.
std::vector allocates space for stored content on the heap (and owns it), so you cannot force it to use your own memory.
By 'use your own memory' I mean 'your own memory with valid content, which is preserved and never touched by the container unless you explicitly say so'. Of course, you can define your own memory allocation policy by overriding 'allocator' parameter, but that is not a solution in this case.
Change you function to accept templated begin and end iterators
instead of void foo( std::vector<double> vd )
use template<typename Iter> void foo( Iter begin, Iter end )
This will let you pass in any standard container or pointers.
iterate like so:
while( begin!=end ) {
/*const?*/ double& value = *begin;
// whatever you were going to do
++begin;
}
If the double[] you want to temporarily store into the vector has no particular allocation constraints, you can use a second std::vector<double> and use std::swap(vec1, vec2) for quick elements exchange. Then to obtain a double[] from any of the vectors just do &vec[0].
Notice that I don't think that swapping is guaranteed to preserve the addresses of the internal arrays (although I cannot think of any implementation that doesn't). Edit: it's actually guaranteed (see comments).

How to initialize std::vector from C-style array?

What is the cheapest way to initialize a std::vector from a C-style array?
Example: In the following class, I have a vector, but due to outside restrictions, the data will be passed in as C-style array:
class Foo {
std::vector<double> w_;
public:
void set_data(double* w, int len){
// how to cheaply initialize the std::vector?
}
Obviously, I can call w_.resize() and then loop over the elements, or call std::copy(). Are there any better methods?
Don't forget that you can treat pointers as iterators:
w_.assign(w, w + len);
You use the word initialize so it's unclear if this is one-time assignment or can happen multiple times.
If you just need a one time initialization, you can put it in the constructor and use the two iterator vector constructor:
Foo::Foo(double* w, int len) : w_(w, w + len) { }
Otherwise use assign as previously suggested:
void set_data(double* w, int len)
{
w_.assign(w, w + len);
}
Well, Pavel was close, but there's even a more simple and elegant solution to initialize a sequential container from a c style array.
In your case:
w_ (array, std::end(array))
array will get us a pointer to the beginning of the array (didn't catch it's name),
std::end(array) will get us an iterator to the end of the array.
The quick generic answer:
std::vector<double> vec(carray,carray+carray_size);
or question specific:
std::vector<double> w_(w,w+len);
based on above: Don't forget that you can treat pointers as iterators
You can 'learn' the size of the array automatically:
template<typename T, size_t N>
void set_data(const T (&w)[N]){
w_.assign(w, w+N);
}
Hopefully, you can change the interface to set_data as above. It still accepts a C-style array as its first argument. It just happens to take it by reference.
How it works
[ Update: See here for a more comprehensive discussion on learning the size ]
Here is a more general solution:
template<typename T, size_t N>
void copy_from_array(vector<T> &target_vector, const T (&source_array)[N]) {
target_vector.assign(source_array, source_array+N);
}
This works because the array is being passed as a reference-to-an-array. In C/C++, you cannot pass an array as a function, instead it will decay to a pointer and you lose the size. But in C++, you can pass a reference to the array.
Passing an array by reference requires the types to match up exactly. The size of an array is part of its type. This means we can use the template parameter N to learn the size for us.
It might be even simpler to have this function which returns a vector. With appropriate compiler optimizations in effect, this should be faster than it looks.
template<typename T, size_t N>
vector<T> convert_array_to_vector(const T (&source_array)[N]) {
return vector<T>(source_array, source_array+N);
}
std::vector<double>::assign is the way to go, because it's little code. But how does it work, actually? Doesnt't it resize and then copy? In MS implementation of STL I am using it does exactly so.
I'm afraid there's no faster way to implement (re)initializing your std::vector.