C-style array vs std::array for library interface - c++

I want to write a library with an interface that provide a read function.
C-style array is error prone but allow to pass a buffer of any size.
C++ array are safer but impose to be constructed with a size.
// interface.h
// C-style array
int read (std::uint8_t* buf, size_t len);
// C++ array
int read (std::array<std::uint8_t, 16>& buff)
How can I have the best of both worlds?
I was thinking about function template but it does not seems practical for a library interface.
template <size_t N>
int read (std::array<std::uint8_t, N>& buf);
EDIT
std::vector could be a good candidate but if we consider that char* and std::array do not have dynamic allocation.
EDIT I like a lot the solution with gsl::span. I am stuck with C++14 so no std::span. I don't know if using a third library (gsl) will be an issue/allow.
EDIT I did not think that using char over another type could have some influence on the answer, so to be clearer it is to manipulate bytes. I change char to std::uint8_t
EDIT Since C++11 guarantee that a return std::vector will moved and not copied, returning std::vector<std::uint8_t> is acceptable.
std::vector<std::uint8_t> read();

You could do what the standard library does: Use a pair of iterators.
template <typename Iter> int read(Iter begin, Iter end)
{
// Some static assets to make sure `Iter` is actually a proper iterator type
}
It gives you the best of both worlds: Slightly better safety and ability to read into an arbitrary part of a buffer. Also it allows you to read into non-continguous containers.

How can I have the best of the two world ?
By using std::vector:
Like std::arrays: It is safer than C-arrays.
Like C-arrays: It allows you to work with functions that must be able take an array of arbitrary size.
EDIT: std::vector does not necessarily imply dynamic allocation (as in dynamic storage duration). That depends on the allocator used. You can still provide a user-specified stack allocator.

I will go against the grain and say that for read-type function taking void* pointer and size are likely the best option. This is the approach taken with any unformatted read functions around the world.

Why don't you use a gsl::span, which was meant for the purpose of eliminating pointer and length parameter pairs for a sequence of contiguous objects? Something like this would work:
int read(gsl::span<uint8_t> buf)
{
for (auto& elem : buf)
{
// Do whatever with elem
}
}
The only problem is that unfortunately, gsl::span is not part of C++ standard (Maybe it might be in C++20), and installing it would require a library such as GSL-lite
Here are more details about span, from Herb Sutter.

Do you really care about the type of the underlying container ?
template<typename Iterator>
int read_n(Iterator begin, size_t len);
Assuming this function returns the number of elements read, i would change the return type to size_t as well.
char *dyn = new char[20];
char stat[20];
std::vector<char> vec(20);
read(dyn, 20);
read(stat 20);
read(vec.begin(), 20);

I think when designing lib's interface you need to take in considiration where it will be used.
Library with C interface with "char *" can be used with wide variety of languages (C, C++ and others ). Using std::array limits your lib's potentional clients.
The third possible variant:
struct buf_f allocBuf();
int rmBuf( struct buf_t b );
int read( struct buf_f b );
char * bufData( struct buf_f b );
size_t bufSize( struct buf_f b );
Surely it can be rewritten with C++ in more elegant way.

You can use make a wrapper function that is a template function that delegates to the C-interface function:
int read(std::uint8_t* buf, size_t len);
template <size_t N>
int read(std::array<std::uint8_t, N>& buf)
{
return read(buf.data(), buf.size());
}
I've found such constructs useful when I need to something over an C ABI but didn't want to lose some of the comforts that C++ gives to, as the template function is compiled as part of the library client code and doesn't need to be C ABI compatible while thee function the template function call is C ABI compatible

Just return a std::vector<uint8_t>, unless this is a DLL, in which case go with a C style interface.
Note: answer changed from std::string to std::vector after change of question from char to uint8_t.

Related

Nice syntax to get sized reference to vector's/array's data?

I'm wondering if there's any std:: function to get a sized pointer/reference to a vector/array's underlying data? Something better than:
const size_t(&asArray1)[N] = *(size_t(*)[N]) vec.data();
const size_t(&asArray2)[arr.size()] = *(size_t(*)[arr.size()]) arr.data();
Clarification - something I could pass to the below:
template<size_t N>
void foo(size_t(&sizedArray)[N]) {}
Update -- SOLUTION:
Use helper functions defined once, that do the appropriate casting and leave the call-site cleaner... See my answer below for helper code.
Live demo: https://onlinegdb.com/S167RI20U
What you're asking for is to decide a type (which must be done at compile time) with information only available at runtime. It is impossible.
C++20 std::span encapsulates ugly syntax in its own non-explicit constructor, so just pass it.
Before C++20, there's std::begin , std::end, std::size, which are not exactly what you need, but may help.
All this is implementable in old compiler, it does not require any special compiler support.
SOLUTION: Use helper functions defined once, that do the appropriate casting and leave the call-site cleaner... I'm surprised that helpers like these aren't available at least for std::array...
For vectors, I will admit that it's a rare use case where you have a compile-time known size for the vector. but I do have a use case where I know my vector contains at least as many elements as an array, and I want the first N of them to be processed through some algorithm. I can guarantee that that many elements are available, and include asserts to boot, etc...
Live demo: https://onlinegdb.com/S167RI20U
template<size_t N, typename T>
using CArrayPtr = T(*)[N];
template<size_t N, typename T>
auto& cArray(array<T, N>& arr) {
return *(CArrayPtr<N, T>)arr.data();
}
template<size_t N, typename T>
auto& cArray(vector<T>& vec) {
return *(CArrayPtr<N, T>)vec.data();
}

Passing std::array to a function that accepts C style array by address [duplicate]

What's the canonical way to get the reference to std::array's underlying raw (C) array?
The data() method returns just a raw pointer, which makes it unsuitable e.g. for passing into functions which accept a reference to a raw array of a known size.
Also, is there a good reason why data() returns a raw pointer, and not a reference to the underlying raw array, or is this just an oversight?
What's the canonical way to get an std::array's underlying raw (C)
array?
There is no way of getting the underlying C array.
Also, is there a good reason why data() returns a raw pointer, and not
a reference to the underlying raw array, or is this just an oversight?
It's backwards: there is no good reason for the std::array to provide the underlying C array. As you already said, the C array would be useful (over the raw pointer) only with functions getting a reference to C arrays.
When was the last time you had a function:
void foo(int (&arr)[5])
Me? Never. I never saw a function with a C array reference parameter with the exception of getting the size of array (and rejecting pointers):
template <class T, std::size_t N>
auto safe_array_size(T (&)[N]) { return N; }
Let's dive a little into why parameters references to arrays are not used.
For starters, from the C area pointer with a separate size parameter was the only way to pass arrays around, due to array-to-pointer decay and lack of reference type.
In C++ there are alternatives to C arrays, like std::vector and std::array. But even when you have a (legacy) C array you have 2 situations:
if you pass it to a C function you don't have the option of reference, so you are stuck to pointer + size
when you want to pass it to a C++ function the idiomatic C++ way is to pass begin + end pointers.
First of all a begin + end iterators is generic, it accepts any kind of containers. But is not uncommon to see reference to std::vector when you want to avoid templates, so why not reference to C array if you have one? Because of a big drawback: you have to know the size of the array:
void foo(int (&arr)[5])
which is extremely limiting.
To get around this you need to make it a template:
template <std::size N>
void foo(int (&arr)[N])
which beats the purpose of avoiding templates, so you better go with begin + end template iterators instead.
In some cases (e.g. math calculations on just 2 or 3 values which have
the same semantics, so they shouldn't be separate parameters) a
specific array size is called for, and making the function generic
wouldn't make sense. In those cases, specifying the size of the array
guarantees safety since it only allows passing in an array of the
correct size at compile-time; therefore it's advantageous and isn't a
"big drawback"
One of the beauties of (C and) C++ is the enormous scope of applicability. So yes, you will always find some fields that use or need a certain unique feature in an unique way. That being said, even in your example I would still shy away from arrays. When you have a fixed number of values that shouldn't be semantically separated I think a structure would be the correct choice over arrays most of the time (e.g. glm::mat4 instead of float[4]).
But let's not forget what std::array is: a modern replacement for C arrays. One thing I learned when analyzing options is that there is no absolute "better than". There is always a "depends". But not in this case: std::array should unquestionably replace C arrays in interfaces. So in the rare case where a fixed size container is needed as a reference parameter it doesn't make sense to enable encouraging the use of C arrays when you already have an std::array. So the only valid case where exposing the underlying C array of std::array is need is for some old libraries that have C array reference parameters. But I think that in the bigger picture adding this to the interface it is not justified. New code should use a struct (btw std::tuple is getting easier and easier to use by each standard) or std::array.
AFAIK, There's no direct or typesafe way to do it, but one work around if you need to pass to a function (with a signature you cannot change to std::array) is by usingreinterpret_cast like this:
some_function(*reinterpret_cast<int (*)[myarr.size()]>(myarr.data())));
If you wanted to make it safer:
#include <array>
void passarray(int (&myarr)[5]){}
template <typename ValueT, std::size_t size>
using CArray = ValueT[size];
template <typename ValueT, std::size_t size>
CArray<ValueT, size> & c_array_cast(std::array<ValueT, size> & arg) {
{
return *reinterpret_cast<CArray<ValueT,size>*>(arg.data());
}
int main()
{
std::array<int,5> myarr = { {1,2,3,4,5} };
passarray(*reinterpret_cast<int (*)[myarr.size()]>(myarr.data()));
passarray(c_array_cast(myarr));
return 0;
}
There isn't one.
I can see why it would be useful, especially when working with legacy code, but since a couple of decades ago we're supposed to be moving away from code like that and towards iterator-aware algorithms. And when working with C code you'd have to use a pointer anyway. I presume these are factors in the decision not to provide this functionality.
Rewrite your code to accept std::array<T, N>& instead, if possible.
You can reinterpret_cast the .data() to a raw, like:
template <typename T, std::size_t N>
inline static decltype(auto) to_raw_array(const std::array<T, N> & arr_v) {
return reinterpret_cast<const T(&) [N]>(*arr_v.data());
}
But it is an ugly hack. As the others have already suggested, I recommend you to use std::array as-is.
Usage:
#include <cstdint>
#include <array>
template <typename T, std::size_t N>
inline static decltype(auto) to_raw_array(const std::array<T, N> & arr_v) {
return reinterpret_cast<const T(&) [N]>(*arr_v.data());
}
void foo(const std::uint8_t(&buf)[5]){
// ...
}
int main(void){
std::array<std::uint8_t, 5> arr = {1,2,3,4,5};
foo(to_raw_array(arr));
}
Why not passing std::array.begin()? Worked in SDL2 on:
int SDL_RenderDrawLines(SDL_Renderer *renderer, const SDL_Point *points, int count)
My line to be drawed:
std::array<SDL_Point, 8> a_line;
I passed like this:
SDL_RenderDrawLines(r_s_game.p_renderer, a_line.begin(), 8);

In C++ what is the point of std::array if the size has to be determined at compile time?

Pardon my ignorance, it appears to me that std::array is meant to be an STL replacement for your regular arrays. But because the array size has to be passed as a template parameter, it prevents us from creating std::array with a size known only at runtime.
std::array<char,3> nums {1,2,3}; // Works.
constexpr size_t size = 3;
std::array<char,size> nums {1,2,3}; // Works.
const buf_size = GetSize();
std::array<char, buf_size> nums; // Doesn't work.
I would assume that one very important use case for an array in C++ is to create a fixed size data structure based on runtime inputs (say allocating buffer for reading files).
The workarounds I use for this are:
// Create a array pointer for on-the-spot usecases like reading from a file.
char *data = new char[size];
...
delete[] data;
or:
// Use unique_ptr as a class member and I don't want to manage the memory myself.
std::unique_ptr<char[]> myarr_ = std::unique_ptr<char[]>(new char[size]);
If I don't care about fixed size, I am aware that I can use std::vector<char> with the size pre-defined as follows:
std::vector<char> my_buf (buf_size);
Why did the designers of std::array choose to ignore this use case? Perhaps I don't understand the real usecase for std::array.
EDIT: I guess another way to phrase my question could also be - Why did the designers choose to have the size passed as a template param and not as a constructor param? Would opting for the latter have made it difficult to provide the functionality that std::array currently has? To me it seems like a deliberate design choice and I don't understand why.
Ease of programming
std::array facilitates several beneficial interfaces and idioms which are used in std::vector. With normal C-style arrays, one cannot have .size() (no sizeof hack), .at() (exception for out of range), front()/back(), iterators, so on. Everything has to be hand-coded.
Many programmers may choose std::vector even for compile time known sized arrays, just because they want to utilize above programming methodologies. But that snatches away the performance available with compile time fixed size arrays.
Hence std::array was provided by the library makers to discourage the C-style arrays, and yet avoid std::vectors when the size is known at the compile time.
The two main reasons I understand are:
std::array implements STL's interfaces for collection-types, allowing an std::array to be passed as-is to functions and methods that accept any STL iterator.
To prevent array pointer decay... (below)
...this is the preservation of type information across function/method boundaries because it prevents Array Pointer Decay.
Given a naked C/C++ array, you can pass it to another function as a parameter argument by 4 ways:
void by_value1 ( const T* array )
void by_value2 ( const T array[] )
void by_pointer ( const T (*array)[U] )
void by_reference( const T (&array)[U] )
by_value1 and by_value2 are both semantically identical and cause pointer decay because the receiving function does not know the sizeof the array.
by_pointer and by_reference both requires that U by a known compile-time constant, but preserve sizeof information.
So if you avoid array decay by using by_pointer or by_reference you now have a maintenance problem every time you change the size of the array you have to manually update all of the call-sites that have that size in U.
By using std::array it's taken care of for you by making those functions template functions where U is a parameter (granted, you could still use the by_pointer and by_reference techniques but with messier syntax).
...so std::array adds a 5th way:
template<typename T, size_t N>
void by_stdarray( const std::array<T,N>& array )
std::array is a replacement for C-style arrays.
The C++ standards don't allow C-style arrays to be declared without compile-time defined sizes.

Getting reference to the raw array from std::array

What's the canonical way to get the reference to std::array's underlying raw (C) array?
The data() method returns just a raw pointer, which makes it unsuitable e.g. for passing into functions which accept a reference to a raw array of a known size.
Also, is there a good reason why data() returns a raw pointer, and not a reference to the underlying raw array, or is this just an oversight?
What's the canonical way to get an std::array's underlying raw (C)
array?
There is no way of getting the underlying C array.
Also, is there a good reason why data() returns a raw pointer, and not
a reference to the underlying raw array, or is this just an oversight?
It's backwards: there is no good reason for the std::array to provide the underlying C array. As you already said, the C array would be useful (over the raw pointer) only with functions getting a reference to C arrays.
When was the last time you had a function:
void foo(int (&arr)[5])
Me? Never. I never saw a function with a C array reference parameter with the exception of getting the size of array (and rejecting pointers):
template <class T, std::size_t N>
auto safe_array_size(T (&)[N]) { return N; }
Let's dive a little into why parameters references to arrays are not used.
For starters, from the C area pointer with a separate size parameter was the only way to pass arrays around, due to array-to-pointer decay and lack of reference type.
In C++ there are alternatives to C arrays, like std::vector and std::array. But even when you have a (legacy) C array you have 2 situations:
if you pass it to a C function you don't have the option of reference, so you are stuck to pointer + size
when you want to pass it to a C++ function the idiomatic C++ way is to pass begin + end pointers.
First of all a begin + end iterators is generic, it accepts any kind of containers. But is not uncommon to see reference to std::vector when you want to avoid templates, so why not reference to C array if you have one? Because of a big drawback: you have to know the size of the array:
void foo(int (&arr)[5])
which is extremely limiting.
To get around this you need to make it a template:
template <std::size N>
void foo(int (&arr)[N])
which beats the purpose of avoiding templates, so you better go with begin + end template iterators instead.
In some cases (e.g. math calculations on just 2 or 3 values which have
the same semantics, so they shouldn't be separate parameters) a
specific array size is called for, and making the function generic
wouldn't make sense. In those cases, specifying the size of the array
guarantees safety since it only allows passing in an array of the
correct size at compile-time; therefore it's advantageous and isn't a
"big drawback"
One of the beauties of (C and) C++ is the enormous scope of applicability. So yes, you will always find some fields that use or need a certain unique feature in an unique way. That being said, even in your example I would still shy away from arrays. When you have a fixed number of values that shouldn't be semantically separated I think a structure would be the correct choice over arrays most of the time (e.g. glm::mat4 instead of float[4]).
But let's not forget what std::array is: a modern replacement for C arrays. One thing I learned when analyzing options is that there is no absolute "better than". There is always a "depends". But not in this case: std::array should unquestionably replace C arrays in interfaces. So in the rare case where a fixed size container is needed as a reference parameter it doesn't make sense to enable encouraging the use of C arrays when you already have an std::array. So the only valid case where exposing the underlying C array of std::array is need is for some old libraries that have C array reference parameters. But I think that in the bigger picture adding this to the interface it is not justified. New code should use a struct (btw std::tuple is getting easier and easier to use by each standard) or std::array.
AFAIK, There's no direct or typesafe way to do it, but one work around if you need to pass to a function (with a signature you cannot change to std::array) is by usingreinterpret_cast like this:
some_function(*reinterpret_cast<int (*)[myarr.size()]>(myarr.data())));
If you wanted to make it safer:
#include <array>
void passarray(int (&myarr)[5]){}
template <typename ValueT, std::size_t size>
using CArray = ValueT[size];
template <typename ValueT, std::size_t size>
CArray<ValueT, size> & c_array_cast(std::array<ValueT, size> & arg) {
{
return *reinterpret_cast<CArray<ValueT,size>*>(arg.data());
}
int main()
{
std::array<int,5> myarr = { {1,2,3,4,5} };
passarray(*reinterpret_cast<int (*)[myarr.size()]>(myarr.data()));
passarray(c_array_cast(myarr));
return 0;
}
There isn't one.
I can see why it would be useful, especially when working with legacy code, but since a couple of decades ago we're supposed to be moving away from code like that and towards iterator-aware algorithms. And when working with C code you'd have to use a pointer anyway. I presume these are factors in the decision not to provide this functionality.
Rewrite your code to accept std::array<T, N>& instead, if possible.
You can reinterpret_cast the .data() to a raw, like:
template <typename T, std::size_t N>
inline static decltype(auto) to_raw_array(const std::array<T, N> & arr_v) {
return reinterpret_cast<const T(&) [N]>(*arr_v.data());
}
But it is an ugly hack. As the others have already suggested, I recommend you to use std::array as-is.
Usage:
#include <cstdint>
#include <array>
template <typename T, std::size_t N>
inline static decltype(auto) to_raw_array(const std::array<T, N> & arr_v) {
return reinterpret_cast<const T(&) [N]>(*arr_v.data());
}
void foo(const std::uint8_t(&buf)[5]){
// ...
}
int main(void){
std::array<std::uint8_t, 5> arr = {1,2,3,4,5};
foo(to_raw_array(arr));
}
Why not passing std::array.begin()? Worked in SDL2 on:
int SDL_RenderDrawLines(SDL_Renderer *renderer, const SDL_Point *points, int count)
My line to be drawed:
std::array<SDL_Point, 8> a_line;
I passed like this:
SDL_RenderDrawLines(r_s_game.p_renderer, a_line.begin(), 8);

calculate number of elements from a fixed array (similar to sizeof)

I'm developing a library in C++ in order to give to developers an help to some tasks.
Usually, in order to calculate the size of an array of integer (for example) in a dynamic way (without use #define SIZE or static int SIZE), I do sizeof(v) / sizeof(int). I'm trying to write a piece of code that can do for me that stuff automatically and I decided to call if lengthof.
The code is here:
template <class T> int _typesize(T*) { return sizeof(T); }
#define lengthof(x) (sizeof(x) / _typesize(&x))
I use the template to get the type of the array, then I return its size in bytes. In GCC I know that it's possible to use typeof, so I can replace _typesize(&x) with sizeof(typeof(x)), but it's not possible on MSVC. _typesize is a compatible way, but I think that it can be expensive because it passes the pointer as copy. There is an elegant way to do this?
No need for macros for this task. If you have a conforming compiler
template<class T, size_t len>
constexpr size_t lengthof(T(&)[len]) {return len;}
//the parameter is an unnamed reference to a `T[len]`,
//where `T` is deduced as the element type of the array
//and len is deduced as the length of the array.
//similar to `T(*)[len]` in C, except you can pass the array
//directly, instead of passing a pointer to it.
//added benefit that if you pass a `T*` to it, it produces a compiler error.
Or if you're using Visual Studio which is not yet conforming...
template<class T, size_t len>
std::integral_constant<size_t, len> lengthof(T(&)[len]) {return {};}
//VC++ doesn't have constexpr, so we have to use `std::integral_constant` instead :(
//but how it works is 100% identical
If you want a more portable way, macros still work best:
#define lengthof(arr) sizeof(arr) / sizeof(arr[0])
//doesn't respect namespaces, evaluates arguments multiple times
//and if you pass a `T*` to it, it evaluates to `1` depending on context.
But to reiterate my comment, I would consider all of this bad code. Use std::vector or std::array.
Usually, you would use: sizeof(x) / sizeof(x[0]) which doesn't rely on any extensions.
The canonical C++ way to get the length of an array is sizeof(arr) / sizeof(arr[0]). Whether you want to hide that by packing it into a macro is another debate entirely.
As a side note, if your _typesize is in the global namespace then that name is reserved for the implementation and illegal to use. In a namespace it's technically legal but generally speaking you can avoid reserved name problems by just avoiding leading underscores entirely.