Iterate all elements in std::vector using c-style API

Iterate all elements in std::vector using c-style API - c++

According to “Item 16. Know how to pass vector and string data to legacy APIs.” of Effective STL of Scott Meyers:
It is safe to use c-style API to access all the elements of vector,
since vectors are guaranteed to have the same underlying memory layout as arrays.
//example 1, do sth to all elements in vector using c-style API
void doSomething(const int *pInts, size_t numlnts);
vector<int> v;
if (!v.empty()) {
doSomething(&v[0], v.size());
}
//example 2, init vector with c-style API
size_t fillArray(double *pArray, size_t arraySize);
vector<double> vd(maxNumDoubles);
vd.resize(fillArray(&vd[0], vd.size()));
To use vector together with c-style API, is there any requirement for the element type T in c++ standard?
Is it always safe if T is a build-in type or POD type?

No, there is no requirement for the element type T. The vector will allocate memory so that each element in the vector consumes sizeof(T) bytes. When you write a loop that iterates over the underlying array via pointer arithmetic (or indexing, which is pointer arithmetic under the hood), the exact same element size is used (sizeof(T)) during the increment/decrement.
However, assuming that you will be reading/writing the underlying data array in true C, then you will face limitation in the sense that C++ class types like std::string, MyCustomClass, etc. cannot be used as T because it will be impossible for your C functions to accept such types safely. As long as T is a type that both C++ and C languages know the storage size of (i.e. both can use sizeof(T) without compilation problems), then you will be in good shape.

Related

In C++ what is the point of std::array if the size has to be determined at compile time?

Pardon my ignorance, it appears to me that std::array is meant to be an STL replacement for your regular arrays. But because the array size has to be passed as a template parameter, it prevents us from creating std::array with a size known only at runtime.
std::array<char,3> nums {1,2,3}; // Works.
constexpr size_t size = 3;
std::array<char,size> nums {1,2,3}; // Works.
const buf_size = GetSize();
std::array<char, buf_size> nums; // Doesn't work.
I would assume that one very important use case for an array in C++ is to create a fixed size data structure based on runtime inputs (say allocating buffer for reading files).
The workarounds I use for this are:
// Create a array pointer for on-the-spot usecases like reading from a file.
char *data = new char[size];
...
delete[] data;
or:
// Use unique_ptr as a class member and I don't want to manage the memory myself.
std::unique_ptr<char[]> myarr_ = std::unique_ptr<char[]>(new char[size]);
If I don't care about fixed size, I am aware that I can use std::vector<char> with the size pre-defined as follows:
std::vector<char> my_buf (buf_size);
Why did the designers of std::array choose to ignore this use case? Perhaps I don't understand the real usecase for std::array.
EDIT: I guess another way to phrase my question could also be - Why did the designers choose to have the size passed as a template param and not as a constructor param? Would opting for the latter have made it difficult to provide the functionality that std::array currently has? To me it seems like a deliberate design choice and I don't understand why.

Ease of programming
std::array facilitates several beneficial interfaces and idioms which are used in std::vector. With normal C-style arrays, one cannot have .size() (no sizeof hack), .at() (exception for out of range), front()/back(), iterators, so on. Everything has to be hand-coded.
Many programmers may choose std::vector even for compile time known sized arrays, just because they want to utilize above programming methodologies. But that snatches away the performance available with compile time fixed size arrays.
Hence std::array was provided by the library makers to discourage the C-style arrays, and yet avoid std::vectors when the size is known at the compile time.

The two main reasons I understand are:
std::array implements STL's interfaces for collection-types, allowing an std::array to be passed as-is to functions and methods that accept any STL iterator.
To prevent array pointer decay... (below)
...this is the preservation of type information across function/method boundaries because it prevents Array Pointer Decay.
Given a naked C/C++ array, you can pass it to another function as a parameter argument by 4 ways:
void by_value1 ( const T* array )
void by_value2 ( const T array[] )
void by_pointer ( const T (*array)[U] )
void by_reference( const T (&array)[U] )
by_value1 and by_value2 are both semantically identical and cause pointer decay because the receiving function does not know the sizeof the array.
by_pointer and by_reference both requires that U by a known compile-time constant, but preserve sizeof information.
So if you avoid array decay by using by_pointer or by_reference you now have a maintenance problem every time you change the size of the array you have to manually update all of the call-sites that have that size in U.
By using std::array it's taken care of for you by making those functions template functions where U is a parameter (granted, you could still use the by_pointer and by_reference techniques but with messier syntax).
...so std::array adds a 5th way:
template<typename T, size_t N>
void by_stdarray( const std::array<T,N>& array )

std::array is a replacement for C-style arrays.
The C++ standards don't allow C-style arrays to be declared without compile-time defined sizes.

Is the vector reserved space always linerar? [duplicate]

My question is simple: are std::vector elements guaranteed to be contiguous? In other words, can I use the pointer to the first element of a std::vector as a C-array?
If my memory serves me well, the C++ standard did not make such guarantee. However, the std::vector requirements were such that it was virtually impossible to meet them if the elements were not contiguous.
Can somebody clarify this?
Example:
std::vector<int> values;
// ... fill up values
if( !values.empty() )
{
int *array = &values[0];
for( int i = 0; i < values.size(); ++i )
{
int v = array[i];
// do something with 'v'
}
}

This was missed from C++98 standard proper but later added as part of a TR. The forthcoming C++0x standard will of course contain this as a requirement.
From n2798 (draft of C++0x):
23.2.6 Class template vector [vector]
1 A vector is a sequence container that supports random access iterators. In addition, it supports (amortized)
constant time insert and erase operations at the end; insert and erase in the middle take linear time. Storage
management is handled automatically, though hints can be given to improve efficiency. The elements of a
vector are stored contiguously, meaning that if v is a vector where T is some type other
than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size().

As other answers have pointed out, the contents of a vector is guaranteed to be continuous (excepting bool's weirdness).
The comment that I wanted to add, is that if you do an insertion or a deletion on the vector, which could cause the vector to reallocate it's memory, then you will cause all of your saved pointers and iterators to be invalidated.

The standard does in fact guarantee that a vector is continuous in memory and that &a[0] can be passed to a C function that expects an array.
The exception to this rule is vector<bool> which only uses one bit per bool thus although it does have continuous memory it can't be used as a bool* (this is widely considered to be a false optimization and a mistake).
BTW, why don't you use iterators? That's what they're for.

As other's have already said, vector internally uses a contiguous array of objects. Pointers into that array should be treated as invalid whenever any non-const member function is called IIRC.
However, there is an exception!!
vector<bool> has a specialised implementation designed to save space, so that each bool only uses one bit. The underlying array is not a contiguous array of bool and array arithmetic on vector<bool> doesn't work like vector<T> would.
(I suppose it's also possible that this may be true of any specialisation of vector, since we can always implement a new one. However, std::vector<bool> is the only, err, standard specialisation upon which simple pointer arithmetic won't work.)

I found this thread because I have a use case where vectors using contiguous memory is an advantage.
I am learning how to use vertex buffer objects in OpenGL. I created a wrapper class to contain the buffer logic, so all I need to do is pass an array of floats and a few config values to create the buffer.
I want to be able to generate a buffer from a function based on user input, so the length is not known at compile time. Doing something like this would be the easiest solution:
void generate(std::vector<float> v)
{
float f = generate_next_float();
v.push_back(f);
}
Now I can pass the vector's floats as an array to OpenGL's buffer-related functions. This also removes the need for sizeof to determine the length of the array.
This is far better than allocating a huge array to store the floats and hoping I made it big enough, or making my own dynamic array with contiguous storage.

cplusplus.com:
Vector containers are implemented as dynamic arrays; Just as regular arrays, vector containers have their elements stored in contiguous storage locations, which means that their elements can be accessed not only using iterators but also using offsets on regular pointers to elements.

Yes, the elements of a std::vector are guaranteed to be contiguous.

Declaring arrays in C++

I am new to C++ and currently learning it with a book by myself. This book seems to say that there are several kinds of arrays depending on how you declare it. I guess the difference between dynamic arrays and static arrays are clear to me. But I do not understand the difference between the STL std::array class and a static array.
An STL std::array variable is declared as:
std::array < int, arraySize > array1;
Whereas a static array variable is declared as:
int array1[arraySize];
Is there a fundamental difference between the two? Or is it just syntax and the two are basically the same?

A std::array<> is just a light wrapper around a C-style array, with some additional nice interface member functions (like begin, end etc) and typedefs, roughly defined as
template<typename T, size_t N>
class array
{
public:
T _arr[N];
T& operator[](size_t);
const T& operator[](size_t) const;
// other member functions and typedefs
}
One fundamental difference though is that the former can be passed by value, whereas for the latter you only pass a pointer to its first element or you can pass it by reference, but you cannot copy it into the function (except via a std::copy or manually).
A common mistake is to assume that every time you pass a C-style array to a function you lose its size due to the array decaying to a pointer. This is not always true. If you pass it by reference, you can recover its size, as there is no decay in this case:
#include <iostream>
template<typename T, size_t N>
void f(T (&arr)[N]) // the type of arr is T(&)[N], not T*
{
std::cout << "I'm an array of size " << N;
}
int main()
{
int arr[10];
f(arr); // outputs its size, there is no decay happening
}
Live on Coliru

The main difference between these two is an important one.
Besides the nice methods the STL gives you, when passing a std::array to a function, there is no decay. Meaning, when you receive the std::array in the function, it is still a std::array, but when you pass an int[] array to a function, it effectively decays to an int* pointer and the size of the array is lost.
This difference is a major one. Once you lose the array size, the code is now prone to a lot of bugs, as you have to keep track of the array size manually. sizeof() returns the size of a pointer type instead of the number of elements in the array. This forces you to manually keep track of the array size using interfaces like process(int *array, int size). This is an ok solution, but prone to errors.
See the guidelines by Bjarne Stroustroup:
https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#Rp-run-time
That can be avoided with a better data type, which std::array is designed for, among many other STL classes.
As a side note, unless there's a strong reason to use a fixed size array, std::vector may be a better choice as a contiguous memory data structure.

std::array and C-style arrays are similar:
They both store a contiguous sequence of objects
They are both aggregate types and can therefore be initialized using aggregate initialization
Their size is known at compile time
They do not use dynamic memory allocation
An important advantage of std::array is that it can be passed by value and doesn't implicitly decay to a pointer like a C-style array does.

In both cases, the array is created on the stack.
However, the STL's std::array class template offers some advantages over the "raw" C-like array syntax of your second case:
int array1[arraySize];
For example, with std::array you have a typical STL interface, with methods like size (which you can use to query the array's element count), front, back, at, etc.
You can find more details here.

Is there a fundamental difference between the two? or is it just syntax and the two are basically the same?
There's a number of differences for a raw c-style array (built-in array) vs. the std::array.
As you can see from the reference documentation there's a number of operations available that aren't with a raw array:
E.g.: Element access
at()
front()
back()
data()
The underlying data type of the std::array is still a raw array, but garnished with "syntactic sugar" (if that should be your concern).

The key differences of std::array<> and a C-style array is that the former is a class that wraps around the latter. The class has begin() and end() methods that allow std::array objects to be easily passed in as parameters to STL algorithms that expect iterators (Note that C-style arrays can too via non member std::begin/std::end methods). The first points to the beginning of the array and the second points to one element beyond its end. You see this pattern with other STL containers, such as std::vector, std::map, std::set, etc.
What's also nice about the STL std::array is that it has a size() method that lets you get the element count. To get the element count of a C-style array, you'll have to write sizeof(cArray)/sizeof(cArray[0]), so doesn't stlArray.size() looks much more readable?
You can get full reference here:
http://en.cppreference.com/w/cpp/container/array

Usually you should prefer std::array<T, size> array1; over T array2[size];, althoug the underlying structure is identical.
The main reason for that is that std::array always knows its size. You can call its size() method to get the size. Whereas when you use a C-style array (i.e. what you called "built-in array") you always have to pass the size around to functions that work with that array. If you get that wrong somehow, you could cause buffer overflows and the function tries to read from/write to memory that does not belong to the array anymore. This cannot happen with std::array, because the size is always clear.

IMO,
Pros: It’s efficient, in that it doesn’t use any more memory than built-in fixed arrays.
Cons: std::array over a built-in fixed array is a slightly more awkward syntax, and that you have to explicitly specify the array length (the compiler won’t calculate it for you from the initializer).

Take an array reference `T(&)[n]` to the contents of a `std::array<T, n>`

Suppose I have a std::array<T, n> and want to take an array reference to its contents (i.e. to the non-exposed elems array member).
I was surprised to find that std::array<T, n>::data() returns T * and not T (&)[n], so it seems that some kind of cast is necessary. I can write:
std::array<int, 5> arr;
int (&ref)[5] = *reinterpret_cast<int (*)[5]>(arr.data());
However, this looks ugly and potentially unsafe. Is it legitimate (well-defined) code and is there a better way to do this?

The standard doesn't provide for the underlying implementation of array, but if it uses int[5] as the underlying representation, then for that implementation only your cast would be (non-portably) legal. For any other underlying representation you violate the strict aliasing rules and enter undefined behavior.
Instead of trying to return the array as an array, can you use iterator pairs to delimit the range you're interested in, following precedent with the standard library?

The array in C++ is a defective type (here I am talking about c-style array, not std::array). The reason of that is because the size of the array isn't stored anywhere in the memory, it is known only at compile-time. Right as you cast array to some other type (commonly, to a pointer), the size of the array is lost.
Now you can see that reverse cast cannot be performed, because there is no way compiler could know size of an array, looking only at the pointer to the first member. As was already suggested, you could use a pair of iterators instead.

Are std::vector elements guaranteed to be contiguous?

My question is simple: are std::vector elements guaranteed to be contiguous? In other words, can I use the pointer to the first element of a std::vector as a C-array?
If my memory serves me well, the C++ standard did not make such guarantee. However, the std::vector requirements were such that it was virtually impossible to meet them if the elements were not contiguous.
Can somebody clarify this?
Example:
std::vector<int> values;
// ... fill up values
if( !values.empty() )
{
int *array = &values[0];
for( int i = 0; i < values.size(); ++i )
{
int v = array[i];
// do something with 'v'
}
}

This was missed from C++98 standard proper but later added as part of a TR. The forthcoming C++0x standard will of course contain this as a requirement.
From n2798 (draft of C++0x):
23.2.6 Class template vector [vector]
1 A vector is a sequence container that supports random access iterators. In addition, it supports (amortized)
constant time insert and erase operations at the end; insert and erase in the middle take linear time. Storage
management is handled automatically, though hints can be given to improve efficiency. The elements of a
vector are stored contiguously, meaning that if v is a vector where T is some type other
than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size().

As other answers have pointed out, the contents of a vector is guaranteed to be continuous (excepting bool's weirdness).
The comment that I wanted to add, is that if you do an insertion or a deletion on the vector, which could cause the vector to reallocate it's memory, then you will cause all of your saved pointers and iterators to be invalidated.

The standard does in fact guarantee that a vector is continuous in memory and that &a[0] can be passed to a C function that expects an array.
The exception to this rule is vector<bool> which only uses one bit per bool thus although it does have continuous memory it can't be used as a bool* (this is widely considered to be a false optimization and a mistake).
BTW, why don't you use iterators? That's what they're for.

As other's have already said, vector internally uses a contiguous array of objects. Pointers into that array should be treated as invalid whenever any non-const member function is called IIRC.
However, there is an exception!!
vector<bool> has a specialised implementation designed to save space, so that each bool only uses one bit. The underlying array is not a contiguous array of bool and array arithmetic on vector<bool> doesn't work like vector<T> would.
(I suppose it's also possible that this may be true of any specialisation of vector, since we can always implement a new one. However, std::vector<bool> is the only, err, standard specialisation upon which simple pointer arithmetic won't work.)

I found this thread because I have a use case where vectors using contiguous memory is an advantage.
I am learning how to use vertex buffer objects in OpenGL. I created a wrapper class to contain the buffer logic, so all I need to do is pass an array of floats and a few config values to create the buffer.
I want to be able to generate a buffer from a function based on user input, so the length is not known at compile time. Doing something like this would be the easiest solution:
void generate(std::vector<float> v)
{
float f = generate_next_float();
v.push_back(f);
}
Now I can pass the vector's floats as an array to OpenGL's buffer-related functions. This also removes the need for sizeof to determine the length of the array.
This is far better than allocating a huge array to store the floats and hoping I made it big enough, or making my own dynamic array with contiguous storage.

cplusplus.com:
Vector containers are implemented as dynamic arrays; Just as regular arrays, vector containers have their elements stored in contiguous storage locations, which means that their elements can be accessed not only using iterators but also using offsets on regular pointers to elements.

Yes, the elements of a std::vector are guaranteed to be contiguous.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Iterate all elements in std::vector using c-style API - c++

Related

In C++ what is the point of std::array if the size has to be determined at compile time?

Is the vector reserved space always linerar? [duplicate]

Declaring arrays in C++

Take an array reference `T(&)[n]` to the contents of a `std::array<T, n>`

Are std::vector elements guaranteed to be contiguous?

Categories

Resources