Why can't we change data pointer of std::vector? - c++

I have an array char* source and a vector std::vector<char> target. I'd like to make the vector target point to source in O(1), without copying the data.
Something along these lines:
#include <vector>
char* source = new char[3] { 1, 2, 3 };
std::vector<char> target;
target.resize(3);
target.setData(source); // <- Doesn't exist
// OR
std::swap(target.data(), source); // <- swap() does not support char*
delete[] source;
Why is it not possible to manually change where a vector points to? Is there some specific, unmanageable problem that would arise if this was possible?

C++ vector class supports adding and deleting elements, with guaranteed consecutive order in memory. If you could initialize your vector with existing memory buffer, and add enough elements to it, it would either overflow or require reallocation.
The interface of vector assumes that it manages its internal buffer, that is, it can allocate, deallocate, resize it whenever it wants (within spec, of course). If you need something that is not allowed to manage its buffer, you cannot use vector - use a different data structure or write one yourself.
You can create a vector object by copying your data (using a constructor with two pointers or assign), but this is obviously not what you want.
Alternatively, you can use string_view, which looks almost or maybe exactly what you need.

std::vector is considered to be the owner of the underlying buffer. You can change the buffer but this change causes allocation i.e. making a copy of the source buffer which you don't want (as stated in the question).
You could do the following:
#include <vector>
int main() {
char* source = new char[3] { 1, 2, 3 };
std::vector<char> target;
target.resize(3);
target.assign(source, source + 3);
delete[] source;
return 0;
}
but again std::vector::assign:
Replaces the contents with copies of those in the range [first, last).
So copy is performed again. You can't get away from it while using std::vector.
If you don't want to copy data, then you should use std::span from C++20 (or create your own span) or use std::string_view (which looks suitable for you since you have an array of chars).
1st option: Using std::string_view
Since you are limited to C++17, std::string_view might be perfect for you. It constructs a view of the first 3 characters of the character array starting with the element pointed by source.
#include <iostream>
#include <string_view>
int main() {
char* source = new char[3] { 1, 2, 3 };
std::string_view strv( source, 3 );
delete[] source;
return 0;
}
2nd option: Using std::span from C++20
std::span comes from C++20 so it might not be the most perfect way for you, but you might be interested in what it is and how it works. You can think of std::span as a bit generalized version of std::string_view because it is a contiguous sequence of objects of any type, not just characters. The usage is similar as with the std::string_view:
#include <span>
#include <iostream>
int main() {
char* source = new char[3] { 1, 2, 3 };
std::span s( source, 3 );
delete[] source;
return 0;
}
3rd option: Your own span
If you are limited to C++17, you can think of creating your own span struct. It might still be an overkill but let me show you (btw take a look at this more elaborated answer):
template<typename T>
class span {
T* ptr_;
std::size_t len_;
public:
span(T* ptr, std::size_t len) noexcept
: ptr_{ptr}, len_{len}
{}
T& operator[](int i) noexcept {
return *ptr_[i];
}
T const& operator[](int i) const noexcept {
return *ptr_[i];
}
std::size_t size() const noexcept {
return len_;
}
T* begin() noexcept {
return ptr_;
}
T* end() noexcept {
return ptr_ + len_;
}
};
int main() {
char* source = new char[3] { 1, 2, 3 };
span s( source, 3 );
delete[] source;
return 0;
}
So the usage is the same as with the C++20's version of std::span.

The std::string_view and std::span are good things to have (if you have compiler version supporting them). Rolling your own similars is ok too.
But some people miss the whole point why one would want to do this exactly to a vector:
Because you have an API that gives Struct[] + size_t and give you ownership
and you also have an API that accepts std::vector<Struct>
ownership could be easily transferred into the vector and no copies made!
You can say: But what about custom allocators, memory mapped file pointers, rom memory that I could then set as the pointer?
If you are already about to set vector internals you should know what you are doing.
You can try to supply a "correct" allocator in those cases to your vector actually.
Maybe give a warning on compiling this code yes, but it would be nice if it would be possible.
I would do it this way:
std::vector would get a constructor that asks for a std::vector::raw_source
std::vector::raw_source is an uint8_t*, size_t, bool struct (for now)
bool takeOwnership: tells if we are taking ownership (false => copy)
size_t size: the size of the raw data
uint8_t* ptr: the pointer to raw data
When taking ownership, vector resize and such uses the vectors allocation strategy as you otherwise provided with your template params anyways - nothing new here. If that does not fit the data you are doing wrong stuff!
Yes API I say look more complicated than a single "set_data(..)" or "own_memory(...)" function, but tried to make it clear that anyone who ever uses this api pretty much understands implications of it automagically. I would like this API to exists, but maybe I still overlook other causes of issues?

Related

Why does std::array require the size as a template parameter and not as a constructor parameter?

There are many design issues I have found with this, particularly with passing std::array<> to functions. Basically, when you initialize std::array, it takes in two template parameters, <class T and size_t size>. However, when you create a function that requires and std::array, we do not know the size, so we need to create template parameters for the functions also.
template <size_t params_size> auto func(std::array<int, params_size> arr);
Why couldn't std::array take in the size at the constructor instead? (i.e.):
auto array = std::array<int>(10);
Then the functions would look less aggressive and would not require template params, as such:
auto func (std::array<int> arr);
I just want to know the design choice for std::array, and why it was designed this way.
This isn't a question due to a bug, but rather a question why std::array<> was designed in such a manner.
std::array<T,N> var is intended as a better replacement for C-style arrays T var[N].
The memory space for this object is created locally, ie on the stack for local variables or inside the struct itself when defined as a member.
std::vector<T> in contrary always allocate it's element's memory in the heap.
Therefore as std::array is allocated locally, it cannot have a variable size since that space needs to be reserved at compile time. std::vector in the other hand has the ability to reallocate and resize since its memory is unbounded.
As a consequence, the big advantage of std::array in terms of performance is that it eliminates that one level of indirection that std::vector pays for its flexibility.
For example:
#include <cstdint>
#include <iostream>
#include <vector>
#include <array>
int main() {
int a;
char b[10];
std::vector<char> c(10);
std::array<char,10> d;
struct E {
std::array<char,10> e1;
std::vector<char> e2{10};
};
E e;
printf( "Stack address: %p\n", __builtin_frame_address(0));
printf( "Address of a: %p\n", &a );
printf( "Address of b: %p\n", b );
printf( "Address of b[0]: %p\n", &b[0] );
printf( "Address of c: %p\n", &c );
printf( "Address of c[0]: %p\n", &c[0] );
printf( "Address of d: %p\n", &d );
printf( "Address of d[0]: %p\n", &d[0] );
printf( "Address of e: %p\n", &e );
printf( "Address of e1: %p\n", &e.e1 );
printf( "Address of e1[0]:%p\n", &e.e1[0] );
printf( "Address of e2: %p\n", &e.e2);
printf( "Address of e2[0]:%p\n", &e.e2[0] );
}
Produces
Program stdout
Stack address: 0x7fffeb115ed0
Address of a: 0x7fffeb115eb0
Address of b: 0x7fffeb115ea6
Address of b[0]: 0x7fffeb115ea6
Address of c: 0x7fffeb115e80
Address of c[0]: 0x1cad2b0
Address of d: 0x7fffeb115e76
Address of d[0]: 0x7fffeb115e76
Address of e: 0x7fffeb115e40
Address of e1: 0x7fffeb115e40
Address of e1[0]:0x7fffeb115e40
Address of e2: 0x7fffeb115e50
Address of e2[0]:0x1cad2d0
Godbolt: https://godbolt.org/z/75s47T56f
Not an answer, really, because I used to despise std::array<> for the same reasons as you — anything with Monadic qualities are not good design (IMNSHO).
Fortunately, C++20 has the solution: a dynamic std::span<>.
#include <array>
#include <iostream>
#include <span>
namespace detail
{
void print( const std::span<const int> & xs )
{
for (size_t n = 0; n < xs.size(); n++)
std::cout << xs[n] << " ";
}
}
void print( const std::span<const int> & xs )
{
std::cout << "{ ";
detail::print( xs );
std::cout << "}\n";
}
void add( const std::span<int> & xs, int n )
{
for (int & x : xs)
x += n;
}
int main()
{
std::array<int,5> xs { 1, 2, 4, 6, 10 };
add( xs, 1 );
print( xs );
}
Notice that the span itself is const in all cases, but the elements themselves are modifiable unless they too are tagged const. This is exactly what an array is like.
std::span is a C++20 object. I know that MS and maybe others had a array_view in older versions of their libraries.
tl;dr
Use std::array only to declare your array object. Pass it around with a dynamic std::span.
std::array vs C array
The use-case for std::array is actually very narrow: encapsulate a fixed-size array as a first-class container object (one that can be copied, not just referenced).
At first blush this doesn’t seem to be much of an improvement over standard C-style arrays:
typedef int myarray[10]; // (1)
using myarray = std::array<int,10>; // (2)
void f( myarray a );
But it is! The difference is in what f() actually gets:
For a C-style array, the argument is just a pointer — a reference to the caller’s data (that you can modify!). You know the size of the referenced array (10), but writing code to get that size is not straight-forward even with the usual C array-size idiom (sizeof(myarray)/sizeof(a[0]), since sizeof(a) is the size of a pointer).
For the std::array, the argument value is an actual local copy of the caller’s data. If you want to be able to modify the caller’s data then you need to be explicit about declaring the formal argument as a reference type (myarray & a) or just to avoid an expensive copy (const myarray & a). This falls in line with how other C++ objects are passed. And though the size is still 10, your code can query the size of the array with the usual C++ container idiom: a.size()!
The usual way C overcomes this is to clutter the call site and formal argument lists with information about the array size so that it doesn’t get lost.
int f( int array[], size_t n ) // traditional C
{
printf( "There are %zu elements.\n", n );
recurse with f( array, n );
}
int main(void)
{
int my_array[10];
f( my_array, ARRAY_SIZE(my_array) );
The std::array way is cleaner.
int f( std::array<int,10> & array ) // C++
{
std::cout << "There are " << array.size() << " elements.\n";
recurse with f( array );
}
int main()
{
std::array<int,10> my_array;
f( my_array );
But while cleaner, it is significantly less flexible than the C array, simply because its length is fixed. A caller cannot pass a std::array<int,12> to the function, for example.
I’ll refer you to the other good answers here to consider more about container choice when handling arrayed data.
If you have a problem with std::array and you think std::span is a solution, now you will have two problems.
More seriously, without knowing what kind of conceptual operation is func it is difficult to tell what is the right alternative.
First, if you want or can exploit to know the size at compile-time there is nothing cooler than what you are trying to avoid.
template<std::size_t N>
void func(std::array<int, N> arr); // add & or && or const& if appropiate
Imagine it, knowing the size at compile time can allow you and the compiler to do all sorts of tricks, like unrolling loops completely or verifying logic at compile time (e.g. if you know the size must be smaller or bigger than a constant).
Or the coolest trick of all, not needing to allocate memory for any auxiliary operation inside func (because you know the size of the problem a priori).
If you want a dynamic array, use (and pass) a std::vector.
void func(std::vector<int> dynarr); // add & or && or const& if appropiate
But then you force your caller to use std::vector as the container.
If you want a fixed array, and it will work with everything,
template<class FixedArray>
void func(FixedArray dynarr); // add & or && or const& if appropiate
Ask yourself, how specific is your function such that you really really want to make it work with any size of std::array but not with std::vector?
Why specifically ints even?
template<class ArithmeticRange>
void func(ArithmeticRange dynarr); // add & or && or const& if appropiate
There are a few contiguous containers and ranges in C++ std. They serve different purposes. There are also a few techniques for passing them around.
I'll try to be exhaustive.
std::array<int, 7>
this is a buffer of 7 ints. They are stored within the object itself. Putting an array somewhere is putting enough storage for exactly 7 ints in that location (plus possible padding for alignment reasons, but that is at the end of the buffer).
You use this when, at compile time, you know exactly how big something is, or need to know.
std::vector<int>
this object holds ownership of a buffer of ints. The memory that holds those ints is dynamically allocated and can change at runtime. The object itself is usually 3 pointers in size. It has some strategies to grow that avoids doing N^2 work when you keep adding 1 element at a time to it.
This object can be efficiently moved -- it will steal the buffer if the old object is marked (by std::move or other ways) as being safe to steal state from.
std::span<int>
This represents an externally owned sequence of ints, possibly stored in a std::array or owned by a std::vector, or stored somewhere else. It knows where in memory it starts and when it ends.
Unlike the two above, it is not a container, but a range or a view of the contents. So you can't assign spans to each other (the semantics are confusing), and you are responsible to ensure that the source buffer lasts "long enough" that you don't use it after it is gone.
span is often used as a function argument. In your case, it probably solves most of your problem -- it lets you pass arrays of different sizes to a function, and within that function you can read or write the values.
span followed pointer semantics. That means const std::span<int> is like a int*const -- the pointer is const, but the thing pointed to is not! You are free to modify the elements in const std::span<int>. In comparison, std::span<const int> is like a int const* -- the pointer is not const, but the thing pointed to is. You are free to change what range of elements the span refers to in std::span<const int>, but you aren't allowed to modify the elements themselves.
A final technique is auto or templates. Here we implement the body of the function in the header (or equivalent) and leave the type unconstrained (or, constrained by concepts).
template<std::size_t N>
int total0( std::array<int, N> const& elems ) {
int r = 0;
for (int e:elems) r+=e;
return r;
}
int total1( std::vector<int> const& elems ) {
int r = 0;
for (int e:elems) r+=e;
return r;
}
int total2( std::span<int const> elems ) {
int r = 0;
for (int e:elems) r+=e;
return r;
}
int total3( auto const& elems ) {
int r = 0;
for (int e:elems) r+=e;
return r;
}
template<class Ints>
int total4( Ints const& elems ) {
int r = 0;
for (int e:elems) r+=e;
return r;
}
notice these all have the same implementation.
total3 and total4 are identical; you need a more modern compiler to use total3 syntax.
total1 and total2 allow you to split the implementation into a cpp file away from the header file. Also, code generation isn't done for different arguments.
total0, total3 and total4 all result in different code to be generated based on the type of the arguments. This can cause binary bloat issues, especially if the body was more complex than shown, and causes build time problems in larger projects.
total1 won't work with a std::array directly. You can do total1({arr.begin(), arr.end()}) which would copy the contents to a dynamic vector before using the code.
Finally, note that span<int> is the closest you get to the C way of arr[], size. Span is, in essence, a pointer-to-first and length pair, with utility code wrapping it.
The main purpose of a C++11 std::array<> is to be a decent replacement for C-style arrays [], especially when they're declared with new and dismissed with delete[].
The main goal here is to get an official, managed object that serves as an array, while maintaining as constant expressions everything that can be.
Principal issues with regular arrays is that since they're not actually objects, one cannot derivate a class from them (forcing you to implement iterators) and are a pain when you copy classes that uses them as object properties.
Since new, delete and delete[] return pointers, you need each time either to implement a copy constructor that will declare another array them copy its content or maintaining your own dynamic reference counter on it.
From this perpective, std::array<> is a good way to declare purely static arrays that will be managed by the language itself.

What is the use of const std::vector?

I've seen code that uses a const std::vector, however can't see why it wouldn't make more sense to simply use an std::array instead.
The values for the vector seem to be initialized at compile-time.
What is the benefit of a const std::vector?
Edit: The vector was not big, only a couple of strings, however I see that this may be one advantage.
const std::vector lets you have a "fixed sized array" whose size you only know at run time but still allows you to have all the benefits of a standard container. If you used a raw/smart pointer in your code you will need to manually pass the size of the array that it points to into the function(s) that need to know the size of the array.
The std::vector could be used with a large list of data, because it could use Dynamic Allocated Array to store values and before C++20 it's doesn't have a constexpr constructor. But std::array uses a raw C-Array and it could be used at compile time with the stack size restricted list of data. So:
std::array const is good for:
Where your data size is less than the stack size.
Where data locality is important.
You want your list at the compile time (before C++20).
std::vector const is good for:
Where your data size is large than the stack size.
It's quite common to see C++ code that expects references to vectors, sometimes const, something like this:
auto do_sum(std::vector<int> const& numbers) -> int {
return std::accumulate(numbers.begin(), numbers.end(), 0);
}
Even though the good intentions are not about not copying memory, this code require the caller to allocate a std::vector, even though the amount of value and the actual values might be known at compile time.
This might be fixed by using a span instead:
auto do_sum(std::span<int const> numbers) -> int {
return std::accumulate(numbers.begin(), numbers.end(), 0);
}
Now this code don't require the users to allocate a vector for the array, and might use std::array, or plain C arrays.
Also, sometimes you don't know what are the numbers of element the array can have, and might be determined at runtime. Consider this:
auto create_vector() -> std::vector<int> {
auto vec = std::vector<int>{};
if (/* runtime condition */) {
vec.push_back(1);
vec.push_back(2);
vec.push_back(3);
} else {
vec.push_back(4);
vec.push_back(5);
}
return vec;
}
int main() {
std::vector<int> const vec = create_vector();
}
As you can see here, as the vector is moved from scope to scope, the const-ness changes and is expressing the intent of the developer to only make it mutable in some initializing scopes.
Because the std::array needs you to specify the size as part of the type. By using std::vector the compiler can work that out for your dynamically.
This matters in maintenance situations as it prevents errors. You only add/remove a string from the initializer of the object and the vector will have the correct size automatically. If you use an array you need to add/remove the string and change the type of the array.
const std::vector<std::string> dataV = { "A", "B", "C" };
const std::array<std::string, 3> dataA = { "A", "B", "C" };
If I now modify these to only have two values.
// The vector will auto resize
const std::vector<std::string> dataV = { "A", "B"};
// This will still be of size three.
// This is not usually what you want.
const std::array<std::string, 3> dataA = { "A", "B"};
// The person modifying the code has to manually spot that and
// change the type to explicitly have two member array. Note
// It may not be as obvious as you think as the type may
// be hidden with a type alias of some description
using DataStore = std::array<std::string, 3>;
/// Lots of code:
DataStore dataA = { "A", "B"}; // Would you have spotted.

Is it possible that a pointer gets owned by a vector without any data copy?

If I have a C type raw pointer, is it possible to create a std::vector from the same type that owns the pointer's data without any data copy (only moving)? What motivates me for asking this question is the existence of data() member function for std::vector which means vector's elements are residing somewhere in the memory consecutively.
Edit: I have to add that the hope I had was also intensified by the existence of functions like std::make_shared.
I don't think that this is directly possible, although you're not the first one to miss this feature. It is even more painful with std::string which doesn't have a non-const data member. Hopefully, this will change in C++17.
If you are allocating the buffer yourself, there is a solution, though. Just use a std::vector up-front. For example, assume you have the following C-style function,
extern void
fill_in_the_numbers(double * buffer, std::size_t count);
then you can do the following.
std::vector<double>
get_the_numbers_1st(const std::size_t n)
{
auto numbers = std::vector<double> (n);
fill_in_the_numbers(numbers.data(), numbers.size());
return numbers;
}
Alternatively, if you're not so lucky and your C-style function insists in allocating the memory itself,
extern double *
allocate_the_buffer_and_fill_in_the_numbers(std::size_t n);
you could resort to a std::unique_ptr, which is sadly inferior.
std::unique_ptr<double[], void (*)(void *)>
get_the_numbers_2nd(const std::size_t n)
{
return {
allocate_the_buffer_and_fill_in_the_numbers(n),
&std::free
};
}
No, std::vector is not designed to be able to assume/utilize a pre-existing array for its internal storage.
Yes, provided that you've created and populated the vector before getting the pointers and that you will not
erase any element
you will not add new elements when vec.size() == vec.capacity() - 1 ,,, doing so will change the address of the elements
Example
#include <iostream>
void fill_my_vector<std::vector<double>& vec){
for(int i=0; i<300; i++){
vec.push_back(i);
}
}
void do_something(double* d, int size)
{ /* ..... */ }
int main(){
std::vector<double> vec;
fill_my_vector(vec);
//You hereby promise to follow the contract conditions, then you are safe doing this
double* ptr;
int ptr_len = vec.size();
ptr = &vec[0];
//call do_something
do_something(ptr, ptr_len);
//ptr will be alive until this function scope exits
}
EDIT
If you mean managing the data from an already created array, you can't... vector manages its own array... It cannot take ownership of an array that wasn't created by its class (vector).

Interface for returning a bunch of values

I have a function that takes a number and returns up to that many things (say, ints). What's the cleanest interface? Some thoughts:
Return a vector<int>. The vector would be copied several times, which is inefficient.
Return a vector<int>*. My getter now has to allocate the vector itself, as well as the elements. There are all the usual problems of who has to free the vector, the fact that you can't allocate once and use the same storage for many different calls to the getter, etc. This is why STL algorithms typically avoid allocating memory, instead wanting it passed in.
Return a unique_ptr<vector<int>>. It's now clear who deletes it, but we still have the other problems.
Take a vector<int> as a reference parameter. The getter can push_back() and the caller can decide whether to reserve() the space. However, what should the getter do if the passed-in vector is non-empty? Append? Overwrite by clearing it first? Assert that it's empty? It would be nice if the signature of the function allowed only a single interpretation.
Pass a begin and end iterator. Now we need to return the number of items actually written (which might be smaller than desired), and the caller needs to be careful not to access items that were never written to.
Have the getter take an iterator, and the caller can pass an insert_iterator.
Give up and just pass a char *. :)
In C++11, where move semantics is supported for standard containers, you should go with option 1.
It makes the signature of your function clear, communicating that you just want a vector of integers to be returned, and it will be efficient, because no copy will be issued: the move constructor of std::vector will be invoked (or, most likely, Named Return Value Optimization will be applied, resulting in no move and no copy):
std::vector<int> foo()
{
std::vector<int> v;
// Fill in v...
return v;
}
This way you won't have to deal with issues such as ownership, unnecessary dynamic allocations, and other stuff which are just polluting the simplicity of your problem: returning a bunch of integers.
In C++03, you may want to go with option 4 and take an lvalue reference to a non-const vector: standard containers in C++03 are not move-aware, and copying a vector may be expensive. Thus:
void foo(std::vector<int>& v)
{
// Fill in v...
}
However, even in that case, you should consider whether this penalty is really significant for your use cases. If it is not, you may well opt for a clearer function signature at the expense of some CPU cycles.
Also, C++03 compilers are capable of performing Named Return Value Optimization, so even though in theory a temporary should be copy-constructed from the value you return, in practice no copying is likely to happen.
You wrote it yourself:
... This is why STL algorithms typically avoid allocating memory, instead wanting it passed in
except that STL algorithms don't typically "want memory passed in", they operate on iterators instead. This is specifically to decouple the algorithm from the container, giving rise to:
option 8
decouple the value generation from both the use and storage of those values, by returning an input iterator.
The easiest way is using boost::function_input_iterator, but a sketch mechanism is below (mostly because I was typing faster than thinking).
Input iterator type
(uses C++11, but you can replace the std::function with a function pointer or just hard-code the generation logic):
#include <functional>
#include <iterator>
template <typename T>
class Generator: public std::iterator<std::input_iterator_tag, T> {
int count_;
std::function<T()> generate_;
public:
Generator() : count_(0) {}
Generator(int count, std::function<T()> func) : count_(count)
, generate_(func) {}
Generator(Generator const &other) : count_(other.count_)
, generate_(other.generate_) {}
// move, assignment etc. etc. omitted for brevity
T operator*() { return generate_(); }
Generator<T>& operator++() {
--count_;
return *this;
}
Generator<T> operator++(int) {
Generator<T> tmp(*this);
++*this;
return tmp;
}
bool operator==(Generator<T> const &other) const {
return count_ == other.count_;
}
bool operator!=(Generator<T> const &other) const {
return !(*this == other);
}
};
Example generator function
(again, it's trivial to replace the lambda with an out-of-line function for C++98, but this is less typing)
#include <random>
Generator<int> begin_random_integers(int n) {
static std::minstd_rand prng;
static std::uniform_int_distribution<int> pdf;
Generator<int> rv(n,
[]() { return pdf(prng); }
);
return rv;
}
Generator<int> end_random_integers() {
return Generator<int>();
}
Example use
#include <vector>
#include <algorithm>
#include <iostream>
int main()
{
using namespace std;
vector<int> out;
cout << "copy 5 random ints into a vector\n";
copy(begin_random_integers(5), end_random_integers(),
back_inserter(out));
copy(out.begin(), out.end(),
ostream_iterator<int>(cout, ", "));
cout << "\n" "print 2 random ints straight from generator\n";
copy(begin_random_integers(2), end_random_integers(),
ostream_iterator<int>(cout, ", "));
cout << "\n" "reuse vector storage for 3 new ints\n";
out.clear();
copy(begin_random_integers(3), end_random_integers(),
back_inserter(out));
copy(out.begin(), out.end(),
ostream_iterator<int>(cout, ", "));
}
return vector<int>, it will not be copied, it will be moved.
In C++11 the right answer is to return the std::vector<int> is to return it, ensuring that it will be either explicitly or implicitly moved. (Prefer implicit move, because explicit move can block some optimizations)
Amusingly, if you are concerned about reusing the buffer, the easiest way is to throw in an optional parameter that takes a std::vector<int> by value like this:
std::vector<int> get_stuff( int how_many, std::vector<int> retval = std::vector<int>() ) {
// blah blah
return retval;
}
and, if you have a preallocated buffer of the right size, just std::move it into the get_stuff function and it will be used. If you don't have a preallocated buffer of the right size, don't pass a std::vector in.
Live example: http://ideone.com/quqnMQ
I'm uncertain if this will block NRVO/RVO, but there isn't a fundamental reason why it should, and moving a std::vector is cheap enough that you probably won't care if it does block NRVO/RVO anyhow.
However, you might not actually want to return a std::vector<int> - possibly you just want to iterate over the elements in question.
In that case, there is an easy way and a hard way.
The easy way is to expose a for_each_element( Lambda ) method:
#include <iostream>
struct Foo {
int get_element(int i) const { return i*2+1; }
template<typename Lambda>
void for_each_element( int up_to, Lambda&& f ) {
for (int i = 0; i < up_to; ++i ) {
f( get_element(i) );
}
}
};
int main() {
Foo foo;
foo.for_each_element( 7, [&](int e){
std::cout << e << "\n";
});
}
and possibly use a std::function if you must hide the implementation of the for_each.
The hard way would be to return a generator or a pair of iterators that generate the elements in question.
Both of these avoid the pointless allocation of the buffer when you only want to deal with the elements one at a time, and if generating the values in question is expensive (it might require traversing memory
In C++98 I would take a vector& and clear() it.

STL vectors with uninitialized storage?

I'm writing an inner loop that needs to place structs in contiguous storage. I don't know how many of these structs there will be ahead of time. My problem is that STL's vector initializes its values to 0, so no matter what I do, I incur the cost of the initialization plus the cost of setting the struct's members to their values.
Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements?
(I'm certain that this part of the code needs to be optimized, and I'm certain that the initialization is a significant cost.)
Also, see my comments below for a clarification about when the initialization occurs.
SOME CODE:
void GetsCalledALot(int* data1, int* data2, int count) {
int mvSize = memberVector.size()
memberVector.resize(mvSize + count); // causes 0-initialization
for (int i = 0; i < count; ++i) {
memberVector[mvSize + i].d1 = data1[i];
memberVector[mvSize + i].d2 = data2[i];
}
}
std::vector must initialize the values in the array somehow, which means some constructor (or copy-constructor) must be called. The behavior of vector (or any container class) is undefined if you were to access the uninitialized section of the array as if it were initialized.
The best way is to use reserve() and push_back(), so that the copy-constructor is used, avoiding default-construction.
Using your example code:
struct YourData {
int d1;
int d2;
YourData(int v1, int v2) : d1(v1), d2(v2) {}
};
std::vector<YourData> memberVector;
void GetsCalledALot(int* data1, int* data2, int count) {
int mvSize = memberVector.size();
// Does not initialize the extra elements
memberVector.reserve(mvSize + count);
// Note: consider using std::generate_n or std::copy instead of this loop.
for (int i = 0; i < count; ++i) {
// Copy construct using a temporary.
memberVector.push_back(YourData(data1[i], data2[i]));
}
}
The only problem with calling reserve() (or resize()) like this is that you may end up invoking the copy-constructor more often than you need to. If you can make a good prediction as to the final size of the array, it's better to reserve() the space once at the beginning. If you don't know the final size though, at least the number of copies will be minimal on average.
In the current version of C++, the inner loop is a bit inefficient as a temporary value is constructed on the stack, copy-constructed to the vectors memory, and finally the temporary is destroyed. However the next version of C++ has a feature called R-Value references (T&&) which will help.
The interface supplied by std::vector does not allow for another option, which is to use some factory-like class to construct values other than the default. Here is a rough example of what this pattern would look like implemented in C++:
template <typename T>
class my_vector_replacement {
// ...
template <typename F>
my_vector::push_back_using_factory(F factory) {
// ... check size of array, and resize if needed.
// Copy construct using placement new,
new(arrayData+end) T(factory())
end += sizeof(T);
}
char* arrayData;
size_t end; // Of initialized data in arrayData
};
// One of many possible implementations
struct MyFactory {
MyFactory(int* p1, int* p2) : d1(p1), d2(p2) {}
YourData operator()() const {
return YourData(*d1,*d2);
}
int* d1;
int* d2;
};
void GetsCalledALot(int* data1, int* data2, int count) {
// ... Still will need the same call to a reserve() type function.
// Note: consider using std::generate_n or std::copy instead of this loop.
for (int i = 0; i < count; ++i) {
// Copy construct using a factory
memberVector.push_back_using_factory(MyFactory(data1+i, data2+i));
}
}
Doing this does mean you have to create your own vector class. In this case it also complicates what should have been a simple example. But there may be times where using a factory function like this is better, for instance if the insert is conditional on some other value, and you would have to otherwise unconditionally construct some expensive temporary even if it wasn't actually needed.
In C++11 (and boost) you can use the array version of unique_ptr to allocate an uninitialized array. This isn't quite an stl container, but is still memory managed and C++-ish which will be good enough for many applications.
auto my_uninit_array = std::unique_ptr<mystruct[]>(new mystruct[count]);
C++0x adds a new member function template emplace_back to vector (which relies on variadic templates and perfect forwarding) that gets rid of any temporaries entirely:
memberVector.emplace_back(data1[i], data2[i]);
To clarify on reserve() responses: you need to use reserve() in conjunction with push_back(). This way, the default constructor is not called for each element, but rather the copy constructor. You still incur the penalty of setting up your struct on stack, and then copying it to the vector. On the other hand, it's possible that if you use
vect.push_back(MyStruct(fieldValue1, fieldValue2))
the compiler will construct the new instance directly in the memory thatbelongs to the vector. It depends on how smart the optimizer is. You need to check the generated code to find out.
You can use boost::noinit_adaptor to default initialize new elements (which is no initialization for built-in types):
std::vector<T, boost::noinit_adaptor<std::allocator<T>> memberVector;
As long as you don't pass an initializer into resize, it default initializes the new elements.
So here's the problem, resize is calling insert, which is doing a copy construction from a default constructed element for each of the newly added elements. To get this to 0 cost you need to write your own default constructor AND your own copy constructor as empty functions. Doing this to your copy constructor is a very bad idea because it will break std::vector's internal reallocation algorithms.
Summary: You're not going to be able to do this with std::vector.
You can use a wrapper type around your element type, with a default constructor that does nothing. E.g.:
template <typename T>
struct no_init
{
T value;
no_init() { static_assert(std::is_standard_layout<no_init<T>>::value && sizeof(T) == sizeof(no_init<T>), "T does not have standard layout"); }
no_init(T& v) { value = v; }
T& operator=(T& v) { value = v; return value; }
no_init(no_init<T>& n) { value = n.value; }
no_init(no_init<T>&& n) { value = std::move(n.value); }
T& operator=(no_init<T>& n) { value = n.value; return this; }
T& operator=(no_init<T>&& n) { value = std::move(n.value); return this; }
T* operator&() { return &value; } // So you can use &(vec[0]) etc.
};
To use:
std::vector<no_init<char>> vec;
vec.resize(2ul * 1024ul * 1024ul * 1024ul);
Err...
try the method:
std::vector<T>::reserve(x)
It will enable you to reserve enough memory for x items without initializing any (your vector is still empty). Thus, there won't be reallocation until to go over x.
The second point is that vector won't initialize the values to zero. Are you testing your code in debug ?
After verification on g++, the following code:
#include <iostream>
#include <vector>
struct MyStruct
{
int m_iValue00 ;
int m_iValue01 ;
} ;
int main()
{
MyStruct aaa, bbb, ccc ;
std::vector<MyStruct> aMyStruct ;
aMyStruct.push_back(aaa) ;
aMyStruct.push_back(bbb) ;
aMyStruct.push_back(ccc) ;
aMyStruct.resize(6) ; // [EDIT] double the size
for(std::vector<MyStruct>::size_type i = 0, iMax = aMyStruct.size(); i < iMax; ++i)
{
std::cout << "[" << i << "] : " << aMyStruct[i].m_iValue00 << ", " << aMyStruct[0].m_iValue01 << "\n" ;
}
return 0 ;
}
gives the following results:
[0] : 134515780, -16121856
[1] : 134554052, -16121856
[2] : 134544501, -16121856
[3] : 0, -16121856
[4] : 0, -16121856
[5] : 0, -16121856
The initialization you saw was probably an artifact.
[EDIT] After the comment on resize, I modified the code to add the resize line. The resize effectively calls the default constructor of the object inside the vector, but if the default constructor does nothing, then nothing is initialized... I still believe it was an artifact (I managed the first time to have the whole vector zerooed with the following code:
aMyStruct.push_back(MyStruct()) ;
aMyStruct.push_back(MyStruct()) ;
aMyStruct.push_back(MyStruct()) ;
So...
:-/
[EDIT 2] Like already offered by Arkadiy, the solution is to use an inline constructor taking the desired parameters. Something like
struct MyStruct
{
MyStruct(int p_d1, int p_d2) : d1(p_d1), d2(p_d2) {}
int d1, d2 ;
} ;
This will probably get inlined in your code.
But you should anyway study your code with a profiler to be sure this piece of code is the bottleneck of your application.
I tested a few of the approaches suggested here.
I allocated a huge set of data (200GB) in one container/pointer:
Compiler/OS:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Settings: (c++-17, -O3 optimizations)
g++ --std=c++17 -O3
I timed the total program runtime with linux-time
1.) std::vector:
#include <vector>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t> vec(size);
}
real 0m36.246s
user 0m4.549s
sys 0m31.604s
That is 36 seconds.
2.) std::vector with boost::noinit_adaptor
#include <vector>
#include <boost/core/noinit_adaptor.hpp>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t,boost::noinit_adaptor<std::allocator<size_t>>> vec(size);
}
real 0m0.002s
user 0m0.001s
sys 0m0.000s
So this solves the problem. Just allocating without initializing costs basically nothing (at least for large arrays).
3.) std::unique_ptr<T[]>:
#include <memory>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
auto data = std::unique_ptr<size_t[]>(new size_t[size]);
}
real 0m0.002s
user 0m0.002s
sys 0m0.000s
So basically the same performance as 2.), but does not require boost.
I also tested simple new/delete and malloc/free with the same performance as 2.) and 3.).
So the default-construction can have a huge performance penalty if you deal with large data sets.
In practice you want to actually initialize the allocated data afterwards.
However, some of the performance penalty still remains, especially if the later initialization is performed in parallel.
E.g., I initialize a huge vector with a set of (pseudo)random numbers:
(now I use fopenmp for parallelization on a 24 core AMD Threadripper 3960X)
g++ --std=c++17-fopenmp -O3
1.) std::vector:
#include <vector>
#include <random>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t> vec(size);
#pragma omp parallel
{
std::minstd_rand0 gen(42);
#pragma omp for schedule(static)
for (size_t i = 0; i < size; ++i) vec[i] = gen();
}
}
real 0m41.958s
user 4m37.495s
sys 0m31.348s
That is 42s, only 6s more than the default initialization.
The problem is, that the initialization of std::vector is sequential.
2.) std::vector with boost::noinit_adaptor:
#include <vector>
#include <random>
#include <boost/core/noinit_adaptor.hpp>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t,boost::noinit_adaptor<std::allocator<size_t>>> vec(size);
#pragma omp parallel
{
std::minstd_rand0 gen(42);
#pragma omp for schedule(static)
for (size_t i = 0; i < size; ++i) vec[i] = gen();
}
}
real 0m10.508s
user 1m37.665s
sys 3m14.951s
So even with the random-initialization, the code is 4 times faster because we can skip the sequential initialization of std::vector.
So if you deal with huge data sets and plan to initialize them afterwards in parallel, you should avoid using the default std::vector.
From your comments to other posters, it looks like you're left with malloc() and friends. Vector won't let you have unconstructed elements.
From your code, it looks like you have a vector of structs each of which comprises 2 ints. Could you instead use 2 vectors of ints? Then
copy(data1, data1 + count, back_inserter(v1));
copy(data2, data2 + count, back_inserter(v2));
Now you don't pay for copying a struct each time.
If you really insist on having the elements uninitialized and sacrifice some methods like front(), back(), push_back(), use boost vector from numeric . It allows you even not to preserve existing elements when calling resize()...
I'm not sure about all those answers that says it is impossible or tell us about undefined behavior.
Sometime, you need to use an std::vector. But sometime, you know the final size of it. And you also know that your elements will be constructed later.
Example : When you serialize the vector contents into a binary file, then read it back later.
Unreal Engine has its TArray::setNumUninitialized, why not std::vector ?
To answer the initial question
"Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements?"
yes and no.
No, because STL doesn't expose a way to do so.
Yes because we're coding in C++, and C++ allows to do a lot of thing. If you're ready to be a bad guy (and if you really know what you are doing). You can hijack the vector.
Here a sample code that works only for the Windows's STL implementation, for another platform, look how std::vector is implemented to use its internal members :
// This macro is to be defined before including VectorHijacker.h. Then you will be able to reuse the VectorHijacker.h with different objects.
#define HIJACKED_TYPE SomeStruct
// VectorHijacker.h
#ifndef VECTOR_HIJACKER_STRUCT
#define VECTOR_HIJACKER_STRUCT
struct VectorHijacker
{
std::size_t _newSize;
};
#endif
template<>
template<>
inline decltype(auto) std::vector<HIJACKED_TYPE, std::allocator<HIJACKED_TYPE>>::emplace_back<const VectorHijacker &>(const VectorHijacker &hijacker)
{
// We're modifying directly the size of the vector without passing by the extra initialization. This is the part that relies on how the STL was implemented.
_Mypair._Myval2._Mylast = _Mypair._Myval2._Myfirst + hijacker._newSize;
}
inline void setNumUninitialized_hijack(std::vector<HIJACKED_TYPE> &hijackedVector, const VectorHijacker &hijacker)
{
hijackedVector.reserve(hijacker._newSize);
hijackedVector.emplace_back<const VectorHijacker &>(hijacker);
}
But beware, this is hijacking we're speaking about. This is really dirty code, and this is only to be used if you really know what you are doing. Besides, it is not portable and relies heavily on how the STL implementation was done.
I won't advise you to use it because everyone here (me included) is a good person. But I wanted to let you know that it is possible contrary to all previous answers that stated it wasn't.
Use the std::vector::reserve() method. It won't resize the vector, but it will allocate the space.
Do the structs themselves need to be in contiguous memory, or can you get away with having a vector of struct*?
Vectors make a copy of whatever you add to them, so using vectors of pointers rather than objects is one way to improve performance.
I don't think STL is your answer. You're going to need to roll your own sort of solution using realloc(). You'll have to store a pointer and either the size, or number of elements, and use that to find where to start adding elements after a realloc().
int *memberArray;
int arrayCount;
void GetsCalledALot(int* data1, int* data2, int count) {
memberArray = realloc(memberArray, sizeof(int) * (arrayCount + count);
for (int i = 0; i < count; ++i) {
memberArray[arrayCount + i].d1 = data1[i];
memberArray[arrayCount + i].d2 = data2[i];
}
arrayCount += count;
}
I would do something like:
void GetsCalledALot(int* data1, int* data2, int count)
{
const size_t mvSize = memberVector.size();
memberVector.reserve(mvSize + count);
for (int i = 0; i < count; ++i) {
memberVector.push_back(MyType(data1[i], data2[i]));
}
}
You need to define a ctor for the type that is stored in the memberVector, but that's a small cost as it will give you the best of both worlds; no unnecessary initialization is done and no reallocation will occur during the loop.