std::array vs array performance

std::array vs array performance - c++

If I want to build a very simple array like:
int myArray[3] = {1,2,3};
Should I use std::array instead?
std::array<int, 3> a = {{1, 2, 3}};
What are the advantages of using std::array over usual ones? Is it more performant? Just easier to handle for copy/access?

What are the advantages of using std::array over usual ones?
It has friendly value semantics, so that it can be passed to or returned from functions by value. Its interface makes it more convenient to find the size, and use with STL-style iterator-based algorithms.
Is it more performant ?
It should be exactly the same. By definition, it's a simple aggregate containing an array as its only member.
Just easier to handle for copy/access ?
Yes.

A std::array is a very thin wrapper around a C-style array, basically defined as
template<typename T, size_t N>
struct array
{
T _data[N];
T& operator[](size_t);
const T& operator[](size_t) const;
// other member functions and typedefs
};
It is an aggregate, and it allows you to use it almost like a fundamental type (i.e. you can pass-by-value, assign etc, whereas a standard C array cannot be assigned or copied directly to another array). You should take a look at some standard implementation (jump to definition from your favourite IDE or directly open <array>), it is a piece of the C++ standard library that is quite easy to read and understand.

std::array is designed as zero-overhead wrapper for C arrays that gives it the "normal" value like semantics of the other C++ containers.
You should not notice any difference in runtime performance while you still get to enjoy the extra features.
Using std::array instead of int[] style arrays is a good idea if you have C++11 or boost at hand.

Is it more performant ?
It should be exactly the same. By definition, it's a simple aggregate containing an array as its only member.
The situation seems to be more complicated, as std::array does not always produce identical assembly code compared to C-array depending on the specific platform.
I tested this specific situation on godbolt:
#include <array>
void test(double* const C, const double* const A,
const double* const B, const size_t size) {
for (size_t i = 0; i < size; i++) {
//double arr[2] = {0.e0};//
std::array<double, 2> arr = {0.e0};//different to double arr[2] for some compiler
for (size_t j = 0; j < size; j++) {
arr[0] += A[i] * B[j];
arr[1] += A[j] * B[i];
}
C[i] += arr[0];
C[i] += arr[1];
}
}
GCC and Clang produce identical assembly code for both the C-array version and the std::array version.
MSVC and ICPC, however, produce different assembly code for each array version. (I tested ICPC19 with -Ofast and -Os; MSVC -Ox and -Os)
I have no idea, why this is the case (I would indeed expect exactly identical behavior of std::array and c-array). Maybe there are different optimization strategies employed.
As a little extra:
There seems to be a bug in ICPC with
#pragma simd
for vectorization when using the c-array in some situations
(the c-array code produces a wrong output; the std::array version works fine).
Unfortunately, I do not have a minimal working example for that yet, since I discovered that problem while optimizing a quite complicated piece of code.
I will file a bug-report to intel when I am sure that I did not just misunderstood something about C-array/std::array and #pragma simd.

std::array has value semantics while raw arrays do not. This means you can copy std::array and treat it like a primitive value. You can receive them by value or reference as function arguments and you can return them by value.
If you never copy a std::array, then there is no performance difference than a raw array. If you do need to make copies then std::array will do the right thing and should still give equal performance.

You will get the same perfomance results using std::array and c array
If you run these code:
std::array<QPair<int, int>, 9> *m_array=new std::array<QPair<int, int>, 9>();
QPair<int, int> *carr=new QPair<int, int>[10];
QElapsedTimer timer;
timer.start();
for (int j=0; j<1000000000; j++)
{
for (int i=0; i<9; i++)
{
m_array->operator[](i).first=i+j;
m_array->operator[](i).second=j-i;
}
}
qDebug() << "std::array<QPair<int, int>" << timer.elapsed() << "milliseconds";
timer.start();
for (int j=0; j<1000000000; j++)
{
for (int i=0; i<9; i++)
{
carr[i].first=i+j;
carr[i].second=j-i;
}
}
qDebug() << "QPair<int, int> took" << timer.elapsed() << "milliseconds";
return 0;
You will get these results:
std::array<QPair<int, int> 5670 milliseconds
QPair<int, int> took 5638 milliseconds
Mike Seymour is right, if you can use std::array you should use it.

Related

Extend std::vector with range checking and signed size_type

I would like to use a class with the same functionality as std::vector, but
Replace std::vector<T>::size_type by some signed integer (like int64_t or simply int), instead of usual size_t. It is very annoying to see warnings produced by a compiler in comparisons between signed and unsigned numbers when I use standard vector interface. I can't just disable such warnings, because they really help to catch programming errors.
put assert(0 <= i && i < size()); inside operator[](int i) to check out of range errors. As I understand it will be a better option over the call to .at() because I can disable assertions in release builds, so performance will be the same as in the standard implementation of the vector. It is almost impossible for me to use std::vector without manual checking of range before each operation because operator[] is the source of almost all weird errors related to memory access.
The possible options that come to my mind are to
Inherit from std::vector. It is not a good idea, as said in the following question: Extending the C++ Standard Library by inheritance?.
Use composition (put std::vector inside my class) and repeat all the interface of std::vector. This option forces me to think about the current C++ standard, because the interface of some methods, iterators is slightly different in C++ 98,11,14,17. I would like to be sure, that when c++ 20 became available, I can simply use it without reimplementation of all the interface of my vector.

An answer more to the underlying problem read from the comment:
For example, I don't know how to write in a ranged-based for way:
for (int i = a.size() - 2; i >= 0; i--) { a[i] = 2 * a[i+1]; }
You may change it to a generic one like this:
std::vector<int> vec1{ 1,2,3,4,5,6};
std::vector<int> vec2 = vec1;
int main()
{
// generic code
for ( auto it = vec1.rbegin()+1; it != vec1.rend(); it++ )
{
*it= 2* *(it-1);
}
// your code
for (int i = vec2.size() - 2; i >= 0; i--)
{
vec2[i] = 2 * vec2[i+1];
}
for ( auto& el: vec1) { std::cout << el << std::endl; }
for ( auto& el: vec2) { std::cout << el << std::endl; }
}
Not using range based for as it is not able to access relative to the position.

Regarding point 1: we hardly ever get those warnings here, because we use vectors' size_type where appropriate and/or cast to it if needed (with a 'checked' cast like boost::numeric_cast for safety). Is that not an option for you? Otherwise, write a function to do it for you, i.e. the non-const version would be something like
template<class T>
T& ati(std::vector<T>& v, std::int64_t i)
{
return v.at(checked_cast<decltype(v)::size_type>(i));
}
And yes, inheriting is still a problem. And even if it weren't you'd break the definition of vector (and the Liskov substitution principle I guess), because the size_type is defined as
an unsigned integral type that can represent any non-negative value of difference_type
So it's down to composition, or a bunch of free functions for accessing with a signed size_type and a range check. Personally I'd go for the latter: less work, as easy to use, and you can still pass your vector to functions teaking vectors without problems.

(This is more a comment than a real answer, but has some code, so...)
For the second part (range checking at runtime), a third option would be to use some macro trick:
#ifdef DEBUG
#define VECTOR_AT(v,i) v.at(i)
#else
#define VECTOR_AT(v,i) v[i]
#endif
This can be used this way:
std::vector<sometype> vect(somesize);
VECTOR_AT(vect,i) = somevalue;
Of course, this requires editing your code in a quite non-standard way, which may not be an option. But it does the job.

C++ for each With a Pointer

I am trying to use a pointer to an array inside of a for each loop in C++. The code below won't work because the "for each statement cannot operate on variables of type 'int *'". I'd prefer to use the new operator so that the array is on the heap and not the stack, but I just can't seem to figure out the syntax here. Any suggestions?
#include <iostream>
using namespace std;
int main() {
int total = 0;
int* array = new int[6];
array[0] = 10; array[1] = 20; array[2] = 30;
array[3] = 40; array[4] = 50; array[5] = 60;
for each(int i in array) {
total += i;
}
cout << total << endl;
}

That for each thing you are using is a Visual C++ extension that's not even recommended by some microsoft employees (I know I've heard STL say bad things about it, I can't remember where).
There are other options, like std::for_each, and range-based for from C++11 (though I don't think Visual C++ supports that yet). However, that's not what you should be using here. You should be using std::accumulate, because this is the job that it was made for:
total = std::accumulate(array, array + 6, 0);
If you're really just interested in how to use this Microsoft for each construct, well, I'm pretty sure you can't if you just have a pointer. You should use a std::vector instead. You should be doing that anyway.

C++0x introduced a ranged-based for loops, which work equal to foreach in other languages. The syntax for them is something like this:
int arr[5]={1,2,3,4,5};
for( int & tmp : arr )
{
//do something
}
These loops work for C-style arrays, initializer lists, and any type that has begin() and end() functions defined for it that return iterators.
I strongly believe that int * doesn't have begin() and end() functions for them that return iterators, because it's just a raw pointer. I also believe that other foreach-equivalents such as foreach in Qt, or what you've posted, work the same way, so you can't use them like this. msdn says that it works for collections:
for each (type identifier in expression) {
statements
}
expression:
A managed array expression or collection. The compiler must be able
to convert the collection element from Object to the identifier type.
expression evaluates to a type that implements IEnumerable, IEnumerable,
or a type that defines a GetEnumerator method. In the
latter case, GetEnumerator should either return a type that implements
IEnumerator or declares all the methods defined in IEnumerator.
Once again, you have a raw pointer, so it will not work.

you can always use for loop like this:
for (int i = 0; i < 6;i++)
{
total += array[i];
}
Although, answer for using "for each" using "gcnew" is already being given so I am omitting that. As an alternative, you can also use vectors as follows:
#include <iostream>
#include <vector>
using namespace std;
int _tmain(int argc, _TCHAR* argv[])
{
int total = 0;
vector<int> myarray;
myarray.push_back(10);
myarray.push_back(20);
myarray.push_back(30);
myarray.push_back(40);
myarray.push_back(50);
myarray.push_back(60);
for each(int i in myarray) {
total += i;
}
cout << total << endl;
return 0;
}
Hope this will help...

The only way I can think about is iterating over array of reference types especially if you want your storage on the heap
Here Microsoft shows you how to do so
But for your case, the simplest alternative (if you want your array on the heap) would be as follows:-
array<int>^ arr = gcnew array<int>{10, 20, 30, 40. 50, 60};
int total = 0;
for each (int i in arr){
total+=i;
}
gcnew creates an instance of a managed type (reference or value type) on the garbage collected heap. The result of the evaluation of a gcnew expression is a handle (^) to the type being created.

You have to use an standard library collection such as std::vector or std::arrayto use for each.
Please note that this codee I not standard C++, therefore not portable, because for each is a Visual C++ extension. I recommend to use std::for_each or C++11 auto ranged loops.

VC++ is not different from ISO/ANSI C++. Anybody who tells you that it is, is wrong. Now, to answer your question of the for each statement. There is no such statement in the ISO C++ specification. Microsoft supports the 'foreach' statement in C#, as part of the .Net framework. As a result, there might be a chance that this is supported in Visual Studio, although I would recommend not using it.
Like the user shubhansh answered a few replies back, try using a vector. However, I'm guessing you would like to use a generic size, rather than hard-coding it in. The following for loop would help you in this regard:
for(vector<int>::size_type i =0; i<myarray.size();i++)
{
total+=1;
}
This is the perfect way to iterate through a vector, as defined by the ISO standard.
Hope this helps you in your development.
Cheers!

C++ cast vector type in place

Is it possible to do this without creating new data structure?
Suppose we have
struct Span{
int from;
int to;
}
vector<Span> s;
We want to get an integer vector from s directly, by casting
vector<Span> s;
to
vector<int> s;
so we could remove/change some "from", "to" elements, then cast it back to
vector<Span> s;

This is not really a good idea, but I'll show you how.
You can get a raw pointer to the integer this way:
int * myPointer2 = (int*)&(s[0]);
but this is really bad practice because you can't guarantee that the span structure doesn't have any padding, so while it might work fine for me and you today we can't say much for other systems.
#include <iostream>
#include <vector>
struct Span{
int from;
int to;
};
int main()
{
std::vector<Span> s;
Span a = { 1, 2};
Span b = {2, 9};
Span c = {10, 14};
s.push_back(a);
s.push_back(b);
s.push_back(c);
int * myPointer = (int*)&(s[0]);
for(int k = 0; k < 6; k++)
{
std::cout << myPointer[k] << std::endl;
}
return 0;
}
As I said, that hard reinterpret cast will often work but is very dangerous and lacks the cross-platform guarantees you normally expect from C/C++.
The next worse thing is this, that will actually do what you asked but you should never do. This is the sort of code you could get fired for:
// Baaaad mojo here: turn a vector<span> into a vector<int>:
std::vector<int> * pis = (std::vector<int>*)&s;
for ( std::vector<int>::iterator It = pis->begin(); It != pis->end(); It++ )
std::cout << *It << std::endl;
Notice how I'm using a pointer to vector and pointing to the address of the vector object s. My hope is that the internals of both vectors are the same and I can use them just like that. For me, this works and while the standard templates may luckily require this to be the case, it is not generally so for templated classes (see such things as padding and template specialization).
Consider instead copying out an array (see ref 2 below) or just using s1.from and s[2].to.
Related Reading:
Are std::vector elements guaranteed to be contiguous?
How to convert vector to array in C++

If sizeof(Span) == sizeof(int) * 2 (that is, Span has no padding), then you can safely use reinterpret_cast<int*>(&v[0]) to get a pointer to array of int that you can iterate over. You can guarantee no-padding structures on a per-compiler basis, with __attribute__((__packed__)) in GCC and #pragma pack in Visual Studio.
However, there is a way that is guaranteed by the standard. Define Span like so:
struct Span {
int endpoints[2];
};
endpoints[0] and endpoints[1] are required to be contiguous. Add some from() and to() accessors for your convenience, if you like, but now you can use reinterpret_cast<int*>(&v[0]) to your heart’s content.
But if you’re going to be doing a lot of this pointer-munging, you might want to make your own vector-like data structure that is more amenable to this treatment—one that offers more safety guarantees so you can avoid shot feet.

Disclaimer: I have absolutely no idea about what you are trying to do. I am simply making educated guesses and showing possible solutions based on that. Hopefully I'll guess one right and you won't have to do crazy shenanigans with stupid casts.
If you want to remove a certain element from the vector, all you need to do is find it and remove it, using the erase function. You need an iterator to your element, and obtaining that iterator depends on what you know about the element in question. Given std::vector<Span> v;:
If you know its index:
v.erase(v.begin() + idx);
If you have an object that is equal to the one you're looking for:
Span doppelganger;
v.erase(std::find(v.begin(), v.end(), doppelganger));
If you have an object that is equal to what you're looking for but want to remove all equal elements, you need the erase-remove idiom:
Span doppelganger;
v.erase(std::remove(v.begin(), v.end(), doppelganger)),
v.end());
If you have some criterion to select the element:
v.erase(std::find(v.begin(), v.end(),
[](Span const& s) { return s.from == 0; }));
// in C++03 you need a separate function for the criterion
bool starts_from_zero(Span const& s) { return s.from == 0; }
v.erase(std::find(v.begin(), v.end(), starts_from_zero));
If you have some criterion and want to remove all elements that fit that criterion, you need the erase-remove idiom again:
v.erase(std::remove_if(v.begin(), v.end(), starts_from_zero)),
v.end());

How do I deal with "signed/unsigned mismatch" warnings (C4018)?

I work with a lot of calculation code written in c++ with high-performance and low memory overhead in mind. It uses STL containers (mostly std::vector) a lot, and iterates over that containers almost in every single function.
The iterating code looks like this:
for (int i = 0; i < things.size(); ++i)
{
// ...
}
But it produces the signed/unsigned mismatch warning (C4018 in Visual Studio).
Replacing int with some unsigned type is a problem because we frequently use OpenMP pragmas, and it requires the counter to be int.
I'm about to suppress the (hundreds of) warnings, but I'm afraid I've missed some elegant solution to the problem.
On iterators. I think iterators are great when applied in appropriate places. The code I'm working with will never change random-access containers into std::list or something (so iterating with int i is already container agnostic), and will always need the current index. And all the additional code you need to type (iterator itself and the index) just complicates matters and obfuscates the simplicity of the underlying code.

It's all in your things.size() type. It isn't int, but size_t (it exists in C++, not in C) which equals to some "usual" unsigned type, i.e. unsigned int for x86_32.
Operator "less" (<) cannot be applied to two operands of different sign. There's just no such opcodes, and standard doesn't specify, whether compiler can make implicit sign conversion. So it just treats signed number as unsigned and emits that warning.
It would be correct to write it like
for (size_t i = 0; i < things.size(); ++i) { /**/ }
or even faster
for (size_t i = 0, ilen = things.size(); i < ilen; ++i) { /**/ }

Ideally, I would use a construct like this instead:
for (std::vector<your_type>::const_iterator i = things.begin(); i != things.end(); ++i)
{
// if you ever need the distance, you may call std::distance
// it won't cause any overhead because the compiler will likely optimize the call
size_t distance = std::distance(things.begin(), i);
}
This a has the neat advantage that your code suddenly becomes container agnostic.
And regarding your problem, if some library you use requires you to use int where an unsigned int would better fit, their API is messy. Anyway, if you are sure that those int are always positive, you may just do:
int int_distance = static_cast<int>(distance);
Which will specify clearly your intent to the compiler: it won't bug you with warnings anymore.

If you can't/won't use iterators and if you can't/won't use std::size_t for the loop index, make a .size() to int conversion function that documents the assumption and does the conversion explicitly to silence the compiler warning.
#include <cassert>
#include <cstddef>
#include <limits>
// When using int loop indexes, use size_as_int(container) instead of
// container.size() in order to document the inherent assumption that the size
// of the container can be represented by an int.
template <typename ContainerType>
/* constexpr */ int size_as_int(const ContainerType &c) {
const auto size = c.size(); // if no auto, use `typename ContainerType::size_type`
assert(size <= static_cast<std::size_t>(std::numeric_limits<int>::max()));
return static_cast<int>(size);
}
Then you write your loops like this:
for (int i = 0; i < size_as_int(things); ++i) { ... }
The instantiation of this function template will almost certainly be inlined. In debug builds, the assumption will be checked. In release builds, it won't be and the code will be as fast as if you called size() directly. Neither version will produce a compiler warning, and it's only a slight modification to the idiomatic loop.
If you want to catch assumption failures in the release version as well, you can replace the assertion with an if statement that throws something like std::out_of_range("container size exceeds range of int").
Note that this solves both the signed/unsigned comparison as well as the potential sizeof(int) != sizeof(Container::size_type) problem. You can leave all your warnings enabled and use them to catch real bugs in other parts of your code.

You can use:
size_t type, to remove warning messages
iterators + distance (like are first hint)
only iterators
function object
For example:
// simple class who output his value
class ConsoleOutput
{
public:
ConsoleOutput(int value):m_value(value) { }
int Value() const { return m_value; }
private:
int m_value;
};
// functional object
class Predicat
{
public:
void operator()(ConsoleOutput const& item)
{
std::cout << item.Value() << std::endl;
}
};
void main()
{
// fill list
std::vector<ConsoleOutput> list;
list.push_back(ConsoleOutput(1));
list.push_back(ConsoleOutput(8));
// 1) using size_t
for (size_t i = 0; i < list.size(); ++i)
{
std::cout << list.at(i).Value() << std::endl;
}
// 2) iterators + distance, for std::distance only non const iterators
std::vector<ConsoleOutput>::iterator itDistance = list.begin(), endDistance = list.end();
for ( ; itDistance != endDistance; ++itDistance)
{
// int or size_t
int const position = static_cast<int>(std::distance(list.begin(), itDistance));
std::cout << list.at(position).Value() << std::endl;
}
// 3) iterators
std::vector<ConsoleOutput>::const_iterator it = list.begin(), end = list.end();
for ( ; it != end; ++it)
{
std::cout << (*it).Value() << std::endl;
}
// 4) functional objects
std::for_each(list.begin(), list.end(), Predicat());
}

C++20 has now std::cmp_less
In c++20, we have the standard constexpr functions
std::cmp_equal
std::cmp_not_equal
std::cmp_less
std::cmp_greater
std::cmp_less_equal
std::cmp_greater_equal
added in the <utility> header, exactly for this kind of scenarios.
Compare the values of two integers t and u. Unlike builtin comparison operators, negative signed integers always compare less than (and not equal to) unsigned integers: the comparison is safe against lossy integer conversion.
That means, if (due to some wired reasons) one must use the i as integer, the loops, and needs to compare with the unsigned integer, that can be done:
#include <utility> // std::cmp_less
for (int i = 0; std::cmp_less(i, things.size()); ++i)
{
// ...
}
This also covers the case, if we mistakenly static_cast the -1 (i.e. int)to unsigned int. That means, the following will not give you an error:
static_assert(1u < -1);
But the usage of std::cmp_less will
static_assert(std::cmp_less(1u, -1)); // error

I can also propose following solution for C++11.
for (auto p = 0U; p < sys.size(); p++) {
}
(C++ is not smart enough for auto p = 0, so I have to put p = 0U....)

I will give you a better idea
for(decltype(things.size()) i = 0; i < things.size(); i++){
//...
}
decltype is
Inspects the declared type of an entity or the type and value category
of an expression.
So, It deduces type of things.size() and i will be a type as same as things.size(). So,
i < things.size() will be executed without any warning

I had a similar problem. Using size_t was not working. I tried the other one which worked for me. (as below)
for(int i = things.size()-1;i>=0;i--)
{
//...
}

I would just do
int pnSize = primeNumber.size();
for (int i = 0; i < pnSize; i++)
cout << primeNumber[i] << ' ';

STL vectors with uninitialized storage?

I'm writing an inner loop that needs to place structs in contiguous storage. I don't know how many of these structs there will be ahead of time. My problem is that STL's vector initializes its values to 0, so no matter what I do, I incur the cost of the initialization plus the cost of setting the struct's members to their values.
Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements?
(I'm certain that this part of the code needs to be optimized, and I'm certain that the initialization is a significant cost.)
Also, see my comments below for a clarification about when the initialization occurs.
SOME CODE:
void GetsCalledALot(int* data1, int* data2, int count) {
int mvSize = memberVector.size()
memberVector.resize(mvSize + count); // causes 0-initialization
for (int i = 0; i < count; ++i) {
memberVector[mvSize + i].d1 = data1[i];
memberVector[mvSize + i].d2 = data2[i];
}
}

std::vector must initialize the values in the array somehow, which means some constructor (or copy-constructor) must be called. The behavior of vector (or any container class) is undefined if you were to access the uninitialized section of the array as if it were initialized.
The best way is to use reserve() and push_back(), so that the copy-constructor is used, avoiding default-construction.
Using your example code:
struct YourData {
int d1;
int d2;
YourData(int v1, int v2) : d1(v1), d2(v2) {}
};
std::vector<YourData> memberVector;
void GetsCalledALot(int* data1, int* data2, int count) {
int mvSize = memberVector.size();
// Does not initialize the extra elements
memberVector.reserve(mvSize + count);
// Note: consider using std::generate_n or std::copy instead of this loop.
for (int i = 0; i < count; ++i) {
// Copy construct using a temporary.
memberVector.push_back(YourData(data1[i], data2[i]));
}
}
The only problem with calling reserve() (or resize()) like this is that you may end up invoking the copy-constructor more often than you need to. If you can make a good prediction as to the final size of the array, it's better to reserve() the space once at the beginning. If you don't know the final size though, at least the number of copies will be minimal on average.
In the current version of C++, the inner loop is a bit inefficient as a temporary value is constructed on the stack, copy-constructed to the vectors memory, and finally the temporary is destroyed. However the next version of C++ has a feature called R-Value references (T&&) which will help.
The interface supplied by std::vector does not allow for another option, which is to use some factory-like class to construct values other than the default. Here is a rough example of what this pattern would look like implemented in C++:
template <typename T>
class my_vector_replacement {
// ...
template <typename F>
my_vector::push_back_using_factory(F factory) {
// ... check size of array, and resize if needed.
// Copy construct using placement new,
new(arrayData+end) T(factory())
end += sizeof(T);
}
char* arrayData;
size_t end; // Of initialized data in arrayData
};
// One of many possible implementations
struct MyFactory {
MyFactory(int* p1, int* p2) : d1(p1), d2(p2) {}
YourData operator()() const {
return YourData(*d1,*d2);
}
int* d1;
int* d2;
};
void GetsCalledALot(int* data1, int* data2, int count) {
// ... Still will need the same call to a reserve() type function.
// Note: consider using std::generate_n or std::copy instead of this loop.
for (int i = 0; i < count; ++i) {
// Copy construct using a factory
memberVector.push_back_using_factory(MyFactory(data1+i, data2+i));
}
}
Doing this does mean you have to create your own vector class. In this case it also complicates what should have been a simple example. But there may be times where using a factory function like this is better, for instance if the insert is conditional on some other value, and you would have to otherwise unconditionally construct some expensive temporary even if it wasn't actually needed.

In C++11 (and boost) you can use the array version of unique_ptr to allocate an uninitialized array. This isn't quite an stl container, but is still memory managed and C++-ish which will be good enough for many applications.
auto my_uninit_array = std::unique_ptr<mystruct[]>(new mystruct[count]);

C++0x adds a new member function template emplace_back to vector (which relies on variadic templates and perfect forwarding) that gets rid of any temporaries entirely:
memberVector.emplace_back(data1[i], data2[i]);

To clarify on reserve() responses: you need to use reserve() in conjunction with push_back(). This way, the default constructor is not called for each element, but rather the copy constructor. You still incur the penalty of setting up your struct on stack, and then copying it to the vector. On the other hand, it's possible that if you use
vect.push_back(MyStruct(fieldValue1, fieldValue2))
the compiler will construct the new instance directly in the memory thatbelongs to the vector. It depends on how smart the optimizer is. You need to check the generated code to find out.

You can use boost::noinit_adaptor to default initialize new elements (which is no initialization for built-in types):
std::vector<T, boost::noinit_adaptor<std::allocator<T>> memberVector;
As long as you don't pass an initializer into resize, it default initializes the new elements.

So here's the problem, resize is calling insert, which is doing a copy construction from a default constructed element for each of the newly added elements. To get this to 0 cost you need to write your own default constructor AND your own copy constructor as empty functions. Doing this to your copy constructor is a very bad idea because it will break std::vector's internal reallocation algorithms.
Summary: You're not going to be able to do this with std::vector.

You can use a wrapper type around your element type, with a default constructor that does nothing. E.g.:
template <typename T>
struct no_init
{
T value;
no_init() { static_assert(std::is_standard_layout<no_init<T>>::value && sizeof(T) == sizeof(no_init<T>), "T does not have standard layout"); }
no_init(T& v) { value = v; }
T& operator=(T& v) { value = v; return value; }
no_init(no_init<T>& n) { value = n.value; }
no_init(no_init<T>&& n) { value = std::move(n.value); }
T& operator=(no_init<T>& n) { value = n.value; return this; }
T& operator=(no_init<T>&& n) { value = std::move(n.value); return this; }
T* operator&() { return &value; } // So you can use &(vec[0]) etc.
};
To use:
std::vector<no_init<char>> vec;
vec.resize(2ul * 1024ul * 1024ul * 1024ul);

Err...
try the method:
std::vector<T>::reserve(x)
It will enable you to reserve enough memory for x items without initializing any (your vector is still empty). Thus, there won't be reallocation until to go over x.
The second point is that vector won't initialize the values to zero. Are you testing your code in debug ?
After verification on g++, the following code:
#include <iostream>
#include <vector>
struct MyStruct
{
int m_iValue00 ;
int m_iValue01 ;
} ;
int main()
{
MyStruct aaa, bbb, ccc ;
std::vector<MyStruct> aMyStruct ;
aMyStruct.push_back(aaa) ;
aMyStruct.push_back(bbb) ;
aMyStruct.push_back(ccc) ;
aMyStruct.resize(6) ; // [EDIT] double the size
for(std::vector<MyStruct>::size_type i = 0, iMax = aMyStruct.size(); i < iMax; ++i)
{
std::cout << "[" << i << "] : " << aMyStruct[i].m_iValue00 << ", " << aMyStruct[0].m_iValue01 << "\n" ;
}
return 0 ;
}
gives the following results:
[0] : 134515780, -16121856
[1] : 134554052, -16121856
[2] : 134544501, -16121856
[3] : 0, -16121856
[4] : 0, -16121856
[5] : 0, -16121856
The initialization you saw was probably an artifact.
[EDIT] After the comment on resize, I modified the code to add the resize line. The resize effectively calls the default constructor of the object inside the vector, but if the default constructor does nothing, then nothing is initialized... I still believe it was an artifact (I managed the first time to have the whole vector zerooed with the following code:
aMyStruct.push_back(MyStruct()) ;
aMyStruct.push_back(MyStruct()) ;
aMyStruct.push_back(MyStruct()) ;
So...
:-/
[EDIT 2] Like already offered by Arkadiy, the solution is to use an inline constructor taking the desired parameters. Something like
struct MyStruct
{
MyStruct(int p_d1, int p_d2) : d1(p_d1), d2(p_d2) {}
int d1, d2 ;
} ;
This will probably get inlined in your code.
But you should anyway study your code with a profiler to be sure this piece of code is the bottleneck of your application.

I tested a few of the approaches suggested here.
I allocated a huge set of data (200GB) in one container/pointer:
Compiler/OS:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Settings: (c++-17, -O3 optimizations)
g++ --std=c++17 -O3
I timed the total program runtime with linux-time
1.) std::vector:
#include <vector>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t> vec(size);
}
real 0m36.246s
user 0m4.549s
sys 0m31.604s
That is 36 seconds.
2.) std::vector with boost::noinit_adaptor
#include <vector>
#include <boost/core/noinit_adaptor.hpp>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t,boost::noinit_adaptor<std::allocator<size_t>>> vec(size);
}
real 0m0.002s
user 0m0.001s
sys 0m0.000s
So this solves the problem. Just allocating without initializing costs basically nothing (at least for large arrays).
3.) std::unique_ptr<T[]>:
#include <memory>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
auto data = std::unique_ptr<size_t[]>(new size_t[size]);
}
real 0m0.002s
user 0m0.002s
sys 0m0.000s
So basically the same performance as 2.), but does not require boost.
I also tested simple new/delete and malloc/free with the same performance as 2.) and 3.).
So the default-construction can have a huge performance penalty if you deal with large data sets.
In practice you want to actually initialize the allocated data afterwards.
However, some of the performance penalty still remains, especially if the later initialization is performed in parallel.
E.g., I initialize a huge vector with a set of (pseudo)random numbers:
(now I use fopenmp for parallelization on a 24 core AMD Threadripper 3960X)
g++ --std=c++17-fopenmp -O3
1.) std::vector:
#include <vector>
#include <random>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t> vec(size);
#pragma omp parallel
{
std::minstd_rand0 gen(42);
#pragma omp for schedule(static)
for (size_t i = 0; i < size; ++i) vec[i] = gen();
}
}
real 0m41.958s
user 4m37.495s
sys 0m31.348s
That is 42s, only 6s more than the default initialization.
The problem is, that the initialization of std::vector is sequential.
2.) std::vector with boost::noinit_adaptor:
#include <vector>
#include <random>
#include <boost/core/noinit_adaptor.hpp>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t,boost::noinit_adaptor<std::allocator<size_t>>> vec(size);
#pragma omp parallel
{
std::minstd_rand0 gen(42);
#pragma omp for schedule(static)
for (size_t i = 0; i < size; ++i) vec[i] = gen();
}
}
real 0m10.508s
user 1m37.665s
sys 3m14.951s
So even with the random-initialization, the code is 4 times faster because we can skip the sequential initialization of std::vector.
So if you deal with huge data sets and plan to initialize them afterwards in parallel, you should avoid using the default std::vector.

From your comments to other posters, it looks like you're left with malloc() and friends. Vector won't let you have unconstructed elements.

From your code, it looks like you have a vector of structs each of which comprises 2 ints. Could you instead use 2 vectors of ints? Then
copy(data1, data1 + count, back_inserter(v1));
copy(data2, data2 + count, back_inserter(v2));
Now you don't pay for copying a struct each time.

If you really insist on having the elements uninitialized and sacrifice some methods like front(), back(), push_back(), use boost vector from numeric . It allows you even not to preserve existing elements when calling resize()...

I'm not sure about all those answers that says it is impossible or tell us about undefined behavior.
Sometime, you need to use an std::vector. But sometime, you know the final size of it. And you also know that your elements will be constructed later.
Example : When you serialize the vector contents into a binary file, then read it back later.
Unreal Engine has its TArray::setNumUninitialized, why not std::vector ?
To answer the initial question
"Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements?"
yes and no.
No, because STL doesn't expose a way to do so.
Yes because we're coding in C++, and C++ allows to do a lot of thing. If you're ready to be a bad guy (and if you really know what you are doing). You can hijack the vector.
Here a sample code that works only for the Windows's STL implementation, for another platform, look how std::vector is implemented to use its internal members :
// This macro is to be defined before including VectorHijacker.h. Then you will be able to reuse the VectorHijacker.h with different objects.
#define HIJACKED_TYPE SomeStruct
// VectorHijacker.h
#ifndef VECTOR_HIJACKER_STRUCT
#define VECTOR_HIJACKER_STRUCT
struct VectorHijacker
{
std::size_t _newSize;
};
#endif
template<>
template<>
inline decltype(auto) std::vector<HIJACKED_TYPE, std::allocator<HIJACKED_TYPE>>::emplace_back<const VectorHijacker &>(const VectorHijacker &hijacker)
{
// We're modifying directly the size of the vector without passing by the extra initialization. This is the part that relies on how the STL was implemented.
_Mypair._Myval2._Mylast = _Mypair._Myval2._Myfirst + hijacker._newSize;
}
inline void setNumUninitialized_hijack(std::vector<HIJACKED_TYPE> &hijackedVector, const VectorHijacker &hijacker)
{
hijackedVector.reserve(hijacker._newSize);
hijackedVector.emplace_back<const VectorHijacker &>(hijacker);
}
But beware, this is hijacking we're speaking about. This is really dirty code, and this is only to be used if you really know what you are doing. Besides, it is not portable and relies heavily on how the STL implementation was done.
I won't advise you to use it because everyone here (me included) is a good person. But I wanted to let you know that it is possible contrary to all previous answers that stated it wasn't.

Use the std::vector::reserve() method. It won't resize the vector, but it will allocate the space.

Do the structs themselves need to be in contiguous memory, or can you get away with having a vector of struct*?
Vectors make a copy of whatever you add to them, so using vectors of pointers rather than objects is one way to improve performance.

I don't think STL is your answer. You're going to need to roll your own sort of solution using realloc(). You'll have to store a pointer and either the size, or number of elements, and use that to find where to start adding elements after a realloc().
int *memberArray;
int arrayCount;
void GetsCalledALot(int* data1, int* data2, int count) {
memberArray = realloc(memberArray, sizeof(int) * (arrayCount + count);
for (int i = 0; i < count; ++i) {
memberArray[arrayCount + i].d1 = data1[i];
memberArray[arrayCount + i].d2 = data2[i];
}
arrayCount += count;
}

I would do something like:
void GetsCalledALot(int* data1, int* data2, int count)
{
const size_t mvSize = memberVector.size();
memberVector.reserve(mvSize + count);
for (int i = 0; i < count; ++i) {
memberVector.push_back(MyType(data1[i], data2[i]));
}
}
You need to define a ctor for the type that is stored in the memberVector, but that's a small cost as it will give you the best of both worlds; no unnecessary initialization is done and no reallocation will occur during the loop.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

std::array vs array performance - c++

If I want to build a very simple array like: int myArray[3] = {1,2,3}; Should I use std::array instead? std::array<int, 3> a = {{1, 2, 3}}; What are the advantages of using std::array over usual ones? Is it more performant? Just easier to handle for copy/access?

Related

Extend std::vector with range checking and signed size_type

C++ for each With a Pointer

C++ cast vector type in place

How do I deal with "signed/unsigned mismatch" warnings (C4018)?

STL vectors with uninitialized storage?

Categories

Resources