cannot access the vector members of a class - c++

I have tried to access the members of a class Part that are vector elements of type integer inside the vector tasks.
#include <iostream>
#include <vector>
using namespace std;
class Part{
vector<int> tasks;
public:
void setTasks(void);
void getTasks(void);
};
void Part::setTasks(void){
vector<int>::iterator it;
int i=1;
for (it = this->tasks.begin(); it != this->tasks.end(); ++it)
{
*it=i;
i=i+1;
}
}
void Part::getTasks(void){
vector<int>::iterator it;
for (it = this->tasks.begin(); it != this->tasks.end(); ++it)
cout<<*it<<"\t";
}
int main()
{
Part one;
one.setTasks();
one.getTasks();
return 0;
}
I am simply trying to access the values and print them yet failing. There is no compilation error. In run-time, nothing is outputted in the terminal. Where is the error?

A default constructed vector has zero size, so the for loop in setTasks is never entered (since the begin() and end() iterators are the same at that point). If you set an initial size to the vector your code will work as intended. For instance, try adding the following at the beginning of setTasks
tasks.resize(10); // sets vector size to 10 elements, each initialized to 0
Another way to write that function would be
#include <numeric>
...
void Part::setTasks(void){
tasks.resize(10);
std::iota(tasks.begin(), tasks.end(), 1); // requires C++11
}
You could also set the initial size of the vector in the default constructor of Part if you wanted to. In that case add the following public constructor
Part() : tasks(10)
{}
Yet another way to achieve setting the size upon construction would be
class Part{
vector<int> tasks = vector<int>(10); // requires C++11

The size of your vector is 0 when you call setTasks(). Your iterator doesn't get you into the for loop at all. You need to think about what exactly you want your setTasks() to do. How many elements of the vector did you intend to set? You should either define your vector with that size, or use that many number of push_backs instead to set your vector to the desired value.

Your vector is empty. Try giving it a size. For example, vector<int> tasks(10). See option 3 in this.

Alternatively, you can use a "back-insert" iterator (#include <iterator>), which internally calls std::vector::push_back, like this:
void Part::setTasks(void){
auto back_it = std::back_inserter(tasks);
for(int i = 0; i < 10; ++i)
*back_it++ = i;
}
This kind of iterator is especially useful in algorithms where your destination size is unknown. Although if you know the size in advance, you should use reserve/resize or specify the size at construction, since push-ing back into a vector can sometimes be slow due to re-allocation.

Related

I am fairly new to STLs in C++ and i tried making a heap using vectors. Didnt get the desired output

#include<bits/stdc++.h>
using namespace std;
class Heap
{
vector <int> v;
int length;
public:
void create(vector <int> v, int s);
void display();
};
void Heap::create(vector <int> v, int s)
{
length=s+1;
for(int i=1;i<=s;i++)
{
this->v[i]=v[i-1];
}
int temp;
int j;
for(int i=2;i<length;i++)
{
temp=v[i];
j=i;
while(j>1&&temp>v[j/2])
{
swap(v[j],v[j/2]);
j=j/2;
}
if(j==1)
{
v[j]=temp;
}
}
}
void Heap::display()
{
for(int i=1;i<length;i++)
{
cout<<v[i]<<"\t";
}
cout<<endl;
}
int main()
{
vector <int> v;
int ans=1;
int d;
while(ans==1)
{
cout<<"Enter the Data\n";
cin>>d;
v.push_back(d);
cout<<"Do you want to enter more data?\n";
cin>>ans;
}
cout<<endl;
Heap h;
h.create(v,((int)v.size()));
h.display();
}
When i execute this code, it asks me to enter the data value. i enter all the data values i want to enter and click the enter button. it shows segmentation error. also the execution is taking a lot of time which is very unusaul. i use codeblocks version 20.
When i execute this code, it asks me to enter the data value. i enter all the data values i want to enter and click the enter button
Yeah, I'm not interested in guessing what you typed in order to reproduce your problem. I'm also not interested in guessing whether the issue is in your I/O code or the code you think you're testing.
Always remove interactive input when you're preparing a minimal reproducible example so that other people can actually reproduce it.
Sometimes removing the interactive input may fix your problem, in which case you've learnt something important (and probably want to ask a different question about your input code).
it shows segmentation error
A segmentation fault interrupts your program at the exact point where it happens. If you run your program in a debugger, it will show you where this is, and the state of everything in your program when it happened. You should try this, and learn to use your debugger.
this->v[i]=v[i-1];
As correctly pointed out in the other answer, there is a bug on this line.
You correctly called push_back when reading input, so you could just do the same here. Alternatively you need to explicitly size this->v before indexing elements that don't exist.
The other main problem with this function is that it mixes up this->v (used, illegally, only once on the line above) and v which is a local copy of the v in main, and which goes out of scope and is lost forever at the end of the function.
Just give your variables different names so you don't have to write this->v on all the other lines where you currently refer to v. Also, consider passing the original v by const ref instead of making a copy.
NB. I do see and understand that you're deliberately switching to 1-based indexing for the sort. If for some reason you can't just use std::sort or std::make_heap, you could at least explicitly set the zeroth element to zero, and then just std::copy the rest.
Finally, Heap::create really looks like it should just be a constructor. Forcing two-phase initialization is poor style in general, and I don't see any reason for it here.
First issue: you have used 'this->v' before initializing it. In this point:
this->v[i]=v[i-1];
this->v have size 0 and has no element to be accessed via index;
Furtheremore you have used wrong indices for it. Assuming this->v has initialized, correct index access is like this
this->v[i-1]=v[i-1];
Finally, it is better to sort the std vectors by using std::sort builtin function:
#include <algorithm>
std::sort(this->v.begin(), this->v.end());
This is obviously a school exercise. So I will only give you pointers as to where your code goes wrong.
class Heap
{
// vector <int> v; // v is not a suitable name for a class member, it's too short
// int length; // why length ? Your container declared above has length information, using
// a duplicate can only introduce opportunities for bugs!!!
vector<int> heap; // I've also renamed it in code below
public:
void create(vector <int> v, int s);
void display();
};
// some documentation is needed here...
// A I read it, it should be something like this, at least (this may reveal some bug):
//
// Initializes heap from range [v[1], v[s])
// void Heap::create(vector <int> v, int s) // you do not need a copy of the source vector!
void Heap::create(const vector& <int> v, int s) // use a const reference instead.
{
// This is not how you assign a vector from a range
// length=s+1;
// for(int i=1;i<=s;i++)
// {
// this->v[i]=v[i-1];
// }
// check inputs always, I'll throw, but you should decide how to handle errors
// This test assumes you want to start at v[1], see comment below.
if (0 > s || s >= v.size())
throw std::out_of_range ("parameter 's' is out of range in Heap::Create()");
// assign our private storage, why the offset by '1' ????
// I doubt this is a requirement of the assignment.
heap.assign(v.begin() + 1, v.begin() + s + 1);
//int temp; // these trivial variables are not needed outside the loop.
//int j;
// why '2' ?? what happens to the first element of heap?
// shouldn't the largest element be already stored there at this point?
// something is obviously missing before this line.
// you'll notice that v - the parameter - is used, and heap, our
// private storage is left unchanged by your code. Another hint
// that v is not suitable for a member name.
for(int i = 2; i < v.length(); i++)
{
int temp = v[i]; // temp defined here
int j = i;
//while(j > 1 && temp > v[j/2]) // avoid using while() when you can use for().
//{
// swap(v[j],v[j/2]);
// j=j/2;
//}
// This is your inner loop. it does not look quite right
for (; j > 1 && temp > v[j / 2]; j = j / 2)
swap(v[j], v[j/2]);
if (j == 1)
v[j] = temp;
}
}
void Heap::display()
{
for(int i=1;i<length;i++)
{
cout<<v[i]<<"\t";
}
cout<<endl;
}
From reading your code, it seems you forgot that vectors are zero-based arrays, i.e. The first element of vector v is v[0], and not v[1]. This creates all kinds of near unrecoverable errors in your code.
As a matter of personal preference, I'd declare Heap as deriving publicly from std::vector, instead of storing data in a member variable. Just something you should consider. You could use std::vector<>::at() to access and assign elements within the object.
As is, your code will not function correcly, even after fixing the memory access errors.

Declaration of Vectors

Vectors size dynamically, so why is this giving a seg fault:
#include <iostream>
#include <string>
#include <vector>
using namespace std;
int main(){
vector<int> vectorOfInts;
vectorOfInts[0] = 3;
}
What I'm trying to actually do is declare a vector in a class.
#include <iostream>
#include <string>
#include <vector>
using namespace std;
class Directory{
public:
string name;
int maxIndex;
vector<Directory> subDirectories;
void addSubdirectory(string x){
Directory newSubdirectory(x);
subDirectories[maxIndex++] = newSubdirectory;
}
Directory(string x){
name = x;
maxIndex = 0;
}
};
int main(){
Directory root("root");
root.addSubdirectory("games");
}
But this also gives a seg fault.
Vectors don't resize entirely automatically. You use push_back or resize to change the size of a vector at run-time, but the vector will not automatically resize itself based on the index you use--if you index beyond its current size, you get undefined behavior.
In your demo code, you could do something like this:
vector<int> vectorOfInts(1);
vectorOfInts[0] = 3;
Alternatively, since you're just adding 3 to the end of the existing data (or nonexistent data, in this case) you could just use push_back (or emplace_back):
vector<int> vectorOfInts;
vectorOfInts.push_back(3);
It looks like the same basic approach will work with your real code as well. It also simplifies things a bit, since you don't need to explicitly track the maxIndex as you've done.
A default-constructed vector has no elements (i.e. its size() returns zero).
The operator[] does not check if it is supplied a valid index, and gives undefined behaviour if supplied an invalid index. It does not resize the vector. A vector with size zero has no valid indices.
That combination explains your problem.
The seg fault, come from the fact that you try to acces an element that does not exist. When you use operator [ ], be sure that you already alocate memory for this element using resize, push_back, emplace_back...
To make your code work, just replace this
void
addSubdirectory(string x)
{
Directory newSubdirectory(x);
subDirectories[maxIndex++] = newSubdirectory;
}
by
void
addSubdirectory(string x)
{
subDirectories.emplace_back(x); // c++11
// else subDirectories.push_back(Directory(x));
}
and you don't need the maxIndex, you can have it using the size method: subDirectories.size() - 1.

Initialize a C++ vector with a variable inital value

I was coding up a Union find data structure , and was trying to initialize the parent vector with a value parent[i]=i, Is there a way in c++ to initialize the vector like this , that is declaring a vector of size N , and not assigning fixed values to each element, rather position dependent value to each element. (without using any obvious for loops)
This is what I was looking for:
std::vector<int> parent(Initializer);
where Initializer is some class or a function.
To try out my hand a bit, I wrote this:
#include <iostream>
#include <vector>
using namespace std;
class Initializer {
private:
static int i;
public:
int operator() ()
{
return i++;
}
};
int main()
{
vector<int> parent(Initializer);
cout << parent[0];
return 0;
}
However I think I have messed up my concepts pretty bad here, and I am not getting what the declaration means, or what it is doing.
Please answer both the questions,
(1) How to initialize a vector with variable initial values.
(2) What exactly is the code I wrote doing?
This is a function declaration:
vector<int> parent(Initializer);
Becasue Initializer is a type name, you declared a function parent that takes Initializer as a (unnamed) parameter and returns vector<int>. See Most vexing parse.
To do what you want, you can do this:
std::vector<int> parent(N); // where N is the size you want
std::iota(parent.begin(), parent.end(), 0); // fill it with consecutive values
// starting with 0
There's std::generate algorithm that you can use to save result of a function (or function object) in a range:
std::generate(parent.begin(), parent.end(), Initializer());
Live demo.
There are several alternatives. If you want to initialize the vector with increasing values, then you can use std::iota.
std::vector<int> vec(size);
std::iota(std::begin(vec), std::end(vec), 0);
If you want something more general you could use std::generate.
std::vector<int> vec(size);
int n = 0;
std::generate(std::begin(vec), std::end(vec), [&n]() {return n++;});

How to pass a vector to another vector push back? (without creating a extra variable to pass)

Well I am questioning myself if there is a way to pass a vector directly in a parameter, with that I mean, like this:
int xPOS = 5, yPOS = 6, zPOS = 2;
//^this is actually a struct but
//I simplified the code to this
std::vector <std::vector<int>> NodePoints;
NodePoints.push_back(
std::vector<int> {xPOS,yPOS,zPOS}
);
This code ofcourse gives an error; typename not allowed, and expected a ')'
I would have used a struct, but I have to pass the data to a Abstract Virtual Machine where I need to access the node positions as Array[index][index] like:
public GPS_WhenRouteIsCalculated(...)
{
for(new i = 0; i < amount_of_nodes; ++i)
{
printf("Point(%d)=NodeID(%d), Position(X;Y;Z):{%f;%f;%f}",i,node_id_array[i],NodePosition[i][0],NodePosition[i][1],NodePosition[i][2]);
}
return 1;
}
Ofcourse I could do it like this:
std::vector <std::vector<int>> NodePoints;//global
std::vector<int> x;//local
x.push_back(xPOS);
x.push_back(yPOS);
x.push_back(zPOS);
NodePoints.push_back(x);
or this:
std::vector <std::vector<int>> NodePoints;//global
std::vector<int> x;//global
x.push_back(xPOS);
x.push_back(yPOS);
x.push_back(zPOS);
NodePoints.push_back(x);
x.clear()
but then I'm wondering which of the two would be faster/more efficient/better?
Or is there a way to get my initial code working (first snippet)?
Use C++11, or something from boost for this (also you can use simple v.push_back({1,2,3}), vector will be constructed from initializer_list).
http://liveworkspace.org/code/m4kRJ$0
You can use boost::assign as well, if you have no C++11.
#include <vector>
#include <boost/assign/list_of.hpp>
using namespace boost::assign;
int main()
{
std::vector<std::vector<int>> v;
v.push_back(list_of(1)(2)(3));
}
http://liveworkspace.org/code/m4kRJ$5
and of course you can use old variant
int ptr[1,2,3];
v.push_back(std::vector<int>(ptr, ptr + sizeof(ptr) / sizeof(*ptr));
If you don't have access to either Boost or C++11 then you could consider quite a simple solution based around a class. By wrapping a vector to store your three points within a class with some simple access controls, you can create the flexibility you need. First create the class:
class NodePoint
{
public:
NodePoint( int a, int b, int c )
{
dim_.push_back( a );
dim_.push_back( b );
dim_.push_back( c );
}
int& operator[]( size_t i ){ return dim_[i]; }
private:
vector<int> dim_;
};
The important thing here is to encapsulate the vector as an aggregate of the object. The NodePoint can only be initialised by providing the three points. I've also provided operator[] to allow indexed access to the object. It can be used as follows:
NodePoint a(5, 6, 2);
cout << a[0] << " " << a[1] << " " << a[2] << endl;
Which prints:
5 6 2
Note that this will of course throw if an attempt is made to access an out of bounds index point but that's still better than a fixed array which would most likely seg fault. I don't see this as a perfect solution but it should get you reasonably safely to where you want to be.
If your main goal is to avoid unnecessary copies of vector<> then here how you should deal with it.
C++03
Insert an empty vector into the nested vector (e.g. Nodepoints) and then use std::swap() or std::vector::swap() upon it.
NodePoints.push_back(std::vector<int>()); // add an empty vector
std::swap(x, NodePoints.back()); // swaps contents of `x` and last element of `NodePoints`
So after the swap(), the contents of x will be transferred to NodePoints.back() without any copying.
C++11
Use std::move() to avoid extra copies
NodePoints.push_back(std::move(x)); // #include<utility>
Here is the explanation of std::move and here is an example.
Both of the above solutions have somewhat similar effect.

STL vectors with uninitialized storage?

I'm writing an inner loop that needs to place structs in contiguous storage. I don't know how many of these structs there will be ahead of time. My problem is that STL's vector initializes its values to 0, so no matter what I do, I incur the cost of the initialization plus the cost of setting the struct's members to their values.
Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements?
(I'm certain that this part of the code needs to be optimized, and I'm certain that the initialization is a significant cost.)
Also, see my comments below for a clarification about when the initialization occurs.
SOME CODE:
void GetsCalledALot(int* data1, int* data2, int count) {
int mvSize = memberVector.size()
memberVector.resize(mvSize + count); // causes 0-initialization
for (int i = 0; i < count; ++i) {
memberVector[mvSize + i].d1 = data1[i];
memberVector[mvSize + i].d2 = data2[i];
}
}
std::vector must initialize the values in the array somehow, which means some constructor (or copy-constructor) must be called. The behavior of vector (or any container class) is undefined if you were to access the uninitialized section of the array as if it were initialized.
The best way is to use reserve() and push_back(), so that the copy-constructor is used, avoiding default-construction.
Using your example code:
struct YourData {
int d1;
int d2;
YourData(int v1, int v2) : d1(v1), d2(v2) {}
};
std::vector<YourData> memberVector;
void GetsCalledALot(int* data1, int* data2, int count) {
int mvSize = memberVector.size();
// Does not initialize the extra elements
memberVector.reserve(mvSize + count);
// Note: consider using std::generate_n or std::copy instead of this loop.
for (int i = 0; i < count; ++i) {
// Copy construct using a temporary.
memberVector.push_back(YourData(data1[i], data2[i]));
}
}
The only problem with calling reserve() (or resize()) like this is that you may end up invoking the copy-constructor more often than you need to. If you can make a good prediction as to the final size of the array, it's better to reserve() the space once at the beginning. If you don't know the final size though, at least the number of copies will be minimal on average.
In the current version of C++, the inner loop is a bit inefficient as a temporary value is constructed on the stack, copy-constructed to the vectors memory, and finally the temporary is destroyed. However the next version of C++ has a feature called R-Value references (T&&) which will help.
The interface supplied by std::vector does not allow for another option, which is to use some factory-like class to construct values other than the default. Here is a rough example of what this pattern would look like implemented in C++:
template <typename T>
class my_vector_replacement {
// ...
template <typename F>
my_vector::push_back_using_factory(F factory) {
// ... check size of array, and resize if needed.
// Copy construct using placement new,
new(arrayData+end) T(factory())
end += sizeof(T);
}
char* arrayData;
size_t end; // Of initialized data in arrayData
};
// One of many possible implementations
struct MyFactory {
MyFactory(int* p1, int* p2) : d1(p1), d2(p2) {}
YourData operator()() const {
return YourData(*d1,*d2);
}
int* d1;
int* d2;
};
void GetsCalledALot(int* data1, int* data2, int count) {
// ... Still will need the same call to a reserve() type function.
// Note: consider using std::generate_n or std::copy instead of this loop.
for (int i = 0; i < count; ++i) {
// Copy construct using a factory
memberVector.push_back_using_factory(MyFactory(data1+i, data2+i));
}
}
Doing this does mean you have to create your own vector class. In this case it also complicates what should have been a simple example. But there may be times where using a factory function like this is better, for instance if the insert is conditional on some other value, and you would have to otherwise unconditionally construct some expensive temporary even if it wasn't actually needed.
In C++11 (and boost) you can use the array version of unique_ptr to allocate an uninitialized array. This isn't quite an stl container, but is still memory managed and C++-ish which will be good enough for many applications.
auto my_uninit_array = std::unique_ptr<mystruct[]>(new mystruct[count]);
C++0x adds a new member function template emplace_back to vector (which relies on variadic templates and perfect forwarding) that gets rid of any temporaries entirely:
memberVector.emplace_back(data1[i], data2[i]);
To clarify on reserve() responses: you need to use reserve() in conjunction with push_back(). This way, the default constructor is not called for each element, but rather the copy constructor. You still incur the penalty of setting up your struct on stack, and then copying it to the vector. On the other hand, it's possible that if you use
vect.push_back(MyStruct(fieldValue1, fieldValue2))
the compiler will construct the new instance directly in the memory thatbelongs to the vector. It depends on how smart the optimizer is. You need to check the generated code to find out.
You can use boost::noinit_adaptor to default initialize new elements (which is no initialization for built-in types):
std::vector<T, boost::noinit_adaptor<std::allocator<T>> memberVector;
As long as you don't pass an initializer into resize, it default initializes the new elements.
So here's the problem, resize is calling insert, which is doing a copy construction from a default constructed element for each of the newly added elements. To get this to 0 cost you need to write your own default constructor AND your own copy constructor as empty functions. Doing this to your copy constructor is a very bad idea because it will break std::vector's internal reallocation algorithms.
Summary: You're not going to be able to do this with std::vector.
You can use a wrapper type around your element type, with a default constructor that does nothing. E.g.:
template <typename T>
struct no_init
{
T value;
no_init() { static_assert(std::is_standard_layout<no_init<T>>::value && sizeof(T) == sizeof(no_init<T>), "T does not have standard layout"); }
no_init(T& v) { value = v; }
T& operator=(T& v) { value = v; return value; }
no_init(no_init<T>& n) { value = n.value; }
no_init(no_init<T>&& n) { value = std::move(n.value); }
T& operator=(no_init<T>& n) { value = n.value; return this; }
T& operator=(no_init<T>&& n) { value = std::move(n.value); return this; }
T* operator&() { return &value; } // So you can use &(vec[0]) etc.
};
To use:
std::vector<no_init<char>> vec;
vec.resize(2ul * 1024ul * 1024ul * 1024ul);
Err...
try the method:
std::vector<T>::reserve(x)
It will enable you to reserve enough memory for x items without initializing any (your vector is still empty). Thus, there won't be reallocation until to go over x.
The second point is that vector won't initialize the values to zero. Are you testing your code in debug ?
After verification on g++, the following code:
#include <iostream>
#include <vector>
struct MyStruct
{
int m_iValue00 ;
int m_iValue01 ;
} ;
int main()
{
MyStruct aaa, bbb, ccc ;
std::vector<MyStruct> aMyStruct ;
aMyStruct.push_back(aaa) ;
aMyStruct.push_back(bbb) ;
aMyStruct.push_back(ccc) ;
aMyStruct.resize(6) ; // [EDIT] double the size
for(std::vector<MyStruct>::size_type i = 0, iMax = aMyStruct.size(); i < iMax; ++i)
{
std::cout << "[" << i << "] : " << aMyStruct[i].m_iValue00 << ", " << aMyStruct[0].m_iValue01 << "\n" ;
}
return 0 ;
}
gives the following results:
[0] : 134515780, -16121856
[1] : 134554052, -16121856
[2] : 134544501, -16121856
[3] : 0, -16121856
[4] : 0, -16121856
[5] : 0, -16121856
The initialization you saw was probably an artifact.
[EDIT] After the comment on resize, I modified the code to add the resize line. The resize effectively calls the default constructor of the object inside the vector, but if the default constructor does nothing, then nothing is initialized... I still believe it was an artifact (I managed the first time to have the whole vector zerooed with the following code:
aMyStruct.push_back(MyStruct()) ;
aMyStruct.push_back(MyStruct()) ;
aMyStruct.push_back(MyStruct()) ;
So...
:-/
[EDIT 2] Like already offered by Arkadiy, the solution is to use an inline constructor taking the desired parameters. Something like
struct MyStruct
{
MyStruct(int p_d1, int p_d2) : d1(p_d1), d2(p_d2) {}
int d1, d2 ;
} ;
This will probably get inlined in your code.
But you should anyway study your code with a profiler to be sure this piece of code is the bottleneck of your application.
I tested a few of the approaches suggested here.
I allocated a huge set of data (200GB) in one container/pointer:
Compiler/OS:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Settings: (c++-17, -O3 optimizations)
g++ --std=c++17 -O3
I timed the total program runtime with linux-time
1.) std::vector:
#include <vector>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t> vec(size);
}
real 0m36.246s
user 0m4.549s
sys 0m31.604s
That is 36 seconds.
2.) std::vector with boost::noinit_adaptor
#include <vector>
#include <boost/core/noinit_adaptor.hpp>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t,boost::noinit_adaptor<std::allocator<size_t>>> vec(size);
}
real 0m0.002s
user 0m0.001s
sys 0m0.000s
So this solves the problem. Just allocating without initializing costs basically nothing (at least for large arrays).
3.) std::unique_ptr<T[]>:
#include <memory>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
auto data = std::unique_ptr<size_t[]>(new size_t[size]);
}
real 0m0.002s
user 0m0.002s
sys 0m0.000s
So basically the same performance as 2.), but does not require boost.
I also tested simple new/delete and malloc/free with the same performance as 2.) and 3.).
So the default-construction can have a huge performance penalty if you deal with large data sets.
In practice you want to actually initialize the allocated data afterwards.
However, some of the performance penalty still remains, especially if the later initialization is performed in parallel.
E.g., I initialize a huge vector with a set of (pseudo)random numbers:
(now I use fopenmp for parallelization on a 24 core AMD Threadripper 3960X)
g++ --std=c++17-fopenmp -O3
1.) std::vector:
#include <vector>
#include <random>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t> vec(size);
#pragma omp parallel
{
std::minstd_rand0 gen(42);
#pragma omp for schedule(static)
for (size_t i = 0; i < size; ++i) vec[i] = gen();
}
}
real 0m41.958s
user 4m37.495s
sys 0m31.348s
That is 42s, only 6s more than the default initialization.
The problem is, that the initialization of std::vector is sequential.
2.) std::vector with boost::noinit_adaptor:
#include <vector>
#include <random>
#include <boost/core/noinit_adaptor.hpp>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t,boost::noinit_adaptor<std::allocator<size_t>>> vec(size);
#pragma omp parallel
{
std::minstd_rand0 gen(42);
#pragma omp for schedule(static)
for (size_t i = 0; i < size; ++i) vec[i] = gen();
}
}
real 0m10.508s
user 1m37.665s
sys 3m14.951s
So even with the random-initialization, the code is 4 times faster because we can skip the sequential initialization of std::vector.
So if you deal with huge data sets and plan to initialize them afterwards in parallel, you should avoid using the default std::vector.
From your comments to other posters, it looks like you're left with malloc() and friends. Vector won't let you have unconstructed elements.
From your code, it looks like you have a vector of structs each of which comprises 2 ints. Could you instead use 2 vectors of ints? Then
copy(data1, data1 + count, back_inserter(v1));
copy(data2, data2 + count, back_inserter(v2));
Now you don't pay for copying a struct each time.
If you really insist on having the elements uninitialized and sacrifice some methods like front(), back(), push_back(), use boost vector from numeric . It allows you even not to preserve existing elements when calling resize()...
I'm not sure about all those answers that says it is impossible or tell us about undefined behavior.
Sometime, you need to use an std::vector. But sometime, you know the final size of it. And you also know that your elements will be constructed later.
Example : When you serialize the vector contents into a binary file, then read it back later.
Unreal Engine has its TArray::setNumUninitialized, why not std::vector ?
To answer the initial question
"Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements?"
yes and no.
No, because STL doesn't expose a way to do so.
Yes because we're coding in C++, and C++ allows to do a lot of thing. If you're ready to be a bad guy (and if you really know what you are doing). You can hijack the vector.
Here a sample code that works only for the Windows's STL implementation, for another platform, look how std::vector is implemented to use its internal members :
// This macro is to be defined before including VectorHijacker.h. Then you will be able to reuse the VectorHijacker.h with different objects.
#define HIJACKED_TYPE SomeStruct
// VectorHijacker.h
#ifndef VECTOR_HIJACKER_STRUCT
#define VECTOR_HIJACKER_STRUCT
struct VectorHijacker
{
std::size_t _newSize;
};
#endif
template<>
template<>
inline decltype(auto) std::vector<HIJACKED_TYPE, std::allocator<HIJACKED_TYPE>>::emplace_back<const VectorHijacker &>(const VectorHijacker &hijacker)
{
// We're modifying directly the size of the vector without passing by the extra initialization. This is the part that relies on how the STL was implemented.
_Mypair._Myval2._Mylast = _Mypair._Myval2._Myfirst + hijacker._newSize;
}
inline void setNumUninitialized_hijack(std::vector<HIJACKED_TYPE> &hijackedVector, const VectorHijacker &hijacker)
{
hijackedVector.reserve(hijacker._newSize);
hijackedVector.emplace_back<const VectorHijacker &>(hijacker);
}
But beware, this is hijacking we're speaking about. This is really dirty code, and this is only to be used if you really know what you are doing. Besides, it is not portable and relies heavily on how the STL implementation was done.
I won't advise you to use it because everyone here (me included) is a good person. But I wanted to let you know that it is possible contrary to all previous answers that stated it wasn't.
Use the std::vector::reserve() method. It won't resize the vector, but it will allocate the space.
Do the structs themselves need to be in contiguous memory, or can you get away with having a vector of struct*?
Vectors make a copy of whatever you add to them, so using vectors of pointers rather than objects is one way to improve performance.
I don't think STL is your answer. You're going to need to roll your own sort of solution using realloc(). You'll have to store a pointer and either the size, or number of elements, and use that to find where to start adding elements after a realloc().
int *memberArray;
int arrayCount;
void GetsCalledALot(int* data1, int* data2, int count) {
memberArray = realloc(memberArray, sizeof(int) * (arrayCount + count);
for (int i = 0; i < count; ++i) {
memberArray[arrayCount + i].d1 = data1[i];
memberArray[arrayCount + i].d2 = data2[i];
}
arrayCount += count;
}
I would do something like:
void GetsCalledALot(int* data1, int* data2, int count)
{
const size_t mvSize = memberVector.size();
memberVector.reserve(mvSize + count);
for (int i = 0; i < count; ++i) {
memberVector.push_back(MyType(data1[i], data2[i]));
}
}
You need to define a ctor for the type that is stored in the memberVector, but that's a small cost as it will give you the best of both worlds; no unnecessary initialization is done and no reallocation will occur during the loop.