C++ Class Variables: Initialization vs. Assignment and Initialization of vectors - c++

I am working on a C++ program that has a series of class variables that contain vectors on some or all of the member variables. My question is three-fold:
Is it straight-forward to use constructors to initialize vector variables that are part of a class (see sample class definition below)? Could someone post an example constructor for the class below (or for at least the single and two-dimension vector variables)?
Is there a problem with simply initializing the variables myself in my code (i.e., iterating through each element of the vectors using loops to assign an initial value)?
Along the same lines, if the variables need to be initialized to different values in different contexts (e.g., zero in one instance, some number in another instance), is there a way to handle that through constructors?
Sample class definition:
class CreditBasedPoolLevel {
public:
int NumofLoans;
int NumofPaths;
int NumofPeriods;
double TotalPoolBal;
vector<int> NumofModeled;
vector<double> ForbearanceAmt;
vector<double> TotalAmtModeled;
vector<vector<int>> DefCountPoolVector;
vector<vector<double>> TermDefBalPoolVector;
vector<vector<double>> BalloonDefBalPoolVector;
vector<vector<double>> TermDefBalPoolVectorCum;
vector<vector<double>> TermSeverityAmt;
vector<vector<double>> TermELAmtPoolVector;
vector<vector<double>> BalloonELAmtPoolVector;
vector<vector<double>> TermELAmtPoolVectorCum;
};

In C++, initializing a variable calls its constructor. In a vector's case, this means it creates an instance of a vector with whatever the initial capacity is (10 I believe), with no values. At this point, you need to use push_back in order to fill the vector - even though it has a capacity, it will cause undefined behavior if you try to access unfilled areas directly (such as with NumofModeled[0]). You can also initialize it with some amount of space by using vector NumofModeled(x) (x being the number of spaces), but generally because vectors have dynamic size, it's easier to use push_back unless there is some reason you need to enter your data out of order.
Relates to the capacity part of one, if you try to access unfilled space in a vector you will get undefined behavior. It's pretty standard practice to fill a vector with a loop though, such as:
vector<int> v;
int in = 0;
while (cin)
{
cin >> in;
v.push_back(in);
}
Yes, but remember that like functions, constructors only differentiate by the type of input parameters. So, for example, you could have CreditBasedPoolLevel(int level) and CreditBasedPoolLevel(vector<int> levels), but not another with the definition CreditBasedPoolLevel(int otherint), because it would conflict with the first. If you want to be able to take different contextual input of the same type, you can use another variable to define the constructor type, such as CreditBasedPoolLevel(int input, string type) and use a switch block to define the initialization logic based on the type.

As for question number three, simply add a constructor with an argument that is the value you want to initialize the vectors with.
And if you just want the vectors to be default constructed, then there's nothing that needs to be done.

Constructor may look something like this:
CreditBasedPoolLevel::CreditBasedPoolLevel()
{
const int numDefCountPools = 13;
const int numDefCountPoolEntries = 25;
for(int i = 0; i < numDefCountPools; i++)
{
vector<int> v;
for(int j = 0; j < numDefCountPoolEntries; j++)
{
v.push_back(j + i * 5); // Don't know what value you ACTUALLY want to fill here
}
DefCountPoolVector.push_back(v);
}
}
Note that this is ONE solution, it really depends on what values you want, how you went them organized, etc, what is the "right" solution for your case.

Related

Initializing multi-dimensional std::vector without knowing dimensions in advance

Context: I have a class, E (think of it as an organism) and a struct, H (a single cell within the organism). The goal is to estimate some characterizing parameters of E. H has some properties that are stored in multi-dimensional matrices. But, the dimensions depend on the parameters of E.
E reads a set of parameters from an input file, declares some objects of type H, solves each of their problems and fills the matrices, computes a likelihood function, exports it, and moves on to next set of parameters.
What I used to do: I used to declare pointers to pointers to pointers in H's header, and postpone memory allocation to H's constructor. This way, E could pass parameters to constructor, and memory allocation could be done afterwards. I de-allocated memory in the destructor.
Problem: Yesterday, I realized this is bad practice! So, I decided to try vectors. I have read several tutorials. At the moment, the only thing that I can think of is using push_back() as used in the question here. But, I have a feeling that this might not be the best practice (as mentioned by many, e.g., here, under method 3).
There are tens of questions that are tangent to this, but none answers this question directly: What is the best practice if dimensions are not known in advance?
Any suggestion helps: Do I have any other solution? Should I stick to arrays?
Using push_back() should be fine, as long as the vector has reserved the appropriate capacity.
If your only hesitancy to using push_back() is the copy overhead when a reallocation is performed, there is a straightforward way to resolve that issue. You use the reserve() method to inform the vector how many elements the vector will eventually have. So long as
reserve() is called before the vector is used, there will just be a single allocation for the needed amount. Then, push_back() will not incur any reallocations as the vector is being filled.
From the example in your cited source:
std::vector<std::vector<int>> matrix;
matrix.reserve(M);
for (int i = 0; i < M; i++)
{
// construct a vector of ints with the given default value
std::vector<int> v;
v.reserve(N);
for (int j = 0; j < N; j++) {
v.push_back(default_value);
}
// push back above one-dimensional vector
matrix.push_back(v);
}
This particular example is contrived. As #kei2e noted in a comment, the inner v variable could be initialized once on the outside of the loop, and then reused for each row.
However, as noted by #Jarod42 in a comment, the whole thing can actually be accomplished with the appropriate construction of matrix:
std::vector<std::vector<int>> matrix(M, std::vector<int>(N, default_value));
If this initialization task was populating matrix with values from some external source, then the other suggestion by #Jarod42 could be used, to move the element into place to avoid a copy.
std::vector<std::vector<int>> matrix;
matrix.reserve(M);
for (int i = 0; i < M; i++)
{
std::vector<int> v;
v.reserve(N);
for (int j = 0; j < N; j++) {
v.push_back(source_of_value());
}
matrix.push_back(std::move(v));
}

Specifying the size of a vector in declaration vs using reserve [duplicate]

I know the size of a vector, which is the best way to initialize it?
Option 1:
vector<int> vec(3); //in .h
vec.at(0)=var1; //in .cpp
vec.at(1)=var2; //in .cpp
vec.at(2)=var3; //in .cpp
Option 2:
vector<int> vec; //in .h
vec.reserve(3); //in .cpp
vec.push_back(var1); //in .cpp
vec.push_back(var2); //in .cpp
vec.push_back(var3); //in .cpp
I guess, Option2 is better than Option1. Is it? Any other options?
Somehow, a non-answer answer that is completely wrong has remained accepted and most upvoted for ~7 years. This is not an apples and oranges question. This is not a question to be answered with vague cliches.
For a simple rule to follow:
Option #1 is faster...
...but this probably shouldn't be your biggest concern.
Firstly, the difference is pretty minor. Secondly, as we crank up the compiler optimization, the difference becomes even smaller. For example, on my gcc-5.4.0, the difference is arguably trivial when running level 3 compiler optimization (-O3):
So in general, I would recommending using method #1 whenever you encounter this situation. However, if you can't remember which one is optimal, it's probably not worth the effort to find out. Just pick either one and move on, because this is unlikely to ever cause a noticeable slowdown in your program as a whole.
These tests were run by sampling random vector sizes from a normal distribution, and then timing the initialization of vectors of these sizes using the two methods. We keep a dummy sum variable to ensure the vector initialization is not optimized out, and we randomize vector sizes and values to make an effort to avoid any errors due to branch prediction, caching, and other such tricks.
main.cpp:
/*
* Test constructing and filling a vector in two ways: construction with size
* then assignment versus construction of empty vector followed by push_back
* We collect dummy sums to prevent the compiler from optimizing out computation
*/
#include <iostream>
#include <vector>
#include "rng.hpp"
#include "timer.hpp"
const size_t kMinSize = 1000;
const size_t kMaxSize = 100000;
const double kSizeIncrementFactor = 1.2;
const int kNumVecs = 10000;
int main() {
for (size_t mean_size = kMinSize; mean_size <= kMaxSize;
mean_size = static_cast<size_t>(mean_size * kSizeIncrementFactor)) {
// Generate sizes from normal distribution
std::vector<size_t> sizes_vec;
NormalIntRng<size_t> sizes_rng(mean_size, mean_size / 10.0);
for (int i = 0; i < kNumVecs; ++i) {
sizes_vec.push_back(sizes_rng.GenerateValue());
}
Timer timer;
UniformIntRng<int> values_rng(0, 5);
// Method 1: construct with size, then assign
timer.Reset();
int method_1_sum = 0;
for (size_t num_els : sizes_vec) {
std::vector<int> vec(num_els);
for (size_t i = 0; i < num_els; ++i) {
vec[i] = values_rng.GenerateValue();
}
// Compute sum - this part identical for two methods
for (size_t i = 0; i < num_els; ++i) {
method_1_sum += vec[i];
}
}
double method_1_seconds = timer.GetSeconds();
// Method 2: reserve then push_back
timer.Reset();
int method_2_sum = 0;
for (size_t num_els : sizes_vec) {
std::vector<int> vec;
vec.reserve(num_els);
for (size_t i = 0; i < num_els; ++i) {
vec.push_back(values_rng.GenerateValue());
}
// Compute sum - this part identical for two methods
for (size_t i = 0; i < num_els; ++i) {
method_2_sum += vec[i];
}
}
double method_2_seconds = timer.GetSeconds();
// Report results as mean_size, method_1_seconds, method_2_seconds
std::cout << mean_size << ", " << method_1_seconds << ", " << method_2_seconds;
// Do something with the dummy sums that cannot be optimized out
std::cout << ((method_1_sum > method_2_sum) ? "" : " ") << std::endl;
}
return 0;
}
The header files I used are located here:
rng.hpp
timer.hpp
Both variants have different semantics, i.e. you are comparing apples and oranges.
The first gives you a vector of n default-initialized values, the second variant reserves the memory, but does not initialize them.
Choose what better fits your needs, i.e. what is "better" in a certain situation.
The "best" way would be:
vector<int> vec = {var1, var2, var3};
available with a C++11 capable compiler.
Not sure exactly what you mean by doing things in a header or implementation files. A mutable global is a no-no for me. If it is a class member, then it can be initialized in the constructor initialization list.
Otherwise, option 1 would be generally used if you know how many items you are going to use and the default values (0 for int) would be useful.
Using at here means that you can't guarantee the index is valid. A situation like that is alarming itself. Even though you will be able to reliably detect problems, it's definitely simpler to use push_back and stop worrying about getting the indexes right.
In case of option 2, generally it makes zero performance difference whether you reserve memory or not, so it's simpler not to reserve*. Unless perhaps if the vector contains types that are very expensive to copy (and don't provide fast moving in C++11), or the size of the vector is going to be enormous.
* From Stroustrups C++ Style and Technique FAQ:
People sometimes worry about the cost of std::vector growing
incrementally. I used to worry about that and used reserve() to
optimize the growth. After measuring my code and repeatedly having
trouble finding the performance benefits of reserve() in real
programs, I stopped using it except where it is needed to avoid
iterator invalidation (a rare case in my code). Again: measure before
you optimize.
While your examples are essentially the same, it may be that when the type used is not an int the choice is taken from you. If your type doesn't have a default constructor, or if you'll have to re-construct each element later anyway, I would use reserve. Just don't fall into the trap I did and use reserve and then the operator[] for initialisation!
Constructor
std::vector<MyType> myVec(numberOfElementsToStart);
int size = myVec.size();
int capacity = myVec.capacity();
In this first case, using the constructor, size and numberOfElementsToStart will be equal and capacity will be greater than or equal to them.
Think of myVec as a vector containing a number of items of MyType which can be accessed and modified, push_back(anotherInstanceOfMyType) will append it the the end of the vector.
Reserve
std::vector<MyType> myVec;
myVec.reserve(numberOfElementsToStart);
int size = myVec.size();
int capacity = myVec.capacity();
When using the reserve function, size will be 0 until you add an element to the array and capacity will be equal to or greater than numberOfElementsToStart.
Think of myVec as an empty vector which can have new items appended to it using push_back with no memory allocation for at least the first numberOfElementsToStart elements.
Note that push_back() still requires an internal check to ensure that size < capacity and to increment size, so you may want to weigh this against the cost of default construction.
List initialisation
std::vector<MyType> myVec{ var1, var2, var3 };
This is an additional option for initialising your vector, and while it is only feasible for very small vectors, it is a clear way to initialise a small vector with known values. size will be equal to the number of elements you initialised it with, and capacity will be equal to or greater than size. Modern compilers may optimise away the creation of temporary objects and prevent unnecessary copying.
Option 2 is better, as reserve only needs to reserve memory (3 * sizeof(T)), while the first option calls the constructor of the base type for each cell inside the container.
For C-like types it will probably be the same.
How it Works
This is implementation specific however in general Vector data structure internally will have pointer to the memory block where the elements would actually resides. Both GCC and VC++ allocate for 0 elements by default. So you can think of Vector's internal memory pointer to be nullptr by default.
When you call vector<int> vec(N); as in your Option 1, the N objects are created using default constructor. This is called fill constructor.
When you do vec.reserve(N); after default constructor as in Option 2, you get data block to hold 3 elements but no objects are created unlike in option 1.
Why to Select Option 1
If you know the number of elements vector will hold and you might leave most of the elements to its default values then you might want to use this option.
Why to Select Option 2
This option is generally better of the two as it only allocates data block for the future use and not actually filling up with objects created from default constructor.
Since it seems 5 years have passed and a wrong answer is still the accepted one, and the most-upvoted answer is completely useless (missed the forest for the trees), I will add a real response.
Method #1: we pass an initial size parameter into the vector, let's call it n. That means the vector is filled with n elements, which will be initialized to their default value. For example, if the vector holds ints, it will be filled with n zeros.
Method #2: we first create an empty vector. Then we reserve space for n elements. In this case, we never create the n elements and thus we never perform any initialization of the elements in the vector. Since we plan to overwrite the values of every element immediately, the lack of initialization will do us no harm. On the other hand, since we have done less overall, this would be the better* option.
* better - real definition: never worse. It's always possible a smart compiler will figure out what you're trying to do and optimize it for you.
Conclusion: use method #2.
In the long run, it depends on the usage and numbers of the elements.
Run the program below to understand how the compiler reserves space:
vector<int> vec;
for(int i=0; i<50; i++)
{
cout << "size=" << vec.size() << "capacity=" << vec.capacity() << endl;
vec.push_back(i);
}
size is the number of actual elements and capacity is the actual size of the array to imlement vector.
In my computer, till 10, both are the same. But, when size is 43 the capacity is 63. depending on the number of elements, either may be better. For example, increasing the capacity may be expensive.
Another option is to Trust Your Compiler(tm) and do the push_backs without calling reserve first. It has to allocate some space when you start adding elements. Perhaps it does that just as well as you would?
It is "better" to have simpler code that does the same job.
I think answer may depend on situation. For instance:
Lets try to copy simple vector to another vector. Vector hold example class which has only integer. In first example lets use reserve.
#include <iostream>
#include <vector>
#include <algorithm>
class example
{
public:
// Copy constructor
example(const example& p1)
{
std::cout<<"copy"<<std::endl;
this->a = p1.a;
}
example(example&& o) noexcept
{
std::cout<<"move"<<std::endl;
std::swap(o.a, this->a);
}
example(int a_)
{
std::cout<<"const"<<std::endl;
a = a_;
}
example()
{
std::cout<<"Def const"<<std::endl;
}
int a;
};
int main()
{
auto vec = std::vector<example>{1,2,3};
auto vec2 = std::vector<example>{};
vec2.reserve(vec.size());
auto dst_vec2 = std::back_inserter(vec2);
std::cout<<"transform"<<std::endl;
std::transform(vec.begin(), vec.end(),
dst_vec2, [](const example& ex){ return ex; });
}
For this case, transform will call copy and move constructors.
The output of the transform part:
copy
move
copy
move
copy
move
Now lets remove the reserve and use the constructor.
#include <iostream>
#include <vector>
#include <algorithm>
class example
{
public:
// Copy constructor
example(const example& p1)
{
std::cout<<"copy"<<std::endl;
this->a = p1.a;
}
example(example&& o) noexcept
{
std::cout<<"move"<<std::endl;
std::swap(o.a, this->a);
}
example(int a_)
{
std::cout<<"const"<<std::endl;
a = a_;
}
example()
{
std::cout<<"Def const"<<std::endl;
}
int a;
};
int main()
{
auto vec = std::vector<example>{1,2,3};
std::vector<example> vec2(vec.size());
auto dst_vec2 = std::back_inserter(vec2);
std::cout<<"transform"<<std::endl;
std::transform(vec.begin(), vec.end(),
dst_vec2, [](const example& ex){ return ex; });
}
And in this case transform part produces:
copy
move
move
move
move
copy
move
copy
move
As it is seen, for this specific case, reserve prevents extra move operations because there is no initialized object to move.

How to initialize a constant int to use for array size?

I have a static integer variable Game::numPlayers, which is read in as an input from user. I then have the following class defined as such:
class GridNode
{
private:
/* Members */
static const int nPlayers = Game::numPlayers;
std::vector<Unit> units[nPlayers];
//...
};
This won't compile ("in-class initializer for static data member is not a constant expression").
I obviously cant just assign the array size of Game::numPlayers, and I also tried not initializing it and letting a constructor do the work, but that didn't work either.
I don't understand what I'm doing wrong here and what else I could possibly do to get this to work as intended.
I'm just copying a value, how is that any different from static const int nPlayers = 8 which copies the value 8 into nPlayers and works?
Edit:
To clarify, I choose to have an array of vectors because I want each node to have a quick-access container of units, but one container for each user/player so as to distinguish which units belong to which player within each node (e.g. index 0 of the array = player 1, index 1 = player 2, index 2 = player 3, and so on), otherwise I would just have one vector or a vector of vectors. I thought a map might work, but I thought an array of vectors would be faster to access and push into.
Also, Game::numPlayers is read in as a user input, but only read and assigned once within one game loop, but if I close/restart/play a new game, it needs to read in the user input again and assign it once.
I don't see why you need to use an array of std::vector if the number of elements will be obtained at runtime.
Instead, create a std::vector<std::vector<Units>> and size it appropriately on construction. if you need to reset the size, have a member function resize the vector.
Example:
class GridNode
{
private:
/* Members */
std::vector<std::vector<Unit>> units;
public:
GridNode(int nPlayers=10) : units(nPlayers) {}
std::vector<Unit>& get_units(size_t player)
{ return units[player]; } // gets the units for player n
void set_num_players(int nPlayers)
{ units.resize(nPlayers); } // set the number of players
int get_num_players() const { return units.size(); }
};
int main()
{
int numPlayers;
cin >> numPlayers;
//...
GridNode myGrid(numPlayers); // creates a GridNode with
// numPlayers vectors.
//...
Unit myUnit;
myGrid.get_units(0).push_back(myUnit); // places a Unit in the
// first player
}
Also, it isn't a good idea to have extraneous variables tell you the vector's size. The vector knows its own size already by calling the vector::size() function. Carrying around unnecessary variables that supposedly gives you this information opens yourself up for bugs.
Only integral constant expressions are allowed as array sizes in array declarations in C++. A const int object initailized with something that is not an integral constant expression (your Game::numPlayers is not, since it is read from the user), does not itself qualify as integral constant expression.
The bottom line here is that regardless of how you slice it, it is not possible to sneak in a run-time value into an array declaration in C++. C++11 does support some semblance of C99-style Variable Length Arrays, but your case (a member array) is not covered by it anyway.
If the array size is a run-tuime value, use std::vector. In your case that would become std::vector of std::vectors.

Copy values to and from a member of a structure in C++

I have a structure as shown below, which I use for the purpose of sorting a vector while keeping track of the indices.
struct val_order{
int order;
double value;
};
(1). Currently what I do is use a loop to copy values as shown below. So my first question is if there is a faster way to copy values to a member of a structure (not copying an entire structure to another structure)
int n=x.size();
std::vector<val_order> index(n);
for(int i=0;i<x.size();i++){ //<------So using a loop to copy values.
index[i].value=x[i];
index[i].order=i;
}
(2). My second question has to do with copying one member of a structure to an array. I found a post here that discusses using memcpy to accomplish that. But I was unable to make it work (code below). The error message I got was class std::vector<val_order, std::allocator<val_order> > has no member named ‘order’. But I was able to access the values in index.order by iterating over it. So I wonder what is wrong with my code.
int *buf=malloc(sizeof(index[0].order)*n);
int *output=buf;
memcpy(output, index.order, sizeof(index.order));
Question 1
Since you are initializing your n vectors from two different sources (array x and variable i), it would be difficult to avoid a loop. (you could initialize the vectors from index.assign if you had an array of val_order already filled with values, see this link)
Question 2
You want to copy all n order values into a int array, and memcpy seems convenient for that. Unfortunately,
each element of a vector is a val_order structure, so even though you could copy via memcpy that would not only copy the int *order* value but also the double *value* value
furthermore, you are dealing with vector, which internal structure is not a simple array (vector allows operations that are not possible with a regular array) and thus you cannot copy a bunch of vector to a int array by simply giving the address of, say, the first vector element to memcpy.
also, memcpy wouldn't work like you want anyway, but it expects an address - thus you would have to give, e.g., &index[0] ... but again, this is not what you want given the points above
So you would have to make another loop instead, like
int *buf = (int *)malloc(sizeof(int)*n);
int *output = buf;
for (int i=0 ; i<n ; i++) {
output[i] = index[i].order;
}

Vector: initialization or reserve?

I know the size of a vector, which is the best way to initialize it?
Option 1:
vector<int> vec(3); //in .h
vec.at(0)=var1; //in .cpp
vec.at(1)=var2; //in .cpp
vec.at(2)=var3; //in .cpp
Option 2:
vector<int> vec; //in .h
vec.reserve(3); //in .cpp
vec.push_back(var1); //in .cpp
vec.push_back(var2); //in .cpp
vec.push_back(var3); //in .cpp
I guess, Option2 is better than Option1. Is it? Any other options?
Somehow, a non-answer answer that is completely wrong has remained accepted and most upvoted for ~7 years. This is not an apples and oranges question. This is not a question to be answered with vague cliches.
For a simple rule to follow:
Option #1 is faster...
...but this probably shouldn't be your biggest concern.
Firstly, the difference is pretty minor. Secondly, as we crank up the compiler optimization, the difference becomes even smaller. For example, on my gcc-5.4.0, the difference is arguably trivial when running level 3 compiler optimization (-O3):
So in general, I would recommending using method #1 whenever you encounter this situation. However, if you can't remember which one is optimal, it's probably not worth the effort to find out. Just pick either one and move on, because this is unlikely to ever cause a noticeable slowdown in your program as a whole.
These tests were run by sampling random vector sizes from a normal distribution, and then timing the initialization of vectors of these sizes using the two methods. We keep a dummy sum variable to ensure the vector initialization is not optimized out, and we randomize vector sizes and values to make an effort to avoid any errors due to branch prediction, caching, and other such tricks.
main.cpp:
/*
* Test constructing and filling a vector in two ways: construction with size
* then assignment versus construction of empty vector followed by push_back
* We collect dummy sums to prevent the compiler from optimizing out computation
*/
#include <iostream>
#include <vector>
#include "rng.hpp"
#include "timer.hpp"
const size_t kMinSize = 1000;
const size_t kMaxSize = 100000;
const double kSizeIncrementFactor = 1.2;
const int kNumVecs = 10000;
int main() {
for (size_t mean_size = kMinSize; mean_size <= kMaxSize;
mean_size = static_cast<size_t>(mean_size * kSizeIncrementFactor)) {
// Generate sizes from normal distribution
std::vector<size_t> sizes_vec;
NormalIntRng<size_t> sizes_rng(mean_size, mean_size / 10.0);
for (int i = 0; i < kNumVecs; ++i) {
sizes_vec.push_back(sizes_rng.GenerateValue());
}
Timer timer;
UniformIntRng<int> values_rng(0, 5);
// Method 1: construct with size, then assign
timer.Reset();
int method_1_sum = 0;
for (size_t num_els : sizes_vec) {
std::vector<int> vec(num_els);
for (size_t i = 0; i < num_els; ++i) {
vec[i] = values_rng.GenerateValue();
}
// Compute sum - this part identical for two methods
for (size_t i = 0; i < num_els; ++i) {
method_1_sum += vec[i];
}
}
double method_1_seconds = timer.GetSeconds();
// Method 2: reserve then push_back
timer.Reset();
int method_2_sum = 0;
for (size_t num_els : sizes_vec) {
std::vector<int> vec;
vec.reserve(num_els);
for (size_t i = 0; i < num_els; ++i) {
vec.push_back(values_rng.GenerateValue());
}
// Compute sum - this part identical for two methods
for (size_t i = 0; i < num_els; ++i) {
method_2_sum += vec[i];
}
}
double method_2_seconds = timer.GetSeconds();
// Report results as mean_size, method_1_seconds, method_2_seconds
std::cout << mean_size << ", " << method_1_seconds << ", " << method_2_seconds;
// Do something with the dummy sums that cannot be optimized out
std::cout << ((method_1_sum > method_2_sum) ? "" : " ") << std::endl;
}
return 0;
}
The header files I used are located here:
rng.hpp
timer.hpp
Both variants have different semantics, i.e. you are comparing apples and oranges.
The first gives you a vector of n default-initialized values, the second variant reserves the memory, but does not initialize them.
Choose what better fits your needs, i.e. what is "better" in a certain situation.
The "best" way would be:
vector<int> vec = {var1, var2, var3};
available with a C++11 capable compiler.
Not sure exactly what you mean by doing things in a header or implementation files. A mutable global is a no-no for me. If it is a class member, then it can be initialized in the constructor initialization list.
Otherwise, option 1 would be generally used if you know how many items you are going to use and the default values (0 for int) would be useful.
Using at here means that you can't guarantee the index is valid. A situation like that is alarming itself. Even though you will be able to reliably detect problems, it's definitely simpler to use push_back and stop worrying about getting the indexes right.
In case of option 2, generally it makes zero performance difference whether you reserve memory or not, so it's simpler not to reserve*. Unless perhaps if the vector contains types that are very expensive to copy (and don't provide fast moving in C++11), or the size of the vector is going to be enormous.
* From Stroustrups C++ Style and Technique FAQ:
People sometimes worry about the cost of std::vector growing
incrementally. I used to worry about that and used reserve() to
optimize the growth. After measuring my code and repeatedly having
trouble finding the performance benefits of reserve() in real
programs, I stopped using it except where it is needed to avoid
iterator invalidation (a rare case in my code). Again: measure before
you optimize.
While your examples are essentially the same, it may be that when the type used is not an int the choice is taken from you. If your type doesn't have a default constructor, or if you'll have to re-construct each element later anyway, I would use reserve. Just don't fall into the trap I did and use reserve and then the operator[] for initialisation!
Constructor
std::vector<MyType> myVec(numberOfElementsToStart);
int size = myVec.size();
int capacity = myVec.capacity();
In this first case, using the constructor, size and numberOfElementsToStart will be equal and capacity will be greater than or equal to them.
Think of myVec as a vector containing a number of items of MyType which can be accessed and modified, push_back(anotherInstanceOfMyType) will append it the the end of the vector.
Reserve
std::vector<MyType> myVec;
myVec.reserve(numberOfElementsToStart);
int size = myVec.size();
int capacity = myVec.capacity();
When using the reserve function, size will be 0 until you add an element to the array and capacity will be equal to or greater than numberOfElementsToStart.
Think of myVec as an empty vector which can have new items appended to it using push_back with no memory allocation for at least the first numberOfElementsToStart elements.
Note that push_back() still requires an internal check to ensure that size < capacity and to increment size, so you may want to weigh this against the cost of default construction.
List initialisation
std::vector<MyType> myVec{ var1, var2, var3 };
This is an additional option for initialising your vector, and while it is only feasible for very small vectors, it is a clear way to initialise a small vector with known values. size will be equal to the number of elements you initialised it with, and capacity will be equal to or greater than size. Modern compilers may optimise away the creation of temporary objects and prevent unnecessary copying.
Option 2 is better, as reserve only needs to reserve memory (3 * sizeof(T)), while the first option calls the constructor of the base type for each cell inside the container.
For C-like types it will probably be the same.
How it Works
This is implementation specific however in general Vector data structure internally will have pointer to the memory block where the elements would actually resides. Both GCC and VC++ allocate for 0 elements by default. So you can think of Vector's internal memory pointer to be nullptr by default.
When you call vector<int> vec(N); as in your Option 1, the N objects are created using default constructor. This is called fill constructor.
When you do vec.reserve(N); after default constructor as in Option 2, you get data block to hold 3 elements but no objects are created unlike in option 1.
Why to Select Option 1
If you know the number of elements vector will hold and you might leave most of the elements to its default values then you might want to use this option.
Why to Select Option 2
This option is generally better of the two as it only allocates data block for the future use and not actually filling up with objects created from default constructor.
Since it seems 5 years have passed and a wrong answer is still the accepted one, and the most-upvoted answer is completely useless (missed the forest for the trees), I will add a real response.
Method #1: we pass an initial size parameter into the vector, let's call it n. That means the vector is filled with n elements, which will be initialized to their default value. For example, if the vector holds ints, it will be filled with n zeros.
Method #2: we first create an empty vector. Then we reserve space for n elements. In this case, we never create the n elements and thus we never perform any initialization of the elements in the vector. Since we plan to overwrite the values of every element immediately, the lack of initialization will do us no harm. On the other hand, since we have done less overall, this would be the better* option.
* better - real definition: never worse. It's always possible a smart compiler will figure out what you're trying to do and optimize it for you.
Conclusion: use method #2.
In the long run, it depends on the usage and numbers of the elements.
Run the program below to understand how the compiler reserves space:
vector<int> vec;
for(int i=0; i<50; i++)
{
cout << "size=" << vec.size() << "capacity=" << vec.capacity() << endl;
vec.push_back(i);
}
size is the number of actual elements and capacity is the actual size of the array to imlement vector.
In my computer, till 10, both are the same. But, when size is 43 the capacity is 63. depending on the number of elements, either may be better. For example, increasing the capacity may be expensive.
Another option is to Trust Your Compiler(tm) and do the push_backs without calling reserve first. It has to allocate some space when you start adding elements. Perhaps it does that just as well as you would?
It is "better" to have simpler code that does the same job.
I think answer may depend on situation. For instance:
Lets try to copy simple vector to another vector. Vector hold example class which has only integer. In first example lets use reserve.
#include <iostream>
#include <vector>
#include <algorithm>
class example
{
public:
// Copy constructor
example(const example& p1)
{
std::cout<<"copy"<<std::endl;
this->a = p1.a;
}
example(example&& o) noexcept
{
std::cout<<"move"<<std::endl;
std::swap(o.a, this->a);
}
example(int a_)
{
std::cout<<"const"<<std::endl;
a = a_;
}
example()
{
std::cout<<"Def const"<<std::endl;
}
int a;
};
int main()
{
auto vec = std::vector<example>{1,2,3};
auto vec2 = std::vector<example>{};
vec2.reserve(vec.size());
auto dst_vec2 = std::back_inserter(vec2);
std::cout<<"transform"<<std::endl;
std::transform(vec.begin(), vec.end(),
dst_vec2, [](const example& ex){ return ex; });
}
For this case, transform will call copy and move constructors.
The output of the transform part:
copy
move
copy
move
copy
move
Now lets remove the reserve and use the constructor.
#include <iostream>
#include <vector>
#include <algorithm>
class example
{
public:
// Copy constructor
example(const example& p1)
{
std::cout<<"copy"<<std::endl;
this->a = p1.a;
}
example(example&& o) noexcept
{
std::cout<<"move"<<std::endl;
std::swap(o.a, this->a);
}
example(int a_)
{
std::cout<<"const"<<std::endl;
a = a_;
}
example()
{
std::cout<<"Def const"<<std::endl;
}
int a;
};
int main()
{
auto vec = std::vector<example>{1,2,3};
std::vector<example> vec2(vec.size());
auto dst_vec2 = std::back_inserter(vec2);
std::cout<<"transform"<<std::endl;
std::transform(vec.begin(), vec.end(),
dst_vec2, [](const example& ex){ return ex; });
}
And in this case transform part produces:
copy
move
move
move
move
copy
move
copy
move
As it is seen, for this specific case, reserve prevents extra move operations because there is no initialized object to move.