Reserving space for a double vector - c++

Suppose T is a type, and I want to make a vector<vector<T>>. I know the eventual size will be m x n, where m and n are runtime constants. (If they were compile-time constants I'd use std::array<std::array<T, n>, m>.) Suppose I have three choices of what to do with my double vector before I continue with my program in earnest:
Option 1
std::vector<std::vector<T>> dbl_vect(m);
for (auto & v : dbl_vect)
v.reserve(n);
Option 2
std::vector<std::vector<T>> dbl_vect;
dbl_vect.reserve(m);
Option 3
std::vector<std::vector<T>> dbl_vect;
Let's suppose I am not worried about iterator & reference invalidation from vector reallocation, so we can remove that from the decision process.
Of course the code that follows these would have to be a little different, since #1 creates the (empty) rows of the dbl_vector, so we have to access the rows rather than pushing more back.
Option #2 seems fairly useless, because it has no idea how much space to reserve for each row.
Option #1 requires me to go through a linear pass of m empty vectors and resize them manually, but it prevents reallocation. If T were pretty big, this would almost certainly be preferable, I believe, because it would prevent copies/moves.
Question: Suppose T = char (or pick your favorite POD type). Under what circumstances should I be indifferent between options 1 and 3, or even prefer #3? Is this mostly due to the relatively small size of a char, or because of the way the compiler will (not) default-initialize a char? If T is larger, maybe user-defined, at what point (in size of the double vector or in size of T) should I start caring?
Here a somewhat similar question is asked, regarding one vector and T=int.

If you know the inner size will be m, one possibility is to make a std::vector<S>, where S is your custom type standing for a std::vector<T>, except it knows how many entries it will have. A similar solution is suggested here (except there m is a compile-time constant).

#3 just default-initializes the vector. There is nothing you gain from this, as the containing vector will have a capacity of zero. Dynamically allocating memory is slow, so to minimize this I would always go with #1 or a variant thereof.

Related

Dynamic array of static arrays

I want a program that creates an undetermined amount of lists. The size of each list is fixed, but I can't determine at compile time how many lists am I going to need.
I understand I cannot create a vector of arrays. I also understand I can use a vector of vectors, but I wonder if this is the best efficient way to do it considering the fact I need to reserve a fixed amount of memory each time I need a new array.
Erm, you can use a vector of arrays, for example,
std::vector<std::array<T, N>> some_vec;
#Nim is right, but I cannot upvote yet :p
Also his solution is C++11.
Another alternative which is to be preferred over std::vector<std::vector<T> > is to have one vector with the dimensions X * Y. Then you can access your elements with v[y * Y + x];
This works with all versions of C++ and should be just as efficient as the std::vector<std::array<T, N> > solution.

Define struct with minimum size

I want to define a struct, e.g. type, such that sizeof(type) is no less than some value.
Motivation:
I have a vector std::vector<type> and I will remove some elements from it. Also, I have saved the indexes of some elements to other places, thus I want just mark it as not used and reuse it in the future. This leads me to save the next available position as a list in erased positions. As a result, sizeof(type) should be no less than sizeof(size_t) and type should be properly aligned as well.
Possible Solutions:
boost::variant<type, size_t>
This has two problems from my point of view. If I use boost::get<type>, the performance will decrease significantly. If I use boost::apply_visitor, the syntax would be weird and the performance also decreases according to my profile.
union{type t; size_t s;}
This of course works except for two shortfalls. Firstly, the syntax to refer the member of type would be more messy. Secondly, I have to define constructor, copy constructor, etc. for this union.
Extend type by char[sizeof(size_t) - sizeof(type)]
This almost fulfills my requirements. However, this risks of zero length array which is not supported by the c++ standard and possibly wrong alignment.
Since I won't use type as size_t often, I'd like to just ensure I can use reinterpret_cast<size_t> when needed.
Complements
After reading the comments, I think the best solution for my problem should be boost::variant. But I am still wondering is there a way to combine the benefits of solution 2 and 3, i.e.
a. I can access members of type without changes.
b. Get the guarantee that reinterpret_cast<size_t> works.
You can mitigate the concerns about solution 3 with something like:
struct data
{
// ...
};
template<class T, bool> class pad_;
template<class T> class pad_<T, true> { char dummy[sizeof(T) - sizeof(data)]; };
template<class T> class pad_<T, false> {};
template<class T> using pad = pad_<T, (sizeof(T) > sizeof(data))>;
class type : public data, pad<size_t>
{
// ...
};
This code:
assumes empty base optimization so that pad could be completely optimized out from type layout when sizeof(data) >= sizeof(size_t)
hasn't the risk of zero length array
Though this being an interesting problem the design itself seams questionable.
When inserting a new element items marked unused are considered first before growing the vector. It means that the relative order of items is unpredictable. If that's being acceptable you could have just used a vector of (smart) pointers.
Typically a vector is inefficient when removing items from the middle. Since the order doesn't matter it is possible to swap the element being removed with the last element and pop the last element.
All elements are of the same size; allocating them using a pool could be faster then using the system allocator.
A pool basically allocates memory in big chunks and hands out smaller chunks on request. A pool usually stores the free list in yet unallocated chunks to track available memory (the same very idea described in the question). There are some good implementations readily available (from Boost and other sources).
Concerning the original design it is cumbersome to enumerate elements in the vector since real elements are mixed with "holes", the logic is going to be obfuscated with additional checks.
Probably there is some sold reasoning behind the original design; unfortunately #user1535111 is not telling the details.

Performance implications of using a list of vectors versus a vector of vectors when appending in parallel

It seems that in general, vectors are to be preferred over lists, see for example here, when appending simple types.
What if I want to fill a matrix with simple types? Every vector is a column, so I am going to go through the outer vector, and append 1 item to each vector, repeatedly.
Do the latter vectors of the outer vector always have to be moved when the previous vectors increase their reserved space? As in is the whole data in one continuous space? Or do the vectors all just hold a pointer to their individual memory regions, so the outer vector's memory size remains unchanged even as the individual vectors grow?
Taken from the comments, it appears vectors of vectors can happily be used.
For small to medium applications, the efficiency of the vectors will seldom be anything to worry about.
There a couple of cases where you might worry, but they will be uncommon.
class CData {}; // define this
typedef std::vector<CData> Column;
typedef std::vector<Column> Table;
Table tab;
To add a new row, you will append an item to every column. In a worst case, you might cause a reallocation of each column. That could be a problem if CData is extremely complex and the columns currently hold a very large number of CData cells (I'd say 10s of thousands, at least)
Similarly, if you add a new column and force the Table vector to reallocate, it might have to copy each column and again, for very large data sets, that might be a bit slow.
Note, however, that a new-ish compiler will probably be able to move the columns from the old table to the new (rather than copying them), making that trivially fast.
As #kkuryllo said in a comment, it generally isn't anything to worry about.
Work on making your code as clean, simple and correct as possible. Only if profiling reveals a performance problem should you worry about optimising for speed.

How do I make multi-dimensional vectors?

Okay, this may sound like a stupid question, but I haven't read anything from the documentation that says it is not possible. Either that, or I overlooked something again.
By multi-dimensional, I mean like arrays. Is something like
vector<vector<double>>
possible? What are the possible drawbacks, at least when compared to arrays?
It's possible, but note that you need a space between the two >s to remove the ambiguity between the right shift operator, i.e.
vector<vector<double> >
Also, I wouldn't call those vectors arrays, since array has a very well-defined meaning in C++:
double matrix[10][10];
edit: As people pointed out, you don't need a space when using C++11.
It is possible.
One of the possible drawbacks could be that it might result in multiple separate allocations from the free store because each vector makes its own allocations. In contrast, a dynamic array allocation is made only once from contiguous memory which is more cache friendly.
What you're describing is absolutely possible, although if you aren't using a C++11 compiler you need to type it as:
vector<vector<double> >
The space between the two > characters being necessary so that the compiler doesn't think you're using the >> operator, as in:
cin >> x;
Of course, with a vector of vectors, you can add and remove elements, either at the top level where the elements are vectors, or at the second level where the elements are doubles. This can be a blessing, a curse, or both, depending on what you are trying to do - note that if you add a double to one of the second-level vectors, the length of that vector is different from all of the others. Because the second-level vectors can have different lengths, I would recommend against using them as a replacement for 2D arrays if fixed dimensions are what you want.

Choice of the most performant container (array)

This is my little big question about containers, in particular, arrays.
I am writing a physics code that mainly manipulates a big (> 1 000 000) set of "particles" (with 6 double coordinates each). I am looking for the best way (in term of performance) to implement a class that will contain a container for these data and that will provide manipulation primitives for these data (e.g. instantiation, operator[], etc.).
There are a few restrictions on how this set is used:
its size is read from a configuration file and won't change during execution
it can be viewed as a big two dimensional array of N (e.g. 1 000 000) lines and 6 columns (each one storing the coordinate in one dimension)
the array is manipulated in a big loop, each "particle / line" is accessed and computation takes place with its coordinates, and the results are stored back for this particle, and so on for each particle, and so on for each iteration of the big loop.
no new elements are added or deleted during the execution
First conclusion, as the access on the elements is essentially done by accessing each element one by one with [], I think that I should use a normal dynamic array.
I have explored a few things, and I would like to have your opinion on the one that can give me the best performances.
As I understand there is no advantage to use a dynamically allocated array instead of a std::vector, so things like double** array2d = new ..., loop of new, etc are ruled out.
So is it a good idea to use std::vector<double> ?
If I use a std::vector, should I create a two dimensional array like std::vector<std::vector<double> > my_array that can be indexed like my_array[i][j], or is it a bad idea and it would be better to use std::vector<double> other_array and acces it with other_array[6*i+j].
Maybe this can gives better performance, especially as the number of columns is fixed and known from the beginning.
If you think that this is the best option, would it be possible to wrap this vector in a way that it can be accessed with a index operator defined as other_array[i,j] // same as other_array[6*i+j] without overhead (like function call at each access) ?
Another option, the one that I am using so far is to use Blitz, in particular blitz::Array:
typedef blitz::Array<double,TWO_DIMENSIONS> store_t;
store_t my_store;
Where my elements are accessed like that: my_store(line, column);.
I think there are not much advantage to use Blitz in my case because I am accessing each element one by one and that Blitz would be interesting if I was using operations directly on array (like matrix multiplication) which I am not.
Do you think that Blitz is OK, or is it useless in my case ?
These are the possibilities I have considered so far, but maybe the best one I still another one, so don't hesitate to suggest me other things.
Thanks a lot for your help on this problem !
Edit:
From the very interesting answers and comments bellow a good solution seems to be the following:
Use a structure particle (containing 6 doubles) or a static array of 6 doubles (this avoid the use of two dimensional dynamic arrays)
Use a vector or a deque of this particle structure or array. It is then good to traverse them with iterators, and that will allow to change from one to another later.
In addition I can also use a Blitz::TinyVector<double,6> instead of a structure.
So is it a good idea to use std::vector<double> ?
Usually, a std::vector should be the first choice of container. You could use either std::vector<>::reserve() or std::vector<>::resize() to avoid reallocations while populating the vector. Whether any other container is better can be found by measuring. And only by measuring. But first measure whether anything the container is involved in (populating, accessing elements) is worth optimizing at all.
If I use a std::vector, should I create a two dimensional array like std::vector<std::vector<double> > [...]?
No. IIUC, you are accessing your data per particle, not per row. If that's the case, why not use a std::vector<particle>, where particle is a struct holding six values? And even if I understood incorrectly, you should rather write a two-dimensional wrapper around a one-dimensional container. Then align your data either in rows or columns - what ever is faster with your access patterns.
Do you think that Blitz is OK, or is it useless in my case?
I have no practical knowledge about blitz++ and the areas it is used in. But isn't blitz++ all about expression templates to unroll loop operations and optimizing away temporaries when doing matrix manipulations? ICBWT.
First of all, you don't want to scatter the coordinates of one given particle all over the place, so I would begin by writing a simple struct:
struct Particle { /* coords */ };
Then we can make a simple one dimensional array of these Particles.
I would probably use a deque, because that's the default container, but you may wish to try a vector, it's just that 1.000.000 of particles means about a single chunk of a few MBs. It should hold but it might strain your system if this ever grows, while the deque will allocate several chunks.
WARNING:
As Alexandre C remarked, if you go the deque road, refrain from using operator[] and prefer to use iteration style. If you really need random access and it's performance sensitive, the vector should prove faster.
The first rule when choosing from containers is to use std::vector. Then, only after your code is complete and you can actually measure performance, you can try other containers. But stick to vector first. (And use reserve() from the start)
Then, you shouldn't use an std::vector<std::vector<double> >. You know the size of your data: it's 6 doubles. No need for it to be dynamic. It is constant and fixed. You can define a struct to hold you particle members (the six doubles), or you can simply typedef it: typedef double particle[6]. Then, use a vector of particles: std::vector<particle>.
Furthermore, as your program uses the particle data contained in the vector sequentially, you will take advantage of the modern CPU cache read-ahead feature at its best performance.
You could go several ways. But in your case, don't declare astd::vector<std::vector<double> >. You're allocating a vector (and you copy it around) for every 6 doubles. Thats way too costly.
If you think that this is the best option, would it be possible to wrap this vector in a way that it can be accessed with a index operator defined as other_array[i,j] // same as other_array[6*i+j] without overhead (like function call at each access) ?
(other_array[i,j] won't work too well, as i,j employs the comma operator to evaluate the value of "i", then discards that and evaluates and returns "j", so it's equivalent to other_array[i]).
You will need to use one of:
other_array[i][j]
other_array(i, j) // if other_array implements operator()(int, int),
// but std::vector<> et al don't.
other_array[i].identifier // identifier is a member variable
other_array[i].identifier() // member function getting value
other_array[i].identifier(double) // member function setting value
You may or may not prefer to put get_ and set_ or similar on the last two functions should you find them useful, but from your question I think you won't: functions are prefered in APIs between parts of large systems involving many developers, or when the data items may vary and you want the algorithms working on the data to be independent thereof.
So, a good test: if you find yourself writing code like other_array[i][3] where you've decided "3" is the double with the speed in it, and other_array[i][5] because "5" is the the acceleration, then stop doing that and give them proper identifiers so you can say other_array[i].speed and .acceleration. Then other developers can read and understand it, and you're much less likely to make accidental mistakes. On the other hand, if you are iterating over those 6 elements doing exactly the same things to each, then you probably do want Particle to hold a double[6], or to provide an operator[](int). There's no problem doing both:
struct Particle
{
double x[6];
double& speed() { return x[3]; }
double speed() const { return x[3]; }
double& acceleration() { return x[5]; }
...
};
BTW / the reason that vector<vector<double> > may be too costly is that each set of 6 doubles will be allocated on the heap, and for fast allocation and deallocation many heap implementations use fixed-size buckets, so your small request will be rounded up t the next size: that may be a significant overhead. The outside vector will also need to record a extra pointer to that memory. Further, heap allocation and deallocation is relatively slow - in you're case, you'd only be doing it at startup and shutdown, but there's no particular point in making your program slower for no reason. Even more importantly, the areas on the heap may just around in memory, so your operator[] may have cache-faults pulling in more distinct memory pages than necessary, slowing the entire program. Put another way, vectors store elements contiguously, but the pointed-to-vectors may not be contiguous.