Define struct with minimum size - c++

I want to define a struct, e.g. type, such that sizeof(type) is no less than some value.
Motivation:
I have a vector std::vector<type> and I will remove some elements from it. Also, I have saved the indexes of some elements to other places, thus I want just mark it as not used and reuse it in the future. This leads me to save the next available position as a list in erased positions. As a result, sizeof(type) should be no less than sizeof(size_t) and type should be properly aligned as well.
Possible Solutions:
boost::variant<type, size_t>
This has two problems from my point of view. If I use boost::get<type>, the performance will decrease significantly. If I use boost::apply_visitor, the syntax would be weird and the performance also decreases according to my profile.
union{type t; size_t s;}
This of course works except for two shortfalls. Firstly, the syntax to refer the member of type would be more messy. Secondly, I have to define constructor, copy constructor, etc. for this union.
Extend type by char[sizeof(size_t) - sizeof(type)]
This almost fulfills my requirements. However, this risks of zero length array which is not supported by the c++ standard and possibly wrong alignment.
Since I won't use type as size_t often, I'd like to just ensure I can use reinterpret_cast<size_t> when needed.
Complements
After reading the comments, I think the best solution for my problem should be boost::variant. But I am still wondering is there a way to combine the benefits of solution 2 and 3, i.e.
a. I can access members of type without changes.
b. Get the guarantee that reinterpret_cast<size_t> works.

You can mitigate the concerns about solution 3 with something like:
struct data
{
// ...
};
template<class T, bool> class pad_;
template<class T> class pad_<T, true> { char dummy[sizeof(T) - sizeof(data)]; };
template<class T> class pad_<T, false> {};
template<class T> using pad = pad_<T, (sizeof(T) > sizeof(data))>;
class type : public data, pad<size_t>
{
// ...
};
This code:
assumes empty base optimization so that pad could be completely optimized out from type layout when sizeof(data) >= sizeof(size_t)
hasn't the risk of zero length array

Though this being an interesting problem the design itself seams questionable.
When inserting a new element items marked unused are considered first before growing the vector. It means that the relative order of items is unpredictable. If that's being acceptable you could have just used a vector of (smart) pointers.
Typically a vector is inefficient when removing items from the middle. Since the order doesn't matter it is possible to swap the element being removed with the last element and pop the last element.
All elements are of the same size; allocating them using a pool could be faster then using the system allocator.
A pool basically allocates memory in big chunks and hands out smaller chunks on request. A pool usually stores the free list in yet unallocated chunks to track available memory (the same very idea described in the question). There are some good implementations readily available (from Boost and other sources).
Concerning the original design it is cumbersome to enumerate elements in the vector since real elements are mixed with "holes", the logic is going to be obfuscated with additional checks.
Probably there is some sold reasoning behind the original design; unfortunately #user1535111 is not telling the details.

Related

Reserving space for a double vector

Suppose T is a type, and I want to make a vector<vector<T>>. I know the eventual size will be m x n, where m and n are runtime constants. (If they were compile-time constants I'd use std::array<std::array<T, n>, m>.) Suppose I have three choices of what to do with my double vector before I continue with my program in earnest:
Option 1
std::vector<std::vector<T>> dbl_vect(m);
for (auto & v : dbl_vect)
v.reserve(n);
Option 2
std::vector<std::vector<T>> dbl_vect;
dbl_vect.reserve(m);
Option 3
std::vector<std::vector<T>> dbl_vect;
Let's suppose I am not worried about iterator & reference invalidation from vector reallocation, so we can remove that from the decision process.
Of course the code that follows these would have to be a little different, since #1 creates the (empty) rows of the dbl_vector, so we have to access the rows rather than pushing more back.
Option #2 seems fairly useless, because it has no idea how much space to reserve for each row.
Option #1 requires me to go through a linear pass of m empty vectors and resize them manually, but it prevents reallocation. If T were pretty big, this would almost certainly be preferable, I believe, because it would prevent copies/moves.
Question: Suppose T = char (or pick your favorite POD type). Under what circumstances should I be indifferent between options 1 and 3, or even prefer #3? Is this mostly due to the relatively small size of a char, or because of the way the compiler will (not) default-initialize a char? If T is larger, maybe user-defined, at what point (in size of the double vector or in size of T) should I start caring?
Here a somewhat similar question is asked, regarding one vector and T=int.
If you know the inner size will be m, one possibility is to make a std::vector<S>, where S is your custom type standing for a std::vector<T>, except it knows how many entries it will have. A similar solution is suggested here (except there m is a compile-time constant).
#3 just default-initializes the vector. There is nothing you gain from this, as the containing vector will have a capacity of zero. Dynamically allocating memory is slow, so to minimize this I would always go with #1 or a variant thereof.

C++ index-to-index map

I have a list of IDs (integers).
They are sorted in a really efficient way so that my application can easily handle them, for example
9382
297832
92
83723
173934
(this sort is really important in my application).
Now I am facing the problem of having to access certain values of an ID in another vector.
For example certain values for ID 9382 are located on someVectorB[30].
I have been using
const int UNITS_MAX_SIZE = 400000;
class clsUnitsUnitIDToArrayIndex : public CBaseStructure
{
private:
int m_content[UNITS_MAX_SIZE];
long m_size;
protected:
void ProcessTxtLine(string line);
public:
clsUnitsUnitIDToArrayIndex();
int *Content();
long Size();
};
But now that I raised UNITS_MAX_SIZE to 400.000, I get page stack errors, and that tells me that I am doing something wrong. I think the entire approach is not really good.
What should I use if I want to locate an ID in a different vector if the "position" is different?
ps: I am looking for something simple that can be easily read-in from a file and that can also easily be serialized to a file. That is why I have been using this brute-force approach before.
If you want a mapping from int's to int's and your index numbers non-consecutive you should consider a std::map. In this case you would define it as such:
std::map<int, int> m_idLocations;
A map represents a mapping between two types. The first type is the "key" and is used for lookup up the second type known as the "value". For each id lookup you can insert it with:
m_idLocations[id] = position;
// or
m_idLocations.insert(std::pair<int,int>(id, position));
And you can look them up using the following syntax:
m_idLocations[id];
Typically a std::map in the stl is implemented using red-black trees which have a worse-cast lookup speed of O(log n). This is slightly slower then O(1) that you'll be getting from the huge array however it's a substantially better use of a space and you're unlikely to notice the difference in practise unless you're storing truly gigantic amounts of numbers or doing an enourmous amount of lookups.
Edit:
In response to some of the comments I think it's important to point out that moving from O(1) to O(log n) can make a significant difference in the speed of your application not to mention practical speed concerns from moving to fixed blocks of memory to tree based structure. However I think that it's important to initially represent what you're trying to say (an int-to-int) mapping and avoid the pitfall of premature optimization.
After you've represented the concept you should then use a profiler to determine if and where the speed issues are. If you find that the map is causing issues then you should look at replacing your mapping with something that you think will be quicker. Make sure to test that the optimization helped and don't forget to include a big comment about what you are representing and why it needed to be changed.
if nothing else works you can just allocate the array dynamically in the constructor. this will move the large array on the heap and avoid your page stack error. you should also remember to release the resource while destroying your clsUnitsUnitIDToArrayIndex
But the recommended usage is as suggested by other members, use a std::vector or std::map
Probably you are getting stackoverflow error due to int m_content[UNITS_MAX_SIZE]. The array is allocated on the stack and 400000 is a pretty big number for the stack. You can use std::vector instead, it is dynamically allocated and you can return a reference of vector member to avoid copy operation:
std::vector<int> m_content(UNITS_MAX_SIZE);
const std::vector<int> &clsUnitsUnitIDToArrayIndex::Content() const
{
return m_content;
}

Efficient use of boolean true and false in C++?

Would any compiler experts be able to comment on the efficient use of boolean values? Specifically, is the compiler able to optimize a std::vector<boolean> to use minimal memory? Is there an equivalent data structure that would?
Back in the day, there were languages that had compilers that could compress an array of booleans to a representation of just one bit per boolean value. Perhaps the best that could be done for C++ is to use std::vector<char> to store the boolean values for minimal memory usage?
The use case here would be storing hundreds of millions of boolean values, where a single byte would save lots of space over 4 or more bytes per value and a single bit, even more.
See std::vector
Specializations
The standard library provides a specialization of std::vector for the type bool, which is optimized for space efficiency.
vector<bool> space-efficient dynamic bitset
(class template specialization)
and from "Working Draft C++, 2012-11-02"
23.3.7 Class vector [vector.bool]
1 To optimize space allocation, a specialization of vector for bool elements is provided:
template <class Allocator> class vector<bool, Allocator> {
...
}
3 There is no requirement that the data be stored as a contiguous allocation of bool values. A space-optimized representation of bits is recommended instead.
So there is no requirement, but only a recommendation, to store the bool values as bits.
std::vector for bool is a template specialization that does what you are asking for.
You can read more here.
You may also want to explore the standard bitset.
Note, that vector<bool> is not a container, however it pretends to be one and provides iterators.
One day that may cause confusion and errors if you treat it like a normal container, e.g. trying to get an address of elements.
You may consider std::bitset or boost::dynamic_bitset if you need to store 1 bit per Boolean value. These data structures do not pretend to be containers, so it is unlikely you make any errors when using any of them, especially in template code.
In what is widely considered to be a flaw in the standard, std::vector is specialised to use a single bit to represent each bool value.
If that happens to be what you are looking for, then just use it.
As a standard-agnostic way of guaranteeing efficient storage, you could create your own Bitvector class. Essentially for every 8 bool values you only need to allocate a single char and then you can store each bool in a single bit. You can then use bit shifting techniques in the accessors/mutators to store/retrieve your individual bits.
One such example is outlined in Ron Penton and André LaMothe's Data Structures for Game Programmers (which I'd also recommend as a general data structure reference). It's not too difficult to write your own though, and, though I haven't searched at great length, there are probably some further examples on the Internet.

Choice of the most performant container (array)

This is my little big question about containers, in particular, arrays.
I am writing a physics code that mainly manipulates a big (> 1 000 000) set of "particles" (with 6 double coordinates each). I am looking for the best way (in term of performance) to implement a class that will contain a container for these data and that will provide manipulation primitives for these data (e.g. instantiation, operator[], etc.).
There are a few restrictions on how this set is used:
its size is read from a configuration file and won't change during execution
it can be viewed as a big two dimensional array of N (e.g. 1 000 000) lines and 6 columns (each one storing the coordinate in one dimension)
the array is manipulated in a big loop, each "particle / line" is accessed and computation takes place with its coordinates, and the results are stored back for this particle, and so on for each particle, and so on for each iteration of the big loop.
no new elements are added or deleted during the execution
First conclusion, as the access on the elements is essentially done by accessing each element one by one with [], I think that I should use a normal dynamic array.
I have explored a few things, and I would like to have your opinion on the one that can give me the best performances.
As I understand there is no advantage to use a dynamically allocated array instead of a std::vector, so things like double** array2d = new ..., loop of new, etc are ruled out.
So is it a good idea to use std::vector<double> ?
If I use a std::vector, should I create a two dimensional array like std::vector<std::vector<double> > my_array that can be indexed like my_array[i][j], or is it a bad idea and it would be better to use std::vector<double> other_array and acces it with other_array[6*i+j].
Maybe this can gives better performance, especially as the number of columns is fixed and known from the beginning.
If you think that this is the best option, would it be possible to wrap this vector in a way that it can be accessed with a index operator defined as other_array[i,j] // same as other_array[6*i+j] without overhead (like function call at each access) ?
Another option, the one that I am using so far is to use Blitz, in particular blitz::Array:
typedef blitz::Array<double,TWO_DIMENSIONS> store_t;
store_t my_store;
Where my elements are accessed like that: my_store(line, column);.
I think there are not much advantage to use Blitz in my case because I am accessing each element one by one and that Blitz would be interesting if I was using operations directly on array (like matrix multiplication) which I am not.
Do you think that Blitz is OK, or is it useless in my case ?
These are the possibilities I have considered so far, but maybe the best one I still another one, so don't hesitate to suggest me other things.
Thanks a lot for your help on this problem !
Edit:
From the very interesting answers and comments bellow a good solution seems to be the following:
Use a structure particle (containing 6 doubles) or a static array of 6 doubles (this avoid the use of two dimensional dynamic arrays)
Use a vector or a deque of this particle structure or array. It is then good to traverse them with iterators, and that will allow to change from one to another later.
In addition I can also use a Blitz::TinyVector<double,6> instead of a structure.
So is it a good idea to use std::vector<double> ?
Usually, a std::vector should be the first choice of container. You could use either std::vector<>::reserve() or std::vector<>::resize() to avoid reallocations while populating the vector. Whether any other container is better can be found by measuring. And only by measuring. But first measure whether anything the container is involved in (populating, accessing elements) is worth optimizing at all.
If I use a std::vector, should I create a two dimensional array like std::vector<std::vector<double> > [...]?
No. IIUC, you are accessing your data per particle, not per row. If that's the case, why not use a std::vector<particle>, where particle is a struct holding six values? And even if I understood incorrectly, you should rather write a two-dimensional wrapper around a one-dimensional container. Then align your data either in rows or columns - what ever is faster with your access patterns.
Do you think that Blitz is OK, or is it useless in my case?
I have no practical knowledge about blitz++ and the areas it is used in. But isn't blitz++ all about expression templates to unroll loop operations and optimizing away temporaries when doing matrix manipulations? ICBWT.
First of all, you don't want to scatter the coordinates of one given particle all over the place, so I would begin by writing a simple struct:
struct Particle { /* coords */ };
Then we can make a simple one dimensional array of these Particles.
I would probably use a deque, because that's the default container, but you may wish to try a vector, it's just that 1.000.000 of particles means about a single chunk of a few MBs. It should hold but it might strain your system if this ever grows, while the deque will allocate several chunks.
WARNING:
As Alexandre C remarked, if you go the deque road, refrain from using operator[] and prefer to use iteration style. If you really need random access and it's performance sensitive, the vector should prove faster.
The first rule when choosing from containers is to use std::vector. Then, only after your code is complete and you can actually measure performance, you can try other containers. But stick to vector first. (And use reserve() from the start)
Then, you shouldn't use an std::vector<std::vector<double> >. You know the size of your data: it's 6 doubles. No need for it to be dynamic. It is constant and fixed. You can define a struct to hold you particle members (the six doubles), or you can simply typedef it: typedef double particle[6]. Then, use a vector of particles: std::vector<particle>.
Furthermore, as your program uses the particle data contained in the vector sequentially, you will take advantage of the modern CPU cache read-ahead feature at its best performance.
You could go several ways. But in your case, don't declare astd::vector<std::vector<double> >. You're allocating a vector (and you copy it around) for every 6 doubles. Thats way too costly.
If you think that this is the best option, would it be possible to wrap this vector in a way that it can be accessed with a index operator defined as other_array[i,j] // same as other_array[6*i+j] without overhead (like function call at each access) ?
(other_array[i,j] won't work too well, as i,j employs the comma operator to evaluate the value of "i", then discards that and evaluates and returns "j", so it's equivalent to other_array[i]).
You will need to use one of:
other_array[i][j]
other_array(i, j) // if other_array implements operator()(int, int),
// but std::vector<> et al don't.
other_array[i].identifier // identifier is a member variable
other_array[i].identifier() // member function getting value
other_array[i].identifier(double) // member function setting value
You may or may not prefer to put get_ and set_ or similar on the last two functions should you find them useful, but from your question I think you won't: functions are prefered in APIs between parts of large systems involving many developers, or when the data items may vary and you want the algorithms working on the data to be independent thereof.
So, a good test: if you find yourself writing code like other_array[i][3] where you've decided "3" is the double with the speed in it, and other_array[i][5] because "5" is the the acceleration, then stop doing that and give them proper identifiers so you can say other_array[i].speed and .acceleration. Then other developers can read and understand it, and you're much less likely to make accidental mistakes. On the other hand, if you are iterating over those 6 elements doing exactly the same things to each, then you probably do want Particle to hold a double[6], or to provide an operator[](int). There's no problem doing both:
struct Particle
{
double x[6];
double& speed() { return x[3]; }
double speed() const { return x[3]; }
double& acceleration() { return x[5]; }
...
};
BTW / the reason that vector<vector<double> > may be too costly is that each set of 6 doubles will be allocated on the heap, and for fast allocation and deallocation many heap implementations use fixed-size buckets, so your small request will be rounded up t the next size: that may be a significant overhead. The outside vector will also need to record a extra pointer to that memory. Further, heap allocation and deallocation is relatively slow - in you're case, you'd only be doing it at startup and shutdown, but there's no particular point in making your program slower for no reason. Even more importantly, the areas on the heap may just around in memory, so your operator[] may have cache-faults pulling in more distinct memory pages than necessary, slowing the entire program. Put another way, vectors store elements contiguously, but the pointed-to-vectors may not be contiguous.

Is this use of nested vector/multimap/map okay?

I am looking for the perfect data structure for the following scenario:
I have an index i, and for each one I need to support the following operation 1: Quickly look up its Foo objects (see below), each of which is associated with a double value.
So I did this:
struct Foo {
int a, b, c;
};
typedef std::map<Foo, double> VecElem;
std::vector<VecElem> vec;
But it turns out to be inefficient because I also have to provide very fast support for the following operation 2: Remove all Foos that have a certain value for a and b (together with the associated double values).
To perform this operation 2, I have to iterate over the maps in the vector, checking the Foos for their a and b values and erasing them one by one from the map, which seems to be very expensive.
So I am now considering this data structure instead:
struct Foo0 {
int a, b;
};
typedef std::multimap<Foo0, std::map<int, double> > VecElem;
std::vector<VecElem> vec;
This should provide fast support for both operations 1 and 2 above. Is that reasonable? Is there lots of overhead from the nested container structures?
Note: Each of the multimaps will usually only have one or two keys (of type Foo0), each of which will have about 5-20 values (of type std::map<int,double>).
To answer the headline question: yes, nesting STL containers is perfectly fine. Depending on your usage profile, this could result in excessive copying behind the scenes though. A better option might be to wrap the contents of all but top-level container using Boost::shared_ptr, so that container housekeeping does not require a deep copy of your nested container's entire contents. This would be the case say if you plan on spending a lot of time inserting and removing VecElem in the toplevel vector - expensive if VecElem is a direct multimap.
Memory overhead in the data structures is likely to be not significantly worse than anything you could design with equivalent functionality, and more likely better unless you plan to spend more time on this than is healthy.
Well, you have a reasonable start on this idea ... but there are some questions that must be addressed first.
For instance, is the type Foo mutable? If it is, then you need to be careful about creating a type Foo0 (um ... a different name may be a good idea hear to avoid confusion) since changes to Foo may invalidate Foo0.
Second, you need to decide whether you also need this structure to work well for inserts/updates. If the population of Foo is static and unchanging - this isn't an issue, but if it isn't, you may end up spending a lot of time maintaining Vec and VecElem.
As far as the question of nesting STL containers goes, this is fine - and is often used to create arbitrarily complex structures.