Efficient use of boolean true and false in C++? - c++

Would any compiler experts be able to comment on the efficient use of boolean values? Specifically, is the compiler able to optimize a std::vector<boolean> to use minimal memory? Is there an equivalent data structure that would?
Back in the day, there were languages that had compilers that could compress an array of booleans to a representation of just one bit per boolean value. Perhaps the best that could be done for C++ is to use std::vector<char> to store the boolean values for minimal memory usage?
The use case here would be storing hundreds of millions of boolean values, where a single byte would save lots of space over 4 or more bytes per value and a single bit, even more.

See std::vector
Specializations
The standard library provides a specialization of std::vector for the type bool, which is optimized for space efficiency.
vector<bool> space-efficient dynamic bitset
(class template specialization)
and from "Working Draft C++, 2012-11-02"
23.3.7 Class vector [vector.bool]
1 To optimize space allocation, a specialization of vector for bool elements is provided:
template <class Allocator> class vector<bool, Allocator> {
...
}
3 There is no requirement that the data be stored as a contiguous allocation of bool values. A space-optimized representation of bits is recommended instead.
So there is no requirement, but only a recommendation, to store the bool values as bits.

std::vector for bool is a template specialization that does what you are asking for.
You can read more here.
You may also want to explore the standard bitset.

Note, that vector<bool> is not a container, however it pretends to be one and provides iterators.
One day that may cause confusion and errors if you treat it like a normal container, e.g. trying to get an address of elements.
You may consider std::bitset or boost::dynamic_bitset if you need to store 1 bit per Boolean value. These data structures do not pretend to be containers, so it is unlikely you make any errors when using any of them, especially in template code.

In what is widely considered to be a flaw in the standard, std::vector is specialised to use a single bit to represent each bool value.
If that happens to be what you are looking for, then just use it.

As a standard-agnostic way of guaranteeing efficient storage, you could create your own Bitvector class. Essentially for every 8 bool values you only need to allocate a single char and then you can store each bool in a single bit. You can then use bit shifting techniques in the accessors/mutators to store/retrieve your individual bits.
One such example is outlined in Ron Penton and André LaMothe's Data Structures for Game Programmers (which I'd also recommend as a general data structure reference). It's not too difficult to write your own though, and, though I haven't searched at great length, there are probably some further examples on the Internet.

Related

std::tuple sizeof, is it a missed optimization?

I've checked all major compilers, and sizeof(std::tuple<int, char, int, char>) is 16 for all of them. Presumably they just put elements in order into the tuple, so some space is wasted because of alignment.
If tuple stored elements internally like: int, int, char, char, then its sizeof could be 12.
Is it possible for an implementation to do this, or is it forbidden by some rule in the standard?
std::tuple sizeof, is it a missed optimization?
Yep.
Is it possible for an implementation to do this[?]
Yep.
[Is] it forbidden by some rule in the standard?
Nope!
Reading through [tuple], there is no constraint placed upon the implementation to store the members in template-argument order.
In fact, every passage I can find seems to go to lengths to avoid making any reference to member-declaration order at all: get<N>() is used in the description of operational semantics. Other wording is stated in terms of "elements" rather than "members", which seems like quite a deliberate abstraction.
In fact, some implementations do apparently store the members in reverse order, at least, probably simply due to the way they use inheritance recursively to unpack the template arguments (and because, as above, they're permitted to).
Speaking specifically about your hypothetical optimisation, though, I'm not aware of any implementation that doesn't store elements in [some trivial function of] the user-given order; I'm guessing that it would be "hard" to come up with such an order and to provide the machinery for std::get, at least as compared to the amount of gain you'd get from doing so. If you are really concerned about padding, you may of course choose your element order carefully to avoid it (on some given platform), much as you would with a class (without delving into the world of "packed" attributes). (A "packed" tuple could be an interesting proposal…)
Yes, it's possible and has been (mostly) done by R. Martinho Fernandes. He used to have a blog called Flaming Danger Zone, which is now down for some reason, but its sources are still available on github.
Here are the all four parts of the Size Matters series on this exact topic: 1, 2, 3, 4.
You might wish to view them raw since github doesn't understand C++ highlighting markup used and renders code snippets as unreadable oneliners.
He essentially computes a permutation for tuple indices via C++11 template meta-program, that sorts elements by alignment in non-ascending order, stores the elements according to it, and then applies it to the index on every access.
They could. One possible reason they don’t: some architectures, including x86, have an indexing mode that can address an address base + size × index in a single instruction—but only when size is a power of 2. Or it might be slightly faster to do a load or store aligned to a 16-byte boundary. This could make code that addresses arrays of std::tuple slightly faster and more compact if they add four padding bytes.

Vector of multiple numeric types in C++11

Is there an efficient way in C++11 to store multiple numeric types in a vector using std::vector? Most what I can find also store strings and what not. I just want to store signed and unsigned integers ranging from 8 to 32 bits.
So far I have come across boost::variant but that seems a bit of an overkill. Is there a neat trick I am missing out on? Or should I just go with boost?
I want to do something like this:
std::vector<Numeric> v{(uint16_t) 1, (int32_t)-200};
Nope. There is no generic numeric type in C++.
First, to handle your specific case: If you don't run out of memory, an std::vector<std::int64_t> will happily safe all your data and will be really fast. It is unlikely that you need anything else.
Otherwise:
If you need speed (i.e. after the generic solutions proof to slow), settle for a single type that does what you need as well as possible.
If speed is not that important (i.e. you cannot prove any significant disadvantage), the generic solutions like boost.variant and boost.any will serve you well.
Dynamically sized containers in C++ are homogeneous, meaning that all of the elements in them must be of the same type. If you want to emulate the storage of elements of different types in such a container then you will have to use a tagged union of some kind. Boost.Variant, as mentioned, is one option.
On the other hand, if you don't need the container to resize dynamically you can use a heterogeneous container, like std::tuple or boost::tuple.
You said:
I just want to store signed and unsigned integers ranging from 8 to 32 bits.
You can use std::vector<int64_t> to store any number in that range. However, if you need to also know the size/type of the elements of the vector, you'll need to store more information.
You can use a struct that is something like:
struct MyNumber
{
enum Type {INT8, UINT8, INT16, UINT16, INT32, UINT32};
type type;
uint64_t value;
};
Then use a std::vector<MyNumber>.
To make conversions to and from MyNumber, you will need to add a bunch of constructors and other helper functions to MyNumber.

Define struct with minimum size

I want to define a struct, e.g. type, such that sizeof(type) is no less than some value.
Motivation:
I have a vector std::vector<type> and I will remove some elements from it. Also, I have saved the indexes of some elements to other places, thus I want just mark it as not used and reuse it in the future. This leads me to save the next available position as a list in erased positions. As a result, sizeof(type) should be no less than sizeof(size_t) and type should be properly aligned as well.
Possible Solutions:
boost::variant<type, size_t>
This has two problems from my point of view. If I use boost::get<type>, the performance will decrease significantly. If I use boost::apply_visitor, the syntax would be weird and the performance also decreases according to my profile.
union{type t; size_t s;}
This of course works except for two shortfalls. Firstly, the syntax to refer the member of type would be more messy. Secondly, I have to define constructor, copy constructor, etc. for this union.
Extend type by char[sizeof(size_t) - sizeof(type)]
This almost fulfills my requirements. However, this risks of zero length array which is not supported by the c++ standard and possibly wrong alignment.
Since I won't use type as size_t often, I'd like to just ensure I can use reinterpret_cast<size_t> when needed.
Complements
After reading the comments, I think the best solution for my problem should be boost::variant. But I am still wondering is there a way to combine the benefits of solution 2 and 3, i.e.
a. I can access members of type without changes.
b. Get the guarantee that reinterpret_cast<size_t> works.
You can mitigate the concerns about solution 3 with something like:
struct data
{
// ...
};
template<class T, bool> class pad_;
template<class T> class pad_<T, true> { char dummy[sizeof(T) - sizeof(data)]; };
template<class T> class pad_<T, false> {};
template<class T> using pad = pad_<T, (sizeof(T) > sizeof(data))>;
class type : public data, pad<size_t>
{
// ...
};
This code:
assumes empty base optimization so that pad could be completely optimized out from type layout when sizeof(data) >= sizeof(size_t)
hasn't the risk of zero length array
Though this being an interesting problem the design itself seams questionable.
When inserting a new element items marked unused are considered first before growing the vector. It means that the relative order of items is unpredictable. If that's being acceptable you could have just used a vector of (smart) pointers.
Typically a vector is inefficient when removing items from the middle. Since the order doesn't matter it is possible to swap the element being removed with the last element and pop the last element.
All elements are of the same size; allocating them using a pool could be faster then using the system allocator.
A pool basically allocates memory in big chunks and hands out smaller chunks on request. A pool usually stores the free list in yet unallocated chunks to track available memory (the same very idea described in the question). There are some good implementations readily available (from Boost and other sources).
Concerning the original design it is cumbersome to enumerate elements in the vector since real elements are mixed with "holes", the logic is going to be obfuscated with additional checks.
Probably there is some sold reasoning behind the original design; unfortunately #user1535111 is not telling the details.

An integer hashing problem

I have a (C++) std::map<int, MyObject*> that contains a couple of millions of objects of type MyObject*. The maximum number of objects that I can have, is around 100 millions. The key is the object's id. During a certain process, these objects must be somehow marked( with a 0 or 1) as fast as possible. The marking cannot happen on the objects themselves (so I cannot introduce a member variable and use that for the marking process). Since I know the minimum and maximum id (1 to 100_000_000), the first thought that occured to me, was to use a std::bit_set<100000000> and perform my marking there. This solves my problem and also makes it easier when marking processes run in parallel, since these use their own bit_set to mark things, but I was wondering what the solution could be, if I had to use something else instead of a 0-1 marking, e.g what could I use if I had to mark all objects with an integer number ?
Is there some form of a data structure that can deal with this kind of problem in a compact (memory-wise) manner, and also be fast ? The main queries of interest are whether an object is marked, and with what was marked with.
Thank you.
Note: std::map<int, MyObject*> cannot be changed. Whatever data structure I use, must not deal with the map itself.
How about making the value_type of your map a std::pair<bool, MyObject*> instead of MyObject*?
If you're not concerned with memory, then a std::vector<int> (or whatever suits your need in place of an int) should work.
If you don't like that, and you can't modify your map, then why not create a parallel map for the markers?
std::map<id,T> my_object_map;
std::map<id,int> my_marker_map;
If you cannot modify the objects directly, have you considered wrapping the objects before you place them in the map? e.g.:
struct
{
int marker;
T *p_x;
} T_wrapper;
std::map<int,T_wrapper> my_map;
If you're going to need to do lookups anyway, then this will be no slower.
EDIT: As #tenfour suggests in his/her answer, a std::pair may be a cleaner solution here, as it saves the struct definition. Personally, I'm not a big fan of std::pairs, because you have to refer to everything as first and second, rather than by meaningful names. But that's just me...
The most important question to ask yourself is "How many of these 100,000,000 objects might be marked (or remain unmarked)?" If the answer is smaller than roughly 100,000,000/(2*sizeof(int)), then just use another std::set or std::tr1::unordered_set (hash_set previous to tr1) to track which ones are so marked (or remained unmarked).
Where does 2*sizeof(int) come from? It's an estimate of the amount of memory overhead to maintain a heap structure in a deque of the list of items that will be marked.
If it is larger, then use std::bitset as you were about to use. It's overhead is effectively 0% for the scale of quantity you need. You'll need about 13 megabytes of contiguous ram to hold the bitset.
If you need to store a marking as well as presence, then use std::tr1::unordered_map using the key of Object* and value of marker_type. And again, if the percentage of marked nodes is higher than the aforementioned comparison, then you'll want to use some sort of bitset to hold the number of bits needed, with suitable adjustments in size, at 12.5 megabytes per bit.
A purpose-built object holding the bitset might be your best choice, given the clarification of the requirements.
Edit: this assumes that you've done proper time-complexity computations for what are acceptable solutions to you, since changing the base std::map structure is no longer permitted.
If you don't mind using hacks, take a look at the memory optimization used in Boost.MultiIndex. It can store one bit in the LSB of a stored pointer.

Best Data Structure for Genetic Algorithm in C++?

i need to implement a genetic algorithm customized for my problem (college project), and the first version had it coded as an matrix of short ( bits per chromosome x size of population).
That was a bad design, since i am declaring a short but only using the "0" and "1" values... but it was just a prototype and it worked as intended, and now it is time for me to develop a new, improved version. Performance is important here, but simplicity is also appreciated.
I researched around and came up with:
for the chromosome :
- String class (like "0100100010")
- Array of bool
- Vector (vectors appears to be optimized for bool)
- Bitset (sounds the most natural one)
and for the population:
- C Array[]
- Vector
- Queue
I am inclined to pick vector for chromossome and array for pop, but i would like the opinion of anyone with experience on the subject.
Thanks in advance!
I'm guessing you want random access to the population and to the genes. You say performance is important, which I interpret as execution speed. So you're probably best off using a vector<> for the chromosomes and a vector<char> for the genes. The reason for vector<char> is that bitset<> and vector<bool> are optimized for memory consumption, and are therefore slow. vector<char> will give you higher speed at the cost of x8 memory (assuming char = byte on your system). So if you want speed, go with vector<char>. If memory consumption is paramount, then use vector<bool> or bitset<>. bitset<> would seem like a natural choice here, however, bear in mind that it is templated on the number of bits, which means that a) the number of genes must be fixed and known at compile time (which I would guess is a big no-no), and b) if you use different sizes, you end up with one copy per bitset size of each of the bitset methods you use (though inlining might negate this), i.e., code bloat. Overall, I would guess vector<bool> is better for you if you don't want vector<char>.
If you're concerned about the aesthetics of vector<char> you could typedef char gene; and then use vector<gene>, which looks more natural.
A string is just like a vector<char> but more cumbersome.
Specifically to answer your question. I am not exactly sure what you are suggestion. You talk about Array and string class. Are you talking about the STL container classes where you can have a queue, bitset, vector, linked list etc. I would suggest a vector for you population (closest thing to a C array there is) and a bitset for you chromosome if you are worried about memory capacity. Else as you are already using a vector of your string representaion of your dna. ("10110110")
For ideas and a good tool to dabble. Recommend you download and initially use this library. It works with the major compilers. Works on unix variants. Has all the source code.
All the framework stuff is done for you and you will learn a lot. Later on you could write your own code from scratch or inherit from these classes. You can also use them in commercial code if you want.
Because they are objects you can change representaion of your DNA easily from integers to reals to structures to trees to bit arrays etc etc.
There is always learning cure involved but it is worth it.
I use it to generate thousands of neural nets then weed them out with a simple fitness function then run them for real.
galib
http://lancet.mit.edu/ga/
Assuming that you want to code this yourself (if you want an external library kingchris seems to have a good one there) it really depends on what kind of manipulation you need to do. To get the most bang for your buck in terms of memory, you could use any integer type and set/manipulate individual bits via bitmasks etc. Now this approach likely not optimal in terms of ease of use... The string example above would work ok, however again its not significantly different than the shorts, here you are now just representing either '0' or '1' with an 8 bit value as opposed to 16 bit value. Also, again depending on the manipulation, the string case will probably prove unwieldly. So if you could give some more info on the algorithm we could maybe give more feedback. Myself I like the individual bits as part of an integer (a bitset), but if you aren't used to masks, shifts, and all that good stuff it may not be right for you.
I suggest writing a class for each member of population, that simplifies things considerably, since you can keep all your member relevant functions in the same place nicely wrapped with the actual data.
If you need a "array of bools" I suggest using an int or several ints (then use mask and bit wise operations to access (modify / flip) each bit) depending on number of your chromosomes.
I usually used some sort of collection class for the population, because just an array of population members doesn't allow you to simply add to your population. I would suggest implementing some sort of dynamic list (if you are familiar with ArrayList then that is a good example).
I had major success with genetic algorithms with the recipe above. If you prepare your member class properly it can really simplify things and allows you to focus on coding better genetic algorithms instead of worrying about your data structures.