Bitset Reference - c++

From http://www.cplusplus.com/reference/stl/bitset/:
Because no such small elemental type exists in most C++ environments, the individual elements are accessed as special references which mimic bool elements.
How, exactly, does this bit reference work?
The only way I could think of would be to use a static array of chars, but then each instance would need to store its index in the array. Since each reference instance would have at least the size of a size_t, that would destroy the compactness of the bitset. Additionally, resizing may be slow, and bit manipulation is expected to be fast.

I think you are confusing two things.
The bitset class stores the bits in a compact representations, e.g. in a char array, typically 8 bits per char (but YMMV on "exotic" platforms).
The bitset::reference class is provided to allow users of the bitset class to have reference-like objects to the bits stored in a bitset.
Because regular pointers and references don't have enough granularity to point to the single bits stored in the bitset (their minimum granularity is the char), such class mimics the semantic of a reference to fake reference-like lvalue operations on the bits. This is needed, in particular, to allow the value returned by operator[] to work "normally" as an lvalue (and it probably costitutes 99% of its "normal" use). In this case it can be seen as a "proxy-object".
This behavior is achieved by overloading the assignment operator and the conversion-to-bool operator; the bitset::reference class will probably encapsulate a reference to the parent bitset object and the offset (bytes+bit) of the referenced bit, that are used by such operators to retrieve and store the value of the bit.
---EDIT---
Actually, the g++ implementation makes the bitset::reference store directly a pointer to the memory word in which the byte is stored, and the bit number in such word. This however is just an implementation detail to boost its performance.
By the way, in the library sources I found a very compact but clear explanation of what bitset::reference is and what it does:
/**
* This encapsulates the concept of a single bit. An instance of this
* class is a proxy for an actual bit; this way the individual bit
* operations are done as faster word-size bitwise instructions.
*
* Most users will never need to use this class directly; conversions
* to and from bool are automatic and should be transparent. Overloaded
* operators help to preserve the illusion.
*
* (On a typical system, this <em>bit %reference</em> is 64
* times the size of an actual bit. Ha.)
*/

I haven't looked at the STL source, but I would expect a Bitset reference to contain a pointer to the actual bitset, and a bit number of size size_t. The references are only created when you attempt to get a reference to a bitset element.
Normal use of bitsets is most unlikely to use references extensively (if at all), so there shouldn't be much of a performance issue. And, it's conceptually similar to char types. A char is normally 8 bits, but to store a 'reference' to a char requires a pointer, so typically 32 or 64 bits.

I've never looked at the reference implementation, but obviously it must know the bitset it is referring to via a reference, and the index of the bit it is responsible for changing. It then can use the rest of the bitsets interface to make the required changes. This can be quite efficient. Note bitsets cannot be resized.

I am not quite sure what you are asking, but I can tell you a way to access individual bits in a byte, which is perhaps what bitsets do. Mind you that the following code is not my own and is Microsoft spec (!).
Create a struct as such:
struct Byte
{
bool bit1:1;
bool bit2:1;
bool bit3:1;
bool bit4:1;
bool bit5:1;
bool bit6:1;
bool bit7:1;
bool bit8:1;
}
The ':1' part of this code are bitfields. http://msdn.microsoft.com/en-us/library/ewwyfdbe(v=vs.80).aspx
They define how many bits a variable is desired to occupy, so in this struct, there are 8 bools that occupy 1 bit each. In total, the 'Byte' struct is therefore 1 byte in size.
Now if you have a byte of data, such as a char, you can store this data in a Byte object as follows:
char a = 'a';
Byte oneByte;
oneByte = *(Byte*)(&a); // Get the address of a (a pointer, basically), cast this
// char* pointer to a Byte*,
// then use the reference operator to store the data that
// this points to in the variable oneByte.
Now you can access (and alter) the individual bits by accessing the bool member variables of oneByte. In order to store the altered data in a char again, you can do as follows:
char b;
b = *(char*)(&oneByte); // Basically, this is the reverse of what you do to
// store the char in a Byte.
I'll try to find the source of this technique, to give credit where credit is due.
Also, again I am not entirely sure whether this answer is of any use to you. I interpreted your question as being 'how would access to individual bits be handled internally?'.

Related

std:bitset c++ external initialization

I want to wrap or cast a std:bitset over a given constant data arrary
or to formulate it differently, initialize a bitset with foreign data.
The user knows the index of the bit which he can check then via bitset.test(i). Data is big, so it must be efficient. (Machine bitorder does not matter, we can store it in the right way).
Thats what I tried:
constexpr uint32_t data[32] = {1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32};
constexpr std::bitset<1000> bset(1); //bit bitset initialized with a value here
constexpr std::bitset<1000> bset2(data); //init it with our data, this is not working
The number of bits is 32*32=1024 that is held by data. With my bitset i can address the almost full range. User does not need more than 1000. Can someone please explain to me how this is done in cpp in my example above with bset2?
Unfortunately std::bitset does not have suitable design for what you want.
It is not designed an aggregate (like std::array is) so aggregate initialiation is impossible (and also copying bits into it with std::memcpy is undefined behavior).
It can take only one unsigned long long in constexpr constructor.
The operator [] and set method will become constexpr in C++23 so there will be a way after that.
Just use constexpr raw array or std::array and add bit accessing methods until then.

Why does std::vector<bool> have no .data()?

The specialisation of std::vector<bool>, as specified in C++11 23.3.7/1, doesn't declare a data() member (e.g. mentioned here and here).
The question is: Why does a std::vector<bool> have no .data()? This is the very same question as why is a vector of bools not stored contiguously in memory. What are the benefits in not doing so?
Why can a pointer to an array of bools not be returned?
Why does a std::vector have no .data()?
Because std::vector<bool> stores multiple values in 1 byte.
Think about it like a compressed storage system, where every boolean value needs 1 bit. So, instead of having one element per memory block (one element per array cell), the memory layout may look like this:
Assuming that you want to index a block to get a value, how would you use operator []? It can't return bool& (since it will return one byte, which stores more than one bools), thus you couldn't assign a bool* to it. In other words bool *bool_ptr =&v[0]; is not valid code, and would result in a compilation error.
Moreover, a correct implementation might not have that specialization and don't do the memory optimization (compression). So data() would have to copy to the expected return type depending of implementation (or standard should force optimization instead of just allowing it).
Why can a pointer to an array of bools not be returned?
Because std::vector<bool> is not stored as an array of bools, thus no pointer can be returned in a straightforward way. It could do that by copying the data to an array and return that array, but it's a design choice not to do that (if they did, I would think that this works as the data() for all containers, which would be misleading).
What are the benefits in not doing so?
Memory optimization.
Usually 8 times less memory usage, since it stores multiple bits in a single byte. To be exact, CHAR_BIT times less.

Can storing unrelated data in the least-significant-bit of a pointer work reliably?

Let me just say up front that what I'm aware that what I'm about to propose is a mortal sin, and that I will probably burn in Programming Hell for even considering it.
That said, I'm still interested in knowing if there's any reason why this wouldn't work.
The situation is: I have a reference-counting smart-pointer class that I use everywhere. It currently looks something like this (note: incomplete/simplified pseudocode):
class IRefCountable
{
public:
IRefCountable() : _refCount(0) {}
virtual ~IRefCountable() {}
void Ref() {_refCount++;}
bool Unref() {return (--_refCount==0);}
private:
unsigned int _refCount;
};
class Ref
{
public:
Ref(IRefCountable * ptr, bool isObjectOnHeap) : _ptr(ptr), _isObjectOnHeap(isObjectOnHeap)
{
_ptr->Ref();
}
~Ref()
{
if ((_ptr->Unref())&&(_isObjectOnHeap)) delete _ptr;
}
private:
IRefCountable * _ptr;
bool _isObjectOnHeap;
};
Today I noticed that sizeof(Ref)=16. However, if I remove the boolean member variable _isObjectOnHeap, sizeof(Ref) is reduced to 8. That means that for every Ref in my program, there are 7.875 wasted bytes of RAM... and there are many, many Refs in my program.
Well, that seems like a waste of some RAM. But I really need that extra bit of information (okay, humor me and assume for the sake of the discussion that I really do). And I notice that since IRefCountable is a non-POD class, it will (presumably) always be allocated on a word-aligned memory address. Therefore, the least significant bit of (_ptr) should always be zero.
Which makes me wonder... is there any reason why I can't OR my one bit of boolean data into the least-significant bit of the pointer, and thus reduce sizeof(Ref) by half without sacrificing any functionality? I'd have to be careful to AND out that bit before dereferencing the pointer, of course, which would make pointer dereferences less efficient, but that might be made up for by the fact that the Refs are now smaller, and thus more of them can fit into the processor's cache at once, and so on.
Is this a reasonable thing to do? Or am I setting myself up for a world of hurt? And if the latter, how exactly would that hurt be visited upon me? (Note that this is code that needs to run correctly in all reasonably modern desktop environments, but it doesn't need to run in embedded machines or supercomputers or anything exotic like that)
If you want to use only the standard facilities and not rely on any implementation then with C++0x there are ways to express alignment (here is a recent question I answered). There's also std::uintptr_t to reliably get an unsigned integral type large enough to hold a pointer. Now the one thing guaranteed is that a conversion from the pointer type to std::[u]intptr_t and back to that same type yields the original pointer.
I suppose you could argue that if you can get back the original std::intptr_t (with masking), then you can get the original pointer. I don't know how solid this reasoning would be.
[edit: thinking about it there's no guarantee that an aligned pointer takes any particular form when converted to an integral type, e.g. one with some bits unset. probably too much of a stretch here]
The problem here is that it is entirely machine-dependent. It isn't something one often sees in C or C++ code, but it has certainly been done many times in assembly. Old Lisp interpreters almost always used this trick to store type information in the low bit(s). (I have seen int in C code, but in projects that were being implemented for a specific target platform.)
Personally, if I were trying to write portable code, I probably wouldn't do this. The fact is that it will almost certainly work on "all reasonably modern desktop environments". (Certainly, it will work on every one I can think of.)
A lot depends on the nature of your code. If you are maintaining it, and nobody else will ever have to deal with the "world of hurt", then it might be ok. You will have to add ifdef's for any odd architecture that you might need to support later on. On the other hand, if you are releasing it to the world as "portable" code, that would be cause for concern.
Another way to handle this is to write two versions of your smart pointer, one for machines on which this will work and one for machines where it won't. That way, as long as you maintain both versions, it won't be that big a deal to change a config file to use the 16-byte version.
It goes without saying that you would have to avoid writing any other code that assumes sizeof(Ref) is 8 rather than 16. If you are using unit tests, run them with both versions.
Any reason? Unless things have changed in the standard lately, the value representation of a pointer is implementation-defined. It is certainly possible that some implementation somewhere may pull the same trick, defining these otherwise-unused low bits for its own purposes. It's even more possible that some implementation might use word-pointers rather than byte-pointers, so instead of two adjacent words being at "addresses" 0x8640 and 0x8642, they would be at "addresses" 0x4320 and 0x4321.
One tricky way around the problem would be to make Ref a (de facto) abstract class, and all instances would actually be instances of RefOnHeap and RefNotOnHeap. If there are that many Refs around, the extra space used to store the code and metadata for three classes rather than one would be made up by the space savings in having each Ref being half the size. (Won't work too well, the compiler can omit the vtable pointer if there are no virtual methods and introducing virtual methods will add the 4-or-8 bytes back to the class).
You always have at least a free bit to use in the pointer as long as
you're not pointing to arbitrary positions inside a struct or array with alignment of 1, or
the platform gives you a free bit
Since IRefCountable has an alignment of 4, you'll have 2 free bottom bits in IRefCountable* to use
Regarding the first point, storing data in the least significant bit is always reliable if the pointer is aligned to a power of 2 larger than 1. That means it'll work for everything apart from char*/bool* or a pointer to a struct containing all char/bool members, and obviously it'll work for IRefCountable* in your case. In C++11 you can use alignof or std::alignment_of to ensure that you have the required alignment like this
static_assert(alignof(Ref) > 1);
static_assert(alignof(IRefCountable) > 1);
// This check for power of 2 is likely redundant
static_assert((alignof(Ref) & (alignof(Ref) - 1)) == 0);
// Now IRefCountable* is always aligned,
// so its least significant bit can be used freely
Even if you have some object with only 1-byte alignment, for example if you change the _refCount in IRefCountable to uint8_t, then you can still enforce alignment requirement with alignas, or with other extensions in older C++ like __declspec(align). Dynamically allocated memory is already aligned to max_align_t, or you can use aligned_alloc() for a higher level alignment
My second bullet point means in case you really need to store arbitrary pointers to objects with absolute 1-byte alignment then most of the time you can still utilize the feature from the platform
On many 32-bit platforms the address space is split in half for user and kernel processes. User pointers will always have the most significant bit unset so you can use that to store data. Of course it won't work on platforms with more than 2GB of user address space, like when the split is 3/1 or 4/4
On 64-bit platforms currently most have only 48-bit virtual address, and a few newer high-end CPUs may have 57-bit virtual address which is far from the total 64 bits. Therefore you'll have lots of bits to spare. And in reality this always work in personal computing since you'll never be able to fill that vast address space
This is called tagged pointer
If the data is always heap-allocated then you can tell the OS to limit the range of address space to use to get more bits
For more information read Using the extra 16 bits in 64-bit pointers
Yes, this can work reliably. This is, in fact, used by the Linux kernel as part of its red-black tree implementation. Instead of storing an extra boolean to indicate whether a node is red or black (which can take up quite a bit of additional space), the kernel uses the low-order bit of the parent node address.
From rbtree_types.h:
struct rb_node {
unsigned long __rb_parent_color;
struct rb_node *rb_right;
struct rb_node *rb_left;
} __attribute__((aligned(sizeof(long))));
The __rb_parent_color field stores both the address of the nodes parent and the color of the node (in the least-significant bit).
Getting The Pointer
To retrieve the parent address from this field you just clear the lower order bits (this clears the lowest 2-bits).
From rbtree.h:
#define rb_parent(r) ((struct rb_node *)((r)->__rb_parent_color & ~3))
Getting The Boolean
To retrieve the color you just extract the lower bit and treat it like a boolean.
From rbtree_augmented.h:
#define __rb_color(pc) ((pc) & 1)
#define __rb_is_black(pc) __rb_color(pc)
#define __rb_is_red(pc) (!__rb_color(pc))
#define rb_color(rb) __rb_color((rb)->__rb_parent_color)
#define rb_is_red(rb) __rb_is_red((rb)->__rb_parent_color)
#define rb_is_black(rb) __rb_is_black((rb)->__rb_parent_color)
Setting The Pointer And Boolean
You set the pointer and boolean value using standard bit manipulation operations (making sure to preserve each part of the final value).
From rbtree_augmented.h:
static inline void rb_set_parent(struct rb_node *rb, struct rb_node *p)
{
rb->__rb_parent_color = rb_color(rb) | (unsigned long)p;
}
static inline void rb_set_parent_color(struct rb_node *rb,
struct rb_node *p, int color)
{
rb->__rb_parent_color = (unsigned long)p | color;
}
You can also clear the boolean value setting it to false via (unsigned long)p & ~1.
There will be always a sense of uncertainty in mind even if this method is working, because ultimately you are playing with the internal architecture which may or may not be portable.
On the other hand to solve this problem, if you want to avoid bool variable, I would suggest a simple constructor as,
Ref(IRefCountable * ptr) : _ptr(ptr)
{
if(ptr != 0)
_ptr->Ref();
}
From the code, I smell that the reference counting is needed only when the object is on heap. For automatic objects, you can simply pass 0 to the class Ref and put appropriate null checks in constructor/destructor.
Have you thought about an out of class storage ?
Depending on whether you have (or not) to worry about multi-threading and control the implementation of new/delete/malloc/free, it might be worth a try.
The point would be that instead of incrementing a local counter (local to the object), you would maintain a "counter" map address --> count that would haughtily ignore addresses passed that are outside the allocated area (stack for example).
It may seem silly (there is room for contention in MT), but it also plays rather nice with read-only since the object is not "modified" only for counting.
Of course, I have no idea of the performance you might hope to achieve with this :p

C++ 2.5 bytes (20-bit) integer

I know it's ridiculous, but I need it for storage optimization. Is there any good way to implement it in C++?
It has to be flexible enough so that I can use it as a normal data type e.g Vector< int20 >, operator overloading, etc..
If storage is your main concern, I suspect you need quite a few 20-bit variables. How about storing them in pairs? You could create a class representing two such variables and store them in 2.5+2.5 = 5 bytes.
To access the variables conveniently you could override the []-operator so you could write:
int fst = pair[0];
int snd = pair[1];
Since you may want to allow for manipulations such as
pair[1] += 5;
you would not want to return a copy of the backing bytes, but a reference. However, you can't return a direct reference to the backing bytes (since it would mess up it's neighboring value), so you'd actually need to return a proxy for the backing bytes (which in turn has a reference to the backing bytes) and let the proxy overload the relevant operators.
As a metter of fact, as #Tony suggest, you could generalize this to have a general container holding N such 20-bit variables.
(I've done this myself in a specialization of a vector for efficient storage of booleans (as single bits).)
No... you can't do that as a single value-semantic type... any class data must be a multiple of the 8-bit character size (inviting all the usual quips about CHAR_BITS etc).
That said, let's clutch at straws...
Unfortunately, you're obviously handling very many data items. If this is more than 64k, any proxy object into a custom container of packed values will probably need a >16 bit index/handle too, but still one of the few possibilities I can see worth further consideration. It might be suitable if you're only actively working with and needing value semantic behaviour for a small subset of the values at one point in time.
struct Proxy
{
Int20_Container& container_; // might not need if a singleton
Int20_Container::size_type index_;
...
};
So, the proxy might be 32, 64 or more bits - the potential benefit is only if you can create them on the fly from indices into the container, have them write directly back into the container, and keep them short-lived with few concurrently. (One simple way - not necessarily the fastest - to implement this model is to use an STL bitset or vector as the Int20_Container, and either store 20 times the logical index in index_, or multiply on the fly.)
It's also vaguely possible that although your values range over a 20-bit space, you've less than say 64k distinct values in actual use. If you have some such insight into your data set, you can create a lookup table where 16-bit array indices map to 20-bit values.
Use a class. As long as you respect the copy/assign/clone/etc... STL semantics, you won't have any problem.
But it will not optimize the memory space on your computer. Especially if you put in in a flat array, the 20bit will likely be aligned on a 32bit boundary, so the benefit of a 20bit type there is useless.
In that case, you will need to define your own optimized array type, that could be compatible with the STL. But don't expect it to be fast. It won't be.
Use a bitfield. (I'm really surprised nobody has suggested this.)
struct int20_and_something_else {
int less_than_a_million : 20;
int less_than_four_thousand : 12; // total 32 bits
};
This only works as a mutual optimization of elements in a structure, where you can spackle the gaps with some other data. But it works very well!
If you truly need to optimize a gigantic array of 20-bit numbers and nothing else, there is:
struct int20_x3 {
int one : 20;
int two : 20;
int three : 20; // 60 bits is almost 64
void set( int index, int value );
int get( int index );
};
You can add getter/setter functions to make it prettier if you like, but you can't take the address of a bitfield, and they can't participate in an array. (Of course, you can have an array of the struct.)
Use as:
int20_x3 *big_array = new int20_x3[ array_size / 3 + 1 ];
big_array[ index / 3 ].set( index % 3, value );
You can use C++ std::bitset. Store everything in a bitset and access your data using the correct index (x20).
Your not going to be able to get exactly 20 bits as a type(even with a bit packed struct), as it will always be aligned (at smallest grainularity) to a byte. Imo the only way to go, if you must have 20 bits, is to create a bitstream to handle the data(which you can overload to accept indexing etc)
You can use the union keyword to create a bit field. I've used it way back when bit fields were a necessity. Otherwise, you can create a class that holds 3 bytes, but through bitwise operations exposes just the most significant 20.
As far as I know that isn't possible.
The easiest option would be to define a custom type, that uses an int32_t as the backing storage, and implements appropriate maths as override operators.
For better storage density, you could store 3 int20 in a single int64_t value.
Just an idea: use optimized storage (5 bytes for two instances), and for operations, convert it into 32-bit int and then back.
While its possible to do this a number of ways.
One possibilty would be to use bit twidling to store them as the left and right parts of a 5 byte array with a class to store/retrieve which converts yoiur desired array entry to an array entry in byte5[] array and extracts the left ot right half as appropriate.
However on most hardware requires integers to be word aligned so as well as the bit twiddling to extract the integer you would need some bit shifiting to align it properly.
I think it would be more efficient to increase your swap space and let virtual memory take care of your large array (after all 20 vs 32 is not much of a saving!) always assuming you have a 64 bit OS.

When to use STL bitsets instead of separate variables?

In what situation would it be more appropriate for me to use a bitset (STL container) to manage a set of flags rather than having them declared as a number of separate (bool) variables?
Will I get a significant performance gain if I used a bitset for 50 flags rather than using 50 separate bool variables?
Well, 50 bools as a bitset will take 7 bytes, while 50 bools as bools will take 50 bytes. These days that's not really a big deal, so using bools is probably fine.
However, one place a bitset might be useful is if you need to pass those bools around a lot, especially if you need to return the set from a function. Using a bitset you have less data that has to be moved around on the stack for returns. Then again, you could just use refs instead and have even less data to pass around. :)
std::bitset will give you extra points when you need to serialize / deserialize it. You can just write it to a stream or read from a stream with it. But certainly, the separate bools are going to be faster. They are optimized for this kind of use after all, while a bitset is optimized for space, and has still function calls involved. It will never be faster than separate bools.
Bitset
Very space efficient
Less efficient due to bit fiddling
Provides serialize / de-serialize with op<< and op>>
All bits packed together: You will have the flags at one place.
Separate bools
Very fast
Bools are not packed together. They will be members somewhere.
Decide on the facts. I, personally, would use std::bitset for some not-performance critical, and would use bools if I either have only a few bools (and thus it's quite overview-able), or if I need the extra performance.
It depends what you mean by 'performance gain'. If you only need 50 of them, and you're not low on memory then separate bools is pretty much always a better choice than a bitset. They will take more memory, but the bools will be much faster. A bitset is usually implemented as an array of ints (the bools are packed into those ints). So the first 32 bools (bits) in your bitset will only take up a single 32bit int, but to read each value you have to do some bitwise operations first to mask out all the values you don't want. E.g. to read the 2nd bit of a bitset, you need to:
Find the int that contains the bit you want (in this case, it's the first int)
Bitwise And that int with '2' (i.e. value & 0x02) to find out if that bit is set
However, if memory is a bottleneck and you have a lot of bools using a bitset could make sense (e.g. if you're target platform is a mobile phone, or it's some state in a very busy web service)
NOTE: A std::vector of bool usually has a specialisation to use the equivalent of a bitset, thus making it much smaller and also slower for the same reasons. So if speed is an issue, you'll be better off using a vector of char (or even int), or even just use an old school bool array.
RE #Wilka:
Actually, bitsets are supported by C/C++ in a way that doesn't require you to do your own masking. I don't remember the exact syntax, but it's something like this:
struct MyBitset {
bool firstOption:1;
bool secondOption:1;
bool thirdOption:1;
int fourBitNumber:4;
};
You can reference any value in that struct by just using dot notation, and the right things will happen:
MyBitset bits;
bits.firstOption = true;
bits.fourBitNumber = 2;
if(bits.thirdOption) {
// Whatever!
}
You can use arbitrary bit sizes for things. The resulting struct can be up to 7 bits larger than the data you define (its size is always the minimum number of bytes needed to store the data you defined).