What are flags and bitfields? - c++

Can someone explain to me what flags and bitfields are. They seems to be related to each other, or mabye i got the wrong idea. I kinda grasp bits and pieces of what they do and are but I would like to get them fully explain and I can't really find any good tutorials or guides.
I would be really thankful if someone could give some good examples on how to use them etc... For instance i see these kinds of expressions all the time and I dont fully understand them. Just that they are some kind of logical operators or something
VARIABLE1 | VARIABLE2
Thanks in advance!

An introduction to bitwise operations can be found here: http://www.codeproject.com/Articles/2247/An-introduction-to-bitwise-operators
As they apply to flags, bitwise operations are advantageous in that they are very fast and save on space. You can store many different states of an object within a single variable by using mutually exclusive bits. i.e.
0001 // property 1 (== 1, 0x01)
0010 // property 2 (== 2, 0x02)
0100 // property 3 (== 4, 0x04)
1000 // property 4 (== 8, 0x08)
These can represent four different properties of an object (these are the "masks"). We can add a property to an object's flags state by using or:
short objState = 0; // initialize to 0
objState |= 0010;
This adds property 2 above to objState by "or"-ing 0010 with 0000, resulting in 0010. If we add another flag/property like so:
objState |= 0100;
we end up with objState = 0110.
Now we can check if the object has the flag for property 2 set, for example, by using and:
if (objState & 0010) // do something
and is 1 if and only if both bits are 1, so if bit 2 is 1, the above operation is guaranteed to be non-zero.
So as I mentioned, the advantages of this way of handling properties/flags of an object is both speed and efficiency. Think of it this way: you can store a set of properties in a single variable using this method.
Let's say, for example, you have a file type and you wish to keep track of the properties using bit masks (I'll use audio files). Maybe bits 0 - 3 can store the bit-depth of the file, bits 4 - 7 can store the file type (Wav, Aif, etc.), and so on. You then just need this one variable to pass around to different functions and can test using your defined bit-masks instead of having to keep track of potentially dozens of variables.
Hope that sheds some light on at least this one application of bitwise operation.

A "bifield" is a set one or more bits in a "word" (that is, say an int, long or char) that are stored together in one variable.
You can for example have "none, "spots" and/or "stripes" on some animal, and it can also have "none, short, medium or long" tail.
So, we need two bits to reprent the length of a tail:
enum tail
{
tail_none = 0,
tail_short = 1,
tail_medium = 2,
tail_long = 3
};
We then store these as bits in the "attributes":
enum bits_shifts
{
tail_shift = 0, // Uses bits 0..1
spots_shift = 2, // Uses bit 2
stripes_shift = 3
};
enum bits_counts
{
tail_bits = 2, // Uses bits 0..1
spots_bits = 1, // Uses bit 2
stripes_bits = 1
};
We now pretend that we have fetched from some input the tail_size and the has_stripes, has_spots variables.
int attributes;
attributes = tail_length << tail_shift;
if (has_spots)
{
attributes |= 1 << spots_shift;
}
if (has_stripes)
{
attributes |= 1 << stripes_shift;
}
Later we want to figure what the attributes are:
switch((attributes >> tail_shift) & (1 << tail_bits)-1))
{
case tail_none:
cout << "no tail";
break;
case tail_short:
cout << "short tail";
break;
case tail_medium:
cout << "medium tail";
break;
case tail_short:
cout << "long tail";
break;
}
if (attributes & (1 << stripes_shift))
{
cout << "has stripes";
}
if (attributes & (1 << spots_shift))
{
cout << "has spots";
}
Now, we have stored all this in one integer, and then "fished it out" again.
You can of course do something like this too:
enum bitfields
{
has_widget1 = 1,
has_widget2 = 2,
has_widget3 = 4,
has_widget4 = 8,
has_widget5 = 16,
...
has_widget25 = 16777216,
...
}
int widgets = has_widget1 | has_widget5;
...
if (widgets & has_widget1)
{
...
}
It is really just an easy way to pack several things into one variable.

the bits of an integer value can be used as bools.
http://msdn.microsoft.com/en-us/library/yszfawxh(v=vs.80).aspx
Make with | and retrieve with &.
enum { ENHANCED_AUDIO = 1, BIG_SPEAKERS = 2, LONG_ANTENNA = 4};
foo(HAS_CAR | HAS_SHOE); // equiv to foo(3);
void processExtraFeatures(flags) {
BOOLEAN enhancedAudio = flags & ENHANCED_AUDIO; // true
BOOLEAN bigSpeakers = flags & BIG_SPEAKERS; // true
BOOLEAN longAntenna = flags & LONG_ANTENNA; // false
}

A "flag" is a notional object that can be set or not set, but not a part of the c++ language.
A bitfield is a language construct for using sets of bits that may not make up an addressable object. Fields of a single bit are one---often very good---way of implementing a flag.

Related

Compact non-repetitive way for an ALL flag in enums that represent bitflags in C++

I often use enums for bitflags like the following
enum EventType {
NODE_ADDED = 1 << 0,
NODE_DELETED = 1 << 1,
LINK_ADDED = 1 << 2,
LINK_DELETED = 1 << 3,
IN_PIN_ADDED = 1 << 4,
IN_PIN_DELETED = 1 << 5,
IN_PIN_CHANGE = 1 << 6,
OUT_PIN_ADDED = 1 << 7,
OUT_PIN_DELETED = 1 << 8,
OUT_PIN_CHANGE = 1 << 9,
ALL = NODE_ADDED | NODE_DELETED | ...,
};
Is there a clean less repetitive way to define an ALL flag that combines all other flags in an enum? For small enums the above works well, but lets say there are 30 flags in an enum, it gets tedious to do it this way. Does something work (in general) like this
ALL = -1
?
Use something that'll always cover every other option, like:
ALL = 0xFFFFFFFF
Or as Swordfish commented, you can flip the bits of an unsigned integer literal:
ALL = ~0u
To answer your comment, you can explicitly tell the compiler what type you want your enum to have:
enum EventType : unsigned int
The root problem here is how may one-bits you need. That depends on the number of enumerators previously. Trying to define ALL inside the enum makes that a case of circular logic
Instead, you have to define it outside the enum:
const auto ALL = (EventType) ~EventType{};
EventType{} has sufficient zeroes, ~ turns it into an integral type with enough ones, so you need another cast back to EventType

Fast inner product of ternary vectors

Consider two vectors, A and B, of size n, 7 <= n <= 23. Both A and B consists of -1s, 0s and 1s only.
I need a fast algorithm which computes the inner product of A and B.
So far I've thought of storing the signs and values in separate uint32_ts using the following encoding:
sign 0, value 0 → 0
sign 0, value 1 → 1
sign 1, value 1 → -1.
The C++ implementation I've thought of looks like the following:
struct ternary_vector {
uint32_t sign, value;
};
int inner_product(const ternary_vector & a, const ternary_vector & b) {
uint32_t psign = a.sign ^ b.sign;
uint32_t pvalue = a.value & b.value;
psign &= pvalue;
pvalue ^= psign;
return __builtin_popcount(pvalue) - __builtin_popcount(psign);
}
This works reasonably well, but I'm not sure whether it is possible to do it better. Any comment on the matter is highly appreciated.
I like having the 2 uint32_t, but I think your actual calculation is a bit wasteful
Just a few minor points:
I'm not sure about the reference (getting a and b by const &) - this adds a level of indirection compared to putting them on the stack. When the code is this small (a couple of clocks maybe) this is significant. Try passing by value and see what you get
__builtin_popcount can be, unfortunately, very inefficient. I've used it myself, but found that even a very basic implementation I wrote was far faster than this. However - this is dependent on the platform.
Basically, if the platform has a hardware popcount implementation, __builtin_popcount uses it. If not - it uses a very inefficient replacement.
The one serious problem here is the reuse of the psign and pvalue variables for the positive and negative vectors. You are doing neither your compiler nor yourself any favors by obfuscating your code in this way.
Would it be possible for you to encode your ternary state in a std::bitset<2> and define the product in terms of and? For example, if your ternary types are:
1 = P = (1, 1)
0 = Z = (0, 0)
-1 = M = (1, 0) or (0, 1)
I believe you could define their product as:
1 * 1 = 1 => P * P = P => (1, 1) & (1, 1) = (1, 1) = P
1 * 0 = 0 => P * Z = Z => (1, 1) & (0, 0) = (0, 0) = Z
1 * -1 = -1 => P * M = M => (1, 1) & (1, 0) = (1, 0) = M
Then the inner product could start by taking the and of the bits of the elements and... I am working on how to add them together.
Edit:
My foolish suggestion did not consider that (-1)(-1) = 1, which cannot be handled by the representation I proposed. Thanks to #user92382 for bringing this up.
Depending on your architecture, you may want to optimize away the temporary bit vectors -- e.g. if your code is going to be compiled to FPGA, or laid out to an ASIC, then a sequence of logical operations will be better in terms of speed/energy/area than storing and reading/writing to two big buffers.
In this case, you can do:
int inner_product(const ternary_vector & a, const ternary_vector & b) {
return __builtin_popcount( a.value & b.value & ~(a.sign ^ b.sign))
- __builtin_popcount( a.value & b.value & (a.sign ^ b.sign));
}
This will lay out very well -- the (a.value & b.value & ... ) can enable/disable an XOR gate, whose output splits into two signed accumulators, with the first pathway NOTed before accumulation.

What does this c++ statement check? (Box2d)

if ((catA & maskB) != 0 && (catB & maskA) != 0)
It is in Box2d's manual: 6.2, and is used to check if two objects should collide (after filtering)
It checks that catA has at least one common '1' bit with maskB, and catB has at least one common '1' bit with maskA.
For example, if catA is 3 (binary 00000011) and maskB is 10101010), then (catA & maskB) != 0 is true because catA & maskB is 00000010.
This is called masking, which means only keeping bits of interest.
You frequently have this kind of construct :
#define READ 1
#define WRITE 2
#define READWRITE (READ|WRITE)
#define DIRECTORY 4
int i=getFileInfo("myfile");
if(i & READWRITE)puts("you can read or write in myfile");
if(i & DIRECTORY)puts("myfile is a directory");
BTW, "i & DIRECTORY" means the same as "(i & DIRECTORY) != 0"
catA is a bit field of collision categories for object A
maskA is a bit field of categories that object A can collide with.
For example:
catA = 100010000010010 // Object A is in 4 collision categories
maskA = 001010000000000 // Object A can collide with too different categories
catB = 000010001000001 // Object B is in 3 collision categories
maskB = 100000000000000 // Object B can only collide with the category represented by the highest bit
catA & maskB means the bits that are 1 in both catA and maskB, so 1000000000000000. It's not 0, because Object B can collide with objects in the highest bit, and object A has that bit set.
catB & maskA means the bits that are 1 in both catB and maskA, so 0000100000000000. It's also not zero since Object A can collide with objects in the 5th highest bit category, and Object B is in that category.
So the two objects can collide.
the symbol & is the bitwise AND operator. Therefore, the statement
(catA & maskB) != 0
checks to see if any bits overlap in both of those items. It looks at if it checks A against B first, then B against A.
I know many C/C++ programmers love terseness but I find this type of code can be made much more readable by moving this test into a function (you can inline it if you want). They could also have gotten rid of the comment by moving the code to a well named method.
if (FixturesCanCollide() )
{
}
You can actually throw away the comparison to !=0 (although you might prefer it for clarity, either way it probably compiles down to the same code.
inline bool FixturesCanCollide()
{
return (catA & maskB) && (catB & maskA);
}

evaluate whether a number is integer power of 4

The following function is claimed to evaluate whether a number is integer power of 4. I do not quite understand how it works?
bool fn(unsigned int x)
{
if ( x == 0 ) return false;
if ( x & (x - 1) ) return false;
return x & 0x55555555;
}
The first condition rules out 0, which is obviously not a power of 4 but would incorrectly pass the following two tests. (EDIT: No, it wouldn't, as pointed out. The first test is redundant.)
The next one is a nice trick: It returns true if and only if the number is a power of 2. A power of two is characterized by having only one bit set. A number with one bit set minus one results in a number with all bits previous to that bit being set (i.e. 0x1000 minus one is 0x0111). AND those two numbers, and you get 0. In any other case (i.e. not power of 2), there will be at least one bit that overlaps.
So at this point, we know it's a power of 2.
x & 0x55555555 returns non-zero (=true) if any even bit it set (bit 0, bit 2, bit 4, bit 6, etc). That means it's power of 4. (i.e. 2 doesn't pass, but 4 passes, 8 doesn't pass, 16 passes, etc).
Every power of 4 must be in the form of 1 followed by an even number of zeros (binary representation): 100...00:
100 = 4
10000 = 16
1000000 = 64
The 1st test ("if") is obvious.
When subtracting 1 from a number of the form XY100...00 you get XY011...11. So, the 2nd test checks whether there is more than one "1" bit in the number (XY in this example).
The last test checks whether this single "1" is in the correct position, i.e, bit #2,4,6 etc. If it is not, the masking (&) will return a nonzero result.
Below solution works for 2,4,16 power of checking.
public static boolean isPowerOf(int a, int b)
{
while(b!=0 && (a^b)!=0)
{
b = b << 1;
}
return (b!=0)?true:false;
}
isPowerOf(4,2) > true
isPowerOf(8,2) > true
isPowerOf(8,3) > false
isPowerOf(16,4) > true
var isPowerOfFour = function (n) {
let x = Math.log(n) / Math.log(4)
if (Number.isInteger(x)) {
return true;
}
else {
return false
}
};
isPowerOfFour(4) ->true
isPowerOfFour(1) ->true
isPowerOfFour(5) ->false

c++ multiple enums in one function argument using bitwise or "|"

I recently came across some functions where you can pass multiple enums like this:
myFunction(One | Two);
Since I think this is a really elegant way I tried to implement something like that myself:
void myFunction(int _a){
switch(_a){
case One:
cout<<"!!!!"<<endl;
break;
case Two:
cout<<"?????"<<endl;
break;
}
}
now if I try to call the function with One | Two, I want that both switch cases get called. I am not really good with binary operators so I dont really know what to do. Any ideas would be great!
Thanks!
For that you have to make enums like :
enum STATE {
STATE_A = 1,
STATE_B = 2,
STATE_C = 4
};
i.e. enum element value should be in power of 2 to select valid case or if statement.
So when you do like:
void foo( int state) {
if ( state & STATE_A ) {
// do something
}
if ( state & STATE_B ) {
// do something
}
if ( state & STATE_C ) {
// do something
}
}
int main() {
foo( STATE_A | STATE_B | STATE_C);
}
Bitwise operators behave well only with powers of 2:
0010
| 0100
------
0110 // both bits are set
0110
& 0100
------
0100 // nonzero, i.e. true: the flag is set
If you try to do the same with arbitrary numbers, you'll get unexpected results:
0101 // 5
| 1100 // 12
------
1101 // 13
Which contains the possible (arbitrary) numbers as set flags: 0001 (1), 0100 (4), 0101 (5), 1000 (8), 1001 (9), 1100 (12), 1101 (13)
So instead of giving two options, you just gave six.
Usually arguments that are combined that way are flags (a value with a single bit set) with a decimal value of 1, 2, 4, 8, etc. Assuming that One and Two follow this rule, you cannot use a switch to check for both. Switches only follow one path. Your combined argument does not equal One or Two, but a combination of them (1 | 2 == 3). You can check to see if One or Two is set like this:
if (_a & One)
{
}
if (_a & Two)
{
}
Remember that a standard enum without explicit values will just count upwards, not use the next bit. If your next defined value is Three, it will likely equal 3 which is not a value of a single bit, and will then act as if you had passed both flags (One | Two) to the function. You'll need to set the values of the enum yourself.
You must split the possible "tokens" (non-overlapping of course... use power of 2):
if (_a & One) { ... }
Not elegant really do what you want with 1 switch statement: split using if statements.
You are better off doing it with a set of if statements ...
ie
if ( _a & ONE )
{
// Do stuff.
}
if ( _a & TWO )
{
// Do other stuff.
}
edit: You could also do it in a switch statement but it would be a nightmare. Youd need something like this
switch( _a )
{
case ONE:
// Handle ONE.
break;
case TWO:
// Handle TWO.
break;
case ONE | TWO:
// Handle ONE.
// Handle TWO.
break;
};
Its relatively sane for only 2 options but once you get more than that it starts to balloon out of control. If you have 32-options you'd have a switch statement that would be unlikely to fit on any machine. All in the "if" solution is much cleaner and far more sane :)