Using C++/WinRT, Bluetooth LE, VS2017, Win10
I have a characteristic in my Bluetooth LE device that is Readable and Writable.
When checking the properties of various characteristics of a service with something like:
if (characteristic.CharacteristicProperties() == GattCharacteristicProperties::Write)
{
std::wcout << "IsWriteable = true; ";
}
The Read/Write characteristic will not get hit with ::Write and will not get hit with ::Read. The docs say that
This enumeration supports a bitwise combination of its member values.
So I tried the AND & operator since this is Read AND Write
if (characteristic.CharacteristicProperties() == (GattCharacteristicProperties::Read & GattCharacteristicProperties::Write))
{
std::wcout << "IsReadWrite = true; ";
}
However, that did not get hit either. The enumerated value of Read is 2 and of Write is 8 and in Debug this characteristic property showed as "Read | Write (10)". So I used the OR | operator in the snippet above and that hit.
My question is, why would the ::Read not hit and the ::Write not hit but the ::Read OR ::Write hit and the ::Read AND ::Write not hit?
Just kinda curious since this doesn't make sense to me.
That's not how bitwise logic operators work. When you say CharacteristicProperties() == (Read & Write), that doesn't mean must be equal to Read AND Write. What it means is to compare the value against the result of performing a bitwise AND on Read and Write. Those values are b0010 and b1000 (in binary notation), and the operation returns the value 0.
What you meant to implement was a check whether both bits are set. That is done by performing a bitwise OR operation on the flags, and compare that result:
if (characteristic.CharacteristicProperties() == (GattCharacteristicProperties::Read | GattCharacteristicProperties::Write))
{
std::wcout << "IsReadWrite = true; ";
}
This verifies whether properties is exactly equal to the combination of Read and Write flags.
Although it is more common to check whether those flags are set (possibly in addition to other flags). Using p, r, and w for the properties, read, and write flags, the following expression does that:
auto const mask = r | w;
if ((p & mask) == mask) { ... }
Another common operation is verifying whether any flags of a set of given flags are set, e.g.
auto const mask = r | w;
if ((p & mask) != 0) { ... }
This is basic bit checking.
enum { Flag1 = 2, Flag2 = 8 };
To check a single bit: if (x & Flag1) ...
To check if any of two bits are set: if (x & (Flag1 | Flag2)) ...
To check if both bits are set: if ((x & (Flag1 | Flag2)) == (Flag1 | Flag2)) ...
The previous check can be rewritten as:
auto const bits = Flag1 | Flag2;
if ((x & bits) == bits)...
In your case, Flag1 & Flag2 is always 0 (2 & 8 is 0).
x == (Flag1 | Flag2) only works if x does not contain other bits (that are irrelevant flags). 2 | 8 is 10.
I saw a post on CodeProject that helped me see the logic in the bitwise operators. I was thinking in terms of the C++ && || logic but the CodeProject post had this
The CodeProject table doesn't copy and paste here but it showed:
A = 0011, B = 0101
A and B = 0001 (ie what is in both A and B)
A or B = 0111 (ie what is in both A or B)
I believe this was mentioned in so many words above but this simple table was what my brain needed to see the binary logic.
Related
I've spent too many brain cycles on this over the last day.
I'm trying to come up with a set of bitwise operations that may re-implement the following condition:
uint8_t a, b;
uint8_t c, d;
uint8_t e, f;
...
bool result = (a == 0xff || a == b) && (c == 0xff || c == d) && (e == 0xff || e == f);
Code I'm looking at has four of these expressions, short-circuit &&ed together (as above).
I know this is an esoteric question, but the short-circuit nature of this and the timing of the above code in a tight loop makes the lack of predictable time a royal pain, and quite frankly, it seems to really suck on architectures where branch prediction isn't available, or so well implemented.
Is there such a beast that would be concise?
So, if you really want to do bit-twiddling to make this "fast" (which you really should only do after profiling your code to make sure this is a bottleneck), what you want to do is vectorize this by packing all the values together into a wider word so you can do all the comparisons at once (one instruction), and then extract the answer from a few bits.
There are a few tricks to this. To compare two value for equality, you can xor (^) them and test to see if the result is zero. To test a field of a wider word to see if it is zero, you can 'pack' it with a 1 bit above, then subtract one and see if the extra bit you added is still 1 -- if it is now 0, the value of the field was zero.
Putting all this together, you want to do 6 8-bit compares at once. You can pack these values into 9 bit fields in a 64-bit word (9 bits to get that extra 1 guard bit your going to test for subtraction). You can fit up to 7 such 9 bit fields in a 64 bit int, so no problem
// pack 6 9-bit values into a word
#define VEC6x9(A,B,C,D,E,F) (((uint64_t)(A) << 45) | ((uint64_t)(B) << 36) | ((uint64_t)(C) << 27) | ((uint64_t)(D) << 18) | ((uint64_t)(E) << 9) | (uint64_t)(F))
// the two values to compare
uint64_t v1 = VEC6x9(a, a, c, c, e, e);
uint64_t v2 = VEC6x9(b, 0xff, d, 0xff, f, 0xff);
uint64_t guard_bits = VEC6x9(0x100, 0x100, 0x100, 0x100, 0x100, 0x100);
uint64_t ones = VEC6x9(1, 1, 1, 1, 1, 1);
uint64_t alt_guard_bits = VEC6x9(0, 0x100, 0, 0x100, 0, 0x100);
// do the comparisons in parallel
uint64_t res_vec = ((v1 ^ v2) | guard_bits) - ones;
// mask off the bits we'll ignore (optional for clarity, not needed for correctness)
res_vec &= ~guard_bits;
// do the 3 OR ops in parallel
res_vec &= res_vec >> 9;
// get the result
bool result = (res_vec & alt_guard_bits) == 0;
The ORs and ANDs at the end are 'backwards' becuase the result bit for each comparison is 0 if the comparison was true (values were equal) and 1 if it was false (values were not equal.)
All of the above is mostly of interest if you are writing a compiler -- its how you end up implementing a vector comparison -- and it may well be the case that a vectorizing compiler will do it all for you automatically.
This can be much more efficient if you can arrange to have your initial values pre-packed into vectors. This may in turn influence your choice of data structures and allowable values -- if you arrange for your values to be 7-bit or 15-bit (instead of 8-bit) they may pack nicer when you add the guard bits...
You could modify how you store and interpret the data:
When a if 0xFF, do you need the value of b. If not, then make b equal to 0xFF and simplify the expression by removing the part that test for 0xFF.
Also, you might combine a, b and c in a single variable.
uint32_t abc;
uint32_t def;
bool result = abc == def;
Other operations might be slower but that loop should be much faster (single comparison instead of up to 6 comparisons).
You might want to use an union to be able to access byte individually or in group. In that case, make sure that the forth byte is always 0.
To remove timing variations with &&, ||, use &, |. #molbdnilo. Possible faster, maybe not. Certainly easier to parallel.
// bool result = (a == 0xff || a == b) && (c == 0xff || c == d)
// && (e == 0xff || e == f);
bool result = ((a == 0xff) | (a == b)) & ((c == 0xff) | (c == d))
& ((e == 0xff) | (e == f));
This question already has answers here:
Can I use bitwise operators instead of logical ones?
(5 answers)
Closed 6 years ago.
1st Semester Student here. I'm confused about the differences between the bitand (&) and the logical and (&&), and the same with the bitor (|), the logical or (||), and the Xor (^). Specifically, I've a bit of code that I thought required the || or && to function, but.. apparently not? Here's the code in question:
cout << "Please enter your biological sex (M or F): " << endl;
cin >> sex;
//Repetition Structure
while (toupper(sex) != 'M' && 'F')
{
cout << "Invalid entry. Please enter your biological sex (M or F): " << endl;
cin >> sex;
} //end while
I tried using || at first, but it gave me the "invalid entry" reply no matter how I answered it. I looked around on the forums, found that && would be better and tried using it. It worked - at first. Then, for no apparent reason, it stopped taking 'F' as an answer (the code wasn't changed).
To correct this, I tried using Xor (^) which failed, and then bitand (&, instead of &&), which apparently made it function correctly once more. But, I've seen warnings that you can still get the right answers using either & or &&, but one does not equal the other and could cause problems later. This is why I need clarification (also, it's just good stuff to know).
UPDATE: Since this was tagged as a duplicate, I'll clarify my intent: I want to know more about & and &&, why you use one and not the other, as well as the same info for ^, | and ||. The tagged thread has examples of using one versus another, but they don't explain the details of the operations themselves. As I'm struggling with understanding the very nature of the operations themselves, those explanations are crucial; examples alone won't clarify my understanding.
This isn't something you use bit operators for. You want to compare input string with a desired 'M' or 'F' so all you have to do is:
while (toupper(sex) != 'M' && toupper(sex) != 'F') {...
In your code you were missing the second toupper(sex) != part. You need to look at this statement as two requirements that have to be met in order for while to continue.
First one is: toupper(sex) != 'M'
Second on: toupper(sex) != 'F'
The reason why a &&(logical and) should be between those two is because you want the while loop to run if sex isn't M and at the same time it isn't F.
Operators like & and | are for comparing bits of a variable. For example if you want to use bit flags you set each of your flag to one bit and compare the resulting flag combination with &.
EDIT: So for bit flags, they compare the value you give them bit by bit. So what you do for bit flags is you can define them as powers of 2 (1, 2, 4, 8...) which each represents on position in binary (00000001, 00000010, 00000100, 00001000 ...). So for example when you have flags:
a = 1;
b = 2;
c = 4;
You can set them with ``| to your set of flags:
setFlags = setFlags | a | c;
This will compare the setFlags bit by bit with a and c.
00000000
|00000001 // a
|00000100 // c
=00000101 // result
So the | operator checks all the bits of the same position and if one of them is true, the result will be true just like logical OR.
Now to read them you use & like this:
if (setFlags & a)
which does:
00000101 // setFlags
&00000001 // a
=00000001 // result
this leaves only bits where they both are true (just like logical AND), therefore you can get true if the setFlags contains that flag.
00000101 // setFlags
&00000010 // b
=00000000 // result
in this case there are no bits set to true on same position so the result tells you that b isn't set in setFlags.
while (toupper(sex) != 'M' && 'F') is evaluated as:
while(true && (bool)'F'): while(true && true) // or
while(false && (bool)'F'): while(false && true)
because if you sex is assigned 'm' or 'M' then the condition is true and you used logical and && on a non-zero value 'f' the result is always true.
it is like: if( (bool)5 ) which is true because the bool values of any non-zero value is always true.
to correct you example:
while (toupper(sex) != 'M' && toupper(sex) != 'F')
logical operators like logical and &&, logical or ||,... return bool values true or false; 1 or 0 they evaluate expressions.
bitwise operators like bitwise and &, bitwise or |... return any value, the y work on bits:
unsigned char uc1 = 13; // 0x01 00001101
unsigned char uc2 = 11; // 0x03 00001011
unsigned char uc3 = (uc1 & uc2);// 00001001
as you can see 1 & 1 = 1, 0 & 1 = 0, 1 & 0 = 0, 1 & 1 = 1, 0 & 0 = 0...
the result us 00001001 which is in decimal 9
I come here to ask for tricks. I've got a 32-bit integer (that's 4 bytes). I want to test zero for each byte, and return true if one of them is true.
E.g.
int c1 = 0x01020304
cout<<test(c1)<<endl; // output false
int c2 = 0x00010203
cout<<test(c2)<<endl; // output true
int c3 = 0xfffefc00
cout<<test(c3)<<endl; // output true
Are there any tricks to do it in the least number of CPU cycles?
There are several ways in the famous bithacks page
bool hasZeroByte(unsigned int v)
{
return ~((((v & 0x7F7F7F7F) + 0x7F7F7F7F) | v) | 0x7F7F7F7F);
}
or
bool hasZeroByte = ((v + 0x7efefeff) ^ ~v) & 0x81010100;
if (hasZeroByte) // or may just have 0x80 in the high byte
{
hasZeroByte = ~((((v & 0x7F7F7F7F) + 0x7F7F7F7F) | v) | 0x7F7F7F7F);
}
And the likely most compact way when compiling to assembly
#define haszero(v) (((v) - 0x01010101UL) & ~(v) & 0x80808080UL)
As they're tricks, they're hard to understand so if you want clarity, mask out each byte and check like in dasblinkenlight's answer
Example assembly output on Compiler Explorer
You can test it by masking each of the bytes in an & operation, and comparing the result to zero:
bool hasZeroByte(int32_t n) {
return !(n & 0x000000FF)
|| !(n & 0x0000FF00)
|| !(n & 0x00FF0000)
|| !(n & 0xFF000000);
}
The fastest way to do this is probably to use strnlen, since most compilers will have optimized this to use low level instructions for finding zero bytes in strings.
bool hasZeroByte(int32_t n) {
return strnlen(reinterpret_cast<char *>(&n), 4) < 4;
}
If you want to be a little more explicit, you could use the memchr function which is documented to do exactly what you are asking:
bool hasZeroByte(int32_t n) {
return memchr(reinterpret_cast<void *>(&n), 0, 4) != nullptr;
}
For those who don't believe this answer, feel free to take a look at the glibc implementation of strlen and see that it is already doing all of the mentioned bit twiddling tricks in the other answers.
See also:
http://www.strchr.com/optimized_strlen_function
http://www.strchr.com/strcmp_and_strlen_using_sse_4.2
http://www.int80h.org/strlen/
I have written the below mentioned code. The code checks the first bit of every byte. If the first bit of every byte of is equal to 0, then it concatenates this value with the previous byte and stores it in a different variable var1. Here pos points to bytes of an integer. An integer in my implementation is uint64_t and can occupy upto 8 bytes.
uint64_t func(char* data)
{
uint64_t var1 = 0; int i=0;
while ((data[i] >> 7) == 0)
{
variable = (variable << 7) | (data[i]);
i++;
}
return variable;
}
Since I am repeatedly calling func() a trillion times for trillions of integers. Therefore it runs slow, is there a way by which I may optimize this code?
EDIT: Thanks to Joe Z..its indeed a form of uleb128 unpacking.
I have only tested this minimally; I am happy to fix glitches with it. With modern processors, you want to bias your code heavily toward easily predicted branches. And, if you can safely read the next 10 bytes of input, there's nothing to be saved by guarding their reads by conditional branches. That leads me to the following code:
// fast uleb128 decode
// assumes you can read all 10 bytes at *data safely.
// assumes standard uleb128 format, with LSB first, and
// ... bit 7 indicating "more data in next byte"
uint64_t unpack( const uint8_t *const data )
{
uint64_t value = ((data[0] & 0x7F ) << 0)
| ((data[1] & 0x7F ) << 7)
| ((data[2] & 0x7F ) << 14)
| ((data[3] & 0x7F ) << 21)
| ((data[4] & 0x7Full) << 28)
| ((data[5] & 0x7Full) << 35)
| ((data[6] & 0x7Full) << 42)
| ((data[7] & 0x7Full) << 49)
| ((data[8] & 0x7Full) << 56)
| ((data[9] & 0x7Full) << 63);
if ((data[0] & 0x80) == 0) value &= 0x000000000000007Full; else
if ((data[1] & 0x80) == 0) value &= 0x0000000000003FFFull; else
if ((data[2] & 0x80) == 0) value &= 0x00000000001FFFFFull; else
if ((data[3] & 0x80) == 0) value &= 0x000000000FFFFFFFull; else
if ((data[4] & 0x80) == 0) value &= 0x00000007FFFFFFFFull; else
if ((data[5] & 0x80) == 0) value &= 0x000003FFFFFFFFFFull; else
if ((data[6] & 0x80) == 0) value &= 0x0001FFFFFFFFFFFFull; else
if ((data[7] & 0x80) == 0) value &= 0x00FFFFFFFFFFFFFFull; else
if ((data[8] & 0x80) == 0) value &= 0x7FFFFFFFFFFFFFFFull;
return value;
}
The basic idea is that small values are common (and so most of the if-statements won't be reached), but assembling the 64-bit value that needs to be masked is something that can be efficiently pipelined. With a good branch predictor, I think the above code should work pretty well. You might also try removing the else keywords (without changing anything else) to see if that makes a difference. Branch predictors are subtle beasts, and the exact character of your data also matters. If nothing else, you should be able to see that the else keywords are optional from a logic standpoint, and are there only to guide the compiler's code generation and provide an avenue for optimizing the hardware's branch predictor behavior.
Ultimately, whether or not this approach is effective depends on the distribution of your dataset. If you try out this function, I would be interested to know how it turns out. This particular function focuses on standard uleb128, where the value gets sent LSB first, and bit 7 == 1 means that the data continues.
There are SIMD approaches, but none of them lend themselves readily to 7-bit data.
Also, if you can mark this inline in a header, then that may also help. It all depends on how many places this gets called from, and whether those places are in a different source file. In general, though, inlining when possible is highly recommended.
Your code is problematic
uint64_t func(const unsigned char* pos)
{
uint64_t var1 = 0; int i=0;
while ((pos[i] >> 7) == 0)
{
var1 = (var1 << 7) | (pos[i]);
i++;
}
return var1;
}
First a minor thing: i should be unsigned.
Second: You don't assert that you don't read beyond the boundary of pos. E.g. if all values of your pos array are 0, then you will reach pos[size] where size is the size of the array, hence you invoke undefined behaviour. You should pass the size of your array to the function and check that i is smaller than this size.
Third: If pos[i] has most significant bit equal to zero for i=0,..,k with k>10, then previous work get's discarded (as you push the old value out of var1).
The third point actually helps us:
uint64_t func(const unsigned char* pos, size_t size)
{
size_t i(0);
while ( i < size && (pos[i] >> 7) == 0 )
{
++i;
}
// At this point, i is either equal to size or
// i is the index of the first pos value you don't want to use.
// Therefore we want to use the values
// pos[i-10], pos[i-9], ..., pos[i-1]
// if i is less than 10, we obviously need to ignore some of the values
const size_t start = (i >= 10) ? (i - 10) : 0;
uint64_t var1 = 0;
for ( size_t j(start); j < i; ++j )
{
var1 <<= 7;
var1 += pos[j];
}
return var1;
}
In conclusion: We separated logic and got rid of all discarded entries. The speed-up depends on the actual data you have. If lot's of entries are discarded then you save a lot of writes to var1 with this approach.
Another thing: Mostly, if one function is called massively, the best optimization you can do is call it less. Perhaps you can have come up with an additional condition that makes the call of this function useless.
Keep in mind that if you actually use 10 values, the first value ends up the be truncated.
64bit means that there are 9 values with their full 7 bits of information are represented, leaving exactly one bit left foe the tenth. You might want to switch to uint128_t.
A small optimization would be:
while ((pos[i] & 0x80) == 0)
Bitwise and is generally faster than a shift. This of course depends on the platform, and it's also possible that the compiler will do this optimization itself.
Can you change the encoding?
Google came across the same problem, and Jeff Dean describes a really cool solution on slide 55 of his presentation:
http://research.google.com/people/jeff/WSDM09-keynote.pdf
http://videolectures.net/wsdm09_dean_cblirs/
The basic idea is that reading the first bit of several bytes is poorly supported on modern architectures. Instead, let's take 8 of these bits, and pack them as a single byte preceding the data. We then use the prefix byte to index into a 256-item lookup table, which holds masks describing how to extract numbers from the rest of the data.
I believe it's how protocol buffers are currently encoded.
Can you change your encoding? As you've discovered, using a bit on each byte to indicate if there's another byte following really sucks for processing efficiency.
A better way to do it is to model UTF-8, which encodes the length of the full int into the first byte:
0xxxxxxx // one byte with 7 bits of data
10xxxxxx 10xxxxxx // two bytes with 12 bits of data
110xxxxx 10xxxxxx 10xxxxxx // three bytes with 16 bits of data
1110xxxx 10xxxxxx 10xxxxxx 10xxxxxx // four bytes with 22 bits of data
// etc.
But UTF-8 has special properties to make it easier to distinguish from ASCII. This bloats the data and you don't care about ASCII, so you'd modify it to look like this:
0xxxxxxx // one byte with 7 bits of data
10xxxxxx xxxxxxxx // two bytes with 14 bits of data.
110xxxxx xxxxxxxx xxxxxxxx // three bytes with 21 bits of data
1110xxxx xxxxxxxx xxxxxxxx xxxxxxxx // four bytes with 28 bits of data
// etc.
This has the same compression level as your method (up to 64 bits = 9 bytes), but is significantly easier for a CPU to process.
From this you can build a lookup table for the first byte which gives you a mask and length:
// byte_counts[255] contains the number of additional
// bytes if the first byte has a value of 255.
uint8_t const byte_counts[256]; // a global constant.
// byte_masks[255] contains a mask for the useful bits in
// the first byte, if the first byte has a value of 255.
uint8_t const byte_masks[256]; // a global constant.
And then to decode:
// the resulting value.
uint64_t v = 0;
// mask off the data bits in the first byte.
v = *data & byte_masks[*data];
// read in the rest.
switch(byte_counts[*data])
{
case 3: v = v << 8 | *++data;
case 2: v = v << 8 | *++data;
case 1: v = v << 8 | *++data;
case 0: return v;
default:
// If you're on VC++, this'll make it take one less branch.
// Better make sure you've got all the valid inputs covered, though!
__assume(0);
}
No matter the size of the integer, this hits only one branch point: the switch, which will likely be put into a jump table. You can potentially optimize it even further for ILP by not letting each case fall through.
First, rather than shifting, you can do a bitwise test on the
relevant bit. Second, you can use a pointer, rather than
indexing (but the compiler should do this optimization itself.
Thus:
uint64_t
readUnsignedVarLength( unsigned char const* pos )
{
uint64_t results = 0;
while ( (*pos & 0x80) == 0 ) {
results = (results << 7) | *pos;
++ pos;
}
return results;
}
At least, this corresponds to what your code does. For variable
length encoding of unsigned integers, it is incorrect, since
1) variable length encodings are little endian, and your code is
big endian, and 2) your code doesn't or in the high order byte.
Finally, the Wiki page suggests that you've got the test
inversed. (I know this format mainly from BER encoding and
Google protocol buffers, both of which set bit 7 to indicate
that another byte will follow.
The routine I use is:
uint64_t
readUnsignedVarLen( unsigned char const* source )
{
int shift = 0;
uint64_t results = 0;
uint8_t tmp = *source ++;
while ( ( tmp & 0x80 ) != 0 ) {
*value |= ( tmp & 0x7F ) << shift;
shift += 7;
tmp = *source ++;
}
return results | (tmp << shift);
}
For the rest, this wasn't written with performance in mind, but
I doubt that you could do significantly better. An alternative
solution would be to pick up all of the bytes first, then
process them in reverse order:
uint64_t
readUnsignedVarLen( unsigned char const* source )
{
unsigned char buffer[10];
unsigned char* p = std::begin( buffer );
while ( p != std::end( buffer ) && (*source & 0x80) != 0 ) {
*p = *source & 0x7F;
++ p;
}
assert( p != std::end( buffer ) );
*p = *source;
++ p;
uint64_t results = 0;
while ( p != std::begin( buffer ) ) {
-- p;
results = (results << 7) + *p;
}
return results;
}
The necessity of checking for buffer overrun will likely make
this slightly slower, but on some architectures, shifting by
a constant is significantly faster than shifting by a variable,
so this could be faster on them.
Globally, however, don't expect miracles. The motivation for
using variable length integers is to reduce data size, at
a cost in runtime for decoding and encoding.
Is it possible to implement a function that results in this mapping:
{
(0x01, 0x01),
(0x10, 0x10),
(0x11, 0x00)
}
Using only bitwise operations?
Context
In the Flixel framework, there are a set of four constants,
FlxObject.LEFT:uint = 0x0001;
FlxObject.RIGHT:uint = 0x0010;
FlxObject.UP:uint = 0x0100;
FlxObject.DOWN:uint = 0x1000;
Obviously designed to be manipulated with bitwise operators. I was trying to write a function, using only bitwise operators, that would return the opposite direction of whatever was passed in (in terms of these FlxObject constants).
Some example mappings:
{
(0x0110, 0x1001),
(0x0100, 0x1000),
(0x1010, 0x0101),
(0x0001, 0x0010),
(0x1100, 0x0000)
}
The problem is, my solution tends to break down when you pass it something like 0x0011, 0x1100, 0x1110 and similar, and requires a check against this case.
Testing code here (also at http://pastie.org/3420169):
#!/usr/bin/env python
from sys import stdout
from os import linesep
# Implementation without conditional
def horiz(dir):
return(dir ^ 0x0011 ^ 0x1100) & 0x0011
def vert(dir):
return (dir ^ 0x1100 ^ 0x0011) & 0x1100
def oppositeDirection(dir):
return horiz(dir) | vert(dir)
# Implementation with conditional
def horizFix(dir):
dir = horiz(dir)
return dir if dir != 0x0011 else 0
def vertFix(dir):
dir = vert(dir)
return dir if dir != 0x1100 else 0
def oppositeDirectionFix(dir):
return horizFix(dir) | vertFix(dir)
failcount = 0
testcount = 0
def test(dir, expect, func):
global failcount, testcount
testcount += 1
result = func(dir)
stdout.write('Testing: {0:04x} => {1:04x}'.format(dir, result))
if result != expect:
stdout.write('\t GOT {0:04x} expected {1:04x}'.format(result, expect))
failcount += 1
stdout.write(linesep)
test_cases =[0x0000, 0x0001, 0x0010, 0x0100, 0x1000, 0x0011, 0x0101, 0x1001, 0x0110, 0x1010, 0x1100, 0x0111, 0x1011, 0x1101, 0x1110, 0x1111]
print 'Testing full oppositeDirection function----------------'
for case in test_cases:
test(case, oppositeDirectionFix(case), oppositeDirection)
print '\nTesting horiz function---------------------------------'
for case in test_cases:
test(case, horizFix(case), horiz)
print '\nTesting vert function----------------------------------'
for case in test_cases:
test(case, vertFix(case), vert)
print '{0}Succeeded: {2}/{1}, Failed: {3}/{1}'.format(linesep, testcount, testcount - failcount, failcount)
If you run the test, you'll see that in the cases like 0x0011 and 0x0000 horiz will return 0x0011, and for 0x1100 and 0x000 vert will return 0x1100. So close!
This is clearly an incredibly insignificant problem, and there will never be any situation in my game code where a direction value would be simultaneously left and right or up and down. But, I'm taking this as an opportunity to hone my bit twiddling skills. Is there some logic principle I'm missing here that will help me either solve it or realize it's an unsolvable problem?
No, you have to make some kind of test because your result depends on two adjacent bits.
You could use XOR with 1111 and then test if the result contains 1100 or 0011.
But since you only have 16 values to verify would it not be simpler to make a switch/select/match function for the 8 valid values?
I realize this is an old question by now, but...
If bit shifts are allowed, then yes it can be done with just bitwise operations... though a lookup table or switch is probably more practical. The trick is just that since the value of each bit in your result depends on more than one of your original bits, you need to incorporate a shifted copy of your input value into the overall operation.
If you change the definitions of horiz(), vert(), and oppositeDirection() in your testing code to the following:
def horiz(dir):
return (((dir << 4) & ~dir & 0x0010) | ((dir >> 4) & ~dir & 0x0001))
def vert(dir):
return (((dir << 4) & ~dir & 0x1000) | ((dir >> 4) & ~dir & 0x1000))
def oppositeDirection(dir):
return (((dir << 4) & ~dir & 0x1010) | ((dir >> 4) & ~dir & 0x0101))
then all the tests pass. If we look at horiz() (the others are similar):
The second digit (from the right) is given by:
((dir << 4) & ~dir & 0x0010)
and the first digit by:
((dir >> 4) & ~dir & 0x0001)
Looking further at how the first digit is calculated, (dir >> 4) gets us the original second digit, but lined up with the first. ~dir gets the inverse of the original digit. We then AND those together, and AND with a mask to get just the digit we're calculating so we don't mess up the other digits.
So if we call the original digits A (first) and B (second), and call the new first digit X, we have:
X = ~A & B
The second digit is calculated the same way, shifting in the other direction. We can also combine the calculations of the horizontal bits and the vertical ones by just adjusting the masks appropriately.