Reversed axis algorithm issue - c++

I'm currently working on an electronic projet and there's a little problem with the joystick values. The values are "correct" but they looks weird.
A classical axis from a joystick usually work (for example left to right).
Totally left : -128
Center : 0
Totally left : +128
But here's what I read from this one :
Totally left : -0
Slightly on the left : - 128
Center : "Random" (never totally zeroed, float between -125 and +125)
Slightly on the right : + 128
Totally right : +0
For the moment I'm using the following workaround to get a linear progression from -128 to +128 :
if (value > 0)
value = -(128 - value);
else
test = 128 + value;
The problem is I have to do that on several inputs, 2 axis per joyrstick, 3 joystick per device, 4 total devices so 24 times and I need to need keep a response time under 20ms for the entire operation. And that's freaking cycle consuming !
I can binary manipulate the value.
Here's how I actually center it. Raw dump contains array of 0 and 1 read from the controller I/O
for (i = 0; i<8; i++) {
value |= raw_dump[pos + i] ? (0x80 >> i):0 ;
}
Do you have any ideas or good algorithm ? I'm starting to be desesperate and I totally suck on binary manipulation... :'(

It looks like whatever mechanism is sampling the joystick actually returns an unsigned byte in the range of 0 .. 255, with 0 at the far left and 255 at the far right.
You can convert that value to the range -128 to 127 with one statement:
value = (value & 0xFF) - 128;
If value is a byte variable, you can shorten that to:
value ^= 0x80;
That conversion should be very quick on any processor, even a 1MHz 6502.
I'm not sure what your second bit of code is about. If you could describe what you're trying to accomplish there, I can offer further insight.

Related

Constrain a 16 bit signed value between 0 and 4095 using Bit Manipulation only (without branching)

I want to constrain the value of a signed short variable between 0 and 4095, after which I take the most significant 8 bits as my final value for use elsewhere. Right now I'm doing it in a basic manner as below:
short color = /* some external source */;
/*
* I get the color value as a 16 bit signed integer from an
* external source I cannot trust. 16 bits are being used here
* for higher precision.
*/
if ( color < 0 ) {
color = 0;
}
else if ( color > 4095 ) {
color = 4095;
}
unsigned char color8bit = 0xFF & (color >> 4);
/*
* color8bit is my final value which I would actually use
* in my application.
*/
Is there any way this can be done using bit manipulation only, i.e. without using any conditionals? It might help quite a bit in speeding things up as this operation is happening thousands of time in the code.
The following won't help as it doesn't take care of edge cases such as negative values and overflows:
unsigned char color8bit = 0xFF & (( 0x0FFF & color ) >> 4 );
Edit: Adam Rosenfield's answer is the one which takes the correct approach but its incorrectly implemented. ouah's answer gives correct results but takes a different approach that what I originally intended to find out.
This is what I ended up using:
const static short min = 0;
const static short max = 4095;
color = min ^ (( min ^ color ) & -( min < color ));
color = max ^ (( color ^ max ) & -( color < max ));
unsigned char color8bit = 0xFF & (( 0x0FFF & color ) >> 4 );
Yes, see these bit-twiddling hacks:
short color = ...;
color = color ^ (color & -(color < 0)); // color = max(color, 0)
color = 4096 ^ ((color ^ 4096) & -(color < 4096)); // color = min(color, 4096)
unsigned char color8bit = 0xFF & (color >> 4);
Whether this actually turns out to be faster, I don't know -- you should profile. Most modern x86 and x86-64 chips these days support "conditional move" instructions (cmov) which conditionally store a value depending on the EFLAGS status bits, and optimizing compilers will often produce these instructions from ternary expressions like color >= 0 ? color : 0. Those will likely be fastest, but they won't run on older x86 chips.
You can do the following:
BYTE data[0x10000] = { ..... };
BYTE byte_color = data[(unsiged short)short_color];
In your days 64kb table is not something outrageous and may be acceptable. The number of assembler commands in this variant of code will be absolute minimum compared to other possible approaches.
short color = /* ... */
color = ((((!!(color >> 12)) * 0xFFF)) | (!(color >> 12) * color ))
& (!(color >> 15) * 0xFFF);
unsigned char color8bit = 0xFF & (color >> 4);
It assumes two's complement representation.
This has the advantage of not using any equality or relational operators. There are situations you want to avoid branches at all costs: in some security applications you don't want the attackers to perform branch predictions. Without branches (in embedded processors particularly) you can make your function run in constant time for all inputs.
Note that: x * 0xFFF can be further reduced to (x << 12) - x. Also the multiplication in (!(color >> 12) * color ) can also be further optimized as the left operand of * here is 0 or 1.
EDIT:
I add a little explanation: the expression above simply does the same as below without the use of the conditional and relational operators:
y = ((y > 4095 ? 4095 : 0) | (y > 4095 ? 0 : y))
& (y < 0 ? 0 : 4095);
EDIT2:
as #HotLicks correctly noted in his comment, the ! is still a conceptual branch. Nevertheless it can also be computed with bitwise operators. For example !!a can be done with the trivial:
b = (a >> 15 | a >> 14 | ... | a >> 1 | a) & 1
and !a can be done as b ^ 1. And I'm sure there is a nice hack to do it more effectively.
I assume a short is 16 bits.
Remove negative values:
int16_t mask=-(int16_t)((uint16_t)color>>15);//0xFFFF if +ve, 0 if -ve
short value=color&mask;//0 if -ve, colour if +ve
value is now between 0 and 32767 inclusive.
You can then do something similar to clamp the value:
mask=(uint16_t)(value-4096)>>15;//1 if <=4095, 0 if >4095
--mask;//0 if <=4095, 0xFFFF if >4095
mask&=0xFFF;//0 if <=4095, 4095 if >4095
value|=mask;//4095 if >4095, color if <4095
You could also easily vectorize this using Intel's SSE intrinsics. One 128-bit register would hold 8 of your short and there are functions to min/max/shift/mask all of them in parallel. In a loop the constants for min/max can be preloaded into a register. The pshufb instruction (part of SSSE3) will even pack the bytes for you.
I'm going to leave an answer even though it doesn't directly answer the original question, because in the end I think you'll find it much more useful.
I'm assuming that your color is coming from a camera or image scanner running at 12 bits, followed by some undetermined processing step that might create values beyond the 0 to 4095 range. If that's the case the values are almost certainly derived in a linear fashion. The problem is that displays are gamma corrected, so the conversion from 12 bit to 8 bit will require a non-linear gamma function rather than a simple right shift. This will be much slower than the clamping operation your question is trying to optimize. If you don't use a gamma function the image will appear too dark.
short color = /* some external source */;
unsigned char color8bit;
if (color <= 0)
color8bit = 0;
else if (color >= 4095)
color8bit = 255;
else
color8bit = (unsigned char)(255.99 * pow(color / 4095.0, 1/2.2));
At this point you might consider a lookup table as suggested by Kirill Kobelev.
This is somewhat akin to Tom Seddon's answer, but uses a slightly cleaner way to do the clamp above. Note that both Mr. Seddon's answer and mine avoid the issue of ouah's answer that shifting a signed value to the right is implementation defined behavior, and hence not guaranteed to work on all architenctures.
#include <inttypes.h>
#include <iostream>
int16_t clamp(int16_t value)
{
// clampBelow is 0xffff for -ve, 0x0000 for +ve
int16_t const clampBelow = -static_cast<int16_t>(static_cast<uint16_t>(value) >> 15);
// value is now clamped below at zero
value &= ~clampBelow;
// subtract 4095 so we can do the same trick again
value -= 4095;
// clampAbove is 0xffff for -ve, 0x0000 for +ve,
// i.e. 0xffff for original value < 4095, 0x0000 for original >= 4096
int16_t const clampAbove = -static_cast<int16_t>(static_cast<uint16_t>(value) >> 15);
// adjusted value now clamped above at zero
value &= clampAbove;
// and restore to original value.
value += 4095;
return value;
}
void verify(int16_t value)
{
int16_t const clamped = clamp(value);
int16_t const check = (value < 0 ? 0 : value > 4095 ? 4095 : value);
if (clamped != check)
{
std::cout << "Verification falure for value: " << value << ", clamped: " << clamped << ", check: " << check << std::endl;
}
}
int main()
{
for (int16_t i = 0x4000; i != 0x3fff; i++)
{
verify(i);
}
return 0;
}
That's a full test program (OK, so it doesn't test 0x3fff - sue me. ;) ) from which you can extract the clamp() routine for whatever you need.
I've also broken clamp out to "one step per line" for the sake of clarity. If your compiler has a half way decent optimizer, you can leave it as is and rely on the compiler to produce the best possible code. If your compiler's optimizer is not that great, then by all means, it can be reduced in line count, albeit at the cost of a little readability.
"Never sacrifice clarity for efficiency" -- Bob Buckley, comp sci professor, U-Warwick, Coventry, England, 1980.
Best piece of advice I ever got. ;)

bitwise bitmanipulation puzzle

Hello is have a question for a school assignment i need to :
Read a round number, and with the internal binaire code with bit 0 on the right and bit 7 on the left.
Now i need to change:
bit 0 with bit 7
bit 1 with bit 6
bit 2 with bit 5
bit 3 with bit 4
by example :
if i use hex F703 becomes F7C0
because 03 = 0000 0011 and C0 = 1100 0000
(only the right byte (8 bits) need to be switched.
The lession was about bitmanipulation but i can't find a way to make it correct for al the 16 hexnumbers.
I`am puzzling for a wile now,
i am thinking for using a array for this problem or can someone say that i can be done with only bitwise ^,&,~,<<,>>, opertors ???
Study the following two functions:
bool GetBit(int value, int bit_position)
{
return value & (1 << bit_position);
}
void SetBit(int& value, int bit_position, bool new_bit_value)
{
if (new_bit_value)
value |= (1 << bit_position);
else
value &= ~(1 << bit_position);
}
So now we can read and write arbitrary bits just like an array.
1 << N
gives you:
000...0001000...000
Where the 1 is in the Nth position.
So
1 << 0 == 0000...0000001
1 << 1 == 0000...0000010
1 << 2 == 0000...0000100
1 << 3 == 0000...0001000
...
and so on.
Now what happens if I BINARY AND one of the above numbers with some other number Y?
X = 1 << N
Z = X & Y
What is Z going to look like? Well every bit apart from the Nth is definately going to be 0 isnt it? because those bits are 0 in X.
What will the Nth bit of Z be? It depends on the value of the Nth bit of Y doesn't it? So under what circumstances is Z zero? Precisely when the Nth bit of Y is 0. So by converting Z to a bool we can seperate out the value of the Nth bit of Y. Take another look at the GetBit function above, this is exactly what it is doing.
Now thats reading bits, how do we set a bit? Well if we want to set a bit on we can use BINARY OR with one of the (1 << N) numbers from above:
X = 1 << N
Z = Y | X
What is Z going to be here? Well every bit is going to be the same as Y except the Nth right? And the Nth bit is always going to be 1. So we have set the Nth bit on.
What about setting a bit to zero? What we want to do is take a number like 11111011111 where just the Nth bit is off and then use BINARY AND. To get such a number we just use BINARY NOT:
X = 1 << N // 000010000
W = ~X // 111101111
Z = W & Y
So all the bits in Z apart from the Nth will be copies of Y. The Nth will always be off. So we have effectively set the Nth bit to 0.
Using the above two techniques is how we have implemented SetBit.
So now we can read and write arbitrary bits. Now we can reverse the bits of the number just like it was an array:
int ReverseBits(int input)
{
int output = 0;
for (int i = 0; i < N; i++)
{
bool bit = GetBit(input, i); // read ith bit
SetBit(output, N-i-1, bit); // write (N-i-1)th bit
}
return output;
}
Please make sure you understand all this. Once you have understood this all, please close the page and implement and test them without looking at it.
If you enjoyed this than try some of these:
http://graphics.stanford.edu/~seander/bithacks.html
And/or get this book:
http://www.amazon.com/exec/obidos/ASIN/0201914654/qid%3D1033395248/sr%3D11-1/ref%3Dsr_11_1/104-7035682-9311161
This does one quarter of the job, but I'm not going to give you any more help than that; if you can work out why I said that, then you should be able to fill in the rest of the code.
if ((i ^ (i >> (5 - 2))) & (1 >> 2))
i ^= (1 << 2) | (1 << 5);
Essentially you need to reverse the bit ordering.
We're not going to solve this for you.. but here's a hint:
What if you had a 2-bit value. How would you reverse these bits?
A simple swap would work, right? Think about how to code this swap with operators that are available to you.
Now let's say you had a 4-bit value. How would you reverse these bits?
Could you split it into two 2-bit values, reverse each one, and then swap them? Would that give you the right result? Now code this.
Generalizing that solution to the 8-bit value should be trivial now.
Good luck!

Set A Float's Fractional Part Using 6 Bits

I am uncompressing some data from double words.
unsigned char * current_word = [address of most significant byte]
My first 14 MSB are an int value. I plan to extract them using a bitwise AND with 0xFFFC.
int value = (int)( (uint_16)current_word & 0xFFFC );
My next 6 bits are a fractional value. Here I am stuck on an efficient implementation. I could extract one bit at a time, and build the fraction 1/2*bit + 1/4+bit + 1/8*bit etc ... but that's not efficient.
float fractional = ?
The last 12 LSB are another int value, which I feel I can pull out using bitwise AND again.
int other_value = (int) ( (uint_16)current_word[2] & 0x0FFF );
This operation will be done on 16348 double words and needs to be finished within 0.05 ms to run at least 20Hz.
I am very new to bit operations, but I'm excited to learn. Reading material and/or examples would be greatly appreciated!
Edit: I wrote OR when I meant AND
Since you're starting with [address of most significant byte] and using increasing addresses from there, your data is apparently in Big-Endian byte order. Casting pointers will therefore fail on nearly all desktop machines, which use Little-Endian byte order.
The following code will work, regardless of native byte order:
int value = (current_word[0] << 6) | (current_word[1] >> 2);
double fractional = (current_word[1] & 0x03) / 4.0 + (current_word[2] & 0xF0) / 1024.0;
int other_value = (current_word[2] & 0x0F) << 8 | current_word[3];
Firstly you'd be more efficient getting the double-word all at once into an int and masking/shifting from there.
Getting the fractional part from that is easy: mask and shift to get an integer, then divide by a float to scale the result.
float fractional = ((current_int >> 12) & 0x3f) / 64.;
there are 5 kinds of shift instructions:
Shift right with sign extend: It will copy your current leftmost bit as the new bit to the leftmost after shifting all the bits to the right. Rightmost one gets dropped.
Shift right with zero extend: Same as (1) but assume that your new leftmost bit is always zero.
Shift left: replace right in (1) and (2) with left , left with right and read (2) again.
Roll right: Shift your bits to the right, instead of rightmost one dropping, it becomes your leftmost.
Roll left: Replace right in (4) with left , left with right and read (4) again.
You can shift as many times you want. In C, more than the amount of bits in your datatype is undefined. Unsigned and signed types shift differently although the syntax is same.
If you are reading your data as unsigned char *, you are not going to be able to get more than 8-bits at a time of data and your example needs to change. If your address is aligned, or your platform allows, you should read your data in as an int *, but then that also begs the question of just how your data is stored. Is it stored 20-bits per integer with 12-bits of other info, or is it a 20-bit stream where you need to keep track of your bit pointer. If the second, it's even more complex than you realize. I'll post further once I have a feel for how your data is laid out in RAM.

C/C++ Bit Array or Bit Vector

I am learning C/C++ programming & have encountered the usage of 'Bit arrays' or 'Bit Vectors'. Am not able to understand their purpose? here are my doubts -
Are they used as boolean flags?
Can one use int arrays instead? (more memory of course, but..)
What's this concept of Bit-Masking?
If bit-masking is simple bit operations to get an appropriate flag, how do one program for them? is it not difficult to do this operation in head to see what the flag would be, as apposed to decimal numbers?
I am looking for applications, so that I can understand better. for Eg -
Q. You are given a file containing integers in the range (1 to 1 million). There are some duplicates and hence some numbers are missing. Find the fastest way of finding missing
numbers?
For the above question, I have read solutions telling me to use bit arrays. How would one store each integer in a bit?
I think you've got yourself confused between arrays and numbers, specifically what it means to manipulate binary numbers.
I'll go about this by example. Say you have a number of error messages and you want to return them in a return value from a function. Now, you might label your errors 1,2,3,4... which makes sense to your mind, but then how do you, given just one number, work out which errors have occured?
Now, try labelling the errors 1,2,4,8,16... increasing powers of two, basically. Why does this work? Well, when you work base 2 you are manipulating a number like 00000000 where each digit corresponds to a power of 2 multiplied by its position from the right. So let's say errors 1, 4 and 8 occur. Well, then that could be represented as 00001101. In reverse, the first digit = 1*2^0, the third digit 1*2^2 and the fourth digit 1*2^3. Adding them all up gives you 13.
Now, we are able to test if such an error has occured by applying a bitmask. By example, if you wanted to work out if error 8 has occured, use the bit representation of 8 = 00001000. Now, in order to extract whether or not that error has occured, use a binary and like so:
00001101
& 00001000
= 00001000
I'm sure you know how an and works or can deduce it from the above - working digit-wise, if any two digits are both 1, the result is 1, else it is 0.
Now, in C:
int func(...)
{
int retval = 0;
if ( sometestthatmeans an error )
{
retval += 1;
}
if ( sometestthatmeans an error )
{
retval += 2;
}
return retval
}
int anotherfunc(...)
{
uint8_t x = func(...)
/* binary and with 8 and shift 3 plaes to the right
* so that the resultant expression is either 1 or 0 */
if ( ( ( x & 0x08 ) >> 3 ) == 1 )
{
/* that error occurred */
}
}
Now, to practicalities. When memory was sparse and protocols didn't have the luxury of verbose xml etc, it was common to delimit a field as being so many bits wide. In that field, you assign various bits (flags, powers of 2) to a certain meaning and apply binary operations to deduce if they are set, then operate on these.
I should also add that binary operations are close in idea to the underlying electronics of a computer. Imagine if the bit fields corresponded to the output of various circuits (carrying current or not). By using enough combinations of said circuits, you make... a computer.
regarding the usage the bits array :
if you know there are "only" 1 million numbers - you use an array of 1 million bits. in the beginning all bits will be zero and every time you read a number - use this number as index and change the bit in this index to be one (if it's not one already).
after reading all numbers - the missing numbers are the indices of the zeros in the array.
for example, if we had only numbers between 0 - 4 the array would look like this in the beginning: 0 0 0 0 0.
if we read the numbers : 3, 2, 2
the array would look like this: read 3 --> 0 0 0 1 0. read 3 (again) --> 0 0 0 1 0. read 2 --> 0 0 1 1 0. check the indices of the zeroes: 0,1,4 - those are the missing numbers
BTW, of course you can use integers instead of bits but it may take (depends on the system) 32 times memory
Sivan
Bit Arrays or Bit Vectors can be though as an array of boolean values. Normally a boolean variable needs at least one byte storage, but in a bit array/vector only one bit is needed.
This gets handy if you have lots of such data so you save memory at large.
Another usage is if you have numbers which do not exactly fit in standard variables which are 8,16,32 or 64 bit in size. You could this way store into a bit vector of 16 bit a number which consists of 4 bit, one that is 2 bit and one that is 10 bits in size. Normally you would have to use 3 variables with sizes of 8,8 and 16 bit, so you only have 50% of storage wasted.
But all these uses are very rarely used in business aplications, the come to use often when interfacing drivers through pinvoke/interop functions and doing low level programming.
Bit Arrays of Bit Vectors are used as a mapping from position to some bit value. Yes it's basically the same thing as an array of Bool, but typical Bool implementation is one to four bytes long and it uses too much space.
We can store the same amount of data much more efficiently by using arrays of words and binary masking operations and shifts to store and retrieve them (less overall memory used, less accesses to memory, less cache miss, less memory page swap). The code to access individual bits is still quite straightforward.
There is also some bit field support builtin in C language (you write things like int i:1; to say "only consume one bit") , but it is not available for arrays and you have less control of the overall result (details of implementation depends on compiler and alignment issues).
Below is a possible way to answer to your "search missing numbers" question. I fixed int size to 32 bits to keep things simple, but it could be written using sizeof(int) to make it portable. And (depending on the compiler and target processor) the code could only be made faster using >> 5 instead of / 32 and & 31 instead of % 32, but that is just to give the idea.
#include <stdio.h>
#include <errno.h>
#include <stdint.h>
int main(){
/* put all numbers from 1 to 1000000 in a file, except 765 and 777777 */
{
printf("writing test file\n");
int x = 0;
FILE * f = fopen("testfile.txt", "w");
for (x=0; x < 1000000; ++x){
if (x == 765 || x == 777760 || x == 777791){
continue;
}
fprintf(f, "%d\n", x);
}
fprintf(f, "%d\n", 57768); /* this one is a duplicate */
fclose(f);
}
uint32_t bitarray[1000000 / 32];
/* read file containing integers in the range [1,1000000] */
/* any non number is considered as separator */
/* the goal is to find missing numbers */
printf("Reading test file\n");
{
unsigned int x = 0;
FILE * f = fopen("testfile.txt", "r");
while (1 == fscanf(f, " %u",&x)){
bitarray[x / 32] |= 1 << (x % 32);
}
fclose(f);
}
/* find missing number in bitarray */
{
int x = 0;
for (x=0; x < (1000000 / 32) ; ++x){
int n = bitarray[x];
if (n != (uint32_t)-1){
printf("Missing number(s) between %d and %d [%x]\n",
x * 32, (x+1) * 32, bitarray[x]);
int b;
for (b = 0 ; b < 32 ; ++b){
if (0 == (n & (1 << b))){
printf("missing number is %d\n", x*32+b);
}
}
}
}
}
}
That is used for bit flags storage, as well as for parsing different binary protocols fields, where 1 byte is divided into a number of bit-fields. This is widely used, in protocols like TCP/IP, up to ASN.1 encodings, OpenPGP packets, and so on.

Find "edges" in 32 bits word bitpattern

Im trying to find the most efficient algorithm to count "edges" in a bit-pattern. An edge meaning a change from 0 to 1 or 1 to 0. I am sampling each bit every 250 us and shifting it into a 32 bit unsigned variable.
This is my algorithm so far
void CountEdges(void)
{
uint_least32_t feedback_samples_copy = feedback_samples;
signal_edges = 0;
while (feedback_samples_copy > 0)
{
uint_least8_t flank_information = (feedback_samples_copy & 0x03);
if (flank_information == 0x01 || flank_information == 0x02)
{
signal_edges++;
}
feedback_samples_copy >>= 1;
}
}
It needs to be at least 2 or 3 times as fast.
You should be able to bitwise XOR them together to get a bit pattern representing the flipped bits. Then use one of the bit counting tricks on this page: http://graphics.stanford.edu/~seander/bithacks.html to count how many 1's there are in the result.
One thing that may help is to precompute the edge count for all possible 8-bit value (a 512 entry lookup table, since you have to include the bit the precedes each value) and then sum up the count 1 byte at a time.
// prevBit is the last bit of the previous 32-bit word
// edgeLut is a 512 entry precomputed edge count table
// Some of the shifts and & are extraneous, but there for clarity
edgeCount =
edgeLut[(prevBit << 8) | (feedback_samples >> 24) & 0xFF] +
edgeLut[(feedback_samples >> 16) & 0x1FF] +
edgeLut[(feedback_samples >> 8) & 0x1FF] +
edgeLut[(feedback_samples >> 0) & 0x1FF];
prevBit = feedback_samples & 0x1;
My suggestion:
copy your input value to a temp variable, left shifted by one
copy the LSB of your input to yout temp variable
XOR the two values. Every bit set in the result value represents one edge.
use this algorithm to count the number of bits set.
This might be the code for the first 3 steps:
uint32 input; //some value
uint32 temp = (input << 1) | (input & 0x00000001);
uint32 result = input ^ temp;
//continue to count the bits set in result
//...
Create a look-up table so you can get the transitions within a byte or 16-bit value in one shot - then all you need to do is look at the differences in the 'edge' bits between bytes (or 16-bit values).
You are looking at only 2 bits during every iteration.
The fastest algorithm would probably be to build a hash table for all possibles values. Since there are 2^32 values that is not the best idea.
But why don't you look at 3, 4, 5 ... bits in one step? You can for instance precalculate for all 4 bit combinations your edgecount. Just take care of possible edges between the pieces.
you could always use a lookup table for say 8 bits at a time
this way you get a speed improvement of around 8 times
don't forget to check for bits in between those 8 bits though. These then have to be checked 'manually'