I'm using a dsPIC33F and GCC. I want to rotate the bits in a word once left or right, like this:
MSB LSB
input: 0101 1101 0101 1101
right: 1010 1110 1010 1110
left : 1011 1010 1011 1010
(In case it's not clear, the LSB moves into the MSB's position for the right rotate and vice versa.)
My processor already has a rotate right (rrnc, rrc) and rotate left instruction (rlnc, rlc), so I'm hoping the compiler will optimise this in. If not, I might have to use inline assembly.
You may write them as obvious combination of conventional shifts:
x rol N == x << N | x >> width-N
x ror N == x >> N | x << width-N
where width is number of bits in number you rotate.
Intelligent compiler may (i think it would be) detect this combination and compile to rotation instruction.
Note it works for unsigned and if width is equal to number of bits in machine word you are dealing on (16 for unsigned int on dsPIC).
There is no circular shift in C. (Reference)
Inline assembly might be the way to go, if performance is critical. Otherwise you could use the code in the article linked above.
There is a GCC for dsPIC? Look in its manual if it has got an intrinsic for circular shifts. The other option is inline asm.
Related
I am doing a little game physics networking project right now, and I am trying to optimize the packets I am sending using this guide:
https://gafferongames.com/post/snapshot_compression/
In the "Optimize Quaternions" section it says:
Don’t always drop the same component due to numerical precision issues. Instead, find the component with the largest absolute value and ENCODE its index using two bits [0,3] (0=x, 1=y, 2=z, 3=w), then send the index of the largest component and the smallest three components over the network
Now my question is, how do I encode an integer down to 2 bits... or have I misunderstood the task?
I know very little about compressing data, but reducing a 4 byte integer (32 bits) down to ONLY 2 bits seems a bit insane to me. Is that even possible, or have I completely misunderstood everything?
EDIT:
Here is some code of what I have so far:
void HavNetConnection::sendBodyPacket(HavNetBodyPacket bp)
{
RakNet::BitStream bsOut;
bsOut.Write((RakNet::MessageID)ID_BODY_PACKET);
float maxAbs = std::abs(bp.rotation(0));
int maxIndex = 0;
for (int i = 1; i < 4; i++)
{
float rotAbs = std::abs(bp.rotation(i));
if (rotAbs > maxAbs) {
maxAbs = rotAbs;
maxIndex = i;
}
}
bsOut.Write(bp.position(0));
bsOut.Write(bp.position(1));
bsOut.Write(bp.position(2));
bsOut.Write(bp.linearVelocity(0));
bsOut.Write(bp.linearVelocity(1));
bsOut.Write(bp.linearVelocity(2));
bsOut.Write(bp.rotation(0));
bsOut.Write(bp.rotation(1));
bsOut.Write(bp.rotation(2));
bsOut.Write(bp.rotation(3));
bsOut.Write(bp.bodyId.toRawInt(bp.bodyId));
bsOut.Write(bp.stepCount);
// Send body packets over UDP (UNRELIABLE), priority could be low.
m_peer->Send(&bsOut, MEDIUM_PRIORITY, UNRELIABLE,
0, RakNet::UNASSIGNED_SYSTEM_ADDRESS, true);
}
The simplest solution to your problem is to use bitfields:
// working type (use your existing Quaternion implementation instead)
struct Quaternion{
float w,x,y,z;
Quaternion(float w_=1.0f, float x_=0.0f, float y_=0.0f, float z_=0.0f) : w(w_), x(x_), y(y_), z(z_) {}
};
struct PacketQuaternion
{
enum LargestElement{
W=0, X=1, Y=2, Z=3,
};
LargestElement le : 2; // 2 bits;
signed int i1 : 9, i2 : 9, i3 : 9; // 9 bits each
PacketQuaternion() : le(W), i1(0), i2(0), i3(0) {}
operator Quaternion() const { // convert packet quaternion to regular quaternion
const float s = 1.0f/float(1<<8); // scale int to [-1, 1]; you could also scale to [-sqrt(.5), sqrt(.5)]
const float f1=s*i1, f2 = s*i2, f3 = s*i3;
const float f0 = std::sqrt(1.0f - f1*f1-f2*f2-f3*f3);
switch(le){
case W: return Quaternion(f0, f1, f2, f3);
case X: return Quaternion(f1, f0, f2, f3);
case Y: return Quaternion(f1, f2, f0, f3);
case Z: return Quaternion(f1, f2, f3, f0);
}
return Quaternion(); // default, can't happen
}
};
If you have a look at the assembler code this generates, you will see a bit of shifting to extract le and i1 to i3 -- essentially the same code you could write manually as well.
Your PacketQuaternion structure will always occupy a whole number of bytes, so (on any non-exotic platform) you will still waste 3 bits (you could just use 10 bits per integer field here, unless you have other use for those bits).
I left out the code to convert from regular quaternion to PacketQuaternion, but that should be relatively simple as well.
Generally (as always when networking is involved), be extra careful that data is converted correctly in all directions, especially, if different architectures or different compilers are involved!
Also, as others have noted, make sure that network bandwidth indeed is a bottle neck before doing aggressive optimization here.
I'm guessing they want you to fit the 2 bits into some value you are already sending that doesn't need all of the available bits, or to pack several small bit fields into a single int for transmission.
You can do things like this:
// these are going to be used as 2 bit fields,
// so we can only go to 3.
enum addresses
{
x = 0, // 00
y = 1, // 01
z = 2, // 10
w = 3 // 11
};
int val_to_send;
// set the value to send, and shift it 2 bits left.
val_to_send = 1234;
// bit pattern: 0000 0100 1101 0010
// bit shift left by 2 bits
val_to_send = val_to_send << 2;
// bit pattern: 0001 0011 0100 1000
// set the address to the last 2 bits.
// this value is address w (bit pattern 11) for example...
val_to_send |= w;
// bit pattern: 0001 0011 0100 1011
send_value(val_to_send);
On the receive end:
receive_value(&rx_value);
// pick off the address by masking with the low 2 bits
address = rx_value & 0x3;
// address now = 3 (w)
// bit shift right to restore the value
rx_value = rx_value >> 2;
// rx_value = 1234 again.
You can 'pack' bits this way, any number of bits at a time.
int address_list;
// set address to w (11)
address_list = w;
// 0000 0011
// bit shift left by 2 bits
address_list = address_list << 2;
// 0000 1100
// now add address x (00)
address_list |= x;
// 0000 1100
// bit shift left 2 more bits
address_list = address_list << 2;
// 0011 0000
// add the address y (01)
address_list |= y;
// 0011 0001
// bit shift left 2 more bits
address_list = address_list << 2;
// 1100 0100
// add the address z. (10)
address_list |= z;
// 1100 0110
// w x y z are now in the lower byte of 'address_list'
This packs 4 addresses into the lower byte of 'address_list';
You just have to do the unpacking on the other end.
This has some implementation details to work out. You only have 30 bits now for the value, not 32. If the data is a signed int, you have more work to do to avoid shifting the sign bit out to the left, etc.
But, fundamentally, this is how you can stuff bit patterns into data that you are sending.
Obviously this assumes that sending is more expensive than the work of packing bits into bytes and ints, etc. This is often the case, especially where low baud rates are involved, as in serial ports.
There are a lot of possible understandings and misunderstandings in play here.
ttemple addressed your technical problem of sending less than a byte.
I want to reiterate the more theoretical points.
This is not done
You originally misunderstood the quoted passage.
We do not use two bits to say “not sending 2121387”,
but to say “not sending z-component”.
That these match exactly, should be easy to see.
This is impossible
If you want to send a 32 bit integer which might take any of the 2^32 possible values,
you need at least 32 bits.
As n bits can represent at most exactly 2^n states,
any smaller amount of bits just will not suffice.
This is kinda possible
Beyond your actual question:
When we relax the requirement that we will always use 2 bits
and have sufficiently strong assumptions
on the probability distribution of the values,
we can get the expected value of the number of bits down.
Ideas like this are used all over the place in the linked article.
Example
Let c be some integer that is 0 almost all the time (97%, say)
and can take any value the rest of the time (3%).
Then we can take one bit to say whether “c is zero”
and need no further bits most of the time.
In the cases where c is not zero,
we spend another 32 bits to encode it regularly.
In total we need 0.97*1+0.03*(1+32) = 1.96 bits on average.
But we need 33 bits sometimes,
which makes this compatible with my earlier assertion of impossibility.
This is complicated
Depending on your background (in math, bit-fiddling etc.) it might just seem like an enormous, unknowable piece of black magic.
(It isn't. You can learn this stuff.)
You do not seem completely lost and a quick learner
but I agree with Remy Lebeau
that you seem to be out of your depth.
Do you really need to do this?
Or are you optimizing prematurely?
If it runs well enough, let it run.
Concentrate on the important stuff.
I am currently working on a programming assignment in which I have to mask only a certain index of the whole 32-bit number(EX: If I take 8 4-bit numbers into my 32-bit integer, I would have 8 indices with 4 bits in each). I want to be able to print only part of the bits out of the whole 32 bits, which can be done with masking. If the bits were to only be in one place, I would not have a problem, for I would just create a mask that puts the 1s in a set place(EX: 00000000 00000000 00000000 00000001). However, I need to be able to shift the mask throughout only one index to print each bit (EX: I want to loop through the first index with my 0001 mask, shifting the 1 left every time, but I do not want to continue after the third bit of that index). I know that I need a loop to accomplish this; however, I am having a difficult time wrapping my head around how to complete this part of my assignment. Any tips, suggestions, or corrections would be appreciated. Also, I'm sorry if this was difficult to understand, but I could not find a better way to word it. Thanks.
first of all about representation. You need binary numbers to represent bits and masks. There are no binaries implemented directly in c/c++ languages at least before c++14. So, before c++14 you had to use hexadecimals or octals to represent your binaries, i.e.
0000 1111 == 0x0F
1111 1010 == 0xFA
since c++14 you can use
0b00001111;
Now, if you shift your binary mask left or right, you will have the following pictures
00001111 (OxF) << 2 ==> 00111100 (0x3C)
00001111 (0xF) >> 2 ==> 00000011 (0x03)
Now, supposedly you have an number in which you are interested in bits 4 to 7 (4 bits)
int bad = 0x0BAD; // == 0000 1011 1010 1101
you can create a mask as
int mask = 0x00F0; // == 0000 0000 1111 00000
and do bitwise and
int result = bad & mask; // ==> 0000 0000 1010 000 (0x00A0)
You will mask 4 bits in the middle of the word, but it will print as 0xA0. probably not what you would expect. To print it as 0xA you would need to shift the result 4 bits right: result >> 4. I prefer doing it in a bit different order, shifting the 'bad' first and then mask:
int result = (bad >> 4) & 0xF;
I hope the above will help you to understand bits.
Wildcard masks are commonly used in networking.
Wildcard masks typically have "wildcard" bits that mean that bit can be both a 0 or a 1.
This binary wildcard mask (where the x's represent the wild-card bit)
10xx
covers all these values:
1000
1001
1010
1011
Is there a efficient way of adding/subtracting bit masks?
For example...
x011 + 0111 + xx01 + xxx0 + 1111 = xxxx
There are several common ways to represent bitmasks with wildcards, here's how to compute the "join" (union of the sets represented by the inputs, then "rounded up" to the strictest mask that encompasses at least that set) for them
Known/value
Consists of a pair of masks, known, value (k,v for short), where known has a 1 iff a bit has a fixed value, 0 for a wildcard. value has the values of non-wildcard bits, for wildcard bits the value is not relevant by itself but it simplifies the math if you choose it 0.
The representations of the masks from the example would be
known value
x011 0111 0011
0111 1111 0111
xx01 0011 0001
xxx0 0001 0000
1111 1111 1111
The join of two of them, (kr, vr) = (ka, va) ⋁ (kb, vb) is
kr = ka & kb & ~(va ^ vb) // known if known in both inputs and same value
vr = va & kr // value is the same as in either input, with wildcards normalized to 0
Z,O
Confusing name, but it's a pair of masks where Z (zero) has a 1 iff the bit can be 0 (so it's either 0 or a wildcard) and O (one) has a 1 iff the bit can be 1 (so it's either 1 or a wildcard). Compared to known/value it has some pros and cons,
More symmetric. Computations for Z and O are usually either the same or "dual", whereas computations for known and value are fundamentally different.
Can represent the empty set. Whether this is a pro or a con depends on what you're doing. When a bit is 0 in both Z and O, that means the bit cannot have any value.
Usually the math is more efficient, OTOH it's often harder to think about. The join is easy though.
The representations of the masks from the example would be
Z O
x011 1100 1011
0111 1000 0111
xx01 1110 1101
xxx0 1111 1110
1111 0000 1111
The join of two of them, (zr, or) = (za, oa) ⋁ (zb, ob) is
zr = za | zb
or = oa | ob
I've seen the operators >> and << in various code that I've looked at (none of which I actually understood), but I'm just wondering what they actually do and what some practical uses of them are.
If the shifts are like x * 2 and x / 2, what is the real difference from actually using the * and / operators? Is there a performance difference?
Here is an applet where you can exercise some bit-operations, including shifting.
You have a collection of bits, and you move some of them beyond their bounds:
1111 1110 << 2
1111 1000
It is filled from the right with fresh zeros. :)
0001 1111 >> 3
0000 0011
Filled from the left. A special case is the leading 1. It often indicates a negative value - depending on the language and datatype. So often it is wanted, that if you shift right, the first bit stays as it is.
1100 1100 >> 1
1110 0110
And it is conserved over multiple shifts:
1100 1100 >> 2
1111 0011
If you don't want the first bit to be preserved, you use (in Java, Scala, C++, C as far as I know, and maybe more) a triple-sign-operator:
1100 1100 >>> 1
0110 0110
There isn't any equivalent in the other direction, because it doesn't make any sense - maybe in your very special context, but not in general.
Mathematically, a left-shift is a *=2, 2 left-shifts is a *=4 and so on. A right-shift is a /= 2 and so on.
Left bit shifting to multiply by any power of two and right bit shifting to divide by any power of two.
For example, x = x * 2; can also be written as x<<1 or x = x*8 can be written as x<<3 (since 2 to the power of 3 is 8). Similarly x = x / 2; is x>>1 and so on.
Left Shift
x = x * 2^value (normal operation)
x << value (bit-wise operation)
x = x * 16 (which is the same as 2^4)
The left shift equivalent would be x = x << 4
Right Shift
x = x / 2^value (normal arithmetic operation)
x >> value (bit-wise operation)
x = x / 8 (which is the same as 2^3)
The right shift equivalent would be x = x >> 3
Left shift: It is equal to the product of the value which has to be shifted and 2 raised to the power of number of bits to be shifted.
Example:
1 << 3
0000 0001 ---> 1
Shift by 1 bit
0000 0010 ----> 2 which is equal to 1*2^1
Shift By 2 bits
0000 0100 ----> 4 which is equal to 1*2^2
Shift by 3 bits
0000 1000 ----> 8 which is equal to 1*2^3
Right shift: It is equal to quotient of value which has to be shifted by 2 raised to the power of number of bits to be shifted.
Example:
8 >> 3
0000 1000 ---> 8 which is equal to 8/2^0
Shift by 1 bit
0000 0100 ----> 4 which is equal to 8/2^1
Shift By 2 bits
0000 0010 ----> 2 which is equal to 8/2^2
Shift by 3 bits
0000 0001 ----> 1 which is equal to 8/2^3
Left bit shifting to multiply by any power of two.
Right bit shifting to divide by any power of two.
x = x << 5; // Left shift
y = y >> 5; // Right shift
In C/C++ it can be written as,
#include <math.h>
x = x * pow(2, 5);
y = y / pow(2, 5);
The bit shift operators are more efficient as compared to the / or * operators.
In computer architecture, divide(/) or multiply(*) take more than one time unit and register to compute result, while, bit shift operator, is just one one register and one time unit computation.
Some examples:
Bit operations for example converting to and from Base64 (which is 6 bits instead of 8)
doing power of 2 operations (1 << 4 equal to 2^4 i.e. 16)
Writing more readable code when working with bits. For example, defining constants using
1 << 4 or 1 << 5 is more readable.
Yes, I think performance-wise you might find a difference as bitwise left and right shift operations can be performed with a complexity of o(1) with a huge data set.
For example, calculating the power of 2 ^ n:
int value = 1;
while (exponent<n)
{
// Print out current power of 2
value = value *2; // Equivalent machine level left shift bit wise operation
exponent++;
}
}
Similar code with a bitwise left shift operation would be like:
value = 1 << n;
Moreover, performing a bit-wise operation is like exacting a replica of user level mathematical operations (which is the final machine level instructions processed by the microcontroller and processor).
Here is an example:
#include"stdio.h"
#include"conio.h"
void main()
{
int rm, vivek;
clrscr();
printf("Enter any numbers\t(E.g., 1, 2, 5");
scanf("%d", &rm); // rm = 5(0101) << 2 (two step add zero's), so the value is 10100
printf("This left shift value%d=%d", rm, rm<<4);
printf("This right shift value%d=%d", rm, rm>>2);
getch();
}
I'm using C++ for hardware-based model design with SystemC. SystemC as a C++ extension introduces specific datatypes useful for signal and byte descriptions.
How can I access the first bits of a datatype in general, like:
sc_bv<16> R0;
or access the first four bits of tmp.
int my_array[42];
int tmp = my_array[1];
sc_bv is a bit-vector data-type, that's storing binary sequences. Now I want the first four bits of that data-type e. g.. My background is C# and Java, therefore I miss some of the OOP and Reflexion based API constructs in general. I need to perform conversion on this low-level stuff. Useful introductory stuff would help a lot.
Thanks :),
wishi
For sc_bv, you can use the indexing operator []
For the int, just use normal bitwise operations with constants, e.g. the least significant bit in tmp is tmp & 1
I can't really speak for SystemC (sounds interesting though). In normal C you'd read out the lower four bits with a mask like so:
temp = R0 & 0xf;
and write into only the lower four bits (assuming a 32-bit register, and temp<16) like so:
R0 = (R0 & 0xfffffff0) | temp;
To access the first four (i assume you mean four highest bits) bits of tmp (ie to get their values) you use bit masks. So if you want to know if for example the second bit is set you do the following:
int second_bit = (tmp & 0x4000000) >> 30;
now second_bit is 1 if the bit is set and zero otherwise. The idea behind this is the following:
Imagine tmp is (in binary)
1101 0000 0000 0000 0000 0000 0000 0000
Now you use bitwise AND ( the & ) with the following value
0100 0000 0000 0000 0000 0000 0000 0000 // which is 0x40000000 in hex
ANDing produces a 1 on the given bit if and only if both operands have corresponding bits set (they are both 1). So the result will be:
0100 0000 0000 0000 0000 0000 0000 0000
Then you shift this 30 bits to the right, which makes it be:
0000 0000 0000 0000 0000 0000 0000 0001 \\ which is 1
Note that if the original value had the tested bit zero, the result would be zero.
This way you can test any bit you like, you just need to provide correct mask. Note that i assumed here that int is 32bits wide, which should be true in most cases.
You will have to know a bit more about sc_bv to amke sure you get the right information. Also, when you say the "first four bytes" I assume you mean the "first four bits." However, that is misleading as well, because you really want to delineate between the low-order or high-order bits.
In any event, you use the C bitwise operators for this kind of thing. However, you will need to know the size of the integer values AND the "endian-ness" of the runtime architecture to get that right.
But, if you REALLY want just the first four bits, then you would do something like this...
inline unsigned char
first_4_bits(void const * ptr)
{
return (*reinterpret_cast<unsigned char const *>(ptr) & 0xf0) >> 4;
}
and that will grab the very first 4 bits of what it being pointed at. So, if the first byte pointed-to is 0x38, then this function will return the first 4 bits, so the result will be 3.