Practical Application of Bitwise Operators [duplicate]

Practical Application of Bitwise Operators [duplicate] - bit-manipulation

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
practical applications of bitwise operations
I have been programming for several years now and I have always wondered about the practical application of bitwise operators.
In my programming experience, I have not had to utilize the bitwise operators.
When are they most commonly used?
In my programming career, is it necessary for me to learn these?
Thank you.
Amicably,
James

Bitwise operations are frequently used close to the hardware - when packing data, doing compression, or packing multiple booleans into a byte. Bitwise operations map directly to processor instructions, and are often extremely fast.
If you're working with I/O or device interfaces, bitwise operations become very necessary - to separate parts of a bitfield into important data.
Or you could just use it as a fast multiply-by-two. :)

Another fun usage for binary and bit twiddling.
Packing Morse code into a single byte. A . is 0 and a - is 1.
A = .-
A = 00000001xB
// Add a 'start bit'
A = 00000101xB
Shift the bit around 8 times, start playing sounds when you find the start bit.
+------- Monitor this position
V
A = 00000101 // Starting off
A = 00001010 // Nothing yet
A = 00010100 // Still nothing
A = 00101000 // WOw, a lot of nothing
A = 01010000 // Our boring life, we do nothing
A = 10100000 // Wow! A start bit! Prep to play sound.
A = 01000000 // Play a short
A = 10000000 // And play a long.

I have not needed it lately but back when coding pascal I used it to multiply or divide whenever the divisor or multiplication was a power of 2.
Color was stored in a byte with textcolor in the low 4 bits and background color in the high 4 bits.
Using c << 4 instead if c * 16 ,and c >> 4 instead of c / 16 to save or retrieve background was many times faster.
And retrieving textcolor with c <<4 >> 4 was also faster than c & 15 (bitvize and) for some reason. Probably register related ;) but thats way over my head to :D
But unless you are doing checksum calculations, compression or encryption you probably can do without.
Even if you can store bits in an int many times drivers can optimize things for you any way and in c# you can use Flag enums to automatically pack bit flags into byte, word or integer values.
So I would guess that since you have not found a use, you probably are not ding work in the area where they make sense.

Related

Writing binary data in c++

I am in the process of building an assembler for a rather unusual machine that me and a few other people are building. This machine takes 18 bit instructions, and I am writing the assembler in C++.
I have collected all of the instructions into a vector of 32 bit unsigned integers, none of which is any larger than what can be represented with an 18 bit unsigned number.
However, there does not appear to be any way (as far as I can tell) to output such an unusual number of bits to a binary file in C++, can anyone help me with this.
(I would also be willing to use C's stdio and File structures. However there still does not appear to be any way to output such an arbitrary amount of bits).
Thank you for your help.
Edit: It looks like I didn't specify how the instructions will be stored in memory well enough.
Instructions are contiguous in memory. Say the instructions start at location 0 in memory:
The first instruction will be at 0. The second instruction will be at 18, the third instruction will be at 36, and so on.
There is no gaps, or no padding in the instructions. There can be a few superfluous 0s at the end of the program if needed.
The machine uses big endian instructions. So an instruction stored as 3 should map to: 000000000000000011

Keep an eight-bit accumulator.
Shift bits from the current instruction into to the accumulator until either:
The accumulator is full; or
No bits remain of the current instruction.
Whenever the accumulator is full:
Write its contents to the file and clear it.
Whenever no bits remain of the current instruction:
Move to the next instruction.
When no instructions remain:
Shift zeros into the accumulator until it is full.
Write its contents.
End.
For n instructions, this will leave (8 - 18n mod 8) zero bits after the last instruction.

There are a lot of ways you can achieve the same end result (I am assuming the end result is a tight packing of these 18 bits).
A simple method would be to create a bit-packer class that accepts the 32-bit words, and generates a buffer that packs the 18-bit words from each entry. The class would need to do some bit shifting, but I don't expect it to be particularly difficult. The last byte can have a few zero bits at the end if the original vector length is not a multiple of 4. Once you give all your words to this class, you can get a packed data buffer, and write it to a file.

You could maybe represent your data in a bitset and then write the bitset to a file.
Wouldn't work with fstreams write function, but there is a way that is described here...

The short answer: Your C++ program should output the 18-bit values in the format expected by your unusual machine.
We need more information, specifically, that format that your "unusual machine" expects, or more precisely, the format that your assembler should be outputting. Once you understand what the format of the output that you're generating is, the answer should be straightforward.
One possible format — I'm making things up here — is that we could take two of your 18-bit instructions:
instruction 1 instruction 2 ...
MSB LSB MSB LSB ...
bits → ABCDEFGHIJKLMNOPQR abcdefghijklmnopqr ...
...and write them in an 8-bits/byte file thus:
KLMNOPQR CDEFGHIJ 000000AB klmnopqr cdefghij 000000ab ...
...this is basically arranging the values in "little-endian" form, with 6 zero bits padding the 18-bit values out to 24 bits.
But I'm assuming: the padding, the little-endianness, the number of bits / byte, etc. Without more information, it's hard to say if this answer is even remotely near correct, or if it is exactly what you want.
Another possibility is a tight packing:
ABCDEFGH IJKLMNOP QRabcdef ghijklmn opqr0000
or
ABCDEFGH IJKLMNOP abcdefQR ghijklmn 0000opqr
...but I've made assumptions about where the corner cases go here.

Just output them to the file as 32 bit unsigned integers, just as you have in memory, with the endianness that you prefer.
And then, when the loader / eeprom writer / JTAG or whatever method you use to send the code to the machine, for each 32 bit word that is read, just omit the 14 more significant bits and send the real 18 bits to the target.
Unless, of course, you have written a FAT driver for your machine...

Weird usage of "&" for a novice C++ programmer

I have some code here, and don't really understand the ">>" and the "&". Can someone clarify?
buttons[0] = indata[byteindex]&1;
buttons[1] = (indata[byteindex]>>1)&1;
rawaxes[7] = (indata[byteindex]>>4)&0xf;

These are bitwise operators, meaning they operate on the binary bits that make up a value. See Bitwise operation on Wikipedia for more detail.
& is for AND
If indata[byteindex] is the number 4, then in binary it would look like 00000100. ANDing this number with 1 gives 0, because bit 1 is not set:
00000100 AND 00000001 = 0
If the value is 5 however, then you will get this:
00000101 AND 00000001 = 1
Any bit matched with the mask is allowed through.
>> is for right-shifting
Right-shifting shifts bits along to the right!
00010000 >> 4 = 00000001

One of the standard patterns for extracting a bit field is (reg >> offset) & mask, where reg is the register (or other memory location) you're reading, offset is how many least-significant bits you skip over, and mask is the set of bits that matter. The >> offset step can be omitted if offset is 0. mask is usually equal to 2width-1, or (1 << width) - 1 in C, where width is the number of bits in the field.
So, looking at what you have:
buttons[0] = indata[byteindex]&1;
Here, offset is 0 (it was omitted) and mask is 1. So this gets just the least-significant bit in indata[byteindex]:
bit number -> 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+
indata[byteindex] | | | | | | | |*|
+-+-+-+-+-+-+-+-+
|
\----> buttons[0]
Next:
buttons[1] = (indata[byteindex]>>1)&1;
Here, offset is 1 and width is 1...
bit number -> 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+
indata[byteindex] | | | | | | |*| |
+-+-+-+-+-+-+-+-+
|
\------> buttons[1]
And, finally:
rawaxes[7] = (indata[byteindex]>>4)&0xf;
Here, offset is 4 and width is 4 (24-1 = 16 - 1 = 15 = 0xf):
bit number -> 7 6 5 4 3 2 1 0
+-+-+-+-+-+-+-+-+
indata[byteindex] |*|*|*|*| | | | |
+-+-+-+-+-+-+-+-+
| | | |
\--v--/
|
\---------------> rawaxes[7]
EDIT...
but I don't understand what the point of it is...
Mike pulls up a rocking chair and sits down.
Back in the old days of 8-bit CPUs, a computer typically had 64K (65 536 bytes) of address space. Now we wanted to do as much as we could with our fancy whiz-bang machines, so we would do things like buy 64K of RAM and map everything to RAM. Shazam, 64K of RAM and bragging rights all around.
But a computer that can only access RAM isn't much good. It needs some ROM for an OS (or at least a BIOS), and some addresses for I/O. (You in the back--siddown. I know Intel chips had separate address space for I/O, but it doesn't help here because the I/O space was much, much smaller than the memory space, so you ran into the same constraints.)
Address space used for ROM and I/O was space that wasn't accessible as RAM, so you wanted to minimize how much space wasn't used for RAM. So, for example, when your I/O peripheral had five different things whose status amounted to a single bit each, rather than give each one of those bits its own byte (and, hence, address), they got the brilliant idea of packing all five of those bits into one byte, leaving three bits that did nothing. Voila, the Interrupt Status Register was born.
The hardware designers were also impressed with how fewer addresses resulted in fewer address bits (since address bits is ceiling of log-base-2 of number of addresses), meaning fewer address pins on the chip, freeing pins for other purposes. (These were the days when 48-pin chips were considered large, and 64-pins huge, and grid array packages were out of the question because multi-layer circuit boards were prohibitively expensive. These were also the days before multiplexing the address and data on the same pins became commonplace.)
So the chips were taped out and fabricated, and hardware was built, and then it fell to the programmers to make the hardware work. And lo, the programmers said, "WTF? I just want to know if there is a byte to read in the bloody serial port, but there are all these other bits like "receiver overrun" in the way." And the hardware guys considered this, and said, "tough cookies, deal with it."
So the programmers went to the Guru, the guy who hadn't forgotten his Boolean algebra and was happy not to be writing COBOL. And the Guru said, "use the Bit AND operation to force those bits you don't care about to 0. If you need a number, and not just a zero-or-nonzero, use a logical shift right (LSR) on the result." And they tried it. It worked, and there was much rejoicing, though the wiser ones started wondering about things like race conditions in a read-modify-write cycle, but that's a story for another time.
And so the technique of packing loosely or completely unrelated bits into registers became commonplace. People developing protocols, which always want to use fewer bits, jumped on these techniques as well. And so, even today, with our gigabytes of RAM and gigabits of bandwidth, we still pack and unpack bitfields with expressions whose legibility borders on keyboard head banging.
(Yes, I know bit fields probably go back to the ENIAC, and maybe even the Difference Engine if Lady Ada needed to stuff two data elements into one register, but I haven't been alive that long, okay? I'm sticking with what I know.)
(Note to hardware designers out there: There really isn't much justification anymore for packing things like status flags and control bits that a driver writer will want to use independently. I've done several designs with one bit per 32-bit register in many cases. No bit shifting or masking, no races, driver code is simpler to write and understand, and the address decode logic is trivially more complex. If the driver software is complex, simplifying flag and bitfield handling can save you a lot of ROM and CPU cycles.)
(More random trivia: The Atmel AVR architecture (used in the Arduino, among many other places) has some specialized bit-set and bit-clear instructions. The avr-libc library used to provide macros for these instructions, but now the gcc compiler is smart enough to recognize that reg |= (1 << bitNum); is a bit set and reg &= ~(1 << bitNum); is a bit clear, and puts in the proper instruction. I'm sure other architectures have similar optimizations.)

These are bitwise operators.
& ands two arguments bit by bit.
'>>' shifts first argument's bit string to the right by second argument.
'<<' does the opposite. | is bitwise or and ^ is bitwise xor just like & is bitwise and.

In English, the first line is grabbing to lowest bit (bit 0) only out of Button[0]. Basically, if the value is odd, it will be 1, if even, it will be 0.
(bit 1)
The second is grabbing the second bit. If that bit is set, it returns 1, else 0. It could have also been written as
buttons[1] = (indata[byteindex]&2)>>1;
and it would have done the same thing.
The last (3rd) line is grabbing the 5th throuh 8th bits (bits 4-7). Basically, it will be a number from 0 to 15 when it is complete. It also could hav been written as
rawaxes[7] = (indata[byteindex]&0xf0) >> 4;
and done the same thing. I'd also guess from context that these arrays are unsigned char arrays. Just a guess though.

The '&' (in this case) is a bitwise AND operator and ">>" is the bit-shift operator (so x>>y yields x shifted right Y bits).
So, they're taking the least significant bit of indata[byteindex] and putting it into buttons[0]. They taking the next least significant bit and putting it into buttons[1].
The last one probably needs to be looked at in binary to make a lot of sense. 0xf is 11112, so they're taking the input, shifting it right 4 bits, then retaining the 4 least significant bits of that result.

Is there any advantage to using '<< 1' instead of '* 2'?

I've seen this a couple of times, but it seems to me that using the bitwise shift left hinders readability. Why is it used? Is it faster than just multiplying by 2?

You should use * when you are multiplying, and << when you are bit shifting. They are mathematically equivalent, but have different semantic meanings. If you are building a flag field, for example, use bit shifting. If you are calculating a total, use multiplication.

It is faster on old compilers that don't optimize the * 2 calls by emitting a left shift instruction. That optimization is really easy to detect and any decent compiler already does.
If it affects readability, then don't use it. Always write your code in the most clear and concise fashion first, then if you have speed problems go back and profile and do hand optimizations.

It's used when you're concerned with the individual bits of the data you're working with. For example, if you want to set the upper byte of a word to 0x9A, you would not write
n |= 0x9A * 256
You'd write:
n |= 0x9A << 8
This makes it clearer that you're working with bits, rather than the data they represent.

For some architectures, bit shifting is faster than multiplying. However, any compiler worth its salt will optimize *2 (or any multiplication by a power of 2) to a left bit shift (when a bit shift would be faster).

For readability of values used as bitfields:
enum Flags { UP = (1<<0),
DOWN = (1<<1),
STRANGE = (1<<2),
CHARM = (1<<3),
...
which I think is preferable to either '=1,...,=2,...=4' or '=1,...=2, =2*2,...=2*3' especially if you have 8+ flags.

If you are using a old C compiler, it is preferrable to use bitwise. For readability you can comment you code though.

How do you use bitwise flags in C++?

As per this website, I wish to represent a Maze with a 2 dimensional array of 16 bit integers.
Each 16 bit integer needs to hold the following information:
Here's one way to do it (this is by no means the only way): a 12x16 maze grid can be represented as an array m[16][12] of 16-bit integers. Each array element would contains all the information for a single corresponding cell in the grid, with the integer bits mapped like this:
(source: mazeworks.com)
To knock down a wall, set a border, or create a particular path, all we need to do is flip bits in one or two array elements.
How do I use bitwise flags on 16 bit integers so I can set each one of those bits and check if they are set.
I'd like to do it in an easily readable way (ie, Border.W, Border.E, Walls.N, etc).
How is this generally done in C++? Do I use hexidecimal to represent each one (ie, Walls.N = 0x02, Walls.E = 0x04, etc)? Should I use an enum?
See also How do you set, clear, and toggle a single bit?.

If you want to use bitfields then this is an easy way:
typedef struct MAZENODE
{
bool backtrack_north:1;
bool backtrack_south:1;
bool backtrack_east:1;
bool backtrack_west:1;
bool solution_north:1;
bool solution_south:1;
bool solution_east:1;
bool solution_west:1;
bool maze_north:1;
bool maze_south:1;
bool maze_east:1;
bool maze_west:1;
bool walls_north:1;
bool walls_south:1;
bool walls_east:1;
bool walls_west:1;
};
Then your code can just test each one for true or false.

Use std::bitset

Use hex constants/enums and bitwise operations if you care about which particular bits mean what.
Otherwise, use C++ bitfields (but be aware that the ordering of bits in the integer will be compiler-dependent).

Learn your bitwise opertors: &, |, ^, and !.
At the top of a lot of C/C++ files I have seen flags defined in hex to mask each bit.
#define ONE 0x0001
To see if a bit is turned on, you AND it with 1. To turn it on, you OR it with 1. To toggle like a switch, XOR it with 1.

To manipulate sets of bits, you can also use ....
std::bitset<N>
std::bitset<4*4> bits;
bits[ 10 ] = false;
bits.set(10);
bits.flip();
assert( !bits.test(10) );

You can do it with hexadecimal flags or enums as you suggested, but the most readable/self-documenting is probably to use what are called "bitfields" (for details, Google for C++ bitfields).

Yes a good way is to use hex decimal to represent the bit patterns. Then you use the bitwise operators to manipulate your 16-bit ints.
For example:
if(x & 0x01){} // tests if bit 0 is set using bitwise AND
x ^= 0x02; // toggles bit 1 (0 based) using bitwise XOR
x |= 0x10; // sets bit 4 (0 based) using bitwise OR

I'm not a huge fan of bitset. It's just more typing in my opinion. And it doesn't hide what you are doing anyway. You still have to & && | bits. Unless you are picking on just 1 bit. That may work for small groups of flags. Not that we need to hide what we are doing either. But the intention of a class is usually to make something easier for it's users. I don't think this class accomplishes it.
Say for instance, you have a flag system with .. 64 flags. If you want to test.. I don't know.. 39 of them in 1 if statement to see if they are all on... using bitfields is a huge pain. You have to type them all out.. Course. I'm making the assumption you use only bitfields functionality and not mix and match methods. Same thing with bitset. Unless I am missing something with the class.. which is quite possible since I rarely use it.. I don't see a way you can test all 39 flags unless you type out the hole thing or resort to "standard methods" (using enum flag lists or some defined value for 39 bits and using the bitsets && operator). This can start to get messy depending on your approach. And I know.. 64 flags sounds like a lot. And well. It is.. depending on what you are doing. Personally speaking, most of the projects I'm involved with depend on flag systems. So actually.. 64 is not that unheard of. Though 16~32 is far more common in my experience. I'm actually helping out in a project right now where one flag system has 640 bits. It's basically a privilege system. So it makes some sense to arrange them all together... However.. admittedly.. I would like to break that up a bit.. but.. eh... I'm helping.. not creating.

What is the fastest way to get the 4 least significant bits in a byte (C++)?

I'm talking about this:
If we have the letter 'A' which is 77 in decimal and 4D in Hex.
I am looking for the fastest way to get D.
I thought about two ways:
Given x is a byte.
x << 4; x >> 4
x %= 16
Any other ways? Which one is faster?

Brevity is nice - explanations are better :)
x &= 0x0f
is, of course, the right answer. It exactly expresses the intent of what you're trying to achieve, and on any sane architecture will always compile down to the minimum number of instructions (i.e. 1). Do use hex rather than decimal whenever you put constants in a bit-wise operator.
x <<= 4; x >>= 4
will only work if your 'byte' is a proper unsigned type. If it was actually a signed char then the second operation might cause sign extension (i.e. your original bit 3 would then appear in bits 4-7 too).
without optimization this will of course take 2 instructions, but with GCC on OSX, even -O1 will reduce this to the first answer.
x %= 16
even without the optimizer enabled your compiler will almost certainly do the right thing here and turn that expensive div/mod operation into the first answer. However it can only do that for powers of two, and this paradigm doesn't make it quite so obvious what you're trying to achieve.

I always use x &= 0x0f

There are many good answers and some of them are technically the right ones.
In a broader scale, one should understand that C/C++ is not an assembler. Programmer's job is to try to tell to the compiler the intention what you want to achieve. The compiler will pick the best way to do it depending on the architecture and various optimization flags.
x &= 0x0F; is the most clear way to tell the compiler what you want to achieve. If shifting up and down is faster on some architecture, it is the compiler's job to know it and do the right thing.

Single AND operation can do it.
x = (x & 0x0F);

It will depend on on the architecture to some extent - shifting up and back down on an ARM is probably the fastest way - however the compiler should do that for you. In fact, all of the suggested methods will probably be optimized to the same code by the compiler.

x = x & 15

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Practical Application of Bitwise Operators [duplicate] - bit-manipulation

Related

Writing binary data in c++

Weird usage of "&" for a novice C++ programmer

Is there any advantage to using '<< 1' instead of '* 2'?

How do you use bitwise flags in C++?

What is the fastest way to get the 4 least significant bits in a byte (C++)?

Categories

Resources