bit shifting - replacing a section of a bitset with a new number - c++

I have a list of numbers encoded as a boost dynamic bitset. I dynamically choose the size of this bitset depending on the maximum value any number in this list can take. So let's say I have numbers from just 0 to 7, I only need three bits and my string 0,2,7 will be encoded as
000010111.
I now need to change say the 2nd number in this list (2) to another number, say 4.
I thought the most efficient way to do this would be to represent 4 as a dynamic bitset of the same length as the list but with all other values set to 1, so 111111011. I would then bitshift this the required amount using with 1s used to fill in values to get 111011111, and then just bitwise AND this with the original bitset to get my desired result.
However, I cannot find a way to do these two things, as it seems with both initialisation of a bitset from an integer, and when bit shifting, the default and fill in values are always set to 0, not 1. How can I get around this problem, or achieve my goal in a different and efficient way.
Thanks

If that is really the implementation, the most general and efficient method I can think of would be to first mask off all the bits for the part you are replacing:
value &= 111000111;
Then "or" in the actual bits for that position:
value |= 000011000;
Hopefully someone here has a better trick for me to learn, but that's what I do.

XOR the old value and the new value:
int valuetoset = oldvalue ^ newvalue; // 4 XOR 2 in your example
Just shift the value you need to set:
int bitstoset = valuetoset << position; // (4 XOR 2) << 3 in your example
Then XOR again bitstoset with your bitset and that's it !
int result = bitstoset ^ bitset;

Would you be able to use a vector of dynamic bitsets? Depending on your needs that might be sufficient and allow for easy updates.
Alternately fill your new bitset similiarly to how you proposed, but exactly inverted. Then right before you do the and at the end, flip all the bits.

I guess your understanding of bitset is elementary wrong:
set means it is NOT ordered, and the idea of a bitset is, that only one bit is necessary to show that the element is in-/outside the set.
So your original set 0,2,7 would have 8 bits because 0..7 are 8 elements and NOT 3 * 3 (3 bits required to represent 0..7), and the bitmap would look like 10000101.
What you describe is just a "packed" coding of the values. In your coding scheme 0,2,7 and 2,0,7 would coded completly different, but in a bitset they are the same.
In a (real) bitset (if that is what you want) you can then really easy "replace" elements by removing the old and adding the new. This happens as T.E.D. describes it.
To get the right mask you can easily use shift operations. So imagine you start counting by 0, you get the mask for value x by doing: 1<<x;
So you remove element x from the set by
value &= ~(1<<x);
and add another elemtn x (which might be the same) with
value | = 1<<x;
From you comment you misuse the bitset, so the masks must be build different (and you already had an almost right idea how to build them).
The command with bitmask for removal of element at position p:
value &= ~(111 p);
This 111 is for the above example where you need 3 bit for a position. If you dont want to hardcode it, you could for just take the next power of 2 and subtract 1 and then you got your only-1-string.
And to add you would just take your suggestest bitlist that contains only the new element and OR it to your bitlist:
value |= new_element_bitlist;

Related

Count the bits set in 1 for binary number in C++

How many bits are set in the number 1 in one binary number of 15 digits.
I have no idea how to start this one. Any help/hints?
Smells like homework, so I'll be all vague and cryptic. But helpful, since that's what we do here at SO.
First, let's figure out how to check the first bit. Hint: you want to set all other bits of the variable to zero, and check the value of the result. Since all other bits are zero, the value of the variable will be the value of the first bit (zero or one). More hint: to set bits to zero, use the AND operation.
Second, let's move the second bit to the first position. There's an operation in C++ just for that.
Third, rinse and repeat until done. Count them ones as you do so.
EDIT: so in pseudocode, assuming x is the source variable
CountOfOnes=0
while X != 0
Y = the first bit of X (Y becomes either 0 or 1)
CountOfOnes = CountOfOnes + Y
X = X right shift 1
Specifically for C++ implementation, you need to make X an unsigned variable; otherwise, the shift right operation will act up on you.
Oh, and << and >> operators are exactly bitwise shift. In C++, they're sometimes overridden in classes to mean something else (like I/O), but when acting on integers, they perform bit shifting.

Bit count in the following case

I got the following questions in one of the interviews plz help me some ideas to solve it as am completely unaware how to proceed
A non-empty array A of N elements contains octal representation of a non-negative integer K, i.e. each element of A belongs to the interval [0; 7]
Write a function:
int bitcount_in_big_octal(const vector<int> &A);
that returns the number of bits set to 1 in the binary representation of K. The function should return -1 if the number of bits set to 1 exceeds 10,000,000.
Assume that the array can be very large.
Assume that N is an integer within the range [1..100,000].
is there any time restriction?
I have one idea: at first, make the following dictionary, {0->0, 1->1, 2->1, 3-> 2, 4->1, 5->1, 6->2, 7->3}. then, loop the array A to sum the 1s in every elements using the dictionary.
Iterate over your representation
for-each element in that iterate, convert the representation to its number of bits. #meteorgan's answer is a great way to do just that. If you need the representation for something other than bit counts, you'll probably want to convert it to some intermediate form useful for whatever else you'll be using - e.g. to byte[]: each octet in the representation should correspond to a single byte and since all you're doing is counting bits it doesn't matter that Java's byte is signed. then for-each byte in the array, use an existing bit counting lib, cast the byte to an int and use Integer.bitCount(...), or roll your own, etc - to count the bits
add the result to a running total, escape the iteration if you hit your threshold.
That's a Java answer in the details (like the lib I linked), but the algorithm steps are fine for C++, find a replacement library (or use the dictionary answer).
Here's the solution using the indexed (dictionary) based approach.
INDEX = [0, 1, 1, 2, 1, 2, 2, 3]
def bitcount_in_big_octal(A):
counter = 0
for octal in A: counter += INDEX[octal]
return counter

huffman encoding

I am trying to implement the huffman algorithm for compression, which requires writing bits of variable length to a file. Is there any way in C++ to write variable length data with 1-bit granularity to a file?
No, the smallest amount of data you can write to a file is one byte.
You can use a bitset to make manipulating bits easier, then use an ofstream to write to file. If you don't want to use bitset, you can use the bitwise operators to manipulate your data before saving it.
The smallest amount of bits you can access and save is 8 = 1 byte. You can access bits in byte using bit operators ^ & |.
You can set n'th bit to 1 using:
my_byte = my_byte | (1 << n);
where n is 0 to 7.
You can set n'th bit to 0 using:
my_byte = my_byte & ((~1) << n);
You can toggle n'th bit using:
my_byte = my_byte ^ (1 << n);
More details here.
klew's answer is probably the one you want, but just to add something to what Bill said, the Boost libraries have a dynamic_bitset that I found helpful in a similar situation.
All the info you need on bit twiddling is here:
How do you set, clear, and toggle a single bit?
But the smallest object that you can put in a file is a byte.
I would use dynamic_bitset and every time the size got bigger than 8 extract the bottom 8 bits into a char and write this to a file, then shift the remaining bits down 8 places (repeat).
No. You will have to pack bytes. Accordingly, you will need a header in your file that specifies how many elements are in your file, because you are likely to have trailing bits that are unused.

How can I set all bits to '1' in a binary number of an unknown size?

I'm trying to write a function in assembly (but lets assume language agnostic for the question).
How can I use bitwise operators to set all bits of a passed in number to 1?
I know that I can use the bitwise "or" with a mask with the bits I wish to set, but I don't know how to construct a mask based off some a binary number of N size.
~(x & 0)
x & 0 will always result in 0, and ~ will flip all the bits to 1s.
Set it to 0, then flip all the bits to 1 with a bitwise-NOT.
You're going to find that in assembly language you have to know the size of a "passed in number". And in assembly language it really matters which machine the assembly language is for.
Given that information, you might be asking either
How do I set an integer register to all 1 bits?
or
How do I fill a region in memory with all 1 bits?
To fill a register with all 1 bits, on most machines the efficient way takes two instructions:
Clear the register, using either a special-purpose clear instruction, or load immediate 0, or xor the register with itself.
Take the bitwise complement of the register.
Filling memory with 1 bits then requires 1 or more store instructions...
You'll find a lot more bit-twiddling tips and tricks in Hank Warren's wonderful book Hacker's Delight.
Set it to -1. This is usually represented by all bits being 1.
Set x to 1
While x < number
x = x * 2
Answer = number or x - 1.
The code assumes your input is called "number". It should work fine for positive values. Note for negative values which are twos complement the operation attempt makes no sense as the high bit will always be one.
Use T(~T(0)).
Where T is the typename (if we are talking about C++.)
This prevents the unwanted promotion to int if the type is smaller than int.

Looking for a Hash Function /Ordered Int/ to /Shuffled Int/

I am looking for constant time algorithm can change an ordered integer index value into a random hash index. It would nice if it is reversible. I need that hash key is unique for each index. I know that this could be done with a table look up in a large file. I.E. create an ordered set of all ints and then shuffle them randomly and write to a file in random sequence. You could then read them back as you need them. But this would require a seek into a large file. I wonder if there is a simple way to use say a pseudo random generator to create the sequence as needed?
Generating shuffled range using a PRNG rather than shuffling the answer by
erikkallen of Linear Feedback Shift Registers looks like the right sort of thing. I just tried it but it produces repeats and holes.
Regards
David Allan Finch
The question is now if you need a really random mapping, or just a "weak" permutation. Assuming the latter, if you operate with unsigned 32-bit integers (say) on 2's complement arithmetics, multiplication by any odd number is a bijective and reversible mapping. Of course the same goes for XOR, so a simple pattern which you might try to use is e.g.
unsigned int hash(int x) {
return (((x ^ 0xf7f7f7f7) * 0x8364abf7) ^ 0xf00bf00b) * 0xf81bc437;
}
There is nothing magical in the numbers. So you can change them, and they can be even randomized. The only thing is that the multiplicands must be odd. And you must be calculating with rollaround (ignoring overflows). This can be inverted. To do the inversion, you need to be able to calculate the correct complementary multiplicands A and B, after which the inversion is
unsigned int rhash(int h) {
return (((x * B) ^ 0xf00bf00b) * A) ^ 0xf7f7f7f7;
}
You can calculate A and B mathematically, but the easier thing for you is just to run a loop and search for them (once offline, that is).
The equation uses XORs mixed with multiplications to make the mapping nonlinear.
You could try building a suitable Feistel network. These are normally used for cryptography (e.g. DES), but with at least 64 bits, so you may need to build one yourself that suits your needs. They are invertible by construction.
Assuming your goal is to spread out grouped values across the whole range,
it seems like shuffling the bits in some pre-defined order might do the trick.
i.e. given 8 bits ABCDEFGH, arrange them like EGDBHCFA, or some such pattern.
The code would just be a simple sequence of masks, shifts and adds.
Mmm... depending if you have a lot of numbers, you could use a normal stl list, and order it by a "random" criteria
bool
nonsort(int i, int j)
{
return random() & 31 >16 ? true : false;
}
std::list<int> li;
// insert elements
li.sort(nonsort);
Then, you can get all the integers with a normal iterator. Remember to initialize random with srand() with time or any other pseudo-random value.
For the set of constraints there really is no solution. An attempt to hash 32 bit unsigned, into a 32 bit unsigned, will give you collisions, unless you do a something simple, like a 1 to 1 mapping. Every number is its own hash.