platform difference? - c++

I was trying out the bitset class in C++ and I tried this with the number 137 as an example:
So, I converted it to binary number which gave me 10001001. Now, I wanted to cut off the MSB and store the rest bits 0001001 in another bit instance called bitarray and I was expecting to see that in the bitarray but it wasn't giving the right value. what could have been the problem? I was just trying to split the MSB from the rest of the bits in the 137 binary representation...here is the code:
bitset<8> bitarray;
bitset<8> bitsetObject(num);
int val = bitsetObject.size();
for (int i = 0; i <= (val - 1); i++)
{
if (i == 6)
break;
else
bitarray[i] = bitsetObject[i + 1];
}
If anyone knows how I could easily slice from the second element to the last element in the bitsetObject array, let me know. Thanks..

If you're just trying to make a new bitset object with the most significant set bit reset, then consider the following:
template<std::size_t N>
std::bitset<N> strip_mssb(std::bitset<N> bitarray)
{
for (std::size_t i = bitarray.size(); i--;)
if (bitarray[i])
{
bitarray.reset(i);
break;
}
return bitarray;
}
Online demo.

You set bitarray[0] equal to bitsetObject[1], which is 0 (assuming num is really 137).
You seem to expect the least bit of bitarray to be equal to 1.

Related

Translate c++ functions in TypeScript

Given the following functions written in C++:
#define getbit(s,i) ((s)[(i)/8] & 0x01<<(i)%8)
#define setbit(s,i) ((s)[(i)/8] |= 0x01<<(i)%8)
How can I turn them into compatible TypeScript functions?
I came up with:
function setbit(s: string, i: number): number {
return +s[i / 8] | 0x01 << i % 8;
}
function getbit(s: string, i: number): number {
return +s[i / 8] & 0x01 << i % 8;
}
I found out that the a |= b equivalent is a = a | b, but I'm not sure about the getbit function implementation. Also, I don't really understand what those functions are supposed to do. Could someone explain them, please?
Thank you.
EDIT:
Using the ideas from #Thomas, I ended up doing this:
function setBit(x: number, mask: number) {
return x | 1 << mask;
}
// not really get, more like a test
function getBit(x: number, mask: number) {
return ((x >> mask) % 2 !== 0);
}
since I don't really need a string for the binary representation.
Strings ain't a good storage here. And btw, JS Strings use 16bit characters, so you're using only 1/256th of the storage possible.
function setbit(string, index) {
//you could do `index >> 3` but this will/may fail if index > 0xFFFFFFFF
//well, fail as in produce wrong results, not as in throwing an error.
var position = Math.floor(index/8),
bit = 1 << (index&7),
char = string.charCodeAt(position);
return string.substr(0, position) + String.fromCharCode(char|bit) + string.substr(position+1);
}
function getbit(string, index) {
var position = Math.floor(i/8),
bit = 1 << (i&7),
char = string.charCodeAt(position);
return Boolean(char & bit);
}
better would be a (typed) Array.
function setBit(array, index){
var position = Math.floor(index/8),
bit = 1 << (index&7);
array[position] |= bit; //JS knows `|=` too
return array;
}
function getBit(array, index) {
var position = Math.floor(index/8),
bit = 1 << (index&7);
return Boolean(array[position] & bit)
}
var storage = new Uint8Array(100);
setBit(storage, 42);
console.log(storage[5]);
var data = [];
setBit(data, 42);
console.log(data);
works with both, but:
all typed Arrays have a fixed length that can not be changed after memory allocation (creation).
regular arrays don't have a regular type, like 8bit/index or so, limit is 53Bit with floats, but for performance reasons you should stick with up to INT31 (31, not 32), that means 30bits + sign. In this case the JS engine can optimize this thing a bit behind the scenes; reduce memory impact and is a little faster.
But if performance is the topic, use Typed Arrays! Although you have to know in advance how big this thing can get.

Cheking a pattern of bits in a sequence

So basically i need to check if a certain sequence of bits occurs in other sequence of bits(32bits).
The function shoud take 3 arguments:
n right most bits of a value.
a value
the sequence where the n bits should be checked for occurance
The function has to return the number of bit where the desired sequence started. Example chek if last 3 bits of 0x5 occur in 0xe1f4.
void bitcheck(unsigned int source, int operand,int n)
{
int i,lastbits,mask;
mask=(1<<n)-1;
lastbits=operand&mask;
for(i=0; i<32; i++)
{
if((source&(lastbits<<i))==(lastbits<<i))
printf("It start at bit number %i\n",i+n);
}
}
Your loop goes too far, I'm afraid. It could, for example 'find' the bit pattern '0001' in a value ~0, which consists of ones only.
This will do better (I hope):
void checkbit(unsigned value, unsigned pattern, unsigned n)
{
unsigned size = 8 * sizeof value;
if( 0 < n && n <= size)
{
unsigned mask = ~0U >> (size - n);
pattern &= mask;
for(int i = 0; i <= size - n; i ++, value >>= 1)
if((value & mask) == pattern)
printf("pattern found at bit position %u\n", i+n);
}
}
I take you to mean that you want to take source as a bit array, and to search it for a bit sequence specified by the n lowest-order bits of operand. It seems you would want to perform a standard mask & compare; the only (minor) complication being that you need to scan. You seem already to have that idea.
I'd write it like this:
void bitcheck(uint32_t source, uint32_t operand, unsigned int n) {
uint32_t mask = ~((~0) << n);
uint32_t needle = operand & mask;
int i;
for(i = 0; i <= (32 - n); i += 1) {
if (((source >> i) & mask) == needle) {
/* found it */
break;
}
}
}
There are some differences in the details between mine and yours, but the main functional difference is the loop bound: you must be careful to ignore cases where some of the bits you compare against the target were introduced by a shift operation, as opposed to originating in source, lest you get false positives. The way I've written the comparison makes it clearer (to me) what the bound should be.
I also use the explicit-width integer data types from stdint.h for all values where the code depends on a specific width. This is an excellent habit to acquire if you want to write code that ports cleanly.
Perhaps:
if((source&(maskbits<<i))==(lastbits<<i))
Because:
finding 10 in 11 will be true for your old code. In fact, your original condition will always return true when 'source' is made of all ones.

add 1 to c++ bitset

I have a c++ bitset of given length. I want to generate all possible combinations of this bitset for which I thought of adding 1 2^bitset.length times. How to do this? Boost library solution is also acceptable
Try this:
/*
* This function adds 1 to the bitset.
*
* Since the bitset does not natively support addition we do it manually.
* If XOR a bit with 1 leaves it as one then we did not overflow so we can break out
* otherwise the bit is zero meaning it was previously one which means we have a bit
* overflow which must be added 1 to the next bit etc.
*/
void increment(boost::dynamic_bitset<>& bitset)
{
for(int loop = 0;loop < bitset.count(); ++loop)
{
if ((bitset[loop] ^= 0x1) == 0x1)
{ break;
}
}
}
All possible combinations? Just use 64-bit unsigned integer and make your life easier.
Not best, but brute force way, but you can add 1 by converting using to_ulong()
bitset<32> b (13);
b = b.to_ulong() + 1;
Using boost library, you can try the following:
For example, a bitset of length 4
boost::dynamic_bitset<> bitset;
for (int i = 0; i < pow(2.0, 4); i++) {
bitset = boost::dynamic_bitset<>(4, i);
std::cout << bitset << std::endl;
}

what am i doing wrong in this bloom filter implementation?

I have this bit table for a segmented bloom filter. Here every column is managed by a single hash function.
unsigned char bit_table_[ROWS][COLUMNS];//bit_table now have 8*ROWS*COLUMNS bits
unsigned char bit_mask[bits_per_char] = { 0x01,0x02,0x04,0x08,
0x10,0x20,0x40,0x80};
There are ROWS number of hash functions each of which handles the setting and checking of COLUMNS*8 bits.
Elements are hashed and bit_index and bit are calculated as
compute_indices(unsigned int hash)
{
bit_index=hash%COLUMNS;
bit=bit_index%8;
}
Now insetion is done as
for (std::size_t i = 0; i < ROWS; ++i)
{
hash=compute_hash(i,set_element);
compute_indices(hash);
bit_table_[i][bit_index ] |= bit_mask[bit];
}
And the query is
for (std::size_t i = 0; i < ROWS; ++i)
{
hash=compute_hash(i,set_element);
compute_indices(hash);
if (((bit_table_[i][bit_index])& bit_mask[bit]) != bit_mask[bit])
{
return false;
}
}
My problem is the bloom filter gets full too soon and I suspect that i am not using the individual bits of the characters correctly. For example i suppose i should have something like:
bit_table_[i][bit_index][bit]|=bit_mask[bit];
for insertion but, since the bit_table is declared as two dimensional array i am not allowed to do this.
What should i do to make use of the individual bits of the char array?
English is my second language, so you might have trouble understanding my question. I would be happy to explain my points more if requested.
EDIT:
compute_hash(i,set_elemnt) uses predefined salt values to compute hash value of the element to be inserted or queried.
There is an error in your compute_indices method.
You are computing a column index and then apply a modulo 8 on this column index. At the end you will always use the same bit in a column.
For example for the column 10, you will always use the bit 2.
You should have :
compute_indices(unsigned int hash)
{
int bitIndex = hash % (COLUMNS * 8);
bit_index= bitIndex / 8;
bit = bitIndex % 8;
}

How to convert large integers to base 2^32?

First off, I'm doing this for myself so please don't suggest "use GMP / xint / bignum" (if it even applies).
I'm looking for a way to convert large integers (say, OVER 9000 digits) into a int32 array of 232 representations. The numbers will start out as base 10 strings.
For example, if I wanted to convert string a = "4294967300" (in base 10), which is just over INT_MAX, to the new base 232 array, it would be int32_t b[] = {1,5}. If int32_t b[] = {3,2485738}, the base 10 number would be 3 * 2^32 + 2485738. Obviously the numbers I'll be working with are beyond the range of even int64 so I can't exactly turn the string into an integer and mod my way to success.
I have a function that does subtraction in base 10. Right now I'm thinking I'll just do subtraction(char* number, "2^32") and count how many times before I get a negative number, but that will probably take a long time for larger numbers.
Can someone suggest a different method of conversion? Thanks.
EDIT
Sorry in case you didn't see the tag, I'm working in C++
Assuming your bignum class already has multiplication and addition, it's fairly simple:
bignum str_to_big(char* str) {
bignum result(0);
while (*str) {
result *= 10;
result += (*str - '0');
str = str + 1;
}
return result;
}
Converting the other way is the same concept, but requires division and modulo
std::string big_to_str(bignum num) {
std::string result;
do {
result.push_back(num%10);
num /= 10;
} while(num > 0);
std::reverse(result.begin(), result.end());
return result;
}
Both of these are for unsigned only.
To convert from base 10 strings to your numbering system, starting with zero continue adding and multiplying each base 10 digit by 10. Every time you have a carry add a new digit to your base 2^32 array.
The simplest (not the most efficient) way to do this is to write two functions, one to multiply a large number by an int, and one to add an int to a large number. If you ignore the complexities introduced by signed numbers, the code looks something like this:
(EDITED to use vector for clarity and to add code for actual question)
void mulbig(vector<uint32_t> &bignum, uint16_t multiplicand)
{
uint32_t carry=0;
for( unsigned i=0; i<bignum.size(); i++ ) {
uint64_t r=((uint64_t)bignum[i] * multiplicand) + carry;
bignum[i]=(uint32_t)(r&0xffffffff);
carry=(uint32_t)(r>>32);
}
if( carry )
bignum.push_back(carry);
}
void addbig(vector<uint32_t> &bignum, uint16_t addend)
{
uint32_t carry=addend;
for( unsigned i=0; carry && i<bignum.size(); i++ ) {
uint64_t r=(uint64_t)bignum[i] + carry;
bignum[i]=(uint32_t)(r&0xffffffff);
carry=(uint32_t)(r>>32);
}
if( carry )
bignum.push_back(carry);
}
Then, implementing atobignum() using those functions is trivial:
void atobignum(const char *str,vector<uint32_t> &bignum)
{
bignum.clear();
bignum.push_back(0);
while( *str ) {
mulbig(bignum,10);
addbig(bignum,*str-'0');
++str;
}
}
I think Docjar: gnu/java/math/MPN.java might contain what you're looking for, specifically the code for public static int set_str (int dest[], byte[] str, int str_len, int base).
Start by converting the number to binary. Starting from the right, each group of 32 bits is a single base2^32 digit.