Signed int from bitset<n> - c++

How can I convert given bitset of a length N (where 0 < N < 64) to signed int. For instance, given:
std::bitset<13> b("1111111101100");
I would like to get back the value -20, not 8172.
My approach:
int t = (static_cast<int>(b.to_ullong()));
if(t > pow(2, 13)/2)
t -= pow(2, 13);
Is there a more generic way to approach this?
Edit: Also the bitset is actually std::bitset<64> and the N can be run-time known value passed by other means.

We can write a function template to do this for us:
template <size_t N, class = std::enable_if_t<(N > 0 && N < 64)>
int64_t as_signed(const std::bitset<N>& b)
{
int64_t v = b.to_ullong(); // safe since we know N < 64
return b[N-1] ? ((1LL << N) - v) : v;
}

Perhaps best is to let compiler to sign-extend it itself:
struct S { int64_t x:N; } s;
int64_t result = s.x = b.to_ullong();
Compiler likely optimizes that s out.
It must be is safe since the int64_t (where available) is required to be two's complement.
Edit: When the actual bit count to extend is only known run-time then most portable algorithm is with mask:
// Do this if bits above position N in b may be are not zero to clear those.
int64_t x = b.to_ullong() & ((1ULL << N) - 1);
// Otherwise just
int64_t x = b.to_ullong();
int64_t const mask = 1ULL << (N - 1);
int64_t result = (x ^ mask) - mask;
A slightly faster but less portable method with dynamic bit counts is with bit shifts (works when architecture has signed arithmetic right shift):
int const shift = 64 - N;
int64_t result = ((int64_t)b.to_ullong() << shift) >> shift;

Related

Issue with Modular Exponentiation C++

I'm trying to perform Modular Exponentiation for large values (upto 64-bits) and I wrote this function for it:
uint64_t modularExp(uint64_t num, uint64_t exp, uint64_t mod)
{
string expBits = bitset<64>(exp).to_string();
expBits = expBits.substr(expBits.find("1")+1);
string operations = "";
uint64_t result = num;
for (int i = 0; i < expBits.length(); ++i)
{
result = (uint64_t)pow(result, 2) % mod;
if (expBits[i] == '1')
result = (result * num) % mod;
}
return result;
}
This works good with small numbers (8 digits or less) but for large numbers, even though they're in the 64 bit range, the result comes out wrong.
Additionally, when the value of mod exceeds 4294967296 (Max 32 bit value), the result just comes out as zero. I suspect the pow function perhaps has a role to play in this issue but I can't figure it out for sure.
Any advice would be greatly appreciated.
First of all, some general advice:
It's better not to use strings when working with integers, as operations with strings are much slower and might become a bottleneck for performance. It's also less clear what is actually being done when strings are involved.
You shouldn't use std::pow with integers, because it operates on floating-point numbers and loses precision.
For the main question, as a workaround, you can use this O(log^2(n)) solution, which should work for arguments up to 63 bits (since it only ever uses addition and multiplication by 2). Note how all that string magic is unnecessary if you just iterate over the bits in small-to-large order:
#include <cstdint>
uint64_t modular_mul(uint64_t a, uint64_t b, uint64_t mod) {
uint64_t result = 0;
for (uint64_t current_term = a; b; b >>= 1) {
if (b & 1) {
result = (result + current_term) % mod;
}
current_term = 2 * current_term % mod;
}
return result;
}
uint64_t modular_pow(uint64_t base, uint64_t exp, uint64_t mod) {
uint64_t result = 1;
for (uint64_t current_factor = base; exp; exp >>= 1) {
if (exp & 1) {
result = modular_mul(result, current_factor, mod);
}
current_factor = modular_mul(current_factor, current_factor, mod);
}
return result;
}
Also, in gcc a (non-standard) __uint128_t is available for some targets. (which can be used to replace modular_mul with normal multiplication)

How can I subtract two IPv6 addresses (128bit numbers) in C/C++?

I'm storing the IP address in sockaddr_in6 which supports an array of four 32-bit, addr[4]. Essentially a 128 bit number.
I'm trying to calculate number of IPs in a given IPv6 range (how many IPs between). So it's a matter of subtracting one from another using two arrays with a length of four.
The problem is since there's no 128bit data type, I can't convert into decimal.
Thanks a ton!
You could use some kind of big-int library (if you can tolerate LGPL, GMP is the choice). Fortunately, 128 bit subtraction is easy to simulate by hand if necessary. Here is a quick and dirty demonstration of computing the absolute value of (a-b), for 128 bit values:
#include <iostream>
#include <iomanip>
struct U128
{
unsigned long long hi;
unsigned long long lo;
};
bool subtract(U128& a, U128 b)
{
unsigned long long carry = b.lo > a.lo;
a.lo -= b.lo;
unsigned long long carry2 = b.hi > a.hi || a.hi == b.hi && carry;
a.hi -= carry;
a.hi -= b.hi;
return carry2 != 0;
}
int main()
{
U128 ipAddressA = { 45345, 345345 };
U128 ipAddressB = { 45345, 345346 };
bool carry = subtract(ipAddressA, ipAddressB);
// Carry being set means that we underflowed; that ipAddressB was > ipAddressA.
// Lets just compute 0 - ipAddressA as a means to calculate the negation
// (0-x) of our current value. This gives us the absolute value of the
// difference.
if (carry)
{
ipAddressB = ipAddressA;
ipAddressA = { 0, 0 };
subtract(ipAddressA, ipAddressB);
}
// Print gigantic hex string of the 128-bit value
std::cout.fill ('0');
std::cout << std::hex << std::setw(16) << ipAddressA.hi << std::setw(16) << ipAddressA.lo << std::endl;
}
This gives you the absolute value of the difference. If the range is not huge (64 bits or less), then ipAddressA.lo can be your answer as a simple unsigned long long.
If you have perf concerns, you can make use of compiler intrinsics for taking advantage of certain architectures, such as amd64 if you want it to be optimal on that processor. _subborrow_u64 is the amd64 intrinsic for the necessary subtraction work.
The in6_addr structure stores the address in network byte order - or 'big endian' - with the most significant byte # s6_addr[0]. You can't count on the other union members being consistently named, or defined. Even If you accessed the union through a (non-portable) uint32_t field, the values would have to be converted with ntohl. So a portable method of finding the difference needs some work.
You can convert the in6_addr to uint64_t[2]. Sticking with typical 'bignum' conventions, we use [0] for the low 64-bits and [1] for the high 64-bits:
static inline void
in6_to_u64 (uint64_t dst[2], const struct in6_addr *src)
{
uint64_t hi = 0, lo = 0;
for (unsigned int i = 0; i < 8; i++)
{
hi = (hi << 8) | src->s6_addr[i];
lo = (lo << 8) | src->s6_addr[i + 8];
}
dst[0] = lo, dst[1] = hi;
}
and the difference:
static inline unsigned int
u64_diff (uint64_t d[2], const uint64_t x[2], const uint64_t y[2])
{
unsigned int b = 0, bi;
for (unsigned int i = 0; i < 2; i++)
{
uint64_t di, xi, yi, tmp;
xi = x[i], yi = y[i];
tmp = xi - yi;
di = tmp - b, bi = tmp > xi;
d[i] = di, b = bi | (di > tmp);
}
return b; /* borrow flag = (x < y) */
}

Gather bits at specific positions into a new value

I have a bit-mask of N chars in size, which is statically known (i.e. can be calculated at compile time, but it's not a single constant, so I can't just write it down), with bits set to 1 denoting the "wanted" bits. And I have a value of the same size, which is only known at runtime. I want to collect the "wanted" bits from that value, in order, into the beginning of a new value. For simplicity's sake let's assume the number of wanted bits is <= 32.
Completely unoptimized reference code which hopefully has the correct behaviour:
template<int N, const char mask[N]>
unsigned gather_bits(const char* val)
{
unsigned result = 0;
char* result_p = (char*)&result;
int pos = 0;
for (int i = 0; i < N * CHAR_BIT; i++)
{
if (mask[i/CHAR_BIT] & (1 << (i % CHAR_BIT)))
{
if (val[i/CHAR_BIT] & (1 << (i % CHAR_BIT)))
{
if (pos < sizeof(unsigned) * CHAR_BIT)
{
result_p[pos/CHAR_BIT] |= 1 << (pos % CHAR_BIT);
}
else
{
abort();
}
}
pos += 1;
}
}
return result;
}
Although I'm not sure whether that formulation actually allows access to the contents of the mask at compile time. But in any case, it's available for use, maybe a constexpr function or something would be a better idea. I'm not looking here for the necessary C++ wizardry (I'll figure that out), just the algorithm.
An example of input/output, with 16-bit values and imaginary binary notation for clarity:
mask = 0b0011011100100110
val = 0b0101000101110011
--
wanted = 0b__01_001__1__01_ // retain only those bits which are set in the mask
result = 0b0000000001001101 // bring them to the front
^ gathered bits begin here
My questions are:
What's the most performant way to do this? (Are there any hardware instructions that can help?)
What if both the mask and the value are restricted to be unsigned, so a single word, instead of an unbounded char array? Can it then be done with a fixed, short sequence of instructions?
There will pext (parallel bit extract) that does exactly what you want in Intel Haswell. I don't know what the performance of that instruction will be, probably better than the alternatives though. This operation is also known as "compress-right" or simply "compress", the implementation from Hacker's Delight is this:
unsigned compress(unsigned x, unsigned m) {
unsigned mk, mp, mv, t;
int i;
x = x & m; // Clear irrelevant bits.
mk = ~m << 1; // We will count 0's to right.
for (i = 0; i < 5; i++) {
mp = mk ^ (mk << 1); // Parallel prefix.
mp = mp ^ (mp << 2);
mp = mp ^ (mp << 4);
mp = mp ^ (mp << 8);
mp = mp ^ (mp << 16);
mv = mp & m; // Bits to move.
m = m ^ mv | (mv >> (1 << i)); // Compress m.
t = x & mv;
x = x ^ t | (t >> (1 << i)); // Compress x.
mk = mk & ~mp;
}
return x;
}

How get smallest n, that 2 ^ n >= x for given integer x in O(1)?

How for given unsigned integer x find the smallest n, that 2 ^ n ≥ x in O(1)? in other words I want to find the index of higher set bit in binary format of x (plus 1 if x is not power of 2) in O(1) (not depended on size of integer and size of byte).
If you have no memory constraints, then you can use a lookup table (one entry for each possible value of x) to achieve O(1) time.
If you want a practical solution, most processors will have some kind of "find highest bit set" opcode. On x86, for instance, it's BSR. Most compilers will have a mechanism to write raw assembler.
Ok, since so far nobody has posted a compile-time solution, here's mine. The precondition is that your input value is a compile-time constant. If you have that, it's all done at compile-time.
#include <iostream>
#include <iomanip>
// This should really come from a template meta lib, no need to reinvent it here,
// but I wanted this to compile as is.
namespace templ_meta {
// A run-of-the-mill compile-time if.
template<bool Cond, typename T, typename E> struct if_;
template< typename T, typename E> struct if_<true , T, E> {typedef T result_t;};
template< typename T, typename E> struct if_<false, T, E> {typedef E result_t;};
// This so we can use a compile-time if tailored for types, rather than integers.
template<int I>
struct int2type {
static const int result = I;
};
}
// This does the actual work.
template< int I, unsigned int Idx = 0>
struct index_of_high_bit {
static const unsigned int result =
templ_meta::if_< I==0
, templ_meta::int2type<Idx>
, index_of_high_bit<(I>>1),Idx+1>
>::result_t::result;
};
// just some testing
namespace {
template< int I >
void test()
{
const unsigned int result = index_of_high_bit<I>::result;
std::cout << std::setfill('0')
<< std::hex << std::setw(2) << std::uppercase << I << ": "
<< std::dec << std::setw(2) << result
<< '\n';
}
}
int main()
{
test<0>();
test<1>();
test<2>();
test<3>();
test<4>();
test<5>();
test<7>();
test<8>();
test<9>();
test<14>();
test<15>();
test<16>();
test<42>();
return 0;
}
'twas fun to do that.
In <cmath> there are logarithm functions that will perform this computation for you.
ceil(log(x) / log(2));
Some math to transform the expression:
int n = ceil(log(x)/log(2));
This is obviously O(1).
It's a question about finding the highest bit set (as lshtar and Oli Charlesworth pointed out). Bit Twiddling Hacks gives a solution which takes about 7 operations for 32 Bit Integers and about 9 operations for 64 Bit Integers.
You can use precalculated tables.
If your number is in [0,255] interval, simple table look up will work.
If it's bigger, then you may split it by bytes and check them from high to low.
Perhaps this link will help.
Warning : the code is not exactly straightforward and seems rather unmaintainable.
uint64_t v; // Input value to find position with rank r.
unsigned int r; // Input: bit's desired rank [1-64].
unsigned int s; // Output: Resulting position of bit with rank r [1-64]
uint64_t a, b, c, d; // Intermediate temporaries for bit count.
unsigned int t; // Bit count temporary.
// Do a normal parallel bit count for a 64-bit integer,
// but store all intermediate steps.
// a = (v & 0x5555...) + ((v >> 1) & 0x5555...);
a = v - ((v >> 1) & ~0UL/3);
// b = (a & 0x3333...) + ((a >> 2) & 0x3333...);
b = (a & ~0UL/5) + ((a >> 2) & ~0UL/5);
// c = (b & 0x0f0f...) + ((b >> 4) & 0x0f0f...);
c = (b + (b >> 4)) & ~0UL/0x11;
// d = (c & 0x00ff...) + ((c >> 8) & 0x00ff...);
d = (c + (c >> 8)) & ~0UL/0x101;
t = (d >> 32) + (d >> 48);
// Now do branchless select!
s = 64;
// if (r > t) {s -= 32; r -= t;}
s -= ((t - r) & 256) >> 3; r -= (t & ((t - r) >> 8));
t = (d >> (s - 16)) & 0xff;
// if (r > t) {s -= 16; r -= t;}
s -= ((t - r) & 256) >> 4; r -= (t & ((t - r) >> 8));
t = (c >> (s - 8)) & 0xf;
// if (r > t) {s -= 8; r -= t;}
s -= ((t - r) & 256) >> 5; r -= (t & ((t - r) >> 8));
t = (b >> (s - 4)) & 0x7;
// if (r > t) {s -= 4; r -= t;}
s -= ((t - r) & 256) >> 6; r -= (t & ((t - r) >> 8));
t = (a >> (s - 2)) & 0x3;
// if (r > t) {s -= 2; r -= t;}
s -= ((t - r) & 256) >> 7; r -= (t & ((t - r) >> 8));
t = (v >> (s - 1)) & 0x1;
// if (r > t) s--;
s -= ((t - r) & 256) >> 8;
s = 65 - s;
As has been mentioned, the length of the binary representation of x + 1 is the n you're looking for (unless x is in itself a power of two, meaning 10.....0 in a binary representation).
I seriously doubt there exists a true solution in O(1), unless you consider translations to binary representation to be O(1).
For a 32 bit int, the following pseudocode will be O(1).
highestBit(x)
bit = 1
highest = 0
for i 1 to 32
if x & bit == 1
highest = i
bit = bit * 2
return highest + 1
It doesn't matter how big x is, it always checks all 32 bits. Thus constant time.
If the input can be any integer size, say the input is n digits long. Then any solution reading the input, will read n digits and must be at least O(n). Unless someone comes up solution without reading the input, it is impossible to find a O(1) solution.
After some search in internet I found this 2 versions for 32 bit unsigned integer number. I have tested them and they work. It is clear for me why second one works, but still now I'm thinking about first one...
1.
unsigned int RoundUpToNextPowOf2(unsigned int v)
{
unsigned int r = 1;
if (v > 1)
{
float f = (float)v;
unsigned int const t = 1U << ((*(unsigned int *)&f >> 23) - 0x7f);
r = t << (t < v);
}
return r;
}
2.
unsigned int RoundUpToNextPowOf2(unsigned int v)
{
v--;
v |= v >> 1;
v |= v >> 2;
v |= v >> 4;
v |= v >> 8;
v |= v >> 16;
v++;
return v;
}
edit: First one in clear as well.
An interesting question. What do you mean by not depending on the size
of int or the number of bits in a byte? To encounter a different number
of bits in a byte, you'll have to use a different machine, with
a different set of machine instructions, which may or may not affect the
answer.
Anyway, based sort of vaguely on the first solution proposed by Mihran,
I get:
int
topBit( unsigned x )
{
int r = 1;
if ( x > 1 ) {
if ( frexp( static_cast<double>( x ), &r ) != 0.5 ) {
++ r;
}
}
return r - 1;
}
This works within the constraint that the input value must be exactly
representable in a double; if the input is unsigned long long, this
might not be the case, and on some of the more exotic platforms, it
might not even be the case for unsigned.
The only other constant time (with respect to the number of bits) I can
think of is:
int
topBit( unsigned x )
{
return x == 0 ? 0.0 : ceil( log2( static_cast<double>( x ) ) );
}
, which has the same constraint with regards to x being exactly
representable in a double, and may also suffer from rounding errors
inherent in the floating point operations (although if log2 is
implemented correctly, I don't think that this should be the case). If
your compiler doesn't support log2 (a C++11 feature, but also present
in C90, so I would expect most compilers to already have implemented
it), then of course, log( x ) / log( 2 ) could be used, but I suspect
that this will increase the risk of a rounding error resulting in
a wrong result.
FWIW, I find the O(1) on the number of bits a bit illogical, for the
reasons I specified above: the number of bits is just one of the many
"constant factors" which depend on the machine on which you run.
Anyway, I came up with the following purely integer solution, which is
O(lg 1) for the number of bits, and O(1) for everything else:
template< int k >
struct TopBitImpl
{
static int const k2 = k / 2;
static unsigned const m = ~0U << k2;
int operator()( unsigned x ) const
{
unsigned r = ((x & m) != 0) ? k2 : 0;
return r + TopBitImpl<k2>()(r == 0 ? x : x >> k2);
}
};
template<>
struct TopBitImpl<1>
{
int operator()( unsigned x ) const
{
return 0;
}
};
int
topBit( unsigned x )
{
return TopBitImpl<std::numeric_limits<unsigned>::digits>()(x)
+ (((x & (x - 1)) != 0) ? 1 : 0);
}
A good compiler should be able to inline the recursive calls, resulting
in close to optimal code.

How to determine how many bytes an integer needs?

I'm looking for the most efficient way to calculate the minimum number of bytes needed to store an integer without losing precision.
e.g.
int: 10 = 1 byte
int: 257 = 2 bytes;
int: 18446744073709551615 (UINT64_MAX) = 8 bytes;
Thanks
P.S. This is for a hash functions which will be called many millions of times
Also the byte sizes don't have to be a power of two
The fastest solution seems to one based on tronics answer:
int bytes;
if (hash <= UINT32_MAX)
{
if (hash < 16777216U)
{
if (hash <= UINT16_MAX)
{
if (hash <= UINT8_MAX) bytes = 1;
else bytes = 2;
}
else bytes = 3;
}
else bytes = 4;
}
else if (hash <= UINT64_MAX)
{
if (hash < 72057594000000000ULL)
{
if (hash < 281474976710656ULL)
{
if (hash < 1099511627776ULL) bytes = 5;
else bytes = 6;
}
else bytes = 7;
}
else bytes = 8;
}
The speed difference using mostly 56 bit vals was minimal (but measurable) compared to Thomas Pornin answer. Also i didn't test the solution using __builtin_clzl which could be comparable.
Use this:
int n = 0;
while (x != 0) {
x >>= 8;
n ++;
}
This assumes that x contains your (positive) value.
Note that zero will be declared encodable as no byte at all. Also, most variable-size encodings need some length field or terminator to know where encoding stops in a file or stream (usually, when you encode an integer and mind about size, then there is more than one integer in your encoded object).
You need just two simple ifs if you are interested on the common sizes only. Consider this (assuming that you actually have unsigned values):
if (val < 0x10000) {
if (val < 0x100) // 8 bit
else // 16 bit
} else {
if (val < 0x100000000L) // 32 bit
else // 64 bit
}
Should you need to test for other sizes, choosing a middle point and then doing nested tests will keep the number of tests very low in any case. However, in that case making the testing a recursive function might be a better option, to keep the code simple. A decent compiler will optimize away the recursive calls so that the resulting code is still just as fast.
Assuming a byte is 8 bits, to represent an integer x you need [log2(x) / 8] + 1 bytes where [x] = floor(x).
Ok, I see now that the byte sizes aren't necessarily a power of two. Consider the byte sizes b. The formula is still [log2(x) / b] + 1.
Now, to calculate the log, either use lookup tables (best way speed-wise) or use binary search, which is also very fast for integers.
The function to find the position of the first '1' bit from the most significant side (clz or bsr) is usually a simple CPU instruction (no need to mess with log2), so you could divide that by 8 to get the number of bytes needed. In gcc, there's __builtin_clz for this task:
#include <limits.h>
int bytes_needed(unsigned long long x) {
int bits_needed = sizeof(x)*CHAR_BIT - __builtin_clzll(x);
if (bits_needed == 0)
return 1;
else
return (bits_needed + 7) / 8;
}
(On MSVC you would use the _BitScanReverse intrinsic.)
You may first get the highest bit set, which is the same as log2(N), and then get the bytes needed by ceil(log2(N) / 8).
Here are some bit hacks for getting the position of the highest bit set, which are copied from http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogObvious, and you can click the URL for details of how these algorithms work.
Find the integer log base 2 of an integer with an 64-bit IEEE float
int v; // 32-bit integer to find the log base 2 of
int r; // result of log_2(v) goes here
union { unsigned int u[2]; double d; } t; // temp
t.u[__FLOAT_WORD_ORDER==LITTLE_ENDIAN] = 0x43300000;
t.u[__FLOAT_WORD_ORDER!=LITTLE_ENDIAN] = v;
t.d -= 4503599627370496.0;
r = (t.u[__FLOAT_WORD_ORDER==LITTLE_ENDIAN] >> 20) - 0x3FF;
Find the log base 2 of an integer with a lookup table
static const char LogTable256[256] =
{
#define LT(n) n, n, n, n, n, n, n, n, n, n, n, n, n, n, n, n
-1, 0, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 3,
LT(4), LT(5), LT(5), LT(6), LT(6), LT(6), LT(6),
LT(7), LT(7), LT(7), LT(7), LT(7), LT(7), LT(7), LT(7)
};
unsigned int v; // 32-bit word to find the log of
unsigned r; // r will be lg(v)
register unsigned int t, tt; // temporaries
if (tt = v >> 16)
{
r = (t = tt >> 8) ? 24 + LogTable256[t] : 16 + LogTable256[tt];
}
else
{
r = (t = v >> 8) ? 8 + LogTable256[t] : LogTable256[v];
}
Find the log base 2 of an N-bit integer in O(lg(N)) operations
unsigned int v; // 32-bit value to find the log2 of
const unsigned int b[] = {0x2, 0xC, 0xF0, 0xFF00, 0xFFFF0000};
const unsigned int S[] = {1, 2, 4, 8, 16};
int i;
register unsigned int r = 0; // result of log2(v) will go here
for (i = 4; i >= 0; i--) // unroll for speed...
{
if (v & b[i])
{
v >>= S[i];
r |= S[i];
}
}
// OR (IF YOUR CPU BRANCHES SLOWLY):
unsigned int v; // 32-bit value to find the log2 of
register unsigned int r; // result of log2(v) will go here
register unsigned int shift;
r = (v > 0xFFFF) << 4; v >>= r;
shift = (v > 0xFF ) << 3; v >>= shift; r |= shift;
shift = (v > 0xF ) << 2; v >>= shift; r |= shift;
shift = (v > 0x3 ) << 1; v >>= shift; r |= shift;
r |= (v >> 1);
// OR (IF YOU KNOW v IS A POWER OF 2):
unsigned int v; // 32-bit value to find the log2 of
static const unsigned int b[] = {0xAAAAAAAA, 0xCCCCCCCC, 0xF0F0F0F0,
0xFF00FF00, 0xFFFF0000};
register unsigned int r = (v & b[0]) != 0;
for (i = 4; i > 0; i--) // unroll for speed...
{
r |= ((v & b[i]) != 0) << i;
}
Find the number of bits by taking the log2 of the number, then divide that by 8 to get the number of bytes.
You can find logn of x by the formula:
logn(x) = log(x) / log(n)
Update:
Since you need to do this really quickly, Bit Twiddling Hacks has several methods for quickly calculating log2(x). The look-up table approach seems like it would suit your needs.
This will get you the number of bytes. It's not strictly the most efficient, but unless you're programming a nanobot powered by the energy contained in a red blood cell, it won't matter.
int count = 0;
while (numbertotest > 0)
{
numbertotest >>= 8;
count++;
}
You could write a little template meta-programming code to figure it out at compile time if you need it for array sizes:
template<unsigned long long N> struct NBytes
{ static const size_t value = NBytes<N/256>::value+1; };
template<> struct NBytes<0>
{ static const size_t value = 0; };
int main()
{
std::cout << "short = " << NBytes<SHRT_MAX>::value << " bytes\n";
std::cout << "int = " << NBytes<INT_MAX>::value << " bytes\n";
std::cout << "long long = " << NBytes<ULLONG_MAX>::value << " bytes\n";
std::cout << "10 = " << NBytes<10>::value << " bytes\n";
std::cout << "257 = " << NBytes<257>::value << " bytes\n";
return 0;
}
output:
short = 2 bytes
int = 4 bytes
long long = 8 bytes
10 = 1 bytes
257 = 2 bytes
Note: I know this isn't answering the original question, but it answers a related question that people will be searching for when they land on this page.
Floor((log2(N) / 8) + 1) bytes
You need exactly the log function
nb_bytes = floor(log(x)/log(256))+1
if you use log2, log2(256) == 8 so
floor(log2(x)/8)+1
You need to raise 256 to successive powers until the result is larger than your value.
For example: (Tested in C#)
long long limit = 1;
int byteCount;
for (byteCount = 1; byteCount < 8; byteCount++) {
limit *= 256;
if (limit > value)
break;
}
If you only want byte sizes to be powers of two (If you don't want 65,537 to return 3), replace byteCount++ with byteCount *= 2.
I think this is a portable implementation of the straightforward formula:
#include <limits.h>
#include <math.h>
#include <stdio.h>
int main(void) {
int i;
unsigned int values[] = {10, 257, 67898, 140000, INT_MAX, INT_MIN};
for ( i = 0; i < sizeof(values)/sizeof(values[0]); ++i) {
printf("%d needs %.0f bytes\n",
values[i],
1.0 + floor(log(values[i]) / (M_LN2 * CHAR_BIT))
);
}
return 0;
}
Output:
10 needs 1 bytes
257 needs 2 bytes
67898 needs 3 bytes
140000 needs 3 bytes
2147483647 needs 4 bytes
-2147483648 needs 4 bytes
Whether and how much the lack of speed and the need to link floating point libraries depends on your needs.
I know this question didn't ask for this type of answer but for those looking for a solution using the smallest number of characters, this does the assignment to a length variable in 17 characters, or 25 including the declaration of the length variable.
//Assuming v is the value that is being counted...
int l=0;
for(;v>>l*8;l++);
This is based on SoapBox's idea of creating a solution that contains no jumps, branches etc... Unfortunately his solution was not quite correct. I have adopted the spirit and here's a 32bit version, the 64bit checks can be applied easily if desired.
The function returns number of bytes required to store the given integer.
unsigned short getBytesNeeded(unsigned int value)
{
unsigned short c = 0; // 0 => size 1
c |= !!(value & 0xFF00); // 1 => size 2
c |= (!!(value & 0xFF0000)) << 1; // 2 => size 3
c |= (!!(value & 0xFF000000)) << 2; // 4 => size 4
static const int size_table[] = { 1, 2, 3, 3, 4, 4, 4, 4 };
return size_table[c];
}
For each of eight times, shift the int eight bits to the right and see if there are still 1-bits left. The number of times you shift before you stop is the number of bytes you need.
More succinctly, the minimum number of bytes you need is ceil(min_bits/8), where min_bits is the index (i+1) of the highest set bit.
There are a multitude of ways to do this.
Option #1.
int numBytes = 0;
do {
numBytes++;
} while (i >>= 8);
return (numBytes);
In the above example, is the number you are testing, and generally works for any processor, any size of integer.
However, it might not be the fastest. Alternatively, you can try a series of if statements ...
For a 32 bit integers
if ((upper = (value >> 16)) == 0) {
/* Bit in lower 16 bits may be set. */
if ((high = (value >> 8)) == 0) {
return (1);
}
return (2);
}
/* Bit in upper 16 bits is set */
if ((high = (upper >> 8)) == 0) {
return (3);
}
return (4);
For 64 bit integers, Another level of if statements would be required.
If the speed of this routine is as critical as you say, it might be worthwhile to do this in assembler if you want it as a function call. That could allow you to avoid creating and destroying the stack frame, saving a few extra clock cycles if it is that critical.
A bit basic, but since there will be a limited number of outputs, can you not pre-compute the breakpoints and use a case statement? No need for calculations at run-time, only a limited number of comparisons.
Why not just use a 32-bit hash?
That will work at near-top-speed everywhere.
I'm rather confused as to why a large hash would even be wanted. If a 4-byte hash works, why not just use it always? Excepting cryptographic uses, who has hash tables with more then 232 buckets anyway?
there are lots of great recipes for stuff like this over at Sean Anderson's "Bit Twiddling Hacks" page.
This code has 0 branches, which could be faster on some systems. Also on some systems (GPGPU) its important for threads in the same warp to execute the same instructions. This code is always the same number of instructions no matter what the input value.
inline int get_num_bytes(unsigned long long value) // where unsigned long long is the largest integer value on this platform
{
int size = 1; // starts at 1 sot that 0 will return 1 byte
size += !!(value & 0xFF00);
size += !!(value & 0xFFFF0000);
if (sizeof(unsigned long long) > 4) // every sane compiler will optimize this out
{
size += !!(value & 0xFFFFFFFF00000000ull);
if (sizeof(unsigned long long) > 8)
{
size += !!(value & 0xFFFFFFFFFFFFFFFF0000000000000000ull);
}
}
static const int size_table[] = { 1, 2, 4, 8, 16 };
return size_table[size];
}
g++ -O3 produces the following (verifying that the ifs are optimized out):
xor %edx,%edx
test $0xff00,%edi
setne %dl
xor %eax,%eax
test $0xffff0000,%edi
setne %al
lea 0x1(%rdx,%rax,1),%eax
movabs $0xffffffff00000000,%rdx
test %rdx,%rdi
setne %dl
lea (%rdx,%rax,1),%rax
and $0xf,%eax
mov _ZZ13get_num_bytesyE10size_table(,%rax,4),%eax
retq
Why so complicated? Here's what I came up with:
bytesNeeded = (numBits/8)+((numBits%8) != 0);
Basically numBits divided by eight + 1 if there is a remainder.
There are already a lot of answers here, but if you know the number ahead of time, in c++ you can use a template to make use of the preprocessor.
template <unsigned long long N>
struct RequiredBytes {
enum : int { value = 1 + (N > 255 ? RequiredBits<(N >> 8)>::value : 0) };
};
template <>
struct RequiredBytes<0> {
enum : int { value = 1 };
};
const int REQUIRED_BYTES_18446744073709551615 = RequiredBytes<18446744073709551615>::value; // 8
or for a bits version:
template <unsigned long long N>
struct RequiredBits {
enum : int { value = 1 + RequiredBits<(N >> 1)>::value };
};
template <>
struct RequiredBits<1> {
enum : int { value = 1 };
};
template <>
struct RequiredBits<0> {
enum : int { value = 1 };
};
const int REQUIRED_BITS_42 = RequiredBits<42>::value; // 6