I was asked to get the internal binary representation of different types in C. My program currently works fine with 'int' but I would like to use it with "double" and "float". My code looks like this:
template <typename T>
string findBin(T x) {
string binary;
for(int i = 4096 ; i >= 1; i/=2) {
if((x & i) != 0) binary += "1";
else binary += "0";
}
return binary;
}
The program fails when I try to instantiate the template using a "double" or a "float".
Succinctly, you don't.
The bitwise operators do not make sense when applied to double or float, and the standard says that the bitwise operators (~, &, |, ^, >>, <<, and the assignment variants) do not accept double or float operands.
Both double and float have 3 sections - a sign bit, an exponent, and the mantissa. Suppose for a moment that you could shift a double right. The exponent, in particular, means that there is no simple translation to shifting a bit pattern right - the sign bit would move into the exponent, and the least significant bit of the exponent would shift into the mantissa, with completely non-obvious sets of meanings. In IEEE 754, there's an implied 1 bit in front of the actual mantissa bits, which also complicates the interpretation.
Similar comments apply to any of the other bit operators.
So, because there is no sane or useful interpretation of the bit operators to double values, they are not allowed by the standard.
From the comments:
I'm only interested in the binary representation. I just want to print it, not do anything useful with it.
This code was written several years ago for SPARC (big-endian) architecture.
#include <stdio.h>
union u_double
{
double dbl;
char data[sizeof(double)];
};
union u_float
{
float flt;
char data[sizeof(float)];
};
static void dump_float(union u_float f)
{
int exp;
long mant;
printf("32-bit float: sign: %d, ", (f.data[0] & 0x80) >> 7);
exp = ((f.data[0] & 0x7F) << 1) | ((f.data[1] & 0x80) >> 7);
printf("expt: %4d (unbiassed %5d), ", exp, exp - 127);
mant = ((((f.data[1] & 0x7F) << 8) | (f.data[2] & 0xFF)) << 8) | (f.data[3] & 0xFF);
printf("mant: %16ld (0x%06lX)\n", mant, mant);
}
static void dump_double(union u_double d)
{
int exp;
long long mant;
printf("64-bit float: sign: %d, ", (d.data[0] & 0x80) >> 7);
exp = ((d.data[0] & 0x7F) << 4) | ((d.data[1] & 0xF0) >> 4);
printf("expt: %4d (unbiassed %5d), ", exp, exp - 1023);
mant = ((((d.data[1] & 0x0F) << 8) | (d.data[2] & 0xFF)) << 8) | (d.data[3] & 0xFF);
mant = (mant << 32) | ((((((d.data[4] & 0xFF) << 8) | (d.data[5] & 0xFF)) << 8) | (d.data[6] & 0xFF)) << 8) | (d.data[7] & 0xFF);
printf("mant: %16lld (0x%013llX)\n", mant, mant);
}
static void print_value(double v)
{
union u_double d;
union u_float f;
f.flt = v;
d.dbl = v;
printf("SPARC: float/double of %g\n", v);
// image_print(stdout, 0, f.data, sizeof(f.data));
// image_print(stdout, 0, d.data, sizeof(d.data));
dump_float(f);
dump_double(d);
}
int main(void)
{
print_value(+1.0);
print_value(+2.0);
print_value(+3.0);
print_value( 0.0);
print_value(-3.0);
print_value(+3.1415926535897932);
print_value(+1e126);
return(0);
}
The commented out 'image_print()` function prints an arbitrary set of bytes in hex, with various minor tweaks. Contact me if you want the code (see my profile).
If you're using Intel (little-endian), you'll probably need to tweak the code to deal with the reverse bit order. But it shows how you can do it - using a union.
You cannot directly apply bitwise operators to float or double, but you can still access the bits indirectly by putting the variable in a union with a character array of the appropriate size, then reading the bits from those characters. For example:
string BitsFromDouble(double value) {
union {
double doubleValue;
char asChars[sizeof(double)];
};
doubleValue = value; // Write to the union
/* Extract the bits. */
string result;
for (size i = 0; i < sizeof(double); ++i)
result += CharToBits(asChars[i]);
return result;
}
You may need to adjust your routine to work on chars, which usually don't range up to 4096, and there may also be some weirdness with endianness here, but the basic idea should work. It won't be cross-platform compatible, since machines use different endianness and representations of doubles, so be careful how you use this.
Bitwise operators don't generally work with "binary representation" (also called object representation) of any type. Bitwise operators work with value representation of the type, which is generally different from object representation. That applies to int as well as to double.
If you really want to get to the internal binary representation of an object of any type, as you stated in your question, you need to reinterpret the object of that type as an array of unsigned char objects and then use the bitwise operators on these unsigned chars
For example
double d = 12.34;
const unsigned char *c = reinterpret_cast<unsigned char *>(&d);
Now by accessing elements c[0] through c[sizeof(double) - 1] you will see the internal representation of type double. You can use bitwise operations on these unsigned char values, if you want to.
Note, again, that in general case in order to access internal representation of type int you have to do the same thing. It generally applies to any type other than char types.
Do a bit-wise cast of a pointer to the double to long long * and dereference.
Example:
inline double bit_and_d(double* d, long long mask) {
long long t = (*(long long*)d) & mask;
return *(double*)&t;
}
Edit: This is almost certainly going to run afoul of gcc's enforcement of strict aliasing. Use one of the various workarounds for that. (memcpy, unions, __attribute__((__may_alias__)), etc)
Other solution is to get a pointer to the floating point variable and cast it to a pointer to integer type of the same size, and then get value of the integer this pointer points to. Now you have an integer variable with same binary representation as the floating point one and you can use your bitwise operator.
string findBin(float f) {
string binary;
for(long i = 4096 ; i >= 1; i/=2) {
long x = * ( long * ) &y;
if((x & i) != 0) binary += "1";
else binary += "0";
}
return binary;
}
But remember: you have to cast to a type with same size. Otherwise unpredictable things may happen (like buffer overflow, access violation etc.).
As others have said, you can use a bitwise operator on a double by casting double* to long long* (or sometimes just long*).
int main(){
double * x = (double*)malloc(sizeof(double));
*x = -5.12345;
printf("%f\n", *x);
*((long*)x) &= 0x7FFFFFFFFFFFFFFF;
printf("%f\n", *x);
return 0;
}
On my computer, this code prints:
-5.123450
5.123450
Related
I am new in C++ programming. I am trying to implement a code through which I can make a single integer value from 6 or more individual bytes.
I have Implemented same for 4 bytes and it's working
My Code for 4 bytes:
char *command = "\x42\xa0\x82\xa1\x21\x22";
__int64 value;
value = (__int64)(((unsigned char)command[2] << 24) + ((unsigned char)command[3] << 16) + ((unsigned char)command[4] << 8) + (unsigned char)command[5]);
printf("%x %x %x %x %x",command[2], command[3], command[4], command[5], value);
Using this Code the value of value is 82a12122 but when I try to do for 6 byte then the result was is wrong.
Code for 6 Bytes:
char *command = "\x42\xa0\x82\xa1\x21\x22";
__int64 value;
value = (__int64)(((unsigned char)command[0] << 40) + ((unsigned char)command[1] << 32) + ((unsigned char)command[2] << 24) + ((unsigned char)command[3] << 16) + ((unsigned char)command[4] << 8) + (unsigned char)command[5]);
printf("%x %x %x %x %x %x %x", command[0], command[1], command[2], command[3], command[4], command[5], value);
The output value of value is 82a163c2 which is wrong, I need 42a082a12122.
So can anyone tell me how to get the expected output and what is wrong with the 6 Byte Code.
Thanks in Advance.
Just cast each byte to a sufficiently large unsigned type before shifting. Even after integral promotions (to unsigned int), the type is not large enough to shift by more than 32 bytes (in the usual case, which seems to apply to you).
See here for demonstration: https://godbolt.org/g/x855XH
unsigned long long large_ok(char x)
{
return ((unsigned long long)x) << 63;
}
unsigned long long large_incorrect(char x)
{
return ((unsigned long long)x) << 64;
}
unsigned long long still_ok(char x)
{
return ((unsigned char)x) << 31;
}
unsigned long long incorrect(char x)
{
return ((unsigned char)x) << 32;
}
In simpler terms:
The shift operators promote their operands to int/unsigned int automatically. This is why your four byte version works: unsigned int is large enough for all your shifts. However, (in your implementation, as in most common ones) it can only hold 32 bits, and the compiler will not automatically choose a 64 bit type if you shift by more than 32 bits (that would be impossible for the compiler to know).
If you use large enough integral types for the shift operands, the shift will have the larger type as the result and the shifts will do what you expect.
If you turn on warnings, your compiler will probably also complain to you that you are shifting by more bits than the type has and thus always getting zero (see demonstration).
(The bit counts mentioned are of course implementation defined.)
A final note: Types beginning with double underscores (__) or underscore + capital letter are reserved for the implementation - using them is not technically "safe". Modern C++ provides you with types such as uint64_t that should have the stated number of bits - use those instead.
Your shift overflows bytes, and you are not printing the integers correctly.
This code is working:
(Take note of the print format and how the shifts are done in uint64_t)
#include <stdio.h>
#include <cstdint>
int main()
{
const unsigned char *command = (const unsigned char *)"\x42\xa0\x82\xa1\x21\x22";
uint64_t value=0;
for (int i=0; i<6; i++)
{
value <<= 8;
value += command[i];
}
printf("%x %x %x %x %x %x %llx",
command[0], command[1], command[2], command[3], command[4], command[5], value);
}
I need to know whether an integer is 32 bits long or not (I want to know if it's exactly 32 bits long (8 hexadecimal characters). How could I achieve this in C++? Should I do this with the hexadecimal representation or with the unsigned int one?
My code is as follows:
mistream.open("myfile.txt");
if(mistream)
{
for(int i=0; i<longArray; i++)
{
mistream >> hex >> datos[i];
}
}
mistream.close();
Where mistream is of type ifstream, and datos is an unsigned int array
Thank you
std::numeric_limits<unsigned>::digits
is a static integer constant (or constexpr in C++11) giving the number of bits (since unsigned is stored in base 2, it gives binary digits).
You need to #include <limits> to get this, and you'll notice here that this gives the same value as Thomas' answer (while also being generalizable to other primitive types)
For reference (you changed your question after I answered), every integer of a given type (eg, unsigned) in a given program is exactly the same size.
What you're now asking is not the size of the integer in bits, because that never varies, but whether the top bit is set. You can test this trivially with
bool isTopBitSet(uint32_t v) {
return v & 0x80000000u;
}
(replace the unsigned hex literal with something like T{1} << (std::numeric_limits<T>::digits-1) if you want to generalise to unsigned T other than uint32_t).
As already hinted in a comment by #chux, you can use a combination of the sizeof operator and the CHAR_BIT macro constant. The former tells you (at compile-time) the size (in multiples of sizeof(char) aka bytes) of its argument type. The latter is the number of bits to the byte (usually 8).
You can encapsulate this nicely into a function template.
#include <climits> // CHAR_BIT
#include <cstddef> // std::size_t
#include <iostream> // std::cout, std::endl
template <typename T>
constexpr std::size_t
bit_size() noexcept
{
return sizeof(T) * CHAR_BIT;
}
int
main()
{
std::cout << bit_size<int>() << std::endl;
std::cout << bit_size<long>() << std::endl;
}
On my implementation, it outputs 32 and 64.
Since the function is a constexpr, you can use it in static contexts, such as in static_assert<bit_size<int>() >= 32, "too small");.
Try this:
#include <climits>
unsigned int bits_per_byte = CHAR_BIT;
unsigned int bits_per_integer = CHAR_BIT * sizeof(int);
The identifier CHAR_BIT represents the number of bits in a char.
The sizeof returns the number of char locations occupied by the integer.
Multiplying them gives us the number of bits for an integer.
OP said "if it's exactly 32 bits long (8 hexadecimal characters)" and further with ".. interested in knowing if the value is between power(2, 31) and power(2, 32) - 1". So it is a little fuzzy on negative 32-bit numbers.
Certainly OP wants to know the result based on the value and not the type.
bool integer_is_32_bits_long(int x) =
// cope with 32-bit int
((INT_MAX == 0x7FFFFFFF) && (x < 0)) ||
// larger 32-bit int
((INT_MAX > 0x7FFFFFFF) && (x >= 0x80000000) && (x <= 0xFFFFFFFF));
Of course if int is 16-bit, then the result is always false.
I want to know if it's exactly 32 bits long (8 hexadecimal characters)
I am interested in knowing if the value is between power(2, 31) and power(2, 32) - 1
So you want to know if the upper bit is set? Then you can simply test if the number is negative:
bool upperBitSet(int x)
{
return x < 0;
}
For unsigned numbers, you can simply shift left and back right and then check if you lost data:
bool upperBitSet(unsigned x)
{
return (x << 1 >> 1) != x;
}
The simplest way probably is to check if the 32nd bit is set:
bool isReally32bitsLong(uint32_t in) {
return (in >> 31)!=0;
}
bool isExactly32BitsLong(uint64_t in) {
return ((in >> 31)!=0) && ((in >> 32) == 0);
}
How can I tell if a binary number is negative?
Currently I have the code below. It works fine converting to Binary. When converting to decimal, I need to know if the left most bit is 1 to tell if it is negative or not but I cannot seem to figure out how to do that.
Also, instead of making my Bin2 function print 1's an 0's, how can I make it return an integer? I didn't want to store it in a string and then convert to int.
EDIT: I'm using 8 bit numbers.
int Bin2(int value, int Padding = 8)
{
for (int I = Padding; I > 0; --I)
{
if (value & (1 << (I - 1)))
std::cout<< '1';
else
std::cout<<'0';
}
return 0;
}
int Dec2(int Value)
{
//bool Negative = (Value & 10000000);
int Dec = 0;
for (int I = 0; Value > 0; ++I)
{
if(Value % 10 == 1)
{
Dec += (1 << I);
}
Value /= 10;
}
//if (Negative) (Dec -= (1 << 8));
return Dec;
}
int main()
{
Bin2(25);
std::cout<<"\n\n";
std::cout<<Dec2(11001);
}
You are checking for negative value incorrectly. Do the following instead:
bool Negative = (value & 0x80000000); //It will work for 32-bit platforms only
Or may be just compare it with 0.
bool Negative = (value < 0);
Why don't you just compare it to 0. Should work fine and almost certainly you can't do this in a manner more efficient than the compiler.
I am entirely unclear if this is what the OP is looking for, but its worth a toss:
If you know you have a value in a signed int that is supposed to be representing a signed 8-bit value, you can pull it apart, store it in a signed 8-bit value, then promote it back to a native int signed value like this:
#include <stdio.h>
int main(void)
{
// signed integer, value is 245. 8bit signed value is (-11)
int num = 0xF5;
// pull out the low 8 bits, storing them in a signed char.
signed char ch = (signed char)(num & 0xFF);
// now let the signed char promote to a signed int.
int res = ch;
// finally print both.
printf("%d ==> %d\n",num, res);
// do it again for an 8 bit positive value
// this time with just direct casts.
num = 0x70;
printf("%d ==> %d\n", num, (int)((signed char)(num & 0xFF)));
return 0;
}
Output
245 ==> -11
112 ==> 112
Is that what you're trying to do? In short, the code above will take the 8bits sitting at the bottom of num, treat them as a signed 8-bit value, then promote them to a signed native int. The result is you can now "know" not only whether the 8-bits were a negative number (since res will be negative if they were), you also get the 8-bit signed number as a native int in the process.
On the other hand, if all you care about is whether the 8th bit is set in the input int, and is supposed to denote a negative value state, then why not just :
int IsEightBitNegative(int val)
{
return (val & 0x80) != 0;
}
I tried this:
float a = 1.4123;
a = a & (1 << 3);
I get a compiler error saying that the operand of & cannot be of type float.
When I do:
float a = 1.4123;
a = (int)a & (1 << 3);
I get the program running. The only thing is that the bitwise operation is done on the integer representation of the number obtained after rounding off.
The following is also not allowed.
float a = 1.4123;
a = (void*)a & (1 << 3);
I don't understand why int can be cast to void* but not float.
I am doing this to solve the problem described in Stack Overflow question How to solve linear equations using a genetic algorithm?.
At the language level, there's no such thing as "bitwise operation on floating-point numbers". Bitwise operations in C/C++ work on value-representation of a number. And the value-representation of floating point numbers is not defined in C/C++ (unsigned integers are an exception in this regard, as their shift is defined as-if they are stored in 2's complement). Floating point numbers don't have bits at the level of value-representation, which is why you can't apply bitwise operations to them.
All you can do is analyze the bit content of the raw memory occupied by the floating-point number. For that you need to either use a union as suggested below or (equivalently, and only in C++) reinterpret the floating-point object as an array of unsigned char objects, as in
float f = 5;
unsigned char *c = reinterpret_cast<unsigned char *>(&f);
// inspect memory from c[0] to c[sizeof f - 1]
And please, don't try to reinterpret a float object as an int object, as other answers suggest. That doesn't make much sense, and is not guaranteed to work in compilers that follow strict-aliasing rules in optimization. The correct way to inspect memory content in C++ is by reinterpreting it as an array of [signed/unsigned] char.
Also note that you technically aren't guaranteed that floating-point representation on your system is IEEE754 (although in practice it is unless you explicitly allow it not to be, and then only with respect to -0.0, ±infinity and NaN).
If you are trying to change the bits in the floating-point representation, you could do something like this:
union fp_bit_twiddler {
float f;
int i;
} q;
q.f = a;
q.i &= (1 << 3);
a = q.f;
As AndreyT notes, accessing a union like this invokes undefined behavior, and the compiler could grow arms and strangle you. Do what he suggests instead.
You can work around the strict-aliasing rule and perform bitwise operations on a float type-punned as an uint32_t (if your implementation defines it, which most do) without undefined behavior by using memcpy():
float a = 1.4123f;
uint32_t b;
std::memcpy(&b, &a, 4);
// perform bitwise operation
b &= 1u << 3;
std::memcpy(&a, &b, 4);
float a = 1.4123;
unsigned int* inta = reinterpret_cast<unsigned int*>(&a);
*inta = *inta & (1 << 3);
Have a look at the following. Inspired by fast inverse square root:
#include <iostream>
using namespace std;
int main()
{
float x, td = 2.0;
int ti = *(int*) &td;
cout << "Cast int: " << ti << endl;
ti = ti>>4;
x = *(float*) &ti;
cout << "Recast float: " << x << endl;
return 0;
}
FWIW, there is a real use case for bit-wise operations on floating point (I just ran into it recently) - shaders written for OpenGL implementations that only support older versions of GLSL (1.2 and earlier did not have support for bit-wise operators), and where there would be loss of precision if the floats were converted to ints.
The bit-wise operations can be implemented on floating point numbers using remainders (modulo) and inequality checks. For example:
float A = 0.625; //value to check; ie, 160/256
float mask = 0.25; //bit to check; ie, 1/4
bool result = (mod(A, 2.0 * mask) >= mask); //non-zero if bit 0.25 is on in A
The above assumes that A is between [0..1) and that there is only one "bit" in mask to check, but it could be generalized for more complex cases.
This idea is based on some of the info found in is-it-possible-to-implement-bitwise-operators-using-integer-arithmetic
If there is not even a built-in mod function, then that can also be implemented fairly easily. For example:
float mod(float num, float den)
{
return num - den * floor(num / den);
}
#mobrule:
Better:
#include <stdint.h>
...
union fp_bit_twiddler {
float f;
uint32_t u;
} q;
/* mutatis mutandis ... */
For these values int will likely be ok, but generally, you should use
unsigned ints for bit shifting to avoid the effects of arithmetic shifts. And
the uint32_t will work even on systems whose ints are not 32 bits.
The Python implementation in Floating point bitwise operations (Python recipe) of floating point bitwise operations works by representing numbers in binary that extends infinitely to the left as well as to the right from the fractional point. Because floating point numbers have a signed zero on most architectures it uses ones' complement for representing negative numbers (well, actually it just pretends to do so and uses a few tricks to achieve the appearance).
I'm sure it can be adapted to work in C++, but care must be taken so as to not let the right shifts overflow when equalizing the exponents.
Bitwise operators should NOT be used on floats, as floats are hardware specific, regardless of similarity on what ever hardware you might have. Which project/job do you want to risk on "well it worked on my machine"? Instead, for C++, you can get a similar "feel" for the bit shift operators by overloading the stream operator on an "object" wrapper for a float:
// Simple object wrapper for float type as templates want classes.
class Float
{
float m_f;
public:
Float( const float & f )
: m_f( f )
{
}
operator float() const
{
return m_f;
}
};
float operator>>( const Float & left, int right )
{
float temp = left;
for( right; right > 0; --right )
{
temp /= 2.0f;
}
return temp;
}
float operator<<( const Float & left, int right )
{
float temp = left;
for( right; right > 0; --right )
{
temp *= 2.0f;
}
return temp;
}
int main( int argc, char ** argv )
{
int a1 = 40 >> 2;
int a2 = 40 << 2;
int a3 = 13 >> 2;
int a4 = 256 >> 2;
int a5 = 255 >> 2;
float f1 = Float( 40.0f ) >> 2;
float f2 = Float( 40.0f ) << 2;
float f3 = Float( 13.0f ) >> 2;
float f4 = Float( 256.0f ) >> 2;
float f5 = Float( 255.0f ) >> 2;
}
You will have a remainder, which you can throw away based on your desired implementation.
How can a hexadecimal floating point constant, as specified in C99, be printed from a array of bytes representing the machine representation of a floating point value? e.g. given
union u_double
{
double dbl;
char data[sizeof(double)];
};
An example hexadecimal floating point constant is a string of the form
0x1.FFFFFEp127f
A syntax specification for this form of literal can be found on the IBM site, and a brief description of the syntax is here on the GCC site.
The printf function can be used to do this on platforms with access to C99 features in the standard library, but I would like to be able to perform the printing in MSVC, which does not support C99, using standard C89 or C++98.
printf manual says:
a,A
(C99; not in SUSv2) For a conversion, the double argument is converted to hexadecimal notation (using the letters abcdef) in the style [-]0xh.hhhhp�d; for A conversion the prefix 0X, the letters ABCDEF, and the exponent separator P is used. There is one hexadecimal digit before the decimal point, and the number of digits after it is equal to the precision. The default precision suffices for an exact representation of the value if an exact representation in base 2 exists and otherwise is sufficiently large to distinguish values of type double. The digit before the decimal point is unspecified for non-normalized numbers, and non-zero but otherwise unspecified for normalized numbers.
You can use frexp() which is in math.h since at least C90 and then do the conversion yourself. Something like this (not tested, not designed to handle boundaries like NaN, infinities, buffer limits, and so on)
void hexfloat(double d, char* ptr)
{
double fract;
int exp = 0;
if (d < 0) {
*ptr++ = '-';
d = -d;
}
fract = frexp(d, &exp);
if (fract == 0.0) {
strcpy(ptr, "0x0.0");
} else {
fract *= 2.0;
--exp;
*ptr++ = '0';
*ptr++ = 'x';
*ptr++ = '1';
fract -= 1.0;
fract *= 16.0;
*ptr++ = '.';
do {
char const hexdigits[] = "0123456789ABCDEF";
*ptr++ = hexdigits[(int)fract]; // truncate
fract -= (int)fract;
fract *= 16;
} while (fract != 0.0);
if (exp != 0) {
sprintf(ptr, "p%d", exp);
} else {
*ptr++ = '\0';
}
}
}
#include <stdint.h>
#include <stdio.h>
int main(void)
{
union { double d; uint64_t u; } value;
value.d = -1.234e-5;
// see http://en.wikipedia.org/wiki/Double_precision
_Bool sign_bit = value.u >> 63;
uint16_t exp_bits = (value.u >> 52) & 0x7FF;
uint64_t frac_bits = value.u & 0xFFFFFFFFFFFFFull;
if(exp_bits == 0)
{
if(frac_bits == 0)
printf("%s0x0p+0\n", sign_bit ? "-" : "");
else puts("subnormal, too lazy to parse");
}
else if(exp_bits == 0x7FF)
puts("infinity or nan, too lazy to parse");
else printf("%s0x1.%llxp%+i\n",
sign_bit ? "-" : "",
(unsigned long long)frac_bits,
(int)exp_bits - 1023);
// check against libc implementation
printf("%a\n", value.d);
}
This might be an "outside the box" answer, but why not convert the double to a string using sprintf, then parse the string for the mantissa and exponent, convert those to
e.g., something like:
char str[256];
long long a, b, c;
sprintf(str, "%e", dbl);
sscanf(str, "%d.%de%d", &a, &b, &c);
printf("0x%x.%xp%x", a, b, c);
I'm sure you'd have to modify the formats for sprintf and sscanf. And you'd never get a first hex digit between A and F. But in general, I think this idea should work. And it's simple.
A better way would be to find an open source library that implements this format for printf (e.g., newlib, uClibc?) and copy what they do.