Convert Sign-Bit, Exponent and Mantissa to float?

Convert Sign-Bit, Exponent and Mantissa to float? - c++

I have the Sign Bit, Exponent and Mantissa (as shown in the code below). I'm trying to take this value and turn it into the float. The goal of this is to get 59.98 (it'll read as 59.9799995)
uint32_t FullBinaryValue = (Converted[0] << 24) | (Converted[1] << 16) |
(Converted[2] << 8) | (Converted[3]);
unsigned int sign_bit = (FullBinaryValue & 0x80000000);
unsigned int exponent = (FullBinaryValue & 0x7F800000) >> 23;
unsigned int mantissa = (FullBinaryValue & 0x7FFFFF);
What I originally tried doing is just placing them bit by bit, where they should be as so:
float number = (sign_bit << 32) | (exponent << 24) | (mantissa);
But this gives me 2.22192742e+009.
I was then going to use the formula: 1.mantissa + 2^(exponent-127) but you can't put a decimal place in a binary number.
Then I tried grabbing each individual value for (exponent, characteristic, post mantissa) and I got
Characteristic: 0x3B (Decimal: 59)
Mantissa: 0x6FEB85 (Decimal: 7334789)
Exponent: 0x5 (Decimal: 5) This is after subtracting it from 127
I was then going to take these numbers and just retrofit it into a printf. But I don't know how to convert the Mantissa hexadecimal into how it's supposed to be (powered to a negative exponent).
Any ideas on how to convert these three variables (sign bit, exponent, and mantissa) into a floating number?
EDIT FOR PAUL R
Here is the code in Minimal, Complete and Verifable format.
I added the uint8_t Converted[4] there just because it is the value I end up with and it makes it runnable.
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char *argv[])
{
uint8_t Converted[4];
Converted[0] = 0x42;
Converted[1] = 0x6f;
Converted[2] = 0xEB;
Converted[3] = 0x85;
uint32_t FullBinaryValue = (Converted[0] << 24) | (Converted[1] << 16) |
(Converted[2] << 8) | (Converted[3]);
unsigned int sign_bit = (FullBinaryValue & 0x80000000);
unsigned int exponent = (FullBinaryValue & 0x7F800000) >> 23;
unsigned int mantissa = (FullBinaryValue & 0x7FFFFF);
float number = (sign_bit) | (exponent << 23) | (mantissa);
return 0;
}

The problem is that the expression float number = (sign_bit << 32) | (exponent << 24) | (mantissa); first computes an unsigned int and then casts that value to float. Casting between fundamental types will preserve the value rather than the memory representation. What you are trying to do is reinterpret the memory representation as a different type. You can use reinterpret_cast.
Try this instead :
uint32_t FullBinaryValue = (Converted[0] << 24) | (Converted[1] << 16) |
(Converted[2] << 8) | (Converted[3]);
float number = reinterpret_cast<float&>(FullBinaryValue);

Related

C++ Bitshift 4 int_8t into a normal integer (32 bit )

I had already asked a question how to get 4 int8_t into a 32bit int, I was told that I have to cast the int8_t to a uint8_t first to pack it with bitshifting into a 32bit integer.
int8_t offsetX = -10;
int8_t offsetY = 120;
int8_t offsetZ = -60;
using U = std::uint8_t;
int toShader = (U(offsetX) << 24) | (U(offsetY) << 16) | (U(offsetZ) << 8) | (0 << 0);
std::cout << (int)(toShader >> 24) << " "<< (int)(toShader >> 16) << " " << (int)(toShader >> 8) << std::endl;
My Output is
-10 -2440 -624444
It's not what I expected, of course, does anyone have a solution?
In the shader I want to unpack the int16 later and that is only possible with a 32bit integer because glsl does not have any other data types.
int offsetX = data[gl_InstanceID * 3 + 2] >> 24;
int offsetY = data[gl_InstanceID * 3 + 2] >> 16 ;
int offsetZ = data[gl_InstanceID * 3 + 2] >> 8 ;
What is written in the square bracket does not matter it is about the correct shifting of the bits or casting after the bracket.

If any of the offsets is negative, then the shift results in undefined behaviour.
Solution: Convert the offsets to an unsigned type first.
However, this brings another potential problem: If you convert to unsigned, then negative numbers will have very large values with set bits in most significant bytes, and OR operation with those bits will always result in 1 regardless of offsetX and offsetY. A solution is to convert into a small unsigned type (std::uint8_t), and another is to mask the unused bytes. Former is probably simpler:
using U = std::uint8_t;
int third = U(offsetX) << 24u
| U(offsetY) << 16u
| U(offsetZ) << 8u
| 0u << 0u;

I think you're forgetting to mask the bits that you care about before shifting them.
Perhaps this is what you're looking for:
int32 offsetX = (data[gl_InstanceID * 3 + 2] & 0xFF000000) >> 24;
int32 offsetY = (data[gl_InstanceID * 3 + 2] & 0x00FF0000) >> 16 ;
int32 offsetZ = (data[gl_InstanceID * 3 + 2] & 0x0000FF00) >> 8 ;
if (offsetX & 0x80) offsetX |= 0xFFFFFF00;
if (offsetY & 0x80) offsetY |= 0xFFFFFF00;
if (offsetZ & 0x80) offsetZ |= 0xFFFFFF00;
Without the bit mask, the X part will end up in offsetY, and the X and Y part in offsetZ.

on CPU side you can use union to avoid bit shifts and bit masking and branches ...
int8_t x,y,z,w; // your 8bit ints
int32_t i; // your 32bit int
union my_union // just helper union for the casting
{
int8_t i8[4];
int32_t i32;
} a;
// 4x8bit -> 32bit
a.i8[0]=x;
a.i8[1]=y;
a.i8[2]=z;
a.i8[3]=w;
i=a.i32;
// 32bit -> 4x8bit
a.i32=i;
x=a.i8[0];
y=a.i8[1];
z=a.i8[2];
w=a.i8[3];
If you do not like unions the same can be done with pointers...
Beware on GLSL side is this not possible (nor unions nor pointers) and you have to use bitshifts and masks like in the other answer...

Split an integer into bytes and combine back into the integers results into error

Toy program to split an integer into 4 bytes and later combine these bytes to get back the input value results into error. However the program works for positive integers. I am interested in signed integers. Need help.
Expected Output: -12345
Actual Output: -57
int main()
{
int j,i = -12345;
char b[4];
b[0] = (i >> 24) & 0xFF;
b[1] = (i >> 16) & 0xFF;
b[2] = (i >> 8) & 0xFF;
b[3] = (i >> 0) & 0xFF;
j = (int)((b[0] << 24) | (b[1] << 16) | (b[2] << 8) | (b[3] << 0));
std::cout << j;
return 0;
}

There are actually two problems that leads to your "error".
The first is that the result of e.g. b[0] << 24 will be an int. When you cast that to a char (and assuming that char is an 8-bit type) then you cut off the top 24 bits of the value, truncating it.
The second problem is that char could be unsigned (it's implementation-defined if char is signed or unsigned). If char is unsigned then the value -1 (0xffffffff) will become 255 (0x000000ff).
When you then bring all that together it will almost certainly result in wrong values.
In general, whenever you feel the need to do a C-style cast (like in (char)(b[0] << 24)) when programming in C++, you should take that as a sign that you're doing something wrong.
One possible way to solve your problem, always work with explicit unsigned data-types.
First you need to copy the original int value to an unsigned int:
unsigned ui;
memcpy(&ui, &i, sizeof ui);
Then use ui instead of i when doing the "split". And explicitly use unsigned char:
unsigned char b[sizeof(unsigned)] = { 0 };
b[0] = (ui >> 24) & 0xFF;
b[1] = (ui >> 16) & 0xFF;
b[2] = (ui >> 8) & 0xFF;
b[3] = (ui >> 0) & 0xFF;
Then to put it all back, again use an explicit unsigned type, and copy it to the resulting variable:
unsigned uj = (b[0] << 24) | (b[1] << 16) | (b[2] << 8) | (b[3] << 0);
memcpy(&j, &uj, sizeof j);
I suggest using unsigned data types here to avoid possible problems that can come from sign-extension during conversion.

Your code works only for possessive numbers! "i" is negative and by shifting it to to right b[0] becomes positive! and finally desensitization results error!
try
int main()
{
int j, i = -12345;
const char* bytes = reinterpret_cast<const char*>(&i);
j = *reinterpret_cast<const int*>(bytes);
std::cout << j;
return 0;
}

Number made of 4 chars with hex

Iam programming in C++ and Iam comming with another "stupid" problem. If I have 4 chars like these:
char a = 0x90
char b = 0x01
char c = 0x00
char d = 0x00
when that all means hexadecimal number 0x00000190 which is 400 decimal number.
How do I convert these chars to one int? I know i can do
int number = a;
but this will convert only one char to int. Could anybody help please?

You may use:
int number = (a & 0xFF)
| ((b & 0xFF) << 8)
| ((c & 0xFF) << 16)
| ((d & 0xFF) << 24);
It would be simpler with unsigned values.

like this
int number = ((d & 0xff) << 24) | ((c &0xff) << 16) | ((b & 0xff) << 8) | (a & 0xff);
the << is the bit shift operator and the & 0xff is necessary to avoid negative values when promoting char to int in the expression (totally right by Jarod42)

This works:
unsigned char ua = a;
unsigned char ub = b;
unsigned char uc = c;
unsigned char ud = d;
unsigned long x = ua + ub * 0x100ul + uc * 0x10000ul + ud * 0x1000000ul
It is like place-value arithmetic in decimal but you are using base 0x100 instead of base 10.
If you are doing this a lot you could wrap it up in an inline function or a macro.
Note - the other answers posted so far using bitwise operations on (char)0x90 are all wrong for systems where plain char is signed, as they are forgetting that (char)0x90 is a negative value there, so after the integral promotions are applied, there are a whole lot of 1 bits on the left.

c++ 64 bit network to host translation

I know there are answers for this question using using gcc byteswap and other alternatives on the web but was wondering why my code below isn't working.
Firstly I have gcc warnings ( which I feel shouldn't be coming ) and reason why I don't want to use byteswap is because I need to determine if my machine is big endian or little endian and use byteswap accordingly i.,e if my machine is big endian I could memcpy the bytes as is without any translation otherwise I need to swap them and copy it.
static inline uint64_t ntohl_64(uint64_t val)
{
unsigned char *pp =(unsigned char *)&val;
uint64_t val2 = ( pp[0] << 56 | pp[1] << 48
| pp[2] << 40 | pp[3] << 32
| pp[4] << 24 | pp[5] << 16
| pp[6] << 8 | pp[7]);
return val2;
}
int main()
{
int64_t a=0xFFFF0000;
int64_t b=__const__byteswap64(a);
int64_t c=ntohl_64(a);
printf("\n %lld[%x] [%lld] [%lld]\n ", a, a, b, c);
}
Warnings:-
In function \u2018uint64_t ntohl_64(uint64_t)\u2019:
warning: left shift count >= width of type
warning: left shift count >= width of type
warning: left shift count >= width of type
warning: left shift count >= width of type
Output:-
4294901760[00000000ffff0000] 281470681743360[0000ffff00000000] 65535[000000000000ffff]
I am running this on a little endian machine so byteswap and ntohl_64 should result in exact same values but unfortunately I get completely unexpected results. It would be great if someone can pointout whats wrong.

The reason your code does not work is because you're shifting unsigned chars. As they shift the bits fall off the top and any shift greater than 7 can be though of as returning 0 (though some implementations end up with weird results due to the way the machine code shifts work, x86 is an example). You have to cast them to whatever you want the final size to be first like:
((uint64_t)pp[0]) << 56
Your optimal solution with gcc would be to use htobe64. This function does everything for you.
P.S. It's a little bit off topic, but if you want to make the function portable across endianness you could do:
Edit based on Nova Denizen's comment:
static inline uint64_t htonl_64(uint64_t val)
{
union{
uint64_t retVal;
uint8_t bytes[8];
};
bytes[0] = (val & 0x00000000000000ff);
bytes[1] = (val & 0x000000000000ff00) >> 8;
bytes[2] = (val & 0x0000000000ff0000) >> 16;
bytes[3] = (val & 0x00000000ff000000) >> 24;
bytes[4] = (val & 0x000000ff00000000) >> 32;
bytes[5] = (val & 0x0000ff0000000000) >> 40;
bytes[6] = (val & 0x00ff000000000000) >> 48;
bytes[7] = (val & 0xff00000000000000) >> 56;
return retVal;
}
static inline uint64_t ntohl_64(uint64_t val)
{
union{
uint64_t inVal;
uint8_t bytes[8];
};
inVal = val;
return bytes[0] |
((uint64_t)bytes[1]) << 8 |
((uint64_t)bytes[2]) << 16 |
((uint64_t)bytes[3]) << 24 |
((uint64_t)bytes[4]) << 32 |
((uint64_t)bytes[5]) << 40 |
((uint64_t)bytes[6]) << 48 |
((uint64_t)bytes[7]) << 56;
}
Assuming the compiler doesn't do something to the uint64_t on it's way back through the return, and assuming the user treats the result as an 8-byte value (and not an integer), that code should work on any system. With any luck, your compiler will be able to optimize out the whole expression if you're on a big endian system and use some builtin byte swapping technique if you're on a little endian machine (and it's guaranteed to still work on any other kind of machine).

uint64_t val2 = ( pp[0] << 56 | pp[1] << 48
| pp[2] << 40 | pp[3] << 32
| pp[4] << 24 | pp[5] << 16
| pp[6] << 8 | pp[7]);
pp[0] is an unsigned char and 56 is an int, so pp[0] << 56 performs the left-shift as an unsigned char, with an unsigned char result. This isn't what you want, because you want all these shifts to have type unsigned long long.
The way to fix this is to cast, like ((unsigned long long)pp[0]) << 56.

Since pp[x] is 8-bit wide, the expression pp[0] << 56 results in zero. You need explicit masking on the original value and then shifting:
uint64_t val2 = (( val & 0xff ) << 56 ) |
(( val & 0xff00 ) << 48 ) |
...
In any case, just use compiler built-ins, they usually result in a single byte-swapping instruction.

Casting and shifting works as PlasmaHH suggesting but I don't know why 32 bit shifts upconvert automatically and not 64 bit.
typedef uint64_t __u64;
static inline uint64_t ntohl_64(uint64_t val)
{
unsigned char *pp =(unsigned char *)&val;
return ((__u64)pp[0] << 56 |
(__u64)pp[1] << 48 |
(__u64)pp[2] << 40 |
(__u64)pp[3] << 32 |
(__u64)pp[4] << 24 |
(__u64)pp[5] << 16 |
(__u64)pp[6] << 8 |
(__u64)pp[7]);
}

Get signed integer from 2 16-bit signed bytes?

So this sensor I have returns a signed value between -500-500 by returning two (high and low) signed bytes. How can I use these to figure out what the actual value is? I know I need to do 2's complement, but I'm not sure how. This is what I have now -
real_velocity = temp.values[0];
if(temp.values[1] != -1)
real_velocity += temp.values[1];
//if high byte > 1, negative number - take 2's compliment
if(temp.values[1] > 1) {
real_velocity = ~real_velocity;
real_velocity += 1;
}
But it just returns the negative value of what would be a positive. So for instance, -200 returns bytes 255 (high) and 56(low). Added these are 311. But when I run the above code it tells me -311. Thank you for any help.

-200 in hex is 0xFF38,
you're getting two bytes 0xFF and 0x38,
converting these back to decimal you might recognise them
0xFF = 255,
0x38 = 56
your sensor is not returning 2 signed bytes but a simply the high and low byte of a signed 16 bit number.
so your result is
value = (highbyte << 8) + lowbyte
value being a 16 bit signed variable.

Based on the example you gave, it appears that the value is already 2's complement. You just need to shift the high byte left 8 bits and OR the values together.
real_velocity = (short) (temp.values[0] | (temp.values[1] << 8));

You can shift the bits and mask the values.
int main()
{
char data[2];
data[0] = 0xFF; //high
data[1] = 56; //low
int value = 0;
if (data[0] & 0x80) //sign
value = 0xFFFF8000;
value |= ((data[0] & 0x7F) << 8) | data[1];
std::cout<<std::hex<<value<<std::endl;
std::cout<<std::dec<<value<<std::endl;
std::cin.get();
}
Output:
ffffff38
-200

real_velocity = temp.values[0];
real_velocity = real_velocity << 8;
real_velocity |= temp.values[1];
// And, assuming 32-bit integers
real_velocity <<= 16;
real_velocity >>= 16;

For 8-bit bytes, first just convert to unsigned:
typedef unsigned char Byte;
unsigned const u = (Byte( temp.values[1] ) << 8) | Byte( temp.values[0] );
Then if that is greater than the upper range for 16-bit two's complement, subtract 216:
int const i = int(u >= (1u << 15)? u - (1u << 16) : u);
You could do tricks at the bit level, but I don't think there's any point in that.
The above assuming that CHAR_BIT = 8, that unsigned is more than 16 bits, and that the machine and desired result is two's complement.
#include <iostream>
using namespace std;
int main()
{
typedef unsigned char Byte;
struct { char values[2]; } const temp = { 56, 255 };
unsigned const u = (Byte( temp.values[1] ) << 8) | Byte( temp.values[0] );
int const i = int(u >= (1u << 15)? u - (1u << 16) : u);
cout << i << endl;
}

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Convert Sign-Bit, Exponent and Mantissa to float? - c++

Related

C++ Bitshift 4 int_8t into a normal integer (32 bit )

Split an integer into bytes and combine back into the integers results into error

Number made of 4 chars with hex

c++ 64 bit network to host translation

Get signed integer from 2 16-bit signed bytes?

Categories

Resources