union int bits to float bits sometimes interpreted wrong - c++

I just discovered some dodgy problems when i was interleaving some floats. I've simplified the issue down and tried some tests
#include <iostream>
#include <vector>
std::vector<float> v; // global instance
union{ // shared memory space
float f; // to store data in interleaved float array
unsigned int argb; // int color value
}color; // global instance
int main(){
std::cout<<std::hex; // print hexadecimal
color.argb=0xff810000; // NEED A==ff AND R>80 (idk why)
std::cout<<color.argb<<std::endl; // NEED TO PRINT (i really dk why)
v.insert(v.end(),{color.f,0.0f,0.0f}); // color, x, y... (need the x, y too. heh..)
color.f=v[0]; // read float back (so we can see argb data)
std::cout<<color.argb<<std::endl; // ffc10000 (WRONG!)
}
the program prints
ff810000
ffc10000
If someone can show me i'm just being dumb somewhere that'd be great.
update: turned off optimizations
#include <iostream>
union FLOATINT{float f; unsigned int i;};
int main(){
std::cout<<std::hex; // print in hex
FLOATINT a;
a.i = 0xff810000; // store int
std::cout<<a.i<<std::endl; // ff810000
FLOATINT b;
b.f = a.f; // store float
std::cout<<b.i<<std::endl; // ffc10000
}
or
#include <iostream>
int main(){
std::cout<<std::hex; // print in hex
unsigned int i = 0xff810000; // store int
std::cout<<i<<std::endl; // ff810000
float f = *(float*)&i; // store float from int memory
unsigned int i2 = *(unsigned int*)&f; // store int from float memory
std::cout<<i2<<std::endl; // ffc10000
}
solution:
#include <iostream>
int main(){
std::cout<<std::hex;
unsigned int i=0xff810000;
std::cout<<i<<std::endl; // ff810000
float f; memcpy(&f, &i, 4);
unsigned int i2; memcpy(&i2, &f, 4);
std::cout<<i2<<std::endl; // ff810000
}

The behavior you're seeing is well defined IEEE floating point math.
The value you're storing in argb, when interpreted as a float will be a SNaN (Signaling NaN). When this SNaN value is loaded into a floating point register, it will be converted to a QNaN (Quiet NaN) by setting the most significant fraction bit to a 1 (and will raise an exception if floating point exceptions are unmasked).
This load will change your value to from ff810000 to ffc10000.

Writing to the int and then reading from the float in the union causes UB. If you want to create a vector of mixed value types, make a struct to hold them. Also, don't use unsigned int when you need exactly 32 bits. Use uint32_t.
#include <iostream>
#include <vector>
struct gldata {
uint32_t argb;
float x;
float y;
};
std::vector<gldata> v;
int main() {
std::cout << std::hex; // print hexadecimal
v.emplace_back(gldata{0xff810000, 0.0f, 0.0f});
std::cout << v[0].argb << "\n"; // 0xff810000
}

Related

Value of a variable that i did not initialize

I have this exercise where i don't understand the method float set(void).So, at first A::v is initialized as a unknown number(1234 for example) but after A::v = v + 1.0.
The result should be A::v= 1234 + 1.0.
It isn't,instead it is A::v=1.
#include <iostream>
using namespace std;
class A {
public:
float v;
float set(void) {
A::v = v + 1.0;
return A::v;
}
};
int main() {
A a;
cout<<a.set()<<endl;
return 0;
}
The answer to your question why the value is always 1:
well, it isn't. It depends on whatever is left there in memory. You have seen a few examples where the uninitialized float value is so small that adding 1.0 to it yields something close to 1.0
But that certainly may not always be the case!
Initialize the variable to 0.
Some further reading:
What Every Computer Scientist Should Know About Floating-Point Arithmetic
You just not initialize value inside class.
When you allocate space on stack, values are random junk.
#include <iostream>
using namespace std;
class A {
public:
float v = 1234.0; //You must initialize it with 1234 if you want result 1235
float set(void) {
/*A::v*/ v = v + 1.0;
return A::v;
}
};
int main() {
A a; //Initializing space on stack, uninitialized values are always, 0x00, 0xcd or trash from other functions.
cout<<a.set()<<endl;
return 0;
}

Size of an object without using sizeof in C++

This was an interview question:
Say there is a class having only an int member. You do not know how many bytes the int will occupy. And you cannot view the class implementation (say it's an API). But you can create an object of it. How would you find the size needed for int without using sizeof.
He wouldn't accept using bitset, either.
Can you please suggest the most efficient way to find this out?
The following program demonstrates a valid technique to compute the size of an object.
#include <iostream>
struct Foo
{
int f;
};
int main()
{
// Create an object of the class.
Foo foo;
// Create a pointer to it.
Foo* p1 = &foo;
// Create another pointer, offset by 1 object from p1
// It is legal to compute (p1+1) but it is not legal
// to dereference (p1+1)
Foo* p2 = p1+1;
// Cast both pointers to char*.
char* cp1 = reinterpret_cast<char*>(p1);
char* cp2 = reinterpret_cast<char*>(p2);
// Compute the size of the object.
size_t size = (cp2-cp1);
std::cout << "Size of Foo: " << size << std::endl;
}
Using pointer algebra:
#include <iostream>
class A
{
int a;
};
int main() {
A a1;
A * n1 = &a1;
A * n2 = n1+1;
std::cout << int((char *)n2 - (char *)n1) << std::endl;
return 0;
}
Yet another alternative without using pointers. You can use it if in the next interview they also forbid pointers. Your comment "The interviewer was leading me to think on lines of overflow and underflow" might also be pointing at this method or similar.
#include <iostream>
int main() {
unsigned int x = 0, numOfBits = 0;
for(x--; x; x /= 2) numOfBits++;
std::cout << "number of bits in an int is: " << numOfBits;
return 0;
}
It gets the maximum value of an unsigned int (decrementing zero in unsigned mode) then subsequently divides by 2 until it reaches zero. To get the number of bytes, divide by CHAR_BIT.
Pointer arithmetic can be used without actually creating any objects:
class c {
int member;
};
c *ptr = 0;
++ptr;
int size = reinterpret_cast<int>(ptr);
Alternatively:
int size = reinterpret_cast<int>( static_cast<c*>(0) + 1 );

C++ manipulating Raw Data of a struct

I have a simple struct that looks like this:
struct Object
{
int x_;
double y_;
};
I am trying to manipulate the raw data of an Object, this is what I've done:
int main()
{
Object my_object;
unsigned char* raw_data = reinterpret_cast<unsigned char*>(&my_object);
int x = 10;
memcpy(raw_data, &x, sizeof(x));
raw_data += sizeof(x);
double y = 20.1;
memcpy(raw_data, &y, sizeof(y));
Object* my_object_ptr = reinterpret_cast<Object *>(raw_data);
std::cout << *(my_object_ptr).x << std::endl; //prints 20 (expected 10)
std::cout << *(my_object_ptr).y << std::endl; //prints Rubbish (expected 20.1)
}
I was expecting that above code will work,,,
What is the real problem? Is this even possible?
You need to use offsetof macro. There were a few more problems too, most importantly you modified raw_data pointer, and then cast the modified value back to Object* pointer, resulting in Undefined Behavior. I chose to remove the raw_data modification (alternative would have been to not cast it back, but to just inspect my_object directly). Here's a fixed code for you, with explanation in comments:
#include <iostream>
#include <cstring> // for memcpy
#include <cstddef> // for offsetof macro
struct Object
{
int x_;
double y_;
};
int main()
{
Object my_object;
unsigned char* raw_data = reinterpret_cast<unsigned char*>(&my_object);
int x = 10;
// 1st memcpy fixed to calculate offset of x_ (even though it is probably 0)
memcpy(raw_data + offsetof(Object, x_), &x, sizeof(x));
//raw_data += offsetof(Object, y_); // if used, add offset of y_ instead of sizeof x
double y = 20.1;
// 2nd memcpy fixed to calculate offset of y_ (offset could be 4 or 8, depends on packing, sizeof int, etc)
memcpy(raw_data + offsetof(Object, y_), &y, sizeof(y));
// cast back to Object* pointer
Object* my_object_ptr = reinterpret_cast<Object *>(raw_data);
std::cout << my_object_ptr->x_ << std::endl; //prints 10
std::cout << my_object_ptr->y_ << std::endl; //prints 20.1
}
This is probably a structure padding issue. If you had double y_ as the first member, you'd probably have seen what you expected. The compiler will pad the structure with extra bytes to make the alignment correct in case the struct is used in an array. Try
#pragma pack(4)
before your struct definition.
The #pragma pack reference for Visual Studio: http://msdn.microsoft.com/en-us/library/2e70t5y1.aspx Your struct is packed to 8 bytes by default, so there's a 4 byte pad between x_ and y_.
Read http://www.catb.org/esr/structure-packing/ to really understand what's going on.

C++ Class/Structure size

Hi I have some problem about the size of a class/struct
Here is my Graphnode.h, I only have 4 vars in it- one 16-unsigned char array, three unsigned char, I think the size should be 19. Why is it 32?
Graphnode currentNode;
cout<< sizeof(currentNode)<<endl;// why this is 32 ?
cout<< sizeof(currentNode.state)<< endl;// this is 16
Graphnode.h:
#include <stdio.h>
#include <stdlib.h>
#include <tr1/array>
//using namespace std;
class Graphnode {
public:
std::tr1::array<unsigned char, 16> state;
unsigned char x;
unsigned char depth;
unsigned char direction;
Graphnode(std::tr1::array<unsigned char, 16>,unsigned char,unsigned char, unsigned char);
Graphnode();
};
Graphnode::Graphnode()
{
int i=0;
for(i=0;i<16;i++)
{
state[i] = 0;
}
x = 0;
depth = 0;
direction = 0;
}
Graphnode::Graphnode(std::tr1::array<unsigned char, 16> _state,unsigned char _x,unsigned char _d,unsigned char _direction)
{
int i=0;
for(i=0;i<16;i++)
{
state[i] = _state[i];
}
x = _x;
depth = _d;
direction = _direction;
}
Because the compiler does not lay out the data structure members one directly after the other; it also leaves padding in between.
Usually this means that any structure will be a multiple of some amount dependent on the types it contains and the target platform, even if the sum of the sizes of the fields is less.
All compilers typically offer non-standard extensions that let you control this packing to a lesser or greater degree.

How to byteswap a double?

I'm trying to write a byteswap routine for a C++ program running on Win XP. I'm compiling with Visual Studio 2008. This is what I've come up with:
int byteswap(int v) // This is good
{
return _byteswap_ulong(v);
}
double byteswap(double v) // This doesn't work for some values
{
union { // This trick is first used in Quake2 source I believe :D
__int64 i;
double d;
} conv;
conv.d = v;
conv.i = _byteswap_uint64(conv.i);
return conv.d;
}
And a function to test:
void testit() {
double a, b, c;
CString str;
for (a = -100; a < 100; a += 0.01) {
b = byteswap(a);
c = byteswap(b);
if (a != c) {
str.Format("%15.15f %15.15f %15.15f", a, c, a - c);
}
}
}
Getting these numbers not matching:
-76.789999999988126 -76.790000000017230 0.000000000029104
-30.499999999987718 -30.499999999994994 0.000000000007276
41.790000000014508 41.790000000029060 -0.000000000014552
90.330000000023560 90.330000000052664 -0.000000000029104
This is after having read through:
How do I convert between big-endian and little-endian values in C++?
Little Endian - Big Endian Problem
You can't use << and >> on double, by the way (unless I'm mistaken?)
Although a double in main memory is 64 bits, on x86 CPUs double-precision registers are 80 bits wide. So if one of your values is stored in a register throughout, but the other makes a round-trip through main memory and is truncated to 64 bits, this could explain the small differences you're seeing.
Maybe you can force variables to live in main memory by taking their address (and printing it, to prevent the compiler from optimizing it out), but I'm not certain that this is guaranteed to work.
b = byteswap(a);
That's a problem. After swapping the bytes, the value is no longer a proper double. Storing it back to a double is going to cause subtle problems when the FPU normalizes the value. You have to store it back into an __int64 (long long). Modify the return type of the method.
Try 3
Okay, found out there's a better way. The other way you have to worry about the order you pack/unpack stuff. This way you don't:
// int and float
static void swap4(void *v)
{
char in[4], out[4];
memcpy(in, v, 4);
out[0] = in[3];
out[1] = in[2];
out[2] = in[1];
out[3] = in[0];
memcpy(v, out, 4);
}
// double
static void swap8(void *v)
{
char in[8], out[8];
memcpy(in, v, 8);
out[0] = in[7];
out[1] = in[6];
out[2] = in[5];
out[3] = in[4];
out[4] = in[3];
out[5] = in[2];
out[6] = in[1];
out[7] = in[0];
memcpy(v, out, 8);
}
typedef struct
{
int theint;
float thefloat;
double thedouble;
} mystruct;
static void swap_mystruct(void *buf)
{
mystruct *ps = (mystruct *) buf;
swap4(&ps->theint);
swap4(&ps->thefloat);
swap8(&ps->thedouble);
}
Send:
char buf[sizeof (mystruct)];
memcpy(buf, &s, sizeof (mystruct));
swap_mystruct(buf);
Recv:
mystruct s;
swap_mystruct(buf);
memcpy(&s, buf, sizeof (mystruct));
Try 2
Okay, got it working! Hans Passant was right. They got me thinking with the "no longer a proper double" comment. So you can't byteswap a float into another float because then it might be in an improper format, so you have to byteswap to a char array and unswap back. This is the code I used:
int pack(int value, char *buf)
{
union temp {
int value;
char c[4];
} in, out;
in.value = value;
out.c[0] = in.c[3];
out.c[1] = in.c[2];
out.c[2] = in.c[1];
out.c[3] = in.c[0];
memcpy(buf, out.c, 4);
return 4;
}
int pack(float value, char *buf)
{
union temp {
float value;
char c[4];
} in, out;
in.value = value;
out.c[0] = in.c[3];
out.c[1] = in.c[2];
out.c[2] = in.c[1];
out.c[3] = in.c[0];
memcpy(buf, out.c, 4);
return 4;
}
int pack(double value, char *buf)
{
union temp {
double value;
char c[8];
} in, out;
in.value = value;
out.c[0] = in.c[7];
out.c[1] = in.c[6];
out.c[2] = in.c[5];
out.c[3] = in.c[4];
out.c[4] = in.c[3];
out.c[5] = in.c[2];
out.c[6] = in.c[1];
out.c[7] = in.c[0];
memcpy(buf, out.c, 8);
return 8;
}
int unpack(char *buf, int *value)
{
union temp {
int value;
char c[4];
} in, out;
memcpy(in.c, buf, 4);
out.c[0] = in.c[3];
out.c[1] = in.c[2];
out.c[2] = in.c[1];
out.c[3] = in.c[0];
memcpy(value, &out.value, 4);
return 4;
}
int unpack(char *buf, float *value)
{
union temp {
float value;
char c[4];
} in, out;
memcpy(in.c, buf, 4);
out.c[0] = in.c[3];
out.c[1] = in.c[2];
out.c[2] = in.c[1];
out.c[3] = in.c[0];
memcpy(value, &out.value, 4);
return 4;
}
int unpack(char *buf, double *value)
{
union temp {
double value;
char c[8];
} in, out;
memcpy(in.c, buf, 8);
out.c[0] = in.c[7];
out.c[1] = in.c[6];
out.c[2] = in.c[5];
out.c[3] = in.c[4];
out.c[4] = in.c[3];
out.c[5] = in.c[2];
out.c[6] = in.c[1];
out.c[7] = in.c[0];
memcpy(value, &out.value, 8);
return 8;
}
And a simple test function:
typedef struct
{
int theint;
float thefloat;
double thedouble;
} mystruct;
void PackStruct()
{
char buf[sizeof (mystruct)];
char *p;
p = buf;
mystruct foo, foo2;
foo.theint = 1;
foo.thefloat = 3.14f;
foo.thedouble = 400.5;
p += pack(foo.theint, p);
p += pack(foo.thefloat, p);
p += pack(foo.thedouble, p);
// Send or recv char array
p = buf;
p += unpack(p, &foo2.theint);
p += unpack(p, &foo2.thefloat);
p += unpack(p, &foo2.thedouble);
}
How to swap the bytes in any basic data type or array of bytes
ie: How to swap the bytes in place in any array, variable, or any other memory block, such as an int16_t, uint16_t, uint32_t, float, double, etc.:
Here's a way to improve the efficiency from 3 entire copy operations of the array to 1.5 entire copy operations of the array. See also the comments I left under your answer. I said:
Get rid of this: memcpy(in, v, 4); and just copy-swap straight into out from v, then memcpy the swapped values back from out into v. This saves you an entire unnecessary copy, reducing your copies of the entire array from 3 to 2.
There's also a further optimization to reduce the copies of the entire array from 2 to 1.5: copy the left half of the array into temporary variables, and the right-half of the array straight into the left-half, swapping as appropriately. Then copy from the temporary variables, which contain the old left-half of the array, into the right-half of the array, swapping as appropriately. This results in the equivalent of only 1.5 copy operations of the entire array, to be more efficient. Do all this in-place in the original array, aside from the temp variables you require for half of the array.
1. Here is my general C and C++ solution:
/// \brief Swap all the bytes in an array to convert from little-endian
/// byte order to big-endian byte order, or vice versa.
/// \note Works for arrays of any size. Swaps the bytes **in place**
/// in the array.
/// \param[in,out] byte_array The array in which to swap the bytes in-place.
/// \param[in] len The length (in bytes) of the array.
/// \return None
void swap_bytes_in_array(uint8_t * byte_array, size_t len)
{
size_t i_left = 0; // index for left side of the array
size_t i_right = len - 1; // index for right side of the array
while (i_left < i_right)
{
// swap left and right bytes
uint8_t left_copy = byte_array[i_left];
byte_array[i_left] = byte_array[i_right];
byte_array[i_right] = left_copy;
i_left++;
i_right--;
}
}
Usage:
// array of bytes
uint8_t bytes_array[16];
// Swap the bytes in this array of bytes in place
swap_bytes_in_array(bytes_array, sizeof(bytes_array));
double d;
// Swap the bytes in the double in place
swap_bytes_in_array((uint8_t*)(&d), sizeof(d));
uint64_t u64;
// swap the bytes in a uint64_t in place
swap_bytes_in_array((uint8_t*)(&u64), sizeof(u64));
2. And here is an optional C++ template wrapper around that to make it even easier to use in C++:
template <typename T>
void swap_bytes(T *var)
{
// Note that `sizeof(*var)` is the exact same thing as `sizeof(T)`
swap_bytes_in_array((uint8_t*)var, sizeof(*var));
}
Usage:
double d;
// Swap the bytes in the double in place
swap_bytes(&d);
uint64_t u64;
// swap the bytes in a uint64_t in place
swap_bytes(&u64);
Notes & unanswered questions
Note, however, that #Hans Passant seems to be onto something here. Although the above works perfectly on any signed or unsigned integer type, and seems to work on float and double for me too, it seems to be broken on long double. I think it's because when I store the swapped long double back into a long double variable, if it is determined to be not-a-valid long double representation anymore, something automatically changes a few of the swapped bytes or something. I'm not entirely sure.
On many 64-bit systems, long double is 16 bytes, so perhaps the solution is to keep the swapped version of the long double inside a 16-byte array and NOT attempt to use it or cast it back to a long double from the uint8_t 16-byte array until either A) it has been sent to the receiver (where the endianness of the system is opposite, so it's in good shape now) and/or B) byte-swapped back again so it's a valid long double again.
Keep the above in mind in case you see problems with float or double types too, as I see with only long double types.
Linux byteswap and endianness and host-to-network byte order utilities
Linux also has a bunch of built-in utilities via gcc GNU extensions that you can use. See:
https://man7.org/linux/man-pages/man3/bswap.3.html - #include <byteswap.h>
https://man7.org/linux/man-pages/man3/endian.3.html - #include <endian.h>
https://man7.org/linux/man-pages/man3/byteorder.3.html - #include <arpa/inet.h> - generally used for network sockets (Ethernet packets) and things; inet stands for "internet"