I'm trying to re-construct a 32-bit floating point value from an eeprom.
The 4 bytes in eeprom memory (0-4) are : B4 A2 91 4D
and the PC (VS Studio) reconstructs it correctly as 3.054199 * 10^8 (the floating point value I know should be there)
Now I'm moving this eeprom to be read from an 8-bit Arduino, so not sure if it's compiler/platform thing, but when I try reading the 4 bytes into a 32-bit dword, and then typecast it to a float, the value I get isn't even close.
Assuming the conversion can't be done automatically with the standard ansi-c compiler, how can the 4 bytes be manually parsed to be a float?
The safest way, and due to compiler optimization also as fast as any other, is to use memcpy:
uint32_t dword = 0x4D91A2B4;
float f;
memcpy(&f, &dw, 4);
Demo: http://ideone.com/riDfFw
As Shafik Yaghmour mentioned in his answer - it's probably an endianness issue, since that's the only logical problem you could encounter with such a low-level operation. While Shafiks answer in the question he linked, basically covers the process of handling such an issue, I'll just leave you some information:
As stated on the Anduino forums, Anduino uses Little Endian. If you're not sure about what will be the endianness of the system you'll end up working on, but want to make your code semi-multiplatform, you can check the endianness at runtime with a simple code snippet:
bool isBigEndian(){
int number = 1;
return (*(char*)&number != 1);
}
Be advised that - as all things - this consumes some of your procesor time and makes your program run slower, and while that's nearly always a bad thing, you can still use this to see the results in a debug version of your app.
How this works is that it tests the first byte of the int stored at the address pointed by &number. If the first byte is not 1, it means the bytes are Big Endian.
Also - this only will work if sizeof(int) > sizeof(char).
You can also embed this in your code:
float getFromEeprom(int address){
char bytes[sizeof(float)];
if(isBigEndian()){
for(int i=0;i<sizeof(float);i++)
bytes[sizeof(float)-i] = EEPROM.read(address+i);
}
else{
for(int i=0;i<sizeof(float);i++)
bytes[i] = EEPROM.read(address+i);
}
float result;
memcpy(&result, bytes, sizeof(float));
return result;
}
You need to cast at the pointer level.
int myFourBytes = /* something */;
float* myFloat = (float*) &myFourBytes;
cout << *myFloat;
Should work.
If the data is generated on a different platform that stores values in the opposite endianness, you'll need to manually swap the bytes around. E.g.:
unsigned char myFourBytes[4] = { 0xB4, 0xA2, 0x91, 0x4D };
std::swap(myFourBytes[0], myFourBytes[3]);
std::swap(myFourBytes[1], myFourBytes[2]);
I've been stumped on this one for days. I've written this program from a book called Write Great Code Volume 1 Understanding the Machine Chapter four.
The project is to do Floating Point operations in C++. I plan to implement the other operations in C++ on my own; the book uses HLA (High Level Assembly) in the project for other operations like multiplication and division.
I wanted to display the exponent and other field values after they've been extracted from the FP number; for debugging. Yet I have a problem: when I look at these values in memory they are not what I think they should be. Key words: what I think. I believe I understand the IEEE FP format; its fairly simple and I understand all I've read so far in the book.
The big problem is why the Rexponent variable seems to be almost unpredictable; in this example with the given values its 5. Why is that? By my guess it should be two. Two because the decimal point is two digits right of the implied one.
I've commented the actual values that are produced in the program in to the code so you don't have to run the program to get a sense of whats happening (at least in the important parts).
It is unfinished at this point. The entire project has not been created on my computer yet.
Here is the code (quoted from the file which I copied from the book and then modified):
#include<iostream>
typedef long unsigned real; //typedef our long unsigned ints in to the label "real" so we don't confuse it with other datatypes.
using namespace std; //Just so I don't have to type out std::cout any more!
#define asreal(x) (*((float *) &x)) //Cast the address of X as a float pointer as a pointer. So we don't let the compiler truncate our FP values when being converted.
inline int extractExponent(real from) {
return ((from >> 23) & 0xFF) - 127; //Shift right 23 bits; & with eight ones (0xFF == 1111_1111 ) and make bias with the value by subtracting all ones from it.
}
void fpadd ( real left, real right, real *dest) {
//Left operand field containers
long unsigned int Lexponent = 0;
long unsigned Lmantissa = 0;
int Lsign = 0;
//RIGHT operand field containers
long unsigned int Rexponent = 0;
long unsigned Rmantissa = 0;
int Rsign = 0;
//Resulting operand field containers
long int Dexponent = 0;
long unsigned Dmantissa = 0;
int Dsign = 0;
std::cout << "Size of datatype: long unsigned int is: " << sizeof(long unsigned int); //For debugging
//Properly initialize the above variable's:
//Left
Lexponent = extractExponent(left); //Zero. This value is NOT a flat zero when displayed because we subtract 127 from the exponent after extracting it! //Value is: 0xffffff81
Lmantissa = extractMantissa (left); //Zero. We don't do anything to this number except add a whole number one to it. //Value is: 0x00000000
Lsign = extractSign(left); //Simple.
//Right
**Rexponent = extractExponent(right); //Value is: 0x00000005 <-- why???**
Rmantissa = extractMantissa (right);
Rsign = extractSign(right);
}
int main (int argc, char *argv[]) {
real a, b, c;
asreal(a) = -0.0;
asreal(b) = 45.67;
fpadd(a,b, &c);
printf("Sum of A and B is: %f", c);
std::cin >> a;
return 0;
}
Help would be much appreciated; I'm several days in to this project and very frustrated!
in this example with the given values its 5. Why is that?
The floating point number 45.67 is internally represented as
2^5 * 1.0110110101011100001010001111010111000010100011110110
which actually represents the number
45.6700000000000017053025658242404460906982421875
This is as close as you can get to 45.67 inside float.
If all you are interested in is the exponent of a number, simply compute its base 2 logarithm and round down. Since 45.67 is between 32 (2^5) and 64 (2^6), the exponent is 5.
Computers use binary representation for all numbers. Hence, the exponent is for base two, not base ten. int(log2(45.67)) = 5.
I need to be able to read in a float or double from binary data in C++, similarly to Python's struct.unpack function. My issue is that the data I am receiving will always be big-endian. I have dealt with this for integer values as described here, but working byte by byte does not work with floating point values. I need a way to extract floating point values (both 32 bit floats and 64 bit doubles) in in C++, similar to how you would use struct.unpack(">f", num) or struct.unpack(">d", num) in Python.
here's an example of what I have tried:
stuct.unpack("d", num) ==> *(double*) str; // if str is a char* containing the data
That works fine if str is little-endian, but not if it is big-endian, as I know it will always be. The problem is that I do not know what the native endianness of the environment will be, so I need to be able to extract the binary data as big-endian at all times.
If you look at the linked question, you'll see this is easily using bitwise-ors and bitshifts for integer values, but that method does not work for floating point.
NOTE I should have pointed this out earlier, but I cannot use c++11 or any third party libraries other than Boost.
Why working byte by byte does not work with floating point values?
Just extract 32bit integer as usual, then reinterpret it as float: float f = *(float*)&i
And the same for 64bit integers and double
void ByteSwap(void * data, int size)
{
char * ptr = (char *) data;
for (int i = 0; i < size/2; ++i)
std::swap(ptr[i], ptr[size-1-i]);
}
bool LittleEndian()
{
int test = 1;
return *((char *)&test) == 1;
}
if (LittleEndian())
ByteSwap(&my_double, sizeof(double));
I have the following code:
typedef __int64 BIG_INT;
typedef double CUT_TYPE;
#define CUT_IT(amount, percent) (amount * percent)
void main()
{
CUT_TYPE cut_percent = 1;
BIG_INT bintOriginal = 0x1FFFFFFFFFFFFFF;
BIG_INT bintAfter = CUT_IT(bintOriginal, cut_percent);
}
bintAfter's value after the calculation is 144115188075855872 instead of 144115188075855871 (see the "2" in the end, instead of "1"??).
On smaller values such as 0xFFFFFFFFFFFFF I get the correct result.
How do I get it to work, on 32bit app? What do I have to take in account?
My aim is to cut a certain percentage of a very big number.
I use VC++ 2008, Vista.
double has a 52 bit mantissa, you're losing precision when you try to load a 60+ bit value into it.
Floating point calculations aren't guaranteed to be perfectly accurate, and you've defined CUT_TYPE as double.
See this answer for more info: Dealing with accuracy problems in floating-point numbers
I'm wondering if there is a way to represent a float using a char in C++?
For example:
int main()
{
float test = 4.7567;
char result = charRepresentation(test);
return 0;
}
I read that probably using bitset I can do it but I'm not pretty sure.
Let's suppose that my float variable is 01001010 01001010 01001010 01001010 in binary.
If I want a char array of 4 elements, the first element will be 01001010, the second: 01001010 and so on.
Can I represent the float variable in a char array of 4 elements?
I suspect what you're trying to say is:
int main()
{
float test = 4.7567;
char result[sizeof(float)];
memcpy(result, &test, sizeof(test));
/* now result is storing the float,
but you can treat it as an array of
arbitrary chars
for example:
*/
for (int n = 0; n < sizeof(float); ++n)
printf("%x", result[n]);
return 0;
}
Edited to add: all the people pointing out that you can't fit a float into 8 bits are of course correct, but actually the OP is groping towards the understanding that a float, like all atomic datatypes, is ultimately a simple contiguous block of bytes. This is not obvious to all novices.
the best you can is create custom float that is byte size. or use char as fixed point decimal. on all cases this will lead to significant loss of precision.
using a union is clean and easy
union
{
float f;
unsigned int ul;
unsigned char uc[4];
} myfloatun;
myfloatun.f=somenum;
printf("0x%08X\n",myfloatun.ul);
Much safer from a compiler perspective than pointers. Memcpy works just fine too.
EDIT
Okay, okay, here are fully functional examples. Yes, you have to use unions with care if you dont keep an eye on how this compiler allocates the union and pads or aligns it it can break and this is why some/many say it is dangerous to use unions in this way. Yet the alternatives are considered safe?
Doing some reading C++ has its own problems with unions and a union may very well just not work. If you really meant C++ and not C then this is probably bad. If you said kleenex and meant tissues then this might work.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef union
{
float f;
unsigned char uc[4];
} FUN;
void charRepresentation ( unsigned char *uc, float f)
{
FUN fun;
fun.f=f;
uc[0]=fun.uc[3];
uc[1]=fun.uc[2];
uc[2]=fun.uc[1];
uc[3]=fun.uc[0];
}
void floatRepresentation ( unsigned char *uc, float *f )
{
FUN fun;
fun.uc[3]=uc[0];
fun.uc[2]=uc[1];
fun.uc[1]=uc[2];
fun.uc[0]=uc[3];
*f=fun.f;
}
int main()
{
unsigned int ra;
float test;
char result[4];
FUN fun;
if(sizeof(fun)!=4)
{
printf("It aint gonna work!\n");
return(1);
}
test = 4.7567F;
charRepresentation(result,test);
for(ra=0;ra<4;ra++) printf("0x%02X ",(unsigned char)result[ra]); printf("\n");
test = 1.0F;
charRepresentation(result,test);
for(ra=0;ra<;ra++) printf("0x%02X ",(unsigned char)result[ra]); printf("\n");
test = 2.0F;
charRepresentation(result,test);
for(ra=0;ra<4;ra++) printf("0x%02X ",(unsigned char)result[ra]); printf("\n");
test = 3.0F;
charRepresentation(result,test);
for(ra=0;ra<4;ra++) printf("0x%02X ",(unsigned char)result[ra]); printf("\n");
test = 0.0F;
charRepresentation(result,test);
for(ra=0;ra<4;ra++) printf("0x%02X ",(unsigned char)result[ra]); printf("\n");
test = 0.15625F;
charRepresentation(result,test);
for(ra=0;ra<4;ra++) printf("0x%02X ",(unsigned char)result[ra]); printf("\n");
result[0]=0x3E;
result[1]=0xAA;
result[2]=0xAA;
result[3]=0xAB;
floatRepresentation(result,&test);
printf("%f\n",test);
return 0;
}
And the output looks like this
gcc fun.c -o fun
./fun
0x40 0x98 0x36 0xE3
0x3F 0x80 0x00 0x00
0x40 0x00 0x00 0x00
0x40 0x40 0x00 0x00
0x00 0x00 0x00 0x00
0x3E 0x20 0x00 0x00
0.333333
You can verify by hand, or look at this website as I took examples directly from it, the output matches what was expected.
http://en.wikipedia.org/wiki/Single_precision
What you do not ever want to do is point at memory with a pointer to look at it with a different type. I never understood why this practice is used so often, particularly with structs.
int broken_code ( void )
{
float test;
unsigned char *result
test = 4.567;
result=(unsigned char *)&test;
//do something with result here
test = 1.2345;
//do something with result here
return 0;
}
That code will work 99% of the time but not 100% of the time. It will fail when you least expect it and at the worst possible time, like the day after your most important customer receives it. Its the optimizer that eats your lunch with this coding style. Yes, I know most of you do this and were taught this and perhaps have never been burned....yet. That just makes it more painful when it finally does happen, because now you know that it can and has failed (with popular compilers like gcc, on common computers like a pc).
After seeing this fail when using this method for testing an fpu, programmatically building specific floating point numbers/patterns, I switched to the union approach which so far has never failed. By definition the elements in the union share the same chunk of storage, and the compiler and optimizer do not get confused about the two items in that shared chunk of storage being...in that same shared chunk of storage. With the above code you are relying on an assumption that there is non-register memory storage behind every use of the variables and that all variables are written back to that storage before the next line of code. Fine if you never optimize or if you use a debugger. The optimizer does not know in this case that result and test share the same chunk of memory, and that is the root of the problem/bug. To do the pointer game you have to get into putting volatile on everything, like a union you still have to know how the compiler aligns and pads, you still have to deal with endians.
The problem is generic that the compiler doesnt know the two items share the same memory space. For the specific trivial example above I have watched the compiler optimize out the assignment of the number to the floating point variable because that value/variable is never used. The address for the storage of that variable is used and if you were to say printf the *result data the compiler would not optimize out the result pointer and thus not optimize out the address to test and thus not optimize out the storage for test, but in this simple example it can and has happened where the numbers 4.567 and 1.2345 never make it into the compiled program. I have also see the compiler allocate the storage for test, but assign the numbers to a floating point register then never use that register nor copy the contents of that register to the storage that it has assigned. The reasons why it fails for less trivial examples can be harder to follow, often having to do with register allocation and eviction, change a line of code and it works, change another and it breaks.
Memcpy,
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void charRepresentation ( unsigned char *uc, float *f)
{
memcpy(uc,f,4);
}
void floatRepresentation ( unsigned char *uc, float *f )
{
memcpy(f,uc,4);
}
int main()
{
unsigned int ra;
float test;
unsigned char result[4];
ra=0;
if(sizeof(test)!=4) ra++;
if(sizeof(result)!=4) ra++;
if(ra)
{
printf("It aint gonna work\n");
return(1);
}
test = 4.7567F;
charRepresentation(result,&test);
printf("0x%02X ",(unsigned char)result[3]);
printf("0x%02X ",(unsigned char)result[2]);
printf("0x%02X ",(unsigned char)result[1]);
printf("0x%02X\n",(unsigned char)result[0]);
test = 0.15625F;
charRepresentation(result,&test);
printf("0x%02X ",(unsigned char)result[3]);
printf("0x%02X ",(unsigned char)result[2]);
printf("0x%02X ",(unsigned char)result[1]);
printf("0x%02X\n",(unsigned char)result[0]);
result[3]=0x3E;
result[2]=0xAA;
result[1]=0xAA;
result[0]=0xAB;
floatRepresentation(result,&test);
printf("%f\n",test);
return 0;
}
gcc fcopy.c -o fcopy
./fcopy
0x40 0x98 0x36 0xE3
0x3E 0x20 0x00 0x00
0.333333
With the flaming I am going to get about my above comments, and depending on which side of the argument you choose to be on. Perhaps memcpy is your safest route. You still have to know the compiler very well, and manage your endians. The compiler should not screw up the memcpy it should store the registers to memory before the call, and execute in order.
You can only do so partially in a way that won't allow you to fully recover the original float. In general, this is called Quantization, and depending on your requirements there is an art to picking a good quantization. For example, floating point values used to represent R, G and B in a pixel will be converted to a char before being displayed on a screen.
Alternatively, it's easy to store a float in its entirety as four chars, with each char storing some of the information about the original number.
You can create, for that number, a fixed point value using 2 bits for the whole number and 5 bits for the fractional portion (or 6 if you want it to be unsigned). That would be able to store roughly 4.76 in terms of accuracy. You don't quite have enough size to represent that number much more accurately - unless you used a ROM lookup table of 256 entries where you are storing your info outside the number itself and in the translator.
int main()
{
float test = 4.7567;
char result = charRepresentation(test);
return 0;
}
If we ignore that your float is a float and convert 47567 into binary, we get 10111001 11001111. This is 16 bits, which is twice the size of a char (8 bits). Floats store their numbers by storing a sign bit (+ or -), an exponent (where to put the decimal point, in this case 10^-1), and then the significant digits (47567).
There's just not enough room in a char to store a float.
Alternatively, consider that a char can store only 256 different values. With four decimal places of precision, there are far more than 256 different values between 1 and 4.7567 or even 4 and 4.7567. Since you can't differentiate between more than 256 different values, you don't have enough room to store it.
You could conceivably write something that would 'translate' from a float to a char by limiting yourself to an extremely small range of values and only one or two decimal places*, but I can't think of any reason you would want to.
*You can store any value between 0 and 256 in a char, so if you always multiplied the value in the char by 10^-1 or 10^-2 (you could only use one of these options, not both since there isn't enough room to store the exponent) you could store any number between 0 and 25.6 or 0 and 2.56. I don't know what use this would have though.
A C char is only 8 bits (on most platforms). The basic problem this causes is two-fold. First, almost all FPUs in existence support IEEE floating point. That means floating point values either require 32 bits, or 64. Some support other non-standard sizes, but the only ones I'm aware of are 80 bits. None I have ever heard of support floats of only 8 bits. So you couldn't have hardware support for an 8-bit float.
More importantly, you wouldn't be able to get a lot of digits out of an 8-bit float. Remember that some bits are used to represent the exponent. You'd have almost no precision left for your digits.
Are you perhaps instead wanting to know about Fixed point? That would be doable in a byte.