"Floating-point invalid operation" when inputting float to a stringstream - c++

I have a simple piece of code that extracts a float from a FORTRAN-generated REAL array, and then inserts it into a stream for logging. Although this works for the first 30 cases, on the 31st it crashes with a "Floating-point invalid operation".
The code is:
int FunctionDeclaration(float* mrSwap)
{
...
float swap_float;
stringstream message_stream;
...
swap_float = *(mrSwap+30-1);
...
message_stream.clear();
message_stream << 30 << "\t" << swap_float << "\tblah blah blah \t";
When debugging, the value of swap_float the instance before the crash (on the last line, above) is 1711696.3 - other than this being much larger than most of the values up until this point, there is nothing particularly special about it.
I have also tried replacing message_stream with cerr, and got the same problem. I had hitherto believed cerr to be pretty much indestructable - how can a simple float destroy it?
Edit:
Thanks for the comments: I've added the declaration of mrSwap. mrSwap is approximately 200 long, so I'm a long way off the end. It is populated outside of my control, and individual entries may not be populated - but to the best of my understanding, this would just mean that swap_float would be set to a random float?

individual entries may not be populated - but to the best of my
understanding, this would just mean that swap_float would be set to a
random float?
Emphatically not. Certain bit patterns in an IEEE floating-point number indicate an invalid number -- for instance, the result of an overflowing arithmetic operation, or an invalid one (such as 0.0/0.0). The puzzling thing here is that the debugger apparently accepts the number as valid, while cout doesn't.
Try getting the bit layout of swap_float. On a 32-bit system:
int i = *(int*)&swap_float;
Then print i in hexadecimal, and let us know what you see.
Updated to add: From Mike's comment, i=1238430338, which is 49D0F282 in hex. This is a valid floating-point number, equal to exactly 1711696.25. So I don't know what's going on, I'm afraid. The only thing I can suggest is that maybe the compiler is loading the invalid floating-point number directly from the mrSwap array into the floating-point register bank, without going through swapFloat. So the true value of swapFloat is simply not available to the debugger. To check this, try
int j = *(int*)(mrSwap+30-1);
and tell us what you see.
Updated again to add: Another possibility is a delayed floating-point trap. The floating-point co-processor (built into the CPU these days) generates a floating-point interrupt because of some illegal operation, but the interrupt doesn't get noticed until the next floating-point operation is attempted. So this crash might be a result of the previous floating-point operation, which could be anywhere. Good luck with that...

I'm just adding this answer to highlight the correct solution within TonyK's answer above - because we did a few loops, the answer has been edited, and because several salient points are within the comments, the actual answer may not be immediately apparent. All credit should go to TonyK for the solution.
"Another possibility is a delayed floating-point trap. The floating-point co-processor (built into the CPU these days) generates a floating-point interrupt because of some illegal operation, but the interrupt doesn't get noticed until the next floating-point operation is attempted. So this crash might be a result of the previous floating-point operation, which could be anywhere." - TonyK
This was indeed the problem: in my comparison using IsSame, the other value was NaN (this is a valid value in this context), and although it happily subtracted it from swap_float, it put a flag in saying to report the next operation as an error. I have to say that I was completely unaware that that was possible - I thought that if it worked, it worked.

Related

A unique type of data conversion

in the following code
tt=5;
for(i=0;i<tt;i++)
{
int c,d,l;
scanf("%lld%lld%lld",&c,&d,&l);
printf("%d %d %d %d",c,d,l,tt);
}
in the first iteration, the value of 'tt' is changing to 0 automatically.
I know that i have declared c,d,l as int and taking input as long long so it is making c,d=0. But still, i m not able to understand how tt is becoming 0.
Small, but obligatory announcement. As it was said in comments, you face undefined behavior, so
don't be surprised by tt assigned to zero
don't be surprised by tt not assigned to zero after insignificant code changes (e.g. reordering initialization from "int i,tt;" to "int tt, i;" or vice versa)
don't be surprised by tt not assigned to zero after compiling with different flags or different compiler version or for different platform or for testing with different input
don't be surprised by anything. Any behavior is possible.
You can't expect this code to work one way or another, so don't ever use it in real program.
However, you seem to be OK with that, and the question is "what is actually happening with tt". IMHO this question is really great, it reveals passion to understand programming deeper, and it helps in digging into lower layer. So lets get started.
Possible explanation
I failed to reproduce behavior on VS2015, but situation is quite clear. Actual data aligning, variable sizes, endianness, stack growth direction and other details may differ on your PC, but the general idea should be the same.
Variables i, tt, c, d, l are local, so they are stored on stack. Lets assume, sizeof(int) is 4 and sizeof(long long) is 8 which is quite common. Then one of possible data alignments is shown on picture (addresses grow from left to right, each cell represents one byte):
When doing scanf, you pass address of c (blue arrow on next pict) for filling with data. But size of data is 8 bytes, so data of both c and tt are overwritten (blue cells on the pict). For little-endian representation, you always write zeroes to tt unless really big number is entered by user, while c actually gets valid data for small numbers.
However, valid data in c will be rewritten the same way during filling d, the same will happen to d while filling l. So only l will get nonzero value in described case. Easy test: enter large number for c, d, l and check if tt is still zero.
How to get precise answer
You can get all answers from assembly code. Enable disassembly listing (exact steps depend on toolchain: gcc has -S option, visual studio has "goto disassembly" item in context menu while on breakpoint) and analyze listing. It's really helpful to see exact instructions your CPU is going to execute. Some debuggers allow executing instructions one by one. So you need to find out how variables are alligned on stack and when exactly are they overwritten. Analyzing scanf is hard for beginners, so you can start with the simplified version of your program: replace scanf with the following (can't test, but should work):
*((long long *)(&c)) = 1; //or any other user specified value
*((long long *)(&d)) = 2;
*((long long *)(&l)) = 3;

Taking a root higher than 2

I'm trying to take the 11th root of an expression and I'm getting a return of -inf.
std::cout << pow(j,(1.0/11.0)) << std::endl;
where j is just some log expression. I've checked that number to make sure it's valid, and it is. I'm thinking it's the way that the power expression is being run. Is there a better way to do this? Thanks.
And yes, I've included cmath into my work.
I can't think of a valid reason for pow to return -inf, if your inputs are marginally sane. However in case you're passing in a negative number, something that may be worth trying is:
if(j==0) return 0;
if(j<0) return -pow(-j, 1.0/11.0);
return pow(j,1.0/11.0);
try to look for FPU errors
the most common is forgotten return of float/double in some function
which leads to problems on FPU stack which is really small.
also you can try add this before pow
asm { fninit; };
this resets the FPU so if you have problems on stack it will help
but of course do not do this in middle of some FPU computation
it would destroy its result
if you are not on x87 platform than this will not help
the value of j before crash will be a good start to share with us.
try to store the result of pow to some float/double variable
cout that variable not temporary heap memory location
if it prints -inf look also inside that variable if it is also -inf
(could be something wrong with the cout not pow ... )
minimize your code (turn off everything part by part)
and see if the problems is suddenly not there
hidden memory leaks and code overwrites are evil ...
Let us know what you have found.

Initialize a variable

Is it better to declare and initialize the variable or just declare it?
What's the best and the most efficient way?
For example, I have this code:
#include <stdio.h>
int main()
{
int number = 0;
printf("Enter with a number: ");
scanf("%d", &number);
if(number < 0)
number= -number;
printf("The modulo is: %d\n", number);
return 0;
}
If I don't initialize number, the code works fine, but I want to know, is it faster, better, more efficient? Is it good to initialize the variable?
scanf can fail, in which case nothing is written to number. So if you want your code to be correct you need to initialize it (or check the return value of scanf).
The speed of incorrect code is usually irrelevant, but for you example code if there is a difference in speed at all then I doubt you would ever be able to measure it. Setting an int to 0 is much faster than I/O.
Don't attribute speed to language; That attribute belongs to implementations of language. There are fast implementations and slow implementations. There are optimisations assosciated with fast implementations; A compiler that produces well-optimised machine code would optimise the initialisation away if it can deduce that it doesn't need the initialisation.
In this case, it actually does need the initialisation. Consider if scanf were to fail. When scanf fails, it's return value reflects this failure. It'll either return:
A value less than zero if there was a read error or EOF (which can be triggered in an implementation-defined way, typically CTRL+Z on Windows and CTRL+d on Linux),
A number less than the number of objects provided to scanf (since you've provided only one object, this failure return value would be 0) when a conversion failure occurs (for example, entering 'a' on stdin when you've told scanf to convert sequences of '0'..'9' into an integer),
The number of objects scanf managed to assign to. This is 1, in your case.
Since you aren't checking for any of these return values (particular #3), your compiler can't deduce that the initialisation is necessary and hence, can't optimise it away. When the variable is uninitialised, failure to check these return values results in undefined behaviour. A chicken might appear to be living, even when it is missing its head. It would be best to check the return value of scanf. That way, when your variable is uninitialised you can avoid using an uninitialised value, and when it isn't your compiler can optimise away the initialisations, presuming you handle erroneous return values by producing error messages rather than using the variable.
edit: On that topic of undefined behaviour, consider what happens in this code:
if(number < 0)
number= -number;
If number is -32768, and INT_MAX is 32767, then section 6.5, paragraph 5 of the C standard applies because -(-32768) isn't representable as an int.
Section 6.5, paragraph 5 says:
If an exceptional condition occurs during the evaluation of an
expression (that is, if the result is not mathematically defined or
not in the range of representable values for its type), the behavior
is undefined.
Suppose if you don't initialize a variable and your code is buggy.(e.g. you forgot to read number). Then uninitialized value of number is garbage and different run will output(or behave) different results.
But If you initialize all of your variables then it will produce constant result. An easy to trace error.
Yes, initialize steps will add extra steps in your code at low level. for example mov $0, 28(%esp) in your code at low level. But its one time task. doesn't kill your code efficiency.
So, always using initialization is a good practice!
With modern compilers, there isn't going to be any difference in efficiency. Coding style is the main consideration. In general, your code is more self-explanatory and less likely to have mistakes if you initialize all variables upon declaring them. In the case you gave, though, since the variable is effectively initialized by the scanf, I'd consider it better not to have a redundant initialization.
Before, you need to answer to this questions:
1) how many time is called this function? if you call 10.000.000 times, so, it's a good idea to have the best.
2) If I don't inizialize my variable, I'm sure that my code is safe and not throw any exception?
After, an int inizialization doesn't change so much in your code, but a string inizialization yes.
Be sure that you do all the controls, because if you have a not-inizialized variable your program is potentially buggy.
I can't tell you how many times I've seen simple errors because a programmer doesn't initialize a variable. Just two days ago there was another question on SO where the end result of the issue being faced was simply that the OP didn't initialize a variable and thus there were problems.
When you talk about "speed" and "efficiency" don't simply consider how much faster the code might compile or run (and in this case it's pretty much irrelevant anyway) but consider your debugging time when there's a simple mistake in the code do to the fact you didn't initialize a variable that very easily could have been.
Note also, my experience is when coding for larger corporations they will run your code through tools like coverity or klocwork which will ding you for uninitialized variables because they present a security risk.

strange results with /fp:fast

We have some code that looks like this:
inline int calc_something(double x) {
if (x > 0.0) {
// do something
return 1;
} else {
// do something else
return 0;
}
}
Unfortunately, when using the flag /fp:fast, we get calc_something(0)==1 so we are clearly taking the wrong code path. This only happens when we use the method at multiple points in our code with different parameters, so I think there is some fishy optimization going on here from the compiler (Microsoft Visual Studio 2008, SP1).
Also, the above problem goes away when we change the interface to
inline int calc_something(const double& x) {
But I have no idea why this fixes the strange behaviour. Can anyone explane this behaviour? If I cannot understand what's going on we will have to remove the /fp:fastswitch, but this would make our application quite a bit slower.
I'm not familiar enough with FPUs to comment with any certainty, but my guess would be that the compiler is letting an existing value that it thinks should be equal to x sit in on that comparison. Maybe you go y = x + 20.; y = y - 20; y is already on the FP stack, so rather than load x the compiler just compares against y. But due to rounding errors, y isn't quite 0.0 like it is supposed to be, and you get the odd results you see.
For a better explanation: Why is cos(x) != cos(y) even though x == y? from the C++FAQ lite. This is part of what I'm trying to get across, I just couldn't remember where exactly I had read it until just now.
Changing to a const reference fixes this because the compiler is worried about aliasing. It forces a load from x because it can't assume its value hasn't changed at some point after creating y, and since x is actually exactly 0.0 [which is representable in every floating point format I'm familiar with] the rounding errors vanish.
I'm pretty sure MS provides a pragma that allows you to set the FP flags on a per-function basis. Or you could move this routine to a separate file and give that file custom flags. Either way, it could prevent your whole program from suffering just to keep that one routine happy.
what are the results of calc_something(0L), or calc_something(0.0f) ? It could be linked to the size of the types before casting. An integer is 4 bytes, a double is 8.
Have you tried looking at the asembled code, to see how the aforementioned conversion is done ?
Googling for 'fp fast', I found this post [social.msdn.microsoft.com]
As I've said in other question, compilers suck at generating floating point code. The article Dennis links to explains the problems well. Here's another: An MSDN article.
If the performance of the code is important, you can easily1 out-perform the compiler by writing your own assembler code. If your algoritm is vectorisable then you can make use of SIMD too (with a slight loss of precision though).
Assuming you understand the way the FPU works.
inline int calc_something(double x) will (probably) use an 80 bits register. inline int calc_something(const double& x) would store the double in memory, where it takes 64 bits. That at least explains the difference between the two.
However, I find your test quite fishy to begin with. The results of calc_something are extremely sensitive to rounding of its input. Your FP algorithms should be robust to rounding. calc_something(1.0-(1.0/3.0)*3) should be the same as calc_something(0.0).
I think the behavior is correct.
You never compare a floating point number up to less than the holding type's precision.
Something that comes from zero may be equal, greater or less than another zero.
See http://floating-point-gui.de/

How to detect an overflow in C++?

I just wonder if there is some convenient way to detect if overflow happens to any variable of any default data type used in a C++ program during runtime? By convenient, I mean no need to write code to follow each variable if it is in the range of its data type every time its value changes. Or if it is impossible to achieve this, how would you do?
For example,
float f1=FLT_MAX+1;
cout << f1 << endl;
doesn't give any error or warning in either compilation with "gcc -W -Wall" or running.
Thanks and regards!
Consider using boosts numeric conversion which gives you negative_overflow and positive_overflow exceptions (examples).
Your example doesn't actually overflow in the default floating-point environment in a IEEE-754 compliant system.
On such a system, where float is 32 bit binary floating point, FLT_MAX is 0x1.fffffep127 in C99 hexadecimal floating point notation. Writing it out as an integer in hex, it looks like this:
0xffffff00000000000000000000000000
Adding one (without rounding, as though the values were arbitrary precision integers), gives:
0xffffff00000000000000000000000001
But in the default floating-point environment on an IEEE-754 compliant system, any value between
0xfffffe80000000000000000000000000
and
0xffffff80000000000000000000000000
(which includes the value you have specified) is rounded to FLT_MAX. No overflow occurs.
Compounding the matter, your expression (FLT_MAX + 1) is likely to be evaluated at compile time, not runtime, since it has no side effects visible to your program.
In situations where I need to detect overflow, I use SafeInt<T>. It's a cross platform solution which throws an exception in overflow situations.
SafeInt<float> f1 = FLT_MAX;
f1 += 1; // throws
It is available on codeplex
http://www.codeplex.com/SafeInt/
Back in the old days when I was developing C++ (199x) we used a tool called Purify. Back then it was a tool that instrumented the object code and logged everything 'bad' during a test run.
I did a quick google and I'm not quite sure if it still exists.
As far as I know nowadays several open source tools exist that do more or less the same.
Checkout electricfence and valgrind.
Clang provides -fsanitize=signed-integer-overflow and -fsanitize=unsigned-integer-overflow.
http://clang.llvm.org/docs/UsersManual.html#controlling-code-generation