What will happen if wrong arguement is passed into MPI_Allreduce? - fortran

I was writing fortran using MPI_Allreduce. I accidently put MPI_DOUBLE_PRECISION instead of MPI_INTEGER to get the integer sum. Sometimes I will get a correct answer sometimes I will get some very huge negative numbers. I was taking the parallel computing course and check my textbook. I did not found that what will exactly happen if I passed the wrong MPI_Datatype to MPI_Allreduce. Plus, the compiler or the code does not complain at all which making debugging extremely difficult! Is there a method to identify wrong MPI_Datatype is passed into MPI_Allreduce? Could somebody explain a bit more about how MPI_ALlreduce will work if wrong argument is passed into?
Many thanks!

Generally speaking, an incorrect program has an undefined behavior.
In your case, using MPI_DOUBLE_PRECISION instead of MPI_INTEGER will likely result in array overflows, and hence memory corruption, and then all bets are off.
Also, you will be using an operator (+ is MPI_SUM) that operates on floating numbers instead of integers, which is incorrect.
Feel free to learn the format of DOUBLE PRECISION numbers and see what happens if you treat them as integers and perform an addition.

Related

Why is fprintf causing memory leak and behaving unpredictably when width argument is missing

The following simple program is behaving unpredictably. Sometimes it prints "0.00000", sometimes it prints more "0" than I can count. Some times it uses up all memory on the system, before the system either kills some process, or it fails with bad_alloc.
#include "stdio.h"
int main() {
fprintf(stdout, "%.*f", 0.0);
}
I'm aware that this is incorrect usage of fprintf. There should be another argument specifying the width of the formatting. It's just surprising that the behavior is so unpredictable. Sometimes it seems to use a default width, while sometimes it fails very badly. Could this not be made to always fail or always use some default behaviour?
I came over similar usage in some code at work, and spent a lot of time figuring out what was happening. It only seemed to happen with debug builds, but would not happen while debugging with gdb. Another curiosity is that running it through valgrind would consistently bring about the printing of many "0"s case, which otherwise happens quite seldom, but the memory usage issue would never occur then either.
I am running Red Hat Enterprise Linux 7, and compiled with gcc 4.8.5.
Formally this is undefined behavior.
As for what you're observing in practice:
My guess is that fprintf ends up using an uninitialized integer as the number of decimal places to output. That's because it'll try to read a number from a location where the caller didn't write any particular value, so you'll just get whatever bits happen to be stored there. If that happens to be a huge number, fprintf will try to allocate a lot of memory to store the result string internally. That would explain the "running out of memory" part.
If the uninitialized value isn't quite that big, the allocation will succeed and you'll end up with a lot of zeroes.
And finally, if the random integer value happens to be just 5, you'll get 0.00000.
Valgrind probably consistently initializes the memory your program sees, so the behavior becomes deterministic.
Could this not be made to always fail
I'm pretty sure it won't even compile if you use gcc -pedantic -Wall -Wextra -Werror.
The format string does not match the parameters, therefore the bahaviour of fprintf is undefined. Google "undefined behaviour C" for more information about "undefined bahaviour".
This would be correct:
// printf 0.0 with 7 decimals
fprintf(stdout, "%.*f", 7, 0.0);
Or maybe you just want this:
// printf 0.0 with de default format
fprintf(stdout, "%f", 0.0);
About this part of your question: Sometimes it seems to use a default width, while sometimes it fails very badly. Could this not be made to always fail or always use some default behaviour?
There cannot be any default behaviour, fprintf is reading the arguments according to the format string. If the arguments don't match, fprintf ends up with seamingly random values.
About this part of your question: Another curiosity is that running it through valgrind would consistently bring about the printing of many "0"s case, which otherwise happens quite seldom, but the memory usage issue would never occur then either.:
This is just another manifestation of undefined behaviour, with valgrind the conditions are quite different and therefore the actual undefined bahaviour can be different.
Undefined behaviour is undefined.
However, on x86-64 System-V ABI it is well-known that arguments are not passed on stack but in registers. Floating point variables are passed in floating-point registers, and integers are passed in general-purpose registers. There is no parameter store on stack, so the width of the arguments does not matter. Since you never passed any integer in the variable argument part, the general purpose register corresponding to the first argument will contain whatever garbage it had from before.
This program will show how the floating point values and integers are passed separately:
#include <stdio.h>
int main() {
fprintf(stdout, "%.*f\n", 42, 0.0);
fprintf(stdout, "%.*f\n", 0.0, 42);
}
Compiled on x86-64, GCC + Glibc, both printfs will produce the same output:
0.000000000000000000000000000000000000000000
0.000000000000000000000000000000000000000000
This is undefined behaviour in the standard. It means "anything is fair game" because you're doing wrong things.
The worst part is that most certainly any compiler will warn you, but you have ignored the warning. Putting some kind of validation other than the compiler will incurr in a cost that everybody will pay just so you can do what's wrong.
That's the opposite of what C and C++ stand for: you pay for what you use. If you want to pay the cost, it's up to you to do the checking.
What's really happening depends on the ABI, compiler and architecture. It's undefined behaviour because the language gives the implementer the freedom to do what's better on every machine (meaning, sometimes faster code, sometimes shorter code).
As an example, when you call a function on the machine, it just means that you're instructing the microprocessor to go to a certain code location.
In some made up assembly and ABI, then, printf("%.*f", 5, 1); will translate into something like
mov A, STR_F ; // load into register A the 32 bit address of the string "%.*f"
mov B, 5 ; // load second 32 bit parameter into B
mov F0, 1.0 ; // load first floating point parameter into register F0
call printf ; // call the function
Now, if you miss some parameter, in this case B, it will take any value that was there before.
The thing with functions like printf is that they allow anything in their parameter list (it's printf(const char*, ...), so anything is valid). That's why you shouldn't use printf on C++: you have better alternatives, like streams. printf avoids the checkings of the compiler. streams are better aware of types and are extensible to your own types. Also, that's why your code should compile without warnings.

Exceeding built-in data type ranges

I have been using C++ for quite some time by now and literally took things for granted.
Recently, I asked myself how can the compiler return accurate values{always} when I use out of range values for calculation.
I understand the 2^n{n = bits} concept.
For example: If I would like to add two int's which are out of range such as:
10e6, I would expect the compiler to return a result that is wrong as the bits are overwritten and ultimately represent a wrong integer. But this is never seen to happen.
Can anyone shed some light over this.
Thanks.

Initialize a variable

Is it better to declare and initialize the variable or just declare it?
What's the best and the most efficient way?
For example, I have this code:
#include <stdio.h>
int main()
{
int number = 0;
printf("Enter with a number: ");
scanf("%d", &number);
if(number < 0)
number= -number;
printf("The modulo is: %d\n", number);
return 0;
}
If I don't initialize number, the code works fine, but I want to know, is it faster, better, more efficient? Is it good to initialize the variable?
scanf can fail, in which case nothing is written to number. So if you want your code to be correct you need to initialize it (or check the return value of scanf).
The speed of incorrect code is usually irrelevant, but for you example code if there is a difference in speed at all then I doubt you would ever be able to measure it. Setting an int to 0 is much faster than I/O.
Don't attribute speed to language; That attribute belongs to implementations of language. There are fast implementations and slow implementations. There are optimisations assosciated with fast implementations; A compiler that produces well-optimised machine code would optimise the initialisation away if it can deduce that it doesn't need the initialisation.
In this case, it actually does need the initialisation. Consider if scanf were to fail. When scanf fails, it's return value reflects this failure. It'll either return:
A value less than zero if there was a read error or EOF (which can be triggered in an implementation-defined way, typically CTRL+Z on Windows and CTRL+d on Linux),
A number less than the number of objects provided to scanf (since you've provided only one object, this failure return value would be 0) when a conversion failure occurs (for example, entering 'a' on stdin when you've told scanf to convert sequences of '0'..'9' into an integer),
The number of objects scanf managed to assign to. This is 1, in your case.
Since you aren't checking for any of these return values (particular #3), your compiler can't deduce that the initialisation is necessary and hence, can't optimise it away. When the variable is uninitialised, failure to check these return values results in undefined behaviour. A chicken might appear to be living, even when it is missing its head. It would be best to check the return value of scanf. That way, when your variable is uninitialised you can avoid using an uninitialised value, and when it isn't your compiler can optimise away the initialisations, presuming you handle erroneous return values by producing error messages rather than using the variable.
edit: On that topic of undefined behaviour, consider what happens in this code:
if(number < 0)
number= -number;
If number is -32768, and INT_MAX is 32767, then section 6.5, paragraph 5 of the C standard applies because -(-32768) isn't representable as an int.
Section 6.5, paragraph 5 says:
If an exceptional condition occurs during the evaluation of an
expression (that is, if the result is not mathematically defined or
not in the range of representable values for its type), the behavior
is undefined.
Suppose if you don't initialize a variable and your code is buggy.(e.g. you forgot to read number). Then uninitialized value of number is garbage and different run will output(or behave) different results.
But If you initialize all of your variables then it will produce constant result. An easy to trace error.
Yes, initialize steps will add extra steps in your code at low level. for example mov $0, 28(%esp) in your code at low level. But its one time task. doesn't kill your code efficiency.
So, always using initialization is a good practice!
With modern compilers, there isn't going to be any difference in efficiency. Coding style is the main consideration. In general, your code is more self-explanatory and less likely to have mistakes if you initialize all variables upon declaring them. In the case you gave, though, since the variable is effectively initialized by the scanf, I'd consider it better not to have a redundant initialization.
Before, you need to answer to this questions:
1) how many time is called this function? if you call 10.000.000 times, so, it's a good idea to have the best.
2) If I don't inizialize my variable, I'm sure that my code is safe and not throw any exception?
After, an int inizialization doesn't change so much in your code, but a string inizialization yes.
Be sure that you do all the controls, because if you have a not-inizialized variable your program is potentially buggy.
I can't tell you how many times I've seen simple errors because a programmer doesn't initialize a variable. Just two days ago there was another question on SO where the end result of the issue being faced was simply that the OP didn't initialize a variable and thus there were problems.
When you talk about "speed" and "efficiency" don't simply consider how much faster the code might compile or run (and in this case it's pretty much irrelevant anyway) but consider your debugging time when there's a simple mistake in the code do to the fact you didn't initialize a variable that very easily could have been.
Note also, my experience is when coding for larger corporations they will run your code through tools like coverity or klocwork which will ding you for uninitialized variables because they present a security risk.

Bizarre behavior from sprintf in C++/VS2010

I have a super-simple class representing a decimal # with fixed precision, and when I want to format it I do something like this:
assert(d.DENOMINATOR == 1000000);
char buf[100];
sprintf(buf, "%d.%06d", d._value / d.DENOMINATOR, d._value % d.DENOMINATOR);
Astonishingly (to me at least) this does not work. The %06d term comes out all 0s even when d.DENOMINATOR does not evenly divide d._value. And if I throw an extra %d in the format string, I see the right value show up in the third spot -- so it's like something is secretly creating an extra argument between my two.
If I compute the two terms outside of the call to sprintf, everything behaves how I expect. I thought to reproduce this with a more simple test case:
char testa[200];
char testb[200];
int x = 12345, y = 1000;
sprintf(testa, "%d.%03d", x/y, x%y);
int term1 = x/y, term2 = x%y;
sprintf(testb, "%d.%03d", term1, term2);
...but this works properly. So I'm completely baffled as to exactly what's going on, how to avoid it in the future, etc. Can anyone shed light on this for me?
(EDIT: Problem ended up being that d._value and d.DENOMINATOR are both long longs so %d doesn't suffice. Thanks very much to Serge's comment below which pointed to the problem, and Mark's answer submitted shortly thereafter.)
Almost certainly your term components are a 64-bit type (perhaps long on a 64-bit system) which is getting passed into the non-type-safe sprintf. Thus when you create an intermediate int the size is right and it works fine.
g++ will warn about this and many other useful things with -Wall. The preferred solution is of course to use C++ iostreams for your formatting as they're totally type safe.
The alternate solution is to cast the result of your expression to the type that you told sprintf to expect so it pulls the proper number of bytes out of memory.
Finally, never use sprintf when almost every compiler supports snprintf which prevents all sorts of silly mistakes. Your code is fine now but when someone modifies it later and it runs off the end of the buffer you may spend days tracking down the corruption.

How to store doubles in memory

Recently I changed some code
double d0, d1;
// ... assign things to d0/d1 ...
double result = f(d0, d1)
to
double d[2];
// ... assign things to d[0]/d[1]
double result = f(d[0], d[1]);
I did not change any of the assignments to d, nor the calculations in f, nor anything else apart from the fact that the doubles are now stored in a fixed-length array.
However when compiling in release mode, with optimizations on, result changed.
My question is, why, and what should I know about how I should store doubles? Is one way more efficient, or better, than the other? Are there memory alignment issues? I'm looking for any information that would help me understand what's going on.
EDIT: I will try to get some code demonstrating the problem, however this is quite hard as the process that these numbers go through is huge (a lot of maths, numerical solvers, etc.).
However there is no change when compiled in Debug. I will double check this again to make sure but this is almost certain, i.e. the double values are identical in Debug between version 1 and version 2.
Comparing Debug to Release, results have never ever been the same between the two compilation modes, for various optimization reasons.
You probably have a 'fast math' compiler switch turned on, or are doing something in the "assign things" (which we can't see) which allows the compiler to legally reorder calculations. Even though the sequences are equivalent, it's likely the optimizer is treating them differently, so you end up with slightly different code generation. If it's reordered, you end up with slight differences in the least significant bits. Such is life with floating point.
You can prevent this by not using 'fast math' (if that's turned on), or forcing ordering thru the way you construct the formulas and intermediate values. Even that's hard (impossible?) to guarantee. The question is really "Why is the compiler generating different code for arrays vs numbered variables?", but that's basically an analysis of the code generator.
no these are equivalent - you have something else wrong.
Check the /fp:precise flags (or equivalent) the processor floating point hardware can run in more accuracy or more speed mode - it may have a different default in an optimized build
With regard to floating-point semantics, these are equivalent. However, it is conceivable that the compiler might decide to generate slightly different code sequences for the two, and that could result in differences in the result.
Can you post a complete code example that illustrates the difference? Without that to go on, anything anyone posts as an answer is just speculation.
To your concerns: memory alignment cannot effect the value of a double, and a compiler should be able to generate equivalent code for either example, so you don't need to worry that you're doing something wrong (at least, not in the limited example you posted).
The first way is more efficient, in a very theoretical way. It gives the compiler slightly more leeway in assigning stack slots and registers. In the second example, the compiler has to pick 2 consecutive slots - except of course if the compiler is smart enough to realize that you'd never notice.
It's quite possible that the double[2] causes the array to be allocated as two adjacent stack slots where it wasn't before, and that in turn can cause code reordering to improve memory access efficiency. IEEE754 floating point math doesn't obey the regular math rules, i.e. a+b+c != c+b+a