I've a code when I run in visual studio, I get different results and when I run using g++ compiler I get different results. It has a seed of 1, so I guess this should not affect it, Also there are some parts of the code which run in thread(but this part doesn't contain any rand function)
I get the same results by running the application on the same platform , but different if I use different compilers
For all behaviour that the standard defines, programs generated by all compilers must behave the same way.
For all behaviour that the standard leaves unspecified, compilers do not need to behave the same. The standard makes no guarantees about programs that violate the standard for example. The standard also leaves many details up to the implementation.
Also, compilers tend to not always comply to the standard in all cases and some compilers may not support same version of the standard as another.
Finally, some standard rules are found to be ambiguous, and different compilers may have chosen an opposite interpretation. These should be documented as defect reports.
.. seed ... rand function ...
The random sequence produced by rand is implementation defined. Yes, the results can be different with different compilers.
C++11 introduced <random> header. Of the random number generators defined there, default_random_engine is the only one that has implementation defined behaviour.
If you want a reproducible pseudo-random numbers, use the C++ facilities instead, so you can choose a well-defined generator.
The C pseudo-random-number generator isn't guaranteed to be the same across compilers or platforms.
Pseudo-random number generation is a very tough problem. It is better to use a third party library. For example, there are several random number generators included with the GNU Scientific Library (https://www.gnu.org/software/gsl/). When I need a reproducible random number generator I tend to use the Mersenne Twister which is the GSL default.
Random number generation for cryptography is done with other libraries.
Also you can test the quailty of you generator is using Die Harder II (https://www.phy.duke.edu/~rgb/General/dieharder.php). And yes, technically you cannot tell how random something is, but if you can find a pattern in it - it is definitely not random.
Related
I read an article in which different compilers were compared to infer which is the best in different circumstances. It gave me a thought. Even though I tried to google, I didn't manage to find a clear and lucid answer: will the program run faster or slower if I use different compilers to compile it? Suppose, it's some uncommon complicated algorithm that is used along with templating.
Yes. The compiler is what writes a program that implements the behavior you've described with your C or C++ code. Different compilers (or even the same compiler, given different options) can come up with vastly different programs that implement the same behavior.
Remember, your CPU does not execute C or C++ code. It only executes machine code. There is no defined standard for how the former gets transformed into the latter.
It may depend on the compiler, compiler version, compiler optimization settings, C++ language version used when compiling, the linker used, linker optimization options and much more. So in short, the answer to your question is Yes.
I have been developing some hopefully generic C++ code using MS Visual Studio 2008 / Windows. The code is ultimately going to be used within both IOS and Android apps. After some initial testing we found that my program behaved differently on Android/IOS and we traced this down to different values of RAND_MAX. Now the code is behaving better, but it is still not exactly the same as on Windows and it is a tricky process finding the differences, especially as I do not have the IOS/Android development environments set up at my end and my client is in a different time zone.
My question is, what could I do to avoid issues with different subtle compiler differences. For example is there a way of making one compiler behave like another? Or perhaps a website that lists common problems with compiler differences?... any ideas?
EDIT: The program does not employ any third party libraries.
The way to make code easier to go from one compiler to another is to make your code as standard compliant as possible. If you take RAND_MAX as an example the C11 7.22.2.1 (5) standard says
The value of the RAND_MAX macro shall be at least 32767
So if you are using RAND_MAX you have to take into account that it could be more than 32767 depending on what compiler you are using.
I would suggest getting a copy of both the C and C++ standards and start getting familiar with them. Whenever you are going to make an assumption of how the code will be treated you should consult those standard to make sure that you are using well defined behavior.
If you are having problems with RAND_MAX being different, it does imply that you're also using srand and rand - are you using them? Keep in mind that no standard actually say what pseudo random number generator they need to implement, so you will not get the same sequence of random numbers on different platforms even if RAND_MAX happens to have the same value. In order to get identical sequences of random numbers on varying platforms, you need to implement a pseudo random number generator of your own - see e.g. http://en.wikipedia.org/wiki/Linear_congruential_generator for an example of a very simple one.
As for the rest of your question - in most cases, it's not compiler differences that will cause you problems (i.e., as long as you don't rely on undefined behaviour or odd corner cases, chances are that your code itself will behave the same assuming that it doesn't fail to compile at all), but differences in the APIs and environment. In this case, RAND_MAX doesn't depend on the compiler, but is a feature of the standard C library of your target platform.
So if your code fails to compile, you clearly are relying on some nonstandard feature of the language (or nonstandard/nonportable API), but if it does compile but behaves differently, you're relying on some undefined detail in the standard C/C++ libraries.
In my work, our code is compiled with Visual 2008 (and soon 2013), gcc 4.8 on Linux, gcc 4.8 for Android, XCode (clang) for iOS. We're doing standard C++, or try to, but each compiler has its own way to deal with what standard defines.
The best thing to do is use only standard librairies (STL, boost), as much as possible. If there are some functions that are available only on one platform or compiler, you have to define a generic one yourself, and call each one for each platform.
And for what I've seen, building with gcc, almost 90% of the code (if not 99%) will be good on Android and iOS.
You can give a try with cygwin and gcc, that runs on Windows, that could help you to detect issues before your code is tested on other platforms.
Do all C++ STLs produce the same random numbers (for the same seed)?
Does this hold for all platforms?
Is this specified somewhere?
No, the standard does not require a specific implementation. Also, the only standard way to get random numbers are rand and srand, and they are not part of what originally was the STL, but rather functions taken over from C.
I have to work on a fortran program, which used to be compiled using Microsoft Compaq Visual Fortran 6.6. I would prefer to work with gfortran but I have met lots of problems.
The main problem is that the generated binaries have different behaviours. My program takes an input file and then has to generate an output file. But sometimes, when using the binary compiled by gfortran, it crashes before its end, or gives different numerical results.
This a program written by researchers which uses a lot of float numbers.
So my question is: what are the differences between these two compilers which could lead to this kind of problem?
edit:
My program computes the values of some parameters and there are numerous iterations. At the beginning, everything goes well. After several iterations, some NaN values appear (only when compiled by gfortran).
edit:
Think you everybody for your answers.
So I used the intel compiler which helped me by giving some useful error messages.
The origin of my problems is that some variables are not initialized properly. It looks like when compiling with compaq visual fortran these variables take automatically 0 as a value, whereas with gfortran (and intel) it takes random values, which explain some numerical differences which add up at the following iterations.
So now the solution is a better understanding of the program to correct these missing initializations.
There can be several reasons for such behaviour.
What I would do is:
Switch off any optimization
Switch on all debug options. If you have access to e.g. intel compiler, use ifort -CB -CU -debug -traceback. If you have to stick to gfortran, use valgrind, its output is somewhat less human-readable, but it's often better than nothing.
Make sure there are no implicit typed variables, use implicit none in all the modules and all the code blocks.
Use consistent float types. I personally always use real*8 as the only float type in my codes. If you are using external libraries, you might need to change call signatures for some routines (e.g., BLAS has different routine names for single and double precision variables).
If you are lucky, it's just some variable doesn't get initialized properly, and you'll catch it by one of these techniques. Otherwise, as M.S.B. was suggesting, a deeper understanding of what the program really does is necessary. And, yes, it might be needed to just check the algorithm manually starting from the point where you say 'some NaNs values appear'.
Different compilers can emit different instructions for the same source code. If a numerical calculation is on the boundary of working, one set of instructions might work, and another not. Most compilers have options to use more conservative floating point arithmetic, versus optimizations for speed -- I suggest checking the compiler options that you are using for the available options. More fundamentally this problem -- particularly that the compilers agree for several iterations but then diverge -- may be a sign that the numerical approach of the program is borderline. A simplistic solution is to increase the precision of the calculations, e.g., from single to double. Perhaps also tweak parameters, such as a step size or similar parameter. Better would be to gain a deeper understanding of the algorithm and possibly make a more fundamental change.
I don't know about the crash but some differences in the results of numerical code in an Intel machine can be due to one compiler using 80-doubles and the other 64-bit doubles, even if not for variables but perhaps for temporary values. Moreover, floating-point computation is sensitive to the order elementary operations are performed. Different compilers may generate different sequence of operations.
Differences in different type implementations, differences in various non-Standard vendor extensions, could be a lot of things.
Here are just some of the language features that differ (look at gfortran and intel). Programs written to fortran standard work on every compiler the same, but a lot of people don't know what are the standard language features, and what are the language extensions, and so use them ... when compiled with a different compiler troubles arise.
If you post the code somewhere I could take a quick look at it; otherwise, like this, 'tis hard to say for certain.
For instance why does most members in STL implementation have _M_ or _ or __ prefix?
Why there is so much boilerplate code ?
What features C++ is lacking that would allow make vector (for instance) implementation clear and more concise?
Implementations use names starting with an underscore followed by an uppercase letter or two underscores to avoid conflicts with user-defined macros. Such names are reserved in C++.
For example, one could define a macro called Type and then #include <vector>. If vector implementations used Type as a template parameter name, it would break.
However, one is not allowed to define macros called _Type (or __type, type__ etc.). Therefore, vector can safely use such names.
Lots of STL implementations also include checking for debug builds, such as verifying that two iterators are from the same container when comparing them, and watching for iterators going out of bounds. This involves fairly complex code to track the container and validity of every iterator created, but is invaluable for finding bugs. This code is also all interwoven with the standard release code with #ifdefs - even in the STL algorithms. So it's never going to be as clear as their most basic operation. Sites like this one show the most basic functionality of STL algorithms, stating their functionality is "equivalent to" the code they show. You won't see that in your header files though.
In addition to the good reasons robson and AshleysBrain have already given, one reason that C++ standard library implementations have such terse names and compact code is that virtually every C++ program (compilation unit, really) includes a large number of the standard library headers, and they are thus repeatedly recompiled (remember that they're largely inlined and template-based, whereas the C standard library headers only contain a handful of function declarations). A standard library written to "industry standard" style guidelines would take longer to compile and thus lead to the perception that a particular compiler was "slow". By minimizing whitespace and using short identifier names, the lexer and parser have less work to do, and the whole compilation process completes a little bit faster.
Another reason worth mentioning is that many standard library implementations (e.g. Dinkumware, Rogue Wave (old), etc.) can be used with several different compilers with widely different standards compliance and quirks. There's frequently a lot of macro hackery aimed at satisfying each supported platform.