I have a small sorting program in C++ developed using XCode on a 2.4 GHz i7 Macbook Pro (I didn't change any of the configurations, so Xcode is probably using LLVM as compiler).
The program only incorporates very standard operations like calculating sums over (parts) of lists (i.e. no explicit use of pointers or so) and is only using standard types and vectors.
When compiling the same code using CL within Visual Studio 2010 on a 2.4 GHz i5 Notebook the runtime is significantly slower (at least by factor 100).
Are there any well-known performance issues with translations from Xcode to VS like the one I just described?
I haven't changed much in Visual Studio 2010 either: Are there some options for CL to be turned on or off that do the job?
Many thanks in advance.
The i7 and i5 processors have similar architecture. The two you speak of have the same clock rate, but are different tiers. Therefore, the i5 and i7 are not comparable on a benchmark like this. Only if you installed Windows on your Mac would you get a valid timing for both programs. The i7 is more powerful than the i5. #Jerry Coffins also has a point there.
Check out the difference between i5 and i7
In addition to #Jerry Coffin's comment, you need to use Shift+F5 combination to run your code without the debugger (yes, there is a debugger for release build as well).
Related
Visual Studio 2015 has got a lot of changes on the C++ compiler side and I'm looking for a benchmark/performance comparison between the Intel C++ compiler and Visual Studio 2015 !
About performance, I mean the performance of the generated code, something like this : https://software.intel.com/en-us/c-compilers/iss
Is there an interest to use the Intel C++ compiler ? Will it produce faster code ?
Thanks
Few year ago, i did some tests on a mac-pro with intel proc.
Results:
icc+linux
vc+win
icc+win
gcc+linux
icc+linux was the very best.
vc+win, icc+win were pretty close.
Explanation: the more the software editor can exploit assertion on the system+hardware, the more it can design a compiler generating fast running code.
Intel is the best because it can exploit its processor and the system (open source).
VC under windows works great too, they know their OS.
Now, this depends of the kind of software. If your program will load a lot of data from disk the best will certainly be vc+win (they have great implementation of internal buffers...). If your program is very multithreaded, icc+linux is gonna win for sure. These are only 2 examples I can talk about because I tested these use cases.
I compared ICC and VC on Windows, and they were very close in terms of performance. I was able to make ICC beat VC only by using the "profile guided optimization" feature.
Currently I am developing a cross-plattform framework where I want to use actuall features of openmp.
I would like to make use of the "new features" of openmp 3.0 (or later).
(Such like unsigned parallel for loops or tasks etc.,
I haven't developed on a windows plattform for quite a while and
as I have seen for now even Visual Studio 2015 does only support openmp 2.0 (At least when using msvc, see e.g. All OpenMP Tasks running on the same thread or https://blogs.msdn.microsoft.com/vcblog/2014/11/12/visual-studio-2015-preview-is-now-available/) So my questions are:
Is there any sane reason to not support openmp3.0 in Visual Studio?
Is there any way to get it work under Visual Studio?
I am aware, that I could use the Intel C++ compiler, but unfortunately i do not have access to one. So is there a free alternative to the Intel compiler with openmp3.0 support?
Thanks in advance
Well, you might try GCC ports for Windows, native (mingw64) and on top of cygwin.
Try to install msys2 and you'll get ming64 as well as cygwin compilers with OpenMP support
You can try cygwin.
cygwin is gcc compiler by GNU for windows.
I have a large Fortran/C++ project that assembles hundreds of Fortran intermediate files into a single executive. When I monitor some of the global single precision floating point variables, I get different results when I run the executive on a Windows 7 x64 machine vs a Windows XP SP2 x86 machine. The differences are as much as 1-2%.
The project was built on the x86 machine and not rebuilt before testing on the x64 machine, although I am using the exact same compiler (compaq visual fortran 6.6), and development studio (visual studio 6.0), and identical code for both machines. The x64 machine has a Pentium E5400, the x86 machine has a pentium 4 dual core. Could this be an example of Deterministic Lockstep?.
I know this is vague - I wish I could provide some code, but there's over 1 million lines. All of the variables are REAL*4 and are calculated in the Fortran code several hundred times per second. The c++ MFC code assembles it into the executive.
Introduction
The difference you are observing are (probably) due to the fact that your executable includes optimized floating-point instructions, and the result of these instructions can be different between different architectures.
Enable float-point consistency
Note: The following only applies to older (6.0) versions of msvc++.
Unless you explicitly tell the compiler that you don't want it to optimize floating-point operations (where the trade-off might be some slight inaccuracy), it will do so.
Passing /Op as a flag to the compiler enables the "'consistency' floating-point model"; effectively disabling the previous mentioned optimization.
msdn.microsoft.com - VS6.0 - /Op (Improve Float Consistency)
The equivalent flag, /fp:strict, is the default option in VS2008.
Turn off SSE2
Note: The following only applies to newer versions of msvc++.
Unless you explicitly say that you don't want the msvc++ to generate SSE{,2} instructions for your floating-point calculations, such will be included in your executable.
You can force the compiler to disable generation of SSE and SSE2 instructions by passing the flag /arch:IA32 to it.
msdn.microsoft.com - /arch (x86)
I am working on a high performance scientific application and found that pushing the computations into Intel compiler gives a lot of speedups by generating fast code, vectorization and better auto parallelization. But my main application is till in Microsoft C++ and uses COM. My questions are
1) Is it possible to build an assembly in Intel C++ compiler and load it into an application built with Microsoft compiler? Will it have incompatibilities?
2) What is the level of support for COM in Intel compilers.
Any advice in this area is appreciated.
Thank you
--Sai
Posting Sai Venkat comment as an answer:
Here is the reply I received from Intel. Intel compiler has 100% support for Microsoft compiler as long as we don't use /clr in compilation
I am writing a server application which has a large amount of source code. Compiling the application on my Intel Atom z510 takes around 15-20 minutes, and about 2-3 minutes on my Intel i7.
I am very new to cross compiling, new as in I've never done it. I can't find any reference on how to cross compile to the Z510. I found a great SO article on optimization flags for the atom here. However, no description on how to use them on my Intel i7 pc for my Intel Atom CPU.
I am making the assumption that anything compiled on my i7 will be default to being optimized for my i7, causing performance drops on the Atom. Any advice/search terms/websites would be greatly appreciated.
As always, thank you so much ahead of time.
Edit: I am using gcc 4.4. Apologies. (The one that comes with Ubuntu 10.04)
Constantin
I think your assumption that code compiled on the Atom is automatically optimized for the Atom is faulty.
Even if you request that behavior via -march=native -mtune=native, gcc 4.4 doesn't know how to optimize for Atom.
And code optimized for the Core i7 would run more slowly than code compiled on the Atom only if you are passing those flags to get code optimized for the Core i7 (which I think also requires a later version of gcc). Getting rid of those flags would cause the compiler on the i7 to generate the same code as the one on the Atom.
If you're on your i7 and want to compile binaries compatible with and optimised for your Atom, just use a -march=atom option to gcc. The binaries produced should work, on the condition you're running the same OS on both systems (this includes agreeing on 32/64 bit-ness), and any necessary run-time dependencies are present.