Now almost an every user have a 2 or 4 cores on desktop (and on high number of notebooks). Power users have 6-12 cores with amd or i7.
Which x86/x86_64 C/C++ compilers can use several threads to do the compilation?
There is already a 'make -j N'-like solutions, but sometimes (for -fwhole-program or -ipo) there is the last big and slow step, which started sequentially.
Does any of these can: GCC, Intel C++ Compiler, Borland C++ compiler, Open64, LLVM/GCC, LLVM/Clang, Sun compiler, MSVC, OpenWatcom, Pathscale, PGI, TenDRA, Digital Mars ?
Is there some higher limit of thread number for compilers, which are multithreaded?
Thanks!
Gcc has -flto=n or -flto=jobserver to make the linking step (which with LTO does optimization and code generation) parallel. According to the documentation, these have been available since version 4.6, although I am not sure how good it was in those early versions.
Some build systems can compile independent modules in parallel, but the compilers themselves are still single-threaded. I'm not sure there is anything to gain by making the compiler multi-threaded. The most time-consuming compilation phase is processing all the #include dependencies and they have to be processed sequentially because of potential dependencies between the various headers. The other compilation phases are heavily dependent on the output of previous phases so there's little benefit to be gained from parallelism there.
Newer Visual Studio versions can compile distinct translation units in parallel. It helps if your project uses many implementation files (such as .c, .cc, .cpp).
MSDN Page
It is not really possible to multi-process the link stage. There amy be some degree of multi-threading possible but it is unlikely to give much of a performance boost. As such many build systems will simply fire off a seperate process for seperate files. Once they are all compiled then it will, as you note, perform a long single threaded link. Alas, as I say, there is precious little you can do about this :(
Multithreaded compilation is not really useful as build systems (Make, Ninja) will start multiple compilation units at once.
And as Ferrucio stated, concurrent compilation is really difficult to implement.
Multithreaded linking can though be useful (concurrent .o/.a reading and symbol resolution.) as this will most likely be the last build step.
Gnu Gold linker can be multithreaded, with the LLVM ThinLTO implementation:
https://clang.llvm.org/docs/ThinLTO.html
Go 1.9 compiler claims to have:
Parallel Compilation
The Go compiler now supports compiling a package's functions in parallel, taking advantage of multiple cores. This is in addition to the go command's existing support for parallel compilation of separate packages.
but of course, it compiles Go, not C++
I can't name any C++ compiler doing likewise, even in October 2017. But I guess that the multi-threaded Go compiler shows that multi-threaded C or C++ compilers are (in principle) possible. But they are few of them, and making new ones is a huge work, and you'll practically need to start such an effort from scratch.
For Visual C++, I am not aware whether it does any parallel compilation (I don't think so). For versions later than Visual Studio 2005 (i.e. Visual C++ 8), the projects within a solution are built in parallel as far as is allowed by the solution dependency graph.
Related
I have been learning Chapel with small programs and they are working great. But as a program becomes longer, the compilation time also becomes longer. So I looked for the way to compile multiple files one by one, but not with success yet. By searching the internet, I found this and this pages, and the latter says
All of these incremental compilation features are enabled with the new --incremental flag in the Chapel compiler, which will be made available in Chapel 1.14.0 release.
Although the Chapel compiler on my computer accepts this option, it does not seem to generate any *.o (or *.a?) when compiling a file containing only a procedure (i.e. no main()). Is this because the above project is experimental...? In that case, can we expect this feature to be included in some future version of Chapel?
(Or, the word "incremental compilation" above is not what I'm expecting for usual compilers like GCC?)
My environment: Chapel-1.14.0 installed via homebrew on Mac OSX 10.11.6.
The Chapel implementation only fully compiles code that is used through the execution of the main() routine. As a starting foray, the incremental compilation project tried to minimize the executable difference between code compiled through normal compilation and code compiled with the --incremental flag. This was to ensure that the user would not encounter a different set of errors when developing in one mode than they would the other. As a consequence, a file containing only a procedure would not be compiled until a compilation attempt when that file/procedure was used.
The project you reference was an excellent first start but exposed many considerations to the team which we had not previously considered (including the one you have raised). We're still discussing the future direction of this feature, so it isn't entirely clear what that would entail. One possible expansion is "separate compilation", where code could be compiled into a .o or .a which could be linked to other programs. Again, though, this is still under discussion.
If you have thoughts on how this feature should develop, we would love to hear them via an issue on our Github page, or via our developers or users mailing lists.
I have a shared DLL that was last compiled in 1997 using Visual Studio 6. We're now using this application and shared DLL on MS Server 2008 and it seems less stable.
I'm assuming if I recompiled using VS 2005 or newer, it would also include improvements in the included Microsoft libraries, right? Is this common to have to recompile for MS bug fixes?
Is there a general practice when it comes to using old compiled code in newer environments?
edit:
I can't really speak from a MS/VS-specific vantage point, but my experiences with other compilers have been the following:
The ABI (i.e. calling conventions or layout of class information) may change between compilers or even compiler versions. So you may get weird crashes if you compile the app and the library with different compiler versions. (That's why there are things like COM or NSObject -- they define a stable way for different modules to talk to each other)
Some OSes change their behaviour depending on the compiler version that generated a binary, or the system libraries it was linked against. That way they can fix bugs without breaking workarounds. If you use a newer compiler or build again with the newer libraries, it is assumed that you test again, so they expect you to notice your workaround is no longer needed and remove it. (This usually applies to the entire application, though, so an older library in a newer app generally gets the new behavior, and its workarounds have already broken.
The new compiler may be better. It may have a better optimizer and generate faster code, it may have bugs fixed, it may support new CPUs.
A new compiler/new libraries may have newer versions of templates and other stub, glue and library code that gets compiled into your application (e.g. C++ template classes). This may be a variant of #3, or of #1 above. E.g. if you have an older implementation of a std::vector that you pass to the newer app, it might crash trying to use parts of it that have changed. Or it might just be less efficient or have less features.
So in general it's a good idea to use a new compiler, but it also means you should be careful and test it thoroughly to make sure you don't miss any changes.
You tagged this with "C++" and "MFC". If you actually have a C++ interface, you really should compile the DLL with the same compiler that you build the clients with. MS doesn't keep the C++ ABI completely stable across compiler versions (especially the standard library), so incompatibilities could lead to subtle errors.
In addition, newer compilers are generally better at optimizing.
If the old dll seems more stable, in my experience, it's only because bugs are obscured better with VC6. (Use of extensive runtime checks with the new version?!)
The main benefit is, that you can debug the dll seamlessly while interacting with the main application. There are other improvements you won't want to miss, e.g. CTime being able to hold dates past year 2037.
I want to compile a C++ program to an intermediate code. Then, I want to compile the intermediate code for the current processor with all of its resources.
The first step is to compile the C++ program with optimizations (-O2), run the linker and do most of the compilation procedure. This step must be independent of operating system and architecture.
The second step is to compile the result of the first step, without the original source code, for the operating system and processor of the current computer, with optimizations and special instructions of the processor (-march=native). The second step should be fast and with minimal software requirements.
Can I do it? How to do it?
Edit:
I want to do it, because I want to distribute a platform independent program that can use all resources of the processor, without the original source code, instead of distributing a compilation for each platform and operating system. It would be good if the second step be fast and easy.
Processors of the same architecture may have different features. X86 processors may have SSE1, SSE2 or others, and they can be 32 or 64 bit. If I compile for a generic X86, it will lack of SSE optimizations. After many years, processors will have new features, and the program will need to be compiled for new processors.
Just a suggestion - google clang and LLVM.
How much do you know about compilers? You seem to treat "-O2" as some magical flag.
For instance, register assignment is a typical optimization. You definitely need to now how many registers are available. No point in assigning foo to register 16, and then discover in phase 2 that you're targetting an x86.
And those architecture-dependent optimizations can be quite complex. Inlining depends critically on call cost, and that in turn depends on architecture.
Once you get to "processor-specific" optimizations, things get really tricky. It's really tough for a platform-specific compiler to be truly "generic" in its generation of object or "intermediate" code at an appropriate "level": Unless it's something like "IL" (intermediate language) code (like the C#-IL code, or Java bytecode), it's really tough for a given compiler to know "where to stop" (since optimizations occur all over the place at different levels of the compilation when target platform knowledge exists).
Another thought: What about compiling to "preprocessed" source code, typically with a "*.i" extension, and then compile in a distributed manner on different architectures?
For example, most (all) the C and C++ compilers support something like:
cc /P MyFile.cpp
gcc -E MyFile.cpp
...each generates MyFile.i, which is the preprocessed file. Now that the file has included ALL the headers and other #defines, you can compile that *.i file to the target object file (or executable) after distributing it to other systems. (You might need to get clever if your preprocessor macros are specific to the target platform, but it should be quite straight-forward with your build system, which should generate the command line to do this pre-processing.)
This is the approach used by distcc to preprocess the file locally, so remote "build farms" need not have any headers or other packages installed. (You are guaranteed to get the same build product, no matter how the machines in the build farm are configured.)
Thus, it would similarly have the effect of centralizing the "configuration/pre-processing" for a single machine, but provide cross-compiling, platform-specific compiling, or build-farm support in a distributed manner.
FYI -- I really like the distcc concept, but the last update for that particular project was in 2008. So, I'd be interested in other similar tools/products if you find them. (In the mean time, I'm writing a similar tool.)
Since I am compiling my C++ code on a very server box (32 or 64 cores in total), is there a way of tweaking compiler options to speed up the compilation times? E.g. to tell compiler to compile independent .cpp files using multiple threads.
Sun Studio includes parallel build support in the included dmake version of make.
See the dmake manual for details.
This depends on what toolchain you're using.
If you're using GNU Make, then add -j 32 to your make invocation to tell Make to start 32 jobs (for example) in parallel. Just make sure that you're not exhausting RAM and thrashing your swap file as a result.
Use something like Boost JAM which does this sort of multithreading for you - and from my experience much more efficiently than multi-threaded make.
Sun's C++ compiler also has an -xjobs option that makes the compiler fork multiple threads internally. For this to be efficient you would probably have to pass all .cc files to a single invocation of CC.
I have built an open-source application from the source code. Unfortunately, the original executable runs significantly faster. I tried to enable few compiler optimizations, but the result wasn't satisfactory enough. What else do I need to make in Visual Studio 2008 to increase the executable performance?
Thanks!
Basically try enabling everything under Optimisation in project settings, then ensure Link Time Code Generation is on, enable Function-level linking and full COMDAT folding (that only reduces the size of the EXE but could help with caching), and turn off security features such as by defining _SECURE_SCL=0. Remember some of these settings have other implications, especially the security ones.
Define _SECURE_SCL=0.
http://msdn.microsoft.com/en-us/library/aa985896(VS.80).aspx
Try to enable SSE instructions, when compiling. Also - you can try to compile using different compiler (GNU GCC).
+There might be enabled some debug defines, shich also can reduce speed.
+Check, that original .exe has same version as one you are trying to compile.
The open-source precompiled binary is most likely (whit out know which project you are working with) compiled with GNU GCC (Mingw on Windows). That might be the reason that it is faster. According to question: performance g++ vs. VC++ some things are considerably slower if you use VC++.