Building LLVM eats up all the RAM - llvm

I have been trying to install LLVM on my system [i7 + 16GB RAM]. I have been following this tutorial : LLVM Install. But in building, it eats up all the RAM and the terminal closes automatically. Is there any way to solve this?
Thanks.

The resources consumed during build can depend on various factors:
Number of build targets that you are building. In general you should be able to skip a bunch of build targets (compiler-rt, libcxx etc)
The type of binaries that will be generated. I mean, shared vs. static. Enabling shared libraries (BUILD_SHARED_LIBS:ON) will consume way less memory.
The type of optimization flag. Debug, Release, RelWithDebInfo will also have an effect. The Debug build will have larger binary size so it may consume more memory during the link step. But the build time will be faster as few optimizations are enabled. Release build may consume less RAM during the link step.
Number of threads -jN
TLDR for reducing RAM pressure:
Enable shared libraries
Use Release builds
Keep number of parallel threads low (Instead of max jN try, -j(N-2)). Using -j1 may use less RAM but would take long time to build.
Skip building as many libraries (e.g., LLVM_ENABLE_RUNTIMES) and targets (e.g., LLVM_TARGETS_TO_BUILD) as you can. This may not be trivial as it requires spending time with the CMakeCache.txt file.
Build only what you want e.g., instead of invoking just ninja, invoke ninja clang, or ninja opt etc.

Related

Build size for LLVM 6.0.0 is huge (42G)

I built llvm-6.0.0 from source and everything works fine. I'm just wondering how come its size is so huge (42G). Can I easily erase some object files or other to make the build directory smaller?
$ du -hs ~/GIT/llvm-6.0.0/build/
42G /home/oren/GIT/llvm-6.0.0/build/
You're building without shared libraries, which means that a number of very large libraries are linked statically into a large number of (otherwise small) tools. I'm guessing that you may also be building for all targets (32-bit ARM, 64-bit ARM, a few dozens more, 32-bit X86, 64-bit X86).
If you run cmake -DLLVM_TARGETS_TO_BUILD=HOST -DCMAKE_BUILD_TYPE=RelWithDebInfo -DBUILD_SHARED_LIBS=on ., you should reduce the space use to about 10G. (At least I have a 10G build tree produced from a similar command line. I also have bigger trees, because those settings aren't the best match for all purposes.)
To anybody that is using vcpkg to build llvm.
You can pass set(VCPKG_BUILD_TYPE release) in the triplet that you use for example x64-linux triplet. This will prevent a buildtree of over 20GB and keep it under 10GB, was around 5GB when attempted, which you can delete and reclaim after installation is done.

build cpp program run on different version linux

Some linux program for example mongodb binary file can run on different version linux whatever the host machine gcc version and glibc version.
How to do that? static link all libs? But I heard of glibc is not supposed to be static linked.
To make an executable that is independent of the installed libraries, you must statically link it.
However, if the application isn't very large/complex to build, it's often better to either distribute the source and build on/for the target system, or pre-build for the most popular variants.
The reason that you don't want to statically link glibc (and all other libs that the application may use) is that even the most simple application becomes about 700K-1MB. Given that my distribution has 1900 entries in /usr/bin, that would make it around 2GB minimum, where now it is 400MB (and that includes beasts like clang, emacs and skype, all weighing in at over 7MB in non-statically linked form - they probably have more than a dozen library dependencies each - clang, for example, grows from under 10MB to around 100-120MB if you compile it with static linking).
And of course, with static linkage, all the code for each application needs to be loaded into memory as a separate copy. So the overall memory usage goes up quite dramatically.

Why isn't ccache used with gcc more often?

I've been wondering...
Are there some limitations with ccache?
If the difference in later compile times are so large,
why aren't more Linux developers using ccache more often?
I guess that the simple answer is that ccache is great when the build system is broken (i.e. the dependencies are not correctly tracked, and to get everything built correctly you might need make clean; make). On the other hand, if dependencies are correctly tracked, then ccache will not yield any advantage over plain make, and will actually incur the cost of maintaining the cache and updating it (the size of the cache might be huge depending on the size of the project)

C++ application - should I use static or dynamic linking for the libraries?

I am going to start a new C++ project that will rely on a series of libraries, including part of the Boost libraries, the log4cxx or the google logging library - and as the project evolves other ones as well (which I can not yet anticipate).
It will have to run on both 32 and 64 bit systems, most probably in a quite diverse Linux environment where I do not expect to have all the required libraries available nor su privileges.
My question is, should I build my application by dynamically or statically linking to all these libraries?
Notes:
(1) I am aware the static linking might be a pain during development (longer compile times, cross-compiling for both 32 and 64 bit, going down dependency chains to include all libraries, etc), but it's a lot easier during testing - just move the file and run.
(2) On the other hand, dynamic linking seams easier during development phase - short compile times, (don't really know how to handle dynamic linking to 64 bit libraries from my 32 bit dev environment), no hustle with dependency chains. Deployment of new versions on the other hand can be ugly - especially when new libraries are required (see condition above of not having su rights on the targeted machines, nor these libraries available).
(3) I've read the related questions regarding this topic but couldn't really figure out which approach would best fit my scenario.
Conclusions:
Thank you all for your input!
I will probably go with static linking because:
Easier deployment
Predictable performance and more consistent results during perf. testing (look at this paper: http://www.inf.usi.ch/faculty/hauswirth/publications/CU-CS-1042-08.pdf)
As pointed out, the size and duration of compilation of static vs. dynamic does not seem to be such a huge difference
Easier and faster test cycles
I can keep all the dev. cycle on my dev. machine
Static linking has a bad rap. We have huge hard drives these days, and extraordinarily fat pipes. Many of the old arguments in favor of dynamic linking are way less important now.
Plus, there is one really good reason to prefer static linking on Linux: The plethora of platform configurations out there make it almost impossible to guarantee your executable will work across even a small fraction of them without static linking.
I suspect this will not be a popular opinion. Fine. But I have 11 years experience deploying applications on Linux, and until something like LSB really takes off and really extends it's reach, Linux will continue to be much more difficult to deploy applications on. Until then, statically link your application, if you have to run across a wide range of platforms.
I would probably use dynamic linking during (most of) development, and then change over to static linking for the final phases of development and (all of) deployment. Fortunately, there's little need for extra testing when switching from dynamic to static linkage of the libraries.
This is another vote for static linking. I haven't noticed significantly longer linking times for out application. The app in question is a ~50K line console app, with multiple libraries that is compiled for a bunch of out of the ordinary machines, mostly supercomputers with 100-10,000 cores. With static linking, you know exactly what libraries you are going to be using, can easily test out new versions of them.
In general, this is the way that most Mac apps are built. It is what allows installation to be simply copying a directory onto the system.
Best is to leave that up to the packager and provide both options in the configure/make scripts. Usually dynamic linking would have the preference since then it would be easy to upgrade the libraries when necessary, i.e. when security vulnerabilities, etc. are discovered.
Note that if you do not have root privileges to install the libraries in the system directories you can compile the program such that it will first look elsewhere for any needed dynamic libraries, this is accomplished by setting the runpath directive in ELF binaries. You can specify such a directory with the -rpath option of the linker ld.

Handling binary dependencies across platforms

I've got a C++ project where we have loads and loads of dependencies. The project should work on Linux and Windows, so we've ported it to CMake. Most dependencies are now included right into the source tree and build alongside the project, so there are no problems with those.
However, we have one binary which depends on Fortran code etc. and is really complicated to build. For Linux, it's also not available as a package, but only as precompiled binaries or with full source (needs a BLAS library installed and several other dependencies). For windows, the same library is available as binary, building for Windows seems even more complicated.
The question is, how do you handle such dependencies? Just check in the binaries for the supported platforms, and require the user to set up his build environment otherwise (that is, manually point to the binary location), or would you really try to get them compiled along (even if it requires installing like 10 libraries -- BLAS libraries are the biggest pain here), or is there some other recommended way to handle that?
If the binary is independant of the other part of your build process, you definitively should check-in it. But as you cannot include every version of the binary (I mean for every platform and compile flags the user might use) the build from source seems mandatory.
I have done something similar. I have checked-in the source code archives of the libraries/binaries I needed. Then I wrote makefile/scripts to build them according to the targeted platform/flags in a specific location (no standard OS location) and make my main build process to point to the right location. I have done that to be able to handle the correct versions and options of the libraries/binaries I needed. It's quite a hard work to make things works for different platforms but it's worth the time !
Oh, and of course it's easier if you use crossplatform build tools :)
One question to you. Does the users need to modify this binary, or are they just happy it's there so the can use/access it? If they don't need to modify it, check in the binaries.
I would agree, check in the binaries for each platform if they are not going to be modified very often. Not only will this reduce build times, but it will also reduce frustration from unnecessary compilations.