Linking taking too long with /bigobj - c++

I am using Visual Studio 2012 to compile a program in debug mode. The StylesDatabase.cpp and LanguagesDatabase.cpp used to compile fine without /bigobj ... since I removed some functions and shifted some functions from protected to public.
Both the C++ files are fairly small but use templated container classes like Boost.MultiIndex(es), Boost.Unordered(_maps) and Wt::Dbo::ptrs. Wt::Dbo::ptr is a pointer to a database object and Wt::Dbo is an ORM library.
After this change, the compiler fails asks me to set /bigobj. After I set /bigobj the compiler works fine, however the linker was taking more than 30 minutes.
So my question is:
How come a fairly small file can exceed the limit of Visual C++? What exactly causes the limit to be exceeded.
How can I prevent the limit to be exceeded without splitting the cpp files?
Why is the linker taking so much time?
I can provide the source if its necessary.

Your files are not the only ones that the linker has to handle - it has to deal also with library files, and in your case these are the Boost template libraries that requires /bigobj flag. Take a look at this Microsoft page: http://msdn.microsoft.com/en-US/library/ms173499.aspx. Even if your files are small, heavily-templated libraries may require you to use /bigobj anyway.
You can think about it that way: somebody had to produce a lot of code so that you can produce much less code writing your program, but this code produced by someone else is there and has to be dealt with at some point as well.

Related

the meaning of visual studio /Z7 [duplicate]

Background
There are several different debug flags you can use with the Visual Studio C++ compiler. They are:
(none)
Create no debugging information
Faster compilation times
/Z7
Produce full-symbolic debugging information in the .obj files using CodeView format
/Zi
Produce full-symbolic debugging information in a .pdb file for the target using Program Database format.
Enables support for minimal rebuilds (/Gm) which can reduce the time needed for recompilation.
/ZI
Produce debugging information like /Zi except with support for Edit-and-Continue
Issues
The /Gm flag is incompatible with the /MP flag for Multiple Process builds (Visual Studio 2005/2008)
If you want to enable minimal rebuilds, then the /Zi flag is necessary over the /Z7 flag.
If you are going to use the /MP flag, there is seemingly no difference between /Z7 and /Zi looking at MSDN. However, the SCons documentation states that you must use /Z7 to support parallel builds.
Questions
What are the implications of using /Zi vs /Z7 in a Visual Studio C++ project?
Are there other pros or cons for either of these options that I have missed?
Specifically, what is the benefit of a single Program Database format (PDB) file for the target vs multiple CodeView format (.obj) files for each source?
References
MDSN /Z7, /Zi, /ZI (Debug Information Format)
MSDN /MP (Build with Multiple Processes)
SCons Construction Variables CCPDBFLAG
Debug Info
Codeview is a much older debugging format that was introduced with Microsoft's old standalone debugger back in the "Microsoft C Compiler" days of the mid-1980s. It takes up more space on disk and it takes longer for the debugger to parse, and it's a major pain to process during linking. We generated it from our compiler back when I was working on the CodeWarrior for Windows in 1998-2000.
The one advantage is that Codeview is a documented format, and other tools can often process it when they couldn't deal with PDB-format debug databases. Also, if you're building multiple files at a time, there's no contention to write into the debug database for the project. However, for most uses these days, using the PDB format is a big win, both in build time and especially in debugger startup time.
One advantage of the old C7 format is that it's all-in-one, stored in the EXE, instead of a separate PDB and EXE. This means you can never have a mismatch. The VS dev tools will make sure that a PDB matches its EXE before it will use it, but it's definitely simpler to have a single EXE with everything you need.
This adds new problems of needing to be able to strip debug info when you release, and the giant EXE file, not to mention the ancient format and lack of support for other modern features like minrebuild, but it can still be helpful when you're trying to keep things as simple as possible. One file is easier than two.
Not that I ever use C7 format, I'm just putting this out there as a possible advantage, since you're asking.
Incidentally, this is how GCC does things on a couple platforms I'm using. DWARF2 format buried in the output ELF's. Unix people think they're so hilarious. :)
BTW the PDB format can be parsed using the DIA SDK.
/Z7 keeps the debug info in the .obj files in CodeView format and lets the linker extract them into a .pdb while /Zi consolidates it into a common .pdb file during compilation already by sync'ing with mspdbsrv.exe.
So /Z7 means more file IO, disc space being used and more work for the linker (unless /DEBUG:FASTLINK is used) as there is lots of duplicate debug info in these obj files. But it also means every compilation is independent and thus can actually still be faster than /Zi with enough parallelization.
By now they've improved the /Zi situation though by reducing the inter-process communication with mspdbsrv.exe: https://learn.microsoft.com/en-us/cpp/build/reference/zf
Another use-case of /Z7 is for "standalone" (though larger) static libraries that don't require shipping a separate .pdb if you want that. That also prevents the annoying issues arising from the awful default vcxxx.pdb name cl uses as long as you don't fix it with a proper https://learn.microsoft.com/en-us/cpp/build/reference/fd-program-database-file-name, which most people forget.
/ZI is like /Zi but adds additional data etc. to make the Edit and Continue feature work.
There is one more disadvantage for /Z7:
It's not compatible with incremental linking, which may alone be a reason to avoid it.
Link: http://msdn.microsoft.com/en-us/library/4khtbfyf%28v=vs.100%29.aspx
By the way: even though Microsoft says a full link (instead of an incremental) is performed when "An object that was compiled with the /Yu /Z7 option is changed.", it seems this is only true for static libraries build with /Z7, not for object files.
Another disadvantage of /Z7 is the big size of the object files. This has already been mentioned here, however this may escalate up to the point where the linker is unable to link the executable because it breaks the size limit of the linker or the PE format (it gives you linker error LNK1248). It seems Visual Studio or the PE format has a hard limit of 2GB (also on x64 machines). When building a debug version you may run into this limit. It appears this does not only affect the size of the final compiled executable, but also temporary data. Only Microsoft knows about the linker internals, but we ran into this problem here (though the executable was of course not 2gigs large, even in debug). The problem miraculously went away and never came back when we switched the project to /ZI.

Penalty of the MSVS compiler flag /bigobj

The basic Google search bigobj issue shows that a lot of people are experiencing the Fatal Error C1128: "number of sections exceeded object file format limit : compile with /bigobj". The error has more chance to occur if one heavily uses a library of C++ templates, like Boost libraries or CGAL libraries.
That error is strange, because it gives the solution to itself: set the compiler flag /bigobj!
So here is my question: why is not that flag set by default? There must be a penalty of using that flag, otherwise it would be set by default. That penalty is not documented in MSDN. Does anybody have a clue?
I ask the question because I wonder if the configuration system of CGAL should not set /bigobj by default.
The documentation does mention an important drawback to /bigobj:
Linkers that shipped prior to Visual C++ 2005 cannot read .obj files
that were produced with /bigobj.
So, setting this option by default would restrict the number of linkers that can consume the resulting object files. Better to activate it on a need-to basis.
why is not that flag set by default? There must be a penalty of using that flag, otherwise it would be set by default.
My quick informal experiment shows .obj files to be about 2% larger with /bigobj than without. So it's a small penalty but it's not zero.
Someone submitted a feature request to make /bigobj the default; see https://developercommunity.visualstudio.com/t/Enable-bigobj-by-default/1031214.

Low performance of Incremental linking in Visual Studio C++

I have a large binary which is built of many static libs and standalone cpp files. It is configured to use incremental linking, all optimizations are disabled by /Od - it is debug build.
I noticed that if I change any standalone cpp file then incremental linking runs fast - 1 min. But if I change any cpp in any static lib then it runs long - 10 min, the same time as ordinary linking. In this case I gain no benefit from incremental linking. Is it possible to speedup it? I use VS2005.
Set "Use Library Dependency Inputs" in the Linker General property page for your project. That will link the individual .obj files from the dependency .lib instead of the .lib, which may have some different side effects.
I going to give you a different type of answer. Hardware.
What is your development environment? Is there anyway to get more RAM or to put your project onto an Solid State Drive? I found that using a SSD sped up my link times by an order of magnitude on my work projects. Helped a little for compile times, but the linking was huge. Getting a faster system of course also helped.
If I understand correctly (after using Visual Stuio for some years), the incremental linking feature does not work for object files that is part of static libraries.
One way to solve this is to restructure your solution so that your application project contains all source files.

g++ produces big binaries despite small project

Probably this is a common question. In fact I think I asked it years ago... but I can't remember the answer.
The problem is: I have a project that is composed of 6 source files. All of them no more than 200 lines of code. It uses many STL containers, stdlib.h and iostream. Now the executable is around 800kb in size.... I guess I shouldn't statically link libraries. How to do this with GCC? And in Eclipse CDT?
EDIT:
As I responses away from what I want I think it's the case for a clarification. What I want to know is why such a small program is so big in size and what is the relationship with static, shared libraries and their difference. If it's a too long story to tell feel free to give pointers to docs. Thank you
If you give g++ dynamic library names, and don't pass the -static flag, it should link dynamically.
To reduce size, you could of course strip the binary, and pass the -Os (optimize for size) optimization flag to g++.
One thing to remember is that using the STL results in having that extra code in your executable even if you are dynamically linking with the C++ library. This is by virtue of the fact that the STL is a bunch of templates that aren't actually compiled until you write and compile your code. Since the library can't anticipate what you might store in a container, there's no way for the library to already contain the code for that particular usage of the container. Same goes with algorithms and everything else in the STL.
I'm not saying this is definitely the reason your executable is so much larger than you expect. But it may be a factor.
Use -O3 and -s flags to produce the most optimized binary. Also see this link for some more information.
If you are building for Windows, consider using the Microsoft compiler. It always produces the smallest binary on that platform.
Eclipse should be linking dynamically by default, unless you've set the static flag on the linker in your makefile.
In response to your EDIT :
-when you link statically, the executable contains a full copy of each library you've linked to.
-when you link dynamically, the executable only contains references and hooks to the linked libraries, which is a much much smaller amount of code.
The executable has to contain more than just your code.
At the very least, it contains some startup code, setting up the environment and if necessary, loading any external libraries, before the program launches.
If you've statically linked the runtime library, you also get that included in your executable. Otherwise you only get a small stub, just big enough to redirect system calls to the external runtime.
It may, depending on compiler settings also include a lot of debugging info and other non-essential data. If optimizations are enabled, that may have increased code size as well.
The real question is why does this matter? 800KB still fits easily on a floppy disk!
Most of this is a one-time cost. it doesn't mean that if you write twice as much code, it'll take up 1600KB. More likely, it'll take 810KB or something like that.
Don't worry about one-time startup costs.
The size usually results in static libraries being linked into your application.
You can reduce the size of the compiled binary by compiling to RELEASE versions, with optimizations to binary size.
Another source of executable size are the libraries. You said that you don't use external libraries, except for STD, so I believe you're including the C Runtime with your executable, ie, linking statically. so check for dynamic linking.
IMO you shouldn't really worry about that, but if you're really paranoid, check this: Smallest x86 ELF Hello World
use Visual C++ 6.0
it supported with Windows 95 to Windows 7.
and can be compiled as x86 platforms but only for Windows.
so if you are a Windows user just stick with Windows Compilers other than GCC which is sux actually.most of people who say Visual C++ is sux cause they are Anti-Microsofters.
also remember use "Visual C++ 6.0" if you use a newer one probably you can't run your files on Windows 95. I have tested all those things that's why I said.
GCC produces largest binaries, but Visual C++ not ,Intel Compiler can use to save more than 30% of space but it demands a Intel processor unless performance would be horrible.
another thing u need to remember is when u use templates though you see small lines
when you compiles those functions would be expanded so the result is make larger binaries.
if you need smaller binaries I suggest move to C cause C is actually widely used but not OO
infact C is easy to use than C++
this does make sense then C++ example
cout << "Hello World" << endl;
printf("%s","Hello World");
second one say print field %s means you type a string so it's easy. :P

why are my visual studio .obj files are massive in size compared to the output .exe?

As a background, I am a developer of an opensource project, a c++ library called openframeworks, that is a wrapper for different libraries, like opengl, quicktime, freeImage, etc. In the next release, we've added a c++ library called POCO, which is similar to boost in some ways in that it's an alternative for java foundation library type functionality.
I've just noticed, that in this latest release where I've added the POCO library as a statically linked library, the .obj files that are produced during the act of compilation are really massive - for example, several .obj files for really small .cpp files are 2mb each. The overall compiled .obj files are about 12mb or so. On the flip side, the exes that are produced are small - 300k to 1mb.
In comparison, the same library compiled in code::blocks produces .obj files that are roughly the same size at the exe - they are all fairly small.
Is there something happening with linking, and the .obj process in visual studio that I don't understand? for example, is it doing some kind of smart prelinking, or other thing, that's adding to the .obj size? I've experimented a bit with settings, such as incremental linking, etc, and not seen any changes.
thanks in advance for any ideas to try or insights !
-zach
note: thanks much! I just tried, dumpbin, which says "anonymous object" and doesn't return info about the object. this might be the reason why....
note 2, after checking out the above link, removing LTCG (link time code generation - /GL) the .obj files are much smaller and dumpbin understands them. thanks again !!
I am not a Visual Studio expert by any stretch of imagination, having hardly used it, but I believe Visual Studio employs link-time optimizations, which can make the resulting code run faster, but can cost a lot of space in the libraries. Also, it may be (I don't know the internals) that debugging information isn't stripped until the actual linking phase.
I'm sure someone's going to come with a better/more detailed answer anyway.
Possibly the difference is debug information.
The compiler outputs the debug information into the .obj, but the linker does not put that data into the .exe or .dll. It is either discarded or put into a .pdb.
In any case use the Visual Studio DUMPBIN utility on the .obj files to see what's in them.
Object files need to contain sufficient information for linking. In C++, this is name-based. Two object files refer to the same object (data/function/class) if they use the same name. This implies that all object files must contain names for all objects that might be referenced by other object files. The executable however will need the names visible from outside the library. In case of a DLL, this means only the names exported. The saving is twofold: there are less names, and those names are present only once in the DLL.
Modern C++ libraries will use namespaces. These namespaces mean that object names become longer, as they include the names of the encapsulating namespaces too.
The compiled library obj files will be huge because they must contain all of the functions, classes and template that your end users might eventually use.
Executables which link to your library will be smaller because they will include only the compiled code that they require to run. This will usually be a tiny subset of the library.