I currently use the following preprocessor defines, and various optimization settings:
WIN32_LEAN_AND_MEAN
VC_EXTRALEAN
NOMINMAX
_CRT_SECURE_NO_WARNINGS
_SCL_SECURE_NO_WARNINGS
_SECURE_SCL=0
_HAS_ITERATOR_DEBUGGING=0
My question is what other things do fellow SOers use, add, define, in order to get a Release Mode build from VS C++ (2008,2010) to be as performant as possible?
btw, I've tried PGO etc, it does help a bit but nothing that comes to parity with GCC, also I'm not using streams, the C++ i'm talking about its more like C but making use of templates and STL algorithms etc.
As it stands now very simple code segments pale in comparison wrt performance when compared to what GCC produces on say an equivalent x86 machine running linux (2.6+ kernel) using 02.
Side-Note: I believe a lot of the issues relate directly to the STL version (Dinkum) provided by MS. Could people please elaborate on experiences using STLPort etc with VS C++.
I don't see how the inclusion of:
_CRT_SECURE_NO_WARNINGS
_SCL_SECURE_NO_WARNINGS
..gives you a better or more performant build. All you are doing is disabling the warnings about the MS CRT deprecated functions. If you are doing this because you know what you are doing and require platform agnostic code fine, otherwise I would reconsider.
UPDATE: Furthermore the compiler can only do so much. I'd wager you would get more performant code if you instrumented and fixed your existing hotspots rather than trying to eek tiny percentage (if that) gains from the compiling and linking phase.
UPDATE2: _HAS_ITERATOR_DEBUGGING cannot be used when compiling release builds anyway according to the MSDN. WIN32_LEAN_AND_MEAN VC_EXTRALEAN (and probably NOMINMAX although performance isn't the chief reason to disable this) might give you some performance boost although all the rest have dubious value. You should favour correct fast code over (maybe - and I stress maybe) slightly faster but more risk prone code.
Related
I've been getting more into C++ programming as of late and keep running into the whole 'debug vs release' compiled versions. Now I feel like I've got a pretty decent understanding of some of the differences between released and debug versions of compiled code. For the debug version of code, the compiler doesn't attempt to optimize the code such that you can run a debugger and step through your program line by line. Essentially the compiled code closely resembles your source code in how it is executed. When compiling in release mode, the compiler attempts to optimize the program such that it has the same functionality, but is more efficient.
However, I'm curious as to whether or not there are instances where the source code between release and debug version can be different. That is, when we refer to debug vs release, are we always just talking about the compiled code, or can there exist differences in the source code?
This question arises due to me working in a proprietary programming language in which a formal, step by step debugger doesn't exist, yet serial monitors do exist. Thus a lot of our 'debug' vs 'release' code is implemented via #defines which look something like this:
#ifdef _DEBUG
check that error didn't occur...
SerialPrint("Error occurred")
#endif
So to summarize my question, depending on your IDE, are there often settings for implementing what I've illustrated? That is, when you attempt to compile to a debug version, can it be integrated with changes in the source code? Or does release vs debug typically just refer to the compiled binaries?
Thank you!
Is there a difference in source code for release and debug compiled program?
It depends on the source code, and the options used to compile the library or program. Below are a few differences I am aware of.
ASSERTS
The simplest of "debugging and diagnostics" is an assert. They are in effect when NDEBUG is not defined. Asserts create self-debugging code, and they snap when an unexpected condition is encountered. The trick is you have to assert everything. Everywhere you validate parameters and state, you should see an assert. Everywhere there's an assert, you should see an if to validate parameters and state.
I laugh when I see a code base without asserts. I kind of say to myself, the devs have too much time on their hands if they are wasting it under the debugger. I often ask why thy don't use asserts, and they usually answer with the following...
Posix assert sucks because it calls abort. If you are debugging a program, then you usually want to step the code to see how the code handles negative conditions that caused the assert to fire. Terminating the program runs foul with the "debugging and diagnostic" purpose. It has got to be one of the dumbest decisions in the history of C/C++. No one seems to recall the reasoning for the abort (a few years ago I tried to track down the pedigree on various C/C++ standards lists).
Usually you replace the useless Posix assert with something more useful, like an assert that raises a SIGTRAP on Linux or calls DebugBreak on Windows. See, for example, a sample trap.h. You replace the Posix assert with your assert to ensure libraries you are using get the updated behavior (if they have already been compiled, then its too late).
I also laugh when projects like ISC's BIND (the DNS server that powers the Internet) DoS's itself with its asserts (they have their own assert; they don't use Posix assert). There's a number of CVE's against BIND for its self-inflicted DoS. DoS'ing yourself is right up there with "lets abort a program being debugged".
For completeness, Microsoft Foundation Classs (MFC) used to have something like 16,000 or 20,000 asserts to help catch mistakes early. That was back in the late 1990s or mid 2000s. I don't what the state is today.
APIs
Some APIs exist that are purposefully built for "debugging and diagnostics". Other APIs can be used for it even though they are not necessarily safe to use in production.
An example of the former (purposefully built) is a Logging and DebugPrint API. Apple successfully used it to egress a user's FileVault passwords and keys. Also see os x filevault debug print.
An example of the latter (not safe for production) is Windows IsBadReadPointer and IsBadWritePointer. Its not safe for production because it suffers a race condition. But its usually fine for development because you want the extra scrutiny.
When we perform security reviews and audits, we often ask/recommend removing all non-essential logging; and ensure the logging level cannot be changed at runtime. When an app goes production, the time for debugging is over. There's no reason to log everything.
Libraries
Sometimes there are special libraries to use to help with debugging a diagnostics. Linux's Electric Fence and Microsoft's CRT Library come to mind. Both are memory checkers with APIs. In this case, you link command will be different, too.
Options
Sometimes you need additional options or defines to help with debugging and diagnostics. Glibc++ and -D_GLIBCXX_DEBUG comes to mind. Another one is concept checking, which used to be enabled by the define -D_GLIBCXX_CONCEPT_CHECKS. Its Boost code and its broken, so you should not use it. In these cases, your compile flags will be different.
Another one I often laugh at is a Release build that lacks the NDEBUG define. That includes Debian and Ubuntu as a matter of policy. The NSA, GHCQ and other 3-letter agencies thanks them for taking the sensitive information (like server keys), stripping the encryption (writing it to a file unprotected), and then egressing the sensitive information (sending them Windows Error Reporting, Apport Error Reporting, etc).
Initialization
Some development environments perform initialization with special bit patterns when a value is not explicitly initialized. Its really just a feature of the tools, like the compiler or linker. Microsoft's tools come to mind; see When and why will an OS initialise memory to 0xCD, 0xDD, etc. on malloc/free/new/delete? GCC had a feature request for it, but I don't think anything was ever done with it.
I often laugh when I disassemble a production DLL and see the Microsoft debug bit patterns bcause I know they are shipping a Debug DLL. I laugh because it often indicates the Release DLL has a memory error that the dev team was not able to clear. Adobe is notorious for doing this (not surprisingly, Adobe supplies some of the most insecure software on the planet, even though they don't supply an Operating System like Apple or Microsoft).
#ifdef _DEBUG
check that error didn't occur...
SerialPrint("Error occurred")
#endif
It makes me want to cry, but you still have to do this in 2016. GDB is (was?) broken under Aarch64, X32 and S/390, so you have to use printf's to debug your code.
The C++ standard supports a kind of debug versus release via the assert macro, whose behavior is governed by whether the NDEBUG macro symbol is defined. But this is not intended as an application wide setting. The standard explicitly notes that each time <assert.h> or <cassert> is included, regardless of how many times it's already been included, it changes the effective definition of assert according to the current definitional state of NDEBUG.
The compiler vendor's implementation of the standard library may rely on other symbols.
And application frameworks may rely on yet other symbols, e.g. _DEBUG, which is a symbol defined by the Visual C++ compiler when you specify the (debug library) /MTd or /MDd option.
In regards to IDE settings, you're free to do what you want. Yes, some IDEs (like MS Visual Studio) or tools like CMake add _DEBUG macro definition specifically for debug configurations, but you could also define one yourself if it's missing. Also, _DEBUG name is not set in stone, you could define MY_PROJECT_DEBUG or whatever instead.
If release and debug versions stay identical in regards to their primary functionality, you're fine. You could add any code wrapped in #ifdef _DEBUG (or otherwise #ifndef _DEBUG) as long as the end result produced by the program is the same.
The usual mistake there is when debug code, which is considered optional, produces side effects. Consider assert example given by others; approximate implementation looks like this:
#ifdef NDEBUG
#define assert(x) ((void)0)
#else
#define assert(x) ((x) ? (void)0 : abort())
#endif
Notice how assert doesn't evaluate the x when in release mode (provided that NDEBUG is only defined in release mode). This means that if condition passed as macro argument has side effects, you code will behave differently in debug and release modes:
#include <assert.h>
int main()
{
int x = 5;
assert(x-- == 5);
return x; // returns 5 in release mode, 4 in debug mode
}
The behavior above is not something you want, as it changes the end result. Real-world code may be more complex and less evident to introduce side effects, e.g. assert(SomeFunctionCall()) and the like.
Note that asserts may not be the best example though, as some people like to have them enabled even in release builds.
I am writing a C++ MFC-based application. Occasionally I will add #if defined(_DEBUG) statements in order to help me during the development process. My manager has demanded that I remove all such statements as he doesn't want two versions of the code. My feeling is that without the ability to use #if defined(_DEBUG), bugs are more likely to creep in undetected during the development process.
Does anyone else have any thoughts on this?
Well, the runtime library, and the MFC has two versions, debug and release, so there will always be two versions of the code.
The usage of #ifdef(_DEBUG) and assert() will help you in the debugging process.
BUT...
It is not recommended to add class/struct members in an #ifdef clause, because the binary interface of the object will be different, and if you serialize or sending such a struct from debug and release versions they will be different.
E.G:
#include <assert.h>
class MyClass
{
void SetA(int a)
{
assert(a<10000); // this is recommended
}
#ifdef _DEBUG
int m_debugCounter; // This is not recommended
#endif
};
in the example, sizeof(MyClass) is different from debug and release versions.
There are pros and cons for having debug code compiled out of production code.
Some of the pros:
The production executables will be smaller
Resources are not wasted preparing logging messages, statistics, etc which are never used
You may be able to remove dependencies on external libraries, if they are only used by the debugging code
Some of the cons:
You end up with two different executables
It makes troubleshooting in the field more difficult because some of the debugging logs are not available in the production version
There is a risk that the behavior of the two versions is not the same. For examples, bugs might appear in the production version which did not appear during testing because the testing was done in the debug version
There is a risk of incompatibilities between the versions, such as files having slightly different formats (as indicated in eranb's answer).
In particular, if using code compiled out (even assert() calls) be very careful that the debug code has no side effects. Otherwise you will create bugs which disappear when debugging is turned on.
You can achieve at least some of pros by using a decent logging framework. For example, I have had some success using rLog - one of the things I like about it is that it is optimized to minimize overhead of dormant logging statements. Other more up to date logging frameworks provide similar functionality.
Having said that, every C++ environment I have worked with has at least some level of compile in/out of debug code. For example, assert() is compiled in/out depending on the NDEBUG macro. The ASSERT() / _DEBUG seems to be an MSVC specific variation on that theme (although as far as I know MSVC also supports the more standard assert() / NDEBUG).
If the manager wants this he doesn't understand to target code quality...
If #if should be removed you can also remove ASSERTs of any Kind. Because they just hide the #if _DEBUG.
And Keep in mind that the MFC and CRT itself is full of this (real) useful #if _DEBUG code!
Without such special _DEBUG blocks I wouldn't be able to target misuse of classes, or to trap internal problems.
For those that develop software for multiple platforms, how do you handle the potential that compilers might do certain things better than other compilers.
Say you develop for OS X, Windows, Linux and you are using Clang/LLVM, VS and GCC.
So if someone compiles your app on OS X and they are using GCC and another person compiles on OS X using the Intel Compilers and you could optimized sections of the code for the Intel compilers if the person has them.
Would you just check a Preprocessor directive?
#ifdef __GCC_
// do it this way
#endif
#ifdef __INTEL__
// do it this way
#endif
#ifdef __GCC_WITH C++_V11_Support__
// do it this way
#endif
#idfef __WINDOWS_VISUAL_STUDIO
// do it this way
#endif
Or is there a better way?
How does one find a list of what directive a compiler offers for checking compiler version, etc
Don't choose the implementation based on predefined macros. Let the build system control it.
This lets you build and compare multiple implementations against each other during unit testing.
Typically, optimization follows the traditional 80/20 or 90/10 rule of "20% of the code takes 80% of the time to run" (and "20% of the code takes 80% of the time to develop"). Substitute 80/20 for 90/10 if you like - it's nearly always somewhere between those two...
So, the first stage of "do we optimize for a particular compiler" is to figure out what parts of your code are slow, and if you can make it any better in a generic way that works on all compilers (e.g. passing const reference rather than a copy of a large object). Once you have exhausted all generic improvements to the code, you may want to look at compiler specific optimizations - but that really requires that you gain enough that it really is worth the extra maintenance of having code that is different between the different compilers.
In general, I would very much avoid the "things are different in different compilers".
Generally speaking, compilers are written to optimize common code, not something specialized written specific for the compiler. So generally you should just focus on writing clean code, using the fastest algorithms. However some compilers are hintable, for instance gcc, through attributes using these attributes lets the compiler do its job better.
For instance using the noreturn attribute will allow gcc to discard function return code, thereby minimizing code size. I guess a lot of compilers have similar hinting schemes.
One could then do;
#ifdef GCC
#define NO_RETURN __attribute(...)
#else
#define NO_RETURN
#endif
And use NO_RETURN in your code.
Doing cross platform development with 64bit. Using gcc/linux and msvc9/server 2008.
Just recently deployed a customer on windows and during some testing of upgrades I found out that although std::streamoff is 8 bytes, the program crashes when seeking past 4G.
I immediately switched to stlport which fixes the problem, however stlport seems to have other issues. Is STL with msvc9 really that broken, or am I missing something?
Since the code is cross platform I have zero interest in using any win32 calls.
Related
iostream and large file support
Reading files larger than 4GB using c++ stl.
Even though you say that you have "zero" interest in using "win32" calls, it situations like this your stuck between a rock and a hard place.
I would just implement my own version of a file iostream using the "win32" calls that looks and feels like the fstream interfaces. This is easy to do and I've done it hundreds of times.
Call it say 'fstreamwin32'.
Then I would have a header file that would do something like:
#ifdef WIN32
typedef fstreamwin32 fsteamnative;
#else
typedef fstream fsteamnative;
#endif
Then I would use fsteamnative everywhere. That way you keep your code cross platform and still solve your problem.
If the problem is ever fixed, you can easily remove your "win32" workaround by changing your typedef back to using fstream typedef. This is why lots of cross platform codebases have lots of levels of indirection (e.g. by using their own typedef's for standard stuff) so that they are do stuff like this would having to change a lot of code.
Another link I found on this subject:
http://cplusplus.com/forum/general/6813/
I ended up using STLport. The biggest difference with STLport being that some unit tests which crashed during multiplies of double precision numbers now work and those unit tests pass. There are some other differences with relative precision popping up but those seem to be minor.
I really hate using STL containers because they make the debug version of my code run really slowly. What do other people use instead of STL that has reasonable performance for debug builds?
I'm a game programmer and this has been a problem on many of the projects I've worked on. It's pretty hard to get 60 fps when you use STL container for everything.
I use MSVC for most of my work.
EASTL is a possibility, but still not perfect. Paul Pedriana of Electronic Arts did an investigation of various STL implementations with respect to performance in game applications the summary of which is found here:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2271.html
Some of these adjustments to are being reviewed for inclusion in the C++ standard.
And note, even EASTL doesn't optimize for the non-optimized case. I had an excel file w/ some timing a while back but I think I've lost it, but for access it was something like:
debug release
STL 100 10
EASTL 10 3
array[i] 3 1
The most success I've had was rolling my own containers. You can get those down to near array[x] performance.
My experience is that well designed STL code runs slowly in debug builds because the optimizer is turned off. STL containers emit a lot of calls to constructors and operator= which (if they are light weight) gets inlined/removed in release builds.
Also, Visual C++ 2005 and up has checking enabled for STL in both release and debug builds. It is a huge performance hog for STL-heavy software. It can be disabled by defining _SECURE_SCL=0 for all your compilation units. Please note that having different _SECURE_SCL status in different compilation units will almost certainly lead to disaster.
You could create a third build configuration with checking turned off and use that to debug with performance. I recommend you to keep a debug configuration with checking on though, since it's very helpful to catch erroneous array indices and stuff like that.
If your running visual studios you may want to consider the following:
#define _SECURE_SCL 0
#define _HAS_ITERATOR_DEBUGGING 0
That's just for iterators, what type of STL operations are you preforming? You may want to look at optimizing your memory operations; ie, using resize() to insert several elements at once instead of using pop/push to insert elements one at a time.
For big, performance critical applications, building your own containers specifically tailored to your needs may be worth the time investment.
I´m talking about real game development here.
I'll bet your STL uses a checked implementation for debug. This is probably a good thing, as it will catch iterator overruns and such. If it's that much of a problem for you, there may be a compiler switch to turn it off. Check your docs.
If you're using Visual C++, then you should have a look at this:
http://channel9.msdn.com/shows/Going+Deep/STL-Iterator-Debugging-and-Secure-SCL/
and the links from that page, which cover the various costs and options of all the debug-mode checking which the MS/Dinkware STL does.
If you're going to ask such a platform dependent question, it would be a good idea to mention your platform, too...
Check out EASTL.
MSVC uses a very heavyweight implementation of checked iterators in debug builds, which others have already discussed, so I won't repeat it (but start there)
One other thing that might be of interest to you is that your "debug build" and "release build" probably involves changing (at least) 4 settings which are only loosely related.
Generating a .pdb file (cl /Zi and link /DEBUG), which allows symbolic debugging. You may want to add /OPT:ref to the linker options; the linker drops unreferenced functions when not making a .pdb file, but with /DEBUG mode it keeps them all (since the debug symbols reference them) unless you add this expicitly.
Using a debug version of the C runtime library (probably MSVCR*D.dll, but it depends on what runtime you're using). This boils down to /MT or /MTd (or something else if not using the dll runtime)
Turning off the compiler optimizations (/Od)
setting the preprocessor #defines DEBUG or NDEBUG
These can be switched independently. The first costs nothing in runtime performance, though it adds size. The second makes a number of functions more expensive, but has a huge impact on malloc and free; the debug runtime versions are careful to "poison" the memory they touch with values to make uninitialized data bugs clear. I believe with the MSVCP* STL implementations it also eliminates all the allocation pooling that is usually done, so that leaks show exactly the block you'd think and not some larger chunk of memory that it's been sub-allocating; that means it makes more calls to malloc on top of them being much slower. The third; well, that one does lots of things (this question has some good discussion of the subject). Unfortunately, it's needed if you want single-stepping to work smoothly. The fourth affects lots of libraries in various ways, but most notable it compiles in or eliminates assert() and friends.
So you might consider making a build with some lesser combination of these selections. I make a lot of use of builds that use have symbols (/Zi and link /DEBUG) and asserts (/DDEBUG), but are still optimized (/O1 or /O2 or whatever flags you use) but with stack frame pointers kept for clear backtraces (/Oy-) and using the normal runtime library (/MT). This performs close to my release build and is semi-debuggable (backtraces are fine, single-stepping is a bit wacky at the source level; assembly level works fine of course). You can have however many configurations you want; just clone your release one and turn on whatever parts of the debugging seem useful.
Sorry, I can't leave a comment, so here's an answer: EASTL is now available at github: https://github.com/paulhodge/EASTL
Ultimate++ has its own set of containers - not sure if you can use them separatelly from the rest of the library: http://www.ultimatepp.org/
What about the ACE library? It's an open-source object-oriented framework for concurrent communication software, but it also has some container classes.
Checkout Data Structures and Algorithms with Object-Oriented Design Patterns in C++
By Bruno Preiss
http://www.brpreiss.com/
Qt has reimplemented most c++ standard library stuff with different interfaces. It looks pretty good, but it can be expensive for the commercially licensed version.
Edit: Qt has since been released under LGPL, which usually makes it possible to use it in commercial product without bying the commercial version (which also still exists).
STL containers should not run "really slowly" in debug or anywhere else. Perhaps you're misusing them. You're not running against something like ElectricFence or Valgrind in debug are you? They slow anything down that does lots of allocations.
All the containers can use custom allocators, which some people use to improve performance - but I've never needed to use them myself.
There is also the ETL https://www.etlcpp.com/. This library aims especially for time critical (deterministic) applications
From the webpage:
The ETL is not designed to completely replace the STL, but complement
it. Its design objective covers four main areas.
Create a set of containers where the size or maximum size is determined at compile time. These containers should be largely
equivalent to those supplied in the STL, with a compatible API.
Be compatible with C++ 03 but implement as many of the C++ 11 additions as possible.
Have deterministic behaviour.
Add other useful components that are not present in the standard library.
The embedded template library has been designed for lower resource
embedded applications. It defines a set of containers, algorithms and
utilities, some of which emulate parts of the STL. There is no dynamic
memory allocation. The library makes no use of the heap. All of the
containers (apart from intrusive types) have a fixed capacity allowing
all memory allocation to be determined at compile time. The library is
intended for any compiler that supports C++03.