Is there a way (g++ option?) to check what code is generated implicitly by the C++ compiler (e.g. all the default constructors/destructors)?
Having the generated C++ code would be ideal, but at least the assembly would be good. Using:
g++ -S -g -O0 <file.cpp>
does not give me any label with generated constructors/destructors.
I think option the -fdump-tree-original is about as close as you can get. Unfortunately it'll show both your own code and automatically generated code, but it won't label which is which. However it's the most readable of GCC's dumps and it shows the generated code before any optimizations are performed.
Another option would be to use -fdump-translation-unit. That creates a raw dump of the tree with literally everything in it. The nodes that compiler made up will be marked as "artificial". However, the format isn't easy for humans to read and there's a lot of it wade through even for a trivial source file. To get any useful information out of it you'd probably need to write a program to read it in and then walk the tree to find the nodes you're interested in and print them out it a more readable format.
Related
I'm using uncrustify to format a directory full of C and C++ code. I need to ensure that uncrustify won't change the resulting code; I can't do a diff on the object file or binaries because the object files have a timestamp and so won't ever be identical. I can't check the source of the files one by one because I'd be here for years.
The project uses make for the build process so I was wondering if there is some way to output something there that could be checked.
I've searched SO and Google to no avail, so my apologies if this is a duplicate.
EDIT: I'm using gcc/g++ and compiling for 32 bit.
One possibility would be to compile them with CLang, and get the output as LLVM IR. If memory serves, this should be command line arguments of -S -emit-llvm.
To do the same with gcc/g++, you can use one of its flags to generate a file containing its intermediate representation at some stage of compilation. Early stages will still show differences from changes in white space and such, but a quick test indicates that by the SSA stage, such non-operational changes have disappeared from the IR.
g++ -c -fdump-tree-ssa foo.cpp
In addition to the normal object file, this will produce a file named foo.cpp.018t.ssa that represents the semantic actions in your source file.
As noted above, I haven't tested this extensive though--it's possible that at this stage, some non-operational changes will still produce different output files (though I kind of doubt it). If necessary, you can use -fdump-tree-all to get output from all stages of compilation1. As a simple rule of thumb, I'd expect later stages to be more immune to changes in formatting and such, so if the ssa stage doesn't work, my next choice would probably be the optimized stage, which is one of the last stages (note: the files produced are numbered in order of the stage that produced each file, so when you dump all stages, it's obvious which are produced by early stages and which by later stages).
1. Note that this produces quite a few files, many of them quite large. The first time you do this, you probably want to do it on a single source file in a directory by itself to keep from drowning in files, so to speak. Also, don't be surprised when compilation this way takes quite a bit longer than normal.
How to see the added code in C++ by the compiler?
E.g., we know that
when an object of some class goes out of scope, the destructor for that object is called, but how do you see the specific code that does the destructor call? Is that code still written in C++?
Its compiler-dependent and in assembly language. For example, with the Microsoft compiler, compiling with /FAsc will generate a .cod file for each object file containing the assembly code along with the original C++ lines as comments. It will show the calls to constructors/destructors as well.
There's not necessarily any "code" that gets added. C++ is pretty clear on when such things happen, and for the compiler, making a new object clearly means calling its constructor -- no additional "code" anywhere.
You're right, however, things like calls to the constructor or destructor must end up somewhere in the assembly -- but there's absolutely no guarantee that having a look at the assembly reveals much more than what you'd have known without. C++ compilers are pretty mature in these aspects, and inline a lot of things in cases where that makes sense, making the same code look different in different places.
The closest thing you'll get is adding debug symbols to your build and using a debugger to get a call graph -- that will make sure that you notice when what you see as code gets called.
You can add flags to the compile command which will let you see the file in various stages of operations done by the compiler. For e.g., the -S flag will produce a file which would have had the preprocessor done and the initial compilation done, but before the assembler runs. However, this code will not be written in C++.
I have a very simple question but I haven't been able to find the answer yet so here I go:
I am using a shared library and I'd like to know if it had been compiled with an optimization flag such as -O3 or not.
Is there a way to find that information ?
Thank you very much
If you are using gcc 4.3 or later, check out -frecord-gcc-switches. After you build the binary file use readelf -n to read the notes section.
More info can be found here Detect GCC compile-time flags of a binary
Unless whoever compiled the library in the first place used a compiler that saves these flags to the binary somehow (I think only recent GCC allows that, and probably clang), there's inherently no way to know exactly what flags have been used. You can, of course, if you have had a lot of experience looking at assembly, deduct a lot (for example "this looks like an automatically unrolled loop", "This looks like someone optimized for a processor where A xor A is faster than A := 0x0", etc).
Generally, there's always different source code that can end up as the same compiled code, so there's no way to tell wether what has been compiled was optimized "by hand" in the first place or has seen compiler optimization in many cases.
Also, there are a lot of C++ compilers out there, a lot of versions of these and even more flags...
Now, your question comes out of somewhere; I'm guessing you're asking this because either
you want to know if there's debugging symbols in there, or
you want to make sure something isn't crashing because of incorrect optimization, or
you want to know whether there's potential for optimization.
Now, 1. is really rather independent of the level of optimization; of course, the more you optimize, the less your bytecode corresponds to "lines of source code", but you can still have debugging symbols.
The second point: I've learned the hard way that unless I've successfully excluded every other alternative, I'm the one to blame for bugs (and not my compiler).
The third point: There's always room for optimization, but that won't help you unless you're in a position to recompile the library yourself. If you recompile, you'll set the flags, so no need to find out if they were set in the first place. If you're not able to recompile: Knowing there is room won't help you. If you're just getting your library out of a complex build process: Most build systems leave you with a log that will include things like compiler flags.
I have a task to create optimized C++ source code and give it to friend for compilation. It means, that I do not control the final compilation, I just write the source code of C++ program.
I know, that a can make optimization during compilation with -O1 (and -O2 and others) options of GCC. But how can I get this optimized source code instead of compiled program? I am not able to configure parameters of my friend's compiler, that is why I need to make a good source on my side.
The optimizations performed by GCC are low level, that means you won't get C++ code again but assembly code in best case. But you won't be able to convert it or something.
In sum: Optimize the source code on code level, not on object level.
You could ask GCC to dump its internal (Gimple, ...) representations, at various "stages". The middle-end of GCC is made of hundreds of passes, and you could ask GCC to dump them, with arguments like -fdump-tree-all or -fdump-gimple-all; beware that you can get hundreds of dump files for a single compilation!
However, GCC internal representations are quite low level, and you should not expect to understand them without reading a lot of material.
The dump options I am mentionning are mostly useful to those working inside GCC, or extending it thru plugins coded in C or extensions coded in MELT (a high-level domain specific language to extend GCC). I am not sure they will be very useful to your friend. However, they can be useful to make you understand that optimization passes do a lot of complex processing.
And don't forget that premature optimization is evil : you should first make your program run correctly, then benchmark and profile it, at last optimize the few parts worth of your efforts. You probably won't be able to write correct & efficient programs without testing and running them yourself, before giving them to your friend.
Easy - choose the best algorithm possible, let the rest be handled by the optimizer.
Optimizing the source code is different than optimizing the binary. You optimize the source code, the compiler will optimize the binary.
For anything more than algorithm choice, you'll need to do some profiling. Sure, there are practices that can speed up code speed, but some make the code less readable. Only optimize when you have to, and after you measure.
I just asked a question related to how the compiler optimizes certain C++ code, and I was looking around SO for any questions about how to verify that the compiler has performed certain optimizations. I was trying to look at the assembly listing generated with g++ (g++ -c -g -O2 -Wa,-ahl=file.s file.c) to possibly see what is going on under the hood, but the output is too cryptic to me. What techniques do people use to tackle this problem, and are there any good references on how to interpret the assembly listings of optimized code or articles specific to the GCC toolchain that talk about this problem?
GCC's optimization passes work on an intermediary representation of your code in a format called GIMPLE.
Using the -fdump-* family of options, you can ask GCC to output intermediary states of the tree.
For example, feed this to gcc -c -fdump-tree-all -O3
unsigned fib(unsigned n) {
if (n < 2) return n;
return fib(n - 2) + fib(n - 1);
}
and watch as it gradually transforms from simple exponential algorithm into a complex polynomial algorithm. (Really!)
A useful technique is to run the code under a good sampling profiler, e.g. Zoom under Linux or Instruments (with Time Profiler instrument) under Mac OS X. These profilers not only show you the hotspots in your code but also map source code to disassembled object code. Highlighting a source line shows the (not necessarily contiguous) lines of generated code that map to the source line (and vice versa). Online opcode references and optimization tips are a nice bonus.
Instruments: developer.apple.com
Zoom: www.rotateright.com
Not gcc, but when debugging in Visual Studio you have the option to intersperse assembly and source, which gives a good idea of what has been generated for what statement. But sometimes it's not quite aligned correctly.
The output of the gcc tool chain and objdump -dS isn't at the same granularity. This article on getting gcc to output source and assembly has the same options as you are using.
Adding the -L option (eg, gcc -L -ahl) may provide slightly more intelligible listings.
The equivalent MSVC option is /FAcs (and it's a little better because it intersperses the source, machine language, and binary, and includes some helpful comments).
About one third of my job consists of doing just what you're doing: juggling C code around and then looking at the assembly output to make sure it's been optimized correctly (which is preferred to just writing inline assembly all over the place).
Game-development blogs and articles can be a good resource for the topic since games are effectively real-time applications in constant memory -- I have some notes on it, so does Mike Acton, and others. I usually like to keep Intel's instruction set reference up in a window while going through listings.
The most helpful thing is to get a good ground-level understanding of assembly programming generally first -- not because you want to write assembly code, but because having done so makes reading disassembly much easier. I've had a hard time finding a good modern textbook though.
In order to output the optimizations applied you can use:
-fopt-info-optimized
To see those that have not been applied
-fopt-info-missed
Beware that the output is sent to standard error stream so to see it you actually have to redirect that : ( hint 2>&1 )
Here is nice example of :
g++ -O3 -std=c++11 -march=native -mtune=native
-fopt-info-optimized h2d.cpp -o h2d 2>&1
h2d.cpp:225:3: note: loop vectorized
h2d.cpp:213:3: note: loop vectorized
h2d.cpp:198:3: note: loop vectorized
h2d.cpp:186:3: note: loop vectorized
You can check the interleaved output, when having applied -g with objdump -dS|c++filt , but that will not get you that far.Enjoy!
Zoom from RotateRight ( http://rotateright.com ) is mentioned in another answer, but to expand on that: it shows you the mapping of source to assembly in what they call the "code browser". It's incredibly handy even if you're not an asm expert because they have also integrated assembly documentation into the app. And the assembly listing is annotated with comments and timing for several CPU types.
You can just open your object or executable file with Zoom and take a look at what the compiler has done with your code.
Victor, in your case the optimization you are looking for is just a smaller allocation of local memory on the stack. You should see a smaller allocation at function entry and a smaller deallocation at function exit if the space used by the empty class is optimized away.
As for the general question, I've been reading (and writing) assembly language for more than (gulp!) 30 years and all I can say is that it takes practice, especially to read the output of a compiler.
Instead of trying to read through an assembler dump, run your program inside a debugger. You can pause execution, single-step through instructions, set breakpoints on the code you want to check, etc. Many debuggers can display your original C code alongside the generated assembly so you can more easily see what the compiler did to optimize your code.
Also, if you are trying to test a specific compiler optimization you can create a short dummy function that contains the type of code that fits the optimization you are interested in (and not much else, the simpler it is the easier the assembly is to read). Compile the program once with optimizations on and once with them off; comparing the generated assembly code for the dummy function between builds should show you what the compiler's optimizers did.