Find out compilation optimization flag from executable - c++

Here I have an executable without knowing its build environment, with the assumption of gcc/g++ being used.
Is there a way to find out the optimization flag used during compilation (like O0, O2, ...)?
All means are welcomed, no matter it's by analyzing the binary or some debug test via gdb (if we assume that -g flag is available during compilation).

If you are lucky, the command-line is present in the executable file itself, depending on the operating system and file format used. If it is an Elf-file, try to dump the content using the objdump from GNU binutils

I really don't know if this can help, but checking O0 is quite easy in the disassembly (objdump -d), since the generated code has no optimization at all and adds some few instructions to simplify debugging.
Typically on an x86, the prologue of a function includes saving the stack pointer (for the backtrace, I presume). So if you locate, for example, the main function, you should see something like:
... main:
... push %rbp
... mov %rsp,%rbp
And you should see this "pattern" at almost every beginning of the functions.
For other targets (I dunno what your target platform is), you should see more or less similar assembly sequences in the prologues or before the function calls.
For other optimization levels, things are way way trickier.
Sorry for remaining vague and not answering the entire question... Just hoping it will help.

Related

Meaning of this=<optimized out> in GDB

I understand the general concept of using optimization flags like -O2 and ending up having had things optimized out, makes sense. But what does it mean for the 'this' function parameter in a gdb frame to be optimized out? Does it mean the use of an Object was determined to be entirely pointless, and that it, and the following function call was elided from existence? Is it indicative of a function having been inlined? Is it indicative of the function call having been elided?
How would I go about investigating further? This occurs with both -O0 and -Og.
If it makes any difference, this is with an ARM process. I'm doing remote debugging using GNU gdbserver (GDB) 7.12.1.20170417-git and 'gdb-multiarch' GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1.
But what does it mean for the 'this' function parameter in a gdb frame to be optimized out?
It means that GDB doesn't have sufficient debug info to understand the current value of this.
It could happen for two reasons:
the compiler failed to emit relevant debug info
the info is there, but GDB failed to understand it
GCC used to do (1) a lot with -O2 and higher optimization levels, but that has been significantly improved around 2015-2016. I have never seen <optimized out> with GCC at -O0.
Clang still does (1) with -O2 and above on x86_64 in 2022, but again I've never seen it do that at -O0.
How would I go about investigating further?
You can run readelf --debug-dump ./a.out and see what info is present in the binary. Beware -- there is a lot of info, and making sense of it requires understanding of what's supposed to be there.
Or you could file a bugzilla issue with exact compiler and debugger versions and compilation command, attach a small binary, and hope that someone will look.
But first make sure you still get this behavior from the latest released version of GCC and GDB (or the current tip-of-trunk versions if you can build them).

Step into standard library call with godbolt

I want to know how various compilers implement std::random_device, so I popped it into godbolt.
Unfortunately, the only thing it says is
std::random_device::operator()():
push rbp
mov rbp, rsp
sub rsp, 16
mov QWORD PTR [rbp-8], rdi
mov rax, QWORD PTR [rbp-8]
mov rdi, rax
call std::random_device::_M_getval()
leave
ret
which is not very helpful. How can I step into the _M_getval() call and examine the assembly there?
You can't "step into" functions; Godbolt isn't a debugger, it's a disassembler (in "binary" mode, otherwise a compiler asm-text output filter / viewer). Your program doesn't run, it just gets compiled. (And unless you choose the "binary" output option, it only compiles to asm, not to machine code, and doesn't actually link.)
But regardless of terminology, no, you can't get Godbolt to show you disassembly for whatever version of a library it happens to have installed.
Single-step the program on your desktop. (Compile with gcc -O3 -fno-plt to avoid having to step through PLT lazy dynamic linking.)
(I did, and libstdc++ 6.2.1 on Arch Linux runs cpuid in the constructor for std::random_device. If rdrand is available, it uses it on calls to _M_getval(). Figuring this out from just disassembly would have been tricky; there are several levels of function calls and branching, and without symbols it would have been hard to figure out what's what. My Skylake has rdseed available, but it didn't use it. Yes, as you commented, that would be a better choice.)
Different compilers can generate different versions of library functions from the same source, that's the main point of the compiler explorer's existence. And no, it doesn't have a separate version of libstdc++ compiled by every compiler in the dropdown.
There'd be no guarantee that the library code you saw would match what's on your desktop, or anything.
It does actually have x86-64 Linux libraries installed, though, so in theory it would be possible for Godbolt to give you an option to find and disassemble certain library functions, but that functionality does not exist currently. And would only work for targets where the "binary" option is available; I think for most of the cross-compile targets it only has headers not libraries. Or maybe there's some other reason it won't link and disassemble for non-x86 ISAs.
Using -static and binary mode shows stuff, but not what we want.
I tried compiling with -static -fno-plt -fno-exceptions -fno-rtti -nostartfiles -O3 -march=skylake (so rdrand and rdseed would be available in case they got inlined; they don't). -fno-plt is redundant with -static, but it's useful without to remove that clutter.
-static causes the library code to actually end up in the linked binary that Godbolt disassembles. But the output is limited to 500 lines, and the definition of std::random_device::_M_getval() happens not to be near the start of the file.
-nostartfiles avoids cluttering the binary with _start and so on from CRT startup files. I think Godbolt already filters these out of the disassembly, though, because you don't see them in the normal binary output (without -static). You're not going to run the program, so it doesn't matter that the linker couldn't find a _start symbol and just defaulted to putting the ELF entry point at the start of the .text section.
Despite compiling with -fno-exceptions -fno-rtti (so no unwind handler for your function is included), libstdc++ functions were compiled with exception handling enabled. So linking them pulls in boatloads of exception code. The static executable starts out with definitions for functions like std::__throw_bad_exception(): and std::__throw_bad_alloc():
BTW, without -fno-exceptions, there's also a get_random_seed() [clone .cold]: definition, which I think is an unwind handler. It's not a definition of your actual function. Near the start of the static binary is operator new(unsigned long) [clone .cold]: which again I think is libstdc++'s exception-handler code.
I think the .text.cold or .init sections got linked first, unfortunately, so none of the interesting functions are going to be visible in the first 500 lines.
Even if this had worked, it's only binary-mode disassembly, not compiler asm
Even with debug symbols, we wouldn't know which struct member was being accessed, just numeric offsets from registers, because objdump doesn't fill those in.
And with lots of branching, it's hard to follow complicated logic possibilities. Single-stepping at run-time automatically follows the actual path of execution.
Related:
How to remove "noise" from GCC/clang assembly output? about using Matt Godbolt's Compiler Explorer for things it is good for.
Matt Godbolt's CppCon2017 talk “What Has My Compiler Done for Me Lately? Unbolting the Compiler's Lid” is an excellent guide, and points out that you can clone the compiler-explorer repo and set it up locally with your own choices of compiler. You could even hack it to allow larger output, but that's still obviously a bad approach for this problem.

When we compile a source code that contains a 'main' without linking, why can't we run it?

I am learning about compiling process and I know that linking is mainly used to link a binary file which contains a 'main' function with other binary files that contain other helper functions that are used in our main functions.
However when I try to run an object file with the code:
int main() {
return 0;
}
Compiled with the -c command in gcc on Ubuntu, I try to run it and I get the error:
"bash: ./source.o: cannot execute binary file: Exec format error"
Read Levine's Linkers & Loaders.
Read about ELF.
Try compiling with gcc -v (you'll see what are the actual programs used: cc1 to compile C code into some assembler, as to assemble that into some object file, ld & collect2 to link). Look also at the generated assembler file with gcc -S -fverbose-asm -O. Notice that gcc knows about (and compiles specially) the main function. And the starting point of your executable is provided by some crt0, etc (it is not main but some _start routine coded in assembler which calls your main....).
Object files are not the same as executables. The executable contains stuff like crt0 and the C standard library, or some way to dynamically link it as a shared object (and you need to link your source.o -compiled from your empty main in source.c- into an executable because of that).
On Linux, play with objdump(1) & readelf(1) (on some existing binaries, and also on your source.o object file)
See also elf(5), execve(2), ld-linux(8), Linux assembly howto, syscalls(2), Advanced Linux Programming, Operating Systems: Three Easy Pieces, and (to understand about libc.so) Drepper's How To Write Shared Libraries, the Dragon Book ...
(you need to read entire books to understand the details; I gave some references)
Look also into Common Lisp & SBCL. Its compiler has a very different model (really different from C).
You dont have a bootstrap. you are in this chicken and egg problem.
The code (for that function) is there, but there are assumptions, first and foremost you need a stack. Depending on the architecture your return address may be on that stack for example. The return value may be on that stack. The C language itself doesnt provide for that directly in the language there is always at least a little bit of assembly or some other language required in order to "bootstrap" your function. For example in ARM for gnu:
bs.s
.globl _start
_start:
mov sp,#0x8000
bl main
b .
so.c
int main ( void )
{
return(0);
}
For ARM the function is complete the instructions dont need to be modified by the linker. but there is no address space defined, either specified or the disassembler assumes zero as the address for this object, but it is an object not a loadable binary.
00000000 <main>:
0: e3a00000 mov r0, #0
4: e12fff1e bx lr
now if we add the bootstrap and link to some address we get a real, executable, program
00008000 <_start>:
8000: e3a0d902 mov sp, #32768 ; 0x8000
8004: eb000000 bl 800c <main>
8008: eafffffe b 8008 <_start+0x8>
0000800c <main>:
800c: e3a00000 mov r0, #0
8010: e12fff1e bx lr
It doesnt mean one couldnt craft an operating system nor an environment where you could load functions in this way, using the compilers object output. But that is the reason for the word chain, tool chain. Compiler makes assembly language, the assembler assembles the assembly language, combined with other necessary objects (bootstrap plus compiler libraries plus C libraries, etc) the linker defines the address spaces for everything and modifies the code/data as needed to resolve externals. A sequence or chain of events to get the final result.
Even the most basic commands like exit aren't directly in the language and need to be linked.
http://en.cppreference.com/w/c/program/exit

-mimplicit-it compiler flag not recognized

I am attempting to compile a C++ library for a Tegra TK1. The library links to TBB, which I pulled using the package manager. During compilation I got the following error
/tmp/cc4iLbKz.s: Assembler messages:
/tmp/cc4iLbKz.s:9541: Error: thumb conditional instruction should be in IT block -- `strexeq r2,r3,[r4]'
A bit of googling and this question led me to try adding -mimplicit-it=thumb to CMAKE_CXX_FLAGS, but the compiler doesn't recognize it.
I am compiling on the tegra with kernal 3.10.40-grinch-21.3.4, and using gcc 4.8.4 compiler (thats what comes back when I type c++ -v)
I'm not sure what the initial error message means, though I think it has something to do with the TBB linked library rather than the source I'm compiling. The problem with the fix is also mysterious. Can anyone shed some light on this?
-mimplicit-it is an option to the assembler, not to the compiler. Thus, in the absence of specific assembler flags in your makefile (which you probably don't have, given that you don't appear to be using a separate assembler step), you'll need to use the -Wa option to the compiler to pass it through, i.e. -Wa,-mimplicit-it=thumb.
The source of the issue is almost certainly some inline assembly - possibly from a static inline in a header file if you're really only linking pre-built libraries - which contains conditionally-executed instructions (I'm going to guess its something like a cmpxchg implementation). Since your toolchain could well be configured to compile to the Thumb instruction set - which requires a preceding it (If-Then) instruction to set up conditional instructions - by default, another alternative might be to just compile with -marm (and/or remove -mthumb if appropriate) and sidestep the issue by not using Thumb at all.
Adding compiler option:
-wa
should solve the problem.

How can I see the assembly code that is generated by a gcc (any flavor) compiler for a C/C++ program?

I am trying to optimize a lot of multiplications and pointer arithmetics and would like to see what the compiler does underneath when I put in optimization flags.
--Edit--
How to restrict it to a specific function or a code block?
--Edit_2--
How to let gcc generate a less verbose assembly-code?
Add -S switch to your command line.
Edit: Do not forget that it will place the assembly to the files you specified under -o switch.
How to restrict it to a specific function or a code block?
Put that function in a separate source file (and use a different command-line parameter for that one source file).
You could also run that program in a debugger like gdb and use a disassembly view. In gdb you could use the command disass/m to view the assembly mixed with the C code on the current location.
You could stop you program at a breakpoint in the Visual Studio debugger and do "show assembly" and even step through it one instruction at a time.