I start with a binary executable and I want to see the source code, not just the assembly code. Is this possible? The documentation at "https://sourceware.org/gdb/onlinedocs/gdb/Machine-Code.html" seems to generate the source code.
If it is possible, why is the source code not showing. I have set no breakpoints, the code is not striped. I have used the gdb command "disas /s main". A screen shot starting with some information about my configuration follow.
──(root㉿kali)-[/home/kali/Downloads]
└─# uname -a
Linux kali 5.15.0-kali3-amd64 #1 SMP Debian 5.15.15-2kali1 (2022-01-31) x86_64 GNU/Linux
┌──(root㉿kali)-[/home/kali/Downloads]
└─# gdb -v
GNU gdb (Debian 10.1-2) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
┌──(root㉿kali)-[/home/kali/Downloads]
└─# file RE1_64bit
RE1_64bit: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.24, BuildID[sha1]=8616e4f2a4a3c325c2a1f32b8ebb8366694f7a03, not stripped
┌──(root㉿kali)-[/home/kali/Downloads]
└─# gdb RE1_64bit
GNU gdb (Debian 10.1-2) 10.1.90.20210103-git
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from RE1_64bit...
(No debugging symbols found in RE1_64bit)
(gdb) disas /s main
Dump of assembler code for function main:
0x000000000040084e <+0>: push %rbp
0x000000000040084f <+1>: mov %rsp,%rbp
0x0000000000400852 <+4>: sub $0x40,%rsp
0x0000000000400856 <+8>: mov %edi,-0x34(%rbp)
0x0000000000400859 <+11>: mov %rsi,-0x40(%rbp)
0x000000000040085d <+15>: cmpl $0x2,-0x34(%rbp)
0x0000000000400861 <+19>: je 0x400886 <main+56>
0x0000000000400863 <+21>: mov -0x40(%rbp),%rax
0x0000000000400867 <+25>: mov (%rax),%rax
0x000000000040086a <+28>: mov %rax,%rsi
0x000000000040086d <+31>: mov $0x4009a2,%edi
0x0000000000400872 <+36>: mov $0x0,%eax
0x0000000000400877 <+41>: call 0x400580 <printf#plt>
0x000000000040087c <+46>: mov $0x1,%edi
0x0000000000400881 <+51>: call 0x4005e0 <exit#plt>
0x0000000000400886 <+56>: mov -0x40(%rbp),%rax
0x000000000040088a <+60>: add $0x8,%rax
0x000000000040088e <+64>: mov (%rax),%rax
0x0000000000400891 <+67>: mov %rax,%rdi
0x0000000000400894 <+70>: call 0x400570 <strlen#plt>
0x0000000000400899 <+75>: cmp $0x4,%rax
0x000000000040089d <+79>: je 0x4008c2 <main+116>
0x000000000040089f <+81>: mov -0x40(%rbp),%rax
0x00000000004008a3 <+85>: mov (%rax),%rax
0x00000000004008a6 <+88>: mov %rax,%rsi
0x00000000004008a9 <+91>: mov $0x4009a2,%edi
0x00000000004008ae <+96>: mov $0x0,%eax
0x00000000004008b3 <+101>: call 0x400580 <printf#plt>
0x00000000004008b8 <+106>: mov $0x1,%edi
0x00000000004008bd <+111>: call 0x4005e0 <exit#plt>
0x00000000004008c2 <+116>: movl $0x0,-0x4(%rbp)
0x00000000004008c9 <+123>: mov $0x4009b3,%edi
0x00000000004008ce <+128>: mov $0x0,%eax
0x00000000004008d3 <+133>: call 0x400580 <printf#plt>
0x00000000004008d8 <+138>: lea -0x30(%rbp),%rax
0x00000000004008dc <+142>: mov %rax,%rdi
0x00000000004008df <+145>: call 0x4005d0 <gets#plt>
0x00000000004008e4 <+150>: cmpl $0x0,-0x4(%rbp)
0x00000000004008e8 <+154>: je 0x4008f8 <main+170>
0x00000000004008ea <+156>: mov -0x40(%rbp),%rax
0x00000000004008ee <+160>: mov %rax,%rdi
0x00000000004008f1 <+163>: call 0x4006dd <fg>
0x00000000004008f6 <+168>: jmp 0x400902 <main+180>
0x00000000004008f8 <+170>: mov $0x4009cd,%edi
0x00000000004008fd <+175>: call 0x400560 <puts#plt>
0x0000000000400902 <+180>: mov $0x0,%eax
0x0000000000400907 <+185>: leave
0x0000000000400908 <+186>: ret
End of assembler dump.
As has been said in the comments, this line:
(No debugging symbols found in RE1_64bit)
indicates that the binary does not include any debug information, so you're not going to be able to match assembler code to source lines.
If the binary did include debug information then it would only contain a table mapping addresses in the binary to file names and line numbers. You would still need to have the actual source files in order to view the source lines, and, of course, the source files need to be the exact versions that were compiled into that specific binary, otherwise the line numbers in the debug information will not match up correctly.
Related
I used gdb to attach a program, and then set a breakpoint in function engine::monAppendSystemInfo. When the breakpoint was hit, the gdb coredump(actually, it's my program crashed in engine::monAppendSystemInfo).This is not an inevitable problem. It has only appeared twice and cannot be reproduced.
Here is the compared assembly code of engine::monAppendSystemInfo.
The code below is disassembled from the coredump file:
Dump of assembler code for function engine::monAppendSystemInfo(bson::BSONObjBuilder&, unsigned int):
0x00000000011188f1 <+0>: push %rbp
0x00000000011188f2 <+1>: mov %rsp,%rbp
0x00000000011188f5 <+4>: push %r12
0x00000000011188f7 <+6>: push %rbx
0x00000000011188f8 <+7>: sub $0xb20,%rsp
0x00000000011188ff <+14>: mov %rdi,-0xa98(%rbp)
0x0000000001118906 <+21>: mov %esi,-0xa9c(%rbp)
0x000000000111890c <+27>: int3 // strange point
=> 0x000000000111890d <+28>: mov 0x28,%rax // crash for accessing 0x28
0x0000000001118915 <+36>: mov %rax,-0x18(%rbp)
The code below is disassembled from normal gdb, and the program can continue to run:
Dump of assembler code for function engine::monAppendSystemInfo(bson::BSONObjBuilder&, unsigned int):
0x00000000011188f1 <+0>: push %rbp
0x00000000011188f2 <+1>: mov %rsp,%rbp
0x00000000011188f5 <+4>: push %r12
0x00000000011188f7 <+6>: push %rbx
0x00000000011188f8 <+7>: sub $0xb20,%rsp
0x00000000011188ff <+14>: mov %rdi,-0xa98(%rbp)
0x0000000001118906 <+21>: mov %esi,-0xa9c(%rbp)
=> 0x000000000111890c <+27>: mov %fs:0x28,%rax // "%fs:0x28" was changed to "0x28" in above
0x0000000001118915 <+36>: mov %rax,-0x18(%rbp)
My linux is: ubuntu16.04.4 LTS, and the enviroment is as below:
root#lyysdbserver1:~# g++ --version
g++ (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
Copyright (C) 2015 Free Software Foundation, Inc.
root#lyysdbserver1:~# gdb --version
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
Why "%fs:0x28" was changed to "0x28"? Is this a gdb bug ?
Presumably you're looking of a coredump of your program when GDB had already set a breakpoint. int 3 is how GDB sets a breakpoint. When you resume, the int 3 instruction should be replaced by the original byte.
Look how the machine code compares for these two instruction sequences:
mov %fs:0x28,%rax; 64 48 8b 04 25 28 00 00 00
int 3; mov 0x28,%rax; cc 48 8b 04 25 28 00 00 00
It has been a while since I started working with SSE/AVX intrinsic functions. I recently began writing a header for matrix transposition. I used a lot of if constexpr branches so that the compiler always selects the optimal instruction set depending on some template parameters. Now I wanted to check if everything works as expected by looking into the local disassembly with objdump. When using Clang, I get a clear output which basically contains only the assembly instructions corresponding to the utilized intrinsic functions. However, if I use GCC, the disassembly is quite bloated with extra instructions. A quick check on Godbolt shows me that those extra instructions in the GCC disassembly shouldn't be there.
Here is a small example:
#include <x86intrin.h>
#include <array>
std::array<__m256, 1> Test(std::array<__m256, 1> a)
{
std::array<__m256, 1> b;
b[0] = _mm256_unpacklo_ps(a[0], a[0]);
return b;
}
I compile with -march=native -Wall -Wextra -Wpedantic -pthread -O3 -DNDEBUG -std=gnu++1z. Then I use objdump -S -Mintel libassembly.a > libassembly.dump on the object file. For Clang (6.0.0), the result is:
In archive libassembly.a:
libAssembly.cpp.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_Z4TestSt5arrayIDv8_fLm1EE>:
0: c4 e3 7d 04 c0 50 vpermilps ymm0,ymm0,0x50
6: c3 ret
which is the same as Godbolt returns: Godbolt - Clang 6.0.0
For GCC (7.4) the output is
In archive libassembly.a:
libAssembly.cpp.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_Z4TestSt5arrayIDv8_fLm1EE>:
0: 4c 8d 54 24 08 lea r10,[rsp+0x8]
5: 48 83 e4 e0 and rsp,0xffffffffffffffe0
9: c5 fc 14 c0 vunpcklps ymm0,ymm0,ymm0
d: 41 ff 72 f8 push QWORD PTR [r10-0x8]
11: 55 push rbp
12: 48 89 e5 mov rbp,rsp
15: 41 52 push r10
17: 48 83 ec 28 sub rsp,0x28
1b: 64 48 8b 04 25 28 00 mov rax,QWORD PTR fs:0x28
22: 00 00
24: 48 89 45 e8 mov QWORD PTR [rbp-0x18],rax
28: 31 c0 xor eax,eax
2a: 48 8b 45 e8 mov rax,QWORD PTR [rbp-0x18]
2e: 64 48 33 04 25 28 00 xor rax,QWORD PTR fs:0x28
35: 00 00
37: 75 0c jne 45 <_Z4TestSt5arrayIDv8_fLm1EE+0x45>
39: 48 83 c4 28 add rsp,0x28
3d: 41 5a pop r10
3f: 5d pop rbp
40: 49 8d 62 f8 lea rsp,[r10-0x8]
44: c3 ret
45: c5 f8 77 vzeroupper
48: e8 00 00 00 00 call 4d <_Z4TestSt5arrayIDv8_fLm1EE+0x4d>
As you can see, there are a lot of additional instructions. In contrast to that, Godbolt does not include all these extra instructions: Godbolt - GCC 7.4
So what is going on here? I have just started learning assembly, so maybe it is totally clear to someone with assembly experience, but I am a little bit confused why GCC creates those extra instructions on my machine.
Greetings and thank you in advance.
EDIT
To avoid further confusions, I just compiled using:
gcc-7 -I/usr/local/include -O3 -march=native -Wall -Wextra -Wpedantic -pthread -std=gnu++1z -o test.o -c /<PathToFolder>/libAssembly.cpp
Output remains the same. I am not sure if this is relevant, but it generates the warning:
warning: ignoring attributes on template argument ‘__m256 {aka __vector(8) float}’ [-Wignored-attributes]
Usually I surpress this warning and it shouldn't be an issue:
Implication of GCC warning: ignoring attributes on template argument (-Wignored-attributes)
Processor is Intel(R) Core(TM) i7-6700K CPU # 4.00GHz
Here is the gcc -v:
gcc-7 -v
Using built-in specs.
COLLECT_GCC=gcc-7
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/7/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 7.4.0-1ubuntu1~18.04.1' --with-bugurl=file:///usr/share/doc/gcc-7/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++ --prefix=/usr --with-gcc-major-version-only --program-suffix=-7 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --enable-default-pie --with-system-zlib --with-target-system-zlib --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 7.4.0 (Ubuntu 7.4.0-1ubuntu1~18.04.1)
Use -fno-stack-protector
Your local GCC defaults to -fstack-protector-strong but Godbolt's GCC install doesn't.
mov rax,QWORD PTR fs:0x28 is the telltale clue; Thread-local storage at fs:40 aka fs:0x28 is where GCC keeps its stack cookie constant. The call after the ret is call __stack_chk_fail (but you disassembled a .o without using objdump -dr to show relocations, so the placeholder +0 offset just looked like still a target within this function).
Since you have arrays (or a class containing an array), stack-protector-strong kicks in even though their sizes are compile-time constants. So you get the code to store the stack cookie, then check it and branch on stack overflow. (Even the array of size 1 in this MVCE is enough to trigger that.)
Making arrays on the stack with 32-byte alignment (for __m256) requires 32-byte alignment, and your GCC is older than GCC8 so you get the ridiculously clunky stack-alignment code that builds a full copy of the stack frame including a return address. Generated assembly for extended alignment of stack variables (To be clear, GCC8 still does align the stack here, just wasting fewer instructions on it.)
This is pretty much a missed optimization; gcc never actually spills or reloads to those arrays so it could have just optimized them away, along with the stack alignment, like it did without stack-protector.
More recent GCC is better at optimizing away stack alignment after optimizing away the memory for aligned locals in more cases, but this has been a persistent missed optimization in AVX code. Fortunately the cost is pretty negligible in a function that loops; as long as small helper functions inline.
Compiling on Godbolt with -fstack-protector-strong reproduces your output. Newer GCC, including current trunk pre-10, still has both missed optimizations, but stack alignment costs fewer instructions because it just uses RBP as a frame pointer and aligns RSP, then references locals relative to aligned RSP. It still checks the stack cookie (with no instructions between storing it and checking it).
On your desktop, compiling with -fno-stack-protector should make good asm.
In the file file1.c, there is a call to a function that is implemented in the file file2.c.
When I link file1.o and file2.o into an executable, if the function in file2 is very small, will the linker automatically detect that the function is small and inline its call?
In addition to the support for Link Time Code Generation (LTCG) that Jame McNellis mentioned, the GCC toolchain also supports link time optimization. Starting with version 4.5, GCC supports the -flto switch which enables Link Time Optimization (LTO), a form of whole program optimization that lets it inline functions from separate object files (and whatever other optimizations a compiler might be able to make if it were compiling all the object files as if they were from a single C source file).
Here's a simple example:
test.c:
void print_int(int x);
int main(){
print_int(1);
print_int(42);
print_int(-1);
return 0;
}
print_int.c:
#include <stdio.h>
void print_int( int x)
{
printf( "the int is %d\n", x);
}
First compile them using GCC4.5.x - examples from GCC docs use -O2, but to get visible results in my simple test, I had to use -O3:
C:\temp>gcc --version
gcc (GCC) 4.5.2
# compile with preparation for LTO
C:\temp>gcc -c -O3 -flto test.c
C:\temp>gcc -c -O3 -flto print_int.c
# link without LTO
C:\temp>gcc -o test-nolto.exe print_int.o test.o
To get the effect of LTO you're supposed to use the optimization options even at the link stage - the linker actually invokes the compiler to compile pieces of intermediate code that the compiler put into the object file in the first steps above. If you don't pass the optimization option at this stage as well, the compiler won't perform the inlining that you'd be looking for.
# link using LTO
C:\temp>gcc -o test-lto.exe -flto -O3 print_int.o test.o
Disassembly of the version without link time optimization. Note that the calls are made to the print_int() function:
C:\temp>gdb test-nolto.exe
GNU gdb (GDB) 7.2
(gdb) start
Temporary breakpoint 1 at 0x401373
Starting program: C:\temp/test-nolto.exe
[New Thread 3324.0xdc0]
Temporary breakpoint 1, 0x00401373 in main ()
(gdb) disassem
Dump of assembler code for function main:
0x00401370 <+0>: push %ebp
0x00401371 <+1>: mov %esp,%ebp
=> 0x00401373 <+3>: and $0xfffffff0,%esp
0x00401376 <+6>: sub $0x10,%esp
0x00401379 <+9>: call 0x4018ca <__main>
0x0040137e <+14>: movl $0x1,(%esp)
0x00401385 <+21>: call 0x401350 <print_int>
0x0040138a <+26>: movl $0x2a,(%esp)
0x00401391 <+33>: call 0x401350 <print_int>
0x00401396 <+38>: movl $0xffffffff,(%esp)
0x0040139d <+45>: call 0x401350 <print_int>
0x004013a2 <+50>: xor %eax,%eax
0x004013a4 <+52>: leave
0x004013a5 <+53>: ret
Disassembly of the version with link time optimization. Note that the calls to printf() are made directly:
C:\temp>gdb test-lto.exe
GNU gdb (GDB) 7.2
(gdb) start
Temporary breakpoint 1 at 0x401373
Starting program: C:\temp/test-lto.exe
[New Thread 1768.0x126c]
Temporary breakpoint 1, 0x00401373 in main ()
(gdb) disassem
Dump of assembler code for function main:
0x00401370 <+0>: push %ebp
0x00401371 <+1>: mov %esp,%ebp
=> 0x00401373 <+3>: and $0xfffffff0,%esp
0x00401376 <+6>: sub $0x10,%esp
0x00401379 <+9>: call 0x4018da <__main>
0x0040137e <+14>: movl $0x1,0x4(%esp)
0x00401386 <+22>: movl $0x403064,(%esp)
0x0040138d <+29>: call 0x401acc <printf>
0x00401392 <+34>: movl $0x2a,0x4(%esp)
0x0040139a <+42>: movl $0x403064,(%esp)
0x004013a1 <+49>: call 0x401acc <printf>
0x004013a6 <+54>: movl $0xffffffff,0x4(%esp)
0x004013ae <+62>: movl $0x403064,(%esp)
0x004013b5 <+69>: call 0x401acc <printf>
0x004013ba <+74>: xor %eax,%eax
0x004013bc <+76>: leave
0x004013bd <+77>: ret
End of assembler dump.
And here's the same experiment with MSVC (first with LTCG):
C:\temp>cl -c /GL /Zi /Ox test.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
test.c
C:\temp>cl -c /GL /Zi /Ox print_int.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
print_int.c
C:\temp>link /LTCG test.obj print_int.obj /out:test-ltcg.exe /debug
Microsoft (R) Incremental Linker Version 10.00.40219.01
Copyright (C) Microsoft Corporation. All rights reserved.
Generating code
Finished generating code
C:\temp>"\Program Files (x86)\Debugging Tools for Windows (x86)"\cdb test-ltcg.exe
Microsoft (R) Windows Debugger Version 6.12.0002.633 X86
Copyright (c) Microsoft Corporation. All rights reserved.
CommandLine: test-ltcg.exe
// ...
0:000> u main
*** WARNING: Unable to verify checksum for test-ltcg.exe
test_ltcg!main:
00cd1c20 6a01 push 1
00cd1c22 68d05dcd00 push offset test_ltcg!__decimal_point_length+0x10 (00cd5dd0)
00cd1c27 e8e3f3feff call test_ltcg!printf (00cc100f)
00cd1c2c 6a2a push 2Ah
00cd1c2e 68d05dcd00 push offset test_ltcg!__decimal_point_length+0x10 (00cd5dd0)
00cd1c33 e8d7f3feff call test_ltcg!printf (00cc100f)
00cd1c38 6aff push 0FFFFFFFFh
00cd1c3a 68d05dcd00 push offset test_ltcg!__decimal_point_length+0x10 (00cd5dd0)
00cd1c3f e8cbf3feff call test_ltcg!printf (00cc100f)
00cd1c44 83c418 add esp,18h
00cd1c47 33c0 xor eax,eax
00cd1c49 c3 ret
0:000>
Now without LTCG. Note that with MSVC you have to compile the .c file without the /GL to prevent the linker from performing LTCG - otherwise the linker detects that /GL was specified, and it'll force the /LTCG option (hey, that's what you said you wanted the first time around with /GL):
C:\temp>cl -c /Zi /Ox test.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
test.c
C:\temp>cl -c /Zi /Ox print_int.c
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
print_int.c
C:\temp>link test.obj print_int.obj /out:test-noltcg.exe /debug
Microsoft (R) Incremental Linker Version 10.00.40219.01
Copyright (C) Microsoft Corporation. All rights reserved.
C:\temp>"\Program Files (x86)\Debugging Tools for Windows (x86)"\cdb test-noltcg.exe
Microsoft (R) Windows Debugger Version 6.12.0002.633 X86
Copyright (c) Microsoft Corporation. All rights reserved.
CommandLine: test-noltcg.exe
// ...
0:000> u main
test_noltcg!main:
00c41020 6a01 push 1
00c41022 e8e3ffffff call test_noltcg!ILT+5(_print_int) (00c4100a)
00c41027 6a2a push 2Ah
00c41029 e8dcffffff call test_noltcg!ILT+5(_print_int) (00c4100a)
00c4102e 6aff push 0FFFFFFFFh
00c41030 e8d5ffffff call test_noltcg!ILT+5(_print_int) (00c4100a)
00c41035 83c40c add esp,0Ch
00c41038 33c0 xor eax,eax
00c4103a c3 ret
0:000>
One thing that Microsoft's linker supports in LTCG that is not supported by GCC (as far as I know) is Profile Guided Optimization (PGO). That technology allows Microsoft's linker to optimize based on a profiling data gathered from previous runs of the program. This allows the linker to do things such as gather 'hot' functions onto the same memory pages and seldom used code sequences onto other memory pages to reduce the working set of a program.
Edit (28 Aug 2011): GCC support profile guided optimization using such options as -fprofile-generate and -fprofile-use, but I'm completely uninformed about them.
Thanks to Konrad Rudolph for pointing this out to me.
I am seeing some strange behavior when attempting to use libdfp in a C++ program. Specifically, it appears as if GCC is always rounding to 8 decimal places, even when I use the 64- and 128-bit decimal types.
To test this, I created an incredibly simple test program:
std::decimal::decimal64 testval = 0.044575289999999997DD;
printf("Decimal float test: expected=0.044575289999999997, actual=%.16Da\n", testval);
Which outputs:
Decimal float test: expected=0.044575289999999997, actual=0.04457529000000000
I am fairly certain that this is not a printing problem in libdfp as I was able to trace the source and found that the number is already rounded by the first line of the printf handler. Additionally, the printf handler will also round, however I have verified that this code is not being called.
For reference, I am building libdfp with:
./configure --with-backend=libdecnumber --enable-decimal-float=bid && make
I suspect the problem to either be in the underlying decimal float representation (BID, in my case) or the raw types being provided by GCC. It almost looks as if everything is being rounded to the size of a 32-bit decimal float. My host arch is x86_64 so this should all be supported natively. Furthermore, GCC does have the corresponding _Decimal[32|64|128] types and <decimal/decimal> can be found on the system. I am building on Fedora 25 for a native x86_64 CPU (Intel Xenon). AFAIK, this processor does not have native decimal float support so everything is being rendered in software.
The only clue I have is that GCC does not list the --enable-decimal-float build option in the configuration summary:
$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/6.3.1/lto-wrapper
Target: x86_64-redhat-linux
Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,objc,obj-c++,fortran,ada,go,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --disable-libgcj --with-isl --enable-libmpx --enable-gnu-indirect-function --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux
Thread model: posix
gcc version 6.3.1 20161221 (Red Hat 6.3.1-1) (GCC)
$ g++ --version
g++ (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
That being said, my compiler does define the _Decimal[32|64|128] types and provides operator<< overloads for them. I wouldn't expect these to be available at all if such support were not enabled. I certainly shouldn't be able to compile a program using them and get almost valid output back.
Finally, I could see this being a problem with libdecnumber but I am just about at the limit of my current knowledge as to who manages assignment to these types.
Has anybody seen this issue before? Failing that, has anybody build and successfully used libdfp on a similar setup? Which piece of software controls rounding for the internal representation (as opposed to rounding for display)?
EDIT
I finally managed to get a disassembly. It appears that the full value is being loaded into rdx and the decimal64 constructor called. The value 0x2fafd619589efa00 works out to 3436200445056317952 in decimal, which I suspect is 0.044575289999999997 represented in BID format. I am not sure how GCC knows to used BID vs DPD as this is specified at libfdp build time, however I will leave that particular mystery for another post.
If my understanding is correct, it seems to imply that GCC is doing the rounding. Is there anything that can be done about this other than rebuilding the compiler (which I would really like to avoid)? I know IEEE-754 provides mechanisms to 'tune' the behavior of fp operations (including rounding mode), does GCC expose any of this to the user?
Disassembly
│0x401180 <main(int, char**)> push rbp │
│0x401181 <main(int, char**)+1> mov rbp,rsp │
│0x401184 <main(int, char**)+4> sub rsp,0x30 │
│0x401188 <main(int, char**)+8> mov DWORD PTR [rbp-0x14],edi │
│0x40118b <main(int, char**)+11> mov QWORD PTR [rbp-0x20],rsi │
b+ │0x40118f <main(int, char**)+15> movabs rdx,0x2fafd619589efa00 │
│0x401199 <main(int, char**)+25> lea rax,[rbp-0x10] │
│0x40119d <main(int, char**)+29> mov QWORD PTR [rbp-0x28],rdx │
│0x4011a1 <main(int, char**)+33> movq xmm0,QWORD PTR [rbp-0x28] │
│0x4011a6 <main(int, char**)+38> mov rdi,rax │
│0x4011a9 <main(int, char**)+41> call 0x4012e6 <std::decimal::decimal64::decimal64(decimal64)> │
│0x4011ae <main(int, char**)+46> mov rax,QWORD PTR [rbp-0x10] │
│0x4011b2 <main(int, char**)+50> mov QWORD PTR [rbp-0x28],rax │
│0x4011b6 <main(int, char**)+54> movq xmm0,QWORD PTR [rbp-0x28] │
│0x4011bb <main(int, char**)+59> mov edi,0x4018e8 │
│0x4011c0 <main(int, char**)+64> mov eax,0x1 │
│0x4011c5 <main(int, char**)+69> call 0x400a30 <printf#plt> │
│0x4011ca <main(int, char**)+74> mov eax,0x0 │
│0x4011cf <main(int, char**)+79> leave │
│0x4011d0 <main(int, char**)+80> ret │
│0x4011d1 <__static_initialization_and_destruction_0(int, int)> push rbp │
│0x4011d2 <__static_initialization_and_destruction_0(int, int)+1> mov rbp,rsp │
│0x4011d5 <__static_initialization_and_destruction_0(int, int)+4>────────sub rsp,0x10───────────────────────────────────────────────────────────│
│0x4011d9 <__static_initialization_and_destruction_0(int, int)+8> mov DWORD PTR [rbp-0x4],edi │
│0x4011dc <__static_initialization_and_destruction_0(int, int)+11> mov DWORD PTR [rbp-0x8],esi │
│0x4011df <__static_initialization_and_destruction_0(int, int)+14> cmp DWORD PTR [rbp-0x4],0x1 │
│0x4011e3 <__static_initialization_and_destruction_0(int, int)+18> jne 0x40120c <__static_initialization_and_destruction_0(int, int)+59> │
│0x4011e5 <__static_initialization_and_destruction_0(int, int)+20> cmp DWORD PTR [rbp-0x8],0xffff │
│0x4011ec <__static_initialization_and_destruction_0(int, int)+27> jne 0x40120c <__static_initialization_and_destruction_0(int, int)+59> │
│0x4011e5 <__static_initialization_and_destruction_0(int, int)+20> cmp DWORD PTR [rbp-0x8],0xffff │
│0x4011ec <__static_initialization_and_destruction_0(int, int)+27> jne 0x40120c <__static_initialization_and_destruction_0(int, int)+59> │
│0x4011ee <__static_initialization_and_destruction_0(int, int)+29> mov edi,0x60309d │
│0x4011f3 <__static_initialization_and_destruction_0(int, int)+34> call 0x400ae0 <_ZNSt8ios_base4InitC1Ev#plt> │
│0x4011f8 <__static_initialization_and_destruction_0(int, int)+39> mov edx,0x4018d8 │
│0x4011fd <__static_initialization_and_destruction_0(int, int)+44> mov esi,0x60309d │
│0x401202 <__static_initialization_and_destruction_0(int, int)+49> mov edi,0x400aa0 │
│0x401207 <__static_initialization_and_destruction_0(int, int)+54> call 0x400ac0 <__cxa_atexit#plt> │
Complete Test Source
#include <float.h>
#include <decimal/decimal>
#include <math.h>
#include <fenv.h>
#include <stdlib.h>
#include <wchar.h>
#include <cstdlib>
int main (int argc, char *argv[])
{
std::decimal::decimal64 testval = 0.044575289999999997DD;
printf("Decimal float test: expected=0.044575289999999997, actual=%.16Da\n", testval);
return EXIT_SUCCESS;
}
Having trouble stepping into string.h in GDB 7.5. Here's a simple example program:
Source code:
#include <stdio.h>
#include <string.h>
int main() {
char str1[20];
strcpy(str1, "STEP INTO ME\n");
printf(str1);
}
Compiled: ~$ gcc -g foo.c
Invoked: ~$ gdb -q ./a.out
GDB:
(gdb) break 5
Breakpoint 1 at 0x8048471: file foo.c, line 6.
(gdb) break strcpy
Function "strcpy" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (strcpy) pending.
(gdb) run
Starting program: /home/user/a.out
Breakpoint 1, main () at foo.c:6
6 strcpy(str_a, "Hello, world!\n");
(gdb) step
7 printf(str_a);
Shouldn't I be in the string library at this point? Instead it continues to the printf().
EDIT:
Scott's suggestion "worked", but not in the expected manner.
Breakpoint 1, main () at foo.c:6
6 strcpy(str_a, "Hello, world!\n");
(gdb) i r $eip
eip 0x80484a1 0x80484a1 <main+21>
(gdb) step
Breakpoint 2, __strcpy_ssse3 () at ../sysdeps/i386/i686/multiarch/strcpy-ssse3.S:78
78 ../sysdeps/i386/i686/multiarch/strcpy-ssse3.S: No such file or directory.
(gdb) i r $eip
eip 0xb7e9c820 0xb7e9c820 <__strcpy_ssse3>
I am surprised at the directory in 78... expected something like: /lib/.../cmov/libc.so.6. And the claim that there is no such file or directory.
Recompile your code with gcc -fno-builtin -g foo.c and the gdb step command will work. (See -fno-builtin documentation). Otherwise small strcpy(), memcpy() calls would often be translated into open coded data movement instructions, e.g. on x86-64:
4 int main() {
0x000000000040052c <+0>: push %rbp
0x000000000040052d <+1>: mov %rsp,%rbp
0x0000000000400530 <+4>: sub $0x20,%rsp
5 char str1[20];
6 strcpy(str1, "STEP INTO ME\n");
0x0000000000400534 <+8>: lea -0x20(%rbp),%rax
0x0000000000400538 <+12>: movl $0x50455453,(%rax)
0x000000000040053e <+18>: movl $0x544e4920,0x4(%rax)
0x0000000000400545 <+25>: movl $0x454d204f,0x8(%rax)
0x000000000040054c <+32>: movw $0xa,0xc(%rax)
7 printf(str1);
0x0000000000400552 <+38>: lea -0x20(%rbp),%rax
0x0000000000400556 <+42>: mov %rax,%rdi
0x0000000000400559 <+45>: mov $0x0,%eax
0x000000000040055e <+50>: callq 0x400410 <printf#plt>
8 }
0x0000000000400563 <+55>: leaveq
0x0000000000400564 <+56>: retq
You can see the strpcy() call being compiled into multiple MOV instructions.
gcc -fno-builtin compiles the same program into:
4 int main() {
0x000000000040057c <+0>: push %rbp
0x000000000040057d <+1>: mov %rsp,%rbp
0x0000000000400580 <+4>: sub $0x20,%rsp
5 char str1[20];
6 strcpy(str1, "STEP INTO ME\n");
0x0000000000400584 <+8>: lea -0x20(%rbp),%rax
0x0000000000400588 <+12>: mov $0x400660,%esi
0x000000000040058d <+17>: mov %rax,%rdi
0x0000000000400590 <+20>: callq 0x400450 <strcpy#plt>
7 printf(str1);
0x0000000000400595 <+25>: lea -0x20(%rbp),%rax
0x0000000000400599 <+29>: mov %rax,%rdi
0x000000000040059c <+32>: mov $0x0,%eax
0x00000000004005a1 <+37>: callq 0x400460 <printf#plt>
8 }
0x00000000004005a6 <+42>: leaveq
0x00000000004005a7 <+43>: retq
and you can see the call to <strcpy#plt>.
Assuming you wanted to step into strcpy() to study its implementation, you'd want to have debug info for libc.so installed. Unfortunately the way to get debug info differs between Linux distros. On Fedora it's as simple as debuginfo-install glibc. It takes more steps on Ubuntu and Debian. This RPM DPKG Rosetta Stone page have links to instructions for Fedora, Ubuntu and Debian (search for debuginfo).
Since you're on Ubuntu 12.10 and actually want to see the strcpy() assembly source code:
$ sudo apt-get install libc6-dbg
$ sudo apt-get source libc6-dev
$ gdb ./a.out
(gdb) directory eglibc-2.15/sysdeps
Source directories searched: /home/scottt/eglibc-2.15/sysdeps:$cdir:$cwd
(gdb) break strcpy
Breakpoint 1 at 0x400450
(gdb) run
Starting program: /home/scottt/a.out
Breakpoint 1, __strcpy_sse2 () at ../sysdeps/x86_64/multiarch/../strcpy.S:32
32 movq %rsi, %rcx /* Source register. */
You tried to set a breakpoint for a function defined in the string library usually part of the standard C library - libc.so
And as gdb informs you:
(gdb) break strcpy
Function "strcpy" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (strcpy) pending.
the library is not loaded yet.
But the real problem is, even when the library is loaded, if the library i.e. libc.so does not have debug symbols in it, you would not be able to step through the code within the library using gdb.
You could enable verbose mode to see which symbols, gdb is able to load:
(gdb) b main
Breakpoint 1 at 0x400914: file test.cpp, line 7.
(gdb) set verbose on
(gdb) run
Starting program: /home/agururaghave/.scratch/gdb-test/test
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from system-supplied DSO at 0x7ffff7ffb000...(no debugging symbols found)...done.
Reading symbols from /usr/lib64/libstdc++.so.6...(no debugging symbols found)...done.
Registering libstdc++-v6 pretty-printer for /usr/lib64/libstdc++.so.6 ...
Loaded symbols for /usr/lib64/libstdc++.so.6
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgcc_s.so.1
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Breakpoint 1, main () at test.cpp:7
7 bool result = myObj1 < myObj2;
This line for example tells you whether it was able to get the symbols for libc.so:
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
You could then figure out where the debug symbols are picked up from using show debug-file-directory:
(gdb) show debug-file-directory
The directory where separate debug symbols are searched for is "/usr/lib/debug".
As you see /usr/lib/debug here does not contain the full .so with debug symbols. Instead it only has the debug info without any .text or .data sections of the actual libc.so which the program uses for execution.
The solution to install the debug info for libraries would be distro specific.
I think the package is called libc6-dbg on the debian based distros. On my openSUSE machine, it seems to be called glibc-debuginfo
BTW, +1 on scottt's suggestion of using -fno-builtin so that gcc does not use its built-in methods for functions like strcpy and other standard ones defined as part of C standard.
You probably don't have any symbols for your C library. Try stepi, but be prepared to see only assembly instructions.