Consider the following code:
#include <stdio.h>
void __attribute__ ((constructor)) a_constructor()
{
printf("%s\n", __func__);
}
void __attribute__ ((constructor)) b_constructor()
{
printf("%s\n", __func__);
}
int main()
{
printf("%s\n",__func__);
}
I compile the above code as : gcc -ggdb prog2.c -o prog2. The code runs as expected.
a_constructor
b_constructor
main
But when I see its dump using objdump -d prog2 > f. There is neither a call to __do_global_ctors_aux anywhere in _init or anywhere else, nor a definition of __do_global_ctors_aux. So, how do the constructors get called? Where is the definition of __do_global_ctors_aux? Is this some optimization?
I also tried compiling it with no optimization like this: gcc -ggdb -O0 prog2.c -o prog2. Please Clarify.
The compilation is being done on 32 bit linux machine.
EDIT
My output from gdb bt is:
Breakpoint 1, a_constructor () at prog2.c:5
5 printf("%s\n", __func__);
(gdb) bt
#0 a_constructor () at prog2.c:5
#1 0x080484b2 in __libc_csu_init ()
#2 0xb7e31a1a in __libc_start_main (main=0x8048445 <main>, argc=1, argv=0xbffff014, init=0x8048460 <__libc_csu_init>,
fini=0x80484d0 <__libc_csu_fini>, rtld_fini=0xb7fed180 <_dl_fini>, stack_end=0xbffff00c) at libc-start.c:246
#3 0x08048341 in _start ()
So, how do the constructors get called?
If you look at the disassembly produced with gcc -g -O0 -S -fverbose-asm prog2.c -o prog2.s, there is the following:
.text
.Ltext0:
.globl a_constructor
.type a_constructor, #function
a_constructor:
.LFB0:
.file 1 "test.c"
.loc 1 4 0
.cfi_startproc
pushq %rbp #
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp #,
.cfi_def_cfa_register 6
.loc 1 5 0
movl $__func__.2199, %edi #,
call puts #
.loc 1 6 0
popq %rbp #
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size a_constructor, .-a_constructor
.section .init_array,"aw"
.align 8
.quad a_constructor
In the above, function a_constructor is put into .text section. And a pointer to the function is also appended to .init_array section. Before calling main glibc iterates over this array and invokes all constructor functions found there.
The details are implementation-specific and you don't mention your implementation.
A perfectly valid strategy used by some implementations is to create a run-time library that contains the real entry point for your program. That real entry point first calls all constructors, and then calls main. If your program is dynamically linked and the code behind that real entry point resides in a shared library (like, say, libc), then clearly disassembling your program cannot possibly show you where the constructor gets called.
A simple approach for figuring where precisely the call is coming from is by loading your program in a debugger, setting a breakpoint on one of the constructors, and asking for the call stack when the breakpoint is hit. For example, on Cygwin:
$ gdb ./test
GNU gdb (GDB) 7.8
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "i686-pc-cygwin".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
.
Find the GDB manual and other documentation resources online at:
.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./test...done.
(gdb) b a_constructor
Breakpoint 1 at 0x4011c6: file test.cc, line 5.
(gdb) run
Starting program: /home/Harald van Dijk/test
[New Thread 4440.0x1734]
[New Thread 4440.0xa8c]
b_constructor
Breakpoint 1, a_constructor () at test.cc:5
5 printf("%s\n", __func__);
(gdb) bt
#0 a_constructor () at test.cc:5
#1 0x61006986 in __main () from /usr/bin/cygwin1.dll
#2 0x004011f6 in main () at test.cc:14
(gdb)
This shows that on Cygwin, a variant of the strategy I mentioned is used: the real entry point is the main function, but the compiler inserts a call to a Cygwin-specific __main function right at the start, and it's that __main function that searches for all constructors and calls them directly.
(Incidentally, clearly this breaks if main is called recursively: the constructors would run a second time. This is why C++ does not allow main to be called recursively. C does allow it, but then, standard C doesn't have constructor functions.)
And you can get a hint of how that __main function searches for them, by not disassembling the executable program, but asking the compiler for the generated assembly:
$ gcc -S test.c -o -
I won't copy the whole assembly listing here, but it shows that on this particular implementation, constructor functions get emitted in a .ctors segment, so it would be easy for a __main function to simply call all functions in that segment, without the compiler having to enumerate each such function one by one.
Related
Ftrace supports dynamic trace, that is, it can trace any global function in the kernel and modules. It uses the -pg compilation option of gcc to add a stub at the beginning of each function, so that when needed, the function can be controlled to jump to the specified code for execution. gcc 4.6 newly added -pg -mfentry support, so that an instruction to call fentry can be inserted at the very beginning of the function, like:
[root#localhost kernel-4.4.27]# echo 'void foo(){}' | gcc -x c -S -o - - -pg -mfentry
foo:
.LFB0:
.cfi_startproc
call __fentry__
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
What I need to do is like -pg option, however I wanna leave 5 extra bytes both at the begin and the end of each function to modify for my use, so I need some advice about how to modify clang or llvm to realize it (because gcc maybe too hard for me to understand and modify)
This question already has answers here:
Hiding instantiated templates in shared library created with g++
(5 answers)
Closed 2 years ago.
I have a C++ library with a C API, and I have set the -fvisibility=hidden compiler flag,
and then I have set __attribute__ ((visibility ("default"))) on C API methods.
However, I still see visible C++ symbols. When I create a debian package for my library,
I get the following symbols file
Why are these symbols still visible ?
You should run your symbols file through c++filt which converts the "mangled" symbol names to what is readable [in the c++ sense].
If you do, you'll find that two thirds of the symbols are std::whatever, and not your symbols. So, they are being pulled in because of the STL. You may not be able to control them.
The other symbols are grk_*, if that helps.
There are object file utilities (e.g. readelf, objdump, objcopy, etc) that may allow you to edit/patch your object files.
Or, you might be able to use a linker script.
Or, you could compile with -S to get a .s file. You could then write a [perl/python] script to modify the asm source and add/change whatever attribute(s) you need to change the visibility. Then, just do: c++ -c modified.s
For a given symbol (e.g.):
int __attribute__((visibility("hidden")))
main(void)
{
return 0;
}
The asm file is:
.file "main.c"
.text
.globl main
.hidden main
.type main, #function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
movl $0, %eax
popq %rbp
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (GNU) 8.3.1 20190223 (Red Hat 8.3.1-2)"
.section .note.GNU-stack,"",#progbits
Notice the asm directive:
.hidden main
Even without such a directive, it should be easy to write a script to add one [after the corresponding .globl]
Having trouble stepping into string.h in GDB 7.5. Here's a simple example program:
Source code:
#include <stdio.h>
#include <string.h>
int main() {
char str1[20];
strcpy(str1, "STEP INTO ME\n");
printf(str1);
}
Compiled: ~$ gcc -g foo.c
Invoked: ~$ gdb -q ./a.out
GDB:
(gdb) break 5
Breakpoint 1 at 0x8048471: file foo.c, line 6.
(gdb) break strcpy
Function "strcpy" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (strcpy) pending.
(gdb) run
Starting program: /home/user/a.out
Breakpoint 1, main () at foo.c:6
6 strcpy(str_a, "Hello, world!\n");
(gdb) step
7 printf(str_a);
Shouldn't I be in the string library at this point? Instead it continues to the printf().
EDIT:
Scott's suggestion "worked", but not in the expected manner.
Breakpoint 1, main () at foo.c:6
6 strcpy(str_a, "Hello, world!\n");
(gdb) i r $eip
eip 0x80484a1 0x80484a1 <main+21>
(gdb) step
Breakpoint 2, __strcpy_ssse3 () at ../sysdeps/i386/i686/multiarch/strcpy-ssse3.S:78
78 ../sysdeps/i386/i686/multiarch/strcpy-ssse3.S: No such file or directory.
(gdb) i r $eip
eip 0xb7e9c820 0xb7e9c820 <__strcpy_ssse3>
I am surprised at the directory in 78... expected something like: /lib/.../cmov/libc.so.6. And the claim that there is no such file or directory.
Recompile your code with gcc -fno-builtin -g foo.c and the gdb step command will work. (See -fno-builtin documentation). Otherwise small strcpy(), memcpy() calls would often be translated into open coded data movement instructions, e.g. on x86-64:
4 int main() {
0x000000000040052c <+0>: push %rbp
0x000000000040052d <+1>: mov %rsp,%rbp
0x0000000000400530 <+4>: sub $0x20,%rsp
5 char str1[20];
6 strcpy(str1, "STEP INTO ME\n");
0x0000000000400534 <+8>: lea -0x20(%rbp),%rax
0x0000000000400538 <+12>: movl $0x50455453,(%rax)
0x000000000040053e <+18>: movl $0x544e4920,0x4(%rax)
0x0000000000400545 <+25>: movl $0x454d204f,0x8(%rax)
0x000000000040054c <+32>: movw $0xa,0xc(%rax)
7 printf(str1);
0x0000000000400552 <+38>: lea -0x20(%rbp),%rax
0x0000000000400556 <+42>: mov %rax,%rdi
0x0000000000400559 <+45>: mov $0x0,%eax
0x000000000040055e <+50>: callq 0x400410 <printf#plt>
8 }
0x0000000000400563 <+55>: leaveq
0x0000000000400564 <+56>: retq
You can see the strpcy() call being compiled into multiple MOV instructions.
gcc -fno-builtin compiles the same program into:
4 int main() {
0x000000000040057c <+0>: push %rbp
0x000000000040057d <+1>: mov %rsp,%rbp
0x0000000000400580 <+4>: sub $0x20,%rsp
5 char str1[20];
6 strcpy(str1, "STEP INTO ME\n");
0x0000000000400584 <+8>: lea -0x20(%rbp),%rax
0x0000000000400588 <+12>: mov $0x400660,%esi
0x000000000040058d <+17>: mov %rax,%rdi
0x0000000000400590 <+20>: callq 0x400450 <strcpy#plt>
7 printf(str1);
0x0000000000400595 <+25>: lea -0x20(%rbp),%rax
0x0000000000400599 <+29>: mov %rax,%rdi
0x000000000040059c <+32>: mov $0x0,%eax
0x00000000004005a1 <+37>: callq 0x400460 <printf#plt>
8 }
0x00000000004005a6 <+42>: leaveq
0x00000000004005a7 <+43>: retq
and you can see the call to <strcpy#plt>.
Assuming you wanted to step into strcpy() to study its implementation, you'd want to have debug info for libc.so installed. Unfortunately the way to get debug info differs between Linux distros. On Fedora it's as simple as debuginfo-install glibc. It takes more steps on Ubuntu and Debian. This RPM DPKG Rosetta Stone page have links to instructions for Fedora, Ubuntu and Debian (search for debuginfo).
Since you're on Ubuntu 12.10 and actually want to see the strcpy() assembly source code:
$ sudo apt-get install libc6-dbg
$ sudo apt-get source libc6-dev
$ gdb ./a.out
(gdb) directory eglibc-2.15/sysdeps
Source directories searched: /home/scottt/eglibc-2.15/sysdeps:$cdir:$cwd
(gdb) break strcpy
Breakpoint 1 at 0x400450
(gdb) run
Starting program: /home/scottt/a.out
Breakpoint 1, __strcpy_sse2 () at ../sysdeps/x86_64/multiarch/../strcpy.S:32
32 movq %rsi, %rcx /* Source register. */
You tried to set a breakpoint for a function defined in the string library usually part of the standard C library - libc.so
And as gdb informs you:
(gdb) break strcpy
Function "strcpy" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (strcpy) pending.
the library is not loaded yet.
But the real problem is, even when the library is loaded, if the library i.e. libc.so does not have debug symbols in it, you would not be able to step through the code within the library using gdb.
You could enable verbose mode to see which symbols, gdb is able to load:
(gdb) b main
Breakpoint 1 at 0x400914: file test.cpp, line 7.
(gdb) set verbose on
(gdb) run
Starting program: /home/agururaghave/.scratch/gdb-test/test
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from system-supplied DSO at 0x7ffff7ffb000...(no debugging symbols found)...done.
Reading symbols from /usr/lib64/libstdc++.so.6...(no debugging symbols found)...done.
Registering libstdc++-v6 pretty-printer for /usr/lib64/libstdc++.so.6 ...
Loaded symbols for /usr/lib64/libstdc++.so.6
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libgcc_s.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgcc_s.so.1
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Breakpoint 1, main () at test.cpp:7
7 bool result = myObj1 < myObj2;
This line for example tells you whether it was able to get the symbols for libc.so:
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
You could then figure out where the debug symbols are picked up from using show debug-file-directory:
(gdb) show debug-file-directory
The directory where separate debug symbols are searched for is "/usr/lib/debug".
As you see /usr/lib/debug here does not contain the full .so with debug symbols. Instead it only has the debug info without any .text or .data sections of the actual libc.so which the program uses for execution.
The solution to install the debug info for libraries would be distro specific.
I think the package is called libc6-dbg on the debian based distros. On my openSUSE machine, it seems to be called glibc-debuginfo
BTW, +1 on scottt's suggestion of using -fno-builtin so that gcc does not use its built-in methods for functions like strcpy and other standard ones defined as part of C standard.
You probably don't have any symbols for your C library. Try stepi, but be prepared to see only assembly instructions.
I have a program in ASM (NASM) and I want obtain an address, but some strange error happen when I was debugging with GDB (I typed "next" and the program exited). Is there some bug in GDB?
test.asm
BITS 32
section .text
global _start
_start:
call function
mov eax,0x41414141
function:
# esi get the address of "mov eax,0x41414141"
pop esi
# Exit
xor eax,eax
xor ebx,ebx
mov al,0x01
int 0x80
Debugging
$ nasm -f elf test.asm
$ ld test.o -o test
$ gdb -q ./test
Reading symbols from /root/Desktop/test...(no debugging symbols found)...done.
(gdb) info functions
All defined functions:
Non-debugging symbols:
0x08048060 _start
0x0804806a function
(gdb) b function
Breakpoint 1 at 0x804806a
(gdb) run # Execute _start
Starting program: /root/Desktop/test
Breakpoint 1, 0x0804806a in function ()
(gdb) # We're going to execute "pop esi" now
(gdb) next # Execute only 1 instruction
Single stepping until exit from function function,
which has no line number information.
[Inferior 1 (process 26492) exited normally]
# WHY EXIT? We was going to execute "pop esi" !!
You used "next" which tells gdb to do source level step (move to next line in source). As you did not build your executable with debug information included gdb does not know how to do this.
There are two solutions:
Build with debug info enabled. I do not know nasm, but it looks it uses the usual -g switch to enable debug info. Add this when assembling.
Use nexti in gdb. This will just execute next assembly instruction and will not care about source.
Working with embedded linux cross compiling the code on an Ubuntu box to run on a COMX-p2020 module. I'm assuming I'm either missing or have some compiler setting incorrect which is causing the illegal instruction. Here is my compiler flags.
CPPFLAGS = -MD -MP -w -g $(DEFINES) $(INCLUDES) -pthread -mcpu=powerpc
And here is the output out put from gdb.
Program received signal SIGILL, Illegal instruction.
0x0ff1d3f4 in std::ios_base::Init::Init() () from /usr/lib/libstdc++.so.6
(gdb) bt
#0 0x0ff1d3f4 in std::ios_base::Init::Init() () from /usr/lib/libstdc++.so.6
#1 0x100d3074 in __static_initialization_and_destruction_0 (__initialize_p=1,__priority=65535) at /opt/Freescale/CodeWarrior_PA_10.0/Cross_Tools/freescale-4.4/bin/../lib/gcc/powerpc-linux-gnu/4.4.1/../../../../powerpc-linux-gnu/include/c++/4.4.1/iostream:72
#2 0x100d30d0 in global constructors keyed to outDmxData() () at ../Luminaire/Mac Source/VArtnetManager.cpp:769
#3 0x100def88 in __do_global_ctors_aux ()
#4 0x10001a58 in _init ()
#5 0x100deed8 in __libc_csu_init ()
#6 0x0fc1d684 in generic_start_main () from /lib/libc.so.6
#7 0x0fc1d8b0 in __libc_start_main () from /lib/libc.so.6
#8 0x00000000 in ?? ()
It is never reaching main, looks like it is trying to allocate a global and chokes during initialization. Here is the global in question.
unsigned char outDmxData[kNumDmxBuses][513];
I then started to strip code down to make sure I could get it to run. I can compile and successfully run a simple hello world with same compiler settings with no problem. I then started slowly adding objects back in till I ran into this.
Program received signal SIGILL, Illegal instruction.
0x0ff6b680 in std::string::assign(std::string const&) ()
from /usr/lib/libstdc++.so.6
(gdb) bt
#0 0x0ff6b680 in std::string::assign(std::string const&) () from /usr/lib/libstdc++.so.6
#1 0x0ff6b6e4 in std::string::operator=(std::string const&) () from /usr/lib/libstdc++.so.6
#2 0x10008014 in VxQueue::InitQueue (this=0x10052038) at ../Common/SystemObjects/VxQueue.cpp:114
#3 0x10007a6c in VxQueue::VxQueue (this=0x10052038,queueName=0x1001d580 "DestoryedObjects") at ../Common/SystemObjects/VxQueue.cpp:40
#4 0x10004aa4 in VxMessageManager::CreateQueueContext (this=0x10052008,queueName=0x1001d580 "DestoryedObjects") at ../Common/SystemObjects/VxMessageManager.cpp:209
#5 0x10004750 in VxMessageManager::VxMessageManager (this=0x10052008) at ../Common/SystemObjects/VxMessageManager.cpp:187
#6 0x10003fa0 in VxMessageManager::CreateSharedMessageManager () at ../Common/SystemObjects/VxMessageManager.cpp:36
#7 0x10001714 in main () at ../Luminaire_gcc/main.cpp:68
The line in question looks like this.
// set default queue name
char queueName[32];
snprintf(queueName, sizeof(queueName), "queue_%04d", m_queueId);
m_queueName = std::string(queueName); // <- error in question
Edit
Here is the disassembly for std::ios_base::Init::Init(). It looks like .long is the instruction it is having problems with. Will post the std::string in a few.
0x0ff1d3dc <+76>: stw r28,48(r1)
0x0ff1d3e0 <+80>: lwz r24,-32768(r30)
0x0ff1d3e4 <+84>: stw r31,60(r1)
0x0ff1d3e8 <+88>: cmpwi cr7,r24,0
0x0ff1d3ec <+92>: beq- cr7,0xff1d870 <_ZNSt8ios_base4InitC1Ev+1248>
0x0ff1d3f0 <+96>: lwz r27,-32764(r30)
=> 0x0ff1d3f4 <+100>: .long 0x7c2004ac
0x0ff1d3f8 <+104>: lwarx r28,0,r27
0x0ff1d3fc <+108>: addi r9,r28,1
0x0ff1d400 <+112>: stwcx. r9,0,r27
0x0ff1d404 <+116>: bne- 0xff1d3f8 <_ZNSt8ios_base4InitC1Ev+104>
The std::string issue appears to look the same. I'm guessing .long 0x7c2004ac means it doesn't know what the instruction is?
0x0ff6b528 <+72>: lwz r0,-32760(r30)
0x0ff6b52c <+76>: cmpwi cr7,r0,0
0x0ff6b530 <+80>: beq- cr7,0xff6b564 <_ZNSsD1Ev+132>
0x0ff6b534 <+84>: addi r10,r3,8
=> 0x0ff6b538 <+88>: .long 0x7c2004ac
0x0ff6b53c <+92>: lwarx r9,0,r10
0x0ff6b540 <+96>: addi r11,r9,-1
0x0ff6b544 <+100>: stwcx. r11,0,r10
0x0ff6b548 <+104>: bne- 0xff6b53c <_ZNSsD1Ev+92>
Edit
Sorry for the length. More for my benefit to document this as I go. Looks like 0x7c2004ac translates to PPC_INST_LWSYNC. Which lead me to this article http://gcc.gnu.org/ml/gcc-patches/2006-11/msg01238.html that sounds like my exact problem (lwsync doesn't work on e500 processors). The next problem being, I am strapped for time and the toolchain I'm using was packaged with the dev kit. So I don't know of a way I can quickly patch this without trying to figure out how to build the toolchain from scratch which I know will not be a quick task, at least for me... I guess I can contact vendor, but they have not been responsive in the past and usually it's up to me to fix their problems.
Use the gdb command disassemble to see what the instruction is in frame 0, and then find out if that is a legal instruction for your hardware platform.
Finally resolved my issue by switching to different toolchain. I was using the toolchain that shipped with Codewarrior 10.0.2 which was causing me the issue. I then used crosstools-ng 1.17.0 using the powerpc-e500v2-linux-gnuspe sample to build a new toolchain. I ran into one issue where crosstools-ng failed to build gdb with [ERROR] configure: error: python is missing or unusable. I did have python installed, so unsure why I got the error. I'm already cross compiling gdb myself, so just disabled it in the menuconfig (Debug Facilities->gdb). I also had to disable -Werror on the kernel with the compiler upgrade.
Also, I am specifying e500mc for the mpcu option when compiling my binary.
CPPFLAGS = -MD -MP -w -g $(DEFINES) $(INCLUDES) -pthread -mcpu=e500mc
Thanks Jonathan for pointing me in the right direction. Hope this helps someone debug similar issue faster than it took me.