I know there is already a lot of topics about this subject, however I didn't find in any of them the explanation to how it is possible to have one definition of an inline function per translation unit. I know that inline functions have external linkage, so when I call one inline function in different translation units they all reference the same function. I also know that in Linux the compiler registers inline function with the "W" symbol, so it is registered as a weak symbol (thanks to it the first definition encountered by the compiler of the inline function is the one considered). However whenever I use nm on the object files created by g++ in Windows there is no weak symbol attached to the function signature. How can the linker know which definition to consider and not to accuse of multiple definitions? (I don't show the linker's work but I tested it).
Here is some example code:
// main1.cpp
#include "main.hpp"
/* inline accepts one definition in each TU's */
inline void Test::hello_world()
{
std::cout << "Hello world" << std::endl;
}
/*void Test::hello_world()
{
std::cout << "Hello world" << std::endl;
}*/
extern void return_hello_world();
int main()
{
Test t;
t.hello_world();
return_hello_world();
return 0;
}
// main2.cpp
#include "main.hpp"
/* inline accepts one definition in each TU's */
inline void Test::hello_world()
{
std::cout << "Hello world" << std::endl;
}
/* void Test::hello_world()
{
std::cout << "Hello world" << std::endl;
} */
void return_hello_world()
{
Test t;
t.hello_world();
}
// main.hpp
#ifndef TEST
#define TEST
#include <iostream>
class Test
{
public:
void hello_world();
};
#endif
The results I receive are:
Windows:
Experimentation4> g++ -g -O0 -c main1.cpp main2.cpp
Experimentation4> nm main1.o
0000000000000000 b .bss
0000000000000000 d .ctors
0000000000000000 d .data
0000000000000000 N .debug_abbrev
0000000000000000 N .debug_aranges
0000000000000000 N .debug_frame
0000000000000000 N .debug_info
0000000000000000 N .debug_line
0000000000000000 N .debug_ranges
0000000000000000 N .debug_str
0000000000000000 p .pdata
0000000000000000 p .pdata$_ZN4Test11hello_worldEv
0000000000000000 r .rdata
0000000000000000 r .rdata$.refptr._ZSt4cout
0000000000000000 r .rdata$.refptr._ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
0000000000000000 r .rdata$zzz
0000000000000000 R .refptr._ZSt4cout
0000000000000000 R .refptr._ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
0000000000000000 t .text
0000000000000000 t .text$_ZN4Test11hello_worldEv
0000000000000000 r .xdata
0000000000000000 r .xdata$_ZN4Test11hello_worldEv
U __main
0000000000000029 t __tcf_0
0000000000000080 t _GLOBAL__sub_I_main
U _Z18return_hello_worldv
0000000000000044 t _Z41__static_initialization_and_destruction_0ii
0000000000000000 T _ZN4Test11hello_worldEv
U _ZNSolsEPFRSoS_E
U _ZNSt8ios_base4InitC1Ev
U _ZNSt8ios_base4InitD1Ev
U _ZSt4cout
U _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
0000000000000000 r _ZStL19piecewise_construct
0000000000000000 b _ZStL8__ioinit
U _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
U atexit
0000000000000000 T main
Linux:
Experimentation2$ g++ -O0 -g -c main1.cpp main2.cpp
Experimentation2$ nm main1.o
U __cxa_atexit
U __dso_handle
0000000000000081 t _GLOBAL__sub_I_main
0000000000000000 T main
U __stack_chk_fail
U _Z18return_hello_worldv
0000000000000043 t _Z41__static_initialization_and_destruction_0ii
0000000000000000 W _ZN4Test11hello_worldEv
U _ZNSolsEPFRSoS_E
U _ZNSt8ios_base4InitC1Ev
U _ZNSt8ios_base4InitD1Ev
U _ZSt4cout
U _ZSt4endlIcSt11char_traitsIcEERSt13basic_ostreamIT_T0_ES6_
0000000000000000 b _ZStL8__ioinit
U _ZStlsISt11char_traitsIcEERSt13basic_ostreamIcT_ES5_PKc
I am dealing with a pesky compilation/linker error involving C++ homebrew compiled for the Game Boy Advance. While my library, libsaturn, is getting compiled fine with the proper C++ symbols, the compiler seems to be discarding the namespace qualifiers given in the header files for the library and just placing in undefined symbols as if they had C linkage. I have inspected the object code files using nm and confirmed that this is the case.
This is the full repository for the program I am trying to build, although I have provided snippets below. Compiling the repository requires a devkitARM installation and Node.js; I have only ensured things work on Linux sadly.
I originally encountered this problem with devkitARM r46, but unfortunately dialing things back to r45 does not alleviate the issue. I have had the code combed over by several peers of mine and they have found nothing out-of-the-ordinary with my config.
This is the code I am dealing with. I’m afraid it isn’t going to be all that useful considering I have very little idea where the source of the issue lies, but regardless…
src/mainloop.cc
bool saturn::mainloop( )
{
return false;
}
include/saturn/mainloop.hh
namespace saturn
{
bool mainloop( );
}
test/src/main.cc
int main( )
{
while(!saturn::mainloop( ))
{
// Do something here
}
return 0;
}
alex#henen-nesw saturn $ arm-none-eabi-nm -pC /tmp/saturn-buildtool/code/test+src+main.cc.o
00000000 T main
U init
U halt
U mainloop
alex#henen-nesw saturn $ arm-none-eabi-nm -pC libsaturn.a
src+bios.cc.o:
00000000 T bios::halt()
U _sat__bios_halt
00000014 T bios::softReset()
U _sat__bios_soft_reset
00000028 T bios::waitInterrupt(unsigned long, unsigned long)
U _sat__bios_intr_wait
0000004c T bios::waitVblank()
U _sat__bios_vblank_intr_wait
src+bootsector.s.o:
00000000 T __start
000000c0 T _sat__rom_start
000000ec T _sat__irq_handler
U main
src+error.cc.o:
00000000 r kDispcntBgMode0
00000002 r kDispcntBgMode1
00000004 r kDispcntBgMode2
00000006 r kDispcntBgMode3
00000008 r kDispcntBgMode4
0000000a r kDispcntBgMode5
0000000c r kDispcntCgbMode
0000000e r kDispcntFrameSel
00000010 r kDispcntHblankIntv
00000012 r kDispcntObjVramDim
00000014 r kDispcntForceBlank
00000016 r kDispcntShowBg0
00000018 r kDispcntShowBg1
0000001a r kDispcntShowBg2
0000001c r kDispcntShowBg3
0000001e r kDispcntShowObj
00000020 r kDispcntShowWin0
00000022 r kDispcntShowWin1
00000024 r kDispcntShowObjWin
00000028 r ioDispcnt
0000002c r ioGreenswap
00000030 r kDispstatVblank
00000032 r kDispstatHblank
00000034 r kDispstatVcounter
00000036 r kDispstatVblankIrq
00000038 r kDispstatHblankIrq
0000003a r kDispstatVcounterIrq
0000003c r ioDispstat
00000040 r ioVcount
00000044 r kBgcntMosaic
00000046 r kBgcntPalMode4
00000048 r kBgcntPalMode8
0000004a r kBgcntOverflow
0000004c r ioBg0cnt
00000050 r ioBg1cnt
00000054 r ioBg2cnt
00000058 r ioBg3cnt
0000005c r ioBg0hofs
00000060 r ioBg0vofs
00000064 r ioBg1hofs
00000068 r ioBg1vofs
0000006c r ioBg2hofs
00000070 r ioBg2vofs
00000074 r ioBg3hofs
00000078 r ioBg3vofs
0000007c r ioBg2pa
00000080 r ioBg2pb
00000084 r ioBg2pc
00000088 r ioBg2pd
0000008c r ioBg2xL
00000090 r ioBg2xH
00000094 r ioBg2yL
00000098 r ioBg2yH
0000009c r ioBg3pa
000000a0 r ioBg3pb
000000a4 r ioBg3pc
000000a8 r ioBg3pd
000000ac r ioBg3xL
000000b0 r ioBg3xH
000000b4 r ioBg3yL
000000b8 r ioBg3yH
000000bc r ioWin0H
000000c0 r ioWin1H
000000c4 r ioWin0V
000000c8 r ioWin1V
000000cc r ioWinIn
000000d0 r ioWinOut
000000d4 r ioMosaic
000000d8 r ioBldcnt
000000dc r ioBldalpha
000000e0 r ioBldy
000000e4 r segBios
000000e8 r segEwram
000000ec r segIwram
000000f0 r segIo
000000f4 r segPal
00000000 b segPalBg
000000f8 r segPalObj
000000fc r segVram
00000004 b segVramBg
00000100 r segVramObj
00000104 r segOam
00000108 r segRom
0000010c r segSram
00000110 r ioDma0sad
00000114 r ioDma0dad
00000118 r ioDma0cntL
0000011c r ioDma0cntH
00000120 r ioDma1sad
00000124 r ioDma1dad
00000128 r ioDma1cntL
0000012c r ioDma1cntH
00000130 r ioDma2sad
00000134 r ioDma2dad
00000138 r ioDma2cntL
0000013c r ioDma2cntH
00000140 r ioDma3sad
00000144 r ioDma3dad
00000148 r ioDma3cntL
0000014c r ioDma3cntH
000000a0 t __static_initialization_and_destruction_0(int, int)
000000e4 t _GLOBAL__sub_I_error.cc
00000000 T saturn::error(saturn::Error)
src+init.cc.o:
00000000 T saturn::init()
U saturn::error(saturn::Error)
src+lowbios.s.o:
00000000 T _sat__bios_soft_reset
00000004 T _sat__bios_register_ram_reset
00000008 T _sat__bios_halt
0000000c T _sat__bios_stop
00000010 T _sat__bios_intr_wait
00000014 T _sat__bios_vblank_intr_wait
00000018 T _sat__bios_div
0000001c T _sat__bios_div_arm
00000020 T _sat__bios_sqrt
00000024 T _sat__bios_arc_tan
00000028 T _sat__bios_arc_tan2
0000002c T _sat__bios_cpu_set
00000030 T _sat__bios_cpu_fast_set
00000034 T _sat__bios_bg_affine_set
00000038 T _sat__bios_obj_affine_set
0000003c T _sat__bios_bit_unpack
00000040 T _sat__bios_lzss_decomp_wram
00000044 T _sat__bios_lzss_decomp_vram
00000048 T _sat__bios_huff_decomp
0000004c T _sat__bios_rl_decomp_wram
00000050 T _sat__bios_rl_decomp_vram
00000054 T _sat__bios_diff_8bit_unfilter_wram
00000058 T _sat__bios_diff_8bit_unfilter_vram
0000005c T _sat__bios_diff_16bit_unfilter
00000060 T _sat__bios_sound_bias
00000064 T _sat__bios_midi_key_to_freq
00000068 T _sat__bios_multi_boot
src+mainloop.cc.o:
00000000 T saturn::mainloop()
src+math.cc.o:
00000000 T saturn::divide(long, long)
U _sat__bios_div
00000030 T saturn::sqroot(unsigned long)
U _sat__bios_sqrt
00000050 T saturn::modulus(long, long)
src+memory.cc.o:
Thanks!
Here's a simplified build script that links the test program correctly: https://gist.github.com/nnevatie/9fe11e3933ed3f51e5344639c6881bd5
Using Ulrich Drepper's relinfo.pl script, one can easily count the number of relocations of a DSO, but it doesn't work on .o files.
Say I have a large shared library and I'm not happy about the number of its relocations. is there a way to find out where they come from (symbol, or at least .o), to check whether they're of the easily fixable type (e.g.: const char * str = "Hello World";' -> const char str[] = "Hello World";)?
Short answer: Use objdump or readelf instead.
Long answer: Let's look at an actual example case, example.c:
#include <stdio.h>
static const char global1[] = "static const char []";
static const char *global2 = "static const char *";
static const char *const global3 = "static const char *const";
const char global4[] = "const char []";
const char *global5 = "const char *";
const char *const global6 = "const char *const";
char global7[] = "char []";
char *global8 = "char *";
char *const global9 = "char *const";
int main(void)
{
static const char local1[] = "static const char []";
static const char *local2 = "static const char *";
static const char *const local3 = "static const char *const";
const char local4[] = "const char []";
const char *local5 = "const char *";
const char *const local6 = "const char *const";
char local7[] = "char []";
char *local8 = "char *";
char *const local9 = "char *const";
printf("Global:\n");
printf("\t%s\n", global1);
printf("\t%s\n", global2);
printf("\t%s\n", global3);
printf("\t%s\n", global4);
printf("\t%s\n", global5);
printf("\t%s\n", global6);
printf("\t%s\n", global7);
printf("\t%s\n", global8);
printf("\t%s\n", global9);
printf("\n");
printf("Local:\n");
printf("\t%s\n", local1);
printf("\t%s\n", local2);
printf("\t%s\n", local3);
printf("\t%s\n", local4);
printf("\t%s\n", local5);
printf("\t%s\n", local6);
printf("\t%s\n", local7);
printf("\t%s\n", local8);
printf("\t%s\n", local9);
return 0;
}
You can compile it to an object file using e.g.
gcc -W -Wall -c example.c
and to an executable using
gcc -W -Wall example.c -o example
You can use objdump -tr example.o to dump the symbol and relocation information for the (non-dynamic) object file, or objdump -TtRr example to dump the same for the executable file (and dynamic object files). Using
objdump -t example.o
on x86-64 I get
example.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 example.c
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .rodata 0000000000000000 .rodata
0000000000000000 l O .rodata 0000000000000015 global1
0000000000000000 l O .data 0000000000000008 global2
0000000000000048 l O .rodata 0000000000000008 global3
00000000000000c0 l O .rodata 0000000000000015 local1.2053
0000000000000020 l O .data 0000000000000008 local2.2054
00000000000000d8 l O .rodata 0000000000000008 local3.2055
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000050 g O .rodata 000000000000000e global4
0000000000000008 g O .data 0000000000000008 global5
0000000000000080 g O .rodata 0000000000000008 global6
0000000000000010 g O .data 0000000000000008 global7
0000000000000018 g O .data 0000000000000008 global8
00000000000000a0 g O .rodata 0000000000000008 global9
0000000000000000 g F .text 000000000000027a main
0000000000000000 *UND* 0000000000000000 puts
0000000000000000 *UND* 0000000000000000 printf
0000000000000000 *UND* 0000000000000000 putchar
0000000000000000 *UND* 0000000000000000 __stack_chk_fail
The output is described in man 1 objdump, under the -t heading. Note that the second "column" is actually fixed-width: seven characters wide, describing the type of the object. The third column is the section name, *UND* for undefined, .text for code, .rodata for read-only (immutable) data, .data for initialized mutable data, and .bss for uninitialized mutable data, and so on.
We can see from the above symbol table that local4, local5, local6, local7, local8, and local9 variables didn't actually get entries in the symbol table at all. This is because they are local to main(). The contents of the strings they refer to are stored in .data or .rodata (or constructed on the fly), depending on what the compiler sees best.
Let's look at the relocation records next. Using
objdump -r example.o
I get
example.o: file format elf64-x86-64
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
0000000000000037 R_X86_64_32S .rodata+0x000000000000005e
0000000000000040 R_X86_64_32S .rodata+0x000000000000006b
0000000000000059 R_X86_64_32S .rodata+0x0000000000000088
0000000000000062 R_X86_64_32S .rodata+0x000000000000008f
0000000000000067 R_X86_64_32 .rodata+0x00000000000000a8
000000000000006c R_X86_64_PC32 puts-0x0000000000000004
0000000000000071 R_X86_64_32 .rodata+0x00000000000000b0
0000000000000076 R_X86_64_32 .rodata
0000000000000083 R_X86_64_PC32 printf-0x0000000000000004
000000000000008a R_X86_64_PC32 .data-0x0000000000000004
000000000000008f R_X86_64_32 .rodata+0x00000000000000b0
000000000000009f R_X86_64_PC32 printf-0x0000000000000004
00000000000000a6 R_X86_64_PC32 .rodata+0x0000000000000044
00000000000000ab R_X86_64_32 .rodata+0x00000000000000b0
00000000000000bb R_X86_64_PC32 printf-0x0000000000000004
00000000000000c0 R_X86_64_32 .rodata+0x00000000000000b0
00000000000000c5 R_X86_64_32 global4
00000000000000d2 R_X86_64_PC32 printf-0x0000000000000004
00000000000000d9 R_X86_64_PC32 global5-0x0000000000000004
00000000000000de R_X86_64_32 .rodata+0x00000000000000b0
00000000000000ee R_X86_64_PC32 printf-0x0000000000000004
00000000000000f5 R_X86_64_PC32 global6-0x0000000000000004
00000000000000fa R_X86_64_32 .rodata+0x00000000000000b0
000000000000010a R_X86_64_PC32 printf-0x0000000000000004
000000000000010f R_X86_64_32 .rodata+0x00000000000000b0
0000000000000114 R_X86_64_32 global7
0000000000000121 R_X86_64_PC32 printf-0x0000000000000004
0000000000000128 R_X86_64_PC32 global8-0x0000000000000004
000000000000012d R_X86_64_32 .rodata+0x00000000000000b0
000000000000013d R_X86_64_PC32 printf-0x0000000000000004
0000000000000144 R_X86_64_PC32 global9-0x0000000000000004
0000000000000149 R_X86_64_32 .rodata+0x00000000000000b0
0000000000000159 R_X86_64_PC32 printf-0x0000000000000004
0000000000000163 R_X86_64_PC32 putchar-0x0000000000000004
0000000000000168 R_X86_64_32 .rodata+0x00000000000000b5
000000000000016d R_X86_64_PC32 puts-0x0000000000000004
0000000000000172 R_X86_64_32 .rodata+0x00000000000000b0
0000000000000177 R_X86_64_32 .rodata+0x00000000000000c0
0000000000000184 R_X86_64_PC32 printf-0x0000000000000004
000000000000018b R_X86_64_PC32 .data+0x000000000000001c
0000000000000190 R_X86_64_32 .rodata+0x00000000000000b0
00000000000001a0 R_X86_64_PC32 printf-0x0000000000000004
00000000000001a7 R_X86_64_PC32 .rodata+0x00000000000000d4
00000000000001ac R_X86_64_32 .rodata+0x00000000000000b0
00000000000001bc R_X86_64_PC32 printf-0x0000000000000004
00000000000001c1 R_X86_64_32 .rodata+0x00000000000000b0
00000000000001d6 R_X86_64_PC32 printf-0x0000000000000004
00000000000001db R_X86_64_32 .rodata+0x00000000000000b0
00000000000001ef R_X86_64_PC32 printf-0x0000000000000004
00000000000001f4 R_X86_64_32 .rodata+0x00000000000000b0
0000000000000209 R_X86_64_PC32 printf-0x0000000000000004
000000000000020e R_X86_64_32 .rodata+0x00000000000000b0
0000000000000223 R_X86_64_PC32 printf-0x0000000000000004
0000000000000228 R_X86_64_32 .rodata+0x00000000000000b0
000000000000023d R_X86_64_PC32 printf-0x0000000000000004
0000000000000242 R_X86_64_32 .rodata+0x00000000000000b0
0000000000000257 R_X86_64_PC32 printf-0x0000000000000004
0000000000000271 R_X86_64_PC32 __stack_chk_fail-0x0000000000000004
RELOCATION RECORDS FOR [.data]:
OFFSET TYPE VALUE
0000000000000000 R_X86_64_64 .rodata+0x0000000000000015
0000000000000008 R_X86_64_64 .rodata+0x000000000000005e
0000000000000018 R_X86_64_64 .rodata+0x0000000000000088
0000000000000020 R_X86_64_64 .rodata+0x0000000000000015
RELOCATION RECORDS FOR [.rodata]:
OFFSET TYPE VALUE
0000000000000048 R_X86_64_64 .rodata+0x0000000000000029
0000000000000080 R_X86_64_64 .rodata+0x000000000000006b
00000000000000a0 R_X86_64_64 .rodata+0x000000000000008f
00000000000000d8 R_X86_64_64 .rodata+0x0000000000000029
RELOCATION RECORDS FOR [.eh_frame]:
OFFSET TYPE VALUE
0000000000000020 R_X86_64_PC32 .text
The relocation records are grouped by the section they relocation resides in. Because string contents are in the .data or .rodata sections, we can restrict ourselves to look at the relocations where the VALUE starts with .data or .rodata. (Mutable strings, like char global7[] = "char []";, are stored in .data, and immutable strings and string literals in .rodata.)
If we were to compile the code with debugging symbols enabled, it would be easier to determine which variable was used to refer to which string, but I might just look at the actual contents at each relocation value (target), to see which references to the immutable strings need fixing.
The command combination
objdump -r example.o | awk '($3 ~ /^\..*\+/) { t = $3; sub(/\+/, " ", t); n[t]++ } END { for (r in n) printf "%d %s\n", n[r], r }' | sort -g
will output the number of relocations per target, followed by the target section, followed by the target offset in the section, sorted with the target that occurs most in relocations last. That is, the last lines output above are the ones you need to concentrate on. For me, I get
1 .rodata
1 .rodata 0x0000000000000044
1 .rodata 0x00000000000000a8
1 .rodata 0x00000000000000b5
1 .rodata 0x00000000000000c0
1 .rodata 0x00000000000000d4
2 .rodata 0x0000000000000015
2 .rodata 0x0000000000000029
2 .rodata 0x000000000000005e
2 .rodata 0x000000000000006b
2 .rodata 0x0000000000000088
2 .rodata 0x000000000000008f
18 .rodata 0x00000000000000b0
If I add optimization (gcc -W -Wall -O3 -fomit-frame-pointer -c example.c), the result is
1 .rodata 0x0000000000000020
1 .rodata 0x0000000000000040
1 .rodata.str1.1
1 .rodata.str1.1 0x0000000000000058
2 .rodata.str1.1 0x000000000000000d
2 .rodata.str1.1 0x0000000000000021
2 .rodata.str1.1 0x000000000000005f
2 .rodata.str1.1 0x000000000000006c
3 .rodata.str1.1 0x000000000000003a
3 .rodata.str1.1 0x000000000000004c
18 .rodata.str1.1 0x0000000000000008
which shows that compiler options do have a big effect, but that there is that one target that is anyways used 18 times: section .rodata offset 0xb0 (.rodata.str1.1 offset 0x8 if optimization is enabled at compile time).
That is the `"\t%s\n" string literal.
Modifying the original program into
char *local8 = "char *";
char *const local9 = "char *const";
const char *const fmt = "\t%s\n";
printf("Global:\n");
printf(fmt, global1);
printf(fmt, global2);
and so on, replacing the format string with an immutable string pointer fmt, eliminates those 18 relocations altogether. (You can also use the equivalent const char fmt[] = "\t%s\n";, of course.)
The above analysis indicates that at least with GCC-4.6.3, most of the avoidable relocations are caused by (repeated use of) string literals. Replacing them with an array of const chars (const char fmt[] = "\t%s\n";) or a const pointer to const chars (const char *const fmt = "\t%s\n";) -- both cases putting the contents to .rodata section, read-only, and the pointer/array reference itself is immutable too -- seems an effective and safe strategy to me.
Furthermore, conversion of string literals to immutable string pointers or char arrays is completely a source-level task. That is, if you convert all string literals using the above method, you can eliminate at least one relocation per string literal.
In fact, I don't see how object-level analysis will help you much, here. It will tell you if your modifications reduce the number of relocations needed, of course.
The above awk stanza can be extended to a function that outputs the string constants for dynamic references with positive offsets:
#!/bin/bash
if [ $# -ne 1 ] || [ "$1" = "-h" ] || [ "$1" = "--help" ]; then
exec >&2
echo ""
echo "Usage: %s [ -h | --help ]"
echo " %s object.o"
echo ""
exit 1
fi
export LANG=C LC_ALL=C
objdump -wr "$1" | awk '
BEGIN {
RS = "[\t\v\f ]*[\r\n][\t\n\v\f\r ]*"
FS = "[\t\v\f ]+"
}
$1 ~ /^[0-9A-Fa-f]+/ {
n[$3]++
}
END {
for (s in n)
printf "%d %s\n", n[s], s
}
' | sort -g | gawk -v filename="$1" '
BEGIN {
RS = "[\t\v\f ]*[\r\n][\t\n\v\f\r ]*"
FS = "[\t\v\f ]+"
cmd = "objdump --file-offsets -ws " filename
while ((cmd | getline) > 0)
if ($3 == "section") {
s = $4
sub(/:$/, "", s)
o = $NF
sub(/\)$/, "", o)
start[s] = strtonum(o)
}
close(cmd)
}
{
if ($2 ~ /\..*\+/) {
s = $2
o = $2
sub(/\+.*$/, "", s)
sub(/^[^\+]*\+/, "", o)
o = strtonum(o) + start[s]
cmd = "dd if=\"" filename "\" of=/dev/stdout bs=1 skip=" o " count=256"
OLDRS = RS
RS = "\0"
cmd | getline hex
close(cmd)
RS = OLDRS
gsub(/\\/, "\\\\", hex)
gsub(/\t/, "\\t", hex)
gsub(/\n/, "\\n", hex)
gsub(/\r/, "\\r", hex)
gsub(/\"/, "\\\"", hex)
if (hex ~ /[\x00-\x1F\x7F-\x9F\xFE\xFF]/ || length(hex) < 1)
printf "%s\n", $0
else
printf "%s = \"%s\"\n", $0, hex
} else
print $0
}
'
This is a bit crude, just slapped together, so I don't know how portable it is. On my machine, it does seem to find the string literals for the few test cases I tried it on; you should probably rewrite it to match your own needs. Or even use an actual programming language with ELF support to examine the object files directly.
For the example program shown above (prior to the modifications I suggest to reduce the number of relocations), compiled without optimization, the above script yields the output
1 .data+0x000000000000001c = ""
1 .data-0x0000000000000004
1 .rodata
1 .rodata+0x0000000000000044 = ""
1 .rodata+0x00000000000000a8 = "Global:"
1 .rodata+0x00000000000000b5 = "Local:"
1 .rodata+0x00000000000000c0 = "static const char []"
1 .rodata+0x00000000000000d4 = ""
1 .text
1 __stack_chk_fail-0x0000000000000004
1 format
1 global4
1 global5-0x0000000000000004
1 global6-0x0000000000000004
1 global7
1 global8-0x0000000000000004
1 global9-0x0000000000000004
1 putchar-0x0000000000000004
2 .rodata+0x0000000000000015 = "static const char *"
2 .rodata+0x0000000000000029 = "static const char *const"
2 .rodata+0x000000000000005e = "const char *"
2 .rodata+0x000000000000006b = "const char *const"
2 .rodata+0x0000000000000088 = "char *"
2 .rodata+0x000000000000008f = "char *const"
2 puts-0x0000000000000004
18 .rodata+0x00000000000000b0 = "\t%s\n"
18 printf-0x0000000000000004
Finally, you might notice that using a function pointer to printf() instead of calling printf() directly will reduce another 18 relocations from the example code, but I would consider that a mistake.
For code, you want relocations, as indirect function calls (calls via function pointers) are much slower than direct calls. Simply put, those relocations make function and subroutine calls much faster, so you most definitely want to keep those.
Apologies for the long answer; hope you find this useful. Questions?
Based on Nomainal Animals's answer, which I still have to fully digest, I have come up with the following simple shell script that seems to work for finding what I called the "easily fixable" variety:
for i in path/to/*.o ; do
REL="$(objdump -TtRr "$i" 2>/dev/null | grep '.data.rel.ro.local[^]+-]')"
if [ -n "$REL" ]; then
echo "$(basename "$i"):"
echo "$REL" | c++filt
echo
fi
done
Sample output (for the QtGui library):
qimagereader.o:
0000000000000000 l O .data.rel.ro.local 00000000000000c0 _qt_BuiltInFormats
0000000000000000 l d .data.rel.ro.local 0000000000000000 .data.rel.ro.local
qopenglengineshadermanager.o:
0000000000000000 l O .data.rel.ro.local 0000000000000090 QOpenGLEngineShaderManager::getUniformLocation(QOpenGLEngineShaderManager::Uniform)::uniformNames
0000000000000000 l d .data.rel.ro.local 0000000000000000 .data.rel.ro.local
qopenglpaintengine.o:
0000000000000000 l O .data.rel.ro.local 0000000000000020 vtable for (anonymous namespace)::QOpenGLStaticTextUserData
0000000000000000 l d .data.rel.ro.local 0000000000000000 .data.rel.ro.local
qtexthtmlparser.o:
0000000000000000 l O .data.rel.ro.local 00000000000003b0 elements
0000000000000000 l d .data.rel.ro.local 0000000000000000 .data.rel.ro.local
Looking up those symbols in the source file usually leads quickly to a fix, or else to the discovery that they're not easily fixable.
But I guess I'll have to revisit Nominal Animal's answer once I run out of .data.rel.ro.locals to fix...
I have a massive C/C++ library that I'm trying to use through JNI for an Android project. It's comprised of several classes that were originally written for MFC and we've ported them over for execution on the Android environment.
The library builds fine (at least according to ndk-build). The size of the library is 56 MB in size.
When I call System.loadLibrary, the application terminates with the following being logged:
Fatal Signal 11 (SIGSEGV) at 0x00000004 (Code = 1), thread 16123
I performed an objdump on my library and there is nothing at 0004. Here's the first few lines of the dump :
DYNAMIC SYMBOL TABLE:
00000000 DF *UND* 00000000 __cxa_finalize
002fe600 g D .bss 00000000 __dso_handle
002f1e9c g D .init_array 00000000 __INIT_ARRAY__
002f20a4 g D .fini_array 00000000 __FINI_ARRAY__
0011dc58 g DF .text 000000ec WideToJava
00000000 DF *UND* 00000000 __android_log_print
0021f460 g DF .text 00000030 wcslen
00000000 DF *UND* 00000000 malloc
00000000 DF *UND* 00000000 free
0011dd44 g DF .text 000000e4 WideToJava2
0011de28 g DF .text 0000004e JavaToWSZ
0027443c g DF .text 0000001c _Znaj
0011dec4 w DF .text 00000018
I then performed a readelf to see if there is another library that needs to be loaded, listed here :
Dynamic section at offset 0x2d99e0 contains 28 entries:
Tag Type Name/Value
0x00000003 (PLTGOT) 0x2fe464
0x00000002 (PLTRELSZ) 776 (bytes)
0x00000017 (JMPREL) 0x11d4a0
0x00000014 (PLTREL) REL
0x00000011 (REL) 0x107770
0x00000012 (RELSZ) 89392 (bytes)
0x00000013 (RELENT) 8 (bytes)
0x6ffffffa (RELCOUNT) 11169
0x00000006 (SYMTAB) 0x128
0x0000000b (SYMENT) 16 (bytes)
0x00000005 (STRTAB) 0x31cd8
0x0000000a (STRSZ) 791389 (bytes)
0x00000004 (HASH) 0xf3038
0x00000001 (NEEDED) Shared library: [liblog.so]
0x00000001 (NEEDED) Shared library: [libandroid.so]
0x00000001 (NEEDED) Shared library: [libstdc++.so]
0x00000001 (NEEDED) Shared library: [libm.so]
0x00000001 (NEEDED) Shared library: [libc.so]
0x00000001 (NEEDED) Shared library: [libdl.so]
0x0000000e (SONAME) Library soname: [libCSEntry.so]
0x00000019 (INIT_ARRAY) 0x2f1e9c
0x0000001b (INIT_ARRAYSZ) 520 (bytes)
0x0000001a (FINI_ARRAY) 0x2f20a4
0x0000001c (FINI_ARRAYSZ) 12 (bytes)
0x00000016 (TEXTREL) 0x0
0x00000010 (SYMBOLIC) 0x0
0x0000001e (FLAGS) SYMBOLIC TEXTREL
0x00000000 (NULL) 0x0
Then I added those libraries before my call to loadLibrary :
public class ApplicationInterface
{
static
{
System.loadLibrary("log");
System.loadLibrary("android");
System.loadLibrary("stdc++");
System.loadLibrary("m");
System.loadLibrary("c");
System.loadLibrary("dl");
// when i call load library it blows up
System.loadLibrary("CSEntry");
}
// code
}
What's stranger is that there is a specific class that blows up this code. When I commented out the constructor to this class the library loads fine. I've verified that the constructor exists in the resulting library using objdump. I then proceeded to comment out the code in the constructor and it fails as well. Here's the offending code in the C++:
// Code
m_pPifFile = new CNPifFile(sPifFilePath);
m_pPifFile->SetAppType(ENTRY_TYPE);
m_pPifFile->SetAppFName(sApplicationFilename);
m_pPifFile->SetBinaryLoad(true);
// load the PIF file
if(m_pPifFile->LoadPifFile())
{
// PIF file loaded create a new Run App
// the offending line
m_pRunAplEntry = new CRunAplEntry(m_pPifFile);
// Code
RunAplEntry.h
class AFX_EXT_CLASS CRunAplEntry : public CRunApl
{
public:
CRunAplEntry(CNPifFile* pPifFile);
~CRunAplEntry();
// code
};
RunApl.h
class CLASS_DECL_ZBRIDGEO CRunApl : public CObject
{
public:
CRunApl();
virtual ~CRunApl();
// code
};
AFX_EXT_CLASS and CLASS_DECL_ZBRIDGEO are #defined to empty space.
We wrote a CObject equivalient for the Android NDK.
Here are the makefiles :
Application.mk
# set the platform to the latest processor type
APP_ABI := armeabi-v7a
# build for GNU STL
APP_STL := gnustl_static
# turn on exceptions and runtime type info
APP_CPPFLAGS += -fexceptions -frtti
Android.mk
LOCAL_LDLIBS := -llog -landroid $(BOOST_LIBS) -lgnustl_static
LOCAL_LDFLAGS := -L$(BOOST_PATH)/lib/arm
LOCAL_CFLAGS := -D__GLIBC__=1
LOCAL_CFLAGS += -D_GLIBCXX_USE_C99_MATH=1
LOCAL_CFLAGS += -DUNICODE=1
LOCAL_CFLAGS += -D_UNICODE=1
LOCAL_CFLAGS += -DGENERATE_BINARY=1
LOCAL_CFLAGS += -DUSE_BINARY=1
LOCAL_CFLAGS += -DPORTABLE=1
LOCAL_CFLAGS += -fpermissive
LOCAL_CFLAGS += $(CSPROMOBILE_INCLUDE)
LOCAL_STATIC_LIBRARIES := Engine zTbdO zFormO zDictO zToolsO zUtilO zCommonO
I've noticed that the GCC compiler doesn't handle virtual functions very well and I suspect this could be the problem, but I'm not sure. Here are the details for my environment.
NDK: Crystax r75
ABI Target: 4.6.3
ADT: v21.0.1-543035
Debug Device: Nexus 7
Any help will be greatly appreciated.
If you have any additional info please don't hesitate to ask.
Thanks,
Will
I want to know programmatic way to get the memory consumed by my user defined class.
Following is the declaration of the class
struct TrieNode {
typedef std::map<char, TrieNode *> ChildType;
std::string m_word;
bool m_visited;
}
I have inserted around 264061 words into this Trie. After this when i do sizeof(trieobject) it just show me 32. How do i know how much exact memory is used by such data structures.
I use
valgrind --tool=massif ./myprogram -opt arg1 arg2
ms_print massif.* | less -SR
for that. Sample output from this page
19.63^ ###
| #
| # ::
| # : :::
| :::::::::# : : ::
| : # : : : ::
| : # : : : : :::
| : # : : : : : ::
| ::::::::::: # : : : : : : :::
| : : # : : : : : : : ::
| ::::: : # : : : : : : : : ::
| ###: : : # : : : : : : : : : #
| ::# : : : # : : : : : : : : : #
| :::: # : : : # : : : : : : : : : #
| ::: : # : : : # : : : : : : : : : #
| ::: : : # : : : # : : : : : : : : : #
| :::: : : : # : : : # : : : : : : : : : #
| ::: : : : : # : : : # : : : : : : : : : #
| :::: : : : : : # : : : # : : : : : : : : : #
| ::: : : : : : : # : : : # : : : : : : : : : #
0 +----------------------------------------------------------------------->KB 0 29.48
Number of snapshots: 25
Detailed snapshots: [9, 14 (peak), 24]
The remainder of the log details the highest percentiles of memory allocations, you can specifically see what type of class takes what % of heap memory (and where the allocations originate in terms of call stack), e.g.:
--------------------------------------------------------------------------------
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
--------------------------------------------------------------------------------
10 10,080 10,080 10,000 80 0
11 12,088 12,088 12,000 88 0
12 16,096 16,096 16,000 96 0
13 20,104 20,104 20,000 104 0
14 20,104 20,104 20,000 104 0
99.48% (20,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->49.74% (10,000B) 0x804841A: main (example.c:20)
|
->39.79% (8,000B) 0x80483C2: g (example.c:5)
| ->19.90% (4,000B) 0x80483E2: f (example.c:11)
| | ->19.90% (4,000B) 0x8048431: main (example.c:23)
| |
| ->19.90% (4,000B) 0x8048436: main (example.c:25)
|
->09.95% (2,000B) 0x80483DA: f (example.c:10)
->09.95% (2,000B) 0x8048431: main (example.c:23)
Well, this is not so easy to do. First of all m_word is a string with variable size right? Internally the std::string holds an array of chars among other things. The same stands for std::map. I guess you could get a rough estimation based on the size of the map * TrieNode but this will be just a rough estimate.
I think some code profiling with an external tool would be of more help. Hell you can even use the task manager if you are out of any tools left :).
Your "object size" is sizeof(std::string) + sizeof(bool) + m_word.capacity() + padding bytes or sizeof(trieobject) + m_word.capacity()
Here's a piece of code for GCC that I came up with that you can use in a test program where you only instantiate one object of your class and do some typical work with it. The code replaces the global operator new() and operator delete(); so it will only track allocations through ::new expressions and the standard allocator, provided that the standard allocator itself uses ::operator new() (this is the case for GCC).
Since we need to track the pointers and their allocations, we need a separate map for that, which of course cannot use the standard allocator itself; GCC's malloc-allocator comes to the rescue.
We use a statically initialized global to make the memory tracker print its data after main returns.
#include <unordered_map>
#include <string>
#include <iostream>
#include <ext/malloc_allocator.h>
struct Memtrack
{
typedef std::unordered_map<void*, std::size_t, std::hash<void*>,
std::equal_to<void*>, __gnu_cxx::malloc_allocator<void*>> AllocMap;
static int memtrack;
static int memmax;
static AllocMap allocs;
Memtrack() { std::cout << "starting tracker: cur = " << memtrack << ", max = " << memmax << ".\n"; }
~Memtrack() { std::cout << "ending tracker: cur = " << memtrack << ", max = " << memmax << ".\n"; }
static void track_new(std::size_t n, void * p)
{
memtrack += n;
if (memmax < memtrack) memmax = memtrack;
allocs[p] = n;
std::cout << "... allocating " << n << " bytes...\n";
}
static void track_delete(void * p)
{
const int n = int(allocs[p]);
memtrack -= n;
std::cout << "... freeing " << n << " bytes...\n";
}
} m;
int Memtrack::memtrack = 0;
int Memtrack::memmax = 0;
Memtrack::AllocMap Memtrack::allocs;
void * operator new(std::size_t n) throw(std::bad_alloc)
{
void * const p = std::malloc(n);
Memtrack::track_new(n, p);
return p;
}
void operator delete(void * p) throw()
{
Memtrack::track_delete(p);
std::free(p);
}
int main()
{
std::cout << "Beginning of main.\n";
std::unordered_map<std::string, int> m; // this piece of code
m["hello"] = 4; // is a typical test for working
m["world"] = 7; // with dynamic allocations
std::cout << "End of main.\n";
}
Some typical output:
starting tracker: cur = 0, max = 0.
Beginning of main.
... allocating 48 bytes...
... allocating 12 bytes...
... allocating 12 bytes...
End of main.
... freeing 12 bytes...
... freeing 12 bytes...
... freeing 48 bytes...
ending tracker: cur = 0, max = 72.
Trivial. If you have some time (which might be the case, if you are only interested in the size for debugging/optimising purposes). This approach might be unsuited for production code!
#include <malloc.h>
template <typename T> int objSize(T const* obj) {
// instead of uordblks, you may be interested in 'arena', you decide!
int oldSize = mallinfo().uordblks;
T* dummy = new T(*obj);
int newSize = mallinfo().uordblks;
delete dummy;
return newSize - oldSize;
}