I am trying to perform loop bound analysis for ARMV7m code using Z3 for a big Framework.
I would like to find the memory address that are used by a certain function inside .elf file
for example in a function foo() I have the below basic block
ldr r1, [r3, #0x20]
strb r2, [r3, #6] {__elf_header}
str r2, [r3, #0x24] {__elf_header}
str r2, [r3, #0x20] {__elf_header}
mov r3, r1
cmp r1, #0
bne #0x89f6
How can I get the initial memory location used by this function [r3, #0x20] ? Are there memory segements for every function to access or is it random ?
Given that the above basic block is a loop. Is there a way to know the memory address that will be used during its execution ?
Does the compiler for example save a memory location address from 0x20 to 0x1234 to be only accessed during the execution of such basic block ? In another word, Is there a map between a function and the range of memory address used by it ?
It is confusing as to what you are asking. First off why would any linker put the effort into randomizing things? Perhaps there is one to intentionally make the output not repeatable. But a linker is just a program and normally will do things like process the items on the command line in order, and then process each object from beginning to end...not random.
So far the rest of this seems pretty straight forward just use the tools. Your comment implies gnu tools? Since this is in part tool specific you should have tagged it as such as you cannot really make generalizations across all toolchains ever created.
unsigned int one ( void )
{
return(1);
}
unsigned int two ( void )
{
return(2);
}
unsigned int three ( void )
{
return(3);
}
arm-none-eabi-gcc -O2 -c so.c -o so.o
arm-none-eabi-objdump -d so.o
so.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <one>:
0: e3a00001 mov r0, #1
4: e12fff1e bx lr
00000008 <two>:
8: e3a00002 mov r0, #2
c: e12fff1e bx lr
00000010 <three>:
10: e3a00003 mov r0, #3
14: e12fff1e bx lr
as shown they are all in .text, simple enough.
arm-none-eabi-gcc -O2 -c -ffunction-sections so.c -o so.o
arm-none-eabi-objdump -d so.o
so.o: file format elf32-littlearm
Disassembly of section .text.one:
00000000 <one>:
0: e3a00001 mov r0, #1
4: e12fff1e bx lr
Disassembly of section .text.two:
00000000 <two>:
0: e3a00002 mov r0, #2
4: e12fff1e bx lr
Disassembly of section .text.three:
00000000 <three>:
0: e3a00003 mov r0, #3
4: e12fff1e bx lr
and now each function has its own section name.
So the rest relies heavily on linking and there is no one linker script, you the programmer choose directly or indirectly and how the final binary (elf) is built is a direct result of that choice.
If you have something like this
.text : { *(.text*) } > rom
and nothing else with respect to these functions then all of them will land in this definition, but the linker script or instructions to the linker can indicate something else causing one or more to land in its own space.
arm-none-eabi-ld -Ttext=0x1000 so.o -o so.elf
arm-none-eabi-ld: warning: cannot find entry symbol _start; defaulting to 0000000000001000
arm-none-eabi-objdump -d so.elf
so.elf: file format elf32-littlearm
Disassembly of section .text:
00001000 <one>:
1000: e3a00001 mov r0, #1
1004: e12fff1e bx lr
00001008 <two>:
1008: e3a00002 mov r0, #2
100c: e12fff1e bx lr
00001010 <three>:
1010: e3a00003 mov r0, #3
1014: e12fff1e bx lr
and then of course
arm-none-eabi-nm -a so.elf
00000000 n .ARM.attributes
00011018 T __bss_end__
00011018 T _bss_end__
00011018 T __bss_start
00011018 T __bss_start__
00000000 n .comment
00011018 T __data_start
00011018 T _edata
00011018 T _end
00011018 T __end__
00011018 ? .noinit
00001000 T one <----
00000000 a so.c
00080000 T _stack
U _start
00001000 t .text
00001010 T three <----
00001008 T two <----
which is simply because there is a symbol table in the file
Symbol table '.symtab' contains 22 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00001000 0 SECTION LOCAL DEFAULT 1
2: 00000000 0 SECTION LOCAL DEFAULT 2
3: 00000000 0 SECTION LOCAL DEFAULT 3
4: 00011018 0 SECTION LOCAL DEFAULT 4
5: 00000000 0 FILE LOCAL DEFAULT ABS so.c
6: 00001000 0 NOTYPE LOCAL DEFAULT 1 $a
7: 00001008 0 NOTYPE LOCAL DEFAULT 1 $a
8: 00001010 0 NOTYPE LOCAL DEFAULT 1 $a
9: 00001008 8 FUNC GLOBAL DEFAULT 1 two
10: 00011018 0 NOTYPE GLOBAL DEFAULT 1 _bss_end__
11: 00011018 0 NOTYPE GLOBAL DEFAULT 1 __bss_start__
12: 00011018 0 NOTYPE GLOBAL DEFAULT 1 __bss_end__
13: 00000000 0 NOTYPE GLOBAL DEFAULT UND _start
14: 00011018 0 NOTYPE GLOBAL DEFAULT 1 __bss_start
15: 00011018 0 NOTYPE GLOBAL DEFAULT 1 __end__
16: 00001000 8 FUNC GLOBAL DEFAULT 1 one
17: 00011018 0 NOTYPE GLOBAL DEFAULT 1 _edata
18: 00011018 0 NOTYPE GLOBAL DEFAULT 1 _end
19: 00080000 0 NOTYPE GLOBAL DEFAULT 1 _stack
20: 00001010 8 FUNC GLOBAL DEFAULT 1 three
21: 00011018 0 NOTYPE GLOBAL DEFAULT 1 __data_start
but if
arm-none-eabi-strip so.elf
arm-none-eabi-nm -a so.elf
arm-none-eabi-nm: so.elf: no symbols
arm-none-eabi-objdump -d so.elf
so.elf: file format elf32-littlearm
Disassembly of section .text:
00001000 <.text>:
1000: e3a00001 mov r0, #1
1004: e12fff1e bx lr
1008: e3a00002 mov r0, #2
100c: e12fff1e bx lr
1010: e3a00003 mov r0, #3
1014: e12fff1e bx lr
The elf file format is somewhat trivial you can easily write code to parse it, you do not need a library or anything like that. And with simple experiments like these can easily understand how these tools work.
How can I get the initial memory used by this function ?
Assuming you mean the initial address assuming not relocated. You just read it out of the file. Simple.
Are there memory segments for every function to access or is it random ?
As demonstrated above, the command line option you mention later in a comment (should have been in the question, you should edit the question for completeness) does exactly that makes a custom section name per function. (what happens if you have the same non-global function name in two or more objects? you can easily figure this out on your own)
Nothing is random here, you would need to have a reason to randomize things for security or other, it is more often preferred that a tool outputs the same or at least similar results each time with the same inputs (some tools will embed a build date/time in the file and that may vary from one build to the next).
If you are not using gnu tools then binutils may still be very useful with parsing and displaying elf files anyway.
arm-none-eabi-nm so.elf
00011018 T __bss_end__
00011018 T _bss_end__
00011018 T __bss_start
00011018 T __bss_start__
00011018 T __data_start
00011018 T _edata
00011018 T _end
00011018 T __end__
00001000 T one
00080000 T _stack
U _start
00001010 T three
00001008 T two
nm so.elf (x86 binutils not arm)
00001000 t $a
00001008 t $a
00001010 t $a
00011018 T __bss_end__
00011018 T _bss_end__
00011018 T __bss_start
00011018 T __bss_start__
00011018 T __data_start
00011018 T _edata
00011018 T _end
00011018 T __end__
00001000 T one
00080000 T _stack
U _start
00001010 T three
00001008 T two
Or can build with clang and examine with gnu, etc. Obviously disassembly won't work, but some tools will.
If this is not what you were asking then you need to re-write your question or edit it so we can understand what you are actually asking.
Edit
I would like to know if there is a map between a function and the range of memory address used by it ?
In general no. The term function implies but is not limited to high level languages like C, etc. Where the machine code clearly has no clue nor should it and well optimized code does not necessarily have a single exit point from the function, much less a return marking the end. For architectures like the various arm instruction sets the return instruction is not the end of the "function", there is pool data that may follow.
But let's look at what gcc does.
unsigned int one ( unsigned int x )
{
return(x+1);
}
unsigned int two ( void )
{
return(one(2));
}
unsigned int three ( void )
{
return(3);
}
arm-none-eabi-gcc -O2 -S so.c
cat so.s
.cpu arm7tdmi
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 1
.eabi_attribute 30, 2
.eabi_attribute 34, 0
.eabi_attribute 18, 4
.file "so.c"
.text
.align 2
.global one
.arch armv4t
.syntax unified
.arm
.fpu softvfp
.type one, %function
one:
# Function supports interworking.
# args = 0, pretend = 0, frame = 0
# frame_needed = 0, uses_anonymous_args = 0
# link register save eliminated.
add r0, r0, #1
bx lr
.size one, .-one
.align 2
.global two
.syntax unified
.arm
.fpu softvfp
.type two, %function
two:
# Function supports interworking.
# args = 0, pretend = 0, frame = 0
# frame_needed = 0, uses_anonymous_args = 0
# link register save eliminated.
mov r0, #3
bx lr
.size two, .-two
.align 2
.global three
.syntax unified
.arm
.fpu softvfp
.type three, %function
three:
# Function supports interworking.
# args = 0, pretend = 0, frame = 0
# frame_needed = 0, uses_anonymous_args = 0
# link register save eliminated.
mov r0, #3
bx lr
.size three, .-three
.ident "GCC: (GNU) 10.2.0"
we see this is being placed in the file, but what does it do?
.size three, .-three
One reference says this is used so that the linker can remove the function if it is not used. And I have seen that feature in play so good to know (you could have looked this up just as easily as I did)
So in that context the info is there and you can extract it (lesson for the reader).
And then if you use this gcc compiler option that you mentioned
-ffunction-sections
Disassembly of section .text.one:
00000000 <one>:
0: e2800001 add r0, r0, #1
4: e12fff1e bx lr
Disassembly of section .text.two:
00000000 <two>:
0: e3a00003 mov r0, #3
4: e12fff1e bx lr
Disassembly of section .text.three:
00000000 <three>:
0: e3a00003 mov r0, #3
4: e12fff1e bx lr
[ 4] .text.one
PROGBITS 00000000 000034 000008 00 0 0 4
[00000006]: ALLOC, EXEC
[ 5] .rel.text.one
REL 00000000 0001a4 000008 08 12 4 4
[00000040]: INFO LINK
[ 6] .text.two
PROGBITS 00000000 00003c 000008 00 0 0 4
[00000006]: ALLOC, EXEC
[ 7] .rel.text.two
REL 00000000 0001ac 000008 08 12 6 4
[00000040]: INFO LINK
[ 8] .text.three
PROGBITS 00000000 000044 000008 00 0 0 4
[00000006]: ALLOC, EXEC
[ 9] .rel.text.three
REL 00000000 0001b4 000008 08 12 8 4
[00000040]: INFO LINK
That is giving us a size of the sections.
In general with respect to software compiled or in particular assembled, assume that a function doesn't have boundaries. As you can see above the one function is inlined into the two function, invisibly, so how big is an inlined function within another function? How many instances of a function are there in a binary? Which one do you want to monitor and know the size of, performance of, etc? Gnu has this feature with gcc, you can see if it is there with other languages or tools. Assume the answer is no, and then if you happen to find a way, then that is good.
Does the compiler saves a memory segment to be only accessed by a certain function ?
I have no idea what this means. The compiler doesn't make memory segments the linker does. How the binary is put into a memory image is a linker thing not a compiler thing for starters. Segments are just a way to communicate between tools that these bytes are for starters code (read only ideally), initialized data, or uninitialized data. Perhaps extending to read only data and then make up your own types.
If your ultimate goal is to find the bytes that represent the high level concept of "function" in memory (assuming no relocation, etc) by looking at the elf binary using the gnu toolchain. That is possible in theory.
The first thing we appear to know is that the OBJECT contains this information so that a linker feature can remove unused functions for size. But that does not automatically mean that the output binary from the linker also includes this information. You need to find where this .size lands in the object and then look for that in the final binary.
The compiler turns one language into another, often from a higher level to a lower level but not always depends on the compiler and input/output languages. C to assembly or C to machine code or what about Verilog to C++ for a simulation is that higher or lower? The terms .text, .data, .bss are not part of the language but more of a habit based on learned experience and helps as mentioned communicate with the linker so that the output binaries can be more controlled for various targets. Normally as shown above the compiler, gcc in this case, since no generalities can be made in this area across all tools and languages or even all C or C++ tools all the code for all the functions in the source file land in one .text segment by default. You have to do extra work to get something different. So the compiler in general does not make a "segment" or "memory segment" for each...In general. You already solved your problem it seems by using a command line option that turns every function into its own segment and now you have a lot more control over size and location, etc.
Just use the file format and/or the tools. This question or series of questions boils down into just go look at the elf file format. This is not a Stack Overflow question as questions seeking recommendations for external information is not for this site.
Does the compiler for example save a memory location address from 0x20 to 0x1234 to be only accessed during the execution of such basic block ? In another word, Is there a map between a function and the range of memory address used by it ?
"save"? the compiler does not link the linker links. Is that memory "only" accessed during the execution of that block? Well in a pure textbook theory yes, but in reality branch prediction and prefetch or cache line fills can also access that "memory".
Unless doing self-modifying code or using the mmu in interesting ways you do not re-use an address space for more than one function within an application. In general. So function foo() is implemented somewhere and bar() somewhere else. Hand written asm from the good old days you might have foo() branch right into the middle of bar() to save space, get better performance or to make the code harder to reverse engineer or whatever. But compilers are not that efficient, they do their best to turn concepts like functions into first off functional(ly equivalent to the high level code) and then second, if desired smaller or faster or both relative to a straight brute force conversion between languages. So barring inlining and tail (leaf?, I call it tail) optimizations and such, one could say there are some number of bytes at some address that define a compiled function. But due to the nature of processor technology you cannot assume those bytes are only accessed by the processor/chip/system busses only when executing that function.
Related
I have a case where g++ refuses to load a library. I have a file deps/lib/libskgxp11.so. I place -L deps/lib and -lskgxp11 on the g++ command. I get the following error:
error while loading shared libraries: libskgxp11.so: cannot open shared object file: No such file or directory
My overall purpose is to get a test to run with gtest that connects to Oracle, executes select * from dual, and compares the result.
I have the following in Makefile:
g++ -m32 -o $(test_target) -L deps/lib -Wl,--start-group $(dep_libs) -lpthread -ldl -lskgxpr -lskgxp11 -locrb11 -locr11 -lhasgen11 -lnnz11 -lskgxn2 -locrutl11 -lclntsh $(test_objects) -Wl,--end-group
I have used the same basic sequence of -Wl,--start group, all dependent static libraries, all dependent shared libaries, all object files, -Wl,--end-group on other projects and it works just fine.
Notice the -m32, we're doing everything in 32 bits right now. All the other shared libraries load fine, and are all in the same dir:
ls deps/lib/libskgxpr.so deps/lib/libskgxp11.so deps/lib/libocrb11.so deps/lib/libocr11.so deps/lib/libhasgen11.so deps/lib/libnnz11.so deps/lib/libskgxn2.so deps/lib/libocrutl11.so deps/lib/libclntsh.so | cat
deps/lib/libclntsh.so
deps/lib/libhasgen11.so
deps/lib/libnnz11.so
deps/lib/libocr11.so
deps/lib/libocrb11.so
deps/lib/libocrutl11.so
deps/lib/libskgxn2.so
deps/lib/libskgxp11.so
deps/lib/libskgxpr.so
I do notice one strange thing, it seems that the following group of libraries are somehow related:
deps/lib/libskgxp11.so
deps/lib/libskgxpcompat.so
deps/lib/libskgxpd.so
deps/lib/libskgxpg.so
deps/lib/libskgxpr.so
They each seem to define the same functions. The other 4 seem to depend on libskgxp11, in the sense that if I link any of the other 4 by themselves with a -l option, g++ complains that it can't load libskgxp11.
I have a command sequence that can tell me for each remaining function I need to get from some shared lib, which shared lib(s) contain it. It gives the following:
skgxpcdel
deps/lib/libskgxp11.so
deps/lib/libskgxpcompat.so
deps/lib/libskgxpd.so
deps/lib/libskgxpg.so
deps/lib/libskgxpr.so
skgxpcini_with_stats
deps/lib/libskgxp11.so
deps/lib/libskgxpcompat.so
deps/lib/libskgxpd.so
deps/lib/libskgxpg.so
deps/lib/libskgxpr.so
skgxpcon_with_stats
deps/lib/libskgxp11.so
deps/lib/libskgxpcompat.so
deps/lib/libskgxpd.so
deps/lib/libskgxpg.so
deps/lib/libskgxpr.so
...
where skgxpcdel is a func I'm looking for. All of the outstanding functions I need give the same list of the same 5 libraries.
If I run objdump -T on the 5 libraries, they all seem to be 32-bit shared libs, I don't see anything special about the one that doesn't load compared to the ones that do:
for i in deps/lib/libskgxp11.so deps/lib/libskgxpcompat.so deps/lib/libskgxpd.so deps/lib/libskgxpg.so deps/lib/libskgxpr.so; do objdump -T $i | head; done
deps/lib/libskgxp11.so: file format elf32-i386
DYNAMIC SYMBOL TABLE:
0000380c l d .init 00000000 .init
00003b90 l d .text 00000000 .text
00008cc4 l d text.unlikely 00000000 text.unlikely
000a719c l d .fini 00000000 .fini
000a71c0 l d .rodata 00000000 .rodata
000bbe80 l d .eh_frame 00000000 .eh_frame
deps/lib/libskgxpcompat.so: file format elf32-i386
DYNAMIC SYMBOL TABLE:
00000ca0 l d .init 00000000 .init
00000d18 l d .text 00000000 .text
00000e04 l d text.unlikely 00000000 text.unlikely
00001618 l d .fini 00000000 .fini
00001640 l d .rodata 00000000 .rodata
0000188c l d .eh_frame 00000000 .eh_frame
deps/lib/libskgxpd.so: file format elf32-i386
DYNAMIC SYMBOL TABLE:
00000ca0 l d .init 00000000 .init
00000d18 l d .text 00000000 .text
00000e04 l d text.unlikely 00000000 text.unlikely
00001618 l d .fini 00000000 .fini
00001640 l d .rodata 00000000 .rodata
0000188c l d .eh_frame 00000000 .eh_frame
deps/lib/libskgxpg.so: file format elf32-i386
DYNAMIC SYMBOL TABLE:
0000380c l d .init 00000000 .init
00003b90 l d .text 00000000 .text
00008cc4 l d text.unlikely 00000000 text.unlikely
000a719c l d .fini 00000000 .fini
000a71c0 l d .rodata 00000000 .rodata
000bbe80 l d .eh_frame 00000000 .eh_frame
deps/lib/libskgxpr.so: file format elf32-i386
DYNAMIC SYMBOL TABLE:
0000380c l d .init 00000000 .init
00003b90 l d .text 00000000 .text
00008cc4 l d text.unlikely 00000000 text.unlikely
000a719c l d .fini 00000000 .fini
000a71c0 l d .rodata 00000000 .rodata
000bbe80 l d .eh_frame 00000000 .eh_frame
I'm scratching my head wondering why I can't link the libskgxp11.so, and what is the relationship between this group of 5 libs.
Any help would be greatly appreciated.
For reference, here are some command sequences I ran to get the list of problem functions and track down the libs:
# Get what project needs
make run-tests 2>&1 | grep -Po '(?<=undefined reference to ).*' | tr -d "\`'" | sort -u > undefined.txt
# Complete content of undefined.txt
skgxpcdel
skgxpcini_with_stats
skgxpcon_with_stats
skgxpdis
skgxpdmpctx
skgxpdmpobj
skgxp_get_epid
skgxpgettabledef
skgxpmmap
skgxpnetmappush
skgxppost
skgxprqhi
skgxpsz
skgxptrace
skgxpunmap
skgxpveri
skgxpvrpc
skgxpwait
# Get what deps provide, for comparison to what project needs
(for i in deps/lib/*.so;do objdump -TC $i | grep -E '^...............F';done) | grep -v '[*]UND[*]' | awk '{print $NF}' | sort -u > have.txt
# First few lines of have.txt
_A_BSafeError
AddCRLBerToList
add_error_table
afidrv
AHChooseRandomConstructor2
AHSecretCBCConstructor2
AHSecretCBCPadConstructor2
AI_AES_CBC
# See what is common between what project needs and what deps provide
comm -12 have.txt undefined.txt > left.txt
# A diff of undefined.txt and left.txt indicates they are identical
# Get what deps provide, in a way searchable by a person
(for i in deps/lib/*.so;do echo "====$i"; objdump -TC $i | grep -E '^...............F';done) | grep -v '[*]UND[*]' | awk '{print $NF}' > have-files.txt
# Here's a sample of first two shared libs and some of their funcs, from have-files.txt
====deps/lib/libagfw11.so
clsagfw_get_check_type
clsagfw_exit
clsagfw_get_attrvalue
====deps/lib/libagtsh.so
naecsn
lmsapbn
kokogtv
# for each func left, try to find lib that contains it
(for i in `cat left.txt`;do echo $i; for j in deps/lib/*.so;do (objdump -TC $j | grep -E '^...............F' | grep -v '[*]UND[*]' | grep -q $i) && echo " $j"; done; done) 2>&1 | more
# Output for first two missing funcs:
skgxpcdel
deps/lib/libskgxp11.so
deps/lib/libskgxpcompat.so
deps/lib/libskgxpd.so
deps/lib/libskgxpg.so
deps/lib/libskgxpr.so
skgxpcini_with_stats
deps/lib/libskgxp11.so
deps/lib/libskgxpcompat.so
deps/lib/libskgxpd.so
deps/lib/libskgxpg.so
deps/lib/libskgxpr.so
As helpfully suggested in the comments, the problem was occurring trying to run the program. I just needed to use the env command to set LD_LIBRARY_PATH to include the deps/lib dir where all the shared libs are stored:
run-tests: $(test_target)
env "LD_LIBRARY_PATH=deps/lib:$$LD_LIBRARY_PATH" $(test_target)
Thanks.
I have A 32 bit MFC app with some .NET 2.0 assemblies loaded.
Have received a dump because it appears to hang (is frozen ). Running on w7-64
For me it looks like the FinalizerThread is to blame, but how can I find the root cause ?
Loaded symbol image file: mscorwks.dll
Image path: C:\Windows\Microsoft.NET\Framework\v2.0.50727\mscorwks.dll
0:006> !analyze -hang -v
<Cut some verbose output>
BUILD_VERSION_STRING: 6.1.7601.18229 (win7sp1_gdr.130801-1533)
MANAGED_THREAD_ID: 77c
DERIVED_WAIT_CHAIN:
Dl Eid Cid WaitType
-- --- ------- --------------------------
6 d48.3fc Pseudo Thread Handle
WAIT_CHAIN_COMMAND: ~6s;k;;
THREAD_ATTRIBUTES:
PROBLEM_CLASSES:
BlockedOn_EventHandle
Tid [0x3fc]
BLOCKING_THREAD: 000003fc
DEFAULT_BUCKET_ID: APPLICATION_HANG_BlockedOn_EventHandle
THREAD_SHA1_HASH_MOD_FUNC: 3088ee7b2b2c579d04c782e4cf604f0179eeb747
THREAD_SHA1_HASH_MOD_FUNC_OFFSET: 75f294bbc1ac358b92e8d5e7660d0d49028dba99
LAST_CONTROL_TRANSFER: from 76bc15e9 to 774a015d
FAULTING_THREAD: 000003fc
STACK_TEXT:
0751fc5c 76bc15e9 00000002 0751fcac 00000001 ntdll!NtWaitForMultipleObjects+0x15
0751fcf8 750519fc 0751fcac 0751fd20 00000000 KERNELBASE!WaitForMultipleObjectsEx+0x100
0751fd40 750541d8 00000002 7efde000 00000000 kernel32!WaitForMultipleObjectsExImplementation+0xe0
0751fd5c 72caef76 00000002 7317a410 00000000 kernel32!WaitForMultipleObjects+0x18
0751fd7c 72cb2f46 002e4498 0751fe80 002e44c8 mscorwks!WKS::WaitForFinalizerEvent+0x77
0751fd90 72c3a0bf 0751fe80 00000000 00000000 mscorwks!WKS::GCHeap::FinalizerThreadWorker+0x49
0751fda4 72c3a05b 0751fe80 0751fe2c 72d55c37 mscorwks!Thread::DoADCallBack+0x32a
0751fe38 72c39f81 0751fe80 6bdfde61 00000000 mscorwks!Thread::ShouldChangeAbortToUnload+0xe3
0751fe74 72ce36ac 0751fe80 00000000 003037c0 mscorwks!Thread::ShouldChangeAbortToUnload+0x30a
0751fe9c 72ce36bd 72cb2efb 00000008 0751fee4 mscorwks!ManagedThreadBase_NoADTransition+0x32
0751feac 72d304c4 72cb2efb 6bdfdef1 00000000 mscorwks!ManagedThreadBase::FinalizerBase+0xd
0751fee4 72d70647 00000000 00000000 00000000 mscorwks!WKS::GCHeap::FinalizerThreadStart+0xbb
0751ff88 7505336a 002e44c8 0751ffd4 774b9f72 mscorwks!Thread::intermediateThreadProc+0x49
0751ff94 774b9f72 002e44c8 984bccd8 00000000 kernel32!BaseThreadInitThunk+0xe
0:006> !threads
ThreadCount: 2
UnstartedThread: 0
BackgroundThread: 2
PendingThread: 0
DeadThread: 0
Hosted Runtime: no
PreEmptive GC Alloc Lock
ID OSID ThreadOBJ State GC Context Domain Count APT Exception
0 1 77c 003078f0 4220 Enabled 00000000:00000000 003037c0 0 STA
6 2 3fc 00313de0 b220 Enabled 00000000:00000000 003037c0 0 MTA (Finalizer)
0:006> !pe
There is no current managed exception on this thread
0:000> !FinalizeQueue
SyncBlocks to be cleaned up: 0
MTA Interfaces to be released: 0
STA Interfaces to be released: 0
----------------------------------
generation 0 has 1 finalizable objects (00313518->0031351c)
generation 1 has 0 finalizable objects (00313518->00313518)
generation 2 has 0 finalizable objects (00313518->00313518)
Ready for finalization 0 objects (0031351c->0031351c)
Statistics:
MT Count TotalSize Class Name
722e131c 1 56 System.Threading.Thread
Total 1 objects
0:006> ~0s
eax=00000000 ebx=00288a70 ecx=00000000 edx=00000000 esi=00288a70 edi=00288a70
eip=756a78d7 esp=0018fe88 ebp=0018fea8 iopl=0 nv up ei pl zr na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
user32!NtUserGetMessage+0x15:
756a78d7 83c404 add esp,4
0:000> kn
# ChildEBP RetAddr
00 0018fe88 756a7c1d user32!NtUserGetMessage+0x15
01 0018fea8 7417a685 user32!GetMessageA+0xa1
02 0018fec4 7417ad32 mfc90!AfxInternalPumpMessage+0x1a [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\thrdcore.cpp # 153]
03 0018fee4 7414717d mfc90!CWinThread::Run+0x5b [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\thrdcore.cpp # 629]
04 0018fef8 00412b18 mfc90!AfxWinMain+0x6a [f:\dd\vctools\vc7libs\ship\atlmfc\src\mfc\winmain.cpp # 47]
05 0018ff88 7505336a <MyMFCApp>+0x12b18
06 0018ff94 774b9f72 kernel32!BaseThreadInitThunk+0xe
07 0018ffd4 774b9f45 ntdll!__RtlUserThreadStart+0x70
08 0018ffec 00000000 ntdll!_RtlUserThreadStart+0x1b
Update:
0:000> !eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x05381018
generation 1 starts at 0x0538100c
generation 2 starts at 0x05381000
ephemeral segment allocation context: (0x0539044c, 0x05391ff4)
segment begin allocated size
05380000 05381000 05391ff4 0x00010ff4(69620)
Large object heap starts at 0x06381000
segment begin allocated size
06380000 06381000 0638d488 0x0000c488(50312)
Total Size 0x1d47c(119932)
------------------------------
GC Heap Size 0x1d47c(119932)
Update 2:
I have ran !DumpHeap 0x05381018 (from generation 0 starts at),
no error output. Also ran !VerifyHeap and !heap -s –v (on the native heaps) with no suspicious output.
Problem:
I'm trying to get the ITEE EQ ie 'If-Then-Else-Else Equal' block to work when R6==0, with THEN branching into the END label but the assembler called out an error on the line: BEQ END
Program info:
I'm doing a program that is in pertinence with Optimization. I'm using a Gradient Descent to converge to the point where gradient is 0 to find the solution x* that minimises some function f(x). I'm using C language to call an assembly function which is this program here.
Here is my program where the error is:
CMP R6, #0 # Compare f'(x) with 0
ITEE EQ # If R6 == 0, Then-Else-Else
BEQ END # Calls END label if equal
SUBNE R0, R6 # Change R0(x) in the opp direction of gradient to get lower value of f(x) if not equal
BNE optimize # Branch to optimize if not equal
This is my first assembly program using the NXP LPC1769 for a school assignment. Do let me know what I'm missing or I have done wrong. Thank you!
Here is my whole program:
.syntax unified
.cpu cortex-m3
.thumb
.align 2
.global optimize
.thumb_func
optimize:
# Write optimization function in assembly language here
MOV R5, INLAMBDA # R5 holds value of inverse lambda(10) ie to eliminate floating point
LDR R6, #2 # Load R6 with value '2' ie constant of f'(x)
MUL R6, R1, R6 # Multiply R6(2) with R1(a) & store to R6(results)
MLA R6, R6, R0, R2 # Multiply R6(results) with R0(x) & sum with R2(b) to get f'(x). Store & update results to R6
SDIV R6, R5 # Divide R6(results) by R5(1/lambda) to get f'(x) * lambda
CMP R6, #0 # Compare f'(x) with 0
ITEE EQ # If R6 == 0, Then-Else-Else
BEQ END # Calls END label if equal
SUBNE R0, R6 # Change R0(x) in the opp direction of gradient to get lower value of f(x) if not equal
BNE optimize # Branch to optimize if not equal
# End label
END:
BX LR
# Define constant values
CONST: .word 123
INLAMBDA: .word 10 # Inverse lambda 1 / lambda(0.1) is 10
The problem is that BEQ END is in the middle of the IT block. To quote some documentation for IT:
A branch or any instruction that modifies the PC is only permitted in an IT block if it is the last instruction in the block.
That said, because it's a branch, the "else" is implicit anyway - if you take the branch, you won't be executing the following instructions on account of being elsewhere, and if you don't take it you've got no choice but to execute them, so there's no need for them to be explicitly conditional at all. In fact, you don't even need the IT either, since B<cond> has a proper Thumb instruction encoding in its own right. But then you also don't even need that, because you're doing a short forward branch based on a register being zero, and there's a specific Thumb-only compare-and-branch instruction for doing exactly that!
In other words, your initial 5-line snippet can be expressed simply as:
CBZ R6, END
SUB R0, R6
B optimize
Suppose I've written the following:
enum class Color { Red, Green, Blue, };
template <Color c> Color foo() { return c; }
template Color foo<Color::Green>();
and compiled it. When I look at an objdump of my compiled code, I get:
[einpoklum#myhost /tmp]$ objdump -t f.o | grep "\.text\." | sed 's/^.*\.text\.//;' | c++filt
Color foo<(Color)1>()
Color foo<(Color)1>() 000000000000000b Color foo<(Color)1>()
And if I use abi::__cxa_demangle() for <cxxabi.h> (GCC; maybe it's different with your compiler), it's also similar - (Color)0 or Color)1 are the template parameters, not Red or Green nor Color::Red or Color::Green.
Obviously, I can't have names mangled the way I like them. But - I would really like to be able to obtain (or write?) a variant of the demangling call which instead of "Color foo<(Color)1>()" returns "Color foo<(Color:Green>()" (or "Color foo<(Green>()". Is this doable?
It might be possible for object files with debug info - section .debug_info contains info about enum class Color, it requires some tool to read ELF debug info, parse data semantically and apply/pass info to the c++filt. I don't know if such tools exist or not (maybe, in the GDB it is all glued together)
It is pretty much impossible in general with object files compiled with optimization, or with stripped debug info - information about enum class Color is just NOT there...
From optimized build
objdump -s aaa.o
aaa.o: file format pe-x86-64
Contents of section .text$_Z3fooIL5Color1EES0_v:
0000 554889e5 b8010000 005dc390 90909090 UH.......]......
Contents of section .xdata$_Z3fooIL5Color1EES0_v:
0000 01040205 04030150 .......P
Contents of section .pdata$_Z3fooIL5Color1EES0_v:
0000 00000000 0b000000 00000000 ............
Contents of section .rdata$zzz:
0000 4743433a 20287838 365f3634 2d706f73 GCC: (x86_64-pos
0010 69782d73 65682d72 6576302c 20427569 ix-seh-rev0, Bui
0020 6c742062 79204d69 6e47572d 57363420 lt by MinGW-W64
0030 70726f6a 65637429 20352e33 2e300000 project) 5.3.0..
Debug build has partial contents of section .debug_info:
0070 00000000 00000000 00000002 436f6c6f ............Colo
0080 720004a3 00000001 01a30000 00035265 r.............Re
0090 64000003 47726565 6e000103 426c7565 d...Green...Blue
00a0 00020004 0405696e 74000566 6f6f3c28 ......int..foo<(
00b0 436f6c6f 7229313e 0001065f 5a33666f Color)1>..._Z3fo
00c0 6f494c35 436f6c6f 72314545 53305f76 oIL5Color1EES0_v
00d0 007b0000 00000000 00000000 000b0000 .{..............
00e0 00000000 00019c06 63007b00 00000100 ........c.{.....
00f0 00
When you list the symbol table of a static library, like nm mylib.a, what does the 8 digit hex that show up next to each symbol mean? Is that the relative location of each symbol in the code?
Also, can multiple symbols have the same symbol value? Is there something wrong with a bunchof different symbols all having the symbol value of 00000000?
Here's a snippet of code I wrote in C:
#include
#include
void foo();
int main(int argc, char* argv[]) {
foo();
}
void foo() {
printf("Foo bar baz!");
}
I ran gcc -c foo.c on that code. Here is what nm foo.o showed:
000000000000001b T foo
0000000000000000 T main
U printf
For this example I am running Ubuntu Linux 64-bit; that is why the 8 digit hex you see is 16 digit here. :-)
The hex digit you see is the address of the code in question within the object file relative to the beginning of the .text. section. (assuming we address sections of the object file beginning at 0x0). If you run objdump -td foo.o, you'll see the following in the output:
Disassembly of section .text:
0000000000000000 :
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: 48 83 ec 10 sub $0x10,%rsp
8: 89 7d fc mov %edi,-0x4(%rbp)
b: 48 89 75 f0 mov %rsi,-0x10(%rbp)
f: b8 00 00 00 00 mov $0x0,%eax
14: e8 00 00 00 00 callq 19
19: c9 leaveq
1a: c3 retq
000000000000001b :
1b: 55 push %rbp
1c: 48 89 e5 mov %rsp,%rbp
1f: b8 00 00 00 00 mov $0x0,%eax
24: 48 89 c7 mov %rax,%rdi
27: b8 00 00 00 00 mov $0x0,%eax
2c: e8 00 00 00 00 callq 31
31: c9 leaveq
32: c3 retq
Notice that these two symbols line right up with the entries we saw in the symbol table from nm. Bare in mind, these addresses may change if you link this object file to other object files. Also, bare in mind that callq at 0x2c will change when you link this file to whatever libc your system provides, since that is currently an incomplete call to printf (it doesn't know where it is right now).
As for your mylib.a, there is more going on here. The file you have is an archive; it contains multiple object files, each one of which with it's own text segment. As an example, here is part of an nm against /usr/lib/libm.a on my box here
e_sinh.o:
0000000000000000 r .LC0
0000000000000008 r .LC1
0000000000000010 r .LC2
0000000000000018 r .LC3
0000000000000000 r .LC4
U __expm1
U __ieee754_exp
0000000000000000 T __ieee754_sinh
e_sqrt.o:
0000000000000000 T __ieee754_sqrt
e_gamma_r.o:
0000000000000000 r .LC0
U __ieee754_exp
0000000000000000 T __ieee754_gamma_r
U __ieee754_lgamma_r
U __rint
You'll see that multiple text segment entries -- indicated by the T in the second column rest at address 0x0, but each individual file has only one text segment symbol at 0x0.
As for individual files having multiple symbols resting at the same address, it seems like it would be possible perhaps. After all, it is just an entry in a table used to determine the location and size of a chunk of data. But I don't know for certain. I have never seen multiple symbols referencing the same part of a section before. Anyone with more knowledge on this than me can chime in. :-)
Hope this helps some.
The hex numeral is the memory offset into the object files where the symbol can be found. It's literally the number of bytes into the object code.
That value is used by the linker to locate and make a copy of the symbol's value. You can see generally how it's laid out if you add the -S option to nm, which will show you the size of the value for each symbol.
nm shows the values of symbols. Some symbols in a library or object file may show up as zero simply because they haven't been given a value yet. They'll get their actual value at link time.
Some symbols are code symbols, some are data, etc. Before linking the symbol value is often the offset in the section it resides in,