I am trying to perform loop bound analysis for ARMV7m code using Z3 for a big Framework.
I would like to find the memory address that are used by a certain function inside .elf file
for example in a function foo() I have the below basic block
ldr r1, [r3, #0x20]
strb r2, [r3, #6] {__elf_header}
str r2, [r3, #0x24] {__elf_header}
str r2, [r3, #0x20] {__elf_header}
mov r3, r1
cmp r1, #0
bne #0x89f6
How can I get the initial memory location used by this function [r3, #0x20] ? Are there memory segements for every function to access or is it random ?
Given that the above basic block is a loop. Is there a way to know the memory address that will be used during its execution ?
Does the compiler for example save a memory location address from 0x20 to 0x1234 to be only accessed during the execution of such basic block ? In another word, Is there a map between a function and the range of memory address used by it ?
It is confusing as to what you are asking. First off why would any linker put the effort into randomizing things? Perhaps there is one to intentionally make the output not repeatable. But a linker is just a program and normally will do things like process the items on the command line in order, and then process each object from beginning to end...not random.
So far the rest of this seems pretty straight forward just use the tools. Your comment implies gnu tools? Since this is in part tool specific you should have tagged it as such as you cannot really make generalizations across all toolchains ever created.
unsigned int one ( void )
{
return(1);
}
unsigned int two ( void )
{
return(2);
}
unsigned int three ( void )
{
return(3);
}
arm-none-eabi-gcc -O2 -c so.c -o so.o
arm-none-eabi-objdump -d so.o
so.o: file format elf32-littlearm
Disassembly of section .text:
00000000 <one>:
0: e3a00001 mov r0, #1
4: e12fff1e bx lr
00000008 <two>:
8: e3a00002 mov r0, #2
c: e12fff1e bx lr
00000010 <three>:
10: e3a00003 mov r0, #3
14: e12fff1e bx lr
as shown they are all in .text, simple enough.
arm-none-eabi-gcc -O2 -c -ffunction-sections so.c -o so.o
arm-none-eabi-objdump -d so.o
so.o: file format elf32-littlearm
Disassembly of section .text.one:
00000000 <one>:
0: e3a00001 mov r0, #1
4: e12fff1e bx lr
Disassembly of section .text.two:
00000000 <two>:
0: e3a00002 mov r0, #2
4: e12fff1e bx lr
Disassembly of section .text.three:
00000000 <three>:
0: e3a00003 mov r0, #3
4: e12fff1e bx lr
and now each function has its own section name.
So the rest relies heavily on linking and there is no one linker script, you the programmer choose directly or indirectly and how the final binary (elf) is built is a direct result of that choice.
If you have something like this
.text : { *(.text*) } > rom
and nothing else with respect to these functions then all of them will land in this definition, but the linker script or instructions to the linker can indicate something else causing one or more to land in its own space.
arm-none-eabi-ld -Ttext=0x1000 so.o -o so.elf
arm-none-eabi-ld: warning: cannot find entry symbol _start; defaulting to 0000000000001000
arm-none-eabi-objdump -d so.elf
so.elf: file format elf32-littlearm
Disassembly of section .text:
00001000 <one>:
1000: e3a00001 mov r0, #1
1004: e12fff1e bx lr
00001008 <two>:
1008: e3a00002 mov r0, #2
100c: e12fff1e bx lr
00001010 <three>:
1010: e3a00003 mov r0, #3
1014: e12fff1e bx lr
and then of course
arm-none-eabi-nm -a so.elf
00000000 n .ARM.attributes
00011018 T __bss_end__
00011018 T _bss_end__
00011018 T __bss_start
00011018 T __bss_start__
00000000 n .comment
00011018 T __data_start
00011018 T _edata
00011018 T _end
00011018 T __end__
00011018 ? .noinit
00001000 T one <----
00000000 a so.c
00080000 T _stack
U _start
00001000 t .text
00001010 T three <----
00001008 T two <----
which is simply because there is a symbol table in the file
Symbol table '.symtab' contains 22 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 00000000 0 NOTYPE LOCAL DEFAULT UND
1: 00001000 0 SECTION LOCAL DEFAULT 1
2: 00000000 0 SECTION LOCAL DEFAULT 2
3: 00000000 0 SECTION LOCAL DEFAULT 3
4: 00011018 0 SECTION LOCAL DEFAULT 4
5: 00000000 0 FILE LOCAL DEFAULT ABS so.c
6: 00001000 0 NOTYPE LOCAL DEFAULT 1 $a
7: 00001008 0 NOTYPE LOCAL DEFAULT 1 $a
8: 00001010 0 NOTYPE LOCAL DEFAULT 1 $a
9: 00001008 8 FUNC GLOBAL DEFAULT 1 two
10: 00011018 0 NOTYPE GLOBAL DEFAULT 1 _bss_end__
11: 00011018 0 NOTYPE GLOBAL DEFAULT 1 __bss_start__
12: 00011018 0 NOTYPE GLOBAL DEFAULT 1 __bss_end__
13: 00000000 0 NOTYPE GLOBAL DEFAULT UND _start
14: 00011018 0 NOTYPE GLOBAL DEFAULT 1 __bss_start
15: 00011018 0 NOTYPE GLOBAL DEFAULT 1 __end__
16: 00001000 8 FUNC GLOBAL DEFAULT 1 one
17: 00011018 0 NOTYPE GLOBAL DEFAULT 1 _edata
18: 00011018 0 NOTYPE GLOBAL DEFAULT 1 _end
19: 00080000 0 NOTYPE GLOBAL DEFAULT 1 _stack
20: 00001010 8 FUNC GLOBAL DEFAULT 1 three
21: 00011018 0 NOTYPE GLOBAL DEFAULT 1 __data_start
but if
arm-none-eabi-strip so.elf
arm-none-eabi-nm -a so.elf
arm-none-eabi-nm: so.elf: no symbols
arm-none-eabi-objdump -d so.elf
so.elf: file format elf32-littlearm
Disassembly of section .text:
00001000 <.text>:
1000: e3a00001 mov r0, #1
1004: e12fff1e bx lr
1008: e3a00002 mov r0, #2
100c: e12fff1e bx lr
1010: e3a00003 mov r0, #3
1014: e12fff1e bx lr
The elf file format is somewhat trivial you can easily write code to parse it, you do not need a library or anything like that. And with simple experiments like these can easily understand how these tools work.
How can I get the initial memory used by this function ?
Assuming you mean the initial address assuming not relocated. You just read it out of the file. Simple.
Are there memory segments for every function to access or is it random ?
As demonstrated above, the command line option you mention later in a comment (should have been in the question, you should edit the question for completeness) does exactly that makes a custom section name per function. (what happens if you have the same non-global function name in two or more objects? you can easily figure this out on your own)
Nothing is random here, you would need to have a reason to randomize things for security or other, it is more often preferred that a tool outputs the same or at least similar results each time with the same inputs (some tools will embed a build date/time in the file and that may vary from one build to the next).
If you are not using gnu tools then binutils may still be very useful with parsing and displaying elf files anyway.
arm-none-eabi-nm so.elf
00011018 T __bss_end__
00011018 T _bss_end__
00011018 T __bss_start
00011018 T __bss_start__
00011018 T __data_start
00011018 T _edata
00011018 T _end
00011018 T __end__
00001000 T one
00080000 T _stack
U _start
00001010 T three
00001008 T two
nm so.elf (x86 binutils not arm)
00001000 t $a
00001008 t $a
00001010 t $a
00011018 T __bss_end__
00011018 T _bss_end__
00011018 T __bss_start
00011018 T __bss_start__
00011018 T __data_start
00011018 T _edata
00011018 T _end
00011018 T __end__
00001000 T one
00080000 T _stack
U _start
00001010 T three
00001008 T two
Or can build with clang and examine with gnu, etc. Obviously disassembly won't work, but some tools will.
If this is not what you were asking then you need to re-write your question or edit it so we can understand what you are actually asking.
Edit
I would like to know if there is a map between a function and the range of memory address used by it ?
In general no. The term function implies but is not limited to high level languages like C, etc. Where the machine code clearly has no clue nor should it and well optimized code does not necessarily have a single exit point from the function, much less a return marking the end. For architectures like the various arm instruction sets the return instruction is not the end of the "function", there is pool data that may follow.
But let's look at what gcc does.
unsigned int one ( unsigned int x )
{
return(x+1);
}
unsigned int two ( void )
{
return(one(2));
}
unsigned int three ( void )
{
return(3);
}
arm-none-eabi-gcc -O2 -S so.c
cat so.s
.cpu arm7tdmi
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 1
.eabi_attribute 30, 2
.eabi_attribute 34, 0
.eabi_attribute 18, 4
.file "so.c"
.text
.align 2
.global one
.arch armv4t
.syntax unified
.arm
.fpu softvfp
.type one, %function
one:
# Function supports interworking.
# args = 0, pretend = 0, frame = 0
# frame_needed = 0, uses_anonymous_args = 0
# link register save eliminated.
add r0, r0, #1
bx lr
.size one, .-one
.align 2
.global two
.syntax unified
.arm
.fpu softvfp
.type two, %function
two:
# Function supports interworking.
# args = 0, pretend = 0, frame = 0
# frame_needed = 0, uses_anonymous_args = 0
# link register save eliminated.
mov r0, #3
bx lr
.size two, .-two
.align 2
.global three
.syntax unified
.arm
.fpu softvfp
.type three, %function
three:
# Function supports interworking.
# args = 0, pretend = 0, frame = 0
# frame_needed = 0, uses_anonymous_args = 0
# link register save eliminated.
mov r0, #3
bx lr
.size three, .-three
.ident "GCC: (GNU) 10.2.0"
we see this is being placed in the file, but what does it do?
.size three, .-three
One reference says this is used so that the linker can remove the function if it is not used. And I have seen that feature in play so good to know (you could have looked this up just as easily as I did)
So in that context the info is there and you can extract it (lesson for the reader).
And then if you use this gcc compiler option that you mentioned
-ffunction-sections
Disassembly of section .text.one:
00000000 <one>:
0: e2800001 add r0, r0, #1
4: e12fff1e bx lr
Disassembly of section .text.two:
00000000 <two>:
0: e3a00003 mov r0, #3
4: e12fff1e bx lr
Disassembly of section .text.three:
00000000 <three>:
0: e3a00003 mov r0, #3
4: e12fff1e bx lr
[ 4] .text.one
PROGBITS 00000000 000034 000008 00 0 0 4
[00000006]: ALLOC, EXEC
[ 5] .rel.text.one
REL 00000000 0001a4 000008 08 12 4 4
[00000040]: INFO LINK
[ 6] .text.two
PROGBITS 00000000 00003c 000008 00 0 0 4
[00000006]: ALLOC, EXEC
[ 7] .rel.text.two
REL 00000000 0001ac 000008 08 12 6 4
[00000040]: INFO LINK
[ 8] .text.three
PROGBITS 00000000 000044 000008 00 0 0 4
[00000006]: ALLOC, EXEC
[ 9] .rel.text.three
REL 00000000 0001b4 000008 08 12 8 4
[00000040]: INFO LINK
That is giving us a size of the sections.
In general with respect to software compiled or in particular assembled, assume that a function doesn't have boundaries. As you can see above the one function is inlined into the two function, invisibly, so how big is an inlined function within another function? How many instances of a function are there in a binary? Which one do you want to monitor and know the size of, performance of, etc? Gnu has this feature with gcc, you can see if it is there with other languages or tools. Assume the answer is no, and then if you happen to find a way, then that is good.
Does the compiler saves a memory segment to be only accessed by a certain function ?
I have no idea what this means. The compiler doesn't make memory segments the linker does. How the binary is put into a memory image is a linker thing not a compiler thing for starters. Segments are just a way to communicate between tools that these bytes are for starters code (read only ideally), initialized data, or uninitialized data. Perhaps extending to read only data and then make up your own types.
If your ultimate goal is to find the bytes that represent the high level concept of "function" in memory (assuming no relocation, etc) by looking at the elf binary using the gnu toolchain. That is possible in theory.
The first thing we appear to know is that the OBJECT contains this information so that a linker feature can remove unused functions for size. But that does not automatically mean that the output binary from the linker also includes this information. You need to find where this .size lands in the object and then look for that in the final binary.
The compiler turns one language into another, often from a higher level to a lower level but not always depends on the compiler and input/output languages. C to assembly or C to machine code or what about Verilog to C++ for a simulation is that higher or lower? The terms .text, .data, .bss are not part of the language but more of a habit based on learned experience and helps as mentioned communicate with the linker so that the output binaries can be more controlled for various targets. Normally as shown above the compiler, gcc in this case, since no generalities can be made in this area across all tools and languages or even all C or C++ tools all the code for all the functions in the source file land in one .text segment by default. You have to do extra work to get something different. So the compiler in general does not make a "segment" or "memory segment" for each...In general. You already solved your problem it seems by using a command line option that turns every function into its own segment and now you have a lot more control over size and location, etc.
Just use the file format and/or the tools. This question or series of questions boils down into just go look at the elf file format. This is not a Stack Overflow question as questions seeking recommendations for external information is not for this site.
Does the compiler for example save a memory location address from 0x20 to 0x1234 to be only accessed during the execution of such basic block ? In another word, Is there a map between a function and the range of memory address used by it ?
"save"? the compiler does not link the linker links. Is that memory "only" accessed during the execution of that block? Well in a pure textbook theory yes, but in reality branch prediction and prefetch or cache line fills can also access that "memory".
Unless doing self-modifying code or using the mmu in interesting ways you do not re-use an address space for more than one function within an application. In general. So function foo() is implemented somewhere and bar() somewhere else. Hand written asm from the good old days you might have foo() branch right into the middle of bar() to save space, get better performance or to make the code harder to reverse engineer or whatever. But compilers are not that efficient, they do their best to turn concepts like functions into first off functional(ly equivalent to the high level code) and then second, if desired smaller or faster or both relative to a straight brute force conversion between languages. So barring inlining and tail (leaf?, I call it tail) optimizations and such, one could say there are some number of bytes at some address that define a compiled function. But due to the nature of processor technology you cannot assume those bytes are only accessed by the processor/chip/system busses only when executing that function.
I am trying to build the RegFS sample to better understand the Windows Projected File System. My code is building without a warning, but I am getting dynamic linking errors. Below is a sample error, with the code causing it right below.
"The procedure entry point PrjWritePlaceholderInfo could not be located in the dynamic link library."
HRESULT VirtualizationInstance::WritePlaceholderInfo(
LPCWSTR relativePath,
PRJ_PLACEHOLDER_INFO* placeholderInfo,
DWORD length
) {
return PrjWritePlaceholderInfo(
_instanceHandle,
relativePath,
placeholderInfo,
length);
}
I'm sure I did something wrong when I was linking. Under [Project Property Pages] > Linker > Input, I prepended "ProjectedFSlib.lib" to "Additional Dependencies."
This is my first time using Visual Studio with libraries not linked in by default, and I've been unable to find instructions on how to locate and link libraries within the Windows SDK.
Thanks for your help!
EDIT:
The DUMPBIN output is:
Dump of file ProjectedFSLib.lib
File Type: LIBRARY
Exports
ordinal name
PrjAllocateAlignedBuffer
PrjClearNegativePathCache
PrjCloseFile
PrjCommandCallbacksInit
PrjCompleteCommand
PrjConfigureVolume
PrjConvertDirectoryToPlaceholder
PrjCreatePlaceholderAsHardlink
PrjDeleteFile
PrjDetachDriver
PrjDoesNameContainWildCards
PrjFileNameCompare
PrjFileNameMatch
PrjFillDirEntryBuffer
PrjFreeAlignedBuffer
PrjGetOnDiskFileState
PrjGetVirtualizationInstanceIdFromHandle
PrjGetVirtualizationInstanceInfo
PrjMarkDirectoryAsPlaceholder
PrjOpenFile
PrjReadFile
PrjStartVirtualizationInstance
PrjStartVirtualizationInstanceEx
PrjStartVirtualizing
PrjStopVirtualizationInstance
PrjStopVirtualizing
PrjUpdateFileIfNeeded
PrjUpdatePlaceholderIfNeeded
PrjWriteFile
PrjWriteFileData
PrjWritePlaceholderInfo
PrjWritePlaceholderInformation
PrjpReadPrjReparsePointData
Summary
D8 .debug$S
14 .idata$2
14 .idata$3
8 .idata$4
8 .idata$5
14 .idata$6
A DUMPBIN of the executable imports results in:
Dump of file regfs.exe
File Type: EXECUTABLE IMAGE
Section contains the following imports:
PROJECTEDFSLIB.dll
14006D2A0 Import Address Table
14006D9E0 Import Name Table
0 time date stamp
0 Index of first forwarder reference
1E PrjWritePlaceholderInfo
1D PrjWriteFileData
19 PrjStopVirtualizing
17 PrjStartVirtualizing
C PrjFileNameMatch
D PrjFillDirEntryBuffer
E PrjFreeAlignedBuffer
0 PrjAllocateAlignedBuffer
11 PrjGetVirtualizationInstanceInfo
12 PrjMarkDirectoryAsPlaceholder
B PrjFileNameCompare
KERNEL32.dll
14006D098 Import Address Table
14006D7D8 Import Name Table
0 time date stamp
0 Index of first forwarder reference
389 IsProcessorFeaturePresent
382 IsDebuggerPresent
466 RaiseException
1B1 FreeLibrary
BA CreateDirectoryW
116 DeleteFileW
59A TerminateProcess
4BD RemoveDirectoryW
621 WriteFile
C2 CreateFile2
86 CloseHandle
267 GetLastError
3F2 MultiByteToWideChar
21D GetCurrentProcess
57B SetUnhandledExceptionFilter
5BC UnhandledExceptionFilter
4E1 RtlVirtualUnwind
4DA RtlLookupFunctionEntry
4D3 RtlCaptureContext
477 ReadFile
2B5 GetProcAddress
5DD VirtualQuery
2BB GetProcessHeap
60D WideCharToMultiByte
450 QueryPerformanceCounter
21E GetCurrentProcessId
2F0 GetSystemTimeAsFileTime
36C InitializeSListHead
352 HeapFree
34E HeapAlloc
27E GetModuleHandleW
2D7 GetStartupInfoW
222 GetCurrentThreadId
ADVAPI32.dll
14006D000 Import Address Table
14006D740 Import Name Table
0 time date stamp
0 Index of first forwarder reference
299 RegQueryValueExW
293 RegQueryInfoKeyW
28C RegOpenKeyExW
27D RegEnumValueW
27A RegEnumKeyExW
25B RegCloseKey
281 RegGetValueW
ole32.dll
14006D438 Import Address Table
14006DB78 Import Name Table
0 time date stamp
0 Index of first forwarder reference
2A CoCreateGuid
MSVCP140D.dll
14006D228 Import Address Table
14006D968 Import Name Table
0 time date stamp
0 Index of first forwarder reference
A5 ??1_Lockit#std##QEAA#XZ
6D ??0_Lockit#std##QEAA#H#Z
296 ?_Xlength_error#std##YAXPEBD#Z
297 ?_Xout_of_range#std##YAXPEBD#Z
VCRUNTIME140D.dll
14006D360 Import Address Table
14006DAA0 Import Name Table
0 time date stamp
0 Index of first forwarder reference
3C memcpy
3D memmove
1 _CxxThrowException
E __CxxFrameHandler3
36 _purecall
3B memcmp
21 __std_exception_copy
22 __std_exception_destroy
8 __C_specific_handler
9 __C_specific_handler_noexcept
25 __std_type_info_destroy_list
2E __vcrt_GetModuleFileNameW
2F __vcrt_GetModuleHandleW
31 __vcrt_LoadLibraryExW
ucrtbased.dll
14006D498 Import Address Table
14006DBD8 Import Name Table
0 time date stamp
0 Index of first forwarder reference
2B6 _register_thread_local_exe_atexit_callback
B5 _configthreadlocale
2CE _set_new_mode
4D __p__commode
11D _free_dbg
52C strcpy_s
528 strcat_s
68 __stdio_common_vsprintf_s
2C2 _seh_filter_dll
B6 _configure_narrow_argv
171 _initialize_narrow_environment
172 _initialize_onexit_table
9F _c_exit
E5 _execute_onexit_table
C2 _crt_atexit
C1 _crt_at_quick_exit
54B terminate
39C _wmakepath_s
3B8 _wsplitpath_s
564 wcscpy_s
A4 _cexit
48D getchar
60 __stdio_common_vfwprintf
35 __acrt_iob_func
4 _CrtDbgReport
567 wcslen
176 _invalid_parameter
4B __p___wargv
49 __p___argc
2CB _set_fmode
EA _exit
450 exit
175 _initterm_e
174 _initterm
13E _get_initial_wide_environment
173 _initialize_wide_environment
B7 _configure_wide_argv
5B __setusermatherr
2C6 _set_app_type
561 wcscmp
5 _CrtDbgReportW
4D8 malloc
2B5 _register_onexit_function
A1 _callnewh
2C3 _seh_filter_exe
Summary
1000 .00cfg
1000 .data
2000 .idata
1000 .msvcjmc
5000 .pdata
17000 .rdata
1000 .reloc
1000 .rsrc
37000 .text
18000 .textbss
As evident, it imports all the necessary functions from PROJECTEDFSLIB.dll
Either add ProjectedFSLib.lib to your libraries or add a:
#pragma comment(lib, "ProjectedFSLib.lib")
line in your code. Also, make sure you are using version 10.0.17763.0 of the SDK. If you are using mingw it would not surprise me if this library has not been made available yet.
The Projected FS is still an optional feature of Windows that requires manual installation to use. Go to Control Panel -> Programs and Features -> Turn Windows Features on or off. In that list of optional features, scroll down to "Windows Projected File System" and make sure that it is enabled there. Only after that is done will you have a ProjectedFSLib.dll show up in your system32 directory.
It's also probably worth noting that it looks like there's only an x64 version of this DLL, so if you're building an x86 program, that might be the reason why you're unable to dynamically link with that DLL.
I'm trying to parse some text information from within project files of an Adobe program (Adobe Premiere Pro). The project files are gzip compressed XML files. These then contain fields which have base64 encoded fields which contain further information about a component of the project. I'm trying to look at the information within these fields.
I can't for the life of me identify the compression scheme used here. It doesn't look like GZip as it doesn't have the appropriate headers. Can anyone help?
An example of a base64 encoded field is:
AQAAAAAAAACk1QAAAAAAAENvbXByZXNzZWRUaXRsZQB4nO0da1MbSa5/iuu+L34Eg6maY4sQ2OWWhBRmk/hTytiGeM+xfbZJYH/83enRPf2cGfMIMyZdFOCR1Gp1t6TWSOPp//03Eb+KW/FVTERNfBMjsRBLMRYzMRX/FP8QTbElGvC/BpipGAB8CNipuCbsn+JCHItfgGqHaH4V+yIRB0AzE5fQ4rM4h08zsfLgF8BlBX2OCPPB6ndftKDPhujA7y78toCi7tEk4i1x5uvPogv4FV2jdEuieA899MUdYI+BaiG+w9UCpNgHuoW4gRbIOZsqAa4ruloBVvUzgOsRzce+uALsBHpjTkXUKNEinZcJ/OzD+FgGF56II+C1JOgJtWZKF4p0U5AWe7yCn3wJ82lRvhnQroIS+hhTxjOYz1VASgVHaPGK1XO0BMe7JHnHhF+IU1q3GfWwBP37kKm/2N6kdq+PaD5GwH1F/VwBPMSvJXmZ9BfAZU4yqrGHcCw/Ws/AWxUTg9dFkrnSv4G2X2kNlax5ss/FJ1jNE/h7BFy6AGuKPbI57NvHYosefD4R7wDG9GydTG/jkLorDkFLjuDnHaznObVgWh+TwNovQMbvMI4vNJKFHK2ydIU/B8g10OB48XpKPItau/Ppz5U7mweklaxleD2SutkFGlwJrZUH1FufsHh9JS1gS45XwxL4O4Pxm1gFSagvHhnj9whvQhOQZUZyzUB6k8qGsxfKk1NhD4xZyh5HQ7RzR6LweWNprzmadsF4siSur7V6r4FmIP4t9eWGLC5kJdvSSrogax9oEH9Oox9Rn7gLoj1vg9R7YAUsYT5tAl7Q98VnUvfmAEUfqDzhOVzjf2w5pR3pK7UzR8nWlN0O8aERI+YNSdgHyATa4p51l/rYopGGR1fP5WnjLuDqluxwlHrzuWWvTyNDcT+/AawPsC/kec9AX/6SvsPmxPozovk7pBbaqy/Ff4CqT7KgDGGqBDjPaJUU7BA4z0gTcE30+iCPfFqf1xn9nRby0XRo5YgZiY/wf0i+cz+1dB+TgAaOyZZwzx3LK9UijEvAJ49JlteA+xs+s6W5VpBFlRijOSQrmBLuGGgmpFu8Mq8gRtwCSWpP9Il91To9J6ArE9KqOa3Nc0u5fu9IfT9tV9ZzSJ4NVyhEZXontXf00zsEFQ+5UPb+t2QlU5IL929XL8I0qFUzis4/yvHYsbyPTch60A/iLJ1K7V5CfOP2mE2XxaO3Jo+evP+Z0B44lfEHWonaVzny9fEJ3VFcUuSM+4vpWT/KsfL+m0WVkE3OifeAbPOCaLu0fwzSVWnSPRfyWo8+i6/2eubYimjVOJW22leunrFuvNtw7TNbPkz3ijlEzXuc5oU17TKd7WrooLujRi18aVqYpW9LyXUquWye/sU9eDN00NW0+poxYv59WDhrp+7B89vWAneC3BrvFXdSHjy+Lv0dSKsZw73GiOLyHTnjGobzcCt+h886Y7ENkTfPiYth6j9ICp2ltWFob5wPVTPYpThc5VnY2rIp1rdr7Lcr7z9x/nfT/IwNdylD+YkQhWo3Ia+wssZrQpnuT7qHwlnQmuP3EaLi9uY9qLkC9r0pt2J7932BiUsAO4Y7TcxfD1PZbRjTnJD2s62YdDacaa9I3yfgm/vkNfYpOzUmuregf6qtT6faz2jWjgmOd253QV6aTx69zdP0Qmgt1yQD5itcbra/MjHaQ7S9Vqb3uKJZ6VMu4Jr84h21QRvrUCVnDzxcK+Xh07JX8e21XuAJHusrdqOviL4i+oroK9bwFZ3oK6KviL7ip/EVRTSqNqxG6V7XYG3fBO5LNMVraaF5d0IrY64a8oe1Vs/TnKpE41SyN1RDRT3tC1UnUzX7YkpdI+LKAXoDdUccsoUQXUJPjPAo2afup7PAvtHFJsQbJcA6PeJxDVeUAVDWkoXPatsraNuTknINF7Mcn+FqRj6ScwfLlEMRVUKrsSDMVFr0inRQzXwYez9vcwSrdUk+fUIj+ZJmK7pytzkne/T94n1a4tNAqsLMle0+ZRC0bbLmosUh7yJqpPG1HuHHpDnXqSyncn1CFtGWFuG2sfdff+/lJ56wR7272LBEcP18YnAwIQntzfwUlDuvJiYh6Y/IZ4zkHjRMnxFAzfFlt7ndvz3Wp12o+1xSHgXajo07o7lxJcuiSig7xNb1u9zzcOXc9llUPPdY51mR91b+dwfWGz19J10Nnyarb/OpjBM5j7cCny5s0q6/DVxxD1ERwP24mM9tIO4tfEZ4k+qzP+4vSxrqGTEhzXicpTTXsJVtuYM/l7WYFYcq2YopV1mWsrsxltIAbX7M7/0twYeqnYYjMi1/3g7UzHxCrObsge9AG77KZ1XNJzCXXks3vjAxvG+G4PeX4FXpEmyXLkG7dAl2Spdgt3QJOqVLsFe6BM302fQyZWhWQIbyPWOzAr6xWQHv2Mz1jy3KuWzDXLUp6us8i0y/5OqoK9Pes9luoxL2i1JUwYYb9F2tashRBVtGOapgzw36lkY15Cg/8mE5yo9/WI7yoyCWoxqxUDXioUZFYqJGReKiRkVio0ZF4qNGQYz0EDlsuJmVMHPmLHFWZUn5tJ+7srQrYmUpVpZiZSlWlpgqVpZiZSlWlmJlKVaWfmxcXP4dQvn3BuVnWcrPr5SfWSk/p1KNbEr5MlQhi1K+Z6xG9qR877hOZakts5F7IG9VKktapvaz2W6sLMXKUqwsxcpSrCw9Vca+Gv60GjFRrCzFytJ9KkvKl/zclSX1rFOsLMXKUqwsxcpSrCzFylKsLMXKUqws/di4uPw7hPLvDcrPspSfXyk/s1J+TqUa2ZTyZahCFqV8z1iN7En53nG9ytIOxRuvnkmm9SpLSqbms9lurCzFylKsLMXKUqwsPVXGvhr+tBoxUawsxcqSX1kyr5feGHxIXlVpICVAfPh8qDBFQnODObcbgtsjD+OwN8wVYc3lzHov4LbsKYy155jfcGnm2XxsQjnIsVXV0HnMEM5swXk6u06QhcVzlbjKwSc6Xgg+uU290TOMdVu9FuYpba1gW5tGczjwKhYhTEL6iyeSTqjuwu8ANSuC2Xjd1yFo3SDYl4nhk5z6VA/DypCpvQ250ll43Zebj/XhCZ3SxBqj3kNZo1Uakv6hnqvdBPVwaPBSdbRr+szneKmV1hA+m+TGmiV1nUAPmHtcpDh9jVe+bOtJ3Nw4iVsbJ/GrjZN425G4RU8BZEut8VmSa4rHSq99o++LwjjTG2kf7Poh0ztraOhUUrdl+ORSjX8vxtKHuGe0+BRmO398YZw9Ptyd5nIHckeocW4bnLWxMN8jnY0322LkoN56nTU6k8Zs+5b2UD7z0txhwxS8L7Nuhd5M7WLdGOfxcUr4/Tn3i1jMN+jGuCXGLdWMW1pU28339+1Cf6/OBI4xTIxhYgwTY5gYw1Qvhtl7QAzzsKyL2tN+bPQSY5efPXb5MTsTZsZ3Y8yyURJvfsxStsQxTolxSplxivuN8ofkWtQJYTHXEuOVKsYrMdcS45aXFrfEXEuMYWIME34DTcy1xNjlpcQuMdcSY5aXErOULXGMU2KcUm6cYn/H6iG5ls6DopWYa4nxSsy1xLglxi0x1xJjmBjDPEUM03xADBNzLTF2qW7sEnMtMWZ5KTFL2RLHOCXGKQ+NU2wIf//5QpgxQwgWimjUOyEe/k0hG/6O9hmNV2+M8Tm4lBzd4B6l3tNrvpdV52eyKBJa1280whXZnE3pv280nxrfOXhDEQNLOJGW4fLJojLbq3cSuhoSpkANGVAfI/HJGKF6x08IZ7cJ0do0vRy+vQy+vSAt68A47cUcnwnXVL0Mqp6ctQnZ2TQw9jDObtMF+KWBbXgtXQpXhz+BJ0SrxRUJR+Q2hdm/P69hnN2mSGafwpW5Vyhzz5HZxuo3ufot9RsmUaI5eZ5uqgHmOH2c289BSvUNtOgPgNxl9BqmNCP3EN6N4cM87Aj6Qu61rmVnUbljCnnxEAW+ExRnkve9pYwFVun4s7CIsz350/n2/CcTo2+Pvj369ujbo2/fRN+eXwmNvj369ujbo2+Pvv15fbsL40zNKXiSa/JjTGNfn4KMd0LnevQ1Xr2heftO7/C7TnsJQVXPIdghzfdYjlhfvQc+MzGXHvcuuBupN2l+pyznUHykvufe7Pr4RNZn9LlMnE/dB54t+N2i74lzBop1GTWhQ3cuW0DRkdWeOznqEDczI4o82tBuC1o206qD4qvOelDczJZoF5hxs7OLLhTpZrD+fG7RKcFxjV0Pm02VxcH1vtlUmJucwagOBVahfqP1++6tRZgmIT9zCb5jRtlFs7r0Ua4e20kWFe7naMMTsj5ciQui7VJOfJDOVDONL9ajz+KrK1oT2slXOTztc5myddy2AG6rMqr42X5XZQiW1ZptTa1XyJawFlSDq0vxl/Qk6iy1JmHmaYul4PNfbArsR50zZY9rTBqSd77UCqTTZ1TtgeW9AkvZpco2r/qtdYbVJ2lvIWvVuOIzpr6lEvIOpObJ1dlsusSasVPHKluy1zwaxOfNG1IcUVV3In0MtjszOGrtukytE7mu2yqhfY8jB32Cy0B+QlmQW4gGNeucrHCRjudfMBbWTheTUDwypRlAyAp0lWVkax5K7V2Q/XHPB2IleVxS7KtmZD3KmuwR4/UbkllruvYdHInrUwN3Ca/WOAv7ieKDRVrXUXVrxrEOrjJp1Iqr8/y4RnRFq25T1h893mbBeHdyx7tTynjrT6grK4o61P3jfrqPuDDXS9pQO06pavSyQ88wb8HfbXkfrqMMPFOKfWVT3h/fJ3pBvh1ojc8Zxfglxi+bE7+0CuMX9RzM08Qv/DTelnwmLz9+8e21SvHL5kYvpk6vG7kMBT4DcQWeYkyfb+DzF/g/rHg8w2/qz9vhO7k7fOcF7vBT+Vn39tgdX8PNLIZ+wob3ghvQSpZJX9foiS7+7N4xzWkOBtDTBfziLuWfTelTJGRnfemjdUbMhSaWzWDfNfAFyB13Qx0bqjnSMhePoFWREbRyRlDPWZ+6k82qe9muDwDp0qdjMTbsynyiOu/MyivZyvUa9UwMzthhmkXE+0fOnttQRcd2YtdMbAxCsmVnvB6lmoEZ2RZDToAWs/dsc3xehqZRM3kghrRLjMRnqk3MqP3/AUeAlD0=
It's a Zlib-compressed string with a 32-byte uncompressed header.
var encoding = #"";
byte[] data = Convert.FromBase64String(encoding);
var compressedArray = new byte[data.Length - 32];
Array.Copy(data, 32, compressedArray, 0, data.Length - 32);
var decompressed = ZlibStream.UncompressBuffer(compressedArray);
var str = Encoding.Unicode.GetString(decompressed);
The header contains the uncompressed length of the data in little-endian order at offset 8: a5 d5 or 0xd5a4, which equals 54692. It is hard to tell from this example if the uncompressed length is stored as a5 d5, a5 d5 00 00, or a5 d5 00 00 00 00 00 00.
When debugging Windows application with Ollydbg, we can add comments to assembly language output as following:
00401020 push ebp ; add comment here
Can we add comments to gdb output just like the way above?
When we input disassemble in gdb, it shows like this:
(gdb) disassemble main
Dump of assembler code for function main:
0x0804841d <+0>: push %ebp
0x0804841e <+1>: mov %esp,%ebp
0x08048420 <+3>: and $0xfffffff0,%esp
0x08048423 <+6>: sub $0x10,%esp
0x08048426 <+9>: movl $0x80484d0,(%esp)
0x0804842d <+16>: call 0x80482f0 <puts#plt>
0x08048432 <+21>: mov $0x0,%eax
0x08048437 <+26>: leave
0x08048438 <+27>: ret
End of assembler dump.
Can we add some comments line 0x0804841d in order that gdb output like this:
(gdb) disassemble main
Dump of assembler code for function main:
0x0804841d <+0>: push %ebp ; add comment here
0x0804841e <+1>: mov %esp,%ebp
0x08048420 <+3>: and $0xfffffff0,%esp
0x08048423 <+6>: sub $0x10,%esp
0x08048426 <+9>: movl $0x80484d0,(%esp)
0x0804842d <+16>: call 0x80482f0 <puts#plt>
0x08048432 <+21>: mov $0x0,%eax
0x08048437 <+26>: leave
0x08048438 <+27>: ret
End of assembler dump.
Yes, GDB commands can be commented with the #.
00401020 push ebp ; # add comment here
http://www.chemie.fu-berlin.de/chemnet/use/info/gdb/gdb_16.html
Can we add some comments
No.
Obviously you can save GDB output into a text file and add comments there to your heart's content. But GDB will not display them next time you disas main.
I have a executable for an embedded device.
It does not have header information that gdb recognizes, but instead uses a proprietary header specified by the vendor.
I can analyse the file just fine using IDA-pro, but I'd like to run some code to see what it does.
The executable is loaded at address 0x52000000
However if I just load the file using
exec-file myfile
I get
"myfile": not in executable format: File format not recognized
And if I restore the memory at the correct location using:
restore myfile 52000000
I get:
You can't do that without a process to debug.
How do I get out of this chicken-and-egg problem?
I just want to jump in the middle of the code, set some registers to predetermined values and run some code to see what happens.
Note that I'm using the gdb ARM toolchain from ARM itself.
As per #artless_noise suggestion I did the following:
objcopy.exe
--output-target=elf32-bigarm
--input-target=binary
--change-start=0x52000000
INPUTFILE OUTPUTFILE
This adds an elf header to the file.
However it does not fix the whole problem.
The output of
readelf.exe -a OUTPUTFILE
gives:
ELF Header:
Magic: 7f 45 4c 46 01 02 01 61 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, big endian
Version: 1 (current)
OS/ABI: ARM
ABI Version: 0
Type: REL (Relocatable file)
Machine: ARM
Version: 0x1
Entry point address: 0x52000000
Start of program headers: 0 (bytes into file)
Start of section headers: 57316 (bytes into file)
Flags: 0x0
Size of this header: 52 (bytes)
Size of program headers: 0 (bytes)
Number of program headers: 0
Size of section headers: 40 (bytes)
Number of section headers: 5
Section header string table index: 2
Section Headers:
[Nr] Name Type Addr Off Size ES Flg Lk Inf Al
[ 0] NULL 00000000 000000 000000 00 0 0 0
[ 1] .data PROGBITS 00000000 000034 00df8c 00 WA 0 0 1
.....
Note that the .data section still has an address of 0x00000000. This should be 0x52000000.
To fix this I opened up a hex editor at address 0xdf8c.
This is close the where the section headers are.
The structure of the section headers is as follows, along with the data I expect to be there.
typedef struct {
Elf32_Word sh_name;
Elf32_Word sh_type; = 1 {.data}
Elf32_Word sh_flags; = ?
Elf32_Addr sh_addr; = 0x00000000
Elf32_Off sh_offset; = 0x00000034
Elf32_Word sh_size; = 0x0000df8c
Elf32_Word sh_link;
Elf32_Word sh_info;
Elf32_Word sh_addralign;
Elf32_Word sh_entsize;
} Elf32_Shdr;
The first header is always all zeros, the second header is the .data section.
So I look for the magic numbers and fill in the starting address, save the file and reload it into gdb.
Now it works