How can debugged variables be NULL when source code proves otherwise - c++

I'm debugging a full memory dump (procdump -ma ...), and I'm investigating the call stack, corresponding with following piece of source code:
unsigned int __stdcall ExecutionThread(void* pArg)
{
__try
{
BOOL bRunning = TRUE;
CInternalManagerObject* pInternalManagerObject = (CInternalManagerObject*) pArg;
pInternalManagerObject->Init();
CInternaStartlManagerObject* pInternaStartlManagerObject = pInternalManagerObject->GetInternaStartlManagerObject();
while(bRunning)
{
bRunning = pInternalManagerObject->Poll(pInternaStartlManagerObject);
if (CSLGlobal::IsValidHandle(_Module.m_hNeverEvent))
WaitForSingleObject(_Module.m_hNeverEvent, 15);
} <<<<<<<<<<<<<<<<============== here is the call stack pointer
pInternalManagerObject->DeInit();
As you can see, pArg is being typecasted and then being used, so it's impossible for pArg to be NULL, but yet this is exactly what the watch-window is telling me. In top of this, the internal variables seem not to be known (also as mentioned in the watch-window).
Watch-window content :
pArg 0x0000000000000000 void *
bRunning identifier "bRunning" is undefined
pInternalManagerObject identifier "pInternalManagerObject" is undefined
I can understand bRunning being optimised away, as this variable is not used anymore, but this is not correct for pInternalManagerObject, which is still used in the following line.
The symbols seem to be loaded fine.
I'm viewing this using Visual Studio Professional 2017, version 15.8.8.
Does anybody have a clue what might be causing this weird behaviour and what I can do in order to get a dump with correct values for the internal variables?
Edit after question for generated assembly code
The generated assembly is:
27:
28: unsigned int __stdcall ExecutionThread(void* pArg)
29: {
00007FF69C7A1690 48 89 5C 24 08 mov qword ptr [rsp+8],rbx
00007FF69C7A1695 48 89 74 24 10 mov qword ptr [rsp+10h],rsi
00007FF69C7A169A 57 push rdi
00007FF69C7A169B 48 83 EC 20 sub rsp,20h
00007FF69C7A169F 48 8B F9 mov rdi,rcx
30: __try
31: {
32: BOOL bRunning = TRUE;
00007FF69C7A16A2 BB 01 00 00 00 mov ebx,1
33: CInternalManagerObject* pInternalManagerObject = (CInternalManagerObject*) pArg;
34:
35: pInternalManagerObject->Init();
00007FF69C7A16A7 E8 64 EA FD FF call CInternalManagerObject::Init (07FF69C780110h)
36:
37: CBaseManager* pBaseManager = pInternalManagerObject->GetBaseManager();
00007FF69C7A16AC 48 8B CF mov rcx,rdi
00007FF69C7A16AF E8 0C E9 FD FF call CInternalManagerObject::GetBaseManager (07FF69C77FFC0h)
00007FF69C7A16B4 48 8B F0 mov rsi,rax
40: {
41: bRunning = pInternalManagerObject->Poll(pBaseManager);
00007FF69C7A16B7 48 8B CF mov rcx,rdi
38:
39: while(bRunning)
00007FF69C7A16BA 85 DB test ebx,ebx
00007FF69C7A16BC 74 2E je ExecutionThread+5Ch (07FF69C7A16ECh)
40: {
41: bRunning = pInternalManagerObject->Poll(pBaseManager);
00007FF69C7A16BE 48 8B D6 mov rdx,rsi
40: {
41: bRunning = pInternalManagerObject->Poll(pBaseManager);
00007FF69C7A16C1 E8 7A ED FD FF call CInternalManagerObject::Poll (07FF69C780440h)
00007FF69C7A16C6 8B D8 mov ebx,eax
42:
43: if (CSLGlobal::IsValidHandle(_Module.m_hNeverEvent))
00007FF69C7A16C8 48 8D 0D C1 13 0E 00 lea rcx,[_Module+550h (07FF69C882A90h)]
00007FF69C7A16CF E8 3C F2 FB FF call __Skyline_Global::CSLGlobal::IsValidHandle (07FF69C760910h)
00007FF69C7A16D4 85 C0 test eax,eax
00007FF69C7A16D6 74 12 je ExecutionThread+5Ah (07FF69C7A16EAh)
44: WaitForSingleObject(_Module.m_hNeverEvent, 15);
00007FF69C7A16D8 BA 0F 00 00 00 mov edx,0Fh
00007FF69C7A16DD 48 8B 0D AC 13 0E 00 mov rcx,qword ptr [_Module+550h (07FF69C882A90h)]
00007FF69C7A16E4 FF 15 16 0B 08 00 call qword ptr [__imp_WaitForSingleObject (07FF69C822200h)]
45: }
00007FF69C7A16EA EB CB jmp ExecutionThread+27h (07FF69C7A16B7h)
46:
47: pInternalManagerObject->DeInit();
00007FF69C7A16EC E8 FF E7 FD FF call CInternalManagerObject::DeInit (07FF69C77FEF0h)
48: }
I suppose this means that the correct value of pArg can be found in register RDI.
The Register window gives me following information:
RAX = 0000000000000000
RBX = 0000000000000001
RCX = 0000000000000000
RDX = 0000000000000000
RSI = 00000072A1E83220
RDI = 00000072A14A9990
...
Having a look into the memory at the mentioned place, I see hexadecimal values like:
0x00000072A14A9990 98 59 82 9c f6 7f 00 00 01 00 00 00 00 00 08 00 28 d2 28 62 f9 7f 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 50 0e 78 a2 72 00 ˜Y.œö...........(Ò(bù...................................P.x¢r.
0x00000072A14A99CE 00 00 ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 5c 07 00 00 00 00 00 00 d0 07 00 02 00 00 00 00 ff ff ff ff ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ..ÿÿÿÿ............\.......Ð.......ÿÿÿÿÿÿÿÿÿÿÿÿ................
0x00000072A14A9A0C 00 00 00 00 d0 07 00 02 00 00 00 00 38 59 82 9c f6 7f 00 00 f0 90 60 a2 72 00 00 00 00 00 00 00 00 00 00 00 09 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 68 59 82 9c f6 7f 00 00 00 00
Does this mean that pArg is not NULL indeed? (Sorry, but I'm not experienced in assembly debugging)

Does this mean that pArg is not NULL indeed?
No it does not mean that; pArg is null. The watch window tells you, the registers tell you.
As you can see, pArg is being typecasted and then being used, so it's
impossible for pArg to be NULL.
That's not correct; that's not what the cast does. If the variable is null then the result of the cast will be null.
https://en.cppreference.com/w/c/language/cast
I suppose this means that the correct value of pArg can be found in
register RDI.
No; pArg is mounted onto rcx; mov works right-to-left.
mov rcx,rdi
RCX = 0000000000000000
https://c9x.me/x86/html/file_module_x86_id_176.html
I can understand bRunning being optimised away, as this variable is not used anymore, but this is not correct for pInternalManagerObject,
which is still used in the following line.
My guess is that you've observed the watch window when the program counter is on the first line of your function. bRunning and pInternalManagerObject are out of scope. (Although they could potentially be stripped due to optimisation). Note that if a variable is stripped you won't be able to see it even if it is used.
Thoughts
Program defensively. make a call to assert (or whatever assertion macro the codebase uses) in order to check the value of pArg (or any other pointer) before dereferencing. If it's an error you could reasonably see in production, go one step further: log the unexpected behaviour and early-out the function. http://www.cplusplus.com/reference/cassert/assert/
KISS: Whilst I'd commend anyone who's willing to get their "hands dirty" in this case it's just not necessary to start cracking open the disassembly. In this case the answer's right there. https://en.wikipedia.org/wiki/KISS_principle
Additionally you get a better response on SO if question is phrased in a manner that's easier to read. Remember to explain what it is you are doing and what the problem is before going into code. Explain what fault you are facing (along with any error output), along with asking a question. https://stackoverflow.com/help/how-to-ask

Related

Switch on template argument: does gcc remove the switch?

Will (or can?) a switch statement based on a template argument be removed by the compiler?
See for example the following function in which the template argument funcStep1 is used to select the appropriate function. Will this switch statement be removed as its argument is known at compile time? I tried to learn it from the assembly code (given below) but I have no experience in reading assembly yet so this is not really a feasible task for me.
This question became now relevant for me since the newly introduced if constexpr provides an alternative that is guaranteed to be evaluated at compile time.
template<int funcStep1>
double Mdp::expectedCost(int sdx0, int x0, int x1, int adx0, int adx1) const
{
switch (funcStep1)
{
case 1:
case 3:
return expectedCost_exact(sdx0, x0, x1, adx0, adx1);
case 2:
case 4:
return expectedCost_approx(sdx0, x0, x1, adx0, adx1);
default:
throw string("Unknown value (Mdp::expectedCost.cc)");
}
}
// Define all the template types we need
template double Mdp::expectedCost<1>(int, int, int, int, int) const;
template double Mdp::expectedCost<2>(int, int, int, int, int) const;
template double Mdp::expectedCost<3>(int, int, int, int, int) const;
template double Mdp::expectedCost<4>(int, int, int, int, int) const;
Here you find the output of 'objdump -D' when the above function is compiled with gcc -O2 -ffunction-sections:
1expectedCost.o: file format elf64-x86-64
Disassembly of section .group:
0000000000000000 <.group>:
0: 01 00 add %eax,(%rax)
2: 00 00 add %al,(%rax)
4: 08 00 or %al,(%rax)
6: 00 00 add %al,(%rax)
8: 09 00 or %eax,(%rax)
...
Disassembly of section .group:
0000000000000000 <.group>:
0: 01 00 add %eax,(%rax)
2: 00 00 add %al,(%rax)
4: 0a 00 or (%rax),%al
6: 00 00 add %al,(%rax)
8: 0b 00 or (%rax),%eax
...
Disassembly of section .group:
0000000000000000 <.group>:
0: 01 00 add %eax,(%rax)
2: 00 00 add %al,(%rax)
4: 0c 00 or $0x0,%al
6: 00 00 add %al,(%rax)
8: 0d .byte 0xd
9: 00 00 add %al,(%rax)
...
Disassembly of section .group:
0000000000000000 <.group>:
0: 01 00 add %eax,(%rax)
2: 00 00 add %al,(%rax)
4: 0e (bad)
5: 00 00 add %al,(%rax)
7: 00 0f add %cl,(%rdi)
9: 00 00 add %al,(%rax)
...
Disassembly of section .bss:
0000000000000000 <_ZStL8__ioinit>:
...
Disassembly of section .text._ZNK3Mdp12expectedCostILi1EEEdiiiii:
0000000000000000 <_ZNK3Mdp12expectedCostILi1EEEdiiiii>:
0: e9 00 00 00 00 jmpq 5 <_ZNK3Mdp12expectedCostILi1EEEdiiiii+0x5>
Disassembly of section .text._ZNK3Mdp12expectedCostILi2EEEdiiiii:
0000000000000000 <_ZNK3Mdp12expectedCostILi2EEEdiiiii>:
0: e9 00 00 00 00 jmpq 5 <_ZNK3Mdp12expectedCostILi2EEEdiiiii+0x5>
Disassembly of section .text._ZNK3Mdp12expectedCostILi3EEEdiiiii:
0000000000000000 <_ZNK3Mdp12expectedCostILi3EEEdiiiii>:
0: e9 00 00 00 00 jmpq 5 <_ZNK3Mdp12expectedCostILi3EEEdiiiii+0x5>
Disassembly of section .text._ZNK3Mdp12expectedCostILi4EEEdiiiii:
0000000000000000 <_ZNK3Mdp12expectedCostILi4EEEdiiiii>:
0: e9 00 00 00 00 jmpq 5 <_ZNK3Mdp12expectedCostILi4EEEdiiiii+0x5>
Disassembly of section .text.startup._GLOBAL__sub_I_expectedCost.cc:
0000000000000000 <_GLOBAL__sub_I_expectedCost.cc>:
0: 48 8d 3d 00 00 00 00 lea 0x0(%rip),%rdi # 7 <_GLOBAL__sub_I_expectedCost.cc+0x7>
7: 48 83 ec 08 sub $0x8,%rsp
b: e8 00 00 00 00 callq 10 <_GLOBAL__sub_I_expectedCost.cc+0x10>
10: 48 8b 3d 00 00 00 00 mov 0x0(%rip),%rdi # 17 <_GLOBAL__sub_I_expectedCost.cc+0x17>
17: 48 8d 15 00 00 00 00 lea 0x0(%rip),%rdx # 1e <_GLOBAL__sub_I_expectedCost.cc+0x1e>
1e: 48 8d 35 00 00 00 00 lea 0x0(%rip),%rsi # 25 <_GLOBAL__sub_I_expectedCost.cc+0x25>
25: 48 83 c4 08 add $0x8,%rsp
29: e9 00 00 00 00 jmpq 2e <_GLOBAL__sub_I_expectedCost.cc+0x2e>
Disassembly of section .init_array:
0000000000000000 <.init_array>:
...
Disassembly of section .comment:
0000000000000000 <.comment>:
0: 00 47 43 add %al,0x43(%rdi)
3: 43 3a 20 rex.XB cmp (%r8),%spl
6: 28 55 62 sub %dl,0x62(%rbp)
9: 75 6e jne 79 <_ZStL8__ioinit+0x79>
b: 74 75 je 82 <_ZStL8__ioinit+0x82>
d: 20 37 and %dh,(%rdi)
f: 2e 33 2e xor %cs:(%rsi),%ebp
12: 30 2d 32 37 75 62 xor %ch,0x62753732(%rip) # 6275374a <_ZStL8__ioinit+0x6275374a>
18: 75 6e jne 88 <_ZStL8__ioinit+0x88>
1a: 74 75 je 91 <_ZStL8__ioinit+0x91>
1c: 31 7e 31 xor %edi,0x31(%rsi)
1f: 38 2e cmp %ch,(%rsi)
21: 30 34 29 xor %dh,(%rcx,%rbp,1)
24: 20 37 and %dh,(%rdi)
26: 2e 33 2e xor %cs:(%rsi),%ebp
29: 30 00 xor %al,(%rax)
Disassembly of section .eh_frame:
0000000000000000 <.eh_frame>:
0: 14 00 adc $0x0,%al
2: 00 00 add %al,(%rax)
4: 00 00 add %al,(%rax)
6: 00 00 add %al,(%rax)
8: 01 7a 52 add %edi,0x52(%rdx)
b: 00 01 add %al,(%rcx)
d: 78 10 js 1f <.eh_frame+0x1f>
f: 01 1b add %ebx,(%rbx)
11: 0c 07 or $0x7,%al
13: 08 90 01 00 00 10 or %dl,0x10000001(%rax)
19: 00 00 add %al,(%rax)
1b: 00 1c 00 add %bl,(%rax,%rax,1)
1e: 00 00 add %al,(%rax)
20: 00 00 add %al,(%rax)
22: 00 00 add %al,(%rax)
24: 05 00 00 00 00 add $0x0,%eax
29: 00 00 add %al,(%rax)
2b: 00 10 add %dl,(%rax)
2d: 00 00 add %al,(%rax)
2f: 00 30 add %dh,(%rax)
31: 00 00 add %al,(%rax)
33: 00 00 add %al,(%rax)
35: 00 00 add %al,(%rax)
37: 00 05 00 00 00 00 add %al,0x0(%rip) # 3d <.eh_frame+0x3d>
3d: 00 00 add %al,(%rax)
3f: 00 10 add %dl,(%rax)
41: 00 00 add %al,(%rax)
43: 00 44 00 00 add %al,0x0(%rax,%rax,1)
47: 00 00 add %al,(%rax)
49: 00 00 add %al,(%rax)
4b: 00 05 00 00 00 00 add %al,0x0(%rip) # 51 <.eh_frame+0x51>
51: 00 00 add %al,(%rax)
53: 00 10 add %dl,(%rax)
55: 00 00 add %al,(%rax)
57: 00 58 00 add %bl,0x0(%rax)
5a: 00 00 add %al,(%rax)
5c: 00 00 add %al,(%rax)
5e: 00 00 add %al,(%rax)
60: 05 00 00 00 00 add $0x0,%eax
65: 00 00 add %al,(%rax)
67: 00 14 00 add %dl,(%rax,%rax,1)
6a: 00 00 add %al,(%rax)
6c: 6c insb (%dx),%es:(%rdi)
6d: 00 00 add %al,(%rax)
6f: 00 00 add %al,(%rax)
71: 00 00 add %al,(%rax)
73: 00 2e add %ch,(%rsi)
75: 00 00 add %al,(%rax)
77: 00 00 add %al,(%rax)
79: 4b 0e rex.WXB (bad)
7b: 10 5e 0e adc %bl,0xe(%rsi)
7e: 08 00 or %al,(%rax)
Yes, this is optimized. There are a few things that make reading assembly easier, such as demangling names (example: _ZNK3Mdp12expectedCostILi1EEEdiiiii is the mangled form of double Mdp::expectedCost<1>(int, int, int, int, int) const), stripping comments and text (and using Intel syntax):
double expectedCost<1>(int, int, int, int, int): # #double expectedCost<1>(int, int, int, int, int)
jmp expectedCost_exact(int, int, int, int, int) # TAILCALL
double expectedCost<2>(int, int, int, int, int): # #double expectedCost<2>(int, int, int, int, int)
jmp expectedCost_approx(int, int, int, int, int) # TAILCALL
double expectedCost<3>(int, int, int, int, int): # #double expectedCost<3>(int, int, int, int, int)
jmp expectedCost_exact(int, int, int, int, int) # TAILCALL
double expectedCost<4>(int, int, int, int, int): # #double expectedCost<4>(int, int, int, int, int)
jmp expectedCost_approx(int, int, int, int, int) # TAILCALL
https://godbolt.org/z/ZtoKFH
The above site simplifies this whole process for you.
In this case I didn't provide definitions for expectedCost_approx so the compiler just leaves a jump. But in any case, compilers are definitely smart enough to realize that each template function has a constant value in the switch.
The answer to your question is: Yes, any moderately useful compiler will perform dead code elimination.
if constexpr is not so much about forcing compile time evaluation for reasons of performance. In terms of performance, there's not really going to be any difference between if constexpr and a normal if when the given expression is a compile-time constant because compilers will end up optimizing the unused branch away either way. What if constexpr enables is to have code in the inactive branch that must not be instantiated with the given template arguments (e.g., because it would be invalid in that particular case). For your switch above, the whole code will be instantiated for all cases. Only afterwards will the unused code be removed by the optimizer. if constexpr on the other hand, guarantees that the code in the unused branch will never be instantiated to begin with. See, e.g., here for more on that…
We don't have switch constexpr, and there are no guaranties of branch elimination for simple switch, even with constexpr value (as for regular if in fact), but I expect than compiler would remove them with proper optimization flag.
Notice also that your not-used branches would instantiate, if any, template methods/objects whereas if constexpr would not.
So if you want to have guaranty that only relevant code is there, or avoid unneeded instantiations, use if constexpr. Else use the one you find the clearer.

What are these seemingly-useless callq instructions in my x86 object files for?

I have some template-heavy C++ code that I want to ensure the compiler optimizes as much as possible due to the large amount of information it has at compile time. To evaluate its performance, I decided to take a look at the disassembly of the object file that it generates. Below is a snippet of what I got from objdump -dC:
0000000000000000 <bar<foo, 0u>::get(bool)>:
0: 41 57 push %r15
2: 49 89 f7 mov %rsi,%r15
5: 41 56 push %r14
7: 41 55 push %r13
9: 41 54 push %r12
b: 55 push %rbp
c: 53 push %rbx
d: 48 81 ec 68 02 00 00 sub $0x268,%rsp
14: 48 89 7c 24 10 mov %rdi,0x10(%rsp)
19: 48 89 f7 mov %rsi,%rdi
1c: 89 54 24 1c mov %edx,0x1c(%rsp)
20: e8 00 00 00 00 callq 25 <bar<foo, 0u>::get(bool)+0x25>
25: 84 c0 test %al,%al
27: 0f 85 eb 00 00 00 jne 118 <bar<foo, 0u>::get(bool)+0x118>
2d: 48 c7 44 24 08 00 00 movq $0x0,0x8(%rsp)
34: 00 00
36: 4c 89 ff mov %r15,%rdi
39: 4d 8d b7 30 01 00 00 lea 0x130(%r15),%r14
40: e8 00 00 00 00 callq 45 <bar<foo, 0u>::get(bool)+0x45>
45: 84 c0 test %al,%al
47: 88 44 24 1b mov %al,0x1b(%rsp)
4b: 0f 85 ef 00 00 00 jne 140 <bar<foo, 0u>::get(bool)+0x140>
51: 80 7c 24 1c 00 cmpb $0x0,0x1c(%rsp)
56: 0f 85 24 03 00 00 jne 380 <bar<foo, 0u>::get(bool)+0x380>
5c: 48 8b 44 24 10 mov 0x10(%rsp),%rax
61: c6 00 00 movb $0x0,(%rax)
64: 80 7c 24 1b 00 cmpb $0x0,0x1b(%rsp)
69: 75 25 jne 90 <bar<foo, 0u>::get(bool)+0x90>
6b: 48 8b 74 24 10 mov 0x10(%rsp),%rsi
70: 4c 89 ff mov %r15,%rdi
73: e8 00 00 00 00 callq 78 <bar<foo, 0u>::get(bool)+0x78>
78: 48 8b 44 24 10 mov 0x10(%rsp),%rax
7d: 48 81 c4 68 02 00 00 add $0x268,%rsp
84: 5b pop %rbx
85: 5d pop %rbp
86: 41 5c pop %r12
88: 41 5d pop %r13
8a: 41 5e pop %r14
8c: 41 5f pop %r15
8e: c3 retq
8f: 90 nop
90: 4c 89 f7 mov %r14,%rdi
93: e8 00 00 00 00 callq 98 <bar<foo, 0u>::get(bool)+0x98>
98: 83 f8 04 cmp $0x4,%eax
9b: 74 f3 je 90 <bar<foo, 0u>::get(bool)+0x90>
9d: 85 c0 test %eax,%eax
9f: 0f 85 e4 08 00 00 jne 989 <bar<foo, 0u>::get(bool)+0x989>
a5: 49 83 87 b0 01 00 00 addq $0x1,0x1b0(%r15)
ac: 01
ad: 49 8d 9f 58 01 00 00 lea 0x158(%r15),%rbx
b4: 48 89 df mov %rbx,%rdi
b7: e8 00 00 00 00 callq bc <bar<foo, 0u>::get(bool)+0xbc>
bc: 49 8d bf 80 01 00 00 lea 0x180(%r15),%rdi
c3: e8 00 00 00 00 callq c8 <bar<foo, 0u>::get(bool)+0xc8>
c8: 48 89 df mov %rbx,%rdi
cb: e8 00 00 00 00 callq d0 <bar<foo, 0u>::get(bool)+0xd0>
d0: 4c 89 f7 mov %r14,%rdi
d3: e8 00 00 00 00 callq d8 <bar<foo, 0u>::get(bool)+0xd8>
d8: 83 f8 04 cmp $0x4,%eax
The disassembly of this particular function continues on, but one thing I noticed is the relatively large number of call instructions like this one:
20: e8 00 00 00 00 callq 25 <bar<foo, 0u>::get(bool)+0x25>
These instructions, always with the opcode e8 00 00 00 00, occur frequently throughout the generated code, and from what I can tell, are nothing more than no-ops; they all seem to just fall through to the next instruction. This begs the question, then, is there a good reason why all these instructions are generated?
I'm concerned about the instruction cache footprint of the generated code, so wasting 5 bytes many times throughout a function seems counterproductive. It seems a bit heavyweight for a nop, unless the compiler is trying to preserve some kind of memory alignment or something. I wouldn't be surprised if this were the case.
I compiled my code using g++ 4.8.5 using -O3 -fomit-frame-pointer. For what it's worth, I saw similar code generation using clang 3.7.
The 00 00 00 00 (relative) target address in e8 00 00 00 00 is intended to be filled in by the linker. It doesn't mean that the call falls through. It just means you are disassembling an object file that has not been linked yet.
Also, a call to the next instruction, if that was the end result after the link phase, would not be a no-op, because it changes the stack (a certain hint that this is not what is going on in your case).

Localizing function body chunk in .o file

i got some simple code file
mangen.c:
///////////// begin of the file
void mangen(int* data)
{
for(int j=0; j<100; j++)
for(int i=0; i<100; i++)
data[j*100+i] = 111;
}
//////// end of the file
I compile it with mingw (on win32)
c:\mingw\bin\gcc -std=c99 -c mangen.c -fno-exceptions -march=core2 -mtune=generic -mfpmath=both -msse2
it yeilds to mangen.o file which is 400 bytes
00000000 4C 01 03 00 00 00 00 00-D8 00 00 00 0A 00 00 00 L...............
00000010 00 00 05 01 2E 74 65 78-74 00 00 00 00 00 00 00 .....text.......
00000020 00 00 00 00 4C 00 00 00-8C 00 00 00 00 00 00 00 ....L...........
00000030 00 00 00 00 00 00 00 00-20 00 30 60 2E 64 61 74 ........ .0`.dat
00000040 61 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 a...............
00000050 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................
00000060 40 00 30 C0 2E 62 73 73-00 00 00 00 00 00 00 00 #.0..bss........
00000070 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................
00000080 00 00 00 00 00 00 00 00-80 00 30 C0 55 89 E5 83 ..........0.U...
00000090 EC 10 C7 45 FC 00 00 00-00 EB 34 C7 45 F8 00 00 ...E......4.E...
000000A0 00 00 EB 21 8B 45 FC 6B-D0 64 8B 45 F8 01 D0 8D ...!.E.k.d.E....
000000B0 14 85 00 00 00 00 8B 45-08 01 D0 C7 00 6F 00 00 .......E.....o..
000000C0 00 83 45 F8 01 83 7D F8-63 7E D9 83 45 FC 01 83 ..E...}.c~..E...
000000D0 7D FC 63 7E C6 C9 C3 90-2E 66 69 6C 65 00 00 00 }.c~.....file...
000000E0 00 00 00 00 FE FF 00 00-67 01 6D 61 6E 67 65 6E ........g.mangen
000000F0 2E 63 00 00 00 00 00 00-00 00 00 00 5F 6D 61 6E .c.........._man
00000100 67 65 6E 00 00 00 00 00-01 00 20 00 02 01 00 00 gen....... .....
00000110 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 ................
00000120 2E 74 65 78 74 00 00 00-00 00 00 00 01 00 00 00 .text...........
00000130 03 01 4B 00 00 00 00 00-00 00 00 00 00 00 00 00 ..K.............
00000140 00 00 00 00 2E 64 61 74-61 00 00 00 00 00 00 00 .....data.......
00000150 02 00 00 00 03 01 00 00-00 00 00 00 00 00 00 00 ................
00000160 00 00 00 00 00 00 00 00-2E 62 73 73 00 00 00 00 .........bss....
00000170 00 00 00 00 03 00 00 00-03 01 00 00 00 00 00 00 ................
00000180 00 00 00 00 00 00 00 00-00 00 00 00 04 00 00 00 ................
Now I need to know where is the binary chunk containing
above function body in here
Could someone provide some simple code that will allow me to retrive
this boundaries ?
(assume that function body may be shorter or longer and also
there may be other functions or data in source fite added so
it will move in chunk but I suspect procedure to localise it
should be not very complex.
You can use objdump -Fd mangen.o to find out file offset and lenght of a function.
Alternatively, you can use readelf -s mangen.o to find out size of a function.
You may define something like int abc = 0x11223344; in the beginning and end of function and use the constants to locate the function body.
You can use objdump or nm.
For instance, try:
nm mangen.o
Or
objdump -t mangen.o
If you need to use your own code, have a look here:
http://www.rohitab.com/discuss/topic/38591-c-import-table-parser/
It will give you something to start with. You can find much more information about the format in MSDN.
If you are into Python, there is nice tool/library (including source code) that can be helpful:
https://code.google.com/p/pefile/

Why does C++ inline function has call instructions?

I read that with inline functions where ever the function call is made we replace the function call with the body of the function definition.
According to the above explanation there should not be any function call when inline is user.
If that is the case Why do I see three call instructions in the assembly code ?
#include <iostream>
inline int add(int x, int y)
{
return x+ y;
}
int main()
{
add(8,9);
add(20,10);
add(100,233);
}
meow#vikkyhacks ~/Arena/c/temp $ g++ -c a.cpp
meow#vikkyhacks ~/Arena/c/temp $ objdump -M intel -d a.o
0000000000000000 <main>:
0: 55 push rbp
1: 48 89 e5 mov rbp,rsp
4: be 09 00 00 00 mov esi,0x9
9: bf 08 00 00 00 mov edi,0x8
e: e8 00 00 00 00 call 13 <main+0x13>
13: be 0a 00 00 00 mov esi,0xa
18: bf 14 00 00 00 mov edi,0x14
1d: e8 00 00 00 00 call 22 <main+0x22>
22: be e9 00 00 00 mov esi,0xe9
27: bf 64 00 00 00 mov edi,0x64
2c: e8 00 00 00 00 call 31 <main+0x31>
31: b8 00 00 00 00 mov eax,0x0
36: 5d pop rbp
37: c3 ret
NOTE
Complete dump of the object file is here
You did not optimize so the calls are not inlined
You produced an object file (not a .exe) so the calls are not resolved. What you see is a dummy call whose address will be filled by the linker
If you compile a full executable you will see the correct addresses for the jumps
See page 28 of:
http://www.cs.princeton.edu/courses/archive/spr04/cos217/lectures/Assembler.pdf

What is the ds:0023:003a3000=?? stuff on the end of a drwatson FAULT?

I have the following entry in a Dr Watson log. What is the significance of the "ds:0023:003a3000=??" part of the entry to the right of the FAULT line?
*----> State Dump for Thread Id 0xdfc <----*
eax=00000000 ebx=00390320 ecx=0854ff48 edx=09e44bfc esi=00012ce1 edi=0854ff61
eip=00465c51 esp=0854ff30 ebp=00000000 iopl=0 nv up ei pl zr na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
function: sysman
00465c37 49 dec ecx
00465c38 eb02 jmp sysman+0x65c3c (00465c3c)
00465c3a 33c9 xor ecx,ecx
00465c3c 8d542428 lea edx,[esp+0x28]
00465c40 52 push edx
00465c41 51 push ecx
00465c42 8d4c2418 lea ecx,[esp+0x18]
00465c46 e8d5c0fcff call sysman+0x31d20 (00431d20)
00465c4b 33c0 xor eax,eax
00465c4d 8d4c2418 lea ecx,[esp+0x18]
FAULT ->00465c51 8a441eff mov al,[esi+ebx-0x1] ds:0023:003a3000=??
00465c55 50 push eax
00465c56 6864074900 push 0x490764
00465c5b 51 push ecx
00465c5c e8cfd0fcff call sysman+0x32d30 (00432d30)
00465c61 8d542424 lea edx,[esp+0x24]
00465c65 68689f4800 push 0x489f68
00465c6a 8d44242c lea eax,[esp+0x2c]
00465c6e 52 push edx
00465c6f 50 push eax
00465c70 e83bc3fcff call sysman+0x31fb0 (00431fb0)
*----> Stack Back Trace <----*
ChildEBP RetAddr Args to Child
00000000 00000000 00000000 00000000 00000000 sysman+0x65c51
*----> Raw Stack Dump <----*
000000000854ff30 58 01 55 08 75 07 c8 09 - 00 00 00 00 18 6d c7 01 X.U.u........m..
000000000854ff40 fc 4b e4 09 04 bd 47 00 - 04 bd 47 00 fc 0c c9 01 .K....G...G.....
000000000854ff50 ac ca ae 09 64 5f c4 01 - 20 37 37 30 32 34 3a 20 ....d_.. 77024:
000000000854ff60 00 b3 42 00 a8 ff 54 08 - 90 a6 47 00 02 00 00 00 ..B...T...G.....
000000000854ff70 8b c5 42 00 b8 ff 54 08 - 2e 03 39 00 28 99 cb 01 ..B...T...9.(...
000000000854ff80 ff ff ff ff 00 00 00 00 - 00 00 00 00 20 1e cb 01 ............ ...
000000000854ff90 a6 f7 ba 77 06 00 00 00 - c9 f7 ba 77 e1 6b d9 09 ...w.......w.k..
000000000854ffa0 06 00 00 00 1f 00 00 00 - 68 00 55 08 c1 a0 47 00 ........h.U...G.
000000000854ffb0 00 00 00 00 58 c4 42 00 - c9 a5 ca 09 d1 fb 38 0a ....X.B.......8.
000000000854ffc0 27 00 00 00 e1 6b d9 09 - ef f2 41 00 c9 a5 ca 09 '....k....A.....
000000000854ffd0 01 59 cc 01 38 00 55 08 - ec 00 55 08 00 00 00 00 .Y..8.U...U.....
000000000854ffe0 e0 00 55 08 ff ff ff ff - 89 00 00 00 01 00 01 01 ..U.............
000000000854fff0 c8 ff 54 08 b8 ff 54 08 - 77 00 55 08 29 a5 ca 09 ..T...T.w.U.)...
0000000008550000 51 00 00 00 5f 00 00 00 - 00 9f 82 7c 61 36 ca 01 Q..._......|a6..
0000000008550010 25 00 00 00 3f 00 00 00 - 00 ce bb 77 91 b7 c7 01 %...?......w....
0000000008550020 19 00 00 00 1f 00 00 00 - 00 ff ff ff d9 28 cc 01 .............(..
0000000008550030 0b 00 00 00 1f 00 00 00 - 00 00 55 08 d1 fb 38 0a ..........U...8.
0000000008550040 27 00 00 00 3f 00 00 00 - 00 20 ba 77 00 00 00 00 '...?.... .w....
0000000008550050 00 00 00 00 00 00 00 00 - 20 b7 c7 01 00 00 00 00 ........ .......
0000000008550060 00 00 00 00 00 00 00 00 - b4 00 55 08 1b 90 47 00 ..........U...G.`
To summarize:
You get a register dump here:
eax=00000000 ebx=00390320 ecx=0854ff48 edx=09e44bfc esi=00012ce1 edi=0854ff61
eip=00465c51 esp=0854ff30 ebp=00000000 iopl=0 nv up ei pl zr na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246
The eip indicates the instruction that failed:
FAULT ->00465c51 8a441eff mov al,[esi+ebx-0x1] ds:0023:003a3000=??
The stuff at the end is the address that failed to read, which is the "usual" data segment of 23, and address 3A3000, whcih is composed of esi and ebx minus 1: 390320+12ce1-1. To me, that looks like an index gone bad - 3a3000 would be the first address of a new "page" in memory, so that's why it's failing at that point. 77025 bytes into an array is quite a long way, but it is of course possible that it's something else that is wrong.