I'm trying to better understand the structure of the Windows PE executable files and namely how does C++ linker handle static variable references. (In my case for Visual Studio C++.)
Say I have the following example:
::PathFindExtension(L"file.exe");
which may translate into the following assembly (Debug build in this case):
I understand that the pointer to the PathFindExtension API for the call instruction is calculated and inserted into the PE image at the .exe file start-up when the corresponding DLL is loaded in, but I'm not clear about how pointers to static strings are handled.
As you see the push instruction:
00568AC6 68 70 48 60 00 push offset string L"file.exe" (604870h)
is 68 id machine code, or PUSH imm32 [source], thus the imm32 is just an absolute reference to memory, or 604870h in this case:
But my question is, since that value is hardcoded at the compile time, how was that offset (i.e. 604870h) calculated?
And the second question, how does ASLR handle this situation, i.e. when this executable could be loaded at some arbitrary address in memory? (If I understand ASLR concept correctly.)
Related
This question already has an answer here:
X86 Assembly - How to calculate instruction opcodes length in bytes [closed]
How to tell length of an x86-64 instruction opcode using CPU itself?
(1 answer)
Closed 3 years ago.
For example, at the address 0x762C51, there is the instruction call sub_E91E50.
In bytes, this is E8 FA F1 72 00.
Next, at the address 0x762C56, there is the instruction push 0. In bytes, this is 6A 00.
Now, when it comes to C++ reading a function like this, it would only have the bytes like: E8 FA F1 72 00 6A 00
How can I determine where the first instruction ends and the next begins.
For variable length instruction sets you can't really do this. Many tools will try but it is often trivial to mess them up if you try. Compiled code works better.
The best way which won't necessarily result in a complete disassembly is to go in execution order. So you have to have a known correct entry point and go from there following the execution paths and looking for collisions and setting those aside for a human to figure out.
Simulation is even better and it might give you better coverage in some areas, but also will leave gaps where an execution order disassembler wouldn't.
Even gnu assembler messes up with variable length instruction sets (next time please specify the target/isa as stated in the assembly tag). So whatever a "full disassembler" is can't possibly do it either, in general.
If someone has told you like it's a class assignment or you compile a function to an object and based on the label in the disassembly you feel comfortable that is a valid entry point then you can start disassembling there in execution order. Understanding that if it is an object there will be incomplete instructions to be filled in later, if you link then if you assume that a function label address is an entry point then can disassemble in execution order.
C++ really has nothing to do with this if you have a sequence of bytes and you are sure you know the entry point it doesn't much matter how that code was created, unless it intentionally has anti-disassembly items in it (hand written or compiler/tool created) generally this is not a problem, but technically a tool could do this and it is trivial for a human to do it by hand.
So recently I have been learning about low level programming languages (such as Assembly, which from my understanding is just symbolic binary) and have came across Shellcoding (e.g. "\x4D..." etc). I found out that you can input Shellcode into a C/C++ application and then execute it - my question is, is it possible to generate Shellcode from an existing exe application and then use this generated Shellcode in a C/C++ application? Have I misunderstood the possibilities of Shellcoding? Many thanks - a person with very limited knowledge on low level programming
is it possible to generate Shellcode from an existing exe application and then use this generated Shellcode in a C/C++ application
Answer: No. Shellcode is base-independed, executable PE file has a huge amount of headers, etc, you cant execute it before doing some actions/
Shellcode - it is a very big question.
First of all, you need to know that function adresses of external libraries such as kernel32, user32 libs, etc, is stored in Import Adress Table, that filled by windows-loader in startup time. All memory workings is doing by addresses, that computing in compile stage. So you need to find addreses by yourself.
To call functions from shellcode you have to have your own loader of function addresses. This loader must to load kernel32.dll library, search for GetProcAddress function and fill IAT
You dont know what address your shellcode will be loaded, you can know it from such code, calling "delta-offset"
call delta
delta:
pop ebp
sub ebp,offset delta
Now in ebp an offset to real addreses, so to get a variable of function address you need to plus the offset, example:
lea eax, [variable]
add eax, ebp; adding a delta-offset
mov ecx, dword ptr DS:[eax]
To compile code for future use you should use something like FASM, after compiling use WinHex editor -> copy -> copy all -> GREP C source
And you will get something like "\x00\x28" etc, to call it you need to set Execution rights to your shellcode array and change an EIP by command handlers like jmp/call/etc
There are an example that shows in Windows-system Hello, World MessageBox
# include <stdlib.h>
# include <stdio.h>
# include <string.h>
# include <windows.h>
int
main(void)
{
char *shellcode = "\x33\xc9\x64\x8b\x49\x30\x8b\x49\x0c\x8b"
"\x49\x1c\x8b\x59\x08\x8b\x41\x20\x8b\x09"
"\x80\x78\x0c\x33\x75\xf2\x8b\xeb\x03\x6d"
"\x3c\x8b\x6d\x78\x03\xeb\x8b\x45\x20\x03"
"\xc3\x33\xd2\x8b\x34\x90\x03\xf3\x42\x81"
"\x3e\x47\x65\x74\x50\x75\xf2\x81\x7e\x04"
"\x72\x6f\x63\x41\x75\xe9\x8b\x75\x24\x03"
"\xf3\x66\x8b\x14\x56\x8b\x75\x1c\x03\xf3"
"\x8b\x74\x96\xfc\x03\xf3\x33\xff\x57\x68"
"\x61\x72\x79\x41\x68\x4c\x69\x62\x72\x68"
"\x4c\x6f\x61\x64\x54\x53\xff\xd6\x33\xc9"
"\x57\x66\xb9\x33\x32\x51\x68\x75\x73\x65"
"\x72\x54\xff\xd0\x57\x68\x6f\x78\x41\x01"
"\xfe\x4c\x24\x03\x68\x61\x67\x65\x42\x68"
"\x4d\x65\x73\x73\x54\x50\xff\xd6\x57\x68"
"\x72\x6c\x64\x21\x68\x6f\x20\x57\x6f\x68"
"\x48\x65\x6c\x6c\x8b\xcc\x57\x57\x51\x57"
"\xff\xd0\x57\x68\x65\x73\x73\x01\xfe\x4c"
"\x24\x03\x68\x50\x72\x6f\x63\x68\x45\x78"
"\x69\x74\x54\x53\xff\xd6\x57\xff\xd0";
DWORD why_must_this_variable;
BOOL ret = VirtualProtect (shellcode, strlen(shellcode),
PAGE_EXECUTE_READWRITE, &why_must_this_variable);
if (!ret) {
printf ("VirtualProtect\n");
return EXIT_FAILURE;
}
printf("strlen(shellcode)=%d\n", strlen(shellcode));
((void (*)(void))shellcode)();
return EXIT_SUCCESS;
}
You probably looking for RunPE algorithm. This algorithm can execute PE executable inside another. You are openning another process, copying sections, fill IAT-table and resuming target process from new entrypoint. It is a code injection tecnhiques, used my a malware. So i will not explain how to realise it
Shellcode is machine code that's used as the payload of an exploit (such as a buffer overflow). Depending on the exploit it's used with, it may have limitations such as a maximum length, or certain byte values (e.g. zero) not allowed. There's no one-size-fits-all answer to what shellcode can be.
In general, though: yes, it's possible in principle to embed a complete program in shellcode. It could take the form of a small wrapper (probably hand-written in assembly) that writes the program to a new .exe file and then runs it, or it could use more-sophisticated techniques to replace the current program in memory. There are probably automated tools to create this sort of shellcode, though I don't know of any specifically.
However, the tone of your question makes me think you might be misunderstanding something important:
I found out that you can input Shellcode into a C/C++ application and then execute it
This is a bug, not a feature. Being able to inject new code into a running program, where the program isn't specifically meant to allow that, is a major security flaw. This sort of thing has been the root of a great many security breaches over the span of decades, and developers spend a great deal of effort trying to prevent it from happening.
If it's possible to inject shellcode into a program, the program is broken.
While looking through the assembly for a console "hello world" program (compiled using the visual c++ compiler), I came across this:
pre_c_init proc near
.text:00401AFE mov eax, 5A4Dh
.text:00401B03 cmp ds:400000h, ax
The code above seems to be accessing memory that isn't filled with anything in particular: All segments start at 0x401000 or even further down in the file. (The image base is at 0x400000, but the first segment is at 0x401000).
I used OllyDbg to see what the actual value at 0x400000 is, and every single time it's the same as in the code (0x5A4D). What's going on here?
5A4D is "MZ" in little-endian ASCII, and MZ is the signature of MS-DOS and, more recently, PE executables.
The comparison checks whether the executable has been mapped at the default base address, 0x400000. This, I believe, is used to determine whether it is necessary to perform relocation.
This is discussed further in the following thread: Why does PE need a relocation table?
I just used DUMPBIN for the first time and I see the term HIGHLOW repeatedly in the output file:
BASE RELOCATIONS #7
11000 RVA, E0 SizeOfBlock
...
3B5 HIGHLOW 2001753D ___onexitbegin
3C1 HIGHLOW 2001753D ___onexitbegin
...
I'm curious what this term stands for. I didn't find anything on Google or Stackoverflow about it.
To apply a fixup, a delta is calculated as the difference between the
preferred base address, and the base where the image is actually
loaded.
The basic idea is that when doing a fixup at some address, we must know
what memory must be changed ("offset" field)
what value is needed for its relocation ("delta" value)
which parts of relocated data and delta value to use ("type" field)
Here are some possible values of the "type" field
HIGH - add higher word (16 bits) of delta to the 16-bit value at "offset"
LOW - add lower word of delta to the value at "offset"
HIGHLOW - add full delta to the 32-bit value at "offset"
In other words, HIGHLOW type tells the program that it's doing a fix-up on offset "offset" from the page of this relocation block*, and that there is a doubleword that needs to be modified in order to have properly working executable.
* all of the relocation entries are grouped into blocks, and every block has a page on which its entries are applied
Let's say that you have this instruction in your code:
section .data
message: "Hello World!", 0
section .code
...
mov eax, message
...
You run assembler and immediately after it you run disassembler. Now your code looks like this:
mov eax, dword [0x702000]
You're now curious why is it 0x700000, and when you look into file dump, you see that
ImageBase: 0x00700000
Now you understand where did this number come from and you'e ready to run the executable.
Loader which loads executable files into memory and creates address space for them finds out, that memory 0x700000 is unavailable and it needs to place that file somewhere else. It decides that 0xf00000 will be OK and copies the file contents there.
But, your program was linked to work only with data on 0x700000 and there was no way for linker to know that its output would be relocated. Because of this, loader must do its magic. It
calculates delta value - the old address (image base) is 0x700000 but it wants 0xf00000 (preferred address). It subtracts one from another and gets 0x800000 as result.
gets to the .reloc section of the file
checks if there is still another page (4KB of data) to be relocated. If no, it continues toward calling fileĀ“s entry point.
4.for every relocation for the current page, it
gets data at relocation offset
adds the delta value (in the way as type field states)
places the new value at relocation offset
continues on step 3
There are also more types of relocation entry and some of them are architecture-specific. To see a full list, read the "Microsoft Portable Executable and Common Object File Format, section 6.6.2. Fixup Types".
What you see here is the content of the "Base relocation table" in Microsoft Windows executable files.
Base relocation tables are necessary in Windows for DLL files and they are optional for executable files; they contain information about the location of address information in the EXE/DLL file that must be updated when the actual address of the DLL file in memory is known (when loading the DLL into memory). Windows uses the information stored in this table to update the address information.
The table supports different types of addresses while the naming is Microsoft-specific: ABSOLUTE (= dummy), HIGH, LOW, HIGHLOW, HIGHADJ and MIPS_JMPADDR.
The full name of the constant is "IMAGE_REL_BASED_HIGHLOW".
The "ABSOLUTE" type is typically a dummy entry inserted to ensure the parts of the table are a multiple of 4 (or 8) bytes long.
On x86 CPUs only the "HIGHLOW" type is used: It tells Windows about the location of an absolute (32-bit) address in the file.
Some background info:
In your example the "Image Base" could be 0x20000000 which means that the EXE/DLL file has been compiled to be loaded into address 0x20000000. At the addresses 0x200113B5 (0x20000000 + 0x11000 + 0x3B5) and 0x200113C1 there are absolute addresses.
Let's say the memory at location 0x200113B5 contains the value 0x20012345 which is the address of a function or variable in the program.
Maybe the memory at address 0x20000000 cannot be used and Windows decides to load the DLL into the memory at 0x50000000 instead. Then the 0x20012345 must be replaced by 0x50012345.
The information in the base relocation table is used by Windows to find all addresses that must be replaced.
When debugging in Visual Studio, if symbols for a call stack are missing, for example:
00 > HelloWorld.exe!my_function(int y=42) Line 291
01 dynlib2.dll!10011435()
[Frames below may be incorrect and/or missing, no symbols loaded for dynlib2.dll]
02 dynlib2.dll!10011497()
03 HelloWorld.exe!wmain(int __formal=1, int __formal=1) Line 297 + 0xd bytes
04 HelloWorld.exe!__tmainCRTStartup() Line 594 + 0x19 bytes
05 HelloWorld.exe!wmainCRTStartup() Line 414
06 kernel32.dll!_BaseProcessStart#4() + 0x23 bytes
the debugger will display the warning Frames below may be incorrect and/or missing.
(Note that only lines 01 and 02 have no symbols. Line 00, where I set a breakpoint and all other lines have symbols loaded.)
Now, I know how to fix the warning (->get pdb file), what I do not quite get is why it is displayed after all! The stack I pasted above is fully OK, it's just that I do not have a pdb file for the dynlib2.dll module.
Why does the debugger need a symbols file to make sure the stack is correct?
I think this is because not all the functions follow the "standard" stack layout. Usually every function starts with:
push ebp
mov ebp,esp
and ends with
pop ebp
ret
By this every function creates its so-called stack frame. EBP always points to the beginning of the top stack frame. In every frame the first two values are the pointer to the previous stack frame, and the function return address.
Using this information one can easily reconstruct the stack. However:
This stack information won't include function names and parameters info.
Not all the functions obey this stack frame layout. If some optimizations are enabled (/Oy, omit stack frame pointers, for instance) - the stack layout is different.
I tried to understand this myself a while ago.
As of 2013, FPO is not used within MSFT and is generally frowned upon. I did come across a different MS binary technology used internally, that probably hampers naive EBP chain traversal: Basic Block Tools.
As noted in the post, PDBs do include 'StackFrameTypeEnum', and elsewhere it's hinted that they include the 'unwind program' for a stack frame. So all in all, they are still needed, and the gory details of why exactly - are not documented.
Symbols are decoupled from the associated binary code to reduce the size of shipping binaries. Check how big your PDB files are - huge, especially compared to the matching binary file (EXE/DLL). You would not want that overhead every time the binary is shipped, installed and used. This is especially important at load time. The symbol information is only for debugging after all, not required for correct operation of your code. Provided you keep symbols that match your shipped binaries, you can still debug problems post mortem with all symbols loaded.