Going through http://hackoftheday.securitytube.net/2013/04/demystifying-execve-shellcode-stack.html
I understood the nasm program which invokes execve and was trying to re-write it.
Some background information:
int execve(const char *filename, char *const argv[], char *const envp[]);
So, eax = 11 (function call number for execve), ebx should point to char* filename, ecx should point to argv[] (which will be the same as ebx since the first argument is the *filename itself e.g. "/bin/sh" in this case), and edx will point to envp[] (null in this case).
Original nasm code:
global _start
section .text
_start:
xor eax, eax
push eax
; PUSH //bin/sh in reverse i.e. hs/nib//
push 0x68732f6e
push 0x69622f2f
mov ebx, esp
push eax
mov edx, esp
push ebx
mov ecx, esp
mov al, 11
int 0x80
The stack is as follows:
Now i tried to optimize this by reducing a few instructions. I agree that till mov ebx, esp the code will remain the same. However, since ecx will need to point to ebx, I can re-write the code as follows:
global _start
section .text
_start:
xor eax, eax
push eax
; PUSH //bin/sh in reverse i.e. hs/nib//
push 0x68732f6e
push 0x69622f2f
mov ebx, esp
mov ecx,ebx
push eax
mov edx, esp
mov al, 11
int 0x80
However, I get a segmentation fault when I run my re-written code.
My stack is as follows:
Any ideas why the re-written code does not work? I've ran gdb also and the address values are according to my thinking, but it just won't run.
In both cases ebx is pointing to the string "//bin/sh". The equivalent of C code like this:
char *EBX = "//bin/sh";
But in your first example, ecx is set to the address of a pointer to that string. The equivalent of C code like this:
char *temp = "//bin/sh"; // push ebx
char **ECX = &temp; // mov ecx, esp
While in your second example, ecx is just set to the same value as ebx.
char *ECX = "//bin/sh";
The two examples are thus fundamentally different, with ecx have two completely different types and values.
Update:
I should add that technically ecx is an array of char pointers (the argv argument), not just a pointer to a char pointer. You're actually building up a two item array on the stack.
char *argv[2];
argv[1] = NULL; // push eax, eax being zero
argv[0] = "//bin/sh"; // push ebx
ECX = argv; // mov ecx,esp
It's just that half of that array is doubling as the envp argument too. Since envp is a single item array with that single item being set to NULL, you can think of the envp arguments being set with C code like this:
EDX = envp = &argv[1];
This is achieved by setting edx to esp while the argv array is only half constructed. Combining the code for the two assignments together you get this:
char *argv[2];
argv[1] = NULL; // push eax, eax being zero
EDX = &argv[1]; // mov edx,esp
argv[0] = "//bin/sh"; // push ebx
ECX = argv; // mov ecx,esp
It's a bit convoluted, but I hope that makes sense to you.
Update 2
All of the arguments to execve are passed as registers, but those registers are pointers to memory which needs to be allocated somewhere - in this case, on the stack. Since the stack builds downwards in memory, the chunks of memory need to be constructed in reverse order.
The memory for the three arguments looks like this:
char *filename: 2f 2f 62 69 | 6e 2f 73 68 | 00 00 00 00
char *argv[]: filename | 00 00 00 00
char *envp[]: 00 00 00 00
The filename is constructed like this:
push eax // '\0' terminator plus some extra
push 0x68732f6e // 'h','s','/','n'
push 0x69622f2f // 'i','b','/','/'
The argv argument like this:
push eax // NULL pointer
push ebx // filename
And the envp argument like this:
push eax // NULL pointer
But as I said, the original example decided to share memory between argv and evp, so there is no need for that last push eax.
I should also note that the reverse order of the characters in the two dwords used when constructing the string is because of the endianess of the machine, not the stack direction.
Related
I am trying to make an array printing by the inline assembly. Printf function keeps interpreting value on the stack as an address it needs to print out and results in an error (Screenshot: https://prnt.sc/r692d3). And if I pass the address to the printf, then it prints out garbage values like those: (Screenshot: https://prnt.sc/r691de).
Also, if someone knows - how to put a '\n' inside a string with inline ASM? Thanks :)
int main()
{
int mas[5] = { 1,2,3,4,5 };
int32_t diff = sizeof(int);
__asm
{
mov esi, 0x0
lea ecx, [mas]
mov eax, [ecx]
push ecx
call printf; Here it tries to read value '1' as an address
pop eax
loop_t:
xor ebx, ebx; Clear the registers
xor ecx, ecx;
lea ecx, [mas]; ECX = &mas
mov ebx, diff;
add ebx, ecx; &mas + diff
mov eax, [ebx]; Transfer the value
push eax; Push it on stack
call printf; Same thing here, interprets it as an address
pop eax;
add diff, 0x4;
inc esi; Cleanup process and looping back on
cmp esi, 0x5;
jne loop_t;
}
}
The first parameter of the printf function is the format string, i.e. a pointer to the first character of a null terminated character array. Therefore, the first parameter will always be treated as an address.
If you pass the value 1 as the first parameter to printf (by pushing it onto the stack last), then it will try to read the format string from the address 1 (which will fail).
My assignment is to Implement a function in assembly that would do the following:
loop through a sequence of characters and swap them such that the end result is the original string in reverse ( 100 points )
Hint: collect the string from user as a C-string then pass it to the assembly function along with the number of characters entered by the user. To find out the number of characters use strlen() function.
i have written both c++ and assembly programs and it works fine for extent: for example if i input 12345 the out put is correctly shown as 54321 , but if go more than 5 characters : the out put starts to be incorrect: for example if i input 123456 the output is :653241. i will greatly appreciate anyone who can point where my mistake is:
.code
_reverse PROC
push ebp
mov ebp,esp ;stack pointer to ebp
mov ebx,[ebp+8] ; address of first array element
mov ecx,[ebp+12] ; the number of elemets in array
mov eax,ebx
mov ebp,0 ;move 0 to base pointer
mov edx,0 ; set data register to 0
mov edi,0
Setup:
mov esi , ecx
shr ecx,1
add ecx,edx
dec esi
reverse:
cmp ebp , ecx
je allDone
mov edx, eax
add eax , edi
add edx , esi
Swap:
mov bl, [edx]
mov bh, [eax]
mov [edx],bh
mov [eax],bl
inc edi
dec esi
cmp edi, esi
je allDone
inc ebp
jmp reverse
allDone:
pop ebp ; pop ebp out of stack
ret ; retunr the value of eax
_reverse ENDP
END
and here is my c++ code:
#include<iostream>
#include <string>
using namespace std;
extern"C"
char reverse(char*, int);
int main()
{
char str[64] = {NULL};
int lenght;
cout << " Please Enter the text you want to reverse:";
cin >> str;
lenght = strlen(str);
reverse(str, lenght);
cout << " the reversed of the input is: " << str << endl;
}
You didn't comment your code, so IDK what exactly you're trying to do, but it looks like you are manually doing the array indexing with MOV / ADD instead of using an addressing mode like [eax + edi].
However, it looks like you're modifying your original value and then using it in a way that would make sense if it was unmodified.
mov edx, eax ; EAX holds a pointer to the start of array, read every iter
add eax , edi ; modify the start of the array!!!
add edx , esi
Swap:
inc edi
dec esi
EAX grows by EDI every step, and EDI increases linearly. So EAX increases geometrically (integral(x * dx) = x^2).
Single-stepping this in a debugger should have found this easily.
BTW, the normal way to do this is to walk one pointer up, one pointer down, and fall out of the loop when they cross. Then you don't need a separate counter, just cmp / ja. (Don't check for JNE or JE, because they can cross each other without ever being equal.)
Overall you the right idea to start at both ends of the string and swap elements until you get to the middle. Implementation is horrible though.
mov ebp,0 ;move 0 to base pointer
This seems to be loop counter (comment is useless or even worse); I guess idea was to swap length/2 elements which is perfectly fine. HINT I'd just compare pointers/indexes and exit once they collide.
mov edx,0 ; set data register to 0
...
add ecx,edx
mov edx, eax
Useless and misleading.
mov edi,0
mov esi , ecx
dec esi
Looks like indexes to start/end of the string. OK. HINT I'd go with pointers to start/end of the string; but indexes work too
cmp ebp , ecx
je allDone
Exit if did length/2 iterations. OK.
mov edx, eax
add eax , edi
add edx , esi
eax and edx point to current symbols to be swapped. Almost OK but this clobbers eax! Each loop iteration after second will use wrong pointers! This is what caused your problem in the first place. This wouldn't have happened if you used pointers instead indexes, or if you'd used offset addressing [eax+edi]/[eax+esi]
...
Swap part is OK
cmp edi, esi
je allDone
Second exit condition, this time comparing for index collision! Generally one exit condition should be enough; several exit conditions usually either superfluous or hint at some flaw in the algorithm. Also equality comparison is not enough - indexes can go from edi<esi to edi>esi during single iteration.
I am currently learning assembly programming as part of one of my university modules. I have a program written in C++ with inline x86 assembly which takes a string of 6 characters and encrypts them based on the encryption key.
Here's the full program: https://gist.github.com/anonymous/1bb0c3be77566d9b791d
My code fo the encrypt_chars function:
void encrypt_chars (int length, char EKey)
{ char temp_char; // char temporary store
for (int i = 0; i < length; i++) // encrypt characters one at a time
{
temp_char = OChars [i]; // temp_char now contains the address values of the individual character
__asm
{
push eax // Save values contained within register to stack
push ecx
movzx ecx, temp_char
push ecx // Push argument #2
lea eax, EKey
push eax // Push argument #1
call encrypt
add esp, 8 // Clean parameters of stack
mov temp_char, al // Move the temp character into a register
pop ecx
pop eax
}
EChars [i] = temp_char; // Store encrypted char in the encrypted chars array
}
return;
// Inputs: register EAX = 32-bit address of Ekey,
// ECX = the character to be encrypted (in the low 8-bit field, CL).
// Output: register EAX = the encrypted value of the source character (in the low 8-bit field, AL).
__asm
{
encrypt:
push ebp // Set stack
mov ebp, esp // Set up the base pointer
mov eax, [ebp + 8] // Move value of parameter 1 into EAX
mov ecx, [ebp + 12] // Move value of parameter 2 into ECX
push edi // Used for string and memory array copying
push ecx // Loop counter for pushing character onto stack
not byte ptr[eax] // Negation
add byte ptr[eax], 0x04 // Adds hex 4 to EKey
movzx edi, byte ptr[eax] // Moves value of EKey into EDI using zeroes
pop eax // Pop the character value from stack
xor eax, edi // XOR character to give encrypted value of source
pop edi // Pop original address of EDI from the stack
rol al, 1 // Rotates the encrypted value of source by 1 bit (left)
rol al, 1 // Rotates the encrypted value of source by 1 bit (left) again
add al, 0x04 // Adds hex 4 to encrypted value of source
mov esp, ebp // Deallocate values
pop ebp // Restore the base pointer
ret
}
//--- End of Assembly code
}
My questions are:
What is the best/ most efficient way to convert this for loop into assembly?
Is there a way to remove the call for encrypt and place the code directly in its place?
How can I optimise/minimise the use of registers and instructions to make the code smaller and potentially faster?
Is there a way for me to convert the OChars and EChars arrays into assembly?
If possible, would you be able to provide me with an explanation of how the solution works as I am eager to learn.
I can't help with optimization or the cryptography but i can show you a way to go about making a loop, if you look at the loop in this function:
void f()
{
int a, b ;
for(a = 10, b = 1; a != 0; --a)
{
b = b << 2 ;
}
}
The loop is essentially:
for(/*initialize*/; /*condition*/; /*modify*/)
{
// run code
}
So the function in assembly would look something along these lines:
_f:
push ebp
mov ebp, esp
sub esp, 8 ; int a,b
initialize: ; for
mov dword ptr [ebp-4], 10 ; a = 10,
mov dword ptr [ebp-8], 1 ; b = 1
mov eax, [ebp-4]
condition:
test eax, eax ; tests if a == 0
je exit
runCode:
mov eax, [ebp-8]
shl eax, 2 ; b = b << 2
mov dword ptr [ebp-8], eax
modify:
mov eax, [ebp-4]
sub eax, 1 ; --a
mov dword ptr [ebp-4], eax
jmp condition
exit:
mov esp, ebp
pop ebp
ret
Plus I show in the source how you make local variables;
subtract the space from the stack pointer.
and access them through the base pointer.
I tried to make the source as generic intel x86 assembly syntax as i could so my apologies if anything needs changing for your specific environment i was more aiming to give a general idea about how to construct a loop in assembly then giving you something you can copy, paste and run.
I would suggest to look into assembly code which is generated by compiler. You can change and optimize it later.
How do you get assembler output from C/C++ source in gcc?
I had to implement a cdecl calling convention into this program which originally used a non standardized convention. As far as I can tell it looks right, but I get a unhandled exception error saying "Accress violation writing location 0x00000066, which seems to hit when the program gets down to the line "not byte ptr[eax]" or atleast that is where the arrow points after breaking the program.
Could anyone tell me what is wrong with my program and how I may fix it? Thank you.
void encrypt_chars (int length, char EKey)
{ char temp_char;
for (int i = 0; i < length; i++)
{
temp_char = OChars [i];
__asm {
push eax
movzx eax, temp_char
push eax
lea eax, EKey
push eax
call encrypt
mov temp_char, al
pop eax
}
EChars[i] = temp_char;
return;
// Inputs: register EAX = 32-bit address of Ekey,
// ECX = the character to be encrypted (in the low 8-bit field, CL).
// Output: register EAX = the encrypted value of the source character (in the low 8-bit field, AL).
__asm {
encrypt:
push ebp
mov ebp, esp
mov ecx, 8[ebp]
mov eax, 12[ebp]
push edi
push ecx
not byte ptr[eax]
add byte ptr[eax], 0x04
movzx edi, byte ptr[eax]
pop eax
xor eax, edi
pop edi
rol al, 1
rol al, 1
add al, 0x04
mov esp, ebp
pop ebp
ret
}
By inspection, the comment on the encrypt function is wrong. Remember: the stack grows down, so when the arguments are pushed onto the stack, the ones pushed first have the higher address and, therefore, the higher offset from the base pointer in the stack frame.
The comment to encrypt says:
// Inputs: register EAX = 32-bit address of Ekey,
// ECX = the character to be encrypted
However, your calling sequence is:
movzx eax, temp_char ; push the char to encrypt FIRST
push eax
lea eax, EKey ; push the encryption key SECOND
push eax
call encrypt
So the character is push first. So the character to encrypt But encrypt is loading them this way:
; On function entry, the old Instruction Pointer (4 bytes) is pushed onto the stack
; so now the EKey is +4 bytes from the stack pointer
; and the character is +8 bytes from the stack pointer
;
push ebp
mov ebp, esp
; We just pushed another 4 bytes onto the stack (the esp register)
; and THEN we put the stack pointer (esp) into ebp as base pointer
; to the stack frame.
;
; That means EKey is now +8 bytes off of the base pointer
; and the char to encrypt is +12 off of the base pointer
;
mov ecx, 8[ebp] ; This loads EKey pointer to ECX
mov eax, 12[ebp] ; This loads char-to-encrypt to EAX
The code then proceeds to try to reference EAX as a pointer (since it thinks that's EKey), which is going to cause an access violation since it's your character to encrypt the first time it tries to reference EAX as a pointer, which is here:
not byte ptr[eax]
So your debugger pointer was right! :)
You can fix it just by swapping these two registers:
mov eax, 8[ebp] ; This loads EKey pointer to EAX
mov ecx, 12[ebp] ; This loads char-to-encrypt to ECX
Finally, your call to encrypt doesn't clean up the stack pointer when it's done. Since you pushed 8 bytes of data onto the stack before calling encrypt, and since encrypt does a standard ret with no stack clean-up, you need to clean up after the call:
...
call encrypt
add esp, 8
...
I would like to dump a functions memory (void) to a byte array (unsinged char[]). Aftwerwards, a dummy function shall be pointed to the byte array and the dummy function shall be executed.
The function I want to dump:
void CallMessageBoxExA()
{
message = "ManualMessageBoxExA";
caption = "Caption";
pAddr = GetProcAddress(GetModuleHandle(L"User32.dll"), "MessageBoxExA");
__asm // Call MessageBoxA
{
push dword ptr 0 //--- push languageID: 0
push dword ptr 0 //--- push style: 0
push dword ptr caption //--- push DWORD parameter (caption)
push dword ptr message //--- push DWORD parameter (message)
push dword ptr 0 //--- push hOwner: 0
mov eax, pAddr
call eax //-- call address of the function, which is currently in EAX
}
}
Dumping the memory:
string DumpMemory(void *pAddress, int maxLength)
{
string result = "";
const unsigned char * p = reinterpret_cast< const unsigned char *>(pAddress);
cout << "Memory location: 0x" << hex << (unsigned int)p << endl;
for (unsigned int i = 0; i < maxLength; i++) {
string code = "";
stringstream ss;
ss << hex << int(p[i]);
ss >> code;
result += code;
}
return result;
}
When looking at the memory location DumpMemory prints to the console, ollydbg shows a JMP instruction at this location:
CPU Disasm
Address Hex dump Command Comments
00281627 $-/E9 044C0000 JMP CallMessageBoxExA
Is this the correct memory-location, or do I have to follow the JMP?
Memory location jump leads to:
CPU Disasm
Address Hex dump Command Comments
00286230 /$ 55 PUSH EBP ; ASM.CallMessageBoxExA(void)
00286231 |. 8BEC MOV EBP,ESP
00286233 |. 81EC C0000000 SUB ESP,0C0
00286239 |. 53 PUSH EBX
0028623A |. 56 PUSH ESI
0028623B |. 57 PUSH EDI
0028623C |. 8DBD 40FFFFFF LEA EDI,[EBP-0C0]
00286242 |. B9 30000000 MOV ECX,30
00286247 |. B8 CCCCCCCC MOV EAX,CCCCCCCC
0028624C |. F3:AB REP STOS DWORD PTR ES:[EDI]
0028624E |. C705 A8562900 MOV DWORD PTR DS:[message],OFFSET 00291D ; ASCII "ManualMessageBoxExA"
00286258 |. C705 AC562900 MOV DWORD PTR DS:[caption],OFFSET 00291D ; ASCII "Caption"
00286262 |. 8BF4 MOV ESI,ESP
00286264 |. 68 041E2900 PUSH OFFSET 00291E04 ; ASCII "MessageBoxExA"
00286269 |. 8BFC MOV EDI,ESP
0028626B |. 68 D01D2900 PUSH OFFSET 00291DD0 ; /ModuleName = "User32.dll"
00286270 |. FF15 00602900 CALL DWORD PTR DS:[<&KERNEL32.GetModuleH ; \KERNEL32.GetModuleHandleW
00286276 |. 3BFC CMP EDI,ESP
00286278 |. E8 5BB2FFFF CALL 002814D8 ; [_RTC_CheckEsp
0028627D |. 50 PUSH EAX ; |hModule
0028627E |. FF15 04602900 CALL DWORD PTR DS:[<&KERNEL32.GetProcAdd ; \KERNEL32.GetProcAddress
00286284 |. 3BF4 CMP ESI,ESP
00286286 |. E8 4DB2FFFF CALL 002814D8 ; [_RTC_CheckEsp
0028628B |. A3 B0562900 MOV DWORD PTR DS:[pAddr],EAX
00286290 |. 6A 00 PUSH 0
00286292 |. 6A 00 PUSH 0
00286294 |. FF35 AC562900 PUSH DWORD PTR DS:[caption]
0028629A |. FF35 A8562900 PUSH DWORD PTR DS:[message]
002862A0 |. 6A 00 PUSH 0
002862A2 |. A1 B0562900 MOV EAX,DWORD PTR DS:[pAddr]
002862A7 |. FFD0 CALL EAX
002862A9 |. 5F POP EDI
002862AA |. 5E POP ESI
002862AB |. 5B POP EBX
002862AC |. 81C4 C0000000 ADD ESP,0C0
002862B2 |. 3BEC CMP EBP,ESP
002862B4 |. E8 1FB2FFFF CALL 002814D8 ; [_RTC_CheckEsp
002862B9 |. 8BE5 MOV ESP,EBP
002862BB |. 5D POP EBP
002862BC \. C3 RETN
Pointing the dummy function at the byte array:
void(*func_ptr)();
func_ptr = (void(*)()) &foo[0]; // make function point to foo[]
(*func_ptr)(); // Call the function
Is this the correct way to make the dummy function point to the byte array?
At which point is the function's end reached? Should I simply check for the different return opcodes (C3 -> return near to caller, CB -> return far to caller, ...)?
PS:
A simple (e.g. not very elaborate) solution is preferred as I am new to C++.
Edit: I want to achieve this in a Windows environment.
You need to store the "copied" function on a block of memory allocated using VirtualAllocEx. On modern OSs, there is a bit on each page which declares whether its contents are executable or not. This is used to minimize the damage of buffer overruns. By default, your memory is not executable. If you use VirtualAllocEx with the PAGE_EXECUTE_READWRITE protection mode, you'll be able to write to a block of memory and then execute from it.
As for your question of "when do you reach the end of a function," that is actually not answerable. There are common patterns you can look for, but x86 lacks any way of identifying the "end" of a function.
It looks like you need to follow the jump. When you follow the jump, the code you're seeing matches what you compiled above.
Also, your DumpMemory is using the address of pAddressIn. Your function is being passed a variable called pAddress. Either this is a typo, or you're referencing a variable declared somewhere else. I assume you meant to use pAddress.
The memory you allocate may need special privilege to be allowed to run. As is, the memory you allocate with the raw function data will be marked as "data". "Data Execution Prevention" may stop this depending on your environment.